When AI Use Becomes Performance
How process audits reward coherence over cognition
The Second Draft: #0086
I write weekly articles for educators who are ready to get unstuck from outdated curriculum, resistant institutions, and a career that was built for a world that no longer exists.
We started this series by noting the growing discipline of AI transparency.
Not AI integration. Not AI literacy.
Just: AI auditing. The emerging belief that if students use AI, they should also submit:
→ Their prompts
→ Their chat logs
→ A documented thinking trace
→ A structured process narrative
We’re watching in real-time the rise of a new genre of assignment: The AI Process Report.
In Part 1, I argued this movement rests on three fundamental misunderstandings:
1/ This isn’t how people actually think.
2/ This isn’t how people should use AI.
3/ This isn’t something we actually know how to assess.
Last time, we examined the first myth.
We looked at confabulation—making up rationale for choices after the fact.
At verbal overshadowing—how articulating a memory can actually degrade the memory.
At the rhizomatic nature of cognition—that human cognition is non-linear and largely inscrutable.
To summarize all of that, we saw that when we ask students to reconstruct their thinking, we are asking them to generate a new artifact, which is a different skill in itself, the outcome of which likely bears little resemblance to what actually occurred and might inadvertently degrade learning from that experience.
Now we turn to the second myth:
This isn’t how people should use AI.
Here’s the big idea: if an audit makes sense, then the collaboration with AI wasn’t actually effective.
Why We Ask for the Process
Before dismantling this, it’s worth stating clearly:
There are legitimate reasons educators are moving in this direction.
As far as I can tell, there are at least five intuitions driving the AI transparency movement.
1. We Want Evidence of Learning
If the product can be generated by AI, then the product alone no longer feels like proof of understanding.
So we ask for process.
We hope the thinking trace will reveal whether learning occurred.
That instinct makes sense.
2. We Want Evidence of Agency
We don’t want students outsourcing their cognition.
We want to know that they remained the author of the work.
That they exercised judgment.
That they made decisions.
So we ask them to narrate their choices.
Again, entirely reasonable.
3. We Want to Scaffold Metacognition
We believe that if students articulate their reasoning, they will become more aware of it.
More reflection now.
More deliberative action later.
So documentation becomes a developmental tool.
That instinct has deep roots in education (and if you read my series is on metacognition is well-intentioned and poorly executed).
4. We Want Integrity Without Prohibition
We know blanket bans don’t work.
We know students will use AI.
So instead of forbidding it, we regulate it.
Transparency feels like the middle ground between naïve acceptance and rigid control.
5. We Want Something We Can Grade
If outputs are unstable as evidence, we need another artifact.
Something observable.
Something perhaps rubric-able.
Process documentation gives us material to evaluate.
Even if we’re not entirely sure how.
None of these impulses are absurd.
They’re absolutely understandable.
The problem is that every one of these reasons assumes something about how thinking and AI collaboration actually work.
And that assumption is wrong.
Why the Audit Logic Fails
Here’s the deeper issue:
Audit frameworks assume that effective AI use follows a process that can be anticipated in advance.
Ask. Evaluate. Refine. Select.
But generative collaboration does not necessarily follow a stable template, which is the entire point of it being generative!
1. Effective Interactions Don’t Follow a Predictable Shape
A successful exchange might look like:
One carefully composed prompt.
One response.
Done.
Or:
Ten chaotic prompts.
Lots of contradictions.
Abandoned intentions.
Half-formed, ill-conceived attempts.
Or:
A short, sloppy prompt that unexpectedly surfaces exactly what was needed.
Or:
A beautifully structured prompt that produces nothing useful.
There is no consistent visible pattern that distinguishes “good AI use” from “bad AI use” and there is no necessarily meaningful distinction between “good thinking” from “bad thinking” in those cases.
That is, the form of the interaction does not reliably signal the quality of the thinking.
But an audit necessarily presumes that it does or at least can.
2. The Most Important Work Happens Before and Between the Prompts
Again, given “we think in our head,” the real thinking happens before anything is typed.
You sit.
You turn the problem over.
You imagine different angles.
Somehow your mind decides to push a prompt.
None of that is documentable, as such.
It is reconstructable to a degree, but that is not the same as reliable.
Other times, the real shift happens in the gap between response and next prompt, in the re-cognition that interactions with new ideas can impel.
The model says something.
You feel:
“Yes. That’s it.”
Or:
“No. That’s wrong.”
That recognition is not fully articulable.
It is tacit pattern matching.
The audit can capture what was typed.
Or what we imagine our minds should have done (confabulation).
But, it cannot capture why something resonated at the actual level at that time.
And resonance is often the decisive move.
3. The Audit Presumes We Know the “Right” Process Shape
When we ask:
What was your goal?
What was your prompt?
What did you keep and reject?
We are implicitly assuming that effective AI collaboration contains:
Deliberate planning
Explicit evaluation
Rational selection
Clear justification
In other words, we assume it should look like a controlled reasoning chain.
But generative AI does not essentially reward pre-scripted control.
Quite often, by design, it rewards the exact opposite—unplanned exploration.
The very nature of the tool undermines the assumption that we can know ahead of time what the “correct” process will look like, or look back and define why a series of chaotic, inchoate, and uncertain prompts led us to our final output.
But the audit requires that we can.
Or at least that we imagine we can make sense of our chaos.
The Temptation
At this point, the obvious response is:
“Exactly. That’s why I want them to catalog it . . . If thinking is tacit, if collaboration is messy, if the decisive moments are fleeting—then all the more reason to slow it down and make it visible.”
That instinct is again completely understandable.
But we already addressed part of that in the first myth.
1/ When we force students to reconstruct their cognition after the fact, we invite confabulation. We get a coherent story, not necessarily an accurate one.
2/ When we force articulation of experiences that are partly tacit and pattern-based, we risk replacing lived cognitive texture with a linguistic substitute. The narration becomes the artifact. The artifact becomes the grade.
In another series, I’ve argued that metacognition can work, but only when the questions are positioned as developmental prompts for the learner, not as compliance artifacts for the evaluator.
There is a difference between cultivating awareness and auditing performance.
ICYMI: Here’s the link to the first article in that series which details exactly the type of questions that can work https://open.substack.com/pub/thepatrickdempsey/p/student-reflections-on-ai-use-doesnt?utm_campaign=post-expanded-share&utm_medium=web
For now, we’ll leave it there.
Because, here’s the thing:
Even if we grant every good intention behind AI transparency.
Even if we assume students can meaningfully identify “key moments.”
Even if we collect the logs, the traces, the crosswalks, the fingerprints . . .
We still face the one misunderstanding that brings the entire structure down:
3/ We wouldn’t know what to do with any of it anyway.
And that’s for next time.


