Why Williamson’s Account of Anti-Luminosity Has Important Implications For Educational Assessment

(Draft of a paper to be submitted to JOPE if I ever manage to get logged into their submissions site. Don't hold your breath.)

This essay sets out, first, to reconstruct Timothy Williamson’s anti-luminosity argument in a careful and accessible way, showing how the intuitive idea that we can always know when a condition obtains collapses under conditions of gradual change and limited discrimination (Williamson 2012). It then examines Williamson’s defence of this result against attempts to replace knowledge with probabilistic notions, arguing that appeals to high probability or expected value do not restore the transparency that luminosity promises. Finally, the essay applies these insights to educational assessment, suggesting that many contemporary systems are implicitly built around a luminosity ideal, the assumption that whenever a standard is met it can be known as such, and that this assumption generates structural tensions, instability, and misplaced confidence in precision within current assessment practices.

What Williamson does with his anti-luminosity argument is to put pressure on a very tempting picture of rationality, self-knowledge, and evidence, a picture that at first looks almost undeniable. The temptation begins from the thought that rationality is meant to be fair to finite agents like us. Rationality is supposed to assess us in a way that takes seriously the limits on our information. If that is right, then it can seem obvious that whenever rationality requires something of us, we must at least be in a position to know that it requires it. If rationality demanded something that we could not even in principle tell was demanded, then it would look as though we could not be faulted for failing to comply. Rationality, on that way of thinking, should not make hidden demands.

That initial thought has a very wide reach. Once one starts from it, one is naturally led toward the idea that if something is part of one’s evidence, then one should be in a position to know that it is part of one’s evidence. After all, if rationality requires us to proportion belief to evidence, and if we are to be assessed fairly for doing so or failing to do so, then it seems that what counts as our evidence must itself be available to us. The same pattern of thought has often been extended to mental states. Philosophers have often supposed that if one is in pain, then one must be in a position to know that one is in pain. If one is having a certain sort of conscious experience, then one should be in a position to know that one is having it. One can think of this either as an independent source of the idea, or as part of the same general picture. Rationality seems to need some realm of conditions that are in that sense self-revealing, and mental states seem like the most plausible candidates.

Williamson introduces a general label for conditions of this kind. He says that a condition is luminous if, whenever it obtains, one is in a position to know that it obtains. By the phrase “in a position to know”  he does not mean that one must actually have formed the belief already. He is allowing for the fact that the question may never have occurred. The claim is rather that as soon as the question does arise, the knowledge is there for the taking. If you satisfy the condition, then you can know that you do.

Williamson in Knowledge and Its Limits (Williamson 2000) argues that very few, if any, non-trivial conditions are luminous in this sense. The argument is meant to be highly general. It does not depend on the special content of any one example. It is really an argument pattern, something like a schema, which can be applied to many candidate luminous conditions.

The strategy is to begin by taking any condition that is supposed to be luminous and then to reason towards a contradiction. One imagines a very gradual process. At one end of the process there is a clear case in which the condition obtains. If the example is pain, then at the beginning there is a clear case of being in pain. At the other end there is a clear case in which it does not obtain, a clear case of not being in pain. The route from one end to the other is continuous and extremely gradual. The underlying changes are tiny. One then divides the process into a long series of very small stages. The crucial feature of the case is that the difference between any two adjacent stages is below the subject’s threshold of discrimination. The subject cannot tell those adjacent stages apart.
Once that setup is in place, Williamson introduces the pivotal claim. If at one stage you know that the condition obtains, then at the next stage it must still obtain. The thought behind this is that if the condition had already ceased to obtain at the next stage, then your belief at the earlier stage that it obtained would not have been secure enough to count as knowledge. Since the difference between the two stages is beneath your power of discrimination, a belief formed at the earlier stage would have been too close to error. So if you genuinely knew at stage 1 that the condition obtained, then it must still obtain at the adjacent stage.

That claim by itself is only part of the setup. To get the contradiction, one combines it with luminosity. If the condition is luminous, then whenever it obtains, you are in a position to know that it obtains. Since Williamson is imagining a case in which the question is actually in play, he can set aside the mere “in a position to know” qualification and move directly to the point. If the condition obtains at one stage, then you know it obtains there. If you know it obtains there, then it must also obtain at the next stage. So if it obtains at one stage, it obtains at the next. Once you have that, iteration does the rest. The condition obtains at the beginning, so it must obtain one step later, and one step later again, and so on all the way across the whole series. But that is absurd, because the far end of the series was stipulated to be a clear case in which the condition does not obtain. So the luminosity assumption has generated a contradiction. Williamson’s conclusion is that the assumption of luminosity must be rejected.

At first glance, the structure resembles the sorites paradox, the ancient puzzle about heaps, baldness, and similar cases in which one tiny change seems unable to make a difference, but repeated many times does. Williamson acknowledges that. He also notes that this similarity might tempt one to think that the whole issue is somehow merely a problem about vagueness. But he thinks the luminosity argument actually gives an additional reason to think that the matter is not simply about vague language. What is really at issue, he suggests, is not linguistic vagueness but the relation between discrimination, evidence, and epistemic assessment.
Williamson has long argued that the concept of knowledge is central to epistemology. He wants to continue defending that view (Williamson 2000). But he recognises a natural objection. Perhaps knowledge is just part of folk epistemology, part of our ordinary, unscientific conceptual scheme. Perhaps as epistemology becomes more serious and more scientific it should stop relying on the ordinary concept of knowledge and instead be reconstructed in probabilistic terms. One might think that what really matters is not knowledge but rational degree of belief, evidential support, credence, and Bayesian updating. If one had that sort of picture in mind, one might react to the anti-luminosity argument in two ways. 
Williamson doesn’t actually think that anti-luminosity is not some quirky feature of the folk concept of knowledge, but even if he did and agreed to move to a probabilistic epistemology, very similar anti-luminosity considerations reappear. In so doing he is trying to show that the argument has a deeper and more robust significance than one might have thought. It is not merely an argument against one familiar philosophical picture of self-knowledge. It is a structural obstacle that re-emerges even in a framework that was supposed to replace knowledge.

To make that move, Williamson sorts out different notions of probability. Not every notion of probability is relevant to epistemology. He distinguishes first between objective probability, often called chance, and subjective probability, often thought of as credence or degree of belief. Objective probability is supposed to be a feature of the world itself. It concerns propensities or chances in nature, not the epistemic position of any particular thinker. If determinism were true in the strongest sense, then objective probabilities might all collapse to zero or one, but in a quantum world one might think there are genuine objective chances strictly between zero and one. In any case, this notion is not what epistemology primarily needs, because it is not sensitive to what evidence an agent has. A law of nature might have objective probability one even when no one has any evidence for it. So chance does not by itself play the role the epistemologist requires.
Subjective probability, by contrast, is tied to the agent’s own degrees of belief. It is more clearly epistemic, but Williamson thinks it is too weak and too permissive to replace knowledge. One can have crazy subjective probabilities and still satisfy the formal probability axioms. Someone can assign extremely high confidence to ridiculous hypotheses. Merely having coherent credences is not enough to guarantee anything like a good epistemology. One can assign probability one to Trump being honest and still satisfy the probability axioms. The problem is that subjective probability without further discipline is not a sufficiently objective guide to evidence and truth.

What epistemology needs, he says, is an intermediate notion, something between objective chance and subjective credence. He calls this evidential probability. Evidential probability is the probability of a hypothesis on one’s evidence. It is less objective than chance because it depends on the evidence available to an agent, but it is more objective than mere credence because it is not simply whatever probability the agent happens to assign. It is the probability supported by the evidence. This is the notion Williamson thinks a scientific epistemology would have to rely on if it tried to do without knowledge.

In Knowledge and Its Limits he gave his own theory of evidential probability, and roughly speaking that theory builds it out of prior probability conditionalised on one’s evidence, where one’s evidence in turn consists of everything one knows. That would make evidential probability itself knowledge-based. But he deliberately wants to build a probabilistic reconstruction of anti-luminosity independent of knowledge.  So he treats evidential probability as a primitive enough notion for his opponent to use. He does not assume any special relation to knowledge, even though he personally believes there is one because he wants to show that the anti-luminosity problem appears even on the most charitable interpretation of a probabilistic epistemology.

The simplest way to reproduce luminosity in a probabilistic framework is to replace knowledge with probability one. A condition would then be luminous in the new sense if, whenever it obtains, its evidential probability is one. That is, whenever the condition holds, the evidence leaves no room whatever for its failure. At first this might look like a very strong but natural probabilistic analogue of the original luminosity idea.

Williamson then shows that the old anti-luminosity structure returns almost unchanged. Suppose that at stage 1 the evidential probability that the condition obtains is one. Could it fail to obtain at the next stage? No, says Williamson, because the next stage is too close to the present one. If the condition failed there, then already now there would have to be some non-zero evidential probability that it fails, since you cannot discriminate your precise place in the gradual series with that level of exactness. The very closeness of the adjacent stage means that if the condition fails there, then your present evidence cannot rule that out with certainty. So one gets the analogue of the earlier key principle. If at one stage the condition has evidential probability one, then at the next stage it must still obtain. Once again, if the condition were luminous in probability one, iteration would force it to obtain all the way along the series, which contradicts the setup. So luminosity in probability one fails in just the way ordinary luminosity fails.

But perhaps we think certainty is too much to ask. Perhaps the condition need only have very high evidential probability, say 0.9 or 0.95, whenever it obtains. Would that be enough to preserve the spirit of luminosity while avoiding the contradiction? Williamson responds in two stages, one about motivation and one about argument.

On motivation, he says that lowering the bar does not really capture the original point of luminosity. If the condition is pain, then the thought that if you are in pain there is a 0.9 probability on your evidence that you are in pain is not remotely what defenders of luminous self-knowledge wanted. They wanted immediacy and transparency, not high confidence with residual uncertainty. Likewise in the case of rationality. If rationality demands something of you, it is not obvious that you are criticisable merely because your evidence made it 0.9 likely that rationality demanded it. Maybe there are huge costs attached to complying, and it was reasonable to take the chance that the demand was not really there. So from the point of view of the original motivations, anything less than full certainty is already a significant retreat.

Williamson, in his initial probabilistic version of the argument, talks as if we can assign probabilities to exact points, for example, “the probability that I am exactly at time t,” or “exactly at this level of pain.” Quite rightly a critic will say that this way of talking is too simple and, in fact, mathematically problematic when we are dealing with things that vary continuously, like time, temperature, or degrees of pain.

Here is the key issue. If something varies continuously, then between any two points there are infinitely many intermediate points. There is no next point in a strict sense. So if you say, “I assign a small non-zero probability to being at the next point because I cannot discriminate it from where I am,” then the same reasoning applies to all the infinitely many nearby points. You would have to assign a non-zero probability to each of them.

But now we run into a basic rule of probability. The probabilities assigned to mutually exclusive possibilities have to add up to at most one. If you assign even a tiny fixed positive probability, say 0.001, to infinitely many distinct points, then when you add them all together, you get something that exceeds one. That is not allowed in standard probability theory. So Williamson’s initial way of describing things, in terms of assigning non-zero probability to exact points, cannot be taken literally if we want to stay within orthodox mathematics. It is a useful intuitive picture, but not a mathematically precise one.

Williamson therefore changes the framework. Instead of talking about exact points, he talks about intervals, that is, small stretches of the continuum that have some width. In probability theory, it is perfectly standard to assign probabilities to intervals rather than to individual points. For example, instead of asking “What is the probability that the temperature is exactly 20.000000 degrees?”, we ask “What is the probability that the temperature lies between 20 and 21 degrees?” That avoids the problem of having to assign positive probability to infinitely many exact points.

Now suppose there is some small interval of cases, not just a single point, throughout which the condition does not hold. For example, imagine a tiny range of situations where a person is not in pain. The assumption that such an interval exists is very weak and realistic. In any gradual transition from pain to no pain, there will be some stretch where the person is clearly not in pain. Now consider a point just outside that interval, very close to it. Because the subject’s powers of discrimination are limited, they cannot sharply distinguish their current situation from nearby ones. That means that from their current position, they cannot rule out the possibility that they are actually in that nearby interval where the condition fails. In probabilistic terms, their evidence must assign some positive probability to being in that interval.

If their evidence assigns some positive probability to being in an interval where the condition fails, then they cannot assign probability one to the condition’s obtaining at their current point. Probability one would mean that the evidence completely rules out failure, but here failure is still a live possibility. So we get a revised version of the key principle. If there is any nearby interval where the condition fails, then points sufficiently close to that interval cannot assign probability one to the condition holding. In other words, the “failure” of the condition spreads outward from that interval into neighbouring regions.

Williamson then uses the same iterative strategy as before. Start with a tiny interval where the condition does not hold. From that, we infer that slightly larger neighbouring regions cannot support probability one for the condition. Then slightly larger regions still, and so on. Step by step, the influence of that small interval propagates across the whole space. The upshot is that even without talking about exact points, and without making any mathematically dubious assumptions, we still get the same conclusion. There cannot be a condition such that, whenever it obtains, its evidential probability is one. The attempt to reconstruct luminosity in probabilistic terms still fails.

The important thing to see is that the structure of the argument remains intact. The combination of three ideas, gradual change, limited discrimination, and the requirement for certainty, still generates the same pressure. Even in a fully continuous, mathematically respectable setting, the idea that we can always be certain when a condition obtains cannot be sustained.
At this point the temptation to move to probability less than one returns. So, instead of saying “Whenever condition C holds, you can be completely certain (probability = 1) that it holds,” we weaken it to something like “Whenever C holds, your evidence makes it very likely, say 0.9 or 0.8 or 0.6.” Now we are no longer demanding certainty, just strong confidence.

Now imagine a gradual transition. At one end, C clearly holds. At the other end, it clearly does not. Between them is a smooth spectrum of cases, each only slightly different from the next. Think of something like gradually dimming a light, or slowly reducing pain, or moving from a clearly excellent essay to a clearly weak one. Somewhere along this spectrum there is a boundary region, not a sharp line, but a zone where it shifts from “C” to “not-C.” Near that region, things are genuinely borderline. Further, add a key assumption. Your evidence cannot tell exactly where you are in this gradual sequence. That is, if you are in one position, you cannot sharply distinguish it from nearby positions. Importantly, your uncertainty does not favour one side over the other. It is symmetric. Being slightly on the “C” side looks very similar to being slightly on the “not-C” side.Now think about what your evidence must look like when you are very close to the boundary, but still technically on the “C” side.

Because you cannot distinguish your exact position, your evidence must spread its probability across nearby possibilities. Some of those possibilities are cases where C holds, and some are cases where it does not. So your evidence does not concentrate entirely on C. It is divided. As you move closer and closer to the boundary, this effect becomes stronger. The possibilities on both sides become more and more similar. Your evidence has less and less reason to favour “C” over “not-C.” So the probability you assign to C gets closer and closer to one half.

That is Williamson’s key point. Near the boundary, your evidence cannot strongly support either side. So the probability of C drops, not because C is false, but because your evidence cannot clearly distinguish it from nearby cases where it is false. But in this scenario we wanted it to be true that whenever C holds, its probability is at least, say, 0.9. But we have just seen that if you go close enough to the boundary, while C still holds, the probability of C will fall below 0.9. In fact, it can be pushed arbitrarily close to 0.5. So no matter what threshold above one half you choose, 0.9, 0.8, 0.6, you can always find cases where C is still true but the probability drops below that threshold.

That means the weakened version of luminosity fails. Even if we relax the requirement from certainty to “high probability,” the structure of gradual change and limited discrimination still undermines the idea. The problem is not that we demanded too much certainty. The problem is that near boundaries, our evidence simply cannot support strong confidence in either direction. So any rule that says “whenever C is true, it must be highly probable” will break down in those borderline cases.

That already shows that high-probability luminosity is much harder to sustain than one might have imagined. But Williamson explores an even subtler push back against his position. Perhaps what matters is not the probability that the condition holds but something like the expected value of a variable. An expected value is the average value of a variable weighted by the probabilities of the possibilities. If there is a one-third chance that a variable is 7 and a two-thirds chance that it is 15, its expectation is one-third of 7 plus two-thirds of 15. One might imagine a variable whose actual value is always equal to its expected value. That would give a weaker kind of epistemic transparency. You might not know the exact value with certainty, but if you used the expectation in your calculations, you would still get the right result.

Williamson gives a toy model to show that this is at least logially possible. Suppose time comes in simple steps:
1

2

3

4

5
Now imagine that when the actual time is, say, 4, your evidence is a bit blurry. You cannot tell exactly which of three nearby times you are at. So your evidential probabilities are:
probability 1/3 that it is 3

probability 1/3 that it is 4

probability 1/3 that it is 5

So you are not certain it is 4. In fact, you only give 4 a probability of one third. That means the exact time is definitely not luminous here. If it is actually 4, you are not in a position to know with certainty that it is 4. Now ask a different question. Even though you are uncertain, what is your expected time?

The expected value is just the weighted average:
(1/3)⋅3+(1/3)⋅4+(1/3)⋅5
That is:
1+4/3+5/3=12/3=4
So your expected value is exactly 4.

That is the point Williamson is making in the toy model. You do not know the exact time, but your evidence is arranged in such a neat and symmetrical way that the average of the possibilities comes out exactly right.

So there is a kind of weaker match between world and evidence:
the actual time is 4

your expected time is also 4
even though you are uncertain whether the time is 3, 4, or 5.
That is the first point.
Now the second point is why Williamson thinks this does not really save us.

At first sight you might think it does. You might say, “Fine, maybe we cannot have luminosity in the strong sense. Maybe we are never certain. But perhaps a weaker ideal is enough. Perhaps the actual value always matching the expected value gives us something almost as good.”Williamson’s answer is that this only works in very special toy cases, and when you try to generalise it, it collapses. The key idea is this. Imagine not just time, but any variable spread over a whole state space. A state space is just the range of possible situations you might be in. The variable could be time, temperature, amount of pain, level of confidence, degree of literary merit in an essay or whatever.

Now suppose this variable takes different values at different states.
For example, let us say the variable is highest at one particular state. Call that state the maximum point. Maybe the variable there is 10, and everywhere else it is lower than 10.
Now suppose that from that state there is some non-zero probability of being at a nearby lower state. For example:

probability 0.8 that you are at the maximum state where value = 10

probability 0.2 that you are at a nearby state where value = 9
Then the expected value is:
0.8⋅10+0.2⋅9=8+1.8=9.8

But the actual value at the maximum state is 10. So the expected value is less than the actual value. That means that at this maximum point, actual value and expected value do not match. And that is the general phenomenon Williamson is exploiting. If a variable really has a genuine peak somewhere, and there is any non-zero evidential spread toward lower-value alternatives, then the average will be dragged downward. So the expected value cannot equal the actual value there.

The same would happen at a genuine minimum, except the average would be dragged upward if there is non-zero probability of higher alternatives. So the only way to avoid this problem everywhere is for there to be no genuine highs and lows at all. In other words, the variable must not vary. It must be the same everywhere. That is why  he says that, in the relevant kinds of state spaces, if a variable always equals its expectation, then it must in effect be constant.
A constant variable is something like this:
state 1: value 7

state 2: value 7

state 3: value 7

state 4: value 7
Then of course the expected value is always 7, because every possibility has value 7. 

Williamson says such a variable is epistemically trivial. Why trivial? Because the whole point of evidence is to help distinguish where you are in the space of possibilities. If the variable has the same value at every point, then learning its value gives you no information about which state you are in.
Compare two cases.
In the informative case:
state A: value 3

state B: value 7

state C: value 10
Here, learning the value helps you locate yourself.
In the trivial case:
state A: value 7

state B: value 7

state C: value 7
Here, learning the value tells you nothing about whether you are in A, B, or C.
So Williamson’s thought is that the expectation trick only works robustly when the variable has stopped being useful for epistemology. That is why he thinks this weaker substitute for luminosity is no real substitute at all. It either works only in a highly artificial toy symmetry case, like the time example, or, when generalised into a stable principle, it forces the variable to become constant and therefore uninformative.

The toy model shows you can fail to know exactly where you are and yet still have an expected value equal to the actual value. So maybe certainty is not needed. But then the broader argument shows if you want this match to hold systematically across a rich space of possibilities and the space really contains informative variation and there is non-zero evidential spread to nearby alternatives then the match breaks down. The average gets pulled away from the true value unless the value is flat everywhere. So the toy model is useful because it shows a logical possibility, but Williamson thinks it does not provide a serious epistemological replacement for knowledge or luminosity.

Williamson’s point is that certainty is too strong and expected value always matching reality looks weaker (so better) but if you try to make that into a general epistemic ideal, it becomes either false or trivial. And that supports his larger conclusion that epistemology needs an intermediate notion like knowledge, not a perfectly transparent one.

Williamson considers another push-back against his position. Even if we cannot achieve luminosity or probability-one transparency outright, perhaps we can approximate it by iteration. Start with a variable x, then take its expectation, then the expectation of that expectation, and so on. Maybe repeated higher-order expectation gets us closer and closer to a luminous ideal. Williamson argues that in realistic state spaces the opposite happens. The more one iterates expectation, the more the differences between states are washed out. The process converges toward a state-independent limit. That means it tends not toward greater informational sharpness but toward a constant, uninformative value. What looked like an approximation to perfect epistemic privilege turns out to be a route by which the very distinctions that make learning possible are eroded.

Williamson is saying that epistemology cannot be built around notions that are too objective, like bare truth, because those notions do not reflect the epistemic situation of finite agents at all. But it also cannot be built around notions that are too privileged, too transparent, too internal to the subject, because once one strengthens epistemic notions to that level they either become impossible or they collapse into triviality. Luminosity is one version of that impossible ideal. Even sophisticated probabilistic approximations to luminosity end up sacrificing exactly what an epistemology needs, the possibility of learning from evidence and experience. An epistemology worthy of finite agents must therefore use intermediate notions, notions that are genuinely epistemic but not maximally self-transparent.

Williamson says that knowledge is exactly such an intermediate notion. Knowledge is not mere truth, because it is sensitive to the agent’s position and constraints. But it is not so internal or luminous that it guarantees self-transparency in every case. It requires truth, but it also reflects evidence, reliability, and the situation of the knower. That is why he thinks knowledge is not just one topic among others in epistemology, nor a relic of folk theory, but the kind of notion epistemology genuinely needs. Its intermediate status is not a defect. It is what makes it fit for purpose.

One might contend that if anti-luminosity can be restated in probabilistic terms, perhaps this weakens Williamson’s knowledge-first strategy rather than strengthening it. Perhaps Bayesian or rational-belief epistemologists can simply say that their framework reproduces the same insight, and so knowledge is no longer special. Williamson’s reply is that this would be a problem only if probabilistic epistemology could stand on its own. But he thinks it cannot. Bayesian theories talk constantly about updating on evidence while giving little account of what evidence itself is. His further argument is that the best candidate for what counts as one’s evidence is what one knows. So even if the anti-luminosity argument can be reconstructed in probabilistic terms, that does not mean knowledge is dispensable. It means only that the anti-luminosity phenomenon is robust. Knowledge still retains its foundational role because evidential probability itself, properly understood, depends on knowledge.

We might worry that his argument is illicitly helped by assumptions about reliability or by holding fixed too much in the example. Williamson makes it clear that his examples are meant as counterexamples to universal luminosity claims, so he is allowed to stipulate ordinary, non-miraculous settings. If someone wants to say that God could tell you exactly whether you are in pain in a borderline case, that does not matter, because luminosity is a universal thesis and one clean non-supernatural counterexample is enough to refute it. Similarly, if you argue that changing evidence across time disrupts the argument he would point out that the changing evidence is built into the structure of the case. The whole point is to track how evidence and probabilities evolve in a gradual process. One does not need a different kind of argument to accommodate that. One only needs to choose the intervals small enough that the relevant discrimination assumptions continue to hold.

We might ask why rationality should require probability one rather than merely some high probability, Williamson would return us to the motivating thought. If your probabilistic calculations themselves are uncertain, then those uncertainties would have to be fed into still higher-order calculations, and those into higher-order ones again. Unless somewhere the process stops with some probabilities that are taken as fixed, the theory cannot properly guide reasoning. The ideal of luminosity reappears here as the demand for a stopping point in higher-order uncertainty. Anti-luminosity then tells us that such a stopping point will not take the strong, transparent form one might have hoped for.

All this shows that the philosophical pressure against luminosity is not an artefact of Williamson’s knowledge-first outlook. The pressure re-emerges even if one shifts the discussion into probabilistic terms and tries to reconstruct epistemology without relying on knowledge. But the result of that reconstruction is not that knowledge becomes dispensable. It is rather that probabilistic epistemology inherits the same structural limitations and then reveals its own need for an intermediate notion of evidence that, Williamson thinks, is best understood through knowledge. 

Probability does not rescue luminosity. It deepens it by showing why epistemology needs notions that are neither too weakly connected to the knower nor too strongly privileged from the knower’s point of view. Knowledge, for Williamson, remains the central notion precisely because it occupies that difficult but indispensable middle ground.

Now if we take Williamson’s framework and treat it not simply as an abstract epistemological argument but as a model of how knowing and judging actually work under human limitations, we can begin to see something quite revealing about educational assessment. What at first looks like a technical debate about knowledge, probability, and self-awareness turns out to map closely onto the everyday practices of marking essays, grading exams, and evaluating student understanding where judgment is nuanced, interpretive, and often gradual rather than sharply defined.

To see this, it helps to restate the core insight in plain terms. Williamson shows that we cannot build a system in which a person always knows, with certainty, whether a condition holds, even when that condition is something as apparently immediate as being in pain. The reason is that human judgment operates under constraints: we cannot detect arbitrarily small differences, and our judgments must be stable across nearby cases if they are to count as knowledge. When these constraints are combined with gradual change, the idea of perfect self-transparency collapses. There will always be borderline cases where the condition holds but we cannot know that it holds.

Now imagine translating that structure into the domain of educational assessment. Educational systems often implicitly rely on a version of Williamsonian luminosity. They assume something like, if a student meets a standard, then it should be possible to determine that they meet it. Or if a piece of work is excellent (or a “Grade A”), then a competent assessor should be able to recognise that with certainty.

This assumption shows up in many familiar practices such as detailed mark schemes that aim to specify exactly what counts as a top-grade essay, rubrics that break performance into finely graded descriptors, moderation procedures designed to ensure that different assessors reach the same judgment and accountability systems that treat grades as precise indicators of ability. All of these reflect a desire for epistemic transparency in assessment. The system is trying to ensure that educational quality is something that can be clearly identified, known, and justified.

Now consider what actual student work looks like. Take English literature. Imagine a series of essays responding to the same question about, say, Shakespeare’s Macbeth. At one end, you have a clearly outstanding essay with sophisticated interpretation, careful attention to language, coherent argument and insight into themes and context. At the other end, you have a clearly weak essay with misunderstandings of the text, little structure, minimal evidence and superficial commentary. Between these two extremes lies a continuous range of essays, each differing only slightly from its neighbours.

One essay is slightly more insightful than another, another uses quotations more effectively and yet another is just a bit more coherent. The differences are often extremely fine-grained. In practice, assessors are dealing with something like Williamson’s gradual series: a progression from clear success to clear failure through many small steps.

Now add Williamson’s second key constraint of limited discrimination. Even experienced teachers cannot perfectly distinguish between very similar pieces of work. Two essays might be equally well argued but in slightly different ways, one slightly more elegant stylistically, the other slightly stronger analytically, one with better structure, the other with deeper insight.

In such cases the difference between the essays may fall below the assessor’s reliable threshold of discrimination. This is not a failure of professionalism. It is a structural feature of human judgment. Just as we cannot detect infinitesimal differences in temperature or colour, we cannot reliably detect every subtle difference in quality between two complex pieces of writing.

Williamson’s argument also relies on a reliability condition: If you know that something is the case, your judgment must be stable across nearby cases. Translated into assessment this means if an assessor confidently judges an essay to be “Grade A,” then very similar essays should also be “Grade A.” Otherwise, the judgment would be unreliable. If two nearly identical essays receive different grades, we begin to suspect arbitrariness. This is why educational systems emphasise consistency, standardisation and moderation. These are attempts to enforce a kind of local stability in judgments.
Now we can see the structure of Williamson’s argument playing out in assessment. Suppose we try to maintain the luminosity assumption that whenever an essay deserves Grade A, it can be known to deserve Grade A. Combine this with gradual differences between essays, limited discrimination and the requirement of consistency across similar cases we get a problem.

If Essay 1 is clearly Grade A, then Essay 2, which is almost identical, should also be Grade A. Then Essay 3, which is only slightly worse than Essay 2, should also be Grade A. And so on. By repeating this reasoning across the gradual series, we are pushed toward the conclusion that even clearly weaker essays should be Grade A. But that is bonkers. So something has to give.

Williamson’s conclusion in the philosophical case is that luminosity must be rejected. In the educational case, the analogue is that it is not always possible to know, with certainty, whether a piece of work meets a given standard. There will be borderline cases where the work meets the standard but we are not in a position to know that it does.

Educational systems, like philosophers, do not usually accept this conclusion quietly. Instead, they attempt to repair the model. One response is to increase precision. Add more criteria. Break grades into sub-levels. Specify finer distinctions. So, for example, in history  “Demonstrates understanding of context” becomes “Accurate contextual knowledge”, “Relevant contextual integration” and “Sophisticated contextual analysis”. The hope is that more detail will eliminate ambiguity. But Williamson’s framework predicts that this will not work. The problem is not lack of detail. It is the combination of gradual variation and limited discrimination. No matter how fine-grained the rubric becomes, there will still be borderline cases.
Another response is to accept uncertainty. “This essay is probably a Grade A”. “There is high confidence it meets the standard”. This kind of thing corresponds to Williamson’s move from knowledge to probability. But in practice, this creates problems. Decisions still have to be made (a grade must be assigned). High probability is not certainty. Stakeholders (students, parents, institutions) expect definite outcomes. And near boundaries, confidence naturally drops. Two essays near a grade boundary might both have only moderate confidence of being in either category. So this approach does not remove the structural difficulty.

A third response is to increase agreement. Multiple markers. External moderation. Standardisation meetings. These aim to stabilise judgments across assessors. They can improve consistency, but they do not eliminate the underlying issue. Instead, they often produce negotiated judgments, consensus-based decisions and institutional conventions. These are practical solutions, but they do not restore luminosity. They do not make it true that whenever a standard is met, it can be known with certainty that it is met.

When we step back, Williamson’s framework helps us see that educational assessment is operating under a deep structural constraint. There is no way to build an assessment system that is both fully precise and fully usable by human agents. If we try to make it fully precise we demand distinctions that cannot be reliably made and we generate inconsistency or arbitrariness. If we relax precision we accept uncertainty and borderline cases and we lose the ideal of exact classification.

This is not a flaw in particular systems. It is a feature of the domain. Consider two essays on Macbeth. Essay A has a subtle reading of ambition, elegant prose, slightly thin on textual evidence. Essay B has strong textual evidence, slightly less interpretive depth. Which is better? Different assessors may reasonably disagree. Even the same assessor might hesitate. Williamson’s framework explains why. The essays are very close in quality. The difference may be below reliable discrimination. Any sharp decision introduces instability
Or consider two history essays on the causes of World War I. Essay A has strong factual accuracy, weaker analytical structure. Essay B has strong argument, but a few minor inaccuracies. Again, the difference is multi-dimensional and gradual. There is no clear point at which “This is definitively a Grade A, and this is not”even if the grading scheme demands such a boundary.

The deeper implication is thus that assessment is inherently non-luminous. There will always be cases where a student’s work meets a standard but it cannot be known with certainty that it does. And conversely a work may fail to meet a standard but not in a way that can be decisively demonstrated.  This is not because assessors are incompetent, but because the domain is continuous, human judgment is limited and knowledge requires stability. (Systems – machine based or otherwise-  that claim they can discriminate across all cases are therefore bogus.)

Williamson’s framework suggests that a better model of assessment would accept borderline cases as inevitable, treat grades as approximations, not exact measures, recognise the role of judgment under uncertainty and avoid the illusion that more detail equals more knowledge.

In other words assessment should be understood as operating in an intermediate epistemic space, neither fully precise nor arbitrary, but structured, constrained, and limited. What Williamson ultimately shows, when applied to education, is something both sobering and clarifying. The dream of a perfectly precise, fully transparent assessment system, one that can always tell exactly who deserves what is not just difficult. It is structurally impossible.

And yet, this does not make assessment necessarily meaningless. Rather it means that assessment must be fallible, context-sensitive and reliant on informed judgment. In that sense, the role of the teacher or assessor is not to mechanically detect pre-existing, perfectly knowable categories, but to exercise disciplined judgment within a space where knowing and not knowing are inevitably intertwined.

References


Williamson, Timothy. “Timothy Williamson – Probabilistic Luminosity.” Lecture. YouTube video, October 17, 2012. https://www.youtube.com/watch?v=zDq2pIoSCbc
Williamson, Timothy. Knowledge and Its Limits. Oxford: Oxford University Press, 2000.