On a Heuristic Point of View Concerning the Production and Transformation of Light

Author

A. Einstein

Published

December 31, 1904

Einstein, A. Ann. Phys. 1905, 17, 132–148. Original German title: “Über einen die Erzeugung und Verwandlung des Lichtes betreffenden heuristischen Gesichtspunkt”

To the Reader

This is the 1905 paper for which Einstein received the Nobel Prize in Physics (1921). It introduces the revolutionary concept of the light quantum (later called the photon) and explains the photoelectric effect, the emission of electrons when light shines on a metal surface.

For General Chemistry students, this paper explains why light behaves as if it comes in discrete packets of energy E = hν. Einstein shows that the energy of ejected electrons depends on light frequency, not intensity. Classical wave theory cannot explain this. The quantum nature of light underlies everything from spectroscopy to solar cells.

Eight years after this paper, Niels Bohr would use Einstein’s photon concept to explain atomic spectra: when an electron drops from a higher to a lower energy level, the atom emits a photon with energy E = hν equal to the energy difference between levels. This is why hydrogen emits specific colors (the Balmer series) rather than a continuous rainbow. Electrons can only occupy certain quantized energy levels.

The historical significance of this paper is remarkable. When Planck introduced his constant h in 1900, he viewed quantization as a mathematical trick to make the equations work, not as a statement about physical reality. Einstein took it seriously as physics, proposing that light itself comes in discrete packets. Most physicists, including Planck himself, initially rejected this idea. Full acceptance came only after Millikan’s precise photoelectric measurements (1916) and the Compton effect (1923). Einstein received the 1921 Nobel Prize specifically for this work on the photoelectric effect, not for relativity.

On a Heuristic¹ Point of View Concerning the Production and Transformation of Light

¹ Heuristic means “serving as an aid to learning or discovery.” Einstein chose this cautious word because he knew the light quantum idea was revolutionary and not yet proven. 1905 was Einstein’s Annus Mirabilis (miracle year): the 26-year-old patent clerk published four papers that transformed physics, on special relativity, Brownian motion, the photoelectric effect (this paper), and mass-energy equivalence (E = mc²). He later called this paper “very revolutionary,” even more so than special relativity. Why? The wave theory of light had decades of experimental confirmation (interference, diffraction, polarization). Proposing that light comes in particles seemed to contradict all of that.

by A. Einstein

Between the theoretical conceptions which physicists have formed about gases and other ponderable bodies, and Maxwell’s theory of electromagnetic processes in so-called empty space, there exists a profound formal difference.² While we conceive the state of a body as completely determined by the positions and velocities of a very large but nevertheless finite number of atoms and electrons, we make use of continuous spatial functions to determine the electromagnetic state of a space, so that a finite number of quantities is not to be regarded as sufficient for the complete specification of the electromagnetic state of a space. According to Maxwell’s theory, energy is to be conceived as a continuous spatial function for all purely electromagnetic phenomena, hence also for light, while according to the present conception of physicists the energy of a ponderable body is to be represented as a sum extending over the atoms and electrons.³ The energy of a ponderable body cannot be divided into arbitrarily many, arbitrarily small parts, while according to Maxwell’s theory (or more generally according to any wave theory) the energy of a light ray emitted from a point source continuously distributes itself over an ever-increasing volume.

² Einstein opens by highlighting a fundamental inconsistency in physics circa 1905. Matter was known to be made of discrete atoms, but light was treated as a continuous wave. This paper proposes that light, too, is discrete.

³ “Ponderable” means “having weight,” i.e., ordinary matter. Einstein contrasts the atomic theory of matter (discrete particles) with Maxwell’s electromagnetic theory (continuous waves). This tension will be resolved by quantizing light.

The wave theory of light operating with continuous spatial functions has proved itself admirably in the representation of purely optical phenomena and will probably never be replaced by another theory.⁴ It should be kept in mind, however, that optical observations refer to time averages rather than to instantaneous values, and it is quite conceivable, despite the complete confirmation of the theory of diffraction, reflection, refraction, dispersion, etc., by experiment, that the theory of light operating with continuous spatial functions leads to contradictions with experience when one applies it to the phenomena of production and transformation of light.

⁴ Einstein acknowledges the wave theory’s success in explaining interference, diffraction, and other optical phenomena. He’s not rejecting waves entirely. Rather, he’s proposing that light behaves as particles in certain processes. The full reconciliation of wave and particle pictures, wave-particle duality, would come later with de Broglie (1924) and quantum mechanics.

It seems to me, in fact, that the observations about “black-body radiation,” photoluminescence, the production of cathode rays by ultraviolet light, and other groups of phenomena concerning the production or transformation of light appear more intelligible under the assumption that the energy of light is distributed discontinuously in space.⁵ According to the assumption to be considered here, when a light ray spreads out from a point source, the energy is not continuously distributed over an ever-increasing volume, but consists of a finite number of energy quanta localized at points in space, which move without dividing and which can only be absorbed or emitted as wholes.⁶

⁵ Here Einstein lists three phenomena that wave theory struggles to explain: (1) black-body radiation, the spectrum of light emitted by hot objects; (2) photoluminescence, materials that absorb light and re-emit it at different wavelengths; (3) cathode rays from UV light, the photoelectric effect, where light ejects electrons from metals.

⁶ Einstein proposes that light consists of discrete “energy quanta,” particles we now call photons (the word was coined by Gilbert Lewis in 1926). Each photon is indivisible: it’s absorbed or emitted whole, never split. This was radical in 1905; most physicists, including Planck, initially rejected it. Planck had quantized the oscillators in his black-body theory, not light itself. Einstein went further, and Planck thought he went too far. In 1913, when recommending Einstein for the Prussian Academy, Planck wrote that Einstein “may sometimes have missed the target in his speculations, as for example in his hypothesis of light quanta.” The word “quantum” (plural: quanta) means a discrete, indivisible unit, the smallest possible “packet” of something. In Gen Chem, you’ll encounter quantized electron energy levels in atoms, which explains why atoms emit light at specific wavelengths (line spectra).

In the following I wish to present the train of thought and cite the facts which have led me to this path, in the hope that the point of view to be presented may prove useful to some researchers in their investigations.

§1. On a Difficulty Concerning the Theory of “Black-Body Radiation”

We first place ourselves on the standpoint of Maxwell’s theory and electron theory and consider the following case.⁷ In a space enclosed by completely reflecting walls, let there be a number of gas molecules and electrons which are freely movable and which exert conservative forces on each other when they come very close together, i.e., can collide with each other like gas molecules according to kinetic gas theory.⁽¹⁾ Let a number of electrons furthermore be bound to widely separated points of the space by forces directed toward these points and proportional to the elongations from these points. These electrons shall also enter into conservative interaction with the free molecules and electrons when the latter come very close. We call the electrons bound to points in space “resonators”; they emit and absorb electromagnetic waves of definite period.⁸

⁷ Einstein sets up a thought experiment: a box with perfectly reflecting walls containing gas molecules, free electrons, and bound “resonator” electrons that can oscillate and emit/absorb light. This models how matter and radiation reach thermal equilibrium.

⁸ “Resonators” are electrons bound by spring-like forces (proportional to displacement). They oscillate at specific frequencies and emit electromagnetic radiation. This model, developed by Planck, represents atoms interacting with light. In modern terms, these are quantized atomic oscillators.

According to the present view about the origin of light, the radiation in the space considered, which is found for the case of dynamical equilibrium on the basis of Maxwell’s theory, must be identical with “black-body radiation”—at least when resonators of all frequencies considered to be present are available.⁹

⁹ A black body is an idealized object that absorbs all incident radiation and, when heated, emits a characteristic spectrum depending only on temperature. Hot objects like stars, molten metal, and incandescent light bulb filaments approximate black bodies. The cavity with reflecting walls and resonators at all frequencies serves as a theoretical model of a black body.

For the time being we disregard the radiation emitted and absorbed by the resonators and inquire into the condition for dynamical equilibrium corresponding to the interaction (collisions) of molecules and electrons. Kinetic gas theory provides for the latter the condition that the mean kinetic energy of a resonator electron must equal the mean kinetic energy of the translational motion of a gas molecule.¹⁰ If we decompose the motion of a resonator electron into three oscillations in three mutually perpendicular directions, we find for the mean value \(\bar{E}\) of the energy of one such linear oscillatory motion¹¹

¹⁰ The equipartition theorem from classical statistical mechanics: at thermal equilibrium, energy is shared equally among all degrees of freedom. Each oscillator should have the same average energy as a gas molecule’s translational motion.

¹¹ Einstein derives what classical physics predicts for the average energy of an oscillator. The result, RT/N = kT (where k is Boltzmann’s constant), is the classical equipartition value. It leads to disaster.

\[\bar{E} = \frac{R}{N}T\]

where R denotes the absolute gas constant, N the number of “actual molecules” in one gram-equivalent,¹² and T the absolute temperature.

¹² A “gram-equivalent” is what we now call a mole. The number N is Avogadro’s number, approximately 6.022 × 10²³ particles per mole. Einstein’s R/N equals Boltzmann’s constant k.

The energy \(\bar{E}\) is, namely, equal to ⅔ of the mean kinetic energy of a free monatomic gas molecule on account of the equality of the time averages of kinetic and potential energy of the resonator. If through some cause—in our case through radiation processes—it should happen that the energy of a resonator has a time average greater or smaller than \(\bar{E}\), then the collisions with the free electrons and molecules would lead to an energy exchange with the gas that on average is different from zero. Thus dynamical equilibrium is only possible in the case we are considering when each resonator has the mean energy \(\bar{E}\).

We now make a similar consideration with respect to the interaction of the resonators and the radiation present in the space. Mr. Planck has derived⁽²⁾ the condition for dynamical equilibrium in this case under the assumption that the radiation can be treated as the most disordered conceivable process.⁽³⁾ He found:¹³

¹³ Planck showed that when resonators are in equilibrium with radiation, there’s a relationship between the resonator’s average energy and the radiation energy density. This equation connects microscopic oscillators to the macroscopic radiation field.

\[\bar{E}_\nu = \frac{L^3}{8\pi\nu^2}\varrho_\nu\]

\(\bar{E}_\nu\) is here the mean energy of a resonator of eigenfrequency¹⁴ ν (per oscillation component), L the speed of light,¹⁵ ν the frequency, and ρ_ν dν the energy per unit volume of that part of the radiation whose frequency lies between ν and ν + dν.

¹⁴ “Eigenfrequency” means natural frequency or resonant frequency, the frequency at which a system naturally oscillates. From German eigen (own, characteristic) + frequency.

¹⁵ Einstein uses L for the speed of light; modern notation uses c (from Latin celeritas, speed). The value is approximately 3 × 10⁸ m s⁻¹.

If the radiation energy of frequency ν is not to be constantly increased or decreased at the expense of the energy of matter, the following must hold:¹⁶

¹⁶ Combining the two equilibrium conditions (resonators with gas molecules and resonators with radiation), Einstein derives the Rayleigh-Jeans law. This is what classical physics predicts.

\[\frac{R}{N}T = \bar{E} = \bar{E}_\nu = \frac{L^3}{8\pi\nu^2}\varrho_\nu\]

\[\varrho_\nu = \frac{R}{N}\frac{8\pi\nu^2}{L^3}T\]

This relation, found as the condition of dynamical equilibrium, not only lacks agreement with experience; it also says that in our picture there can be no question of a definite energy distribution between aether¹⁷ and matter.¹⁸ The wider the range of oscillation numbers of the resonators is chosen, the greater becomes the radiation energy of the space, and in the limit we obtain¹⁹

¹⁷ The luminiferous aether was a hypothetical medium thought to pervade all space and transmit light waves. Einstein’s 1905 special relativity paper (published the same year) would make the aether concept unnecessary.

¹⁸ The classical result fails catastrophically. It predicts radiation energy density increases as ν², giving more and more energy at higher frequencies. This contradicts experiments showing black-body radiation peaks at a specific frequency and then decreases.

¹⁹ This divergent integral represents the ultraviolet catastrophe (a term coined later by Paul Ehrenfest). Classical physics absurdly predicts that a black body should radiate infinite power, concentrated at high frequencies. This crisis demanded a revolutionary solution: quantum theory.

\[\int_0^\infty\varrho_\nu\,d\nu = \frac{R}{N}\frac{8\pi}{L^3}T\int_0^\infty\nu^2\,d\nu = \infty\]

§2. On Planck’s Determination of the Elementary Quanta

We want to show in the following that Mr. Planck’s determination of the elementary quanta is to a certain degree independent of the theory of “black-body radiation” established by him.²⁰

²⁰ Einstein will show that Planck’s constant h (hidden in the constants α and β) can be extracted from experimental data without fully accepting Planck’s theoretical framework. This makes the quantum hypothesis more robust.

Planck’s formula⁽⁴⁾ for ρ_ν, which satisfies all experiments up to now, reads:²¹

²¹ Planck’s radiation law (1900) perfectly matches experimental black-body spectra. The key is the exponential term with βν/T in the denominator, which prevents the ultraviolet catastrophe by suppressing high-frequency contributions.

\[\varrho_\nu = \frac{\alpha\nu^3}{e^{\frac{\beta\nu}{T}} - 1}\]

where

\[\alpha = 6.10 \times 10^{-56}\] \[\beta = 4.866 \times 10^{-11}\]

For large values of T/ν, i.e., for large wavelengths and radiation densities, this formula goes over in the limit into the following:

\[\varrho_\nu = \frac{\alpha}{\beta}\nu^2 T\]

One sees that this formula agrees with the one developed in §1 from Maxwell’s theory and electron theory.²² By equating the coefficients of both formulas one obtains:

²² At low frequencies (long wavelengths), Planck’s formula reduces to the classical Rayleigh-Jeans result. Classical physics works in this limit. The quantum effects only become important at high frequencies, where the exponential term dominates.

\[\frac{R}{N}\frac{8\pi}{L^3} = \frac{\alpha}{\beta}\]

\[N = \frac{\beta}{\alpha}\frac{8\pi R}{L^3} = 6.17 \times 10^{23}\]

i.e., one atom of hydrogen weighs 1/N gram = 1.62 × 10⁻²⁴ g.²³ This is exactly the value found by Mr. Planck, which agrees satisfactorily with values found by other methods for this quantity.

²³ Einstein extracts Avogadro’s number N = 6.17 × 10²³ from black-body radiation data. The modern value is 6.022 × 10²³. This connected the quantum theory of light to the atomic theory of matter.

We arrive therefore at the conclusion: the greater the energy density and wavelength of a radiation, the more usable the theoretical foundations we have used prove to be; for small wavelengths and small radiation densities, however, they fail completely.²⁴

²⁴ A key insight: classical physics works for long wavelengths and high intensities, but fails for short wavelengths (high frequencies) and low intensities. This is precisely where quantum effects dominate. Remember: wavelength (λ) and frequency (ν) are inversely related: c = λν. Short wavelength = high frequency = high photon energy. UV and X-rays (short λ) carry more energy per photon than visible or infrared light (long λ).

In the following, “black-body radiation” shall be considered based on experience without establishing a picture about the production and propagation of the radiation.

§3. On the Entropy of Radiation

The following consideration is contained in a famous work by Mr. W. Wien and is included here only for the sake of completeness.²⁵

²⁵ Wilhelm Wien won the 1911 Nobel Prize for his work on black-body radiation. His “displacement law” relates the peak wavelength of a black body’s emission to its temperature: λ_max × T = constant.

Let there be a radiation occupying the volume v. We assume that the observable properties of this radiation are completely determined when the radiation density ρ(ν) is given for all frequencies.⁽⁵⁾ Since radiations of different frequencies are to be regarded as separable from each other without doing work and without supplying heat, the entropy of the radiation can be represented in the form²⁶

²⁶ Entropy (S) measures the dispersal of energy in a system. Einstein treats radiation of each frequency as independent, so the total entropy is the sum (integral) of contributions from all frequencies. This additivity is crucial for his argument.

\[S = v\int_0^\infty\varphi(\varrho,\nu)\,d\nu\]

where φ is a function of the variables ρ and ν. φ can be reduced to a function of only one variable through formulation of the statement that adiabatic compression of a radiation between reflecting walls does not change its entropy. We shall not enter into this, however, but shall immediately investigate how the function φ can be determined from the radiation law of black bodies.

For “black-body radiation,” ρ is such a function of ν that the entropy is a maximum for given energy, i.e., that²⁷

²⁷ The second law of thermodynamics: at equilibrium, entropy is maximized. For black-body radiation, this constrains the form of the entropy function φ. Einstein uses calculus of variations to find this constraint.

\[\delta\int_0^\infty\varphi(\varrho,\nu)\,d\nu = 0\]

when

\[\delta\int_0^\infty\varrho\,d\nu = 0\]

From this it follows that for every choice of δρ as a function of ν

\[\int_0^\infty\left(\frac{\partial\varphi}{\partial\varrho} - \lambda\right)\delta\varrho\,d\nu = 0\]

where λ is independent of ν. For black-body radiation, therefore, ∂φ/∂ρ is independent of ν.

For the temperature increase of a black-body radiation of volume v = 1 by dT, the equation holds:

\[dS = \int_{\nu=0}^{\nu=\infty}\frac{\partial\varphi}{\partial\varrho}\,d\varrho\,d\nu\]

or, since ∂φ/∂ρ is independent of ν:

\[dS = \frac{\partial\varphi}{\partial\varrho}\,dE\]

Since dE is equal to the heat added and the process is reversible, it also holds:

\[dS = \frac{1}{T}\,dE\]

By comparison one obtains:²⁸

²⁸ This connects entropy to temperature via the fundamental thermodynamic relation dS = dE/T (for reversible processes at constant volume). Einstein can now derive the entropy of radiation from any radiation law.

\[\frac{\partial\varphi}{\partial\varrho} = \frac{1}{T}\]

This is the law of black-body radiation. One can therefore determine the function φ from the law of black-body radiation and conversely determine the latter from the function φ by integration with respect to the fact that φ vanishes for ρ = 0.

§4. Limiting Law for the Entropy of Monochromatic Radiation at Low Radiation Density

From the observations made so far concerning “black-body radiation” it emerges that the law originally established by Mr. W. Wien for “black-body radiation”²⁹

²⁹ Wien’s approximation predates Planck’s exact formula. It works well at high frequencies (short wavelengths) but fails at low frequencies. Einstein deliberately uses Wien’s law because he’s interested in the high-frequency regime where quantum effects dominate.

\[\varrho = \alpha\nu^3 e^{-\frac{\beta\nu}{T}}\]

is not exactly valid. However, the same has been completely confirmed by experiment for large values of ν/T. We base our calculations on this formula, but bear in mind that our results are only valid within certain limits.

From this formula one first obtains:³⁰

³⁰ The original manuscript uses “lg” for logarithm. We have modernized this to “ln” (natural logarithm) throughout, as this is standard notation in chemistry courses.

\[\frac{1}{T} = -\frac{1}{\beta\nu}\ln\frac{\varrho}{\alpha\nu^3}\]

and further, using the relation found in the previous paragraph:

\[\varphi(\varrho,\nu) = -\frac{\varrho}{\beta\nu}\left\{\ln\frac{\varrho}{\alpha\nu^3} - 1\right\}\]

Let there now be a radiation of energy E, whose frequency lies between ν and ν + dν. Let the radiation occupy the volume v. The entropy of this radiation is:

\[S = v\,\varphi(\varrho,\nu)\,d\nu = -\frac{E}{\beta\nu}\left\{\ln\frac{E}{v\alpha\nu^3\,d\nu} - 1\right\}\]

If we restrict ourselves to investigating the dependence of the entropy on the volume occupied by the radiation, and if we denote by S₀ the entropy of the radiation when it occupies the volume v₀, we obtain:³¹

³¹ This equation is the crucial result of this section. The entropy difference depends on the logarithm of the volume ratio, exactly the same form as the entropy of an ideal gas. This “coincidence” will lead Einstein to the light quantum hypothesis.

\[S - S_0 = \frac{E}{\beta\nu}\ln\left(\frac{v}{v_0}\right)\]

This equation shows that the entropy of a monochromatic radiation of sufficiently small density varies with volume according to the same law as the entropy of an ideal gas or of a dilute solution. The equation just found will be interpreted in the following on the basis of the principle introduced by Mr. Boltzmann into physics, according to which the entropy of a system is a function of the probability of its state.

§5. Molecular-Theoretical Investigation of the Dependence of the Entropy of Gases and Dilute Solutions on the Volume

In calculating the entropy by molecular-theoretical methods, the word “probability” is frequently used in a meaning that does not conform to the definition of probability given in probability calculus.³² In particular, “cases of equal probability” are often hypothetically stipulated in cases where the theoretical pictures employed are sufficiently definite to deduce probabilities rather than stipulate them hypothetically. I will show in a separate paper that in considerations about thermal processes it is entirely sufficient to use the so-called “statistical probability,” and hope thereby to eliminate a logical difficulty that still obstructs the implementation of Boltzmann’s principle. Here, however, only its general formulation and its application to quite special cases shall be given.

³² Einstein was deeply concerned with the logical foundations of statistical mechanics. The “separate paper” he mentions may be his 1905 paper on Brownian motion, which established the statistical approach on firm ground.

If it makes sense to speak of the probability of a state of a system, and if furthermore every increase of entropy can be conceived as a transition to a more probable state, then the entropy S₁ of a system is a function of the probability W₁ of its momentary state.³³ If, therefore, two systems S₁ and S₂ not interacting with each other are present, one can set:

³³ Boltzmann’s insight: entropy measures probability. A high-entropy state is more probable than a low-entropy state. The second law of thermodynamics (entropy increases) simply says systems evolve toward more probable configurations.

\[S_1 = \varphi_1(W_1)\] \[S_2 = \varphi_2(W_2)\]

If one considers these two systems as a single system of entropy S and probability W, then:

\[S = S_1 + S_2 = \varphi(W)\]

and

\[W = W_1 \times W_2\]

The latter relation states that the states of the two systems are independent events.

From these equations it follows:

\[\varphi(W_1 \times W_2) = \varphi_1(W_1) + \varphi_2(W_2)\]

and from this finally³⁴

³⁴ The functional equation φ(W₁ × W₂) = φ₁(W₁) + φ₂(W₂) has only one solution: the logarithm. This mathematical requirement forces S ∝ ln W, giving Boltzmann’s formula S = k ln W (inscribed on Boltzmann’s tombstone).

\[\varphi_1(W_1) = C\ln(W_1) + \text{const.}\] \[\varphi_2(W_2) = C\ln(W_2) + \text{const.}\] \[\varphi(W) = C\ln(W) + \text{const.}\]

The quantity C is therefore a universal constant; from kinetic gas theory it follows that its value is R/N, where the constants R and N have the same meaning as above.³⁵ If S₀ denotes the entropy at a certain initial state of the system considered and W the relative probability of a state of entropy S, we obtain in general:

³⁵ The constant C = R/N is Boltzmann’s constant k = 1.38 × 10⁻²³ J/K. It connects the microscopic world (individual molecules) to macroscopic thermodynamics. Modern notation: S = k ln W.

\[S - S_0 = \frac{R}{N}\ln W\]

We first treat the following special case. In a volume v₀ let there be a number (n) of movable points (e.g., molecules), to which our consideration shall refer. Besides these there may be in the space arbitrarily many other movable points of any kind. Concerning the law according to which the considered points move in the space, no assumption shall be made except that no part of the space (and no direction) is distinguished with respect to this motion. The number of the (firstly considered) movable points shall further be so small that their mutual interaction can be disregarded.

To the system considered, which may for example be an ideal gas or a dilute solution, a certain entropy S₀ belongs. Let us imagine a part of the volume v₀ of size v and all n movable points transferred into the volume v, without anything else in the system being changed. This state obviously corresponds to a different value of the entropy (S), and we now want to determine the entropy difference with the help of Boltzmann’s principle.

We ask: how great is the probability of the last-mentioned state relative to the original? Or: how great is the probability that at a randomly chosen instant of time all n independently moving points (by chance) are found in the volume v?³⁶

³⁶ A simple counting argument: each molecule has probability v/v₀ of being in the smaller volume. For n independent molecules, the total probability is (v/v₀)ⁿ. This gives the entropy change via Boltzmann’s formula.

For this probability, which is a “statistical probability,” one obviously obtains the value:

\[W = \left(\frac{v}{v_0}\right)^n\]

from this one obtains by application of Boltzmann’s principle:

\[S - S_0 = R\left(\frac{n}{N}\right)\ln\left(\frac{v}{v_0}\right)\]

It is noteworthy that the derivation of this equation, from which the Boyle-Gay-Lussac law³⁷ and the identical law of osmotic pressure can be thermodynamically derived with ease,⁽⁶⁾ requires no assumption about the law of motion of the molecules.

³⁷ The Boyle-Gay-Lussac law combines Boyle’s law (PV = constant at fixed T) and Gay-Lussac’s law (P/T = constant at fixed V) into the ideal gas law: PV = nRT. Einstein shows this fundamental chemistry equation emerges from statistical mechanics.

§6. Interpretation of the Expression for the Dependence of the Entropy of Monochromatic Radiation on Volume According to Boltzmann’s Principle

We have found in §4 for the dependence of the entropy of monochromatic radiation on volume the expression:³⁸

³⁸ This section contains the paper’s central argument. Einstein compares the entropy formula for radiation (§4) with the entropy formula for an ideal gas (§5). The mathematical identity forces a revolutionary conclusion.

\[S - S_0 = \frac{E}{\beta\nu}\ln\left(\frac{v}{v_0}\right)\]

If one writes this formula in the form

\[S - S_0 = \frac{R}{N}\ln\left[\left(\frac{v}{v_0}\right)^{\frac{N}R\frac{E}{\beta\nu}}\right]\]

and compares it with the general formula expressing Boltzmann’s principle,

\[S - S_0 = \frac{R}{N}\ln W\]

one arrives at the following conclusion:

If monochromatic radiation of frequency ν and energy E is enclosed (by reflecting walls) in the volume v₀, then the probability that at a randomly chosen instant of time the entire radiation energy is found in the partial volume v of the volume v₀ is:³⁹

³⁹ The exponent (N/R)(E/βν) = E/(Rβν/N) must equal the number of independent entities n. So radiation behaves as if it consists of n = E/(Rβν/N) independent “particles,” each with energy Rβν/N.

\[W = \left(\frac{v}{v_0}\right)^{\frac{N}{R}\frac{E}{\beta\nu}}\]

From this we further conclude:

Monochromatic radiation of low density (within the validity range of Wien’s radiation formula) behaves thermodynamically as if it consisted of mutually independent energy quanta of the size Rβν/N.⁴⁰

⁴⁰ Einstein shows that light quanta have energy Rβν/N. He never uses the symbol h in this paper. However, since Rβ/N equals Planck’s constant h (from Planck’s 1900 work), this is equivalent to E = hν. Before this paper, h was merely a fitting parameter. By showing that light itself comes in discrete packets of energy Rβν/N, Einstein demonstrated that this constant has fundamental physical significance. In modern Gen Chem notation: E = hν, with h = 6.626 × 10⁻³⁴ J·s.

We still wish to compare the mean size of the energy quanta of “black-body radiation” with the mean kinetic energy of the center-of-mass motion of a molecule at the same temperature. The latter is (3/2)(R/N)T, while for the mean size of the energy quantum one obtains, using Wien’s formula:⁴¹

⁴¹ Einstein calculates the average photon energy in black-body radiation: 3kT, compared to (3/2)kT for a gas molecule’s translational kinetic energy. The average photon energy is twice the average molecular translational energy. This difference hints that photons don’t follow classical equipartition.

\[\frac{\int_0^\infty\alpha\nu^3 e^{-\frac{\beta\nu}{T}}\,d\nu}{\int_0^\infty\frac{N}{R\beta\nu}\alpha\nu^3 e^{-\frac{\beta\nu}{T}}\,d\nu} = 3\frac{R}{N}T\]

If, now, monochromatic radiation (of sufficiently low density) behaves with respect to the dependence of entropy on volume like a discontinuous medium consisting of energy quanta of size Rβν/N, it is natural to investigate whether the laws of production and transformation of light are also constituted as if light consisted of such energy quanta.⁴² With this question we shall occupy ourselves in the following.

⁴² Having established the light quantum concept thermodynamically, Einstein now tests it against three phenomena: Stokes’s rule in fluorescence (§7), the photoelectric effect (§8), and gas ionization (§9). The predictions match experiment.

§7. On Stokes’s Rule

Let monochromatic light be transformed into light of a different frequency by photoluminescence, and in accordance with the result just obtained, let us assume that both the producing and the produced light consist of energy quanta of the size (R/N)βν, where ν denotes the relevant frequency.⁴³ The transformation process is then to be interpreted as follows. Each producing energy quantum of frequency ν₁ is absorbed and gives rise—at least at sufficiently small distribution density of the producing energy quanta—by itself alone to a light quantum of frequency ν₂; possibly during absorption of the producing light quantum, light quanta of frequencies ν₃, ν₄, etc., as well as energy of other kinds (e.g., heat) can be generated simultaneously. By what intermediate processes this final result comes about is immaterial. If the photoluminescent substance is not to be regarded as a permanent source of energy, the energy of a produced energy quantum cannot be greater than that of a producing light quantum; thus the relation must hold:⁴⁴

⁴³ Photoluminescence (including fluorescence and phosphorescence) is when a material absorbs light at one wavelength and emits it at another. Fluorescent lights and highlighter markers work this way.

⁴⁴ Energy conservation for photons: the emitted photon cannot have more energy than the absorbed photon. Since E = hν, this means ν_emitted ≤ ν_absorbed.

\[\frac{R}{N}\beta\nu_2 \leq \frac{R}{N}\beta\nu_1\]

\[\nu_2 \leq \nu_1\]

This is the well-known Stokes’s rule.⁴⁵

⁴⁵ Stokes’s rule (1852): fluorescent light has a longer wavelength (lower frequency) than the exciting light. Your clothes glow under UV “black lights” because they absorb high-frequency UV and emit lower-frequency visible light. Einstein’s quantum theory explains why.

It should be particularly emphasized that under weak illumination the amount of light produced must be proportional to the intensity of the illuminating light under otherwise identical circumstances, since each producing energy quantum causes an elementary process of the kind indicated above, independently of the action of the other producing energy quanta. In particular, there will be no lower limit for the intensity of the illuminating light below which the light would be unable to produce luminescence.⁴⁶

⁴⁶ Each photon acts independently. Even at very low intensities, occasional photons will still cause fluorescence. There’s no threshold intensity required. This contrasts with some nonlinear optical processes that require multiple photons acting together.

Deviations from Stokes’s rule are conceivable according to the above conception of the phenomena in the following cases:

when the number of energy quanta simultaneously involved in transformation per unit volume is so large that an energy quantum of the produced light can receive its energy from several producing energy quanta;⁴⁷

⁴⁷ Anti-Stokes emission can occur when multiple photons contribute energy to one emitted photon, or when the material supplies thermal energy. This is rare at low intensities but observable in special circumstances.

when the producing (or produced) light is not of the energetic constitution of a “black-body radiation” from the validity range of Wien’s law, which applies, for example, when the producing light is produced by a body of such high temperature that for the wavelengths in question Wien’s law no longer holds.

The last-mentioned possibility deserves special attention. According to the conception developed, it is not excluded that a “non-Wien radiation” also in great dilution behaves energetically differently than a “black-body radiation” from the validity range of Wien’s law.

§8. On the Production of Cathode Rays by Illumination of Solid Bodies

The usual conception that the energy of light is continuously distributed over the space traversed finds particularly great difficulties in attempting to explain the photoelectric phenomena, as has been shown in a pioneering work by Mr. Lenard.⁽⁷⁾ ⁴⁸

⁴⁸ The photoelectric effect: when light shines on a metal surface, electrons are ejected. Philipp Lenard’s experiments (Nobel Prize 1905) revealed puzzling features that wave theory couldn’t explain. This section provides the quantum explanation.

According to the conception that the exciting light consists of energy quanta of the energy (R/N)βν, the production of cathode rays by light can be understood as follows.⁴⁹ Energy quanta penetrate into the surface layer of the body, and their energy is transformed at least in part into kinetic energy of electrons. The simplest conception is that a light quantum gives up all its energy to a single electron; we shall assume that this happens. It shall, however, not be excluded that electrons only take up part of the energy of light quanta. An electron in the interior of the body provided with kinetic energy will have lost part of its kinetic energy by the time it has reached the surface. Furthermore, it is to be assumed that each electron, on leaving the body, has to perform a work P characteristic of the body (for the body).⁵⁰ The electrons leaving the body with the greatest normal velocity will be those which were directly at the surface and excited normal to it. The kinetic energy of such electrons is⁵¹

⁴⁹ “Cathode rays” are beams of electrons. When light ejects electrons from a metal, they form cathode rays. Einstein’s explanation: one photon gives its energy hν to one electron.

⁵⁰ Work function P: the minimum energy needed to remove an electron from the metal’s surface. Different metals have different work functions (e.g., cesium ~2 eV, platinum ~5.6 eV). This is why some metals are more sensitive to visible light than others.

⁵¹ Einstein’s photoelectric equation: Kinetic energy = photon energy − work function, or KE = hν − P. This simple equation explains all the puzzling features of the photoelectric effect. Note: if hν < P, the photon doesn’t have enough energy to eject an electron, and KE would be negative (impossible). This defines the threshold frequency ν₀ = P/h: light below this frequency cannot eject electrons regardless of intensity. This is a common Gen Chem problem: calculating ν₀ from work function values.

\[\frac{R}{N}\beta\nu - P\]

If the body is charged to a positive potential Π and surrounded by conductors at zero potential, and if Π is just sufficient to prevent loss of electricity by the body, we must have:⁵²

⁵² The stopping potential Π is the voltage needed to stop the fastest electrons. Measuring Π versus ν gives a straight line whose slope determines Planck’s constant. Millikan spent a decade (1905–1915) trying to disprove Einstein’s “reckless” hypothesis but ended up confirming it precisely, earning part of his 1923 Nobel Prize.

\[\Pi\varepsilon = \frac{R}{N}\beta\nu - P\]

where ε denotes the electric mass⁵³ of the electron, or

⁵³ “Electric mass” is an archaic term for electric charge. The electron charge is e = 1.602 × 10⁻¹⁹ C.

\[\Pi E = R\beta\nu - P'\]

where E denotes the charge of a gram-equivalent of a monovalent ion⁵⁴ and P′ the potential of this amount of negative electricity with respect to the body.⁽⁸⁾

⁵⁴ The “charge of a gram-equivalent” is the Faraday constant F = N_Ae ≈ 96,485 C/mol. It’s the total charge of one mole of electrons.

If one sets E = 9.6 × 10⁸, then Π × 10⁻⁸ is the potential in volts which the body assumes under irradiation in vacuum.

In order to see first whether the derived relation agrees with experience in order of magnitude, we set P′ = 0, ν = 1.03 × 10¹⁵ (corresponding to the limit of the solar spectrum toward the ultraviolet) and β = 4.866 × 10⁻¹¹. We obtain Π × 10⁷ = 4.3 volts, a result agreeing in order of magnitude with the results of Mr. Lenard.⁽⁹⁾ ⁵⁵

⁵⁵ Einstein’s prediction (about 4 volts stopping potential for UV light) matched Lenard’s experimental measurements. This agreement supported the light quantum hypothesis, though full verification came later with Millikan’s precision measurements.

If the derived formula is correct, then Π must be, when presented as a function of the frequency of the exciting light in Cartesian coordinates, a straight line whose slope is independent of the nature of the substance investigated.⁵⁶

⁵⁶ The key prediction: plotting stopping potential vs. frequency gives a straight line with slope h/e, the same for all metals. The intercept (where Π = 0) varies with the metal’s work function. Millikan initially set out to disprove what he called Einstein’s “bold, not to say reckless, hypothesis” but his meticulous experiments provided its strongest confirmation.

As far as I can see, our conception does not conflict with the properties of the photoelectric effect observed by Mr. Lenard. If each energy quantum of the exciting light gives up its energy to electrons independently of all others, then the velocity distribution of the electrons, i.e., the quality of the produced cathode radiation, will be independent of the intensity of the exciting light; on the other hand, the number of electrons leaving the body will, under otherwise identical circumstances, be proportional to the intensity of the exciting light.⁽¹⁰⁾ ⁵⁷

⁵⁷ Two crucial predictions that wave theory cannot explain: (1) Electron energy depends on frequency, not intensity. Brighter light ejects more electrons, not faster ones. (2) No threshold intensity. Even dim light ejects electrons instantly if the frequency is high enough. These principles underlie modern solar cells: photons with energy above the semiconductor’s band gap eject electrons that create electric current. Below the threshold frequency, no amount of intense light produces electricity.

Similar remarks would be to make concerning the presumed limits of validity of the above-mentioned regularities as concerning the presumed deviations from Stokes’s rule.

In the foregoing it is assumed that the energy of at least part of the energy quanta of the producing light is delivered completely to a single electron. If one does not make this obvious assumption, one obtains instead of the above equation:

\[\Pi E + P' \leq R\beta\nu\]

For cathodoluminescence, which forms the inverse process to the one just considered, one obtains by an analogous consideration:⁵⁸

⁵⁸ Cathodoluminescence is the inverse of the photoelectric effect: electrons strike a material and produce light. Television CRTs worked this way. The inequality is reversed because now electron energy is converted to photon energy.

\[\Pi E + P' \geq R\beta\nu\]

For the substances investigated by Mr. Lenard, PE is always considerably larger than Rβν, since the voltage which the cathode rays have had to traverse in order to be able to produce visible light amounts in some cases to hundreds, in others to thousands of volts.⁽¹¹⁾ It is therefore to be assumed that the kinetic energy of an electron is used to produce many light energy quanta.

§9. On the Ionization of Gases by Ultraviolet Light

We shall have to assume that in the ionization of a gas by ultraviolet light an absorbed light energy quantum is used for the ionization of one gas molecule.⁵⁹ From this it follows immediately that the work of ionization (i.e., the work theoretically necessary for ionization) of a molecule cannot be greater than the energy of an absorbed effective light energy quantum. If one denotes by J the (theoretical) ionization work per gram-equivalent, then it must hold:

⁵⁹ Photoionization: a photon with enough energy can knock an electron completely out of an atom or molecule, creating an ion. This is how UV light damages DNA. UV-C photons (200–280 nm) have enough energy (~4–6 eV) to break chemical bonds and cause mutations. The ozone layer absorbs most UV-C, protecting life on Earth. UV-A in tanning beds (320–400 nm) has lower photon energy but still causes skin damage over time.

\[R\beta\nu \geq J\]

According to Lenard’s measurements, the greatest effective wavelength for air is about 1.9 × 10⁻⁵ cm, thus:

\[R\beta\nu = 6.4 \times 10^{12}\,\text{erg} \geq J\] ⁶⁰

⁶⁰ Einstein uses CGS units common in 1905 physics. Energy is measured in ergs (1 erg = 10⁻⁷ J). Modern chemistry uses SI units (joules). To convert: 6.4 × 10¹² erg/mol = 640 kJ/mol.

An upper limit for the ionization work is also obtained from the ionization voltages in dilute gases. According to J. Stark,⁽¹²⁾ ⁶¹ the smallest measured ionization voltage (at platinum anodes) for air is about 10 volts.⁽¹³⁾

⁶¹ Johannes Stark (1874–1957) received the 1919 Nobel Prize in Physics for discovering the Doppler effect in canal rays and the splitting of spectral lines in electric fields (Stark effect).

⁶² The ionization energy from UV absorption (~6.4 × 10¹² erg/mol) matches the ionization energy from electron bombardment (~9.6 × 10¹² erg/mol). This consistency supports the quantum hypothesis.

⁶³ Quantum yield: the ratio of molecules affected to photons absorbed. If every photon ionizes one molecule, the yield is 1. Einstein predicts j = L/(hν), testable by measuring both the absorbed light energy and the number of ions produced. Quantum yield is central to photochemistry: photosynthesis has a quantum yield near 95%, meaning almost every absorbed photon drives useful chemistry.

One thus obtains for J the upper limit 9.6 × 10¹², which is nearly equal to the value found above.⁶² There results yet another consequence whose verification by experiment appears to me to be of great importance. If every absorbed light energy quantum ionizes a molecule, then the following relation must hold between the absorbed light quantity L and the number j of gram-molecules ionized by it:⁶³

\[j = \frac{L}{R\beta\nu}\]

This relation must, if our conception corresponds to reality, hold for every gas which (at the frequency in question) exhibits no appreciable absorption not accompanied by ionization.

Bern, 17 March 1905.

(Received 18 March 1905.)

Notes

This assumption is equivalent to the presupposition that the mean kinetic energies of gas molecules and electrons are equal to each other at temperature equilibrium. With the help of this latter presupposition Mr. Drude has derived theoretically the ratio of thermal and electrical conductivities of metals.
M. Planck, Ann. d. Phys. 1. p. 99. 1900.
This presupposition can be formulated as follows. We develop the Z-component of the electric force (Z) at an arbitrary point of the space considered between the time limits t = 0 and t = T (where T denotes a time large relative to all oscillation periods coming into consideration) into a Fourier series

\[Z = \sum_{\nu=1}^{\nu=\infty}A_\nu\sin\left(2\pi\nu\frac{t}{T} + \alpha_\nu\right)\]

where A_ν ≥ 0 and 0 ≤ α_ν ≤ 2π. If one imagines such a development performed at the same points in space for randomly chosen initial instants of time, one will obtain for the quantities A_ν and α_ν various systems of values. There then exist for the frequency of the various value combinations of the quantities A_ν and α_ν (statistical) probabilities dW of the form:

dW = f(A₁, A₂, … α₁, α₂, …) dA₁ dA₂ … dα₁ dα₂ …

The radiation is then as disordered as conceivable when

f(A₁, A₂, … α₁, α₂, …) = F₁(A₁)F₂(A₂) … f₁(α₁) f₂(α₂) …,

i.e., when the probability of one of the quantities A or α having a certain value is independent of the values which the other quantities A or α have. The more closely the condition is fulfilled that the individual pairs of quantities A_ν and α_ν depend on the emission and absorption processes of particular groups of resonators, the closer with increasing approximation the radiation will be regarded as “one as disordered as conceivable” in the case considered by us.
M. Planck, Ann. d. Phys. 4. p. 561. 1901.
This assumption is an arbitrary one. One will naturally adhere to this simplest assumption as long as experiment does not force one to abandon it.
If E is the energy of the system, one obtains:

−d(E − TS) = p dv = T dS = R(n/N)(dv/v);

thus

pv = R(n/N)T.
P. Lenard, Ann. d. Phys. 8. p. 169 and 170. 1902.
If one assumes that the individual electron must be detached from a neutral molecule by light with the expenditure of a certain work, there is nothing to change in the derived relation; only P′ is to be conceived as the sum of two summands.
P. Lenard, Ann. d. Phys. 8. p. 165 and 184. Plate I, Fig. 2. 1902.
P. Lenard, l. c. p. 150 and p. 166–168.
P. Lenard, Ann. d. Phys. 12. p. 469. 1903.
J. Stark, Die Elektrizität in Gasen p. 57. Leipzig 1902.
In gas interiors the ionization voltage for negative ions is about five times larger.