Loading...

This page intentionally left blank

Concepts in Thermal Physics S TEP HEN J. BLU N D ELL A N D KATHERIN E M. BLU N D ELL Department of Physics, University of Oxford, UK

1

3

Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York © Oxford University Press 2006 The moral rights of the authors have been asserted Database right Oxford University Press (maker) First published 2006 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose the same condition on any acquirer British Library Cataloguing in Publication Data Data available Library of Congress Cataloging in Publication Data Data available Printed in Great Britain on acid-free paper by CPI Antony Rowe, Chippenham, Wilts. ISBN 0–19–856769–3 978–0–19–856769–1 ISBN 0–19–856770–7 (Pbk.) 978–0–19–856770–7 (Pbk.) 10 9 8 7 6 5 4 3 2 1

To our dear parents Alan and Daphne Blundell Alan and Christine Sanders with love.

This page intentionally left blank

Preface “In the beginning was the Word. . .”

(John 1:1, 1st century AD)

“Consider sunbeams. When the sun’s rays let in Pass through the darkness of a shuttered room, You will see a multitude of tiny bodies All mingling in a multitude of ways Inside the sunbeam, moving in the void, Seeming to be engaged in endless strife, Battle, and warfare, troop attacking troop, And never a respite, harried constantly, With meetings and with partings everywhere. From this you can imagine what it is For atoms to be tossed perpetually In endless motion through the mighty void.” (On the Nature of Things, Lucretius, 1st century BC)

“. . . (we) have borne the burden of the work and the heat of the day.” (Matthew 20:12, 1st century AD)

Thermal physics forms a key part of any undergraduate physics course. It includes the fundamentals of classical thermodynamics (which was founded largely in the nineteenth century and motivated by a desire to understand the conversion of heat into work using engines) and also statistical mechanics (which was founded by Boltzmann and Gibbs, and is concerned with the statistical behaviour of the underlying microstates of the system). Students often ﬁnd these topics hard, and this problem is not helped by a lack of familiarity with basic concepts in mathematics, particularly in probability and statistics. Moreover, the traditional focus of thermodynamics on steam engines seems remote and largely irrelevant to a twenty-ﬁrst century student. This is unfortunate since an understanding of thermal physics is crucial to almost all modern physics and to the important technological challenges which face us in this century. The aim of this book is to provide an introduction to the key concepts in thermal physics, ﬂeshed out with plenty of modern examples from astrophysics, atmospheric physics, laser physics, condensed matter physics and information theory. The important mathematical principles, particularly concerning probability and statistics, are expounded in some detail. This aims to make up for the material which can no longer be automatically assumed to have been covered in every school

viii

mathematics course. In addition, the appendices contain useful mathematics, such as various integrals, mathematical results and identities. There is unfortunately no shortcut to mastering the necessary mathematics in studying thermal physics, but the material in the appendix provides a useful aide-m´emoire. Many courses on this subject are taught historically: the kinetic theory of gases, then classical thermodynamics are taught ﬁrst, with statistical mechanics taught last. In other courses, one starts with the principles of classical thermodynamics, followed then by statistical mechanics and kinetic theory is saved until the end. Although there is merit in both approaches, we have aimed at a more integrated treatment. For example, we introduce temperature using a straightforward statistical mechanical argument, rather than on the basis of a somewhat abstract Carnot engine. However, we do postpone detailed consideration of the partition function and statistical mechanics until after we have introduced the functions of state which manipulation of the partition function so conveniently produces. We present the kinetic theory of gases fairly early on, since it provides a simple, well-deﬁned arena in which to practise simple concepts in probability distributions. This has worked well in the course given in Oxford, but since kinetic theory is only studied at a later stage in courses in other places, we have designed the book so that the kinetic theory chapters can be omitted without causing problems; see Fig. 1.5 on page 10 for details. In addition, some parts of the book contain material which is much more advanced (often placed in boxes, or in the ﬁnal part of the book), and these can be skipped at ﬁrst reading. The book is arranged in a series of short, easily digestible chapters, each one introducing a new concept or illustrating an important application. Most people learn from examples, so plenty of worked examples are given in order that the reader can gain familiarity with the concepts as they are introduced. Exercises are provided at the end of each chapter to allow the students to gain practice in each area. In choosing which topics to include, and at what level, we have aimed for a balance between pedagogy and rigour, providing a comprehensible introduction with suﬃcient details to satisfy more advanced readers. We have also tried to balance fundamental principles with practical applications. However, this book does not treat real engines in any engineering depth, nor does it venture into the deep waters of ergodic theory. Nevertheless, we hope that there is enough in this book for a thorough grounding in thermal physics and the recommended further reading gives pointers for additional material. An important theme running through this book is the concept of information, and its connection with entropy. The black hole shown at the start of this preface, with its surface covered in ‘bits’ of information, is a helpful picture of the deep connection between information, thermodynamics, radiation and the Universe. The history of thermal physics is a fascinating one, and we have provided a selection of short biographical sketches of some of the key pioneers in thermal physics. To qualify for inclusion, the person had to have

ix

made a particularly important contribution and/or had a particularly interesting life – and be dead! Therefore one should not conclude from the list of people we have chosen that the subject of thermal physics is in any sense ﬁnished, it is just harder to write with the same perspective about current work in this subject. The biographical sketches are necessarily brief, giving only a glimpse of the life-story, so the Bibliography should be consulted for a list of more comprehensive biographies. However, the sketches are designed to provide some light relief in the main narrative and demonstrate that science is a human endeavour. It is a great pleasure to record our gratitude to those who taught us the subject while we were undergraduates in Cambridge, particularly Owen Saxton and Peter Scheuer, and to our friends in Oxford: we have beneﬁtted from many enlightening discussions with colleagues in the physics department, from the intelligent questioning of our Oxford students and from the stimulating environments provided by both Mansﬁeld College and St John’s College. In the writing of this book, we have enjoyed the steadfast encouragement of S¨ onke Adlung and his colleagues at OUP, and in particular Julie Harris’ black-belt LATEX support. A number of friends and colleagues in Oxford and elsewhere have been kind enough to give their time and read drafts of chapters of this book; they have made numerous helpful comments which have greatly improved the ﬁnal result: Fathallah Alouani Bibi, James Analytis, David Andrews, Arzhang Ardavan, Tony Beasley, Michael Bowler, Peter Duﬀy, Paul Goddard, Stephen Justham, Michael Mackey, Philipp Podsiadlowski, Linda Schmidtobreick, John Singleton and Katrien Steenbrugge. Particular thanks are due to Tom Lancaster, who twice read the entire manuscript at early stages and made many constructive and imaginative suggestions, and to Harvey Brown, whose insights were always stimulating and whose encouragement was always constant. To all these friends, our warmest thanks are due. Errors which we discover after going to press will be posted on the book’s website, which may be found at: http://users.ox.ac.uk/∼sjb/ctp It is our earnest hope that this book will make the study of thermal physics enjoyable and fascinating and that we have managed to communicate something of the enthusiasm we feel for this subject. Moreover, understanding the concepts of thermal physics is vital for humanity’s future; the impending energy crisis and the potential consequences of climate change mandate creative, scientiﬁc and technological innovations at the highest levels. This means that thermal physics is a ﬁeld which some of tomorrow’s best minds need to master today. SJB & KMB Oxford June 2006

This page intentionally left blank

Contents I

Preliminaries

1

1 Introduction 1.1 What is a mole? 1.2 The thermodynamic limit 1.3 The ideal gas 1.4 Combinatorial problems 1.5 Plan of the book Exercises

2 3 4 6 7 9 12

2 Heat 2.1 A deﬁnition of heat 2.2 Heat capacity Exercises

13 13 14 17

3 Probability 3.1 Discrete probability distributions 3.2 Continuous probability distributions 3.3 Linear transformation 3.4 Variance 3.5 Linear transformation and the variance 3.6 Independent variables Further reading Exercises

18 19 20 21 22 23 24 27 27

4 Temperature and the Boltzmann factor 4.1 Thermal equilibrium 4.2 Thermometers 4.3 The microstates and macrostates 4.4 A statistical deﬁnition of temperature 4.5 Ensembles 4.6 Canonical ensemble 4.7 Applications of the Boltzmann distribution Further reading Exercises

30 30 31 33 34 36 36 40 44 44

II

45

Kinetic theory of gases

5 The Maxwell–Boltzmann distribution

46

xii Contents

5.1 5.2

The velocity distribution The speed distribution 5.2.1 v and v 2 5.2.2 The mean kinetic energy of a gas molecule 5.2.3 The maximum of f (v) 5.3 Experimental justiﬁcation Exercises

46 47 48 48 49 49 52

6 Pressure 6.1 Molecular distributions 6.1.1 Solid angles 6.1.2 The number of molecules travelling in a certain direction at a certain speed 6.1.3 The number of molecules hitting a wall 6.2 The ideal gas law 6.3 Dalton’s law Exercises

54 55 55

7 Molecular eﬀusion 7.1 Flux 7.2 Eﬀusion Exercises

62 62 64 67

8 The mean free path and collisions 8.1 The mean collision time 8.2 The collision cross-section 8.3 The mean free path Exercises

68 68 69 71 72

III

73

Transport and thermal diﬀusion

55 56 56 58 59

9 Transport properties in gases 9.1 Viscosity 9.2 Thermal conductivity 9.3 Diﬀusion 9.4 More-detailed theory Further reading Exercises

74 74 79 81 84 86 87

10 The thermal diﬀusion equation 10.1 Derivation of the thermal diﬀusion equation 10.2 The one-dimensional thermal diﬀusion equation 10.3 The steady state 10.4 The thermal diﬀusion equation for a sphere 10.5 Newton’s law of cooling 10.6 The Prandtl number 10.7 Sources of heat Exercises

88 88 89 92 92 95 97 98 99

Contents xiii

IV

The ﬁrst law

103

11 Energy 11.1 Some deﬁnitions 11.1.1 A system in thermal equilibrium 11.1.2 Functions of state 11.2 The ﬁrst law of thermodynamics 11.3 Heat capacity Exercises

104 104 104 104 106 108 111

12 Isothermal and adiabatic processes 12.1 Reversibility 12.2 Isothermal expansion of an ideal gas 12.3 Adiabatic expansion of an ideal gas 12.4 Adiabatic atmosphere Exercises

114 114 116 117 117 119

V

121

The second law

13 Heat engines and the second law 13.1 The second law of thermodynamics 13.2 The Carnot engine 13.3 Carnot’s theorem 13.4 Equivalence of Clausius and Kelvin statements 13.5 Examples of heat engines 13.6 Heat engines running backwards 13.7 Clausius’ theorem Further reading Exercises

122 122 123 126 127 127 129 130 133 133

14 Entropy 14.1 Deﬁnition of entropy 14.2 Irreversible change 14.3 The ﬁrst law revisited 14.4 The Joule expansion 14.5 The statistical basis for entropy 14.6 The entropy of mixing 14.7 Maxwell’s demon 14.8 Entropy and probability Exercises

136 136 136 138 140 142 143 145 146 149

15 Information theory 15.1 Information and Shannon entropy 15.2 Information and thermodynamics 15.3 Data compression 15.4 Quantum information Further reading Exercises

153 153 155 156 158 161 161

xiv Contents

VI

Thermodynamics in action

163

16 Thermodynamic potentials 16.1 Internal energy, U 16.2 Enthalpy, H 16.3 Helmholtz function, F 16.4 Gibbs function, G. 16.5 Availability 16.6 Maxwell’s relations Exercises

164 164 165 166 167 168 170 178

17 Rods, bubbles and magnets 17.1 Elastic rod 17.2 Surface tension 17.3 Paramagnetism Exercises

182 182 185 186 192

18 The third law 18.1 Diﬀerent statements of the third law 18.2 Consequences of the third law Exercises

193 193 195 198

VII

199

Statistical mechanics

19 Equipartition of energy 19.1 Equipartition theorem 19.2 Applications 19.2.1 Translational motion in a monatomic gas 19.2.2 Rotational motion in a diatomic gas 19.2.3 Vibrational motion in a diatomic gas 19.2.4 The heat capacity of a solid 19.3 Assumptions made 19.4 Brownian motion Exercises

200 200 203 203 203 204 205 205 207 208

20 The partition function 20.1 Writing down the partition function 20.2 Obtaining the functions of state 20.3 The big idea 20.4 Combining partition functions Exercises

209 210 211 218 218 219

21 Statistical mechanics of an ideal gas 21.1 Density of states 21.2 Quantum concentration 21.3 Distinguishability 21.4 Functions of state of the ideal gas 21.5 Gibbs paradox

221 221 223 224 225 228

Contents xv

21.6 Heat capacity of a diatomic gas Exercises 22 The chemical potential 22.1 A deﬁnition of the chemical potential 22.2 The meaning of the chemical potential 22.3 Grand partition function 22.4 Grand potential 22.5 Chemical potential as Gibbs function per particle 22.6 Many types of particle 22.7 Particle number conservation laws 22.8 Chemical potential and chemical reactions Further reading Exercises

229 230 232 232 233 235 236 238 238 239 240 245 246

23 Photons 247 23.1 The classical thermodynamics of electromagnetic radiation 248 23.2 Spectral energy density 249 23.3 Kirchhoﬀ’s law 250 23.4 Radiation pressure 252 23.5 The statistical mechanics of the photon gas 253 23.6 Black body distribution 254 23.7 Cosmic Microwave Background radiation 257 23.8 The Einstein A and B coeﬃcients 258 Further reading 261 Exercises 262 24 Phonons 24.1 The Einstein model 24.2 The Debye model 24.3 Phonon dispersion Further reading Exercises

263 263 265 268 271 271

VIII

273

Beyond the ideal gas

25 Relativistic gases 25.1 Relativistic dispersion relation for massive particles 25.2 The ultrarelativistic gas 25.3 Adiabatic expansion of an ultrarelativistic gas Exercises

274 274 274 277 279

26 Real gases 26.1 The van der Waals gas 26.2 The Dieterici equation 26.3 Virial expansion 26.4 The law of corresponding states Exercises

280 280 288 290 294 296

xvi Contents

27 Cooling real gases 27.1 The Joule expansion 27.2 Isothermal expansion 27.3 Joule–Kelvin expansion 27.4 Liquefaction of gases Exercises

297 297 299 300 302 304

28 Phase transitions 28.1 Latent heat 28.2 Chemical potential and phase changes 28.3 The Clausius–Clapeyron equation 28.4 Stability & metastability 28.5 The Gibbs phase rule 28.6 Colligative properties 28.7 Classiﬁcation of phase transitions Further reading Exercises

305 305 308 308 313 316 318 320 323 323

29 Bose–Einstein and Fermi–Dirac distributions 29.1 Exchange and symmetry 29.2 Wave functions of identical particles 29.3 The statistics of identical particles Further reading Exercises

325 325 326 329 332 332

30 Quantum gases and condensates 30.1 The non-interacting quantum ﬂuid 30.2 The Fermi gas 30.3 The Bose gas 30.4 Bose–Einstein condensation (BEC) Further reading Exercises

337 337 340 345 346 351 352

IX

353

Special topics

31 Sound waves 31.1 Sound waves under isothermal conditions 31.2 Sound waves under adiabatic conditions 31.3 Are sound waves in general adiabatic or isothermal? 31.4 Derivation of the speed of sound within ﬂuids Further reading Exercises

354 355 355 356 357 360 360

32 Shock waves 32.1 The Mach number 32.2 Structure of shock waves 32.3 Shock conservation laws 32.4 The Rankine–Hugoniot conditions

361 361 361 363 364

Contents xvii

Further reading Exercises

367 367

33 Brownian motion and ﬂuctuations 33.1 Brownian motion 33.2 Johnson noise 33.3 Fluctuations 33.4 Fluctuations and the availability 33.5 Linear response 33.6 Correlation functions Further reading Exercises

368 368 371 372 373 375 378 385 385

34 Non-equilibrium thermodynamics 34.1 Entropy production 34.2 The kinetic coeﬃcients 34.3 Proof of the Onsager reciprocal relations 34.4 Thermoelectricity 34.5 Time reversal and the arrow of time Further reading Exercises

386 386 387 388 391 395 397 397

35 Stars 35.1 Gravitational interaction 35.1.1 Gravitational collapse and the Jeans criterion 35.1.2 Hydrostatic equilibrium 35.1.3 The virial theorem 35.2 Nuclear reactions 35.3 Heat transfer 35.3.1 Heat transfer by photon diﬀusion 35.3.2 Heat transfer by convection 35.3.3 Scaling relations Further reading Exercises

398 399 399 401 402 404 405 405 407 408 412 412

36 Compact objects 36.1 Electron degeneracy pressure 36.2 White dwarfs 36.3 Neutron stars 36.4 Black holes 36.5 Accretion 36.6 Black holes and entropy 36.7 Life, the Universe and Entropy Further reading Exercises

413 413 415 416 418 419 420 421 423 423

37 Earth’s atmosphere 37.1 Solar energy 37.2 The temperature proﬁle in the atmosphere

424 424 425

xviii Contents

37.3 The greenhouse eﬀect Further reading Exercises

427 432 432

A Fundamental constants

433

B Useful formulae

434

C Useful mathematics C.1 The factorial integral C.2 The Gaussian integral C.3 Stirling’s formula C.4 Riemann zeta function C.5 The polylogarithm C.6 Partial derivatives C.7 Exact diﬀerentials C.8 Volume of a hypersphere C.9 Jacobians C.10 The Dirac delta function C.11 Fourier transforms C.12 Solution of the diﬀusion equation C.13 Lagrange multipliers

436 436 436 439 441 442 443 444 445 445 447 447 448 449

D The electromagnetic spectrum

451

E Some thermodynamical deﬁnitions

452

F Thermodynamic expansion formulae

453

G Reduced mass

454

H Glossary of main symbols

455

I

457

Bibliography

Index

460

Part I

Preliminaries In order to explore and understand the rich and beautiful subject that is thermal physics, we need some essential tools in place. Part I provides these, as follows: • In Chapter 1 we explore the concept of large numbers, showing why large numbers appear in thermal physics and explaining how to handle them. Large numbers arise in thermal physics because the number of atoms in the bit of matter under study is usually very large (for example, it can be typically of the order of 1023 ), but also because many thermal physics problems involve combinatorial calculations (and this can produce numbers like 1023 !, where “!” here means a factorial). We introduce Stirling’s approximation which is useful for handling expressions such as ln N ! which frequently appear in thermal physics. We discuss the thermodynamic limit and state the ideal gas equation (derived later, in Chapter 6, from the kinetic theory of gases). • In Chapter 2 we explore the concept of heat, deﬁning it as “energy in transit”, and introduce the idea of a heat capacity. • The ways in which thermal systems behave is determined by the laws of probability, so we outline the notion of probability in Chapter 3 and apply it to a number of problems. This Chapter may well cover ground that is familiar to some readers, but is a useful introduction to the subject. • We then use these ideas to deﬁne the temperature of a system from a statistical perspective and hence derive the Boltzmann distribution in Chapter 4. This distribution describes how a thermal system behaves when it is placed in thermal contact with a large thermal reservoir. This is a key concept in thermal physics and forms the basis of all that follows.

1 1.1 What is a mole?

Introduction 3

1.2 The thermodynamic limit

4

1.3 The ideal gas

6

1.4 Combinatorial problems

7

1.5 Plan of the book

9

Chapter summary

12

Exercises

12

Some large numbers: million billion trillion quadrillion quintillion googol googolplex

106 109 1012 1015 1018 10100 100 1010

Note: these values assume the US billion, trillion etc which are now in general use.

1

Still more hopeless would be the task of measuring where each molecule is and how fast it is moving in its initial state!

The subject of thermal physics involves studying assemblies of large numbers of atoms. As we will see, it is the large numbers involved in macroscopic systems which allow us to treat some of their properties in a statistical fashion. What do we mean by a large number? Large numbers turn up in many spheres of life. A book might sell a million (106 ) copies (probably not this one), the Earth’s population is (at the time of writing) between six and seven billion people (6–7×109 ), and the US budget deﬁcit is currently around half a quadrillion dollars (5 × 1014 US$). But even these large numbers pale into insigniﬁcance compared with the numbers involved in thermal physics. The number of atoms in an average-sized piece of matter is usually ten to the power of twenty-something, and this puts extreme limits on what sort of calculations we can do to understand them.

Example 1.1 One kilogramme of nitrogen gas contains approximately 2 × 1025 N2 molecules. Let us see how easy it would be to make predictions about the motion of the molecules in this amount of gas. In one year, there are about 3.2×107 seconds, so that a 3 GHz personal computer can count molecules at a rate of roughly 1017 year−1 , if it counts one molecule every computer clock cycle. Therefore it would take about 0.2 billion years just for this computer to count all the molecules in one kilogramme of nitrogen gas (a time which is roughly a few percent of the age of the Universe!). Counting the molecules is a computationally simpler task than calculating all their movements and collisions with each other. Therefore modelling this quantity of matter by following each and every particle is a hopeless task.1 Hence, to make progress in thermal physics it is necessary to make approximations and deal with the statistical properties of molecules, i.e. to study how they behave on average. Chapter 3 therefore contains a discussion of probability and statistical methods which are foundational for understanding thermal physics. In this chapter, we will brieﬂy review the deﬁnition of a mole (which will be used throughout the book), consider why very big numbers arise from combinatorial problems in thermal physics and introduce the thermodynamic limit and the ideal gas equation.

1.1

1.1

What is a mole? 3

What is a mole?

A mole is, of course, a small burrowing animal, but also a name (ﬁrst coined about a century ago from the German ‘Molekul’ [molecule]) representing a certain numerical quantity of stuﬀ. It functions in the same way as the word ‘dozen’, which describes a certain number of eggs (12), or ‘score’, which describes a certain number of years (20). It might be easier if we could use the word dozen when describing a certain number of atoms, but a dozen atoms is not many (unless you are building a quantum computer) and since a million, a billion, and even a quadrillion are also too small to be useful, we have ended up with using an even bigger number. Unfortunately, for historical reasons, it isn’t a power of ten. The mole: A mole is deﬁned as the quantity of matter that contains as many objects (for example, atoms, molecules, formula units, or ions) as the number of atoms in exactly 12 g (= 0.012 kg) of 12 C. A mole is also approximately the quantity of matter that contains as many objects (for example, atoms, molecules, formula units, ions) as the number of atoms in exactly 1 g (=0.001 kg) of 1 H, but carbon was chosen as a more convenient international standard since solids are easier to weigh accurately. A mole of atoms is equivalent to an Avogadro number NA of atoms. The Avogadro number, expressed to 4 signiﬁcant ﬁgures, is NA = 6.022 × 1023

(1.1)

Example 1.2 • 1 mole of carbon is 6.022 × 1023 atoms of carbon. • 1 mole of benzene is 6.022 × 1023 molecules of benzene. • 1 mole of NaCl contains 6.022 × 1023 NaCl formula units, etc. The Avogadro number is an exceedingly large number: a mole of eggs would make an omelette with about half the mass of the Moon! The molar mass of a substance is the mass of one mole of the substance. Thus the molar mass of carbon is 12 g, but the molar mass of water is close to 18 g (because the mass of a water molecule is about 18 12 times larger than the mass of a carbon atom). The mass m of a single molecule or atom is therefore the molar mass of that substance divided by the Avogadro number. Equivalently: molar mass = mNA .

(1.2)

One can write NA as 6.022×1023 mol−1 as a reminder of its deﬁnition, but NA is dimensionless, as are moles. They are both numbers. By the same logic, one would have to deﬁne the ‘eggbox number’ as 12 dozen−1 .

4 Introduction

1.2

2

F

An impulse is the product of force and a time interval. The impulse is equal to the change of momentum.

F

t

F

t

t Fig. 1.1 Graphs of the force on a roof as function of time due to falling rain drops.

The thermodynamic limit

In this section, we will explain how the large numbers of molecules in a typical thermodynamic system mean that it is possible to deal with average quantities. Our explanation proceeds using an analogy: imagine that you are sitting inside a tiny hut with a ﬂat roof. It is raining outside, and you can hear the occasional raindrop striking the roof. The raindrops arrive randomly, so sometimes two arrive close together, but sometimes there is quite a long gap between raindrops. Each raindrop transfers its momentum to the roof and exerts an impulse2 on it. If you knew the mass and terminal velocity of a raindrop, you could estimate the force on the roof of the hut. The force as a function of time would look like that shown in Fig. 1.1(a), each little blip corresponding to the impulse from one raindrop. Now imagine that you are sitting inside a much bigger hut with a ﬂat roof a thousand times the area of the ﬁrst roof. Many more raindrops will now be falling on the larger roof area and the force as a function of time would look like that shown in Fig. 1.1(b). Now scale up the area of the ﬂat roof by a further factor of one hundred and the force would look like that shown in Fig. 1.1(c). Notice two key things about these graphs: (1) The force, on average, gets bigger as the area of the roof gets bigger. This is not surprising because a bigger roof catches more raindrops. (2) The ﬂuctuations in the force get smoothed out and the force looks like it stays much closer to its average value. In fact, the ﬂuctuations are still big but, as the area of the roof increases, they grow more slowly than the average force does. The force grows with area, so it is useful to consider the pressure which is deﬁned as force . (1.3) pressure = area The average pressure due to the falling raindrops will not change as the area of the roof increases, but the ﬂuctuations in the pressure will decrease. In fact, we can completely ignore the ﬂuctuations in the pressure in the limit that the area of the roof grows to inﬁnity. This is precisely analogous to the limit we refer to as the thermodynamic limit. Consider now the molecules of a gas which are bouncing around in a container. Each time the molecules bounce oﬀ the walls of the container, they exert an impulse on the walls. The net eﬀect of all these impulses is a pressure, a force per unit area, exerted on the walls of the container. If the container were very small, we would have to worry about ﬂuctuations in the pressure (the random arrival of individual molecules on the wall, much like the raindrops in Fig. 1.1(a)). However, in most cases that one meets, the number of molecules in a container of gas is extremely large, so these ﬂuctuations can be ignored and the pressure of the gas appears to be completely uniform. Again, our description of the pressure of this

1.2

system can be said to be ‘in the thermodynamic limit’, where we have let the number of molecules be regarded as tending to inﬁnity in such a way that the density of the gas is a constant. Suppose that the container of gas has volume V , that the temperature is T , the pressure is p and the kinetic energy of all the gas molecules adds up to U . Imagine slicing the container of gas in half with an imaginary plane, and now just focus your attention on the gas on one side of the plane. The volume of this half of the gas, let’s call it V ∗ , is by deﬁnition half that of the original container, i.e. V∗ =

V . 2

(1.4)

The kinetic energy of this half of the gas, let’s call it U ∗ , is clearly half that of the total kinetic energy, i.e. U∗ =

U . 2

(1.5)

However, the pressure p∗ and the temperature T ∗ of this half of the gas are the same as for the whole container of gas, so that p∗ T

∗

= p

(1.6)

= T.

(1.7)

Variables which scale with the system size, like V and U , are called extensive variables. Those which are independent of system size, like p and T , are called intensive variables. Thermal physics evolved in various stages and has left us with various approaches to the subject: • The subject of classical thermodynamics deals with macroscopic properties, such as pressure, volume and temperature, without worrying about the underlying microscopic physics. It applies to systems which are suﬃciently large that microscopic ﬂuctuations can be ignored, and it does not assume that there is an underlying atomic structure to matter. • The kinetic theory of gases tries to determine the properties of gases by considering probability distributions associated with the motions of individual molecules. This was initially somewhat controversial since the existence of atoms and molecules was doubted by many until the late nineteenth and early twentieth centuries. • The realization that atoms and molecules exist led to the development of statistical mechanics. Rather than starting with descriptions of macroscopic properties (as in thermodynamics) this approach begins with trying to describe the individual microscopic states of a system and then uses statistical methods to derive the macroscopic properties from them. This approach received an additional impetus with the development of quantum theory which showed explicitly how to describe the microscopic quantum states

The thermodynamic limit 5

6 Introduction

of diﬀerent systems. The thermodynamic behavior of a system is then asymptotically approximated by the results of statistical mechanics in the thermodynamic limit, i.e. as the number of particles tends to inﬁnity (with intensive quantities such as pressure and density remaining ﬁnite). In the next section, we will state the ideal gas law which was ﬁrst found experimentally but can be deduced from the kinetic theory of gases (see Chapter 6).

1.3

The ideal gas

Experiments on gases show that the pressure p of a volume V of gas depends on its temperature T . For example, a ﬁxed amount of gas at constant temperature obeys p ∝ 1/V,

(1.8)

a result which is known as Boyle’s law (sometimes as the Boyle– Mariotte law); it was discovered experimentally by Robert Boyle (1627– 1691) in 1662 and independently by Edm´e Mariotte (1620–1684) in 1676. At constant pressure, the gas also obeys V ∝ T,

(1.9)

where T is measured in Kelvin. This is known as Charles’ law and was discovered experimentally, in a crude fashion, by Jacques Charles (1746– 1823) in 1787, and more completely by Joseph Louis Gay-Lussac (1778– 1850) in 1802, though their work was partly anticipated by Guillaume Amontons (1663–1705) in 1699, who also noticed that a ﬁxed volume of gas obeys p ∝ T, (1.10) 3

Note that none of these scientists expressed temperature in this way, since the Kelvin scale and absolute zero had yet to be invented. For example, GayLussac found merely that V = V0 (1 + αT˜), where V0 and α are constants and T˜ is temperature in his scale.

a result that Gay-Lussac himself found independently in 1809 and is often known as Gay-Lussac’s law.3 These three empirical laws can be combined to give pV ∝ T.

It turns out that, if there are N molecules in the gas, this ﬁnding can be expressed as follows: pV = N kB T .

4

It takes the numerical value kB = 1.3807×10−23 J K−1 . We will meet this constant again in eqn 4.7.

(1.11)

(1.12)

This is known as the ideal gas equation, and the constant kB is known as the Boltzmann constant.4 We now make some comments about the ideal gas equation. • We have stated this law purely as an empirical law, observed in experiment. We will derive it from ﬁrst principles using the kinetic theory of gases in Chapter 6. This theory assumes that a gas can be modelled as a collection of individual tiny particles which can bounce oﬀ the walls of the container, and each other (see Fig. 1.2).

1.4

• Why do we call it ‘ideal’ ? The microscopic justiﬁcation which we will present in Chapter 6 proceeds under various assumptions: (i) we assume that there are no intermolecular forces, so that the molecules are not attracted to each other; (ii) we assume that molecules are point-like and have zero size. These are idealized assumptions and so we do not expect the ideal gas model to describe real gases under all circumstances. However, it does have the virtue of simplicity: eqn 1.12 is simple to write down and remember. Perhaps more importantly, it does describe gases quite well under quite a wide range of conditions. • The ideal gas equation forms the basis of much of our study of classical thermodynamics. Gases are common in nature: they are encountered in astrophysics and atmospheric physics and it is gases which are used to drive engines, and thermodynamics was invented to try and understand engines. Therefore this equation is fundamental in our treatment of thermodynamics and should be memorized. • The ideal gas law, however, doesn’t describe all important gases, and several chapters in this book are devoted to seeing what happens when various assumptions fail. For example, the ideal gas equation assumes that the gas molecules move non-relativistically. When this is not the case, we have to develop a model of relativistic gases (see Chapter 25). At low temperatures and high densities, gas molecules do attract one another (this must occur for liquids and solids to form) and this is considered in Chapters 26, 27 and 28. Furthermore, when quantum eﬀects are important we need a model of quantum gases, and this is outlined in Chapter 30. • Of course, thermodynamics applies also to systems which are not gaseous (so the ideal gas equation, though useful, is not a cure for all ills), and we will look at the thermodynamics of rods, bubbles and magnets in Chapter 17.

1.4

Combinatorial problems

Even larger numbers than NA occur in problems involving combinations, and these turn out to be very important in thermal physics. The following example illustrates a simple combinatorial problem which captures the essence of what we are going to have to deal with.

Example 1.3 Let us imagine that a certain system contains ten atoms. Each of these atoms can exist in one of two states, according to whether it has zero units or one unit of energy. These ‘units’ of energy are called quanta of energy. How many distinct arrangements of quanta are possible for this system if you have at your disposal (a) 10 quanta of energy; (b) 4 quanta of energy?

Combinatorial problems 7

Fig. 1.2 In the kinetic theory of gases, a gas is modelled as a number of individual tiny particles which can bounce oﬀ the walls of the container, and each other.

8 Introduction

Fig. 1.3 Ten atoms which can accommodate four quanta of energy. An atom with a single quantum of energy is shown as a ﬁlled circle, otherwise it is shown as an empty circle. One conﬁguration is shown here.

Solution: We can represent the ten atoms by drawing ten boxes; an empty box signiﬁes an atom with zero quanta of energy; a ﬁlled box signiﬁes an atom with one quantum of energy (see Fig. 1.3). We give two methods for calculating the number of ways of arranging r quanta among n atoms: (1) In the ﬁrst method, we realize that the ﬁrst quantum can be assigned to any of the n atoms, the second quantum can be assigned to any of the remaining atoms (there are n − 1 of them), and so on until the rth quantum can be assigned to any of the remaining n − r + 1 atoms. Thus our ﬁrst guess for the number of possible arrangements of the r quanta we have assigned, is Ωguess = n × (n − 1) × (n − 2) × . . . × (n − r + 1). This can be simpliﬁed as follows: Ωguess =

n! n × (n − 1) × (n − 2) × . . . × 1 = . (n − r) × (n − r − 1) × . . . × 1 (n − r)!

(1.13)

However, this assumes that we have labelled the quanta as ‘the ﬁrst quantum’, ‘the second quantum’ etc. In fact, we don’t care which quantum is which because they are indistinguishable. We can rearrange the r quanta in any one of r! arrangements. Hence our answer Ωguess needs to be divided by r!, so that the number Ω of unique arrangements is n! ≡ n Cr , Ω= (1.14) (n − r)! r! 5

Other symbols sometimes used for „ « n n C include n C and . r r r

Fig. 1.4 Each row shows the ten atoms which can accommodate r quanta of energy. An atom with a single quantum of energy is shown as a ﬁlled circle, otherwise it is shown as an empty circle. (a) For r = 10 there is only one possible conﬁguration. (b) For r = 4, there are 210 possibilities, of which three are shown.

where n Cr is the symbol for a combination.5 (2) In the second method, we recognize that there are r atoms each with one quantum and n − r atoms with zero quanta. The number of arrangements is then simply the number of ways of arranging r ones and n − r zeros. There are n! ways of arranging a sequence of n distinguishable symbols. If r of these symbols are the same (all ones), there are r! ways of arranging these without changing the pattern. If the remaining n − r symbols are all the same (all zeros), there are (n − r)! ways of arranging these without changing the pattern. Hence we again ﬁnd that n! . (1.15) Ω= (n − r)! r! For the speciﬁc cases shown in Fig. 1.4: (a) n = 10, r = 10, so Ω = 10!/(10! × 0!) = 1. This one possibility, with each atom having a quantum of energy, is shown in Fig. 1.4(a). (b) n = 10, r = 4, so Ω = 10!/(6! × 4!) = 210. A few of these possibilities are shown in Fig. 1.4(b). If instead we had chosen 10 times as many atoms (so n = 100) and 10 times as many quanta, the numbers for (b) would have come out much much bigger. In this case, we would have r = 40, Ω ∼ 1028 . A further factor of 10 sends these numbers up much further, so for n = 1000 and r = 400, Ω ∼ 10290 – a staggeringly large number.

1.5

The numbers in the above example are so large because factorials increase very quickly. In our example we treated 10 atoms; we are clearly going to run into trouble when we are going to deal with a mole of atoms, i.e. when n = 6 × 1023 . One way of bringing large numbers down to size is to look at their logarithms.6 Thus, if Ω is given by eqn 1.15, we could calculate ln Ω = ln(n!) − ln((n − r)!) − ln(r!).

(1.16)

Plan of the book 9

6

We will use ‘ln’ to signify log to the base e, i.e. ln = loge . This is known as the natural logarithm.

This expression involves the logarithm of a factorial, and it is going to be very useful to be able to evaluate this. Most pocket calculators have diﬃculty in evaluating factorials above 69! (because 70! > 10100 and many pocket calculators give an overﬂow error for numbers above 9.999×1099 ), so some low cunning will be needed to overcome this. Such low cunning is provided by an expression termed Stirling’s formula: ln n! ≈ n ln n − n.

(1.17)

This expression7 is derived in Appendix C.3.

7

As shown in Appendix C.3, it is slightly more accurate to use the formula ln n! ≈ n ln n − n + 12 ln 2πn, but this only gives a signiﬁcant advantage when n is not too large.

Example 1.4 Estimate the order of magnitude of 1023 !. Solution: Using Stirling’s formula, we can estimate ln 1023 ! ≈ 1023 ln 1023 − 1023 = 5.2 × 1024 ,

(1.18)

1023 ! = exp(ln 1023 !) ≈ exp(5.20 × 1024 ).

(1.19)

and hence x

We have our answer in the form e , but we would really like it as ten to some power. Now if ex = 10y , then y = x/ ln 10 and hence 24

1023 ! ≈ 102.26×10 .

(1.20)

Just pause for a moment to take in how big this number is. It is roughly one followed by about 2.26 × 1024 zeros! Our claim that combinatorial numbers are big seems to be justiﬁed!

1.5

Plan of the book

This book aims to introduce the concepts of thermal physics one by one, steadily building up the techniques and ideas which make up the subject. Part I contains various preliminary topics. In Chapter 2 we deﬁne heat and introduce the idea of heat capacity. In Chapter 3, the ideas of probability are presented for discrete and continuous distributions. (For

10 Introduction

Fig. 1.5 Organization of the book. The dashed line shows a possible route through the material which avoids the kinetic theory of gases. The numbers of the core chapters are given in bold type. The other chapters can be omitted on a ﬁrst reading, or for a reduced-content course.

1.5

a reader familiar with probability theory, this chapter can be omitted.) We then deﬁne temperature in Chapter 4, and this allows us to introduce the Boltzmann distribution, which is the probability distribution for systems in contact with a thermal reservoir. The plan for the remaining parts of the book is sketched in Fig. 1.5. The following two parts contain a presentation of the kinetic theory of gases which justiﬁes the ideal gas equation from a microscopic model. Part II presents the Maxwell–Boltzmann distribution of molecular speeds in a gas and the derivation of formulae for pressure, molecular eﬀusion and mean free path. Part III concentrates on transport and thermal diﬀusion. Parts II and III can be omitted in courses in which kinetic theory is treated at a later stage. In Part IV, we begin our introduction to mainstream thermodynamics. The concept of energy is covered in Chapter 11, along with the zeroth and ﬁrst laws of thermodynamics. These are applied to isothermal and adiabatic processes in Chapter 12. Part V contains the crucial second law of thermodynamics. The idea of a heat engine is introduced in Chapter 13, which leads to various statements of the second law of thermodynamics. Hence the important concept of entropy is presented in Chapter 14 and its application to information theory is discussed in Chapter 15. Part VI introduces the rest of the machinery of thermodynamics. Various thermodynamic potentials, such as the enthalpy, Helmholtz function and Gibbs function, are introduced in Chapter 16, and their usage illustrated. Thermal systems include not only gases, and Chapter 17 looks at other possible systems such as elastic rods and magnetic systems. The third law of thermodynamics is described in Chapter 18 and provides a deeper understanding of how entropy behaves as the temperature is reduced to absolute zero. Part VII focusses on statistical mechanics. Following a discussion of the equipartition of energy in Chapter 19, so useful for understanding high temperature limits, the concept of the partition function is presented in some detail in Chapter 20 which is foundational for understanding statistical mechanics. The idea is applied to the ideal gas in Chapter 21. Particle number becomes important when considering different types of particle, so the chemical potential and grand partition function are presented in Chapter 22. Two simple applications where the chemical potential is zero are photons and phonons, discussed in Chapters 23 and 24 respectively. The discussion up to this point has concentrated on the ideal gas model and we go beyond this in Part VIII: Chapter 25 discusses the eﬀect of relativistic velocities and Chapters 26 and 27 discuss the effect of intermolecular interactions while phase transitions are discussed in Chapter 28, where the important Clausius–Clapeyron equation for a phase boundary is derived. Another quantum mechanical implication is the existence of identical particles and the diﬀerence between fermions and bosons, discussed in Chapter 29, and the consequences for the properties of quantum gases are presented in Chapter 30.

Plan of the book 11

12 Exercises

The remainder of the book, Part IX, contains more detailed information on various special topics which allow the power of thermal physics to be demonstrated. In Chapters 31 and 32 we describe sound waves and shock waves in ﬂuids. We draw some of the statistical ideas of the book together in Chapter 33 and discuss non-equilibrium thermodynamics and the arrow of time in Chapter 34. Applications of the concepts in the book to astrophysics in Chapters 35 and 36 and to atmospheric physics are described in Chapter 37.

Chapter summary • In this chapter, the idea of big numbers has been introduced. These arise in thermal physics for two main reasons: (1) The number of atoms in a typical macroscopic lump of matter is large. It is measured in the units of the mole. One mole of atoms contains NA atoms where NA = 6.022 × 1023 . (2) Combinatorial problems generate very large numbers. To make these numbers manageable, we often consider their logarithms and use Stirling’s approximation: ln n! ≈ n ln n − n.

Exercises (1.1) What is the mass of 3 moles of carbon dioxide (CO2 )? (1 mole of oxygen atoms has a mass of 16 g.) (1.2) A typical bacterium has a mass of 10−12 g. Calculate the mass of a mole of bacteria. (Interestingly, this is about the total number of bacteria living in the guts of all humans living on planet Earth.) Give your answer in units of elephant-masses (elephants have a mass ≈ 5000 kg). (1.3) (a) How many water molecules are there in your body? (Assume that you are nearly all water.) (b) How many drops of water are there in all the oceans of the world? (The mass of the world’s oceans is about 1021 kg. Estimate the size of a typical drop of water.) (c) Which of these two numbers from (a) and (b) is the larger?

(1.4) A system contains n atoms, each of which can only have zero or one quanta of energy. How many ways can you arrange r quanta of energy when (a) n = 2, r = 1; (b) n = 20, r = 10; (c) n = 2 × 1023 , r = 1023 ? (1.5) What fractional error do you make when using Stirling’s approximation (in the form ln n! ≈ n ln n − n) to evaluate (a) ln 10!, (b) ln 100! and (c) ln 1000! ? (1.6) Show that eqn C.19 is equivalent to writing √ n! ≈ nn e−n 2πn, (1.21) and n! ≈

√

1

2πnn+ 2 e−n .

(1.22)

2

Heat In this Chapter, we will introduce the concepts of heat and heat capacity.

2.1

A deﬁnition of heat

We all have an intuitive notion of what heat is: sitting next to a roaring ﬁre in winter, we feel its heat warming us up, increasing our temperature; lying outside in the sunshine on a warm day, we feel the Sun’s heat warming us up. In contrast, holding a snowball, we feel heat leaving our hand and transferring to the snowball, making our hand feel cold. Heat seems to be some sort of energy transferred from hot things to cold things when they come into contact. We therefore make the following deﬁnition: heat is energy in transit. We now stress a couple of important points about this deﬁnition. (1) Experiments suggest that heat spontaneously transfers from a hotter body to a colder body when they are in contact, and not in the reverse direction. However, there are circumstances when it is possible for heat to go in the reverse direction. A good example of this is a kitchen freezer: you place food, initially at room temperature, into the freezer and shut the door; the freezer then sucks heat out of the food and cools the food down to below freezing point. Heat is being transferred from your warmer food to the colder freezer, apparently in the ‘wrong’ direction. Of course, to achieve this, you have to be paying your electricity bill and therefore be putting in energy to your freezer. If there is a power cut, heat will slowly leak back into the freezer from the warmer kitchen and thaw out all your frozen food. This shows that it is possible to reverse the direction of heat ﬂow, but only if you intervene by putting additional energy in. We will return to this point in Section 13.5 when we consider refrigerators, but for now let us note that we are deﬁning heat as energy in transit and not hard-wiring into the deﬁnition anything about which direction it goes. (2) The ‘in transit’ part of our deﬁnition is very important. Though you can add heat to an object, you cannot say that ‘an object contains a certain quantity of heat.’ This is very diﬀerent to the case of the fuel in your car: you can add fuel to your car, and you

2.1 A deﬁnition of heat

13

2.2 Heat capacity

14

Chapter summary

17

Exercises

17

14 Heat

are quite entitled to say that your car ‘contains a certain quantity of fuel’. You even have a gauge for measuring it! But heat is quite diﬀerent. Objects do not and cannot have gauges which read out how much heat they contain, because heat only makes sense when it is ‘in transit’.1 To see this, consider your cold hands on a chilly winter day. You can increase the temperature of your hands in two diﬀerent ways: (i) by adding heat, for example by putting your hands close to something hot, like a roaring ﬁre; (ii) by rubbing your hands together. In one case you have added heat from the outside, in the other case you have not added any heat but have done some work. In both cases, you end up with the same ﬁnal situation: hands which have increased in temperature. There is no physical diﬀerence between hands which have been warmed by heat and hands which have been warmed by work.2

1

We will see later that objects can contain a certain quantity of energy, so it is possible, at least in principle, to have a gauge which reads out how much energy is contained.

2

We have made this point by giving a plausible example, but in Chapter 11 we will show using more mathematical arguments that heat only makes sense as energy ‘in transit’.

Heat is measured in joules (J). The rate of heating has the units of watts (W), where 1 W=1 J s−1 .

Example 2.1 A 1 kW electric heater is switched on for ten minutes. How much heat does it produce? Solution: Ten minutes equals 600 s, so the heat Q is given by Q = 1 kW × 600 s = 600 kJ.

(2.1)

Notice in this last example that the power in the heater is supplied by electrical work. Thus it is possible to produce heat by doing work. We will return to the question of whether one can produce work from heat in Chapter 13.

2.2

Heat capacity

In the previous section, we explained that it is not possible for an object to contain a certain quantity of heat, because heat is deﬁned as ‘energy in transit’. It is therefore with a somewhat heavy heart that we turn to the topic of ‘heat capacity’, since we have argued that objects have no capacity for heat! (This is one of those occasions in physics when decades of use of a name have made it completely standard, even though it is really the wrong name to use.) What we are going to derive in this section might be better termed ‘energy capacity’, but to do this would put us at odds with common usage throughout physics. All of this being said, we can proceed quite legitimately by asking the following simple question:

2.2

Heat capacity 15

How much heat needs to be supplied to an object to raise its temperature by a small amount dT ? The answer to this question is the heat dQ = C dT , where we deﬁne the heat capacity C of an object using C=

dQ . dT

(2.2)

As long as we remember that heat capacity tells us simply how much heat is needed to warm an object (and is nothing about the capacity of an object for heat) we shall be on safe ground. As can be inferred from eqn 2.2, the heat capacity C has units J K−1 . As shown in the following example, although objects have a heat capacity, one can also express the heat capacity of a particular substance per unit mass, or per unit volume.3

Example 2.2 The heat capacity of 0.125 kg of water is measured to be 523 J K−1 at room temperature. Hence calculate the heat capacity of water (a) per unit mass and (b) per unit volume. Solution: (a) The heat capacity per unit mass c is given by dividing the heat capacity by the mass, and hence c=

523 J K−1 = 4.184 × 103 J K−1 kg−1 . 0.125 kg

(2.3)

(b) The heat capacity per unit volume C is obtained by multiplying the previous answer by the density of water, namely 1000 kg m−3 , so that C = 4.184 × 103 J K−1 kg−1 × 1000 kg m−3 = 4.184 × 106 J K−1 m−3 . (2.4)

The heat capacity per unit mass c occurs quite frequently, and it is given a special name: the speciﬁc heat capacity.

Example 2.3 Calculate the speciﬁc heat capacity of water. Solution: This is given in answer (a) from the previous example: the speciﬁc heat capacity of water is 4.184 × 103 J K−1 kg−1 .

3

We will use the symbol C to represent a heat capacity, whether of an object, or per unit volume, or per mole. We will always state which is being used. The heat capacity per unit mass is distinguished by the use of the lower-case symbol c. We will usually reserve the use of subscripts on the heat capacity to denote the constraint being applied (see later).

16 Heat

Also useful is the molar heat capacity, which is the heat capacity of one mole of the substance.

Example 2.4 Calculate the molar heat capacity of water. (The molar mass of water is 18 g.) Solution: The molar heat capacity is obtained by multiplying the speciﬁc heat capacity by the molar mass, and hence C = 4.184 × 103 J K−1 kg−1 × 0.018 kg = 75.2 J K−1 mol−1 .

4

This complication is there for liquids and solids, but doesn’t make such a big diﬀerence.

(2.5)

When we think about the heat capacity of a gas, there is a further complication.4 We are trying to ask the question: how much heat should you add to raise the temperature of our gas by one degree Kelvin? But we can imagine doing the experiment in two ways (see also Fig. 2.1): (1) Place our gas in a sealed box and add heat (Fig. 2.1(a)). As the temperature rises, the gas will not be allowed to expand because its volume is ﬁxed, so its pressure will increase. This method is known as heating at constant volume. (2) Place our gas in a chamber connected to a piston and heat it (Fig. 2.1(b)). The piston is well lubricated, and so will slide in and out to maintain the pressure in the chamber to be identical to that in the lab. As the temperature rises, the piston is forced out (doing work against the atmosphere) and the gas is allowed to expand, keeping its pressure constant. This method is known as heating at constant pressure.

Fig. 2.1 Two methods of heating a gas: (a) constant volume, (b) constant pressure.

5

We will calculate the relative sizes of CV and Cp in Section 11.3.

In both cases, we are applying a constraint to the system, either constraining the volume of the gas to be ﬁxed, or constraining the pressure of the gas to be ﬁxed. We need to modify our deﬁnition of heat capacity given in eqn 2.2, and hence we deﬁne two new quantities: CV is the heat capacity at constant volume and Cp is the heat capacity at constant pressure. We can write them using partial diﬀerentials as follows: ∂Q CV = , (2.6) ∂T V ∂Q . (2.7) Cp = ∂T p We expect that Cp will be bigger than CV for the simple reason that more heat will need to be added when heating at constant pressure than when heating at constant volume. This is because in the latter case additional energy will be expended on doing work on the atmosphere as the gas expands. It turns out that indeed Cp is bigger than CV in practice.5

Exercises 17

Example 2.5 The speciﬁc heat capacity of helium gas is measured to be 3.12 kJ K−1 kg−1 at constant volume and 5.19 kJ K−1 kg−1 at constant pressure. Calculate the molar heat capacities. (The molar mass of helium is 4 g.) Solution: The molar heat capacity is obtained by multiplying the speciﬁc heat capacity by the molar mass, and hence CV

=

12.48 J K−1 mol−1 ,

(2.8)

Cp

=

20.76 J K−1 mol−1 .

(2.9)

(Interestingly, these answers are almost exactly 23 R and 52 R. We will see why in Section 11.3.)

Chapter summary • In this chapter, the concepts of heat and heat capacity have been introduced. • Heat is ‘energy in transit’. • The heat capacity C of an object is given by C = dQ/dT . The heat capacity of a substance can also be expressed per unit volume or per unit mass (in the latter case it is called speciﬁc heat capacity).

Exercises

(2.2) The world’s oceans contain approximately 1021 kg of water. Estimate the total heat capacity of the world’s oceans.

(2.4) The molar heat capacity of gold is 25.4 J mol−1 K−1 . Its density is 19.3×103 kg m−3 . Calculate the speciﬁc heat capacity of gold and the heat capacity per unit volume. What is the heat capacity of 4 × 106 kg of gold? (This is roughly the holdings of Fort Knox.)

(2.3) The world’s power consumption is currently about 13 TW, and growing! (1 TW= 1012 W.) Burning one ton of crude oil (which is nearly seven barrels worth) produces about 42 GJ (1 GJ= 109 J). If the world’s total power needs were to come from burning oil (a large fraction currently does), how much oil would we be burning per second?

(2.5) Two bodies, with heat capacities C1 and C2 (assumed independent of temperature) and initial temperatures T1 and T2 respectively, are placed in thermal contact. Show that their ﬁnal temperature Tf is given by Tf = (C1 T1 + C2 T2 )/(C1 + C2 ). If C1 is much larger than C2 , show that Tf ≈ T1 + C2 (T2 − T1 )/C1 .

(2.1) Using data from this chapter, estimate the energy needed to (a) boil enough tap water to make a cup of tea, (b) heat the water for a bath.

3

Probability

3.1 Discrete probability distributions 19 3.2 Continuous probability distributions 20 3.3 Linear transformation

21

3.4 Variance

22

3.5 Linear transformation and the variance 23 3.6 Independent variables

24

Chapter summary

26

Further reading

27

Exercises

27

Life is full of uncertainties, and has to be lived according to our best guesses based on the information available to us. This is because the chain of events that lead to various outcomes can be so complex that the exact outcomes are unpredictable. Nevertheless, things can still be said even in an uncertain world: for example, it is more helpful to know that there is a 20% chance of rain tomorrow than that the weather forecaster has absolutely no idea; or worse still that he/she claims that there will deﬁnitely be no rain, when there might be! Probability is therefore an enormously useful and powerful subject, since it can be used to quantify uncertainty. The foundations of probability theory were laid by the French mathematicians Pierre de Fermat (1601–1665) and Blaise Pascal (1623–1662), in a correspondence in 1654 which originated from a problem set to them by a gentleman gambler. The ideas proved to be intellectually infectious and the ﬁrst probability textbook was written by the Dutch physicist Christian Huygens (1629–1695) in 1657, who applied it to the working out of life expectancy. Probability was thought to be useful only for determining possible outcomes in situations in which we lacked complete knowledge. The supposition was that if we could know the motions of all particles at the microscopic level, we could determine every outcome precisely. In the twentieth century, the discovery of quantum theory has led to the understanding that, at the microscopic level, outcomes are purely probabilistic. Probability has had a huge impact on thermal physics. This is because we are often interested in systems containing huge numbers of particles, so that predictions based on probability turn out to be precise enough for most purposes. In a thermal physics problem, one is often interested in the values of quantities which are the sum of many small contributions from individual atoms. Though each atom behaves differently, the average behaviour is what comes through, and therefore it becomes necessary to be able to extract average values from probability distributions. In this chapter, we will deﬁne some basic concepts in probability theory. Let us begin by stating that the probability of occurrence of a particular event, taken from a ﬁnite set of possible events, is zero if that event is impossible, is one if that event is certain, and takes a value somewhere in between zero and one if that event is possible but not certain. We begin by considering two diﬀerent types of probability distribution: discrete and continuous.

3.1

3.1

Discrete probability distributions 19

Discrete probability distributions

Discrete random variables can only take a ﬁnite number of values. Examples include the number obtained when throwing a die (1, 2, 3, 4, 5 or 6), the number of children in each family (0, 1, 2, . . .), and the number of people killed per year in the UK in bizarre gardening accidents (0, 1, 2, . . .). Let x be a discrete random variable which takes values xi with probability Pi . We require that the sum of the probabilities of every possible outcome adds up to one. This may be written Pi = 1. (3.1) i

We deﬁne the mean (or average or expected value) of x to be x = xi Pi . (3.2) i

The idea is that you weight by its probability each value taken by the random variable x.

Alternative notations for the mean of x include x ¯ and E(x). We prefer the one given in the main text since it is easier to distinguish quantities such as x2 and x2 with this notation, particularly when writing quickly.

Example 3.1 Note that the mean, x, may be a value which x cannot actually take. A common example of this is the number of children in families, which is often quoted as 2.4. Any individual couple can only have an integer number of children. Thus the expected value of x is actually an impossibility!

It is also possible to deﬁne the mean squared value of x using x2 = x2i Pi .

(3.3)

i

In fact, any function of x can be averaged, using (by analogy) f (x) = f (xi )Pi .

(3.4)

Now let us actually evaluate the mean of x for a particular discrete distribution.

Example 3.2 Let x take values 0, 1 and 2 with probabilities 12 , 14 and 14 respectively. This distribution is shown in Figure 3.1. Calculate x and x2 .

Px

i

x Fig. 3.1 An example of a discrete probability distribution.

20 Probability

Solution: First check that Pi = 1. Since 12 + 14 + can calculate the averages as follows: xi Pi x =

1 4

= 1, this is ﬁne. Now we

i

=

0·

=

3 . 4

1 1 1 +1· +2· 2 4 4 (3.5)

Again, we ﬁnd that the mean x is not actually one of the possible values of x. We can now calculate the value of x2 as follows: x2 = x2i Pi i

3.2 1

For a continuous random variable, there are an inﬁnite number of possible values it can take, so the probability of any one of them occurring is zero! Hence we talk about the probability of the variable lying in some range, such as ‘between x and x + dx’.

=

0·

=

5 . 4

1 1 1 +1· +4· 2 4 4 (3.6)

Continuous probability distributions

Let x now be a continuous random variable,1 which has a probability P (x) dx of having a value between x and x + dx. Continuous random variables can take a range of possible values. Examples include the height of children in a class, the length of time spent in a waiting room, and the amount a person’s blood pressure increases when they read their mobile-phone bill. These quantities are not restricted to any ﬁnite set of values, but can take a continuous set of values. As before, we require that the total probability of all possible outcomes is one. Because we are dealing with continuous distributions, the sums become integrals, and we have P (x) dx = 1. (3.7) The mean is deﬁned as

x =

x P (x) dx.

Similarly, the mean square value is deﬁned as x2 = x2 P (x) dx, and the mean of any function of x, f (x), can be deﬁned as f (x) = f (x) P (x) dx,

(3.8)

(3.9)

(3.10)

3.3

Linear transformation 21

Example 3.3 2

2

Let P (x) = Ce−x /2a where C and a are constants. This probability is illustrated in Figure 3.2 and this curve is known as a Gaussian.2 Calculate x and x2 given this probability distribution. Solution: The ﬁrst thing to do is to normalize the probability distribution (i.e. to ensure that the sum over all probabilities is one). This allows us to ﬁnd the constant C using eqn C.3 to do the integral: ∞ ∞ 2 2 P (x) dx = C e−x /2a dx 1= −∞ √−∞ (3.11) = C 2πa2 (3.12) √ so we ﬁnd that C = 1/ 2πa2 which gives 2 2 1 P (x) = √ e−x /2a . 2 2πa The mean of x can then be evaluated using ∞ 2 2 1 x e−x /2a dx x = √ 2πa2 −∞ = 0,

(3.13)

(3.14)

because the integrand is an odd function. The mean of x2 can also be evaluated as follows: ∞ 2 2 1 x2 = √ x2 e−x /2a dx 2 2πa −∞ 1 1√ = √ 8πa6 2πa2 2 = a2 , (3.15) where the integrals are performed as described in Appendix C.2.

3.3

Linear transformation

Sometimes one has a random variable, and one wants to make a second random variable by performing a linear transformation on the ﬁrst one. If y is a random variable which is related to the random variable x by the equation y = ax + b (3.16) where a and b are constants, then the average value of y is given by y = ax + b = ax + b.

(3.17)

The proof of this result is straightforward and is left as an exercise.

2

See Appendix C.2.

Px

x Fig. 3.2 An example continuous probability distribution.

22 Probability

Example 3.4 Temperatures in Celsius and Fahrenheit are related by the simple formula C = 59 (F − 32), where C is the temperature in Celsius and F the temperature in Fahrenheit. Hence the average temperature of a particular temperature distribution is C = 95 (F − 32). The average annual temperature in New York Central Park is 54◦ F. One can convert this to Celsius using the formula above to get ≈ 12◦ C.

3.4

Variance

We now know how to calculate the average of a set of values, but what about the spread in the values? The ﬁrst idea one might have to quantify the spread of values in a distribution is to consider the deviation from the mean for a particular value of x. This is deﬁned by x − x.

(3.18)

This quantity tells you by how much a particular value is above or below the mean value. We can work out the average of the deviation (averaging over all values of x) as follows: x − x = x − x = 0,

(3.19)

which follows from using the equation for linear transformation (eqn 3.17). Thus the average deviation is not going to be a very helpful indicator! Of course, the problem is that the deviation is sometimes positive and sometimes negative, and the positive and negative deviations cancel out. A more useful quantity would be the modulus of the deviation, |x − x|,

3

In fact, in general we can deﬁne the kth moment about the mean as (x − x)k . The ﬁrst moment about the mean is the mean deviation, and it is zero, as we have seen. The second moment about the mean is the variance. The third moment about the mean is known as the skewness parameter, and sometimes turns out to be useful. The fourth moment about the mean is called the kurtosis.

(3.20)

which is always positive, but this will suﬀer from the disadvantage that modulus signs in algebra can be both confusing and tedious. Therefore, another approach is to use another quantity which is always positive, the square of the deviation, (x − x)2 . This quantity is what we need: always positive and easy to manipulate algebraically. Hence, its average is given a special name, the variance. Consequently, the variance of x, written as σx2 , is deﬁned as the mean squared deviation: 3 σx2 = (x − x)2 .

(3.21)

We further will deﬁne the standard deviation, σx as the square root of the variance: (3.22) σx = (x − x)2 .

3.5

Linear transformation and the variance 23

The standard deviation represents the ‘root mean square’ (known as the ‘r.m.s.’) scatter or spread in the data. The following identity is extremely useful: σx2

= (x − x)2 = x2 − 2xx + x2 = x2 − 2xx + x2 = x2 − x2 .

(3.23)

Example 3.5 For Examples 2.2 and 2.3 above, work out σx2 , the variance of the distribution, in each case. Solution: For Example 2.2 σx2 = x2 − x2 =

9 11 5 − = . 4 16 16

(3.24)

For Example 2.3 σx2 = x2 − x2 = a2 − 0 = a2 .

3.5

(3.25)

Linear transformation and the variance

We return to the problem of a linear transformation of a random variable. What happens to the variance in this case? If y is a random variable which is related to the random variable x by the equation y = ax + b, (3.26) where a and b are constants, then we have seen that y = ax + b = ax + b.

(3.27)

Hence, we can work out y 2 , which is y 2 = (ax + b)2 = a2 x2 + 2abx + b2 = a2 x2 + 2abx + b2 .

(3.28)

Also, we can work out y2 , which is y2 = (ax + b)2 = a2 x2 + 2abx + b2 .

(3.29)

24 Probability

Hence, using eqn 3.23, the variance in y is given by eqn 3.28 minus eqn 3.29, i.e. σy2

= y 2 − y2 = a2 x2 − a2 x2 = a2 σx2 .

(3.30)

Notice that the variance depends on a but not on b. This makes sense because the variance tells us about the width of a distribution, and nothing about its absolute position. The standard deviation of y is therefore given by σy = aσx . (3.31)

Example 3.6 The average temperature in a town in the USA in January is 23◦ F and the standard deviation is 9◦ F. Convert these ﬁgures into Celsius using the relation in Example 2.4. Solution: The average temperature in Celsius is given by C =

5 5 (F − 32) = (23 − 32) = −5◦ C, 9 9

and the standard deviation is given by

3.6 4

Two random variables are independent if knowing the value of one of them yields no information about the value of the other. For example, the height of a person chosen at random from a city and the number of hours of rainfall in that city on the ﬁrst Tuesday of September are two independent random variables.

5 9

(3.32)

× 9 = 5◦ C.

Independent variables

If u and v are independent random variables,4 the probability that u is in the range from u to u + du and v is in the range from v to v + dv is given by the product Pu (u)du Pv (v)dv.

(3.33)

Hence, the average value of the product of u and v is uv = uvPu (u)Pv (v) du dv = uPu (u) du vPv (v) dv = uv,

(3.34)

because the integrals separate for independent random variables. Thus the average value of the product of u and v is equal to the product of their average values.

3.6

Example 3.7 Suppose that there are n independent random variables, Xi , each with 2 . Let Y be the sum of the random the same mean X and variance σX variables, so that Y = X1 + X2 + · · · + Xn . Find the mean and variance of Y . Solution: The mean of Y is simply Y = X1 + X2 + · · · + Xn ,

(3.35)

but since all the Xi have the same mean X this can be written Y = nX.

(3.36)

Hence the mean of Y is n times the mean of the Xi . To ﬁnd the variance of Y , we can use the formula σY2 = Y 2 − Y 2 .

(3.37)

Hence 2 Y 2 = X12 + · · · + XN + X1 X2 + X2 X1 + X1 X3 + · · ·

(3.38)

2 = X12 + · · · + XN + X1 X2 + X2 X1 + X1 X3 + · · ·

There are n terms like X12 on the right-hand side, and n(n − 1) terms like X1 X2 . The former terms take the value X 2 and the latter terms (because they are the product of two independent random variables) take the value XX = X2 . Hence, using eqn 3.36, Y 2 = nX 2 + n(n − 1)X2 ,

(3.39)

so that σY2

= Y 2 − Y 2 = nX 2 − nX2 2 = nσX .

(3.40)

The results proved in this last example have some interesting applications. The ﬁrst concerns experimental measurements. Imagine that a quantity X is measured n times, each time with an independent error, which we call σX . If you add up the results of the√measurements to make Y = Xi , then the rms error in Y is only n times the rms error of a singleX. Hence if you try and get a good estimate of√X by calculating ( Xi )/n, the error in this quantity is equal to σX / n. Thus, for example, if you make four measurements of a quantity and average your results, the random error in your average is half of what it

Independent variables 25

26 Probability

would be if you’d just taken a single measurement. Of course, you may still have systematic errors in your experiment. If you are consistently overestimating your quantity by an error in your experimental setup, that error won’t reduce by repeated measurement! A second application is in the theory of random walks. Imagine a drunken person staggering out of a pub and attempting to walk along a narrow street (which conﬁnes him or her to motion in one dimension). Let’s pretend that with each inebriated step, the drunken person is equally likely to travel one step forward or one step backward. The eﬀects of intoxication are such that each step is uncorrelated with the previous one. Thus the average distance travelled in a single step is X = 0. After n such steps, we would have an expected total distance travelled of Y = Xi = 0. However, in this case the root mean squared distance is more revealing. In this case√Y 2 = nX 2 , so that the rms length of a random walk of n steps is n times the length of a single step. This result will be useful in considering Brownian motion in Chapter 33.

Chapter summary • In this chapter, several introductory concepts in probability theory have been introduced. • The mean of a discrete probability distribution is given by xi Pi , x = i

and the mean of a continuous probability distribution is given by x = x P (x) dx. • The variance is given by σx2 = (x − x)2 , where σx is the standard deviation. • If y = ax + b, then y = ax + b and σy = aσx . • If u and v are independent random variables, then uv = uv. the X’s are all In particular, if Y = X1 + X2 + · · · + Xn , where √ from the same distribution, Y = nx and σY = n σX .

Further reading 27

Further reading There are many good books on probability theory and statistics. Recommended ones include Papoulis (1984), Wall and Jenkins (2003) and Sivia and Skilling (2006).

Exercises (3.1) A throw of a regular die yields the numbers 1, 2, . . . , 6, each with probability 1/6. Find the mean, variance and standard deviation of the numbers obtained. (3.2) The mean birth weight of babies in the UK is about 3.2 kg with a standard deviation of 0.5 kg. Convert these ﬁgures into pounds (lb), given that 1 kg = 2.2 lb. (3.3) This question is about a discrete probability distribution known as the Poisson distribution. Let x be a discrete random variable which can take the values 0, 1, 2, . . . A quantity is said to be Poisson distributed if one obtains the value x with probability e−m mx , P (x) = x! where m is a particular number (which we will show in part (b) of this exercise is the mean value of x). (a) Show that P (x) is a well-behaved probability distribution in the sense that ∞ X P (x) = 1. x=0

(Why is this condition important?) (b) Show that the mean value of the probability ∞ X xP (x) = m. distribution is x = x=0

(c) The Poisson distribution is useful for describing very rare events which occur independently and whose average rate does not change over the period of interest. Examples include birth defects measured per year, traﬃc accidents at a particular junction per year, numbers of typographical errors on a page, and number of activations of a Geiger counter per minute. The ﬁrst recorded example of a Poisson distribution, the one which in fact motivated

Poisson, was connected with the rare event of someone being kicked to death by a horse in the Prussian army. The number of horsekick deaths of Prussian military personnel was recorded for each of 10 corps in each of 20 years from 1875–1894 and the following data recorded: Number of deaths per year per corps

Observed frequency

0 1 2 3 4 ≥5

109 65 22 3 1 0

Total

200

Calculate the mean number of deaths per year per corps. Compare the observed frequency with a calculated frequency assuming the number of deaths per year per corps are Poisson distributed with this mean. (3.4) This question is about a continuous probability distribution known as the exponential distribution. Let x be a continuous random variable which can take any value x ≥ 0. A quantity is said to be exponentially distributed if it takes values between x and x + dx with probability P (x) dx = Ae−x/λ dx where λ and A are constants. (a) Find the value of A that makes P (x) a welldeﬁned continuous probability distribution so

28 Exercises Z

that

(iii) θ2 ;

∞

P (x) dx = 1. 0

(b) Show that the mean Zvalue of the probability ∞ xP (x) dx = λ. distribution is x = 0

(c) Find the variance and standard deviation of this probability distribution. Both the exponential distribution and the Poisson distribution are used to describe similar processes, but for the exponential distribution x is the actual time between, for example, successive radioactive decays, successive molecular collisions, or successive horse-kicking incidents (rather than, as with the Poisson distribution, x being simply the number of such events in a speciﬁed interval). (3.5) If θ is a continuous random variable which is uniformly distributed between 0 and π, write down an expression for P (θ). Hence ﬁnd the value of the following averages: (i) θ; (ii) θ −

π ; 2

(iv) θn (for the case n ≥ 0); (v) cos θ; (vi) sin θ; (vii) | cos θ|; (viii) cos2 θ; (ix) sin2 θ; (x) cos2 θ + sin2 θ. Check that your answers are what you expect. (3.6) In experimental physics, it is important to repeat measurements. Assuming that errors are random, show that if the error in making a single measurement of a quantity X is ∆, the error obtained af√ ter using n measurements is ∆/ n. (Hint: After n measurements, the procedure would be to take the n results and average them. So you require the standard deviation of the quantity Y = (X1 +X2 +· · ·+Xn )/n where X1 , X2 . . . Xn can be assumed to be independent, and each has standard deviation ∆.)

Biography 29

Ludwig Boltzmann (1844–1906)

versity of Vienna under the supervision of Stefan. His subsequent career took him to Graz, Heidelberg, Berlin, then Vienna again, back to Graz, then Vienna, Leipzig, and ﬁnally back to Vienna. His own temperament was in accord with this physical restlessness and lack of stability. The moving around was also partly due to his diﬃcult relationships with various other physicists, particularly Ernst Mach, who was appointed to a chair in Vienna (which occasioned Boltzmann’s move to Leipzig in 1900), and Wilhelm Ostwald (whose opposition in Leipzig, together with Mach’s retirement in 1901, motivated Boltzmann’s return to Vienna in 1902, although not before Boltzmann had attempted suicide). The notions of irreversibility inherent in thermodynamics led to some controversial implications, particularly to a Universe based on Newtonian mechanics which are reversible in time. Boltzmann’s approach used probability to understand how the behaviour of atoms determined the properties of matter. Ostwald, a physical chemist, who had himself recognized the importance of Gibbs’ work (see Chapters 16, 20 and 22) to the extent that he had translated Gibbs’ papers into German, was nevertheless a vigorous opponent of theories that involved what he saw as unmeasurable quantities. Ostwald was one of the last opponents of atomism, and became a dedicated opponent of Boltzmann. Ostwald himself was ﬁnally convinced of the validity of atoms nearly a decade after Boltzmann’s death, by which time Ostwald had been awarded a Nobel Prize, in 1909, for his work on catalysis. Boltzmann died just before his atomistic viewpoint became obviously vindicated and universally accepted. Boltzmann had suﬀered from depression and mood swings throughout his life. On holiday in Italy in 1906, Ludwig Boltzmann hanged himself while his wife and daughter were swimming. His famous equation relating entropy S with number of microstates W (Ω in this book) is

Ludwig Boltzmann made major contributions to the applications of probability to thermal physics. He worked out much of the kinetic theory of gases independently of Maxwell, and together they share the credit for the Maxwell– Boltzmann distribution (see Chapter 5). Boltzmann was very much in awe of Maxwell all his life, and was one of the ﬁrst to see the signiﬁcance of Maxwell’s theory of electromagnetism. “Was it a god who wrote these lines?” was BoltzFig. 3.3 Ludwig Boltzmann mann’s comment (quoting Goethe) on Maxwell’s work. Boltzmann’s great insight was to recognize the statistical connection between thermodynamic entropy and the number of microstates, and through a series of technical papers was able to put the subject of statistical mechanics on a ﬁrm footing (his work was, independently, substantially extended by the American physicist Gibbs). Boltzmann was able to show that the second law of thermodynamics (considered in Part IV of this book) could be derived from the principles of classical mechanics, although the fact that classical mechanics makes no distinction between the direction of time meant that he had to smuggle in some assumptions that mired his approach in some controversy. However, his derivation of what is known as the Boltzmann transport equation, which extends the ideas of the kinetic theory of gases, led to important developments in the electron transport theory of metals and in plasma physics. Boltzmann also showed how to derive from the principles of thermodynamics the empirical law discovered by his teacher, Josef Stefan, which stated S = k log W (3.41) that the total radiation from a hot body was proportional to the fourth power of its absolute temperature (see Chapter 23). and is engraved on his tombstone in Vienna. The Boltzmann was born in Vienna and did his doc- constant k is called the Boltzmann constant, and is torate in the kinetic theory of gases at the Uni- written as kB in this book.

Temperature and the Boltzmann factor

4 4.1 Thermal equilibrium

30

4.2 Thermometers

31

4.3 The microstates macrostates

and 33

4.4 A statistical deﬁnition of temperature 34 4.5 Ensembles

36

4.6 Canonical ensemble

36

4.7 Applications of the mann distribution

Boltz40

Chapter summary

43

Further reading

44

Exercises

44

T

T

T

T

T

T

Fig. 4.1 (a) Two objects at diﬀerent temperatures. (b) The objects are now placed in thermal contact and heat ﬂows from the hot object to the cold object. (c) After a long time, the two objects have the same ﬁnal temperature Tf .

In this chapter, we will explore the concept of temperature and show how it can be deﬁned in a statistical manner. This leads to the idea of a Boltzmann distribution and a Boltzmann factor. Now of course the concept of temperature seems such an intuitively obvious one that you might wonder why we need a whole chapter to discuss it. Temperature is simply a measure of ‘hotness’ or ‘coldness’, so that we say that a hot body has a higher temperature than a cold one. For example, as shown in Fig. 4.1(a) if an object has temperature T1 and is hotter than a second body with temperature T2 , we expect that T1 > T2 . But what do these numbers T1 and T2 signify? What does temperature actually mean?

4.1

Thermal equilibrium

To begin to answer these questions, let us consider what happens if our hot and cold bodies are placed in thermal contact which means that they are able to exchange energy. As described in Chapter 2, heat is ‘energy in transit’ and experiment suggests that, if nothing else is going on,1 heat will always ﬂow from the hotter body to the colder body, as shown in Fig. 4.1(b). This is backed up by our experience of the world: we always seem to burn ourselves when we touch something very hot (heat ﬂows into us from the hot object) and become very chilled when we touch something very cold (heat ﬂows out of us into the cold object). As heat ﬂows from the hotter body to the colder body, we expect that the energy content and the temperatures of the two bodies will each change with time. After some time being in thermal contact, we reach the situation in Fig. 4.1(c). The macroscopic properties of the two bodies are now no longer changing with time. If any energy ﬂows from the ﬁrst body to the second body, this is equal to the energy ﬂowing from the second body to the ﬁrst body; thus, there is no net heat ﬂow between the two bodies. The two bodies are said to be in thermal equilibrium, which is deﬁned by saying that the energy content and the temperatures of the 1 This is assuming that no additional power is being fed into the systems, such as occurs in the operation of a refrigerator which sucks heat out of the cold interior and dumps it into your warmer kitchen, but only because you are supplying electrical power.

4.2

two bodies will no longer be changing with time. We would expect that the two bodies in thermal equilibrium are now at the same temperature. It seems that something irreversible has happened. Once the two bodies were put in thermal contact, the change from Fig. 4.1(b) to Fig. 4.1(c) proceeds inevitably. However, if we started with two bodies at the same temperature and placed them in thermal contact as in Fig. 4.1(c), the reverse process, i.e. ending up with Fig. 4.1(b), would not occur.2 Thus as a function of time, systems in thermal contact tend towards thermal equilibrium, rather than away from it. The process that leads to thermal equilibrium is called thermalization. If various bodies are all in thermal equilibrium with each other, then we would expect that their temperatures should be the same. This idea is encapsulated in the zeroth law of thermodynamics, which states that

Thermometers 31

2

Thermal processes thus deﬁne an arrow of time. We will return to this point later in Section 34.5.

Zeroth law of thermodynamics: Two systems, each separately in thermal equilibrium with a third, are in equilibrium with each other. You can tell by the numbering of the law that although it is an assumption that comes before the other laws of thermodynamics, it was added after the ﬁrst three laws had been formulated. Early workers in thermodynamics took the content of the zeroth law as so obvious it hardly needed stating, and you might well agree with them! Nevertheless, the zeroth law gives us some justiﬁcation for how to actually measure temperature: we place the body whose temperature needs to be measured in thermal contact with a second body which displays some property which has a well-known dependence on temperature and wait for them to come into thermal equilibrium. The second body is called a thermometer. The zeroth law then guarantees that if we have calibrated this second body against any other standard thermometer, we should always get consistent results. Thus, a more succinct statement of the zeroth law3 is: ‘thermometers work’.

4.2

Thermometers

We now make some remarks concerning thermometers. • For a thermometer to work well, its heat capacity must be much lower than that of the object whose temperature one wants to measure. If this is not the case, the action of measurement (placing the thermometer in thermal contact with the object) could alter the temperature of the object. • A common type of thermometer utilizes the fact that liquids expand when they are heated. Galileo Galilei used a water thermometer based on this principle in 1593, but it was Daniel Gabriel Fahrenheit (1686–1736) who devised thermometers based on alco-

3

This version is from our colleague M.G. Bowler.

32 Temperature and the Boltzmann factor

R

hol (1709) and mercury (1714) that bear most resemblance to modern household thermometers. He introduced his famous temperature scale which was then superseded by the more logical scheme devised by Anders Celsius (1701–1744).

T

R

Fig. 4.2 The temperature dependence of the resistance of a typical platinum sensor.

T

p

Fig. 4.3 The temperature dependence of the resistance of a typical RuO2 sensor.

T Fig. 4.4 The vapour pressure of 4 He as a function of temperature. The dashed line labels atmospheric pressure and the corresponding boiling point for liquid 4 He.

4

We will introduce the Carnot engine in Section 13.2. The deﬁnition of temperature which arises from this is based upon eqn 13.7 and states that the ratio of the temperature of a body to the heat ﬂow from it is a constant in a reversible Carnot cycle.

• Another method is to measure the electrical resistance of a material which has a well-known dependence of resistance on temperature. Platinum is a popular choice since it is chemically resistant, ductile (so can be easily drawn into wires) and has a large temperaturecoeﬃcient of resistance; see Fig. 4.2. Other commonly used thermometers are based on doped germanium (a semiconductor which is very stable after repeated thermal cycling), carbon sensors and RuO2 (in contrast to platinum, the electrical resistance of these thermometers increases as they are cooled; see Fig. 4.3). • Using the ideal gas equation (eqn 1.12), one can measure the temperature of a gas by measuring its pressure with its volume ﬁxed (or by measuring its volume with its pressure ﬁxed). This works well as far as the ideal gas equation works, although at very low temperature, gases liquefy and show departures from the ideal gas equation. • Another method which is useful in cryogenics is to have a liquid coexisting with its vapour and to measure the vapour pressure. For example, liquid helium (4 He, the most common isotope) has a vapour pressure dependence on temperature which is shown in Fig. 4.4.

All of these methods use some measurable property, like resistance or pressure, which depends in some, sometimes complicated, manner on temperature. However, none of them are completely linear across the entire temperature range of interest: mercury solidiﬁes at very low temperature and becomes gaseous at very high temperature, the resistance of platinum saturates at very low temperature and platinum wire melts at very high temperature, etc. However, against what standard thermometer can one possibly assess the relative merits of these diﬀerent thermometers? Which thermometer is perfect and gives the real thing, against which all other thermometers should be judged? It is clear that we need some absolute deﬁnition of temperature based on fundamental physics. In the nineteenth century, one such deﬁnition was found, and it was based on a hypothetical machine, which has never been built, called a Carnot engine.4 Subsequently, it was found that temperature could be deﬁned in terms of a purely statistical argument using ideas from probability theory, and this is the one we will use which we introduce in Section 4.4. In the following section we will introduce the terminology of microstates and macrostates that will be needed for this argument.

4.3

4.3

The microstates and macrostates 33

The microstates and macrostates

To make the distinction between microstates and macrostates, consider the following example.

Example 4.1 Imagine that you have a large box containing 100 identical coins. With the lid on the box, you give it a really good long and hard shake, so that you can hear the coins ﬂipping, rattling and being generally tossed around. Now you open the lid and look inside the box. Some of the coins will be lying with heads facing up and some with tails facing up. There are lots of possible conﬁgurations that one could achieve (2100 to be precise, which is approximately 1030 ) and we will assume that each of these diﬀerent conﬁgurations is equally likely. Each possible conﬁguration therefore has a probability of approximately 10−30 . We will call each particular conﬁguration a microstate of this system. An example of one of these microstates would be: ‘Coin number 1 is heads, coin number 2 is heads, coin number 3 is tails, etc’. To identify a microstate, you would somehow need to identify each coin individually, which would be a bit of a bore. However, probably the way you would categorize the outcome of this experiment is by simply counting the number of coins which are heads and the number which are tails (e.g. 53 heads and 47 tails). This sort of categorisation we call a macrostate of this system. The macrostates are not equally likely. For example, of the ≈ 1030 possible individual conﬁgurations (microstates), # of conﬁgurations with 50 heads and 50 tails

=

# of conﬁgurations with 53 heads and 47 tails

=

100! (50!)2 100! 53!47! 100! 90!10!

≈ 4 × 1027 , ≈ 3 × 1027 ,

≈ 1013 , # of conﬁgurations with 90 heads and 10 tails = # of conﬁgurations with 100 heads and 0 tails = 1. Thus, the outcome with all 100 coins with heads facing up is a very unlikely outcome. This macrostate contains a single microstate. If that were the result of the experiment, you would probably conclude that (i) your shaking had not been very vigorous and that (ii) someone had carefully prepared the coins to be lying heads up at the start of the experiment. Of course, a particular microstate with 53 heads and 47 tails is just as unlikely; it is just that there are about 3×1027 other microstates which have 53 heads and 47 tails which look extremely similar.

This simple example shows two crucial points: • The system could be described by a very large number of equally likely microstates. • What you actually measure5 is a property of the macrostate of the

5

In our example, the measurement was opening the large box and counting the number of coins which were heads and those which were tails.

34 Temperature and the Boltzmann factor

system. The macrostates are not equally likely, because diﬀerent macrostates correspond to diﬀerent numbers of microstates. The most likely macrostate that the system will ﬁnd itself in is the one which corresponds to the largest number of microstates. Thermal systems behave in a very similar way to the example we have just considered. To specify a microstate for a thermal system, you would need to give the microscopic conﬁgurations (perhaps position and velocity, or perhaps energy) of each and every atom in the system. In general it is impossible to measure which microstate the system is in. The macrostate of a thermal system on the other hand would be speciﬁed only by giving the macroscopic properties of the system, such as the pressure, the total energy or the volume. A macroscopic conﬁguration, such as a gas with pressure 105 Pa in a volume 1 m3 , would be associated with an enormous number of microstates. In the next section, we are going to give a statistical deﬁnition of temperature which is based on the idea that a thermal system can have a large number of equally likely microstates, but you are only able to measure the macrostate of the system. At this stage, we are not going to worry about what the microstates of the system actually are; we are simply going to posit their existence and say that if the system has energy E, then it could be in any one of Ω(E) equally likely microstates, where Ω(E) is some enormous number.

4.4

Fig. 4.5 Two systems able to exchange energy between them.

6

We use the product of the two quantities, Ω1 (E1 ) and Ω2 (E2 ), because for each of the Ω1 (E1 ) states of the ﬁrst system, the second system can be in any of its Ω2 (E2 ) diﬀerent states. Hence the total number of possible combined states is the product of Ω1 (E1 ) and Ω2 (E2 ).

A statistical deﬁnition of temperature

We return to our example of Section 4.1 and consider two large systems which can exchange energy with each other, but not with anything else (Fig. 4.5). In other words, the two systems are in thermal contact with each other, but thermally isolated from their surroundings. The ﬁrst system has energy E1 and the second system has energy E2 . The total energy E = E1 + E2 is therefore assumed ﬁxed since the two systems cannot exchange energy with anything else. Hence the value of E1 is enough to determine the macrostate of this joint system. Each of these systems can be in a number of possible microstates. This number of possible microstates could in principle be calculated as in Section 1.4 (and in particular, Example 1.3) and will be a very large, combinatorial number, but we will not worry about the details of this. Let us assume that the ﬁrst system can be in any one of Ω1 (E1 ) microstates and the second system can be in any one of Ω2 (E2 ) microstates. Thus the whole system can be in any one of Ω1 (E1 )Ω2 (E2 ) microstates.6 The systems are able to exchange energy with each other, and we will assume that they have been left in the condition of being joined together for a suﬃciently long time that they have come into thermal equilibrium. This means that E1 and E2 have come to ﬁxed values. The crucial insight which we must make is that a system will appear to choose a macroscopic conﬁguration which maximizes the number of microstates. This idea is based upon the following assumptions:

4.4

A statistical deﬁnition of temperature 35

(1) each one of the possible microstates of a system is equally likely to occur; (2) the system’s internal dynamics are such that the microstates of the system are continually changing; (3) given enough time, the system will explore all possible microstates and spend an equal time in each of them.7 These assumptions imply that the system will most likely be found in a conﬁguration which is represented by the most microstates. For a large system our phrase ‘most likely’ becomes ‘absolutely, overwhelmingly likely’; what appears at ﬁrst sight to be a somewhat weak, probabilistic statement (perhaps on the same level as a ﬁve-day weather forecast) becomes an utterly reliable prediction on whose basis you can design an aircraft engine and trust your life to it! For our problem of two connected systems, the most probable division of energy between the two systems is the one which maximizes Ω1 (E1 )Ω2 (E2 ), because this will correspond to the greatest number of possible microstates. Our systems are large and hence we can use calculus to study their properties; we can therefore consider making inﬁnitesimal changes to the energy of one of the systems and seeing what happens. Therefore, we can maximize this expression with respect to E1 by writing d (Ω1 (E1 )Ω2 (E2 )) = 0 (4.1) dE1 and hence, using standard rules for diﬀerentiation of a product, we have Ω2 (E2 )

dΩ1 (E1 ) dΩ2 (E2 ) dE2 + Ω1 (E1 ) = 0. dE1 dE2 dE1

(4.2)

Since the total energy E = E1 + E2 is assumed ﬁxed, this implies that

and hence

dE1 = −dE2 ,

(4.3)

dE2 = −1, dE1

(4.4)

1 dΩ2 1 dΩ1 − = 0, Ω1 dE1 Ω2 dE2

(4.5)

so that eqn 4.2 becomes

and hence

d ln Ω1 d ln Ω2 = . (4.6) dE1 dE2 This condition deﬁnes the most likely division of energy between the two systems if they are allowed to exchange energy since it maximizes the total number of microstates. This division of energy is, of course, more usually called ‘being at the same temperature’, and so we identify d ln Ω/dE with the temperature T (so that our two systems have T1 = T2 ). We will deﬁne the temperature T by d ln Ω 1 = , kB T dE

(4.7)

7

This is the so-called ergodic hypothesis.

36 Temperature and the Boltzmann factor

where kB is the Boltzmann constant, which is given by We will see later (Section 14.5) that in statistical mechanics, the quantity kB ln Ω is called the entropy, S, and hence eqn 4.7 is equivalent to dS 1 = . T dE

kB = 1.3807 × 10−23 J K−1 .

(4.8)

With this choice of constant, T has its usual interpretation and is measured in Kelvin. We will show in later chapters that this choice of deﬁnition leads to experimentally veriﬁable consequences, such as the correct expression for the pressure of a gas.

4.5

Ensembles

We are using probability to describe thermal systems and our approach is to imagine repeating an experiment to measure a property of a system again and again because we cannot control the microscopic properties (as described by the system’s microstates). In an attempt to formalize this, Josiah Willard Gibbs in 1878 introduced a concept known as an ensemble. This is an idealization in which one consider making a large number of mental ‘photocopies’ of the system, each one of which represents a possible state the system could be in. There are three main ensembles that tend to be used in thermal physics: (1) The microcanonical ensemble: an ensemble of systems that each have the same ﬁxed energy. (2) The canonical ensemble: an ensemble of systems, each of which can exchange its energy with a large reservoir of heat. As we shall see, this ﬁxes (and deﬁnes) the temperature of the system. (3) The grand canonical ensemble: an ensemble of systems, each of which can exchange both energy and particles with a large reservoir. (This ﬁxes the system’s temperature and a quantity known as the system’s chemical potential. We will not consider this again until Chapter 22 and it can be ignored for the present.) In the next section we will consider the canonical ensemble in more detail and use it to derive the probability of a system at a ﬁxed temperature being in a particular microstate.

T

Fig. 4.6 A large reservoir (or heat bath) at temperature T connected to a small system.

4.6

Canonical ensemble

We now consider two systems coupled as before in such a way that they can exchange energy (Fig. 4.6). This time, we will make one of them enormous, and call it the reservoir (also known as a heat bath). It is so large that you can take quite a lot of energy out of it and yet it can remain at essentially the same temperature. In the same way, if you stand on the sea shore and take an eggcup-full of water out of the ocean, you do not notice the level of the ocean going down (although it does in fact go down, but by an unmeasurably small amount). The number of ways of arranging the quanta of energy of the reservoir will therefore be colossal. The other system is small and will be known as

4.6

the system. We will assume that for each allowed energy of the system there is only a single microstate, and therefore the system has a value of Ω equal to one. Once again, we ﬁx8 the total energy of the system plus reservoir to be E. The energy of the reservoir is taken to be E − while the energy of the system is taken to be . This situation of a system in thermal contact with a large reservoir is very important and is known as the canonical ensemble.9 The probability P () that the system has energy is proportional to the number of microstates which are accessible to the reservoir multiplied by the number of microstates which are accessible to the system. This is therefore P () ∝ Ω(E − ) × 1. (4.9) Since we have an expression for temperature in terms of the logarithm of Ω (eqn 4.7), and since E, we can perform a Taylor expansion10 of ln Ω(E − ) around = 0, so that ln Ω(E − ) = ln Ω(E) −

d ln Ω(E) + ··· dE

(4.10)

+ ··· , kB T

(4.11)

Canonical ensemble 37

8

We thus treat the system plus reservoir as being in what is known as the microcanonical ensemble, which has ﬁxed energy with each of its microstates being equally likely. 9 ‘Canonical’ means part of the ‘canon’, the store of generally accepted things one should know. It’s an odd word, but we’re stuck with it. Focussing on a system whose energy is not ﬁxed, but which can exchange energy with a big reservoir, is something we do a lot in thermal physics and is therefore in some sense canonical.

10

See Appendix B.

and so now using eqn. 4.7, we have ln Ω(E − ) = ln Ω(E) −

P

where T is the temperature of the reservoir. In fact, we can neglect the further terms in the Taylor expansion (see Exercise 4.4) and hence eqn 4.11 becomes Ω(E − ) = Ω(E) e−/kB T . (4.12) Using eqn 4.9 we thus arrive at the following result for the probability distribution describing the system which is given by P () ∝ e−/kB T .

(4.13)

Since the system is now in equilibrium with the reservoir, it also must have the same temperature as the reservoir. But notice that although the system therefore has ﬁxed temperature T , its energy is not a constant but is governed by the probability distribution in eqn 4.13 (and is plotted in Fig. 4.7). This is known as the Boltzmann distribution and also as the canonical distribution. The term e−/kB T is known as a Boltzmann factor. We now have a probability distribution which describes exactly how a small system behaves when coupled to a large reservoir at temperature T . The system has a reasonable chance of achieving an energy which is less than kB T , but the exponential in the Boltzmann distribution begins to quickly reduce the probability of achieving an energy much greater than kB T . However, to quantify this properly we need to normalize the probability distribution. If a system is in contact with a reservoir and has a microstate r with energy Er , then e−Er /kB T P (microstate r) = −E /k T , i B ie

(4.14)

Fig. 4.7 The Boltzmann distribution. The dashed curve corresponds to a higher temperature than the solid curve.

38 Temperature and the Boltzmann factor

The partition function is the subject of Chapter 20.

where the sum in the denominator makes sure that the probability is normalized. The sum in the denominator is called the partition function and is given the symbol Z. We have derived the Boltzmann distribution on the basis of statistical arguments which show that this distribution of energy maximizes the number of microstates. It is instructive to verify this for a small system, so the following example presents the results of a computer experiment to demonstrate the validity of the Boltzmann distribution.

Example 4.2 To illustrate the statistical nature of the Boltzmann distribution, let us play a game in which quanta of energy are distributed in a lattice. We choose a lattice of 400 sites, arranged for convenience on a 20×20 grid. Each site initially contains a single energy quantum, as shown in Fig. 4.8(a). The adjacent histogram shows that there are 400 sites with one quantum on each. We now choose a site at random and remove the quantum from that site and place it on a second, randomly-chosen site. The resulting distribution is shown in Fig. 4.8(b), and the histogram shows that we now have 398 sites each with 1 quantum, 1 site with no quanta and 1 site with two quanta. This redistribution process is repeated many times and the resulting distribution is as shown in Fig. 4.8(c). The histogram describing this looks very much like a Boltzmann exponential distribution. The initial distribution shown in Fig. 4.8(a) is very equitable and gives a distribution of energy quanta between sites of which Karl Marx would have been proud. It is however very statistically unlikely because it is associated with only a single microstate, i.e. Ω = 1. There are many more microstates associated with other macrostates, as we shall now show. For example, the state obtained after a single iteration, such as the one shown in Fig. 4.8(b), is much more likely, since there are 400 ways to choose the site from which a quantum has been removed, and then 399 ways to choose the site to which a quantum is added; hence Ω = 400 × 399 = 19600 for this histogram (which contains 398 singly occupied sites, one site with zero quanta and one site with two quanta). The state obtained after many iterations in Fig. 4.8(c) is much, much more likely to occur if quanta are allowed to rearrange randomly as the number of microstates associated with the Boltzmann distribution is absolutely enormous. The Boltzmann distribution is simply a matter of probability. In the model considered in this example, the rˆole of temperature is played by the total number of energy quanta in play. So, for example, if instead the initial arrangement had been two quanta per site rather than one quantum per site, then after many iterations one would obtain the arrangement shown in Fig. 4.8(d). Since the initial arrangement has more energy, the ﬁnal state is a Boltzmann distribution with a higher temperature (leading to more sites with more energy quanta).

4.6

Canonical ensemble 39

Fig. 4.8 Energy quanta distributed on a 20×20 lattice. (a) In the initial state, one quantum is placed on each site. (b) A site is chosen at random and a quantum is removed from that site and placed on a second randomly-chosen site. (c) After many repetitions of this process, the resulting distribution resembles a Boltzmann distribution. (d) The analogous ﬁnal distribution following redistribution from an initial state with two quanta per site. The adjacent histogram in each case shows how many quanta are placed on each site.

Let us now start with a bigger lattice, now containing 106 sites, and place a quantum of energy on each site. We randomly move quanta from site to site as before, and in our computer program we let this proceed for a large number of iterations (in this case 1010 ). The resulting distribution is shown in Fig. 4.9, which displays a graph on a logarithmic scale of the number of sites N with n quanta. The straight line is a ﬁt to the expected Boltzmann distribution. This example is considered in more detail in the exercises.

N

40 Temperature and the Boltzmann factor

Fig. 4.9 The ﬁnal distribution for a lattice of size 1000×1000 with one quantum of energy initially placed on each site. The error bars are calculated by assuming √ Poisson statistics and have length N , where N is the number of sites having n quanta.

n

4.7

Applications of the Boltzmann distribution

To illustrate the application of the Boltzmann distribution, we now conclude this chapter with some examples. These examples involve little more than a simple application of the Boltzmann distribution, but they have important consequences. Before we do so, let us introduce a piece of shorthand. Since we will often need to write the quantity 1/kB T , we will use the shorthand β≡

1 , kB T

(4.15)

so that the Boltzmann factor becomes simply e−βE . Using this shorthand, we can also write eqn 4.7 as β=

d ln Ω . dE

(4.16)

Example 4.3 The two state system. The ﬁrst example is one of the simplest one can think of. In a two-state system, there are only two states, one with energy 0 and the other with energy > 0. What is the average energy of the system?

4.7

Applications of the Boltzmann distribution 41

Solution: The probability of being in the lower state is given by eqn 4.14, so we have 1 P (0) = . (4.17) 1 + e−β Similarly, the probability of being in the upper state is P () =

e−β . 1 + e−β

(4.18)

The average energy E of the system is then E =

0 · P (0) + · P () e−β = 1 + e−β . = β e +1

(4.19)

This expression (plotted in Fig. 4.10) behaves as expected: when T is very low, kB T , and so β 1 and E → 0 (the system is in the ground state). When T is very high, kB T , and so β 1 and E → /2 (both levels are equally occupied on average).

Fig. 4.10 The value of E as a function of /kB T = β, following eqn 4.19. As T → ∞, each energy level is equally likely to be occupied and so E = /2. When T → 0, only the lower level is occupied and E = 0.

Example 4.4 Isothermal atmosphere: Estimate the number of molecules in an isothermal11 atmosphere as a function of height. Solution: This is our ﬁrst attempt at modelling the atmosphere, where we make the rather naive assumption that the temperature of the atmosphere is constant. Consider a molecule in an ideal gas at temperature T in the presence of gravity. The probability P (z) of the molecule of mass m being at height z is given by P (z) ∝ e−mgz/kB T ,

(4.21)

This result (plotted in Fig. 4.11) agrees with a more pedestrian derivation which goes as follows: consider a layer of gas between height z and z +dz. There are n dz molecules per unit area in this layer, and therefore they exert a pressure (force per unit area) dp = −n dz · mg

‘Isothermal’ means constant temperature. A more sophisticated treatment of the atmosphere is postponed until Section 12.4; see also Chapter 37.

(4.20)

because its potential energy is mgz. Hence, the number density12 of molecules n(z) at height z, which will be proportional to the probability function P (z) of ﬁnding a molecule at height z, is given by n(z) = n(0)e−mgz/kB T .

11

(4.22)

12

Number density means number per unit volume.

42 Temperature and the Boltzmann factor

downwards (because each molecule has weight mg). We note in passing that eqn 4.22 can be rearranged using ρ = nm to show that dp = −ρg dz,

(4.23)

which is known as the hydrostatic equation. Using the ideal gas law (in the form derived in Chapter 6), which is p = nkB T , we have that mg dn =− dz, n kB T which is a simple diﬀerential equation yielding mg z, ln n(z) − ln n(0) = − kB T

(4.24)

(4.25)

so that, again, we have n(z) = n(0)e−mgz/kB T .

Fig. 4.11 The number density n(z) of molecules at height z for an isothermal atmosphere.

(4.26)

Our prediction is that the number density falls oﬀ exponentially with height, but the reality is diﬀerent. Our assumption of constant T is at fault (the temperature falls as the altitude increases, at least initially) and we will return to this problem in Section 12.4, and also in Chapter 37.

Example 4.5 Chemical reactions: Many chemical reactions have an activation energy Eact which is about 1 2 eV. At T = 300 K, which is about room temperature, the probability that a particular reaction occurs is proportional to exp(−Eact /(kB T )).

(4.27)

If the temperature is increased to T + ∆T = 310 K, the probability increases to exp(−Eact /(kB (T + ∆T )), (4.28) which is larger by a factor exp(−Eact /(kB (T + ∆T )) exp(−Eact /(kB T ))

Eact −1 −1 = exp − [(T + ∆T ) − T ] kB Eact ∆T ≈ exp kB T T ≈ 2. (4.29)

Hence many chemical reactions roughly double in speed when the temperature is increased by about 10 degrees.

4.7

Applications of the Boltzmann distribution 43

Example 4.6 The Sun: The main fusion reaction in the Sun13 p+ + p+ → d+ + e+ + ν¯

13 + p is a proton, d+ is a deuteron (a

(4.30)

but the main barrier to this occuring is the electrostatic repulsion of the two protons coming together in the ﬁrst place. This energy is E=

e2 , 4π0 r

(4.31)

which for r = 10−15 m, the distance which they must approach each other, E is about 1 MeV. The Boltzmann factor for this process at a temperature of T ≈ 107 K (at the centre of the Sun) is e−E/kB T ≈ 10−400 .

(4.32)

This is extremely small, suggesting that the Sun is unlikely to undergo fusion. However, our lazy sunny afternoons are saved by the fact that quantum mechanical tunnelling allows the protons to pass through this barrier vastly more often than this calculation predicts that they could pass over the top of it.

Chapter summary • The temperature T of a system is given by β≡

d ln Ω 1 = , kB T dE

where kB is the Boltzmann constant, E is its energy, and Ω is the number of microstates (i.e. the number of ways of arranging the quanta of energy in the system). • The microcanonical ensemble is an idealized collection of systems which all have the same ﬁxed energy. • The canonical ensemble is an idealized collection of systems, each of which can exchange its energy with a large reservoir of heat. • For the canonical ensemble, the probability that a particular system has energy is given by P () ∝ e−β (Boltzmann distribution), and the factor e−β is known as the Boltzmann factor. Its use has been illustrated for a number of physical situations.

proton and a neutron), e+ is a positron and ν¯ is a neutrino. This reaction and its consequences are explored more fully in Section 35.2.

44 Exercises

Further reading Methods of measuring temperature are described in Pobell (1996) and White and Meeson (2002).

Exercises (4.1) Check that the probability in eqn 4.14 is normalized, so that the sum of all possible probabilities is one. (4.2) For the two-state system described in Example 3.2, derive an expression for the variance of the energy. (4.3) A system comprises N states which can have energy 0 or ∆. Show that the number of ways Ω(E) of arranging the total system to have energy E = r∆ (where r is an integer) is given by Ω(E) =

N! . r!(N − r)!

(4.33)

Now remove a small amount of energy s∆ from the system, where s r. Show that Ω(E − ) ≈ Ω(E)

rs , (N − r)s

(4.34)

and hence show that the system has temperature T given by „ « N −r 1 1 = ln . (4.35) kB T ∆ r Sketch kB T as a function of r from r = 0 to r = N and explain the result. (4.4) In eqn 4.11, we neglected the next term in the Taylor expansion which is d2 ln Ω 2 . dE 2 Show that this term equals −

2 dT , kB T 2 dE

(4.36)

(4.37)

and hence show that it can be neglected compared to the ﬁrst two terms if the reservoir is large. (Hint: how much should the temperature of the reservoir change when you change its energy by of order ?)

(4.5) A visible photon with energy 2 eV is absorbed by a macroscopic body held at room temperature. By what factor does Ω for the macroscopic body change? Repeat the calculation for a photon which originated from an FM radio transmitter. (4.6) Figure 4.10 is a plot of E as a function of β. Sketch E as a function of temperature T (measured in units of /kB ). (4.7) Find the average energy E for (a) An n-state system, in which a given state can have energy 0, , 2, . . . , n. (b) A harmonic oscillator, in which a given state can have energy 0, , 2, . . . (i.e. with no upper limit). (4.8) Estimate kB T at room temperature, and convert this energy into electronvolts (eV). Using this result, answer the following: (a) Would you expect hydrogen atoms to be ionized at room temperature? (The binding energy of an electron in a hydrogen atom is 13.6 eV.) (b) Would you expect the rotational energy levels of diatomic molecules to be excited at room temperature? (It costs about 10−4 eV to promote such a system to an excited rotational energy level.) (4.9) Write a computer program to reproduce the results in Example 3.1. For the case of N 1 sites with initially one quantum per site, show that after many iterations you would expect there to be N (n) sites with n quanta, where N (n) ≈ 2−n N ,

(4.38)

and explain why this is a Boltzmann distribution. Generalize your results for Q 1 quanta distributed on N 1 sites.

Part II

Kinetic theory of gases In the second part of this book, we apply the results of Part I to the properties of gases. This is the kinetic theory of gases, in which it is the motion of individual gas atoms, behaving according to the Boltzmann distribution, which determines quantities such as the pressure of a gas, or the rate of eﬀusion. This part is structured as follows: • In Chapter 5, we show that the Boltzmann distribution applied to gases gives rise to a speed distribution known as the Maxwell– Boltzmann distribution. We show how this can be measured experimentally. • A treatment of pressure in Chapter 6 using the results so far developed allows us to derive Boyle’s law and the ideal gas law. • We are then able to treat the eﬀusion of gases through small holes in Chapter 7, which also introduces the concept of ﬂux. • Chapter 1 considers the nature of molecular collisions and introduces the concepts of the mean scattering time, the collision crosssection and the mean free path.

The Maxwell–Boltzmann distribution

5 5.1 The velocity distribution

46

5.2 The speed distribution

47

5.3 Experimental justiﬁcation 49 Chapter summary

52

Exercises

52

In this chapter we will apply the results of the Boltzmann distribution (eqn 4.13) to the problem of the motion of molecules in a gas. For the present, we will neglect any rotational or vibrational motion of the molecules and consider only translational motion (so these results are strictly applicable only to a monatomic gas). In this case the energy of a molecule is given by 1 1 1 1 mvx2 + mvy2 + mvz2 = mv 2 , 2 2 2 2

Fig. 5.1 The velocity of a molecule is shown as a vector in velocity space.

where v = (vx , vy , vz ) is the molecular velocity, and v = |v| is the molecular speed. This molecular velocity can be represented in velocity space (see Fig. 5.1). The aim is to determine the distribution of molecular velocities and to determine the distribution of molecular speeds. This we will do in the next two sections. To make some progress, we will make a couple of assumptions: ﬁrst, that the molecular size is much less than the intermolecular separation, so that we assume that molecules spend most of their time whizzing around and only rarely bumping into each other; second, we will ignore any intermolecular forces. Molecules can exchange energy with each other due to collisions, but everything remains in equilibrium. Each molecule therefore behaves like a small system connected to a heat reservoir at temperature T , where the heat reservoir is ‘all the other molecules in the gas’. Hence the results of the Boltzmann distribution of energies (described in the previous chapter) will hold.

5.1

1

But we could choose any direction of motion we like!

(5.1)

The velocity distribution

To work out the velocity distribution of molecules in a gas, we must ﬁrst choose a given direction and see how many molecules have particular components of velocity along it. We deﬁne the velocity distribution function as the fraction of molecules with velocities in, say, the x-direction,1 between vx and vx + dvx , as g(vx ) dvx . The velocity distribution function is proportional to a Boltzmann factor, namely e to the power of the relevant energy, in this case 12 mvx2 , divided by kB T . Hence 2

g(vx ) ∝ e−mvx /2kB T .

(5.2)

5.2

This velocity distribution function is sketched in Fig. 5.2. To normal∞ ize this function, so that −∞ g(vx ) dvx = 1, we need to evaluate the integral2 ∞ 2 π 2πkB T −mvx /2kB T = , (5.3) e dvx = m/2k T m B −∞

so that g(vx ) =

2 m e−mvx /2kB T . 2πkB T

The speed distribution 47

2 The integral may be evaluated using eqn C.3.

(5.4)

It is then possible to ﬁnd the following expected values of this distribution (using the integrals in Appendix C.2): ∞ vx = vx g(vx ) dvx = 0, (5.5) −∞ ∞ 2kB T , (5.6) |vx | = 2 vx g(vx ) dvx = πm 0 ∞ kB T vx2 = . (5.7) vx2 g(vx ) dvx = m −∞

Fig. 5.2 g(vx ), the distribution function for a particular component of molecular velocity (which is a Gaussian distribution).

Of course, it does not matter which component of the velocity was initially chosen. Identical results would have been obtained for vy and vz . Hence the fraction of molecules with velocities between (vx , vy , vz ) and (vx + dvx , vy + dvy , vz + dvz ) is given by g(vx )dvx g(vy )dvy g(vz )dvz 2

2

2

∝ e−mvx /2kB T dvx e−mvy /2kB T dvy e−mvz /2kB T dvz =

5.2

e−mv

2

/2kB T

dvx dvy dvz .

(5.8)

The speed distribution

We now wish to turn to the problem of working out the distribution of molecular speeds in a gas. We want the fraction of molecules which are travelling with speeds between v = |v| and v + dv, and this corresponds to a spherical shell in velocity space of radius v and thickness dv (see Fig. 5.3). The volume of velocity space corresponding to speeds between v and v + dv is therefore equal to 4πv 2 dv,

(5.9)

so that the fraction of molecules with speeds between v and v + dv can be deﬁned as f (v) dv, where f (v) is given by f (v) dv ∝ v 2 dv e−mv

2

/2kB T

.

(5.10)

In this expression the 4π factor has been absorbed in the proportionality sign.

Fig. 5.3 Molecules with speeds between v and v + dv occupy a volume of velocity space inside a spherical shell of radius v and thickness dv. (An octant of this sphere is shown cut-away.)

48 The Maxwell–Boltzmann distribution 3

We integrate between 0 and ∞, not between −∞ and ∞, because the speed v = |v| is a positive quantity.

∞ To normalize3 this function, so that 0 f (v) dv = 1, we must evaluate the integral (using eqn C.3) ∞ 2 π 1 v 2 e−mv /2kB T dv = , (5.11) 4 (m/2kB T )3 0 so that

f

4 f (v) dv = √ π

Fig. 5.4 f (v), the distribution function for molecular speeds (Maxwell– Boltzmann distribution).

m 2kB T

3/2

v 2 dv e−mv

2

/2kB T

.

(5.12)

This speed distribution function is known as the Maxwell–Boltzmann speed distribution, or sometimes simply as a Maxwellian distribution and is plotted in Fig. 5.4. Having derived the Maxwell–Boltzmann distribution function in eqn 5.10, we are now in a position to derive some of its properties.

5.2.1

v and v 2

It is straightforward to ﬁnd the following expected values of the Maxwell– Boltzmann distribution: ∞ 8kB T , (5.13) vf (v) dv = v = πm 0 ∞ 3kB T v 2 = . (5.14) v 2 f (v) dv = m 0 Note that using eqns 5.7 and 5.14 we can write vx2 + vy2 + vz2 =

kB T kB T 3kB T kB T + + = = v 2 m m m m

as expected. Note also that the root mean squared speed of a molecule 3kB T vrms = v 2 = m

(5.15)

(5.16)

is proportional to m−1/2 .

5.2.2

The mean kinetic energy of a gas molecule

The mean kinetic energy of a gas molecule is given by EKE =

1 3 mv 2 = kB T. 2 2

(5.17)

This is an important result, and we will later derive it again by a diﬀerent route (see section 19.2.1). It demonstrates that the average energy of a molecule in a gas depends only on temperature.

5.3

5.2.3

Experimental justiﬁcation 49

The maximum of f (v)

The maximum value of f (v) is found by setting df =0 dv which yields

2kB T . m

(5.19)

√ 8 < 3, π

(5.20)

vmax < v < vrms

(5.21)

vmax = Since

(5.18)

√ 2<

we have that and hence the points marked on Fig. 5.4 are in the order drawn. The mean speed of the Maxwell–Boltzmann distribution is higher than the value of the speed corresponding to the maximum in the distribution since the shape of f (v) is such that the tail to the right is very long.

Example 5.1 Calculate the rms speed of a nitrogen (N2 ) molecule at room temperature. [One mole of N2 has a mass of 28 g.] Solution: For nitrogen at room temperature, m = (0.028 kg)/(6.022 × 1023 ) and so vrms ≈ 500 m s−1 . This is about 1100 miles per hour, and is the same order of magnitude as the speed of sound.

5.3

Experimental justiﬁcation

How do you demonstrate that the velocity distribution in a gas obeys the Maxwell–Boltzmann distribution? A possible experimental apparatus is shown in Fig. 5.5. This consists of an oven, a velocity selector, and a detector which are mounted on an optical bench. Hot gas atoms emerge from the oven and pass through a collimating slit. Velocity selection of molecules is achieved using discs with slits cut into them which are rotated at high angular speed by a motor. A phase shifter varies the phase of the voltage fed to the motor for one disc relative to that of the other. Thus only molecules travelling with a particular speed from the oven will pass through the slits in both discs. A beam of light can be used to determine when the velocity selector is set for zero transit time. This beam is produced by a small light source near one disk and passes through the velocity selector and is detected by a photocell near the other disk.

Fig. 5.5 The experimental apparatus which can be used to measure the Maxwell–Boltzmann distribution.

50 The Maxwell–Boltzmann distribution

Another way of doing the velocity selection is shown in Fig. 5.6. This consists of a solid surface on whose surface is cut a helical slot and which is capable of rotation around the cylinder’s axis at a rate ω. A molecule of velocity v which goes through the slot without changing its position relative to the sides of the slot will satisfy the equation v=

Fig. 5.6 Diagram of the velocity selector. (After R. C. Miller and P. Kusch, Phys. Rev. 99, 1314 (1955).) Copyright (1955) by the American Physical Society.

ωL φ

(5.22)

in which φ and L are the ﬁxed angle and length shown in Fig. 5.6. Tuning ω allows you to tune the selected velocity v.

Fig. 5.7 Intensity data measured for potassium atoms using the velocity selector shown in Fig. 5.6. (After R. C. Miller and P. Kusch, Phys. Rev. 99, 1314 (1955).) Copyright (1955) by the American Physical Society.

Data from this experiment are shown in Fig. 5.7. In fact, the intensity 2 as a function of velocity v does not follow the expected v 2 e−mv /2kB T 2 distribution but instead ﬁts to v 4 e−mv /2kB T . What has gone wrong? Nothing has gone wrong, but there are two factors of v which have to be included for two diﬀerent reasons. One factor of v comes from the fact that the gas atoms emerging through the small aperture in the wall of the oven are not completely representative of the atoms inside the oven. This eﬀect will be analysed in Chapter 7. The other factor of v comes from the fact that as the velocity selector is spun faster, it accepts a smaller fraction of molecules. This can be understood in detail as follows. Because of the ﬁnite width of the slit, the velocity selector selects molecules with a range of velocities. The limiting velocities correspond to molecules which enter the slot at one wall and leave the slot at the opposite wall. This leads to velocities which range all the way from

5.3

Experimental justiﬁcation 51

ωL/φ− to ωL/φ+ , where φ± = φ ± l/r and l and r are as deﬁned in Fig. 5.6. Thus the range, ∆v, of velocities transmitted is given by 1 1 2l v, (5.23) − ∆v = ωL ≈ φ− φ+ φr and thus increases as the selected velocity increases. This gives rise to the second additional factor of v. Another way to experimentally justify the treatment in this chapter is to look at spectral lines of hot gas atoms. The limit on resolution is often set by Doppler broadening so that those atoms travelling towards a detector with a component of velocity vx towards the detector will have transition frequencies which diﬀer from those of atoms at rest due to the Doppler shift. A spectral line with frequency ω0 (and wavelength λ0 = 2πc/ω0 , where c is the speed of light) will be Doppler-shifted to a frequency ω0 (1 ± vx /c) and the ± sign reﬂects molecules travelling towards or away from the detector. The Gaussian distribution of velocities given by eqn 5.2 now gives rise to a Gaussian shape of the spectral line I(ω) (see Fig. 5.8) which is given by mc2 (ω0 − ω)2 I(ω) ∝ exp − (5.24) 2kB T ω02 and the full-width at half-maximum of this spectral line is given by either ∆ω FWHM (or in wavelength by ∆λFWHM ) by I(ω0 + ∆ω FWHM /2) 1 = I(ω0 ) 2

(5.25)

∆ω FWHM kB T ∆λFWHM = = 2 2 ln 2 2 . (5.26) ω0 λ0 mc Another source of broadening of spectral lines arises from molecular collisions. This is called collisional broadening or sometimes pressure broadening (since collisions are more frequent in a gas when the pressure is higher, see Section 8.1). Doppler broadening is therefore most important in low-pressure gases. so that

Fig. 5.8 The intensity of a Doppler– broadened spectral line.

52 Exercises

Chapter summary • A physical situation which is very important in kinetic theory is the translational motion of atoms or molecules in a gas. The probability distribution for a given component of velocity is given by 2

g(vx ) ∝ e−mvx /2kB T . • We have shown that the corresponding expression for the probability distribution of molecular speeds is given by f (v) ∝ v 2 e−mv

2

/2kB T

.

This is known as a Maxwell–Boltzmann distribution, or sometimes as a Maxwellian distribution. • Two important average values of the Maxwell–Boltzmann distribution are 8kB T 3kB T , v 2 = . v = πm m

Exercises (5.1) Do the integrals in eqns 5.5–5.7 and eqns 5.13 and 5.14, and check that you get the same answers. (5.2) Calculate the rms speed of hydrogen (H2 ), helium (He) and oxygen (O2 ) at room temperature. [The atomic masses of H, He and O are 1, 2 and 16 respectively.] Compare these speeds with the escape velocity on the surface of (i) the Earth, (ii) the Sun. (5.3) Whatp fractional error do you make if you approximate v 2 by v for a Maxwell–Boltzmann gas? (5.4) A Maxwell–Boltzmann distribution implies that a given molecule (mass m) will have a speed between v and v+dv with probability equal to f (v) dv where 2 −mv 2 /2kB T

f (v) ∝ v e

,

and the proportionality sign is used because a normalization constant has been omitted. (You can correct for R ∞ this by dividing any averages you work out by 0 f (v) dv.) For this distribution, calculate the mean speed v and the mean inverse speed 1/v. Show that 4 v1/v = . π

(5.5) The width of a spectral line (FWHM) is often quoted as r ∆λFWHM = 7.16 × 10−7 λ0

T , m

(5.27)

where T is the temperature in Kelvin, λ0 is the wavelength at the centre of the spectral line in the rest frame and m is the atomic mass of the gas measured in atomic mass units (i.e. multiples of the mass of a proton). Does this formula make sense? (5.6) What is the Doppler broadening of the 21cm line in an interstellar gas cloud (temperature 100 K) composed of neutral4 hydrogen? (Express your answer in kHz.) (5.7) Calculate the rms speed of a sodium atom in the solar atmosphere at 6000 K. (The atomic mass of sodium is 23.) The sodium D lines (λ = 5900 ˚ A) are observed in a solar spectrum. Estimate the Doppler broadening in GHz.

Biography 53

James Clerk Maxwell (1831–1879)

porate the new understanding of the link between electricity and magnetism (and which became known as the ‘Gaussian’ system, or c.g.s. system – though ‘Maxwellian system’ would have been more appropriate). He also constructed his apparatus for measuring the viscosity of gases (see Chapter 9), verifying some of his predictions, but not others. In 1865, he resigned his chair at King’s and moved full time to Glenair, where he wrote his ‘Theory of Heat’ which introduced what are now known as Maxwell relations (Chapter 16) and the concept of Maxwell’s demon (Section 14.7). He applied for, but did not get, the position of Principal of St Andrews’ University, but in 1871 was appointed to the newlyestablished Professorship of Experimental Physics in Cambridge (after William Thomson and Hermann Helmholtz both turned the job down). There he supervised the building of the Cavendish Laboratory and wrote his celebrated treatise on ‘Electricity and Magnetism’ (1873) where his four electromagnetic equations (‘Maxwell’s equations’) ﬁrst appear. In 1877 he was diagnosed with abdominal cancer and died in Cambridge in 1879. In his short life Maxwell had been one of the most proliﬁc, inspirational and creative scientists that has ever lived. His work has had far-reaching implications in much of physics, not just in thermodynamics. He had also lived a devout and contemplative life in which he had been free of pride, selﬁshness and ego, always generous and courteous to everyone. The doctor who tended him in his last days wrote

Born in Edinburgh, James Clerk Maxwell was brought up in the Scottish countryside at Glenair. He was educated at home until, at the age of 10, he was sent to the Edinburgh Academy where his unusual homemade clothes and distracted air earned him the nickname “Dafty”. But a lot was going on in his head and he wrote his ﬁrst scientiﬁc paper at age 14. Maxwell went to Peterhouse, Cambridge in 1850 but then moved to Trinity College, where he gained a fellowship in 1854. There he worked on the perception of colour, and also put Michael Faraday’s ideas of lines of electrical force onto a sound mathematFig. 5.9 James Clerk ical basis. In 1856 he Maxwell took up a chair in Natural Philosophy in Aberdeen where he worked on a theory of the rings of Saturn (conﬁrmed by the Voyager spacecraft visits of the 1980’s) and, in 1858, married the College Principal’s daughter, Katherine Mary Dewar. In 1859, he was inspired by a paper of Clausius on diﬀusion in gases to conceive of his theory of speed distributions in gases, outlined in Chapter 5, which, with its subsequent elaborations by BoltzI must say that he is one of the best men mann, is known as the Maxwell–Boltzmann distriI have ever met, and a greater merit than bution. These triumphs were not enough to prehis scientiﬁc achievements is his being, so serve him from the consequences of the merging of far as human judgement can discern, a Aberdeen’s two Universities in 1860 when, incredimost perfect example of a Christian genbly, the powers that be decided that it was Maxwell tleman. out of the two Professors of Natural Philosophy who should be made redundant. He failed to obtain a Maxwell summed up his own philosophy as follows: chair at Edinburgh (losing out to Tait) but instead moved to King’s College London. There, he produced Happy is the man who can recognize in the world’s ﬁrst colour photograph, came up with his the work of Today a connected portion theory of electromagnetism that proposed that light of the work of life, and an embodiment was an electromagnetic wave and explained its speed of the work of Eternity. The foundations in terms of electrical properties, and chaired a comof his conﬁdence are unchangeable, for he mittee to decide on a new system of units to incorhas been made a partaker of Inﬁnity.

6

Pressure

6.1 Molecular distributions

55

6.2 The ideal gas law

56

6.3 Dalton’s law

58

Chapter summary

59

Exercises

59

One of the most fundamental variables in the study of gases is pressure. The pressure p due to a gas (or in fact any ﬂuid) is deﬁned as the ratio of the perpendicular contact force to the area of contact. The unit is therefore that of force (N) divided by that of area (m2 ) and is called the Pascal (Pa = Nm−2 ). The direction in which pressure acts is always at right angles to the surface upon which it is acting. Other units for measuring pressure are sometimes encountered, such as the bar (1 bar = 105 Pa) and the almost equivalent atmosphere (1 atm = 1.01325×105 Pa). The pressure of the atmosphere at sea-level actually varies depending on the weather by approximately ±50 mbar around the standard atmosphere of 1013.25 mbar, though pressures (adjusted for sea level) as low as 882 mbar and as high as 1084 mbar have been recorded. An archaic unit is the Torr, which is equal to a millimetre of mercury (Hg): 1 Torr = 133.32 Pa.

Example 6.1 Air has a density of about 1.29 kg m−3 . Give a rough estimate of the height of the atmosphere assuming that the density of air in the atmosphere is uniform. Solution: Atmospheric pressure p ≈ 105 Pa is due to the weight of air ρgh in the atmosphere (with assumed height h and uniform density ρ) pressing down on each square metre. Hence h = p/ρg ≈ 104 m (which is about the cruising altitude of planes). Of course, in reality the density of the atmosphere falls oﬀ with increasing height (see Chapter 37).

The pressure p of a volume V of gas (comprising N molecules) depends on its temperature T via an equation of state, which is an expression of the form p = f (T, V, N ), (6.1) where f is some function. One example of an equation of state is that for an ideal gas, which was given in eqn 1.12: pV = N kB T.

(6.2)

6.1

Molecular distributions 55

Daniel Bernoulli (1700–1782) attempted an explanation of Boyle’s law (p ∝ 1/V ) by assuming (controversially at the time) that gases were composed of a vast number of tiny particles (see Fig. 6.1). This was the ﬁrst serious attempt at a kinetic theory of gases of the sort that we will describe in this chapter to derive the ideal gas equation.

6.1

Molecular distributions

In the previous chapter we derived the Maxwell–Boltzmann speed distribution function f (v). We denote the total number of molecules per unit volume by the symbol n. The number of molecules per unit volume which are travelling with speeds between v and v + dv is then given by nf (v) dv. We now seek to determine the distribution function of molecules travelling in diﬀerent directions.

6.1.1

Fig. 6.1 In the kinetic theory of gases, a gas is modelled as a number of individual tiny particles which can bounce oﬀ the walls of the container, and each other.

Solid angles

Recall that an angle θ in a circle is deﬁned by dividing the arc length s which the angle subtends by the radius r (see Fig. 6.2), so that s θ= . r

(6.3)

The angle is measured in radians. The angle subtended by the whole circle at its centre is then 2πr = 2π. (6.4) r By analogy, a solid angle Ω in a sphere (see Fig. 6.3) is deﬁned by dividing the surface area A which the solid angle subtends by the radius squared, so that A Ω = 2. (6.5) r The solid angle is measured in steradians. The solid angle subtended by a whole sphere at its centre is then 4πr2 = 4π. r2

6.1.2

s

Fig. 6.2 The deﬁnition of angle θ in terms of the arc length.

(6.6) A

The number of molecules travelling in a certain direction at a certain speed

If all molecules are equally likely to be travelling in any direction, the fraction whose trajectories lie in an elemental solid angle dΩ is dΩ . 4π

(6.7)

If we choose a particular direction, then the solid angle dΩ corresponding to molecules travelling at angles between θ and θ +dθ to that direction is

Fig. 6.3 The deﬁnition of solid angle Ω = A/r 2 where r is the radius of the sphere and A is the surface area over the region of the sphere indicated.

56 Pressure

equal to the area of the annular region shown shaded in the unit-radius sphere of Fig. 6.4 which is given by dΩ = 2π sin θ dθ,

(6.8)

so that

1 dΩ = sin θ dθ. 4π 2 Therefore, a number of molecules per unit volume given by n f (v) dv

Fig. 6.4 The area of the shaded region on this sphere of unit radius is equal to the circumference of a circle of radius sin θ multiplied by the width dθ and is hence given by 2π sin θ dθ.

1 2

sin θ dθ

(6.9)

(6.10)

have speeds between v and v + dv and are travelling at angles between θ and θ + dθ to the chosen direction, where f (v) is the speed distribution function.

6.1.3

The number of molecules hitting a wall

We now let our particular direction, up until now arbitarily chosen, lie perpendicular to a wall of area A (see Fig. 6.5). In a small time dt, the molecules travelling at angle θ to the normal to the wall sweep out a volume A v dt cos θ. (6.11)

t

A

Multiplying this volume by the number in expression 6.10 implies that in time dt, the number of molecules hitting a wall of area A is 1 (6.12) A v dt cos θ n f (v) dv sin θ dθ. 2 Hence, the number of molecules hitting unit area of wall in unit time, and having speeds between v and v +dv and travelling at angles between θ and θ + dθ, is given by v cos θ n f (v) dv

Fig. 6.5 Molecules hit a region of wall (of cross-sectional area A1/2 × A1/2 = A) at an angle θ. The number hitting in time dt is the volume of the shaded region (A vdt cos θ) multiplied by n f (v) dv 12 sin θ.

6.2

1 2

sin θ dθ.

(6.13)

The ideal gas law

We are now in a position to calculate the pressure of a gas on its container. Each molecule which hits the wall of the container has a momentum change of 2mv cos θ which is perpendicular to the wall. This change of momentum is equivalent to an impulse. Hence, if we multiply 2mv cos θ (the momentum change arising from one molecule hitting the container walls) by the number of molecules hitting unit area per unit time, and having speeds between v and v + dv and angles between θ and θ + dθ (which we derived in eqn 6.13), and then integrating over θ and v, we should get the pressure p. Thus ∞ π/2 1 dv dθ (2mv cos θ) v cos θ n f (v) dv sin θ dθ p = 2 0 0 ∞ π/2 = mn dv v 2 f (v) cos2 θ sin θ dθ, (6.14) 0

0

6.2

and using the integral

π/2 0

cos2 θ sin θ dθ = 13 , we have that p = 13 nmv 2 .

(6.15)

If we write the total number of molecules N in volume V as N = nV, then this equation can be written as 1 pV = N mv 2 . 3 Using v 2 = 3kB T /m, this can be rewritten as pV = N kB T,

(6.16)

(6.17)

(6.18)

which is the ideal gas equation which we met in eqn 1.12. This completes the kinetic theory derivation of the ideal gas law. Equivalent forms of the ideal gas law: • The form given in eqn 6.18 is pV = N kB T, and contains an N which we reiterate is the total number of molecules in the gas. • An equivalent form of the ideal gas equation can be derived by dividing both sides of eqn 6.18 by volume, so that p = nkB T,

(6.19)

where n = N/V is the number of molecules per unit volume. • Another form of the ideal gas law can be obtained by writing the number of molecules N = nm NA where nm is the number of moles and NA is the Avogadro number (the number of molecules in a mole, see Section 1.1). In this case, eqn 6.18 becomes pV = nm RT,

(6.20)

R = NA kB

(6.21)

where is the gas constant (R = 8.31447 J K−1 mol−1 ). The formula p = nkB T expresses the important point that the pressure of an ideal gas does not depend on the mass m of the molecules. Although more massive molecules transfer greater momentum to the container walls than light molecules, their mean velocity is lower and so they make fewer collisions with the walls. Therefore the pressure is the same for a gas of light or massive molecules; it depends only on n, the number per unit volume, and the temperature.

The ideal gas law 57

58 Pressure

Example 6.2 What is the volume occupied by one mole of ideal gas at standard temperature and pressure (STP, deﬁned as 0◦ C and 1 atm)? Solution: At p = 1.01325 × 105 Pa and T = 273.15 K, the molar volume Vm can be obtained from eqn 6.20 as RT = 0.022414 m3 = 22.414 litres. Vm = (6.22) p

Example 6.3 What is the connection between pressure and kinetic energy density? Solution: The kinetic energy of a gas molecule moving with speed v is 1 mv 2 . (6.23) 2 The total kinetic energy of the molecules of a gas per unit volume, i.e. the kinetic energy density which we will call u, is therefore given by ∞ 1 1 mv 2 f (v) dv = nmv 2 , (6.24) u=n 2 2 0 This expression is true for a nonrelativistic gas of particles. For an ultra-relativistic gas, the correct expression is given in eqn 25.21.

so that comparing with eqn 6.15 we have that 2 p = u. 3

6.3

(6.25)

Dalton’s law

If one has a mixture of gases in thermal equilibrium, then the total pressure p = nkB T is simply the sum of the pressures due to each component of the mixture. We can write n as ni , (6.26) n= i

where ni is the number density of the ith species. Therefore

ni kB T = pi , p= i

(6.27)

i

as the partial pressure of the ith species. where pi = ni kB T is known The observation that p = i pi is known as Dalton’s law, after the British chemist John Dalton (1766–1844), who was a pioneer of the atomic theory.

Exercises 59

Example 6.4 Air is 75.5% N2 , 23.2% O2 , 1.3% Ar and 0.05% CO2 by mass. Calculate the partial pressure of CO2 in air at atmospheric pressure. Solution: Dalton’s law states that the partial pressure is proportional to the number density. The number density is proportional to the mass fraction divided by the molar mass. The molar masses of the species (in grammes) are 28 (N2 ), 32 (O2 ), 40 (Ar) and 44 (CO2 ). Hence, the partial pressure of CO2 is 0.05 × 1 atm 44 = 0.00033 atm. pCO2 = 75.5 23.2 1.3 0.05 + + + 28 32 40 44

(6.28)

Chapter summary • The pressure, p, is given by p=

1 nmv 2 , 3

where n is the number of molecules per unit volume and m is the molecular mass. • This expression agrees with the ideal gas equation, p = nkB T , where V is the volume, T is the temperature and kB is the Boltzmann constant.

Exercises (6.1) What is the volume occupied by 1 mole of gas at 10−10 Torr, the pressure inside an ‘ultra high vacuum’ (UHV) chamber. (6.2) Calculate u, the kinetic energy density, for air at atmospheric pressure. (6.3) Mr Fourier sits in his living room at 18◦ C. He de-

cides he is rather cold and turns the heating up so that the temperature is 25◦ C. What happens to the total energy of the air in his living room? [Hint: what controls the pressure in the room?] (6.4) A diﬀuse cloud of neutral hydrogen atoms (known as HI) in space has a temperature of 50 K. Calculate

60 Exercises the pressure (in Pa) and the volume (in cubic light years) occupied by the cloud if its mass is 100M . (M is the symbol for the mass of the Sun, see Appendix A.) (6.5) (a) Given that the number of molecules hitting unit area of a surface per second with speeds between v and v + dv and angles between θ and θ + dθ to the normal is 1 v n f (v)dv sin θ cos θ dθ, 2 show that the average value of cos θ for these molecules is 32 . (b) Using the results above, show that for a gas obeying the Maxwellian distribution (i.e. f (v) ∝ 2 v 2 e−mv /2kB T ) the average energy of all the molecules is 32 kB T , but the average energy of those which hit the surface is 2kB T . (6.6) The molecules in a gas travel with diﬀerent velocities. A particular molecule will have velocity v and speed v = |v| and will move at an angle θ to some chosen ﬁxed axis. We have shown that the number of molecules in a gas with speeds between v and v + dv, and moving at angles between θ and θ + dθ to any chosen axis is given by 1 n f (v) dv sin θ dθ, 2

where n is the number of molecules per unit volume and f (v) is some function of v only. [f (v) could be the Maxwellian distribution given above; however you should not assume this but rather calculate the general case.] Hence show by integration that: (a) u = 0 (b) u2 = 13 v 2 (c) |u| = 12 v where u is any one Cartesian component of v, i.e. vx , vy or vz . [Hint: You can take u as the z-component of v without loss of generality. Why? Then express u in terms of v and θ and average over v and θ. You can use expressions such as Z ∞ v f (v) dv v = Z0 ∞ f (v) dv 0

and similarly for v . Make sure you understand why.] 2

(6.7) If v1 , v2 , v3 are three Cartesian components of v, what value do you expect for v1 v2 , v1 v3 and v2 v3 ? Evaluate one of them by integration to check your deduction. (6.8) Calculate the partial pressure of O2 in air at atmospheric pressure.

Biography 61

Robert Boyle (1627–1691) Robert Boyle was born into wealth. His father was a self-made man of humble yeoman stock who, at the age of 22, had left England for Ireland to seek his fortune. This his father found or, possibly more accurately, “grabbed” and through rapid land acquisition of a rather dubious nature Boyle senior became one of England’s richest men and the Earl of Cork to boot. Robert was born when his father was in his sixties and was the last but one of his father’s sixteen children. His faFig. 6.6 Robert Boyle ther, as a new member of the aristocracy, believed in the best education for his children, and Robert was duly packed oﬀ to Eton and then, at the age of 12, sent oﬀ for a European Grand Tour, taking in Geneva, Venice and Florence. Boyle studied the works of Galileo, who died in Florence while Boyle was staying in the city. Meanwhile, his father was getting into a spot of bother with the Irish rebellion of 1641–1642, resulting in the loss of the rents that kept him and his family in the manner to which they had become accustomed, and hence also causing Robert Boyle some ﬁnancial diﬃculty. He was almost married oﬀ at this time to a wealthy heiress, but Boyle managed to escape this fate and remained unmarried for the rest of his life. His father died in 1643 and Boyle returned to England the following year, inheriting his father’s Dorset estate. However, by this time the Civil War (which had started in 1642) was in full swing and Boyle tried hard not to take sides. He kept his head down, devoting his time to study, building a chemical laboratory in his house and worked on moral and theological essays. Cromwell’s defeat of the Irish in 1652 worked well for Boyle as many Irish lands were handed over to the English colonists. Financially, Boyle was now secure and ready to live the life of a gentleman. In London, he had met John Wilkins who had founded an intellectual society which he called “The Invisi-

ble College” and which suddenly brought Boyle into contact with the leading thinkers of the day. When Wilkins was appointed Warden of Wadham College, Oxford, Boyle decided to move to Oxford and set up a laboratory there. He set up an air pump and, together with a number of talented assistants (the most famous of which was Robert Hooke, later to discover his law of springs and to observe a cell with a microscope, in addition to numerous other discoveries) Boyle and his team conducted a large number of elaborate experiments in this new vacuum. They showed that sound did not travel in a vacuum,and that ﬂames and living organisms could not be sustained, and discovered the “spring of air”, namely that compressing air resulted in its pressure increasing, and that the pressure of a gas and its volume were in inverse proportion. Boyle was much taken with the atomistic viewpoint as described by the French philosopher Pierre Gassendi (1592–1655), which seems particularly appropriate for someone whose work led to the path for the development of the kinetic theory of gases. His greatest legacy was in his reliance on experiment as a means of determining scientiﬁc truth. He was, however, also someone who often worked vicariously through a band of assistants, citing his weakness of health and of eyesight as a reason for failing to write his papers as he wished to and to have read other peoples’ works as he ought; his writings are, however, full of criticisms of his assistants for making mistakes, failing to record data and generally slowing down his research endeavours. With the restoration of the monarchy in 1660, the Invisible College, which had been meeting for several years in Gresham College, London, sought the blessing of the newly crowned Charles II and became the Royal Society, which has existed ever since as a thriving scientiﬁc society. In 1680, Boyle (who had been a founding fellow of the Royal Society) was elected President of the Royal Society, but declined to hold the oﬃce, citing an unwillingness to take the necessary oaths. Boyle retained a strong Christian faith throughout his life, and prided himself on his honesty and pure seeking of the truth. In 1670, Boyle suﬀered a stroke but made a good recovery, staying active in research until the mid-1680’s. He died in 1691, shortly after the death of his sister Katherine to whom he had been extremely close.

7 7.1 Flux

Molecular eﬀusion 62

7.2 Eﬀusion

64

Chapter summary

67

Exercises

67

Isotopes (the word means ‘same place’) are atoms of a chemical element with the same atomic number Z (and hence number of protons in the nucleus) but diﬀerent atomic weights A (and hence diﬀerent number of neutrons in the nucleus).

Eﬀusion is the process by which a gas escapes from a very small hole. The empirical relation known as Graham’s law of eﬀusion [after Thomas Graham (1805–1869)] states that the rate of eﬀusion is inversely proportional to the square root of the mass of the eﬀusing molecule.

Example 7.1 Eﬀusion can be used to separate diﬀerent isotopes of a gas (which cannot be separated chemically). For example, in the separation of 238 UF6 and 235 UF6 the diﬀerence in eﬀusion rate between the two gases is equal to mass of 235 UF6 352.0412 = 1.00574, (7.1) = 348.0343 mass of 238 UF6 which, although small, was enough for many kilogrammes of 235 UF6 to be extracted for the Manhattan project in 1945 to produce the ﬁrst uranium atom bomb, which was subsequently dropped on Hiroshima.

Example 7.2 How much faster does helium gas eﬀuse out of a small hole than N2 ? Solution: mass of N2 28 = = 2.6. (7.2) mass of He 4

In this chapter, we will discover where Graham’s law comes from. We begin by evaluating the ﬂux of particles hitting the inside walls of the container of a gas.

7.1

Flux

The concept of ﬂux is a very important one in thermal physics. It quantiﬁes the ﬂow of particles or the ﬂow of energy or even the ﬂow of momentum. Of relevance to this chapter is the molecular ﬂux, Φ,

7.1

which is deﬁned to be the number of molecules which strike unit area per second. Thus number of molecules . (7.3) molecular ﬂux = area × time The units of molecular ﬂux are therefore m−2 s−1 . We can also deﬁne heat ﬂux using amount of heat . (7.4) heat ﬂux = area × time The units of heat ﬂux are therefore J m−2 s−1 . In Section 9.1, we will also come across a ﬂux of momentum. Returning to the eﬀusion problem, we note that the ﬂux of molecules in a gas can be evaluated by integrating expression 6.13 over all θ and v, so that ∞ π/2 1 dv dθ v cos θ n f (v) dv sin θ dθ Φ = 2 0 0 π/2 n ∞ = dv v f (v) dθ cos θ sin θ (7.5) 2 0 0 so that Φ = 14 nv. (7.6) An alternative expression for Φ can be found as follows: rearranging the ideal gas law p = nkB T , we can write p n= , (7.7) kB T and using the expression for the average speed of molecules in a gas from eqn 5.13 8kB T , (7.8) v = πm we can substitute these expressions into eqn 7.6 and obtain p . Φ= √ (7.9) 2πmkB T Note that consideration of eqn 7.9 shows us that the eﬀusion rate depends inversely on the square root of the mass in agreement with Graham’s law.

Example 7.3 Calculate the particle ﬂux from N2 gas at STP (standard temperature and pressure, i.e. 1 atm and 0◦ C). Solution: 1.01325 × 105 Pa Φ = 2π × (28 × 1.67 × 10−27 kg) × 1.38 × 10−23 J K−1 × 273 K (7.10) ≈ 3 × 1027 m−2 s−1

R π/2

Flux 63

sin θ cos θ dθ = 12 . (Hint: substi0 tute u = sin θ, du =R cos θ dθ, so that the integral becomes 01 u du = 12 .)

64 Molecular eﬀusion

7.2

Eﬀusion

Consider a container of gas with a small hole of area A in the side. Gas will leak (i.e. eﬀuse) out of the hole (see Fig. 7.1). The hole is small, so that the equilibrium of gas in the container is not disturbed. The number of molecules escaping per unit time is just the number of molecules hitting the hole area in the closed box per second, so is given by ΦA per second, where Φ is the molecular ﬂux. This is the eﬀusion rate. Fig. 7.1 A gas eﬀuses from a small hole in its container.

Example 7.4 In the Knudsen method of measuring vapour pressure p from a liquid containing molecules of mass m at temperature T , the liquid is placed in the bottom of a container which has a small hole of area A at the top (see Fig. 7.2). The container is placed on a weighing balance and its weight M g is measured as a function of time. In equilibrium, the eﬀusion rate is pA , (7.11) ΦA = √ 2πmkB T so that the rate of change of mass, dM/dt is given by mΦA. Hence 2πkB T 1 dM p= . (7.12) m A dt

p

Fig. 7.2 The Knudsen method.

k T

f

k T

Fig. 7.3 The distribution function for molecular speeds (Maxwell–Boltzmann distribution) in a gas is proportional to 2 v 2 e−mv /2kB T (solid line) but the gas which eﬀuses from a small hole has a distribution function which is propor2 tional to v 3 e−mv /2kB T (dashed line).

Eﬀusion preferentially selects faster molecules. Therefore the speed distribution of molecules eﬀusing through the hole is not Maxwellian. This result seems paradoxical at ﬁrst glance: aren’t the molecules emerging from the box the same ones that were inside beforehand? How can their distribution be diﬀerent? The reason is that the faster molecules inside the box travel more quickly and have a greater probability of reaching the hole than their slower cousins.1 This can be expressed mathematically by noticing that the number of molecules which hit a wall (or a hole) is given by eqn 6.13 and this has an extra factor of v in it. Thus the distribution of molecules eﬀusing through the hole is proportional to v 3 e−mv

2

/2kB T

.

(7.13)

Note the extra factor of v in this expression compared with the usual Maxwell–Boltzmann distribution in eqn 5.10. The molecules in the 1 An analogy may help here: the foreign tourists who visit your country are not completely representative of the nation from which they have come; this is because they are likely to be at least a little more adventurous than their average countrymen and countrywomen by the very fact that they have actually stepped out of their own borders.

7.2

Eﬀusion 65

Maxwellian gas had an average energy of 12 mv 2 = 32 kB T , but the molecules in the eﬀusing gas have a higher energy, as the following example will demonstrate.

Example 7.5 What is the mean kinetic energy of gas molecules eﬀusing out of a small hole? Solution: kinetic energy

= = =

1 mv 2 2 ∞ 2 3 − 1 mv 2 /kB T 1 2 dv 2m 0 v v e ∞ 1 2 /k T − mv 3 B v e 2 dv 0 ∞ 2 −u u e du 2kB T 1 0∞ m 2 m ue−u du 0

(7.14)

2 where the substitution ∞ n −xu = mv /2kB T has been made. Using the standard integral 0 x e dx = n! (see Appendix C.1), we have that

kinetic energy = 2kB T.

(7.15)

This is larger by a factor of 43 compared to the mean kinetic energy of molecules in the gas. This is because eﬀusion preferentially selects higher energy molecules.

The hole has to be small. How small? The diameter of the hole has to be much less2 than the mean free path λ, deﬁned in Section 8.3.

Example 7.6 Consider a container divided by a partition with a small hole, diameter D, containing the same gas on each side. The gas on the left-hand side has temperature T1 and pressure p1 . The gas on the right-hand side has temperature T2 and pressure p2 . If D λ, p1 = p2 . If D λ, we are in the eﬀusion regime and the system will achieve equilibrium when the molecular ﬂuxes balance, so that Φ1 = Φ2 , so that, using eqn 7.9 we may write p p √1 = √2 . T1 T2

(7.16)

(7.17)

2

This is because, as we shall see in Section 8.3, the mean free path controls the characteristic distance between collisions. If the hole is small on this scale, molecules can eﬀuse out without the rest of the gas ‘noticing’, i.e. without a pressure gradient developing close to the hole.

66 Molecular eﬀusion

A ﬁnal example gives an approximate derivation of the ﬂow rate of gas down a pipe at low pressures.

Example 7.7 Estimate the mass ﬂow rate of gas down a long pipe of length L and diameter D at very low pressures in terms of the diﬀerence in pressures p1 − p2 between the two ends of the pipe. Solution: This type of ﬂow is known as Knudsen ﬂow. At very low pressures, molecules make collisions much more often with the walls of the tube than they do with each other. Let us deﬁne a coordinate x which measures the distance along the pipe. The net ﬂux Φ(x) of molecules ﬂowing down the pipe at position x can be estimated by subtracting the molecules eﬀusing down the pipe since their last collision (roughly a distance D upstream) from the molecules eﬀusing up the pipe since their last collision (roughly a distance D downstream). Thus 1 v[n(x + D) − n(x − D)], (7.18) 4 where n(x) is the number density of molecules at position x. Using p = 13 nmv 2 (eqn 6.15), this can be written Φ(x) ≈

Φ(x) ≈

3 v [p(x + D) − p(x − D)]. 4m v 2

(7.19)

We can write

dp , (7.20) dx but also notice that in steady state Φ must be the same along the tube, so that p2 − p 1 dp = . (7.21) dx L Hence the mass ﬂow rate M˙ = mΦ(πD2 /4) (where πD2 /4 is the crosssectional area of the pipe) is given by p(x + D) − p(x − D) ≈ −2D

3 v p1 − p2 M˙ ≈ πD3 . 8 v 2 L

(7.22)

With eqns 5.13 and 5.14, we have that 8 v2 = , 2 v 3π and hence our estimate of the Knudsen ﬂow rate is D 3 p1 − p 2 M˙ ≈ . v L

(7.23)

(7.24)

Note that the ﬂow rate is proportional to D3 , so it is much more eﬃcient to pump gas through wide pipes to obtain low pressures.

Exercises 67

Chapter summary • The molecular ﬂux, Φ, is the number of molecules which strike unit area per second and is given by Φ=

1 nv. 4

• This expression, together with the ideal gas equation, can be used to derive an alternative expression for the particle ﬂux: p . Φ= √ 2πmkB T • These expressions also govern molecular eﬀusion through a small hole.

Exercises (7.1) In a vacuum chamber designed for surface science experiments, the pressure of residual gas is kept as low as possible so that surfaces can be kept clean. The coverage of a surface by a single monolayer requires about 1019 atoms per m2 . What pressure would be needed to deposit less than one monolayer per hour from residual gas? You may assume that if a molecule hits the surface, it sticks. (7.2) A vessel contains a monatomic gas at temperature T . Use the Maxwell–Boltzmann distribution of speeds to calculate the mean kinetic energy of the molecules. Molecules of the gas stream through a small hole into a vacuum. A box is opened for a short time and catches some of the molecules. Neglecting the thermal capacity of the box, calculate the ﬁnal temperature of the gas trapped in the box. (7.3) A closed vessel is partially ﬁlled with liquid mercury; there is a hole of area 10−7 m2 above the liquid level. The vessel is placed in a region of high vacuum at 273 K and after 30 days is found to be lighter by 2.4×10−5 kg. Estimate the vapour pressure of mercury at 273 K. (The relative molecular mass of mercury is 200.59.) (7.4) Calculate the mean speed and most probable speed for a molecule of mass m which has eﬀused out of an enclosure at temperature T . Which of the two speeds is the larger?

(7.5) A gas eﬀuses into a vacuum through a small hole of area A. The particles are then collimated by passing through a very small circular hole of radius a, in a screen a distance d from the ﬁrst hole. Show that the rate at which particles emerge from the second hole is 14 nAv(a2 /d2 ), where n is the particle density and v is the average speed. (Assume no collisions take place after the gas eﬀuses through the second hole, and that d a.) (7.6) Show that if a gas were allowed to leak through a small hole, into an evacuated sphere and the particles condensed where they ﬁrst hit the surface they would form a uniform coating. (7.7) An astronaut goes for a space walk and her space suit is pressurised to 1 atm. Unfortunately, a tiny piece of space dust punctures her suit and it develops a small hole of radius 1 µm. What force does she feel due to the eﬀusing gas? (7.8) Show that the time dependence of the pressure inside an oven (volume V ) containing hot gas (molecular mass m, temperature T ) with a small hole of area A is given by p(t) = p(0)e−t/τ , with V τ = A

r

2πm . kB T

(7.25) (7.26)

The mean free path and collisions

8 8.1 The mean collision time

68

8.2 The collision cross-section 69 8.3 The mean free path

71

Chapter summary

72

Exercises

72

1

It turns out that large-angle scattering dominates transport processes in most gases (described in Chapter 9) and is largely independent of energy and therefore temperature; this allows us to use a rigid-sphere model of collisions, i.e. to model atoms in a gas as billiard balls.

At room temperature, the r.m.s. speed of O2 or N2 is about 500 ms−1 . Processes such as the diﬀusion of one gas into another would therefore be almost instantaneous, were it not for the occurrence of collisions between molecules. Collisions are fundamentally quantum mechanical events, but in a dilute gas, molecules spend most of their time between collisions and so we can consider them as classical billiard balls and ignore the details of what actually happens during a collision. All that we care about is that after collisions the molecules’ velocities become essentially randomized.1 In this chapter we will model the eﬀect of collisions in a gas and develop the concepts of a mean collision time, the collision cross-section and the mean free path.

8.1

The mean collision time

In this section, we aim to calculate the average time between molecular collisions. Let us consider a particular molecule moving in a gas of other similar molecules. To make things simple to start with, we suppose the molecule under consideration is travelling at speed v and that the other molecules in the gas are stationary. This is clearly a gross oversimpliﬁcation, but we will relax this assumption later. We will also attribute a collision cross-section σ to each molecule which is something like the cross-sectional area of our molecule. Again, we will reﬁne this deﬁnition later in the chapter. In a time dt, our molecule will sweep out a volume σvdt. If another molecule happens to lie inside this volume, there will be a collision. With n molecules per unit volume, the probability of a collision in time dt is therefore nσvdt. Let us deﬁne P (t) as follows: P (t) = the probability of a molecule not colliding up to time t. (8.1) Elementary calculus then implies that P (t + dt) = P (t) +

dP dt, dt

(8.2)

but P (t + dt) is also the probability of a molecule not colliding up to time t multiplied by the probability of not colliding in subsequent time dt, i.e. that P (t + dt) = P (t)(1 − nσvdt). (8.3)

8.2

The collision cross-section 69

Hence rearranging gives 1 dP = −nσv P dt

(8.4)

and therefore that (using P (0) = 1) P (t) = e−nσvt .

(8.5)

Now the probability of surviving without collision up to time t but then colliding in the next dt is e−nσvt nσvdt. We can check that this is a proper probability by integrating it, ∞ e−nσvt nσvdt = 1,

(8.6)

(8.7)

0

and conﬁrming that it is equal to unity. Here, use has been made of the integral ∞ e−x dx = 0! = 1 (8.8) 0

(see Appendix C.1). We are now in a position to calculate the mean scattering time τ , which is the average time elapsed between collisions for a given molecule. This is given by ∞ τ = t e−nσvt nσvdt 0 ∞ 1 = (nσvt)e−nσvt d(nσvt) nσv 0 ∞ 1 = xe−x dx (8.9) nσv 0 where the integral has been simpliﬁed by the substitution x = nσvt. Hence we ﬁnd that 1 , τ= (8.10) nσv where use has been made of the integral (again, see Appendix C.1) ∞ xe−x dx = 1! = 1. (8.11)

a a

0

8.2

The collision cross-section

In this section we will consider the factor σ in much more detail. To be as general as possible, we will consider two spherical molecules of radii a1 and a2 with a hard-sphere potential between them (see Fig. 8.1).

Fig. 8.1 Two spherical molecules of radii a1 and a2 with a hard-sphere potential between them.

70 The mean free path and collisions

This implies that there is a potential energy function V (R) that depends on the relative separation R of their centres, and is given by 0 R > a1 + a2 (8.12) V (R) = ∞ R ≤ a1 + a2

V R

a

a

R

Fig. 8.2 The hard-sphere potential V (R).

and this is sketched in Fig. 8.2. The impact parameter b between two moving molecules is deﬁned as the distance of closest approach that would result if the molecular trajectories were undeﬂected by the collision. Thus for a hard-sphere potential there is only a collision if the impact parameter b < a1 + a2 . Focus on one of these molecules (let’s say the one with radius a1 ). This is depicted in Fig. 8.3. Now imagine molecules of the other type (with radius a2 ) nearby. A collision will only take place if the centre of these other molecules comes inside a tube of radius a1 + a2 (so that the molecule labelled A would not collide, whereas B and C would). Thus our ﬁrst molecule can be considered to sweep out an imaginary tube of space of cross-sectional area π(a1 + a2 )2 that deﬁnes its ‘personal space’. The area of this tube is called the collision cross-section σ and is then given by σ = π(a1 + a2 )2 . (8.13) If a1 = a2 = a, then σ = πd2

(8.14)

where d = 2a is the molecular diameter.

Fig. 8.3 A molecule sweeps out an imaginary tube of space of crosssectional area σ = π(a1 + a2 )2 . If the centre of another molecule enters this tube, there will be a collision. 2

But not too low a temperature, or quantum eﬀects become important.

3

Cross-sections in nuclear and particle physics can be much larger than the size of the object, expressing the fact that an object (in this case a particle) can react strongly with things a long distance away from it.

Is the hard-sphere potential correct? It is a good approximation at lower temperatures,2 but progressively worsens as the temperature increases. Molecules are not really hard spheres but slightly squashy objects, and when they move at higher speeds and plough into each other with more momentum, you need more of a direct hit to cause a collision. Thus as the gas is warmed, the molecules may appear to have a smaller cross-sectional area.3

8.3

8.3

The mean free path 71

The mean free path

Having derived the mean collision time, it is tempting to derive the mean free path as v (8.15) λ = vτ = nσv but what should we take as v? A ﬁrst guess is to use v, but that turns out to be not quite right. What has gone wrong? Our picture of molecular scattering has been to focus on one molecule as the moving one, and think of all of the others as sitting ducks, ﬁxed in space waiting patiently for a collision to occur. The reality is quite diﬀerent: all molecules are whizzing around. We should therefore take v as the average relative velocity, i.e. vr , where vr = v1 − v2

(8.16)

and v 1 and v 2 are the velocities of two molecules labelled 1 and 2. Now, vr2 = v12 + v22 − 2v 1 · v 2 ,

(8.17)

vr2 = v12 + v22 = 2v 2 ,

(8.18)

so that because v 1 · v 2 = 0 (which follows because cos θ = 0). The quantity which we want is vr , but what we have an expression for is vr2 . If the probability distribution is a Maxwell–Boltzmann distribution, then the error in writing vr ≈ vr2 is small,4 so to a reasonable degree of approximation we can write √ vr ≈ vr2 ≈ 2v (8.19) and hence we obtain an expression for λ as follows: 1 . λ≈ √ 2nσ

(8.20)

Substitution of p = nkB T yields the expression kB T . λ≈ √ 2pσ

(8.21)

To increase the mean free path by a certain factor, the pressure needs to be decreased by the same factor.

Example 8.1 Calculate the mean free path for a gas of N2 at room temperature and pressure. (For N2 , take d = 0.37 nm.) Solution: The collision cross-section is πd2 = 4.3 × 10−19 m2 . We have p ≈ 105 Pa 5 and T ≈ 300 K, so the number density is n = p/k √ B T ≈ 10 /(1.38 × 10−23 ×300) ≈ 2×1025 m−3 . This leads to λ ≈ 1/( 2nσ) = 6.8×10−8 m.

4

Equation 7.23 implies that q p 8 = 0.92, so v/ v 2 = 3π the error is less than 10%.

72 Exercises

Notice that both λ and τ decrease with increasing pressure at ﬁxed temperature. Thus the frequency of collisions increases with increasing pressure.

Chapter summary • The mean scattering time is given by τ=

1 , nσvr

where the collision cross-section is σ = πd2 , d is the molecular √ diameter and vr ≈ 2v. • The mean free path is 1 . λ≈ √ 2nσ

Exercises (8.1) What is the mean free path of an N2 molecule in an ultra-high-vacuum chamber at a pressure of 10−10 mbar? What is the mean collision time? The chamber has a diameter of 0.5 m. On average, how many collisions will the molecule make with the chamber walls compared with collisions with other molecules? If the pressure is suddenly raised to 10−6 mbar, how do these results change? (8.2)

(a) Show that√the root mean square free path is given by 2λ where λ is the mean free path. (b) What is the most probable free path length?

(c) What percentage of molecules travel a distance greater (i) than λ, (ii) than 2λ, (iii) than 5λ? (8.3) Show that particles hitting a plane boundary have travelled a distance 2λ/3 perpendicular to the plane since their last collision, on average. (8.4) A diﬀuse cloud of neutral hydrogen atoms in space has a temperature of 50 K. Estimate the mean scattering time (in years) between hydrogen atoms in the cloud and the mean free path (in Astronomical Units). (1 Astronomical Unit is the Earth–Sun distance; see Appendix A for a numerical value.)

Part III

Transport and thermal diﬀusion In the third part of this book, we use our results from the kinetic theory of gases to derive various transport properties of gases and then apply this to solving the thermal diﬀusion equation. This part is structured as follows: • In Chapter 9, we use the intuition developed from considering molecular collisions and the mean free path to determine various transport properties, in particular viscosity, thermal conductivity and diﬀusion. These correspond to the transport of momentum, heat and particles respectively. • In Chapter 10 we derive the thermal diﬀusion equation which shows how heat is transported between regions of diﬀerent temperature. This equation is a diﬀerential equation and can be applied to a variety of physical situations, and we show how to solve it in certain cases of high symmetry.

Transport properties in gases

9 9.1 Viscosity

74

9.2 Thermal conductivity

79

9.3 Diﬀusion

81

9.4 More-detailed theory

84

Chapter summary

86

Further reading

86

Exercises

87

In this chapter, we wish to describe how a gas can transport momentum, energy or particles, from one place to another. The model we have used so far has been that of a gas in equilibrium, so that none of its parameters are time-dependent. Now we consider non-equilibrium situations, but still in the steady state, i.e. so that the system parameters are time-independent, but the surroundings will be time-dependent. The phenomena we want to treat are called transport properties and we will consider (1) viscosity, which is the transport of momentum, (2) thermal conductivity which is the transport of heat, and (3) diﬀusion, which is the transport of particles.

9.1 1

This proportionality was suggested by Isaac Newton and holds for many liquids and most gases, which are thus termed Newtonian ﬂuids. NonNewtonian ﬂuids have a viscosity which is a function of the applied shear stress. 2

Also used is the kinematic viscosity ν, deﬁned by ν = η/ρ where ρ is the density. This is useful because one often wants to compare the viscous forces with inertial forces. The unit of kinematic viscosity is m2 s−1 . u

z

x

F

ux

Viscosity

Viscosity is the measure of the resistance of a ﬂuid to the deformation produced by a shear stress. For straight, parallel and uniform ﬂow, the shear stress between the layers is proportional1 to the velocity gradient in the direction perpendicular to the layers. The constant of proportionality, given the symbol η, is called the coeﬃcient of viscosity, the dynamic viscosity or simply the viscosity.2 Consider the scenario in Fig. 9.1 in which a ﬂuid is sandwiched between two plates of area A which each lie in the xy plane. A shear stress τxz = F/A is applied to the ﬂuid by sliding the top plate over it at speed u while keeping the bottom plate stationary. A shear force F is applied. A velocity gradient dux /dz is set up, so that ux = 0 near the bottom plate and ux = u near the top plate. If the ﬂuid is a gas, then this extra motion in the x-direction is superimposed on the Maxwell–Boltzmann motion in the x, y and z directions (and hence the use of the average ux , rather than ux ). The viscosity η is then deﬁned by

F

τxz = Fig. 9.1 A ﬂuid is sandwiched between two plates of area A which each lie in an xy plane.

F dux =η . A dz

(9.1)

The units of viscosity are Pa s (= N m−2 s). Force is rate of change of momentum, and hence transverse momentum is being transported

9.1

through the ﬂuid. This is achieved because molecules travelling in the +z direction move from a layer in which ux is smaller to one in which ux is larger, and hence they transfer net momentum to that layer in the −x direction. Molecules travelling parallel to −z have the opposite eﬀect. Hence, the shear stress τxz is equal to the transverse momentum transported across each square metre per second, and hence τxz is equal to a ﬂux of momentum (though note that there must be a minus sign involved, because the momentum ﬂux must be from regions of high transverse velocity to regions of low transverse velocity, which is in the opposite direction to the velocity gradient). The velocity gradient ∂ux /∂z therefore drives a momentum ﬂux Πz , according to Πz = −η

∂ux . ∂z

Hence the total x-momentum transported across unit area perpendicular to z in unit time is the momentum ﬂux Πz given by ∞ π ∂ux 1 v cos θ n f (v) dv sin θ dθ · m − Πz = λ cos θ 2 ∂z 0 0 π ∞ ∂ux 1 nmλ v f (v) dv − cos2 θ sin θ dθ = 2 ∂z 0 0 ∂ux 1 = − nmλv . (9.4) 3 ∂z Hence the viscosity is given by 1 nmλv. 3

z v

(9.2)

The viscosity can be calculated using kinetic theory as follows: Recall ﬁrst that we showed before in eqn 6.13 that the number of molecules which hit unit area per second is v cos θ n f (v) dc 12 sin θ dθ. Consider molecules travelling at an angle θ to the z-axis (see Fig. 9.2). Then molecules which cross a plane of constant z will have travelled on average a distance λ since their last collision, and so they will have travelled a distance λ cos θ parallel to the z-axis since their last collision. Over that distance there is an average increase in ux given by (∂ux /∂z)λ cos θ, so these upward travelling molecules bring an excess momentum in the x-direction given by3 ∂ux −m λ cos θ. (9.3) ∂z

η=

Viscosity 75

(9.5)

Equation 9.5 has some important consequences. • η independent of √ pressure. Because λ ≈ 1/( 2nσ) ∝ n−1 , the viscosity is independent of n and hence (at constant temperature) it is independent of pressure. This is at ﬁrst sight a weird result: as you increase the pressure, and hence n, you should be better at transmitting momentum

Fig. 9.2 Molecular velocty v for molecules travelling at an angle θ to the z-axis. These will have travelled on average a distance λ since their last collision, and so they will have travelled a distance λ cos θ parallel to the z-axis since their last collision.

3

The negative sign is because the molecules moving in the +z direction are moving up the velocity gradient from a slower to a faster region and so “ bring ” a deﬁcit in x-momentum if ∂ux is positive. It is the same rea∂z son for the negative sign in eqn 9.2.

76 Transport properties in gases

T

Fig. 9.3 The apparent viscosity of air as a function of pressure at 288 K. It is found to be constant over a wide range of pressure.

Fig. 9.4 The dependence of the viscos√ ity of various gases on m/d2 . The dotted line is the prediction of eqn 9.6. The solid line is the prediction of eqn 9.45.

because you have more molecules to do it with. However, your mean free path reduces correspondingly, so that each molecule becomes less eﬀective at transmitting momentum in such a way as to precisely cancel out the eﬀect of having more of them. This result holds impressively well over quite a range of pressures (see Fig. 9.3) although it begins to fail at very low or very high pressures. • η ∝ T 1/2 . Because η is independent of n, the only temperature dependence is from v ∝ T 1/2 , and hence η ∝ T 1/2 . Note therefore that the viscosity of gases increases with T , which is diﬀerent for most liquids which get runnier (i.e. less viscous) when you heat them. √ • Substituting in λ = ( 2nσ)−1 , σ = πd2 and v = (8kB T /πm)1/2 yields a more useful (though less memorable) expression for the viscosity: 1/2 mkB T 2 . (9.6) η= 3πd2 π • Equation 9.6 predicts that the viscosity will be proportional to √ m/d2 at constant temperature. This holds very well, as shown in Fig. 9.4.

m m

d

• Various approximations have gone into this approach, and a condition for their validity is that L λ d,

(9.7)

where L is the size of the container holding the gas and d is the molecular diameter. We need λ d (pressure not too high) so that we can neglect collisions involving more than two particles. We need λ L (pressure not too low) so that molecules mainly collide with each other and not with the container walls. If λ is of the

9.1

Viscosity 77

same order of magnitude or greater than L, most of a molecule’s collisions will be with the container walls. Figure 9.3 indeed shows that the pressure-independence of the viscosity begins to break down when the pressure is too low or too high.

T

• The factor of 13 in eqn 9.5 is not quite right, so that eqn 9.6 leads to the dotted line in Fig. 9.4. To get a precise numerical factor, you need to consider the fact that the velocity distribution is diﬀerent in diﬀerent layers (because of the shear stress applied) and then average over the distribution of path lengths. This will be done in Section 9.4 and leads to a prediction which gives the solid line in Fig. 9.4. • The measured temperature dependence of the viscosity √ of various gases broadly agrees with our prediction that η ∝ T , as shown in Fig. 9.5, but the agreement is not quite perfect. The reason for this is that the collision cross-section, σ = πd2 , is actually temperature-dependent. At high temperatures, molecules move faster and hence have to collide more directly to have a proper momentum-randomizing collision. We have been assuming that molecules behave as perfect hard spheres and that any collision perfectly randomizes the molecular motion, but this is not precisely true. This means that the eﬀective molecular diameter shrinks as you increase the temperature, increasing the viscosity over and √ above the expected T dependence. This is evident in the data presented in Fig. 9.5. • Viscosity can be measured by the damping of torsional oscillations in the apparatus shown in the box.

Fig. 9.5 The temperature dependence of the viscosity of various gases. The agreement with the predicted T 1/2 behaviour is satisfactory as a ﬁrst approximation, but not very good in detail.

78 Transport properties in gases

Measurement of viscosity

ﬁbre. The velocity gradient u(r) is related to the angular velocity ω(r) by u(r) = rω(r), and we expect Maxwell developed a method for measuring that ω varies all the way from 0 at r = a to ω0 at the viscosity of a gas by observing the damp- r = b. The velocity gradient is thus ing rate of oscillations of a disk suspended du dω from a ﬁxed support by a torsion ﬁbre. =ω+r , (9.8) dr dr It is positioned halfway between two, ﬁxed hor- but the ﬁrst term here simply corresponds to the ve(a) izontal disks and oscil- locity gradient due to rigid rotation and does not lates parallel to them contribute to the viscous shearing stress which is thus in the gas. This is ηrdω/dr. The force F on a cylindrical element of gas shown in Fig. 9.6(a), (of length l) is then just this viscous stress multiplied with the ﬁxed horizon- by the area of the cylinder 2πrl, i.e. tal disks shaded and dω the oscillating disk in (9.9) F = 2πrlη × r , white. The dampdr ing of the torsional osand so the torque G = rF on this cylindrical element cillations is from the is viscous damping due dω G = 2πr3 lη . (9.10) to the gas trapped on dr each side of the oscilIn the steady state, there is no change in viscous (b) lating disk between the torque from the outer to the inner cylinder (if there ﬁxed disks. The ﬁxed were, angular acceleration would be induced somedisks are mounted inwhere and the system would change) so this torque side a vacuum chamber is transmitted to the suspended cylinder. Hence rein which the composiarranging and integrating give tion and pressure of the b ω0 gas to be measured can dr G = 2πlη dω = 2πlηω0 , (9.11) be varied. 3 0 a r A very accurate method is the rotating- so that G 1 1 cylinder method in η= − 2 . (9.12) which gas is conﬁned 4πωl a2 b between two vertical The torque G is related to the angular deﬂection φ of coaxial cylinders. It is the inner cylinder by G = αφ. The angular deﬂection shown in Fig. 9.6(b). can be measured using a light beam reﬂected from a The outer cylinder (ina small mirror attached to the torsion ﬁbre. The coeﬃner radius a) is rotated cient α is known as the torsion constant. This can b by a motor at a conbe found by measuring the period T of torsional oscilstant angular speed ω0 , lations of an object of moment of inertia I suspended while the inner cylinfrom the wire, which is Fig. 9.6 Measuring viscos- der (outer radius b) is ity by (a) Maxwell’s method suspended by a torsion I and (b) the rotating cylinder . (9.13) T = 2π ﬁbre from a ﬁxed supmethod. α port. The torque G on the outer cylinder is transmitted via the gas to the Knowledge of I and T yields α which can be used inner cylinder and a resulting torque on the torsion with the measured φ to obtain G and hence η.

9.2

9.2

Thermal conductivity 79

Thermal conductivity

We have deﬁned heat as ‘energy in transit’.4 It quantiﬁes the transfer of energy in response to a temperature gradient. The amount of heat which ﬂows along a temperature gradient depends upon the thermal conductivity of the material which we will now deﬁne. Thermal conductivity can be considered in one-dimension using the diagram shown in Fig. 9.7. Heat ﬂows from hot to cold, and so ﬂows against the temperature gradient. The ﬂow of heat can be described by a heat ﬂux vector J , whose direction lies along the direction of ﬂow of heat and whose magnitude is equal to the heat energy ﬂowing per unit time per second (measured in J s−1 m−2 =W m−2 ). The heat ﬂux Jz in the z-direction is given by ∂T Jz = −κ , (9.14) ∂z where the negative sign is because heat ﬂows ‘downhill’. The constant κ is called the thermal conductivity5 of the gas. In general, in three dimensions we can write that the heat ﬂux J is related to temperature using J = −κ∇T. (9.15) How do molecules in a gas ‘carry’ heat? Gas molecules have energy, and as we found in eqn 5.17 their mean translational kinetic energy 12 mv 2 = 32 kB T depends on the temperature. Therefore to increase the temperature of a gas by 1 K, one has to increase the mean kinetic energy by 32 kB per molecule. The heat capacity6 C of the gas is the heat required to increase the temperature of gas by 1 K. The heat capacity Cmolecule of a gas molecule is therefore equal to 23 kB , though we will later see that it can be larger than this if the molecule can store energy in forms other than translational kinetic energy.7 The derivation of the thermal conductivity of a gas is very similar to that for viscosity. Consider molecules travelling along the z-axis. Then molecules which cross a plane of constant z will have travelled on average a distance λ since their last collision, and so they will have travelled a distance λ cos θ parallel to the z-axis since their last collision. Therefore, they bring a deﬁcit of thermal energy given by ∂T λ cos θ, (9.16) Cmolecule × ∆T = Cmolecule ∂z where Cmolecule is the heat capacity of a single molecule. Hence the total thermal energy transported across unit area in unit time, i.e. the heat ﬂux, is given by ∞ π ∂T 1 λ cos θ v cos θ nf (v) sin θdθ dv −Cmolecule Jz = ∂z 2 0 0 ∞ π 1 ∂T = − nCmolecule λ v f (v) dv cos2 θ sin θ dθ 2 ∂z 0 0 1 ∂T = − nCmolecule λv . (9.17) 3 ∂z

4

See Chapter 2.

z T T Jz

T

T T

Fig. 9.7 Heat ﬂows in the opposite direction to the temperature gradient. 5

Thermal conductivity W m−1 K−1 .

6

7

has

units

See Section 2.2.

Other forms include rotational kinetic energy or vibrational energy, if the gas molecules are polyatomic.

80 Transport properties in gases

Hence, the thermal conductivity κ is given by κ=

1 CV λv, 3

(9.18)

where CV = nCmolecule is the heat capacity per unit volume (though the subscript V here refers to a temperature change at constant volume). Equation 9.18 has some important consequences.

T

Fig. 9.8 The thermal conductivity of various gases as a function of temperature. The agreement with the predicted T 1/2 behaviour is satisfactory as a ﬁrst approximation, but not very good in detail.

• κ independent of pressure. √ The argument is the same as for η. Because κ ≈ 1/( 2nσ) ∝ n−1 , κ is independent of n and hence (at constant temperature) it is independent of pressure. • κ ∝ T 1/2 . The argument is also the same as for η. Because κ is independent √ of n, the only temperature dependence is from v ∝ T , and hence η ∝ T 1/2 . This holds really quite well for a number of gases (see Fig. 9.8). √ • As for viscosity, substituting in λ = ( 2nσ)−1 , σ = πd2 and v = (8kB T /πm)1/2 yields a more useful (though less memorable) expression for the thermal conductivity: κ=

2 Cmolecule 3πd2

kB T πm

1/2 .

(9.19)

• L λ d is again the relevant condition for our treatment to hold. • Equation 9.19 predicts that the thermal conductivity will be pro√ portional to 1/( md2 ) at constant temperature. This holds very well, as shown in Fig. 9.9. • Thermal conductivity can be measured by various techniques, see the box. The similarity of η and κ would suggest that Cmolecule κ = . η m

d m m

Fig. 9.9 The dependence of the thermal√conductivity of various gases on 1/( md2 ). The dotted line is the prediction of eqn 9.19. The solid line is the prediction of eqn 9.46 which works very well for the monatomic noble gases, but a little less well for diatomic N2 . 8

See Section 2.2.

(9.20)

The ratio Cmolecule /m is the speciﬁc heat capacity8 cV (the subscript V indicating a measurement at constant volume), so equivalently κ = cV η.

(9.21)

However, neither of these relations hold too well. Faster molecules cross a given plane more often than slow ones. These carry more kinetic energy and therefore do carry more heat. However, they don’t necessarily carry more average momentum in the x-direction. We will return to this point in Section 9.4.

9.3

Measurement of thermal conductivity The thermal conductivity κ can be measured using the hot-wire method. Gas ﬁlls the space between two coaxial cylinders (inner cylinder radius a, outer cylinder radius b) as shown in Fig. 9.10. The outer cylinder is connected to a constant-temperature bath of temperature Tb , b while heat is generated in the inner cylinder Ta a (the hot wire) at rate Q per unit length of the Tb cylinder (measured in units of W m−1 ). The temperature of the inFig. 9.10 The hot-wire method ner cylinder rises to Ta . for measuring thermal conduc- The rate Q can be contivity. nected with the radial heat ﬂux Jr using Q = 2πrJr ,

(9.22)

and Jr itself is given by −κ∂T /∂r, as in eqn 9.14. Hence ∂T Q = −2πrκ , (9.23) ∂r

9.3

and rearranging and integrating yields Tb b dr = −2πκ dT, Q Ta a r and hence κ=

Q ln b/a . 2π Ta − Tb

Diﬀusion 81

(9.24)

(9.25)

Since Q is known (it is the power supplied to heat the inner cylinder) and Ta and Tb can be measured, the value of κ can be deduced. An important application of this technique is in the Pirani gauge, which is commonly used in vacuum systems to measure pressure. A sensor wire is heated electrically, and the pressure of the gas is determined by measuring the current needed to keep the wire at a constant temperature. (The resistance of the wire is temperature dependent, so the temperature is estimated by measuring the resistance of the wire.) The Pirani gauge thus relies on the fact that at low pressure the thermal conductivity is a function of pressure (since the condition λ L, where L is a linear dimension in the gauge, is not met). In fact, a typical Pirani gauge will not work to detect pressures much above 1 mbar because, above these pressures, the thermal conductivity of the gases no longer changes with pressure. The thermal conductivity of each gas is diﬀerent, so the gauge has to be calibrated for the individual gas being measured.

Diﬀusion

Consider a distribution of similar molecules, some of which are labelled (e.g. by being radioactive). Let there be n∗ (z) of these labelled molecules per unit volume, but note that n∗ is allowed to be a function of the z coordinate. The ﬂux Φz of labelled molecules parallel to the z-direction (measured in m−2 s−1 ) is9 Φz = −D

∂n∗ ∂z

,

In three dimensions, this equation is written Φ = −D∇n∗ .

(9.26) 10

where D is the coeﬃcient of self-diﬀusion.10 Now consider a thin slab of gas of thickness dz and area A, as shown in Fig. 9.11. The ﬂux into the slab is AΦz ,

9

(9.27)

We use the phrase self-diﬀusion because the molecules which are diﬀusing are the same (apart from being labelled) as the molecules into which they are diﬀusing. Below we will consider diﬀusion of molecules into dissimilar molecules.

82 Transport properties in gases

and the ﬂux out of the slab is ∂Φz dz . A Φz + ∂z A

z

z

z

An

z

z

The diﬀerence in these two ﬂuxes must be balanced by the time-dependent changes in the number of labelled particles inside the region. Hence (9.29)

z

∂ ∗ ∂Φz (n A dz) = −A dz, ∂t ∂z ∂n∗ ∂Φz =− , ∂t ∂z

(9.30)

so that A

z

Fig. 9.11 The ﬂuxes into and out of a thin slab of gas of thickness dz and area A. 11

See also Appendix C.12.

(9.28)

and hence that

∂ 2 n∗ ∂n∗ =D . (9.31) ∂t ∂z 2 This is the diﬀusion equation. A derivation of the diﬀusion equation in three dimensions is shown in the box.11 Three-dimensional derivation of the diﬀusion equation The total number of labelled particles that ﬂow out of a closed surface S is given by the integral Φ · dS, (9.32) S

and this must be balanced by the rate of decrease of labelled particles inside the volume V which is surrounded by S, i.e. ∂ Φ · dS = − n∗ dV. (9.33) ∂t V S The divergence theorem implies that Φ · dS = ∇ · Φ dV, S

(9.34)

V

and hence that

∂n∗ . (9.35) ∂t Substituting in Φ = −D∇n∗ then yields the diﬀusion equation, which is ∂n∗ = D∇2 n∗ . (9.36) ∂t ∇·Φ=−

A kinetic theory derivation of D proceeds as follows. The excess labelled molecules hitting unit area per second is ∞ ∂n∗ 1 Φz = λ cos θ v cos θf (v) dv sin θ − 2 ∂z 0 1 ∂n∗ = − λv , (9.37) 3 ∂z

9.3

Diﬀusion 83

and hence 1 λv. 3 This equation has some important implications: D=

(9.38)

• D ∝ p−1 In this case, there is no factor of n, but λ ∝ 1/n and hence D ∝ n−1 and at ﬁxed temperature D ∝ p−1 (this holds quite well experimentally, see Fig. 9.12). • D ∝ T 3/2 Because p = nkB T and v ∝ T 1/2 , we have that D ∝ T 3/2 at ﬁxed pressure. • Dρ = η The only diﬀerence between the formula for D and that for η is a factor of ρ = nm, and so Dρ = η.

(9.39)

−1/2 −2

D

D

• D∝m d which is the same dependence as thermal conductivity. • The less-memorable formula for D is, as before, obtained by substituting in the expressions for v and λ, yielding 1/2 kB T 2 D= . (9.40) 3πnd2 πm

p

p

This section has been about self-diﬀusion, where labelled atoms (or molecules) diﬀuse amongst unlabelled, but otherwise identical atoms (or molecules). Experimentally, it is easier to measure the diﬀusion of atoms (or molecules) of one type (call them type 1, mass m1 , diameter d1 ) amongst atoms (or molecules) of another type (call them type 2, mass m2 , diameter d2 ). In this case the diﬀusion constant D12 is used which is given by eqn 9.40 with d replaced by (d1 +d2 )/2 and m replaced by 2m1 m2 /(m1 + m2 ), so that 1/2 kB T (m1 + m2 ) 2 . (9.41) D12 = 2πm1 m2 3πn( 12 [d1 + d2 ])2

Fig. 9.12 Diﬀusion as a function of pressure.

84 Transport properties in gases

9.4

12

See Section 3.6.

More-detailed theory

The treatment of the transport properties presented so far in this chapter has the merit that it allows one to get the basic dependences fairly straightforwardly, and gives good insight as to what is going on. However, some of the details of the predictions are not in complete agreement with experiment and it is the purpose of this section to oﬀer a critique of this approach and see how things might be improved. This section contains more advanced material than considered in the rest of this chapter and can be skipped at ﬁrst reading. One eﬀect which we have ignored is the persistence of velocity after a collision. Our assumption has been that following a collision, a molecule’s velocity becomes completely randomized and is completely uncorrelated with its velocity before the collision. However, although that is the simplest approximation to take, it is not correct. After most collisions, a molecule will retain some component of its velocity in the direction of its original motion. Moreover, our treatment has implicitly assumed a Maxwellian distribution of molecular velocities and that the diﬀerent components of v are uncorrelated with each other, so that they can be considered to be independent random variables.12 However, these components are actually partially correlated with each other and so are not independent random variables. A further eﬀect which becomes important at low pressure is the presence of boundaries; the details of the collisions of molecules with walls of a container can be quite important, and such collisions become more important as the pressure is reduced so that the mean free path increases. Yet another consideration is the interconversion between the internal energy of a molecule and its translational degrees of freedom. As we will see in later chapters, the heat capacity of a molecule contains terms not only due to its translational motion (Cmolecule = 32 kB ) but also due to its rotational and vibrational degrees of freedom. Collisions can give rise to processes where a molecule’s energy can be redistributed throughout these diﬀerent degrees of freedom. Thus if the molar heat capacity CV can be written as the sum of two terms, CV = CV + CV , where CV is due to translational degrees of freedom and CV is due to other degrees of freedom, then it turns out that eqn 9.21 should be amended to give 5 C + CV η. κ= (9.42) 2 V The 25 factor reﬂects the correlations that exist between momentum, energy and translational motion. The most energetic molecules are the most rapid and therefore possess longer mean free paths. This leads to Eucken’s formula, which states that 1 κ = (9γ − 5)ηCV . (9.43) 4 For an ideal monatomic gas γ = 53 and hence κ=

5 ηCV , 2

(9.44)

9.4

which supersedes eqn 9.21. A more accurate treatment of the eﬀects mentioned in this section has been performed by Chapman and Enskog (in the twentieth century); the methods used go beyond the scope of this text, but we summarize the results. • The viscosity, which was written as η = (2/3πd2 )(mkB T /π)1/2 in eqn 9.6, should be replaced by 5 1 η= 16 d2

mkB T π

1/2 ,

(9.45)

i.e. the 2/3π should be replaced by 5/16. • The corrected formula for κ (which we had evaluated in eqn 9.19) can be obtained from this expression of η using Eucken’s formula, eqn 9.43, and hence reads 25 κ= Cmolecule 32d2

kB T πm

1/2 ,

(9.46)

i.e. the 2/3π should be replaced by 25/32. • The formula for D, which appears in eqn 9.40, should now be replaced by 1/2 kB T 3 1 , (9.47) D= 8 nd2 πm i.e. the 2/3π should be replaced by 3/8. Similarly, eqn 9.41 should be replaced by D=

3 8n( 12 [d1 + d2 ])2

kB T (m1 + m2 ) 2πm1 m2

1/2 .

(9.48)

This also alters other conclusions, such as eqn 9.39, which becomes Dρ =

3 8η 5 16

=

6η . 5

(9.49)

More-detailed theory 85

86 Further reading

Chapter summary • Viscosity, η, deﬁned by Πz = −η ∂ux /∂z is (approximately) η=

1 nmλv. 3

• Thermal conductivity, κ, deﬁned by Jz = −κ ∂T /∂z is (approximately) κ=

1 CV λv. 3

• Diﬀusion, D, deﬁned by Φz = −D ∂n∗ /∂z is (approximately) D=

1 λv. 3

• These relationships assume that L λ d. The results of a more detailed theory have been summarized (which serve only to alter the numerical factors at the start of each equation). • The predicted pressure, temperature, molecular mass and molecular diameter dependences are: η

κ

D

∝ p0 ∝ T 1/2 ∝ m1/2 d−2

∝ p0 ∝ T 1/2 ∝ m−1/2 d−2

∝ p−1 ∝ T 3/2 ∝ m−1/2 d−2

(In this table, ∝ p0 means independent of pressure.)

Further reading Chapman and Cowling (1970) is the classic treatise describing the more advanced treatment of transport properties in gases.

Exercises 87

Exercises (9.1) Is air more viscous than water? Compare the dynamic viscosity η and the kinematic viscosity ν = η/ρ using the following data:

Air Water

ρ (kg m−3 )

η (Pa s)

1.3 1000

17.4×10−6 1.0×10−3

(9.2) Obtain an expression for the thermal conductivity of a gas at ordinary pressures. The thermal conductivity of argon (atomic weight 40) at S.T.P. is 1.6×10−2 Wm−1 K−1 . Use this to calculate the mean free path in argon at S.T.P. Express the mean free path in terms of an eﬀective atomic radius for collisions and ﬁnd the value of this radius. Solid argon has a close–packed cubic structure in which, if the atoms are regarded as hard spheres, 0.74 of the volume of the structure is ﬁlled. The density of solid argon is 1.6×103 kg m−3 . Compare the eﬀective atomic radius obtained from this information with your eﬀective collision radius. Comment on your result. (9.3) Deﬁne the coeﬃcient of viscosity. Use kinetic theory to show that the coeﬃcient of viscosity of a gas is given, with suitable approximations, by η = Kρcλ where ρ is the density of the gas, λ is the mean free path of the gas molecules, c is their mean speed, and K is a number which depends on the approximations you make. In 1660 Boyle set up a pendulum inside a vessel which was attached to a pump which could remove air from the vessel. He was surprised to ﬁnd that there was no observable change in the rate of damping of the swings of the pendulum when the pump was set going. Explain his observation in terms of the above formula. Make a rough order of magnitude estimate of the lower limit to the pressure which Boyle obtained; use reasonable assumptions concerning the apparatus which Boyle might have used. [The viscosity of air at atmospheric pressure and at 293 K is 18.2 µN s m−2 .] Explain why the damping is nearly independent of pressure despite the fact that fewer molecules collide with the pendulum as the pressure is reduced.

(9.4) Two plane disks, each of radius 5 cm, are mounted coaxially with their adjacent surfaces 1 mm apart. They are in a chamber containing Ar gas at S.T.P. (viscosity 2.1×10−5 N s m−2 ) and are free to rotate about their common axis. One of them rotates with an angular velocity of 10 rad s−1 . Find the torque which must be applied to the other to keep it stationary. (9.5) Measurements of the viscosity, η, of argon gas (40 Ar) over a range of pressures yield the following results at two temperatures: at 500 K η ≈ 3.5 × 10−5 kg m−1 s−1 ; at 2000 K η ≈ 8.0 × 10−5 kg m−1 s−1 . The viscosity is found to be approximately independent of pressure. Discuss the extent to which these data are consistent with (i) simple kinetic theory, and (ii) the diameter of the argon atom (0.34 nm) deduced from the density of solid argon at low temperatures. (9.6) In Section 11.3, we will deﬁne the ratio of Cp to CV is given by the number γ. We will also show that Cp = CV + R, where the heat capacities here are per mole. Show that these deﬁnitions lead to CV =

R . (γ − 1)

(9.50)

Starting ´ formulae CV = CV + 3CV and ` 5 with the κ = 2 CV + CV η, show that if CV /R = 2 , then

κ=

1 (9γ − 5) ηCV , 4

(9.51)

which is Eucken’s formula. Deduce the value of γ for each of the following monatomic gases measured at room temperatures. Species He Ne Ar Kr Xe

κ/(ηCV ) 2.45 2.52 2.48 2.54 2.58

Deduce what proportion of the heat capacity of the molecules is associated with the translational degrees of freedom for these gases. (Hint: notice the word ‘monatomic’.)

The thermal diﬀusion equation

10 10.1 Derivation of the thermal diﬀusion equation 88 10.2 The one-dimensional thermal diﬀusion equation 89 10.3 The steady state

92

10.4 The thermal diﬀusion equation for a sphere 92 10.5 Newton’s law of cooling

95

10.6 The Prandtl number

97

10.7 Sources of heat

98

Chapter summary

99

Exercises

99

This section assumes familiarity with solving diﬀerential equations (see e.g. Boas (1983), Riley et al. (2006)). It can be omitted at ﬁrst reading.

In the previous chapter, we have seen how the thermal conductivity of a gas can be calculated using kinetic theory. In this chapter, we look at solving problems involving the thermal conductivity of matter using a technique which was developed by mathematicians in the late eighteenth and early nineteenth centuries. The key equation describes thermal diﬀusion, i.e. how heat appears to ‘diﬀuse’ from one place to the other, and most of this chapter introduces techniques for solving this equation.

10.1

Derivation of the thermal diﬀusion equation

Recall from eqn 9.15 that the heat ﬂux J is given by J = −κ∇T.

(10.1)

This equation is very similar mathematically to the equation for particle ﬂux Φ in eqn 9.26 which is, in three dimensions,

J

J

Φ = −D∇n,

(10.2)

where D is the diﬀusion constant, and also to the ﬂow of electrical current given by the current density J e deﬁned by

V

S J

J e = σE = −σ∇φ,

J

Fig. 10.1 A closed surface S encloses a volume V . The R total heat ﬂow out of S is given by S J · dS.

(10.3)

where σ is the conductivity, E is the electric ﬁeld and φ here is the electric potential. Because of this mathematical similarity, an equation which is analogous to the diﬀusion equation (eqn 9.36) holds in each case. We will derive the thermal diﬀusion equation in this section. In fact in all these phenomena, there needs to be some account of the fact that you can’t destroy energy, or particles, or charge. (We will only treat the thermal case here.) The total heat ﬂow out of a closed surface S is given by the integral J · dS, (10.4) S

and is a quantity which has the dimension of power. It is therefore equal to the rate which the material inside the surface is losing energy.

10.2

The one-dimensional thermal diﬀusion equation 89

This can be expressed as the rate of change of the total thermal energy inside the volume V which is surrounded by the closed surface S. The thermal energy can be written as the volume integral V CT dV , where C here is the heat capacity per unit volume (measured in J K−1 m−3 ) and is equal to ρc, where ρ is the density and c is the heat capacity per unit mass (the speciﬁc heat capacity, see Section 2.2). Hence ∂ J · dS = − CT dV. (10.5) ∂t V S The divergence theorem implies that J · dS = ∇ · J dV, S

(10.6)

V

and hence that

∂T . (10.7) ∂t Substituting in eqn 10.1 then yields the thermal diﬀusion equation which is ∂T = D∇2 T , (10.8) ∂t ∇ · J = −C

where D = κ/C is the thermal diﬀusivity. Since κ has units W m−1 K−1 and C = ρc has units J K−1 m−3 , D has units m2 s−1 .

10.2

The one-dimensional thermal diﬀusion equation

In one dimension, this equation becomes ∂2T ∂T =D 2, ∂t ∂x

(10.9)

and can be solved using conventional methods.

Example 10.1 Solution of the one-dimensional thermal diﬀusion equation The one-dimensional thermal diﬀusion equation looks a bit like a wave equation. Therefore, one method to solve eqn 10.9 is to look for wave-like solutions of the form T (x, t) ∝ exp(i(kx − ωt)),

(10.10)

where k = 2π/λ is the wave vector, ω = 2πf is the angular frequency, λ is the wavelength and f is the frequency. Substitution of this equation into eqn 10.9 yields (10.11) −iω = −Dk 2

We haven’t worried about what the ‘zero’ of thermal energy is; there could also be an additive, time-independent, constant in the expression for total thermal energy, but since we are going to diﬀerentiate this with respect to time to obtain the rate of change of thermal energy, it doesn’t matter.

90 The thermal diﬀusion equation

and hence

iω (10.12) D so that ω . (10.13) k = ±(1 + i) 2D The spatial part of the wave, which looks like exp(ikx), can either be of the form ω exp (i − 1) x , which blows up as x → −∞ , (10.14) 2D k2 =

or

ω exp (−i + 1) x , 2D

which blows up as x → ∞ .

(10.15)

Let us now solve a problem in which a boundary condition is applied at x = 0 and a solution is desired in the region x > 0. We don’t want solutions which blow up as x → ∞ and pick the ﬁrst type of solution (i.e. eqn 10.14). Hence our general solution for x ≥ 0 can be written as ω x , (10.16) A(ω) exp(−iωt) exp (i − 1) T (x, t) = 2D ω where we have summed over all possible frequencies. To ﬁnd which frequencies are needed, we have to be speciﬁc about the boundary condition for which we want to solve. Let us imagine that we want to solve the one-dimensional problem of the propagation of sinusoidal temperature waves into the ground. The waves could be due to the alternation of day and night (for a wave with period 1 day), or winter and summer (for a wave with period 1 year). The boundary condition can be written as T (0, t) = T0 + ∆T cos Ωt.

(10.17)

This boundary condition can be rewritten T (0, t) = T0 +

∆T iΩt ∆T −iΩt e + e . 2 2

(10.18)

However, at x = 0 the general solution (eqn 10.16) becomes T (0, t) = A(ω) exp(−iωt).

(10.19)

ω

Comparison of eqns 10.18 and 10.19 implies that the only non-zero values of A(ω) are A(0) = T0 ,

A(−Ω) =

∆T 2

and

A(Ω) =

Hence the solution to our problem for x ≥ 0 is

∆T −x/δ x e , T (x, t) = T0 + cos Ωt − 2 δ

∆T . 2

(10.20)

(10.21)

10.2

The one-dimensional thermal diﬀusion equation 91

2D 2κ δ= = (10.22) Ω ΩC is known as the skin depth. The solution in eqn 10.21 is plotted in Fig. 10.2. [Note that the use of the term skin depth brings out the analogy between this eﬀect and the skin depth which arises when electromagnetic waves are incident on a metal surface, see e.g. Griﬃths (2003).] We note the following important features of this solution: where

x

• T falls oﬀ exponentially as e−x/δ . • There is a phase shift of x/δ radians in the oscillations. • δ ∝ Ω−1/2 so that faster oscillations fall oﬀ faster.

t

T

T

T

t

x

Fig. 10.2 A contour plot and a surface plot of eqn 10.21, showing that the temperature falls oﬀ exponentially as e−x/δ . The contour plot shows that there is a phase shift in the oscillations as x increases.

92 The thermal diﬀusion equation

10.3

The steady state

If the system has reached a steady state, its properties are not timedependent. This includes the temperature, so that ∂T = 0. ∂t

(10.23)

Hence in this case, the thermal diﬀusion equation reduces to ∇2 T = 0,

(10.24)

which is Laplace’s equation.

10.4

1

See Appendix B.

The thermal diﬀusion equation for a sphere

Very often, heat transfer problems have spherical symmetry (e.g. the cooling of the Earth or the Sun). In this section we will show that one can also solve the (rather forbidding looking) problem of the thermal diﬀusion equation in a system with spherical symmetry. In spherical polars, we have in general that ∇2 T is given by1 ∂2T ∂T 1 ∂ ∂ 1 ∂T 1 ∇2 T = 2 , (10.25) r2 + 2 sin θ + 2 2 ∂r ∂θ r ∂r r sin θ ∂θ r sin θ ∂φ2 so that if T is not a function of θ or φ we can write 1 ∂ 2 2 ∂T ∇ T = 2 r , ∂r r ∂r

(10.26)

and hence the diﬀusion equation becomes κ 1 ∂ ∂T = ∂t C r2 ∂r

r

2 ∂T

∂r

.

(10.27)

Example 10.2 The thermal diﬀusion equation for a sphere in the steady state. In the steady state, ∂T /∂t = 0 and hence we need to solve 1 ∂ 2 ∂T r = 0. (10.28) ∂r r2 ∂r Now if T is independent of r, ∂T /∂r = 0 and this will be a solution. Moreover, if r2 (∂T /∂r) is independent of r, this will generate another solution. Now r2 (∂T /∂r) = constant implies that T ∝ r−1 . Hence a general solution is B (10.29) T =A+ , r

10.4

The thermal diﬀusion equation for a sphere 93

where A and B are constants. This should not surprise us if we know some electromagnetism, as we are solving Laplace’s equation in spherical coordinates assuming spherical symmetry, and in electromagnetism the solution for the electric potential in this case is an arbritary constant plus a Coulomb potential which is proportional to 1/r.

A practical problem one often needs to solve is cooking a slab of meat. The meat is initially at some cool temperature (the temperature of the kitchen or of the refrigerator) and it is placed into a hot oven. The skill in cooking is getting the inside up to temperature. How long does it take? The next example shows how to calculate this for the (rather artiﬁcial) example of a spherical chicken!

Example 10.3 The spherical chicken A spherical chicken2 of radius a at initial temperature T0 is placed into an oven at temperature T1 at time t = 0 (see Figure 10.3). The boundary conditions are that the oven is at temperature T1 so that T (a, t) = T1 ,

2

The methods in this example can also be applied to a spherical nut roast.

(10.30)

and the chicken is originally at temperature T0 , so that T (r, 0) = T0 .

(10.31)

We want to obtain the temperature as a function of time at the centre of the chicken, i.e. T (0, t). Solution: We will show how we can transform this to a one-dimensional diﬀusion equation. This is accomplished using a substitution T (r, t) = T1 +

B(r, t) , r

(10.32)

where B(r, t) is now a function of r and t. This substitution is motivated by the solution to the steady–state problem in eqn 10.29 and of course means that that we can write B as B = r(T − T1 ). We now need to work out some partial diﬀerentials: 1 ∂B ∂T = , ∂t r ∂t ∂T B 1 ∂B =− 2 + , ∂r r r ∂r and hence multiplying eqn 10.34 by r2 we have that r2

∂B ∂T = −B + r , ∂r ∂r

(10.33) (10.34)

(10.35)

a T T

T T

Fig. 10.3 Initial condition of a spherical chicken of radius a at initial temperature T0 which is placed into an oven at temperature T1 at time t = 0.

94 The thermal diﬀusion equation

and therefore

∂T ∂ ∂2B r2 =r 2, ∂r ∂r ∂r

(10.36)

which reduces the problem to ∂B ∂2B =D 2, ∂t ∂r

(10.37)

where D = κ/C. This is a one-dimensional diﬀusion equation and is therefore much easier to solve than the one with which we started. The new boundary conditions can be rewritten as follows: (1) because B = r(T − T1 ) we have that B = 0 when r = 0: B(0, t) = 0;

(10.38)

(2) because T = T1 at r = a we have that: B(a, t) = 0;

(10.39)

(3) because T = T0 at t = 0 we have that: B(r, 0) = r(T0 − T1 ).

(10.40)

We look for wave-like solutions with these boundary conditions and hence are led to try (10.41) B = sin(kr)e−iωt , and hence iω = Dk 2 .

(10.42)

The relation ka = nπ where n is an integer ﬁts the ﬁrst two boundary conditions and hence

nπ 2 , (10.43) iω = D a and hence our general solution is B(r, t) =

∞

An sin

nπr

n=1

a

e−D(

nπ a

2

) t.

(10.44)

To ﬁnd An , we need to match this solution at t = 0 using our third boundary condition. Hence r(T0 − T1 ) =

∞

An sin

nπr

n=1

Notice that the functions sin(nπr/a) and sin(mπr/a) are orthogonal unless m = n.

We multiply both sides by sin

a

sin 0

mπr a

mπr

r(T0 − T1 ) dr =

a

∞ n=1

a

.

(10.45)

and integrate, so that An

a

sin 0

mπr a

sin

nπr a

dr.

(10.46)

10.5

The right-hand side yields Am a/2 and the left-hand side can be integrated by parts. This yields Am =

2a (T1 − T0 )(−1)m , mπ

(10.47)

and hence that B(r, t) =

∞ 2 (−1)n 2a (T1 − T0 ) sin(nπr/a)e−D(nπ/a) t , π n n=1

(10.48)

so that using eqn 10.32 the temperature T (r, t) inside the chicken (r ≤ a) behaves as T (r, t) = T1 +

∞ (−1)n sin(nπr/a) −D(nπ/a)2 t 2a (T1 − T0 ) e . (10.49) π n r n=1

The centre of the chicken has temperature T (0, t) = T1 + 2(T1 − T0 )

∞

2

(−1)n e−D(nπ/a) t ,

(10.50)

n=1

using the fact that as r → 0,

nπr nπ 1 sin → . r a a

(10.51)

The expression in eqn 10.50 becomes dominated by the ﬁrst exponential in the sum as time t increases, so that 2

T (0, t) ≈ T1 − 2(T1 − T0 )e−D(π/a) t ,

(10.52)

for t a2 /Dπ 2 . Analogous behaviour is of course found for a warm sphere which is cooling in a colder environment. A cooling or warming body thus behaves like a low-pass ﬁlter, with the smallest exponent dominating at long times. The smaller the sphere, the shorter the time before it warms or cools according to a simple exponential law.

10.5

Newton’s law of cooling

Newton’s law of cooling states that the temperature of a cooling body falls exponentially towards the temperature of its surroundings with a rate which is proportional to the area of contact between the body and the environment. The results of the previous section indicate that it is an approximation to reality, as a cooling sphere only cools exponentially at long times. Newton’s law of cooling is often stated as follows: the heat loss of a solid or liquid surface (a hot central heating pipe or the exposed surface of a cup of tea) to the surrounding gas (usually air, which is free

Newton’s law of cooling 95

96 The thermal diﬀusion equation

Fig. 10.4 The sum of the ﬁrst few terms of T (0, t) = T1 + P n −D(nπ/a)2 t are 2(T1 − T0 ) ∞ n=1 (−1) e shown, together with T (0, t) evaluated from all terms (thick solid line). The sums of only the ﬁrst few terms fail near t = 0 and one needs more and more terms to give an accurate estimate of the temperatures as t gets closer to 0 (although this is the region where one knows what the temperature is anyway!).

to convect the heat away) is proportional to the area of contact multiplied by the temperature diﬀerence between the solid/liquid and the gas. Mathematically, this can be expressed as an equation for the heat ﬂux J which is J = h∆T, (10.53) where ∆T is the temperature diﬀerence between the body and its environment and h is a vector whose direction is normal to the surface of the body and whose magnitude h = |h| is a heat transfer coeﬃcient. In general, h depends on the temperature of the body and its surroundings and varies over the surface, so that Newton’s “law” of cooling is more of an empirical relation. This alternative deﬁnition generates an exponential decay of temperature as demonstrated in the following example.

Example 10.4 A polystyrene cup containing tea at temperature Thot at t = 0 stands for a while in a room with air temperature Tair . The heat loss through the surface area A exposed to the air is, according to Newton’s law of cooling, proportional to A(T (t) − Tair ), where T (t) is the temperature of the tea at time t. Ignoring the heat lost by other means, we have that −C

∂T = JA = hA(T − Tair ), ∂t

(10.54)

10.6

The Prandtl number 97

where J is the heat ﬂux, C is the heat capacity of the cup of tea and h is a constant, so that T = Tair + (Thot − Tair )e−λt

(10.55)

where λ = Ah/C.

What makes these types of calculations of heat transfer so diﬃcult is that heat transfer from bodies into their surrounding gas or liquid often is dominated by convection.3 Convection can be deﬁned as the transfer of heat by the motion of or within a ﬂuid (i.e. within a liquid or a gas). Convection is often driven by the fact that warmer ﬂuid expands and rises, while colder ﬂuid contracts and sinks; this causes currents in the ﬂuid to be set up which rather eﬃciently transfer heat. Our analysis of the thermal conductivity in a gas ignores such currents. Convection is a very complicated process and can depend on the precise details of the geometry of the surroundings. A third form of heat transfer is by thermal radiation and this will be the subject of chapter 23.

10.6

The Prandtl number

How valid is it to ignore convection? It’s clearly ﬁne to ignore it in a solid, but for a ﬂuid we need to know the relative strength of the diﬀusion of momentum and heat. Convection dominates if momentum diﬀusion dominates (because convection involves transport of the gas itself) but conduction dominates if heat diﬀusion dominates. We can express these two diﬀusivities using the kinematic viscosity ν = η/cp (with units m2 s−1 ) and the thermal diﬀusivity D = κ/ρcp (also with units m2 s−1 ), where ρ is the density. To examine their relative magnitudes, we deﬁne the Prandtl number as the dimensionless ratio σp obtained by dividing ν by D, so that σp =

ηcp ν = . D κ

(10.56)

For an ideal gas, we can use cp /cV = γ = 53 , and using eqn 9.21 (which states that κ = cV η) we arrive at σp = 53 . However, eqn 9.21 resulted from an approximate treatment, and the corrected version is eqn 9.44 (which states that κ = 52 ηcV ), and hence we arrive at σp =

2 . 3

(10.57)

For many gases, the Prandtl number is found to be around this value. It is between 100 and 40000 for engine oil and around 0.015 for mercury. When σp 1 diﬀusion of momentum (i.e. viscosity) dominates over diﬀusion of heat (i.e. thermal conductivity), and convection is the dominant mode of heat transport. When σp 1 the reverse is true, and thermal conduction dominates the heat transport.

3 One can either have forced convection, in which ﬂuid is driven past the cooling body by some external input of work (provided by means of a pump, fan, propulsive motion of an aircraft etc.), or free convection, in which any external ﬂuid motion is driven only by the temperature diﬀerence between the cooling body and the surrounding ﬂuid. Newton’s law of cooling is actually only correct for forced convection, while for free convection (which one should probably use for the example of the cooling of a cup of tea in air) the heat transfer coeﬃcient is temperature dependent (h ∝ (∆T )1/4 for laminar ﬂow, h ∝ (∆T )1/3 in the turbulent regime). We examine convection in stars in more detail in Section 35.3.2.

98 The thermal diﬀusion equation

10.7

Sources of heat

If heat is generated at a rate H per unit volume, (so H is measured in W m−3 ), this will add to the divergence of J so that eqn 10.7 becomes ∇ · J = −C

∂T + H, ∂t

(10.58)

and hence the thermal diﬀusion equation becomes H C ∂T − , κ ∂t κ

(10.59)

∂T H = D∇2 T + . ∂t C

(10.60)

∇2 T = or equivalently

Example 10.5 A metallic bar of length L with both ends maintained at T = T0 passes a current which generates heat H per unit length of the bar per second. Find the temperature at the centre of the bar in steady state. Solution: In steady state, ∂T = 0, (10.61) ∂t and so ∂2T H =− . ∂x2 κ

(10.62)

Integrating this twice yields T = αx + β −

H 2 x , 2κ

(10.63)

where α and β are constants of integration. The boundary conditions imply that H T − T0 = x(L − x), (10.64) 2κ so that at x = L/2 we have that the temperature is T = T0 +

HL2 . 8κ

(10.65)

Exercises 99

Chapter summary • The thermal diﬀusion equation (in the absence of a heat source) is ∂T = D∇2 T , ∂t

(10.66)

where D = κ/C is the thermal diﬀusivity. • ‘Steady state’ implies that ∂ (physical quantity) = 0. ∂t

(10.67)

• If heat is generated at a rate H per unit volume per unit time, then the thermal diﬀusion equation becomes ∂T H = D∇2 T + . ∂t C

(10.68)

• Newton’s law of cooling states that the heat loss from a solid or liquid surface is proportional to the area of the surface multiplied by the temperature diﬀerence between the solid/liquid and the gas.

Exercises I. The temperature of its surface is ﬁxed at T0 us(10.1) One face of a thick uniform layer is subject to sinuing water cooling. Show that the temperature T (r) soidal temperature variations of angular frequency inside the wire at radius r is given by ω. Show that damped sinusoidal temperature oscillations propagate into the layer and give an exρI 2 pression for the decay length of the oscillation amT (r) = T0 + 2 4 (a2 − r2 ). 4π a κ plitude. A cellar is built underground and is covered by a (b) The wire is now placed in air at temperature Tair ceiling which is 3 m thick and made of limestone. and the wire loses heat from its surface according to The outside temperature is subject to daily ﬂuctuNewton’s law of cooling (so that the heat ﬂux from ations of amplitude 10◦ C and annual ﬂuctuations of the surface of the wire is given by α(T (a) − Tair ), ◦ 20 C. Estimate the magnitude of the daily and anwhere α is a constant). Find the temperature T (r). nual temperature variations within the cellar. Assuming that January is the coldest month of the (10.3) Show that for the problem of a spherical chicken being cooked in an oven considered in Example 10.3 year, when will the cellar’s temperature be at its in this chapter, the temperature T gets 90% of the lowest? way from T0 to T1 after a time ∼ a2 ln 20/π 2 D. [The thermal conductivity of limestone is (10.4) A microprocessor has an array of metal ﬁns at1.6 Wm−1 K−1 , and the heat capacity of limestone tached to it, whose purpose is to remove heat genis 2.5 × 106 J K−1 m−3 .] erated within the processor. Each ﬁn may be rep(10.2) (a) A cylindrical wire of thermal conductivity κ, raresented by a long thin cylindrical copper rod with dius a and resistivity ρ uniformly carries a current one end attached to the processor; heat received by

100 Exercises the rod through this end is lost to the surroundings through its sides. Show that the temperature T (x, t) at location x along the rod at time t obeys the equation ρCp

∂T 2 ∂2T = κ 2 − R(T ), ∂t ∂x a

where a is the radius of the rod, and R(T ) is the rate of heat loss per unit area of surface at temperature T. The surroundings of the rod are at temperature T0 . Assume that R(T ) has the form of Newton’s law of cooling, namely R(T ) = A(T − T0 ).

carriers. An alternative equation can be derived as follows. Consider the number density n of thermal carriers in a material. In equilibrium, n = n0 , so that „ « ∂n n − n0 = −v · ∇n + , (10.72) ∂t τ where τ is a relaxation time and v is the carrier velocity. Multiply this equation by ωτ v, where ω is the energy of a carrier, and sum over allPk states. P Using the fact that k n0 v = 0 and J = k ωnv, and that |n − n0 | n0 show that J +τ

dJ = −κ∇T, dt

(10.73)

and hence the modiﬁed thermal diﬀusion equation In the steady state: becomes (a) obtain an expression for T as a function of x for ∂T ∂2T (10.74) + τ 2 = D∇2 T. the case of an inﬁnitely long rod whose hot end has ∂t ∂t temperature Tm ; Show that this does not suﬀer from a group velocity (b) show that the heat that can be transported away whose magnitude can ever become inﬁnite. Is this by a long rod (with radius a) is proportional to a3/2 , modiﬁcation ever necessary? provided that A is independent of a. (10.7) A series of N large, ﬂat rectangular slabs with thickIn practice the rod is not inﬁnitely long. What ness ∆xi and thermal conductivity κi are placed length does it need to have for the results above to on top of one another. The top and bottom surbe approximately valid? The radius of the rod, a, faces are maintained at temperature Ti and Tf reis 1.5 mm. spectively. Show that the heat ﬂux JPthrough the [The thermal conductivity of copper is slabs is given by J = (Ti − Tf )/ i Ri , where The cooling constant A = 380 W m−1 K−1 . Ri = ∆xi /κi . −2 −1 250 W m K .] (10.8) The space between two concentric cylinders is ﬁlled (10.5) For oscillations at frequency ω, a viscous penetrawith material of thermal conductivity κ. The intion depth δv can be deﬁned by ner (outer) cylinder has radius r1 (r2 ) and is main„ «1/2 tained at temperature T1 (T2 ). Derive an expres2η , (10.69) δv = sion for the heat ﬂow per unit length between the ρω cylinders. analogously to the thermal penetration depth (10.9) A pipe of radius R is maintained at a uniform tem„ «1/2 perature T . In order to reduce heat loss from the 2κ pipe, it is lagged by an insulating material of ther(10.70) δ= ρcp ω mal conductivity κ. The lagged pipe has radius r > R. Assume that all surfaces lose heat accorddeﬁned in this chapter. Show that ing to Newton’s law of cooling J = h∆T , where „ «2 δv h = |h| can be taken to be a constant. Show that = σp , (10.71) the heat loss per unit length of pipe is inversely δ proportional to where σp is the Prandtl number (see eqn 10.56). 1 “r” 1 (10.6) For thermal waves, calculate the magnitude of the , (10.75) + ln hr κ R group velocity. This shows that the thermal diﬀuand hence show that thin lagging doesn’t reduce sion equation cannot hold exactly as the velocity heat loss if R < κ/h. of propagation can become larger than that of the

Biography 101

Jean Baptiste Joseph Fourier (1768–1830) Fourier was born in Auxerre, France, the son of a ´ tailor. He was schooled there in the Ecole Royale Militaire where he showed early mathematical promise. In 1787 he entered a Benedictine abbey to train for the priesthood, but the pull of science was too great and he never followed that vocation, instead becoming a teacher at his old school in Auxerre. He was also interested in politics, and unfortunately there was a lot of it around at the time; Fourier became embroiled in the RevolutionFig. 10.5 J.B.J. Fourier ary ferment and in 1794 came close to being guillotined, but following Robespierre’s execution by the same means, the political tide turned in Fourier’s ´ favour. He was able to study at the Ecole Normale in Paris under such luminaries as Lagrange and ´ Laplace, and in 1795 took up a chair at the Ecole Polytechnique. Fourier joined Napoleon on his invasion of Egypt in 1798, becoming governor of Lower Egypt in the process. There he carried out archaeological explorations and later wrote a book about Egypt (which Napoleon then edited to make the history sections more favourable to himself). Nelson’s defeat of the French ﬂeet in late 1798 rendered Fourier isolated there, but he nevertheless set up political institutions. He managed to slink back to France in 1801 to resume his academic post, but Napoleon (a hard man to refuse) sent him back to an administrative position in Grenoble where he ended up on such highbrow activities as supervising the draining of swamps and organizing the construction of a road between Grenoble and Turin. He nevertheless found enough time to work on experiments on the propagation of

heat and published, in 1807, his memoir on this subject. Lagrange and Laplace criticized his mathematics (Fourier had been forced to invent new techniques to solve the problem, which we now call Fourier series, and this was fearsomely unfamiliar stuﬀ at the time), while the notoriously diﬃcult Biot (he of the Biot-Savart law fame) claimed that Fourier had ignored his own crucial work on the subject (Fourier had discounted it, as Biot’s work on this subject was wrong). Fourier’s work won him a prize, but reservations about its importance or correctness remained. In 1815, Napoleon was exiled to Elba and Fourier managed to avoid Napoleon who was due to pass through Grenoble en route out of France. When Napoleon escaped, he brought an army to Grenoble and Fourier avoided him again, earning Napoleon’s displeasure, but he managed to patch things up and got himself made Prefect of Rhˆone, a position from which he resigned as soon as he could. Following Napoleon’s ﬁnal defeat at Waterloo, Fourier became somewhat out of favour in political circles and was able to continue working on physics and mathematics back in Paris. In 1822 he published his Th´eorie analytique de chaleur (Analytical Theory of Heat) which included all his work on thermal diﬀusion and the use of Fourier series, a work that was to prove inﬂuential with many later thermodynamicists of the nineteenth century. In 1824, Fourier wrote an essay which pointed towards what we now call the greenhouse eﬀect; he realised that the insulating eﬀect of the atmosphere might increase the Earth’s surface temperature. He understood the way planets lose heat via infrared radiation (though he called it “chaleur obscure”). Since so much of his scientiﬁc work had been bound up with the nature of heat (even his work on Fourier series was only performed so he could solve heat problems) he became, in his later years, somewhat obsessed by the imagined healing powers of heat. He kept his house overheated, and wore excessively warm clothes, in order to maximize the eﬀect of the supposedly lifegiving heat. He died in 1830 after falling down the stairs.

This page intentionally left blank

Part IV

The ﬁrst law In this part we are now ready to think about energy in some detail and hence introduce the ﬁrst law of thermodynamics. This part is structured as follows: • In Chapter 11, we present the notion of a function of state, of which internal energy is one of the most useful. We discuss in detail the ﬁrst law of thermodynamics, which states that energy is conserved and heat is a form of energy. We derive expressions for the heat capacity measured at constant volume or pressure for an ideal gas. • In Chapter 12 we introduce the key concept of reversibility and discuss isothermal and adiabatic processes.

11 11.1 Some deﬁnitions

Energy 104

11.2 The ﬁrst law of thermodynamics 106 11.3 Heat capacity

108

Chapter summary

111

Exercises

111

In this chapter we are going to focus on one of the key concepts in thermal physics, that of energy. What happens when energy is changed from one form to another? How much work can you get out of a quantity of heat? These are key questions to be answered. We are now beginning a study of thermodynamics proper, and in this chapter we will introduce the ﬁrst law of thermodynamics. Before the ﬁrst law, the most important concept in this chapter, we will introduce some additional ideas.

11.1

Some deﬁnitions

11.1.1

A system in thermal equilibrium

In thermodynamics, we deﬁne a system to be whatever part of the Universe we select for study. Near the system are its surroundings. We recall from Section 4.1 that a system is in thermal equilibrium when its macroscopic observables (such as its pressure or its temperature) have ceased to change with time. If you take a gas in a container which has been held at a certain stable temperature for a considerable period of time, the gas is likely to be in thermal equilibrium. A system in thermal equilibrium having a particular set of macroscopic observables is said to be in a particular equilibrium state. If however, you suddenly apply a lot of heat to one side of the box, then initially at least, the gas is likely to be in a non-equilibrium state.

11.1.2

Functions of state

A system is in an equilibrium state if macroscopic observable properties have ﬁxed, deﬁnite values, independent of ‘how they got there’. These properties are functions of state (sometimes called variables of state). A function of state is any physical quantity that has a well-deﬁned value for each equilibrium state of the system. Thus, in thermal equilibrium these variables of state have no time dependence. Examples are volume, pressure, temperature and internal energy, and we will introduce a lot more in what follows. Examples of quantities which are not functions of state include the position of particle number 4325667, the total work done on a system and the total heat put into the system. Below, we will show in detail why work and heat are not functions of state. However, the point can be understood as follows: the

11.1

fact that your hands are warm or cold depends on their current temperature (a function of state), independently of how you got them to that temperature. For example, you can get to the same ﬁnal thermodynamic state of having warm hands by diﬀerent combinations of working and heating, e.g. you can end up with warm hands by rubbing them together (using the muscles in your arms to do work on them) or putting them in a toaster1 (adding heat). We now give a more mathematical treatment of what is meant by a function of state. Let the state of a system be described by parameters x = (x1 , x2 , . . .) and let f (x) be some function of state. [Note that this could be a very trivial function, such as f (x) = x1 , since what we’ve called ‘parameters’ are themselves functions of state. But we want to allow for more complicated functions of state which might be combinations of these ‘parameters’.] Then if the system parameters change from xi to xf , the change in f is xf df = f (xf ) − f (xi ). (11.1) ∆f =

1

Some deﬁnitions 105

NB don’t try this at home.

y

xi

This only depends on the endpoints xi and xf . The quantity df is an exact diﬀerential (see Appendix C.7) and functions of state have exact diﬀerentials. By contrast, a quantity which is represented by an inexact diﬀerential is not a function of state. The following example illustrates these kinds of diﬀerentials.

x y

Example 11.1 Let a system be described by two parameters, x and y. Let f = xy so that df = d(xy) = y dx + x dy. (11.2)

x y

Then if (x, y) changes from (0, 0) to (1, 1), the change in f is given by (1,1) (1,1) df = [xy](0,0) = (1 × 1) − (0 × 0) = 1. (11.3) ∆f = (0,0)

This answer is independent of the exact path taken (it could be any of those shown in Fig. 11.1) because df is an exact diﬀerential. Now consider2 d¯g = y dx. The change in g when (x, y) changes from (0, 0) to (1, 1) along the path shown in Fig. 11.1(a) is given by 1 (1,1) 1 (11.4) y dx = x dx = . ∆g = 2 (0,0) 0 However when the integral is not carried out along the line y = x, but along the path shown in Fig. 11.1(b), it is given by (1,1) (1,0) y dx + y dx = 0. (11.5) ∆g = (0,0)

(1,0)

x Fig. 11.1 Three possible paths between the points (x, y) = (0, 0) and (x, y) = (1, 1). 2

We put a line through quantities such as the d in d ¯g to signify that it is an inexact diﬀerential.

106 Energy

3

Note that if x is taken to be volume, V , and y is taken to be pressure, p, then the quantity f is proportional to temperature, while d ¯g is the negative of the work d ¯W = −p dV . This demonstrates that temperature is a function of state and work is not.

If the integral is taken along the path shown in Fig. 11.1(c), yet another result would be obtained, but we are not going to attempt to calculate that! Hence we ﬁnd that the value of ∆g depends on the path taken, and this is because d ¯g is an inexact diﬀerential.3 Recall from Section 1.2 that functions of state can either be: • extensive (proportional to system size), e.g. energy, volume, magnetization, mass, or • intensive (independent of system size), e.g. temperature, pressure, magnetic ﬁeld, density, energy density. In general one can ﬁnd an equation of state which connects functions of state: for a gas this takes the form f (p, V, T ) = 0. An example is the equation of state for an ideal gas, pV = nRT , which we met in eqn 1.12.

11.2

4

The S.I. unit of energy is named after Joule. 1 J = 1 N m. Older units are still in use in some places: 1 calorie is deﬁned as the energy required to raise 1 g of water by 1◦ C (actually from 14.5◦ C to 15.5◦ C at sea level) and 1 cal = 4.184 J. The energy contained in food is usually measured in kilocalories (kcal), where 1 kcal=1000 cal. Older books sometimes used the erg: 1 erg = 10−7 J. The British thermal unit (Btu) is an archaic unit, no longer commonly used in Britain: 1 Btu = 1055 J. The foot-pound is 1 ft lb = 1.356 J. Electricity bills often record energy in kilowatt hours (1 kWh = 3.6 MJ). Useful in atomic physics is the electron Volt: 1 eV = 1.602×10−19 J.

The ﬁrst law of thermodynamics

Though the idea that heat and work are both forms of energy seems obvious to a modern physicist, the idea took some getting used to. Lavoisier had, in 1789, proposed that heat was a weightless, conserved ﬂuid called caloric. Caloric was a fundamental element that couldn’t be created or destroyed. Lavoisier’s notion ‘explained’ a number of phenomena, such as combustion (fuels have stored caloric which is released on burning). Rumford in 1798 realised that something was wrong with the caloric theory: heating could be produced by friction, and if you keep on drilling through a cannon barrel (to take the example that drew the problem to his attention) almost limitless supplies of heat can be extracted. Where does all this caloric come from? Mayer quantiﬁed this in 1842 with an elegant experiment in which he frictionally generated heat in paper pulp and measured the temperature rise. Joule4 independently performed similar experiments, but more accurately, in the period 1840–1845 (and his results became better known so that he was able to claim the credit!) Joule let a mass tied to a string slowly descend a certain height, while the other end of the string turns a paddle wheel immersed in a certain mass of water. The turning of the paddle frictionally heats the water. After a number of descents, Joule measured the temperature rise of the water. In this way he was able to deduce the ‘mechanical equivalent of heat’. He also measured the heat output of a resistor (which, in modern units, is equal to I 2 R, where I is the current and R the resistance). He was able to show that the same heat was produced for the same energy used, independent of the method of delivery. This implied that heat is a form of energy. Joule’s experiments therefore consigned the caloric theory of heat to a footnote in history. However, it was Mayer and later Helmholtz who elevated the experimental observations into a grand principle, which we can state as follows:

11.2

The ﬁrst law of thermodynamics 107

The ﬁrst law of thermodynamics: Energy is conserved and heat and work are both forms of energy. A system has an internal energy U which is the sum of the energy of all the internal degrees of freedom that the system possesses. U is a function of state because it has a well–deﬁned value for each equilibrium state of the system. We can change the internal energy of the system by heating it or by doing work on it. The heat Q and work W are not functions of state since they concern the manner in which energy is delivered to (or extracted from) the system. After the event of delivering energy to the system, you have no way of telling which of Q or W was added to (or subtracted from) the system by examining the system’s state. The following analogy may be helpful: your personal bank balance behaves something like the internal energy U in that it acts like a function of state of your ﬁnances; cheques and cash are like heat and work in that they both result in a change in your bank balance, but after they have been paid in, you can’t tell by simply looking at the value of your bank balance by which method the money was paid in. The change in internal energy U of a system can be written ∆U = ∆Q + ∆W,

F

(11.6)

where ∆Q is the heat supplied to the system and ∆W is the work done on the system. Note the convention: ∆Q is positive for heat supplied to the system; if ∆Q is negative, heat is extracted from the system; ∆W is positive for work done on the system; if ∆W is negative, the system does work on its surroundings. We deﬁne a thermally isolated system as a system which cannot exchange heat with its surroundings. In this case we ﬁnd that ∆U = ∆W , because no heat can pass in or out of a thermally isolated system. For a diﬀerential change, we write eqn 11.6 as dU = d¯Q + d¯W ,

x

V

(11.7)

where d ¯W and d¯Q are inexact diﬀerentials. The work done on stretching a wire by a distance dx with a tension F is (see Fig. 11.2(a)) d ¯W = F dx. (11.8) The work done by compressing a gas (pressure p, volume V ) by a piston can be calculated in a similar fashion (see Fig. 11.2(b)). In this case the force is F = pA, where A is the area of the piston, and Adx = −dV , so that d ¯W = −p dV. (11.9) In this equation, the negative sign ensures that the work d¯W done on the system is positive when dV is negative, i.e. when the gas is being compressed.

Fig. 11.2 (a) The work done stretching a wire by a distance dx is F dx. (b) The work done compressing a gas is −p dV .

108 Energy

It turns out that eqn 11.9 is only strictly true for a reversible change, a point we will explain further in Section 12.1. The idea is that if the piston is not frictionless, or if you move the piston too suddenly and generate shock waves, you will need to do more work to compress the gas because more heat is dissipated in the process.

11.3

Heat capacity

We now want to understand in greater detail how adding heat can change the internal energy of gas. In general, the internal energy will be a function of temperature and volume, so that we can write U = U (T, V ). Hence a small change in U can be related to changes in T and V by ∂U ∂U dT + dV. (11.10) dU = ∂T V ∂V T Rearranging eqn 11.7 with eqn 11.9 yields d ¯Q = dU + p dV, and now using eqn 11.10 we have that ∂U ∂U dT + + p dV. d¯Q = ∂T V ∂V T We can divide eqn 11.12 by dT to obtain ∂U ∂U dV d ¯Q = , + +p dT ∂T V ∂V T dT

(11.11)

(11.12)

(11.13)

which is valid for any change in T or V . However, what we want to know is what is the amount of heat we have to add to eﬀect a change of temperature under certain constraints. The ﬁrst constraint is that of keeping the volume constant. We recall the deﬁnition of the heat capacity at constant volume CV (see Section 2.2, eqn 2.6) as ∂Q . (11.14) CV = ∂T V From eqn 11.13, this constraint knocks out the second term and implies that ∂U . (11.15) CV = ∂T V The heat capacity at constant pressure is then, using eqns 2.7 and 11.13, given by ∂Q (11.16) Cp = ∂T p ∂U ∂V ∂U + +p (11.17) = ∂T V ∂V T ∂T p

11.3

so that Cp − CV =

∂U ∂V

+p T

∂V ∂T

.

(11.18)

p

Recall from Section 2.2 that heat capacities are measured in J K−1 and refer to the heat capacity of a certain quantity of gas. We will sometimes wish to talk about the heat capacity per mole of gas, or sometimes the heat capacity per mass of gas. We will use small c for the latter, known as the speciﬁc heat capacities: cV

=

cp

=

CV M Cp M

(11.19) (11.20) (11.21)

where M is the mass of the material. Speciﬁc heat capacities are measured in J K−1 kg−1 .

Example 11.2 Heat capacity of an ideal monatomic gas For an ideal monatomic gas, the internal energy U is due to the kinetic energy, and hence U = 32 RT per mole (see eqn 5.17; this result arises from the kinetic theory of gases). This means that U is only a function of temperature. Hence ∂U = 0. (11.22) ∂V T The equation of state for 1 mole of ideal gas is pV = RT,

(11.23)

RT , p

(11.24)

so that V = and hence

∂V ∂T

= p

R , p

and hence using eqns 11.18, 11.22 and 11.25 we have that ∂U ∂V Cp − CV = +p = R. ∂V T ∂T p Because U = 32 RT , we therefore have ∂U CV = = ∂T V and Cp = CV + R =

(11.25)

(11.26)

that 3 R per mole, 2 5 R per mole. 2

(11.27)

(11.28)

Heat capacity 109

110 Energy

Example 11.3 Is it always true that dU = CV dT ? Solution: No, in general eqn 11.10 and eqn 11.15 imply that ∂U dU = CV dT + dV. ∂V T ∂U For an ideal gas, ∂V = 0 (eqn 11.22) so it is true that T dU = CV dT, ∂U = 0 and hence dU = Cv dT . but for non-ideal gases, ∂V T

5

γ is sometimes called the adiabatic exponent.

6

If the gas is not monatomic, γ can take a diﬀerent value; see Section 19.2.

(11.29)

(11.30)

The ratio of Cp to CV turns out to be a very useful quantity (we will see why in the following chapter) and therefore we give it a special name. Hence, we deﬁne the adiabatic index5 γ as the ratio of Cp and CV , so that Cp . (11.31) γ= CV The reason for the name will become clear in the following chapter.

Example 11.4 What is γ for an ideal monatomic gas? Solution: Using the results from the previous example6 γ=

Cp CV + R R 5 = =1+ = . CV CV CV 3

(11.32)

Example 11.5 Assuming U = CV T for an ideal gas, ﬁnd (i) the internal energy per unit mass and (ii) the internal energy per unit volume. Solution: Using the ideal gas equation pV = N kB T and the density ρ = N m/V (where m is the mass of one molecule), we ﬁnd that p kB T = . ρ m

(11.33)

Using eqn 11.32, we have that the heat capacity per mole is given by CV =

R . γ−1

(11.34)

Exercises 111

Hence, we can write that the internal energy for one mole of gas is U = CV T =

NA kB T RT = . γ−1 γ−1

(11.35)

The molar mass is mNA , and so dividing eqn 11.35 by the molar mass, yields u ˜, the internal energy per unit mass, given by u ˜=

p . ρ(γ − 1)

(11.36)

Multiplying u ˜ by the density ρ gives u, the internal energy per unit volume, as p . (11.37) u = ρ˜ u= γ−1

Chapter summary • Functions of state have exact diﬀerentials. • The ﬁrst law of thermodynamics states that ‘energy is conserved and heat is a form of energy’. • dU = d¯W + d¯Q. • For a reversible change, d¯W = −p dV .

• CV = ∂Q = ∂U ∂T ∂T V .

V • CV = ∂Q and Cp − CV = R for a mole of ideal gas. ∂T P

• The adiabatic index is γ = Cp /CV .

Exercises (11.1) One mole of ideal monatomic gas is conﬁned in a cylinder by a piston and is maintained at a constant temperature T0 by thermal contact with a heat reservoir. The gas slowly expands from V1 to V2 while being held at the same temperature T0 . Why does the internal energy of the gas not change? Calculate the work done by the gas and the heat ﬂow into the gas. (11.2) Show that, for an ideal gas, R =γ−1 CV

(11.38)

and

γ−1 R = , Cp γ

(11.39)

where CV and Cp are the heat capacities per mole. (11.3) Consider the diﬀerential dz = 2xy dx + (x2 + 2y) dy.

(11.40)

R (x ,y ) Evaluate the integral (x12,y12) dz along the paths consisting of straight-line segments (i) (x1 , y1 ) → (x2 , y1 ) and then (x2 , y1 ) → (x2 , y2 ).

112 Exercises (ii) (x1 , y1 ) → (x1 , y2 ) and then (x1 , y2 ) → (x2 , y2 ). Is dz an exact diﬀerential? (11.4) In polar coordinates, x = r cos θ and y = r sin θ. The deﬁnition of x implies that ∂x x = cos θ = . ∂r r

(11.41)

But we also have x2 + y 2 = r2 , so diﬀerentiating with respect to r gives 2x

∂x = 2r ∂r

=⇒

∂x r = . ∂r x

(11.42)

But eqns 11.41 and 11.42 imply that ∂x ∂r = . ∂r ∂x

(11.43)

What’s gone wrong? (11.5) In the comic song by Flanders and Swann about the laws of thermodynamics, they summarize the ﬁrst law by the statement: Heat is work and work is heat Is that a good summary?

Biography 113

Antoine Lavoisier (1743–1794) All ﬂammable materials contain the odourless, colourless, tasteless substance phlogiston, and the process of burning them releases this phlogiston into the air. The burned material is said to be “dephlogistonated”. That this notion is completely untrue was ﬁrst shown by Antoine Lavoisier, who was born into a wealthy Parisian family. Lavoisier showed that both sulphur and phosphorous increase in weight once burned, but the weight gain was lost from the air. He demonstrated that it was oxygen which was responsible for combustion, not phlogiston and also that oxygen was reFig. 11.3 Antoine sponsible for the rusting of metLavoisier als (his oxygen work was helped by results communicated to him by Joseph Priestley, and Lavoisier was a little lax in giving Priestley credit for this). Lavoisier showed Benjamin Thompson [Count Rumford] (1753–1814) Thompson was and had an

born in rural Massachusetts early interest in science. In 1772, as a humble doctor’s apprentice he married a rich heiress, moved to Rumford, New Hampshire, and got himself appointed as a major in a local militia. He threw his lot in with the British during the American Revolution, feeding them information about the location of American forces and performing scientiﬁc work on the force of Fig. 11.4 Benjamin gunpowder. His British loyalThompson ties made him few friends in the land of his birth and he ﬂed to Britain, abandoning his wife. He subsequently fell out with the British and moved, in 1785, to Bavaria where he worked for Elector Karl Theodor who made him a Count, and henceforth he was known as Count Rumford. He organised

that hydrogen and oxygen combined to make water and also identiﬁed the concept of an element as a fundamental substance that could not be broken down into simpler constituents by chemical processes. Lavoisier combined great experimental skill (and in this, he was ably assisted by his wife) and theoretical insight and is considered a founder of modern chemistry. Unfortunately, he added to his list of elemental substances both light and caloric, his proposed ﬂuid which carried heat. Thus while ridding science of an unnecessary mythical substance (phlogiston), he introduced another one (caloric). Lavoisier was a tax collector and thus found himself in the ﬁring line when the French revolution started, the fact that he ploughed his dubiously-gotten gains into scientiﬁc research cutting no ice with revolutionaries. He had unfortunately made an enemy of Jean-Paul Marat, a journalist with an interest in science who in 1780 had wanted to join the French Academy of Sciences, but was blocked by Lavoisier. In 1792 Marat, now a ﬁrebrand revolutionary leader, demanded Lavoisier’s death. Although Marat was himself assassinated in 1793 (while lying in his bath), Lavoisier was guillotined the following year. the poor workhouses, established the cultivation of the potato in Bavaria and invented Rumford soup. He continued to work on science, sometimes erratically (he believed that gases and liquids were perfect insulators of heat) but sometimes brilliantly; he noticed that the drilling of metal cannon barrels produced apparently limitless amounts of heat and his subsequent understanding of the production of heat by friction allowed him to put an end to Lavoisier’s caloric theory. Not content with simply destroying Lavoisier’s theory, he married Lavoisier’s widow in 1804, though they separated four years later (Rumford unkindly remarked that Antoine Lavoisier had been lucky to have been guillotined than to have stayed married to her!). In 1799, Rumford founded the Royal Institution of Great Britain, establishing Davy as the ﬁrst lecturer (Michael Faraday was appointed there 14 years later). He also endowed a medal for the Royal Society and a chair at Harvard. Rumford was also a proliﬁc inventor and gave the world the Rumford ﬁreplace, the double boiler, a drip coﬀeepot and, perhaps improbably, Baked Alaska (though Rumford’s priority on the latter invention is not universally accepted).

Isothermal and adiabatic processes

12 12.1 Reversibility

114

12.2 Isothermal expansion of an ideal gas 116 12.3 Adiabatic expansion of an ideal gas 117 12.4 Adiabatic atmosphere

117

Chapter summary

119

Exercises

119

In this chapter we will apply the results of the previous chapter to illustrate some properties concerning isothermal and adiabatic expansions of gases. These results will assume that the expansions are reversible, and so the ﬁrst part of this chapter explores the key concept of reversibility. This will be important for our discussion of entropy in subsequent chapters.

12.1

Reversibility

The laws of physics are reversible, so that if any process is allowed, then the time-reversed process can also occur. For example, if you could ﬁlm the molecules in a gas bouncing oﬀ each other and the container walls, then when watching the ﬁlm it would be hard to tell whether the ﬁlm was being played forwards or backwards. However, there are plenty of processes which you see in nature which seem to be irreversible. For example, consider an egg rolling oﬀ the edge of a table and smashing on the ﬂoor. Potential energy is converted into kinetic energy as the egg falls, and ultimately the energy ends up as a small amount of heat in the broken egg and the ﬂoor. The law of conservation of energy does not forbid the conversion of that heat back into kinetic energy of the reassembled egg which would then leap oﬀ the ground and back on to the table. However, this is never observed to happen. As another example, consider a battery which drives a current I through a resistor with resistance R and dissipates heat I 2 R into the environment. Again, one never ﬁnds heat being absorbed by a resistor from its environment, resulting in the generation of a spontaneous current which can used to recharge the battery. Lots of processes are like this, in which the ﬁnal outcome is some potential, chemical or kinetic energy that gets converted into heat, which is then dissipated into the environment. As we shall see, the reason seems to be that there are lots more ways that the energy can be distributed in heat than in any other way, and this is therefore the most probable outcome. To try and understand this statistical nature of reversibility, it is helpful to consider the following example.

12.1

Example 12.1 We return to the situation described in Example 4.1. To recap, you are given a large box containing 100 identical coins. With the lid on the box, you give it a really good long and hard shake, so that you can hear the coins ﬂipping, rattling and being generally tossed around. Now you open the lid and look inside the box. Some of the coins will be lying with heads facing up and some with tails facing up. We assume that each of the 2100 possible possible conﬁgurations (the microstates) are equally likely to be found. Each of these is equally likely and so each has a probability of occurrence of approximately 10−30 . However, the measurement made is counting the number of coins which are heads and the number which are tails (the macrostates), and the results of this measurement are not equally likely. In Example 3.1 we showed that of the ≈ 1030 individual microstates, a large number (≈ 4 × 1027 ) corresponded to 50 heads and 50 tails, but only one microstate corresponded to 100 heads and 0 tails. Now, imagine that you had in fact carefully prepared the coins so that they were lying heads up. Following a good shake, the coins will most probably be a mixture of heads and tails. If, on the other hand, you carefully prepared a mixed arrangement of heads and tails, a good shake of the box is very unlikely to achieve a state in which all the coins lie with heads facing up. The process of shaking the box seems to almost always randomize the number of heads and tails, and this is an irreversible process.

This shows that the statistical behaviour of large systems is such as to make certain outcomes (such as a box of coins with mixed heads and tails) more likely than certain others (such as a box of coins containing coins the same way up). The statistics of large numbers therefore seems to drive many physical changes in an irreversible direction. How can we do a process in a reversible fashion? The early researchers in thermodynamics wrestled with this problem which was of enormous practical importance in the design of engines in which you want to waste as little heat as possible in order to make your engine as eﬃcient as possible. It was realized that when gases are expanded or compressed, it is possible to irreversibly convert energy into heat, and this will generally occur when we perform the expansion or the compression very fast, causing shock waves to be propagated through the gas (we will consider this eﬀect in more detail in Chapter 32). However, it is possible to perform the expansion or compression reversibly if we do it suﬃciently slowly so that the gas remains in equilibrium throughout the entire process and passes seamlessly from one equilibrium state to the next, each equilibrium state diﬀering from the previous one by an inﬁnitesimal change in the system parameters. Such a process is said to be quasistatic, since the process is almost in completely unchanging static equilibrium. As we shall see, heat can nevertheless be absorbed or

Reversibility 115

116 Isothermal and adiabatic processes

1

This is an important point: reversibility does not necessarily exclude the generation of heat. However, reversibility does require the absence of friction; a vehicle braking and coming to a complete stop, converting its kinetic energy into heat through friction in the brakes, is an irreversible process.

emitted in the process, while still maintaining reversibility.1 In contrast, for an irreversible process, a non-zero change (rather than a sequence of inﬁnitesimal changes) is made to the system, and therefore the system is not at equilibrium throughout the process. An important (but given the name, perhaps not surprising) property of reversible processes is that you can run them in reverse. This fact we will use a great deal in Chapter 13. Of course, it would take an inﬁnite amount of time for a strictly reversible process to occur, so most processes we term reversible are approximations to the ‘real thing’.

12.2

Isothermal expansion of an ideal gas

In this section, we will calculate the heat change in a reversible isothermal expansion of an ideal gas. The word isothermal means ‘at constant temperature’, and hence in an isothermal process ∆T = 0.

(12.1)

For an ideal gas, we showed in eqn 11.30 that dU = CV dT , and so this means that for an isothermal change ∆U = 0,

(12.2)

since U is a function of temperature only. Equation 12.2 implies that dU = 0 and hence from eqn 11.7 d ¯W = −¯ dQ,

(12.3)

so that the work done by the gas on its surroundings as it expands is equal to the heat absorbed by the gas. We can use d¯W = −p dV (eqn 11.9) which is the correct expression for the work done in a reversible expansion. Hence the heat absorbed by the gas during an isothermal expansion from volume V1 to volume V2 of 1 mole of an ideal gas at temperature T is ∆Q = d¯Q (12.4) = − d¯W (12.5) V2 = p dV (12.6) V1 V2

RT dV V V1 V2 = RT ln . V1 =

2

See eqn 6.25.

(12.7) (12.8)

For an expansion, V2 > V1 , and so ∆Q > 0. The internal energy has stayed the same, but the volume has increased so that the energy density has gone down. The energy density and the pressure are proportional to one another2 , so that pressure will also have decreased.

12.3

12.3

Adiabatic expansion of an ideal gas 117

Adiabatic expansion of an ideal gas

The word adiathermal means ‘without ﬂow of heat’. A system bounded by adiathermal walls is said to be thermally isolated. Any work done on such a system produces an adiathermal change. We deﬁne a change to be adiabatic if it is both adiathermal and reversible. In an adiabatic expansion, therefore, there is no ﬂow of heat and we have d ¯Q = 0.

(12.9)

The ﬁrst law of thermodynamics therefore implies that dU = d¯W.

(12.10)

For an ideal gas, dU = CV dT , and using d¯W = −p dV for a reversible change, we ﬁnd that, for 1 mole of ideal gas, CV dT = −p dV = − so that ln

RT dV, V

T2 R V2 =− ln . T1 CV V1

(12.11)

(12.12)

Now Cp = CV + R, and dividing this by CV yields γ = Cp /CV = 1 + R/CV , and therefore −(R/CV ) = 1 − γ, so that eqn 12.12 becomes T V γ−1 = constant,

(12.13)

or equivalently (using pV ∝ T for an ideal gas) p1−γ T γ = constant

(12.14)

pV γ = constant,

(12.15)

and the last equation probably being the most memorable. Figure 12.1 shows isotherms (lines of constant temperature, as would be followed in an isothermal expansion) and adiabats (lines followed by an adiabatic expansion in which heat cannot enter or leave the system) for an ideal gas on a graph of pressure against volume. At each point, the adiabats have a steeper gradient than the isotherms, a fact we will return to in a later chapter.

12.4

Adiabatic atmosphere

The hydrostatic equation (eqn 4.23) expresses the additional pressure due to a thickness dz of atmosphere with density ρ and is dp = −ρg dz.

(12.16)

CV here is per mole, since we are dealing with 1 mole of ideal gas.

118 Isothermal and adiabatic processes

p

V Fig. 12.1 Isotherms (solid lines) and adiabats (dashed lines).

Since p = nkB T and ρ = nm, where m is the mass of one molecule, we can write ρ = mp/kB T and hence dp mgp =− , dz kB T

(12.17)

mg dp =− dz. p kB

(12.18)

which implies that T

3

Atmospheric physicists call a ‘bit’ of air a ‘parcel’.

For an isothermal atmosphere, T is a constant, and one obtains the results of Example 4.4. This assumes that the whole atmosphere is at a uniform temperature and is very unrealistic. A much better approximation (although nevertheless still an approximation to reality) is that each parcel of air3 does not exchange heat with its surroundings. This means that if a parcel of air rises, it expands adiabatically. In this case, eqn 12.18 can be solved by recalling that for an adiabatic expansion p1−γ T γ is a constant (see eqn 12.14) and hence that (1 − γ)

dT dp +γ = 0. p T

Substituting this into eqn 12.18 yields γ − 1 mg dT =− , dz γ kB

(12.19)

(12.20)

which is an expression relating the rate of decrease of temperature with height, predicting it to be linear. We can rewrite (γ − 1)/γ = R/Cp , and using R = NA kB and writing the molar mass Mmolar = NA m we can write eqn 12.20 as dT Mmolar g =− , (12.21) dz Cp

Exercises 119

The quantity Mmolar g/Cp is known as the adiabatic lapse rate. For dry air (mostly nitrogen), it comes out as 9.7 K/km. Experimental values in the atmosphere are closer to 6–7 K/km (due partly to the fact that the atmosphere isn’t dry, and latent heat eﬀects, due to the heat needed to evaporate water droplets [and sometimes thaw ice crystals], are also important).

Chapter summary • In an isothermal expansion ∆T = 0. • An adiabatic change is both adiathermal (no ﬂow of heat) and reversible. In an adiabatic expansion of an ideal gas, pV γ is constant.

Exercises (12.1) In an adiabatic expansion of an ideal gas, pV γ is constant. Show also that T V γ−1

=

T

=

constant, constant × p

(12.22) 1−1/γ

. (12.23)

(12.2) Assume that gases behave according to a law given by pV = f (T ), where f (T ) is a function of temperature. Show that this implies « „ 1 df ∂p = , (12.24) ∂T V V dT « „ 1 df ∂V = . (12.25) ∂T p p dT Show also that „ « ∂Q ∂V p „ « ∂Q ∂p V

« ∂T Cp , ∂V p „ « ∂T CV . ∂p V

where A and B are constants. equations and show that

(Cp − CV )dT = B dV − A dp,

=

In an adiabatic change, we have that „ „ « « ∂Q ∂Q dQ = dp + dV = 0. ∂p V ∂V p

„

(12.27)

(12.28)

(12.3) Explain why we can write d ¯Q

=

Cp dT + A dp

=

CV dT + B dV,

and

∂p ∂V

« = T

B . A

(12.32)

In an adiabatic change, show that dp

=

−(Cp /A)dT,

(12.33)

dV

=

−(CV /B)dT.

(12.34)

(12.26)

Hence show that pV γ is a constant.

d ¯Q

(12.31)

and that at constant temperature

„

=

Subtract these

(12.29) (12.30)

Hence show that in an adiabatic change, we have that « ∂p ∂V adiabatic « „ ∂V ∂T adiabatic „ « ∂p ∂T adiabatic

„

„

= = =

« ∂p γ , (12.35) ∂V T „ « ∂V 1 ,(12.36) 1 − γ ∂T p „ « ∂p γ (. 12.37) γ − 1 ∂T V

(12.4) Using eqn 12.35, relate the gradients of adiabats and isotherms on a p–V diagram.

120 Exercises (12.5) Two thermally insulated cylinders, A and B, of equal volume, both equipped with pistons, are connected by a valve. Initially A has its piston fully withdrawn and contains a perfect monatomic gas at temperature T , while B has its piston fully inserted, and the valve is closed. Calculate the ﬁnal temperature of the gas after the following operations, which each start with the same initial arrangement. The thermal capacity of the cylinders is to be ignored. (a) The valve is fully opened and the gas slowly drawn into B by pulling out the piston B; piston A remains stationary. (b) Piston B is fully withdrawn and the valve is opened slightly; the gas is then driven as far as it will go into B by pushing home piston A at such a rate that the pressure in A remains constant: the cylinders are in thermal contact.

[You may neglect friction. As the oscillations are fairly rapid, the changes in p and V which occur can be treated as occurring adiabatically.] In Rinkel’s 1929 modiﬁcation of this experiment, the ball is held in position in the neck where the gas pressure p in the container is exactly equal to air pressure, and then let drop, the distance L that it falls before it starts to go up again is measured. Show that this distance is given by mgL =

γP A2 L2 . 2V

(12.40)

(12.6) In R¨ uchhardt’s method of measuring γ, illustrated in Fig. 12.2, a ball of mass m is placed snugly inside a tube (cross-sectional area A) connected to a container of gas (volume V ). The pressure p of the gas inside the container is slightly greater than atmospheric pressure p0 because of the downward force of the ball, so that p = p0 +

mg . A

(12.38)

Show that if the ball is given a slight downwards displacement, it will undergo simple harmonic motion with period τ given by r mV . (12.39) τ = 2π γpA2

Fig. 12.2 R¨ uchhardt’s apparatus for measuring γ. A ball of mass m oscillates up and down inside a tube.

Part V

The second law In this part we introduce the second law of thermodynamics and follow its consequences. This part is structured as follows: • In Chapter 13, we consider heat engines, which are cyclic processes that convert heat into work. We state various forms of the second law of thermodynamics and prove their equivalence, in particular showing that no engine can be more eﬃcient than a Carnot engine. We also prove Clausius’ theorem, which applies to any cyclic process. • In Chapter 14 we show how the results from the preceding chapter lead to the concept of entropy. We derive the important equation dU = T dS − pdV , which combines the ﬁrst and second laws of thermodynamics. We also introduce the Joule expansion and use it to discuss the statistical interpretation of entropy and Maxwell’s demon. • There is a very deep connection between entropy and information, and we explore this in Chapter 15, brieﬂy touching upon data compression and quantum information.

Heat engines and the second law

13 13.1 The second law of thermodynamics 122 13.2 The Carnot engine

123

13.3 Carnot’s theorem

126

13.4 Equivalence of Clausius and Kelvin statements 127 13.5 Examples of heat engines 127 13.6 Heat engines running backwards 129 13.7 Clausius’ theorem

130

Chapter summary

132

Further reading

133

Exercises

133

In this chapter, we introduce the second law of thermodynamics, probably the most important and far-reaching of all concepts in thermal physics. We are going to illustrate it with an application to the theory of ‘heat engines’, which are machines that produce work from a temperature diﬀerence between two reservoirs.1 It was by considering such engines that nineteenth century physicists such as Carnot, Clausius and Kelvin came to develop their diﬀerent statements of the second law of thermodynamics. However, as we will see in subsequent chapters, the second law of thermodynamics has a wider applicability, aﬀecting all types of processes in large systems and bringing insights in information theory and cosmology. In this chapter, we will begin by stating two alternative forms of the second law of thermodynamics and then discuss how these statements impact on the eﬃciency of heat engines.

1

A reservoir in this context is a body which is suﬃciently large that we can consider it to have essentially inﬁnite heat capacity. This means that you can keep sucking heat out of it, or dumping heat into it, without its temperature changing. See Section 4.6.

2

The ‘in isolation’ phrase is very important here. In a refrigerator, heat is sucked out of cold food and squirted out of the back into your warm kitchen, so that it ﬂows in the ‘wrong’ direction: from cold to hot. However, this process is not happening in isolation. Work is being done by the refrigerator motor and electrical power is being consumed, adding to your electricity bill.

13.1

The second law of thermodynamics

The second law of thermodynamics can be formulated as a statement about the direction of heat ﬂow that occurs as a system approaches equilibrium (and hence there is a connection with the direction of the ‘arrow of time’). Heat is always observed to ﬂow from a hot body to a cold body, and the reverse process, in isolation,2 never occurs. Therefore, following Clausius, we can state the second law of thermodynamics as follows: Clausius’ statement of the second law of thermodynamics: ‘No process is possible whose sole result is the transfer of heat from a colder to a hotter body.’ It turns out that an equivalent statement of the second law of thermodynamics can be made, concerning how easy it is to change energy between diﬀerent forms, in particular between work and heat. It is very easy to convert work into heat. For example, pick up a brick of mass m and carry it up to the top of a building of height h (thus doing work on it equal to mgh) and then let it fall back to ground level by dropping it oﬀ the top (being careful not to hit passing pedestrians). All the work that you’ve done in carrying the brick to the top of the building will be dissi-

13.2

The Carnot engine 123

pated in heat (and a small amount of sound energy) as the brick hits the ground. However, conversion of heat into work is much harder, and in fact the complete conversion of heat into work is impossible. This point is expressed in Kelvin’s statement of the second law of thermodynamics: Kelvin’s statement of the second law of thermodynamics: ‘No process is possible whose sole result is the complete conversion of heat into work.’ These two statements of the second law of thermodynamics do not seem to be obviously connected, but the equivalence of these two statements will be proved in Section 13.4.

13.2

The Carnot engine

Kelvin’s statement of the second law of thermodynamics says that you can’t completely convert heat into work. However, it does not forbid some conversion of heat into work. How good a conversion from heat to work is possible? To answer this question, we have to introduce the concept of an engine. We deﬁne an engine as a system operating a cyclic process that converts heat into work. It has to be cyclic so that it can be continuously operated, producing a steady power.

p

V One such engine is the Carnot engine, which is based upon a process called a Carnot cycle and which is illustrated in Figure 13.1. An equivalent plot which is easier to sketch is shown in Figure 13.2. The Carnot cycle consists of two reversible adiabats and two reversible isotherms for an ideal gas. The engine operates between two heat reservoirs, one at the high temperature of Th and one at the lower temperature of T . Heat enters and leaves only during the reversible isotherms (because no heat

Fig. 13.1 A Carnot cycle consists of two reversible adiabats (BC and DA) and two reversible isotherms (AB and CD). The Carnot cycle is here shown on a p–V plot. It is operated in the direction A→B→C→D→A, i.e. clockwise around the solid curve. Heat Qh enters in the isotherm A→B and heat Q leaves in the isotherm C→D.

124 Heat engines and the second law

T Q

T Fig. 13.2 A Carnot cycle can be drawn on replotted axes where the isotherms are shown as horizontal lines (T is constant for an isotherm) and the adiabats are shown as vertical lines (where the quantity S, which must be some function of pV γ , is constant in an adiabatic expansion; in Chapter 14 we will give a physical interpretation of S).

Q

T

S can enter or leave during an adiabat). Heat Qh enters during the expansion A→B and heat Q leaves during the compression C→D. Because the process is cyclic, the change of internal energy (a state function) in going round the cycle is zero. Hence the work output by the engine, W , is given by (13.1) W = Qh − Q .

Example 13.1 Find an expression for Qh /Q for an ideal gas undergoing a Carnot cycle in terms of the temperatures Th and T . Solution: Using the results of Section 12.2, we can write down VB Qh = RTh ln , VA γ−1 Th VC B → C: , = T

VB VD C → D: Q = −RT ln , VC γ−1 T

VA D → A: . = Th VD A →

B:

(13.2) (13.3) (13.4) (13.5)

Equations 13.3 and 13.5 lead to VB VC = , VA VD 3

In fact, when we later prove in Section 13.3 that all reversible engines have this eﬃciency, one can use eqn 13.7 as a thermodynamic definition of temperature. In this book, we have preferred to deﬁne temperature using a statistical argument via eqn 4.7.

(13.6)

and dividing eqn 13.2 by eqn 13.4 and substituting in eqn 13.6 leads to Qh Th = . Q

T

This is a key result.3

(13.7)

13.2

The Carnot engine is shown schematically in Fig. 13.3. It is drawn as a machine with heat input Qh from a reservoir at temperature Th , drawn as a horizontal line, and two outputs, one of work W and the other of heat Q which passes into the reservoir at temperature T . The concept of eﬃciency is important to characterize engines. It is the ratio of ‘what you want to achieve’ to ‘what you have to do to achieve it’. For an engine, what you want to achieve is work (to pull a train up a hill for example) and what you have to do to achieve it is to put heat in (by shovelling coal into the furnace), keeping the hot reservoir at Th and providing heat Qh for the engine. We therefore deﬁne the eﬃciency η of an engine as the ratio of the work out to the heat in. Thus η=

W . Qh

Example 13.2 For the Carnot engine, the eﬃciency can be calculated using eqns 13.1, 13.7 and 13.8 as follows: substituting eqn 13.1 into 13.8 yields Qh − Q

, Qh

(13.9)

Th − T

T

=1− . Th Th

(13.10)

and eqn 13.7 then implies that ηCarnot =

T Q

W Q T

(13.8)

Note that since the work out cannot be greater than the heat in (i.e. W < Qh ) we must have that η < 1. The eﬃciency must be below 100%.

ηCarnot =

The Carnot engine 125

How does this eﬃciency compare to that of a real engine? It turns out that real engines are much less eﬃcient than Carnot engines.

Example 13.3 A power station steam turbine operates between Th ∼ 800 K and T = 300 K. If it were a Carnot engine, it could achieve an eﬃciency of ηCarnot = (Th − T )/Th = 60%, but in fact real power stations do not achieve the maximum eﬃciency and ﬁgures closer to 40% are typical.

Fig. 13.3 A Carnot engine shown schematically. In diagrams such as this one, the arrows are labelled with the heat/work ﬂowing in one cycle of the engine.

126 Heat engines and the second law

13.3

Carnot’s theorem

The Carnot engine is in fact the most eﬃcient engine possible! This is stated in Carnot’s theorem, as follows: Carnot’s theorem: Of all the heat engines working between two given temperatures, none is more eﬃcient than a Carnot engine.

4

This means that Carnot’s theorem is, in itself, a statement of the second law of thermodynamics.

T Q

Q W

Q

Q T

Remarkably, one can prove Carnot’s theorem on the basis of Clausius’ statement of the second law of thermodynamics.4 The proof follows a reductio ad absurdum argument. Proof: Imagine that E is an engine which is more eﬃcient than a Carnot engine (i.e. ηE > ηCarnot ). The Carnot engine is reversible so one can run it in reverse. Engine E, and a Carnot engine run in reverse, are connected together as shown in Fig. 13.4. Now since ηE > ηCarnot , we have that W W > , (13.11) Qh Qh and so Qh > Qh . (13.12) The ﬁrst law of thermodynamics implies that

Fig. 13.4 A hypothetical engine E, which is more eﬃcient than a Carnot engine, is connected to a Carnot engine.

so that

W = Qh − Q = Qh − Q ,

(13.13)

Qh − Qh = Q − Q .

(13.14)

Qh −Qh

Now is positive because of eqn 13.12, and therefore so is Q −Q . The expression Qh − Qh is the net amount of heat dumped into the reservoir at temperature Th . The expression Q − Q is the net amount of heat extracted from the reservoir at temperature T . Because both these expressions are positive, the combined system shown in Fig. 13.4 simply extracts heat from the reservoir at T and dumps it into the reservoir at Th . This violates Clausius’ statement of the second law of thermodynamics, and therefore engine E cannot exist. Corollary: All reversible engines have the same eﬃciency ηCarnot . T Q

Q W

Q

Q T

Fig. 13.5 A hypothetical reversible engine R is connected to a Carnot engine.

Proof: Imagine another reversible engine R. Its eﬃciency ηR ≤ ηCarnot by Carnot’s theorem. We run it in reverse and connect it to a Carnot engine going forwards, as shown in Figure 13.5. This arrangement will simply transfer heat from the cold reservoir to the hot reservoir and violates Clausius’ statement of the second law of thermodynamics unless ηR = ηCarnot . Therefore all reversible engines have the same eﬃciency Th − T

ηCarnot = . (13.15) Th

13.4

13.4

Equivalence of Clausius and Kelvin statements 127

Equivalence of Clausius and Kelvin statements

We ﬁrst prove the proposition that if a system violates Kelvin’s statement of the second law of thermodynamics, it violates Clausius’ statement of the second law of thermodynamics. Proof: If a system violates Kelvin’s statement of the second law of thermodynamics, one could connect it to a Carnot engine as shown in Figure 13.6. The ﬁrst law implies that Qh = W

(13.16)

Qh = W + Q .

(13.17)

and that The heat dumped in the reservoir at temperature Th is Qh − Qh = Q .

T

(13.18)

This is also equal to the heat extracted from the reservoir at temperature T . The combined process therefore has the net result of transferring heat Q from the reservoir at T to the reservoir at Th as its sole eﬀect and thus violates Clausius’ statement of the second law of thermodynamics. Therefore the Kelvin violator does not exist. We now prove the opposite proposition, that if a system violates Clausius’ statement of the second law of thermodynamics, it violates Kelvin’s statement of the second law of thermodynamics. Proof: If a system violates Clausius’ statement of the second law of thermodynamics, one could connect it to a Carnot engine as shown in Figure 13.7. The ﬁrst law implies that Qh − Q = W.

Q W

Q T

Fig. 13.6 A Kelvin violator is connected to a Carnot engine. T Q

Q

W

(13.19)

The sole eﬀect of this process is thus to convert heat Qh − Ql into work and thus violates Kelvin’s statement. We have thus shown the equivalence of Clausius’ and Kelvin’s statements of the second law of thermodynamics.

13.5

Q

Examples of heat engines

One of the ﬁrst engines to be constructed was made in the ﬁrst century by Hero of Alexandria, and is sketched in Fig. 13.8(a). It consists of an airtight sphere with a pair of bent pipes projecting from it. Steam is fed via another pair of pipes and once expelled through the bent pipes causes rotational motion. Though Hero’s engine convincingly converts heat into work, and thus qualiﬁes as a bona ﬁde heat engine, it was little more than an entertaining toy. More practical was the engine sketched in Fig. 13.8(b) which was designed by Thomas Newcomen (1664–1729).

Q

Q T

Fig. 13.7 A Clausius violator is connected to a Carnot engine.

128 Heat engines and the second law

This was one of the ﬁrst practical steam engines and was used for pumping water out of mines. Steam is used to push the piston upwards. Then, cold water is injected from the tank and condenses the steam, reducing the pressure in the piston. Atmospheric pressure then pushes the piston down and raises the beam on the other side of the fulcrum. The problem with Newcomen’s engine was that one had then to heat up the steam chamber again before steam could be readmitted and so it was extremely ineﬃcient. James Watt (1736–1819) famously improved the design so that condensation took place in a separate chamber which was connected to the steam cylinder by a pipe. This work led the foundation of the industrial revolution.

Fig. 13.8 Sketches of (a) Hero’s engine, (b) Newcomen’s engine and (c) Stirling’s engine.

Another design of an engine is Stirling’s engine, the brainchild of the Rev. Robert Stirling (1790–1878) and which is sketched in Fig. 13.8(c), It works purely by the repeated heating and cooling of a sealed amount of gas. In the particular engine shown in Fig. 13.8(c), the crankshaft is driven by the two pistons in an oscillatory fashion, but the 90◦ bend ensures that the two pistons move out of phase. The motion is driven by a temperature diﬀerential between the top and bottom surfaces of the engine. The design is very simple and contains no valves and operates at relatively low pressures. However, such an engine literally has to ‘warm up’ to establish the temperature diﬀerential and so it is harder to regulate power output. One of the most popular engines is the internal combustion engine used in most automobile applications. Rather than externally heating water to produce steam (as with Newcomen’s and Watt’s engines) or to produce a temperature diﬀerential (as with Stirling’s engine), here the burning of fuel inside the engine’s combustion chamber generates the high temperature and pressure necessary to produce useful work. Different fuels can be used to drive these engines, including diesel, gasoline, natural gas and even biofuels such as ethanol. These engines all pro-

13.6

duce carbon dioxide, and this has important consequences for Earth’s atmosphere, as we shall discuss in Chapter 37. There are many diﬀerent types of internal combustion engines, including piston engines (in which pressure is converted into rotating motion using a set of pistons), combustion turbines (in which gas ﬂow is used to spin a turbine’s blades) and jet engines (in which a fast moving jet of gas is used to generate thrust).5

13.6

Heat engines running backwards 129

5 In Exercise 13.5 we consider the Otto cycle, which models the diesel engine, a type of internal combustion engine.

Heat engines running backwards

In this section we discuss two applications of heat engines in which the engine is run in reverse, putting in work in order to move heat around.

Example 13.4 (a) The refrigerator: The refrigerator is a heat engine which is run backwards so that you put work in and cause a heat ﬂow from a cold reservoir to a hot reservoir (see Figure 13.9). In this case, the cold reservoir is the food inside the refrigerator which you wish to keep cold and the hot reservoir is usually your kitchen. For a refrigerator, we must deﬁne the eﬃciency in a diﬀerent way from the eﬃciency of a heat engine. This is because what you want to achieve is ‘heat sucked out of the contents of the refrigerator’ and what you have to do to achieve it is ‘electrical work’ from the mains electricity supply. Thus we deﬁne the eﬃciency of a refrigerator as η=

Q

. W

which can yield an eﬃciency above 100%. (b) The heat pump: A heat pump is essentially a refrigerator (Figure 13.9 applies also for a heat pump), but it is utilized in a diﬀerent way. It is used to pump heat from a reservoir, to a place where it is desired to add heat. For example, the reservoir could be the soil/rock several metres underground and heat could be pumped out of the reservoir into a house which needs heating. In one cycle of the engine, we want to add heat Qh to the house, and now W is the work we must apply (in the form of electrical work) to accomplish this. The eﬃciency of a heat pump is therefore deﬁned as Qh . W

Q

W Q

(13.20)

For a refrigerator ﬁtted with a Carnot engine, it is then easy to show that T

ηCarnot = , (13.21) Th − T

η=

T

(13.22)

T Fig. 13.9 A refrigerator or a heat pump. Both devices are heat engines run in reverse (i.e. reversing the arrows on the cycle shown in Fig. 13.3).

130 Heat engines and the second law

6

However, the capital cost means that heat pumps have not become popular until recently.

Note that Qh > W and so η > 1. The eﬃciency is always above 100%! (See Exercise 13.1.) This shows why heat pumps are attractive6 for heating. It is always possible to turn work into heat with 100% eﬃciency (an electric ﬁre turns electrical work into heat in this way), but a heat pump can allow you to get even more heat into your house for the same electrical work (and hence for the same electricity bill!). For a heat pump ﬁtted with a Carnot engine, it is easy to show that ηCarnot =

13.7

Th . Th − T

(13.23)

Clausius’ theorem

Consider a Carnot cycle. In one cycle, heat Qh enters and heat Q

leaves. Heat is therefore not a conserved quantity of the cycle. However, we found in eqn 13.7 that for a Carnot cycle Th Qh = , Q

T

7

The subscript ‘rev’ on ∆Qrev is there to remind us that we are dealing with a reversible engine.

(13.24)

and so if we deﬁne7 ∆Qrev as the heat entering the system at each point, we have that ∆Qrev Qh (−Q ) = + = 0, (13.25) T Th T

cycle

and so ∆Qrev /T is a quantity which sums to zero around the cycle. Replacing the sum by an integral, we could write d¯Qrev =0 (13.26) T

8

You need to get the energy out of a real engine quickly, so you do not have time to everything quasistatically!

for this Carnot cycle. Our argument so far has been in terms of a Carnot cycle which operates between two heat distinct reservoirs. Real engine cycles can be much more complicated than this in that their ‘working substance’ changes temperature in a much more complicated way and, moreover, real engines do not behave perfectly reversibly.8 Therefore we would like to generalize our treatment so that it can be applied to a general cycle operating between a whole series of reservoirs and we would like the cycle to be either reversible or irreversible. Our general cycle is illustrated in Fig. 13.10(a). For this cycle, heat d¯Qi enters at a particular part of the cycle. At this point the system is connected to a reservoir which is at temperature Ti . The total work extracted from the cycle is ∆W , given by ∆W = d¯Qi , (13.27) cycle

from the ﬁrst law of thermodynamics. The sum here is taken around the whole cycle, indicated schematically by the dotted circle in Fig. 13.10(a).

13.7

Clausius’ theorem 131

T Wi i

Qi Ti

Ti Qi

Qi

W

W

Next we imagine that the heat at each point is supplied via a Carnot engine which is connected between a reservoir at temperature T and the reservoir at temperature Ti (see Fig. 13.10(b)). The reservoir at T is common for all the Carnot engines connected at all points of the cycle. Each Carnot engine produces work d¯Wi , and for a Carnot engine we know that heat from reservoir at T heat to reservoir at Ti , (13.28) = Ti T and hence

d ¯Qi d¯Qi + d¯Wi . = Ti T Rearranging, we have that T d¯Wi = d¯Qi −1 . Ti

(13.29)

(13.30)

The thermodynamic system in Fig. 13.10(b) looks at ﬁrst sight to do nothing other than convert heat to work, which is not allowed according to Kelvin’s statement of the second law of thermodynamics, and hence we must insist that this is not the case. Hence total work produced per cycle = ∆W + d ¯Wi ≤ 0. (13.31) cycle

Fig. 13.10 (a) A general cycle in which heat d ¯Qi enters in part of the cycle from a reservoir at temperature Ti . Work ∆W is extracted from each cycle. (b) The same cycle, but showing the heat d ¯Qi entering the reservoir at Ti from a reservoir at temperature T via a Carnot engine (labelled Ci ).

132 Heat engines and the second law

Using eqns 13.27, 13.30 and 13.31, we therefore have that T

d¯Qi ≤ 0. Ti

(13.32)

cycle

Since T > 0, we have that d¯Qi ≤ 0, Ti

(13.33)

cycle

and replacing the sum by an integral, we can write this as d¯Q ≤ 0, T

(13.34)

which is known as the Clausius inequality, embodied in the expression of Clausius’ theorem: Clausius’ theorem: d ¯Q ≤ 0, where equality necessarily holds for For any closed cycle, T a reversible cycle.

Chapter summary • No process is possible whose sole result is the transfer of heat from a colder to a hotter body. (Clausius’ statement of the second law of thermodynamics) • No process is possible whose sole result is the complete conversion of heat into work. (Kelvin’s statement of the second law of thermodynamics) • Of all the heat engines working between two given temperatures, none is more eﬃcient than a Carnot engine. (Carnot’s theorem) • All the above are equivalent statements of the second law of thermodynamics. • All reversible engines operating between temperatures Th and T

have the eﬃciency of a Carnot engine: ηCarnot = (Th − T )/Th . • For a Carnot engine: Qh Th = . Q

T

d¯Q ≤ 0 where • Clausius’ theorem states that for any closed cycle, T equality necessarily holds for a reversible cycle.

Further reading 133

Further reading An entertaining account of how steam engines really work may be found in Semmens and Goldﬁnch (2000). A short account of Watt’s development of his engine is Marsden (2002).

Exercises (13.1) A heat pump has an eﬃciency greater than 100%. Does this violate the laws of thermodynamics? (13.2) What is the maximum possible eﬃciency of an engine operating between two thermal reservoirs, one at 100◦ C and the other at 0◦ C?

Q

(13.3) The history of science is littered with various schemes for producing perpetual motion. A machine which does this is sometimes referred to as a perpetuum mobile, which is the Latin term for a perpetual motion machine.

Q

• A perpetual motion machine of the ﬁrst kind produces more energy than it uses. • A perpetual motion machine of the second kind produces exactly the same amount of energy as it uses, but it continues running forever indeﬁnitely by converting all its waste heat back into mechanical work. Give a critique of these two types of machine and state which laws of thermodynamics they each break, if any. (13.4) A possible ideal-gas cycle operates as follows: (i) from an initial state (p1 , V1 ) the gas is cooled at constant pressure to (p1 , V2 ); (ii) the gas is heated at constant volume to (p2 , V2 ); (iii) the gas expands adiabatically back to (p1 , V1 ). Assuming constant heat capacities, show that the thermal eﬃciency is 1−γ

(V1 /V2 ) − 1 . (p2 /p1 ) − 1

(13.35)

(You may quote the fact that in an adiabatic change of an ideal gas, pV γ stays constant, where γ = cp /cV .)

V

V

V

V

Fig. 13.11 The Otto cycle.

(13.5) Show that the eﬃciency of the standard Otto cycle (shown in Fig. 13.11) is 1 − r1−γ , where r = V1 /V2 is the compression ratio. The Otto cycle is the four-stroke cycle in internal combustion engines in cars, lorries and electrical generators. (13.6) An ideal air conditioner operating on a Carnot cycle absorbs heat Q2 from a house at temperature T2 and discharges Q1 to the outside at temperature T1 , consuming electrical energy E. Heat leakage into the house follows Newton’s law, Q = A[T1 − T2 ],

(13.36)

where A is a constant. Derive an expression for T2 in terms of T1 , E and A for continuous operation when the steady state has been reached. The air conditioner is controlled by a thermostat. The system is designed so that with the thermostat set at 20◦ C and outside temperature 30◦ C the

134 Exercises system operates at 30% of the maximum electrical energy input. Find the highest outside temperature for which the house may be maintained inside at 20◦ C. (13.7) Two identical bodies of constant heat capacity Cp at temperatures T1 and T2 respectively are used as reservoirs for a heat engine. If the bodies remain at constant pressure, show that the amount of work obtainable is W = Cp (T1 + T2 − 2Tf ) ,

(13.37)

where Tf is the ﬁnal temperature attained by both bodies. Show that if the most eﬃcient engine is used, then Tf2 = T1 T2 . (13.8) A building is maintained at a temperature T by means of an ideal heat pump which uses a river

at temperature T0 as a source of heat. The heat pump consumes power W , and the building loses heat to its surroundings at a rate α(T −T0 ), where α is a positive constant. Show that T is given by ” p W “ 1 + 1 + 4αT0 /W . (13.38) T = T0 + 2α (13.9) Three identical bodies of constant thermal capacity are at temperatures 300 K, 300 K and 100 K. If no work or heat is supplied from outside, what is the highest temperature to which any one of these bodies can be raised by the operation of heat engines? If you set this problem up correctly you may have to solve a cubic equation. This looks hard to solve but in fact you can deduce one of the roots [hint: what is the highest temperature of the bodies if you do nothing to connect them?].

Biography 135

Sadi Carnot (1796–1832) Sadi Carnot’s father, Lazare Carnot (1753–1823), was an engineer and mathematician who founded ´ the Ecole Polytechnique in Paris, was brieﬂy Napoleon Bonaparte’s minister of war and served as his military governor of Antwerp. After Napoleon’s defeat, Lazare Carnot was forced into exile. He ﬂed to Warsaw in 1815 and then moved to Magdeburg in Germany in 1816. It was there in 1818 that he saw a steam engine, and both he and his son Sadi Carnot, who visited him there in 1821, became hooked on the problem of understanding how it worked. Sadi Carnot had been educated as a child by his father. In 1812 he entered ´ the Ecole Polytechnique and studied with Poisson and Amp`ere. He then Fig. 13.12 Sadi Carnot moved to Metz and studied military engineering, worked for a while as a military engineer, and then moved back to Paris in 1819. There he became interested in a variety of industrial problems as well as the theory of gases. He had now become skilled in tackling various problems, but it was his visit to Magdeburg that proved crucial in bringing him the problem that was to be his life’s most important work. In this, his father’s inﬂuence was a signiﬁcant factor in the solution to the problem. Lazare Carnot had been obsessed by the operation of machines all his life and had been particularly interested in thinking about the operation of water-wheels. In a waterwheel, falling water can be made to produce useful mechanical work. The water falls from a reservoir of high potential energy to a reservoir of low potential energy, and on the way down, the water turns a wheel which then drives some useful machine such as a ﬂour mill. Lazare Carnot had thought a great deal about how you could make such systems as eﬃcient as possible and convert as much of the potential energy of

the water as possible into useful work. Sadi Carnot was struck by the analogy between such a water-wheel and a steam engine, in which heat (rather than water) ﬂows from a reservoir at high temperature to a reservoir at low temperature. Carnot’s genius was that rather than focus on the details of the steam engine he decided to consider an engine in abstracted form, focussing purely on the ﬂow of heat between two thermal reservoirs. He idealized the workings of an engine as consisting of simple gas cycles (in what we now know as a Carnot cycle) and worked out its eﬃciency. He realised that to be as eﬃcient as possible, the engine had to pass slowly through a series of equilibrium states and that it therefore had to be reversible. At any stage, you could reverse its operation and send it the other way around the cycle. He was then able to use this fact to prove that all reversible heat engines operating between two temperatures had the same eﬃciency. This work was summarized in his paper on the subject, R´eﬂexions sur la puissance motrice du feu et sur les machines propres `a d´evelopper cette puissance (Reﬂections on the motive power of ﬁre and machines ﬁtted to develop That power) which was published in 1824. Carnot’s paper was favourably reviewed, but had little immediate impact. Few could see the relevance of his work, or at least see past the abstract argument and the unfamiliar notions of idealized engine cycles; his introduction, in which he praised the technical superiority of English engine designers, may not have helped win his French audience. Carnot died in 1832 during a cholera epidemic, and most of his papers were destroyed (the standard precaution following a cholera fatality). The French ´ physicist Emile Clapeyron later noticed his work and published his own paper on it in 1834. However, it was yet another decade before the work simultaneously came to the notice of a young German student, Rudolf Clausius, and a recent graduate of Cambridge University, William Thomson (later Lord Kelvin), who would each individually make much of Carnot’s ideas. In particular, Clausius patched up and modernized Carnot’s arguments (which had assumed the validity of the prevailing, but subsequently discredited, caloric theory of heat) and was motivated by Carnot’s ideas to introduce the concept of entropy.

14 14.1 Deﬁnition of entropy

Entropy 136

14.2 Irreversible change

136

14.3 The ﬁrst law revisited

138

14.4 The Joule expansion

140

14.5 The statistical basis for entropy 142 14.6 The entropy of mixing

143

14.7 Maxwell’s demon

145

In this chapter we will use the results from Chapter 13 to deﬁne a quantity called entropy and to understand how entropy changes in reversible and irreversible processes. We will also consider the statistical basis for entropy, and use this to understand the entropy of mixing, the apparent conundrum of Maxwell’s demon and the connection between entropy and probability.

14.1

Deﬁnition of entropy

14.8 Entropy and probability 146 Chapter summary

149

Exercises

149

In this section, we introduce a thermodynamic deﬁnition of entropy. We begin by recalling from eqn 13.26 that d¯Qrev /T = 0. This means that the integral B d¯Qrev T A is path independent (see Appendix C.7). Therefore the quantity d¯Qrev /T is an exact diﬀerential and we can write down a new state function which we call entropy. We therefore deﬁne the entropy S by dS =

d¯Qrev , T

so that

B

S(B) − S(A) = A

(14.1)

d¯Qrev , T

(14.2)

and S is a function of state. For an adiabatic process (a reversible adiathermal process) we have that dQrev = 0.

(14.3)

Hence an adiabatic process involves no change in entropy (the process is also called isentropic).

14.2

Irreversible change

Entropy S is deﬁned in terms of reversible changes of heat. Since S is a state function, then the integral of S around a closed loop is zero, so that d ¯Qrev = 0. (14.4) T

14.2

Let us now consider a loop which contains an irreversible section (A→B) and a reversible section (B→A), as shown in Fig. 14.1. The Clausius inequality (eqn 13.34) implies that, integrating around this loop, we have that d ¯Q ≤ 0. (14.5) T Writing out the left-hand side in detail, we have that A B d ¯Q d ¯Qrev + ≤ 0, (14.6) T T A B and hence rearranging gives B A

d ¯Q ≤ T

Irreversible change 137

p

V Fig. 14.1 An irreversible and a reversible change between two points A and B in p–V parameter space.

B

A

d ¯Qrev . T

(14.7)

This is true however close A and B get to each other, so in general we can write that the change in entropy dS is given by dS =

d ¯Qrev d ¯Q ≥ . T T

(14.8)

The equality in this expression is only obtained (somewhat trivially) if the process on the right-hand side is actually reversible. Note that because S is a state function, the entropy change in going from A to B is independent of the route. Consider a thermally isolated system. In such a system d ¯Q = 0 for any process, so that the above inequality becomes dS ≥ 0.

(14.9)

This is a very important equation and is, in fact, another statement of the second law of thermodynamics. It shows that any change for this thermally isolated system always results in the entropy either staying the same (for a reversible change)1 or increasing (for an irreversible change). This gives us yet another statement of the second law, namely that: ‘the entropy of an isolated system tends to a maximum.’ We can tentatively apply these ideas to the Universe as a whole, under the assumption that the Universe itself is a thermally isolated system: Application to the Universe: Assuming that the Universe can be treated as an isolated system, the ﬁrst two laws of thermodynamics become: (1) UUniverse = constant. (2) SUniverse can only increase. The following example illustrates how the entropy of a particular system and a reservoir, as well as the Universe (taken to be the system plus reservoir), changes in an irreversible process.

1

For a reversible process in a thermally isolated system, T dS ≡ dQrev = 0 because no heat can ﬂow in or out.

138 Entropy

Example 14.1 A large reservoir at temperature TR is placed in thermal contact with a small system at temperature TS . They both end up at the temperature of the reservoir, TR . The heat transferred from the reservoir to the system is ∆Q = C(TR − TS ), where C is the heat capacity of the system. • If TR > TS , heat is transferred from reservoir to system, the system warms and its entropy increases; the entropy of the reservoir decreases, because heat ﬂows out of it. • If TR < TS , heat is transferred from system to reservoir, the system cools and its entropy decreases; the entropy of the reservoir increases, because heat ﬂows into it. C

S

C

C

Fig. 14.2 The entropy change in the simple process in which a small system is placed in contact with a large reservoir.

Let us calculate these entropy changes in detail: The entropy change in the reservoir, which has constant temperature TR , is d ¯Q 1 C(TS − TR ) ∆Q ∆Sreservoir = = = , (14.10) d ¯Q = TR TR TR TR while the entropy change in the system is TR TR C dT d ¯Q = = C ln . ∆Ssystem = T T TS TS Hence, the total entropy change in the Universe is TR TS ∆SUniverse = ∆Ssystem + ∆Sreservoir = C ln + −1 . TS TR

(14.11)

(14.12)

These expressions are plotted in Fig. 14.2 and demonstrate that even though ∆Sreservoir and ∆Ssystem can each be positive or negative, we always have that (14.13) ∆SUniverse ≥ 0.

14.3

The ﬁrst law revisited

Using our new notion of entropy, it is possible to obtain a much more elegant and useful statement of the ﬁrst law of thermodynamics. We recall from eqn 11.7 that the ﬁrst law is given by dU = d¯Q + d¯W.

(14.14)

Now, for a reversible change only, we have that d ¯Q = T dS

(14.15)

14.3

The ﬁrst law revisited 139

and d ¯W = −pdV.

(14.16)

Combining these, we ﬁnd that dU = T dS − pdV.

(14.17)

Constructing this equation, we stress, has assumed that the change is reversible. However, since all the quantities in eqn 14.17 are functions of state, and are therefore path independent, this equation holds for irreversible processes as well! For an irreversible change, d ¯Q ≤ T dS and also d¯W ≥ −p dV , but with d¯Q being smaller than for the reversible case and d ¯W being larger than for the reversible case so that dU is the same whether the change is reversible or irreversible. Therefore, we always have that: dU = T dS − pdV .

(14.18)

This equation implies that the internal energy U changes when either S or V changes. Thus, the function U can be written in terms of the variables S and V which are its so-called natural variables. These variables are both extensive (i.e. they scale with the size of the system).2 The variables p and T are both intensive (i.e. they do not scale with the size of the system) and behave a bit like forces, since they show how the internal energy changes with respect to some parameter. In fact, since mathematically we can write dU as ∂U ∂U dS + dV, (14.19) dU = ∂S V ∂V S we can make the identiﬁcation of T and p using ∂U T = and ∂S V ∂U . p = − ∂V S

(14.20) (14.21)

The ratio of p and T can also be written in terms of the variables U , S and V , as follows: ∂U ∂S p =− , (14.22) T ∂V S ∂U V using the reciprocal theorem (see eqn C.41). Hence ∂S p = , T ∂V U

(14.23)

using the reciprocity theorem (see eqn C.42). These equations are used in the following example.

2

See Section 11.1.2.

140 Entropy

U

p

V

p

Fig. 14.3 Two systems, 1 and 2, which are able to exchange volume and internal energy.

Example 14.2 Consider two systems, with pressures p1 and p2 and temperatures T1 and T2 . If internal energy ∆U is transferred from system 1 to system 2, and volume ∆V is transferred from system 1 to system 2 (see Fig. 14.3), ﬁnd the change of entropy. Show that equilibrium results when T1 = T2 and p1 = p2 . Solution: Equation 14.18 can be rewritten as dS =

p 1 dU + dV. T T

(14.24)

If we now apply this to our problem, the change in entropy is then straightforwardly 1 p1 1 p2 ∆S = − − ∆U + ∆V. (14.25) T1 T2 T1 T2 Equation 14.9 shows that the entropy always increases in any physical process. Thus, when equilibrium is achieved, the entropy will have achieved a maximimum, so that ∆S = 0. This means that the joint system cannot increase its entropy by further exchanging volume or internal energy between system 1 and system 2. ∆S = 0 can only be achieved when T1 = T2 and p1 = p2 .

Eqn 14.18 is an important equation that will be used a great deal in subsequent chapters. Before proceeding, we pause to summarize the most important equations in this section and state their applicability. Summary dU = d¯Q + d¯W always true d¯Q = T dS only true for reversible changes d¯W = −p dV only true for reversible changes dU = T dS − p dV always true For irreversible changes: d ¯Q ≤ T dS, d¯W ≥ −p dV

14.4

The Joule expansion

In this section, we describe in detail an irreversible process which is known as the Joule expansion. One mole of ideal gas (pressure pi , temperature Ti ) is conﬁned to the left-hand side of a thermally isolated container and occupies a volume V0 . The right-hand side of the container (also volume V0 ) is evacuated. The tap between the two parts of the container is then suddenly opened and the gas ﬁlls the entire container of volume 2V0 (and has new temperature Tf and pressure pf ). Both

14.4

The Joule expansion 141

containers are assumed to be thermally isolated from their surroundings. For the initial state, the ideal gas law implies that

p pi V0 = RTi ,

(14.26)

pf (2V0 ) = RTf .

(14.27)

V

and for the ﬁnal state that

Since the system is thermally isolated from its surroundings, ∆U = 0. Also, since U is only a function of T for an ideal gas, ∆T = 0 and hence Ti = Tf . This implies that pi V0 = pf (2V0 ), so that the pressure halves, i.e. pi (14.28) pf = . 2 It is hard to calculate directly the change of entropy of a gas in a Joule expansion along the route that it takes from its initial state to the ﬁnal state. The pressure and volume of the system are undeﬁned during the process immediately after the partition is removed since the gas is in a non-equilibrium state. However, entropy is a function of state and therefore for the purposes of the calculation, we can take another route from the initial state to the ﬁnal state since changes of functions of state are independent of the route taken. Let us calculate the change in entropy for a reversible isothermal expansion of the gas from volume V0 to volume 2V0 (as indicated in Fig. 14.5). Since the internal energy is constant in the isothermal expansion of an ideal gas, dU = 0, and hence the new form of the ﬁrst law in eqn 14.18 gives us T dS = p dV , so that 2V0 2V0 f p dV R dV = = R ln 2. (14.29) dS = ∆S = T V V0 V0 i Since S is a function of state, this increase in entropy R ln 2 is also the change of entropy for the Joule expansion.

Example 14.3 What is the change of entropy in the gas, surroundings and Universe during a Joule expansion? Solution: Above, we have worked out ∆Sgas for the reversible isothermal expansion and the Joule expansion: they have to be the same. What about the surroundings and the Universe in each case? For the reversible isothermal expansion of the gas, we deduce the change of entropy in the surroundings so that the entropy in the Universe does not increase (because we are dealing with a reversible situation). ∆Sgas ∆Ssurroundings ∆SUniverse

= R ln 2, = −R ln 2, = ∆Sgas + ∆Ssurroundings = 0.

(14.30)

p

p

V

V

Fig. 14.4 The Joule expansion between volume V0 and volume 2V0 . One mole of ideal gas (pressure pi , temperature Ti ) is conﬁned to the left-hand side of a container in a volume V0 . The container is thermally isolated from its surroundings. The tap between the two parts of the container is then suddenly opened and the gas ﬁlls the entire container of volume 2V0 (and has new temperature Tf and pressure pf ).

p

V

V

V

Fig. 14.5 The Joule expansion between volume V0 and volume 2V0 and a reversible isothermal expansion of a gas between the same volumes. The path in the p–V plane for the Joule expansion is undeﬁned, whereas it is well deﬁned for the reversible isothermal expansion. In each case however, the start and end points are well deﬁned. Since entropy is a function of state, the change in entropy for the two processes is the same, regardless of route.

142 Entropy

Notice that the entropy of the surroundings goes down. This does not contradict the second law of thermodynamics. The entropy of something can decrease if that something is not isolated. Here the surroundings are not isolated because they are able to exchange heat with the system. For the Joule expansion, the system is thermally isolated so that the entropy of the surroundings does not change. Hence ∆Sgas ∆Ssurroundings ∆SUniverse

3

In other words, the method involving the least work.

= R ln 2, = 0, = ∆Sgas + ∆Ssurroundings = R ln 2.

(14.31)

Once the Joule expansion has occurred, you can only put the gas back in the left-hand side by compressing it. The best3 you can do is to do this reversibly, by a reversible isothermal compression, which takes work ∆W given (for 1 mole of gas) by

V0

∆W = −

2V0

p dV = −

V0 2V0

RT dV = RT ln 2 = T ∆Sgas . V

(14.32)

The increase of entropy in a Joule expansion is thus ∆W/T . A paradox?: • In the Joule expansion, the system is thermally isolated so no heat can be exchanged: ∆Q = 0. • No work is done: ∆W = 0. • Hence ∆U = 0 (so for an ideal gas, ∆T = 0). • But if ∆Q = 0, doesn’t that imply that ∆S = ∆Q/T = 0? The above reasoning is correct, until the very end: the answer to the question in the last point is NO! The equation d¯Q = T dS is only true for reversible changes. In general d ¯Q ≤ T dS, and here we have ∆Q = 0 and ∆S = R ln 2, so we have that ∆Q ≤ T ∆S.

14.5

The statistical basis for entropy

We now want to show that as well as deﬁning entropy via thermodynamics, i.e. using dS = d¯Qrev /T , it is also possible to deﬁne entropy via statistics. We will motivate this as follows: As we showed in eqn 14.20, the ﬁrst law dU = T dS − pdV implies that ∂U , (14.33) T = ∂S V

14.6

or equivalently 1 = T

∂S ∂U

The entropy of mixing 143

.

(14.34)

1 d ln Ω = . kB T dE

(14.35)

V

Now, recall from eqn 4.7 that

Comparing these last two equations motivates the identiﬁcation of S with kB ln Ω, i.e. S = kB ln Ω. (14.36) This is the expression for the entropy of a system which is in a particular macrostate in terms of Ω, the number of microstates associated with that macrostate. We are assuming that the system is in a particular macrostate which has ﬁxed energy, and this situation is known as the microcanonical ensemble (see Section 4.5). Later in this chapter (see Section 14.8), and also later in the book, we will generalize this result to express the entropy for more complicated situations. Nevertheless, this expression is suﬃciently important that it was inscribed on Boltzmann’s tombstone, although on the tombstone the symbol Ω is written as a ‘W’.4 In the following example, we will apply this expression to understanding the Joule expansion which we introduced in Section 14.4.

Example 14.4 Joule expansion: Following a Joule expansion, each molecule can be either on the left-hand side or the right-hand side of the container. For each molecule there are therefore two ways of placing it. For one mole (NA molecules) there are 2NA ways of placing them. The additional number of microstates associated with the gas being in a container twice as big as the initial volume is therefore given by Ω = 2NA

(14.37)

for one mole (NA molecules) of gas, so that ∆S = kB ln 2NA = kB NA ln 2 = R ln 2,

(14.38)

which is the same expression as written in eqn 14.29.

14.6

The entropy of mixing

Consider two diﬀerent ideal gases (call them 1 and 2) which are in separate vessels with volumes xV and (1 − x)V respectively at the same

4

See page 29.

144 Entropy

xV p T

xV p T

Fig. 14.6 Gas 1 is conﬁned in a vessel of volume xV , while gas 2 is conﬁned in a vessel of volume (1 − x)V . Both gases are at pressure p and temperature T . Mixing occurs once the tap on the pipe connecting the two vessels is opened.

pressures p and temperatures T (see Fig. 14.6). Since the pressures and temperatures are the same on each side, and since p = (N/V )kB T , the number of molecules of gas 1 is xN and of gas 2 is (1 − x)N , where N is the total number of molecules. If the tap on the pipe connecting the two vessels is opened, the gases will spontaneously mix, resulting in an increase in entropy, known as the entropy of mixing. As for the Joule expansion, we can imagine going from the starting state (gas 1 in the ﬁrst vessel, gas 2 in the second vessel) to the ﬁnal state (a homogeneous mixture of gas 1 and gas 2 distributed throughout both vessels) via a reversible route, so that we imagine a reversible expansion of gas 1 from xV into the combined volume V and a reversible expansion of gas 2 from (1−x)V into the combined volume V . For an isothermal expansion of an ideal gas, the internal energy doesn’t change and hence T dS = p dV so that dS = (p/T ) dV = N kB dV /V , using the ideal gas law. This means that the entropy of mixing for our problem is

V

V

dV2 V2

(14.39)

∆S = −N kB (x ln x + (1 − x) ln(1 − x)).

(14.40)

∆S = xN kB

xV

dV1 + (1 − x)N kB V1

(1−x)V

S Nk

and hence

x

Fig. 14.7 The entropy of mixing according to eqn 14.40.

This equation is plotted in Fig. 14.7. As expected, there is no entropy increase when x = 0 or x = 1. The maximum entropy change occurs when x = 21 in which case ∆S = N kB ln 2. This of course corresponds to the equilibrium state in which no further increase of entropy is possible. This expression for x = 12 also admits to a very simple statistical interpretation. Before the mixing of the gases takes place, we know that gas 1 is only in the ﬁrst vessel and gas 2 is only in the second vessel. After mixing, each molecule can exist in two additional ‘microstates’ than before; for every microstate with a molecule of gas 1 on the left there is now an additional one with a molecule of gas 1 now on the right. Therefore Ω must be multiplied by 2N and hence S must increase by kB ln 2N which is N kB ln 2. This treatment has an important consequence: distinguishability is an important concept! We have assumed that there is some tangible diﬀerence between gas 1 and gas 2, so that there is some way to label whether a particular molecule is gas 1 or gas 2. For example, if the two gases were nitrogen and oxygen, one could measure the mass of the molecules to determine which was which. But what if the two gases were actually the same? Physically, we would expect that mixing them would have no observable consequences, so there should be no increase in entropy. Thus mixing should only increase entropy if the gases really are distinguishable. We will return to this issue of distinguishability in Chapter 29.

14.7

14.7

Maxwell’s demon 145

Maxwell’s demon

In 1867, James Clerk Maxwell came up with an intriguing puzzle via a thought experiment. This has turned out to be much more illuminating and hard to solve than he might ever have imagined. The thought experiment can be stated as follows: imagine performing a Joule expansion on a gas. A gas is initially in one chamber, which is connected via a closed tap to a second chamber containing only a vacuum (see Fig. 14.4). The tap is opened and the gas in the ﬁrst chamber expands to ﬁll both chambers. Equilibrium is established and the pressure in each chamber is now half of what it was in the ﬁrst chamber at the start. The Joule expansion is formally irreversible as there is no way to get the gas back into the initial chamber without doing work. Or is there? Maxwell imagined that the tap was operated by a microscopic intelligent creature, now called Maxwell’s demon, who was able to watch the individual molecules bouncing around close to the tap (see Fig. 14.8). If the demon sees a gas molecule heading from the second chamber back into the ﬁrst, it quickly opens the tap and then shuts it straight away, just letting the molecule through. If it spots a gas molecule heading from the ﬁrst chamber back into the second chamber, it keeps the tap closed. The demon does no work5 and yet it can make sure that the gas molecules in the second chamber all go back into the ﬁrst chamber. Thus it creates a pressure diﬀerence between the two chambers where none existed before the demon started its mischief.

5

It does no work in the pdV sense, though it does do some in the brain sense.

Fig. 14.8 Maxwell’s demon watches the gas molecules in chambers A and B and intelligently opens and shuts the trap door connecting the chambers. The demon is therefore able to reverse the Joule expansion and only let molecules travel from B to A, thus apparently contravening the second law of thermodynamics.

146 Entropy

Now, a similar demon could be employed to make hot molecules go the wrong way (i.e. so that heat ﬂows the wrong way, from cold to hot – this in fact was Maxwell’s original implementation of the demon), or even to sort out molecules of diﬀerent types (and thus subvert the ‘entropy of mixing’, see Section 14.6). It looks as if the demon could therefore cause entropy to decrease in a system with no consequent increase in entropy anywhere else. In short, Maxwell’s demon appears to make a mockery out of the second law of thermodynamics. How on earth does it get away with it? Many very good minds have addressed this problem. One early idea was that the demon needs to make measurements of where all the gas molecules are, and to do this would need to shine light on the molecules; thus the process of observation of the molecules might be thought to rescue us from Maxwell’s demon. However, this idea turned out not to be correct as it was found to be possible, even in principle, to detect a molecule with arbitrarily little work and dissipation. Remarkably, it turns out that because a demon needs to have a memory to operate (so that it can remember where it has observed a molecule and any other results of its measurement process), this act of storing information (actually it is the act of erasing information, as we will discuss below) is associated with an increase of entropy, and this increase cancels out any decrease in entropy that the demon might be able to eﬀect in the system. This connection between information and entropy is an extremely important insight and will be explored in Chapter 15. The demon is in fact a type of computational device that processes and stores information about the world. It is possible to design a computational process which proceeds entirely reversibly, and therefore has no increase in entropy associated with it. However, the act of erasing information is irreversible (as anyone who has ever failed to backup their data and then had their computer crash will testify). Erasing information always has an associated increase in entropy (of kB ln 2 per bit, as we shall see in Chapter 15); Maxwell’s demon can operate reversibly therefore, but only if it has a large enough hard disk that it doesn’t ever need to clear space to continue operating. The Maxwell demon therefore beautifully illustrates the connection between entropy and information.

14.8

Entropy and probability

The entropy that you measure is due to the number of diﬀerent states in which the system can exist, according to S = kB ln Ω (eqn 14.36). However, each state may consist of a large number of microstates that we can’t directly measure. Since the system could exist in any one of those microstates, there is extra entropy associated with them. An example should make this idea clear.

14.8

Example 14.5 A system has 5 possible equally likely states in which it can exist, and which of those states it occupies can be distinguished by some easy physical measurement. The entropy is therefore, using eqn 14.36, S = kB ln 5.

(14.41)

However, each of those 5 states is made up of 3 equally likely microstates and it is not possible to measure easily which of those microstates it is in. The extra entropy associated with these microstates is Smicro = kB ln 3. The system therefore really has 3 × 5 = 15 states and the total entropy is therefore Stot = kB ln 15. This can be decomposed into Stot = S + Smicro .

(14.42)

Now let us suppose that a system can have N diﬀerent, equally–likely microstates. As usual, it is hard to measure the details of these microstates directly, but let us assume that they are there. These microstates are divided into various groups (we will call these groups macrostates) with ni microstates contained in the ith macrostate. The macrostates are easier to distinguish using experiment because they correspond to some macroscopic, measurable property. We must have that the sum of all the microstates in each macrostate is equal to the total number of microstates, so that ni = N. (14.43) i

The probability Pi of ﬁnding the system in the ith macrostate is then given by ni (14.44) Pi = . N Equation 14.43 then implies that Pi = 1 as required. The total entropy is of course Stot = kB ln N , though we can’t measure that directly (having no information about the microstates which is easily accessible). Nevertheless, Stot is equal to the sum of the entropy associated with the freedom of being able to be in diﬀerent macrostates, which is our measured entropy S, and the entropy Smicro associated with it being able to be in diﬀerent microstates within a macrostate. Putting this statement in an equation, we have Stot = S + Smicro ,

(14.45)

which is identical to eqn 14.42. The entropy associated with being able to be in diﬀerent microstates (the aspect we can’t measure) is given by Smicro = Si = Pi Si , (14.46) i

Entropy and probability 147

148 Entropy

where Si = kB ln ni is the entropy of the microstates in the ith macrostate and, to recap, Pi is the probability of a particular macrostate being occupied. Hence S

= Stot − Smicro

= kB ln N − Pi ln ni = kB

i

Pi (ln N − ln ni ),

(14.47)

i

and using ln N − ln ni = − ln(ni /N ) = − ln Pi (from eqn 14.44) yields Gibbs’ expression for the entropy: S = −kB i Pi ln Pi . (14.48)

Example 14.6 Find the entropy for a system with Ω macrostates, each with probability Pi = 1/Ω (i.e. assuming the microcanonical ensemble). Solution: Using eqn 14.48, substitution of Pi = 1/Ω yields S = −kB

i

Ω 1 1 1 ln = −kB ln = kB ln Ω, (14.49) Pi ln Pi = −kB Ω Ω Ω i=1

which is the same as eqn 14.36.

A connection between the Boltzmann probability and the expression for entropy in eqn 14.48 is demonstrated in the following example.

6

See Appendix C.13.

Example 14.7 Maximise S = −kB i Pi ln Pi (eqn 14.48) subject to the constraints that Pi = 1 and i Pi Ei = U . Solution: Use the method of Lagrange multipliers,6 in which we maximize S − α × (constraint 1) − β × (constraint 2) kB

(14.50)

where α and β are Lagrange multipliers. Thus we vary this expression with respect to one of the probabilities Pj and get

∂ −Pi ln Pi − αPi − βPi Ei = 0, (14.51) ∂Pj i

Exercises 149

so that − ln Pj − 1 − α − βEj = 0.

(14.52)

This can be rearranged to give Pj =

e−βEj , e1+α

(14.53)

e−βEj Z

(14.54)

so that with Z = e1+α we have Pj =

which is our familiar expression for the Boltzmann probability (eqn 4.13).

Chapter summary • Entropy is deﬁned by dS = d¯Qrev /T . • The entropy of an isolated system tends to a maximum. • The entropy of an isolated attains this maximum at equilibrium. • The laws of thermodynamics can be stated as follows: (1) UUniverse = constant. (2) SUniverse can only increase. • These can be combined to give dU = T dS − p dV which always holds. • The statistical deﬁnition of entropy is S = kB ln Ω. • The general deﬁnition of entropy, due to Gibbs, is S = −kB i Pi ln Pi .

Exercises (14.1) A mug of tea has been left to cool from 90◦ C to 18◦ C. If there is 0.2 kg of tea in the mug, and the tea has speciﬁc heat capacity 4200 J K−1 kg−1 , show that the entropy of the tea has decreased by 185.7 J K−1 . Comment on the sign of this result. (14.2) In a free expansion of a perfect gas (also called Joule expansion), we know U does not change, and no work is done. However, the entropy must increase because the process is irreversible. Are these statements compatible with the ﬁrst law

dU = T dS − pdV ? (14.3) A 10 Ω resistor is held at a temperature of 300 K. A current of 5 A is passed through the resistor for 2 minutes. Ignoring changes in the source of the current, what is the change of entropy in (a) the resistor and (b) the Universe? (14.4) Calculate the change of entropy (a) of a bath containing water, initially at 20◦ C, when it is placed in thermal contact with a very

150 Exercises large heat reservoir at 80◦ C, (b) of the reservoir when process (a) occurs, (c) of the bath and of the reservoir if the bath is brought to 80◦ C through the operation of a Carnot engine between them. The bath and its contents have total heat capacity 104 J K−1 . (14.5) A block of lead of heat capacity 1 kJ K−1 is cooled from 200 K to 100 K in two ways. (a) It is plunged into a large liquid bath at 100 K. (b) The block is ﬁrst cooled to 150 K in one liquid bath and then to 100 K in another bath. Calculate the entropy changes in the system comprising block plus baths in cooling from 200 K to 100 K in these two cases. Prove that in the limit of an inﬁnite number of intermediate baths the total entropy change is zero. (14.6) Calculate the changes in entropy of the Universe as a result of the following processes: (a) A capacitor of capacitance 1 µF is connected to a battery of e.m.f. 100 V at 0◦ C. (NB think carefully about what happens when a capacitor is charged from a battery.) (b) The same capacitor, after being charged to 100 V, is discharged through a resistor at 0◦ C. (c) One mole of gas at 0◦ C is expanded reversibly and isothermally to twice its initial volume. (d) One mole of gas at 0◦ C is expanded reversibly and adiabatically to twice its initial volume. (e) The same expansion as in (f) is carried out by opening a valve to an evacuated container of equal volume. (14.7) Consider n moles of a gas, initially conﬁned within a volume V and held at temperature T . The gas is expanded to a total volume αV , where α is a constant, by (a) a reversible isothermal expansion and (b) removing a partition and allowing a free

expansion into the vacuum. Both cases are illustrated in Fig. 14.9. Assuming the gas is ideal, derive an expression for the change of entropy of the gas in each case.

Fig. 14.9 Diagram showing n moles of gas, initially conﬁned within a volume V .

Repeat this calculation for case (a), assuming that the gas obeys the van der Waals equation of state « „ n2 a (14.55) p + 2 (V − nb) = nRT. V Show further that for case (b) the temperature of the van der Waals gas falls by an amount proportional to (α − 1)/α. (14.8) The probability of a system being in the ith microstate is (14.56) Pi = e−βEi /Z, where Ei is the energy of the ith microstate and β and Z are constants. Show that the entropy is given by (14.57) S/kB = ln Z + βU, P where U = i Pi Ei is the internal energy. (14.9) Use the Gibbs expression for entropy (eqn 14.48) to derive the formula for the entropy of mixing (eqn 14.40).

Biography 151

Julius Robert Mayer (1814-1878) Robert Mayer studied medicine in T¨ ubingen and took the somewhat unusual career route of signing up as a ship’s doctor with a Dutch vessel bound for the East Indies. While letting blood from sailors in the tropics, he noticed that their venous blood was redder than observed back home and concluded that the metabolic oxidation rate in hotter climates was slower. Since a constant body temperature was required for life, the body must reduce its oxidation rate because oxidation of material from food produces Fig. 14.10 Robert internal heat. Though there was Mayer some questionable physiological reasoning in his logic, Mayer was on to something. He had realised that energy was something that needed to be conserved in any physical process. Back in Heilbronn, Germany, Mayer set to work on a measurement of the mechanical equivalent of heat and wrote a paper in 1841 which was the ﬁrst statement of the conservation of energy (though

James Prescott Joule (1818-1889) James Joule was the son of a wealthy brewer in Salford, near Manchester, England. Joule was educated at home, and his tutors included John Dalton, the father of modern atomic theory. In 1833, illness forced his father to retire, and Joule was left in charge of the family brewery. He had a passion for scientiﬁc research and set up a laboratory, working there in the early morning Fig. 14.11 James and late evening so that he could Joule continue his day job. In 1840, he showed that the heat dissipated by an electric current I in a resistor R was proportional to I 2 R (what we now call Joule heating). In 1846, Joule discov-

he used the word ‘force’). Mayer’s work predated the ideas of Joule and Helmholtz (though his experiment was not as accurate as Joule’s) and his notion of the conservation of energy had a wider scope than that of Helmholtz; not only were mechanical energy and heat convertible, but his principle could be applied to tides, meteorites, solar energy and living things. His paper was eventually published in 1842, but received little acclaim. A later more detailed paper in 1845 was rejected and he published it privately. Mayer then went through a bit of a bad patch, to put it mildly: others began to get the credit for ideas he thought he had pioneered, three of his children died in the late 1840’s and he attempted suicide in 1850, jumping out of a third-story window, but only succeeding in permanently laming himself. In 1851 he checked into a mental institution where he received sometimes brutal treatment and was discharged in 1853, with the doctors unable to oﬀer him any hope of a cure. In 1858, he was even referred to as being dead in a lecture by Liebig (famous for his condenser, and editor of the journal that had accepted Mayer’s 1842 paper). Mayer’s scientiﬁc reputation began to recover in the 1860’s and he was awarded the Copley Medal of the Royal Society of London in 1871, the year after they awarded it to Joule.

ered the phenomenon of magnetostriction (by which a magnet changes its length when magnetized). However Joule’s work did not impress the Royal Society and he was dismissed as a mere provincial dilettante. However, Joule was undeterred and he decided to work on the convertibility of energy and to try to measure the mechanical equivalent of heat. In his most famous experiment he measured the increase in temperature of a thermally insulated barrel of water, stirred by a paddle-wheel which was driven by a falling weight. But this was just one of an exhaustive series of meticulously performed experiments which aimed to determine the mechanical equivalent of heat, using electrical circuits, chemical reactions, viscous heating, mechanical contraptions and gas compression. He even attempted to measure the temperature diﬀerence between water at the top and bottom of a waterfall, an opportunity aﬀorded to him by being in Switzerland on his honeymoon!

152 Biography

Joule’s obsessive industry paid oﬀ: his completely different experimental methods gave consistent results. Part of Joule’s success was in designing thermometers with unprecedented accuracy which could measure temperature changes as small as 1/200 degrees Fahrenheit. This was necessary as the eﬀects he was looking for tended to be small. His methods proved to be accurate and even his early measurements were within several percent of the modern accepted value of the mechanical equivalent of heat, and his 1850 experiment was within 1 percent. However, the smallness of the eﬀect led to scepticism, particularly from the scientiﬁc establishment, who had all had proper educations, didn’t spend their days making beer and knew that you couldn’t measure temperature diﬀerences as tiny as Joule claimed to have observed. However the tide began to turn in Joule’s favour in the late 1840’s. Helmholtz recognized Joule’s contribution to the conservation of energy in his paper of

1847. In the same year, Joule gave a talk at a British Association meeting in Oxford where Stokes, Faraday and Thomson were in attendance. Thomson was intrigued and the two struck up a correspondence, resulting in a fruitful collaboration between the two between 1852 and 1856. They measured the temperature fall in the expansion of a gas, and discovered the Joule–Thomson eﬀect. Joule refused all academic appointments, preferring to work independently. Though without advanced education, Joule had excellent instincts and was an early defender of the kinetic theory of gases, and felt his way towards a kinetic theory of heat, perhaps because of his youthful exposure to Dalton’s teachings. On Joule’s gravestone is inscribed the number ‘772.55’, the number of foot-pounds required to heat a pound of water by one degree Fahrenheit. It is ﬁtting that today, mechanical and thermal energy are measured in the same unit: the Joule.

Rudolf Clausius (1822-1888)

he wrote a paper in which he stated that heat cannot of itself pass from a colder to a warmer body, a statement of the second law of thermodynamics. He also showed that his function f (T ) could be written (in modern notation) as f (T ) = 1/T . In 1865 he was ready to give f (T ) dQ a name, deﬁning the entropy (a word he made up to sound like ‘energy’ but contain ‘trope’ meaning ‘turning’, as in the word ‘heliotrope’, a plant which turns towards the Sun) using dS = dQ/T for a reversible process. He also summarized the ﬁrst and second laws of thermodynamics by stating that the energy of the world is constant and its entropy tends to a maximum. When Bismarck started the Franco-German war, Clausius patriotically ran a volunteer ambulance corps of Bonn students in 1870–1871, carrying oﬀ the wounded from battles in Vionville and Gravelotte. He was wounded in the knee, but received the Iron Cross for his eﬀorts in 1871. He was no less zealous in defending Germany’s preeminence in thermal physics in various priority disputes, being provoked into siding with Mayer’s claim over Joule’s, and in various debates with Tait, Thomson and Maxwell. Clausius however showed little interest in the work of Boltzmann and Gibbs that aimed to understand the molecular origin of the irreversibility that he had discovered and named.

Rudolf Clausius studied mathematics and physics in Berlin, and did his doctorate in Halle University on the colour of the sky. Clausius turned his attention to the theory of heat and, in 1850, he published a paper which essentially saw him picking up the baton left by Sadi Carnot (via an 1834 paper by Emile Clapeyron) and running with it. He deﬁned the internal energy, U , of a system and wrote that the change of heat was given by Fig. 14.12 Rudolf dQ = dU +(1/J)p dV , where the Clausius factor J (the mechanical equivalent of heat) was necessary to convert mechanical energy p dV into the same units as thermal energy (a conversion which in today’s units is, of course, unnecessary). He also showed that in a Carnot process, the integral round a closed loop of f (T ) dQ was zero, where f (T ) was some function of temperature. His work brought him a professorship in Berlin, though he subsequently moved to chairs in Z¨ urich (1855), W¨ urzburg (1867) and Bonn (1869). In 1854,

15

Information theory In this chapter we are going to examine the concept of information and relate it to thermodynamic entropy. At ﬁrst sight, this seems a slightly crazy thing to do. What on earth do something to do with heat engines and something to do with bits and bytes have in common? It turns out that there is a very deep connection between these two concepts. To understand why, we begin our account by trying to formulate one deﬁnition of information.

15.1

Information and Shannon entropy

Consider the following three true statements about Isaac Newton (1643– 1727) and his birthday.1 (1) Isaac Newton’s birthday falls on a particular day of the year. (2) Isaac Newton’s birthday falls in the second half of the year (3) Isaac Newton’s birthday falls on the 25th of a month. The ﬁrst statement has, by any sensible measure, no information content. All birthdays fall on a particular day of the year. The second statement has more information content: at least we now know which half of the year his birthday is. The third statement is much more speciﬁc and has the greatest information content. How do we quantify information content? Well, one property we could notice is that the greater the probability of the statement being true in the absence of any prior information, the less the information content of the statement. Thus if you knew no prior information about Newton’s birthday, then you would say that statement 1 has probability P1 = 1, statement 2 has probability P2 = 12 , and statement 3 has probability2 12 P3 = 365 ; so as the probability decreases, the information content increases. Moreover, since the useful statements 2 and 3 are independent, then if you are given statements 2 and 3 together, their information contents should add. Moreover, the probability of statements 2 and 3 both 6 . Since being true, in the absence of prior information, is P2 × P3 = 365 the probability of two independent statements being true is the product of their individual probabilities, and since it is natural to assume that information content is additive, one is motivated to adopt the deﬁnition of information which was proposed by Claude Shannon (1916–2001) as follows:

15.1 Information entropy

and

Shannon 153

15.2 Information and thermodynamics 155 15.3 Data compression

156

15.4 Quantum information

158

Chapter summary

161

Further reading

161

Exercises

161

1

The statements take as prior information that Newton was born in 1643 and that the dates are expressed according to the calendar which was used in his day. The Gregorian calendar was not adopted in England until 1742.

2

We are using the fact that 1643 was not a leap year!

154 Information theory

The information content Q of a statement is deﬁned by Q = −k log P, 3

We need k to be a positive constant so that as P goes up, Q goes down.

(15.1)

where P is the probability of the statement and k is a positive constant.3 If we use log2 (log to the base 2) for the logarithm in this expression and also k = 1, then the information Q is measured in bits. If instead we use ln ≡ loge and choose k = kB , then we have a deﬁnition which, as we shall see, will match what we have found in thermodynamics. In this chapter, we will stick with the former convention since bits are a useful quantity with which to think about information. Thus, if we have a set of statements with probability Pi , with corresponding information Qi = −k log Pi , then the average information content S is given by Qi Pi = −k Pi log Pi . (15.2) S = Q = i

i

The average information is called the Shannon entropy.

Example 15.1 • A fair die produces outcomes 1, 2, 3, 4, 5 and 6 with probabilities 1 1 1 1 1 1 6 , 6 , 6 , 6 , 6 , 6 . The information associated with each outcome is Q = −k log 16 = k log 6 and the average information content is then S = k log 6. Taking k = 1 and using log to the base 2 gives a Shannon entropy of 2.58 bits. • A biased die produces outcomes 1, 2, 3, 4, 5 and 6 with prob1 1 1 1 1 1 , 10 , 10 , 10 , 10 , 2 . The information contents associated abilities 10 with the outcomes are k log 10 ,k log 10, k log 10, k log 10, k log 10 and k log 2. (These are 3.32, 3.32, 3.32, 3.32, 3.32 and 1 bit respectively.) If we take k = 1 again, the √ Shannon entropy is then 1 log 10 + 12 log 2) = k(log 20) (this is 2.16 bits). This S = k(5 × 10 Shannon entropy is smaller than in the case of the fair die.

The Shannon entropy quantiﬁes how much information we gain, on average, following a measurement of a particular quantity. (Another way of looking at it is to say the Shannon entropy quantiﬁes the amount of uncertainty we have about a quantity before we measure it.) To make these ideas more concrete, let us study a simple example in which there are only two possible outcomes of a particular random process (such as the tossing of a coin, or asking the question ‘will it rain tomorrow?’).

15.2

Information and thermodynamics 155

Example 15.2 What is the Shannon entropy for a Bernoulli4 trial (a two-outcome random variable) with probabilities P and 1 − P of the two outcomes? Solution: Pi log Pi = −P log P − (1 − P ) log(1 − P ), (15.3) S=−

4

James Bernoulli (1654–1705).

P

P

where we have set k = 1. This behaviour is sketched in Fig. 15.1. The Shannon entropy has a maximum when p = 12 (greatest uncertainty about the outcome, or greatest information gained, 1 bit, following a trial) and a minimum when p = 0 or 1 (least uncertainty about the outcome, or least information gained, 0 bit, following a trial). The information associated with each of the two possible outcomes is also shown in Fig. 15.1 as dashed lines. The information associated with the outcome having probability P is given by Q1 = − log2 P and decreases as P increases. Clearly when this outcome is very unlikely (P small) the information associated with getting that outcome is very large (Q1 is many bits of information). However, such an outcome doesn’t happen very often so it doesn’t contribute much to the average information (i.e. to the Shannon entropy, the solid line in Fig. 15.1). When this outcome is almost certain (P almost 1) it contributes a lot to the average information but has very little information content. For the other outcome, with probability 1 − P , Q2 = − log2 (1 − P ) and the behaviour is simply a mirror image of this. The maximum average information is when P = 1 − P = 12 and both outcomes have 1 bit of information associated with them.

15.2

Information and thermodynamics

Remarkably, the formula for Shannon entropy in eqn 15.2 is identical (apart from whether you take your constant as k or kB ) to Gibbs’ expression for thermodynamic entropy in eqn 14.48. This gives us a useful perspective on what thermodynamic entropy is. It is a measure of our uncertainty of a system, based upon our limited knowledge of its properties and ignorance about which of its microstates it is in. In making inferences on the basis of partial information, we can assign probabilities on the basis that we maximize entropy subject to the constraints provided by what is known about the system. This is exactly what we did in Example 14.7, when we maximized the Gibbs entropy of an isolated system subject to the constraint that the total energy U was constant; hey presto, we found that we recovered the Boltzmann probability distribution. With this viewpoint, one can begin to understand thermodynamics from an information theory viewpoint.

SP

i

P Fig. 15.1 The Shannon entropy of a Bernoulli trial (a two-outcome random variable) with probabilities P and 1 − P of the two outcomes. The units are chosen so that the Shannon entropy is in bits. Also shown is the information associated with each outcome (dashed lines).

156 Information theory

5

We could equally well reset the bits to one.

However, not only does information theory apply to physical systems, but as pointed out by Rolf Landauer (1927–1999), information itself is a physical quantity. Imagine a physical computing device which has stored N bits of information and is connected to a thermal reservoir of temperature T . The bits can be either one or zero. Now we decide to physically erase that information. Erasure must be irreversible. There must be no vestige of the original stored information left in the erased state of the system. Let us erase the information by resetting all the bits to zero.5 Then this irreversible process reduces the number of states of the system by ln 2N and hence the entropy of the system goes down by N kB ln 2, or kB ln 2 per bit. For the total entropy of the Universe not to decrease, the entropy of the surroundings must go up by kB ln 2 per bit and so we must dissipate heat in the surroundings equal to kB T ln 2 per bit erased. This connection between entropy and information helps us in our understanding of Maxwell’s demon discussed in Section 14.7. By performing computations about molecules and their velocities, the demon has to store information. Each bit of information is associated with entropy, as becomes clear when the demon has to free up some space on its hard disk to continue computing. The process of erasing one bit of information gives rise to an increase of entropy of kB ln 2. If Maxwell’s demon reverses the Joule expansion of 1 mole of gas, it might therefore seem like it has decreased the entropy of the Universe by NA kB ln 2 = R ln 2, but it will have had to store at least NA bits of information to do this. Assuming that Maxwell’s demons only have on-board a storage capacity of a few hundred gigabytes, which is much less than NA bits, the demon will have had to erase its disk many many times in the process of its operation, thus leading to an increase in entropy of the Universe which at least equals, and probably outweighs, the decrease of entropy of the Universe it was aiming to achieve. If the demon is somehow ﬁtted with a vast on-board memory so that it doesn’t have to erase its memory to do the computation, then the increase in entropy of the Universe can be delayed until the demon needs to free up some memory space. Eventually, one supposes, as the demon begins to age and becomes forgetful, the Universe will reclaim all that entropy!

15.3

Data compression

Information must be stored, or sometimes transmitted from one place to another. It is therefore useful if it can be compressed down to its minimum possible size. This really begs the question what the actual irreducible amount of real information in a particular block of data really is; many messages, political speeches, and even sometimes book chapters, contain large amounts of extraneous padding that is not really needed. Of course, when we compress a ﬁle down on a computer we often get something which is unreadable to human beings. The English

15.3

language has various quirks, such as when you see a letter ‘q’ it is almost always followed by a ‘u’, so is that second ‘u’ really needed when you know it is coming? A good data compression algorithm will get rid of extra things like that, plus much more besides. Hence, the question of how many bits are in a given source of data seems like a useful question for computer scientists to attempt to answer; in fact we will see it has implications for physics! We will here not prove Shannon’s noiseless channel coding theorem, but motivate it and then state it.

Example 15.3 Let us consider the simplest case in which our data are stored in the form of the binary digits ‘0’ and ‘1’. Let us further suppose that the data contain ‘0’ with probability P and ‘1’ with probability 1 − P . If P = 12 then our data cannot really be compressed, as each bit of data contains real information. Let us now suppose that P = 0.9 so that the data contain more 0’s than 1’s. In this case, the data contain less information, and it is not hard to ﬁnd a way of taking advantage of this. For example, let us read the data into our compression algorithm in pairs of bits, rather than one bit at a time, and make the following transformations: 00

→

0

10 01

→ →

10 110

11

→

1110

In each of the transformations, we end on a single ‘0’, which lets the decompression algorithm know that it can start reading the next sequence. Now, of course, although the pair of symbols ‘00’ have been compressed to ‘0’, saving a bit, the pair of symbols ‘01’ has been enlarged to ‘110’ and ‘11’ has been even more enlarged to ‘1110’, costing 1 extra or 2 extra bits respectively. However, ‘00’ is very likely to occur (probability 0.81) while ‘01’ and ‘11’ are much less likely to occur (probabilities 0.09 and 0.01 respectively), so overall we save bits using this compression scheme.

This example gives us a clue as to how to compress data more generally. The aim is to identify in a sequence of data what the typical sequences are and then eﬃciently code only those. When the amount of data becomes very large, then anything other than these typical sequences is very unlikely to occur. Because there are fewer typical sequences than there are sequences in general, a saving can be made. Hence, let us divide up some data into sequences of length n. Assuming the elements in the data do not depend on each other, then the

Data compression 157

158 Information theory

probability of ﬁnding a sequence x1 , x2 , . . . , xn is P (x1 , x2 , . . . , xn ) = P (x1 )P (x2 ) . . . P (xn ) ≈ P nP (1 − P )n(1−P ) , (15.4) for typical sequences. Taking logarithms to base 2 of both sides gives − log2 P (x1 , x2 , . . . , xn ) ≈ −nP log2 P − n(1 − P ) log2 (1 − P ) = nS, (15.5) where S is the entropy for a Bernoulli trial with probability P . Hence P (x1 , x2 , . . . , xn ) ≈

1 . 2nS

(15.6)

This shows that there are at most only 2nS typical sequences and hence it only requires nS bits to code them. As n becomes larger, and the typical sequences become longer, the possibility of this scheme failing becomes smaller and smaller. A compression algorithm will take a typical sequence of n terms x1 , x2 , . . . , xn and turn them into a string of length nR. Hence, the smaller R is, the greater the compression. Shannon’s noiseless channel coding theorem states that if we have a source of information with entropy S, and if R > S, then there exists a reliable compression scheme of compression factor R. Conversely, if R < S then any compression scheme will not be reliable. Thus the entropy S sets the ultimate compression limit on a set of data.

15.4

6

The operator Tr means the trace of the following matrix, i.e. the sum of the diagonal elements.

Quantum information

This section shows how the concept of information can be extended to quantum systems and assumes familiarity with the main results of quantum mechanics. In this chapter we have seen that in classical systems the information content is connected with the probability. In quantum systems, these probabilities are replaced by density matrices. A density matrix is used to describe the statistical state of a quantum system, as can arise for a quantum system in thermal equilibrium at ﬁnite temperature. A summary of the main results concerning density matrices is given in the box on page 159. For quantum systems, the information is represented by the operator −k log ρ, where ρ is the density matrix; as before we take k = 1. Hence the average information, or entropy, would be − log ρ. This leads to the deﬁnition of the von Neumann entropy S as6 S(ρ) = −Tr(ρ log ρ).

(15.7)

If the eigenvalues of ρ are λ1 , λ2 . . ., then the von Neumann entropy becomes λi log λi , (15.8) S(ρ) = − i

which looks like the Shannon entropy.

15.4

The density matrix: • If a quantum system is in one of a number of states |ψi with probability Pi , then the density matrix ρ for the system is deﬁned by ρ= Pi |ψi ψi |. (15.9) i

• As an example, think of a three-state system and think of |ψ1 1 as a column vector 0, and hence ψ1 | as a row vector (1, 0, 0), 0 and similarly for |ψ2 , ψ2 |, |ψ3 and ψ3 |. Then 0 0 0 0 0 0 1 0 0 ρ = P1 0 0 0 + P2 0 1 0 + P3 0 0 0 0 0 1 0 0 0 0 0 0 0 P1 0 = 0 P2 0 (15.10) 0 0 P3 This form of the density matrix looks very simple, but this is only because we have expressed it in a very simple basis. • If Pj = 0 and Pi=j = 0, then the system is said to be in a pure state and ρ can be written in the simple form ρ = |ψj ψj |.

(15.11)

Otherwise, it is said to be in a mixed state. ˆ of a quantum me• One can show that the expectation value A ˆ chanical operator A is equal to ˆ = Tr(Aρ). ˆ A

(15.12)

Trρ = 1,

(15.13)

• One can also prove that

where Trρ means the trace of the density matrix. This expresses the fact that the sum of the probabilities must equal unity, and is in fact a special case of eqn 15.12 setting Aˆ = 1. • One can also show that Trρ2 ≤ 1 with equality if and only if the state is pure. • For a system in thermal equilibrium at temperature T , Pi is given by the Boltzmann factor e−βEi where Ei is an eigenvalue of the ˆ The thermal density matrix ρth is Hamiltonian H. ˆ e−βEi |ψi ψi | = exp(−β H). (15.14) ρth = i

Quantum information 159

160 Information theory

7

A pure state is deﬁned in the box on page 159.

8

Note that we take 0 ln 0 = 0.

9

An arbitary qubit can be written as |ψ = α|0 + β|1 where |α|2 + |β|2 = 1. 10

Einstein called entanglement ‘spooky action at a distance’, and used it to argue against the Copenhagen interpretation of quantum mechanics and show that quantum mechanics is incomplete.

11

It turns out that a unitary operator, such as the time-evolution operator, acting on a state leave the entropy unchanged. This is akin to our results in thermodynamics that reversibility is connected with the preservation of entropy.

Example 15.4 Show that the entropy of a pure state7 is zero. How can you maximize the entropy? Solution: (i) As shown in the box on page 159, the trace of the density matrix is equal to one (Trρ = 1), and hence (15.15) λi = 1. For a pure state only one eigenvalue will be one and all the other eigenvalues will be zero, and hence8 S(ρ) = 0, i.e. the entropy of a pure state is zero. This is not surprising, since for a pure state there is no ‘uncertainty’ about the state of the system. (ii) The entropy is maximized when λi = 1/n for all i, where n is the dimension of the density matrix. In this case, the entropy is S(ρ) = n × (− n1 log n1 ) = log n. This corresponds to there being maximal uncertainty in its precise state.

Classical information is made up only of sequences of 0’s and 1’s (in a sense, all information can be broken down into a series of ‘yes/no’ questions). Quantum information is comprised of quantum bits (known as qubits), which are two-level quantum systems which can be represented by linear combinations9 of the states |0 and |1. Quantum mechanical states can also be entangled with each other. The phenomenon of entanglement10 has no classical counterpart. Quantum information √ therefore also contains entangled superpositions such as (|01+|10)/ 2. Here the quantum states of two objects must be described with reference to each other; measurement of the ﬁrst bit in the sequence to be a 0 forces the second bit to be 1; if the measurement of the ﬁrst bit gives a 1, the second bit has to be 0; these correlations persist in an entangled quantum system even if the individual objects encoding each bit are spatially separated. Entangled systems cannot be described by pure states of the individual subsystems, and this is where entropy plays a rˆ ole, as a quantiﬁer of the degree of mixing of states. If the overall system is pure, the entropy of its subsystems can be used to measure its degree of entanglement with the other subsystems.11 In this text we do not have space to provide many details about the subject of quantum information, which is a rapidly developing area of current research. Suﬃce to say that the processing of information in quantum mechanical systems has some intriguing facets which are not present in the study of classical information. Entanglement of bits is just one example. As another example, the no-cloning theorem states that it is impossible to make a copy of non-orthogonal quantum mechanical states (for classical systems, there is no physical mechanism to stop you copying information, only copyright laws). All of these features lead to the very rich structure of quantum information theory.

Further reading 161

Chapter summary • The information Q is given by Q = − ln P where P is the probability. • The entropy is the average information S = Q = − i Pi log Pi . • The quantum mechanical generalization of this is the von Neumann entropy given by S(ρ) = −Tr(ρ log ρ) where ρ is the density matrix.

Further reading The results which we have stated in this chapter concerning Shannon’s coding theorems, and which we considered only for the case of Bernoulli trials, i.e. for binary outputs, can be proved for the general case. Shannon also studied communication over noisy channels in which the presence of noise randomly ﬂips bits with a certain probability. In this case it is also possible to show how much information can be reliably transmitted using such a channel (essentially how many times you have to ‘repeat’ the message to get yourself ‘heard’, though actually this is done using errorcorrecting codes). Further information may be found in Feynman (1996) and Mackay (2003). An excellent account of the problem of Maxwell’s demon may be found in Leﬀ and Rex (2003). Quantum information theory has become a very hot research topic in the last few years and an excellent introduction is Nielsen and Chuang (2000).

Exercises (15.1) In a typical microchip, a bit is stored by a 5 fF capacitor using a voltage of 3 V. Calculate the energy stored in eV per bit and compare this with the minimum heat dissipation by erasure, which is kB T ln 2 per bit, at room temperature. (15.2) A particular logic gate takes two binary inputs A and B and has two binary outputs A and B . Its truth table is A B A B 0 0 1 1

0 1 0 1

1 1 0 0

1 0 1 0

and this is produced by A = NOT A and B = NOT B. The input has a Shannon entropy of 2 bits. Show that the output has a Shannon entropy of 2 bits. A second logic gate has a truth table given by

A

B

A

B

0 0 1 1

0 1 0 1

0 1 1 1

0 0 0 1

This can be achieved using A = A OR B and B = A AND B. Show that the output now has an entropy of 23 bits. What is the diﬀerence between the two logic gates? (15.3) Maximize the Shannon entropy S = P −k i Pi log Pi subjectPto the constraints that P Pi f (xi ) and show that Pi = 1 and f (x) = Pi

=

Z(β)

=

1 −βf (xi ) , e Z(β) X −βf (x ) i e ,

f (x)

=

−

d ln Z(β). dβ

(15.16) (15.17) (15.18)

162 Exercises (15.4) Noise in a communication channel ﬂips bits at random with probability P . Argue that the entropy associated with this process is S = −P log P − (1 − P ) log(1 − P ).

(15.19)

It turns out that the rate R at which we can pass information along this noisy channel is 1 − S. (This is an application of Shannon’s noisy channel coding theorem, and a nice proof of this theorem is given on page 548 of Nielsen and Chuang (2000).) (15.5) (a) The relative entropy measures the closeness of two probability distributions P and Q and is deﬁned by „ « X X Pi Pi log Qi , = −Sp − S(P ||Q) = Pi log Qi (15.20)

P where Sp = − Pi log Pi . Show that S(P ||Q) ≥ 0 with equality if and only if Pi = Qi for all i. (b) If i takes N values with probability Pi , then show that S(P ||Q) = −SP + log N

(15.21)

where Qi = 1/N for all i. Hence show that SP ≤ log N

(15.22)

with equality if and only if Pi is uniformly distributed between all N outcomes.

Part VI

Thermodynamics in action In this part we use the laws of thermodynamics developed in Part V to solve real problems in thermodynamics. Part VI is structured as follows: • In Chapter 16 we derive various functions of state called thermodynamic potentials, in particular the enthalpy, Helmholtz function and the Gibbs function, and show how they can be used to study thermodynamic systems under various constraints. We introduce the Maxwell relations, which allow us to relate various partial differentials in thermal physics. • In Chapter 17 we show that the results derived so far can be extended straightforwardly to a variety of diﬀerent thermodynamic systems other than the ideal gas. • In Chapter 18 we introduce the third law of thermodynamics, which is really an addendum to the second law, and explain some of its consequences.

16

Thermodynamic potentials

16.1 Internal energy, U

164

16.2 Enthalpy, H

165

16.3 Helmholtz function, F

166

16.4 Gibbs function, G.

167

16.5 Availability

168

16.6 Maxwell’s relations

170

Chapter summary

178

Exercises

178

The internal energy U of a system is a function of state, which means that a system undergoes the same change in U when we move it from one equilibrium state to another, irrespective of which route we take through parameter space. This makes U a very useful quantity, though not a uniquely useful quantity. In fact, we can make a number of other functions of state, simply by adding to U various other combinations of the functions of state p, V , T and S in such a way as to give the resulting quantity the dimensions of energy. These new functions of state are called thermodynamic potentials, and examples include U + T S, U − pV , U + 2pV − 3T S. However, most thermodynamic potentials that one could pick are really not very useful (including the ones we’ve just used as examples!) but three of them are extremely useful and are given special symbols: H = U + pV , F = U − T S and G = U + pV − T S. In this chapter, we will explore why these three quantities are so useful. First, however, we will review some properties concerning the internal energy U .

16.1

Internal energy, U

Let us review the results concerning the internal energy that were derived in Section 14.3. Changes in the internal energy U of a system are given by the ﬁrst law of thermodynamics written in the form (eqn 14.17): dU = T dS − pdV. 1

See Section 14.3.

(16.1)

This equation shows that the natural variables1 to describe U are S and V , since changes in U are due to changes in S and/or V . Hence we write U = U (S, V ) to show that U is a function of S and V . Moreover, if S and V are held constant for the system, then dU = 0,

(16.2)

which is the same as saying that U is a constant. Equation 16.1 implies that the temperature T can be expressed as a diﬀerential of U using ∂U , (16.3) T = ∂S V and similarly the pressure p can be expressed as ∂U p=− . ∂V S

(16.4)

16.2

Enthalpy, H 165

We also have that for isochoric processes (where isochoric means that V is constant), dU = T dS, (16.5) and for reversible2 isochoric processes

2

For a reversible process, d ¯Q = T dS, see Section 14.3.

dU = d¯Qrev = CV dT,

and hence

(16.6)

T2

∆U = T1

CV dT.

(16.7)

This is only true for systems held at constant volume; we would like to be able to extend this to systems held at constant pressure (an easier constraint to apply experimentally), and this can be achieved using the thermodynamic potential called enthalpy which we describe next.

16.2

Enthalpy, H

We deﬁne the enthalpy H by H = U + PV .

(16.8)

This deﬁnition together with eqn 16.1 imply that dH

= T dS − pdV + pdV + V dp = T dS + V dp.

(16.9)

The natural variables for H are thus S and p, and we have that H = H(S, p). We can therefore immediately write down that for a isobaric (i.e. constant pressure) process, dH = T dS,

(16.10)

and for a reversible isobaric process dH = d¯Qrev = Cp dT,

so that

(16.11)

T2

∆H = T1

Cp dT.

(16.12)

This shows the importance of H, that for reversible isobaric processes the enthalpy represents the heat absorbed by the system.3 Isobaric conditions are relatively easy to obtain: an experiment which is open to the air in a laboratory is usually at constant pressure since pressure is provided by the atmosphere.4 We also conclude from eqn 16.9 that if both S and p are constant, we have that dH = 0. Equation 16.9 also implies that ∂H , (16.13) T = ∂S p

3

If you add heat to the system at constant pressure, the enthalpy H of the system goes up. If heat is provided by the system to its surroundings H goes down. 4

At a given latitude, the atmosphere provides a constant pressure, small changes due to weather fronts notwithstanding.

166 Thermodynamic potentials

and

V =

∂H ∂p

.

(16.14)

S

Both U and H suﬀer from the drawback that one of their natural variables is the entropy S, which is not a very easy parameter to vary in a lab. It would be more convenient if we could substitute that for the temperature T , which is, of course, a much easier quantity to control and to vary. This is accomplished for both of our next two functions of state, the Helmholtz and Gibbs functions.

16.3 5

This is sometimes called Helmholtz free energy.

Helmholtz function, F

We deﬁne the Helmholtz function5 using F = U − T S.

(16.15)

Hence we ﬁnd that dF

= T dS − pdV − T dS − SdT = −SdT − pdV.

(16.16)

This implies that the natural variables for F are V and T , and we can therefore write F = F (T, V ). For an isothermal process (constant T ), we can simplify eqn 16.16 further and write that dF = −pdV, and hence

∆F = −

(16.17)

V2

pdV.

(16.18)

V1

Hence a positive change in F represents reversible work done on the system by the surroundings, while a negative change in F represents reversible work done on the surroundings by the system. As we shall see in Section 16.5, F represents the maximum amount of work you can get out of a system at constant temperature, since the system will do work on its surroundings until its Helmholtz function reaches a minimum. Equation 16.16 implies that the entropy S can be written as ∂F , (16.19) S=− ∂T V and the pressure p as

p=−

∂F ∂V

.

(16.20)

T

If T and V are constant, we have that dF = 0 and F is a constant.

16.4

16.4

Gibbs function, G. 167

Gibbs function, G.

We deﬁne the Gibbs function6 using

6

This is sometimes called the Gibbs free energy.

G = H − T S.

(16.21)

Hence we ﬁnd that dG

= T dS + V dp − T dS − SdT = −SdT + V dp,

(16.22)

and hence the natural variables of G are T and p. [Hence we can write G = G(T, p).] Having T and p as natural variables is particularly convenient as T and p are the easiest quantities to manipulate and control for most experimental systems. In particular, note that if T and p are constant, dG = 0. Hence G is conserved in any isothermal isobaric process.7 The expression in eqn 16.22 allows us to write down expressions for entropy and volume as follows: ∂G (16.23) S=− ∂T p

and V =−

∂G ∂p

.

7

For example, at a phase transition between two diﬀerent phases (call them phase 1 and phase 2), there is phase coexistence between the two phases at the same pressure at the transition temperature. Hence the Gibbs functions for phase 1 and phase 2 must be equal at the phase transition. This will be particularly useful for us in Chapter 28.

(16.24)

T

We have now deﬁned the four main thermodynamic potentials which are useful in much of thermal physics: the internal energy U , the enthalpy H, the Helmholtz function F and the Gibbs function G. Before proceeding further, we summarize the main equations which we have used so far.

Function of state

Diﬀerential

Natural variables

First derivatives

Internal energy

U

dU = T dS − pdV

U = U (S, V )

T =

∂U

Enthalpy

H = U + pV

dH = T dS + V dp

H = H(S, p)

T =

∂H

Helmholtz function

F = U − TS

dF = −SdT − pdV

F = F (T, V )

S=−

Gibbs function

G = H − TS

dG = −SdT + V dp

G = G(T, p)

S=−

Note that to derive these equations quickly, all you need to do is memorize the deﬁnitions of H, F and G and the ﬁrst law in the form dU = T dS − pdV and the rest can be written down straightforwardly.

∂S V

,

, p ∂F

∂S

∂T

V

∂G ∂T

p

,

,

∂U p = − ∂V S

V = ∂H ∂p S ∂F p = − ∂V T

V = ∂G ∂p T

168 Thermodynamic potentials

Example 16.1

∂ F ∂ G Show that U = −T 2 ∂T and H = −T 2 ∂T . V T p T Solution: Using the expressions ∂G ∂F , and S=− , S=− ∂T V ∂T p we can write down

U = F + TS = F − T and

∂F ∂T

H = G + TS = G − T

∂G ∂T

V

= −T 2

p

= −T 2

∂(F/T ) ∂T

∂(G/T ) ∂T

,

(16.25)

.

(16.26)

V

p

These equations are known as the Gibbs–Helmholtz equations and are useful in chemical thermodynamics.

16.5

Availability

We want to try now to work out how to ﬁnd the equilibrium properties of a system when it is placed in contact with its surroundings. In general, a system is able to exchange heat with its surroundings and also to do work on its surroundings. Let us now consider a system in contact with surroundings which are at temperature T0 and pressure p0 (see Fig. 16.1). Let us consider what happens when we transfer energy dU and volume dV from the surroundings, to the system. The internal energy of the surroundings changes by dU0 , where dU0 = −dU = T0 dS0 − p0 (−dV ),

T p

Fig. 16.1 A system in contact with surroundings at temperature T0 and pressure p0 .

(16.27)

where the minus signs express the fact that the energy and volume in the surroundings are decreasing. We can rearrange this expression to give the change of entropy in the surroundings as dU + p0 dV dS0 = − . (16.28) T0 If the entropy of the system changes by dS, then the total change of entropy dStot is (16.29) dStot = dS0 + dS, and the second law of thermodynamics implies that dStot ≥ 0. Using eqns 16.28 and 16.29, we have that T0 dStot = − [dU + p0 dV − T0 dS] ≥ 0.

(16.30)

16.5

Hence dU + p0 dV − T0 dS ≤ 0.

(16.31)

We now deﬁne the availability A by A = U + p0 V − T0 S,

(16.32)

and because p0 and T0 are constants, then dA = dU + p0 dV − T0 dS.

(16.33)

Hence eqn 16.31 becomes dA ≤ 0.

(16.34)

We have derived this inequality from the second law of thermodynamics. It demonstrates that changes in A are always negative. As a system settles down to equilibrium, any changes will always force A downwards. Once the system has reached equilibrium, A will be constant at this minimum level. Hence equilibrium can only be achieved by minimizing A. However, the type of equilibrium achieved depends on the nature of the constraints, as we will now show. • System with ﬁxed entropy and volume: In this case dS = dV = 0 and hence eqns 16.33 and 16.34 imply dA = dU ≤ 0,

(16.35)

so we must minimize U to ﬁnd the equilibrium state of this system. • System with ﬁxed entropy and pressure: In this case dS = dp = 0, and hence eqn 16.33 implies dA = dU + p0 dV.

(16.36)

The change in enthalpy is dH = dU +p dV +V dp, and since p = p0 and dp = 0, we have that dH = dU + p0 dV,

(16.37)

dA = dH ≤ 0,

(16.38)

and hence so we must minimize H to ﬁnd the equilibrium state. • System thermally isolated and with ﬁxed volume: Since no heat can enter the system and the system can do no work on its surroundings, dU = 0. Hence eqn 16.33 becomes dA = −T0 dS and hence dA ≤ 0 implies that dS ≥ 0. Thus we must maximise S to ﬁnd the equilibrium state. • System with ﬁxed volume at constant temperature: dA = dU − T0 dS ≤ 0, but because dT = 0 and dF = dU − T0 dS − SdT = dU − T0 dS, we have that dA = dF ≤ 0, so we must minimize F to ﬁnd the equilibrium state.

(16.39)

Availability 169

170 Thermodynamic potentials

• System at constant pressure and temperature: Eqn 16.33 gives dA = dU − T0 dS + p0 dV ≤ 0. We can write dG (from the deﬁnition G = H − T S) as dG = dU + p0 dV + V dp − T0 dS − S dT = dU − T0 dS + p0 dV, (16.40) since dp = dT = 0, and hence dA = dG ≤ 0,

(16.41)

so we must minimize G to ﬁnd the equilibrium state.

Example 16.2 Chemistry laboratories are usually at constant pressure. If a chemical reaction is carried out at constant pressure, then by eqn 16.10 we have that ∆H = ∆Q, (16.42) 8

The temperature may rise during a reaction, but if the ﬁnal products cool to the original temperature, one only needs to think about the beginning and end points, since G is a function of state.

9

However, one may also need to consider the kinetics of the reaction. Often a reaction has to pass via a metastable intermediate state which may have a higher Gibbs function, so the system cannot spontaneously lower its Gibbs function without having it slightly raised ﬁrst. This gives a reaction an activation energy which must be added before the reaction can proceed, even though the completion of the reaction gives you all that energy back and more.

and hence ∆H is the reversible heat added to the system, i.e. the heat absorbed by the reaction. (Recall that our convention is that ∆Q is the heat entering the system, and in this case the system is the reacting chemicals.) • If ∆H < 0, the reaction is called exothermic and heat will be emitted. • If ∆H > 0, the reaction is called endothermic and heat will be absorbed. However, this does not tell you whether or not a chemical reaction will actually proceed. Usually reactions occur8 at constant T and p, so if the system is trying to minimize its availability, then we need to consider ∆G. The second law of thermodynamics (via eqn 16.34 and hence eqn 16.41) therefore implies that a chemical system will minimize G, so that if ∆G < 0, the reaction may spontaneously occur.9

16.6

Maxwell’s relations

In this section, we are going to derive four equations which are known as Maxwell’s relations. These equations are very useful in solving problems in thermodynamics, since each one relates a partial diﬀerential between quantities that can be hard to measure to a partial diﬀerential between quantities that can be much easier to measure. The derivation proceeds along the following lines: a state functon f is a function of variables x and y. A change in f can be written as ∂f ∂f dx + dy. (16.43) df = ∂x y ∂y x

16.6

Because df is an exact diﬀerential (see Appendix C.7), we have that 2 2 ∂ f ∂ f = . (16.44) ∂x∂y ∂y∂x Hence writing

Fx =

∂f ∂x

we have that

and Fy = y

∂Fy ∂x

=

∂Fx ∂y

∂f ∂y

,

(16.45)

x

.

(16.46)

We can now apply this idea to each of the state variables U , H, F and G in turn.

Example 16.3 The Maxwell relation based on G can be derived as follows. We write down an expression for dG: dG = −SdT + V dp. We can also write

dG =

∂G ∂T

dT + p

∂G ∂p

(16.47)

dp,

(16.48)

T

∂G and hence we can write S = − ∂G . Because dG is ∂T p and V = ∂p T an exact diﬀerential, we have that 2 2 ∂ G ∂ G = , (16.49) ∂T ∂P ∂p∂T and hence we have the following Maxwell relation: ∂V ∂S = − ∂p T ∂T p

(16.50)

This reasoning can be applied to each of the thermodynamic potentials U , H, F and G to yield the four Maxwell relations: Maxwell’s relations:

∂T ∂V S ∂T ∂p S ∂S ∂V T ∂S ∂p T

= = = =

∂p − ∂S V ∂V ∂S p ∂p ∂T V ∂V − . ∂T p

(16.51) (16.52) (16.53) (16.54)

Maxwell’s relations 171

172 Thermodynamic potentials 10

If you do, however, insist on memorizing them, then lots of mnemonics exist. One useful way of remembering them is as follows. Each Maxwell relation is of the form „ « „ « ∂∗ ∂† =± ∂‡ ∂ ‡

where the pairs of symbols which are similar to each other ( and ∗, or † and ‡) signify conjugate variables, so that their product has the dimensions of energy: e.g. T and S, and p and V . Thus you can notice that, for each Maxwell relation, terms diagonally opposite each other are conjugate variables. The variable held constant is conjugate to the one on the top of the partial diﬀerential. Another point is that you always have a minus sign when V and T are on the same side of equation.

These equations should not be memorized;10 rather it is better to remember how to derive them! A more sophisticated way of deriving these equations based on Jacobians (which may not to be everybody’s taste) is outlined in the box below. It has the attractive virtue of producing all four Maxwell relations in one go by directly relating the work done and heat absorbed in a cyclic process, but the unfortunate vice of requiring easy familiarity with the use of Jacobian transformations.

An alternative derivation of Maxwell’s relations: The following derivation is more elegant, but requires a knowledge of Jacobians (see Appendix C.9): Consider a cyclic process which can be described in both the T –S and p–V planes. The internal energy U is a state function and therefore doesn’t change in a cycle, so p dV = T dS, and hence we have dp dV = dT dS, (16.55) so that the work done (the area enclosed by the cycle in the p–V plane) is equal to the heat absorbed (the area enclosed by the cycle in the T –S plane). However, one can also write ∂(T, S) = dT dS, (16.56) dp dV ∂(p, V ) where ∂(T, S)/∂(p, V ) is the Jacobian of the transformation from the p–V plane to the T –S plane, and so this implies that ∂(T, S) = 1. ∂(p, V )

(16.57)

This equation is suﬃcient to generate all four Maxwell relations via ∂(T, S) ∂(p, V ) = , ∂(x, y) ∂(x, y)

(16.58)

where (x, y) are taken as (i) (T, p), (ii) (T, V ), (iii) (p, S) and (iv) (S, V ), and using the identities in Appendix C.9.

We will now give several examples of how Maxwell’s relations can be used to solve problems in thermodynamics.

16.6

Example 16.4 Find expressions for (∂CV /∂p)T and (∂CV /∂V )T in terms of p, V and T . Solution: By the deﬁntions of CV and Cp we have that ∂Q ∂S =T (16.59) CV = ∂T V ∂T V

and Cp = Now

∂Cp ∂p

∂Q ∂T

=T p

∂S ∂T

.

(16.60)

p

T

∂S ∂ T = ∂p ∂T p T

∂ ∂S = T ∂p ∂T p T ∂ ∂S = T ∂T ∂p T p

and using a Maxwell relation 2

∂ V ∂Cp ∂V ∂ = −T = −T . ∂p T ∂T ∂T p ∂T 2 p

(16.61)

(16.62)

p

Similarly

∂CV ∂V

=T T

∂2p ∂T 2

.

(16.63)

V

Both the expressions in eqns 16.62 and 16.63 are zero for a perfect gas.

Before proceeding further with the examples, we will pause to list the tools which you have at your disposal to solve these sorts of problems. Any given problem may not require you to use all of these, but you may have to use more than one of these ‘techniques’. (1) Write down a function of state in terms of particular variables. If f is a function of x and y, so that f = f (x, y), you then have immediately that ∂f ∂f dx + dy. (16.64) df = ∂x y ∂y x (2) Use Maxwell’s relations to transform the partial diﬀerential you start with into a more convenient one. Use the Maxwell relations in eqns 16.51–16.54.

Maxwell’s relations 173

174 Thermodynamic potentials

(3) Invert a Maxwell relation using the reciprocal theorem. The reciprocal theorem states that ∂x 1 = ∂z , (16.65) ∂z y ∂x y and this is proved in Appendix C.6 (see eqn C.41). (4) Combine partial diﬀerentials using the reciprocity theorem. The reciprocity theorem states that ∂y ∂z ∂x = −1, (16.66) ∂y z ∂z x ∂x y which is proved in Appendix C.6 (see eqn C.42). This can be combined with the reciprocal theorem to write that ∂x ∂z ∂x =− , (16.67) ∂y z ∂z y ∂y x which is a very useful identity. (5) Identify a heat capacity. Some of the partial diﬀerentials which appear in Maxwell’s relations relate to real, measurable As we have seen in properties. ∂S ∂S Example 16.4, both ∂T and can be related to heat ca∂T V p pacities: ∂S ∂S CV Cp = = and (16.68) T ∂T V T ∂T p

11 Recall that T = (∂U/∂S)V and p = −(∂U/∂V )S .

(6) Identify a “generalized susceptibility”. A generalized susceptibility quantiﬁes how much a particular variable changes when a generalized force is applied. A generalized force is a variable such as T or p which is a diﬀerential of 11 the internal energy with respect to some other ∂V parameter. An example of a generalized susceptibility is ∂T x which, you will recall, answers the question “keeping x constant, how much does the volume change when you change the temperature?”. It is related to the thermal expansivity at constant x, where x is pressure or entropy. Thus the isobaric expansivity βp is deﬁned as 1 ∂V , (16.69) βp = V ∂T p while the adiabatic expansivity βS is deﬁned as 1 ∂V βS = . V ∂T S

(16.70)

Expansivities measure the fractional change in volume with a change in temperature.

16.6

Another useful generalized susceptibility is the compressibility. This quantiﬁes how large a fractional volume change you achieve when you apply pressure. The isothermal compressibility κT is deﬁned as 1 ∂V κT = − , (16.71) V ∂p T while the adiabatic compressibility κS is deﬁned as 1 ∂V κS = − . V ∂p S

(16.72)

Both quantities have a minus sign so that the compressibilities are positive (this is because things get smaller when you press them, so fractional volume changes are negative when positive pressure is applied). None of these expansivities or compressibilities appears directly in a Maxwell relation, but each can easily be related to those that do using the reciprocal and reciprocity theorems.

Example 16.5 By considering S = S(T, V ), show that Cp − CV = V T βp2 /κT . Solution: Considering S = S(T, V ) allows us to write down immediately that ∂S ∂S dT + dV. (16.73) dS = ∂T V ∂V T Diﬀerentiating this equation with respect to T at constant p yields ∂S ∂S ∂V ∂S = + . (16.74) ∂T p ∂T V ∂V T ∂T p Now the ﬁrst two terms can be replaced by Cp /T and CV /T respectively, while use of a Maxwell relation and a partial diﬀerential identity (see eqn 16.67) yields ∂p ∂p ∂V ∂S = =− (16.75) ∂V T ∂T V ∂V T ∂T p and hence using eqns 16.69 and 16.71 we have that Cp − CV =

V T βp2 . κT

(16.76)

The next example shows how to calculate the entropy of an ideal gas.

Maxwell’s relations 175

176 Thermodynamic potentials

Example 16.6 Find the entropy of 1 mole of ideal gas. Solution: For one mole of ideal gas pV = RT . Consider the entropy S as a function of volume and temperature, i.e. S = S(T, V ), so that

dS

= =

∂S ∂T

(16.77)

∂S ∂V

dT + dV T ∂p CV dT + dV, T ∂T V V

(16.78) (16.79)

using eqn 16.53 and eqn 16.68. The ideal gas law for 1 mole, p = RT /V , implies that ∂p = R/V, (16.80) ∂T V and hence, if we integrate eqn 16.79, RdV CV dT + . S= T V

(16.81)

If CV is not a function of temperature (which is true for an ideal gas) simple integration yields S = CV ln T + R ln V + constant.

(16.82)

The entropy of an ideal gas increases with increasing temperature and increasing volume.

The ﬁnal example in this chapter shows how to prove that the ratio of the isothermal and adiabatic compressibilities, κT /κS , is equal to γ.

Example 16.7 Find the ratio of the isothermal and adiabatic compressibilities. Solution: This follows using straightforward manipulations of partial diﬀerentials. To begin with, we write 1 ∂V V ∂p T κT , = (16.83) 1 ∂V κS V ∂p S

16.6

which follows from the deﬁnition of κT and κS (eqns 16.71 and 16.72). Then we proceed as follows: ∂V ∂T − ∂T p ∂p V κT = reciprocity theorem ∂V ∂S κS − ∂S p ∂p V ∂V ∂S ∂T p ∂V p = Maxwell’s relations ∂p ∂S ∂T V ∂p V ∂S ∂T p reciprocity theorem = ∂S ∂T V Cp = CV = γ. (16.84) We can show that this equation is correct for the case of an ideal gas as follows. Assuming the ideal gas equation pV ∝ T , we have for constant temperature that dp dV =− , (16.85) p V and hence 1 κT = . (16.86) p For an adiabatic change p ∝ V −γ and hence dV dp = −γ , p V and hence

(16.87)

1 . (16.88) γp This agrees with eqn 16.84 above. We note that because κT is larger than κS (because γ > 1), the isotherms always have a smaller gradient than the adiabats on a p–V plot (see Fig. 12.1). κS =

Maxwell’s relations 177

178 Exercises

Chapter summary • We deﬁne the following thermodynamic potentials: U,

H = U + pV,

F = U − T S,

G = H − T S,

which are then related by the following diﬀerentials: dU dH dF

= T dS − pdV = T dS + V dp = −SdT − pdV

dG = −SdT + V dp • The availability A is given by A = U + p0 V − T0 S, and for any spontaneous change we have that dA ≤ 0. This means that a system in contact with a reservoir (temperature T0 , pressure p0 ) will minimize A which means – minimizing U when S and V are ﬁxed; – minimizing H when S and p are ﬁxed; – minimizing F when T and V are ﬁxed; – minimizing G when T and p are ﬁxed. • Four Maxwell relations can be derived from the boxed equations above, and used to solve many problems in thermodynamics.

Exercises (16.1) (a) Using the ﬁrst law dU = T dS − pdV to provide a reminder, write down the deﬁnitions of the four thermodynamic potentials U , H, F , G (in terms of U , S, T , p, V ), and give dU, dH, dF, dG in terms of T, S, p, V and their derivatives. (b) Derive all the Maxwell relations.

In each case the quantity on the left hand side is the appropriate thing to consider for a particular type of expansion. State what type of expansion each refers to. (b) Using these relations, verify that for an ideal gas (∂T /∂V )U = 0 and (∂T /∂p)H = 0, and that (∂T /∂V )S leads to the familiar relation pV γ = constant along an isentrope.

(16.2) (a) Derive the following general relations – (16.3) Use the ﬁrst law of thermodynamics to show that « « » „ „ ∂p ∂T 1 „ « = − −p T (i) ∂U Cp − CV ∂V U CV ∂T V = − p, (16.89) „ „ « « ∂V T V βp ∂T ∂p 1 (ii) = − T ∂V S CV ∂T V where βp is the coeﬃcient of volume expansivity # " „ „ « « and the other symbols have their usual meanings. ∂T ∂V 1 (iii) = −V T ∂p H Cp ∂T p (16.4) (a) The natural variables for U are S and V . This

Exercises 179 means that if you know S and V , you can ﬁnd pressure is given by „ „ « « U (S, V ). Show that this also gives you simple ex∂U ∂P pressions for T and p. − , (16.91) P =T ∂T V ∂V T (b) Suppose instead that you know V , T and the function U (T, V ) (i.e. you have expressed U in where U is the total energy of the gas. terms of variables which are not all the natural vari(16.6) Show that another expression for the entropy per ables of U ). Show that this leads to a (much more mole of an ideal gas is complicated) expression for p, namely « Z „ (16.92) S = Cp ln T − R ln p + constant. ∂U dT p + f (V ), (16.90) = 2 T ∂V T T (16.7) Show that the entropy of an ideal gas can be expressed as where f (V ) is some (unknown) function of V . „ « p (16.5) Use thermodynamic arguments to obtain the genS = CV ln + constant. (16.93) γ ρ eral result that, for any gas at temperature T , the

180 Biography

Hermann von Helmholtz (1821–1894) Since his family couldn’t aﬀord to give him an academic education in physics, the seventeenyear old Helmholtz found himself at a Berlin medical school getting a free four-year medical education, the catch being that he then had to serve as a surgeon in the Prussian army. It was during his time in the army that he submitted a scientiﬁc paper ‘On the conservation of force’ (his use of the word ‘force’ is more akin to what we call ‘energy’, the two concepts being poorly diﬀerentiated at the time). It was a blow against the notion of a ‘vital force’, an indwelling ‘life source’ Fig. 16.2 H. von which was widely proposed by Helmholtz physiologists to explain biological systems. Helmholtz intuited that such a vital force was mere metaphysical speculation and instead all physical and chemical processes involved the exchange of energy from one form to another, and that ‘all organic processes are reducible to physics’. Thus he began a remarkable career based on his remarkable physical insight into physiology.

William Thomson [Lord Kelvin] (1824–1907) William Thomson was something of a prodigy: born in Belfast, the son of a mathematician, he studied in Glasgow University and then moved to Peterhouse, Cambridge. By the time he had graduated, he had written 12 research papers, the ﬁrst of the 661 of his career. He became Professor of Natural Philosophy in the University of Glasgow at 22, a Fellow of the Royal Society Fig. 16.3 William at 27, was knighted at 42, and Thomson in 1892 became Baron Kelvin of Largs (taking his new title from the River Kelvin in Glasgow), an appointment which

In 1849 he was appointed professor of physiology at K¨ onigsberg, and six years later took up a professorship in anatomy in Bonn, moving to Heidelberg three years later. During this period he pioneered the application of physical and mathematical techniques to physiology: he invented the opthalmoscope (for looking into the eye), the opthalmometer (for measuring the curvature and refractive errors in the eye) and worked on the problem of three-colour vision; he did pioneering research in physiological acoustics, explaining the operation of the inner ear; he also measured the speed of nerve impulses in a frog. He even found time to make important contributions in understanding vortices in ﬂuids. In 1871, he was appointed to a chair in Berlin, but this time it was in physics; here he pursued work in electrodynamics, non-Euclidean geometry and physical chemistry. Helmholtz mentored and inﬂuenced many highly talented students in Berlin, including Planck, Wien and Hertz. Helmholtz’s scientiﬁc life was characterized by a search for unity and clarity. He once said that ‘whoever in the pursuit of science, seeks after immediate practical utility may rest assured that he seeks in vain’, but there can be only few scientists in history whose work has had the result of greater practical utility.

occurred during his presidency of the Royal Society. When he died, he was buried next to Isaac Newton in Westminster Abbey. Thomson made pioneering contributions in fundamental electromagnetism and ﬂuid dynamics, but also involved himself in large engineering projects. After working out how to solve the problem of sending signals down very long cables, he was involved in laying the ﬁrst transatlantic telegraph cables in 1866. In 1893, he headed an international commission to plan the design of the Niagara Falls power station and was convinced by Nikola Tesla, somewhat against his better judgement, to use three-phase AC power rather than his preferred DC power transmission. On this point he was unable to accurately forecast the future (which was of course AC, not DC); in similar vein, he pronounced heavier-than-air ﬂying machines ‘impossible’, thought that radio had ‘no future’ and

Biography 181

that war, as a ‘relic of barbarism’ would ‘become as obsolete as duelling’. If only. It is his progress in thermodynamics that interests us here. Inspired by the meticulous thermometric measurements of Henri Regnault which he had observed during a postgraduate stay in Paris, Thomson proposed an absolute temperature scale in 1848. Thomson was also profoundly inﬂuenced by Fourier’s theory of heat (which he had read in his teens) and Carnot’s work via the paper of Clapeyron. These had assumed a caloric theory of heat, which Thomson had initially adopted, but his encounter with Joule at the 1847 British Association meeting in Oxford had sown some seeds of doubt in caloric. After much thought, Thomson groped his way towards his ‘dynamical theory of heat’ which he published in 1851, a synthesis of Joule and Carnot, containing a description of the degradation of energy and speculations about the heat death of the Universe. He just missed a full articulation of the concept of entropy, but grasped the essential details of the ﬁrst and second laws of

thermodynamics. His subsequent fruitful collaboration with Joule led to the Joule-Thomson (or JouleKelvin) eﬀect. Thomson also discovered many key results concerning thermoelectricity. His most controversial result was however his estimate of the age of the Earth, based on Fourier’s thermal diﬀusion equation. He concluded that if the Earth had originally been a red-hot globe, and had cooled to his present temperature, its age must be about 108 years. This pleased nobody: the Earth was too old for those who believed in a six-thousand year old planet but too young for Darwin’s evolution to produce the present biological diversity. Thomson could not have known that radioactivity (undiscovered until the very end of the nineteenth century) acts as an additional heat source in the Earth, allowing the Earth to be nearly two orders of magnitude older than he estimated. His lasting legacy however has been his new temperature scale, so that his ‘absolute zero’, the lowest possible temperature obtainable, is zero degrees Kelvin.

Josiah Willard Gibbs (1839–1903)

Academy of Sciences, which was hardly required reading at the time; moreover his mathematical style did not make his papers easily accessible. Maxwell was one of the few who were very impressed. Gibbs established the key principles of chemical thermodynamics, deﬁned the free energy and chemical potential, completely described phase equilibria with more than one component and championed a geometric view of thermodynamics. Not only did he substantially formulate thermodynamics and statistical mechanics in the form we know it today, but he also championed the use of vector calculus, in its modern form, to describe electromagnetism (in the face of spirited opposition from various prominent Europeans who maintained that the only way to describe electromagnetism was using quaternions). Gibbs didn’t interact a great deal with scientiﬁc colleagues in other institutions; he was privately secure in himself and in his ideas. One contemporary wrote of him: ‘Unassuming in manner, genial and kindly in his intercourse with his fellow-men, never showing impatience or irritation, devoid of personal ambition of the baser sort or of the slightest desire to exalt himself, he went far toward realising the ideal of the unselﬁsh, Christian gentleman’.

Willard Gibbs was born in New Haven and died in New Haven, living his entire life (a brief postdoctoral period in France and Germany excepted) at Yale, where he remained unmarried. His father was also called Josiah Willard Gibbs and had also been a professor at Yale, though in Sacred Literature rather than in mathematical physics. Willard Gibbs’ life was quiet and secluded, well away from the centres of intense scientiﬁc activity at the time, which were all in Europe. This Fig. 16.4 J. W. gave this gentle and scholarly Gibbs man the opportunity to perform clear-thinking, profound and independent work in chemical thermodynamics, work which turned out to be completely revolutionary, though this took time to be appreciated. Willard Gibbs’ key papers were published in a series of installments in the Transactions of the Conneticut

Rods, bubbles and magnets

17 17.1 Elastic rod

182

17.2 Surface tension

185

17.3 Paramagnetism

186

Chapter summary

191

Exercises

192

In this book, we have been illustrating the development of thermodynamics using the ideal gas as our chief example. We have written the ﬁrst law of thermodynamics as dU = T dS − p dV,

and everything has followed from this. However, in this chapter we want to show that thermodynamics can be applied to other types of systems. In general we will write the work d¯W as d¯W = X dx,

1

Recall from Section 11.1.2 that intensive variables are independent of the size of the system whereas extensive variables are proportional to the size of the system.

(17.1)

(17.2)

where X is some (intensive1 ) generalized force and x is some (extensive) generalized displacement. Examples of these are given in Table 17.1. In this chapter we will examine only three of these examples in detail: the elastic rod, the surface tension in a liquid and the assembly of magnetic moments in a paramagnet.

f ﬂuid elastic rod liquid ﬁlm dielectric magnetic

X

x

d ¯W

−p f γ E B

V L A pE m

−p dV f dL γ dA E · dpE B · dm

L

L f Fig. 17.1 An elastic material of length L and cross-sectional area A is extended a length dL by a tension df .

Table 17.1 Generalized force X and generalized displacement x for various diﬀerent systems. In this table, p = pressure, V = volume, f = tension, L = length, γ = surface tension, A = area, E = electric ﬁeld, pE = electric dipole moment, B = magnetic ﬁeld, m = magnetic dipole moment.

17.1

Elastic rod

Consider a rod with cross-sectional area A and length L, held at temperature T . The rod is made from any elastic material (such as a metal or rubber) and is placed under an inﬁnitesimal tension df , which leads to the rod extending by an inﬁnitesimal length dL (see Fig. 17.1). We

17.1

deﬁne the isothermal Young’s modulus ET as the ratio of stress σ = df /A to strain = dL/L, so that L ∂f σ ET = = . (17.3) A ∂L T The Young’s modulus ET is always a positive quantity. There is another useful quantity that characterizes an elastic rod. We can also deﬁne the linear expansivity at constant tension, αf , by 1 ∂L , (17.4) αf = L ∂T f which is the fractional change in length with temperature. This quantity is positive in most elastic systems (though not rubber). If you hang a weight onto the end of a metal wire (thus keeping the tension f in the wire constant) and heat the wire, it will extend. This implies that αf > 0 for a the metal wire. However, if you hang a weight on a piece of rubber and heat it, you will ﬁnd that the rubber will often contract, which implies that αf < 0 for rubber.

Example 17.1 How does the tension of a wire held at constant length change with temperature? Solution: Our deﬁnitions of ET and αf allow us to calculate this. Using eqn C.42, we have that ∂f ∂L ∂f =− = −AET αf , (17.5) ∂T L ∂L T ∂T f where the last step is obtained using eqns 17.3 and 17.4. We are now in a position to do some thermodynamics on our elastic system. We will rewrite the ﬁrst law of thermodynamics for this case as dU = T dS + f dL.

(17.6)

We can also obtain other thermodynamic potentials, such as the Helmholtz function F = U − T S, so that dF = dU − T dS − S dT , and hence dF = −S dT + f dL. Equation 17.7 implies that the entropy S is ∂F , S=− ∂T L and similarly the tension f is f=

∂F ∂L

(17.7)

(17.8)

. T

(17.9)

Elastic rod 183

184 Rods, bubbles and magnets 2

As in the case of a gas, the Maxwell relation allows us to relate some diﬀerential of entropy (which is hard to measure experimentally, but is telling us something fundamental about the system) to a diﬀerential which we can measure in an experiment, here the change in tension with temperature of a rod held at constant length.

A Maxwell-relation–type-step2 then leads to an expression for the isothermal change in entropy on extension as ∂S ∂f =− . (17.10) ∂L T ∂T L The right-hand side of this equation was worked out in eqn 17.5, so that ∂S = AET αf , (17.11) ∂L T where A is the area (presumed not to change), and so stretching the rod (increasing L) results in an increase in entropy if αf > 0. This is like the case of an ideal gas for which ∂p ∂S = > 0, (17.12) ∂V T ∂T V so that expanding the gas (increasing V ) results in an increase in entropy. If the entropy of the system goes up as it is expanded isothermally, then heat must be absorbed. For the case of the elastic rod (assuming it is not made of rubber), extending it isothermally (and reversibly) by ∆L would then lead to an absorption of heat ∆Q given by ∆Q = T ∆S = AET T αf ∆L.

Fig. 17.2 Rubber consists of longchain molecules. (a) With no force applied, the rubber molecule is quite coiled up and the average end-to-end distance is short, and the entropy is large. This picture has been drawn by taking each segment of the chain to point randomly. (b) With a force applied (about a vertical axis in this diagram), the molecule becomes more aligned with the direction of the applied force, and the end-to-end distance is large, reducing the entropy (see Exercise 17.3). 3

For example, the crystallites might distort from cubic to tetragonal symmetry, thus lowering the entropy. In addition, the stretching of the wire may increase the volume per atom in the wire and this also increases the entropy.

(17.13)

Why does stretching a wire increase its entropy? Let us consider the case of a metallic wire. This contains many small crystallites which have low entropy. The action of stretching the wire distorts those small crystallites, and that increases their entropy and so heat is absorbed.3 However, for rubber αf < 0, and hence an isothermal extension means that heat is emitted. The action of stretching a piece of rubber at constant temperature results in the alignment of the long rubber molecules, reducing their entropy (see Fig. 17.2) and causing heat to be released.

Example 17.2 The internal energy U for an ideal gas does not change when it is expanded isothermally. How does U change for an elastic rod when it is extended isothermally? Solution: The change in internal energy on isothermal extension can be worked out from eqn 17.6 and eqn 17.11 by writing ∂S ∂U =T + f = f + AT ET αf . (17.14) ∂L T ∂L T This is the sum of a positive term expressing the energy going into the rod by work and a term expressing the heat ﬂow into the rod due to an isothermal change of length. (For an ideal gas, a similar analysis applies, but the work done by the gas and the heat that ﬂows into it balance perfectly, so that U does not change.)

17.2

17.2

Surface tension 185

Surface tension

We now consider the case of a liquid surface with surface area A. Here the expression for the work is given by d ¯W = γ dA,

(17.15)

p

where γ is the surface tension. Consider the arrangement shown in Fig. 17.3. If the piston moves down, work d¯W = F dx = +p dV is done on the liquid (which is assumed to be incompressible). The droplet radius will therefore increase by an amount dr such that dV = 4πr2 dr, and the surface area of the droplet will change by an amount dA = 4π(r + dr)2 − 4πr2 ≈ 8πr dr,

(17.16)

so that d ¯W = γ dA = 8πγr dr.

(17.17)

Equating this to d¯W = F dx = +p dV = p · 4πr2 dr yields p=

2γ . r

r

Fig. 17.3 A spherical droplet of liquid of radius r is suspended from a thin pipe connected to a piston which maintains the pressure p of the liquid.

(17.18)

The pressure p in this expression is, of course, really the pressure difference between the pressure in the liquid and the atmospheric pressure against which the surface of the drop pushes.

Example 17.3 What is the pressure of gas inside a spherical bubble of radius r? Solution: The bubble (see Fig. 17.4) has two surfaces, and so the pressure pbubble of gas inside the bubble, minus the pressure p0 outside the bubble, has to support two lots of surface tension. Hence, assuming the liquid wall of the bubble is thin (so that the radii of inner and outer walls are both ≈ r), 4γ . (17.19) pbubble − p0 = r

Notice that surface tension has a microscopic explanation. A molecule in the bulk of the liquid is attracted to its nearest neighbours by intermolecular forces (which is what holds a liquid together), and these forces are applied to a given molecule by its neighbours from all directions. One can think of these forces almost as weak chemical bonds. The molecules at the surface are only attracted by their neighbouring molecules in one direction, back towards the bulk of the liquid, but there is no corresponding attractive force out into the ‘wild blue yonder’. The surface has a higher energy than the bulk because bonds have to be broken in

p r p

Fig. 17.4 A bubble of radius r has an inner and an outer surface.

186 Rods, bubbles and magnets

order to make a surface, and γ tells you how much energy you need to form unit area of surface (which gives an estimate of the size of the intermolecular forces). We can write the ﬁrst law of thermodynamics for our surface of area A as dU = T dS + γ dA (17.20) and similarly changes in the Helmholtz function can be written dF = −S dT + γ dA, which yields the Maxwell relation ∂γ ∂S =− . ∂A T ∂T A Equation 17.20 implies that ∂S ∂U =T + γ, ∂A T ∂A T and hence using eqn 17.22, we have that ∂γ ∂U =γ−T , ∂A T ∂T A

T

T

Fig. 17.5 Schematic diagram of the surface tension γ of a liquid as a function of temperature. Since γ must vanish at the boiling temperature Tb , we expect that (∂γ/∂T )A < 0.

B is often known as the magnetic ﬂux density or the magnetic induction, but following common usage, we refer to B as the magnetic ﬁeld; see Blundell (2001). The magnetic ﬁeld H (often called the magnetic ﬁeld strength) is related to B and the magnetization M by B = µ0 (H + M ).

(17.22)

(17.23)

(17.24)

the sum of a positive term expressing the energy going into a surface by work and a negative term expressing the heat ﬂow into the surface due to an isothermal change of area. Usually, the surface tension has a temperature dependence as shown in Fig. 17.5, and hence (∂γ/∂T )A < 0, so in fact both terms contribute a positive amount. Heat ∆Q is given by ∂S ∂γ ∆Q = T ∆A = −T ∆A > 0, (17.25) ∂A T ∂T A and this is absorbed on isothermally stretching a surface to increase its area by ∆A. This quantity is positive and so heat really is absorbed. ∂S Since ∂A is positive, this shows that the surface has an additional T entropy compared to the bulk, in addition to costing extra energy.

17.3 4

(17.21)

Paramagnetism

Consider a system of magnetic moments arranged in a lattice at temperature T . We assume that the magnetic moments cannot interact with each other. If the application of a magnetic ﬁeld causes the magnetic moments to line up, the system is said to exhibit paramagnetism. The equivalent formulation of the ﬁrst law of thermodynamics for a paramagnet is dU = T dS + B dm, (17.26) where m is the magnetic moment and B is the magnetic ﬁeld.4 The

17.3

Paramagnetism 187

magnetic moment m = M V , where M is the magnetization and V is the volume. The magnetic susceptibility χ is given by χ = limH→0

M . H

(17.27)

For most paramagnets χ 1, so that M H and hence B = µ0 (H + M ) ≈ µ0 H. This implies that we can write the magnetic susceptibility χ as µ0 M . (17.28) χ≈ B Paramagnetic systems obey Curie’s law which states that χ∝

1 , T

as shown in Fig. 17.6, and hence ∂χ < 0, ∂T B

T

(17.29) Fig. 17.6 The magnetic susceptibility for a paramagnet follows Curie’s law which states that χ ∝ 1/T .

(17.30)

a fact that we will use below.

Example 17.4 Show that heat is emitted in an isothermal increase in B (a process known as isothermal magnetization) but that temperature is reduced for an adiabatic reduction in B (a process known as adiabatic demagnetization). Solution: For this problem, it is useful to include the magnetic energy −mB into the Helmholtz function, so we write it as F = U − T S − mB.

(17.31)

This implies that (assuming V is constant) dF = −S dT − m dB, which yields the Maxwell relation ∂S ∂m V B ∂χ = ≈ , ∂B T ∂T B µ0 ∂T B

(17.32)

(17.33)

which relates the isothermal change of entropy with ﬁeld at constant temperature to a diﬀerential of the susceptibility χ. The heat absorbed in an isothermal change of B is ∂S T V B ∂χ ∆B = ∆B < 0, (17.34) ∆Q = T ∂B T µ0 ∂T B and since it is negative it implies that heat is actually emitted. The change in temperature in an adiabatic change of B is ∂T ∂T ∂S =− . (17.35) ∂B S ∂S B ∂B T

This coupling between thermal and magnetic properties is known as the magnetocaloric eﬀect.

188 Rods, bubbles and magnets

∂S If we deﬁne CB = T ∂T , the heat capacity at constant B, then B substitution of this and eqn 17.33 into eqn 17.35 yields ∂T T V B ∂χ =− . (17.36) ∂B S µ0 CB ∂T B ∂T Equation 17.30 implies that ∂B > 0, and hence we can cool a maS terial using an adiabatc demagnetization, i.e. by reducing the magnetic ﬁeld on a sample while keeping it at constant entropy. This can yield temperatures as low as a few milliKelvin for electronic systems and a few microKelvin for nuclear systems.

B

B

Let us now consider why adiabatic demagnetization results in the cooling of a material from a microscopic point of view. Consider a sample of a paramagnetic salt, which contains N independent magnetic moments. Without a magnetic ﬁeld applied, the magnetic moments will point in random directions (because we are assuming that they do not interact with each other) and the system will have no net magnetization. An applied ﬁeld B will, however, tend to line up the magnetic moments and produce a magnetization. Increasing temperature reduces the magnetization, and increasing magnetic ﬁeld increases the magnetization. At very high temperature, the magnetic moments all point in random directions and the net magnetization is zero (see Fig. 17.7(a)). The thermal energy kB T is so large that all states are equally populated, irrespective of whether or not the state is energetically favourable. If the magnetic moments have angular momentum quantum number J = 12 they can only point parallel or antiparallel to the magnetic ﬁeld: hence there are Ω = 2N ways of arranging up and down magnetic moments. Hence the magnetic contribution to the entropy, S, is S = kB ln Ω = N kB ln 2.

(17.37)

In the general case of J > 21 , Ω = (2J + 1)N and the entropy is S = N kB ln(2J + 1).

Fig. 17.7 (a) At high temperature, the spins in a paramagnet are in random directions because the thermal energy kB T is much larger than the magnetic energy mB. This state has high entropy. (b) At low temperature, the spins become aligned with the ﬁeld because the thermal energy kB T is much smaller than the magnetic energy mB. This state has low entropy.

(17.38)

At lower temperature, the entropy of the paramagnetic salt must reduce as only the lowest energy levels are occupied, corresponding to the average alignment of the magnetic moments with the applied ﬁeld increasing. At very low temperature, all the magnetic moments will align with the magnetic ﬁeld to minimize their energy (see Fig. 17.7(b)). In this case there is only one way of arranging the system (with all spins aligned) so Ω = 1 and S = 0. The procedure for magnetically cooling a sample is as follows. The paramagnet is ﬁrst cooled to a low starting temperature using liquid helium. The magnetic cooling then proceeds via two steps (see also Fig. 17.8). The ﬁrst step is isothermal magnetization. The energy of a paramagnet is reduced by alignment of the moments parallel to a magnetic

17.3

Paramagnetism 189

Fig. 17.8 The entropy of a paramagnetic salt as a function of temperature for several diﬀerent applied magnetic ﬁelds between zero and some maximum value which we will call Bb . Magnetic cooling of a paramagnetic salt from temperature Ti to Tf is accomplished as indicated in two steps: ﬁrst, isothermal magnetization from a to b by increasing the magnetic ﬁeld from 0 to Bb at constant temperature Ti ; second, adiabatic demagnetization from b to c. The S(T ) curves have been calculated assuming J = 21 . A term ∝ T 3 has been added to these curves to simulate the entropy of the lattice vibrations. The curve for B = 0 is actually for small, but nonzero, B to simulate the eﬀect of a small residual ﬁeld.

ﬁeld. At a given temperature the alignment of the moments may therefore be enhanced by increasing the strength of an applied magnetic ﬁeld. This is performed isothermally (see Fig. 17.8, step a → b) by having the sample thermally connected to a bath of liquid helium (the boiling point of helium at atmospheric pressure is 4.2 K), or perhaps with the liquid helium bath at reduced pressure so that the temperature can be less than 4.2 K. The temperature of the sample does not change and the helium bath absorbs the heat liberated by the sample as its energy and entropy decrease. The thermal connection is usually provided by lowpressure helium gas in the sample chamber which conducts heat between the sample and the chamber walls, the chamber itself sitting inside the helium bath. (The gas is often called ‘exchange’ gas because it allows the sample and the bath to exchange heat.) The second step is to thermally isolate the sample from the helium bath (by pumping away the exchange gas). The magnetic ﬁeld is then slowly reduced to zero, slowly so that the process is quasistatic and the entropy is constant. This step is called adiabatic demagnetization (see Fig. 17.8, step b → c) and it reduces the temperature of the system. During the adiabatic demagnetization the entropy of the sample remains constant; the entropy of the magnetic moments increases (because the moments randomize as the ﬁeld is turned down) and this is precisely balanced by the decrease in the entropy of the phonons (the lattice vibrations) as the sample cools. Entropy is thus exchanged between the phonons and the spins. There is another way of looking at adiabatic demagnetization: Consider the energy levels of magnetic ions in a a paramagnetic salt which is subjected to an applied magnetic ﬁeld. The population of magnetic ions

190 Rods, bubbles and magnets

E

E

E

Fig. 17.9 Schematic diagram showing the energy levels in a magnetic system (a) initially, (b) following isothermal magnetization and (c) following adiabatic demagnetization.

in each energy level is given by the Boltzmann distribution, as indicated schematically in Fig. 17.9(a). The rate at which the levels decrease in population as the energy increases is determined by the temperature T . When we perform an isothermal magnetization (increasing the applied magnetic ﬁeld while keeping the temperature constant) we are increasing the spacing between the energy levels of the paramagnetic salt [see Fig. 17.9(b)], but the occupation of each level is determined by the same Boltzmann distribution because the temperature T is constant. Thus the higher–energy levels become depopulated. This depopulation is the result of transitions between energy levels caused by interaction with the surroundings which are keeping the system at constant temperature. In an adiabatic demagnetization, the external magnetic ﬁeld is reduced to its original value, closing–up the energy levels again. However, because the salt is now thermally isolated, no transitions between energy levels are possible and the populations of each level remain the same [see Fig. 17.9(c)]. Another way of saying this is that in an adiabatic process the entropy S = −kB i Pi ln Pi (eqn 14.48) of the system is constant, and this expression only involves the probability Pi of occupying the ith level, not the energy. Thus the temperature of the paramagnetic salt following the adiabatic demagnetization is lower because the occupancies now correspond to a Boltzmann distribution with a lower temperature. Does adiabatic demagnetization as a method of cooling have a limit? At ﬁrst sight it looks like the entropy for B = 0 would be S = N kB ln(2J+ 1) for all temperatures T > 0 and therefore would fall to zero only at absolute zero. Thus adiabatic demagnetization looks like it might work as a cooling method all the way to absolute zero. However, in real paramagnetic salts there is always some small residual internal ﬁeld due to interactions between the moments which ensures that the entropy falls prematurely towards zero when the temperature is a little above absolute zero (see Fig. 17.8). The size of this ﬁeld puts a limit on the lowest temperature to which the paramagnetic salt can be cooled. In certain paramagnetic salts which have a very small residual internal ﬁeld, temperatures of a few milliKelvin can be achieved. The failure of Curie’s law as we approach T = 0 is just one of the consequences of the third law of thermodynamics which we will treat in the following chapter.

17.3

Paramagnetism 191

Fig. 17.10 Entropy increases when (a) a gas is expanded isothermally, (b) a metallic rod is stretched isothermally. Entropy decreases when (c) rubber is stretched isothermally and (d) a paramagnet is magnetized isothermally.

Chapter summary • The ﬁrst law for a gas is dU = T dS − p dV . An isothermal expansion results in S increasing (see Fig. 17.10(a)). An adiabatic compression results in T increasing. • The ﬁrst law for an elastic rod is dU = T dS +f dL. An isothermal extension of a metal wire results in S increasing (see Fig. 17.10(b)) but for rubber S decreases (see Fig. 17.10(c)). An adiabatic contraction of a metal wire results in T increasing (but for rubber T decreases). • The ﬁrst law for a liquid ﬁlm is dU = T dS + γ dA. An isothermal stretching results in S increasing. An adiabatic contraction results in T increasing. • The ﬁrst law for a magnetic system is dU = T dS + B dm. An isothermal magnetization results in S decreasing (see Fig. 17.10(d)). An adiabatic demagnetization results in T decreasing.

192 Exercises

Exercises (17.1) For an elastic rod, show that „ 2 « « „ ∂ f ∂CL = −T , ∂L T ∂T 2 L

(17.39)

(17.5) Consider a liquid of density ρ with molar mass M . Explain why the number of molecules per unit area in the surface is approximately

where CL is the heat capacity at constant length L. (17.2) For an elastic rod, show that « „ ∂T T AET αf =− . ∂L S CL

(17.40)

For rubber, explain why this quantity is positive. Hence explain why, if you take a rubber band which has been under tension for some time and suddenly release the tension to zero, the rubber band appears to have cooled. (17.3) A rubber molecule can be modelled in one dimension as a chain consisting of a series of N = N+ +N− links, where N+ links point in the +x direction, while N− links point in the −x direction. If the length of one link in the chain is a, show that the length L of the chain is L = a(N+ − N− ).

(17.41)

Show further that the number of ways Ω(L) of arranging the links to achieve a length L can be written as N! , (17.42) Ω(L) = N+ !N− ! and also that the entropy S = kB ln Ω(L) can be written approximately as » – L2 S = N kB ln 2 − (17.43) 2N 2 a2 when L N a, and hence that S decreases as L increases. (17.4) The entropy S of a surface can be written as a function of area A and temperature T . Hence show that dU

= =

T dS + γ dA (17.44) » „ « – ∂γ dA. CA dT + γ − T ∂T A

(ρNA /M )2/3 .

(17.45)

Hence, the energy contribution per molecule to the surface tension γ is approximately γ/(ρNA /M )2/3 .

(17.46)

Evaluate this quantity for water (surface tension at 20◦ C is approximately 72 mJ m−2 ) and express your answer in eV. Compare your result with the latent heat per molecule (the molar latent heat of water is 4.4×104 J mol−1 ). (17.6) For a stretched rubber band, it is observed experimentally that the tension f is proportional to the temperature T if the length L is held constant. Prove that: (a) the internal energy U is a function of temperature only; (b) adiabatic stretching of the band results in an increase in temperature; (c) the band will contract if warmed while kept under constant tension. (17.7) A soap bubble of radius R1 and surface tension γ is expanded at constant temperature by forcing in air by driving in a piston containing volume Vpiston fully home. Show that the work ∆W needed to increase the bubble’s radius to R2 is ∆W

=

p2 + 8πγ(R22 − R12 ) p1 +p0 (V2 − V1 − Vpiston ), (17.47)

p2 V2 ln

where p1 and p2 are the initial and ﬁnal pressures in the bubble, p0 is the pressure of the atmosphere and V1 = 43 πR13 and V2 = 43 πR23 .

18

The third law In Chapter 13, we presented the second law of thermodynamics in various diﬀerent forms. In Chapter 14, we related this to the concept of entropy and showed that the entropy of an isolated system always either stays the same or increases with time. But what value does the entropy of a system take, and how can you measure it? One way of measuring the entropy of a system is to measure its heat capacity. For example, if measurements of Cp , the heat capacity at constant pressure, are made as a function of temperature, then using ∂S , (18.1) Cp = T ∂T p we can obtain entropy S by integration, so that Cp S= dT. T

18.2 Consequences of the third law 195 Chapter summary

198

Exercises

198

Cp (18.2)

This is all very well, but when you integrate, you have to worry about constants of integration. Writing eqn 18.2 as a deﬁnite integral, we have that the entropy S(T ), measured at temperature T , is T Cp dT, (18.3) S(T ) = S(T0 ) + T0 T where T0 is some diﬀerent temperature (see Fig. 18.1). Thus it seems that we are only able to learn about changes in entropy, for example as a system is warmed from T0 to T , and we are not able to obtain an absolute measurement of entropy itself. The third law of thermodynamics, presented in this chapter, gives us additional information because it provides a value for the entropy at one particular temperature, namely absolute zero.

18.1

18.1 Diﬀerent statements of the third law 193

Diﬀerent statements of the third law

Walter H. Nernst (1864–1941) (Fig. 18.2) came up with the ﬁrst statement of the third law of thermodynamics after examining data on chemical thermodynamics and doing experiments with electrochemical cells. The essential conclusion he came to concerned the change in enthalpy ∆H in a reaction (the heat of the reaction, positive if endothermic, negative if exothermic; see Section 16.5), and the change in Gibbs’ function ∆G (which determines in which direction the reaction goes). Since

ST

ST

T

T

T

Fig. 18.1 A graphical representation of eqn 18.3.

194 The third law

G = H − T S, we expect that ∆G = ∆H − T ∆S,

(18.4)

so that as T → 0, ∆G → ∆H. Experimental data showed that this was true, but ∆G and ∆H not only came closer together on cooling, but they approached each other asymptotically. On the basis of the data, Nernst also postulated that ∆S → 0 as T → 0. His statement of the third law, dating from 1906, can be written as Nernst’s statement of the third law Near absolute zero, all reactions in a system in internal equilibrium take place with no change in entropy.

Fig. 18.2 W. Nernst

Max Planck (1858–1947) (Fig. 18.3) added more meat to the bones of the statement by making a further hypothesis in 1911, namely that: Planck’s statement of the third law The entropy of all systems in internal equilibrium is the same at absolute zero, and may be taken to be zero.

Fig. 18.3 M. Planck

Planck actually made his statement only about perfect crystals. However, it is believed to be true about any system, as long as it is in internal equilibrium (i.e. that all parts of a system are in equilibrium with each other). There are a number of systems, such as 4 He and 3 He, which are liquids even at very low temperature. Electrons in a metal can be treated as a gas all the way down to T = 0. The third law applies to all of these systems. However, note that the systems have to be in internal equilibrium for the third law to apply. An example of a system not in equilibrium is a glass, which has frozen-in disorder. For a solid, the lowest–energy phase is the perfect crystal, but the glass phase is higher in energy and is unstable. The glass phase will eventually relax back to the perfect crystalline phase but it may take many years or centuries to do this. Planck’s choice of zero for the entropy was further motivated by the development of statistical mechanics, a subject we will tackle later in this book. It suﬃces to say here that the statistical deﬁnition of entropy, presented in eqn 14.36 (S = kB ln Ω), implies that zero entropy is equivalent to Ω = 1. Thus at absolute zero, when a system ﬁnds its ground state, the entropy being equal to zero implies that this ground state is non-degenerate. At this point, we can raise a potential objection to the third law in Planck’s form. Consider a perfect crystal composed of N spinless atoms. We are told by the third law that its entropy is zero. However, let us further suppose that each atom has at its centre a nucleus with angular momentum quantum number I. If no magnetic ﬁeld is applied to this system, then we appear to have a contradiction. The degeneracy of the

18.2

Consequences of the third law 195

nuclear spin is 2I + 1 and if I > 0, this will not be equal to one. How can we reconcile this with zero entropy since the non-zero nuclear spin implies that the entropy S of this system should be S = N kB ln(2I + 1), to however low a temperature we cool it? The answer to this apparent contradiction is as follows: in a real system in internal equilibrium, the individual components of the system must be able to exchange energy with each other, i.e. to interact with each other. Nuclear spins actually feel a tiny, but non-zero, magnetic ﬁeld due to the dipolar ﬁelds produced each other, and this lifts the degeneracy. Another way of looking at this is to say that the interactions give rise to collective excitations of the nuclear spins. These collective excitations are nuclear spin waves, and the lowest–energy nuclear spin wave, corresponding to the longest–wavelength mode, will be nondegenerate. At suﬃciently low temperatures (and this will be extremely low!) only that long-wavelength mode will be thermally occupied and the entropy of the nuclear spin system will be zero. However, this example raises an important point. If we cool a crystal, we will extract energy from the lattice and its entropy will drop towards zero. However, the nuclear spins will still retain their entropy until cooled to a much lower temperature (reﬂecting the weaker interactions between nuclear spins compared to the bonds between atoms in the lattice). If we ﬁnd a method of cooling the nuclei, there might still be some residual entropy associated with the individual nucleons. All these thermodynamic subsystems (the electrons, the nuclear spins, and the nucleons) are very weakly coupled to each other, but their entropies are additive. Francis Simon (1893–1956) (Fig. 18.4) in 1937 called these diﬀerent subsystems ‘aspects’ and formulated the third law as follows: Simon’s statement of the third law The contribution to the entropy of a system by each aspect of the system which is in internal thermodynamic equilibrium tends to zero as T → 0. Simon’s statement is convenient because it allows us to focus on a particular aspect of interest, knowing that its entropy will tend to zero as T approaches 0, while ignoring the aspects that we don’t care about and which might not lose their entropy until much closer to T = 0.

18.2

Consequences of the third law

Having provided various statements of the third law, it is time to examine some of its consequences. • Heat capacities tend to zero as T → 0 This consequence is easy to prove. Any heat capacity C given by ∂S ∂S C=T = → 0, (18.5) ∂T ∂ln T

Fig. 18.4 F. E. Simon

196 The third law

because as T → 0, ln T → −∞ and S → 0. Hence C → 0. Note that this result disagrees with the classical prediction of C = R/2 per mole per degree of freedom. (We note for future reference that this observation emphasizes the fact that the equipartition theorem, to be presented in Chapter 19, is a high temperature theory and fails at low temperature.) • Thermal expansion stops Since S → 0 as T → 0, we have for example that ∂S →0 (18.6) ∂p T as T → 0, but by a Maxwell relation, this implies that 1 ∂V →0 V ∂T p

(18.7)

and hence the isobaric expansivity βp → 0. • No gases remain ideal as T → 0 The ideal monatomic gas has served us well in this book as a simple model which allows us to obtain tractable results. One of these results is eqn 11.26, which states that for an ideal gas, Cp − CV = R per mole. However, as T → 0, both Cp and CV tend to zero, and this equation cannot be satisﬁed. Moreover, we expect that CV = 3R/2 per mole, and as we have seen, this also does not work down to absolute zero. Yet another nail in the coﬃn of the ideal gas is the expression for its entropy given in eqn 16.82 (S = CV ln T + R ln V + constant). As T → 0, this equation yields S → −∞, which is as far from zero as you can get! Thus we see that the third law forces us to abandon the ideal gas model when thinking about gases at low temperature. Of course, it is at low temperature that the weak interactions between gas molecules (blissfully neglected so far since we have modelled gas molecules as independent entities) become more important. More sophisticated models of gases will be considered in Chapter 26. • Curie’s law breaks down Curie’s law states that the susceptibility χ is proportional to 1/T and hence χ → ∞ as T → 0. However, the third law implies that (∂S/∂B)T → 0 and hence ∂m ∂S V B ∂χ = = (18.8) ∂B T ∂T B µ0 ∂T B

∂χ → 0, in disagreement with Curie’s must tend to zero. Thus ∂T law. Why does it break down? You may begin to see a theme developing: it is interactions again! Curie’s law is derived by considering magnetic moments to be entirely independent, in which case their properties can be determined by considering only the balance between the applied ﬁeld (driving the moments to align)

18.2

and temperature (driving the moments to randomize). The susceptibility measures their inﬁnitesimal response to an inﬁnitesimal applied ﬁeld; this becomes inﬁnite when the thermal ﬂuctuations are removed at T = 0. However, if interactions between the magnetic moments are switched on, then an applied ﬁeld will have much less of an eﬀect because the magnetic moments will already be driven into some partially ordered state by each other. There is a basic underlying message here: the microscopic parts of a system can behave independently at high temperature, where the thermal energy kB T is much larger than any interaction energy. At low temperature, these interactions become important and all notions of independence break down. To paraphrase (badly) the poet John Donne: No man is an island, and especially not as T → 0. • Unattainability of absolute zero The ﬁnal point can almost be elevated to the status of another statement of the third law:

Consequences of the third law 197

S

X X

T S

X X

It is impossible to cool to T = 0 in a ﬁnite number of steps.

T This is messy to prove rigorously, but we can justify the argument by reference to Fig. 18.5, which shows plots of S against T for diﬀerent values of a parameter X (which might be magnetic ﬁeld, for example). Cooling is produced by isothermal increases in X and adiabatic decreases in X. If the third law did not hold, it would be possible to proceed according to Fig. 18.5(a) and cool all the way to absolute zero. However, because of the third law, the situation is as in Fig. 18.5(b) and the number of steps needed to get to absolute zero becomes inﬁnite. Before concluding this chapter, we make one remark concerning Carnot engines. Consider a Carnot engine, operating between reservoirs with temperatures T and Th , having an eﬃciency which is equal to η = 1 − (T /Th ) (eqn 13.10). If T → 0, the eﬃciency η tends to 1. If you operated this Carnot engine, you would then get perfect conversion of heat into work, in violation of Kelvin’s statement of the second law of thermodynamics. It seems at ﬁrst sight that the unattainability of absolute zero (a version of the third law) is a simple consequence of the second law. However, there are diﬃculties in considering a Carnot engine operating between two reservoirs, one of which is at absolute zero. It is not clear how you can perform an isothermal process at absolute zero, because once a system is at absolute zero it is not possible to get it to change its thermodynamical state without warming it. Thus it is generally believed that the third law is indeed a separate postulate which is independent of the second law. The third law points to the fact that many of our ‘simple’ thermodynamic models, such as the ideal gas equation and Curie’s law of paramagnets, need substantial modiﬁcation

Fig. 18.5 The entropy as a function of temperature for two diﬀerent values of a parameter X. Cooling is produced by isothermal increases in X (i.e. X1 → X2 ) and adiabatic decreases in X (i.e. X2 → X1 ). (a) If S does not go to 0 as T → 0 it is possible to cool to absolute zero in a ﬁnite number of steps. (b) If the third law is obeyed, then it is impossible to cool to absolute zero in a ﬁnite number of steps.

198 Exercises

if they are to give correct predictions as T → 0. It is therefore opportune to consider more sophisticated models based upon the microscopic properties of real systems, and that brings us to statistical mechanics, the subject of the next part of this book.

Chapter summary • The third law of thermodynamics can be stated in various ways: • Nernst: Near absolute zero, all reactions in a system in internal equilibrium take place with no change in entropy. • Planck: The entropy of all systems in internal equilibrium is the same at absolute zero, and may be taken to be zero. • Simon: The contribution to the entropy of a system by each aspect of the system which is in internal thermodynamic equilibrium tends to zero as T → 0. • Unattainability of T = 0: it is impossible to cool to T = 0 in a ﬁnite number of steps. • The third law implies that heat capacities and thermal expansivities tend to zero as T → 0. • Interactions between the constituents of a system become important as T → 0, and this leads to the breakdown of the concept of an ideal gas and also the breakdown of Curie’s law.

Exercises (18.1) Summarize the main consequences of the third law of thermodynamics. Explain how it casts a shadow of doubt on some of the conclusions from various thermodynamic models. (18.2) Recall from eqn 16.26 that « „ ∂G . H =G−T ∂T p

(18.9)

Hence show that „ ∆G − ∆H = T

∂∆G ∂T

« ,

(18.10)

p

and explain what happens to these terms as T → 0.

Part VII

Statistical mechanics In this part we introduce the subject of statistical mechanics. This is a thermodynamic theory in which account is taken of the microscopic properties of individual atoms or molecules analysed in a statistical fashion. Statistical mechanics allows macroscopic properties to be calculated from the statistical distribution of the microscopic behaviour of individual atoms and molecules. This part is structured as follows: • In Chapter 19, we present the equipartition theorem, a principle that states that the internal energy of a classical system composed of a large number of particles in thermal equilibrium will distribute itself evenly among each of the quadratic degrees of freedom accessible to the particles of the system. • In Chapter 20 we introduce the partition function, which encodes all the information concerning the states of a system and their thermal occupation. Having the partition function allows you to calculate all the thermodynamic properties of the system. • In Chapter 21 we calculate the partition function for an ideal gas and use this to deﬁne the quantum concentration. We show how the indistinguishability of molecules aﬀects the statistical properties and has thermodynamic consequences. • In Chapter 22 we extend our results on partition functions to systems in which the number of particles can vary. This allows us to deﬁne the chemical potential and introduce the grand partition function. • In Chapter 23, we consider the statistical mechanics of light, which is quantized as photons, introducing blackbody radiation, radiation pressure, and the cosmic microwave background. • In Chapter 24, we discuss the analogous behaviour of lattice vibrations, quantized as phonons, and introduce the Einstein model and Debye model of the thermal properties of solids.

19

Equipartition of energy

19.1 Equipartition theorem

200

19.2 Applications

203

19.3 Assumptions made

205

19.4 Brownian motion

207

Chapter summary

208

Exercises

208

Before introducing the partition function in Chapter 20, which will allow us to calculate many diﬀerent properties of thermodynamic systems on the basis of their microscopic energy levels (which can be deduced using quantum mechanics), we devote this chapter to the equipartition theorem. This theorem provides a simple, classical theory of thermal systems. It gives remarkably good answers, but only at high temperature, where the details of quantized energy levels can be safely ignored. We will motivate and prove this theorem in the following section, and then apply it to various physical situations in Section 19.2, demonstrating that it provides a rapid and straightforward method for deriving heat capacities. Finally, in Section 19.3, we will critically examine the assumptions which we have made in the derivation of the equipartition theorem.

19.1

Equipartition theorem

1

We will show later in Section 19.3 that this quadratic dependence is very common; most potential wells are approximately quadratic near the bottom of the well.

Very often in physics one is faced with an energy dependence which is quadratic in some variable.1 An example would be the kinetic energy EKE of a particle with mass m and velocity v, which is given by EKE =

1 mv 2 . 2

(19.1)

Another example would be the potential energy EPE of a mass suspended at one end of a spring with spring constant k and displaced by a distance x from its equilibrium point (see Figure 19.1). This is given by

k

EPE =

1 2 kx . 2

(19.2)

In fact, the total energy E of a moving mass on the end of a spring is given by the sum of these two terms, so that

x m v Fig. 19.1 A mass m suspended on a spring with spring constant k. The mass is displaced by a distance x from its equilibrium or ‘rest’ position.

E = EKE + EPE =

1 1 mv 2 + kx2 , 2 2

(19.3)

and, as the mass undergoes simple harmonic motion, energy is exchanged between EKE and EPE , while the total energy remains ﬁxed. Let us suppose that a system whose energy has a quadratic dependence on some variable is allowed to interact with a heat bath. It is then able to borrow energy occasionally from its environment, or even give it back into the environment. What mean thermal energy would it have? The

19.1

thermal energy would be stored as kinetic or potential energy, so if a mass on a spring is allowed to come into thermal equilibrium with its environment, one could in principle take a very big magnifying glass and see the mass on a spring jiggling around all by itself owing to such thermal vibrations. How big would such vibrations be? The calculation is quite straightforward. Let the energy E of a particular system be given by E = αx2 ,

2

e−βαx , e−βαx2 dx −∞

P (x) = ∞

E

= =

(19.5)

∞

E P (x) dx −∞ ∞ 2 αx2 e−βαx dx −∞ ∞ e−βαx2 dx −∞

1 2β 1 kB T. (19.6) = 2 This is a really remarkable result. It is independent of the constant α and gives a mean energy which is proportional to temperature. The theorem can be extended straightforwardly to the energy being the sum of n quadratic terms, as shown in the following example. =

Example 19.1 Assume that the energy E of a system can be given by the sum of n independent quadratic terms, so that E=

n

αi x2i ,

(19.7)

i=1

where αi are constants and xi are some variables. Assume also that each xi could in principle take any value with equal probability. Calculate the mean energy. Solution: The mean energy E is given by ∞ ∞ ··· E P (x1 , x2 , . . . xn ) dx1 dx2 · · · dxn . (19.8) E = −∞

−∞

E

(19.4)

where α is some positive constant and x is some variable (see Fig. 19.2). Let us also assume that x could in principle take any value with equal probability. The probability P (x) of the system having a particular en2 ergy αx2 is proportional to the Boltzmann factor e−βαx (see eqn 4.13), so that after normalizing, we have

and the mean energy is

Equipartition theorem 201

x Fig. 19.2 The energy E of a system is E = αx2 .

202 Equipartition of energy

This now now looks quite complicated when we substitute in the probability as follows n

∞ ∞ n 2 2 exp −β · · · α x α x i i j=1 j j dx1 dx2 · · · dxn −∞ −∞ E =

∞

i=1

··· −∞

n 2 dx dx · · · dx exp −β α x j 1 2 n j j=1 −∞

∞

,

(19.9) where i and j have been used to distinguish diﬀerent sums. This expression can be simpliﬁed by recognizing that it is the sum of n similar terms (write out the sums to convince yourself):

∞ ∞ n n · · · −∞ αi x2i exp −β j=1 αj x2j dx1 dx2 · · · dxn −∞

E = , ∞ ∞ n 2 dx dx · · · dx · · · exp −β α x i=1 j 1 2 n j j=1 −∞ −∞ (19.10) and then all but one integral cancels between the numerator and denominator of each term, so that ∞ n α x2 exp −βαi x2i dxi −∞ i i ∞ E = . (19.11) exp (−βαi x2i ) dxi −∞ i=1 Now each term in this sum is the same as the one treated above in eqn 19.6. Hence E =

n

αi x2i

=

i=1

n 1 i=1

=

2

kB T

n kB T. 2

(19.12)

Each quadratic energy dependence of the system is called a mode of the system (or sometimes a degree of freedom of the system). The spring, our example at the beginning of this chapter, has two such modes. The result of the example above shows that each mode of the system contributes an amount of energy equal to 12 kB T to the total mean energy of the system. This result is the basis of the equipartition theorem, which we state as follows: Equipartition theorem: If the energy of a classical system is the sum of n quadratic modes, and that system is in contact with a heat reservoir at temperature T , the mean energy of the system is given by n × 12 kB T . The equipartition theorem expresses the fact that energy is ‘equally partitioned’ between all the separate modes of the system, each mode having a mean energy of precisely 12 kB T .

19.2

Applications 203

Example 19.2 We return to our example of a mass on a spring, whose energy is given by the sum of two quadratic energy modes (see eqn 19.3). The equipartition theorem then implies that the mean energy is given by 1 2 × kB T = kB T. 2

(19.13)

How big is this energy? At room temperature, kB T ≈ 4 × 10−21 J ≈ 0.025 eV which is a tiny energy. This energy isn’t going to set a 10 kg mass on a stiﬀ spring vibrating very much! However, the extraordinary thing about the equipartition theorem is that the result holds independently of the size of the system, so that kB T = 0.025 eV is also the mean energy of an atom on the end of a chemical bond (which can be modelled as a spring) at room temperature. For an atom, kB T = 0.025 eV goes a very long way and this explains why atoms in molecules jiggle around a lot at room temperature. We will explore this in more detail below.

19.2

Applications

We now consider four applications of the equipartition theorem.

19.2.1

Translational motion in a monatomic gas

The energy of each atom in a monatomic gas is given by E=

1 1 1 mv 2 + mv 2 + mv 2 , 2 x 2 y 2 z

(19.14)

where v = (vx , vy , vz ) is the velocity of the atom (see Fig. 19.3). This energy is the sum of three independent quadratic modes, and thus the equipartition theorem gives the mean energy as 3 1 E = 3 × kB T = kB T. 2 2

(19.15)

This is in agreement with our earlier derivation of the mean kinetic energy of a gas (see eqn 5.17).

19.2.2

Rotational motion in a diatomic gas

In a diatomic gas, there is an additional possible energy source to consider, namely that of rotational kinetic energy. This adds two terms to the energy L21 L2 + 2, (19.16) 2I1 2I2

Fig. 19.3 The velocity of a molecule in a gas.

204 Equipartition of energy

where L1 and L2 are the angular momenta along the two principal directions shown in Fig. 19.4 and I1 and I2 are the corresponding moments of inertia. We do not need to worry about the direction along the diatomic molecule’s bond, the axis labelled ‘3’ in Fig. 19.4. (This is because the moment of inertia in this direction is very small (so that the corresponding rotational kinetic energy is very large), so rotational modes in this direction cannot be excited at ordinary temperature; such rotational modes are connected with the individual molecular electronic levels and we will therefore ignore them.) The total energy is thus the sum of ﬁve terms, three due to translational kinetic energy and two due to rotational kinetic energy

L

E=

Fig. 19.4 Rotational motion in a diatomic gas.

m k m r r

Fig. 19.5 A diatomic molecule can be modelled as two masses connected by a spring. See Appendix G.

(19.17)

and all of these energy modes are independent of one another. Using the equipartition theorem, we can immediately write down the mean energy as 5 1 (19.18) E = 5 × kB T = kB T. 2 2

L

2

1 1 L2 L2 1 mvx2 + mvy2 + mvz2 + 1 + 2 , 2 2 2 2I1 2I2

19.2.3

Vibrational motion in a diatomic gas

If we also include the vibrational motion of the bond linking the two atoms in our diatomic molecule, there are two additional modes to include. The intramolecular bond can be modelled as a spring (see Fig. 19.5), so that the two extra energy terms are the kinetic energy due to relative motion of the two atoms and the potential energy in the bond (let us suppose it has spring constant k). Writing the positions of the two atoms as r 1 and r 2 with respect to some ﬁxed origin, the energy of the atom can be written 1 1 1 L2 L2 1 1 mvx2 + mvy2 + mvz2 + 1 + 2 + µ(r˙ 1 − r˙ 2 )2 + k(r 1 − r 2 )2 , 2 2 2 2I1 2I2 2 2 (19.19) where µ = m1 m2 /(m1 + m2 ) is the reduced mass2 of the system. The equipartition theorem just cares about the number of modes in the system, so the mean energy is simply

E=

1 7 E = 7 × kB T = kB T. 2 2

(19.20)

The heat capacity of the systems described above can be obtained by diﬀerentiating the energy with respect to temperature. The mean energy is given by f (19.21) E = kB T, 2 where f is the number of degrees of freedom. This equation implies that CV per mole =

f R, 2

(19.22)

19.3

Assumptions made 205

and using eqn 11.28 we have Cp per mole =

f + 1 R, 2

(19.23)

from which we may derive γ=

( f + 1)R Cp 2 =1+ . = 2 f CV f R 2

(19.24)

We can summarize our results for the heat capacity of gases, per atom/molecule, as follows:

Gas

Modes

f

E

γ

Monatomic

translational only

3

Diatomic

translational and rotational

5

Diatomic

translational, rotational and vibrational

7

3 2 kB 5 2 kB 7 2 kB

5 3 7 5 9 7

19.2.4

The heat capacity of a solid

In a solid, the atoms are held rigidly in the lattice and there is no possibility of translational motion. However, the atoms can vibrate about their mean positions. Consider a cubic solid in which each atom is connected by springs (chemical bonds) to six neighbours (one above, one below, one in front, one behind, one to the right, one to the left). Since each spring joins two atoms, then if there are N atoms in the solid, there are 3N springs (neglecting the surface of the solid, a reasonable approximation if N is large). Each spring has two quadratic modes of energy (one kinetic, one potential) and hence a mean thermal energy equal to 2 × 12 kB T = kB T . Hence the mean energy of the solid is E = 3N kB T,

z y

x

(19.25)

and the heat capacity is ∂E/∂T = 3N kB . Because R = NA kB , the molar heat capacity of a solid is then expected to be 3NA kB = 3R.

19.3

Assumptions made

The equipartition theorem seems to be an extremely powerful tool for evaluating thermal energies of systems. However, it does have some limitations, and to discover what these are, it is worth thinking about the assumptions we have made in deriving it.

Fig. 19.6 In a cubic solid, each atom is connected by chemical bonds, modelled as springs, to six nearest neighbours, two along each of the three Cartesian axes. Each spring is shared between two atoms.

206 Equipartition of energy

• We have assumed that the parameter for which we have taken the energy to be quadratic can take any possible value. In the derivation, the variables xi could be integrated continuously from −∞ to ∞. However, quantum mechanics insists that certain quantities can only take particular ‘quantized’ values. For example, the problem of a mass on a spring is shown by quantum mechanics to have an energy spectrum which is quantized into levels given by (n + 12 )ω. When the thermal energy kB T is of the same order, or lower than, ω, the approximation made by ignoring the quantized nature of this energy spectrum is going to be a very bad one. However, when kB T ω, the quantized nature of the energy spectrum is going to be largely irrelevant, in much the same way that you don’t notice that the diﬀerent shades of grey in a newspaper photograph are actually made up of lots of little dots if you don’t look closely. Thus we come to an important conclusion: The equipartition theorem is generally valid only at high temperature, so that the thermal energy is larger than the energy gap between quantized energy levels. Results based on the equipartition theorem should emerge as the high-temperature limit of more detailed theories.

Vx

x

x

Fig. 19.7 V (x) is a function which is more complicated than a quadratic but which has a minimum at x = x0 . 3

Using a Taylor expansion; see Appendix B.

4

The argument that the bottom of almost all potential wells tends to be approximately quadratic could fail if (∂ 2 V /∂x2 )x0 turned out to be zero. This would happen if, for example, V (x) = α(x − x0 )4 .

• Everywhere we have assumed that modes are quadratic. Is that always valid? To give a concrete example, imagine that an atom moves with coordinate x in a potential well given by V (x), which is a function which might be more complicated than a quadratic (see for example Fig. 19.7). At absolute zero, the atom ﬁnds a potential minimum at say x0 (so that, for the usual reasons, ∂V /∂x = 0 and ∂ 2 V /∂x2 > 0 at x = x0 ). At temperature T > 0, the atom can explore regions away from x0 by borrowing energy of order kB T from its environment. Near x0 , the potential V (x) can be expanded3 as ∂V 1 ∂2V V (x) = V (x0 ) + (x − x0 ) + (x − x0 )2 + · · · , ∂x x0 2 ∂x2 x0 (19.26) so that using ∂V ∂x x = 0, we ﬁnd that the potential energy is 0

V (x) = constant +

1 2

∂2V ∂x2

x0

(x − x0 )2 + · · · ,

(19.27)

which is a quadratic again. This demonstrates that the bottom of almost all potential wells tends to be approximately quadratic (this is known as the harmonic approximation).4 If the temperature gets too high, the system will be able to access positions far away from x0 and the approximation of ignoring the higher–order (cubic, quartic, etc.) terms (known as the anharmonic terms) in the Taylor expansion may become important.

19.4

However, we have just said that the equipartition theorem is only valid at high temperature. Thus we see that the temperature must be high enough that we can safely ignore the quantum nature of the energy spectrum, but not so high that we invalidate the approximation of treating the relevant potential wells as perfectly quadratic. Fortunately there is plenty of room between these two extremes.

19.4

Brownian motion

We close this chapter with one example in which the eﬀect of the equipartition of energy is encountered.

Example 19.3 Brownian motion: In 1827, Robert Brown used a microscope to observe pollen grains jiggling about in water. He was not the ﬁrst to make such an observation (any small particles suspended in a ﬂuid will do the same, and are very apparent when looking down a microscope), but this eﬀect has come to be known as Brownian motion. The motion is very irregular, consisting of translations and rotations, with grains moving independently, even when moving close to each other. The motion is found to be more active the smaller the particles. The motion is also found to be more active the less viscous the ﬂuid. Brown was able to discount a ‘vital’ explanation of the eﬀect, i.e. that the pollen grains were somehow ‘alive’, but he was not able to give a correct explanation. Something resembling a modern theory of Brownian motion was proposed by Wiener in 1863, though the major breakthrough was made by Einstein in 1905. We will postpone a full discussion of Brownian motion until Chapter 33, but using the equipartition theorem, the origin of the eﬀect can be understood in outline. Each pollen grain (of mass m) is free to move translationally and so has mean kinetic energy 12 mv 2 = 32 kB T . This energy is very small, as we have seen, but leads to a measurable amplitude of vibration for a small pollen grain. The amplitude of vibration is greater for smaller pollen grains because a mean kinetic energy of 23 kB T gives more mean square velocity v 2 to less massive grains. The thermally excited vibrations are resisted by viscous damping, so the motion is expected to be more pronounced in less viscous ﬂuids.

Brownian motion 207

208 Exercises

Chapter summary • The equipartition theorem states that if the energy of a system is the sum of n quadratic modes, and that the system is in contact with a heat reservoir of temperature T , the mean energy of the system is given by n × 21 kB T . • The equipartition theorem is a high–temperature result and gives incorrect predictions at low temperature, where the discrete nature of the energy spectrum cannot be ignored.

Exercises (19.1) What is the mean kinetic energy in eV at room temperature of a gaseous (a) He atom, (b) Xe atom, (c) Ar atom and (d) Kr atom. [Hint: do you have to do four separate calculations?] (19.2) Comment on the following values of molar heat capacity in J K−1 mol−1 , all measured at constant pressure at 298 K. Al Ar Au Cu He H2 Fe

24.35 20.79 25.42 24.44 20.79 28.82 25.10

Pb Ne N2 O2 Ag Xe Zn

26.44 20.79 29.13 29.36 25.53 20.79 25.40

[Hint: express them in terms of R; which of the substances is a solid and which is gaseous?] (19.3) A particle at position r is in a potential well V (r) given by B A (19.28) V (r) = n − , r r

where A and B are positive constants and n > 2. Show that the bottom of the well is approximately quadratic in r. Hence ﬁnd the particle’s mean thermal energy at temperature T above the bottom of the well assuming the validity of the equipartition theorem in this situation. (19.4) In example 19.1, show that x2i =

kB T . 2αi

(19.29)

(19.5) If the energy E of a system is not quadratic, but behaves like E = α|x| where α > 0, show that E = kB T . (19.6) If the energy E of a system behaves like E = α|x|n , where n = 1, 2, 3 . . . and α > 0, show that E = ξkB T , where ξ is a numerical constant. (19.7) A simple pendulum with length makes an angle θ with the vertical, where θ 1. p Show that it oscillates with a period given by 2π /g. The pendulum is now placed at rest and allowed to come into equilibrium with its surroundings at temperature T . Derive an expression for θ2 .

20

The partition function The probability that a system is in some particular state α is given by the Boltzmann factor e−βEα . We deﬁne the partition function1 Z by a sum over all the states of the Boltzmann factors, so that Z= e−βEα (20.1) α

where the sum is over all states of the system (each one labelled by α). The partition function Z contains all the information about the energies of the states of the system, and the fantastic thing about the partition function is that all thermodynamical quantities can be obtained from it. It behaves like a zipped-up and compressed version of all the properties of the system; once you have Z, you only have to know how to uncompress and unzip it to get functions of state like energy, entropy, Helmholtz function, or heat capacity to simply drop out. We can therefore reduce problem-solving in statistical mechanics to two steps: Steps to solving statistical mechanics problems: (1) Write down the partition function Z. (see Section 20.1) (2) Go through some standard procedures to obtain the functions of state you want from Z. (see Section 20.2) We will outline these two steps in the sections that follow. Before we do that, let us pause to notice an important feature about the partition function. • The zero of energy is always somewhat arbitrary: one can always choose to measure energy with respect to a diﬀerent zero, since it is only energy diﬀerences which are important. Hence the partition function is deﬁned up to an arbitary multiplicative constant. This seems somewhat strange, but it turns out that many physical quantities are related to the logarithm of the partition function and therefore these quantities are deﬁned up to an additive constant (which might reﬂect, for example, the rest mass of particles). Other physical quantities, however, are determined by a diﬀerential of the logarithm of the partition function and therefore these quantities can be determined precisely.

20.1 Writing down the partition function 210 20.2 Obtaining the functions of state 211 20.3 The big idea 20.4 Combining tions

218 partition

func218

Chapter summary

219

Exercises

219

1

The partition function is given the symbol Z because the concept was ﬁrst coined in German. Zustandssumme means ‘sum over states’, which is exactly what Z is. The English name ‘partition function’ reﬂects the way in which Z measures how energy is ‘partitioned’ between states of the system.

210 The partition function

This point needs to be remembered whenever the partition function is obtained. Everything in this chapter refers to what is known as the single– particle partition function. We are working out Z for one particle of matter which may well be coupled to a reservoir of other particles, but our attention is only on that single particle of matter. We will defer discussion of how to treat aggregates of particles until the next two chapters. With that in mind, we are now ready to write down some partition functions.

20.1

Writing down the partition function

The partition function contains all the information we need to work out the thermodynamical properties of a system. In this section, we show how you can write down the partition function in the ﬁrst place. This procedure is not complicated! Writing down the partition function is nothing more than evaluating eqn 20.1 for diﬀerent situations. We demonstrate this for a couple of commonly encountered and important examples.

E

Example 20.1 (a) The two-level system: (see Fig. 20.1(a)) Let the energy of a system be either −∆/2 or ∆/2. Then

E

Z=

e−βEα = eβ∆/2 + e−β∆/2 = 2 cosh

α

h

β∆ 2

,

(20.2)

where the ﬁnal result follows from the deﬁnition of cosh x ≡ 12 (ex + e−x ) (see Appendix B).

Fig. 20.1 Energy levels of (a) a twolevel system and (b) a simple harmonic oscillator.

(b) The simple harmonic oscillator: (see Fig. 20.1(b)) The energy of the system is (n + 12 )ω where n = 0, 1, 2, . . ., and hence

An alternative form of this result is found by multiplying top and bottom 1 by eβ 2 ω to obtain the result Z = 1/(2 sinh(βω/2)).

e− 2 βω , 1 − e−βω α n=0 n=0 (20.3) where the sum is evaluated using the standard result for the sum of an inﬁnite geometric progress, see Appendix B. Z=

e−βEα =

∞

1

1

e−β(n+ 2 )ω = e−β 2 ω

∞

1

e−nβω =

Two further, slightly more complicated, examples are the set of N equally spaced energy levels and the energy levels appropriate for the rotational states of a diatomic molecule.

20.2

Obtaining the functions of state 211

Example 20.2 (c) The N -level system: (see Fig. 20.2(c)) Let the energy levels of a system be 0, ω, 2ω, . . . , (N − 1)ω. Then Z=

e−βEα =

α

N −1

e−jβω =

j=0

1 − e−N βω , 1 − e−βω

(20.4)

where the sum is evaluated using the standard result for the sum of a ﬁnite geometric progress, see Appendix B. (d) Rotational energy levels: (see Fig. 20.2(d)) The rotational kinetic energy of a molecule with moment of inertia I is given by Jˆ2 /2I, where Jˆ is the total angular momentum operator. The eigenvalues of Jˆ2 are given by 2 J(J + 1), where the angular momentum quantum number, J, takes the values J = 0, 1, 2, . . . The energy levels of this system are given by EJ =

2 J(J + 1), 2I

Z=

e−βEα =

α

E

J

(20.5)

and have degeneracy 2J + 1. Hence the partition function is ∞

N N

2

(2J + 1)e−β

J(J+1)/2I

E J

,

(20.6)

J=0

J J

where the factor (2J + 1) takes into account the degeneracy of the level.

20.2

Fig. 20.2 Energy levels of (c) an N level system and (d) a rotational system.

Obtaining the functions of state

Z

Once Z has been written down, we can place it in our mathematical sausage machine (see Fig. 20.3) which processes it and spits out fullyﬂedged thermodynamical functions of state. We now outline the derivations of the components of our sausage machine so that you can derive all these functions of state for any given Z. • Internal energy U The internal energy U is given by −βEi i Ei e . U= −βE i ie

(20.7)

Now the denominator of this expression is the partition function Z = i e−βEi , but the numerator is simply −

dZ = Ei e−βEi . dβ i

(20.8)

U F p S CV

H

Fig. 20.3 Given Z, it takes only a turn of the handle on our ‘sausage machine’ to produce other functions of state.

212 The partition function

Thus U = −(1/Z)(dZ/dβ), or more simply, d ln Z . dβ

U =−

(20.9)

This is a useful form since Z is normally expressed in terms of β. If you prefer things in terms of temperature T , then using β = 1/kB T (and hence d/dβ = −kB T 2 (d/dT )) one obtains U = kB T 2

d ln Z . dT

(20.10)

• Entropy S Since the probability Pj is given by a Boltzmann factor divided by the partition function (so that the sum of the probabilities is one, as can be shown using eqn 20.1), we have Pj = e−βEj /Z and hence (20.11) ln Pj = −βEj − ln Z. Equation 14.48 therefore gives us an expression for the entropy as follows: S = −kB Pi ln Pi i

= kB

Pi (βEi + ln Z)

i

= kB (βU + ln Z), (20.12) where we have used U = i Pi Ei and i Pi = 1. Using β = 1/kB T we have that S=

U + kB ln Z. T

(20.13)

• Helmholtz function F The Helmholtz function is deﬁned via F = U − T S, so using eqn 20.13 we have that F = −kB T ln Z.

(20.14)

This can also be cast into the memorable form Z = e−βF .

(20.15)

Once we have an expression for the Helmholtz function, a lot of things come out in the wash. For example, using eqn 16.19 we have that ∂F ∂ln Z S=− = kB ln Z + kB T , (20.16) ∂T V ∂T V which, using eqn 20.10, is equivalent to eqn 20.13 above. This expression then leads to the heat capacity, via (recall eqn 16.68) ∂S , (20.17) CV = T ∂T V

20.2

or one can use CV =

∂U ∂T

.

(20.18)

V

Either way, 2 ∂ln Z ∂ ln Z +T CV = kB T 2 . ∂T ∂T 2 V V

(20.19)

• Pressure p The pressure can be obtained from F using eqn 16.20, so that ∂ln Z ∂F = kB T . (20.20) p=− ∂V T ∂V T Having got the pressure we can then write down the enthalpy and the Gibbs function. • Enthalpy H ∂ln Z ∂ln Z +V H = U + pV = kB T T ∂T ∂V V T

(20.21)

• Gibbs function G

∂ln Z G = F + pV = kB T − ln Z + V ∂V T

Function of state

(20.22)

Statistical mechanical expression d ln Z dβ −kB T ln Z

−

U F

∂F

U −F T

S

=−

p

=−

H

= U + pV

G

= F + pV = H − T S

CV

=

∂T

V

=

∂F ∂V

∂U ∂T

V

T

∂ln Z ∂T

kB ln Z + kB T V ∂ln Z kB T T ∂V ∂ln Z ∂ln Z kB T T +V ∂T V T ∂V ∂ln Z kB T − ln Z + V ∂V T ∂ln Z ∂ 2 ln Z kB T 2 +T ∂T ∂T 2 V V

Table 20.1 Thermodynamic quantities derived from the partition function Z.

Obtaining the functions of state 213

214 The partition function

These relations are summarized in Table 20.1. In practice, it is easiest to only remember the relations for U and F , since the others can be derived (using the relations shown in the left column of the table). Now that we have described how the process works, we can set about practising this for diﬀerent partition functions.

Example 20.3 (a) Two-level system: The partition function for a two-level system (whose energy is either −∆/2 or ∆/2) is given by eqn 20.2, which states that Z = 2 cosh

β∆ 2

.

(20.23)

Having obtained Z, we can immediately compute the internal energy U and ﬁnd that β∆ ∆ d ln Z = − tanh . (20.24) U =− dβ 2 2 Hence the heat capacity CV is CV =

∂U ∂T

= kB V

β∆ 2

2 sech

2

β∆ 2

.

(20.25)

The Helmholtz function is F = −kB T ln Z = −kB T ln 2 cosh

β∆ 2

,

(20.26)

and hence the entropy is S=

∆ U −F = − tanh T T

β∆ 2

β∆ + kB ln 2 cosh . 2

(20.27)

These results are plotted in Fig. 20.4(a). At low temperature, the system is in the lower level and the internal energy U is −∆/2. The entropy S is kB ln Ω, where Ω is the degeneracy and hence Ω = 1 and so S = kB ln 1 = 0. At high temperature, the two levels are each occupied with probability 12 , U therefore tends to 0 (which is half-way between −∆/2 and ∆/2), and the entropy tends to kB ln 2 as expected. The entropy rises as the temperature increases because it reﬂects the freedom of the system to exist in diﬀerent states, and at high temperature the system has more freedom (in that it can exist in either of the two states). Conversely, cooling corresponds to a kind of ‘ordering’ in which the system can only exist in one state (the lower), and this gives rise to a reduction in the entropy.

Obtaining the functions of state 215

U

U h

20.2

k T h

S k

S k

k T

k T h

CV/kB

CV/kB

k T

k T

k T h

Fig. 20.4 The internal energy U , the entropy S and the heat capacity CV for (a) the two-state system (with energy levels ±∆/2) and (b) the simple harmonic oscillator.

The heat capacity is very small both (i) at low temperature (kB T ∆) and (ii) at very high temperature (kB T ∆), because changes in temperature have no eﬀect on the internal energy when (i) the temperature is so low that only the lower level is occupied and even a small change in temperature won’t alter that, and (ii) the temperature is so high that both levels are occupied equally and a small change in temperature won’t alter this. At very low temperature, it is hard to change the energy of the system because there is not enough energy to excite transitions from the ground state and therefore the system is ‘stuck’. At very high temperature, it is hard to change the energy of the system because both states are equally occupied. In between, roughly around a temperature T ≈ ∆/kB , the heat capacity rises to a maximum, known as a Schottky anomaly,2 as shown in the lowest panel of Fig. 20.4(a).

2

Walter Schottky (1886–1976).

216 The partition function

This arises because at this temperature, it is possible to thermally excite transitions between the two states of the system. Note, however, that the Schottky anomaly is not a sharp peak, cusp or spike, as might be associated with a phase transition (see Section 28.7), but is a smooth, fairly broad maximum. (b) Simple harmonic oscillator: The partition function for the simple harmonic oscillator (from eqn 20.3) is 1 e− 2 βω . (20.28) Z= 1 − e−βω Hence (referring to Table 20.1), we ﬁnd that U is given by 1 1 d ln Z = ω + βω U =− dβ 2 e −1

(20.29)

and hence that CV is CV =

∂U ∂T

= kB (βω)2 V

eβω . (eβω − 1)2

(20.30)

At high temperature, βω 1 and so (eβω −1) ≈ βω and so CV → kB (the equipartition result). Similarly, U → ω 2 + kB T ≈ kB T . The Helmholtz function is (referring to Table 20.1) F = −kB T ln Z =

ω + kB T ln(1 − e−βω ), 2

and hence the entropy is (referring again to Table 20.1) βω U −F −βω = kB − ln(1 − e ) . S= T eβω − 1

(20.31)

(20.32)

These results are plotted in Fig. 20.4(b). At absolute zero, only the lowest level is occupied, so the internal energy is 12 ω and the entropy is kB ln 1 = 0. The heat capacity is also zero. As the temperature rises, more and more energy levels in the ladder can be occupied, and U rises without limit. The entropy also rises (and follows a dependence which is approximately kB ln(kB T /ω) where kB T /ω is approximately the number of occupied levels). Both functions carry on rising because the ladder of energy levels increases without limit. The heat capacity rises to a plateau at CV = kB , which is the equipartition result (see eqn 19.13).

The results for two further examples are plotted in Fig. 20.5 and are shown without derivation. The ﬁrst is an N -level system and is shown in Fig. 20.5(a). At low temperature, the behaviour of the thermodynamic functions resembles that of the simple harmonic oscillator, but at higher temperature, U and S begin to saturate and CV falls, because the system has a limited number of energy levels.

Obtaining the functions of state 217

U

U

20.2

S k

k T

S k

k T

CV/kB

k T

CV/kB

k T

k T

k T

Fig. 20.5 The internal energy U , the entropy S and the heat capacity CV for (a) the N -level system (the simulation is shown for N = 20) and (b) the rotating diatomic molecule (in this case ∆ = 2 /2I where I is the moment of inertia).

The second plot in Fig. 20.5(b) is for the rotating diatomic molecule. This resembles the simple harmonic oscillator at higher temperature (the heat capacity saturates at CV = kB ) but diﬀers at low temperature owing to the detailed diﬀerence in the structure of the energy levels. At high temperature, the heat capacity is given by the equipartition result (see eqn 19.13). This can be veriﬁed directly using the partition function which, at high temperature, can be represented by the following integral: ∞ ∞ (2J + 1)e−β∆J(J+1) ≈ (2J + 1)e−β∆J(J+1) dJ, (20.33) Z= 0

J=0

where ∆ = /2I. Using 2

d −β∆J(J+1) e = −(2J + 1)β∆ e−β∆J(J+1) , dJ

(20.34)

218 The partition function

we have that Z=−

1 −β∆J(J+1) e β∆

∞ = 0

1 . β∆

(20.35)

This implies that U = −d ln Z/dβ = 1/β = kB T and hence CV = (dU/dT )V = kB .

20.3

The big idea

The examples above illustrate the ‘big idea’ of statistical mechanics: you describe a system by its energy levels Eα and evaluate its properties by following the prescription given by the two steps: (1) Write down Z = α e−βEα . (2) Evaluate various functions of state using the expressions in Table 20.1. 3

Well, almost. The Schr¨ odinger equation can only be solved for a few systems, and if you don’t know the energy levels of your system, you can’t write down Z. Fortunately, there are quite a number of systems for which you can solve the Schr¨ odinger equation, some of which we are considering in this chapter, and they describe lots and lots of important physical systems, enough to keep us going in this book!

And that’s really all there is to it!3 You can understand the results by comparing the energy kB T to the spacings between energy levels. • If kB T is much less than the spacing between the lowest energy level and the ﬁrst excited level then the system will sit in the lowest level. • If there are a ﬁnite set of levels and kB T is much larger than the energy spacing between the lowest and highest levels, then each energy level will be occupied with equal probability. • If there are an inﬁnite ladder of levels and kB T is much larger than the energy spacing between adjacent levels, then the mean energy rises linearly with T and one obtains a result consistent with the equipartition theorem.

20.4

Combining partition functions

Consider the case when the energy E of a particular system depends on various independent contributions. For example, suppose it is a sum of two contributions a and b, so that the energy levels are given by Ei,j where (a) (b) (20.36) Ei,j = Ei + Ej , (a)

(b)

and where Ei is the ith level due to contribution a and Ej is the jth level due to contribution b, so the partition function Z is (a) (b) (b) (a) e−β(Ei +Ej ) = e−βEi e−βEj = Za Zb , (20.37) Z= i

j

i

j

so that the partition functions of the independent contributions multiply. Hence also ln Z = ln Za +ln Zb , and the eﬀect on functions of state which depend on ln Z is that the independent contributions add.

Exercises 219

Example 20.4 (i) The partition function Z for N independent simple harmonic oscillators is given by N , (20.38) Z = ZSHO 1

where ZSHO = e− 2 βω /(1 − e−βω ), from eqn 20.3, is the partition function for a single simple harmonic oscillator. (ii) A diatomic molecule with both vibrational and rotational degrees of freedom has a partition function Z given by Z = Zvib Zrot ,

(20.39) 1

where Zvib is the vibrational partition function Zvib = e− 2 βω /(1 − e−βω ), from eqn 20.3, and Zrot is the rotational partition function Zrot =

α

e−βEα =

∞

2

(2J + 1)e−β

J(J+1)/2I

.

(20.40)

J=0

from eqn 20.6. For a gas of diatomic molecules, we would also need a factor in the partition function corresponding to translational motion. We will derive this in the following chapter.

Chapter summary −βEα • The partition function Z = contains the information αe needed to ﬁnd many thermodynamic properties. • The equations U = −d ln Z/dβ, F = −kB T ln Z, S = (U − F )/T , ∂F , H = U + pV , G = H − T S can be used to generate p = − ∂V T the relevant thermodynamic properties from Z.

Exercises (20.1) Show that at high temperature, such that kB T ω, the partition function of the simple harmonic oscillator is approximately Z ≈ (βω)−1 . Hence ﬁnd U , C, F and S at high temperature. Repeat the problem for the high temperature limit of the rotational energy levels of the diatomic molecule for which Z ≈ (β2 /2I)−1 (see eqn 20.35).

(20.2) Show that ln Pj = β(F − Ej ).

(20.41)

(20.3) Show that eqn 20.29 can be rewritten as U=

ω βω coth , 2 2

(20.42)

220 Exercises and eqn 20.32 can be rewritten as » „ «– ω βω βω S = kB coth − ln 2 sinh . 2 2 2 (20.43) (20.4) Show that the zero-point energy of a simple harmonic oscillator does not contribute to its entropy or heat capacity, but does contribute to its energy and Helmholtz function. spin- 12

paramagnet in a magnetic ﬁeld B can be (20.5) A modelled as a set of independent two-level systems with energy −µB B and µB B (where µB ≡ e/2m is the Bohr magneton). (a) Show that for one magnetic ion, the partition function is (20.44) Z = 2 cosh(βµB B). (b) For N independent magnetic ions, the partition function ZN is ZN = Z N . Show that the Helmholtz function is given by F = −N kB T ln[2 cosh(βµB B)].

(20.45)

(c) Eqn 17.32 implies that the magnetic moment m is given by m = −(∂F/∂B)T . Hence show that m = N µB tanh(βµB B).

(20.6) A certain magnetic system contains n independent molecules per unit volume, each of which has four energy levels given by 0, ∆ − gµB B, ∆, ∆ + gµB B (g is a constant). Write down the partition function, compute the Helmholtz function and hence compute the magnetization M . Hence show that the magnetic susceptibility χ is given by B→0

µ0 M 2ngµ2B . = B kB T (3 + e∆/kB T )

1 1 1 )ω + (ny + )ω + (nz + )ω. 2 2 2 (20.49) Show that the partition function Z is given by E = (nx +

3 , Z = ZSHO

(20.48)

(20.50)

where ZSHO is the partition function of a simple harmonic oscillator given in eqn 20.3. Hence show that the Helmholtz function is given by F =

3 ω + 3kB T ln(1 − e−βω ), 2

(20.51)

and that the heat capacity tends to 3kB at high temperature. (20.8) The internal levels of an isolated hydrogen atom are given by E = −R/n2 where R = 13.6 eV. The degeneracy of each level is given by 2n2 . (a) Sketch the energy levels. (b) Show that

(20.46)

Sketch m as a function of B. (d) Show further that for small ﬁelds, µB B kB T , (20.47) m ≈ N µ2B B/kB T. (e) The magnetic susceptibility is deﬁned as χ ≈ µ0 M/B (see Blundell (2001)) for small B. Hence show that χ ∝ 1/T , which is Curie’s law.

χ = lim

(20.7) The energy E of a system of three independent harmonic oscillators is given by

Z=

∞ X

„ 2n2 exp

n=1

R n 2 kB T

« .

(20.52)

Note that when T = 0, this expression for Z diverges. This is because of the large degeneracy of the hydrogen atom’s highly excited states. If the hydrogen atom were to be conﬁned in a box of ﬁnite size, this would cut oﬀ the highly excited states and Z would not then diverge. By approximating Z as follows: Z≈

2 X n=1

„ 2n2 exp

R n 2 kB T

« ,

(20.53)

i.e. by ignoring all but the n = 1 and n = 2 states, estimate the mean energy of a hydrogen atom at 300 K.

Statistical mechanics of an ideal gas The partition function is a sum over all the states of a system of the relevant Boltzmann factors. As we saw in Chapter 20, constructing the partition function is the ﬁrst step to deriving all the thermodynamic properties of a system. A very important example of this technique is the ideal gas. To determine the partition function of an ideal gas, we have to know what the relevant energy levels are so that we can label the states of the system. Our ﬁrst step, outlined in the following section, is to work out how many states lie in a certain energy or momentum interval, and this leads us to the density of states to be deﬁned below.

21.1

21 21.1 Density of states

221

21.2 Quantum concentration 223 21.3 Distinguishability

224

21.4 Functions of state of the ideal gas 225 21.5 Gibbs paradox

228

21.6 Heat capacity of a diatomic gas 229 Chapter summary

230

Exercises

230

Density of states

Consider a cubical box of dimensions L×L×L and volume V = L3 . The box is ﬁlled with gas molecules, and we want to consider the momentum states of these gas molecules. It is convenient to label each molecule (we assume all have mass m) in the gas by its momentum p divided by , i.e. by its wave vector k = p/. We assume that the molecules behave like free particles inside the box, but that they are completely conﬁned within the walls of the box. Their wave functions are thus the solution to the Schr¨ odinger equation for the three-dimensional particle-in-a-box problem.1 We can hence write the wave function of a molecule with wave vector k as2 ψ(x, y, z) =

1 V 1/2

nx π , L

We here assume familiarity with basic quantum mechanics. 2

sin(kx x) sin(ky y) sin(kz z).

(21.1)

The factor 1/V 1/2 is simply to ensure that the wave function is normalized over the volume of the box, so that |ψ(x, y, z)|2 dV = 1. Since the molecules are conﬁned inside the box, we want this wave function to go to zero at the boundaries of the box (the six planes x = 0, x = L, y = 0, y = L, z = 0 and z = L) and this will occur if kx =

1

ky =

ny π , L

kz =

nz π , L

(21.2)

where nx , ny and nz are integers. We can thus label each state by this triplet of integers. An allowed state can be represented by a point in three-dimensional k-space , and these points are uniformly distributed [in each direction,

This wave function is a sum of plane waves travelling in opposite directions. Thus, in this treatment, kx , ky and kz can only be positive since negating any of them results in the same probability density |ψ(x, y, z)|2 .

222 Statistical mechanics of an ideal gas

points are separated by a distance π/L, see Fig. 21.1(a)]. A single point in k-space occupies a volume

π 3 π π π × × = . L L L L

(21.3)

Let us now focus on the magnitude of the wave vector given by k = |k|. Allowed states with a wave vector whose magnitude lies between k and k + dk lie on one octant of a spherical shell of radius k and thickness dk (see Fig. 21.1(b)). It is just one octant since we only allow positive wave vectors in this approach. The volume of this shell is therefore 1 × 4πk 2 dk. 8

(21.4)

The number of allowed states with a wave vector whose magnitude lies between k and k + dk is described by the function g(k) dk, where g(k) is the density of states. This number is then given by g(k) dk =

volume in k-space of one octant of a spherical shell . (21.5) volume in k-space occupied per allowed state

This implies that g(k) dk =

1 8

× 4πk 2 dk V k 2 dk = . 3 (π/L) 2π 2

(21.6)

Example 21.1 An alternative method of calculating eqn 21.6 is to centre the box of gas at the origin, so that it is bounded by the planes x = ±L/2, y = ±L/2 and z = ±L/2, and to apply periodic boundary conditions. In this case, the wave function is given by ψ(x, y, z) = Fig. 21.1 (a) States in k-space are separated by π/L. Each state occupies a volume (π/L)3 . (b) The density of states can be calculated by considering the volume in k-space between states with wave vector k and states with wave vector k + dk, namely 4πk2 dk. One octant of the sphere is shown. (c) In Example 22.1, our alternative formulation allows states in k-space to have positive or negative wave vectors and these states are separated by 2π/L. Each state now occupies a volume (2π/L)3 .

1 V

1/2

eik·r =

1 V 1/2

eikx x eiky y eikz z .

(21.7)

The periodic boundary conditions can now be applied: L L ψ( , y, z) = ψ(− , y, z), 2 2

(21.8)

eikx L/2 = e−ikx L/2 ,

(21.9)

implies that and hence

2πnx L where nx is an integer. Similarly we have that kx =

ky =

2ny π 2nz π , and kz = . L L

(21.10)

(21.11)

21.2

Quantum concentration 223

The points in k-space are now spaced twice as far apart compared to our earlier treatment (see Fig. 21.1(c)), but nx , ny and nz can now be positive or negative, meaning that a complete sphere of values in k-space is used in this formalism. Thus the density of states is now volume in k-space of a complete spherical shell . (21.12) g(k) dk = volume in k-space occupied per allowed state This implies that g(k) dk =

V k 2 dk 4πk 2 dk = , (2π/L)3 2π 2

(21.13)

as before in eqn 21.6. Having calculated the density of states in eqn 21.6 (and identically in eqn 21.13), we are now in a position to calculate the partition function of an ideal gas.

21.2

Quantum concentration

The single-particle partition function3 for the ideal gas is given by a generalization of eqn 20.1 in which we replace the sum by an integral. Hence we have ∞ e−βE(k) g(k) dk, (21.14) Z1 = 0

where the energy of a single molecule with wave vector k is given by E(k) = Hence,

2 k 2 . 2m

3/2 k 2 dk V mkB T = , 2π 2 3 2π 0 which can be written in the appealingly simple form 3/2 1 mkB T Z1 = V nQ , where nQ = 3 , 2π

Z1 =

∞

k /2m V

2 2

e−β

(21.15)

(21.16)

(21.17)

where nQ is known as the quantum concentration. We can deﬁne λth , the thermal wavelength, as follows: h −1/3 λth = nQ =√ , (21.18) 2πmkB T and hence we can also write Z1 =

V . λ3th

(21.19)

Equation 21.17 (and 21.19) brings out the important fact that the partition function is proportional to the volume of the system (and also proportional to temperature to the power of 3/2). The importance of this will be seen in the following section.

3

There is a distinction between the partition function associated with ‘singleparticle states’ (where we focus our attention only on a single particle in our system, assuming it has freedom to exist in any state without having to worry about not occupying a state which has already been taken by another particle) and the partition function associated with the whole system. This point will be made clear in the following section. However, we will introduce the subscript 1 at this point to remind ourselves that we are thinking about single-particle states.

224 Statistical mechanics of an ideal gas

21.3

Distinguishability

In this section, we want to attempt to understand what happens for our gas of N molecules, moving on from considering only single–particle states to considering the N -particle state. This is a surprisingly subtle point and to see why, we study the following, much simpler, example.

Example 21.2 Consider a particle which can exist in two states. We model this particle as a thermodynamic system in which the energy can be either 0 or . The two states of the system are shown in Fig. 21.2(a) and the singlepartition function is Z1 = e0 + e−β = 1 + e−β .

(21.20)

Now consider two such particles which behave in the same way and let us suppose that they are distinguishable (for example, they might have diﬀerent physical locations, or they might have some diﬀerent attribute, like colour). The possible states of the combined system are shown in Fig. 21.2(b), and we have made them distinguishable in the diagram by depicting them with diﬀerent symbols. In this case we can write down the two-particle partition function Z2 as a sum over those four possible states, and hence Z2 = e0 + e−β + e−β + e−2β ,

(21.21)

and in this case we see that Z2 = (Z1 )2 .

(21.22)

In much the same way, we could work out the N -particle partition function for N distinguishable particles and show that it is given by ZN = (Z1 )N .

Fig. 21.2 (a) A particle is described by a two-state system with energy 0 or . (b) The possible states for two such particles if they are distinguishable. (c) The possible states for two such particles if they are indistinguishable.

(21.23)

However, what happens if the particles are indistinguishable? Returning to the combination of two systems, there are now only three possible states of the combined system, as shown in Fig. 21.2(c). The partition function is now Z2 = e0 + e−β + e−2β = (Z1 )2 .

(21.24)

What has happened is that (Z1 )2 correctly accounts for those states in which the particles are in the same energy level, but has overcounted (by a factor of 2) those states in which the particles are in diﬀerent energy levels. Similarly, for N indistinguishable particles, the N -particle partition function ZN = (Z1 )N because (Z1 )N overcounts states in which all N particles are in diﬀerent states by a factor of N !.

21.4

Functions of state of the ideal gas 225

Let us summarize the results of this example. If the N particles are distinguishable, then we can write the N -particle partition function ZN as ZN = (Z1 )N . (21.25) If they are indistinguishable, then it is much more complicated.4 However, we can make a rather crafty approximation, as follows. If it is possible to ignore those conﬁgurations in which two or more particles are occupying the same energy level, then we can assume exactly the same answer as the distinguishable case and so we only have to worry about the single overcounting factor which we make when we ignore indistinguishability. If we have N particles all in diﬀerent states, then that overcounting factor is N ! (the number of diﬀerent arrangements of N distinguishable particles on N distinct sites). Hence we can write the N -particle partition function ZN for indistinguishable particles as ZN =

(Z1 )N . N!

(21.26)

This result has assumed that it is possible to ignore those states in which two or more particles occupy the same energy level. When is this approximation possible? We will have only one particle occupying any given state if the system is in a regime when the number of available states is much larger than the number of particles. So for the ideal gas, we require that the number of thermally accessible energy levels must be much larger than the number of molecules in the gas. This occurs when n, the number density of molecules, is much less than the quantum concentration nQ . Thus the condition for validity of eqn 21.26 for an ideal gas is n nQ . (21.27) If this condition holds, the N -particle partition function for an ideal gas can be written as N V 1 . (21.28) ZN = N ! λ3th The quantum concentration nQ is plotted in Fig. 21.3 for electrons, protons, N2 molecules and C60 molecules (known as buckyballs). At room temperature, the quantum concentration of N2 molecules is much higher than the actual number density of molecules in air (≈ 1025 m−3 ) and so the approximation in eqn 21.26 is a good one. Electrons in a metal have a concentration ≈ 1029 m−3 which is larger than the quantum concentration for electrons at room temperature, so the approximation in eqn 21.26 will not work for electrons and their quantum properties have to be considered in more detail.

21.4

Functions of state of the ideal gas

Having obtained the partition function of an ideal gas, we are now in a position to use the machinery of statistical mechanics, developed in

4

Note that identical (and hence indistinguishable) particles can be made to be distinguishable if they are localized. The particles can then be distinguished by their physical location. Electrons in a gas are indistinguishable if there is no means of labelling which is which, but the electrons sitting in a particular magnetic orbital, one per atom of a magnetic solid, are distinguishable.

n

n

226 Statistical mechanics of an ideal gas

Fig. 21.3 The quantum concentration nQ and thermal wavelength λth for electrons, protons, N2 molecules and buckyballs.

T

Chapter 20, to derive all the relevant thermodynamic properties. This we do in the following example.

Example 21.3 The partition function for N molecules in a gas is given in eqn 21.28 by N V 1 ZN = ∝ (V T 3/2 )N , (21.29) N ! λ3th since λth ∝ T −1/2 . Hence we can write ln ZN = N ln V +

3N ln T + constants. 2

(21.30)

The internal energy U is given by U =−

3 d ln ZN = N kB T, dβ 2

so that the heat capacity is CV = results.

3 2 N kB

(21.31)

in agreement with previous

21.4

Functions of state of the ideal gas 227

The Helmholtz function is F = −kB T ln ZN = −kB T N ln V − kB so that

p=−

∂F ∂V

= T

3N T ln T − kB T × constants, 2 (21.32)

N kB T = nkB T, V

(21.33)

which is, reassuringly, the ideal gas equation. This also gives the enthalpy H via 5 H = U + pV = N kB T. (21.34) 2 Before proceeding to the entropy, it is going to be necessary to worry about what the constants are in eqn 21.30. Returning to eqn 21.29, we write ln ZN

= N ln V − 3Nln λth − N ln N + N Ve = N ln , N λ3th

(21.35)

where we have used Stirling’s approximation, ln N ! ≈ N ln N − N (see eqn 1.17). Hence we can obtain the following expression for the Helmholtz function F : Ve F = −N kB T ln N λ3th 3 = N kB T [ln(nλth ) − 1]. (21.36) This allows us to derive the entropy S: S

3 U −F = N kB + N kB ln = T 2 5/2 Ve = N kB ln N λ3th 5 − ln(nλ3th ) , = N kB 2

Ve N λ3th

(21.37)

and hence the entropy is expressed in terms of the thermal wavelength of the molecules. We can also derive the Gibbs function G 5/2 Ve 5 G = H − T S = N kB T − N kB T ln 2 N λ3th 3 = N kB T ln(nλth ). (21.38)

228 Statistical mechanics of an ideal gas

Fig. 21.4 (a) Joule expansion of an ideal gas (an irreversible process). (b) Mixing of two diﬀerent gases, equivalent to the Joule expansion of each of the gases (an irreversible process). (c) Mixing of two identical gases, which is clearly a reversible process – how can you tell if they have been mixed?

S

Nk

S

Nk

S

21.5

Gibbs paradox

The expression for the entropy in eqn 21.37 is called the Sackur-Tetrode equation and can be used to demonstrate the Gibbs paradox. Consider the process shown in Fig. 21.4(a), namely the Joule expansion of N molecules of an ideal gas. This is an irreversible process which halves the number density n so that the increase in entropy is given by ∆S

= Sﬁnal − Sinitial 5 5 n − ln( λ3th ) − N kB − ln(nλ3th ) = N kB 2 2 2 = N kB ln 2, (21.39)

in agreement with eqn 14.29. This reﬂects the fact that, following the Joule expansion, we have an uncertainty about each molecule as to whether it is on the left or right-hand side of the chamber, whereas beforehand there was no uncertainty (all molecules were on the lefthand side). Hence the uncertainty is 1 bit per molecule, and hence ∆S/kB = N ln 2. Now consider the situation depicted in Fig. 21.4(b) in which two different gases are allowed to mix following the removal of a partition which separated them. This is clearly an irreversible process and is equivalent

21.6

to the Joule expansion of each gas. Thus the entropy increase is ∆S = 2N kB ln 2.

(21.40)

An apparently similar case is shown in Fig. 21.4(c), but this time the two gases on either side of the partition are indistinguishable. Removing the partition is now an eminently reversible operation so ∆S = 0. Yet, it might be argued, is it not the case that the removal of the partition simply allows the gases which were initially on either side of the partition to each undergo a Joule expansion? Surely, the change of entropy would then be ∆S = 2N kB ln 2. This apparent paradox is resolved by understanding that indistinguishable really means indistinguishable! In other words, the case shown in Fig. 21.4(c) is fundamentally diﬀerent from that shown in Fig. 21.4(b). Removing the partition in the case of Fig. 21.4(c) is a reversible operation since we have no way of losing information about which side of the partition certain bits of gas are; this is because all molecules of this gas look the same to us and we never had such information in the ﬁrst place. Hence ∆S = 0. Gibbs resolved this paradox himself by realising that indistinguishability was fundamental and that all states of the system that diﬀer only by a permutation of identical molecules should be considered as the same state. Failure to do this results in an expression for the entropy which is not extensive (see Exercise 21.2, which was the original manifestation of the Gibbs paradox).

21.6

Heat capacity of a diatomic gas

The energy of a diatomic molecule in a gas can be written using eqn 19.19 as the sum of three translational, two rotational and two vibrational terms, giving seven modes in total. The equipartition theorem shows that the mean energy per molecule at high temperature is therefore 7 2 kB T (see eqn 19.20). Because the modes are independent, the partition function of a diatomic molecule, Z, can be written as the product of partition functions for the translational, rotational and vibrational modes as (21.41) Z = Ztrans Zvib Zrot , 1

where Ztrans = V /λ3th from eqn 21.19, Zvib = e− 2 βω /(1 − e−βω ), from eqn 20.3, and Zrot is the rotational partition function Zrot =

α

e−βEα =

∞

2

(2J + 1)e−β

J(J+1)/2I

,

(21.42)

J=0

from eqn 20.6. Thus the mean energy U of such a diatomic molecule is given by U = −d ln Z/dβ and is the sum of the energies of the individual modes. Similarly, the heat capacity CV is the sum of the heat capacities of the individual modes. This gives rise to the behaviour shown in Fig. 21.5 in which the heat capacity goes through a series of plateaus: at any non-zero temperature, all the translational modes are excited (a

Heat capacity of a diatomic gas 229

230 Exercises

Fig. 21.5 The molar heat capacity at constant volume of a diatomic gas as a function of temperature.

failure of the ideal gas model, because CV should go to zero as T → 0, see Chapter 18) and CV = 23 R (for one mole of gas); above T ≈ 2 /2IkB the rotational modes are also excited and CV rises to 52 R; above T ≈ ω/kB , the vibrational modes are excited and hence CV rises to 72 R.

Chapter summary √ • For an ideal gas Z = V /λ3th where λth = h/ 2πmkB T is the thermal wavelength. • The quantum concentration nQ = 1/λ3th . • The N -particle partition function ZN = (Z1 )N /N ! for indistinguishable particles in the low-density case when n/nQ 1 so that nλ3th 1.

Exercises (21.1) Show that the single-partition function Z1 of a two-dimensional gas conﬁned in an area A is given by A (21.43) Z1 = 2 , λth

√ where λth = h/ 2πmkB T . (21.2) Show that S as given by eqn 21.37 (the SackurTetrode equation) is an extensive quantity, but that the entropy of a gas of distinguishable par-

Exercises 231 ticles is given by S = N kB

»

– 3 − ln(λ3th /V ) , 2

(21.44)

(21.3) Show that the number of states in a gas with energies below Emax is „ «3/2 Z √2mEmax / 2 2mEmax V g(k) dk = . 6π 2 2 0 (21.45) Putting Emax = 32 kB T , show that the number of states is ΞV nQ where Ξ is a numerical constant of order unity.

Cp R

and show that this quantity is not extensive. This non-extensive entropy provided the original version of the Gibbs paradox.

(21.4) An atom in a solid has two energy levels: a ground state of degeneracy g1 and an excited state of degeneracy g2 at an energy ∆ above the ground state. Show that the partition function Zatom is Zatom = g1 + g2 e−β∆ .

(21.46)

Show that the heat capacity of the atom is given by g1 g2 ∆2 e−β∆ . (21.47) C= kB T 2 (g1 + g2 e−β∆ )2 A monatomic gas of such atoms has a partition function given by Z = Zatom ZN ,

(21.48)

where ZN is the partition function due to the translational motion of the gas atoms and is given by ZN = (1/N !)[V /λ3th ]N . Show that the heat capacity of such as gas is – » 3 g1 g2 ∆2 e−β∆ . (21.49) kB + C=N 2 kB T 2 (g1 + g2 e−β∆ )2

T Fig. 21.6 The heat capacity of hydrogen gas as a function of temperature.

(21.5) Explain the behaviour of the experimental heat capacity (measured at constant pressure) of hydrogen (H2 ) gas shown in Fig. 21.6. (21.6) Show that the single–particle partition function Z1 of a gas of hydrogen atoms is given approximately by Z1 =

V eβR , λ3th

(21.50)

where R = 13.6 eV and the contribution due to excited states has been neglected.

22

The chemical potential

22.1 A deﬁnition of the chemical potential 232 22.2 The meaning of the chemical potential 233 22.3 Grand partition function 235 22.4 Grand potential

236

22.5 Chemical potential as Gibbs function per particle 238 22.6 Many types of particle 22.7 Particle number tion laws

We now want to consider systems which can exchange particles with their surroundings and we will show in this chapter that this feature leads to a new concept known as the chemical potential. Diﬀerences in the chemical potential drive the ﬂow of particles from one place to another in much the same way as diﬀerences in temperature drive the ﬂow of heat. The chemical potential turns up in chemical reactions (hence the name) because if you are doing a reaction such as 2H2 + O2 → 2H2 O,

238

conserva239

22.8 Chemical potential chemical reactions

and 240

Chapter summary

245

Further reading

245

Exercises

246

(22.1)

you are changing the number of particles in your system (3 molecules on the left, 2 on the right). However, as we shall see, the chemical potential applies to more than just chemical systems. It is connected with conservation laws, so that particles such as electrons (which are conserved) and photons (which are not) have diﬀerent chemical potentials and this has consequences for their behaviour.

22.1

A deﬁnition of the chemical potential

If you add a particle to a system, then the internal energy will change by an amount which we call the chemical potential µ. Thus the ﬁrst and second laws of thermodynamics expressed in eqn 14.18 must, in the case of changing numbers of particles, be modiﬁed to contain an extra term, so that dU = T dS − pdV + µdN, (22.2) 1

If we are dealing with discrete particles, then N is an integer and can only change by integer amounts; hence using calculus expressions like dN is a bit sloppy, but this is an indiscretion for which we may be excused if N is large. However, there exist systems such as quantum dots which are semiconductor nanocrystals whose size is a few nanometres. Quantum dots are so small that µ jumps discontinuously when you add one electron to the quantum dot.

where N is the number of particles in the system.1 This means that we can write an expression for µ as a partial diﬀerential of U as follows: ∂U . (22.3) µ= ∂N S,V However, keeping S and V constant is a diﬃcult constraint to apply, so it is convenient to consider other thermodynamic potentials. Equation 22.2 together with the deﬁnitions F = U − T S and G = U + pV − T S imply that dF dG

= −pdV − SdT + µdN, = V dp − SdT + µdN,

(22.4) (22.5)

22.2

and hence we can make the more useful deﬁnitions: ∂F µ = or ∂N V,T ∂G . µ = ∂N p,T

The meaning of the chemical potential 233

(22.6) (22.7)

The constraints of constant p and T are experimentally convenient for chemical systems and so eqn 22.7 will be particularly useful.

22.2

The meaning of the chemical potential

What drives a system to form a particular equilibrium state? As we have seen in Chapter 14, it is the second law of thermodynamics which states that entropy always increases. The entropy of a system can be considered to be a function of U , V and N , so that S = S(U, V, N ). Therefore, we can immediately write down ∂S ∂S ∂S dU + dV + dN. (22.8) dS = ∂U N,V ∂V N,U ∂N U,V Equation 22.2 implies that dS =

dU pdV µdN + − . T T T

(22.9)

Comparison of eqn 22.8 and 22.9 implies that we can therefore make the following identiﬁcations: ∂S ∂S ∂S 1 p µ = , = , = − . (22.10) ∂U N,V T ∂V N,U T ∂N U,V T Now consider two systems which are able to exchange heat or particles between them. If we write down an expression for dS, then we can use the second law of thermodynamics in the form dS ≥ 0 to determine the equilibrium state. We repeat this analysis for two cases as follows: • The case of heat ﬂow Consider two systems which are able to exchange heat with each other while remaining thermally isolated from their surroundings (see Fig. 22.1). If system 1 loses internal energy dU , system 2 must gain internal energy dU . Thus the change of entropy is ∂S2 ∂S1 dU1 + dU2 dS = ∂U1 N,V ∂U2 N,V ∂S1 ∂S2 = (−dU ) + (dU ) ∂U1 N,V ∂U2 N,V 1 1 = − + dU ≥ 0. (22.11) T1 T2

U

U

U

Fig. 22.1 Two systems which are able to exchange heat with each other.

234 The chemical potential

So dU > 0, i.e. energy ﬂows from 1 to 2, when T1 > T2 . As expected, equilibrium is found when T1 = T2 , i.e. when the temperatures of the two systems are equal.

N

N

N

Fig. 22.2 Two systems which are able to exchange particles with each other.

• The case of particle exchange Now consider two systems which are able to exchange particles with each other, but remain isolated from their surroundings (see Fig. 22.2). If system 1 loses dN particles, system 2 must gain dN particles. Thus the change of entropy is ∂S2 ∂S1 dN1 + dN2 dS = ∂N1 U,V ∂N2 U,V ∂S1 ∂S2 = (−dN ) + (dN ) ∂N1 U,V ∂N2 U,V µ1 µ2 − = dN ≥ 0 (22.12) T1 T2 Assuming that T1 = T2 , we ﬁnd that dN > 0 (so that particles ﬂow from 1 to 2) when µ1 > µ2 . Similarly, if µ2 < µ1 , then dN < 0. Hence equilibrium is found when µ1 = µ2 , i.e. when the chemical potentials are the same for each system. This demonstrates that chemical potential plays a similar rˆ ole in particle exchange as 1/temperature does in heat exchange.

Example 22.1 Find the chemical potential for an ideal gas. Solution: We use eqn 22.6 (µ = (∂F/∂N )V,T ), which relates µ to F , together with eqn 21.36, which gives an expression for F , namely F = N kB T [ln(nλ3th ) − 1].

(22.13)

Recalling also that n = N/V , we ﬁnd that µ = kB T [ln(nλ3th ) − 1] + N kB T

1 N

,

(22.14)

and hence µ = kB T ln(nλ3th ).

(22.15)

In this case, comparison with eqn 21.38 shows that µ = G/N . We will see in Section 22.5 that this property has more general applicability than just this speciﬁc case.

22.3

22.3

Grand partition function 235

Grand partition function

In this section we will introduce a version of the partition function we met in Chapter 20 but now generalized to include the eﬀect of variable numbers of particle. To do this, we have to generalize the canonical ensemble we met in Chapter 4 to the case of both energy and particle exchange. Let us write the entropy S as a function of internal energy U and particle number N . Consider a small system with ﬁxed volume V and with energy and containing N particles, connected to a reservoir with energy U − and N −N particles (see Fig. 22.3). We assume that U and N N . Using a Taylor expansion, we can write the entropy of the reservoir as dS dS −N , (22.16) S(U − , N − N ) = S(U, N ) − dU N ,V dN U,V and using the diﬀerentials deﬁned in eqn 22.10, we have that S(U − , N − N ) = S(U, N ) −

1 ( − µN ). T

(22.17)

The probability P (, N ) that the system chooses a particular macrostate is proportionality to the number Ω of microstates corresponding to that microstate, and using S = kB ln Ω we have that P (, N ) ∝ eS(U −,N −N )/kB ∝ eβ(µN −) .

(22.18)

This is known as the Gibbs distribution and the situation is known as the grand canonical ensemble. In the case in which µ = 0, this reverts to the Boltzmann distribution (the canonical ensemble). Normalizing this distribution, we have that the probability of a state of the system with energy Ei and with Ni particles is given by eβ(µNi −Ei ) , Z

Pi =

N

Fig. 22.3 A small system with energy and containing N particles, connected to a reservoir with energy U − and N − N particles.

(22.19)

where Z is a normalization constant. The normalization constant is known as the grand partition function Z, which we write as follows: Z= eβ(µNi −Ei ) , (22.20) i

which is a sum over all states of the system. The grand partition function Z can be used to derive many thermodynamic quantities, and we write down the most useful equations here without detailed proof.2 ∂ln Z N= Ni Pi = kB T , (22.21) ∂µ β i ∂ln Z U= Ei P i = − + µN, (22.22) ∂β µ i

2

See Exercise 22.4.

236 The chemical potential

and S = −kB

i

Pi ln Pi =

U − µN + kB T ln Z . T

(22.23)

For convenience, let us summarize the various ensembles considered in statistical mechanics. (1) The microcanonical ensemble: an ensemble of systems which all have the same ﬁxed energy. The entropy S is related to the number of microstates by S = kB ln Ω, and hence by Ω = eβT S .

(22.24)

(2) The canonical ensemble: an ensemble of systems, each of which can exchange its energy with a large reservoir of heat. As we shall see, this ﬁxes (and deﬁnes) the temperature of the system. Since F = −kB T ln Z, the partition function is given by Z = e−βF ,

(22.25)

where F is the Helmholtz function. (3) The grand canonical ensemble: an ensemble of systems, each of which can exchange both energy and particles with a large reservoir. This ﬁxes the system’s temperature and chemical potential. By analogy with the canonical ensemble, we write the grand partition function as (22.26) Z = e−βΦG , where ΦG is the grand potential, which we discuss in the next section.

22.4

Grand potential

Using eqn 22.26, we have deﬁned a new state function, the grand potential ΦG , by ΦG = −kB T ln Z. (22.27) Rearranging eqn 22.23, we have that −kB T ln Z = U − T S − µN,

(22.28)

ΦG = U − T S − µN = F − µN.

(22.29)

and hence The grand potential has diﬀerential dΦG given by dΦG = dF − µ dN − N dµ,

(22.30)

and, substituting in eqn 22.4, we therefore have dΦG = −S dT − p dV − N dµ,

(22.31)

22.4

and this leads to the following equations for S, p and N : ∂ΦG , S = − ∂T V,µ ∂ΦG p = − , ∂V T,µ ∂ΦG . N = − ∂µ T,V

Grand potential 237

(22.32) (22.33) (22.34)

Example 22.2 Find the grand potential for an ideal gas, and show that eqns 22.33 and 22.34 lead to the correct expressions for p and N . Solution: Using eqns 21.36 and 22.15 we have that ΦG

= N kB T [ln(nλ3th ) − 1] − N kB T ln(nλ3th ) = −N kB T,

(22.35)

and using the ideal gas equation (pV = N kB T )) this becomes ΦG = −pV.

(22.36)

We can check that eqn 22.34 leads to the correct value of p by evaluating ∂ΦG ∂ΦG ∂N = , (22.37) ∂µ T,V ∂N T,V ∂µ T,V

∂µ G = −k T (from eqn 22.35) and = kB T /N and since ∂Φ B ∂N T,V ∂N we have that

∂ΦG ∂µ

T,V

T,V

= −kB T ×

N = −N, kB T

justifying eqn 22.34. Similarly,3 ∂ΦG ∂µ ∂ΦG ∂µ =− =N ∂V T,µ ∂µ T,V ∂V T,ΦG ∂V T,ΦG

(22.38) 3

Using the reciprocity theorem, with T held constant for all terms.

(22.39)

and since the constraint of constant T and constant ΦG = −N kB T means constant T and N , and using N = nV , we can use eqn 22.15 to obtain ∂ln(N λ3th /V ) ∂µ kB T , (22.40) = kB T =− ∂V T,N ∂V V T,N and eqn 22.39 becomes ∂ΦG N kB T = −p, =− ∂V T,µ V thus justifying eqn 22.33.

(22.41)

238 The chemical potential

22.5 4

The distinction between intensive and extensive variables is discussed in Section 11.1.2.

Chemical potential as Gibbs function per particle

If we scale a system by a factor λ, then we expect all the extensive4 variables will scale with λ, thus U → λU,

S → λS,

V → λV,

N → λN,

(22.42)

and writing the entropy S as a function of U , V and N , we have λS(U, V, N ) = S(λU, λV, λN ),

(22.43)

so that diﬀerentiating with respect to λ we have S=

∂S ∂(λU ) ∂S ∂(λV ) ∂S ∂(λN ) + + , ∂(λU ) ∂λ ∂(λV ) ∂λ ∂(λN ) ∂λ

(22.44)

so that setting λ = 1 and using eqn 22.10, we have that U pV µN + − , T T T

(22.45)

U − T S + pV = µN.

(22.46)

S= and hence

We recognize the left-hand side of this equation as the Gibbs function, and so we have G = µN. (22.47) This gives a new interpretation for the chemical potential: by rearranging the above equation, one has that µ=

G , N

(22.48)

so that the chemical potential µ can be thought of as the Gibbs function per particle. This analysis also implies that the grand potential ΦG = F − µN = U − T S − µN can be rewritten (using eqn 22.46) as ΦG = −pV .

(22.49)

This equation has been demonstrated to be correct for the speciﬁc example of the ideal gas (see eqn 22.36), but we have now shown that it is always correct if entropy is an extensive property.

22.6

Many types of particle

If there is more than one type of particle, then one can generalize the treatment in Section 22.5, and write µi dNi , (22.50) dU = T dS − pdV + i

22.7

Particle number conservation laws 239

where Ni is the number of particles of species i and µi is the chemical potential of species i. Correspondingly, we have the equations µi dNi , (22.51) dF = −pdV − SdT + dG

i = V dp − SdT + µi dNi ,

(22.52)

i

and in particular, when the pressure and temperature are held constant we have that dG = µi dNi . (22.53) i

This generalization will be useful in our treatment of chemical reactions in Section 22.8. In the following section, we make the connection between µ and the conservation of particle number.

22.7

Particle number conservation laws

Imagine that one has a set of particles in a box in which particle number is not conserved. This means that we are free to create or destroy particles at will. There might be an energy cost associated with doing this, but provided we have energy to ‘pay’ for the particles, no conservation laws would be broken. In this case, the system will try to minimize its availability (see Section 16.5) and if the constraints are that the box has ﬁxed volume and ﬁxed temperature, then the appropriate availability is the Helmholtz function5 F . The system will therefore choose a number of particles N by minimizing F with respect to N , i.e. ∂F = 0. (22.54) ∂N V,T

5

If the constraints were constant pressure and temperature, we would be dealing with G not F ; see Section 16.5.

This means that, from eqn 22.6, µ = 0.

(22.55)

We arrive at the important result that, for a set of particles with no conservation law concerning particle number, the chemical potential µ is zero. One example of such a particle is the photon.6 To understand this further, let us consider a set of particles for which particle number is a conserved quantity. Consider a gas of electrons. Electrons do have a conservation law: electron number has to be conserved, so the only way of annihilating an electron is by reacting it with a positron7 via the reaction e− + e+ γ,

(22.56)

where γ denotes a photon. Thus imagine that our box contains N− electrons and N+ positrons. We are constrained by our conservation law to ﬁx the number N = N+ − N− , which also serves to ensure that

6

Strictly this is only for photons in a vacuum, which the following example will assume. Photons can under some circumstances have a non-zero chemical potential. For example, if electrons and holes combine in a light-emitting diode, it may be that the chemical potential of the electrons µe , from the conduction band, is not balanced by the chemical potential of the holes µh , from the valence band, and this leads to light with a non-zero chemical potential µ γ = µ e + µh . 7

A positron e+ is an antielectron.

240 The chemical potential

charge is conserved. The system is at ﬁxed T and V , and hence we should minimize F with respect to any variable, so let us choose N− as a variable to vary. Thus ∂F = 0. (22.57) ∂N− V,T,N In this case, F is the sum of a term due to the Helmholtz function for the electrons and one for the positrons. Thus ∂F ∂F dN+ + = 0. (22.58) ∂N− V,T,N+ ∂N+ V,T,N− dN− Now we have that

∂F ∂N−

V,T,N+

= µ− ,

the chemical potential of the electrons, while ∂F = µ+ , ∂N+ V,T,N−

(22.59)

(22.60)

the chemical potential of the positrons. Moreover, since dN− = 1, dN+

(22.61)

µ+ + µ− = 0.

(22.62)

we have that 8

Again, this is true for most circumstances, the photons from a lightemitting diode being a notable counterexample.

We are ignoring the chemical potential of the photons, since this is zero because photons do not have a conservation law.8

22.8

Chemical potential and chemical reactions

We next want to consider how the chemical potential can be used to determine the equilibrium position of a chemical reaction. Before proceeding, we will prove an important result concerning the way the chemical potential of an ideal gas depends on pressure.

Example 22.3 Derive an expression for the dependence of the chemical potential of an ideal gas on pressure at ﬁxed temperature. Solution: Equation 22.15 and the ideal gas equation (p = nkB T ) imply that 3 λth µ = kB T ln (22.63) + kB T ln p. kB T

22.8

Chemical potential and chemical reactions 241

It is useful to compare the chemical potential at standard temperature (298 K) and pressure (p = 1 bar = 105 Pa), which we denote by µ , with the chemical potential measured at some other pressure p. Here the symbol denotes the value of a function measured at standard temperature and pressure. The chemical potential µ(p) at pressure p is then given by p (22.64) µ(p) = µ + kB T ln . p

Another way of solving this is to use the equation for the change in Gibbs function dG = V dp−S dT , which when the temperature is constant is dG = V dp. This can be integrated to give Z p G(p) = G + V dp p

and hence

Chemists often deﬁne their chemical potentials as the Gibbs function per mole, rather than per particle. In those units, one would have µ(p) = µ + RT ln

p . p

(22.65)

G(p) = G + nm RT ln

p p

for nm moles of gas. Equation 22.65 then follows.

We are now ready to think about a simple chemical reaction. Consider the chemical reaction A B. (22.66) The symbol indicates that in this reaction it is possible to have both the forward reaction A→B and the backward reaction B→A. If we have a container ﬁlled with a mixture of A and B, and we leave it to react for a while, then depending on whether A→B is more or less important than B→A, we can determine the equilibrium concentrations of A and B. For gaseous reactions, the concentration of A (or B) is related to that species’ partial pressure9 pA (or pB ). We deﬁne the equilibrium constant K as the ratio of these two partial pressures at equilibrium, i.e. pA . (22.67) K= pB When K 1, the backwards reaction dominates and our container will be mainly ﬁlled with A. When K 1, the forwards reaction dominates and our container will be mainly ﬁlled with B. The change in Gibbs function as this reaction proceeds is dG = µA dNA + µB dNB .

(22.68)

However, since an increase in B is always accompanied by a corresponding decrease in A, we have that dNB = −dNA ,

(22.69)

dG = (µA − µB ) dNB .

(22.70)

and hence Let us now denote the total molar Gibbs function change in a reaction by the symbol ∆r G. For a gaseous reaction, eqn 22.65 implies that ∆r G = ∆r G + RT ln

pA , pB

(22.71)

9

The partial pressure of a gas in a mixture is what the pressure of that gas would be if all other components suddenly vanished. Dalton’s law states that the total pressure of a mixture of gases is equal to the sum of the individual partial pressures of the gases in the mixture (see Section 6.3).

242 The chemical potential

where ∆r G is the diﬀerence between the molar chemical potentials of the two species. When ∆r G > 0, the forward reaction A → B occurs spontanously. When ∆r G < 0, the backward reaction B → A occurs spontanously. Equilibrium occurs when ∆r G = 0, and substituting this into eqn 22.71 and using eqn 22.67 shows that ln K = −

10

The reactant is deﬁned to be the chemical on the left-hand side of the reaction; the product is deﬁned to be the chemical on the right-hand side of the reaction.

∆r G . RT

(22.72)

Hence there is a direct relationship between the equilibrium constant of a reaction and the diﬀerence in chemical potentials (measured under standard conditions) of the product and reactant.10 It is useful to generalize these ideas to the case in which the chemical reaction is a bit more complicated than A B. A general chemical reaction, with p reactants and q products, can in general be written in the form p p+q (−νj )Aj → (+νj )Aj , (22.73) j=1

j=p+1

where the νj coeﬃcients are here deﬁned to be negative for the reactants and where Aj represents the jth substance. This can be rearranged to give p+q νj Aj . (22.74) 0→ j=1

Example 22.4 Equation 22.53 can be applied to chemical reactions, such as N2 + 3H2 → 2NH3 .

(22.75)

This can be cast into the general form of eqn 22.74 by writing ν1 = −1,

ν2 = −3,

ν3 = 2.

(22.76)

In a chemical system in equilibrium at constant temperature and pressure we have that the Gibbs function is minimized and so eqn 22.53 gives p+q

µj dNj = 0,

(22.77)

j=1

where Nj is the number of molecules of type Aj . In order to keep the reaction balanced, the dNj must be proportional to νj and hence p+q j=1

This equation is very general.

νj µj = 0.

(22.78)

22.8

Chemical potential and chemical reactions 243

Example 22.5 For the chemical reaction N2 + 3H2 → 2NH3 , eqn 22.78 implies that −µN2 − 3µH2 + 2µNH3 = 0.

(22.79)

One can generalize the previous deﬁnition of the equilibrium constant for a gaseous reaction in eqn 22.67 (for a simple A B reaction) to the the following expression (for our general reaction in eqn 22.74): K=

p+q j=1

pj p

νj .

(22.80)

Example 22.6 For the chemical reaction N2 + 3H2 → 2NH3 , the equilibrium constant is K=

p2NH3 p2 (pNH3 /p )2 = . (pN2 /p )(pH2 /p )3 pN2 p3H2

Equilibrium, given by eqn 22.78, implies that

p+q pj νj µj + RT ln = 0 pj j=1 and writing ∆r G =

p+q

νj µ j ,

(22.81)

(22.82)

(22.83)

j=1

we have that

∆r G + RT

p+q j=1

and hence

νj ln

pj =0 p j

∆r G + RT ln K = 0,

(22.84)

(22.85)

244 The chemical potential

or equivalently ln K = −

∆r G , RT

(22.86)

in agreement with eqn 22.72 (which was proved only for the simple reaction A B). Since ln K = −∆r G /RT , we have that d ln K 1 d(∆r G /T ) =− , dT R dT

(22.87)

and using the Gibbs–Helmholtz relation (eqn 16.26) this becomes d ln K ∆r H = . dT RT 2

(22.88)

Note that if the reaction is exothermic under standard conditions, then ∆r H < 0 and hence K decreases as temperature increases. Equilibrium therefore shifts away from the products of the reaction. If on the other hand the reaction is endothermic under standard conditions, then ∆r H > 0 and hence K increases as temperature increases. Equilibrium therefore shifts towards the products of the reaction. This observation agrees with Le Chatelier’s principle which states that ‘a system at equilibrium, when subjected to a disturbance, responds in such a way as to minimize that disturbance’. In this case an exothermic reaction produces heat and this can raise the temperature, which then slows the forward reaction towards the products. In the case of an endothermic reaction, heat is absorbed by the reactants and this can lower the temperature which would speed up the forward reaction towards the products. Equation 22.88 can be written in the following form: ∆r H d ln K =− , d(1/T ) R 11

Jacobus Henricus van ’t Hoﬀ (1852– 1911).

(22.89)

which is known as the van ’t Hoﬀ equation.11 This implies that a graph of ln K against 1/T should yield a straight line whose gradient is −∆r H /R. This fact is used in the following example.

Example 22.7 Consider the dissociation reaction of molecular hydrogen into atomic hydrogen, i.e. the reaction H2 → H · + H·

(22.90)

The equilibrium constant for this reaction is plotted in Fig. 22.4. The plot of K against T emphasizes that the ‘equilibrium for this reaction is well and truly on the left’, meaning that the main constituent is H2 ; molecular hydrogen is only very slightly dissociated even at 2000 K. Plotting the same data as ln K against 1/T yields a straight-line graph whose gradient yields −∆H /R for this reaction. For these data we ﬁnd that ∆H is about 440 kJ mol−1 . This is positive and hence the reaction is endothermic, which makes sense because you need to heat H2 to break the molecular bond. This corresponds to a bond enthalpy per hydrogen molecule of (440 kJ mol−1 /NA e) ≈ 4.5 eV.

K

Further reading 245

T T

• An extra term is appropriately introduced into the combined ﬁrst and second law to give dU = T dS − pdV + µdN , and this allows for cases in which the number of particles can vary. ∂G . • µ is the chemical potential, which can be expressed as µ = ∂N p,T It is also the Gibbs function per particle. • For a system which can exchange particles with its surroundings, the chemical potential plays a similar rˆ ole in particle exchange as temperature does in heat exchange. • The grand partition function Z is given by Z = i eβ(µNi −Ei ) . • The grand potential is ΦG = −kB T ln Z = U − T S − µN = −pV . • µ = 0 for particles with no conservation law. • For a chemical reaction dG = µj dNj = 0 and hence νj µj = 0.

K

Chapter summary

T

Fig. 22.4 The equilibrium constant for the reaction H2 → H · + H·, as a function of temperature. The same data are plotted in two diﬀerent ways.

• The equilibrium constant K can be written as ln K = −∆r G /RT . • The temperature dependence of K follows d ln K/dT = ∆r H /RT 2 .

Further reading • Baierlein (2001) and Cook and Dickerson (1995) are both excellent articles concerning the nature of the chemical potential. • Atkins and de Paulo (2006) contains a treatment of the chemical potential from the perspective of chemistry.

246 Exercises

Exercises P (22.1) Maximize the entropy S = −kB i Pi ln Pi , where Pi is the probability of the ith level being P occupied, subject to the Pi = 1, Pconstraints that P Pi Ni = N to rederive the Pi Ei = U and grand canonical ensemble. (22.2) The fugacity z is deﬁned as z = eβµ . eqn 22.15, show that z = nλ3th

Using

(22.6) (a) Consider the ionization of atomic hydrogen, governed by the equation H p+ + e − ,

(22.93)

where p+ is a proton (equivalently a positively ionized hydrogen) and e− is an electron. Explain why

(22.91)

for an ideal gas, and comment on the limits z 1 and z 1. (22.3) Estimate the bond enthalpy of Br2 using the data plotted in Fig. 22.5.

T

µH = µ p + µ e .

Using the partition function for hydrogen atoms from eqn 21.50, and using eqn 22.92, show that Z1p Ze ZH −kB T ln 1 = −kB T ln 1 eβR , Np Ne NH (22.95) where Z1x and Nx are the single–particle partition function and number of particles for species x, and where R = 13.6 eV. Hence show that

−kB T ln

(2πme kB T )3/2 −βR ne np = e , nH h3

K

(22.94)

(22.96)

where nx = Nx /V is the number density of species x, stating any approximations you make. Equation 22.96 is known as the Saha equation.

T Fig. 22.5 The equilibrium constant for the reaction Br2 → Br · + Br·, as a function of temperature.

(22.4) Derive eqns 22.21, 22.22 and 22.23. (22.5) If the partition function ZN of a gas of N indistinguishable particles is given by ZN = Z1N /N 1, where Z1 is the single–particle partition function, show that the chemical potential is given by Z1 µ = −kB T ln . (22.92) N

(b) Explain why charge neutrality implies that ne = np and conservation of nucleons implies nH + np = n, where n is the total number density of hydrogen (neutral and ionised). Writing y = np /n as the degree of ionization, show that y2 e−βR , = 1−y nλ3th

(22.97)

where λth is the thermal wavelength for the electrons. Find the degree of ionization of a cloud of atomic hydrogen at 1000 K and density 1020 m−3 . (c) Equation 22.97 shows that the degree of ionization goes up when the density n goes down. Why is that?

23

Photons In this chapter, we will consider the thermodynamics of electromagnetic radiation. It was Maxwell who realized that light was an electromagnetic wave and that the speed of light, c, could be expressed in terms of fundamental constants taken from the theories of electricity and magnetism. In modern notation, this relation is √ c = 1/ 0 µ0 ,

23.1 The classical thermodynamics of electromagnetic radiation 248 23.2 Spectral energy density 249 23.3 Kirchhoﬀ ’s law

250

23.4 Radiation pressure

252

(23.1)

23.5 The statistical mechanics of the photon gas 253

where 0 and µ0 are the permittivity and permeability of free space respectively. Later, Planck realized that light behaved not only like a wave but also like a particle. In the language of quantum mechanics, electromagnetic waves can be quantized as a set of particles which are known as photons. Each photon has an energy ω where ω = 2πν is the angular frequency.1 Each photon has a momentum k where k is the wave vector.2 The ratio of the energy to the momentum of a photon is

23.6 Black body distribution 254

λ ω = 2πν × = νλ = c. k 2π

(23.2)

Electromagnetic radiation is emitted from any substance at non-zero temperature. This is known as thermal radiation. For objects at room temperature, you may not have noticed this eﬀect because the frequency of the electromagnetic radiation is low and most of the emission is in the infrared region of the electromagnetic spectrum. Our eyes are only sensitive to electromagnetic radiation in the visible region. However, you may have noticed that a piece of metal in a furnace glows ‘red hot’ so that, for such objects at higher temperature, your eyes are able to pick up some of the thermal radiation.3 This chapter is all about the properties of this thermal radiation. We will begin in Sections 23.1–23.4 by restricting ourselves to simple thermodynamics arguments to derive as much as we can about thermal radiation without going into the gory details, in much the same way as was originally done in the nineteenth century. This approach doesn’t get us the whole way, but provides a lot of insight. Then in Sections 23.5– 23.6, we will use the more advanced statistical mechanical techniques introduced in the previous chapters to do the job properly. The ﬁnal sections concern the thermal radiation that exists in the Universe as a remnant of the hot Big Bang and the eﬀect of thermal radiation on the behaviour of atoms and hence the operation of the laser.

23.7 Cosmic Microwave ground radiation

Back257

23.8 The Einstein A and B coeﬃcients 258 Chapter summary

261

Further reading

261

Exercises

262

1 ν is the frequency. The energy can also be expressed as hν. Recall also that = h/(2π). 2

The wave vector k = 2π/λ where λ is the wavelength.

3

Your eyes can pick up a lot of the thermal radiation if they are assisted by infrared goggles.

248 Photons

23.1

Fig. 23.1 A cavity of photons whose walls are diathermal, meaning they are in thermal contact with their surroundings, so that the temperature within may be controlled.

The classical thermodynamics of electromagnetic radiation

In this section, we will consider the thermodynamics of electromagnetic radiation from a classical standpoint, although we will allow ourselves the post-nineteenth century luxury of considering the electromagnetic radiation to consist of a gas of photons. First we will consider the eﬀect of a collection of photons on the surroundings which contain it. Let us consider the surroundings to be a container of volume V , which in this subject is termed a ‘cavity’, which is held at temperature T . The photons inside the cavity are in thermal equilibrium with the cavity walls, and form electromagnetic standing waves. The walls of the cavity, shown in Fig. 23.1, are made of diathermal material (i.e. they transmit heat between the gas of photons inside the cavity and the surroundings). If n photons per unit volume comprise the gas of photons in the cavity then the energy density u of the gas may be written as: u=

4

The factor of two diﬀerence arises from writing the kinetic energy as mc2 and not as 21 mv 2 , and thus reﬂects the diﬀerence in form between the equation for the relativistic energy of a photon and that for the kinetic energy of a non-relativistic particle.

U = nω, V

(23.3)

where ω is the mean energy of a photon. From kinetic theory (eqn 6.15), the pressure p of a gas of particles is 13 nmv 2 . For photons, we replace v 2 in this formula by c2 , the square of the speed of light. Interpreting mc2 as the energy of a photon, we then have that p is one third of the energy density. Thus u (23.4) p= , 3 which is diﬀerent from the expression in eqn 6.25 (p = 2u/3) from the kinetic theory of gases, a point which we will return to in Section 25.2 (see eqn 25.21).4 Equation 23.4 gives an expression for the radiation pressure due to the electromagnetic radiation. Also from kinetic theory (eqn 7.6), the ﬂux Φ of photons on the walls of their container, that is to say the number of photons striking unit area of their container per second, is given by 1 (23.5) Φ = nc, 4 where c is the speed of light. From this, and eqn 23.3, we can write the power incident per unit area of cavity wall, due to the photons, as 1 uc. (23.6) 4 This relation will be important as we now derive the Stefan–Boltzmann law, which relates the temperature of a body to the energy ﬂux radiating from it in the form of electromagnetic radiation. We can derive this using the ﬁrst law of thermodynamics in the form dU = T dS − pdV to give ∂U ∂S = T −p ∂V T ∂V T ∂p − p, (23.7) = T ∂T V P = ωΦ =

23.2

where the last equality follows from using a Maxwell relation. The lefthand side of eqn 23.7 is simply5 the energy density u. Hence, using eqn 23.7, together with eqn 23.4, we obtain ∂u 1 u u= T (23.8) − . 3 ∂T V 3 Rearranging gives:

4u = T

∂u ∂T

,

(23.9)

V

Spectral energy density 249

5

This should be obvious since it is the deﬁnition of energy density. However, if you want to convince yourself, notice that diﬀerentiating U = uV with respect to V yields „ « « „ ∂U ∂u =u+V = u, ∂V T ∂V T ” “ ∂u = 0 since u, an energy because ∂V T density, is independent of volume.

from which follows

du dT = . T u Equation 23.10 may be integrated to give: 4

u = AT 4 ,

(23.10)

(23.11)

where A is a constant of the integration with units J K−1 m−3 . We can now use eqn 23.6 to give us the power incident6 per unit area.7 1 1 Ac T 4 = σT 4 , P = uc = (23.12) 4 4

6

Note that when the cavity is in equilibrium with the radiation inside it, the power incident is equal to the power emitted; hence the expression for P expresses the power emitted by the surface and the power incident on the surface.

where the term in brackets, σ = 14 Ac, is known as the Stefan–Boltzmann constant. Equation 23.12 is known as the Stefan–Boltzmann law or 7 The power per unit area is equal to an sometimes as Stefan’s law. For the moment, we have no idea what energy ﬂux. value the constant σ takes and this is something that was originally determined from experiment. In Section 23.5, using the techniques of statistical mechanics, we will derive an expression for this constant.

23.2

Spectral energy density

The energy density u of electromagnetic radiation is a quantity which tells you how many Joules are stored in a cubic metre of cavity. What we want to do now is to specify in which frequency ranges that energy is stored. All of this will fall out of the statistical mechanical treatment in Section 23.5, but we want to continue to apply a classical treatment to see how far we can get. To do this, consider two containers, each in contact with thermal reservoirs at temperature T and joined to one another by a tube, as illustrated schematically in Fig. 23.2. The system is allowed to come to equilibrium. The thermal reservoirs are at the same temperature T and so we know from the second law of thermodynamics that there can be no net heat ﬂow from either one of the bodies to the other. Therefore there can be no net energy ﬂux along the tube, so that the energy ﬂux from the soot-lined cavity along the tube from left to right must be balanced by the energy ﬂux from the mirror-lined cavity along the tube from right to left. Equation 23.12 thus tells us that each cavity must have the same energy density u. This argument can be repeated for cavities of diﬀerent

T

T

Fig. 23.2 Two cavities at temperature T, one is lined with soot and the other with a mirror coating.

250 Photons

shape and size as well as diﬀerent coatings. Hence we conclude that u is independent of shape, size or material of the cavity. But maybe one cavity might have more energy density than the other at certain wavelengths, even if it has to have the same energy density overall? This is not the case, as we shall now prove. First, we make a deﬁnition. uλ has units J m−3 m−1 . We can also deﬁne a spectral density in terms of frequency ν, so that uν dν is the energy density due to those photons which have frequencies between ν and ν + dν.

• The spectral energy density uλ is deﬁned as follows: uλ dλ is the energy density due to those photons which have wavelengths between λ and λ + dλ. The total energy density is then u = uλ dλ. (23.13) Now imagine that a ﬁlter, which only allows a narrow band of radiation at wavelength λ to pass, is inserted at point A in Fig. 23.2 and the system is left to come to equilibrium. The same arguments listed above apply in this case: there is no net energy ﬂux from one cavity to the other and hence the speciﬁc internal energy within a narrow wavelength range is the same for each case: mirror usoot (T ). λ (T ) = uλ

(23.14)

This demonstrates that the spectral internal energy has no dependence on the material, shape, size or nature of a cavity. The spectral energy density is thus a universal function of λ and T only.

23.3

Kirchhoﬀ ’s law

We now wish to discuss how well particular surfaces of a cavity will absorb or emit electromagnetic radiation of a particular frequency or wavelength. We therefore make the following additional deﬁnitions: αλ is dimensionless. eλ has units W m−2 m−1 .

• The spectral absorptivity αλ is the fraction of the incident radiation which is absorbed at wavelength λ. • The spectral emissive power eλ of a surface is a function such that eλ dλ is the power emitted per unit area by the electromagnetic radiation having wavelengths between λ and λ + dλ. Using these deﬁnitions, we may now write down the form for the power per unit area absorbed by a surface, if the incident spectral energy density is uλ dλ, as follows: 1 uλ dλ c αλ . (23.15) 4 The power per unit area emitted by a surface is given by eλ dλ.

(23.16)

In equilibrium, the expressions in eqns 23.15 and 23.16 must be equal, and hence eλ c = uλ . (23.17) αλ 4

23.3

Kirchhoﬀ ’s law 251

Equation 23.17 expresses Kirchhoﬀ ’s law, which states that the ratio eλ /αλ is a universal function of λ and T . Therefore, if you ﬁx λ and T , the ratio eλ /αλ is ﬁxed and hence eλ ∝ αλ . In other words ‘good absorbers are good emitters’ and ‘bad absorbers are bad emitters’.

Example 23.1 Dark coloured objects which absorb most of the light that falls on them will be good at emitting thermal radiation. One has to be a bit careful here because you have to be sure about which wavelength you are talking about. A better statement of Kirchhoﬀ’s laws would be ‘good absorbers at one wavelength are good emitters at the same wavelength’. For example, a white coﬀee mug absorbs poorly in visible wavelengths so looks white. A black, but otherwise identical, coﬀee mug absorbs well in visible wavelengths so looks black. Which one is best at keeping your coﬀee warm? You might conclude that it is the white mug because ‘poor absorbers are poor emitters’ and that the mug will lose less heat by thermal radiation. However, a hot mug emits radiation mainly in the infrared region of the electromagnetic spectrum,8 and so the mug being white in the visible is immaterial; what you need to know is what ‘colour’ each mug is in the infrared, i.e. measuring their absorption spectra at infrared wavelengths will tell you about their emission properties there.

A perfect black body is an object which is deﬁned to have αλ = 1 for all λ. Kirchhoﬀ’s law expressed in eqn 23.17 tells us that for this maximum value of α, a black body is the best possible emitter. It is often useful to think of a black body cavity which is an enclosure whose walls have αλ = 1 for all λ and which contains a gas of photons at the same temperature as the walls, due to emission and absorption of photons by the atoms in the walls. The gas of photons contained in the black body cavity is known as black body radiation.

Example 23.2 The temperature of the Earth’s surface is maintained by radiation from the Sun. By making the approximation that the Sun and the Earth behave as black bodies, show that the ratio of the Earth’s temperature to that of the Sun is given by RSun TEarth , (23.18) = TSun 2D where RSun is the radius of the Sun and the Earth–Sun separation is D. Solution:

8

See Appendix D.

252 Photons 2 The Sun emits a power equal to its surface area 4πRSun multiplied by 4 σTSun . This power is known as its luminosity L (measured in Watts), so that 2 4 σTSun . (23.19) L = 4πRSun

At a distance D from the Sun, this power is uniformly distributed over a sphere with surface area 4πD2 , and the Earth is only able to ‘catch’ 2 . Thus the power incident on this power over its projected area πREarth the Earth is 2 πREarth . (23.20) power incident = L 4πD2 The power emitted by the Earth, assuming it has a uniform temperature 4 multiplied by the TEarth and behaves as a black body, is simply σTEarth 2 Earth’s surface area 4πREarth , so that 2 4 power emitted = 4πREarth σTEarth

(23.21)

Equating eqn 23.20 and eqn 23.21 yields the desired result. Putting in the numbers RSun = 7×108 m, D = 1.5×1011 m and TSun = 5800 K yields TEarth = 280 K, which is not bad given the crudeness of the assumptions.

23.4

Radiation pressure

To summarize the results of the earlier sections in this chapter, for black body radiation we have: 1 uc = σT 4 , (23.22) 4 4σ energy density in radiation u = (23.23) T 4, c u 4σT 4 pressure on cavity walls p = = . (23.24) 3 3c If, however, one is dealing with a beam of light, in which all the photons are going in the same direction (rather than in each and every direction as we have in a gas of photons) then these results need to be modiﬁed. The pressure exerted by a collimated beam of light can be calculated as follows: a cubic metre of this beam has momentum nk = nω/c, and this momentum is absorbed by a unit area of surface, normal to the beam, in a time 1/c. Thus the pressure is p = [nω/c]/[1/c] = nω = u. A cubic metre of the beam has energy nω, so the power P incident on unit area of surface is P = nω/(1/c) = uc. Hence, we have power radiated per unit area P

=

= uc = σT 4 ,

σ energy density in radiation u = T 4, c σT 4 . pressure on cavity walls p = u = c

power radiated per unit area P

(23.25) (23.26) (23.27)

23.5

The statistical mechanics of the photon gas 253

It is worth emphasising that electromagnetic radiation exerts a real pressure on a surface and this can be calculated using eqn 23.24 or eqn 23.27 as appropriate. An example of a calculation of radiation pressure is given below.

Example 23.3 Sunlight falls on the surface of the Earth with a power per unit area equal to P = 1370 W m−2 . Calculate the radiation pressure and compare it to atmospheric pressure. Solution: Sunlight on the Earth’s surface consists of photons all going in the same direction,9 and hence we can use p=

P = 4.6 µPa, c

(23.28)

9

We make this approximation because the Sun is suﬃciently far from the Earth, so that all the rays of light arriving on Earth are parallel.

which is more than ten orders of magnitude lower than atmospheric pressure (which is ∼ 105 Pa).

23.5

The statistical mechanics of the photon gas

Our argument so far has only used classical thermodynamics. We have been able to predict that the energy density u of a photon gas behaves as AT 4 but we have been able to say nothing about the constant A. It was only through the development of quantum theory that it was possible to derive what A is, and we will present this in what follows. The crucial insight is that electromagnetic waves in a cavity can be described by simple harmonic oscillators. The angular frequency ω of each mode of oscillation is related to the wave vector k by ω = ck

(23.29)

(see Fig. 23.3) and hence the density of states10 of electromagnetic waves as a function of wave vector k is given by g(k) dk =

4πk 2 dk × 2, (2π/L)3

(23.30)

where the cavity is assumed to be a cube of volume V = L3 and the factor 2 corresponds to the two possible polarizations of the electromagnetic waves. Thus V k 2 dk , (23.31) g(k) dk = π2

k Fig. 23.3 The relation between ω and k, for example that in eqn 23.29, is known as a dispersion relation. For light (plotted here) this relation is very simple and is called non-dispersive because both the phase velocity (ω/k) and the group velocity (dω/dk) are equal. 10

This treatment is similar to the analysis in Section 21.1 for the ideal gas.

254 Photons

and hence the density of states g(ω), now written as a function of frequency using eqn 23.29, is g(ω) = g(k)

g(k) dk = , dω c

(23.32)

and hence

V ω 2 dω . (23.33) π 2 c3 We can derive U for the photon gas by using the expression for U for a single simple harmonic oscillator in eqn 20.29 to give ∞ 1 1 + βω U= g(ω) dω ω . (23.34) 2 e −1 0 g(ω) dω =

This presents us with a problem since the ﬁrst part of this expression, due to the sum of all the zero-point energies, diverges: ∞ 1 (23.35) g(ω) dω ω → ∞. 2 0 This must correspond to the energy of the vacuum, so after swallowing hard we redeﬁne our zero of energy so that this inﬁnite contribution is swept conveniently under the carpet. We are therefore left with ∞ ∞ V ω 3 dω ω U= = 2 3 . (23.36) g(ω) dω βω e −1 π c 0 eβω − 1 0 If we make the substitution x = βω, we can rewrite this as 4 ∞ 3 4 V π 2 kB V 1 x dx U= 2 3 = T 4, π c β ex − 1 15c3 3 0

(23.37)

and hence u = U/V = AT 4 . Here, use has been made of the integral ∞ 3 π4 x dx = ζ(4)Γ(4) = , (23.38) ex − 1 15 0 which is proved in Appendix C.4 (see eqn C.25). This therefore establishes that the constant A = 4σ/c is given by A= 11 If you prefer to use h, rather than , the Stefan–Boltzmann constant is written as

σ=

4 π 2 kB , 15c3 3

(23.39)

and hence the Stefan–Boltzmann constant11 σ is σ=

4 2π 5 kB . 15c2 h3

23.6

4 π 2 kB = 5.67 × 10−8 W m−2 K−4 . 60c2 3

(23.40)

Black body distribution

The expression in eqn 23.36 can be rewritten as U u= = uω dω, V

(23.41)

Black body distribution 255

u

u

23.6

Fig. 23.4 The black body distribution of spectral energy density, plotted for 200 K, 250 K and 300 K as a function of (a) frequency and (b) wavelength. The upper scale shows the frequency in inverse centimetres, a unit beloved of spectroscopists.

where uω is a diﬀerent form of the spectral energy density (written this time as a function of angular frequency ω = 2πν). It thus takes the form uω =

π 2 c3

ω3 . −1

eβω

(23.42)

This spectral energy density function is known as a black body distribution. We can also express this in terms of frequency ν by writing uω dω = uν dν, and using ω = 2πν and hence dω/dν = 2π. This yields uν =

ν3 8πh . c3 eβhν − 1

(23.43)

This function is plotted in Fig. 23.4(a). Similarly, we can transform this into wavelength, by writing uν dν = uλ dλ, and using ν = c/λ and hence dν/dλ = −c/λ2 . This yields an expression for uλ as follows: uλ =

1 8πhc . λ5 eβhc/λ − 1

(23.44)

This is shown in Fig. 23.4(b). We note several features of this black body distribution. • At low frequency (i.e. long wavelength), when hν/kB T 1, the exponential term can be written as eβhν ≈ 1 +

hν , kB T

(23.45)

256 Photons

and hence uν →

8πkB T ν 2 , c3

(23.46)

and equivalently

8πkB T . (23.47) λ4 These two expressions are diﬀerent forms of the Rayleigh–Jeans law, and were derived in the nineteenth century before the advent of quantum mechanics. As that might imply, Planck’s constant h does not appear in them. These expressions are the correct limit of the black body distribution, as shown in Fig. 23.5. They created problems at the time, because if you take the Rayleigh–Jeans form of uλ and assume it is true for all wavelengths, and then try and integrate it to get the total internal energy U , you ﬁnd that ∞ ∞ 8πkB T dλ uλ dλ = → ∞. (23.48) U= λ4 0 0

u

uλ →

Fig. 23.5 The black body energy density uλ (thick solid line), together with the Rayleigh–Jeans equation (eqn 23.47) which is the longwavelength limit of the black body distribution.

12

Note that if we divide the energy density by the time taken for unit volume of photons to pass through unit area of surface, namely 1/c, we have the energy ﬂux.

This apparent divergence in U was called the ultraviolet catastrophe, because integrating down to small wavelengths (towards the ultraviolet) produced a divergence. In fact, such high–energy electromagnetic waves are not excited because light is quantized and it costs too much energy to produce an ultraviolet photon when the temperature is too low. Of course, using the correct black body uλ from eqn 23.44, the correct form ∞ 4σ 4 T uλ dλ = (23.49) U= c 0 is obtained. • One can also deﬁne the radiance (or surface brightness) Bν as the ﬂux of radiation per steradian (the unit of solid angle, abbreviated to sr) in a unit frequency interval. This function gives the power through an element of unit area, per unit frequency, from an element of solid angle. The units of radiance are W m−2 Hz−1 sr−1 . Because there are a total of 4π steradians, we have that12 Bν (T ) =

ν3 c 2h uν (T ) = 2 βhν . 4π c e −1

(23.50)

By analogy, Bλ , with units W m−2 m−1 sr−1 , is deﬁned by Bλ (T ) =

1 c 2hc2 uν (T ) = 5 βhc/λ . 4π λ e −1

(23.51)

• Wien found experimentally in 1896, before the advent of quantum mechanics, that the product of the temperature and of the wavelength at which the maximum of the black body distribution uλ is found is a constant. This is a statement of what is known as Wien’s law. The constant can be given as follows: λmax T = a constant.

(23.52)

23.7

Cosmic Microwave Background radiation 257

Wien’s law follows from the fact that λmax can be determined by the condition duλ /dλ = 0, and applying this to eqn 23.44 leads to βhc/λmax = a constant. Hence λmax T is a constant, which is Wien’s law. The law tells us that at room temperature, objects which are approximately black bodies will radiate the most at wavelength λmax ≈ 10µm, which is in the infrared region of the electromagnetic spectrum, as demonstrated in Fig. 23.4(b). One can easily show13 that the maximum in uν occurs at a frequency given by hν = 2.82144 (23.53) kB T and the maximum in uλ occurs at a wavelength given by hc = 4.96511. λkB T

13

See Exercise 23.2.

(23.54)

This can be used to show that the product λT is given by 5.1 mm K at the maximum of uν (T ), λT = (23.55) 2.9 mm K at the maximum of uλ (T ). These maxima do not occur at the same place for each distribution because one is measured per unit frequency interval and the other per unit wavelength interval, and these are diﬀerent.14 Figure 23.6(a) shows how the shape of the distribution changes with temperature for uν and Fig. 23.6(b) for uλ on log-log scales. These diagrams show how the peak of the black body distribution lies in the optical region of the spectrum for temperatures of several thousand Kelvin, but in the microwave region for a temperature of a few Kelvin. This fact is very relevant for the black body radiation in the Universe, which we describe in the following section.

23.7

14

The diﬀerence between dν and dλ is derived as follows: c = νλ,

and hence ν = c/λ, so that dν = −

c dλ. λ2

Cosmic Microwave Background radiation

In 1978, Penzias and Wilson of Bell Labs, New Jersey, USA won the Nobel Prize for their serendipitous discovery (in 1963–1965) of seemingly uniform microwave emission coming from all directions in the sky which has come to be known as the cosmic microwave background (CMB). Remarkably, the spectral shape of this emission exhibits, to high precision, the distribution for black body radiation of temperature 2.7 K (see Fig. 23.7) with a peak in the emission spectrum at a wavelength of about 1 mm. It is startling that the radiation is uniform, or isotropic, to better than 1 part in 105 (meaning that its spectrum and intensity is almost the same if you measure in diﬀerent directions in the sky). This is one of the key pieces of evidence in favour of the hot Big Bang model for the origin of the Universe. It implies that there was a time when all of the Universe we see now was in thermal equilibrium.15

15 Note that diﬀerent black body distributions, that is multiple curves corresponding to regions at a variety of different temperatures, do not superpose to form a single black body distribution.

u

u

258 Photons

Fig. 23.6 The black body distribution of spectral energy density, plotted on a logarithmic scale for four diﬀerent temperatures as a function of (a) frequency and (b) wavelength.

We can make various inferences about the origin of the Universe from observations of the cosmic microwave background. It can be shown that the energy density of radiation in the expanding Universe falls oﬀ as the fourth power of the scale factor (which you can think of as the linear magniﬁcation factor describing the separation of a pair of marker galaxies in the Universe, a quantity which increases with cosmic time). From the Stefan–Boltzmann law, the energy density of radiation falls oﬀ as T 4 , so temperature and scale-factor are inversely proportional to one another, so the Universe cools as it expands. Conversely, when the Universe was much younger, it was much smaller and much hotter. Extrapolating back in time, one ﬁnds that temperatures were such that physical conditions were very diﬀerent. For example, it was too hot for matter to exist as atoms, and everything was ionized. Further back in cosmic time still, even quarks and hadrons, the sub-structure of protons and neutrons were thought to be dissociated.

23.8

The Einstein A and B coeﬃcients

If a gas of atoms is subjected to thermal radiation, the atoms can respond by making transitions between diﬀerent energy levels. We can think about this eﬀect in terms of absorption and emission of photons

The Einstein A and B coeﬃcients 259

B

23.8

Fig. 23.7 The experimentally determined spectrum of the cosmic microwave background (data courtesy NASA).

by the atom. The atoms are sitting in a bath of photons which we call the radiation ﬁeld and it has an energy density uω given by eqn 23.42. In this section, we will consider the eﬀect of this radiation ﬁeld on the transitions between atomic energy levels by modelling the atom as a simple two-level system. Consider the two-level system shown in Fig. 23.8 which comprises two energy levels, a lower level 1 and an upper level 2, separated by an energy ω. In the absence of the radiation ﬁeld, atoms in the upper level can decay to the lower level by the process of spontaneous emission of a photon [Fig. 23.8(a)]. The number of atoms in the upper level, N2 , is given by solving a simple diﬀerential equation dN2 = −A21 N2 , dt

N A

N B u

(23.56)

where A21 is a constant. This expresses simply that the decay rate depends on the number of atoms in the upper level. The solution of this equation is (23.57) N2 (t) = N2 (0)e−t/τ ,

N B u

where τ ≡ 1/A21 is the natural radiative lifetime of the upper level. In the presence of a radiation ﬁeld of energy density uω , two further processes are possible: • An atom in level 1 can absorb a photon of energy ω and will end up in level 2 [Fig. 23.8(b)]. This process is called absorption, and will occur at a rate which is proportional both to uω and to the number of atoms in level 1. Thus the rate can be written as N1 B12 uω , where B12 is a constant.

Fig. 23.8 Transitions for a two-level system: (a) spontaneous emission of a photon; (b) absorption of a photon; (c) stimulated emission of a photon.

260 Photons

• Quantum mechanics allows the reverse process to occur. Thus an atom in level 2 can emit a photon of energy ω as a direct result of the radiation ﬁeld, and the atom will end up in level 1 [Fig. 23.8(c)]. In terms of individual photons, this process involves two photons: the presence of a ﬁrst photon in the radiation ﬁeld (which is absorbed and then re-emitted) stimulates the emission by the atom of an additional photon. This process is called stimulated emission, and will occur at a rate which is proportional both to uω and to the number of atoms in level 2. Thus the rate can be written as N2 B21 uω where B21 is a constant. The constants A21 , B12 and B21 are called the Einstein A and B coeﬃcients. To summarize, our three processes are: (1) spontaneous emission (one photon emitted); (2) absorption (one photon absorbed); (3) stimulated emission (one photon absorbed, two photons emitted). In the steady state, with all three processes occurring simultaneously, we must have N2 B21 uω + N2 A21 = N1 B12 uω . (23.58) This can be rearranged to give uω =

A21 /B21 . (N1 B12 /N2 B21 ) − 1

(23.59)

If the system is in thermal equilibrium, then the relative populations of the two levels must be given by a Boltzmann factor, i.e. N2 g1 = e−βω , N1 g2

(23.60)

where g1 and g2 are the degeneracies of levels 1 and 2 respectively. Substitution of eqn 23.60 into eqn 23.59 yields uω =

A21 /B21 , (g1 B12 /g2 B21 )e−βω − 1

(23.61)

and comparison with eqn 23.42 yields the following relations between the Einstein A and B coeﬃcients: B21 g1 = B12 g2

and

A21 =

ω 3 B21 . π 2 c3

(23.62)

Example 23.4 When will a system of atoms in a radiation ﬁeld exhibit gain, i.e. produce more photons than they absorb? Solution: The atoms will produce more photons than they absorb if the rate of stimulated emission is greater than the absorption rate, and this will occur if (23.63) N2 B21 uω > N1 B12 uω ,

Further reading 261

which implies that

N2 N1 > . (23.64) g2 g1 This means that we need to have a population inversion, so that the number of atoms (‘the population’) in the upper state (per degenerate level) exceeds that in the lower state. This is the principle behind the operation of the laser (a word that stands for light ampliﬁcation by stimulated emission of radiation). However, in our two-level system such a population inversion is not possible in thermal equilibrium. For laser operation, it is necessary to have further energy levels to provide additional transitions: these can provide a mechanism to ensure that level 2 is pumped (fed by transitions from another level, keeping its population high) and that level 1 can drain away (into another lower level, so that level 1 has a low population).

Chapter summary • The power emitted per unit area of a black body surface at temperature T is given by σT 4 , where σ=

4 π 2 kB = 5.67 × 10−8 W m−2 K−4 . 2 60c 3

• Radiation pressure p due to black body photons is equal to u/3 where u is the energy density. Radiation pressure due to a collimated beam of light is equal to u. • The spectral energy density uω takes the form of a black body distribution. This form ﬁts well to the experimentally measured form of the cosmic microwave background. It is also important in the theory of lasers.

Further reading • A discussion of lasers may be found in Foot (2004), chapters 1 and 7. • More information concerning the cosmic microwave background is in Liddle (2003) chapter 10 and Carroll and Ostlie (1996) chapter 27.

262 Exercises

Exercises (23.1) The temperature of the Earth’s surface is maintained by radiation from the Sun. By making the approximation that the Sun is a black body, but now assuming that the Earth is a grey body with albedo A (this means that it reﬂects a fraction A of the incident energy), show that the ratio of the Earth’s temperature to that of the Sun is given by r RSun , (23.65) TEarth = TSun (1 − A)1/4 2D where RSun is the radius of the Sun and the Earth– Sun separation is D. (23.2) Show that the maxima in the functions uν and uλ can be computed by maximising the function xα /(eα − 1) for α = 3 and α = 5 respectively. Show that this implies that x = α(1 − e−x ).

(23.66)

This equation can be solved by iterating xn = α(1 − e−xn−1 );

(d) the heat capacity at constant pressure, Cp , is inﬁnite. (What on earth does that mean?) (23.6) Ignoring the zero-point energy, show that the partition function Z for a gas of photons in volume V is given by Z ∞ V ln Z = − 2 3 ω 2 ln(1 − e−ωβ ) dω, (23.68) π c 0 and hence, by integrating by parts, that ln Z =

(23.3) The cosmic microwave background (CMB) radiation has a temperature of 2.73 K. (a) What is the photon energy density in the Universe? (b) Estimate the number of CMB photons which fall on the outstretched palm of your hand every second. (c) What is the average energy due to CMB radiation which lands on your outstretched palm every second? (d) What radiation pressure do you feel from CMB radiation? (23.4) What is the ratio of the number of photons from the Sun to the number of CMB photons which irradiate your outstretched hand every second (during the daytime!)? (23.5) Thermal radiation can be treated thermodynamically as a gas of photons with internal energy U = u(T )V and pressure p = u(T )/3, where u(T ) is the energy density. Show that: (a) the entropy density s is given by s = 4p/T ; (b) the Gibbs function G = 0; (c) the heat capacity at constant volume Cv = 3s per unit volume;

(23.69)

Hence show that F

=

S

=

U

=

p

=

(23.67)

now show that (using an initial guess of x1 = 1) this leads to the values given in eqns 23.53 and 23.54.

V π 2 (kB T )3 . 453 c3 4σV T 4 3c 16σV T 3 3c 4σV T 4 c 4σT 4 , 3c

−

(23.70) (23.71) (23.72) (23.73)

and hence that U = −3F , pV = U/3 and S = 4U/3T . (23.7) Show that the total number N of photons in black body radiation contained in a volume V is „ «3 Z ∞ 2ζ(3) kB T g(ω) dω = V, N= π2 c eω/kB T − 1 0 (23.74) where ζ(3) = 1.20206 is a Riemann-zeta function (see Appendix C.4). Hence show that the average energy per photon is π4 U = kB T = 2.701kB T, N 30ζ(3)

(23.75)

and that the average entropy per photon is S 2π 4 = kB = 3.602kB . N 45ζ(3)

(23.76)

The result for the internal energy of a photon gas is therefore U = 2.701N kB T , whereas for a classical ideal gas one obtains U = 32 N kB T . Why should the two results be diﬀerent? Compare the expression for the entropy of a photon gas with that for an ideal gas (the Sackur-Tetrode equation); what is the physical reason for the diﬀerence?

24

Phonons In a solid, energy can be stored in vibrations of the atoms which are arranged in a lattice.1 In the same way that photons are quantized electromagnetic waves that describe the elementary excitations of the electromagnetic ﬁeld, phonons are the quantized lattice waves that describe the elementary excitations of vibrations of the lattice. Rather than treating the vibration of each individual atom, our focus is on the normal modes of the system which oscillate independently of each other. Each normal mode can be treated as a simple harmonic oscillator, and thus can contain an integer number of energy quanta. These energy quanta can be considered discrete ‘particles’, known as phonons. The thermodynamic properties of a solid can therefore be calculated in much the same way as was done for photons in the previous chapter – by evaluating the statistical mechanics of a set of simple harmonic oscillators. The problem here is more complex because of the dispersive nature of lattice waves, but two models (the Einstein model and the Debye model) are commonly used to describe solids and we evaluate each in turn in the following two sections.

24.1

24.1 The Einstein model

263

24.2 The Debye model

265

24.3 Phonon dispersion

268

Chapter summary

271

Further reading

271

Exercises

271

1

We assume a crystalline solid, though analogous results can be derived for non-crystalline solids. A lattice is a three-dimensional array of regularly spaced points, each point coinciding with the mean position of the atoms in the crystal.

The Einstein model

The Einstein model treats the problem by making the assumption that all vibrational modes of the solid have the same frequency ωE . There are 3N such modes2 (each atom of the solid has three vibrational degrees of freedom). We will assume that these normal modes are independent and do not interact with each other. In this case, the partition function Z can be written as the product Z=

3N

Zk ,

(24.1)

k=1

where Zk is the partition function of a single mode. Hence, the logarithm of the partition function is a simple sum over all the modes of the system: ln Z =

3N

ln Zk .

(24.2)

k=1

Each mode can be modelled as a simple harmonic oscillator, so we can use the expression in eqn 20.3 to write down the partition function of a

2

Strictly speaking, a solid has 3N − 6 vibrational modes, since although each atom can move in one of three directions (hence 3N degrees of freedom) one has to subtract 6 modes which correspond to translation and rotation of the solid as a whole. When N is large, as it will be for any macroscopic sample, a correction of 6 modes is irrelevant.

264 Phonons

single mode as Zk =

∞ n=0

1

e−(n+ 2 )ωE β =

1

e− 2 ωE β . 1 − e−ωE β

(24.3)

This expression is independent of k because all the modes are identical, and so the partition function is Z = (Zk )3N and hence 1 −ωE β ) , (24.4) ln Z = 3N − ωE − ln(1 − e 2 and so the internal energy U is ∂ln Z 3N 3N U =− ωE + ωE e−ωE β = ∂β 2 1 − e−ωE β 3N 3N ωE ωE + ωE β . (24.5) = 2 e −1

Recall that NA kB = R.

In fact, we could have got immediately to eqn 24.5 simply by multiplying 3N by the expression in eqn 20.29, but we have taken a longer route to reiterate the basic principles. Writing ωE = kB ΘE deﬁnes a temperature ΘE which scales with the vibrational frequency in the Einstein model. This allows us to rewrite eqn 24.5 as 1 1 + Θ /T U = 3RΘE , (24.6) 2 e E −1 where U is now per mole of solid. In the high–temperature limit, U → 3RT because 1 T as T → ∞. (24.7) → Θ /T E ΘE e −1

3

For a solid, CV ≈ Cp , and so the subscript will be omitted.

Example 24.1 Derive the molar heat capacity of an Einstein solid as a function of temperature, and show how it behaves in the low– and high–temperature limits. Solution: Using the expression for the molar internal energy in eqn 24.6, one can to show that3 use C = ∂U ∂T −1 ΘE ΘE /T C = 3RΘE Θ /T e − 2 , T (e E − 1)2 2 x x e , (24.8) = 3R x (e − 1)2 where x = ΘE /T . • As T → 0, x → ∞ and C → 3Rx2 e−x . • As T → ∞, x → 0 and C → 3R.

24.2

The high–temperature result is known as the Dulong–Petit rule.4 In summary, the molar heat capacity of an Einstein solid falls oﬀ very fast at low temperature (because it will be dominated by the e−ΘE /T term), but saturates to a value of 3R at high temperature.

24.2

The Debye model 265

4

The Dulong–Petit rule is named after P.L. Dulong and A.T. Petit who measured it in 1819. It agrees with our expectations based on the equipartition theorem, see eqn 19.25.

The Debye model

The Einstein model makes a rather gross assumption that the normal modes of a solid all have the same frequency. It is clearly better to assume a distribution of frequencies. Hence, we would like to choose a function g(ω) which is the density of vibrational states. The number of vibrational states with frequencies between ω and ω + dω should be given by g(ω) dω and we require that the total number of normal modes be given by g(ω) dω = 3N. (24.9)

g

The Einstein model took the density of states to be simply a delta function, i.e. (24.10) gEinstein (ω) = 3N δ(ω − ωE ), as shown in Fig. 24.1, but we would now like to do better. The next simplest approximation is to assume that lattice vibrations correspond to waves, all with the same speed vs , which is the speed of sound in the solid. Thus we assume that ω = vs q,

(24.11)

where q is the wave vector of the lattice vibration.5 The density of states of lattice vibrations in three dimensions as a function of q is given by g(q) dq =

4πq 2 dq × 3, (2π/L)3

(24.12)

where the solid is assumed to be a cube of volume V = L3 and the factor 3 corresponds to the three possible ‘polarizations’ of the lattice vibration (one longitudinal and two transverse polarizations are possible for each value of q). Thus 3V q 2 dq , (24.13) g(q) dq = 2π 2 and hence 3V ω 2 dω . (24.14) g(ω) dω = 2π 2 vs3 Because there is a limit (3N ) on the total number of modes, we will now assume that lattice vibrations are possible up to a maximum frequency ωD known as the Debye frequency. This is deﬁned by ωD g(ω) dω = 3N, (24.15) 0

Fig. 24.1 The density of states for the Einstein model, using eqn 24.10.

5

P. Debye (1884–1966) introduced this model in 1912, but assumed that a solid was a continuous elastic medium with a linear dispersion relation. We will improve on this dispersion relation in Section 24.3.

g

Fig. 24.2 The density of states for the Debye model, using eqn 24.14.

266 Phonons

which, using eqn 24.14, implies that ωD =

6N π 2 vs3 V

1/3 .

(24.16)

9N ω 2 dω . 3 ωD

(24.17)

6

Some example Debye temperatures are shown in the following table: material Ne Na NaCl Al Si C (diamond)

ΘD (K) 63 150 321 394 625 1860

The Debye temperature is higher for harder materials, since the bonds are stiﬀer and the phonon frequencies correspondingly higher.

This allows us to rewrite eqn 24.14 as g(ω) dω =

The density of states for the Debye model is shown in Fig. 24.2. We also deﬁne the Debye temperature6 ΘD by ωD , (24.18) ΘD = kB which gives the temperature scale corresponding to the Debye frequency. We are now ready to roll up our sleeves and tackle the statistical mechanics of this model.

Example 24.2 Derive the molar heat capacity of a Debye solid as a function of temperature. Solution: To obtain C = (∂U/∂T ), we ﬁrst need to obtain U which we can do by one of two methods. Method 1 (Starting from the partition function.) We begin by writing down the logarithm of the partition function as follows: ωD 1 e− 2 ωβ dω g(ω) ln ln Z = . (24.19) 1 − e−ωβ 0 This integral looks a bit daunting, but we can do it by integrating by parts: ωD ωD 1 ωg(ω) dω − g(ω) ln(1 − e−ωβ ) dω. (24.20) ln Z = − 2 0 0 The ﬁrst term of eqn 24.20 is easily evaluated to be − 89 N ωD β while the second term we will leave unevaluated for the moment. Thus we have ωD 9 9 ω 2 ln(1 − e−ωβ ) dω. (24.21) ln Z = − N ωD β − 3 8 ωD 0 Now we can use U = −∂ ln Z/∂β, and hence we ﬁnd that 9 9N ωD ω 3 dω . U = N ωD + 3 8 ωD 0 eωβ − 1

(24.22)

24.2

The Debye model 267

Method 2 (Using the expression for U of a simple harmonic oscillator.) We can derive the internal energy U by using the expression for U for a single simple harmonic oscillator in eqn 20.29 to give ωD 1 1 + βω g(ω) dω ω U= , (24.23) 2 e −1 0 which results in eqn 24.22 after substituting in eqn 24.17 and integrating. Obtaining C and hence, using The heat capacity can be derived from C = ∂U ∂T eqn 24.22, we have that 9N ωD −ω 3 dω ωβ ω C= 3 e − . (24.24) ωD 0 (eωβ − 1)2 kB T 2 Making the substitution x = βω, and hence xD = βωD , eqn 24.24 can be rewritten as 9R xD x4 ex dx . (24.25) C= 3 xD 0 (ex − 1)2

C R

C

R

T

T

T

The expression in eqn 24.25 is quite complicated and it is not obvious, just by looking at the equation, what the temperature dependence of the heat capacity will be. This is because xD = ωD β is temperature dependent and hence both the prefactor 9/x3D and the integral are temperature

Fig. 24.3 The molar speciﬁc heat capacity for the Einstein solid and the Debye solid, according to eqn 24.8 and eqn 24.25 respectively. The inset shows the same information on a log–log scale, illustrating the diﬀerence between the low temperature speciﬁc heat capacities of the two models. The Debye model predicts a cubic temperature dependence at low temperature according to eqn 24.28, as shown by the dotted line. The ﬁgure is drawn with ΘE = ΘD .

268 Phonons

dependent. The full temperature dependence is plotted in Fig. 24.3, but the following example shows how to obtain the high-temperature and low-temperature limiting behaviours analytically.

Example 24.3 Show how the molar heat capacity of a Debye solid, derived in eqn 24.25, behaves in the low– and high–temperature limits. Solution: • At high temperature, x → 0 and hence ex − 1 → x. Hence, the heat capacity C behaves as 9R xD x4 dx = 3R, (24.26) C→ 3 xD 0 x2

Here Z ∞

use

the

integral

x4 ex dx/(ex − 1)2

we

can

=

4π 4 /15

0

which is derived in the Appendix B, see eqn C.31.

which is the equipartition result (Dulong–Petit rule, eqn 19.25) again. • At low temperature, x becomes very large and ex 1. The heat capacity is given by 9R ∞ x4 ex dx 12Rπ 4 C→ 3 = . (24.27) xD 0 (ex − 1)2 5x3D Thus an expression for the low temperature heat capacity of a solid is 3 T 4π 4 C = 3R × . (24.28) 5 ΘD This demonstrates that the molar heat capacity of a Debye solid saturates to a value of 3R at high temperature and is proportional to T 3 at low temperature.

m

m

m

m

m

m

Fig. 24.4 A monatomic linear chain.

24.3

Phonon dispersion

We have so far assumed that the phonon dispersion relation is given by eqn 24.11. In this section, we will improve on this substantially. Let us ﬁrst consider the vibrations on a monatomic linear chain of atoms, each atom with mass m connected to its nearest neighbour by a spring with force constant K (see Fig. 24.4). The displacement from the equilibrium position of the nth mass is given the symbol un . Hence, the equation of motion of the nth mass is given by m¨ un = K(un+1 −un )−K(un −un−1 ) = K(un+1 −2un +un−1 ). (24.29) In order to solve this equation, we must attempt to look for wave-like solution. A trial normal mode solution un = exp[i(qna − ωt)] yields −mω 2 = K(eiqa − 2 + e−iqa ),

(24.30)

24.3

Phonon dispersion 269

and hence mω 2 = 2K(1 − cos qa),

(24.31)

which simpliﬁes to ω2 = and hence

ω=

4K sin2 (qa/2), m

4K m

(24.32)

1/2 | sin(qa/2)|.

(24.33)

This result is plotted in Fig. 24.5. In the long-wavelength limit when qa → 0, we have that ω → vs q, where vs = a

K m

1/2 ,

a

a

q

Fig. 24.5 The dispersion relation for a monatomic linear chain, given in eqn 24.33.

(24.34)

and hence eqn 24.11 is obtained in this limit.

Fig. 24.6 The phonon dispersion in copper (Cu). Because Cu is a three-dimensional metal, the phonon dispersion has to be evaluated in three dimensions, and it is here shown as a function of wave vector in diﬀerent directions. Along the (101) direction, bands can be seen which look somewhat like the simple monatomic chain. Both longitudinal (L) and transverse (T) modes are present. The wave vector q is plotted in units of π/a where a is the lattice spacing. The data shown are from Svensson et al., Phys. Rev. B 155, 619 (1967), and are obtained using inelastic neutron scattering. In this technique, a beam of slow neutrons is scattered from the sample and the changes in both the energy and momentum of the neutrons are measured. This can be used to infer the energy ω and momentum q of the phonons. Copyright (1967) by the American Physical Society.

The measured phonon dispersion for copper (Cu), which is a monatomic metal with a face-centred cubic structure, is shown in Fig 24.6 and demonstrates that for small wave vectors (long wavelengths) the angular frequency ω is indeed proportional to the wave vector q and hence ω = vs q is a good approximation in this limit. However, there are both longitudinal and transverse modes present and so in the Debye model one would need to use a suitably modiﬁed sound speed.7 Where the bands ﬂatten over, peaks can be seen in the phonon density of states because states are uniformly distributed in wave vector and so will be

7

Usually what is used in the expression for the Debye frequency is 2 1 3 = 3 + 3 vs3 vs,T vs,L

where vs,T and vs,L are the transverse and longitudinal sound speeds respectively. The weighting of 2:1 is because there are two orthogonal transverse modes and only one longitudinal mode.

g

270 Phonons

Fig. 24.7 The density of states g(ω) for the phonons in copper. The curve is obtained by numerical analysis of the measured phonon dispersion relation. Data from Svensson et al., Phys. Rev. B 155, 619 (1967). Copyright (1967) by the American Physical Society.

m

M

m

M

m

M

Fig. 24.8 A diatomic linear chain.

a

a

concentrated at energies corresponding to regions in the dispersion relation which are horizontal. This is illustrated in Fig. 24.7, and you can compare each peak in this graph with a ﬂattening of the band in some part of the dispersion relation shown in Fig. 24.6. The phonon density of states clearly follows a quadratic dependence at low frequency, corresponding to the non-dispersive parts of the dispersion relation. If the solid contains more than one crystallographically independent dinstinct atom per unit cell, the situation is a little more complicated. To gain insight into this problem, one can solve the diatomic linear chain problem (see Fig. 24.8) which is composed of an alternating series of two diﬀerent atoms. The dispersion relation for this is plotted in Fig. 24.9, and shows two branches. The acoustic branch is very similar to the monatomic linear chain dispersion and near q = 0 corresponds to neighbouring atoms vibrating almost in phase (and the group velocity near q = 0 is the speed of sound in the material, hence the adjective ‘acoustic’). The modes of vibration in the acoustic branch are called acoustic modes. The optic branch has non-zero ω at q = 0 and near q = 0 corresponds to vibrations in which neighbouring atoms vibrating almost out of phase. The modes of vibration in the optic branch are called optic modes. It is called an optic branch because, if the chain contains ions of diﬀerent charges, an oscillation with small q causes an oscillating electric dipole moment which can couple with electromagnetic radiation. An example of such a phonon dispersion in which optic modes are present is provided by germanium (Ge), shown in Fig. 24.10. Although all the atoms in Ge are identical, there are two crystallographically distinct atomic sites and hence an optic branch is observed in the phonon dispersion. These data have also been measured by inelastic neutron scattering.

q

Fig. 24.9 The dispersion relation for a diatomic linear chain. The lower curve is the acoustic branch, the upper curve is the optic branch.

Fig. 24.10 The measured phonon dispersion relation in germanium. B. N. Brockhouse, Rev. Mod. Phys. 74, 1131 (2002). Copyright (2002) by the American Physical Society.

Though the phonon dispersion relations of real solids are more complicated than the linear relation, ω = vs q, assumed by the Debye model, they are linear at low frequency. With this relationship we have that the

Further reading 271

phonon dispersion relation is approximately quadratic at low frequency. At low temperature (where only low–energy, i.e. low–frequency, phonons can be excited), the heat capacity of most solids therefore shows the Debye T 3 behaviour. In practice, the acoustic modes of a solid can be well described by the Debye model, while the optic modes (whose frequencies do not vary much with wave vector) are quite well described by the Einstein model.

Chapter summary • A phonon is a quantized lattice vibration. • The Einstein model of a solid assumes that all phonons have the same frequency. • The Debye model allows a range of phonon frequencies up to a maximum frequency called the Debye frequency. The density of states is quadratic in frequency, and this assumes that ω = vs q. • The dispersion relation of a real solid is more complicated and may contain acoustic and optic branches. It can be experimentally determined using inelastic neutron scattering. • The heat capacity of a three-dimensional solid is proportional to T 3 at low temperature and saturates to a value of 3R at high temperature.

Further reading A wonderful introduction to waves in periodic structures may be found in Brillouin (1953). Useful information about phonons may be found in Ashcroft and Mermin (1976) chapters 22–24, Dove (2003) chapters 8 and 9 and Singleton (2001) Appendix D.

Exercises (24.1) A primitive cubic crystal has lattice parameter 0.3 nm and Debye temperature 100 K. Estimate the maximum phonon frequency in Hz and the speed of sound in m s−1 . (24.2) Show that eqn 24.22 can be rewritten as Z 9RT xD x3 dx 9 U = NA ωD + 3 , (24.35) 8 xD 0 ex − 1

by making the substitution x = βω, and hence xD = βωD . (24.3) Show that the Debye model of a d-dimensional crystal predicts that the low temperature heat capacity is proportional to T d . (24.4) Show that the density of states of lattice vibrations on a monatomic linear chain (see Section 24.3) is

272 Exercises given by g(ω) = (2/πa)[ω 2 − 4K/m]1/2 . Sketch g(ω) and comment on the singularity at ω = 4K/m. (24.5) Generalize the treatment of a monatomic linear chain to the transverse vibrations of atoms on a (two-dimensional) square lattice of atoms and show that «1/2 „ 2K [2 − cos qx a − cos qy a], (24.36) ω= m

(24.7) The treatment of the monatomic linear chain in Section 24.3 included only nearest-neighbour interactions. Show that if a force constant Kj links an atom with one j atoms away, then the dispersion relation becomes

and derive an expression for the speed of sound. (24.6) Show that the dispersion relation for the diatomic chain shown in Fig. 24.8 is #1/2 „ « "„ «2 1 1 ω2 4 1 1 2 − . = + ± + sin qa K M m M m Mm (24.37)

ω2 =

4K X jqa Kj sin2 . m j 2

(24.38)

A measurement is made of ω(q). Show that the force constants can be obtained from ω(q) using

Kj = −

ma 2π

Z

π a

−π a

dq ω 2 (q) cos jqa.

(24.39)

Part VIII

Beyond the ideal gas In this part we introduce various extensions to the ideal gas model which allow us to take account of various complications which make the subject of thermal physics more rich and interesting, but of course also slightly more complicated! This part is structured as follows: • In Chapter 25, we study the consequences of allowing the dispersion relation, the equation which connects energy and momentum, to be relativistic. We examine the diﬀerences between the relativistic and non-relativistic cases. • In Chapter 26, we introduce several equations of state which take into account the interactions between molecules in a gas. These include the van der Waals model, the Dieterici model and the virial expansion. We discuss the law of corresponding states. • In Chapter 27 we discuss how to cool real gases using the JouleKelvin expansion and the operation of a liqueﬁer. • In Chapter 28 we discuss phase transitions, discussing latent heat and deriving the Clausius-Clapeyron equation. We discuss the criteria for stability and metastability and derive the Gibbs phase rule. We introduce colligative properties and classify the diﬀerent types of phase transition. • In Chapter 29 we examine the eﬀect that exchange symmetry has on the quantum wave functions of collections of identical particles. This allows us to introduce bosons and fermions, which can be used to describe the Bose–Einstein distribution and Fermi–Dirac distribution respectively. • In Chapter 30, we show how the results of the previous chapter can be applied to quantum gases, and we consider non-interacting fermion and boson gases and discuss Bose–Einstein condensation.

25

Relativistic gases

25.1 Relativistic dispersion relation for massive particles 274 25.2 The ultrarelativistic gas 274 25.3 Adiabatic expansion of an ultrarelativistic gas 277 Chapter summary

279

Exercises

279

E p

m

mc

In this chapter we will repeat our derivation of the partition function for a gas, and hence of the other thermodynamic properties which can be obtained from it, but this time include relativistic eﬀects. We will see that this leads to some subtle changes in these properties which have profound consequences. First we will review the full relativistic dispersion relation for particles with non-zero mass and then derive the partition function for ultrarelativistic particles.

25.1

E

Relativistic dispersion relation for massive particles

In deriving the partition function for a gas, we assumed that the kinetic energy E of a molecule of mass m was equal to p2 /2m, where p is the momentum (and using p = k, we wrote down E(k) = 2 k 2 /2m; see eqn 21.15). This is a classical approximation valid only when p/m c (where c is the speed of light), and in general we should use the relativistic formula E 2 = p2 c2 + m2 c4 , (25.1) where m is now taken to be the rest mass, i.e. the mass of the molecule in its rest frame. This is plotted in Fig. 25.1. When p mc (the non-relativistic limit) this reduces to

mc E pc

p

E=

p2 + mc2 , 2m

(25.2)

Fig. 25.1 The dispersion relation of a particle with mass (thick solid line) according to eqn 25.1. The dashed line is the non-relativistic limit (p mc). The dotted line is the ultrarelativistic limit (p mc).

which is identical to our classical approximation E = p2 /2m apart from the extra constant mc2 (the rest mass energy), which just deﬁnes a new ‘zero’ for the energy (see Fig. 25.1). In the case p mc (the ultrarelativistic limit), eqn 25.1 reduces to

1 The relation between E and p is known as a dispersion relation. By scaling E = ω and p = k by a factor we have a relation between ω and k, which is perhaps more familiar as a dispersion relation from wave physics.

which is the appropriate relation1 for photons (this is the straight line in Fig. 25.1).

E = pc,

25.2

(25.3)

The ultrarelativistic gas

Let us now consider a gas of particles with non-zero mass in the ultrarelativistic limit which means that E = pc. Such a linear dispersion

25.2

relation means that some of the algebra in this chapter is actually much simpler than we had to deal with for the partition function in the nonrelativistic case where the dispersion relation is quadratic. Using the ultrarelativistic limit means that all the particles (or at the very least, the vast majority of them), will be moving so quickly that their kinetic energy is much greater than their rest mass energy.2 Using the ultrarelativstic limit E = pc = kc, we can write down the single-particle partition function ∞ Z1 = e−βkc g(k) dk, (25.4)

The ultrarelativistic gas 275

2

Note however that we are ignoring any quantum eﬀects which may come into play; these will be considered in Chapter 30.

0

where we recall that (eqn 21.6) g(k) dk =

V k 2 dk , 2π 2

and so, using the substitution x = βkc, we have 3 ∞ 1 V Z1 = e−x x2 dx, 2π 2 βc 0 and recognizing that the integral is 2!, we have ﬁnally that 3 V kB T Z1 = 2 . π c

(25.5)

(25.6)

(25.7)

Notice immediately that we ﬁnd that Z1 ∝ V T 3 , whereas in the nonrelativistic case we had that Z1 ∝ V T 3/2 . We can also write eqn 25.7 in a familiar form V (25.8) Z1 = 3 , Λ where Λ is not the same as the expression for the thermal wavelength in eqn 21.18, but is given by Λ=

cπ 2/3 , kB T

Equivalently, one can write Λ=

(25.9)

hc , 2π 1/3 kB T

It now becomes a simple exercise to determine all the properties of the ultrarelativistic gas using our practiced methods of partition functions.

Example 25.1 Find U , CV , F , p, S, H and G for an ultrarelativistic gas of indistinguishable particles. Solution: The N -particle partition function ZN is given by3 ZN =

Z1N , N!

(25.10)

and hence ln ZN = N ln V + 3N ln T + constants.

(25.11)

3

This is assuming the density is not so high that this approximation breaks down.

276 Relativistic gases

The internal energy U is given by U =−

d ln ZN = 3N kB T, dβ

(25.12)

which is diﬀerent from the non-relativistic case (which gave U = 32 N kB T ). The heat capacity CV is ∂U , (25.13) CV = ∂T V 4

Notice that this does not agree with the equipartition theorem, which would predict CV = 32 N kB , half of the value that we have found. Why does the equipartition theorem fail? Because the dispersion relation is not a quadratic one (i.e. E ∝ p2 ), as is needed for the equipartition theorem to hold, but instead is a linear one (E ∝ p). 5 Note that we have p = nkB T for both the non-relativistic and ultrarelativistic cases. This is because Z1 ∝ V in both cases; hence ZN ∝ V N and F = −kB T N ln V +(other terms not involving V ), so that p = −(∂F/∂V )T = nkB T .

and hence is given by4 CV = 3N kB . The Helmholtz function is F = −kB T ln ZN = −kB T N ln V − 3N kB T ln T − kB T × constants, (25.14) so that ∂F N kB T = nkB T, = (25.15) p=− ∂V T V which is the ideal gas equation,5 as for the non-relativistic case. This also gives the enthalpy H via H = U + pV = 4N kB T.

(25.16)

As we found for the non-relativistic case, getting the entropy involves bothering with what the constants are in eqn 25.11. Hence, let us write this equation as ln ZN

= N ln V − 3N ln Λ − N ln N + N 1 = N ln + N, nΛ3

(25.17)

where n = N/V , so we immediately have (using the usual statistical mechanics manipulations listed in Table 20.1): F

S

= −kB T ln ZN = N kB T [ln(nΛ3 ) − 1], U −F = T = N kB ln[4 − ln(nΛ3 )],

(25.18)

(25.19)

G = H − T S = 4N kB T − N kB T [4 − ln(nΛ )] = N kB T ln(nΛ3 ). 3

(25.20)

The results from this problem are summarized in Table 25.1. One consequence of these results is that the pressure p is related to the energy density u = U/V using u (25.21) p= , 3 which is very diﬀerent from the non-relativistic case p = 2u/3 (see eqn 6.25). This has some rather dramatic consequences for the structure of stars (see Section 35.1.3).

25.3

Adiabatic expansion of an ultrarelativistic gas 277

Property

Non-relativistic

ultrarelativistic

Z1

V λ3th

V Λ3

h λth = √ 2πmkB T

Λ=

cπ 2/3 kB T

3 2 N kB T 5 2 N kB T

3N kB T

F

N kB T V 2u = 3 N kB T [ln(nλ3th ) − 1]

N kB T V u = 3 N kB T [ln(nΛ3 ) − 1]

S

N kB [ 52 − ln(nλ3th )]

N kB [4 − ln(nΛ3 )]

G

N kB T ln(nλ3th )

N kB T ln(nΛ3 )

Adiabatic expansion

V T 3/2 = constant

V T 3 = constant

pV 5/3 = constant

pV 4/3 = constant

U H p

4N kB T

Table 25.1 The properties of non-relativistic and ultrarelativistic monatomic gases of indistinguishable particles of mass m.

25.3

Adiabatic expansion of an ultrarelativistic gas

We will now consider the adiabatic expansion of an ultrarelativistic monatomic gas. This means that we will keep the gas thermally isolated from its surroundings and no heat will enter or leave. The entropy stays constant in such a process, and hence (from Table 25.1) so does nΛ3 which implies that V T 3 = constant,

(25.22)

or equivalently (using pV ∝ T ) pV 4/3 = constant.

(25.23)

This implies that the adiabatic index γ = 4/3. This contrasts with the non-relativstic cases (for which V T 3/2 and pV 5/3 are constants, and γ = 5/3).

278 Relativistic gases

6

See Section 23.7.

Example 25.2 An example of the adiabatic expansion of an ultrarelativistic gas relates to the expansion of the Universe. If the Universe expands adiabatically (how can heat enter or leave it when it presumably doesn’t have any ‘surroundings’ by deﬁnition?) then we expect that an ultrarelativistic gas inside the Universe, such as the cosmic microwave background photons,6 behaves according to (25.24) V T 3 = constant, where T is the temperature of the Universe and V is its volume. Hence T ∝ V −1/3 ∝ a−1 ,

7

See Section 23.7.

(25.25)

where a is the scale factor7 of the Universe (V ∝ a3 ). Thus the temperature of the cosmic microwave background is inversely proportional to the scale factor of the Universe. A non-relativistic gas in the Universe would behave according to

in which case

V T 3/2 = constant,

(25.26)

T ∝ V −2/3 ∝ a−2 ,

(25.27)

so the non-relativistic gas would cool faster than the cosmic microwave background as the Universe expands. We can also work out the density ρ of both types of gas as a function of the scale factor a. For the adiabatic expansion of a gas of non-relativistic particles, the density ρ ∝ V −1 (because the mass stays constant) and hence (25.28) ρ ∝ a−3 . For relativistic particles,

u , (25.29) c2 where u = U/V is the energy density. Now u = 3p (by eqn 25.21) and since p ∝ V −4/3 for relativistic particles, we have that ρ=

ρ ∝ a−4 . 8

This is because, for both cases, you have the eﬀect of volume dilution due to the Universe expanding which goes as a3 ; but only for the relativistic case do you have an energy loss (and hence a density loss) due to the Universe expanding, giving an extra factor of a.

(25.30)

Thus the density drops oﬀ faster for a gas of relativistic particles than it does for non-relativistic particles, as the Universe expands.8 The Universe contains both matter (mostly non-relativistic) and photons (clearly ultrarelativistic). This simple analysis shows that as the Universe expanded, the matter cooled faster than the photons, but the density of the matter decreases less quickly than that due to the photons. The density of the early Universe is said to be radiation dominated but as time has passed the Universe has become matter dominated as far as its density (and hence expansion dynamics) is concerned.

Exercises 279

Chapter summary • Using the ultrarelativistic dispersion relation E = pc, rather than the non-relativistic dispersion relation E = p2 /2m, leads to changes in various thermodynamic functions, as listed in Table 25.1.

Exercises (25.1) Find the phase velocity and the group velocity for a relativistic particle whose energy E is E 2 = p2 c2 + m20 c4 and examine the limit p mc and p mc. (25.2) In D dimensions, show that the density of states of particles with spin-degeneracy g in a volume V is gV Dπ D/2 kD−1 dk . (25.31) g(k) dk = Γ( D + 1)(2π)D 2 You may need to use the fact that the volume of a sphere of radius r in D dimensions is (see Appendix C.8) 2π D/2 rD . (25.32) Γ( D + 1) 2 (25.3) Consider a general dispersion relation of the form E = αps ,

(25.33)

where p is the momentum and α and p are constants. Using the result of the previous question,

show that the density of states as a function of energy is D gV Dπ D/2 E s −1 dE. + 1) (25.34) Hence show that the single-particle partition function takes the form

g(E) dE =

hD αD/s sΓ( D 2

Z1 =

V , λD

(25.35)

where λ is given by λ=

h π 1/2

„

α kB T

«1/s "

Γ( D + 1) 2 Γ( D + 1) s

#1/D .

(25.36)

Show that this result for three dimensions (D = 3) agrees with (i) the non-relativistic case when s = 2 and (ii) the ultrarelativistic case when s = 1.

26

Real gases

26.1 The van der Waals gas

280

26.2 The Dieterici equation

288

26.3 Virial expansion

290

26.4 The law of corresponding states 294 Chapter summary

295

Exercises

296

p

T T T

V Fig. 26.1 Isotherms of the ideal gas for three diﬀerent temperatures T3 > T2 > T1 .

In this book we have spent a lot of time considering the so-called ideal (sometimes called ‘perfect’) gas, which has an equation of state given by pV = nmoles RT,

(26.1)

where nmoles is the number of moles, or equivalently by pVm = RT,

(26.2)

where Vm = V /nmoles is the molar volume (i.e. the volume occupied by 1 mole). This equation of state leads to isotherms as plotted in Fig. 26.1. However, real gases don’t behave quite like this, particularly when the pressure is high and the volume is small. For a start, if you get a real gas cold enough it will liquefy, and this is something that the ideal gas equation does not predict or describe. In a liquid, the intermolecular attractions, which we have so far preferred to ignore, are really significant. In fact, even before the gas liqueﬁes, there are departures from ideal-gas behaviour. This chapter deals with how this additional element of real behaviour can be modelled, by introducing various extensions to the ideal gas model, including those introduced by van der Waals (Section 26.1) and Dieterici (Section 26.2). An alternative series expansion approach is the so-called virial expansion in Section 26.3. Many similar systems behave in similar ways once the diﬀerences in the magnitude of the intermolecular interactions have been factored out by some appropriate scaling. This forms the basis of the law of corresponding states in Section 26.4.

26.1

The van der Waals gas

The most commonly used model of real gas behaviour is the van der Waals gas. This is the simplest real gas model which includes the two crucial ingredients we need: (i) intermolecular interactions (gas molecules actually weakly attract one another) and (ii) the non-zero size of molecules (gas molecules don’t have freedom to move around in all the volume of the container, because some of the volume is occupied by the other gas molecules!). Like the ideal gas, the van der Waals gas is only a model of real behaviour, but by being a slightly more complicated description than the ideal gas (more complicated in the right way!) it is able to describe more of the physical properties exhibited by real gases.

26.1

Origin of the a/Vm2 term Assume nmoles moles of gas in volume V . The number of nearest neighbours is proportional to nmoles /V , and so attractive intermolecular interactions lower the total potential energy by an amount proportional the number of atoms multiplied by the number of nearest neighbours, i.e. we can write the energy change as an2moles , V

(26.3)

where a is a constant. Hence, if you change V , the energy changes by an amount dV an2 , (26.4) − moles 2 V but this energy change can be thought of as being due to an eﬀective pressure peﬀ , so that the energy change would be −peﬀ dV . Hence peﬀ = −a

n2moles a = − 2. V2 Vm

(26.5)

The pressure p that you measure is the sum of the pressure pideal neglecting intermolecular interactions and peﬀ . Therefore pideal = p − peﬀ = p +

a Vm2

(26.6)

is the pressure which you have to enter into the formula for the ideal gas, pideal Vm = RT, (26.7) making the correction Vm → Vm − b to take account of the excluded volume. This yields a (26.8) p + 2 (Vm − b) = RT, Vm in agreement with eqn 26.10. This equation of state can also be justiﬁed from statistical mechanics as follows: taking the expression for the partition function of N molecules in a gas, ZN = (1/N !)(V /λ3th )N , we replace the volume V by V − nmoles b, the volume actually available for molecules to move around in; we also include a Boltzmann factor 2 e−β(−anmoles /V ) to give N 2 1 V − nmoles b ZN = eβanmoles /V , (26.9) 3 N! λth which after using F = −kB T ln ZN and p = −(∂F/∂V )T yields the van der Waals equation of state.

The van der Waals gas 281

282 Real gases

The equation of state for a van der Waals gas is p+

1

To understand this, consider eqn 26.10 at some ﬁxed T . When Vm → b, the term (Vm − b) is very small, and hence 2 ) ≈ (p + a/b2 ) is very big and (p + a/Vm hence p increases.

a Vm2

(Vm − b) = RT.

(26.10)

In this equation, the constant a parameterizes the strength of the intermolecular interactions, while the constant b accounts for the volume excluded owing to the ﬁnite size of molecules. If a and b are both set to zero, we recover the equation of state for an ideal gas, pVm = RT . Moreover, in the low-density limit (when Vm b and Vm (a/p)1/2 ) we also recover the ideal gas behaviour. However, when the density is high, and we try to make Vm approach b, the pressure p shoots up.1 The motivation for the a/Vm2 term in the van der Waals model is outlined in the box on page 281.

p

b

V

Fig. 26.2 Isotherms of the van der Waals gas. Isotherms towards to the top right of the graph correspond to higher temperatures. The dashed line shows the region in which liquid and vapour are in equilibrium (see the end of Section 26.1). The thick line is the critical isotherm and the dot marks the critical point.

26.1

Multiplying eqn 26.10 by V 2 for one mole of van der Waals gas (where Vm = V ), we have pV 3 − (pb + RT )V 2 + aV − ab = 0,

(26.11)

which is a cubic equation in V . The equation of state of the van der Waals gas is plotted in Fig. 26.2 for various isotherms. As the temperature is lowered, the isotherms change from being somewhat ideal-gas like, at the top right of the ﬁgure, to exhibiting an S-shape with a minimum and a maximum (as expected for a general cubic equation) in the lower left of the ﬁgure. This provides us with a complication: the isothermal compressibility (eqn 16.71) is κT = − V1 (∂V /∂p)T , and for the ideal gas this is always positive (and equal to the pressure of the gas). However, for the van der Waals gas, when the isotherms become S-shaped, there is a region when the gradient (∂V /∂p)T is positive and hence the compressibility κT will be negative. This is not a stable situation: a negative compressibility means that when you try to compress the gas it gets bigger! If a pressure ﬂuctuation momentarily increases the pressure, the volume increases (rather than decreases) and negative work is done on the gas, providing energy to amplify the pressure ﬂuctuation; thus a negative compressibility means that the system is unstable with respect to ﬂuctuations. The problem starts when the isotherms become S-shaped, and this happens when the temperature is lower than a certain critical temperature. This temperature is that of the critical isotherm, which is indicated by the thick solid line in Fig. 26.2. This does not have a maximum or minimum but shows a point of inﬂection, known as the critical point, and which is marked by the dot on Fig. 26.2.

Example 26.1 Find the temperature Tc , pressure pc and volume Vc at the critical point of a van der Waals gas, and calculate the ratio pc Vc /RTc . Solution: The equation of state for one mole of van der Waals gas can be rewritten with p as the subject as follows: p=

a RT − . V −b V2

The point of inﬂection can be found by using ∂p RT 2a =− + 3 =0 2 ∂V T (V − b) V and

∂2p ∂V 2

= T

2RT 6a − 4 = 0. 3 (V − b) V

(26.12)

(26.13)

(26.14)

The van der Waals gas 283

284 Real gases

Equation 26.13 implies that RT =

2a(V − b)2 , V3

(26.15)

3a(V − b)3 , V4

(26.16)

while eqn 26.14 implies that RT =

and equating these last two equations gives 3(V − b) = 2, V

(26.17)

which implies that V = Vc , where Vc is the critical volume given by Vc = 3b.

(26.18)

Substituting this back into eqn 26.14 yields RT = 8a/27b and hence T = Tc where Tc is the critical temperature given by Tc =

8a . 27Rb

(26.19)

Substituting our expressions for Vc and Tc back into the equation of state for a van der Waals gas gives the critical pressure pc as pc =

a . 27b2

(26.20)

We then have that

3 pc V c = = 0.375, RTc 8 independent of both a and b. At the critical point, ∂p = 0, ∂V Tc

(26.21)

(26.22)

and hence the isothermal compressibility diverges since κT = −

1 (∂V /∂p)T → ∞. V

(26.23)

We have found that the compressibility κT is negative when T < Tc and so the system is then unstable. Let us now examine the isotherms below the critical temperature. Since the constraints in an experiment are often those of constant pressure and temperature, it is instructive to examine the Gibbs function for the van der Waals gas, which we can obtain as follows. The Helmholtz function F is related to p by p = −(∂F/∂V )T and so (for 1 mole) F = f (T ) − RT ln(V − b) −

a , V

(26.24)

The van der Waals gas 285

G

V Vc

26.1

p pc

Fig. 26.3 The behaviour of the volume V and Gibbs function G of a van der Waals gas as a function of pressure at T = 0.9Tc .

286 Real gases

G

V Fig. 26.4 The Gibbs function for different pressures for the van der Waals gas with T /Tc = 0.9. The line corresponding to highest (lowest) pressure is at the top (bottom) of the ﬁgure. The thick solid line corresponds to the critical pressure pc .

where f (T ) is a function of temperature. Hence the Gibbs function is a (26.25) G = F + pV = f (T ) − RT ln(V − b) − + pV, V and this is plotted as a function of pressure p in the lower half of Fig. 26.3 for a temperature T = 0.9Tc , i.e. below the critical temperature. What is found is that the Gibbs function becomes multiply valued for certain values of pressure. Since a system held at constant temperature and pressure will minimize its Gibbs function, the system will normally ignore the upper loop of the Gibbs function, i.e. the path BXYB in Fig. 26.3, and proceed from A to B to C as the pressure is reduced. The upper part of Fig. 26.3 also shows the corresponding behaviour of the volume as a function of pressure for this same temperature. We see here that the two points B1 and B2 on the curve representing the volume correspond to the single point B on the curve representing the Gibbs function. Since the Gibbs function is the same for these two points, phases corresponding to these two points can be in equilibrium with each other. The point B is thus a two-phase region in which gas and liquid coexist together. Thus liquid (with a much smaller compressibility) is stable in the region A→B and gas (with a much larger compressibility) is stable in the region B→C. The line BX represents a metastable state, in this case superheated liquid. The line BY represents another metastable state, supercooled gas. These metastable states are not the phase corresponding to the lowest Gibbs function of the system for the given conditions of temperature and pressure. They can, however, exist for limited periods of time. The dependence of the Gibbs function on volume for various pressures, expressed in eqn 26.25 and plotted Fig. 26.4, helps us to understand why. At high pressure, there is a single minimum in the Gibbs function corresponding to a low–volume state (the liquid). At low pressure, there is a single minimum in the Gibbs function corresponding to a high–volume state (the gas). At the critical pressure (the thick solid line in Fig. 26.4) there are two minima, corresponding to the coexisting liquid and gas states. If you take gas initially at low pressure and raise the pressure, then when you reach the critical pressure, the system will still be in the right-hand minimum of the Gibbs function. Raising the pressure above pc would make the left-hand minimum (liquid state) the more stable state, but the system might be stuck in the right-hand minimum (gaseous state) because there is a small energy barrier to surmount to achieve the true stable state. The system is thus, at least temporarily, stuck in a metastable state. Of course, the triangle BXY vanishes for temperatures above the critical temperature and then there is simply a crossover between a system with low compressibility to one with progressively higher compressibility as the pressure is reduced. When T > Tc , the sharp distinction between liquid and gas is lost and you cannot really tell precisely where the system stops being liquid and starts being a gas. This is a point we will return to in Section 28.7. We have noted that at points B1 and B2 in Fig. 26.5 we have phase

26.1

The van der Waals gas 287

p

Fig. 26.5 The Maxwell construction for the van der Waals gas. Phase coexistence occurs between points B1 and B2 when the shaded areas are equal. The dotted line shows the locus of such points for diﬀerent temperatures (and is identical to the dashed line in Fig. 26.2).

V coexistence because the Gibbs function is equal at these points. In general, we can always write that the Gibbs function at some pressure p1 is related to the Gibbs function at some pressure p0 by p1 ∂G G(p1 , T ) = G(p0 , T ) + dp, (26.26) ∂p T p0 and since

∂G ∂p

= V,

(26.27)

T

we have G(p1 , T ) = G(p0 , T ) +

p1

V dp.

(26.28)

p0

Applying this equation between the points B1 and B2 we have that B2 V dp (26.29) G(pB2 , T ) = G(pB1 , T ) + B1

and since G(pB1 , T ) = G(pB2 , T ), we have that B2 V dp = 0.

(26.30)

B1

This result gives us a useful way of identifying the points B1 and B2 , as illustrated in Fig. 26.5. These two points show phase coexistence when the two shaded areas are equal, and this follows directly from eqn 26.30.

p p

288 Real gases

T T

Fig. 26.6 The p–T phase diagram for a van der Waals gas.

The horizontal dashed line separating the two equal shaded areas in Fig. 26.5 is known as the Maxwell construction. The dotted line in Fig. 26.5 shows the locus of such points of coexistence for diﬀerent temperatures (and is identical to the dashed line in Fig. 26.2). This allows us to plot the phase diagram shown in Fig. 26.6, which shows p against T . The line of phase coexistence is shown, ending in the critical point at T = Tc and p = pc . At ﬁxed pressure, the stable low–temperature state is the liquid, while the stable high–temperature state is the gas. Note that when p > pc and T > Tc , there is no sharp phase boundary separating gas from liquid. Thus it is possible to ‘avoid’ a sharp phase transition between liquid and gas by, for example, starting with a liquid, heating it at low pressure to above Tc , isothermally pressurizing above pc , and then isobarically cooling to below Tc and obtaining a gas. We will consider these transitions between diﬀerent phases in more detail in Chapter 28.

26.2

The Dieterici equation

The van der Waals equation of state can be written in the form p = prepulsive + pattractive ,

(26.31)

where the ﬁrst term is a repulsive hard sphere interaction prepulsive =

RT , V −b

(26.32)

which is an ideal-gas-like term but with the denominator being the volume available to gas molecules, namely that of the container V minus

26.2

The Dieterici equation 289

that of the molecules, b. The second term is the attractive interaction pattractive = −

a . V2

(26.33)

There have been other attempts to model non-ideal gases. In the Berthelot equation, the attractive force is made temperature-dependent by writing a . (26.34) pattractive = − TV 2 Another approach is due to Dieterici,2 who in 1899 proposed an alternative equation of state in which he wrote that

a p = prepulsive exp − , (26.35) RT V

2

Conrad Dieterici (1858–1929)

pr

and using eqn 26.32 this leads to

a p(Vm − b) = RT exp − RT Vm

,

(26.36)

which is the Dieterici equation, here written in terms of the molar volume. The constant a is, again, a parameter which controls the strength of attractive interactions. Isotherms of the Dieterici equation of state are shown in Fig. 26.7; they are similar to the ones for the van der Waals gas (Fig. 26.2), showing a very sudden increase in pressure as V approaches b. The critical point can be identiﬁed for this model by evaluating 2 ∂p ∂ p = = 0, (26.37) ∂V 2 T ∂V T and this yields (after a little algebra) Tc =

a , 4Rb

pc =

a , 4e2 b2

Vc = 2b,

(26.38)

for the critical temperature, pressure and volume, and hence pc V c 2 = 2 = 0.271. RTc e

(26.39)

This value agrees well with those listed in Table 26.1 (and is better than the van der Waals result, which is 0.375, as shown in eqn 26.21).

pc Vc /RTc

Ne

Ar

Kr

Xe

0.287

0.292

0.291

0.290

Table 26.1 The values of pc Vc /RTc for various noble gases.

b

Vr

Fig. 26.7 Isotherms for the Dieterici equation of state.

290 Real gases

26.3

Virial expansion

Another method to model real gases is to take the ideal gas equation and modify it using a power series in 1/Vm (where Vm is the molar volume). This leads to the following virial expansion: BT

pVm B C =1+ + 2 + ··· RT Vm Vm T T

(26.40)

In this equation, the parameters B, C, etc. are called virial coeﬃcients and can be made to be temperature dependent (so that we will denote them by B(T ) and C(T )). The temperature at which the virial coeﬃcient B(T ) goes to zero is called the Boyle temperature TB since it is the temperature at which Boyle’s law is approximately obeyed (neglecting the higher–order virial coeﬃcients), as shown in Fig. 26.8.

Fig. 26.8 The temperature dependence of the virial coeﬃcient B.

Example 26.2 Express the van der Waals equation of state in terms of a virial expansion and hence ﬁnd the Boyle temperature in terms of the critical temperature. Solution: The van der Waals equation of state can be rewritten as −1 a RT RT a b + 2 = pV = + 2, (26.41) 1− V −b V V V V and using the binomial expansion, the term in brackets can be expanded into a series, resulting in 2 3 b b pV 1 a =1+ b− + + + ··· , (26.42) RT V RT V V which is in the same form as the virial expansion in eqn 26.40 with B(T ) = b −

a . RT

(26.43)

The Boyle temperature TB is deﬁned by B(TB ) = 0 and hence TB =

a , bR

(26.44)

and hence using eqn 26.19 we have that TB =

27Tc . 8

(26.45)

26.3

The additional terms in the virial expansion give information about the nature of the intermolecular interactions. We can show this using the following argument, which shows how to model intermolecular interactions in the dilute gas limit using a statistical mechanical argument.3 The total internal energy U of the molecules (each with mass m) in a gas can be written as (26.46) U = UK.E. + UP.E. ,

Virial expansion 291

3

This argument is a little more technical than the material in the rest of this chapter and can be skipped at ﬁrst reading.

where the kinetic energy UK.E. is given by a sum over all N molecules UK.E. =

N p2i , 2m i=1

(26.47)

where pi is the momentum of the ith molecule, and the potential energy is given by 1 V(|r i − r j |), (26.48) UP.E. = 2 i=j

where V(|r i − r j |) is the potential energy between the ith and jth molecules and the factor 12 is to avoid double counting of pairs of molecules in the sum. The partition function Z is then given by Z = · · · d3 r1 · · · d3 rN d3 p1 · · · d3 pN e−β[UK.E. ({pi })+UP.E. ({ri })] = ZK.E. ZP.E. ,

(26.49)

where the last equality follows because the integrals in momentum and in position variables are separable. Now ZK.E. is the partition function for the ideal gas which we have already derived (Chapter 21) and which leads to the ideal gas equation pV = RT , so we focus entirely on ZP.E. which is given by 1 (26.50) · · · d3 r1 · · · d3 rN e−βUP.E. , ZP.E. = N V where we have included the factor 1/V N so that when UP.E. = 0 then ZP.E. = 1. Hence β P 1 1 ZP.E. = N (26.51) · · · d3 r1 · · · d3 rN e− 2 i=j 2 V(|ri −rj |) , V and adding one and subtracting one from this equation,4 we have βP 1 1 ZP.E. = 1 + N · · · d3 r1 · · · d3 rN e− 2 i=j 2 V(|ri −rj |) − 1 . V (26.52) We presume that the intermolecular interactions are only signiﬁcant for molecules which are virtually touching, so the integrand is appreciably diﬀerent from zero only when two molecules are very close together. If the gas is dilute this condition of two molecules being close will only happen relatively rarely, and so we will assume that this condition occurs only for one pair of molecules at any one time. There are N ways of

4

This trick is done because we know that later we will want to play with the log of the partition function, and ln(1 + x) ≈ x for x 1, so having Z in the form one plus something small is convenient.

292 Real gases

picking the ﬁrst molecule for a collision, and N − 1 ways of picking the second molecule for a collision, and since we don’t care which is the ‘ﬁrst’ molecule and which is the ‘second’, the number of ways to select a pair of molecules from N molecules is N (N − 1) , 2

(26.53)

which is approximately N 2 /2 when N is large. Writing r for the coordinate separating these two molecules, we then have N2 3 3 −βV(r) e r · · · d r − 1 . (26.54) ZP.E. ≈ 1 + · · · d 1 N 2V N Since the integral depends only on the separation r of these two molecules, we can integrate out the other N − 1 volume coordinates (resulting in integrals equal to unity multiplied by the volume V ) and obtain N2 3 −βV(r) r e − 1 , (26.55) d ZP.E. ≈ 1 + 2V N and writing B(T ) (the virial coeﬃcient) as N B(T ) = d3 r 1 − e−βV(r) , 2 we have that ZP.E. ≈ 1 − T T

N B(T ) , V

(26.57)

and hence F

= −kB T ln Z = −kB T ln(ZK.E. ZP.E. ) N kB T B(T ) , = F0 + V

B

(26.56)

(26.58)

where F0 is the Helmholtz function of the ideal gas and the last equality is accomplished using ln(1 + x) ≈ x for x 1. Hence, we can evaluate the pressure p as follows: ∂F N kB T B(T ) N kB T + p=− = . (26.59) ∂V T V V2 Rearranging, we have that for one mole of gas

Fig. 26.9 The temperature dependence of the virial coeﬃcient B(T ) in argon. Argon has a boiling point at atmospheric pressure of Tb = 87 K and the critical point is at Tc = 151 K and pc = 4.86 MPa.

pV B(T ) =1+ , RT V

(26.60)

which is of the form of the virial expansion in eqn 26.40 but with only a single non-ideal term. The temperature dependence of the virial coeﬃcient B(T ) for argon is shown in Fig. 26.9. It is large and negative at low temperatures but changes sign (at the Boyle temperature) and then becomes small and

26.3

Virial expansion 293

positive at higher temperatures. We can understand this from the expression for B(T ) which is given in eqn 26.56. This is an integral of the function 1 − e−βV(r) , which is shown in Fig. 26.10(b). At low temperatures, the integral of this function is dominated by the negative peak which is centred around rmin , the minimum in the potential well (corresponding to the particles spending more time with this intermolecular spacing). Hence B(T ) is negative and large at low temperatures. As temperature increases, the peak in this function broadens out as molecules spend more time a long way from each other, resulting in a weakened average potential energy. Here, the eﬀect of the positive plateau below rmin begins to dominate the integral and B changes sign.

r

r r

r r Fig. 26.10 The upper graph, (a), shows the intermolecular potential energy V(r). The integrand in eqn 26.56 is 1 − e−βV(r) and this is plotted in the lower graph, (b), for diﬀerent values of β. The solid curve shows a value with large β (low temperature) and the other curves show the eﬀect of reducing β (raising the temperature), where in order of decreasing β the lines are dashed, dotted, long-dashed, and dotdashed.

294 Real gases

26.4

The law of corresponding states

For diﬀerent substances, the size of the molecules (which controls b in the van der Waals model) and the strength of the intermolecular interactions (which controls a in the van der Waals model) will vary, and hence their phase diagrams will be diﬀerent. For example, the critical temperatures and pressures for diﬀerent gases are diﬀerent. However, the phase diagram of substances should be the same when plotted in reduced coordinates, which can be obtained by dividing a quantity by its value at the critical point. Hence, if we replace the quantities p, V, T by their reduced coordinates p˜, V˜ , T˜ deﬁned by p˜ =

p , pc

V V˜ = , Vc

T T˜ = , Tc

(26.61)

then phase diagrams of materials which are not wholly diﬀerent from one another should lie on top of each other. This is called the law of corresponding states.

Example 26.3 Express the equation of state of the van der Waals gas in reduced coordinates. Solution: Substituting eqns 26.61 into eqn 26.10 we ﬁnd that pc p˜ =

a RTc T˜ − , 2 ˜ Vc V − b Vc V˜ 2

and this can be rearranged to give 3 8T˜ p˜ + = . V˜ 2 3V˜ − 1

(26.62)

(26.63)

The law of corresponding states works well in practice for real experimental data, since the intermolecular potential energies are usually of a similar form in diﬀerent substances, as shown in Fig. 26.10(a). There is a repulsive region at small distances, a stable minimum at a separation rmin corresponding to a potential well depth of −, and then a longrange attractive region at larger distances. For diﬀerent molecules, the length scale rmin and the energy scale may be diﬀerent, but these two parameters together are suﬃcient to give a reasonable description of the intermolecular potential energy. The parameter rmin sets the scale of the molecular size and the parameter sets the scale of the intermolecular interactions. Dividing p, V and T by their values at the critical point removes these scales and allows the diﬀerent phase diagrams to be superimposed.

The law of corresponding states 295

T T

26.4

Fig. 26.11 The liquid–gas coexistence for a number of diﬀerent substances can be superimposed once they are plotted in reduced coordinates. The solid line is a scaling relation. This plot is adapted from Guggenheim (1945).

An example of this for real data is shown in Fig. 26.11. The form of the liquid–gas coexistence is diﬀerent in detail from that predicted by the van der Waals equation, but shows that the underlying behaviour in diﬀerent real systems is similar and shows ‘universal’ features.

Chapter summary • Attractive intermolecular interactions and the non-zero size of molecules lead to departures from ideal gas behaviour. • The van der Waals equation of state is a p + 2 (Vm − b) = RT. Vm • The Dieterici equation of state is p(Vm − b) = RT e−a/RT Vm . • The virial expansion of a gas can be written as pVm B C =1+ + 2 + ··· RT Vm Vm • The law of corresponding states implies that if the variables p, V and T are scaled by their values at the critical point, the behaviour of diﬀerent gases in these scaled variables is often very similar to that of other gases scaled in the same way.

296 Exercises

Exercises (26.1) Show that the isothermal compressibility κT of a van der Waals gas can be written as κT =

4b (T − Tc )−1 . 3R

(26.64)

Sketch the temperature dependence of κT and explain what happens to the properties of the gas when the temperature is lowered through the critical temperature. (26.2) The equation of state of a certain gas is p(V −b) = RT , where b is a constant. What order of magnitude do you expect b to be? Show that the internal energy of this gas is a function of temperature only. (26.3) Show that the Dieterici equation of state,

(26.4) Show that the isobaric expansivity βp of the van der Waals gas is given by „ «−1 1 b 2a . (26.65) 1+ − βp = T V −b pV 2 + a What happens to this quantity close to the critical point? (26.5) Show that eqn 26.9 leads to U=

where P˜ = P/Pc , T˜ = T /Tc V˜ = V /Vc , and (Pc , Tc , Vc ) is the critical point.

(26.66)

for one mole of gas. (26.6) The total energy of one mole of a van der Waals gas can be written as U=

p(V − b) = RT e−a/RT V , can be written in reduced units as » „ «– 1 , P˜ (2V˜ − 1) = T˜ exp 2 1 − T˜V˜

3 a RT − , 2 V

a f RT − , 2 V

(26.67)

where f is the number of degrees of freedom (see eqn 19.22). Show that CV =

f R 2

and Cp − CV ≈ R +

(26.68) 2a . TV

(26.69)

27

Cooling real gases In Chapter 26, we considered how to model the properties of real gases using various corrections to the ideal gas model. In this chapter, we will use these results to explore some of the deviations from ideal gas behaviour which can be observed in practice, in particular with changes in the behaviour of a Joule expansion. Then we will introduce the Joule– Kelvin throttling process (which has no eﬀect on an ideal gas, but which can lead to cooling of a real gas) and discuss how real gases can be liqueﬁed.

27.1

27.1 The Joule expansion

297

27.2 Isothermal expansion

299

27.3 Joule–Kelvin expansion 300 27.4 Liquefaction of gases

302

Chapter summary

303

Exercises

304

The Joule expansion

We have discussed the properties of non-ideal gases in some detail. In this section, we will see how the intermolecular interactions in such gases lead to departures from ideal-gas behaviour for the Joule expansion. Recall from Section 14.4 that a Joule expansion is an irreversible expansion of a gas into a vacuum which can be accomplished by opening a tap connecting the vessel containing gas and an evacuated vessel (see Fig. 27.1). The entire system is isolated from its surroundings and so no heat enters or leaves. No work is done, so the internal energy U is unchanged. We are interested in ﬁnding out whether the gas warms, cools or remains at constant temperature in this expansion. To answer this, we deﬁne the Joule coeﬃcient µJ using ∂T , (27.1) µJ = ∂V U where the constraint of constant U is relevant for the Joule expansion. This partial diﬀerential can be transformed using eqn 16.67 and the deﬁnition of CV to give ∂T ∂U ∂U 1 =− . (27.2) µJ = − ∂U V ∂V T CV ∂V T Now the ﬁrst law, dU = T dS − pdV , implies that ∂U ∂S =T − p, ∂V T ∂V T and using a Maxwell relation (eqn 16.53) this becomes ∂p ∂U =T − p, ∂V T ∂T V

(27.3)

(27.4)

Fig. 27.1 The Joule expansion: (a) before opening the tap and (b) after opening the tap.

298 Cooling real gases

and hence µJ = −

1

Of course, at very high densities, the intermolecular interactions become repulsive rather than attractive, but at such a density one is probably dealing with a solid rather than a gas.

1 CV

∂p −p . T ∂T V

(27.5)

For an ideal gas, p = RT /V , (∂p/∂T )V = R/V , and hence µJ = 0. Hence, as we found in Section 14.4, the temperature of an ideal gas is unchanged in a Joule expansion. For real gases, you always get cooling because of the attractive eﬀect of interactions. This is because CV > 0 and (∂U/∂V )T > 0 and so µJ = − C1V (∂U/∂V )T < 0. This can be understood physically as follows: when a gas freely expands into a vacuum, the time-averaged distance between neighbouring molecules increases and the magnitude of the potential energy resulting from the attractive intermolecular interactions is reduced. However, this potential energy is a negative quantity (because the interactions are attractive) and so the potential energy is actually increased (because it is made less negative).1 Since U must be unchanged in a Joule expansion (no heat enters or leaves and no work is done), the kinetic energy must be reduced (by the same amount by which the potential energy rises) and hence the temperature falls.

Example 27.1 Evaluate the Joule coeﬃcient for a van der Waals gas. Solution: The equation of state is p = RT /(V − b) − a/V 2 and so ∂p R , = ∂T V V −b and hence 1 µJ = − CV

RT RT a a − + . =− V −b V −b V2 CV V 2

(27.6)

(27.7)

The temperature change in a Joule expansion from V1 to V2 can be evaluated simply by integrating the Joule coeﬃcient as follows: V2 V2 ∂p 1 µJ dV = − − p dV. (27.8) ∆T = T ∂T V V1 V1 CV

Example 27.2 Evaluate the change in temperature for a van der Waals gas which undergoes a Joule expansion from volume V1 to volume V2 . Solution: Using eqn 27.8, we have that V2 1 dV a 1 a =− − ∆T = − <0 (27.9) CV V1 V 2 CV V1 V2

27.2

Isothermal expansion 299

since V2 > V1 in an expansion.

27.2

Isothermal expansion

Consider the isothermal expansion of a non-ideal gas. Equation 27.4 states that ∂U ∂p =T − p, (27.10) ∂V T ∂T V so that the change of U in an isothermal expansion is V2 ∂p − p dV. T ∆U = ∂T V V1 • For an ideal gas2 , ∆U = 0. • For a van der Waals gas, ∆U =

V2 V1

(27.11) 2

a V2

dV = a(1/V1 − 1/V2 ).

Note that U depends on a, not b (it is inﬂuenced by the intermolecular interactions but does not ‘care’ that they have non-zero size). Note also that for large volumes, U becomes independent of V and one recovers the ideal gas limit.

Example 27.3 Calculate the entropy of a van der Waals gas. Solution: The entropy S can be written as a function of T and V so that S = S(T, V ). Hence ∂S ∂S dS = dT + dV ∂T V ∂V T ∂p CV dT + dV, (27.12) = T ∂T V where eqns 16.68 and 16.53 have been used to obtain the second line. For the van der Waals gas, we can write (∂p/∂T )V = R/(V − b), and hence we have that S = CV ln T + R ln(V − b) + constant.

(27.13)

Note that the entropy depends on the constant b, but not a. Entropy ‘cares’ about the volume occupied by the molecules in the gas (because this determines how much available space there is for the molecules to move around in, and this in turn determines the number of possible microstates of the system) but not about the intermolecular interactions.

For an ideal gas, p = R/V , (∂p/∂T “ ”)V ∂p T ∂T − p = 0. V

= RT /V , and hence

300 Cooling real gases

27.3

Joule–Kelvin expansion

The Joule expansion is a useful conceptual process, but it is not much practical use for cooling gases. Gas slightly cools when it is expanded into a second evacuated vessel, but what do you do with it then? What is wanted is some kind of ﬂow process where warm gas can be fed into some kind of a ‘cooling machine’ and cold gas (or better still, cold liquid) emerges from the other end. Such a process was discovered by James Joule and William Thomson (later Lord Kelvin) and is known as a Joule–Thomson expansion or a Joule–Kelvin expansion. Consider a steady ﬂow process in which gas at high pressure p1 is forced through a throttle valve or a porous plug to a lower pressure p2 . This is illustrated in Fig. 27.2. Consider a volume V1 of gas on the high–pressure side. Its internal energy is U1 . To push the gas through the constriction, the high pressure gas behind it has to do work on it equal to p1 V1 (since the pressure p1 is maintained on the high–pressure side of the constriction). The gas expands as it passes through to the low–pressure region and now occupies volume V2 which is larger than V1 . It has to do work on the low–pressure gas in front of it which is at pressure p2 and hence this work is p2 V2 . The gas may change its temperature in the process and hence its new internal energy is U2 . The change in internal energy (U2 − U1 ) must be equal to the work done on the gas (p1 V1 ) minus the work done by the gas (p2 V2 ). Thus p

V

V

p

U1 + p1 V1 = U2 + p2 V2

(27.14)

H1 = H 2 ,

(27.15)

or equivalently

Fig. 27.2 A throttling process.

so that it is enthalpy that is conserved in this ﬂow process. Since we are now interested in how much the gas changes temperature when we reduce its pressure at constant enthalpy, we deﬁne the Joule– Kelvin coeﬃcient by ∂T µJK = . (27.16) ∂p H This can be transformed using the reciprocity theorem (eqn C.42) and the deﬁnition of Cp to give ∂T ∂H 1 ∂H µJK = − =− . (27.17) ∂H p ∂p T Cp ∂p T Now the relation dH = T dS + V dp implies that ∂H ∂S =T + V, ∂p T ∂p T and using a Maxwell relation (eqn 16.54) this becomes ∂H ∂V = −T + V, ∂p T ∂T p

(27.18)

(27.19)

27.3

and hence µJK

∂V 1 = −V T Cp ∂T p

Joule–Kelvin expansion 301

(27.20)

The change in temperature for a gas following a Joule–Kelvin expansion from pressure p1 to pressure p2 is given by p2 ∂V 1 − V dp. (27.21) T ∆T = ∂T p p1 Cp Since dH = T dS + V dp = 0, the entropy change is p2 V ∆S = − dp, p1 T

(27.22)

T T

and for an ideal gas this is R ln(p1 /p2 ) > 0. Thus this is an irreversible process.

p p Whether the Joule–Kelvin expansion results in heating or cooling is more subtle and in fact µJK can take either sign. It is convenient to consider when µJK changes sign, and this will occur when µJK = 0, i.e. when T (∂V /∂T )p − V = 0, or equivalently ∂V V (27.23) = . ∂T p T

Fig. 27.3 The inversion curve of the van der Waals gas is shown as the heavy dashed line. The isenthalps (lines of constant enthalpy) are shown as thin solid lines. When the gradients of the isenthalps on this diagram are positive, then cooling can be obtained when pressure is reduced at constant enthalpy (i.e. in a Joule–Kelvin expansion). Also shown (as a solid line near the bottom left-hand corner of the graph which terminates at the dot) is the line of coexistence of liquid and gas (from Fig. 26.6) ending in the critical point (p = pc , T = Tc , shown by the dot).

302 Cooling real gases

4

He

43

H2

N2

Ar

CO2

204

607

794

1275

Table 27.1 The maximum inversion temperature in Kelvin for several gases.

This equation deﬁnes the so-called inversion curve in the T –p plane. This is plotted for the van der Waals gas in Fig. 27.3 as a heavy solid line. The lines of constant enthalpy are also shown and their gradients change sign when they cross the inversion curve. When the gradient of the isenthalps on this diagram are positive, then cooling can be obtained when pressure is reduced at constant enthalpy (i.e. in a Joule–Kelvin expansion). A crucial parameter is the maximum inversion temperature, below which the Joule–Kelvin expansion can result in cooling. These are listed for several real gases in Table 27.1. In the case of helium, this temperature is 43 K, so helium gas must be cooled to below this temperature by some other means before it can be liqueﬁed using the Joule–Kelvin process.

27.4

Fig. 27.4 A schematic diagram of a liqueﬁer. h

T

p

h

y T

p

liquefier compressed gas

y h

exhaust gas

T

p

liquid

Fig. 27.5 A block diagram of the liquefaction process.

Liquefaction of gases

For achieving the liquefaction of gases, the Joule–Kelvin process is extremely useful, though it must be carried out below the maximum inversion temperature of the particular gas in question. A schematic diagram of a liqueﬁer is shown in Fig. 27.4. High–pressure gas is forced through a throttle valve, resulting in cooling by the Joule–Kelvin process. Low– pressure gas plus liquid results, and the process is made more eﬃcient by use of a counter-current heat exchanger by which the outgoing cold low–pressure gas is used to precool the incoming warm high pressure gas, helping to ensure that by the time it reaches the throttle valve the incoming high–pressure gas is already as cool as possible and at least at a temperature such that the Joule–Kelvin eﬀect will result in cooling. We can consider the liqueﬁer as a ‘black box’ into which you put 1 kg of warm gas and get out y kg of liquid, as well as (1 − y) kg of exhaust gas (see Fig. 27.5). The variable y is the eﬃciency y of a liqueﬁer, i.e. the mass fraction of incoming gas which is liqueﬁed. Since enthalpy is conserved in a Joule–Kelvin process, we have that hi = yhL + (1 − y)hf ,

(27.24)

where hi is the speciﬁc enthalpy of the incoming gas, hL is the speciﬁc enthalpy of the liquid, and hf is the speciﬁc enthalpy of the outgoing gas. Hence the eﬃciency y is given by y=

hf − h i hf − hL

(27.25)

27.4

For an eﬃcient heat exchanger, the temperature of the compressed gas Ti and the exhaust gas Tf will be the same. We also have that pf = 1 atm, and TL is ﬁxed (because the liquid will be in equilibrium with its vapour). Therefore hf and hL are ﬁxed. The only parameter to vary is then hi and to maximize y we must minimize hi , i.e. ∂hi =0 (27.26) ∂pi Ti and since (∂h/∂p)T = −(1/Cp )µJK , we therefore require that µJK = 0.

(27.27)

This means that it is best to work the liqueﬁer right on the inversion curve (µJK = 0) for maximum eﬃciency. Most gases could be liqueﬁed by the end of the nineteenth century, but the modern form of gas liqueﬁer dates back to the work of the German chemist Karl von Linde (1842–1934) who commercialized liquid-air production in 1895 using the Joule–Kelvin eﬀect with a counter-current heat exchanger (as shown in Fig. 27.4; this is known as the Linde process) and discovered various uses of liquid nitrogen. James Dewar (1842–1923) was the ﬁrst to liquefy hydrogen using the Linde process in 1898, and in 1899 got it to go solid. Dewar was also the ﬁrst, in 1891, to study the magnetic properties of liquid oxygen. The Dutch physicist Heike Kamerlingh Onnes (1853–1926) was the ﬁrst to produce liquid helium in 1908 by a similar process, precooling the helium gas using liquid hydrogen. Using liquid helium, he then discovered superconductivity in 1911, and was awarded the Nobel Prize in 1913 for ‘his investigations on the properties of matter at low temperatures which led, inter alia, to the production of liquid helium’.

Chapter summary • The Joule expansion results in cooling for non-ideal gases because of the attractive interactions between molecules. • The entropy of a gas depends on the non-zero size of molecules. • The Joule–Kelvin expansion is a steady ﬂow process in which enthalpy is conserved. It can result in either warming or cooling of a gas. It forms the basis of many gas liquefaction techniques.

Liquefaction of gases 303

304 Exercises

Exercises (27.1) (a) Derive the following general relations – „ « « » „ ∂T ∂p 1 = − −p , (a) T ∂V U CV ∂T V „ „ « « ∂T 1 ∂p (b) = − T , ∂V S CV ∂T V # " „ „ « « ∂T ∂V 1 (c) = −V . T ∂p H Cp ∂T p In each case the quantity on the left hand side is the appropriate thing to consider for a particular type of expansion. State what type of expansion each refers to. (b) Using these relations, verify that for an ideal gas (∂T /∂V )U = 0 and (∂T /∂p)H = 0, and that (∂T /∂V )S leads to the familiar relation pV γ = constant along an isentrope. (27.2) In a Joule–Kelvin liqueﬁer, gas is cooled by expansion through an insulated throttle – a simple but ineﬃcient process with no moving parts at low temperature. Explain why enthalpy is conserved in this process. Deduce that – » „ « « „ ∂V 1 ∂T = −V . T ∂P H CP ∂T P Estimate the highest starting temperature at which the process will work for helium at low densities, on the following assumptions: (i) the pressure is given at low densities by a virial expansion of the form „ « “ a ” 1 PV =1+ b− + ··· , RT RT V and (ii) the Boyle temperature a/bR (the temperature at which the second virial coeﬃcient vanishes) is known from experiment to be 19 K for helium.

[Hint: One method of solving this problem is to remember that p is easily made the subject of the equation of state and one can then use (∂V /∂T )p = −(∂p/∂T )V /(∂p/∂V )T .] (27.3) For a gas obeying Dieterici’s equation of state p(V − b) = RT e−a/RT V , for 1 mole, prove that the equation of the inversion curve is „ « „ « 1 2a RT a p= − exp − , b2 b 2 RT b and hence ﬁnd the maximum inversion temperature Tmax . (27.4) Show that the equation for the inversion curve of the Dieterici gas in reduced units is » – 5 4 ˜ ˜ P = (8 − T ) exp , − 2 T˜ and sketch it in the T˜–P˜ plane. (27.5) Why is enthalpy conserved in steady ﬂow processes? A helium liqueﬁer in its ﬁnal stage of liquefaction takes in compressed helium gas at 14 K, liqueﬁes a fraction α, and rejects the rest at 14 K and atmospheric pressure. Use the values of enthalpy H of helium gas at 14 K as a function of pressure p in the table below to determine the input pressure which allows α to take its maximum value, and determine what this value is. p (atm)

0

10

20

30

40

H (kJ kg−1 )

87.4

78.5

73.1

71.8

72.6

[Enthalpy of liquid helium at atmospheric pressure = 10.1 kJ kg−1 ].

28

Phase transitions In this chapter we will consider phase transitions, in which one thermodynamic phase changes into another. An example would be the transition from liquid water to gaseous steam which occurs when you boil a kettle of water. If you start with cold water, and warm it in the kettle, all that happens initially is that the water gets progressively hotter. However, when the temperature of the water reaches 100◦ C interesting things begin to happen. Bubbles of gas form with diﬀerent sizes, making the kettle considerably noisier, and water molecules begin to leave the liquid surface in large quantities and steam is emitted. The transition between diﬀerent phases is very sudden. It is only when the boiling point is reached that liquid water becomes thermodynamically unstable and gaseous water, steam, becomes thermodynamically stable. In this chapter, we will look in detail at the thermodynamics of this and other phase transitions.

28.1

28.1 Latent heat

305

28.2 Chemical potential phase changes

and 308

28.3 The Clausius–Clapeyron equation 308 28.4 Stability & metastability 313 28.5 The Gibbs phase rule

316

28.6 Colligative properties

318

28.7 Classiﬁcation of phase transitions 320 Chapter summary

323

Further reading

323

Exercises

323

Latent heat

To increase the temperature of a substance, one needs to apply heat, and how much heat is needed can be calculated from the heat capacity because adding heat to the substance increases its entropy. The gradient of entropy with temperature is related to the heat capacity via ∂S Cx = T , (28.1) ∂T x where x is the appropriate constraint (e.g. p, V , B etc). Now consider two phases which are in thermodynamic equilibrium at a critical temperature Tc . Very often, it is found that to change from phase 1 to phase 2 at a constant temperature Tc , you need to supply some extra heat, known as the latent heat L, which is given by L = ∆Qrev = Tc (S2 − S1 ),

(28.2)

where S1 is the entropy of phase 1 and S2 is the entropy of phase 2. This, together with eqn 28.1, implies that there will be a spike in the heat capacity Cx as a function of temperature. An example of a phase transition which involves a latent heat is the liquid–gas transition. The entropy as a function of temperature for H2 O is shown in Fig. 28.1. The entropy is shown to change discontinuously at the phase transition. The heat capacity1 Cp of the liquid phase,

1

We use Cp because the constraint usually applied in the laboratory is that of constant pressure.

306 Phase transitions

steam

S

L Tb

water

Tb

Fig. 28.1 The entropy of H2 O as a function of temperature. The boiling point is Tb = 373 K.

water, is about 75 J K−1 mol−1 (equivalent to about 4.2 kJ kg−1 K−1 ) at for the temperatures below the boiling point Tb , and this is responsible gradient of S(T ) below the transition (because ∆S = Cp dT /T ), while the heat capacity of the gaseous phase, steam, is about 34 J K−1 mol−1 , and this is responsible for the gradient in S(T ) above the transition. The sudden, discontinuous change in S which occurs at Tb is a jump of magnitude L/Tb , where L is the latent heat, equal to 40.7 kJ mol−1 (or equivalently 2.26 MJ kg−1 ).

Example 28.1 If it takes 3 minutes to boil a kettle of water which was initially at 20◦ C, how much longer will it take to boil the kettle dry? Solution: Using the data above, the energy required to raise water from 20◦ C to 100◦ C is 80 × 4.2 = 336 kJ kg−1 . The energy required to turn it into steam at 100◦ C is 2.26 MJ kg−1 , which is 6.7 times as big. Therefore it would take 6.7× 3 ≈ 20 minutes to boil the kettle dry (though, of course, having an automatic switch-oﬀ mechanism saves this from happening!).

28.1

Let us now perform a rough estimate for the entropy discontinuity at a vapour2 –liquid transition. The number of microstates Ω available to a single gas molecule is proportional to its volume,3 and hence the ratio of Ω for one mole of vapour and one mole of liquid is NA Vvapour Ωvapour = . (28.3) Ωliquid Vliquid Hence Ωvapour = Ωliquid

ρliquid ρvapour

NA

Latent heat 307

2

The word vapour is a synonym for gas, but is often used when conditions are such that the substance in the gas can also exist as a liquid or solid; if T < Tc , the vapour can be condensed into a liquid or solid with the application of pressure. 3

∼ (103 )NA ,

(28.4)

since the density of the vapour is roughly 103 times smaller than the density of the liquid. Hence, using S = kB ln Ω, we have that the entropy discontinuity is approximately ∆S = ∆(kB ln Ω) = kB ln(103 )NA = R ln 103 ≈ 7R,

(28.5)

Recall from Section 21.1 that one microstate occupies a volume in k-space equal to (2π/L)3 ∝ V , and hence the density of states is proportional to the system volume V .

Remember that R = NA kB .

so that L ≈ 7RTb .

(28.6)

This relationship is known as Trouton’s rule, and is an empirical relationship which has been noticed for many systems, although it is usually stated with a slightly diﬀerent prefactor: L ≈ 10RTb .

(28.7)

The fact that the latent heat is slightly larger than expected from our simple argument stems from the fact that the latent heat also involves a contribution from the attractive intermolecular potential. However, the law of corresponding states4 implies that if substances have similar shaped intermolecular potentials then certain properties should scale in the same way, so we do expect L/RTb to be a constant.

Tb (K) L (kJ mol−1 ) L/RTb

4

See Section 26.4.

Ne

Ar

Kr

Xe

He

H2 O

CH4

C6 H6

27.1 1.77 7.85

87.3 6.52 8.98

119.8 9.03 9.06

165.0 12.64 9.21

4.22 0.084 2.39

373.15 40.7 13.1

111.7 8.18 8.80

353.9 30.7 10.5

Table 28.1 The values of Tb , L and L/RTb for several common substances.

This can be tested for various real substances (see Table 28.1) and indeed it is found that for many substances the ratio L/RTb is around 8– 10, conﬁrming Trouton’s empirical rule. Notable outliers include helium (He) for which quantum eﬀects are very important (see Chapter 30) and water5 (H2 O), which is a polar liquid (because the water molecule has a dipole moment) and which therefore possesses a rather diﬀerent intermolecular potential.

5

Water being a special case has a lot of consequences; see, for example, Section 37.3.

308 Phase transitions

28.2

Chemical potential and phase changes

We have seen in Section 16.5 that the Gibbs function is the quantity that must be minimized when systems are held at constant pressure and temperature. In Section 22.5 we found that the chemical potential is the Gibbs function per particle. We were also able to write in eqn 22.52 that µi dNi . (28.8) dG = V dp − SdT + i

Now consider the situation in Figure 28.2 in which N1 particles of phase 1 are in equilibrium with N2 particles of phase 2. Then the total Gibbs free energy is (28.9) Gtot = N1 µ1 + N2 µ2 , Fig. 28.2 Two phases in equilibrium at constant pressure (the constraint of constant pressure is maintained by the piston).

and since we are in equilibrium we must have dGtot = 0,

(28.10)

dGtot = dN1 µ1 + dN2 µ2 = 0.

(28.11)

and hence But if we increase the number of particles in phase 1, the number of particles in phase 2 must decrease by the same amount, so that dN1 = −dN2 . Hence we have that p

p

µ1 = µ2 .

p

Thus in phase equilibrium, each coexisting phase has the same chemical potential. The lowest µ phase is stable. Along a line of coexistence µ1 = µ2 .

T

T

T

Fig. 28.3 Two phases in the p–T plane coexist at the phase boundary, shown by the solid line.

28.3

(28.12)

The Clausius–Clapeyron equation

We now want to ﬁnd the equation which describes the phase boundary in the p–T plane (see Fig. 28.3). This line of coexistence of the two phases is determined by the equation µ1 (p, T ) = µ2 (p, T ).

(28.13)

If we move along this phase boundary, we must also have µ1 (p + dp, T + dT ) = µ2 (p + dp, T + dT ),

(28.14)

so that when we change p to p + dp and T to T + dT we must have dµ1 = dµ2 .

(28.15)

This implies that (using eqns 16.22 and 22.48) −s1 dT + v1 dp = −s2 dT + v2 dp,

(28.16)

28.3

The Clausius–Clapeyron equation 309

where s1 and s2 are the entropy per particle in phases 1 and 2, and v1 and v2 are the volume per particle in phases 1 and 2. Rearranging this equation therefore gives that s2 − s1 dp = . dT v2 − v 1

(28.17)

If we deﬁne the latent heat per particle as l = T ∆s, we then have that l dp = , dT T (v2 − v1 )

(28.18)

L dp = dT T (V2 − V1 ).

(28.19)

or equivalently

which is known as the Clausius–Clapeyron equation. This shows that the gradient of the phase boundary of the p–T plane is purely determined by the latent heat, the temperature at the phase boundary and the diﬀerence in volume between the two phases.6

6

This can be obtained from the diﬀerence in densities.

Example 28.2 Derive an equation for the phase boundary of the liquid and gas phases under the assumptions that the latent heat L is temperature independent, that the vapour can be treated as an ideal gas, and that Vvapour = V Vliquid . Solution: Assuming that Vvapour = V Vliquid and that pV = RT for one mole, the Clausius–Clapeyron equation becomes Lp dp = . dT RT 2

(28.20)

This can be rearranged to give dp LdT = , p RT 2

(28.21)

and hence integrating we obtain ln p = −

L + constant. RT

Hence the equation of the phase boundary is L p = p0 exp − , RT

(28.22)

(28.23)

where the exponential looks like a Boltzmann factor e−βl with l = L/NA , the latent heat per particle.

Remember again that R = NA kB .

310 Phase transitions

The temperature dependence of the latent heat of many substances cannot be neglected. As an example, the temperature dependence of the latent heat of water is shown in Fig. 28.4 and this shows a weak temperature dependence. A method of treating this is outlined in the following example.

Example 28.3

L

Evaluate the temperature dependence of the latent heat along the phase boundary in a liquid–gas transition and hence deduce the equation of the phase boundary including this temperature dependence. Solution: Along the phase boundary, we can write that the gradient in the temperature is (see Fig. 28.5) ∂ ∂ dp d = + . (28.24) dT ∂T p dT ∂p T T

Fig. 28.4 The temperature dependence of the latent heat of water. The solid line is according to eqn 28.30.

p

Hence, applying this to the quantity ∆S = Sv − SL = L/T where the subscripts v and L refer to vapour and liquid respectively, we have that L ∂(∆S) d dp ∂(∆S) + = dT T ∂T dT ∂p p T ∂Sv ∂SL dp Cpv − CpL + ,(28.25) = − T ∂p T ∂p T dT so that

p

T T

T T Fig. 28.5 The phase boundary.

d dT

∂ dp L Cpv − CpL − (Vv − VL ) . = T T ∂T dT

Using Vv VL and pVv = RT , we have that L R Lp d Cpv − CpL − × , = dT T T p RT 2 and expanding d dT

L L 1 dL − 2, = T T dT T

(28.26)

(28.27)

(28.28)

yields dL = (Cpv − CpL ) dT,

(28.29)

L = L0 + (Cpv − CpL )T.

(28.30)

so that Thus the latent heat contains a linear temperature dependence and this is shown by the solid line in Fig. 28.4. The negative slope is due to the fact that CpL > Cpv . Substituting this value of L into eqn 28.19 yields the equation of the phase boundary: (Cpv − CpL ) ln T L0 + p = p0 exp − . (28.31) RT R

28.3

The Clausius–Clapeyron equation 311

We can also use the Clausius–Clapeyron equation to derive the phase boundary of the liquid–solid coexistence line, as shown in the following example.

Example 28.4 Find the equation in the p–T plane for the phase boundary between the liquid and solid phases of a substance. Solution: The Clausius–Clapeyron equation (eqn 28.19) can be rearranged to give dp =

LdT , T ∆V

(28.32)

and neglecting the temperature dependence of L and ∆V , we ﬁnd that this integrates to T L p = p0 + ln , (28.33) ∆V T0 where T0 and p0 are constants such that (T, p) = (T0 , p0 ) is a point on the phase boundary. The volume change ∆V on melting is relatively small, so that the gradient of the phase boundary in the p–T plane is very steep.

p p

solid liquid gas

T T A phase diagram of a hypothetical pure substance is shown in Fig. 28.6 and shows the solid, liquid and gaseous phases coexisting with the phase boundaries calculated from the Clausius–Clapeyron equation. The three

Fig. 28.6 A schematic phase diagram of a (hypothetical) pure substance.

312 Phase transitions

7

The latent heat of melting is sometimes known as the latent heat of fusion.

8

This fact will be used in Exercise 28.5.

phases coexist at the triple point. The solid-liquid phase boundary is very steep, reﬂecting the large change in entropy in going from liquid to solid and the very small change in volume. This phase boundary does not terminate, but continues indeﬁnitely. By way of contrast, the phase boundary between liquid and gas terminates at the critical point, as we have seen in Section 26.1. (We will have more to say about this observation in Section 28.7.) Note also that, at temperatures close to the triple point, the latent heat of sublimation (changing from solid to gas) is equal to the sum of the latent heat of melting (solid→liquid)7 and the latent heat of vapourisation (liquid→gas).8

liquid gas

p

solid

T

Fig. 28.8 Schematic diagram of hydrogen bonding in water.

9

A hydrogen bond is a weak attractive interaction between a hydrogen atom and a strongly electronegative atom such as oxygen or nitrogen. The electron cloud around the hydrogen nucleus is attracted by the electronegative atom and leaves the hydrogen with a partial positive charge, and because of its size this results in a large charge density. Hydrogen bonding is responsible for the linking of the base pairs in DNA and the structure of many proteins. It is also responsible for the high boiling point of water (which given its low molecular mass would be expected to boil at much lower temperatures than it does).

Fig. 28.7 The phase diagram of H2 O showing the solid (ice), liquid (water) and gaseous (steam) phases. The horizontal dashed line corresponds to atmospheric pressure, and the normally experienced freezing and boiling points of water are indicated by the open circles.

The gradient of the liquid–solid coexistence line is normally positive because most substances expand when they melt. A notable counterexample is water, which slightly shrinks when it melts. Hence, the gradient of the ice-water coexistence line is negative (see Fig. 28.7; because the solid-liquid line is so steep, it is not easy to see that the slope is negative). This eﬀect occurs because of the hydrogen bonding9 in water (see Fig. 28.8) which results in a rather open structure of the ice crystal lattice. This collapses on melting, resulting in a slightly denser liquid. This result has many consequences: for example, icebergs ﬂoat on the ocean and ice cubes ﬂoat in your gin and tonic. The pressure dependence of the coexistence line means that pressing ice can cause it to melt, an eﬀect which is responsible for the movement of glaciers, which can press against rock, melt near the region of contact with the rock, and slowly creep downhill.

28.4

28.4

Stability & metastability 313

Stability & metastability

We have seen in Section 28.2 that the phase with the lowest chemical potential µ is the most stable. Let us see how the phase transition varies as a function of pressure. Since µ is the Gibbs function per particle, eqn 16.24 implies that ∂µ = v, (28.34) ∂p T where v is the volume per particle. Since v > 0, the gradient of the chemical potential with pressure must always be positive. The behaviour of µ as a function of pressure as one crosses the phase transition between the liquid and gas phases is shown in Fig. 28.9. This ﬁgure shows that the phase which is stable at the highest pressure must therefore have the smallest volume. This of course makes sense since, when you apply large pressure, you expect the smallest space-occupying phase to be the most stable. We can also think about µ as a function of temperature. Equation 16.23 implies that ∂µ = −s, (28.35) ∂T p where s is the entropy per particle. Since µ > 0, the gradient of µ as a function of temperature must always be negative. The behaviour of µ as a function of temperature as you cross the phase transition between the liquid and gas phases is shown in Fig. 28.10. This ﬁgure shows that the phase which is stable at the highest temperature must therefore have the highest entropy. This makes sense because G = H − T S and so at higher temperature, you minimize G by maximizing S. This also shows that as you warm a substance through its boiling point, it is possible to momentarily continue on the curve corresponding to µliq and to form superheated liquid which is a metastable state. Although for temperatures above the boiling point it is the gaseous state which is thermodynamically stable (i.e. has the lowest Gibbs function), there may be reasons why this state cannot be formed immediately and the liquid state persists. Similarly, if you cool a gas below the boiling point, it is possible to momentarily continue on the curve corresponding to µgas and to form supercooled vapour, which is a metastable state. Again, this is not the thermodynamically stable state of the system but there may be reasons why the liquid state cannot nucleate immediately and the gaseous state persists. Let us now try and fathom the reason why the thermodynamically most stable state sometimes doesn’t form. Consider a liquid with pressure pliq in equilibrium with a vapour at pressure p. The chemical potentials of the liquid and vapour must be equal. Now imagine that the liquid pressure slightly increases to pliq + dpliq . If the vapour is still in equilibrium with the liquid, then its pressure must increase to p + dp and we must have that ∂µvap ∂µliq dpliq = dp, (28.36) ∂pliq T ∂p T

p Fig. 28.9 The chemical potential as a function of pressure.

T Fig. 28.10 The chemical potential as a function of temperature.

314 Phase transitions

so that the chemical potentials of the liquid and vapour are still equal. Using eqn 28.34, this implies that vliq dpliq = vvap dp,

(28.37)

where vliq is the volume per particle occupied by the liquid and vliq is the volume per particle occupied by the gas. Hence multiplying this by NA and using pV = RT for one mole of the gas, we ﬁnd that Vliq dpliq = 10

The vapour pressure of a liquid (or a solid) is the pressure of vapour in equilibrium with the liquid (or the solid).

RT dp , p

(28.38)

where Vliq is the molar volume of the liquid. We can use this to ﬁnd the dependence of the vapour pressure10 on the pressure in the liquid at constant temperature. Integrating eqn 28.38 leads to Vliq ∆pliq p = p0 exp , (28.39) RT where ∆pliq is the extra pressure applied to the liquid, p0 is the vapour pressure of the gas with no excess pressure applied to the liquid and p is the vapour pressure of the gas with excess pressure ∆pliq in the liquid. This result can be used to derive the vapour pressure of a droplet of liquid. Recall from eqn 17.18 that the excess pressure in a droplet of liquid of radius r can be obtained as ∆pliq =

2γ , r

where γ is the surface tension. Hence we ﬁnd that 2γVliq p = p0 exp , rRT

(28.40)

(28.41)

which is known as Kelvin’s formula. This formula shows that small droplets have a very high vapour pressure, and this gives some understanding about why the vapour sometimes doesn’t condense when you cool it through the boiling temperature. Small droplets initially begin to nucleate, but have a very high vapour pressure and therefore instead of growing can evaporate. This stabilises the vapour, even though it is the thermodynamically stable phase. The thermodynamic driving force to condense is overcome by the tendency to evaporate. This eﬀect occurs very often in the atmosphere which contains water vapour which has risen to an altitude where it is suﬃciently cold to condense into water droplets, but the droplets cannot form owing to this tendency to evaporate. Clouds do form through the nucleation of droplets on minute dust particles which have suﬃcient surface area for the liquid to condense and then grow above the critical size. A similar eﬀect occurs for superheated liquids. The pressure of liquid near a vapour-ﬁlled cavity of radius r is less than that in the bulk liquid according to 2γ (28.42) ∆pliq = − , r

28.4

and hence the vapour pressure inside the cavity follows: 2γVliq p = p0 exp − . rRT

(28.43)

Thus the vapour pressure inside a cavity is lower than one might expect. As you boil a liquid, any bubble of vapour which does form tends to collapse. This means the liquid can become superheated and kinetically stable above its boiling point, even though the vapour is the true thermodynamic ground state. The only bubbles which then do survive are very large ones, and this causes the violent bumping which can be observed in boiling liquids. This can be avoided by boiling liquids with small pieces of glass or ceramic, so that there are plenty of nucleation centres for small bubbles to form.

Example 28.5 A bubble chamber is used in particle physics to detect electrically charged subatomic particles. It consists of a container ﬁlled with a superheated transparent liquid such as liquid hydrogen, at a temperature just below its boiling point. The motion of the charged particle is suﬃcient to nucleate a string of bubbles of vapour which display the track of the particle. A magnetic ﬁeld can be applied to the chamber so that the shape of the curved tracks of the particle can be used to infer its charge to mass ratio. Its invention in 1952 earned Donald Glaser (1926–) the 1960 Nobel Prize for Physics.

Example 28.6 Calculate the Gibbs function for a droplet of liquid of radius r (and hence surface area A = 4πr2 ) in equilibrium with vapour. Assume the temperature is such that the liquid is the thermodynamically stable phase. Solution: Writing the number of particles (of mass m) in the liquid and vapour as Nliq and Nvap respectively, the change in Gibbs function is dG = µliq dNliq + µvap dNvap + γ dA,

(28.44)

where γ is the surface tension. Since particles must be conserved, dNvap = −dNliq . Diﬀerentiating A = 4πr2 yields dA = 8πr dr, and writing ∆µ = µvap − µliq (which will be positive since the liquid is the thermodynamically stable phase) we have 4πr2 ∆µρliq dG = 8πγr − dr, (28.45) m

Stability & metastability 315

316 Phase transitions

where ρliq is the density of the liquid. This can be integrated to yield G(r) = G(0) + 4πγr2 −

4π∆µρliq 3 r , 3m

(28.46)

G

G

and hence equilibrium is established when dG/dr = 0, and this occurs at the critical radius r∗ given by

Gr

r∗ =

r r Fig. 28.11 The Gibbs function of the droplet as a function of radius, plotted in units of G0 = 16πγ 3 m2 /(3(ρliq ∆µ)2 ).

2γm . ρliq ∆µ

(28.47)

This function is sketched in Fig. 28.11 and shows that r∗ is indeed a stationary point, but is a maximum in G, not a minimum! Thus r = r∗ is a point of unstable equilibrium. If r < r∗ , the system can minimize G by shrinking r to zero, i.e. the droplet evaporates. If r > r∗ , the system can minimize G by the droplet growing to inﬁnite size. This eﬀect occurs as water condenses in a cloud. The large droplets keep the partial pressure of the water vapour low. The smaller droplets therefore evaporate and the water can transfer from the smaller to the larger droplets.

28.5

The Gibbs phase rule

In this section, we want to ﬁnd out how much freedom a system has to change its internal parameters while keeping the diﬀerent substances in various combinations of phases in equilibrium with each other. We want to include the possibility of having mixtures of diﬀerent substances, and we will call the diﬀerent substances components. A component is a chemically independent constituent of the system. To keep track of the number of molecules in these diﬀerent components, we introduce the mole fraction xi which is deﬁned to be the ratio of the number of moles, ni , of the ith substance, divided by the total number of moles n, so that ni (28.48) xi = . n By deﬁnition, we have that xi = 1. (28.49) Each of the components can be in diﬀerent phases (where here we mean phases such as ‘solid’, ‘liquid’ and ‘gas’, but we might also wish to include other possibilities, such as ‘ferromagnetic’ and ‘paramagnetic’ phases, or ‘superconducting’ and ‘non-superconducting’ phases). We denote by the symbol F the number of degrees of freedom the system has while keeping the diﬀerent phases in equilibrium, and it is this quantity we now want to calculate, following a method introduced by Gibbs. Consider a multicomponent system, containing C components. Each component can be in any one of P diﬀerent phases. The system is

28.5

characterized by the intensive variables, the pressure p, the temperature T and the mole fractions of C − 1 of the components (we don’t need all C C of them, since i=1 xi = 1) for each of the P phases, so that is 2 + P (C − 1)

(28.50)

variables. If the phases of each component are in equilibrium with one another, then we must have, as i runs from 1 to C, µi (phase 1) = µi (phase 2) = · · · = µi (phase P ),

(28.51)

which gives us P − 1 equations to solve for each component, and hence C(P − 1) equations to solve for each of the C components. The number of degrees of freedom F the system has is given by the difference between the number of variables and the number of constraining equations to solve. Hence F = [P (C − 1) + 2] − C(P − 1), and thus F =C −P +2

(28.52)

which is known as the Gibbs phase rule.

Example 28.7 For a single–component system, C = 1 and hence F = 3 − P . Thus: • If there is one phase, F = 2, and the whole p–T plane is accessible. • If there are two phases, F = 1, and these two phases can only coexist at a common line of coexistence in the p–T plane. • If there are three phases, F = 0, and these three phases can only coexist at a common point of coexistence in the p–T plane (the triple point). For a two–component system, C = 2 and hence F = 4 − P . If we ﬁx the pressure, then the number of remaining degrees of freedom F = F − 1 = 3 − P . Having ﬁxed the pressure, we have two variables which are temperature T and the mole fraction x1 of the ﬁrst component (the mole fraction of the second component being given by 1 − x1 ). Thus: • If there is one phase, F = 2, and the whole x1 –T plane is accessible. • If there are two phases, F = 1, and these two phases can only coexist at a common line of coexistence in the x1 –T plane. • If there are three phases, F = 0, and these three phases can only coexist at a common point of coexistence in the x1 –T plane.

The Gibbs phase rule is of great use in interpreting complex phase diagrams of mixtures of substances.

The Gibbs phase rule 317

318 Phase transitions

28.6

11

The word colligative means a collection of things fastened together.

Colligative properties

When a liquid of a particular material (we will call it A) has another species, B, dissolved in it, the chemical potential of A is decreased. The result of this is that the boiling point of the liquid A is elevated and the freezing point of the liquid is depressed compared to that of the pure liquid. These eﬀects are known as colligative properties.11 The magnitude of the eﬀect can be worked out from the reduction of the chemical potential.

Example 28.8 The liquid which is the main component is known as the solvent, while the material which is dissolved in it is known as the solute.

Find the chemical potential of a solvent A with a solute B dissolved in it. The mole fraction of the solvent is xA . Solution: Recall from eqn 22.65 that the chemical potential of a gas (let us call it a gas of molecules of A) with pressure p∗A is given by (g)∗

µA

= µ A + RT ln

p∗A , p

(28.53)

where the superscript (g) indicates gas and the superscript * indicates that we are dealing with a pure substance. If this is in equilibrium with the liquid form of A, then we also have ( )∗

µA

= µ A + RT ln

p∗A , p

(28.54)

where the superscript () indicates liquid. Now imagine that we mix some B molecules into the liquid. The mole fraction of A, xA , is now less than one. The chemical potential of A in the liquid is now still equal to the chemical potential of A in the gas, but the gas has a diﬀerent vapour pressure pA (no asterisk because we are no longer dealing with pure substances). Thus µA = µA = µ A + RT ln ( )

(g)

pA . p

(28.55)

Equations 28.54 and 28.55 give that (g)

( )∗

µA = µA

+ RT ln

pA . p∗A

(28.56)

The vapour pressure of A in the mixed system can be estimated using Raoult’s law, which states that pA = xA p∗A (i.e. that the vapour pressure of A is proportional to its mole fraction). Hence eqn 28.56 becomes ( )

( )∗

µA = µA ( )

+ RT ln xA . ( )∗

(28.57)

Since xA < 1, we ﬁnd that µA < µA and so the chemical potential is indeed depressed compared to the pure case.

28.7

Classiﬁcation of phase transitions 319

We can now derive formulae to describe the colligative properties. Equation 28.57 can be rewritten as ln xA = (g)∗

∆Gvap , RT

(28.58)

( )∗

where ∆Gvap = µA − µA . When xA = 1, then equilibrium between vapour and liquid occurs at a temperature T ∗ given by (using eqn 28.58 with xA = 1) ∆Gvap (T ∗ ) = 0, (28.59) RT ∗ which implies that (recall that G = H − T S) ∆Hvap (T ∗ ) − T ∗ ∆Svap (T ∗ ) = 0.

(28.60)

When xB = 1 − xA is very small, then we have that ln(1 − xB ) ≈ −xB ,

(28.61)

and hence eqn 28.58 implies that −xB =

1 ∆Gvap = [∆Hvap (T ) − T ∆Svap (T )] , RT R

(28.62)

and assuming that ∆Hvap and ∆Svap are only weakly temperaturedependent, this yields 1 ∆Hvap 1 ∆Hvap δ −xB = − (T − T ∗ ). (28.63) ≈ R T∗ T RT ∗2 Hence T − T ∗ , the elevation in boiling point, is given approximately by T − T∗ ≈

RT ∗2 xB . ∆Hvap

(28.64)

It is often written T − T ∗ = Kb xB , where Kb ≈ RT ∗2 /∆Hvap is known as the ebullioscopic constant. For water, Kb = 0.51 K mol−1 kg−1 . There is a similar eﬀect on the depression of the freezing point. One can show similarly that the freezing point is depressed by an amount T ∗ − T = Kf xB where Kf ≈ RT ∗2 /∆Hfus is the cryoscopic constant. The salt water in the oceans freezes at a lower temperature than fresh water. The eﬀect is also relevant for salt being put on pavements (sidewalks) in winter to stop them becoming icy. Adding a small quantity of solute to a solvent increases the entropy of the solvent because the solute atoms are randomly located in the solvent. This means that there is a weaker tendency to form a gas (which would increase the solvent’s entropy) because the entropy of the solvent has been increased anyway. This results in an elevation of the boiling point. Similarly, this additional entropy opposes the tendency to freeze and the freezing point is depressed.

320 Phase transitions

G

V

T

G

Fig. 28.12 Ehrenfest’s classiﬁcation of phase transitions. (a) First-order phase transition. (b) Second-order phase transition. The critical temperature Tc is marked by a vertical dotted line in each case.

T

V

T

28.7

C

T

C

T

T

Classiﬁcation of phase transitions

Paul Ehrenfest (1880–1933) proposed a classiﬁcation of phase transitions which goes as follows: the order of a phase transition is the order of the lowest diﬀerential of G (or µ) which shows a discontinuity at Tc . Thus ﬁrst-order phase transitions involve a latent heat because the entropy (a ﬁrst diﬀerential of G) shows a discontinuity. The volume is also a ﬁrst diﬀerential of G and this also shows a discontinuous jump. The heat capacity is a second diﬀerential of G and thus it shows a sharp spike, as does the compressibility. This is illustrated in Fig. 28.12(a). Examples of ﬁrst-order phase transitions include the solid–liquid transition, the solid–vapour transition, and the liquid–vapour transition. By Ehrenfest’s classiﬁcation, a second-order phase transition has no latent heat because the entropy does not show a discontinuity (and neither does the volume – both are ﬁrst diﬀerentials of G), but quantities like the heat capacity and compressibility (second diﬀerentials of G) do. This is illustrated in Fig. 28.12(b). Examples of second-order phase transitions include the superconducting transition, or the order–disorder transition in β-brass. However, a big problem with the approach we have been using so far in studying phase transitions is that one key approximation made in thermodynamics, namely that the number of particles is so large that average properties such as pressure and density are well deﬁned, breaks down at a phase transition. Fluctuations build up near a phase transition and so the behaviour of the system does not follow the expectations of our analysis very close to the phase transition temperature. This critical region is characterized by ﬂuctuations, at all length scales. For example, when a saucepan of water is heated, the water warms quite quietly and

28.7

Classiﬁcation of phase transitions 321

unobtrusively until near the boiling point when it makes a great deal of noise and bubbles violently.12 We have already analysed the behaviour of the formation of bubbles in Section 28.4. Therefore, it has been found that Ehrenfest’s approach is rather too simple. We will have more to say concerning ﬂuctuations in Chapters 33 and 34. A more modern approach to classifying phase transitions simply distinguishes between those which show a latent heat, for which Ehrenfest’s term “ﬁrst-order phase transition” is retained, and those which do not, which are called a continuous phase transition (and include Ehrenfest’s phase transitions of second order, third order, fourth order, etc, all lumped together).

12

A visual demonstration of this is found in the phenomenon known as critical opalescence which is the blurring and clouding of images seen through a volume of gas near its critical point. This occurs because density ﬂuctuations are strong near the critical point and give rise to large variations in refractive index.

Example 28.9 • The liquid–gas phase transition is a ﬁrst-order transition, except at the critical point where the phase transition involves no latent heat and is a continuous phase transition. • A ferromagnet13 such as iron loses its ferromagnetism when heated to the Curie temperature, TC (a particular example of a critical temperature). This phase transition is a continuous phase transition, since there is no latent heat. The magnetization is a ﬁrst diﬀerential of the Gibbs function and does not change discontinuously at TC . The speciﬁc heat CB , at constant magnetic ﬁeld B, has a ﬁnite peak at TC .

A further classiﬁcation of phase transitions involves the notion of symmetry breaking. Figure 28.13 shows atoms in a liquid and in a solid. As a liquid cools there is a very slight contraction of the system but it retains a very high degree of symmetry. However, below the melting temperature, the liquid becomes a solid and that symmetry is broken. This may at ﬁrst sight seem surprising because the picture of the solid ‘looks’ more symmetrical than that of the liquid. The atoms in the solid are all symmetrically lined up while in the liquid they are all over the place. The crucial observation is that any point in a liquid is, on average, exactly the same as any other. If you average the system over time, each position is visited by atoms as often as any other. There are no unique directions or axes along which atoms line up. In short, the system possesses complete translational and rotational symmetry. In the solid, however, this high degree of symmetry is nearly all lost. The solid drawn in Fig. 28.13 still possesses some residual symmetry: rather than being invariant under arbitrary rotations, it is invariant under four-fold rotations (π/2, π, 3π/2, 2π); rather than being invariant under arbitrary translations, it is now invariant under a translation of

13

A ferromagnet is a material containing magnetic moments which are all aligned in parallel below a transition temperature called the Curie temperature. Above this temperature, the magnetic moments become randomly aligned. This state is known as the paramagnetic state.

322 Phase transitions

Fig. 28.13 The liquid–solid phase transition. Top: The high temperature state (statistically averaged) has complete translational and rotational symmetry. Bottom: These symmetries are broken as the system becomes a solid below the critical temperature Tc . 14 In other words when kB T was lower than this energy, corresponding to a temperature T ∼ 1015 K.

an integer combination of lattice basis vectors. Therefore not all symmetry has been lost but the high symmetry of the liquid state has been, to use the technical term, ‘broken’. It is impossible to change symmetry gradually. Either a particular symmetry is present or it is not. Hence, phase transitions are sharp and there is a clear delineation between the ordered and disordered states. Not all phase transitions involve a change of symmetry. Consider the liquid–gas coexistence line again (see Fig. 28.7). The boundary line between the liquid and gas regions is terminated by a critical point. Hence it is possible to ‘cheat’ the sharp phase transition by taking a path through the phase diagram which avoids a discontinuous change. For temperatures above the critical temperature (647 K for water) the gaseous and liquid states are distinguished only by their density. The transition between a gas and a liquid involves no change of symmetry and therefore it is possible to avoid it by working round the critical end point. In contrast, the solid–liquid transition involves a change of symmetry and consequently there is no critical point for the melting curve. Symmetry-breaking phase transitions include those between the ferromagnetic and paramagnetic states (in which the low–temperature state does not possess the rotational symmetries of the high–temperature state) and those between the superconducting and normal metal states of certain materials (in which the low–temperature state does not possess the same symmetry in the phase of the wavefunction as the high temperature state). The concept of broken symmetry is very wide–ranging and is used to explain how the electromagnetic and weak forces originated. In the early Universe, when the temperature was very high, it is believed that the electromagnetic and weak forces were part of the same, uniﬁed, electroweak force. When the temperature cooled14 to below about 1011 eV a symmetry was broken and a phase transition occured, via what is known as the Higgs mechanism, and the W and Z bosons (mediating the weak force) acquired mass while the photon (mediating the electromagnetic force) remained massless. It is suggested that, at even earlier times, when the temperature of the Universe was around 1021 eV, the electroweak and strong forces were uniﬁed, and as the Universe expanded and its temperature lowered, another symmetry-breaking transition caused them to appear as diﬀerent forces.

Further reading 323

Chapter summary • The latent heat is related to the change in entropy at a ﬁrst–order phase transition. • The Clausius–Clapeyron equation states that L dp = , dT T (V2 − V1 ) and this can be used to determine the shape of the phase boundary. • The Kelvin formula states that the pressure in a droplet is given by 2γVliq p = p0 exp . rRT • The Gibbs phase rule states that F = C − P + 2. • Dissolving a solute in a solvent results in the elevation of the solvent’s boiling point and a depression of its freezing point. • A ﬁrst–order phase transition involves a latent heat while a continuous phase transition does not. • Certain phase transitions involve the breaking of symmetry.

Further reading More information on phase transitions may be found in Binney et al. (1992), Yeomans (1992), Le Bellac (2004), Blundell (2001) and Anderson (1984).

Exercises (28.1) When lead is melted at atmospheric pressure, the melting point is 327.0◦ C, the density decreases from 1.101×104 to 1.065×104 kg m−3 and the latent heat is 24.5 kJ kg−1 . Estimate the melting point of lead at a pressure of 100 atm. (28.2) Some tea connoisseurs claim that a good cup of tea cannot be brewed with water at a temperature less than 97◦ C. Assuming this to be the case, is it possible for an astronomer, working on the summit of Mauna Kea in Hawaii (elevation 4194 m, though you don’t need to know this to solve the problem)

where the air pressure is 615 mbar, to make a good cup of tea without the aid of a pressure vessel? (28.3) The gradient of the melting line of water on a p−T diagram close to 0◦ C is −1.4 × 107 Pa K−1 . At 0◦ C, the speciﬁc volume of water is 1.00 × 10−3 m3 kg−1 and of ice is 1.09 × 10−3 m3 kg−1 . Using this information, deduce the latent heat of fusion of ice. In winter, a lake of water is covered initially by a uniform layer of ice of thickness 1 cm. The air temperature at the surface of the ice is −0.5◦ C. Es-

324 Exercises timate the rate at which the layer of ice begins to thicken, assuming that the temperature of the water just below the ice is 0◦ C. You can also assume steady state conditions and ignore convection. The temperature of the water at the bottom of the lake, depth 1 m, is maintained at 2◦ C. Find the thickness of ice which will eventually be formed. ˆ The thermal conductivity of ice ˜is 2.3 W m−1 K−1 and of water is 0.56 W m−1 K−1 . (28.4) (a) Show that the temperature dependence of the latent heat of vaporization L is given by the following expression: „ « L Cpv − CpL d = (28.65) dT T T „ »„ « « – ∂SL ∂Sv dp − + . ∂p T ∂p T dT In this equation, Sv and SL are the entropies of the vapour and liquid and Cpv and CpL are the heat capacities of the vapour and liquid. Hence show that L = L0 +L1 T where L0 and L1 are constants. (b) Show further that when the saturated vapour of an incompressible liquid is expanded adiabatically, some liquid condenses out if „ « L d <0 CpL + T dT T where CpL is the heat capacity of the liquid (which is assumed constant) and L is the latent heat of vaporisation. (Hint: consider the gradient of the phase boundary in the p–T plane and the corresponding curve for adiabatic expansion.)

(28.5) The equilibrium vapour pressure p of water as a function of temperature is given in the following table: T (◦ C)

p (Pa)

0 10 20 30 40 50

611 1228 2339 4246 7384 12349

Deduce a value for the latent heat of evaporation Lv of water. State clearly any simplifying assumptions that you make. Estimate the pressure at which ice and water are in equilibrium at −2◦ C given that ice cubes ﬂoat with 4/5 of their volume submerged in water at the triple point (0.01◦ C, 612 Pa). [Latent heat of sublimation of ice at the triple point, Ls = 2776 × 103 J kg−1 .] (28.6) It is sometimes stated that the weight of a skater pressing down on their thin skates is enough to melt ice, so that the skater can glide around on a thin ﬁlm of liquid water. Assuming an ice rink at −5◦ C, do some estimates and show that this mechanism won’t work. [In fact, frictional heating of ice is much more important, see S.C. Colbeck, Am. J. Phys. 63, 888 (1995) and S. C. Colbeck, L. Najarian, and H. B. Smith Am. J. Phys. 65, 488 (1997).]

Bose–Einstein and Fermi–Dirac distributions In this chapter, we are going to consider the way in which quantum mechanics changes the statistical properties of gases. The crucial ingredient is the concept of identical particles. The results of quantum mechanics show that there are two types of identical particle: bosons and fermions. Bosons can share quantum states, while fermions cannot share quantum states. Another way of stating this is to say that bosons are not subject to the Pauli exclusion principle, while fermions are. This diﬀerence in ability to share quantum states (arising from what we shall call exchange symmetry) has a profound eﬀect on the statistical distribution of these particles over the energy states of the system. This distribution over energy states is called the statistics of these particles, and we will demonstrate the eﬀect of exchange symmetry on statistics. However, it can also be shown that another diﬀerence between bosons and fermions is the type of spin angular momentum that they may possess. This is enshrined in the spin-statistics theorem, which we will not prove but which states that bosons have integer spin while fermions have half-integer spin.

Example 29.1 • Examples of bosons include: photons (spin 1), 4 He atoms (spin 0). • Examples of fermions include: electrons (spin 12 ), neutrons (spin 1 1 1 3 3 7 2 ), protons (spin 2 ), He atoms (spin 2 ), Li nuclei (spin 2 ).

29.1

Exchange and symmetry

In this section, we will argue why a two-particle wave function can be either symmetric or antisymmetric under exchange of particles. Consider two identical particles, one at position r 1 and the other at position r 2 . The wave function which describes this is ψ(r 1 , r 2 ). We now deﬁne an exchange operator Pˆ12 which exchanges particles 1 and 2. Thus Pˆ12 ψ(r 1 , r 2 ) = ψ(r 2 , r 1 ).

(29.1)

29 29.1 Exchange and symmetry 325 29.2 Wave functions of identical particles 326 29.3 The statistics particles Chapter summary

of

identical 329 332

Further reading

332

Exercises

332

326 Bose–Einstein and Fermi–Dirac distributions

ˆ Since the particles are identical, we also expect that the Hamiltonian H which describes this two-particle system must commute with Pˆ12 , i.e. ˆ Pˆ12 ] = 0, [H,

(29.2)

so that the energy eigenfunctions must be simultaneously eigenfunctions of the exchange operator. However, because the particles are identical, swapping them over must have no eﬀect on the probability density. Thus |ψ(r 1 , r 2 )|2 = |ψ(r 2 , r 1 )|2 . 1

A Hermitian operator has real eigenvalues, so is useful for representing real physical quantities in quantum mechanics.

(29.3)

If Pˆ12 is a Hermitian1 operator, it must have real eigenvalues, so we expect that Pˆ12 ψ = λψ, where λ is a real eigenvalue. Equation 29.3 shows that the only solution to this is λ = ±1, i.e. Pˆ12 ψ(r 1 , r 2 ) = ψ(r 2 , r 1 ) = ±ψ(r 1 , r 2 ).

(29.4)

The wave function must therefore have one of two types of exchange symmetry, as follows: • The wave function is symmetric under exchange of particles: ψ(r 2 , r 1 ) = ψ(r 1 , r 2 ),

(29.5)

and the particles are called bosons. • The wave function is antisymmetric under exchange of particles: ψ(r 2 , r 1 ) = −ψ(r 1 , r 2 ),

(29.6)

and the particles are called fermions. This argument is valid for particles in three dimensions, the situation we usually encounter in our three-dimensional world, but fails in two dimensions. This occurs because you have to be a bit more careful than we’ve been here about how you exchange two particles. This point is rather an esoteric one, but the interested reader can follow up this point in the box on page 327 and in the further reading.

29.2

Wave functions of identical particles

In the previous section, we wrote down a two-particle wave function ψ(r 2 , r 1 ) which labelled the particles according to their position. However, there are lots more ways in which one could label a particle, such as which orbital state it is in, or what its momentum is. To keep things completely general, we will label the particles according to their state in a more abstract way. The eﬀect on the statistics will then be more transparent, and is demonstrated by the following example (on page 328).

29.2

Anyons The argument that we have used to describe exchange symmetry is, in fact, only strictly valid in three dimensions. In two dimensions, there are further possibilities other than fermions and bosons. For the interested reader, we give a more detailed description in this box. We begin by noticing that eqn 29.3 allows the solution ψ(r 2 , r 1 ) = eiθ ψ(r 1 , r 2 ), where θ is a phase factor. Thus exchanging identical particles means that the wave function acquires a phase θ. Deﬁning r = r 2 −r 1 , the action of exchanging the position coordinates of two particles involves letting this vector execute some path from r to −r, but avoiding the origin so that the two particles do not ever occupy the same position. We therefore can imagine the exchange of particles as a path in r-space. Without loss of generality, we can keep |r| ﬁxed, so that in the process of exchanging the two particles, they move relative to each other at a ﬁxed separation. Thus, for the case of three dimensions, the path is on the surface of a sphere in r-space. Since the two particles are identical, opposite points on the surface of the sphere are equivalent and must be identiﬁed Fig. 29.1 Paths in r-space, (giving r-space the topolfor the three-dimensional ogy of, what is known as, case, corresponding to (a) real two-dimensional prono exchange of particles and (b) exchange of parti- jective space). It turns out that all paths on this surcles. face fall into two classes: ones which are contractible to a point [and thus correspond to no exchange of particles, yielding θ = 0 to ensure the wavefunction is single-valued; see Fig. 29.1(a)] and those which are not [and thus correspond to exchange of particles; see Fig. 29.1(b)]. For this latter case we have to assign θ = π, so that two exchanges correspond to no exchange, i.e.

Wave functions of identical particles 327

eiθ eiθ = 1, so that θ = π. This argument thus justiﬁes that the phase factor eiθ = ±1, giving rise to bosons (eiθ = +1) and fermions (eiθ = −1). However, the argument fails in two dimensions. In the two-dimensional case, the path is on a circle in r-space in which opposite points on the circle are equivalent and are identiﬁed. In this case, the paths in r-space can wind round the origin an integer number of times. This means that two successive exchanges of the particles [as shown in Fig. 29.2(c)] are not topologically equivalent to zero exchanges [if performed by winding round the origin in the same direction, as shown in Fig. 29.2(a)] and thus the phase θ can take any Fig. 29.2 Paths in value. (In this case, r-space r-space, for the twohas the topology of real onedimensional case, corresponding to (a) no ex- dimensional projective space, change, (b) a single ex- which is the same as that of change and (c) two ex- a circle.) The resulting parchanges of particles. ticles have more complicated statistical properties than either bosons or fermions and are called anyons (because θ can take ‘any’ value). Since θ/π is no longer forced to be ±1, and can take any fractional value in between, anyons can have fractional statistics. The crucial distinction between r-space in two and three dimensions is that the removal of the origin in two-dimensional space makes the space multiply connected (allowing paths which wind around the origin), whereas three-dimensional space remains singly connected (and a path which tries to wind round the origin can be deformed into one which does not). We live in a three-dimensional world, so is any of this relevant? In fact, anyons turn out to be important in the fractional quantum Hall eﬀect, which occurs in certain two-dimensional electron systems under high magnetic ﬁeld. For more details concerning anyons, see the further reading.

328 Bose–Einstein and Fermi–Dirac distributions

Example 29.2 Imagine that a particle can exist in one of two states, which we will label |0 and |1. We now consider two such particles, and describe their joint state by a product state. Thus |0|1

(29.7)

describes the state in which the ﬁrst particle is in state 0 and the second particle is in state 1. What are the possible states for this system if the particles are (a) distinguishable, (b) indistinguishable, but classical, (c) indistinguishable bosons, and (d) indistinguishable fermions? Solution: (a) Distinguishable particles: There are four possible states, which are |0|0,

|1|0,

|0|1,

|1|1

(29.8)

(b) Indistinguishable, but classical, particles: There are now only three possible states, which are |0|0,

2

This example demonstrates quantum– mechanical entanglement. The states of the two particles are entangled, in the sense that if one particle is in the |0 state, the other particle has to be in the |1 state, and vice versa.

|1|0,

|1|1

(29.9)

Since the particles are indistinguishable, there is no way of distinguishing the state |1|0 from |0|1. (c) Indistinguishable bosons: There are also only three possible states. Clearly both |0|0 and |1|1 are eigenstates of the exchange operator, but |1|0 and |0|1 are not. However, if we make a linear combination2 1 √ (|1|0 + |0|1) , 2

(29.10)

this will be an eigenstate of the exchange operator with eigenvalue 1. Thus the three possible states are: |0|0,

|1|1,

1 √ (|1|0 + |0|1) . 2

(29.11)

(d) Indistinguishable fermions: No two fermions can be in the same quantum state (by the Pauli exclusion principle), so |0|0 and |1|1 are not allowed. Thus only one state is possible, which is 1 √ (|1|0 − |0|1) . 2

(29.12)

This wave function is an eigenstate of the exchange operator with eigenvalue −1.

29.3

The statistics of identical particles 329

In general, for fermions, the requirement that Pˆ12 |ψ = −|ψ means that if |ψ is a two-particle state consisting of two particles in the same quantum state, i.e. if ψ = |ϕ|ϕ, then Pˆ12 |ϕ|ϕ = |ϕ|ϕ = −|ϕ|ϕ, (29.13) so that |ϕ|ϕ = 0,

(29.14)

i.e. the doubly-occupied state cannot exist. This, again, illustrates the Pauli exclusion principle, namely that two identical fermions cannot coexist in the same quantum state.

29.3

The statistics of identical particles

In the last section, we have demonstrated that exchange symmetry has an important eﬀect on the statistics of two identical particles. Now we want to do the same for cases in which we have many more than two identical particles. Our derivation will be easiest if we do this by ﬁnding the grand partition function Z (see Section 22.3) for a system comprised either of fermions or bosons. In this approach, the the total number of particles is not ﬁxed, and this is an easy constraint to apply as we shall see. If one is treating a system in which the number of particles is ﬁxed, we can always ﬁx it at the end of our calculation. Our method will be to use the expression Z = α eβ(µNα −Eα ) (from eqn 22.20). Here, α denotes a particular state of the system. We assume that there are certain possible quantum states in which to place our particles, and that the energy cost of putting a particle into the ith state is given by Ei . We will put ni particles into the ith state; here ni is called the occupation number of the ith state. A particular conﬁguration of the system is then described by the product n1 n2 eβ(µ−E1 ) × eβ(µ−E2 ) × ··· = eni β(µ−Ei ) . (29.15) i

The grand partition function is the sum of such products for all sets of occupation numbers which are allowed by the symmetry of the particles. Hence eni β(µ−Ei ) , (29.16) Z= {ni } i

where the symbol {ni } denotes a set of occupation numbers allowed by the symmetry of the particles. Fortunately, the total number of particles i ni does not have to be ﬁxed,3 because that would have been a ﬁddly constraint to apply to this expression. In fact, we will only be considering two cases: fermions, for which {ni } = {0, 1} (independent of i), and bosons, for which {ni } = {0, 1, 2, 3, . . .} (independent of i). This allows us to factor out the terms in the product for each state i and hence write eni β(µ−Ei ) . (29.17) Z= i {ni }

3

The total number of particles is not ﬁxed in the grand canonical ensemble, which is the one we are using here.

330 Bose–Einstein and Fermi–Dirac distributions

Example 29.3 Evaluate ln Z for a gas of (i) fermions and (ii) bosons. Solution: (i) For fermions, each state can either be empty or singly occupied, so that {ni } = {0, 1}, and hence eqn 29.17 becomes 1 + eβ(µ−Ei ) . (29.18) Z= i

Hence ln Z =

ln(1 + eβ(µ−Ei ) ).

(29.19)

i

(ii) For bosons, each state can contain any integer number of particles, so that {ni } = {0, 1, 2, 3, . . .}, and hence eqn 29.17 becomes 1 + eβ(µ−Ei ) + e2β(µ−Ei ) + · · · (29.20) Z= i

and therefore, by summing this geometric series, we have that Z=

i

and hence ln Z = −

1 , 1 − eβ(µ−Ei )

ln(1 − eβ(µ−Ei ) ).

(29.21)

(29.22)

i

Summarizing the results of the previous example, we can write ln Z = ±

ln(1 ± eβ(µ−Ei ) ),

(29.23)

i

where the ± sign means + for fermions and − for bosons. The number of particles in each energy level is given by 1 ∂ln Z eβ(µ−Ei ) ni = − , = β ∂Ei 1 ± eβ(µ−Ei )

(29.24)

and hence dividing top and bottom by eβ(µ−Ei ) gives ni =

1 , eβ(Ei −µ) ± 1

(29.25)

where, again, the ± sign means + for fermions and − for bosons. If µ and T are ﬁxed for a particular system, eqn 29.25 shows that the mean occupation of the ith state, ni , is a function only of the energy Ei . It is therefore convenient to consider the distribution function f (E) for fermions and bosons, which is deﬁned to be the mean occupation of

29.3

The statistics of identical particles 331

f E

E

E

Fig. 29.3 The Fermi–Dirac and Bose– Einstein distribution functions.

a state with energy E. We can therefore immediately write down the distribution function f (E) for fermions as f (E) =

1 eβ(E−µ)

+1

,

(29.26)

which is known as the Fermi–Dirac distribution function, and for bosons as 1 f (E) = β(E−µ) (29.27) , e −1 which is known as the Bose–Einstein distribution function. Sometimes the term on the right-hand side of eqn 29.26 is referred to as the Fermi factor and the term on the right-hand side of eqn 29.27 is referred to as the Bose factor. These are sketched in Fig. 29.3. Note that in the limit β(E − µ) 1, both functions tend to the Boltzmann distribution e−β(E−µ) . This is because this limit corresponds to low-density (µ small) and here there are many more states thermally accessible to the particles than there are particles; thus double occupancy never occurs and the requirements of exchange symmetry become irrelevant and both fermions and bosons behave like classical particles. The diﬀerences, however, are particularly felt at high density. In particular, note that the

332 Exercises

distribution function for bosons diverges when µ = E. Thus for bosons, the chemical potential must always be below, even if only slightly, the lowest–energy state. If it is not, then the lowest–energy state would become occupied with an inﬁnite number of particles, which is unphysical. The implications for the properties of quantum gases will be considered in the next chapter.

Chapter summary • The wave function of a pair of bosons is symmetric under exchange of particles, while the wave function of a pair of fermions is antisymmetric under exchange of particles. • Bosons can share quantum states, while fermions cannot share quantum states. • Bosons obey Bose–Einstein statistics, given by f (E) =

1 , eβ(µ−E) − 1

while fermions obey Fermi–Dirac statistics, given by f (E) =

1 . eβ(µ−E) + 1

Further reading More information about anyons may be found in Canright and Girvin (1990), Rao (1992) and in the collection of articles in Shapere and Wilczek (1989).

Exercises (29.1) Diﬀerentiate between particles that obey Bose– Einstein and Fermi–Dirac statistics, giving two examples of each.

(29.3) By rewriting the Fermi–Dirac function f (E) = as

(29.2) For the particles considered in Example 29.2, what is the probability that both particles are in the |0 state when (a) distinguishable, (b) indistinguishable, but classical, (c) indistinguishable bosons, and (d) indistinguishable fermions?

f (E) =

1 2

1 eβ(E−µ) + 1

„ « 1 1 − tanh β(E − µ) , 2

(29.28)

(29.29)

show that f (E) is symmetric about E = µ and sketch it. Find simpliﬁed expressions for f (E)

Exercises 333 when (i) E µ, (ii) E µ and (iii) E is very close to µ. (29.4) Are identical particles always indistinguishable? (29.5) Hydrogen (H2 ) gas can exist in two forms. If the proton spins are in an exchange symmetric triplet (S = 1) state, it is known as orthohydrogen. If the proton spins are in an exchange antisymmetric singlet (S = 0) state, it is known as para-hydrogen. The symmetry of the total wave function must be antisymmetric overall, so that the rotational part of the wave function must be antisymmetric for orthohydrogen (so that the angular momentum quantum number J is 1, 3, 5, . . .) or symmetric for parahydrogen (so that J = 0, 2, 4, . . .) The proton separation in hydrogen is 7.4 × 10−11 m. Estimate the spacing in Kelvin between the ground state and ﬁrst excited state in parahydrogen. Show that the ratio f of ortho-hydrogen to parahydrogen is given by P f = 3P

J=1,3,5,... (2J

J=0,2,4,... (2J

+ 1)e−J(J+1) + 1)e

2

Hence show (using Stirling’s approximation) that X gj [¯ nj ln n ¯ j + (1 − n ¯ j ) ln(1 − n ¯ j )], S = −kB j

(29.33) where n ¯ j = nj /gj are the mean occupation numbers of the quantum states. Maximize this expression subject to the constraint that the total energy E and number of particles N are constant, and hence show that n ¯j =

/2IkB T

, (29.30)

(29.6) In this exercise, we derive Fermi–Dirac and Bose– Einstein statistics using the microcanonical ensemble. (a) Show that the number of ways of distributing nj fermions among gj states with not more than one particle in each is gj ! . nj (gj − nj )!

(29.31)

1 . eα+βEj + 1

(29.34)

(b) Show that the number of ways of distributing nj bosons among gj states with any number of particles in each is Ωj =

−J(J+1) 2 /2IkB T

and ﬁnd f at 50 K.

Ωj =

Here, j labels a particular group of states. Hence the entropy S is given by " # Y gj ! . (29.32) S = kB ln nj (gj − nj )! j

(gj + nj − 1)! . nj (gj − nj )!

(29.35)

Hence show that X gj [(1 + n ¯ j ) ln(1 + n ¯j ) − n ¯ j ln n ¯ j ]. S = kB j

(29.36) Maximize this expression subject to the constraint that the total energy E and number of particles N are constant, and hence show that n ¯j =

1 . eα+βEj − 1

(29.37)

334 Biography

Albert Einstein (1879–1955) Albert Einstein’s academic career began badly. In 1895, he failed to get into the prestigious Eidgen¨ ossische Technische Hochschule (ETH) in Z¨ urich, and was sent to nearby Aarau to ﬁnish secondary school. He enrolled at ETH the following year, but failed to get a teaching assistant job there after his degree. After teaching maths at a Fig. 29.4 Albert technical schools in Winterthur Einstein and Schaﬀhausen, Einstein ﬁnally landed a job at a patent oﬃce in Bern in 1902 and was to stay there for seven years. Though Einstein was present in the oﬃce, his mind was elsewhere and he combined the day job with doctoral studies at the University of Z¨ urich. In 1905 this unknown patent clerk published his doctoral thesis (which derived a relationship between diﬀusion and frictional forces, and which contained a new method to determine molecular radii) and also published four revolutionary papers in the journal Annalen der Physik. The ﬁrst paper proposed that Planck’s energy quanta were real entities and would show up in the photoelectric eﬀect, work for which he was awarded the 1921 Nobel Prize. The citation stated that the prize was “for his services to Theoretical Physics, and especially for his discovery of the law of the photoelectric eﬀect”. The second paper explained Brownian motion on the basis of statistical mechanical ﬂuctuations of atoms. The third and fourth papers introduced his special theory of relativity and his famous equation E = mc2 . Any one of these developments alone was suﬃcient to earn him a major place in the history of physics; the combined achievement led to more modest immediate rewards: the following year, Einstein was promoted by the patent oﬃce to “technical examiner second class”. Einstein only became a professor (at Z¨ urich) in 1909, moving to Prague in 1911, ETH in 1912 and Berlin in 1914. In 1915, Einstein presented his general theory of relativity, which included gravity. The consequences of this theory include the phenomena of gravitational

lensing and gravitational waves, and the general theory of relativity is of fundamental importance in modern astrophysics. In the 1920’s, Einstein battled with Bohr on the interpretation of quantum theory, a subject which he had helped found through his work on the photoelectric eﬀect. Einstein did not believe quantum theory to be complete, while completeness was the central thesis of Bohr’s Copenhagen interpretation. Einstein seemed to lose the battles, but his criticisms illuminated the understanding of quantum mechanics, particularly concerning the nature of quantum entanglement. Einstein also contributed to quantum statistical mechanics through his work on Bose–Einstein statistics (see the biography of Bose). The rise of Nazi Germany led to Einstein’s departure in 1933 from the country of his birth, and after receiving oﬀers from Jerusalem, Leiden, Oxford, Madrid and Paris, he settled on Princeton where he remained for the rest of his life. When he arrived there in 1935, and was asked what he would require for his study, he is reported to have replied “A desk, some pads and a pencil, and a large wastebasket to hold all of my mistakes.” In 1939, following persuasion from Szil´ ard, he played a crucial rˆ ole in alerting President Roosevelt to the theoretical possibility of nuclear weapons being developed based on the discovery of nuclear ﬁssion and the need for the Allies to have this before the Nazis; this eventually led to the Manhattan project and the development of the atomic bomb. Einstein’s ﬁnal years were spent in an unsuccessful search for a grand uniﬁed theory which would combine the fundamental forces into a single theory. Interstingly, Einstein said that his search for the principle of relativity had been motivated by his yearning for a grand universal principle which was on same the level of the second law of thermodynamics. He saw many theories of physics as constructive, such as the kinetic theory of gases, which build up a description of complex behaviour from a simple scheme of mechanical and diﬀusional processes. Instead, he was after something much grander, in which many subtle consequences followed from a single universal principle. His model was thermodynamics, in which everything ﬂowed from a fundamental principle about increase of entropy. Thus in some sense, thermodynamics was the template for relativity.

Biography 335

Satyendranath Bose (1894–1974) Satyendranath Bose was born in Calcutta and graduated from the Presidency College there in 1915. He was appointed to Calcutta’s new research institute, University College, in 1917 along with M. N. Saha and, the following year, C. V. Raman. All three were to make pioneering contributions to physics. Four years later, Bose moved to the University of Dacca as Reader of Physics (though he returned to Calcutta in 1945). Bose had Fig. 29.5 S. Bose a prodigious memory and was legendary for giving highly polished lectures without consulting any notes. In 1924, Bose sent a paper to Einstein in Berlin, together with a handwritten covering letter: Respected Sir: I have ventured to send you the accompanying article for your

Enrico Fermi (1901–1954) Enrico Fermi was born in Rome and gained a degree at the University of Pisa in 1922. He spent a brief period working with Born and then returned to Italy, ﬁrst as a lecturer in Florence (where he worked out ‘Fermi statistics’, the statistical mechanics of particles subject to the Pauli exclusion principle) and then as a professor of physics at Rome in 1927. In Rome, Fermi made important contributions, including the theFig. 29.6 E. Fermi ory of beta decay, the demonstration of nuclear transformation in elements subjected to neutron bombardment, and the discovery of slow neutrons. These results demonstrate Fermi’s extraordinary ability to excel in both theory and experiment. Though extremely adept at detailed mathematical analysis, Fermi dis-

perusal and opinion. I am anxious to know what you think of it. You will see that I have tried to deduce the coeﬃcient of 8πν 2 /c3 in Planck’s Law independent of the classical electrodynamics, only assuming that the ultimate elementary regions in the phase-space has the content h3 . Bose had treated black body radiation as a photon gas, using phase space arguments; Planck’s distribution came simply from maximising the entropy. Einstein was impressed and translated Bose’s paper into German and submitted it to Zeitschrift f¨ ur Physik on Bose’s behalf. Einstein followed up Bose’s work in 1924 by generalising it to non-relativistic particles with non-zero mass and in 1925 he deduced the phenomenon now known as Bose–Einstein condensation. This purely theoretical proposal was a full thirteen years before Fritz London proposed interpreting the superﬂuid transition in 4 He as just such a Bose–Einstein condensation.

liked complicated theories and had an aptitude for getting the right answer simply and quickly using the most eﬃcient method possible. Fermi was awarded the Nobel Prize in 1938 for his “demonstrations of the existence of new radioactive elements produced by neutron irradiation, and for his related discovery of nuclear reactions brought about by slow neutrons”. After picking up his prize in Stockholm, he emigrated to the United States. He was one of the ﬁrst to realize the possibility of a chain reaction in uranium, demonstrating the ﬁrst self-sustaining nuclear reaction in a squash court near the University of Chicago in December 1942. Following this event, a coded phone call was sent to the leaders of the Manhattan project, with the message: ‘The Italian navigator has landed in the new world... The natives were very friendly’. Fermi became a major player in the Manhattan project, and following the end of World War II he remained in Chicago, working in high energy physics and cosmic rays until his untimely death due to stomach cancer.

336 Biography

Paul Dirac (1902–1984) Paul Adrien Maurice Dirac was brought up in Bristol by his English mother and Swiss father. His father insisted that only French was spoken at the dinner table, a stipulation that left Dirac with something of a distaste for speaking at all. He read engineering at Bristol University, graduating in 1921, and then took another degree in maths and got a ﬁrst in 1923. This led him to doctoral research in Cambridge under the supervision (if one can use such a word of what was rather a tenuous relationship) of Fowler. During this period, Dirac’s brother committed suicide and Dirac broke oﬀ conFig. 29.7 P.A.M. tact with his father; this all conDirac tributed to making Dirac even more socially withdrawn. In 1925, he read Heisenberg’s paper on commutators and realized the connection with Poisson brackets from classical mechanics. His Ph.D. thesis, submitted the following year, was entitled simply Quantum Mechanics. In 1926, Dirac showed how the antisymmetry of the wave function under particle exchange led to statistics which were identical to those derived by Fermi. Particles obeying such Fermi–Dirac statistics Dirac called (generously) ‘fermions’, while those obeying Bose– Einstein statistics were ‘bosons’. After having spent time with Bohr in Copenhagen, Born in G¨ ottingen and Ehrenfest in Leiden, Dirac returned to Cambridge in 1927 to take up a fellowship at St John’s College. His famous Dirac equation (which predicted the existence of the positron) appeared in 1928 and his book, The Principles of Quantum Mechanics (still highly readable, and in print), in 1930. In 1932 he was appointed to the Lucasian chair (held before by Newton, Airy, Babbage, Stokes

and Larmor, and later by Hawking) and the following year he shared the Nobel Prize with Schr¨ odinger “for the discovery of new productive forms of atomic theory”. Following a sabbatical visit to work with Eugene Wigner at Princeton, Dirac married Wigner’s sister Margrit in 1937. In 1969, Dirac retired from Cambridge and moved to Tallahassee, Florida, where he became a professor at FSU. Dirac had a very high view of mathematics, stating in the preface to his 1930 book that it was “the tool specially suited for dealing with abstract concepts of any kind and there is no limit to its power in this ﬁeld.” Later he remarked that in science “one tries to tell people, in such a way as to be understood by everyone, something that no one ever knew before. But in poetry, it’s the exact opposite.” Clarity for Dirac was fundamental, as was beauty, as it was “more important to have beauty in one’s equations that to have them ﬁt experiment.” Failure to match the results of experimental data can be rectiﬁed by further experiment, or by the sorting out of some minor feature not taken into account that subsequent theoretical development will resolve; but for Dirac, an ugly theory could never be right. Dirac said “I was taught at school never to start a sentence without knowing the end of it.” This explains a lot. Dirac’s famously taciturn and precise nature spawned many “Dirac stories”. Dirac once fell asleep during someone else’s lecture, but woke during a moment when the speaker was getting stuck in a mathematical derivation, muttering: “Here is a minus sign where there should be a plus. I seem to have dropped a minus sign somewhere.” Dirac opened one eye and interjected: “Or an odd number of them.” One further example concerns a conference lecture he himself gave, following which a questioner indicated that he had not followed a particular part of Dirac’s argument. A long silence ensued, broken ﬁnally by the chairman asking if Professor Dirac would deal with the question. Dirac responded, “It was a statement, not a question.”

Quantum gases and condensates

30

Exchange symmetry aﬀects the occupation of allowed states in quantum gases. If the density of the gas is very low, such that nλ3th 1, we can ignore this and forget about exchange symmetry; this is what we do for gases at room temperature. But if the density is high, the eﬀects of exchange symmetry become very important and it really starts to matter whether the particles you are considering are fermions or bosons. In this chapter, we consider quantum gases in detail and explore the possible eﬀects that one can observe.

30.1

30.1 The non-interacting tum ﬂuid

quan337

30.2 The Fermi gas

340

30.3 The Bose gas

345

30.4 Bose–Einstein condensation (BEC) 346 Chapter summary

351

Further reading

351

Exercises

352

The non-interacting quantum ﬂuid

We ﬁrst consider a ﬂuid composed of non-interacting particles. To keep things completely general for the moment, we will consider particles with spin S. This means that each allowed momentum state is associated with 2S + 1 possible spin states.1 If we can ignore interactions between particles, the grand partition function Z is simply the product of single– particle partition functions, so that Z= Zk2S+1 , (30.1)

1 If the spin is S, there are 2S + 1 possible states corresponding to the zcomponent of angular momentum being −S, −S + 1, . . . S.

k

where

Zk = (1 ± e−β(Ek −µ) )±1

(30.2)

is a single particle partition function and where the ± sign is + for fermions and − for bosons.2

Example 30.1 Find the grand potential for a three-dimensional gas of non-interacting bosons and fermions with spin S. Solution: The grand potential ΦG is obtained from eqn 30.1 as follows: ΦG

= −kB T ln Z = ∓kB T (2S + 1) = ∓kB T (2S + 1)

ln(1 ± e−β(Ek −µ) )

k ∞ 0

ln(1 ± e−β(E−µ) ) g(E) dE,

(30.3)

2

These results follow directly from eqns 29.18 and 29.20.

338 Quantum gases and condensates

where g(E) is the density of states, which can be derived as follows. States in k-space are uniformly distributed, and so g(k) dk =

4πk 2 dk (2S + 1)V k 2 dk × (2S + 1) = , 3 (2π/L) 2π 2

(30.4)

where (2S + 1) is the spin degeneracy factor and V = L3 is the volume. Using E = 2 k 2 /2m we can transform this into 3/2 (2S + 1)V E 1/2 dE 2m g(E) dE = , (30.5) (2π)2 2 and hence ΦG = ∓kB T

(2S + 1)V (2π)2

2m 2

3/2

∞

ln(1±e−β(E−µ) ) E 1/2 dE, (30.6)

0

which after integrating by parts yields 3/2 ∞ E 3/2 dE 2 (2S + 1)V 2m ΦG = − . 3 (2π)2 2 e−β(E−µ) ± 1 0

3

Note that in the derived expressions, the ± sign means + for fermions and − for bosons.

(30.7)

The grand potential evaluated in the previous example can be used to derive various thermodynamic functions for fermions and bosons.3 Another way to get to the same result is to evaluate the mean occupation nk of a state with wave vector k, which is given by nk = kB T

∂ 1 Zk = β(E −µ) , ∂µ e k ±1

(30.8)

and then use this expression to derive directly quantities such as ∞ g(E) dE N= nk = , (30.9) β(Ek −µ) ± 1 e 0 k and U=

∞

E g(E) dE . ±1

nk Ek =

eβ(Ek −µ)

0

k

(30.10)

For reasons which will become more clear below, we will write eβµ as the fugacity z, i.e. z = eβµ . (30.11) These give expressions for N and U as follows: 3/2 ∞ E 1/2 dE (2S + 1)V 2m N= (2π)2 2 z −1 eβE ± 1 0 and

(2S + 1)V U= (2π)2

2m 2

3/2 0

∞

E 3/2 dE . ±1

z −1 eβE

(30.12)

(30.13)

30.1

The non-interacting quantum ﬂuid 339

One problem with all these types of formula, such as eqns 30.7, 30.12 and 30.13, is that to simplify them any further, you have to do a diﬃcult integral. Fortunately, we can show that these integrals are related to the polylogarithm function Lin (x) (see Appendix C.5), so that ∞ E n−1 dE = (kB T )n Γ(n)[∓Lin (∓z)], (30.14) z −1 eβE ± 1 0 where Γ(n) is a gamma function. This result is proved in the appendix (eqn C.36). The crucial thing to realize is that Lin (z) is just a numerical function of z, i.e. of the temperature and the chemical potential. This integral then allows us to establish, after a small amount of algebra, that the number N of particles is given by N=

(2S + 1)V [∓Li3/2 (∓z)], λ3th

(30.15)

and the internal energy U is given by U

= =

3 (2S + 1)V kB T [∓Li5/2 (∓z)] 2 λ3th Li5/2 (∓z) 3 N kB T . 2 Li3/2 (∓z)

(30.16)

We will use these equations in subsequent sections. Note also that we have from eqns 30.7 and 30.13 that 2 ΦG = − U. 3

(30.17)

Example 30.2 Evaluate N , U and ΦG (from eqns 30.15, 30.16 and 30.17) in the hightemperature limit. Solution: In the high–temperature limit, namely βµ 1, we can use the fact that Lin (z) ≈ z when |z| 1. Hence N

≈

U

≈

ΦG

≈

(2S + 1)V , λ3th 3 N kB T, 2 −N kB T.

(30.18) (30.19) (30.20)

These three equations are reassuringly familiar. The equation for N shows that the number density of particles N/V is such that, on average, 2S + 1 particles (one for each spin state) occupy a volume λ3th . The equation for U asserts that the energy per particle is the familiar equipartition result 32 kB T . The equation for ΦG , together with ΦG = −pV (from eqn 22.49) yields the ideal gas law pV = N kB T .

340 Quantum gases and condensates

30.2

4 The highest ﬁlled energy level at T = 0 is known as the Fermi level, though this can be a misleading term as, for example in semiconductors, there may not be any states at the chemical potential (which lies somewhere in the energy gap).

5

The Heaviside step function θ(x) is deﬁned by 0 x<0 θ(x) = 1 x>0

The Fermi gas

What we have done so far is to consider bosons and fermions on an equal footing. Let us now restrict our attention to a gas of fermions (known as a Fermi gas) and to get a feel for what is going on, let us also consider T = 0. Fermions will occupy the lowest–energy states, but we can only put one fermion in each state, and thus only 2S + 1 in each energy level. The fermions will ﬁll up the energy levels until they get to an energy EF , known as the Fermi energy, which is the energy of the highest occupied state at a temperature of absolute zero.4 Thus we deﬁne EF = µ(T = 0).

(30.21)

This makes sense because µ(T = 0) = ∂E/∂N , which gives µ(T = 0) = E(N ) − E(N − 1) = EF . At absolute zero, we have that β → ∞, and hence the occupation nk is given by 1 nk = β(E −µ) (30.22) = θ(µ − EF ), e k +1 where θ(x) is a Heaviside step function.5 At absolute zero, therefore, the number of states is given by kF g(k) d3 k, (30.23) N= 0

It is plotted in Fig. 30.1.

where kF is the Fermi wave vector, deﬁned by 2 kF2 . 2m Hence the number of fermions N is given by

x

EF =

(2S + 1)V kF3 , 2π 2 3 so that writing n = N/V , we have 1/3 6π 2 n kF = , 2S + 1 N=

x

Fig. 30.1 The Heaviside step function.

and hence EF =

2/3 6π 2 n 2 . 2m 2S + 1

(30.24)

(30.25)

(30.26)

(30.27)

Example 30.3 Evaluate kF and EF for spin- 12 particles. Solution: When S = 12 , 2S + 1 = 2 and hence eqns 30.26 and 30.27 become 1/3 kF = 3π 2 n , (30.28) and EF =

2 2 2/3 3π n . 2m

(30.29)

30.2

The Fermi gas 341

Fig. 30.2 (a) The Fermi function f (E) deﬁned by eqn 29.26. The thick line is for T = 0. The step function is smoothed out as the temperature is increased (shown as thinner lines). The temperatures shown are T = 0, T = 0.01µ/kB , T = 0.05µ/kB and T = 0.1µ/kB . (b) The density of states g(E) for a non-interacting fermion gas in three dimensions is proportional to E 1/2 . (c) f (E)g(E) for the same temperatures as in (a).

At T = 0, the distribution function f (E) is a Heaviside step function, taking the value 1 for E < µ and 0 for E > µ. This step is smoothed out as the temperature T increases, as shown in Fig. 30.2(a). The density of states g(E) for a non-interacting fermion gas in three dimensions is proportional to E 1/2 (as shown in eqn 30.4) and this is plotted in Fig. 30.2(b). The product of f (E)g(E) gives the actual number distribution of fermions, and this is shown in Fig. 30.2(c). The sharp cutoﬀ you would expect at T = 0 is smoothed over an energy scale kB T around the chemical potential µ. The electrons in a metal can be treated as a non-interacting gas of fermions. Using the number density n of electrons in a metal, one can calculate the Fermi energy using eqn 30.29, and some example results are shown in Table 30.1. The Fermi energies are all several eV; converting each number into a temperature, the so-called Fermi temperature TF = EF /kB , yields values of several tens of thousands of Kelvin. Thus the Fermi energy is a large energy scale, and hence for most metals the Fermi function is close to a step function, at pretty much all temperatures below their melting temperature. In this case, the electrons in a metal are said to be in the degenerate limit. The pressure of these electrons is given (by using eqns 22.49 and 30.17) as 2U , (30.30) p= 3V

342 Quantum gases and condensates

Li Na K Cu Ag

n (1028 m−3 )

EF (eV)

2 3 nEF 9

(10 N m−2 )

B (109 N m−2 )

4.70 2.65 1.40 8.47 5.86

4.74 3.24 2.12 7.00 5.49

23.8 9.2 3.2 63.3 34.3

11.1 6.3 3.1 137.8 103.6

Table 30.1 Properties of selected metals

as is appropriate for non-relativistic electrons (see Table 25.1). The mean energy of the electrons at T = 0 is given by EF Eg(E) dE , (30.31) E = 0 EF g(E) dE 0 which with g(E) ∝ E 1/2 gives E = 35 EF . Writing U = nE, we have that the bulk modulus B is B = −V

10U 2 ∂p = = nEF . ∂V 9V 3

(30.32)

This expression is evaluated in Table 30.1 and gives results which are of the same order of magnitude as experimental values. The next example computes an integral which is useful for considering analytically the eﬀect of ﬁnite temperature.

Example 30.4

Evaluate the integral I =

∞

φ(E)f (E) dE as a power series in tem0

perature. Solution: E Consider the function ψ(E) = 0 φ(E ) dE , which is deﬁned so that φ(E) = dψ/dE and therefore ∞ ∞ dψ df ∞ f (E) dE = [f (E)φ(E)]0 − dE I= ψ(E) dE dE 0 0 ∞ df = − dE. (30.33) ψ(E) dE 0 Now put x = (E − µ)/kB T and hence df 1 ex =− . dE kB T (ex + 1)2

(30.34)

30.2

Writing ψ(E) as a power series in x as ∞ xs ds ψ ψ(E) = , s! dxs x=0 s=0 we can express I as a power series of integrals as follows: ∞ ∞ 1 ds ψ xs ex dx I= . s! dxs x=0 −EF /kB T (ex + 1)2 s=0

(30.35)

(30.36)

The integral part of this can be simpliﬁed by replacing6 the lower limit by −∞. It vanishes for odd s, but for even s ∞ s x ∞ s x x e dx x e dx = 2 x 2 (ex + 1)2 −∞ (e + 1) 0 ∞ ∞ = 2 dx ex xs × (n + 1)(−1)n+1 e−nx =

2

0

n=0

∞

n+1

n=1

(−1)

∞

n

xs e−nx dx

0

=

∞ (−1)n+1 2(s!) ns n=1

=

2(s!)(1 − 21−s )ζ(s),

where ζ(s) is the Riemann zeta function. Thus the integral is s ∞ d ψ 2 (1 − 21−s )ζ(s) I = s dx x=0 s=0,s even π 2 d2 ψ 7π 4 d4 ψ = ψ+ + + ··· 6 dx2 x=0 360 dx4 x=0 µ dφ π2 = (kB T )2 φ(E) dE + 6 dE −∞ E=µ 3 d 7π 4 φ + (kB T )4 + ... 360 dE 3 E=µ

The Fermi gas 343

(30.37)

(30.38)

This expression is known as the Sommerfeld formula.

Having derived the Sommerfeld formula, we can now evaluate N and U quite easily. Let us choose S = 12 , just to make the equations a little less cumbersome. Then 3/2 ∞ 2m V E 1/2 f (E) dE N = 2π 2 2 0 3/2 2 2m V π 2 kB T 3/2 = 1+ µ + . . . , (30.39) 3π 2 2 8 µ

6

This approximation is valid when kB T EF .

344 Quantum gases and condensates

which implies that

π2 µ(T ) = µ(0) 1 − 12

kB T µ(0)

2

+ ... .

(30.40)

In fact, equating EF and µ is good to 0.01% for typical metals even at room temperature, although it is worthwhile keeping in the back of one’s mind that the two quantities are not the same. We can also compute the heat capacity of electrons in a metal by a similar technique, as shown in the following example.

Example 30.5 Compute the heat capacity of non-interacting free electrons in a threedimensional metal. Solution:

U

= = = =

V 2π 2

2m 2

3/2

∞

E 3/2 f (E) dE 0 3/2 2 2m V 5π 2 kB T 5/2 µ(T ) + ... 1+ 5π 2 2 8 µ(0) 2 3 π 2 kB T N µ(T ) 1 + + ... 5 2 µ(0) 2 3 5π 2 kB T N µ(0) 1 + + ... (30.41) 5 12 µ(0)

and hence CV =

3 N kB 2

π 2 kB T 3 µ(0)

+ O(T 3 ).

(30.42)

Thus the contribution to the heat capacity from electrons is linear in temperature (recall from Chapter 24 that the heat capacity from lattice vibrations (phonons) is proportional to T 3 at low temperature) and will therefore dominate the heat capacity of a metal at very low temperatures.

7

The periodic potential which exists in crystalline metals can lead to the formation of energy gaps, i.e. intervals in energy in which there are no allowed states.

The Fermi surface is the set of points in k-space whose energy is equal to the chemical potential. If the chemical potential lies in a gap7 between energy bands, then the material is a semiconductor or an insulator and there will be no Fermi surface. Thus a metal is a material with a Fermi surface.

30.3

30.3

The Bose gas

For the Bose gas (a gas composed of bosons), we can use our expressions for N and U in eqns 30.15 and 30.16 to give N= and U=

(2S + 1)V Li3/2 (z) λ3th

(30.43)

Li5/2 (z) 3 N kB T . 2 Li3/2 (z)

(30.44)

Example 30.6 Evaluate eqns 30.43 and 30.44 for the case µ = 0. Solution: If µ = 0 then z = 1. Now Lin (1) = ζ(n) where ζ(n) is the Riemann zeta function. Therefore 3 (2S + 1)V ζ N= (30.45) 3 λth 2 and U=

ζ( 5 ) 3 N kB T 23 . 2 ζ( 2 )

(30.46)

The numerical values are ζ( 32 ) = 2.612, ζ( 52 ) = 1.341, and hence we have that ζ( 25 )/ζ( 32 ) = 0.513. Note that these results will not apply to photons because we have assumed at the beginning that E = 2 k 2 /2m, whereas for a photon E = kc. This is worked through in the following example.

Example 30.7 Rederive the equation for U for a gas of photons using the formalism of this chapter. Solution: The density of states is g(k) dk = (2S + 1)V k 2 dk/(2π 2 ). A photon has a spin of 1, but the 0 state is not allowed, so the spin degeneracy factor (2S + 1) is in this case only 2. Using E = kc we arrive at g(E) dE =

V π 2 3 c3

E 2 dE,

(30.47)

and hence U=

∞

E g(E) dE = 0

V 2 π 3 c3

0

∞

E 3 dE , −1

z −1 eβE

(30.48)

The Bose gas 345

346 Quantum gases and condensates

and using

0

∞

E 3 dE = (kB T )4 Γ(4)Li4 (z), z −1 eβE − 1

(30.49)

and recognizing that z = 1 because µ = 0 and hence Li4 (z) = ζ(4) = π 4 /90, and using Γ(4) = 3! = 6, we have that U=

V π2 (kB T )4 , 153 c3

(30.50)

which agrees with eqn 23.37. For Bose systems with a dispersion relation like E = 2 k 2 /2m (i.e. for a gapless dispersion, where the lowest–energy level, corresponding to k = 0 or inﬁnite wavelength, is at zero energy), the chemical potential has to be negative. If it were not, the level at E = 0 would have inﬁnite occupation. Thus µ < 0, and hence the fugacity z = eβµ must lie in the range 0 < z < 1. But what value will the chemical potential take? Equation 30.45 can be rearranged to give nλ3th = Li3/2 (z), 2S + 1

(30.51)

and here we hit an uncomfortable problem. The left-hand side can be increased if n = N/V increases or if T decreases (because λth ∝ T −1/2 ). We can plug numbers for n and T into the left-hand side and then read oﬀ a value for z from the graph in Fig. 30.3, which shows the behaviour of the function Li3/2 (z) (and also Li5/2 (z)). As we raise n or decrease T , we make the left-hand side of eqn 30.51 bigger and hence z bigger, so that µ becomes less negative, approaching 0 from below. However, if nλ3th > ζ( 32 ) = 2.612, 2S + 1

(30.52)

there is no solution to eqn 30.51. What has happened?

30.4

Bose–Einstein condensation (BEC)

The solution to the conundrum raised in the previous section is remarkably subtle, but has far-reaching consequences. As the chemical potential has become closer and closer to zero energy, approaching this from below, the lowest energy level has become macroscopically occupied. The reason our mathematics has broken down is that our usual, normally perfectly reasonable, approximation in going from a sum to an integral in evaluating our grand partition function is no longer valid. In fact, we can see when this fails using a rearranged version of eqn 30.52. Failure occurs when we fall below a temperature Tc given by 2/3 n 2π2 . (30.53) kB Tc = m 2.612(2S + 1)

Bose–Einstein condensation (BEC) 347

z

n(z)

30.4

z

Fig. 30.3 The functions Li3/2 (z) and Li5/2 (z). For z 1 (the classical regime), Lin (z) ≈ z. Also, Lin (1) = ζ(n).

z We can perform a corrected analysis of the problem as follows. We separate N into two terms: N = N 0 + N1 ,

(30.54)

where N0 is

1 z , (30.55) = βµ 1−e 1−z the number of particles in the ground state, and N1 is our original integral representing all the other states. Thus above Tc , N0 =

N = N1 =

(2S + 1)V Li3/2 (z), λ3th

(30.56)

but below Tc , N1 is ﬁxed to be N1 =

(2S + 1)V Li3/2 (1), λ3th

(30.57)

so that the concentration of particles in the excited state is n1 ≡

(2S + 1)ζ( 32 ) N1 = . V λ3th

(30.58)

Any remaining particles must be in the ground state, so that n≡

(2S + 1)ζ( 32 ) N = . V λth (Tc )3

(30.59)

348 Quantum gases and condensates

Hence

T Tc

3/2 .

(30.60)

n

n

n0 n − n1 = =1− n n

T T Fig. 30.4 The number of particles in the ground state as a function of temperature, after eqn 30.60. 8

This is often abbreviated to BEC.

This function is plotted in Fig. 30.4 and shows how the number of particles in the ground state grows as the temperature is cooled below Tc . This macroscopic occupation of the ground state is known as Bose– Einstein condensation.8 Note that this transition is not driven by interactions between particles (as we had for the liquid-gas transition); we have so far only considered non-interacting particles; the transition is driven purely by the requirements of exchange symmetry on the quantum statistics of the bosons. The term ‘condensation’ often implies a condensation in space, as when liquid water condenses on a cold window in a steamy bathroom. However, for Bose–Einstein condensation it is a condensation in k-space, with a macroscopic occupation of the lowest energy state occurring below Tc .

Example 30.8 Find the internal energy U (T ) at temperature T for the Bose gas. Solution: The internal energy of the system only depends on the excited states, since the macroscopically occupied ground state has zero energy. Since z = 1 for T ≤ Tc , we have that U

= = =

ζ( 5 ) 3 N1 kB T 32 2 ζ( 2 ) ζ( 5 ) T 3/2 3 N kB T 32 2 ζ( 2 ) Tc 5/2 T 0.77N kB Tc . Tc

(30.61)

For T > Tc we have (from eqn 30.46) U=

Li5/2 (z) 3 N kB T . 2 Li3/2 (z)

(30.62)

This example gives the high–temperature results as a function of the fugacity, but z is temperature-dependent. For a system with a ﬁxed number N of bosons, we can extract z via N/V = (2S + 1)Li3/2 (z)/λ3th and equating this with eqn 30.59 yields ζ( 32 ) T = , (30.63) Tc Li3/2 (z)

Bose–Einstein condensation (BEC) 349

Nk T

z

30.4

T

CV / (N kB)

U

Nk

Nk

T T which although it cannot be straightforwardly inverted to make z the subject, does show how z is related to T above Tc . (Below Tc , z is practically one.) The fugacity z, internal energy U and heat capacity CV , calculated for non-interacting bosons, are plotted in Fig. 30.5. The fugacity is obtained by numerical inversion of eqn 30.63; it rises up towards unity as you cool, and below Tc is not actually one but very close to it. The internal energy U in Fig. 30.5(a) is obtained from eqn 30.62, while the heat capacity CV is plotted from eqn 30.66, to be proven in the exercises at the end of this chapter. The Indian physicist S. N. Bose wrote to Einstein in 1924 describing his work on the statistical mechanics of photons. Einstein appreciated the signiﬁcance of this work and used Bose’s approach to predict what

Fig. 30.5 The (a) fugacity, (b) internal energy and (c) heat capacity for a system of bosons as a function of temperature.

350 Quantum gases and condensates

is now called Bose–Einstein condensation. In late 1930’s, it was discovered that liquid 4 He becomes a superﬂuid when cooled below about 2.2 K. Superﬂuidity is a quantum-mechanical state of matter with very unusual properties, such as the ability to ﬂow through very small capillaries with no measurable viscosity. Speculation arose as to whether this state of matter was connected with Bose– Einstein condensation.

Example 30.9 Estimate the Bose–Einstein condensation temperature for liquid 4 He, given that mHe ≈ 4mp and that the density ρ ≈ 145 kg m−3 . Solution: Using n = ρ/m, eqn 30.53 yields Tc ≈ 3.1 K which is remarkably close to the experimental value of the superﬂuid transition temperature.

Fig. 30.6 Observation of Bose– Einstein condensation by absorption imaging. The data are shown as shadow pictures (upper panel) and as a three-dimensional plot (lower panel); the blackness of the shadow in the upper panel is here represented by height in the lower panel. These pictures measure the slow expansion of the trapped atoms observed after a 0.006 s time of ﬂight, and thus measure the momentum distribution inside the cloud. The left-hand picture shows an expanding cloud cooled to just above the transition point. In the right-hand picture we see the velocity distribution well below Tc where almost all the atoms are condensed into the zero-velocity peak. (Image courtesy W. Ketterle.) 9

Alkali atoms are in Group I of the periodic table and include Li, Na, K, Rb and Cs.

10

The 2001 Nobel Prize was awarded to Eric Cornell and Carl Wieman (who did the experiment with rubidium atoms) and to Wolfgang Ketterle (who did it with sodium atoms).

Despite the agreement between this estimate and the experimental value, things are a bit more complicated. The particle density of 4 He is very high and interactions between helium atoms cannot be ignored; 4 He is a strongly interacting Bose gas, and therefore the predictions of the theory outlined in this chapter have to be modiﬁed. A more suitable example of Bose–Einstein condensation is provided by the very dilute gases of alkali metal atoms9 that can be prepared inside magnetic ion traps. The atoms, usually about 104 –106 of them, can be trapped and cooled using the newly developed techniques of laser cooling. These alkali atoms have a single electronic spin due to their one valence electron and this can couple with the non-zero nuclear spin. Each atom therefore has a magnetic moment and thus can be trapped inside local minima of magnetic ﬁeld. The density of these ultracold atomic gases inside the traps are very low, more than seven orders of magnitude lower than that in 4 He, though their masses are higher. The Bose–Einstein condensation temperature is therefore also very low, typically 10−8 –10−6 K, but these temperatures can be reached using laser cooling. The low density precludes signiﬁcant three-body collisions (in which two atoms bind with the third taking away the excess kinetic energy, thus causing clustering), but two-body collisions do occur which allow the cloud of atoms to thermalize. Example data are shown in Fig. 30.6 from one such experiment which clearly show that below a critical temperature Bose–Einstein condensation is taking place.10 Superﬂuidity is also found in these ultracold atomic gases; it turns out that the very weak interactions that exist between the alkali atoms are important for this to occur (a non-interacting Bose gas does not show superﬂuidity). Other experiments have explored the intriguing consequences of macroscopic quantum coherence, the property that in the condensed state all the atoms exist in a coherent quantum superposition.

Further reading 351

Electrons do not exhibit Bose–Einstein condensation because they are fermions, not bosons, but they can show other condensation eﬀects such as superconductivity. In a superconductor, a weak attractive interaction (which can be mediated by phonons) allows pairs of electrons to form Cooper pairs. A Cooper pair is a boson, and the Cooper pairs themselves can form a coherent state below the superconducting transition temperature. Many common superconductors can be described in this way using the BCS theory of superconductivity,11 though many newly discovered superconductors, such as the high-temperature superconductors which are ceramics, do not seem to be described by this model.

11

BCS is named after its discoverers, John Bardeen, Leon Cooper, and Robert Schrieﬀer.

Chapter summary • Non-interacting bosons can be described using the equations N U ΦG

(2S + 1)V [∓Li3/2 (∓z)], λ3th Li5/2 (∓z) 3 N kB T , = 2 Li3/2 (∓z) 2 = − U. 3

=

• In a Fermi gas (a gas of fermions), electrons ﬁll states up to EF at absolute zero. At non-zero temperature, electrons with kB T of EF are important in determining the properties. • The results for a Fermi gas can be applied to the electrons in a metal. • In a Bose gas, Bose–Einstein condensation can occur below a temperature given by kB Tc =

2π2 m

n 2.612(2S + 1)

2/3 .

• The results for a Bose gas can be applied to liquid 4 He and dilute ultracold atomic gases.

Further reading For further information, see Ashcroft and Mermin (1976), Annett (2004), Foot (2004), Ketterle (2002) and Pethick and Smith (2002).

352 Exercises

Exercises (30.1) Show that in the classical limit, when the fugacity z = eβµ 1, z is the ratio of the thermal volume to the volume per particle of a single-spin excitation. (30.2) Show that the pressure p exerted by a Fermi gas at absolute zero is p=

2 nEF , 5

(30.64)

where n is the number density of particles. (30.3) Show that for a gas of fermions with density of states g(E), the chemical potential is given by µ(T ) = EF −

g (EF ) π2 (kB T )2 + ... 6 g(EF )

(30.65)

(30.4) Show that the heat capacity of a system of noninteracting bosons is given by „ «3/2 15 ζ( 52 ) T , T < Tc , N k CV = B 4 ζ( 32 ) Tc „ « 5 Li5/2 (z) 3 3 Li3/2 (z) CV = N kB − , 2 2 Li3/2 (z) 2 Li1/2 (z) T > Tc . (30.66) (30.5) Show that Bose–Einstein condensation does not occur in two dimensions. (30.6) In Bose–Einstein condensation, the ground state becomes macroscopically occupied. What about the ﬁrst excited state, which might be only a small energy above the ground state; is it also macroscopically occupied?

Part IX

Special topics In this ﬁnal part, we apply some of the material presented earlier in this book to some specialized topics. This part is structured as follows: • In Chapter 31 we describe sound waves and prove that these are adiabatic. We derive an expression for the speed of sound in a ﬂuid. • A particular type of sound wave is the shock wave, and we consider such waves in Chapter 32. We deﬁne the Mach number and derive the Rankine–Hugoniot conditions, which allow us to consider the changes in density and pressure at a shock front. • In Chapter 33, we examine how ﬂuctuations can be studied in thermodynamics and lead to eﬀects such as Brownian motion. We consider the linear response of a system to a generalized force and derive the ﬂuctuation–dissipation theorem. • In Chapter 34, we discuss non-equilibrium thermodynamics and show how ﬂuctuations lead to the Onsager reciprocal relations, which connect certain kinetic coeﬃcients. We apply these ideas to thermoelectric phenomena and brieﬂy discuss time-reversal symmetry. • In Chapter 35, we consider the physics of stars and study how gravity, nuclear reactions, convection and conduction all lead to the observed properties of stellar material. • In Chapter 36 we discuss what happens to stars when they run out of fuel, and consider the properties of white dwarfs, neutron stars and black holes. • In Chapter 37, we apply thermal physics to the atmosphere, attempting to understand how solar energy keeps the Earth at a certain temperature, the rˆole played by the greenhouse eﬀect and how mankind may be causing climate change.

31

Sound waves

31.1 Sound waves under isothermal conditions 355 31.2 Sound waves under adiabatic conditions 355 31.3 Are sound waves in general adiabatic or isothermal? 356 31.4 Derivation of the speed of sound within ﬂuids 357 Chapter summary

359

Further reading

360

Exercises

360

Sound waves can be propagated in various ﬂuids, such as liquids or gases, and consist of oscillations in the local pressure and density of the ﬂuid. They are longitudinal waves (in which the displacement of molecules from their equilibrium positions is in the same direction as the wave motion) and can be described by alternating regions of compression and rarefaction (see Fig. 31.1). The speed at which sound travels through a material is therefore related to the material’s compressibility (measured by its bulk modulus, see below) as well as to its inertia (represented by its density). In this chapter, we will show that the speed of sound vs is given by B , (31.1) vs = ρ where vs is the speed of sound and B is the bulk modulus of the material. The bulk modulus describes how much the volume of the ﬂuid will change with changing pressure, so it is deﬁned as the pressure increment dP divided by the fractional volume increment dV /V ; since a pressure increase usually results in a volume decrease the deﬁnition is therefore ∂p , (31.2) B = −V ∂V

Fig. 31.1 A sound wave in a ﬂuid is a longitudinal wave consisting of compressions and rarefactions.

in order to ensure that B > 0. It is also helpful to write the bulk modulus in terms of density rather than volume. Density ρ and volume V are related, for a ﬁxed mass of material M , by ρ=

M , V

(31.3)

which means that fractional changes in density and in pressure are related by ∂p (31.4) B = −ρ . ∂ρ Later in this chapter we will see how to derive the equation for the speed of sound which is quoted in eqn 31.1, but ﬁrst we are going to see how it works for two diﬀerent possible constraints introduced in the previous chapter, adiabatic and isothermal. These constraints determine the way in which we evaluate the partial diﬀerential in eqn 31.4.

31.1

31.1

Sound waves under isothermal conditions 355

Sound waves under isothermal conditions

We ﬁrst begin by supposing that sound waves propagate under isothermal conditions. Simple diﬀerentiation of the ideal gas equation (eqn 6.20) at constant temperature gives that1 ∂p BT = −V = p, (31.5) ∂V T where the subscript T indicates that it is the temperature which is held constant (isothermal conditions). Thus, using eqn 31.1, and then substituting in eqn 6.15 and writing the density as ρ = nm, we may write 1 2 BT p v 2 3 nmv = = = . (31.6) vs = ρ ρ ρ 3

1

For an ideal gas at constant temperature, pV is a constant and hence p ∝ V −1 . This implies that dp/p = −dV /V, and hence

„ −V

∂p ∂V

« = p. T

This implies that we can write vs =

vx2 ,

(31.7)

where vx is as deﬁned in eqn 5.15. This implies that the sound speed is very similar to the mean molecular speed in a given direction and is consistent with molecular interactions being the mediator of bulk sound waves.

31.2

Sound waves under adiabatic conditions

A gas under adiabatic conditions obeys eqn 12.15 (pV γ is constant) and hence p ∝ V −γ so that dV dp = −γ , (31.8) p V and hence the adiabatic2 bulk modulus BS is ∂p BS = −V = γp. ∂V S

2

The subscript S is because the entropy S is constant in an adiabatic process.

(31.9)

Hence the equation for sound speed under these conditions then becomes γp γv 2 = . (31.10) vs = ρ 3 Comparison of the sound speed under isothermal and adiabatic conditions (i.e. eqns 31.6 and 31.10) tell us that the speed under adiabatic conditions is γ 1/2 times faster than it would be under isothermal conditions.

356 Sound waves

Example 31.1 What is the temperature dependence of the speed of sound assuming adiabatic conditions? Solution: The relationship between the sound speed vs and the mean square speed of molecules in air v 2 given in eqn 31.10 enables us to relate the sound speed in air to its temperature. Using v 2 = 3kB T /m, we have that γv 2 γkB T = . (31.11) vs = 3 m 3

Note that γ can be weakly temperature dependent.

This shows that the speed of sound is a function of temperature and mass alone.3 It is unsurprising that the speed of sound, i.e. the speed at which a pressure disturbance can be propagated, follows the same temperature dependence as the mean molecular speed since the molecular collision rates that govern the propagation of disturbances are proportional to the mean molecular speed.

31.3

4

Do not confuse λ as wavelength with λ as mean free path. The context should indicate which is meant.

Are sound waves in general adiabatic or isothermal?

Because of the ideal gas law, one would expect that at the compressions in a sound wave the temperature rises, while at the rarefactions there is cooling. If there were suﬃcient time for thermal equilibration to take place as the sound wave passes (i.e. as the compressions and rarefactions reverse positions) then the wave would be isothermal. However, if there is insuﬃcient time, then the wave is said to be adiabatic since there is no time for heat to ﬂow. To establish whether sound waves are usually likely to be adiabatic or isothermal, we are going to consider how far thermal changes can propagate in comparison with the length scale of a sound wave. The latter is given by the wavelength4 λ of the sound wave, which is related to the angular frequency ω in a medium with sound speed vs , by 2πvs . (31.12) ω The distance over which a thermal wave can propagate is the skin depth δ which we met in eqn 10.22. Thus the characteristic depth to which heat diﬀuses in a certain time T (using T = 2π/ω for the ‘thermal wave’ which is driven at frequency ω) is given by λ=

DT 2D = . (31.13) ω π The frequency dependence of these two length scales, the wavelength of the sound wave and the skin depth or propagation distance of the heat δ2 =

31.4

Derivation of the speed of sound within ﬂuids 357

wave driven at the same frequency, is shown in Fig. 31.2. In diﬀerent frequency ranges, either λ or δ will be larger because they have a diﬀerent frequency dependence (λ ∝ ω −1 and δ ∝ ω −1/2 ). In the high–frequency regime, for which λ < δ, the heat wave has propagated over a larger distance so any sound waves would be isothermal. In the low–frequency regime, for which λ > δ, the sound waves would be adiabatic. In fact, it turns out that the latter situation is usually satisﬁed in practice and sound waves are adiabatic. You can demonstrate this by substituting typical values for D and ω into eqn 31.13 to estimate δ and show that the wavelength of a sound wave will exceed the skin depth. In fact, for these typical values of δ, the wavelengths required to be in the isothermal regime are so tiny that they are smaller than the mean free path of the molecules in the gas (see Exercise 31.3).

Example 31.2 What is the speed of sound in a relativistic gas? Solution: For a non-relativistic gas we have from eqn 6.15 that p = 13 nmv 2 . Using ρ = nm, we can write this as p = 13 ρv 2 . For a relativistic gas, this should be replaced by 1 (31.14) p = ρc2 , 3 where c is the speed of light. Since ρ ∝ 1/V , we have that B = p and hence c (31.15) vs = B/ρ = √ . 3

31.4

Derivation of the speed of sound within ﬂuids

The speed of sound formula in eqn 31.1 can be derived by combining two equations, the continuity equation and the Euler equation (see boxes on page 358), to give a wave equation whose speed can be clearly identiﬁed. These equations are fully three dimensional, and a derivation for three dimensions is straightforward. However, ﬂuids such as air cannot transmit shear and so no transverse waves can be propagated, only longitudinal waves. For this reason, we will just present the one-dimensional derivation appropriate for longitudinal waves; this is illustrative and perfectly analogous to the three-dimensional version. The continuity equation in one dimension (see box on page 358) is given by ∂ρ ∂(ρu) =− . (31.16) ∂x ∂t

Fig. 31.2 Propagation distance of a sound wave and of a thermal wave as a function of frequency. In the region where λ < δ the sound waves would be isothermal and in the region where λ > δ the sound waves would be adiabatic.

358 Sound waves

The continuity equation The continuity equation for a ﬂuid (that is, for a liquid or a gas) can be derived in a similar manner to the diﬀusion equation, eqn 9.35. The mass ﬂux out of a closed surface S is ρ u · dS, (31.17) S

where ρ is the density and u is the local ﬂuid velocity. This ﬂux must be balanced by the rate of decrease of ﬂuid concentration inside the volume: ∂ ρ u · dS = − ρdV. (31.18) ∂t V S The divergence theorem then implies that ∂ρ dV ∇ · (ρu)dV = − V V ∂t and hence ∇ · (ρu) = −

∂ρ , ∂t

(31.19)

(31.20)

or in one dimension that ∂(ρu) ∂ρ =− . ∂x ∂t

(31.21)

The Euler equation The force per unit mass on an element of ﬂuid owing to a pressure gradient ∇p is −(1/ρ)∇p. This leads to the Euler equation: 1 Du − ∇p = , ρ Dt

(31.22)

where Du/Dt is the local acceleration of the ﬂuid, described in the co-moving frame of the ﬂuid via the convective derivative DX ∂X ≡ + (u · ∇)X. Dt ∂t

(31.23)

Here, DX/Dt is the rate of change of property X with time following the ﬂuid. Thus, eqn 31.22 becomes ∂u 1 + (u · ∇)u, − ∇p = ρ ∂t or in one dimension −

1 ∂p ∂u ∂u = +u . ρ ∂x ∂t ∂x

(31.24)

(31.25)

31.4

Derivation of the speed of sound within ﬂuids 359

Euler’s equation for a ﬂuid in one dimension (see box on page 358) is −

1 ∂p ∂u ∂u = +u . ρ ∂x ∂t ∂x

(31.26)

Equation 31.16 may be expanded as ∂(ρu) ∂ρ ∂u ∂ρ =u +ρ =− . ∂x ∂x ∂x ∂t Dividing through by ρ and writing s = δρ/ρ yields

(31.27)

∂s ∂u ∂s + =− . (31.28) ∂x ∂x ∂t For small-amplitude sound waves, any terms which are second order in u, such as u∂s/∂x, may be neglected so that eqn 31.27 becomes u

∂u ∂s =− . (31.29) ∂x ∂t Again neglecting terms which are second order in u, one ﬁnds that eqn 31.26 becomes ∂u 1 ∂p =− . (31.30) ∂t ρ ∂x In terms of a bulk modulus deﬁned in eqn 31.4, we may re-write eqn 31.30 as ∂u B ∂s =− , (31.31) ∂t ρ ∂x and then eliminating u from this equation and from eqn 31.29 we have a one-dimensional wave equation: ∂2s ρ ∂2s = . (31.32) 2 ∂x B ∂t2 This has solutions which may be recognized as travelling waves of the form s ∝ ei(kx−ωt) , (31.33) for which the wave speed is then given by substituting eqn 31.33 into eqn 31.32 and obtaining B ω . (31.34) vs = = k ρ

Chapter summary • The speed of sound is deﬁned by vs = B/ρ, where B is given by B = −V ∂p/∂V . • For adiabatic sound waves the speed of sound is given by vs = γv 2 /3 = γkT /m. • In a relativistic gas, the speed of sound is given by √ vs = c/ 3.

360 Exercises

Further reading Faber (1995) has a good discussion of sound waves in gases and liquids and is a useful primer on ﬂuid dynamics in general.

Exercises (31.1) The speed of sound in air at 0◦ C is 331.5 m s−1 . Estimate the speed of sound at an aircraft’s cruising altitude where the temperature is −60◦ C. (31.2) Calculate the speed of sound in nitrogen at 200◦ C. (31.3) For sound waves in air of frequency (a) 1 Hz and (b) 20 kHz estimate both the wavelength λ of the sound wave and the skin depth δ (the characteristic depth to which a thermal wave of this frequency will diﬀuse). Hence show that sound waves are invariably adiabatic and not isothermal. For what frequency would δ = λ? (31.4) The speed of sound in air, hydrogen and carbon dioxide at 0◦ C is 331.5 m s−1 , 1270 m s−1 and

258 m s−1 respectively. Explain the relative magnitude of these values. (31.5) Breathing helium gas can result in your voice sounding higher (do not try this as asphyxiation is a serious risk); explain this eﬀect (and note that the actual pitch of the voice is not higher). (31.6) Estimate the time taken for a sound wave to cross the Sun using eqn 31.11, assuming that the average temperature of the Sun is 6 × 106 K. [Assume that the Sun is mostly ionized hydrogen (protons plus electrons) so that the average mass per particle is about mp /2. The radius R of the Sun is 6.96×108 m.]

32

Shock waves Shock waves (known for short as shocks) occur when a disturbance is propagating through a medium faster than the sound speed of the medium. In this chapter we are going to consider the nature of shocks in gases and the thermodynamic properties of the gas on either side of such a shock.

32.1 The Mach number

32.3 Shock conservation laws 363 32.4 The Rankine–Hugoniot conditions 364 Chapter summary

32.1

The Mach number

361

32.2 Structure of shock waves 361

366

Further reading

367

Exercises

367

The Mach number M of a disturbance is deﬁned to be the ratio of the speed w at which the disturbance is passing through a medium to the sound speed vs of the medium. Thus we have M=

w . vs

(32.1)

When M > 1, the disturbance is called a shock front and the speed of the disturbance is supersonic. The development of a shock wave can be seen in Fig. 32.1, which shows wavefronts from a moving point source. The point source, moving at speed w, emits circular wavefronts and these wavefronts overlap constructively to form a single conical– shaped wavefront when w > vs , i.e. when M ≥ 1 (the cone looks like the two sides of a triangle in the ﬁgure, which is necessarily printed in two dimensions!). The semi-angle of the cone decreases as M increases. This shock wave is responsible for the sonic ‘boom’ which can be heard when a supersonic aircraft passes overhead (a double boom is often heard owing to the fact that shock waves originate from both the nose and the tail of the aircraft). Because the semi-angle of the cone decreases for very high speeds, a very fast aircraft at high altitude does not produce a boom at ground level because the cone does not intersect the ground.

32.2

Structure of shock waves

What is actually going on at a shock front? In order to establish the thermodynamic properties either side of a shock front, it is helpful to treat it as a mathematical discontinuity across which there is an abrupt change in the values of the properties because of the motion of the shock. In reality, the width of the shock front is ﬁnite but its detailed structure does not matter for our purposes although we will discuss it brieﬂy in Section 32.4. Figure 32.2 illustrates the velocities of unshocked

Fig. 32.1 The propagation of a shock wave for subsonic and supersonic ﬂows. (a) M = 0.8, (b) M = 1, (c) M = 1.2, (d) M = 1.4.

362 Shock waves

and shocked gas with respect to a shock front (illustrated as a grey rectangle in each frame). This is shown for the two frames of reference in which it is convenient to work for these situations: the rest frame of the unshocked gas and the rest frame of the shock front (which we shall call the shock frame). In the rest frame of the undisturbed gas, the shock front moves at velocity w while the gas through which the shock has already passed moves at velocity w2 (where w2 < w). There is a shock because the shock front propagates at speed w > vs1 , where vs1 is the sound speed in the unshocked gas. If w vs1 then there is said to be a strong shock whereas if w is just a little above vs1 then there is said to be a weak shock. In the shock frame, the gas through which the shock front has passed moves away from the shock at velocity v2 while the as-yet undisturbed gas moves towards it at velocity v1 . Therefore, v1 = w, since this is the speed at which the undisturbed gas enters the shock front. In the same frame, the speed at which the shocked gas leaves the back of the shock is given by (32.2) v2 = w − w2 .

w

w

Fig. 32.2 Structure of a shock front in the rest frame of the undisturbed gas and in the rest frame of the shock front (which we call the shock frame). The terms ‘upstream’ and ‘downstream’ are best understood in the rest frame of the shock: in this frame, the shock is stationary and high–velocity gas (velocity v1 ), which is yet to be disturbed by the shock front, streams towards the shock front (from ‘upstream’) from region 1, while slower (velocity v2 ) shocked gas moves away (‘downstream’) in region 2. Region 1 contains gas with lower internal energy, temperature, entropy, pressure and density but higher velocity and hence bulk kinetic energy than region 2.

32.3

32.3

Shock conservation laws

To establish the physical properties of the gas before and after the passage of the shock, we have to think about the conservations laws, of mass, momentum and energy, either side of the shock front. It is most convenient at this point to work in the shock frame (right panel of Fig. 32.2). We then have the following three conservation equations: • The conservation of mass is applied by stating that the mass ﬂux Φm , that is the mass crossing unit area in unit time, is equal on either side of the shock. Denoting the upstream region by 1 and the downstream region by 2, we may write ρ2 v2 = ρ1 v1 = Φm .

(32.3)

• The conservation of momentum requires that the momentum ﬂux should be continuous; this means that the force per unit area plus the rate at which momentum is transported across unit area should be matched on either side of the shock front, giving p2 + ρ2 v22 = p1 + ρ1 v12 .

(32.4)

• The conservation of energy requires that the rate at which gas pressure does work per unit area (given by pv) and the rate of transport of internal and kinetic energy per unit area ((ρ˜ u+ 21 ρv 2 )v, where u ˜ is the internal energy per unit mass) is constant across a shock, which gives 1 1 ˜2 + ρ2 v22 v2 = p1 v1 + ρ1 u ˜1 + ρ1 v12 v1 . (32.5) p2 v2 + ρ2 u 2 2 The following example illustrates a simple algebraic manipulation of two of the conservation laws.

Example 32.1 Rearrange eqn 32.3 and eqn 32.4 to show that −1 Φ2m = (p2 − p1 )/(ρ−1 1 − ρ2 ),

(32.6)

and hence ﬁnd an expression for − in terms of pressures and densities. Solution: Equation 32.3 implies that vi = ρ−1 i Φm . This, together with eqn 32.4, can be simply rearranged to give v12

v22

−1 p2 − p1 = ρ1 v12 − ρ2 v22 = Φ2m (ρ−1 1 − ρ2 ),

(32.7)

and the desired result follows. The ﬁnal step can be achieved by writing −1 −1 −1 v12 − v22 = (v1 − v2 )(v1 + v2 ) = Φ2m (ρ−1 1 − ρ2 )(ρ1 + ρ2 ),

(32.8)

and substitution of eqn 32.7 yields −1 v12 − v22 = (p2 − p1 )(ρ−1 1 + ρ2 ).

(32.9)

Shock conservation laws 363

364 Shock waves

32.4 1

The derivation in this section is nothing more than algebraic manipulations following from the conservation laws, but we give it in full since it is somewhat ﬁddly. If you are not concerned with these details, you can skip straight to equation 32.19.

The Rankine–Hugoniot conditions

Having written down the conservation laws, we now wish to solve these simultaneously to ﬁnd the pressures, densities and temperatures on either side of the shock front.1 We will treat the gas as an ideal gas, so that the internal energy per unit mass, u ˜, is given by (see eqn 11.36) u ˜=

p . (γ − 1)ρ

(32.10)

Rearranging this gives p = (γ − 1)ρ˜ u and substituting this into eqn 32.5 gives 1 1 γρ2 v2 u ˜2 + v22 = γρ1 v1 u ˜1 + v12 . (32.11) 2 2 Dividing the left-hand side by ρ2 v2 and the right-hand side by ρ1 v1 (and eqn 32.3 implies that these two factors are equal) and using eqn 32.10 yields γ p2 1 γ p1 1 + v22 = + v12 . (32.12) (γ − 1)ρ2 2 (γ − 1)ρ1 2 Using eqn 32.9, and multiplying by γ − 1, this can be rearranged to give −1 −1 −1 2γ(p1 ρ−1 1 − p2 ρ2 ) + (γ − 1)(p2 − p1 )(ρ1 + ρ2 ).

(32.13)

Hence, we have that ρ−1 (γ + 1)p1 + (γ − 1)p2 2 = . (γ − 1)p1 + (γ + 1)p2 ρ−1 1

(32.14)

Substitution into eqn 32.6 gives Φ2m =

1 p2 − p 1 −1 −1 = 2 ρ1 [(γ − 1)p1 + (γ + 1)p2 ], ρ−1 [1 − ρ /ρ ] 1 2 1

(32.15)

and hence v12 = Φ2m ρ−2 1 =

1 −1 ρ [(γ − 1)p1 + (γ + 1)p2 ]. 2 1

(32.16)

We would like to express everything in terms of the Mach number M1 of the shock, and recalling that M1 = v1 /vs1 and vs1 = γp1 /ρ1 , we have that ρ1 v12 . (32.17) M12 = γp1 Susbtitution of eqn 32.17 into eqn 32.16 gives ρv12 = M12 γp1 =

1 [(γ − 1)p1 + (γ + 1)p2 ], 2

(32.18)

and rearranging gives our desired equation relating the pressure on either side of the shock front: p2 2γM12 − (γ − 1) . = p1 γ+1

(32.19)

32.4

The Rankine–Hugoniot conditions 365

Substitution of eqn 32.19 into eqn 32.14, and using eqn 32.3, gives an equation for the ratio of the densities (and velocities) on either side of the shock: ρ2 v1 (γ + 1)M12 = = . (32.20) ρ1 v2 2 + (γ − 1)M12

p p

T T

Equations 32.19 and 32.20 are known as the Rankine–Hugoniot conditions and describe the physical properties of material on other side of the shock front. The results are plotted in Fig. 32.3.

Example 32.2 What are the ranges of values that can be taken by the following quantities for a shock front? (i) ρ2 /ρ1 , (ii) v2 /v1 and (iii) p2 /p1 . Solution: When M1 = 1, each of these quantities takes the value unity. In the limit as M1 → ∞, we ﬁnd that ρ2 ρ1 v2 v1 p2 p1

→ → →

γ+1 , γ−1 γ−1 , γ+1 2γM12 , γ+1

(32.21)

M (32.22) (32.23)

so that ρ2 /ρ1 and v2 /v1 both saturate (at values of 4 and 1/4 respectively in the case of γ = 5/3) but p2 /p1 can increase without limit. This is demonstrated in Fig. 32.3.

Example 32.3 Show that for a monatomic gas, the ratio ρ2 /ρ1 can never exceed 4 and v2 /v1 can never be lower than 14 . Solution: Equation 32.21, together with γ = 5/3 for a monatomic gas, shows that ρ2 /ρ1 can never exceed (γ + 1)/(γ − 1) = 4. Since v2 /v1 = ρ1 /ρ2 , this ratio can never be lower than 14 . The Rankine–Hugoniot conditions, eqns 32.19 and 32.20 together with eqn 32.29, as they stand permit expansive shocks, that is with a reversal of roles for the two regions pictured in Fig. 32.2. The physical picture here would be that subsonically moving hot gas expands at a shock front and accelerates to become supersonic cool gas, i.e. internal energy would convert to bulk kinetic energy at the shock front. Such

Fig. 32.3 The Rankine–Hugoniot conditions for a shock front as a function of Mach number M1 , where γ is assumed to take the value 5/3 (this is the value γ takes for a non-relativistic, monatomic gas).

366 Shock waves

a situation is forbidden by the second law of thermodynamics (Chapter 14), which says that entropy can only increase. The second law, together with the Rankine–Hugoniot conditions only permit compressive shocks in which the shock speed (w) exceeds the sound speed vs1 , i.e. the Mach number M1 > 1. In the shock frame, the ﬂow ahead of the shock (‘upstream’) is supersonic and the ﬂow behind the shock (‘downstream’) is subsonic. For a shock to be compressive means that, p2 > p1 and ρ2 > ρ1 (which is of course consistent with v2 < v1 ). The ideal gas equation implies that p/ρ ∝ T and hence that T2 p2 /ρ2 = . (32.24) T1 p1 /ρ1 This can be used to show that T 2 > T1 ,

(32.25)

so that a shock wave not only slows the gas but also that it heats it up, thus converting kinetic energy into thermal energy. The conversion of ordered energy into random motion occurs via collisions. The thickness of a shock front is thus usually of the order of the collisional mean free path. We would expect from this that entropy increases as kinetic energy is converted into heat. The entropy increase to the gas downstream of the shock compared with that upstream is straightforwardly computed by using the relationship we established in eqn 16.93, namely p + constant. (32.26) S = CV ln ργ Hence, the diﬀerence in entropy ∆S between the two regions is given by γ P2 ρ1 ∆S = S2 − S1 = CV ln . (32.27) P1 ρ2 When we substitute eqns 32.19 and 32.20 into eqn 32.27 we obtain the following expression for the entropy diﬀerence across a shock: γ 2γM12 − (γ − 1) 2 + (γ − 1)M12 ∆S = CV ln . (32.28) γ+1 (γ + 1)M12 This equation can be used to show that ∆S > 0, so that entropy always increases as gas is shocked. Equation 32.28 is plotted in Fig. 32.4. Fig. 32.4 The entropy change ∆S in units of R for one mole of gas, as a function of Mach number M1 .

Chapter summary • Shock waves occur when a disturbance is propagating through a medium at a speed w which is faster than the sound speed of the medium vs . • The Mach number M = w/vs . • Shocks convert kinetic energy into thermal energy.

Further reading 367

Further reading Faber (1995) contains useful information on shocks in ﬂuids.

Exercises (32.1) Show that the semi-angle of the cone of the shock waves shown in Fig. 32.1 is given by sin−1 (1/M ), where M is the Mach number. (32.2) Use eqn 32.24 to show that [2γM12 − (γ − 1)][2 + (γ − 1)M12 ] T2 = , (32.29) T1 (γ + 1)2 M12 and hence for M1 1 we have that 2γ(γ − 1)M12 T2 → . T1 (γ + 1)2

(32.30)

(32.3) For a shock wave in a monatomic gas show that ρ2 → 4, ρ1

p2 5 → M12 , p1 4

in the limit M1 1.

T2 5 2 → M1 , T1 32 (32.31)

(32.4) Air is mostly nitrogen (N2 ) and oxygen (O2 ), which are both diatomic gases and for which γ = 7/5. Show that in this case, in the limit M1 1 we have that ρ2 → 6, ρ1

p2 7 → M12 , p1 6

T2 7 2 → M1 . T1 36 (32.32)

(32.5) Show that in the limit as the Mach number of a shock becomes large, the increase in entropy from the upstream material ﬂowing into the shock to the downstream material ﬂowing away from it is given by »

2γM12 ∆S = CV ln γ+1

–„

γ−1 γ+1

«γ .

(32.33)

Brownian motion and ﬂuctuations

33 33.1 Brownian motion

368

33.2 Johnson noise

371

33.3 Fluctuations

372

33.4 Fluctuations and the availability 373 33.5 Linear response

375

33.6 Correlation functions

378

Chapter summary

384

Further reading

385

Exercises

385

Our treatment of the thermodynamic properties of thermal systems has assumed that we can replace quantities such as pressure by their average values. Even though the molecules in a gas hit the walls of their container stochastically, there are so many of them that the pressure does not appear to ﬂuctuate. But with very small systems, these ﬂuctuations can become important. In this chapter, we consider these ﬂuctuations in detail. A useful insight comes from the ﬂuctuation–dissipation theorem, which is derived from the assumption that the response of a system in thermodynamic equilibrium to a small external perturbation is the same as its response to a spontaneous ﬂuctuation. This implies that there is a direct relation between the ﬂuctuation properties of a thermal system and what are known as its linear response properties.

33.1

Brownian motion

We introduced Brownian motion in Section 19.4. There we showed that the equipartition theorem implies that the translational motion of particles at temperature T ﬂuctuates since each particle must have mean kinetic energy given by 21 mv 2 = 32 kB T . Einstein, in his 1905 paper on Brownian motion, noted that the same random forces which cause Brownian motion of a particle would also cause drag if the particle were pulled through the ﬂuid.

Example 33.1 Find the solution to the equation of motion (known as the Langevin equation) for the velocity v of a particle of mass m which is given by mv˙ = −αv + F (t),

(33.1)

where α is a damping constant (arising from friction), F (t) is a random force whose average value over a long time period, F , is zero. Solution: Note ﬁrst that in the absence of the random force, eqn 33.1 becomes mv˙ = −αv,

(33.2)

33.1

Brownian motion 369

which has solution v(t) = v(0) exp[−t/(mα−1 )],

(33.3)

so that any velocity component dies away with a time constant given by m/α. The random force F (t) is necessary to give a model in which the particle’s motion does not die away. To solve eqn 33.1, write v = x˙ and premultiply both sides by x. This leads to mx¨ x = −αxx˙ + xF (t). (33.4) Now

d (xx) ˙ = x¨ x + x˙ 2 , dt

(33.5)

and hence we have that d ˙ = mx˙ 2 − αxx˙ + xF (t). (33.6) m (xx) dt We now average this result over time. We note that x and F are uncorrelated, and hence xF = xF = 0. We can also use the equipartition theorem, which here states that 1 1 mx˙ 2 = kB T. (33.7) 2 2 Hence, using eqn 33.7 in eqn 33.6, we have d ˙ = kB T − αxx, m xx ˙ (33.8) dt or equivalently d α kB T + , (33.9) xx ˙ = dt m m which has a solution kB T . (33.10) xx ˙ = Ce−αt/m + α Putting the boundary condition that x = 0 when t = 0, one can ﬁnd that the constant C = −kB T /α, and hence kB T (1 − e−αt/m ). xx ˙ = (33.11) α Using the identity 1 d 2 x = xx, ˙ (33.12) 2 dt we then have m 2kB T t − (e−αt/m ) . x2 = (33.13) α α When t m/α, kB T t2 , (33.14) x2 = m while for t m/α, 2kB T t . (33.15) x2 = α Writing1 x2 = 2Dt, where D is the diﬀusion constant, yields D = kB T /α.

1

See Appendix C.12.

370 Brownian motion and ﬂuctuations

If a steady force F had been applied instead of a random one, then the terminal velocity (the velocity achieved in the steady state, with v˙ = 0) of the particle could have been obtained from mv˙ = −αv + F = 0,

(33.16)

ole of a mobility (the ratio yielding v = α−1 F , and so α−1 plays the rˆ of velocity to force). It is easy to understand that the terminal velocity should be limited by frictional forces, and hence depends on α. However, the previous example shows that the diﬀusion constant D is proportional to kB T and also to the mobility α−1 . Note that the diﬀusion constant D = kB T /α is independent of mass. The mass only enters in the transient term in eqn 33.13 (see also eqn 33.14) that disappears at long times. Remarkably, we have found that the diﬀusion rate D, describing the random ﬂuctuations of the particle’s position, is related to the frictional damping α. The formula D = kB T /α is an example of the ﬂuctuation–dissipation theorem, which we will prove later in the chapter (Section 33.6). As a prelude to what will come later, the following example considers the correlation function for the Brownian motion problem.

Correlation functions are discussed in more detail in Section 33.6. The velocity correlation function v(0)v(t) is deﬁned by Z T /2 1 lim dt v(t )v(t + t ), T →∞ T −T /2 and describes how well, on average, the velocity at a certain time is correlated with the velocity at a later time.

Example 33.2 Derive an expression for the velocity correlation function v(0)v(t) for the Brownian motion problem. Solution: The rate of change of v is given by v(t) ˙ =

v(t + τ ) − v(t) τ

(33.17)

in the limit in which τ → 0. Inserting this into eqn 33.1 and premuliplying by v(0) gives v(0)v(t + τ ) − v(0)v(t) α v(0)F (t) = − v(0)v(t) + . τ m m

(33.18)

Averaging this equation, and noting that v(0)F (t) = 0 because v and F are uncorrelated, yields v(0)v(t + τ ) − v(0)v(t) α = − v(0)v(t), τ m

(33.19)

and taking the limit in which τ → 0 yields

and hence

d α v(0)v(t) = − v(0)v(t), dt m

(33.20)

v(0)v(t) = v(0)2 e−αt/m .

(33.21)

33.2

Johnson noise 371

This example shows that the velocity correlation function decays to zero as time increases at exactly the same rate that the velocity itself relaxes (see eqn 33.3).

33.2

Johnson noise

We now consider another ﬂuctuating system: the noise voltage which is generated across a resistor of resistance R by thermal ﬂuctuations. Let us suppose that the resistor is connected to a transmission line of length L which is correctly terminated at each end, as shown in Fig. 33.1.2 Because the transmission line is matched, it should not matter whether it is connected or not. The transmission line can support modes of wave vector k = nπ/L and frequency ω = ck, and therefore there is one mode per frequency interval ∆ω given by cπ . (33.22) ∆ω = L By the equipartition theorem, each mode has mean energy kB T , and hence the energy per unit length of transmission line, in an interval ∆ω, is given by ∆ω . (33.23) kB T cπ Half this energy is travelling from left to right, and half from right to left. Hence, the mean power incident on the resistor is given by 1 kB T ∆ω, (33.24) 2π and in equilibrium this must equal the mean power dissipated by the resistor, which is given by (33.25) I 2 R. In the circuit, we have I = V /(2R) and hence 1 V 2 = I 2 R = kB T ∆ω, 4R 2π

(33.26)

and hence

2 kB T R∆ω, π which, using ∆ω = 2π∆f , can be written in the form V 2 =

(33.27)

V 2 = 4kB T R∆f.

(33.28)

This expression is known as the Johnson noise produced across a resistor in a frequency interval ∆f . It is another example of the connection between ﬂuctuations and dissipation, since it relates ﬂuctuating noise power (V 2 ) to the dissipation in the circuit (R). We can derive a quantum mechanical version of the Johnson noise formula by replacing kB T by ω/(eβω − 1), which yields V 2 =

2R ω∆ω . π eβω − 1

(33.29)

2

We will give a method of calculating the noise voltage that may seem a little artiﬁcial at ﬁrst, but provides a convenient way of calculating how the resistor can exchange energy with a thermal reservoir. A more elegant approach will be done in Example 33.9. R

R

Fig. 33.1 The equivalent circuit to consider the Johnson noise across a resistor. The resistor is connected to a matched transmission line which is correctly terminated, hence the presence of the second resistor; one can consider the noise voltage as being an alternating voltage source which is connected in series with the second resistor.

372 Brownian motion and ﬂuctuations

33.3

3

This part of the argument assumes that we are working in the microcanonical ensemble (see Section 4.6).

Fluctuations

In this section, we will consider the origin of ﬂuctuations and show how much freedom a system has to allow the functions of state to ﬂuctuate. We will focus on one such function of state, which we will call x, and ask the question: if the system is in equilibrium, what is the probability distribution of x? Let us suppose that the number of microstates associated with a system characterised by this parameter x and having energy E (which we will consider ﬁxed3 ) is given by Ω(x, E).

(33.30)

If x were constrained to this value, the entropy S of the system would be S(x, E) = kB ln Ω(x, E), (33.31) which we could write equivalently as Ω(x, E) = eS(x,E)/kB . If x were not constrained, its probability distribution function would then follow the function p(x), where p(x) ∝ Ω(x, E) = eS(x,E)/kB .

(33.32)

At equilibrium the system will maximize its entropy, and let us suppose that this occurs when x = x0 . Hence ∂S(x, E) (33.33) =0 when x = x0 . ∂x Let us now write a Taylor expansion of S(x, E) around the equilibrium point x = x0 : ∂S 1 ∂2S S(x, E) = S(x0 , E)+ (x−x0 )+ (x−x0 )2 +· · · . ∂x x=x0 2 ∂x2 x=x0 (33.34) which with eqn 33.33 implies that 1 ∂2S S(x) = S(x0 ) + (x − x0 )2 + · · · (33.35) 2 ∂x2 x=x0 Hence, deﬁning ∆x = x − x0 , we can write the probability function as a Gaussian, (∆x)2 p(x) ∝ exp − , (33.36) 2(∆x)2 where kB (∆x)2 = − ∂ 2 S .

(33.37)

∂x2 E

This equation shows that if the entropy S changes rapidly as a function of x, we are more likely to ﬁnd the system with x close to x0 . This makes sense.

33.4

Example 33.3 Let x be the internal energy U for a system with ﬁxed volume. Using T = (∂U/∂S)V , we have that 2 ∂(1/T ) ∂ S 1 = =− 2 , (33.38) 2 ∂U V ∂U T CV V and hence

kB (∆U )2 = − ∂ 2 S

= kB T 2 CV .

(33.39)

∂U 2 V

So if a system is in contact with a bath at temperature T , there is a nonzero probability that we may ﬁnd the system away from the equilibrium internal energy: thus U can ﬂuctuate. The size of the ﬂuctuations is larger if the heat capacity is larger.

Both the heat capacity CV and the internal energy U are extensive parameters and therefore they scale with the size of the system. The r.m.s. ﬂuctuations of U scale with the square root of the size of the system, so the fractional r.m.s. ﬂuctuations scale with the size of the system to the power − 12 . Thus if the system has N atoms, then √ (∆U )2 ∝ N , C ∝ N, U ∝N (33.40) and

(∆U )2 1 (33.41) ∝√ . U N Hence as N → ∞, we can ignore ﬂuctuations. Fluctuations are more important in small systems. However, note that at a critical point for a ﬁrst-order phase transition, C → ∞ and hence (∆U )2 → ∞. U

(33.42)

Hence ﬂuctuations become divergent at the critical point and cannot be ignored, even for large systems.

33.4

Fluctuations and the availability

We now generalize an argument presented in Section 16.5 to the case in which numbers of particles can ﬂuctuate. Consider a system in contact with a reservoir. The reservoir has temperature T0 , pressure p0 and chemical potential µ0 . Let us consider what happens when we transfer energy dU , volume dV and dN particles from the reservoir to the system. The internal energy of the reservoir changes by dU0 , where dU0 = −dU = T0 dS0 − p0 (−dV ) + µ0 (−dN ),

(33.43)

Fluctuations and the availability 373

374 Brownian motion and ﬂuctuations

where the minus signs express the fact that the energy, volume and number of particles in the reservoir are decreasing. We can rearrange this expression to give the change of entropy in the reservoir as dS0 =

−dU − p0 dV + µ0 dN . T0

(33.44)

If the entropy of the system changes by dS, then the total change of entropy dStot is dStot = dS + dS0 , (33.45) and the second law of thermodynamics implies that dStot ≥ 0. Using eqn 33.44, we have that dStot = −

dU − T0 dS + p0 dV − µ0 dN , T0

(33.46)

dA , T0

(33.47)

which can be written as dStot = −

where A = U − T0 S + p0 V − µ0 N is the availability (this generalizes eqn 16.32). We now apply the concept of availability to ﬂuctuations. Let us suppose that the availability depends on some variable x, so that we can write a function A(x). Equilibrium will be achieved when A(x) is minimized (so that Stot is maximized, see eqn 33.47) and let us suppose that this occurs when x = x0 . Hence we can similarly write A(x) in a Taylor expansion around the equilibrium point and hence 1 ∂2A A(x) = A(x0 ) + (∆x)2 + · · · , (33.48) 2 ∂x2 x=x0 so that we can recover the probability distribution in eqn 33.36 with kB T0 (∆x)2 = − ∂ 2 A .

(33.49)

∂x2

Example 33.4 A system with a ﬁxed number N of particles is in thermal contact with a reservoir at temperature T . It is surrounded by a tensionless membrane so that its volume is able to ﬂuctuate. Calculate the mean square volume ﬂuctuations. For the special case of an ideal gas, show that (∆V )2 = V 2 /N . Solution: Fixing T and N means that U can ﬂuctuate. Fixing N implies that dN = 0 and hence we have that dU = T dS − pdV.

(33.50)

33.5

Changes in the availability therefore follow: dA = dU − T0 dS + p0 dV = (T − T0 )dS + (p0 − p)dV,

and hence

and

∂2A ∂V 2

∂A ∂V

(33.51)

T,N

= p0 − p

T,N

=−

∂p ∂V

Hence (∆V ) = −kB T0 2

(33.52)

∂V ∂p

.

(33.53)

T,N

.

(33.54)

T,N

For an ideal gas, (∂V /∂p)T,N = −N kB T /p2 = −V /p, and hence (∆V )2 =

V2 . N

(33.55)

Equation 33.55 implies that the fractional volume ﬂuctuations follow (∆V )2 1 = 1/2 . (33.56) V N Thus for a box containing 1024 molecules of gas (a little over a mole of gas), the fractional volume ﬂuctuations are at the level of one part in 1012 . We can derive other similar expressions for other ﬂuctuating variables, including (∆T )2 (∆S)2 (∆p)2

kB T 2 , CV = kB Cp , kB T κS = , CV =

(33.57) (33.58) (33.59)

where κS is the adiabatic compressibility (see eqn 16.72).

33.5

Linear response

In order to understand in more detail the relationship between ﬂuctuations and dissipation, it is necessary to consider how systems respond to external forces in a rather more general way. We consider a displacement variable x(t) that is the result of some force f (t), and require that the product xf has the dimensions of energy. (We will say that x and f are conjugate variables if their product has the dimensions of energy.) We assume that the response of x to a force f is linear (so that, for example, doubling the force doubles the response), but there could be

Linear response 375

376 Brownian motion and ﬂuctuations

some delay in the way in which the system responds. The most general way of writing this down is as follows: we say that the average value of x at time t is denoted by x(t)f (the subscript f reminds us that a force f has been applied) and is given by ∞ χ(t − t )f (t ) dt , (33.60) x(t)f = −∞

where χ(t − t ) is a response function. This relates the value of x(t) to a sum over values of the force f (t ) at all other times. Now it makes sense to sum over past values of the force, but not to sum over future values of the force. This will force the response function χ(t − t ) to be zero if t < t . Before seeing what eﬀect this has, we need to Fourier transform eqn 33.60 to make it simpler to deal with. The Fourier transform of x(t) is given by the function x ˜(ω) given by ∞ dt e−iωt x(t). (33.61) x ˜(ω) = −∞

The inverse transform is then given by ∞ 1 dω eiωt x ˜(ω). x(t) = 2π −∞

(33.62)

The expression in eqn 33.60 is a convolution of the functions χ and f , and hence by the convolution theorem we can write this equation in Fourier transform form as ˜ f˜(ω). ˜ x(ω)f = χ(ω)

(33.63)

This is much simpler than eqn 33.60 as it is a product, rather than a convolution. Note that the response function χ(ω) ˜ can be complex. The real part of the response function gives the part of the displacement which is in phase with the force. The imaginary part of the response function gives a displacement with is π2 out of phase with the force. It corresponds to dissipation because the external force does work on the system at a rate given by the force multiplied by the velocity, i.e. f (t)x(t), ˙ and this work is dissipated as heat. For f (t) and x(t) ˙ to be in phase, and hence give a non-zero average, f (t) and x(t) have to be π2 out of phase (see Exercise 33.2). We can build causality into our problem by writing the response function as χ(t) = y(t)θ(t), (33.64) where θ(t) is a Heaviside step function (see Fig. 30.1) and y(t) is a function which equals χ(t) when t > 0 and can equal anything at all when t < 0. For the convenience of the following derivation, we will set y(t) = −χ(|t|) when t < 0, making y(t) an odd function (and, importantly, making y˜(ω) purely imaginary). By the inverse convolution theorem, the Fourier transform of χ(t) is given by the convolution ∞ 1 ˜ − ω)˜ dω θ(ω y (ω ). (33.65) χ(ω) ˜ = 2π −∞

33.5

Writing the Heaviside step function as −t e t>0 θ(t) = , 0 t<0 in the limit in which → 0 its Fourier transform is given by ∞ 1 iω ˜ = 2 θ(ω) = dt e−iωt e−t = − 2 . iω + ω + 2 ω + 2 0

Linear response 377

(33.66)

(33.67)

Thus, taking the limit → 0, we have that i ˜ θ(ω) = πδ(ω) − . ω

(33.68)

Substituting this into eqn 33.65 yields4 χ(ω) ˜ =

i 1 y˜(ω) − P 2 2π

4

∞

−∞

y˜(ω ) dω . ω − ω

(33.69)

We now write χ(ω) ˜ in terms of its real and imaginary parts: χ(ω) ˜ =χ ˜ (ω) + iχ ˜ (ω),

(33.70)

The symbol P denotes the Cauchy principal value of the integral. This means that an integral whose integrand blows up at some value is evaluated using an appropriate limit. For example, R 1 −1 dx/x is undeﬁned since 1/x → ∞ at x = 0, but „Z − « Z 1 Z 1 dx dx dx = lim + P x

→0+ −1 x −1 x

= 0.

and since y˜(ω) is purely imaginary, eqn 33.69 yields iχ ˜ (ω) = and hence

χ ˜ (ω) = P

∞

−∞

1 y˜(ω), 2

(33.71)

˜ (ω ) dω χ . π ω − ω

(33.72)

This is one of the Kramers–Kronig relations which connects the real and imaginary parts of the response function.5 Note that our derivation has only assumed that the response is linear (eqn 33.60) and causal, so that the Kramers-Kronig relations are very general. By putting ω = 0 into eqn 33.72, we obtain another very useful result: χ ˜ (0) = P

∞

−∞

˜ (ω ) dω χ . π ω

(33.73)

Sometimes the response function is called a generalized susceptibility, and the zero frequency real part, χ ˜ (0), is called the static susceptibility. As discussed above, the imaginary part of the response function, χ ˜ (ω), corresponds to the dissipation of the system. Equation 33.73 therefore shows that the static susceptibility (the response at zero frequency) is related to an integral of the total dissipation of the system.

5

The other Kramers–Kronig relation is derived in Exercise 33.3.

378 Brownian motion and ﬂuctuations

Example 33.5 Find the reponse function for the damped harmonic oscillator (mass m, spring constant k, damping α) whose equation of motion is given by m¨ x + αx˙ + kx = f

(33.74)

and show that eqn 33.73 holds. Solution: Writing the resonant frequency ω02 = k/m, and writing the damping γ = α/m, we have f (33.75) x ¨ + γ x˙ + ω02 x = , m and Fourier transforming this gives immediately that 1 1 x ˜(ω) = χ(ω) ˜ = . (33.76) m ω02 − ω 2 − iωγ f˜(ω) Hence, the imaginary part of the response function is ωγ 1 , χ ˜ (ω) = m (ω 2 − ω02 )2 + (ωγ)2

(33.77)

and the static susceptibility is χ ˜ (0) =

Fig. 33.2 (a) The real and imaginary parts of χ(ω) ˜ as a function of ω. (b) An illustration of eqn 33.73 for the damped harmonic oscillator.

See Appendix C.11.

(33.78)

The real and imaginary parts of χ(ω) ˜ are plotted in Fig. 33.2(a). The imaginary part shows a peak near ω0 . Equation 33.77 shows that 2 2 2 χ ˜ (ω)/ω = (γ/m)[(ω − ω ) + (ωγ) ] and straightforward integration 0 ∞ ˜ (ω)/ω) dω = π/(mω02 ) = π χ ˜ (0) and hence that shows that −∞ (χ eqn 33.73 holds. This is illustrated in Fig. 33.2(b).

33.6 6

1 1 = . 2 mω0 k

Correlation functions

Consider a function x(t). Its Fourier transform6 is given by ∞ x ˜(ω) = dt e−iωt x(t),

(33.79)

−∞

as before, and we deﬁne the power spectral density as |˜ x(ω)|2 . This function shows how much power is associated with diﬀerent parts of the frequency spectrum. We now deﬁne the autocorrelation function Cxx (t) by ∞ Cxx (t) = x(0)x(t) = x∗ (t )x(t + t) dt . (33.80) −∞

33.6

The notation here is that the double subscript means we are measuring how much x at one time is correlated with x at another time. (We could also deﬁne a cross-correlation function Cxy (t) = x(0)y(t) which measures how much x at one time is correlated with a diﬀerent variable y at another time.) The autocorrelation function is connected to the power spectral density by the Wiener–Khinchin theorem7 which states that the power spectral density is given by the Fourier transform of the autocorrelation function: ∞ 2 ˜ |˜ x(ω)| = Cxx (ω) = e−iωt x(0)x(t) dt (33.81)

Correlation functions 379

7

Norbert Wiener (1894–1964); Alexsandr Y. Khinchin (1894–1959). The proof of this theorem is given in Appendix C.11.

−∞

The inverse relation also must hold: ∞ 1 x(0)x(t) = eiωt |˜ x(ω)|2 dω, 2π −∞ and hence for t = 0 we have that x(0)x(0) =

1 2π

∞

−∞

|˜ x(ω)|2 dω,

(33.82)

(33.83)

or, more succinctly, 1 x = 2π 2

∞

−∞

C˜xx (ω) dω.

(33.84)

This is a form of Parseval’s theorem that states that the integrated power is the same whether you integrate over time or over frequency.8

Example 33.6 A random force F (t) has average value given by F (t) = 0

(33.85)

and its autocorrelation function is given by F (t)F (t ) = Aδ(t − t ),

is a ﬂat power spectrum.

Parseval’s theorem is actually nothing more than Pythagoras’ theorem in an inﬁnite-dimensional vector space. If you think of the function x(t), or its transform x ˜(ω), as a single vector in such a space, then the square of the length of the vector is equal to the sum of the squares on the ‘other sides’, which in this case is the sum of the squares of the components (i.e. an integral of the squares of the values of the function).

(33.86)

where δ(t − t ) is a Dirac delta function.9 Find the power spectrum. Solution: By the Wiener–Khinchin theorem, the power spectrum is simply the Fourier transform of the autocorrelation function, and hence |F (ω)|2 = A

8

(33.87)

9

See Appendix C.10.

380 Brownian motion and ﬂuctuations

This demonstrates that if the random force F (t) has zero autocorrelation, it must have inﬁnite frequency content.

Example 33.7 Find the velocity autocorrelation for the Brownian motion particle governed by eqn 33.1 where the random force F (t) is as described in the previous example, i.e. with F (t)F (t ) = Aδ(t − t ). Hence relate the constant A to the temperature T . Solution: Equation 33.1 states that mv˙ = −αv + F (t),

(33.88)

and the Fourier transform of this equation is v˜(ω) =

F˜ (ω) . α − imω

(33.89)

This implies that the Fourier transform of the velocity autocorrelation function is A C˜vv (ω) = |v(ω)|2 = 2 , (33.90) α + m2 ω 2 using the result of eqn 33.87. The Wiener–Khinchin theorem states that ˜ (33.91) Cvv (ω) = e−iωt v(0)v(t) dt, and hence

C˜vv (t) = v(0)v(t) = v 2 e−αt/m ,

(33.92)

in agreement with eqn 33.21 derived earlier using another method. Parseval’s theorem (eqn 33.84) implies that ∞ dω ˜ A Cvv (ω) = . (33.93) v 2 = 2mα −∞ 2π Equipartition, which gives that 12 mv 2 = 12 kB T , leads immediately to A = 2αkB T.

(33.94)

Let us next suppose that the energy E of a harmonic system is given by E = 12 αx2 (as in Chapter 19). The probability P (x) of the system taking the value x is given by a Boltzmann factor e−βE and hence 2

P (x) = N e−βαx

/2

,

(33.95)

where N is a normalization constant. Now we apply a force f which is conjugate to x so that the energy E is lowered by xf . The probability P (x) becomes 2 (33.96) P (x) = N e−β(αx /2−xf ) ,

33.6

Correlation functions 381

and by completing the square, this can be rewritten as P (x) = N e−

βα f 2 2 (x− a )

,

(33.97)

where N is a diﬀerent normalization constant. This equation is of the usual Gaussian form P (x) = N e−(x−xf )

2

/2x2

,

(33.98)

where xf = f /a and x2 = 1/βα. Notice that xf is telling us about the average value of x in response to the force f , while x2 = kB T /α is telling us about ﬂuctuations in x. The ratio of these two quantities is given by xf = βf. (33.99) x2 Now xf is the average value x takes when a force f is applied, and we know that xf is related10 to f by the static susceptibility by xf =χ ˜ (0), f

(33.100)

so that eqn 33.99 can be rewritten as x2 = kB T χ ˜ (0).

(33.101)

Equation 33.101 thus relates x2 to the static susceptibility of the system. Using eqn 33.73, we can express this relationship as ∞ ˜ (ω ) dω χ , (33.102) x2 = kB T ω −∞ π and together with eqn 33.84, this motivates χ ˜ (ω) C˜xx (ω) = 2kB T ω ,

(33.103)

which is a statement of the ﬂuctuation–dissipation theorem. This shows that there is a direct connection between the autocorrelation function of the ﬂuctuations, C˜xx (ω), and the imaginary part χ˜ (ω) of the response function which is associated with dissipation.

Example 33.8 Show that eqn 33.103 holds for the problem considered in Example 33.5. Solution: Recall from Example 33.7 that C˜xx (ω) = e−iωt x(0)x(t) dt = |˜ x(ω)|2 = A|χ(ω)|2 , (33.104)

10

Here we are making the assumption that the linear response function χ(ω) ˜ governs both ﬂuctuations and the usual response to perturbations.

382 Brownian motion and ﬂuctuations

and hence using χ(ω) ˜ from eqn 33.76 and A from eqn 33.94, we have that 2γkB T 1 C˜xx (ω) = . (33.105) m (ω 2 − ω02 )2 + (ωγ)2 Equation 33.77 shows that 2γkB T χ ˜ (ω) = 2kB T ω m

1 , (ω 2 − ω02 )2 + (ωγ)2

(33.106)

and hence eqn 33.103 holds.

Example 33.9 Derive an expression for the Johnson noise across a resistor R using the circuit in Fig. 33.3 (which includes the small capacitance across the ends of the resistor). Solution: Simple circuit theory yields

I

V + IR =

R

Q . C

(33.107)

The charge Q and voltage V are conjugate variables (their product has dimensions of energy) and so we write

Q

C

˜ Q(ω) = χ(ω) ˜ V˜ (ω),

(33.108)

Q where the response function χ(ω) ˜ is given for this circuit by

V

χ(ω) ˜ =

1 . C −1 − iωR

(33.109)

ωR . C −2 + ω 2 R2

(33.110)

Hence χ ˜ (ω) is given by χ ˜ (ω) = Fig. 33.3 Circuit for analysing Johnson noise across a resistor.

At low frequency (ω 1/RC, and since the capacitance will be small, 1/RC will be very high so that this is not a severe resistriction) we have that χ ˜ (ω) → ωRC 2 . Thus the ﬂuctuation–dissipation theorem (eqn 33.103) gives χ ˜ (ω) C˜QQ (ω) = 2kB T = 2kB T RC 2 . ω

(33.111)

Because Q = CV for a capacitor, correlations in Q and V are related by C˜QQ (ω) C˜V V (ω) = , C2

(33.112)

33.6

Correlation functions 383

and hence C˜V V (ω) = 2kB T R.

(33.113)

CVV

Equation 33.84 implies that 1 V = 2π

2

∞

−∞

C˜V V (ω) dω,

(33.114)

and hence if this integral is carried out, not over all frequencies, but only in a small interval ∆f = ∆ω/(2π) about some frequency ±ω0 (see Fig. 33.4), (33.115) V 2 = 2CV V (ω)∆f = 4kB T R∆f, in agreement with eqn 33.28.

We close this chapter by remarking that our treatment so far applies only to classical systems. The quantum mechanical version of the ﬂuctuation–dissipation theorem can be evaluated by replacing kB T , the mean thermal energy in a classical system, by βω 1 ω coth , (33.116) ω n(ω) + ≡ 2 2 2 which is the mean energy in a quantum harmonic oscillator. In eqn 33.116, n(ω) =

1 eβω

−1

(33.117)

is the Bose factor, which is the mean number of quanta in the harmonic oscillator at temperature T . Hence, in the quantum mechanical case, eqn 33.103 is replaced by βω C˜xx (ω) = χ . ˜ (ω) coth 2

(33.118)

At high temperature, coth(βω/2) → 2/(βω) and we recover eqn 33.103. The quantum mechanical version of eqn 33.102 is ∞ βω x2 = . (33.119) dω χ ˜ (ω ) coth 2 −∞ 2

Fig. 33.4 The voltage ﬂuctuations V 2 in a small frequency interval ∆f = ∆ω/(2π) centred on ±ω0 are due to ˜V V (ω) shown by the the part of the C shaded boxes. One can imagine that the noise is examined through a ﬁlter which only allows these frequencies through, so that the integral in eqn 33.114 only picks up the regions shown by the shaded boxes.

384 Brownian motion and ﬂuctuations

Chapter summary • The ﬂuctuation–dissipation theorem implies that there is a direct relation between the ﬂuctuation properties of the thermal system (e.g. the diﬀusion constant) and its linear response properties (e.g. the mobility). If you’ve characterised one, you’ve characterised the other. • Fluctuations are more important for small systems than for large systems, though are always dominant near the critical point of a phase transition, even for large systems. • Fluctuations in a variable x are given by (∆x)2 = −kB T0 /(∂ 2 A/∂x2 ). • A response function is deﬁned by ∞ x(t)f = χ(t − t )f (t ) dt , −∞

and causality implies the Kramers–Kronig relations. • The Fourier transform of the correlation function gives the power spectrum. This allows us to show that ∞ 1 x2 = C˜xx (ω) dω. 2π −∞ • The ﬂuctuation–dissipation theorem states that χ ˜ (ω) C˜xx (ω) = 2kB T , ω and relates the autocorrelation function to the dissipations via the imaginary part of the response function.

Further reading 385

Further reading A good introduction to ﬂuctuations and response functions can be found in Chaikin and Lubensky (1995). Another useful source of information is chapter 12 of Landau and Lifshitz (1980).

Exercises (33.1) If a system is held at ﬁxed T , N and p, show that the ﬂuctuations in a variable x are governed by the probability function p(x) ∝ e−G(x)/kB T ,

(33.120)

where G(x) is the Gibbs function. (33.2) A system has displacement x ˜(ω) = χ(ω) ˜ f˜(ω) in response to a force f˜(ω). Show that if the force is given by f (t) = f0 cos ωt, the average power dissi-

pated is

1 2 f ωχ ˜ (ω). 2 0

(33.3) Repeat the derivation that led to eqn 33.72 (one of the Kramers–Kronig relations), but this time set y(−t) = y(t), so that y˜(ω) is purely real. In this case, prove the other Kramers–Kronig relation which states that χ ˜ (ω) = −P

Z

∞

−∞

˜ (ω ) dω χ . π ω − ω

(33.121)

Non-equilibrium thermodynamics

34 34.1 Entropy production

386

34.2 The kinetic coeﬃcients

387

34.3 Proof of the Onsager reciprocal relations 388 34.4 Thermoelectricity

391

34.5 Time reversal and the arrow of time 395 Chapter summary

396

Further reading

397

Exercises

397

Much of the material in this book has been concerned with the properties of systems in thermodynamic equilibrium, in which the functions of state are time-independent. However, we have also touched upon transport properties (see Chapter 9) which deal with the ﬂow of momentum, heat or particles from one place to another. Such processes are usually irreversible and result in entropy production. In this chapter, we will use the theory of ﬂuctuations developed in Chapter 33 to derive a general relation concerning diﬀerent transport processes, and apply it to thermoelectric eﬀects. We conclude the chapter by discussing the asymmetry of time.

34.1

Entropy production

Changes in the internal energy density u of a system are related to entropy density s, number Nj of particles of type j and electric charge density ρe by the combined ﬁrst and second law of thermodynamics, which states that du = T ds + µj dNj + φdρ, (34.1) j

where µj is the chemical potential of atoms of type j and φ is the electric potential. Rearranging this equation to make entropy changes the subject gives µj φ 1 dNj − dρe , (34.2) ds = du − T T T j and this is of the form ds =

φk dρk ,

(34.3)

k

where ρk is a generalized density and φk = ∂s/∂ρk is the corresponding generalized potential. Possible values for these variables are listed in Table 34.1. Each of the generalized densities are subject to a continuity equation of the form ∂ρk + ∇ · J k = 0, (34.4) ∂t where J k is a generalized current density. We can associate each of these currents with a ﬂow of entropy which itself measured by the entropy

34.2

energy number of particles of type j charge density

ρk

φk

∇φk

u Nj ρe

1/T µj /T −φe /T

∇(1/T ) ∇(µj /T ) −∇(φe /T )

Table 34.1 Terms in eqn 34.3.

current density J s . This will be subject to its own continuity equation which states that the local entropy production rate Σ is given by ∂s + ∇ · J s. (34.5) ∂t We can relate the entropy current density J s to the other current densities via the following equation: φk J k . (34.6) Js = Σ=

k

Inserting this into eqn 34.5 yields Σ=

φk ρ˙ k + ∇ ·

k

φk J k

.

(34.7)

k

Now some straightforward vector calculus, and use of eqn 34.4, yields

∇· = φk J k ∇φk · J k + φk ∇ · J k k

k

=

k

∇φk · J k +

k

and hence Σ=

φk (−ρ˙ k ),

(34.8)

k

∇φk · J k .

(34.9)

k

This equation relates the local entropy production rate Σ to the generalized current densities J k and the generalized force ﬁelds ∇φk .

34.2

The kinetic coeﬃcients

Very often the response of a system to a force is to produce a steady current. For example, a constant electric ﬁeld applied to an electrical conductor produces an electric current; a constant temperature gradient applied to a thermal conductor produces a ﬂow of heat. Assuming a linear response, one can write in general that the generalized current density J i is related to the generalized force ﬁelds by the equation Ji = Lij ∇φj , (34.10) j

The kinetic coeﬃcients 387

388 Non-equilibrium thermodynamics

where the coeﬃcients Lij are called kinetic coeﬃcients.

Example 34.1 Recall the equation for heat ﬂow (eqn 9.15): J = −κ∇T.

(34.11)

This can be cast into the form J u = Luu ∇(1/T ),

(34.12)

where Luu = κT 2 .

Equation 34.10 implies that the local entropy production Σ is given by ∇φi Lij ∇φj . (34.13) Σ= ij

The second law of thermodynamics can be stated in the form that entropy must increase globally. However, entropy can go down in one place if it goes up at least as much somewhere else. Equation 34.5 relates the entropy produced locally in a small region to the entropy which is transported into or out of that region (perhaps by matter, charge, heat, or some combination of these, being imported or exported). An even stronger statement can be made by insisting that not only is the global entropy change always positive but that so is the local equilibrium production rate: Σ ≥ 0. Equation 34.13 then implies that Lij must be positive-deﬁnite matrix (all its eigenvalues must be positive). A further statement about Lij can be made which follows from Onsager’s reciprocal relations, which state that Lij = Lji .

(34.14)

We will prove these relations, ﬁrst derived by Lars Onsager (Fig. 34.1) in 1929, in the following section. Fig. 34.1 Lars Onsager (1903–1976).

34.3

Proof of the Onsager reciprocal relations

Near an equilibrium state, we deﬁne the variable αk = ρk − ρeqm , which k measures the departure of the kth density variable from its equilibrium value. The probability of the system having density ﬂuctuations given by α = (α1 , α2 , . . . αm ) can be written as P (α) ∝ e∆S/kB .

(34.15)

34.3

Proof of the Onsager reciprocal relations 389

We assume that the probability P is suitably normalized, so that P dα = 1. (34.16) The entropy change for a ﬂuctuation ∆S is a function of α which we can express using a Taylor expansion in α. There are no linear terms since we are measuring departures from equilibrium, where S is maximized, and hence we write 1 gij αi αj , (34.17) ∆S = − 2 ij where gij = (∂ 2 ∆S/∂αi ∂αj )α=0 . Thus we can write the logarithm of the probability as ∆S ln P = + constant, (34.18) kB and hence ∂ln P 1 ∂∆S = . (34.19) ∂αi kB ∂αi The next part of the proof involves working out a couple of averages of a ﬂuctuation of one of the density variables with some other quantity. (1) We begin by deriving an expression for (∂S/∂αi )αj : ! ! ∂ln P ∂S αj αj = kB ∂αi ∂αi ∂ln P αj P dα = kB ∂αi ∂P αj dα (34.20) = kB ∂αi ∂αj ∞ = kB P dα . dα [P αj ]−∞ − ∂αi In this equation, dα = dα1 · · · dαj−1 dαj+1 · · · dαm , i.e. the prod∞ uct of all the dα i except dαj . The term [P αj ]∞ is zero because 1 P ∝ exp[− 2kB ij gij αi αj ] and hence goes to zero as αj → ±∞. Using ∂αj /∂αi = δij , we can therefore show that ! ∂S αj = −kB δij . (34.21) ∂αi (2) We now derive an expression for αi αj : ∂∆S =− gik αk , ∂αi

(34.22)

k

and hence k

Hence

gik αk αj = −

∂S αj ∂αi

!

αi αj = kB (g −1 )ij .

= kB δij .

(34.23)

(34.24)

390 Non-equilibrium thermodynamics

Example 34.2 Show that ∆S = −mkB /2, explain the sign of the answer, and interpret the answer in terms of the equipartition theorem. Solution: " # m kB mkB 1 1 . gij αi αj = − gij αi αj = − δii = − ∆S = − 2 ij 2 ij 2 i=1 2 (34.25) The equilibrium conﬁguration, α = 0, corresponds to maximum entropy, so ∆S should be negative; a ﬂuctuation corresponds to a statistically less likely state. If the system has m degrees of freedom, then its mean thermal energy is mkB T /2, which is equal to −T ∆S.

We are now in a position to work out some correlation functions of the ﬂuctuations. We now make the crucial assumption that any microscopic process and its reverse process take place on average with the same frequency. This is the principle of microscopic reversibility. This implies that αi (0)αj (t)

= αi (0)αj (−t) = αi (t)αj (0).

(34.26)

Subtracting αi (0)αj (0) from both sides of eqn 34.26 yields αi (0)αj (t) − αi (0)αj (0) = αi (t)αj (0) − αi (0)αj (0).

(34.27)

Dividing eqn 34.27 by t and factoring out common factors gives ! ! αj (t) − αj (0) αi (t) − αi (0) (34.28) αi (0) = αj (0) , t t and in the limit as t → 0, this can be written αi α˙j = α˙i αj .

(34.29)

Now, assuming that the decay of ﬂuctuations is governed by the same laws as the macroscopic ﬂows as they respond to generalized forces, so that we can use the kinetic coeﬃcients Lij to describe the ﬂuctuations, we have that ∂S α˙ = Lik (34.30) ∂αk k

and hence substituting into eqn 34.29 yields " # " # ∂S ∂S Ljk Lik αj , αi = ∂αk ∂αk k

k

(34.31)

34.4

which simpliﬁes to

Ljk

k

∂S αi ∂αk

! =

Lik

k

∂S αj ∂αk

!

Using the relation in eqn 34.21, we have that Ljk (−kB δik ) = Lik (−kB δjk ) k

Thermoelectricity 391

.

(34.32)

(34.33)

k

and hence we have the Onsager reciprocal relations: Lji = Lij .

34.4

(34.34)

Thermoelectricity

In this section we apply the Onsager reciprocal relations and the other ideas developed in this chapter to the problem of thermoelectricity which describes the relationship between ﬂows of heat and electrical current. It is not surprising that heat current and electrical currents in metals should be related; both result from the ﬂow of electrons and electrons carry both charge and energy. Consider two dissimilar metals A and B, with diﬀerent work functions1 and chemical potentials, whose energy levels are shown schematically in Fig. 34.2(a). These two metals are connected together, as shown in Fig. 34.2(b), and both held at the same temperature T . Because initially µA = µB , some electrons will diﬀuse from A and diﬀuse into B, resulting in a small build up of positive charge on A and a small build up of negative charge on B. This will lead to a small electric ﬁeld across the junction of A and B that will oppose any further electrons moving into B. Once equilibrium is established, the chemical potential in A and B must equalize and hence µA = µB , see Section 22.2. No voltage develops between the ends of the metals if they remain at the same temperature, but if the ends of A and B are at diﬀerent temperatures, there will be a voltage diﬀerence. Electrons respond both to an applied electric ﬁeld E and a gradient in the chemical potential ∇µ, the former producing a drift current and the latter a diﬀusion current. Near the junction between A and B shown in Fig. 34.2(b), these two currents coexist but are equal and opposite and therefore precisely cancel in equilibrium, if A and B are held at the same temperature. Thus a voltmeter responds not to the integrated electric ﬁeld given by E · dl (34.35) around a circuit, but rather to

E · dl,

where

1 E = E + ∇µ e

(34.36)

(34.37)

1

The work function of a metal is the minimum energy needed to remove an electron from the Fermi level to a point in the vacuum well away from the surface.

Fig. 34.2 (a) Two dissimilar metals with diﬀerent work functions wA and wB and chemical potentials µA = −wA and µB = −wB . (b) Metals A and B, held at the same temperature, are connected together.

392 Non-equilibrium thermodynamics

is the electromotive ﬁeld, which combines the eﬀects of the ﬁelds driving the drift and diﬀusion currents. We thus write the current densities for charge and heat, J e and J Q , in terms of the electromotive ﬁeld and temperature gradient which drive them, in the following general way: Je

= LEE E + LET ∇T,

(34.38)

JQ

= LT E E + LT T ∇T.

(34.39)

Here the kinetic coeﬃcients LEE , LET , LT E and LT T are written using the symbol L rather than L because we haven’t yet written them in the form of eqn 34.10. To work out what these coeﬃcients are, let us examine some special cases: • No temperature gradient: If ∇T = 0, then we expect that J e = σE,

(34.40)

where σ is the electrical conductivity, and hence we identify LEE = σ from eqn 34.38. In this case, the heat current density is given by eqn 34.39 and hence J Q = LT E E =

2

J. C. A. Peltier (1785–1845) ﬁrst observed the eﬀect in 1834.

Fig. 34.3 A junction between wires of diﬀerent metals, A and B, carrying an electrical current has a discontinuous jump in its heat current. This is the Peltier eﬀect.

LT E J e = ΠJ e , LEE

(34.41)

where Π = LT E /LEE is known as the Peltier coeﬃcient. (The Peltier coeﬃcient has dimensions of energy/charge, and so is measured in volts.) Thus, an electrical current is associated with a heat current, and this is known as the Peltier eﬀect.2

J

J J

J

Consider an electrical current J e ﬂowing along a wire of metal A and then via a junction to a wire of metal B, as shown in Fig. 34.3. The electrical current must be the same in both wires, so the heat current must exhibit a discontinuous jump at the junction. This jump is given by (ΠA − ΠB )J e , and this result in the liberation of heat (34.42) ΠAB J e at the junction, where ΠAB = ΠA − ΠB . If ΠAB < 0 this results in cooling, and this is the principle behind Peltier cooling in which heat is removed from a region by putting it in thermal contact

34.4

with a junction between dissimilar wires and passing a current along the wires (see see Fig. 34.4). Of course, the heat removed is simultaneously liberated elsewhere in the circuit, as shown in Fig. 34.4. The Peltier heat ﬂow is reversible and so if the electrical currents are reversed, then so are the heat ﬂows. • No electrical current: If J e = 0, then (34.43) J Q = −κ∇T, where κ is the thermal conductivity. However, we also have an electric ﬁeld E given by E = ∇T (34.44) where is the Seebeck coeﬃcient3 or the thermopower (units V K−1 ). Thus a thermal gradient is associated with an electric ﬁeld: this is called the Seebeck eﬀect. Equation 34.38 and eqn 34.44 imply that LET . (34.45) =− LEE A circuit which consists of a single material with a temperature gradient around it would produce a voltage given by E · dl = ∇T · dl = 0. (34.46) To observe the thermopower, one needs a circuit containing two diﬀerent metals: this is known as a thermocouple and such a circuit is shown in Fig. 34.5. An equivalent circuit is shown in Fig. 34.6. Thus the Seebeck e.m.f. (electromotive force) ∆φS measured by the voltmeter in the circuit in Fig. 34.6 is given by ∆φS = − E · dl T1 T2 T0 = B dT + A dT + B dT T0 T2

= T1

and we write

T1

dφS A − B = . dT

Fig. 34.4 In this circuit, ΠAB < 0, so that for the current direction shown, the junction on the right-hand side absorbs heat while the junction on the left–hand side liberates heat. 3 T. J. Seebeck (1770–1831) discovered this in 1821.

T T T

Fig. 34.5 Thermocouple circuit for measuring the diﬀerences between thermoelectric voltages.

T

T

(34.47) T

T

(34.48) Fig. 34.6 Equivalent thermocouple circuit.

Example 34.3 Derive an expression for κ in terms of the kinetic coeﬃcients. Solution: Substitution of eqn 34.45 into eqn 34.44 yields LET ∇T. LEE

I

T2

(A − B )dT,

E =−

Thermoelectricity 393

(34.49)

394 Non-equilibrium thermodynamics

Putting this into eqn 34.39 implies that LEE LT T − LT E LET JQ = ∇T, LEE and hence comparison with eqn 34.43 yields LEE LT T − LT E LET κ=− . LEE

(34.50)

(34.51)

Putting eqns 34.38 and 34.39 into the form of eqn 34.10, we have Je

= LEE ∇(−φ/T ) + LET ∇(1/T ),

(34.52)

JQ

= LT E ∇(−φ/T ) + LT T ∇(1/T ),

(34.53)

where LEE LT E

= T LEE , = T LT E ,

LET LT T

= −T 2 LET , = −T 2 LT T .

(34.54)

The Onsager reciprocal relation in this case thus implies that LT E = LET ,

(34.55)

and hence LT E = −T LET , so that Π = T .

(34.56)

ΠAB = T (A − B )

(34.57)

This yields 4

William Thomson, also known as Lord Kelvin (1824–1907). Thomson’s proof was, of course, not based on the Onsager reciprocal relations and is somewhat suspect.

5

Lord Kelvin again!

which is known as Thomson’s second relation.4 It is a very good example of the power of Onsager’s approach: we have been able to relate the Peltier and Seebeck coeﬃcients on the basis of Onsager’s general theorem concerning the symmetry of general kinetic coeﬃcients. There is one other thermoelectric eﬀect which we wish to consider. If the thermopower is temperature dependent, then there will even be heat liberated by an electric current which ﬂows in a single metal. This heat is known as Thomson heat.5 An electrical current J e corresponds to a heat current J Q = ΠJ e (by eqn 34.41). The heat liberated at a particular point per second is therefore given by the divergence of J Q and hence (34.58) ∇ · J Q = ∇ · (T J e ), using eqn 34.56. If no charges build up, then J e is divergenceless and hence ∇ · J Q = J e · ∇(T ) = J e · ∇T + J e · T ∇.

(34.59)

34.5

Time reversal and the arrow of time 395

Writing ∇ = (d/dT )∇T and using eqn 34.44, we have ﬁnally that ∇ · J Q = J e · E + τ J e · ∇T,

(34.60)

which is the sum of a resistive heating term (J e · E) and a thermal gradient term (τ J e · ∇T ). In this equation, the Thomson coeﬃcient τ is given by d . (34.61) τ =T dT The Thomson coeﬃcient is the heat generated per second per unit current per unit temperature gradient. Equation 34.57 implies that ΠAB d (34.62) T = τ A − τB , dT T and this implies that dΠAB + A − B = τA − τB , dT which is known as Thomson’s ﬁrst relation.

34.5

(34.63)

Time reversal and the arrow of time

The proof of the Onsager reciprocal relations rested upon the hypothesis of microscopic reversibility. This makes some degree of sense since molecular collisions and processes are based upon laws of motion which are themselves symmetric under time reversal. The heat produced in the Peltier eﬀect considered in the previous section is reversible (one has to simply reverse the current) and this adds to our feeling that we are dealing with underlying reversible processes. But of course, that is not the whole story. The second law of thermodynamics insists that entropy never decreases and in fact increases in irreversible processes. This presents us with a dilemma since to explain this we have to understand why microscopic time-symmetric laws generate a Universe in which time is deﬁnitely asymmetric: eggs break but do not unbreak; heat ﬂows from hot to cold and never the reverse; we remember the past but not the future. In our Universe, +t is manifestly diﬀerent from −t. This problem aﬄicted Boltzmann when he tried to prove the second law on the basis of classical mechanics and derived his famous Htheorem, which showed how the Maxwell–Boltzmann distribution of velocities in a gas would emerge as a function of time on the basis of molecular collisions. One hypothesis which had gone into his proof was the innocent looking principle of molecular chaos (the stoßzahlansatz) which states that the velocities of molecules undergoing a collision are statistically indistinguishable from those of any pair of molecules in the gas selected at random. However, this cannot be right; Boltzmann’s approach showed how molecules retain correlations in their motion following a collision and this ‘memory’ of the collision is progressively redistributed among the molecular velocities until they adopt the most

396 Non-equilibrium thermodynamics

6

This wonderful phrase is used by Lockwood (2005).

7

Wild speculations about quantum gravity are possible since we have, at present, no adequate theory of quantum gravity.

likely Maxwell–Boltzmann distribution. However, because the underlying dynamics are time symmetric, before a collision molecules must possess pre-collisional correlations, which are “harbingers of collisions to come”.6 This makes a nonsense of the stoßzahlansatz. It seems more likely that the source of the time asymmetry is not in the dynamics but in the boundary conditions. If we watch an irreversible process, we are watching how a system prepared in a low-entropy state evolves to a state of higher entropy. For example, in a Joule expansion, it is the experimenter who prepares the two chambers appropriately in a low-entropy state (by producing entropy elsewhere in the Universe, pumping gas out from the second chamber). There is a boundary condition at the start, putting all the gas in one chamber in a non-equilibrium state, but not at the end. This lopsided nature of the boundary conditions results in asymmetric time evolution. Thus the operation of the second law of thermodynamics in our Universe may come about because the Universe was prepared in a low-entropy state; in this view, the boundary condition of the Universe is therefore the driving force for the asymmetry in time. Or is it that it is something to do with the operation of the microscopic laws, which leads to the asymmetry in the ﬂow of time? Not perhaps as Boltzmann attempted, using classical mechanics, but at the quantum mechanical (or possibly at the quantum-gravity7 ) level? These questions are far from being resolved. We are so familiar with the arrow of time that we perhaps are not struck more often how odd it is and how much it is out of alignment it is with our present understanding of the reversible, microscopic laws of physics.

Chapter summary • The local entropy production Σ is given by Σ= ∇φk · J k = ∇φi Lij ∇φj ≥ 0 k

ij

• The Onsager reciprocal relations state that Lij = Lji . • The Peltier eﬀect is the liberation of heat at a junction owing to the ﬂow of electrical current. • The Seebeck eﬀect is the development of a voltage in a circuit containing a junction between two metals owing to the presence of a temperature gradient between them. • Onsager’s reciprocal relations rest on the principle of microscopic reversibility. The arrow of time, which shows the direction in which irreversible processes occur, may result from the asymmetry in the boundary conditions applied to a system.

Further reading 397

Further reading • A good introduction to non-equilibrium thermodynamics may be found in Kondepudi and Prigogine (1998), Plischke and Bergersen (1989) and chapter 12 of Landau and Lifshitz (1980). • The problem of the arrow of time is discussed in a highly readable and thought-provoking fashion in Lockwood (2005).

Exercises (34.1) If a system is held at ﬁxed T , N and p, show that the ﬂuctuations in a variable x are governed by the probability function p(x) ∝ e−G(x)/kB T ,

(34.64)

where G(x) is the Gibbs function. (34.2) For the thermoelectric problem considered in Section 34.4, show that LEE

=

T σ,

(34.65)

LET

=

T 2 σ,

(34.66)

LT E

=

T 2 σ,

(34.67)

LT T

=

κT 2 + 2 T 3 σ.

(34.68)

(c) In a metal, the measured thermopower is much less than 87 µV K−1 and decreases as the metal is cooled. Give an argument for why one might expect the thermopower to behave as kB kB T , (34.69) ≈ e TF where TF is the Fermi temperature. (d) In a semiconductor, the measured thermopower is much larger than 87 µV K−1 and increases as the metal is cooled. Give an argument for why one might expect the thermopower to behave as

◦

(34.3) At 0 C the measured Peltier coeﬃcient for a Cu– Ni thermocouple is 5.08 mV. Hence estimate the Seebeck coeﬃcient at this temperature and compare your answer with the measured value of 20.0 µV K−1 . (34.4) (a) Explain why the thermopower is a measure of the entropy per carrier. (b) Consider a classical gas of charged particles and explain why the thermopower should be of the order of kB /e = 87 µV K−1 and be independent of temperature T .

≈

kB Eg , e 2kB T

(34.70)

where Eg is the energy gap of the semiconductor. (e) Since thermopower is a function of the entropy of the carriers, the third law of thermodynamics leads one to expect that it should go to zero as T → 0. Is this a problem for the semiconductor considered in (d)?

35

Stars

35.1 Gravitational interaction 399 35.2 Nuclear reactions

404

35.3 Heat transfer

405

Chapter summary

412

Further reading

412

Exercises

412

1 There are thought to be at least 1018 galaxies in the observable Universe. On average, a galaxy might contain 1011 stars. 2

The interstellar medium is the dilute gas, dust and plasma which exists between the stars within a galaxy. 3

Luminosity is a term used to mean energy radiated per unit time, i.e. power, and has the units of watts. In astrophysics, one often uses spectral luminosity (which is often what astrophysicists mean when they say luminosity), which is the power radiated per unit energy band or per wavelength interval or per frequency interval, and so in the latter case have units W Hz−1 .

4

The adjective Galactic pertains to our own Galaxy, the Milky Way, while the adjective galactic pertains to galaxies in general.

In this chapter we apply some of the concepts of thermal physics developed earlier in this book to stellar astrophysics. Astrophysics is the study of the physical properties of the Universe and the objects therein. In this ﬁeld, we make the fundamental assumption that the laws of physics, including those governing the properties of atoms and gravitational and electromagnetic ﬁelds, which are all largely obtained from experiment on Earth, are valid throughout the entire Universe, way beyond the conﬁnes of the Solar System where they have been well tested. It is further assumed that the fundamental constants do not vary in time and space. The Universe contains a great many galaxies.1 Each of these galaxies contain a great many stars which are born out of the condensation of the denser gas in the interstellar medium2 . Gravitational collapse produces extremely high temperatures permitting fusion to take place and hence the radiation of energy. Stars live and evolve, seeming to follow the laws of physics with impressive obedience, changing size, temperature and luminosity.3 Ultimately stars die, some exploding as supernovae and returning their mass (at least partially) to the Galactic4 interstellar medium. The star about which we know the most is the Sun. It seems to be a rather average star in our Galaxy, and some of its properties are summarized in the following box. The ﬁrst three properties are measured while the remaining ones are model-dependent. mass radius luminosity eﬀective temperature Solar quantities: age central density central temperature central pressure

M R L Teﬀ t ρc Tc pc

1.99 × 1030 kg 6.96 × 108 m 3.83 × 1026 W 5780 K 4.55 × 109 yr 1.45 × 105 kg m−3 15.6 × 106 K 2.29 × 1016 Pa

Stellar astrophysics, the subject of this chapter, is a very interesting ﬁeld because using fairly simple physics, we can make predictions which can be tested observationally. We will consider the main processes which determine the properties of stars (Section 35.1 gravity, Section 35.2 nuclear reactions and Section 35.3.2 heat transfer) and, importantly, derive

35.1

Gravitational interaction 399

the main equations of stellar structure used to model stars. We will not, however, address more complicated issues such as magnetic ﬁelds in stars or detailed particle physics. In the following chapter, we will consider what happens to stars at the ends of their lives.

35.1

Gravitational interaction

The fundamental force which causes stars to form and which produces huge pressures and temperatures in the centre of stars is gravity. In this section, we explore how the eﬀect of gravity governs the behaviour of stars.

35.1.1

Gravitational collapse and the Jeans criterion

How do stars form in the ﬁrst place? In order for a gas cloud to condense into stars, the cloud must be suﬃciently dense that attractive gravitational forces predominate over the pressure (which is proportional to the internal energy) otherwise the cloud would expand and disperse. The critical condition for condensation, i.e. for a gas cloud to be gravitationally bound, is that the total energy E must be less than zero. Now E = U + Ω, where U is the kinetic energy and Ω is the gravitational potential energy. To be gravitationally bound requires E < 0 and hence −Ω > U . The gravitational potential energy is negative, and hence the condition for condensation is |Ω| > U.

(35.1)

Now consider a spherical gas cloud of radius R and mass M which is in thermal equilibrium at temperature T . The cloud consists of N particles, each assumed to be of the same type and of mass m = M/N . The gravitational potential energy of this cloud is given by GM 2 , (35.2) R where G is the gravitational constant and f is a factor of order unity which reﬂects the density proﬁle within the cloud.5 For simplicity, we will set f = 1 in what follows. The thermal kinetic energy U of the cloud is found by assuming that each particle contributes 32 kB T , so that Ω = −f

3 N kB T. (35.3) 2 Thus making use of eqn 35.1, a gas cloud will collapse if its mass M exceeds the Jeans mass6 MJ given by

5

For a spherical cloud of uniform density, f = 35 . For a spherical shell, f = 1.

U=

3kB T R. (35.4) 2Gm Thus the Jeans mass is the minimum mass of a gas cloud that will collapse under gravity. Increasing the temperature T causes the particles MJ =

6

Sir James Jeans (1877–1946).

400 Stars

to move faster and thus makes it harder for the cloud to collapse; increasing the mass m of each particle favours gravitational collapse. The condition (35.5) M > MJ is known as the Jeans criterion. It is often helpful to write the Jeans mass in terms of the density ρ of the cloud given by M

ρ=

, 4 3 3 πR

(35.6)

assuming spherical symmetry. This can be rearranged to give R=

3M 4πρ

13 ,

(35.7)

and hence the Jeans criterion can be written as 1/2 9kB T , R > RJ = 8πGmρ

(35.8)

where RJ is the Jeans length. Substitution of eqn 35.8 into eqn 35.4 yields another expression for the Jeans mass: MJ =

3kB T 2Gm

3/2

3 4πρ

1/2 .

(35.9)

Equivalently, one may also write that a cloud of mass M will condense if its average density exceeds 3 3kB T 3 ρJ = , 4πM 2 2Gm

(35.10)

where ρJ is known as the Jeans density.

Example 35.1 What is the Jeans density of a cloud composed of hydrogen atoms and with total mass M at 10 K? Solution: Using eqn 35.10, 3 3kB × 10 3 ρJ = ≈ 5 × 10−17 kg m−3 , 2 4πM 2GmH which corresponds to about 3 × 1010 particles per cubic metre.

(35.11)

35.1

35.1.2

Gravitational interaction 401

Hydrostatic equilibrium

As we have seen, gravity is responsible for gas clouds condensing into stars. It also contributes to the pressure inside a star. Consider a spherical body of gas of mass M and radius R, in which the only forces acting are due to gravity and internal pressure. The mass enclosed by a spherical shell of radius r is r ρ(r )4πr 2 dr , (35.12) m(r) = 0

where ρ(r) is the density of the star at radius r, and so is responsible for a gravitational acceleration given by g(r) =

Gm(r) . r2

(35.13)

R

r r Fig. 35.1 Schematic illustration of a star of mass M and total radius R. Consider a small element at radius r from the centre having area ∆A perpendicular to the radius. We denote the pressure on the inner surface of the element at radius r as p and that at radius r + ∆r as p + (dp/dr)∆r.

In equilibrium, this is balanced by the internal pressure p of the star. Consider a small volume element at radius r extending to r + ∆r and having cross-sectional area ∆A. The force on the element due to the pressure either side is given by dp dp ∆r ∆A − p(r)∆A = ∆r∆A. (35.14) p(r) + dr dr The gravitational attraction of the mass m(r) within radius r is equal to g(r)ρ(r)∆r∆A = g(r)∆M . Since the mass of the element ∆M is given by ρ(r)∆r∆A, the inward acceleration of any element of mass at distance r from the centre due to gravity and pressure is −

d2 r 1 dp . = g(r) + dt2 ρ(r) dr

(35.15)

If the star is gravitationally stable it is said to be in hydrostatic equilibrium and an elemental volume will undergo no acceleration towards the centre of the star since the gravitational acceleration g(r) =

402 Stars

Gm(r)/r2 will be balanced by the increased pressure on the inner surface compared with that on the outer surface. If this is true for all r then the left-hand side of eqn 35.15 will be zero, enabling us to rewrite this in a form known as the equation of hydrostatic equilibrium. Gm(r)ρ(r) dp =− . (35.16) dr r2 The equation of hydrostatic equilibrium is one of the fundamental equations satisﬁed by static stellar structures.

35.1.3

The virial theorem

The virial theorem relates the average pressure (related to the internal kinetic energy) needed to support a self-gravitating system, thus balancing the gravitational potential energy with the kinetic energy. To derive this we ﬁrst need to relate pressure to internal kinetic energy. Recall from Section 11.3 that the adiabatic index γ is used to describe the relation between the pressure and the volume of a gas during adiabatic compression or expansion, i.e. when the internal energy changes solely because of the work done on it. For such a process, pV γ is a constant, and so we may write dV dp 0=γ + . (35.17) V p Hence we can write d(pV ) = pdV + V dp = −(γ − 1)pdV.

(35.18)

If we denote the internal energy due to translational kinetic energy by dU then dU = −pdV, (35.19) and hence dU =

1 d(pV ). γ−1

(35.20)

If the adiabatic index is a constant (which is not the case if diﬀerent energy levels, for example rotational and vibrational, become excited) then this equation simply integrates to U=

pV . γ−1

(35.21)

Hence the internal energy density u = U/V is given by u=

p . γ−1

(35.22)

Example 35.2 Use eqn 35.22 to derive the energy density of a gas of (i) non-relativistic particles (γ = 53 ) and (ii) relativistic particles (γ = 43 ).

35.1

Solution: Straightforward substitution into eqn 35.22 yields u u

3 p 2 = 3p

=

for γ = 53 , for γ =

(35.23)

4 3,

(35.24)

in agreement with eqns 6.25 and 25.21.

The next part of the derivation of the virial theorem proceeds by multiplying both sides of the hydrostatic equation (eqn 35.16) by 4πr3 and integrating with respect to r from r = 0 to r = R. This leads to R R Gm(r)ρ(r) 3 dp dr = − 4πr2 dr, 4πr (35.25) dr r 0 0 which becomes

R p(r)4πr3 0

R

−3

p(r) 4πr dr = −

m=M

2

0

m=0

Gm(r) dm. r

(35.26)

The ﬁrst term on the left-hand side is zero, because the surface of a star is deﬁned to be at the radius where the pressure has fallen to zero. The second term on the left-hand side is equal to −3pV , where V is the star’s entire volume and p is the average pressure. The right-hand side is the gravitational potential energy of the star, Ω, so eqn 35.26, which came from the equation of hydrostatic equilibrium, leads us to Ω pV = − , 3

(35.27)

which is a statement of the virial theorem. Equation 35.27 substituted into eqn 35.21 [which implies that pV = (γ − 1)U ] yields 3(γ − 1)U + Ω = 0,

(35.28)

which is another statement of the virial theorem. The total energy E is the sum of the potential energy Ω and the kinetic energy U , i.e. E = U + Ω.

(35.29)

Putting together eqns 35.28 and 35.29 gives E = (4 − 3γ)U =

3γ − 4 Ω. 3(γ − 1)

(35.30)

Example 35.3 Use eqns 35.28 and 35.30 to relate U , Ω and E for a gas of (i) nonrelativistic particles (γ = 53 ) and (ii) relativistic particles (γ = 43 ).

Gravitational interaction 403

404 Stars

Solution: (i) For a gas of non-relativistic particles, we have (using γ = eqns 35.28 and 35.30) that 2U + Ω = 0,

5 3

in

(35.31)

and hence

Ω . (35.32) 2 Since the kinetic energy U is positive, the total energy E is negative thus the system is bound. Moreover, this shows that if the total energy E of a star decreases, this corresponds to a decrease in the gravitational potential energy Ω, but an increase in the kinetic energy U . Since U is directly related to the temperature T , we conclude that a star has a ‘negative heat capacity’: as a star radiates energy (E decreases), it contracts and heats up! This allows the nuclear heating process to be, to some extent, self-regulating. If a star loses energy from its surface, it contracts and heats up; therefore nuclear burning can increase, leading to an expansion, which cools the stellar core. (ii) For a gas of relativistic particles, we have that (using γ = 43 in eqns 35.28 and 35.30) U + Ω = 0, (35.33) E = −U =

and hence E = 0.

(35.34)

Because the total energy is zero, a gravitationally bound state is not stable.

35.2

Nuclear reactions

The energy production in a star is dominated by nuclear reactions. These reactions are fusion processes, in which two or more nuclei combine and release energy. This is often called nuclear burning, though note that it is not burning in the conventional sense (we normally use the term burning to denote a chemical reaction with atmospheric oxygen; here we are talking about nuclear fusion reactions). Young stars are composed mainly of hydrogen, and the most important fusion reaction is hydrogen burning via the so-called PP chain, the ﬁrst part of which is known as PP1 and is described in eqns 35.35. In these equations, 1 H is hydrogen, 2 D is deuterium, γ is a photon and 3 He and 4 He are isotopes of helium. H + 1H → 2 D + 1H → 3 He + 3 He → 1

D + e+ + ν; 3 He + γ; 4 He + 1 H + 1 H. 2

(35.35)

35.3

This process releases 26.5 MeV of energy (of which about 0.3 MeV is carried away by the neutrino ν) by converting four 1 H to one 4 He. When helium becomes suﬃciently abundant in the star, it too can burn in further cycles. Additional reactions can occur involving carbon and nitrogen to produce oxygen, which catalyse a further helium–burning reaction and is called the CNO cycle. This complex series of reactions, and other such cycles which can produce elements as heavy as Fe, are now quite well understood, and can be used to understand the observed abundance of various chemical elements in the Universe, and to then infer primordial abundances. We will not examine the details of these reactions here, but suﬃce to say that when hydrogen is transmuted into iron via various complicated reaction pathways, the maximum possible energy release is equivalent to about 0.8% of the converted mass. In other words, the mass defect is 0.008. Hence the total energy available to the Sun can be estimated given by as 0.008M c2 , leading to an estimated solar lifetime tlifetime ∼ tlifetime

0.008M c2 ∼ 1011 years. L

(35.36)

The current age of the Sun is estimated to be 4.55×109 years, and hence the very rough estimate of the total solar lifetime is not obviously unrealistic. In fact, the long lifetimes of stars can only be explained by nuclear reactions.

35.3

Heat transfer

We have just seen that the release of nuclear energy is responsible for much of the energy radiated from stars. Energy is also released (or absorbed) owing to gravitational contraction (or expansion). A small mass dm makes a contribution dL to the total luminosity L of a star given by dL = dm, (35.37) where is the total power released per unit mass by nuclear reactions and gravity. For a spherically symmetric star, the luminosity dL(r) of a thin shell of radius dr (and mass dm = 4πr2 ρ dr), and writing = (r), is dL(r) = 4πr2 ρ(r). (35.38) dr How the luminosity varies with radius depends on how the heat is transported to the surface of the star, either by photon diﬀusion or by convection. We consider each of these in turn.

35.3.1

Heat transfer by photon diﬀusion

The passage of photons through a star towards its surface is a diﬀusive process and is precisely analogous to thermal conductivity via the free

Heat transfer 405

406 Stars

electrons in a metal. As such, we may use eqn 9.14 to describe the radial heat ﬂux J(r) ∂T J(r) = −κphoton , (35.39) ∂r where κphoton is the thermal conductivity (see Section 9.2) due to photons in the star. Treating the photons as a classical gas, we can use a result from the kinetic theory of gases, eqn 9.18, to write κphoton = 7

We have used the symbol l for mean free path in this section so as to keep the symbol λ for wavelength.

1 Clc, 3

(35.40)

where C here is the heat capacity of the photon gas per unit volume, l is the mean free path7 for photons and c is the mean speed of the particles in the ‘gas’, which here can be equated to the speed of light c. The heat capacity per unit volume C may be obtained from the energy density of a gas of photons in thermal equilibrium at temperature T , which is given by 4σ 4 T (35.41) u= c and from which we may derive the heat capacity per unit volume C = du/dT as 16σ 3 T . (35.42) C= c We next turn to the mean free path of the photons. This is determined by any process which results in photons being absorbed or scattered. Consider a beam of light with intensity Iλ at wavelength λ. The change in intensity dIλ of this beam as it travels through the stellar material is proportional to its intensity Iλ , the distance it has travelled ds, and the density of the gas ρ. So we have dIλ = −κλ ρIλ ds,

(35.43)

where the minus sign above shows that the intensity decreases with distance due to absorption. The constant κλ is called the absorption coeﬃcient or opacity. Equation 35.43 integrates to a dependence on distance of the form Iλ (s) = Iλ (0)e−s/l where l = 1/(κλ ρ) is the mean free path. Hence we obtain a new and useful expression for the thermal conductivity of a gas of photons by substituting eqn 35.42 into eqn 35.40: κ=

16 σT 3 . 3 κ(r)ρ(r)

(35.44)

The total radiative ﬂux at radius r is 4πr2 J(r) and this is equal to L(r). Hence using eqn 35.39 and 35.44, we can write L(r) = −4πr2

16σ[T (r)]3 dT . 3κ(r)ρ(r) dr

(35.45)

For many stars, the dominant heat transfer mechanism is radiative diﬀusion, which crucially depends on the temperature gradient dT /dr.

35.3

Heat transfer 407

We can now summarize the main equations of stellar structure which we have obtained so far. Equations of stellar structure dm(r) dr

=

4πr2 ρ(r)

dp(r) dr

= −

dL(r) dr

=

dT dr

(35.12)

Gm(r)ρ(r) r2

4πr2 ρ(r)

= −

(35.16) (35.38)

3κ(r)ρ(r)L(r) 64πr2 σ[T (r)]3

(35.45)

In these equations, the energy release due to nuclear reactions, (r), may need to be corrected for a term which includes the release of gravitational potential energy. Under certain circumstances, this term may in fact be dominant.8 These equations ignore convection, which we will consider in the following section.

35.3.2

8

One also has to consider the heat capacity of the stellar material whenever the stellar structure changes.

Heat transfer by convection

If the temperature gradient exceeds a certain critical value then the heat transfer in a star is governed by convection. The following analysis was ﬁrst produced by Schwarzschild9 in 1906. Consider a parcel of stellar material at radius r having initial values of density and pressure ρ∗ (r) and p∗ (r) respectively. The parcel subsequently rises by a distance dr through ambient material of density and pressure ρ(r) and p(r). Initially the parcel is in pressure equilibrium with its surroundings and so p∗ (r) = p(r); it initially has the same density as its surroundings, and hence ρ∗ (r) = ρ(r). We will assume that is constant10 where γ is the parcel rises adiabatically, and hence p∗ ρ−γ ∗ the adiabatic index. The parcel will be buoyant and will continue to rise if its density is lower than that of its surroundings (see Fig. 35.2), i.e. convection is possible if ρ∗ < ρ, (35.46)

9

Karl Schwarzschild (1873–1916).

10

This follows from pV γ being constant; see eqn 12.15.

p

p

which implies that dρ∗ dρ < . dr dr Because the parcel rises adiabatically, the constancy of that γ dρ∗ 1 dp∗ = . p∗ dr ρ∗ dr

(35.47) p∗ ρ−γ ∗

implies (35.48)

We can treat the ambient material as an ideal gas (so that p ∝ ρT ; see

Fig. 35.2 A parcel of stellar material (of density ρ∗ ) will rise in its surroundings (density ρ) if ρ∗ < ρ. This is the condition for convection to occur.

408 Stars

eqn 6.18) and hence 1 dp 1 dρ 1 dT = + . p dr ρ dr T dr Substituting eqns 35.48 and 35.49 into eqn 35.47 leads to 1 ρ∗ dp∗ ρ dp ρ dT < − . γ p∗ dr p dr T dr

(35.49)

(35.50)

Since pressure equilibration happens very rapidly, we can assume that p(r) = p∗ (r). Moreover, ρ∗ ≈ ρ to ﬁrst order, and hence dp 1 p dT −1 <− , (35.51) γ dr T dr and thus

dT < dr

1 1− γ

T dp p dr

(35.52)

is the condition for convection to occur. In fact, both temperature and pressure decrease with increasing distance from the centre in a star; hence both the temperature and pressure gradients are negative. Thus, it is more convenient to write the condition for convection to occur as $ $ $ $ $ dT $ $ $ $ $ > 1 − 1 T $ dp $ . (35.53) $ dr $ γ p $ dr $ In this equation, the pressure gradient is governed by hydrostatic equilibrium, which we met in eqn 35.16. This equation shows that convection will occur if the temperature gradient is very large, or because γ becomes close to 1 (which makes the right-hand side of the equation small) and occurs when a gas becomes partially ionized (see Exercise 35.5).

35.3.3

Scaling relations

To ﬁnd the detailed pressure and temperature dependences inside a star requires one to simultaneously solve eqns 35.12, 35.16, 35.38 and 35.45 (these are tabulated in the box at the end of Section 35.3.1) with a realistic form of the opacity κ(r). This is very complicated and has to be performed numerically. However, we can gain considerable insight into general trends by deriving scaling relations. To do this, we assume that the principle of homology applies, which says that if a star of total mass M expands or contracts, its physical properties change by the same factor at all radii. This means that the radial proﬁles as a function of the fractional mass is the same for all stars, the only diﬀerence being a constant factor which depends on the mass M . For example, this implies that a pressure interval dp scales in exactly the same way as the central pressure pc , and that the density proﬁle ρ(r) scales in the same way as the mean density ρ. The following example demonstrates the use of the principle of homology in deriving scaling relations for various stellar properties.

35.3

Heat transfer 409

Example 35.4 Using the principle of homology for a star of total mass M and radius R, show that (a) p(r) ∝ R−4 and (b) T (r) ∝ R−1 . Solution: (a) The equation for hydrostatic equilibrium, eqn 35.16, states that dp Gm(r)ρ(r) = , dr r2

(35.54)

so using ρ ∝ M R−3 and writing dp/dr = pc /R, we deduce that pc ∝ M 2 R−5 . R

(35.55)

Equation 35.55 means that pc ∝ M 2 R−4 , and using the principle of homology, p(r) ∝ M 2 R−4 (35.56) (b) We next consider a relationship for scaling the temperature throughout a star. Our starting point this time is the ideal gas law, which we met in eqn 6.18, from which we may write the following: T (r) ∝

p(r) . ρ(r)

(35.57)

Using ρ ∝ M R−3 and eqn 35.56, we have T (r) ∝ M R−1 .

(35.58)

Hence as the star shrinks, its central temperature increases. Note that this does not give information on the surface temperature T (R), since this depends on the precise form of T (r).

For a low–mass star, the opacity κ(r) increases with density and decreases with temperature roughly according to κ(r) ∝ ρ(r)T (r)−3.5 ,

(35.59)

which is known as Kramers opacity.11 In this case, scaling yields (via ρ ∝ M R−3 and eqn 35.58) κ(r) ∝ M −2.5 R0.5 .

(35.60)

For a very massive star, in which electron scattering dominates the opacity, κ(r) is a constant.

11

H. A. Kramers (1894–1952).

410 Stars

Example 35.5 Determine the scaling of the luminosity L with M and R for (a) a low– mass star and (b) a high–mass star. Solution: By the principle of homology, a temperature increment dT scales in the same way as T , which eqn 35.58 gives as T (r) ∝ M R−1 . An increment in radius, however, scales with radius, i.e. dR ∝ R. Therefore the temperature gradient follows dT /dr ∝ M R−1 /R, giving dT ∝ M R−2 . dr

(35.61)

L(r) T (r)3 dT , ∝ − r2 ρ(r)κ(r) dr

(35.62)

Equation 35.45 becomes

and hence in case (a), for which κ(r) ∝ ρ(r)T (r)−3.5 , we ﬁnd L(r) ∝

M 5.5 . R0.5

(35.63)

The assumption of homology means that if the luminosity at any radius r scales as M 5.5 R−0.5 , then the surface luminosity scales in this way, so we may write M 5.5 (35.64) L ∝ 0.5 . R For case (b), since κ(r) is a constant, we ﬁnd L(r) ∝ M 3 and hence L ∝ M 3.

12

Ejnar Hertzsprung 1873–1967, Henry Norris Russell 1877–1957.

(35.65)

The Hertzsprung–Russell diagram12 is a plot of the luminosity of a collection of stars against its eﬀective surface temperature Teﬀ , where the latter quantity is obtained by measuring the colour of a star, and hence the wavelength of its peak emission which is inversely proportional to Teﬀ by Wien’s law. Fig 35.3 shows a Hertzsprung–Russell diagram for a selection of stars in our Galaxy. The most striking feature of this diagram is the main sequence, which represents stars which are burning mainly hydrogen; this is how almost all stars spend most of their ‘active’ life. The correlation between L and Teﬀ occurs because both quantities depend on the star’s mass. Empirically it is found that, for main sequence stars, L ∝ M a , where a is a positive constant which takes a value of about 3.5 (which is intermediate between the value of 5.5 for low–mass stars and 3 for massive stars which we found in Example 35.5). Note that the lifetime of a star must be proportional to M/L (since the total mass M measures how much ‘fuel’ is ‘on board’) and hence is proportional to M 1−a . Hence more massive stars burn up faster than less massive stars.

Heat transfer 411

L L

35.3

Fig. 35.3 A schematic Hertzsprung– Russell diagram (image courtesy of the Open University).

The Hertzsprung-Russell diagram in Fig. 35.3 also shows various red giants, which are stars that have exhausted their supply of hydrogen in their cores. Red giants are very luminous due to a very hot inert helium core (far hotter than in a main-sequence star) which causes the hydrogen shell around it (which undergoes nuclear fusion) to greatly expand; the surface is very large and cooler, leading to a lower surface temperature. Eventually, the temperature in the helium core rises so high that beryllium and carbon can be formed; the outer part of the core can be ejected leading to the formation of a nebula, and the remaining core can collapse to form a white dwarf. White dwarfs, which are not very luminous but have a high surface temperature, will be described in the following chapter. It is expected that our own Sun will eventually pass through a red–giant phase, the core of which will ultimately become a white dwarf.

412 Exercises

Chapter summary • A gas cloud will condense if its density is below the Jeans density. • The equation of hydrostatic equilibrium is Gm(r)ρ(r) dp(r) =− . dr r2 • The luminosity obeys dL(r) = 4πr2 ρ(r). dr • The temperature proﬁle inside a star obeys 3κ(r)ρ(r)L(r) dT =− . dr 64πr2 σ[T (r)]3 • The virial theorem states that pV = −

Ω 3

and

3(γ − 1)U + Ω = 0.

Further reading Recommended texts on stellar physics include Binney & Merriﬁeld (1998), Prialnik (2000) Carroll & Ostlie (1996) and Zeilik & Gregory (1998).

Exercises (35.1) Estimate the number of protons in the Sun. (35.2) Find the critical density for condensation of a cloud of molecular hydrogen gas of total mass 1000M at 20 K, expressing your answer in number of molecules per cubic metre. How would this answer change if (a) the mass of the cloud was only one solar mass, (b) the temperature was 100 K? (35.3) Assume that the density of baryonic matter in the Universe is 3×10−27 kg m−3 and that the distance to the edge of the Universe is given by cτ where τ is the age of the Universe, 13 × 109 years and c is the speed of light. Given that a typical galaxy has a mass 1011 M , estimate the number of galaxies

in the observable Universe. Estimate how many protons there are in the observable Universe, stating all your assumptions. (35.4) Show that for a uniform density cloud in eqn 35.2, f = 3/5. (35.5) Consider a gas consisting of neutral hydrogen atoms with number density n0 , protons of number density n+ , and electrons with number density ne = n+ . The ionization potential is χ. Find the adiabatic index γ. (35.6) Show that for low–mass stars, the luminosity L scales with the eﬀective surface temperature Teﬀ 4/5 and mass M according to L ∝ M 11/5 Teﬀ .

36

Compact objects When a star is near the end of its lifetime, and all of its fuel is used up, there is no longer enough outward pressure due to radiation to resist the inward pull of gravity and the star starts to collapse again. However, there is another source of internal pressure. The electrons inside a star, being fermions, are subject to the Pauli exclusion principle and take unkindly to being squashed into a small space. They produce an outward electron degeneracy pressure which we calculate in the following section. This concept leads to white dwarfs (Section 36.2) and, for the case of neutron degeneracy pressure, neutron stars (Section 36.3). More massive stars can turn into black holes (Section 36.4). We consider how mass can accrete onto such objects in Section 36.5 and conclude the chapter by considering the entropy of a black hole in Section 36.6.

36.1

Electron degeneracy pressure

Using the results from Chapter 30 concerning fermion gases, we can write the Fermi momentum pF as pF = (3π 2 n)1/3 ,

(36.1)

where n is the number density of electrons, so that equivalently n can be written as 1 p F 3 . (36.2) n= 3π 2 If we assume that the electrons behave non-relativistically, the Fermi energy is p2 EF = F , (36.3) 2me and the average internal energy density u is u=

3 32 nEF = (3π 2 )2/3 n5/3 . 5 10me

(36.4)

This gives an expression for the electron degeneracy pressure pelectron (using eqn 6.25) as pelectron =

2 2 u= (3π 2 )2/3 n5/3 . 3 5me

(36.5)

We can relate the number density of electrons, n, to the density ρ of the star by the following argument. If the star contains nuclei with atomic

36.1 Electron sure

degeneracy

pres413

36.2 White dwarfs

415

36.3 Neutron stars

416

36.4 Black holes

418

36.5 Accretion

419

36.6 Black holes and entropy 420 36.7 Life, the Universe and Entropy 421 Chapter summary

423

Further reading

423

Exercises

423

414 Compact objects

Note that our expression for the electron degeneracy pressure is inversely proportional to the electron mass me . This is why we have worried about electron degeneracy pressure, and not proton or neutron degeneracy pressure, since the pressure produced by neutrons and protons is much smaller because they are more massive.

number Z and mass number A, each nucleus has mass Amp and positive charge +Ze (where −e is the charge of an electron). For charge balance, for every nucleus there must be Z electrons. Hence, by ignoring the mass of the electrons themselves (which is much less than the mass of the nuclei), n is given by Zρ n≈ . (36.6) Amp Putting this into eqn 36.5, we ﬁnd that the electron degeneracy pressure pelectron ∝ ρ5/3 . This outward electron degeneracy pressure must balance the inward pressure due to the gravitational force. This pressure, which we will here denote by pgrav , is related by eqn 35.27 to the gravitational potential energy Ω, which is given by Ω=− so that pgrav =

G Ω =− 3V 5

3GM 2 , 5R

4π 3

(36.7)

1/3 M 2/3 ρ4/3 ,

(36.8)

where we have used ρ = M/V and R3 = 3M/(4πρ) to obtain the ﬁnal result. Note the important results that, for non-relativistic electrons: • The outward pressure is pelectron ∝ ρ5/3 . • The inward pressure is pgrav ∝ ρ4/3 . This leads to a stable situation since, if a star supported only by electron degeneracy pressure begins to shrink so that ρ begins to increase, the outward pressure pelectron increases faster than pgrav , producing an outward restoring force.

Example 36.1 What is the condition for balancing pelectron and pgrav ? Solution: We set pelectron = pgrav , and using eqns 36.5 and 36.8 this implies that 5 4G3 M 2 m3e Amp ρ= . 27π 3 6 Z

(36.9)

(36.10)

36.2

36.2

White dwarfs 415

White dwarfs

A star supported from further collapse only by electron degeneracy pressure is called a white dwarf1 and is the fate of many stars once they have exhausted their nuclear fuel. Equation 36.10 shows that ρ ∝ M 2,

(36.11)

which together with ρ ∝ M/R3 implies that R ∝ M −1/3 .

(36.12)

This implies that the radius of a white dwarf decreases as the mass increases.

Example 36.2 What is the electron degeneracy pressure for relativistic electrons? Solution: The Fermi energy is now EF = pF c, (36.13) and the average internal energy density is u=

3c 3 nEF = (3π 2 )1/3 n4/3 . 4 4

(36.14)

The pressure pelectron now follows from eqn 25.21 and is p=

c u = (3π 2 )1/3 n4/3 . 3 4

(36.15)

Note the important result that, for relativistic electrons: • The outward pressure is pelectron ∝ ρ4/3 . • The inward pressure is pgrav ∝ ρ4/3 . This leads to a unstable situation, since now if a star begins to shrink, so that ρ begins to increase, the outward pressure pelectron increases at exactly the same rate as pgrav . Electron degeneracy pressure cannot halt further collapse. We can estimate the mass above which the electrons in a white dwarf will behave relativistically. This will occur when

and hence when n

pF me c,

(36.16)

1 me c 3 , 3π 2

(36.17)

1

White dwarfs are called dwarfs because they are small and white because they are hot and luminous.

416 Compact objects

or equivalently

ρ

Amp Z

1 me c 3 . 3π 2

(36.18)

Substituting in eqn 36.10 for ρ in this equation, and then rearranging, yields 2 √ 3/2 Z 3 π c M ≈ 1.2M , (36.19) Amp 2 G

2

Subrahmanyan Chandrasekhar 1910– 1995

Fig. 36.1 Sirius is the brightest star in the night sky, but is actually a binary star. What you see with the naked eye is the bright normal star ‘Sirius A’, but the small star in orbit around it, known as ‘Sirius B’ (discovered by Alvan G. Clark in 1862), is a white dwarf. Because a white dwarf is so dense, it is very hot and can emit X-rays. The X-ray image shown in the ﬁgure was taken with the High Resolution Camera on the Chandra satellite. In this X-ray image the white dwarf Sirius B is much brighter than Sirius A. The bright ‘spokes’ in the image are produced by X-rays scattered by the support structure of a diﬀraction grating which was in the optical path for this observation. (Image courtesy of NASA.)

assuming that Z/A = 0.5 (appropriate for hydrogen). A more exact treatment leads to an estimate around 1.4M . This is known as the Chandrasekhar limit2 , and is the mass above which a white dwarf is no longer stable. Above the Chandrasekhar limit, the electron degeneracy pressure is no longer suﬃcient to support the star against gravitational collapse. White dwarfs are fairly common and it is believed that most small and medium-size stars will end up in this state, often after going through a red-giant phase. The ﬁrst-discovered white dwarf was Sirius B, the socalled dark companion of Sirius A (the brightest star visible in the night sky, to be found in the costellation of Canis Major), and which is shown in an X-ray image in Fig. 36.1. Though Sirius B is much less bright in the visible region of the spectrum, it is a stronger emitter of X-rays because of its high temperature and thus appears as the brighter object in the X-ray image.

36.3

Neutron stars

Once a star is more massive than about 1.4M , electrons behave relativistically and cannot prevent further collapse. However, the star will contain neutrons and these will still be non-relativistic since the neutron mass is larger than the electron mass. Neutrons are fermions and their pressure, albeit lower than the electron pressure below the Chandrasekhar limit, will follow ρ5/3 and therefore can balance the inward gravitational pressure. Free neutrons decay with a mean lifetime of about 15 minutes, but in a star one has to consider the equilibrium n p+ + e− + νe .

(36.20)

Because the electrons are relativistic, their Fermi energy is proportional to pF ∝ n1/3 , while the neutrons are non-relativistic and so their Fermi energy is proportional to p2F ∝ n2/3 . Thus at high density, an equilibrium can be established in the reaction in eqn 36.20. This implies that the Fermi momentum of the electrons is much smaller than that of the neutrons, and hence the number density of electrons will be much smaller than that of the neutrons. This moves the equilibrium towards the left-hand side of eqn 36.20. A compact object composed mainly of neutrons is called a neutron star. The ﬁrst observational evidence of such an object came from the discovery of pulsars by Jocelyn Bell Burnell in 1967. These were

36.3

Neutron stars 417

soon identiﬁed as rapidly rotating neutron stars which emit beams of radiation from their north and south magnetic poles. If their axis of rotation is not aligned with the poles, then lighthouse-type sweeping beams are produced as they rotate. When these intersect with the line of sight of an observer, pulses of radiation with a regular frequency are seen. The physical mechanism by which the radiation is emitted from pulsars is currently the subject of active research. Neutron stars are thought to form from the collapsed remnant of a very massive star after a supernova explosion. Even though the mass of a neutron star is a few solar masses, they are very compact, having radii in the range 10–20 km (see Exercise 36.3). One such neutron star is found at the centre of the Crab Nebula, in the constellation of Taurus. This object is 6500 light years from us and is the remnant of a supernova explosion which was was recorded by Chinese and Arab astronomers in 1054 as being visible during daylight for over three weeks. The neutron star at the centre currently rotates at a rate of thirty times per second.

Fig. 36.2 The Crab Nebula, as seen by the VLT telescope in Paranal, Chile. At the centre of the nebula is a neutron star. (Figure courtesy European Southern Observatory.)

418 Compact objects

Example 36.3 Estimate the minimum rotation period τ of a pulsar of radius R and mass M . Solution: For a neutron star rotating at ω = 2π/τ , the gravitational force at the equator GM/R2 must be bigger than the centrifugal force ω 2 R, so that R3 . (36.21) τ = 2π GM

By analogy with a white dwarf, the mass M of a neutron star follows M ∝ R−1/3 , so that more massive neutron stars are smaller than lighter ones. When the mass of a neutron star becomes very large, the neutrons behave relativistically and the neutron star becomes unstable.

3

Including general relativity reduces the maximum mass to about 0.7M , but including a more realistic equation of state raises the maximum mass up again, to somewhere around 2–3M .

Example 36.4 Above what mass will a neutron star become unstable? Solution: The high gravitational ﬁelds and compact nature of neutron stars mean that we really ought to include the eﬀects of general relativity and the strong nuclear interactions. However, ignoring these, we can make an estimate on the basis that the neutron star will become unstable when the neutrons themselves become relativistic. By analogy with eqn 36.19, and taking Z/A = 1, we have the maximum mass3 as √ 3/2 3 π c M ≈ 5M . (36.22) 2m2p G

36.4

Black holes

If a neutron star undergoes gravitational collapse, there is no other pressure to balance the gravitational attraction and the gravitational collapse of the star is total. The result is a black hole. To treat black holes properly requires general relativity, but we can derive a few results about them using simple arguments. The escape velocity vesc at the surface of 2 to the magnia star can be obtained by equating kinetic energy 21 mvesc tude of the gravitational potential energy GM m/R so that 2GM . (36.23) vesc = R

36.5

For a black hole of mass M , the escape velocity reaches the speed of light, c, at the Schwarzschild radius4 RS given by RS =

2GM . c2

4

Accretion 419

Karl Schwarzschild 1873-1916

(36.24)

This result seems to imply that photons from a black hole cannot escape and the black hole appears black to an observer. Actually this is not quite true for two reasons, one practical and one esoteric. (1) Matter falling into a black hole is ripped apart by the enormous gravitational tidal forces, well before it enters the event horizon5 at the Schwarzschild radius. This results in powerful emission of X-rays and radiation at other wavelengths. Supermassive black holes at the centres of certain galaxies, the most luminous active galactic nuclei having masses 108 M , are responsible for the most powerful sustained electromagnetic radiation in the Universe. (2) Even neglecting this powerful observed emission, there is believed to be weak emission of radiation from black holes due to quantum ﬂuctuations close to the event horizon. This Hawking radiation can be thought of as resulting from vacuum ﬂuctuations which produce particle–antiparticle pairs in which one half of the virtual pair falls into the black hole and the other half escapes. Because it emits energy, a black hole must have a temperature. The Hawking temperature TH of a black hole of mass M is given by kB TH =

c3 , 8πGM

(36.25)

so that as the black hole loses energy owing to Hawking radiation it becomes hotter. It also loses mass and this is termed black hole evaporation. If we ignore all other processes, the lifetime of a black hole can be estimated using dM 2 c = −4πRS2 σTH4 , dt

(36.26)

which leads to a lifetime which is proportional to M 3 . Thus small black holes evaporate due to Hawking radiation much faster than very massive ones.

36.5

Accretion

Black holes and neutron stars increase their mass as matter falls on to them. There is, however, a maximum rate of this accretion of mass onto any compact object. This occurs because the higher the rate of accretion, the greater the luminosity due to the infalling matter, and hence a higher outward radiation ﬂux. Therefore the radiation pressure increases, pushing outwards on any further matter attempting to fall inwards and accrete. To analyse this situation, consider a piece of matter accreting onto a star at radius R. This piece of matter has density ρ,

5

An event horizon is a mathematical, rather than physical, surface surrounding a black hole within which the escape velocity for a particle exceeds the speed of light — making escape impossible.

420 Compact objects

and volume dAdR. The gravitational force dragging it towards the star is GM (36.27) − 2 ρdA dR. R However, the radiation from the luminosity L of the stellar object produces a radiation pressure on the falling matter which results in an outward force equal to L dA × κρdR, 4πR2 c

(36.28)

where the factor κρdR is the fraction of radiant energy absorbed in the matter. The piece of matter will be able to accrete onto the star if the gravitational force dominates, so that GM L dA × κρdR, ρdA dR > 2 R 4πR2 c

(36.29)

so that

4πGM c , (36.30) κ where Ledd is the Eddington luminosity6 . If the luminosity L is entirely produced by accreting matter, then L = GM M˙ /R, so that there is a maximum rate of accretion given by L < Ledd =

6

Arthur Stanley Eddington 1882-1944

4πcR . M˙ edd = κ

(36.31)

This assumes spherically symmetric accretion and luminosity. Many compact objects actually accrete mass at a rate above the Eddington limit given by eqn 36.31, by accreting near the object’s equator but radiating photons from the object’s polar regions.

36.6

Black holes and entropy

In this section we consider the entropy of black holes. If we ignore the quantum mechanical Hawking radiation, the mass of a black hole can only increase since mass can enter but not leave. This means that the event horizon expands and the area A of the horizon, given by 4πRS2 , only increases. It turns out that the area of an event horizon can be associated with its entropy S according to S = kB

7

The presence of a factor belies the fact that this is a classical result. Entropy of even classical systems is essentially a count over states, and the underlying states are quantum in nature.

A 2 , 4lP

(36.32)

where lP = (G/c3 )1/2 is the Planck length, a result obtained by Hawking and Bekenstein.7 The entropy (and hence the area) of a black hole increases in all classical processes, as it should according to the second law of thermodynamics. Since all information concerning matter is lost when it falls into a black hole, the entropy of a black hole can be thought of as a large reservoir of missing information. Information

36.7

can be measured in bits, and relating information to entropy (see Chapter 15) implies that for a black hole, one bit corresponds to four Planck 2 ). This is indicated schematically in areas (where the Planck area is lP Fig. 36.3. The entropy of the black hole measures the uncertainty concerning which of its internal conﬁgurations are realized. We can speculate that a particular black hole may have been formed from a collapsing neutron star, the collapse of a normal star, or (somewhat improbably) the collapse of a giant cosmic spaghetti monster: we have no way of telling which, because all of this information has become completely inaccessible to us and all we can measure is the black hole’s mass, charge and angular momentum. Information about the black hole’s past history or its current chemical composition is hidden from our eyes. As the mass M of a black hole increases, so too does RS and hence so does S. Therefore the maximal limit of entropy (and hence information) for any ordinary region of space is directly proportional not to the region’s volume, but to its area. This is a counterexample to the usual rule that entropy is an extensive property, being proportional to volume. Although the entropy of a black hole increases in all classical processes, it decreases in the quantum mechanical black hole evaporation due to Hawking radiation. Finding out what has happened to the information in black hole evaporation, and whether information can ever escape from a black hole, is a current conundrum in black hole physics. It is useful to consider what happens when a body containing ordinary entropy falls into a black hole. The ordinary body has entropy for the usual reasons, namely that it can exist in a wide variety of diﬀerent conﬁgurations and its entropy expresses our uncertainty in knowledge of its precise conﬁguration. All that entropy seems at ﬁrst sight to have been lost when the body falls into the black hole, since it can now only exist in one single conﬁguration: the state of being annihilated! It therefore appears that the entropy of the Universe has gone down. However, the increase in mass of the black hole leads to an increase in the black hole’s area and hence in its entropy. It turns out that this more than compensates for any entropy apparently ‘lost’ by matter falling into the black hole. This motivates Bekenstein’s generalized second law of thermodynamics, which states that the sum of the usual entropy of matter in the Universe plus the entropy of the black holes never decreases.

36.7

Life, the Universe and Entropy 421

Fig. 36.3 The entropy of a black hole is proportional to its area A. This corresponds to a quantity of information such that one bit is ‘stored’ in four Planck areas across the surface of the black hole.

Life, the Universe and Entropy

We often hear it said that we receive our energy from the Sun. This is true, but though Earth receives about 1.5×1017 W of energy, mainly in ultraviolet and visible photons (radiation corresponding to the temperature on the surface of the Sun), the planet ultimately radiates it again as infrared photons (radiation corresponding to the temperature in Earth’s atmosphere8 ). If we did not do this, our planet would get progressively warmer and warmer and so for the conditions on Earth

8

See Chapter 37.

422 Compact objects

to be approximately time independent, we require that the total solar energy arriving at Earth must balance the total energy leaving Earth. The crucial point is that the frequency of radiation coming in is higher than that going out; a visible or ultraviolet photon thus has more energy than an infrared photon. Thus fewer photons arrive than leave. The entropy per photon is a constant, independent of frequency, so that by having fewer high–energy photons coming in and a larger number of lower-energy photons leaving, the incoming energy is low-entropy energy while the energy that leaves is high entropy. Thus the Sun is, for planet Earth, a convenient low-entropy energy source and the planet beneﬁts from this incoming ﬂux of low-entropy energy. This allows acorns to grow into oak trees, a process which in itself corresponds to a decrease in entropy but which can occur because a greater increase of entropy occurs elsewhere. When we digest food, and our body builds new cells and tissue, we are extracting some low-entropy energy from the plant and animal matter which we have eaten, all of which derives from the Sun. Similarly, the process of evolution over million of years, in which the complexity of life on Earth has increased with time, is driven by this ﬂux of solar low-entropy energy. Since the Universe is bathed in 2.7 K black body radiation, the Sun, with its 6000 K surface temperature, is clearly in a non-equilibrium state. The ‘ultimate equilibrium state’ of the Universe would be everything sitting at some uniform, low temperature, such as 2.7 K. During the Sun’s lifetime, almost all its low-entropy energy will be dissipated, ﬁlling space with photons; they will travel through the Universe and eventually interact with matter. The resulting high-entropy energy will tend to eventually feed into the cosmic slush of the ultimate equilibrium state. However, it is in the process of these interacting with matter that fun can begin: life is a non-equilibrium state, and prospers on Earth through non-equilibrium states that are driven by the constant inﬂux of low– entropy energy. The origin of the Sun’s low entropy is of course gravity. The Sun has gravitationally condensed from a uniform hydrogen cloud which is a source of low entropy as far as gravity is concerned (the operation of gravity is to cause such a cloud to condense and the entropy increases as the particles clump together). The clouds of gas of course came from the matter dispersed in the Big Bang. A crucial insight is to realize that although the matter and electromagnetic degrees of freedom in the early Universe were in thermal equilibrium (i.e. in a thermalized, high-entropy state, and thus producing the almost perfectly uniform cosmic microwave background we see today), the gravitational degrees of freedom were not thermalized. These unthermalized gravitational degrees of freedom provided the reservoir of low entropy which could drive gravitational collapse, and hence lead to the emission of low-entropy energy from stars which can, in favourable circumstances, drive life itself.

Further reading 423

Chapter summary • Electron degeneracy pressure is proportional to ρ5/3 for nonrelativistic electrons and to ρ4/3 for relativistic electrons. In the former case, it can balance the gravitational pressure, which is proportional to ρ4/3 . • A white dwarf is stable up to 1.4M and is supported by electron degeneracy pressure. Its radius R depends on mass M as R ∝ M −1/3 . • For 1.4M < M 5M , electrons behave relativistically, but a star can be supported by neutron degeneracy pressure, resulting in the formation of a neutron star. These are very compact and rotate with a period ∝ R3/2 M −1/2 . • The Schwarzschild radius RS of a black hole is 2GM/c2 . • The maximum accretion rate for spherically symmetric accretion is given by the Eddington limit M˙ edd = 4πcR/κ. 2 • A black hole has entropy S/kB = A/(4lP ) so that one bit of information can be associated with four Planck areas.

Further reading More information may be found in Carroll & Ostlie (1996), Cheng (2005), Prialnik (2000), Perkins (2003) and Zeilik & Gregory (1998).

Exercises (36.1) Show that for a white dwarf M V is a constant. (36.2) Estimate the radius of a white dwarf with mass M . (36.3) Estimate the radius of a neutron star with mass 2M and calculate its minimum rotation period.

(36.4) What is the Schwarzschild radius of a black hole with mass (i) 10M , (ii) 108 M and (iii) 10−8 M ? (36.5) For a black hole of mass 100M , estimate the Schwarzschild radius, the Hawking temperature and the entropy.

37 37.1 Solar energy

Earth’s atmosphere 424

37.2 The temperature proﬁle in the atmosphere 425 37.3 The greenhouse eﬀect

427

Chapter summary

432

Further reading

432

Exercises

432

1

The mass of the Earth is M⊕ = 5.97 × 1024 kg.

The atmosphere is the layer of gases gravitationally bound to the Earth, composed of ∼ 78% N2 , 21 % O2 and very small amounts of other gases. The Earth has radius R⊕ = 6378 km, and atmospheric pressure at sea-level is p = 105 Pa, and hence the mass Matmos of the atmosphere is given by Matmos =

2 4πR⊕ p = 5 × 1018 kg. g

(37.1)

Thus, Matmos /M⊕ ∼ 10−6 , where M⊕ is the mass of the Earth1 . The atmosphere is able to exchange thermal energy with the ocean (whose mass is considerably larger (≈ 1021 kg) than that of the atmosphere) and also with space (absorbing ultraviolet and visible radiation from the Sun, and emitting infrared radiation). In this chapter, we will examine brieﬂy a few of the thermodynamic properties of the atmosphere. More details on all of these issues may be found in the further reading at the end of the chapter.

37.1

Solar energy

Energy is continuously pumped into the atmosphere by the Sun. The luminosity of the Sun is L = 3.83 × 1026 W and can be related to the Sun’s eﬀective surface temperature T via 2 L = 4πR σT 4 ,

(37.2)

where R = 6.96 × 108 m is the solar radius. This gives T ≈ 5800 K. The power incident on unit area on the equator of the Earth at a distance from the Sun equal to one astronomical unit (approximately the Earth–Sun distance, equal to 1.496 × 1011 m) is S=

L −2 , 2 = 1.36 kW m 4πRES

(37.3)

and is called the solar constant. The Earth absorbs energy at a rate 2 S(1 − A) where A ≈ 0.31 is the Earth’s albedo, deﬁned as the πR⊕ fraction of solar radiation reﬂected. The Earth emits radiation at a rate 2 σTE4 , where TE is the radiative temperature of the given by 4πR⊕ Earth, sometimes called the radiometric temperature of the Earth. Balancing the power absorbed with the power emitted yields 2 2 S(1 − A) = 4πR⊕ σTE4 , πR⊕

(37.4)

37.2

S R

The temperature proﬁle in the atmosphere 425

R Fig. 37.1 Schematic illustration of (a) the solar power received on the Earth’s surface and (b) the power radiated from the Earth as a result of its illumination by the Sun.

and hence

1/2 R (1 − A)1/4 , (37.5) 2RES and this leads to TE ≈ 255 K, which is ∼ −20◦ C. This is much lower than the mean surface temperature, which is ∼ 283 K. This is because most of the thermal radiation into space comes from high up in the atmosphere, where the temperature is lower than it is at the surface. TE = T

Example 37.1 How large a solar panel do you need to drive a television (which needs 100 W to run) on a sunny day, assuming that the solar panel operates at 15 % eﬃciciency? Solution: Assuming that you have the full S = 1.36 kW m−2 at your disposal, the area needed is 100 W ≈ 0.5 m2 . (37.6) 0.15 × 1.36 × 103 W m−2

37.2

The temperature proﬁle in the atmosphere

In this section we wish to derive the dependence of the temperature T as a function of height z above the ground. In the lowest region of the atmosphere, the temperature proﬁle is governed by the adiabatic lapse rate (see Section 12.4), whose derivation we will brieﬂy review. Consider a ﬁxed mass of dry air which retains its identity as it rises. If it does not exchange heat with its surroundings (¯ dQ = 0) it can be treated adiabatically. Its change of enthalpy dH is given by dH = Cp dT = d¯ Q + V dp,

(37.7)

426 Earth’s atmosphere

and hence Cp dT = V dp.

(37.8)

Pressure p can be related to height z using the hydrostatic equation which we met in eqn 4.23, dp = −ρgdz,

(37.9)

and this leads to

2

The Coriolis force arises because the Earth is rotating. A description of this may be found in Andrews (2000) and in books on mechanics.

ρgV g dT =− = − ≡ −Γ, (37.10) dz Cp cp where cp = Cp /ρV is the speciﬁc heat capacity of dry air at constant pressure. We deﬁne Γ = g/cp to be the adiabatic lapse rate. Considerable heat transfer takes place within the lowest ∼ 10 km of the atmosphere, which is termed the troposphere. Air is warmed by contact with the Earth’s surface and absorption of solar energy. The heating of the air drives the temperature gradient |dT /dz| to be larger than |Γ|, making it unstable to these convection currents. When the temperature gradient vertically upwards from the Earth becomes too great (so that air at low altitudes is too warm and air higher up is too cool) then convection will take place, just as we learned it takes place within the interior of stars (Section 35.3.2). As the air rises into lower– pressure regions, it cools owing to adiabatic expansion. This instability to convection is why this region of the atmosphere is termed the name troposphere (the name comes from the Greek tropos, meaning ‘turning’). Moreover, if the temperature gradient as a function of latitude is similarly too great, when combined with the Coriolis2 force due to the rotation of Earth, the atmosphere exhibits baroclinic instability, giving rise to cyclones and anticyclones which can transport considerable energy between the equator and the poles. At the top of the troposphere, there is an interface region called the tropopause, where there is no convection. Vertically above this is the next layer, called the stratosphere, and in the lowest part of this layer temperature is often invariant with height z (see Fig. 37.2). The atmosphere becomes ‘stratiﬁed’ into layers which tend not to move up or down, but just hang there (‘in much the same way that bricks don’t’, to borrow a phrase from Douglas Adams). The stratosphere is ‘optically thin’ and hence absorbs little energy from the incoming solar radiation. If the stratosphere has absorptivity , it will absorb energy radiated at infrared wavelengths from the Earth’s surface at the rate σTE4 per unit area, where TE is the eﬀective radiative temperature of the Earth (including the troposphere). If the temperature of the stratosphere is Tstrat , it will emit (mainly 4 4 from its upper surface and σTstrat infrared) radiation at a rate σTstrat 4 from its lower surface i.e. at a total rate of 2σTstrat , and hence TE . (37.11) 21/4 The eﬀective radiative temperature of the Earth is ∼ 250 K, and this yields Tstrat ∼ 214 K, not far from what is observed. Tstrat =

37.3

The greenhouse eﬀect 427

z (km)

10

T

z Tstrat

TE

T

At higher altitudes in the stratosphere, the temperature starts rising with increasing height, owing to absorption of ultraviolet radiation in the ozone3 layer, reaching around 270 K. At about 50 km is the stratopause, which is the interface between the stratosphere and the mesosphere. In the mesosphere, the temperature falls again owing to the absence of ozone, bottoming out below 200 K at ∼ 90 km, roughly the location of the mesopause. Above this is the thermosphere, where the temperature rises very high (to above 1000◦ C) owing to very energetic solar photons and cosmic ray particles which cause dissociation of molecules in the upper atmosphere.

37.3

Fig. 37.2 Diagramatic form of a very simple model of the troposphere and the stratosphere. For real data, see Taylor (2005).

3

Ozone is the name given to the O3 molecule.

The greenhouse eﬀect

The diﬀerent molecules which are found in the atmosphere respond differently to incident radiation from the Earth, which is at infrared wavelengths (see Fig. 37.3). The main constituents of air are N2 and O2 . Both these molecules are composed of two identical atoms and are termed diatomic homonuclear molecules. They do not couple directly to infrared radiation because any vibrations of such molecules do not produce a dipole moment4 , but rather they can only stretch along the bond. However, for heteronuclear molecules like CO2 , the situation is diﬀerent. Two of the vibrational modes of CO2 , which is a linear molecule, are the asymmetric stretch mode (at ∼ 5 µm) and the bending mode (at ∼ 15–20 µm). These are both infrared active because they correspond to a change in dipole moment when the vibration takes place. The symmetric stretch mode is not infrared active. Water (H2 O) behaves similarly to CO2 , but because H2 O is a bent molecule with a permanent dipole moment, all three normal modes of

4

A molecule is said to have a dipole moment if there is charge separation across the molecule. Dipole moment is a vector quantity, and if two charges +q and −q are separated by a distance D then it takes the value qD in the direction from the negative charge towards the positive charge. A molecule can possess a permanent dipole moment, or have one induced by a vibrational mode.

Fig. 37.3 This graph shows a black body spectrum at 255 K analogous to radiation emitted from the Earth. Shown above are cartoons of relevant normal modes of the CO2 and H2 O molecules. The grey vertical arrows indicate the relevant vibrational wavelengths.

5

The term ‘greenhouse eﬀect’ was coined in 1827 by Jean Baptiste Joseph Fourier, whom we met on page 101.

u

428 Earth’s atmosphere

vibration are infrared active, although the symmetric stretch and bending modes are at high frequencies (< 3 µm). The antisymmetric stretch mode (at ∼ 3 µm) is relevant to atmospheric absorption. These vibrational modes are sketched in Fig. 37.3. The strong infrared absorption of gases like CO2 and H2 O (but not N2 and O2 ) gives rise to the greenhouse eﬀect.5 This eﬀect depends on very small concentrations of these heteronuclear molecules, or greenhouse gases, in the atmosphere. Greenhouse gases are capable of absorbing radiation emitted by the Earth, and produce strong absorption in the emitted spectrum as shown in Fig. 37.4. What this means is that the radiation at these wavelengths, which would pass out of the atmosphere in the absence of the greenhouse gases, is retained in the atmosphere: the greenhouse gases act as a ‘winter coat’ at these particular wavelengths and increase the temperature at the Earth’s surface. To some extent, of course, this is a good thing as this planet would be a very cold place without any of the winter coat eﬀect of H2 O, CO2 and the other greenhouse gases. However, too much

37.3

The greenhouse eﬀect 429

Fig. 37.4 Thermal radiation in the infrared emitted from the Earth’s surface and atmosphere (see Fig. 37.1) as observed over the Mediterrean Sea from the Nimbus 4 satellite by Hanel et al (1971). The atmosphere is not transparent around 9.5 microns or around 15 microns owing to absorption by ozone (O3 ) and by CO2 respectively. This ﬁgure is reproduced from Houghton (2005).

winter coat when it is not needed will elevate the temperature of the planet, with potentially disastrous consequences. There is now much accumulated evidence from independent measurements that the concentrations of greenhouse gases and in particular CO2 are changing6 as a result of human activity and, further, are giving rise to global warming (the elevation of the temperature of Earth’s atmosphere, see Fig. 37.5) and consequent climate change. This is termed anthropogenic climate change, with anthropogenic meaning ‘having its origins in the activities of humans’.

6

CO2 levels are rising at a rate unprecedented in the last 20 million years.

Fig. 37.5 Variations in the globally averaged near-surface air temperature over the last 40 years; reproduced by kind permission of P. Jones of the Climate Research Unit, University of East Anglia.

When one considers that over the last few hundred years, since the industrial revolution, we have released into the atmosphere fossil fuels which were laid down over a few hundred million years it is perhaps unsurprising that the small changes in the chemical composition of the Earth’s atmosphere this brings can have considerable inﬂuence on climate. The immense heat capacity of the oceans means that the full consequences of global warming and consequent climate change do not instantly become apparent7 . Already however these are signiﬁcant as can be seen in Fig. 37.5, which shows measurements of globally averaged temperatures since 1861. Predictions of global warming are complicated because this is a multi-

7

This is explored more in Exercise 37.3.

430 Earth’s atmosphere

8

It is helpful to consider the capacity of the atmosphere to ‘hold’ increasing water vapour as its temperature increases in terms of the phase diagram for water (shown in Fig. 28.7): on the phase boundary between gas and liquid, p increases with increasing T . Here, p should be interpreted as the partial pressure of water vapour i.e. a measure of how much water vapour is present. Fig. 28.7 shows that as temperature increases, a larger partial pressure of water vapour can be attained before condensation takes place. 9

Precipitation is any form of water that falls from the sky, such as rain, snow, sleet and hail.

parameter problem which depends on boundary conditions which themselves cannot be known exactly; for example, the details of cloud cover cannot be predicted with detailed accuracy. Clouds play a part in the Earth’s radiation balance because they reﬂect some of the incident radiation from the Sun but they also absorb and emit thermal radiation and have the same winter coat insulating eﬀect as greenhouse gases. In addition, the presence of water vapour (water in gaseous form as distinct from water droplets in a cloud) plays an important rˆ ole as a greenhouse gas. Furthermore, as the atmosphere heats up, so it can hold more water vapour before it begins to condense out as liquid droplets8 . This increased capacity further increases the winter coat eﬀect. The increasing presence of CO2 in the atmosphere thus gives rise to a positive feedback mechanism: as the global temperature rises, the atmosphere can hold a greater amount of H2 O before saturation and precipitation9 is reached. This leads to an even larger greenhouse eﬀect from atmospheric H2 O. These and other feedback eﬀects will inﬂuence the future of this planet. There is a competition between positive feedback eﬀects (i.e. warming triggering further warming, as ice cover on the planet is reduced, more land is exposed that being less reﬂective absorbs heat more quickly) and negative feedback eﬀects (e.g. higher temperatures will tend to promote the growth rate of plants and trees which will increase their intake of CO2 .) but it seems that positive feedback eﬀects have a much greater eﬀect.

Fig. 37.6 CO2 emissions in 2000 from diﬀerent countries or groups of countries in tonnes per capita versus their population in millions. Data from Grubb (2003).

10

Carbon neutral (sometimes referred to as ‘zero carbon’) means that there is no net input of CO2 into the atmosphere as a result of that particular energy supply.

It is also diﬃcult to forecast accurately future trends in the world’s human population, especially in developing countries that are, in addition, becoming increasingly industrialized. It is also diﬃcult to make precise predictions about the economies of developed and developing worlds and their reliance on fossil fuels rather than carbon neutral10 energy supplies. Figure 37.6 gives some sense of the uncertainty in fu-

37.3

The greenhouse eﬀect 431

ture CO2 production: the width of each bar represents the population of each nation (or group of nations) in millions and the height represents the CO2 emission per capita. Both the rate of change of width and the rate of change of height of each bar are uncertain, but it seems highly likely that increasing population and increasing industrialization will lead to increasing CO2 production worldwide. However, although many of these factors are uncertain, a very wide range of plausible input models for global warming (covering extremes such as a world with a continuously increasing population to one which has emphasis on local solutions to economic, social and environmental sustainability) predict a temperature rise of at least two degrees in 2100 compared with that in the ﬁrst half of the twentieth century (see Fig. 37.7).

Fig. 37.7 Predictions of global warming from a wide range of diﬀerent input models. Data from Cubasch et al. (2001) and ﬁgure from Houghton (2005).

Some consequences of global warming are already apparent: at the time of writing we have observed a 0.6◦ C rise in the annual average global temperature, a 1.8◦ C rise in the average Arctic temperature, 90% of the planet’s glaciers have been retreating since 1850 and Arctic sea-ice has reduced by 15–20 %. One of the predicted consequences of global warming is the rise of 2◦ C in the average global temperature (see Fig. 37.7) by the second half of this century. This will promote the melting of the Greenland ice and cause sea water to expand. Both eﬀects will lead to a signiﬁcant rise in sea–level and consequent reduction of habitable land on the planet.

432 Exercises

Chapter summary • Earth receives about 1.4 kW per m2 from the Sun as radiation. • The presence of some CO2 molecules in the atmosphere keeps Earth from being a much colder place to inhabit. • The CO2 concentrations in the atmosphere have increased signiﬁcantly since the industrial revolution. • Increasing CO2 in the atmosphere catalyses the increasing temperature of the atmosphere, by promoting the presence in the atmosphere of another greenhouse gas, H2 O vapour. • Although there are considerable uncertainties in the time scales over which global warming will take place, it seems hard to avoid the conclusion that signiﬁcant and devastating global warming has begun.

Further reading • The International Panel on Climate Change: http://www.ipcc.ch • The Climate Research Unit: http://www.cru.uea.ac.uk • Climate prediction for everyone: http://www.climateprediction.net • Useful background reading and an introduction to the physics of atmospheres may be found in Andrews (2000), Taylor (2005) and Houghton (2005)

Exercises (37.1) What is the average power per unit area of Earth received from the Sun per year (a) on the equator, (b) at 35◦ latitude and (c) over all the Earth? (37.2) Given the mass of Earth’s atmosphere at the start of this chapter, estimate its heat capacity. (37.3) Given the mass of Earth’s ocean at the start of this chapter, ﬁnd its heat capacity. Compare your answer with that for the previous question. (37.4) Suppose that the Earth did not have any atmosphere, and neglecting any thermal conduction be-

tween the oceans and the land, estimate how long would it take for the power from the Sun to bring the ocean to the boil. State any further assumptions that you make. (37.5) The total annual energy consumption at the start of the 21st century is about 13 TW (13 × 1012 W). If the eﬃciency of a solar panel is 15%, what area of land would you need to cover with solar panels (a) at the equator and (b) at 35◦ latitude, to supply the energy needs for the Earth’s population?

Fundamental constants Bohr radius speed of light in free space Electronic charge Planck constant h/2π = Boltzmann constant electron rest mass proton rest mass Avogadro number standard molar volume molar gas constant ﬁne structure constant permittivity of free space magnetic permeability of free space Bohr magneton nuclear magneton neutron magnetic moment proton magnetic moment Rydberg constant Stefan constant gravitational constant mass of the Sun mass of the Earth radius of the Sun radius of the Earth 1 astronomical unit 1 light year 1 parsec Planck length Planck mass Planck time

e2 = 4πε0 c

R

5.292 × 10−11 m 2.9979 × 108 m s−1 1.6022 × 10−19 C 6.626 × 10−34 J s 1.0546 × 10−34 J s 1.3807 × 10−23 J K−1 9.109 × 10−31 kg 1.6726 × 10−27 kg 6.022 × 1023 mol−1 22.414 × 10−3 m3 mol−1 8.315 J mol−1 K−1

α

(137.04)−1

0 µ0

8.854 × 10−12 F m−1 4π × 10−7 H m−1

µB µN µn µp R∞ R∞ hc σ G M M⊕ R R⊕

9.274 × 10−24 A m2 or J T−1 5.051 × 10−27 A m2 or J T−1 −1.9130µN 2.7928µN 1.0974 × 107 m−1 13.606 eV 5.671 × 10−8 W m−2 K−4 6.673 × 10−11 N m2 kg−2 1.99 × 1030 kg 5.97 × 1024 kg 6.96 × 108 m 6.378 × 106 m 1.496 × 1011 m 9.460 × 1015 m 3.086 × 1016 m

lP

1.616 × 10−35 m

mP

2.176 × 10−8 kg

tP

5.391 × 10−44 s

a0 c e h kB me mp NA

G = c3 c = G lP /c =

A

B

Useful formulae

(1) Trigonometry: eiθ = cos θ + i sin θ eiθ − e−iθ sin θ = 2i eiθ + e−iθ cos θ = 2 sin(θ + φ)

=

sin θ cos φ + cos θ sin φ

cos(θ + φ)

=

cos θ cos φ − sin θ sin φ

tan θ = sin θ/ cos θ cos2 θ + sin2 θ = 1 cos 2θ = cos2 θ − sin2 θ sin 2θ = 2 cos θ sin θ (2) Hyperbolics: sinh x =

ex − e−x 2

ex + e−x 2 cosh2 x − sinh2 x = 1

(4) Geometric progression N -term series: a+ar+ar2 +· · ·+arN −1 = a

sinh 2x = 2 cosh x sinh x tanh x = sinh x/ cosh x

logb (xy) = logb (x) + logb (y) logb (x/y) = logb (x) − logb (y) logb (x) =

logk (x) logk (b)

ln(x) ≡ loge (x) where e= 2.71828182846 . . .

a(1 − rN ) . 1−r

∞-term series: a + ar + ar2 + · · · = a

∞ n=0

rn =

a . 1−r

(5) Taylor and Maclaurin series A Taylor series of a real function f (x) about a point x = a is given by (x − a)2 d2 f df + +. . . f (x) = f (a)+(x−a) dx x=a 2! dx2 x=a If a = 0, the expansion is a Maclaurin series x2 d2 f df + +. . . f (x) = f (0)+x dx x=0 2! dx2 x=0 (6) Some Maclaurin series (valid for |x| < 1): (1 + x)n = 1 + nx +

n(n − 1) 2 x 2!

n(n − 1)(n − 2) 3 x + ··· 3! (1 − x)−1 = 1 + x + x2 + x3 + · · · +

x3 x4 x2 + + + ··· 2! 3! 4! x5 x3 + − ··· sin x = x − 3! 5! x4 x2 + − ··· cos x = 1 − 2! 4! 2x5 x3 + + ··· tan x = x + 3 15

ex = 1 + x +

(3) Logarithms:

rn =

n=0

cosh x =

cosh 2x = cosh2 x + sinh2 x

N −1

435

2x5 x3 + − ··· 3 15 3 5 x x7 x tanh−1 x = x + + + + ··· 3 5 7 2 3 x x + − ··· ln(1 + x) = x − 2 3 (7) Integrals: Indeﬁnite (with a > 0): dx x 1 tan−1 = x2 + a2 a $ a $ $x − a$ dx 1 $ ln $$ = 2 2 x −a 2a x + a$ dx x √ = sinh−1 a x2 + a2 −1 x dx cosh a if x > a √ = − cosh−1 xa if x < −a x2 − a2 x dx √ = sin−1 2 2 a a −x (8) Vector operators: • grad acts on a scalar ﬁeld to produce a vector ﬁeld: ∂φ ∂φ ∂φ , , grad φ = ∇φ = ∂x ∂y ∂z • div acts on a vector ﬁeld to produce a scalar ﬁeld: ∂Ay ∂Az ∂Ax + + divA = ∇ · A = ∂x ∂y ∂z • curl acts on a vector ﬁeld to produce another vector ﬁeld: $ $ $ i j k $$ $ curl A = ∇×A = $$ ∂/∂x ∂/∂y ∂/∂z $$ $ Ax Ay Az $ tanh x = x −

where φ(r) and A(r) are any given scalar and vector ﬁeld respectively. (9) Vector identities: ∇ · (∇φ) ∇ × (∇φ)

= ∇2 φ = 0

∇ · (∇ × A) = 0 ∇ · (φA) = A · ∇φ + φ∇ · A ∇ × (φA) = φ∇ × A − A × ∇φ

These identities can be easily proved by application of the alternating tensor and use of the summation convention. The alternating tensor ijk is deﬁned according to: if ijk is an even permutation of 123 1 −1 if ijk is an odd permutation of 123 ijk = 0 if any two of i, j or k are equal so that the vector product can be written (A × B)i = ijk Aj Bk . The summation convention is used here, so that twice repeated indices are assumed summed. The scalar product is then A · B = Ai Bi . Use can be made of the identity ijk ilm = δjl δkm − δjm δkl where δij is the Kronecker delta given by 1 i=j δij = 0 i = j The vector triple product is given by A × (B × C) = (A · C)B − (A · B)C. (10) Cylindrical coordinates: ∂φ 1 ∂2φ ∂2φ 1 ∂ 2 r + 2 2+ 2 ∇ φ= r ∂r ∂r r ∂φ ∂z ∇φ =

∇ × (A × B)

= (A · ∇)B + (B · ∇)A + A × (∇ × B) + B × (∇ × A) = (B · ∇)A − (A · ∇)B + A(∇ · B) − B(∇ · A)

(11) Spherical polar coordinates: ∂ 1 ∂ 1 ∂φ 2 2 ∂φ ∇ φ= 2 r + 2 sin θ r ∂r ∂r r sin θ ∂θ ∂θ

∇ × (∇ × A) = ∇(∇ · A) − ∇2 A ∇ · (A × B) = B · ∇ × A − A · ∇ × B ∇(A · B)

∂φ 1 ∂φ ∂φ , , ∂r r ∂φ ∂z

+ ∇φ =

∂2φ 1 r2 sin2 θ ∂φ2

∂φ 1 ∂φ 1 ∂φ , , ∂r r ∂θ r sin θ ∂φ

C

Useful mathematics

C.1 The factorial integral

436

C.2 The Gaussian integral

436

C.3 Stirling’s formula

439

C.4 Riemann zeta function

441

C.5 The polylogarithm

442

C.6 Partial derivatives

443

C.7 Exact diﬀerentials

444

C.8 Volume of a hypersphere 445 C.9 Jacobians

445

C.10 The Dirac delta function 447 C.11 Fourier transforms

447

C.12 Solution of the diﬀusion equation 448 C.13 Lagrange multipliers

449

C.1

The factorial integral

One of the most useful integrals in thermodynamics problems is the following one (which is worth memorizing): ∞ n! = xn e−x dx (C.1) 0

• This integral is simple to prove by induction as follows: First, show that it is true for the case n = 0. Then assume it is true for n = k and prove it is true for n = k + 1. (Hint: integrate ∞ (k + 1)! = 0 xk+1 e−x dx by parts.) • It allows you to deﬁne the factorial of non-integer numbers. This is so useful that the integral is given a special name, the gamma function. The traditional deﬁnition of the gamma function is ∞ xn−1 e−x dx (C.2) Γ(n) = 0

so that Γ(n) = (n − 1)!, i.e. the factorial function and the gamma function are ‘out of step’ with each other, a rather confusing feature. The gamma function is plotted in Fig. C.1 and has a surprisingly complicated structure for negative n. Selected values of the gamma function are listed in Table C.1. The gamma function will appear again in later integrals.

z

− 32

− 12

1 2

1

Γ(z)

√ 4 π 3

√ −2 π

√ π

1

3 2

2

5 2

3

4

1

√ 3 π 4

2

6

√

π 2

Table C.1 Selected values of the gamma function. Other values can be generated using Γ(z + 1) = zΓ(z).

C.2

The Gaussian integral 2

The Gaussian is a function of the form e−αx , which is plotted in Fig. C.2. It has a maximum at x = 0 and a shape which has been

The Gaussian integral 437

n

C.2

Fig. C.1 The gamma function Γ(n) showing the singularities for integer values of n ≤ 0. For positive, integer n, Γ(n) = (n − 1)!.

n

likened to that of a bell. It turns up in many statistical problems, often under the name of the normal distribution. The integral of a Gaussian is another extremely useful integral: ∞ 2 π . e−αx dx = (C.3) α −∞

x

• It can be proved by evaluating the two-dimensional integral ∞ ∞ ∞ ∞ 2 −α(x2 +y 2 ) −αx2 −αy 2 Fig. C.2 A Gaussian e−αx . dx dy e = dx e dy e −∞

−∞

−∞

= I 2,

−∞

(C.4)

where I is our desired integral. We can evaluate the left-hand side using polar coordinates, so that 2π ∞ 2 dθ dr re−αr , (C.5) I2 = 0

0

which with the substitution z = αr2 (and hence dz = 2αr dr) gives ∞ 1 π I 2 = 2π × (C.6) dz e−z = , 2α 0 α

x

438 Useful mathematics

and hence I =

π/α is proved.

• Even more fun begins when we employ a cunning stratagem: we diﬀerentiate both sides of the equation with respect to α. Because 2 x does not depend on α, this is easy to do. Hence (d/dα)e−αx = √ 2 −x2 e−αx and (d/dα) π/α = − π/2α3/2 so that

∞

−∞

2

x2 e−αx dx =

1 2

π . α3

(C.7)

• This trick can be repeated with equal ease. Diﬀerentiating again gives ∞ 2 3 π x4 e−αx dx = . (C.8) 4 α5 −∞ 1

A general formula is r Z ∞ 2 π (2n)! x2n e−αx dx = , 2n n!2 α2n+1 −∞

for integer n ≥ 0.

• Therefore we have a way of generating the integrals between −∞ 2 and ∞ of x2n e−αx , where n ≥ 0 is an integer.1 Because these functions are even, the integrals of the same functions between 0 and ∞ are just half of these results: ∞ 1 π −αx2 , e dx = 2 α 0 ∞ 1 π 2 −αx2 x e dx = , 4 α3 0 ∞ 3 π 4 −αx2 x e dx = . 8 α5 0 2

2

A general formula is Z ∞ 2 x2n+1 e−αx dx = 0

n! , 2αn+1

for integer n ≥ 0. Another method of getting these integrals is to make the substitution y = αx2 and turn them into the factorial integrals considered above. This is all very well, but√you need to know things like (− 12 )! = π to proceed.

• To integrate x2n+1 e−αx between −∞ and ∞ is easy: the functions are all odd and so the integrals are all zero. To integrate between 0 2 ∞ and ∞, start oﬀ with 0 xe−αx dx which can be done by noticing 2 2 that xe−αx is almost what you get when you diﬀerentiate e−αx . All the odd powers of x can now be obtained2 by diﬀerentiating that integral with respect to α. Hence, ∞ 2 1 , xe−αx dx = 2α 0 ∞ 2 1 x3 e−αx dx = , 2α2 0 ∞ 2 1 x5 e−αx dx = 3 . α 0 • A useful expression for a normalized Gaussian (one whose integral is unity) is 2 2 1 √ e−(x−µ) /2σ . (C.9) 2 2πσ This has mean x = µ and variance (x − x)2 = σ 2 .

C.3

C.3

Stirling’s formula 439

Stirling’s formula x

The derivation of Stirling’s formula proceeds by using the integral expression for n! in eqn C.1, namely ∞ n! = xn e−x dx. (C.10)

n

0

n

x e

We will play with the right-hand side of this integral and develop an approximation for it. We notice that the integrand xn e−x consists of a function which increases with x (the function xn ) and a function which decreases with x (the function e−x ), and so it must have a maximum somewhere (see Fig. C.3(a)). Most of the integral is due to the bulge around this maximum, so we will try to approximate this region around the bulge. As we are eventually going to take logs of this integral, it is natural to work with the logarithm of this integrand, which we will call f (x). Hence we deﬁne the function f (x) by ef (x) = xn e−x

-x

e

x

n lnx

f=n lnx x

(C.11)

1

This implies that f (x) is given by f (x) = n ln x − x,

x

(C.12)

(C.13)

which implies that the maximum in f is at x = n. We can diﬀerentiate again and get d2 f n = − 2. (C.14) dx2 x Now we can perform a Taylor expansion3 around the maximum, so that df 1 d2 f f (x) = f (n) + (x − n) + (x − n)2 + · · · dx x=n 2! dx2 x=n 1 n = n ln n − n + 0 × (x − n) − (x − n)2 + · · · 2 n2 (x − n)2 = n ln n − n − + ··· (C.15) 2n The Taylor expansion approximates f (x) by a quadratic (see the dotted line in Fig. C.3) and hence ef (x) approximates to a Gaussian.4 Putting this as the integrand in eqn C.1, and removing from this integral the terms which do not depend on x, we have ∞ 2 n! = en ln n−n e−(x−n) /2n+··· dx. (C.16) 0

n

x

which is sketched in Fig. C.3(b). When the integrand has a maximum, so will f (x). Hence the maximum of the integrand, and also the maximum of this function f (x), can be found using df n = − 1 = 0, dx x

-x

Fig. C.3 (a) The integrand xn e−x (solid line) contains a maximum. (b) The function f (x) = −x + n ln x (solid line) which is the natural logarithm of the integrand. The dotted line is the Taylor expansion around the maximum (from eqn C.15). These curves have been plotted for n = 3, but the ability of the Taylor expansion to model the solid line improves as n increases. Note that (b) shows the natural logarithm of the curves in (a). 3

See Appendix B.

4

See Appendix C.2.

440 Useful mathematics

The integral in this expression can be evaluated with the help of eqn C.3 to be ∞ ∞ √ 2 −(x−n)2 /2n+··· e dx ≈ e−(x−n) /2n dx = 2πn. (C.17) −∞

0

(Here we have used the fact that it doesn’t matter if you put the lower −(x−n)2 /2n limit of the integral as −∞ rather than 0 since the integrand, e√ , is a Gaussian centred at x = n with a width that scales as n so that the contribution to the integral from the region between −∞ and 0 is vanishingly small as n becomes large.) We have that √ n! ≈ en ln n−n 2πn, (C.18) and hence ln n! ≈ n ln n − n +

1 2

ln 2πn,

(C.19)

which is one version of Stirling’s formula. When n is very large, this can be written ln n! ≈ n ln n − n, (C.20) which is another version of Stirling’s formula.

n

n

. . . . . . . . . . . . . . . . . . .. . n . .. . . . . . . . . . . . . . .. . . . . . . . . . ..

Fig. C.4 Stirling’s approximation for ln n!. The dots are the exact results. The solid line is according to eqn C.19, while the dashed line is eqn C.20. The inset shows the two lines for larger values of n and demonstrates that as n becomes large, eqn C.20 becomes a very good approximation.

n The approximation in eqn C.19 is very good, as can be seen in Fig. C.4. The approximation in eqn C.20 (the dotted line in Fig. C.4) slightly

C.4

Riemann zeta function 441

underestimates the exact result when n is small, but as n becomes large (as is often the case in thermal physics problems) it becomes a very good approximation (as shown in the inset to Fig. C.4).

Riemann zeta function s

C.4

The Riemann zeta function ζ(s) is usually deﬁned by ζ(s) =

∞ 1 , ns n=1

(C.21)

and converges for s > 1 (see Fig. C.5). For s = 1 it gives a divergent series. Some useful values are listed in Table C.2.

s s

ζ(s) Fig. C.5 The Riemann zeta function ζ(s) for s > 1.

∞ ≈ 2.612 π 2 /6 ≈ 1.645 ≈ 1.341 ≈ 1.20206 π 4 /90 ≈ 1.0823 ≈ 1.0369 π 6 /945 ≈ 1.017

1 3 2

2 5 2

3 4 5 6

Table C.2 Selected values of the Riemann zeta function.

Our reason for introducing the Riemann zeta function is that it is involved in many useful integrals. One such is the Bose integral IB (n) deﬁned by ∞ xn . (C.22) IB (n) = dx x e −1 0 We can evaluate this as follows: ∞ xn e−x IB (n) = dx 1 − e−x 0 ∞ ∞ = dx xn e−(k+1)x 0

=

k=0

∞ k=0

1 (k + 1)n+1

∞

dy y n e−y

0

= ζ(n + 1) Γ(n + 1).

(C.23)

Thus we have that IB (n) =

∞

dx 0

xn = ζ(n + 1) Γ(n + 1). −1

ex

(C.24)

442 Useful mathematics

So for example,

∞

dx 0

π4 π4 x3 ex = ζ(4) Γ(4) = × 3! = . x e −1 90 15

(C.25)

Another useful integral can be derived as follows. Consider the integral ∞ xn−1 I= . (C.26) dx ax e −1 0 This can be evaluated easily by making the substitution y = ax, yielding ∞ 1 y n−1 I= n . (C.27) dy y a 0 e −1 Now, diﬀerentiating I with respect to a using eqn C.26 gives ∞ dI xn eax =− dx ax , da (e − 1)2 0

(C.28)

while using eqn C.27 yields dI n = − n+1 da a

∞

dy 0

y n−1 . ey − 1

(C.29)

These two expressions should be the same, and hence equating them and putting a = 1 yields

∞

dx 0

xn ex = n ζ(n) Γ(n). − 1)2

(ex

So for example, ∞ 4π 4 x4 ex π4 × 3! = . dx x = 4ζ(4) Γ(4) = 4 × 2 (e − 1) 90 15 0

C.5

(C.30)

(C.31)

The polylogarithm

The polylogarithm function Lin (z) (also known as Jonqui´ ere’s function) is deﬁned as ∞ zk , (C.32) Lin (z) = kn k=1

where z is in the open unit disc in the complex plane, i.e. |z| 1. The deﬁnition over the whole complex plane follows via the process of analytic continuation. The polylogarithm is useful in the evaluation of integrals of Bose–Einstein and Fermi–Dirac distribution functions. First note that we can write ∞ ze−x 1 = = (ze−x )m+1 , z −1 ex − 1 1 − ze−x m=0

(C.33)

C.6

i.e. as a geometric progression. Hence we can evaluate the following integral: ∞ n−1 ∞ ∞ x dx = xn−1 ((ze−x )m+1 , −1 ex − 1 z 0 m=0 0 ∞ ∞ = z m+1 xn−1 e−(m+1)x =

0

m=0 ∞

m+1

z (m + 1)n m=0 ∞

∞

y n−1 e−y

0

m+1

z (m + 1)n m=0

=

Γ(n)

=

∞ zk Γ(n) kn

=

Γ(n)Lin (z).

k=1

Similarly one can show that ∞ n−1 x dx = −Γ(n)Lin (−z). −1 x z e +1 0

(C.34)

(C.35)

Combining these equations, one can write in general that 0

∞

xn−1 dx = ∓Γ(n)Lin (∓z) . z −1 ex ± 1

(C.36)

Note that when |z| 1, only the ﬁrst term in the series in eqn C.32 contributes, and Lin (z) ≈ z. (C.37) Note also that Lin (1) =

∞ 1 = ζ(n), kn

(C.38)

k=1

where ζ(n) is the Riemann zeta function (eqn C.21).

C.6

Partial derivatives

Consider x as a function of two variables y and z. This can be written x = x(y, z), and we have that ∂x ∂x dx = dy + dz. (C.39) ∂y z ∂z y But rearranging x = x(y, z) can lead to having z as a function of x and y so that z = z(x, y), in which case ∂z ∂z dz = dx + dy. (C.40) ∂x y ∂y x

Partial derivatives 443

444 Useful mathematics

Substituting C.40 into C.39 gives ∂x ∂x ∂x ∂z ∂z dx + + dy. dx = ∂z y ∂x y ∂y z ∂z y ∂y x The terms multiplying dx give the reciprocal theorem

∂x ∂z

y

1 = ∂z ,

(C.41)

∂x y

and the terms multiplying dz give the reciprocity theorem

C.7

∂x ∂y

z

∂y ∂z

x

∂z ∂x

y

= −1.

(C.42)

Exact diﬀerentials

An expression such as F1 (x, y) dx + F2 (x, y) dy is known as an exact diﬀerential if it can be written as the diﬀerential ∂f ∂f df = dx + dy, (C.43) ∂x ∂y of a diﬀerentiable single-valued function f (x, y). This implies that ∂f ∂f F2 = , (C.44) F1 = ∂x ∂y or in vector form, F = ∇f . Hence the integral of an exact diﬀerential is path-independent, so that [where 1 and 2 are shorthands for (x1 , y1 ) and (x2 , y2 )] 2 2 2 F1 (x, y) dx+F2 (x, y) dy = F ·dr = df = f (2)−f (1), (C.45) 1

1

1

and the answer depends only on the initial and ﬁnal states of the system. For an inexact diﬀerential this is not true and knowledge of the initial and ﬁnal states is not suﬃcient to evaluate the integral: you have to know which path was taken. For an exact diﬀerential the integral round a closed loop is zero: (C.46) F1 (x, y) dx + F2 (x, y) dy = F · dr = df = 0, which implies that ∇ × F = 0 (by Stokes’ theorem) and hence 2 2 ∂F2 ∂F1 ∂ f ∂ f = or = . ∂x ∂y ∂x∂y ∂y∂x

(C.47)

For thermal physics, a crucial point to remember is that functions of state have exact diﬀerentials.

C.8

C.8

Volume of a hypersphere

A hypersphere in D-dimensions and with radius r is described by the equation D x2i = r2 . (C.48) i=1

It has volume VD given VD = αrD ,

(C.49)

where α is a numerical constant which we will now determine. Consider the integral I given by

∞

I= −∞

dx1 · · ·

∞

−∞

dxD exp −

D

x2i

.

(C.50)

i=1

This can be evaluated as follows: ∞ D −x2 I= dx e = π D/2 .

(C.51)

−∞

Alternatively, we can evaluate it in hyperspherical polars as follows: ∞ 2 dVD e−r , (C.52) I= 0

where the volume element is given by dVD = αDrD−1 dr. Hence, equating eqn C.51 and eqn C.52 we have that ∞ 2 dr rD−1 e−r , (C.53) π D/2 = αD 0

and hence α=

2π D/2 . DΓ(D/2)

(C.54)

Hence we obtain the volume of a hypersphere in D-dimensions as VD =

C.9

2π D/2 rD . Γ( D 2 + 1)

(C.55)

Jacobians

Let x = g(u, v) and y = h(u, v) be a transformation of the plane. Then the Jacobian of this transformation is $ ∂x ∂x $ $ ∂x ∂y ∂x ∂y ∂(x, y) $$ ∂u ∂v $ = = $ ∂y ∂y (C.56) $ ∂u ∂v − ∂u ∂u . ∂(u, v) ∂u ∂v

Volume of a hypersphere 445

446 Useful mathematics

Example C.1 The Jacobian of the polar coordinate transformation x(r, θ) = r cos θ and y(r, θ) = r sin θ is $ $ $ $ ∂x $ $ cos θ −r sin θ $ ∂(x, y) $$ ∂x ∂r ∂θ $ $ $=r (C.57) = $ ∂y ∂y $ = $ $ sin θ r cos θ ∂(r, θ) ∂r ∂θ

If g and h have continuous partial diﬀerentials such that the Jacobian is never zero, we then have $ $ $ ∂(x, y) $ $ du dv (C.58) $ f (x, y) dx dy = f (g(u, v), h(u, v)) $ ∂(u, v) $ R S So in our example, we would have f (x, y) dx dy = f (g(r, θ), h(r, θ))r dr dθ. R

(C.59)

S

The Jacobian of the inverse transformation is the reciprocal of the Jacobian of the original transformation. $ $ $ ∂(x, y) $ 1 $ $ (C.60) $ ∂(u, v) $ = $$ ∂(u,v) $$ , $ ∂(x,y) $ which is a consequence of the fact that the determinant of the inverse of a matrix is the reciprocal of the determinant of the matrix. Other useful identities are ∂(x, y) ∂(y, x) ∂(y, x) =− = , (C.61) ∂(u, v) ∂(u, v) ∂(v, u) ∂(x, y) = 1, ∂(x, y) ∂y ∂(x, y) = , ∂(x, z) ∂z x and

(C.62) (C.63)

∂(x, y) ∂(x, y) ∂(a, b) = / . ∂(u, v) ∂(a, b) ∂(u, v)

Quick exercise: The Jacobian can be generalized to three-dimensions, as $ ∂x ∂x ∂x $ $ $ ∂v ∂w $ $ ∂u ∂(x, y, z) ∂y ∂y ∂y $ $ . = ∂v ∂w $ ∂(u, v, w) $$ ∂u ∂z ∂z $ ∂z ∂u

∂v

(C.64)

(C.65)

∂w

Show that for the transformation of spherical polars x = r sin θ cos φ, y = r sin θ sin φ, z = r cos θ, the Jacobian is ∂(x, y, z) = r2 sin θ. ∂(r, θ, φ)

(C.66)

C.10

C.10

The Dirac delta function

The Dirac delta function δ(x − a) centred at x = a is zero for all x not equal to a, but its area is 1. Hence ∞ δ(x − a) = 1. (C.67) −∞

Because the Dirac delta function is such a narrow ‘spike’, integrals of the Dirac delta function multiplied by any other function f (x) are simple to do: ∞ f (x)δ(x − a) = f (a). (C.68) −∞

C.11

Fourier transforms

Consider a function x(t). Its Fourier transform is deﬁned by ∞ x ˜(ω) = dt e−iωt x(t).

(C.69)

−∞

The inverse transform is 1 x(t) = 2π

∞

dω eiωt x ˜(ω).

(C.70)

−∞

We now state some useful results concering Fourier transforms. • The Fourier transform of a delta function δ(t − t ) is given by ∞ dt e−iωt δ(t − t ) = e−iωt , (C.71) −∞

and putting this into the inverse transform shows that ∞ dω ei(ω−ω )t = 2πδ(ω − ω ),

(C.72)

−∞

which is an identity which will be useful later. • The Fourier transform of x(t) ˙ is iω˜ x(ω), and so diﬀerential equations can be Fourier transformed into algebraic equations. • Parseval’s theorem states that ∞ ∞ 1 2 dt|x(t)| = dω|˜ x(ω)|2 . (C.73) 2π −∞ −∞ • The convolution h(t) of two functions f (t) and g(t) is deﬁned by ∞ h(t) = dt f (t − t )g(t ). (C.74) −∞

The convolution theorem states that the Fourier transform of h(t) is then given by the multiplication of the Fourier transforms of f (t) and g(t), i.e. ˜ h(ω) = f˜(ω)˜ g (ω).

(C.75)

The Dirac delta function 447

448 Useful mathematics

• We now prove the Wiener–Khinchin theorem (mentioned in Section 33.6. Using the inverse Fourier transform, we can write the correlation function Cxx (t) as ∞ x∗ (t )x(t + t) dt (C.76) Cxx (t) = −∞ ∞ ∞ 1 = dt dω e−iωt x ˜∗ (−ω) 2π −∞ −∞ ∞ 1 −iω(t+t ) dω e x ˜(ω ) 2π −∞ ∞ 2 ∞ 1 = dt e−i(ω −ω)t dω 2π −∞ −∞ ∞ dω eiω t x ˜∗ (−ω)˜ x(ω), −∞

and using eqn C.72, this reduces to ∞ 1 Cxx (t) = dω eiωt x ˜∗ (−ω)˜ x(ω), 2π −∞

(C.77)

i.e. the inverse Fourier transform of |˜ x(ω)|2 .

C.12

Solution of the diﬀusion equation

The diﬀusion equation

∂2n ∂n =D 2 ∂t ∂x can be solved by Fourier transforming n(x, t) using ∞ dx e−ikx n(x, t), n ˜ (k, t) =

(C.78)

(C.79)

−∞

so that −ik˜ n(k, t) =

∞

dx e−ikx

−∞

∂n(x, t) . ∂x

(C.80)

Hence eqn C.78 becomes ∂n ˜ (k, t) = −Dk 2 n ˜ (k, t), ∂t

(C.81)

which is now a simple ﬁrst-order diﬀerential equation whose solution is 2

n ˜ (k, t) = n ˜ (k, 0) e−Dk t . Inverse Fourier transforming then yields ∞ 2 1 n(x, t) = dx eikx e−Dk t n ˜ (k, 0). 2π −∞

(C.82)

(C.83)

In particular, if the initial distribution of n is given by n(x, 0) = n0 δ(x),

(C.84)

C.13

Lagrange multipliers 449

then n ˜ (k, 0) = n0 , and hence n(x, t) = √

2 n0 e−x /(4Dt) . 4πDt

(C.85) (C.86)

This equation is plotted in Fig. C.6 and describes a Gaussian whose width increases with time. Note that x2 = 2Dt. Quick exercise: Repeat this in three dimensions for the diﬀusion equation ∂n = D∇2 n ∂t

(C.87)

and show that if n(0, t) = n0 δ(r) then 2 n0 e−r /(4Dt) . n(r, t) = √ 4πDt

C.13

(C.88)

Lagrange multipliers

Fig. C.6 Equation C.86 plotted for various values of t. At t = 0, n(x, t) is a delta function at the origin, i.e. n(x, 0) = n0 δ(x). As t increases, n(x, t) becomes broader and the distribution spreads out.

Fig. C.7 We wish to ﬁnd the maximum of the function f subject to the constraint that g = 0. This occurs at the point P at which one of the contours of f and the curve g = 0 touch tangentially.

The method of Lagrange multipliers5 is used to ﬁnd the extrema of a function of several variables subject to one or more constraints. Suppose we wish to maximize (or minimize) a function f (x) subject to the constraint g(x) = 0. Both f and g are functions of the N variables x = (x1 , x2 , . . . , xN ). The maximum (or minimum) will occur when one of the contours of f and the curve g = 0 touch tangentially; let us call the set of points at which this occurs P (this is shown in Fig. C.7 for a two-dimensional case). Now ∇f is a vector normal to the contours of f and ∇g is a vector normal to the curve g = 0, and these two vectors will be parallel to each other at P. Hence ∇[f + λg] = 0,

(C.89)

5

Joseph-Louis (1736–1813).

Comte

de

Lagrange

450 Useful mathematics

where λ is a constant, called the Lagrange multiplier. Thus we have N equations to solve: ∂F = 0, (C.90) ∂xk where F = f + λg. This allows us to ﬁnd λ and hence identify the (N − 2)-dimensional surface on which f is extremized subject to the constraint g = 0. If there are M constraints, so that for example gi (x) = 0 where i = 1, . . . , M , then we solve eqn C.90 with F =f+

M

λi gi ,

(C.91)

i=1

where λ1 , . . . , λM are Lagrange multipliers.

Example C.2 Find the ratio of the radius r to the height h of a cylinder which maximizes its total surface area subject to the constraint that its volume is constant. Solution: The volume V = πr2 h and area A = 2πrh + 2πr2 , so we consider the function F given by F = A + λV, (C.92) and solve ∂F ∂h ∂F ∂r

=

2πr + λπr2 = 0,

(C.93)

=

2πh + 4πr + 2λπrh = 0,

(C.94)

which yields λ = −2/r and hence h = 2r.

The electromagnetic spectrum

D

Fig. D.1 The electromagnetic spectrum. The energy of a photon is shown as a temperature T = E/kB in K and as an energy E in eV. The corresponding frequency f is shown in Hz and, because the unit is often quoted in spectroscopy, in cm−1 . The cm−1 scale is marked with some common molecular transitions and excitations (the typical range for molecular rotations and vibrations are shown, together with the C–H bending and stretching modes). The energy of typical π and σ bonds are also shown. The wavelength λ = c/f of the photon is shown (where c is the speed of light). The particular temperatures marked on the temperature scale are TCMB (the temperature of the cosmic microwave background), the boiling points of liquid helium (4 He) and nitrogen (N2 ), both at atmospheric pressure, and also the value of room temperature. Other abbreviations on this diagram are IR = infrared, UV = ultraviolet, R = red, G = green, V = violet. The letter H marks 13.6 eV, the magnitude of the energy of the 1s electron in hydrogen. The frequency axis also contains descriptions of the main regions of the electromagnetic spectrum: radio, microwave, infrared (both ‘near’ and ‘far’), optical and UV.

E

Some thermodynamical deﬁnitions • System = whatever part of the Universe we select. • Open systems can exchange particles with their surroundings. • Closed systems cannot. • An isolated system is not inﬂuenced from outside its boundaries. • Adiathermal = without ﬂow of heat. A system bounded by adiathermal walls is thermally isolated. Any work done on such a system produces adiathermal change. • Diathermal walls allow ﬂow of heat. Two systems separated by diathermal walls are said to be in thermal contact. • Adiabatic = adiathermal and reversible (often used synonymously with adiathermal). • Put a system in thermal contact with some new surroundings. Heat ﬂows and/or work is done. Eventually no further change takes place: the system is said to be in a state of thermal equilibrium. • A quasistatic process is one carried out so slowly that the system passes through a series of equilibrium states so is always in equilibrium. A process which is quasistatic and has no hysteresis is said to be reversible. • Isobaric = at constant pressure. • Isochoric = at constant volume. • Isenthalpic = at constant enthalpy. • Isentropic = at constant entropy. • Isothermal = at constant temperature.

Thermodynamic expansion formulae

F

(∗)T

(∗)P

(∗)V

(∗)S

(∗)U

(∗)H

(∗)F

(∂G)

−1

−S/V

κS − αV

αS − Cp /T

S(T α − P κ) −Cp + P V α

S(T α − 1) −Cp

S − P (κS − V α)

(∂F )

−κP

−(S/V ) − P α

κS

αS − pκCV /T

S(T α − P κ) −P κCV

S(T α − 1) −P (κCV + V α)

0

(∂H)

Tα − 1

Cp /V

−κCV − V α

−Cp /T

P (κCV + V α) −Cp

0

(∂U )

T α − pκ

(Cp /V ) − P α

−κCV

−P κCV /T

0

(∂S)

α

Cp /T V

−κCV /T

0

(∂V )

κ

α

0

(∂P )

−1/V

0

Table F.1 Expansion formulae for ﬁrst-order partial derivatives of thermal variables. (After E. W. Dearden, Eur. J. Phys. 16 76 (1995).)

Table F.1 contains a listing of various partial derivatives, some of which have been derived in this book. To evaluate a partial diﬀerential, one has to take the ratio of two terms in this table using the equation ∂x (∂x)z ≡ . (F.1) ∂y z (∂y)z Note that (∂A)B ≡ −(∂B)A .

Example F.1 To evaluate the Joule-Kelvin coeﬃcient: ∂T (∂T )H (∂H)T V (T α − 1) µJK = = =− = . ∂P H (∂P )H (∂H)p Cp

(F.2)

G

Reduced mass

F F

Consider two particles with masses m1 and m2 located at positions r1 and r2 and held together by a force F(r) that depends only on the distance r = |r| = |r1 − r2 | (see Fig G.1). Thus we have m1 r¨ 1 m2 r¨ 2

r r

and hence

= F (r),

(G.1)

= F (r),

(G.2) (G.3)

−1 r¨ = (m−1 1 + m2 )F (r)

(G.4)

µ¨ r = F (r),

(G.5)

which can be written Fig. G.1 The forces exerted by two particles on one another.

where µ is the reduced mass given by 1 1 1 = + , µ m1 m2

(G.6)

m 1 m2 . m1 + m 2

(G.7)

or equivalently µ=

Glossary of main symbols α damping constant αλ spectral absorptivity β = 1/(kB T )

Φ ﬂux χ magnetic suseptibility χ(t − t ) response function

γ adiabatic index

χ(t) response function

γ surface tension

ψ(r) wave function

Γ(n) gamma function δ skin depth Seebeck coeﬃcient 0 permittivity of free space ζ(s) Riemann zeta function η viscosity θ(x) Heaviside step function κ thermal conductivity Λ relativistic thermal wavelength λ mean free path λ wavelength λth thermal wavelength µ chemical potential µ0 permeability of free space µ chemical potential at STP µJ Joule coeﬃcient µJK Joule-Kelvin coeﬃcient ν frequency π = 3.1415926535 . . . Π momentum ﬂux Π Peltier coeﬃcient ρ density ρ resistivity ρJ Jeans density

Ω solid angle Ω potential energy Ω(E) number of microstates with energy E ω angular frequency A availability A area A albedo A21 Einstein coeﬃcient A12 Einstein coeﬃcient B12 Einstein coeﬃcient B bulk modulus B magnetic ﬁeld BT bulk modulus at constant temperature BS bulk modulus at constant entropy Bν radiance or surface brightness in a frequency interval Bλ radiance or surface brightness in a wavelength interval B(T ) virial coeﬃcient as a function of T C heat capacity C number of chemically distinct constituents C capacitance CMB cosmic microwave background c speed of light c speciﬁc heat capacity D coeﬃcient of self-diﬀusion E electric ﬁeld

Σ local entropy production

E electr