4-Vector Notation


To some extent, this discussion of 4-vector notation might be running a bit ahead of the subsequent introduction of some of the key implications of special relativity. However, there are a number of key aspects of special relativity, associated with space-time and energy-momentum, which 4-vectors can help to explain. Of course, you can always use the link above to jump ahead and then return to the idea of 4-vectors at a later point. With this said, we will start anchored in the idea of the Lorentz transforms that were described as a logical extension of the Galilean transforms, which imposed the requirement that the speed of light [c] be the same in all frames of reference. For the purposes of this discussion, we will simply re-state the primary Lorentz transforms as reference:

[1]      1

The concept of a 4-vector is not unsurprisingly linked to the definition of spacetime in 4-dimensions, i.e. time plus 3 spatial directions. As such, we might introduce a position 4-vector [X] as follows:

[2]      2

In the first set of brackets [x0..xn], we see the generalisation of the 4 components that describe spacetime normalised to the units of distance, such that [x0=ct]. In contrast [x1,x2,x3] represent the spatial vectors that we might associate with the unit vectors [ijk] and magnitude [xyz], which in turn can be simplified to the vector sum [R]. If we consider the form of [2] and apply this logic to [1], we end up with the Lorentz transforms aligned to requirements of 4-vectors:

[3]      3

At this point, we will simply introduce the concept of a ‘spacetime interval [s]’ that corresponds to a separation in spacetime, which we might initially consider to be normalised to the units of distance:

[4]       4

However, for the sake of simplicity, we shall continue the discussion by only considering one spatial axis [x] and adopt the nomenclature introduced in [3] to produce [5] as the equivalence of [4]:

[5]      5

For the spacetime interval [s] to have the property of ‘invariance’, such that all observers determine [s] to be the same in all frames of reference, then we would need to show that the following equation is true:

[6]      6

We can demonstrate this property by substituting the equations in [3] into [6]:

[7]      7

2So, based on the Lorentz transforms, the quantity defined as the spacetime interval [s] is shown to be invariant to all observers, irrespective of their relative velocity [v]. However, let us return to the form of [4] and consider the initial implications of this definition of spacetime in terms of a simple graph of observed time [t] against distance [x]. In the diagram right, an observer see an object travel from [A] to [B], which corresponds to 3 light-seconds in distance in a time of 5 seconds. In this local frame, the velocity [v] of this object is 0.6 of the speed of light [c]. Within the context of this diagram, the speed of light [c] is shown as a line at 45° to the vertical, which is a maxima under special relativity. As such, we have created the basis of what is called a spacetime diagram, which is presenting a view of the spacetime interval [s] between [A] and [B].

So how might we initially interpret the spacetime interval [s]?

Actually, there are several interpretations that will be discussed in another section, but for now, we might focus on 2 perspectives. One perspective represents the time and distance, as shown by the time and distance axes, in which the object moves from [A] to [B] with velocity [v=0.6c]. The second perspective could be described as an observer co-moving with the object from [A] to [B], such that the relative velocity [v] of the object is zero. What we might realise from this description is that the distance offset [x] in this secondary frame remains zero for the co-moving observer, such that the spacetime interval [s] must be perceived in terms of time [t] only. However, we have also shown that the measure of the spacetime interval [s] is invariant for all observers. Given the information provided about the first frame of reference, we can calculate [s] as follows:

[8]       8

However, for the reasons stated above, the secondary co-moving observer must perceive the spacetime interval [s] in terms of time [t] only. In fact, we might better describe this situation in terms of a clock travelling between [A] and [B], where the elapsed time on the clock would be 4 seconds, while 5 seconds would have elapsed in the stationary frame selected. To gain some further insight, we might re-arrange [8] by substituting for [x=vt]:

[9]       9

In [9], we have retained the consistency of the units, in terms of distance, by adopting [ct] to normalise the units of time. However, by definition, it is clear that the spacetime interval can be a composite of both time and distance, plus we have cited a specific example in which this interval is perceived in terms of time only. We might formalise this ‘duality’ as follows:

[10]     10

The term [dt] relates to the ‘proper time [τ] ’ and corresponds to the time on the clock co-moving between 2 points in spacetime. This concept will now allow us to extend the description of 4-vectors beyond the initial description of the position 4-vector [X].

[11]    11

As in classical physics, any change in the position 4-vector with time [t] also suggests some form of velocity [v]. However, the velocity 4-vector [V] requires the time to be invariant for all frames of reference and, as such, we might consider the previous definition of ‘proper time [τ]’ along with the assumption that velocity [V] is orientated along the [x1] axis:

[12]    12

However, we might recognise that [dx0=c.dt] such that we might substitute [10] into [12]:

[13]    13

We might also realise that we can use the chain rule to expand the spatial vector component:

[14]    14

If we now substitute for [14] back into [13]:

[15]    15

Having established a definition for the velocity 4-vector, we might in-turn extend this definition to momentum [P]:

[16]    16

The vector component of [16] might be readily interpreted as relativistic momentum of a particle of mass [m0] moving along the x-axis with velocity [v]. As the velocity [v] reduces to non-relativistic speed, the denominator approaches unity and momentum converges back towards the Newtonian form [ρ=mv]. However, the interpretation of the first scalar component [P0] is not so obvious, as simply collapsing the velocity [v], implicit in [β=v/c], to zero presents the form [P0=mc], which does not necessarily convey any obvious physical meaning. However, we might be able to pursue an inference of [P0] if we first expand the expression using a binomial series of the form:

[17]   17

In isolation, this expansion might not actually appear to help, but we might get another insight into the relationship between energy and momentum by multiplying both sides of [17] by [c], such that the units of momentum become the units of energy:

[18]   18

What might now begin to emerge from [18] is that the scalar [P0] is representative of the total energy of a particle with a rest mass [m0]. The first terms reflects Einstein’s famous equation that relates mass and energy, while the second term corresponds to the Newtonian expression for kinetic energy.

But what are the implications of the additional terms in the binomial series in [18]?

The introduction of 4-vectors is highlighting that the Newtonian expression for kinetic energy is only an approximation, which requires the higher order terms as [v] approaches [c].

Note: A subsequent discussion related to the derivation of kinetic energy is entitled 'A Matter of Energy' that might be of interest at this point.

However, by dividing [18] by [c] allows us to attach some physical meaning to [P0=E/c] and substitute this meaning back into [16]:

[19]    19

We can take the physical interpretation of [19] one step further, but first we need to clarify that all 4-vectors obey the same dot product arithmetic that was implicit in the spacetime interval, when restricting the definition to just the x-axis.

[20]    20

If we now substitute the solution in [16] into [20]:

[21]    21

We can verify the solution on the right by remembering that the solution of [21] must be invariant in all frames of reference, including the co-moving case, where [v=0], which then simplifies [21] to [m2c2]. We can now generalise the solution by replacing [P0] as indicated in [19] and simply presenting [P1] as the momentum vector [ρ]:

[22]    22

As such, we have arrived at the relativistic energy of a particle expressed in terms of its extended kinetic energy linked to the momentum vector and scalar rest mass energy. So coupling the Lorentz transforms, as shown in [3], with the idea of 4-vectors has demonstrated how special relativity came to change Newtonian physics, not only in terms of space and time, but equally in terms of momentum and energy.