How much has the sea surface warmed?

One of longest instrumental records we have of the changing climate

Ships have routinely and systematically measured sea-surface temperature (SST) since the mid 19th century. SST records are therefore one of the longest instrumental records we have of the changing climate. Combined with data from land stations, they form the basis of estimates of global temperature. Given the importance accorded to temperature change since the pre-industrial period it is important to understand how reliable SST measurements are.

How was sea-surface temperature measured and why does it matter?

Historically, sea-surface temperature has been measured in three principal ways. Early on, ships used wooden or canvas buckets to draw water up to the deck, where a thermometer was used to measure the temperature of the water sample. In the mid-20th century new insulated buckets were introduced after it was realised that heat loss from the widely-used canvas buckets was influencing the recorded SSTs and the readings were a little cooler than the true SST. Starting from around the 1930s measurements were taken of the temperature of water drawn in below the surface for various purposes like cooling the ship’s engines. Field studies have shown that this method typically gives readings that are a little higher than the true SST. The engine room method is now the most common way of making SST measurements from ships and it still runs a little warm. In the late 1970s, buoys of various kinds were developed. These have dedicated SST sensors and nowadays drifting and moored buoys give near global coverage coupled with relatively high accuracy; drifting buoys make measurements which have a typical accuracy of a few tenths of a degree.

When bringing these different sources of data together, it’s important to account for the relative differences between the measurements methods. If this isn’t done, then changes in the different proportions of each kind of measurement will lead to slight but noticeable changes in the estimated temperature that are not due to changes in the climate.

Three teams have addressed the issue of biases: the term used to refer to the systematic warming or cooling caused by different measurement methods. The National Oceanic and Atmospheric Administration (NOAA) in the US uses differences between air temperatures over the oceans and sea surface temperatures to separate real changes in SST from biases due to shifts in instrumentation. The Met Office previously used a physical model of how buckets lose heat on average, combined with estimates of engine room biases from papers and technical reports to remove artificial changes. A similar approach was taken by a team from the Japan Meteorological Agency.

Comparisons of the data sets produced by these three centres showed differences between them. We expect some differences because each centre uses its own processing method. We also expect that there will be differences between the true SST and the estimated SST as there is with any measurement in the sciences. The NOAA and Met Office datasets have estimated how large this discrepancy might be. They each put a range on their data in which we would expect to find the true SST. This range is referred to as the uncertainty. However, in some times and places the ranges do not overlap, which suggests that one or the other of the data sets is incorrect and that the uncertainty is underestimated.

There are a number of reasons why the data sets might disagree. In no particular order: there are residual biases in the air temperature measurements used by NOAA to adjust their SSTs (the air temperatures are already corrected for changing ship height); information about how measurements were made (particularly which ship used which method) is limited; other metadata like ship call signs is not available; estimates of biases associated with each method might be wrong.

Using oceanographic measurements to improve our SST records 

We tried to understand some of these difficulties by comparing SST measurements to water temperature measurements made by research vessels and drifting buoys. These are generally of higher quality than SST measurements from ships in the voluntary observing ship (VOS) fleet, so they provide a good benchmark for assessing biases in the SST data. We used the comparisons to make new estimates of engine room biases and modern bucket biases. We also used them to try and refine our understanding of how many ships employed each method and to pin down the timing of the shift from uninsulated to insulated buckets which happened in the mid-20th century.


There were some interesting results. First, ships making engine room measurements have seen a gradual reduction in bias over time. In the 1960s and 70s the measurements were biased relatively warm by up to around 0.5°C. By the 2000s, that bias had been greatly reduced. The reduction in bias has had an impact on recent temperature trends in the unadjusted data, leading to a slight underestimate of the temperature increase since the early 2000s, previously noted and corrected by NOAA.

Second, we found that modern buckets tend to gain heat rather than lose it as their pre-modern counterparts did. Previous studies have focused more on the mechanisms of heat loss – evaporation, cooling by the air – but our results suggest that solar effects might also be important in some places at certain times of year.

Third, using these new estimates to reduce the effect of biases in the combined data set, we found that air temperatures cooled relative to sea surface temperatures in the early 1990s. This is likely due to residual uncorrected biases in air or sea temperatures; although the possibility that the relative change is real cannot be ruled out.

These findings make only a small difference to our understanding of the temperature change since the late nineteenth century.

Where next?

As well as improving our understanding of how changing instrumentation affects global temperature trends, the study highlights some areas where more work is needed. Metadata remains a key area for improvement. To remove the effects of instrumentation changes, we need metadata for how measurements were made. And to better understand the contribution of individual ships, metadata identifying ships is needed.