The Banana-Doughnut Debate Explained

I have written this somewhat simplified account after many requests from colleagues and students to explain what the series of articles, Comments and Replies in Geophysical Journal is about. It is not intended to replace that discussion, which should be the ultimate source of information. We ourselves have struggled to understand some of the issues raised by our colleagues, and to clarify our own writings in a more accessible language - we are aware that the original literature is not easily accessible.

Why go bananas?

For more than a hundred years, seismologists have treated seismic P and S waves as if they satisfy the same laws as optical rays (ray theory). This approach has been extraordinarily succesful: it enabled Beno Gutenberg to determine the radius of the Earth's core in 1910, Inge Lehmann to discover the tiny inner core in 1936, and Harold Jeffreys and Keith Bullen to derive a spherical Earth model that could accurately predict seismic travel times and be used to locate earthquakes by reading the times of P and S waves from seismograms located far away.

But even after those gigantic accomplishments, ray theory continued to help seismologists to make fundamental discoveries. In the 1970's, it had become evident that a spherical Earth model was not sufficient: some regions on Earth are slower or faster than others, basically because they are warmer or cooler. Researchers at MIT and Harvard, led by Keiti Aki and Adam Dziewonski, pioneered the technique of seismic tomography. This gave us the first tangible evidence that the deep seismicity near the ocean's trenches is actually caused by oceanic lithosphere sinking back in Earth's mantle. In 1990, an undergraduate at Utrecht, Suzan van der Lee - now at Northwestern University - came up with the first image of a slab (the Aegean) sinking in the lower mantle for her senior thesis project; because this image was counter the prevalent opinion that the less dense upper mantle did not mix with heavier lower mantle rock, it took a while to double-check it with her advisors and get into print [1].

At the same time Rob van der Hilst - also at Utrecht, now at MIT - came up with extensive evidence that the slabs around the Pacific were able to sink into the lower mantle [2]. At the same time, these findings seemed to make minced meat of the so-called two layered convection model. This model states that Earth cools off with little or no any exchange of mass between upper and lower mantle. It was succesful in explaining geochemical observations. These required a hidden reservoir where argon and helium could be held since Earth's formation without escaping into the atmosphere, and the lower mantle was the obvious candidate for that as long as its rock stayed where it was: deep.

But after all these successes, limitations of ray theory became apparent. Wavelengths of seismic waves often measure in the hundreds of kilometers, and this makes it difficult to see small objects with seismic waves. This is very similar to the limitations of the optical microscope which does not allow us to see beyond a very small resolution, given by the Fresnel zone. If we treat seismic waves like optical rays, they must also have Fresnel zones. Because seismic rays curve back to the surface of Earth, these zones take on the shape of bananas, as is shown in the following figure:

Figure 1. This banana-shaped curve shows the Fresnel zones for seismic waves: regions that influence a seismic P wave inside Earth. The outer curve is for a P wave with a dominant period of 4 sec, the innermost one for a period of 1 second. Objects smaller than the Fresnel zone cannot be seen using ray theory (Source: G. Nolet, Imaging the deep earth: technical possibilities and theoretical limitations, Proc. XXIIth Assembly ESC Barcelona 1990, ed. A. Roca, 107-115, 1991)

Why doughnuts?

The simple idea of a Fresnel zone gave us an idea of what we could resolve in the limit of ray theory, but it gave us no way to improve on it. But independently from the work on P and S waves, seismologists working with very low-frequency waves (surface waves and normal modes) had already figured out how complicated the actual sensitivity is of seismic waves. Figure 2 shows that the sensitivity for a normal mode is also spread out over the surface of the Earth in a band-like structure (though it looks more like a caterpillar than a banana).

Figure 2. At one (very low) frequency the sensitivity of a a normal mode to Earth's structure is band-like (Source: J. Woodhouse and T.P. Girnus, Geophys. J. Roy. astr. Soc., 68, 653-673, 1982).

In the mid 1980's, Roel Snieder at Utrecht (now at Colorado School of Mines) was the first to derive similiar sensitivities for surface waves at much higher frequencies. He also attempted to interpret perturbations in such seismic waveforms tomographically, using a first order perturbation theory. But it was not until about ten years later that it became evident that one could get a Fresnel zone-like sensitivity kernel by summing over many of such surface wave modes, as shown in Figure 3.


Figure 3. Summing normal modes yields a banana-like sensitivity for S waves (Source: X.-D. Li and T. Tanimoto, Geophys. J. Int., 112, 92-102, 1993)

These kernels still look very much like the Fresnel zone sketched in Figure 1. The doughnut hole made its appearance in the theory when postdoc Henk Marquering at Princeton started to look at the sensitivity of arrival times of seismic S waves, using techniques very similar to those used earlier by Snieder and by Li and Tanimoto.

Marquering's results were initially baffling and seemed to contradict every seismology textbook published in the 20th century: the location of the ray itself was shown to be a region of zero sensitivity for the travel time of the wave! In striking contrast, Ray theory defines the ray as the only region where the travel time is sensitive to Earth's structure, and the two theories are in obvious conflict to each other. But after grappling for a while with this paradox we understood it (an explanation is provided in [18]), and extensive numerical tests soon proved that it was, in fact, ray theory that is deficient, not the new finite frequency theory. Princeton's Tony Dahlen soon developed a very efficient, though approximate, way to compute the sensitivity kernels for travel time (named banana-doughnut kernels by Marquering because the region of zero sentitivity creates a doughnut hole at the center of the banana). Figure 4 shows two such a kernels, computed with Dahlen's theory [3].

Figure 4. Sections from banana-doughnut kernels as seen from aside and across for a P wave at an epicentral distance of 60 degrees. For a dominant period of 20 sec, the doughnut hole is massive. For a shorter period (2 sec) the hole is much narrower and only clearly visible in the white line that denotes the amplitude of sensitivity across the midpoint of the kernel (Source: Dahlen et al., Geophys. J. Int., 141, 157-174, 2000).

Why disputing it?

The concept that travel times can not be influenced by Earth structures located on the ray itself is counterintuitive, and has met with resistance of two kinds: many - even very esteemed - seismologists initially expressed disbelief, but that hesitancy usually disappeared when we showed how well the theory worked when tested on purely synthetic seismograms computed with the pseudospectral method (see the publications of Shu-Huei Hung and Adam Baig in Geophys. J. Int. [4-8]). The second resistance was of the it doesn't make a difference in practice type. This ignores that a short period delay, e.g. as reported by the ISC, has a very narrow sensitivity kernel, whereas a long period delay can have a sensitivity kernel with a width of 1000 km or more. If a delay is visible in thre ISC data set but not in the long period data set, this tells us something about the size of the heterogeneity. Montelli et al. [24] used this succesfully to image lower mantle plumes. Since then, tomographic studies by several authors [21-23] have shown that one can exploit the frequency-dependence of the sensitivity kernels even further by using up to seven filter bands and significantly increase resolution. Indeed, the kernels soon turned out to make a first decisive contribution to the resolution of small features when Princeton graduate student Raffaella Montelli started to test the new theory on actual data and - to her own surprise - discovered more than a dozen plumes (hot uprisings under islands like Hawaii) in the lower mantle of Earth [24]. By coincidence, this discovery came at a time that the plume hypothesis itself was under fire. I suspect that this has contributed to fuel the debate about the banana-doughnut kernels (paradoxically, plume opponents and I may soon find common ground since analysis of the lower mantle plumes seems to raise significant problems for the model of whole mantle convection and bring us back to a very limited mass exchange between upper- and lower mantle).

So the kernels do make a difference! Not everyone agrees, however. Martijn de Hoop (now Purdue) and Rob van der Hilst wrote a paper in Geophysical Journal [4] in which they claim that though "the sensitivity being identical zero on the unperturbed source-receiver ray", that "the kernel itself does not have a zero on the unperturbed ray". The discussion is very much about mathematical niceties that an ordinary seismologist (such as me) would not ever wish to worry about, but in the end this is nonsense in our view, which we have detailed in Dahlen, F.A. and G. Nolet, Geophys. J. Int. 163, 949-951, 2005 (click here:) .

One particular aspect of the paper by de Hoop and van der Hilst may lead readers astray and for a proper understanding of finite frequency theory we explore it in some detail. They claim that one can mollify kernels to make the zero sensitivity disappear. They state that this mollifying happens implicitly when the matched filter that is used in the measurement of the time delay is not exactly equal to the observed P or S pulse. Or when it is time-shifted, e.g. as a consequence of an error in the earthquake's origin time.

First of all, this view reflects a basic misconception of the role of banana-doughnut kernels: they are designed to interpret the difference between the matched and observed waveforms! This has nothing to do with the zero sensitivity on the ray. The argument that origin time errors can annihilate the zero sensitivity is correct, but only if the tomographer allows origin time bias to propagate into the model - something I teach my students to avoid at all cost. The rather complicated mathematical arguments can be paraphrased as follows:

Assume we have a tomographic system of equations with a model m, data d and time errors e and a matrix (with the banana-doughnut kernels in some discretized version) A:

A m = d + e

De Hoop and van der Hilst map the error e back into the model by a transformation e=Em, and rephrase the system as:

(A-E) m = d

to conclude that when one reads back the kernels in the new mollified matrix (A-E), these kernels have no zeroes on the ray. Magic? Well, this is correct in so far as one could actually do the mapping e=Em. Which is a problem unless the errors can be adequately modeled by m and most importantly if one actually knows the errors e - but why then not subtract the error from the data? If the errors, on the other hand, represent a systematic shift (e.g. because all origin times are estimated too early) no one would do tomography this way, because it maps the bias explicitly into the model. In practice, tomographers center the distribution of delay times around 0 just to avoid that the model is affected by such bias. This point waas not adequately addressed by de Hoop and van der Hilst in their Reply [11].

The second attack by van der Hilst and de Hoop is in another paper published in Geophysical Journal [12]. The argument is mainly that the beneficial effects of banana-doughnut kernels are limited to a small magnification of amplitude anomalies that could easily have been accomplished by reducing the damping (or regularization) of the inversion that is at the basis of every tomographic interpretation. We disagree with the authors that there is that much freedom in choosing one's damping strategy because data errors are known quite well and the misfit of the data (or chi-square in the language of statistics) cannot be too large. We reject the examples shown by the authors to illustrate their point of view because they are selected for cases where the Fresnel zone is narrow (near the surface) or where the anomalies are much larger than the Fresnel zone (slabs). We also reject their statistical analysis because it fails to take into account that wavefront healing has both positive and negative signs - resulting in a null result when averaging over all effects, as the authors do. But these papers are much easier to understand and we refer the interested reader to [12] and a preprint of our response (click on Machiavelli):

As a footnote, there is also a paper that questions the beneficial effects of finite frequency inversions for surface wave phases [14]. Such kernels have been developed by Ying Zhou (now at Virgina Tech) [15-17]. In contrast to Zhou, Anne Sieminski does not invert for 3D structure of the Earth, but attempts to retrieve a 2D map of phase velocities. This approach has two limitations: first of all, inverting data for one frequency only ignores the beneficial effects of the different widths of sensitivity kernels at different frequencies (which is correlated to depth of the sensitivity, but that is no reason not to exploit it). But most importantly there is again a conceptual misunderstanding: the concept of local phase velocity implies the validity of ray theory for surface waves! If wavefronts are deformed by heterogeneities off the geometrical raypath, their local speed is determined by their past history, and not unique. Ying Zhou [15] has shown how large the approximations are that one has to make to adopt a local velocity, and that these approximations usually annihilate the beneficial effects of finite frequency interpretations. If the path coverage is dense, and the kernel width need not be exploited to get a good resolution, even the local velocity approach - despite its significant shortcomings - can yield a phenomenal increase in resolution [19].

Further reading

The original paper to study finite-frequency tomography for body waves is (Dahlen et al., GJI, 2000) [3]
However, many may find this a hard nut to crack. A simplified derivation of the same results - together with a method to compute kernels in local, 3D structures, can be found in [18] (click here:) Dahlen for Dummies


[1] Spakman, W., van der Lee, S., & van der Hilst, R., Travel-time tomography of the European-Mediterranean mantle down to 1400 km, Phys. Earth Plan. Int., 79, 3-74, 1993.
[2] van der Hilst, R., Engdahl E.R., Spakman W. & Nolet, G., Tomographic imaging of subducted lithosphere below northwest Pacific island arcs, Nature, 353, 37-43, 1991,
[3] Dahlen, F.A., Hung, S.-H. & Nolet, G., 2000. Fr'echet kernels for finite-frequency travel times - I. Theory, Geophys. J. Int., 141, 157-174.
[4] Baig, A. & Dahlen, F., 2004. Statistics of traveltimes and amplitudes in random media, Geophys. J. Int., 158, 187-210.
[5] Baig, A., Dahlen, F., & Hung, S.-H., 2003. Traveltimes of waves in three-dimensional random media, Geophys. J. Int., 153, 467-482.
[6] Hung, S.-H., Dahlen, F., & Nolet, G., 2000. Frechet kernels for finite-frequency travel times - II. examples, Geophys. J. Int., 141, 175-203.
[7] Hung, S.-H., Dahlen, F., & Nolet, G., 2001. Wavefront healing: a banana-doughnut perspective, Geophys. J. Int., 146, 289-312.
[8] Baig, A. & Dahlen, F., 2004. Traveltime biases in random media and the s-wave discrepancy, Geophys. J. Int., 158, 922-938.
[9] de Hoop, M. & van der Hilst, R., 2005. On sensitivity kernels for wave equation transmission tomography, Geophys. J. Int., 160, 621-633.
[10] Dahlen, F. & Nolet, G., 2005. Comment on the paper on sensitivity kernels for wave equation transmission tomography by de Hoop and van der Hilst, Geophys. J. Int., 163, 949-951.
[11] de Hoop, M. & van der Hilst, R., 2005. Reply to a comment by F.A. Dahlen and G. Nolet on: On sensitivity kernels for wave equation tomography, Geophys. J. Int., 163, 952-955.
[12] van der Hilst, R. & de Hoop, M., 2005. Banana-doughnut kernels and mantle tomography, Geophys. J. Int., 163, 956-961.
[14] Sieminski, A., L've^eque, J.-J., & Debayle, E., 2004. Can finite-frequency effects be accounted for in ray theory surface wave tomography, Geophys. Res. Lett., 31, doi:10.029/2004GL021402.
[15] Zhou, Y., Dahlen, F., & Nolet, G., 2004. Three-dimensional sensitivity kernels for surface wave observables, Geophys. J. Int., 158, 142-168.
[16] Zhou, Y., Dahlen, F., Nolet, G., & Laske, G., 2005. Finite-frequency effects in global surface wave tomography, Geophys. J. Int., 163, 1087-1111.
[17] Zhou, Y., Nolet, G., Dahlen, F., & Laske, G., 2005. Global upper mantle structure from finite-frequency surface-wave tomography, J. Geophys. Res., in press.~
[18] Nolet, G., Dahlen, F., & R.Montelli, 2005. Traveltimes and amplitudes of seismic waves: a re-assessment, in: A. Levander and G. Nolet (eds.), Array analysis of broadband seismograms, AGU Monograph Ser., 157, 37-48.
[19] T. Yang and D. Forsyth, Regional tomographic inversion of the amplitude and phase of Rayleigh waves with 2-D sensitivity kernels, Geophys. J. Int., in press, 2006.
[20] Nolet, G. and R. Montelli, Optimum parameterization of tomographic models, Geophys. J. Int., 161, 365-372, 2005.
[21] Yang, T., Y. Shen, S. van der Lee, S.C. Solomon and S.-H. Hung, Upper mantle beneath the Azores hotspot from finite-frequency seismic tomography, EPSL , 250, 11-26, 2006
[22] S.-H. Hung, Y. Shen and L.-Y. Chiao, Imaging seismic velocity structure beneath the Iceland hotspot: a finite frequency approach, JGR , 109, B08305, 2004
[23] K. Sigloch, N. McQuarrie and G. Nolet, Two-stage subduction history under North America inferred from multiple-frequency tomography, Nature Geosci., 1, 458-462, 2008
[24] Montelli, G. Nolet, F.A. Dahlen, G. Masters, E.R. Engdahl and S.-H. Hung, Finite frequency tomography reveals a variety of plumes in the mantle, Science, 303, 338-343, 2004