Re: [Sursound] Why do you need to decode ambisonic/b format signals ?

fons Sun, 23 Jan 2011 05:53:59 -0800

On Sun, Jan 23, 2011 at 12:39:19AM +0100, Jörn Nettingsmeier wrote:

> http://stackingdwarves.net/public_stuff/linux_audio/tmt10/TMT2010_J%c3%b6rn_Nettingsmeier-Higher_order_Ambisonics-Slides.pdf


Nice ! But I don't really agree with some of the reasoning :-)

There's a logical 'jump' in there which is not explained 
(and it would be hard to explain it):

A. In your first example (the Kirchhoff-Helmholtz Integral) the
   zillion mics sample the sound field on a surface surrounding
   the listener. This is pure WFS. 

B. Then you try to do the same thing with less mics, exploiting
   their polar patterns. But here the mics have to be coincident.
   not on a surface surrounding the listener.

Now I don't think you can say that (B) is some form of (A) scaled
down to a practical size. It is something completely different.
There is no continuous way from (B) to (A) by increasing the 
number of channels - the mics would remain coincident and not on
the surface, even in the 'infinite' limit case.

The basis of (B) is not Kirchoff-Helmholtz, but Fourier-Bessel,
which is the expansion of a sound field in spherical coorinates.
Doing this it turns out that the angular dependency is given by
the spherical harmonic functions, and the radial one by the 
spherical Bessel functions.

The angular part, the spherical harmonics, can be seen as a sort
of spectral transform acting on the surface of a sphere. Just as
a 1-D FT maps a cyclic function (i.e. a function defined on a
circle) to a discrete spectrum, the spherical harmonics represent
the discrete spectrum of a function defined on the sphere (i.e. a
function of direction in 3-D space). The reason why this is not a
2-D FT is because a 2-D FT has a torus [*] as its domain, not a
sphere: on a torus azim and elev are independent, on a sphere they
are not.

So each component of an AMB signal set is really an element of the
discrete spatial spectrum, not a sample of space or the sphere.

There is also a fundamental difference in the way a tetra mic and
spherical high order mics work, and it is similar to the difference
between (A) and (B). A conventional tetra mic would still work if
the capsules were really coincident, it uses the polar patterns
of the mics and the output signals correspond to spatial spectrum
components, this is (B). An HOA mic consisting of omni capsules in
free space samples the space, this is (A). Clearly the capsules must
not be coincident. Its signals do not represent spatial spectrum
components and have to be processed by a 'spherical harmonics transform'
to provide B-format. An HOA mic with omni capsules on a solid body
(e.g. Eigenmike) is a mix of the two, it's (A) at LF and (B) at HF
were diffraction by the solid body makes each individual capsule
directional.

An AMB decoder is in fact doing the inverse transform: from a spectral
representation to speakers signals which are 'samples in space'. 

Filippo Fazi does decoding for signals from an HOA mic in a different
way: he takes the original spatial samples, and expands them using
the Fourier-Bessel integral directly to speaker signals, without
ever going into the spatial spectral domain. It's a form of extra-
polation in fact and clearly illustrates the potential instability
(which when going to B-format instead must be handled by limiting
the frequency ranges of the higher order signals).

[*] Suppose you have function defined on an infinite 2-D plance, i.e.
a function of x,y. You could take the 2-D FT of this, providing a
spectrum. Now assume the function is cyclic in both x and y, with
period 2 pi. So it consists of an infinite number of identical square
tiles. Its FT is then discrete, consisting of isolated points. We take
a single tile (which still contains all information) and bend it so the
top and bottom ends meet, giving a cylinder. This makes the y-axis cyclic.
Now we bend the cylinder so its two ends meet, making the x-axis cyclic.
The result is a torus. Each point on it can be located by its original
x,y values, which now can be interpreted as azimuth and elevation, both
of them having a range of 2 pi and fully independent. 
You can't warp a square into a sphere this way, it's a fundamentally
different type of surface on which azim and elev are not independent.
So it's spectrum is not a 2-D FT. It turns out to be spherical harmonics.


Ciao,

-- 
FA

There are three of them, and Alleline.

_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound

Re: [Sursound] Why do you need to decode ambisonic/b format signals ?

Reply via email to