Re: [Sursound] Why do you need to decode ambisonic/b format signals ?

Jörn Nettingsmeier Sun, 23 Jan 2011 17:47:34 -0800

On 01/23/2011 02:53 PM, f...@kokkinizita.net wrote:

On Sun, Jan 23, 2011 at 12:39:19AM +0100, Jörn Nettingsmeier wrote:

http://stackingdwarves.net/public_stuff/linux_audio/tmt10/TMT2010_J%c3%b6rn_Nettingsmeier-Higher_order_Ambisonics-Slides.pdf


Nice ! But I don't really agree with some of the reasoning :-)

There's a logical 'jump' in there which is not explained
(and it would be hard to explain it):

A. In your first example (the Kirchhoff-Helmholtz Integral) the
    zillion mics sample the sound field on a surface surrounding
    the listener. This is pure WFS.

B. Then you try to do the same thing with less mics, exploiting
    their polar patterns. But here the mics have to be coincident.
    not on a surface surrounding the listener.

Now I don't think you can say that (B) is some form of (A) scaled
down to a practical size. It is something completely different.
There is no continuous way from (B) to (A) by increasing the
number of channels - the mics would remain coincident and not on
the surface, even in the 'infinite' limit case.

true. my reasoning is that instead of sampling the sphere on every pointon the surface, you use measuring microphones that look at the spheresurface from the inside.i should maybe emphasize that the size of the sphere is variable as afunction of order and wavelength.

The basis of (B) is not Kirchoff-Helmholtz, but Fourier-Bessel,
which is the expansion of a sound field in spherical coorinates.
Doing this it turns out that the angular dependency is given by
the spherical harmonic functions, and the radial one by the
spherical Bessel functions.

in fact, i did think about tackling the problem from the fourier-besselangle, but i haven't yet gained enough insight to really try. plus ifind that the kh approach is somewhat more intuitive. and since both areultimately converging to "perfect", i thought i might get away with it.

The angular part, the spherical harmonics, can be seen as a sort
of spectral transform acting on the surface of a sphere. Just as
a 1-D FT maps a cyclic function (i.e. a function defined on a
circle) to a discrete spectrum, the spherical harmonics represent
the discrete spectrum of a function defined on the sphere (i.e. a
function of direction in 3-D space). The reason why this is not a
2-D FT is because a 2-D FT has a torus [*] as its domain, not a
sphere: on a torus azim and elev are independent, on a sphere they
are not.

So each component of an AMB signal set is really an element of the
discrete spatial spectrum, not a sample of space or the sphere.


true.

it does not correspond to any particular point on the sphere, but it isa sample nonetheless. when explaining that to students, i often use animage analogy:when i want to transmit an image of a meadow, it's best to just paintone big green pixel first. not much, but already gives you a rough ideait's not going to be about polar bears.next, divide it into four, the upper two blue, the lower two green. thensixteen, and so on, and eventually we get the cow and the birds. pointis, we will be able to make out the cow and the birds way earlier thanif we had used very small pixels right from the start and displayed themleft-to-right, top-to-bottom.

that is hopefully similar to the discrete sampling approach of the naiveKH microphone curtain vs. spherical decomposition to get a reasonablycorrect idea across.

There is also a fundamental difference in the way a tetra mic and
spherical high order mics work, and it is similar to the difference
between (A) and (B). A conventional tetra mic would still work if
the capsules were really coincident, it uses the polar patterns
of the mics and the output signals correspond to spatial spectrum
components, this is (B). An HOA mic consisting of omni capsules in
free space samples the space, this is (A). Clearly the capsules must
not be coincident. Its signals do not represent spatial spectrum
components and have to be processed by a 'spherical harmonics transform'
to provide B-format. An HOA mic with omni capsules on a solid body
(e.g. Eigenmike) is a mix of the two, it's (A) at LF and (B) at HF
were diffraction by the solid body makes each individual capsule
directional.


also true, but alas, there is only so much you can do in 20 minutes :)
some things have to give.

i also fully expect to roast in hell for the simplistic decodingmatrices in the slides - i always add a huge bunch of caveats in mypresentation, but i know at least one unlucky soul who has tried toimplement them as-is. the point is: here's how it works in principle,the hairy details omitted.you can't imagine how difficult it is to sell the concept totonmeisters - the more experienced, the harder. no point in frighteningthem off right from the start ;)

Filippo Fazi does decoding for signals from an HOA mic in a different
way: he takes the original spatial samples, and expands them using
the Fourier-Bessel integral directly to speaker signals, without
ever going into the spatial spectral domain. It's a form of extra-
polation in fact and clearly illustrates the potential instability
(which when going to B-format instead must be handled by limiting
the frequency ranges of the higher order signals).

i look forward to the day when i'll be able to digest fillipo's papers.incidentally, i have no pressing plans for the next decade, so there ishope :)

[*] Suppose you have function defined on an infinite 2-D plance, i.e.
a function of x,y. You could take the 2-D FT of this, providing a
spectrum. Now assume the function is cyclic in both x and y, with
period 2 pi. So it consists of an infinite number of identical square
tiles. Its FT is then discrete, consisting of isolated points. We take
a single tile (which still contains all information) and bend it so the
top and bottom ends meet, giving a cylinder. This makes the y-axis cyclic.
Now we bend the cylinder so its two ends meet, making the x-axis cyclic.
The result is a torus. Each point on it can be located by its original
x,y values, which now can be interpreted as azimuth and elevation, both
of them having a range of 2 pi and fully independent.
You can't warp a square into a sphere this way, it's a fundamentally
different type of surface on which azim and elev are not independent.
So it's spectrum is not a 2-D FT. It turns out to be spherical harmonics.


thanks for this explanation!

best,

jörn



--
Jörn Nettingsmeier
Lortzingstr. 11, 45128 Essen, Tel. +49 177 7937487

Meister für Veranstaltungstechnik (Bühne/Studio), Elektrofachkraft
Audio and event engineer - Ambisonic surround recordings

http://stackingdwarves.net

_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound

Re: [Sursound] Why do you need to decode ambisonic/b format signals ?

Reply via email to