Re: [Sursound] Is it possible to mix ambisonic encoded information of different order?

Sampo Syreeni Sun, 11 Jun 2023 16:31:48 -0700

On 2023-06-01, Jan Jacob Hofmann wrote:

is it possible/ reasonable to mix ambisonic encoded information ofdifferent order?

It's possible and it's reasonable, and as Fons Adriansen said above, atthe rather high orders you're talking about, it's not much belowoptimality either. This has also been talked about in the past, with the— granted, a bit of a shocking — revelation to me and some others, thatactually orders mixed this way do *not* automatically decode optimallyin either decoder.

But theoretically, this ought to be purely a decoding side issue. Whenyou're mixing into or in B-format, you're essentially dealing with anisotropic approximation of a soundfield, around a central point. Thatapproximation is always a physical one, and in ambisonic work, it'sgoing to be orthogonal by the basic math. If you want to add extradirectional accuracy, you'll add orders to your directionaldecomposition. If you can't or won't, then you don't. But in the end,the fact that the (3D) Fourier-Bessel series rightly normalized (too)preserves the power of point sources, and is an isotropic decompositionof an inbound far field, guarantees that the *only* thing you lose inlower order is directional accuracy. When going to B-format, the onemeant to capture the physics, mixing two orders cannot lose anything.

So the real trouble comes when decoding B-format into D-format. If youhave a set of first order, POA signals, you have one particular, optimalequation set for how you'd lay the sound out over your speakers. If youhad a second order HOA signal, running into something like 5.1, theoptimal set differs quite a lot, especially in the higher frequencies,since the theory doesn't work by easy interference principles there, butby second order psychoacoustical ones, coming from the stereo work ofMakita. Solving the problem optimally becomes rather finicky.

Then, solving it for mixed orders (not usually a term used for thissituation, but for leaving out certain spherical harmonics, e.g. forhorizontal, pantophonic work), is even messier. How could we know indecoding only, blindly, that we have a superimposition of say first andsecond (arbitrary?) order signals, so that we could apply the optimumdecoding rule to them all, at the same time?

I've been toying around with this problem for a decade or so, andhaven't found a satisfactory solution to it all. My intuition saysthis has something to do with non-negative matrix factorization andconvex optimization, but even if that's it, I'm not quite there yet.

From Dolby Surround and HARPEX -like things I've been toying around withdoing them in the pure spherical harmonical domain to arbitrary order; agenerizable infinite order decoder; in DirAC kind of stuff I've beentoying around with just tensoring the STFT/MDCT-domain with thedirectional Fourier domain, complexly; and then some classical LTI DSPstatistical learning and information/compression/rate-distortion theoryon top. In an effort to solve the problem of how to make full spatialaudio pack well.

And then there was NFC-HOA. I was already making some progress, but thattotally stopped me. In that one, you an mix several orders of signals,but suddenly you can't mix ones of separate radii. Fuck, back to thedrawing board for me as well. :/

The sound-information (synthesized) is encoded in Ambisonic 7th orderwhile the spatial reverberation of that very sound is encoded „only“to third order.

In fact Fons asked you already: why go to such a high order? You'd needan extraordinary number of speakers to utilize such a signal. Also, anextraordinary computing power and a lot of real life meaasurement ofyour speaaker rig to even align your decoding solution optimally.Whereas in low, matched order, you can do it right with a day'scomputation time.

Reason for doing so: My reverberant information comes from severaldirections in space. If these would not have to be encoded all up to7th order, it would save some calculation time and computation effort.

They really don't have to. Take a look at Ville Pulkki's DirAC work,here in Finland. The gist of it is that it reconstructs both specularsources and reverberation, separately. The first part is identified viatime coherence, averaging, much like Dolby Surround does it in its fourconstrainted channels, and like HARPEX does it better in the ambisonicwork.

Ville's work however is fully general and frequency dependent in itssource recognition. And it goes beyond: it actually tries to identifyreverberant modes from a SoundField, by using the imaginary axis of theFourier transformation in time to recognize reverberant modes. Which hasalso been discussed years before on-list, when Angelo (I think) talkedabout his car interiors.

Also the reverberant information may well be more „blurry“ in respectto the actual sound, as it may stay in the background of perceptionanyway.

So in reverberation, why not try out a SoundField, for a measurement?The original Ambisonic mic? Because it's actually calibrated to measurenot only the pointwise pressure, as its W, but also velocity in XYZ.The latter are where you get the reverberant, echoing, reactive fieldmeasurements from.

But my emphasis is on the question, if a decode of 3rd *and* 7th orderinformation - yielding in one encoded file - would be mathematicallycorrect if it comes to the decoding of the higher order content. Wouldthere be missing something (maybe an overall lower amplitude of thethird order content)?

As said, it will not be. As your order goes higher, in the higher orderdecoder, you'll get better and better decodings at the higherfrequencies. Just as Fons Adriansen said, above. Doing it right, youwill necessarily start to approach the far field diffraction limit ofyour array, both low and high.

However, at the same time, your decode for the lower order will not bepsychoacoustically optimal, and won't approach it by these principles.If you mix in lower order content, it won't decode optimally withoutsevere extra work. At something like 3-7 differentiation, you probablywon't hear the difference, but if you mix together even first and thirdorder, you definitely will; an optimal third order decoder does not workwell with a first order superimposed signal to the degree a specialisedfirst order (esp. four (panto) or six (peri)) rig would do.

The higher order stuff will though mix in when done right. It willspread ought to a lower order rig, even if the solution is ratherdifficult to find. E.g. on-list we've talked about many numericalsolutions to the problems, like Wiggins's Tabu search. But if you try toapply the higher order optimization problem to the lower orders, itdoesn't pan out.

My long term problem is how to at least partially, blindly, tellarbitrary order decompositions/additions apart from each other, atleast in part. I'm not there, even yet. :/

--
Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front
+358-40-3751464, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit 
account or options, view archives and so on.

Re: [Sursound] Is it possible to mix ambisonic encoded information of different order?

Reply via email to