On 2023-06-01, Jan Jacob Hofmann wrote:

is it possible/ reasonable to mix ambisonic encoded information of different order?

It's possible and it's reasonable, and as Fons Adriansen said above, at the rather high orders you're talking about, it's not much below optimality either. This has also been talked about in the past, with the — granted, a bit of a shocking — revelation to me and some others, that actually orders mixed this way do *not* automatically decode optimally in either decoder.

But theoretically, this ought to be purely a decoding side issue. When you're mixing into or in B-format, you're essentially dealing with an isotropic approximation of a soundfield, around a central point. That approximation is always a physical one, and in ambisonic work, it's going to be orthogonal by the basic math. If you want to add extra directional accuracy, you'll add orders to your directional decomposition. If you can't or won't, then you don't. But in the end, the fact that the (3D) Fourier-Bessel series rightly normalized (too) preserves the power of point sources, and is an isotropic decomposition of an inbound far field, guarantees that the *only* thing you lose in lower order is directional accuracy. When going to B-format, the one meant to capture the physics, mixing two orders cannot lose anything.

So the real trouble comes when decoding B-format into D-format. If you have a set of first order, POA signals, you have one particular, optimal equation set for how you'd lay the sound out over your speakers. If you had a second order HOA signal, running into something like 5.1, the optimal set differs quite a lot, especially in the higher frequencies, since the theory doesn't work by easy interference principles there, but by second order psychoacoustical ones, coming from the stereo work of Makita. Solving the problem optimally becomes rather finicky.

Then, solving it for mixed orders (not usually a term used for this situation, but for leaving out certain spherical harmonics, e.g. for horizontal, pantophonic work), is even messier. How could we know in decoding only, blindly, that we have a superimposition of say first and second (arbitrary?) order signals, so that we could apply the optimum decoding rule to them all, at the same time?

I've been toying around with this problem for a decade or so, and haven't found a satisfactory solution to it all. My intuition says this has something to do with non-negative matrix factorization and convex optimization, but even if that's it, I'm not quite there yet.

From Dolby Surround and HARPEX -like things I've been toying around with doing them in the pure spherical harmonical domain to arbitrary order; a generizable infinite order decoder; in DirAC kind of stuff I've been toying around with just tensoring the STFT/MDCT-domain with the directional Fourier domain, complexly; and then some classical LTI DSP statistical learning and information/compression/rate-distortion theory on top. In an effort to solve the problem of how to make full spatial audio pack well.

And then there was NFC-HOA. I was already making some progress, but that totally stopped me. In that one, you an mix several orders of signals, but suddenly you can't mix ones of separate radii. Fuck, back to the drawing board for me as well. :/

The sound-information (synthesized) is encoded in Ambisonic 7th order while the spatial reverberation of that very sound is encoded „only“ to third order.

In fact Fons asked you already: why go to such a high order? You'd need an extraordinary number of speakers to utilize such a signal. Also, an extraordinary computing power and a lot of real life meaasurement of your speaaker rig to even align your decoding solution optimally. Whereas in low, matched order, you can do it right with a day's computation time.

Reason for doing so: My reverberant information comes from several directions in space. If these would not have to be encoded all up to 7th order, it would save some calculation time and computation effort.

They really don't have to. Take a look at Ville Pulkki's DirAC work, here in Finland. The gist of it is that it reconstructs both specular sources and reverberation, separately. The first part is identified via time coherence, averaging, much like Dolby Surround does it in its four constrainted channels, and like HARPEX does it better in the ambisonic work.

Ville's work however is fully general and frequency dependent in its source recognition. And it goes beyond: it actually tries to identify reverberant modes from a SoundField, by using the imaginary axis of the Fourier transformation in time to recognize reverberant modes. Which has also been discussed years before on-list, when Angelo (I think) talked about his car interiors.

Also the reverberant information may well be more „blurry“ in respect to the actual sound, as it may stay in the background of perception anyway.

So in reverberation, why not try out a SoundField, for a measurement? The original Ambisonic mic? Because it's actually calibrated to measure not only the pointwise pressure, as its W, but also velocity in XYZ. The latter are where you get the reverberant, echoing, reactive field measurements from.

But my emphasis is on the question, if a decode of 3rd *and* 7th order information - yielding in one encoded file - would be mathematically correct if it comes to the decoding of the higher order content. Would there be missing something (maybe an overall lower amplitude of the third order content)?

As said, it will not be. As your order goes higher, in the higher order decoder, you'll get better and better decodings at the higher frequencies. Just as Fons Adriansen said, above. Doing it right, you will necessarily start to approach the far field diffraction limit of your array, both low and high.

However, at the same time, your decode for the lower order will not be psychoacoustically optimal, and won't approach it by these principles. If you mix in lower order content, it won't decode optimally without severe extra work. At something like 3-7 differentiation, you probably won't hear the difference, but if you mix together even first and third order, you definitely will; an optimal third order decoder does not work well with a first order superimposed signal to the degree a specialised first order (esp. four (panto) or six (peri)) rig would do.

The higher order stuff will though mix in when done right. It will spread ought to a lower order rig, even if the solution is rather difficult to find. E.g. on-list we've talked about many numerical solutions to the problems, like Wiggins's Tabu search. But if you try to apply the higher order optimization problem to the lower orders, it doesn't pan out.

My long term problem is how to at least partially, blindly, tell arbitrary order decompositions/additions apart from each other, at least in part. I'm not there, even yet. :/
--
Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front
+358-40-3751464, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit 
account or options, view archives and so on.

Reply via email to