On 2022-02-26, Eero Aro wrote:

The way I understood Mark Anderson's question, that he is looking for a _practical_ way to decode his existing UHJ recordings into surround loudspeaker playback with a software solution, ie. a software replacement for using UHJ decoding in a tuner amplifier, such as the Onkyo SV909.

Check. Once more I should have been more constructive. Angelo's solution is quite workable to this end, but somehow I feel it's not quite the Endlösung here. Especially since conversions of this kind tend to leave the original material deleted, so that one ought to get it "right" or "bestest" the first time around.

...unless Gerzon, Craven and Stuart's MLP work saves us here. Because in there, they developed a rather general framework of exactly invertible transforms in the digital domain. Utilizing that, it's posslble to approximate to a high degree almost any short term linear and shift-invariant MIMO system in a form which comes back to the precise bits you threw at it.

So I'd say, do use Angelo's formulae, but discretize them, so that they may be inverted back into the original UHJ at a later time, to perchance be reprocessed.

This is easy to understand, because the demand for such software would be very little. (Both Onkyo 909 and the Meridian 565 use digital processing and do decode UHJ, but they are not what we are talking about.)

BTW, I'm not too certain about such software being in short demand. I think the trouble is that there is no easy plug-n-play library out there to do the job. If there was, it could make handling UHJ/BHJ easy enough for it to become a much more widely used stereo encoding.

The nasty thing though is that BHJ isn't isotropic, while isotropy is largely required for gaming work. I.e. the thing driving *all* of VR and high end consumer sound spatialisation right now. It *can* be reverted to a form where phasiness hasn't been redistributed to the back, and so it *can* be resynthesized to any rotational frame, but again, plug'n'play libraries to do that aren't yet in place.

In particular, nobody's been bold enough yet to implement zero delay provably constant effort convolutions of the Gardner/Lake DSP kind, so that the library doesn't have to be thought about, as a component.

As Sampo says, when the signal has been UHJ encoded, there is no way to retrieve the original B-Format.

Though active decoding of UHJ, mindful of its doings, could come pretty close to the best active decode of full B-format. If did right.

One of the things I've been wondering about for the longest time then is how to optimally, actively decode a B-format signal. How to do what I think Angelo Farina at one time called "an infinite order decode".

Nobody's done the exercise for a general soundfield, yet. We do have the result that passive LTI-shelves help, via Makita theory. We do have DirAC as an active decoder, and we do have Harpex. But none those solutions go to arbitrary order, and all of them are in a sense unprincipled: the classical shelving solution does an isotropic optimization over Makita criteria, assuming distant point sources, the the Harpex one tries to reconstruct two point sources exactly, and DirAC, as well as it does work for point reconstruction and ambience separately, is frankly speaking a theoretical mess. E.g. whoever told us cardioid responses have anything at all to do with how the sound field really behaves?

So what would be the systematic way of dealing with active decoding? Well, I think the first two things to look at would be directional power over time, expressed as a higher order spherical harmonical decomposition, and in some dual sense how singular a signal seems wrt the decomposition we use. That is to say, as we usually detect a point source for active matrixing by it being "strong in a certain direction", that usually translates into a directional power operator of some sort, followed by sharpening along the detected direction. In DirAC, we also do a sort of division between focused power and ambience, and further divide out the ambient field across speakers -- a powerful idea, and an eminently workable one as I've heard in practice at Aalto. But it's still unprincipled, because it cannot differentiate between zero, one, two, whatever, focused sources in pantophony, or zero, one, two, three, whatever in periphony, nor partition the ambience quite right if some of the sources are somewhat coherent with each other. (Think lead violin and cello in a frontal orchestra.)

I believe the principled way to go about this would be to treat the field as complex, and harmonic, then to square it in order to find point sources, express that instantaneous solution as a higher order complex spherical harmonical expansion, extract the out-of-phase component for DirAC-like processing, and to apply some time-running polynomial of the adjugate of the system function to set a variable time-frequency tradeoff. Wiener filtering theory might come in handy, when dealing with the noise-signal tradeoff.

If you set it out this way, an arbitrary field with arbitrarily many, not perhaps even resolved sources, can be handled just as well as a planewave from a single source. The amount of shapening for a source would depend continuously on its coherence properties, and if more than one source was present, their possible mutual coherence would naturally be taken into account. DirAC processing would also take heed of anisotropic reverberation, such as when recording close to a wall or close to an orifice to a wider space. And the funkiest thing would be that the math stays of finite order: every operation necessarily would be of at most square the order of the original spherical harmonical decomposition. While describing any and all distributions of point sources and ambience over the unit circle/sphere (pantophony/periphony).

I have quite a lot of Ambisonic UHJ CD:s and I'd rather listen to them as decoded into a surround setup than listening to them in stereo with two speakers.

Actually, even if you can't invert UHJ to B-format, really, it's even more difficult to go from B-format to something like 5.1. The optimum decoding equations are a full nonlinear mess, even in the basic Makita sense, and suffer from multiple local optima. (I believe Bruce Wiggins's thesis which used tabu search to find viable non-symmetric decoders was an attempt to deal with the problem.)

DirAC is stupendously good at this, at least perceptually speaking. It adapts to pretty much any speaker array, and from five unevenly spaced speakers onwards sounds like there is no rig at all.

Yet I'm perfectly certain, based on the above, I could derive a physical signal which the system would decode badly. Pretty much every current system would. Say, a narrow band signal coming from around a corner, so that it has a large out-of-phase component, spreading sharply in space.

And by the way, the majority of UHJ encoded music releases _was_ recorded with a Soundfield type microphone, because the largest number of them were made by Nimbus Records.

Also thankfully so: the SoundField series is an unusually robust piece of work. Solid theory, high engineering, unbelievably high adherence to acoustical theory which wasn't really even understood at the time the mics were designed.

Mk4 and Mk5 have been used as *measurement* mics, in acoustical research. I don't think any other mic, in any other audio discipline, really has.

Nimbus didn't use the Soundfield-made microphone, they used their own setup made of two fig of eights and an omni.

Yes. Them idjits. 'Cause there is going to be some high frequency phasing them. It might sound good, pace the ORTF crowd, but it isn't *real* or *accurate*.

They did that mainly because the Soundfield was too noisy and they didn't need the Z signal, as it couldn't be encoded into UHJ and carved onto vinyl anyway.

Actually you *do* need Z. That's the point where I alluded to Christoph Faller above: if you cut out the third dimension, your reconstructed field will show a 1/r extra attenuation term from the rig inwards, because you're bleeding off energy to the third dimension. This is not much of a problem when you have a fully propagating field, but when you're attempting to reproduce standing waves, the problem grows much wilder. Then you really, *really* need at least some modicum of periphonic control, in order to keep the central pressure field to what it was supposed to be.

So, all Reaper users out there, please tell Mark how to do the routing in Reaper. David already was in the business.

Does anybody want to sketch out that library I talked about? I'm a theoretician, so not much of a coder. Yet I could guide a seven-year-old through the process of writing such a thing, in plain C.
--
Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front
+358-40-3751464, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit 
account or options, view archives and so on.

Reply via email to