On 21/01/2013 23:51, Gabriel Wolf wrote:
...
Ambisonic's "problem" was that people were happy, a posteriori, to agree
that AMB was inadequate, but were unable to agree on what a proper HOA
format should comprise, except inasmuch as plain old 3rd order 3D (the
maximum AMB supports) was not good enough.
Not good enough in terms of ...?


It only supports up to 3rd order periphonic (16 channels), as it relies on the number of channels (avoiding the need to store empty channels) being unambiguous, for each combination of horizontal and height orders, as they are up to that limit. It also presumes the conventional 3dB scaling of the W channel as per the "original" B-Format spec, and that is now regarded as both inconvenient and obsolete. A "proper" file format for HOA needs metadata in the header detailing the nature of the encoding, agreed channel orderings and idents (especially where unused channels are omitted). The AMB format has no metadata, just a WAVEX GUID identifying the format.

Put most simply, the file header needs to supply all the information required to enable an appropriate decoding to be used. The file is fully self-describing, robust and unambiguous, so that any program can confirm purely by reading the header that the file is properly constructed, and can selectively extract whatever metadata is provided.

Ideally it also needs to be efficient in storage, by excluding any unused B-Format channels. One solution that has been defined is to rely on lossless compression to do this, i.e. incorporate the compression into the file format definition itself.

Also, AMB is based on the standard WAVE format with 32bit chunk sizes, so is only able to handles file sizes up to 4GB, which is seriously limiting for HOA with high-resolution samples (all the more so if empty channels are included). This was reasonable enough back in 2000, when the WAVEFORMATEXTESNIBLE format itself was very new, but is a serious limitation today.

So, defining such a format is non-trivial, even if the core issues are clear. There are so many options, and nobody working in HOA (which as this list demonstrates continues to be a heavily research-active topic) really wants to have to deal with file format limitations. I would guess that MPEG will want a much narrower specification, and maybe base it on some patentable compression scheme, not least as their target speaker arrangement is ostensibly fixed. Whereas a defining characteristic of Ambisonics (HO or otherwise) is that while there are more or less optimum layouts, speaker arrangements are not fixed.

In practice, those defining such a format need not only to define the file format itself, but also define and publish basic tools to create/encode and decode, to a suitably wide range of representative speaker arrangements; and of course to be able to confirm the whole thing with listening tests. As well as expertise, that requires considerable physical resources, to say nothing of the generation of source material for test purposes. One way and another, it is an expensive business!

Richard Dobson


_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound

Reply via email to