On 21/01/2013 23:51, Gabriel Wolf wrote:
...
Ambisonic's "problem" was that people were happy, a posteriori, to agree
that AMB was inadequate, but were unable to agree on what a proper HOA
format should comprise, except inasmuch as plain old 3rd order 3D (the
maximum AMB supports) was not good enough.
Not good enough in terms of ...?
It only supports up to 3rd order periphonic (16 channels), as it relies
on the number of channels (avoiding the need to store empty channels)
being unambiguous, for each combination of horizontal and height orders,
as they are up to that limit. It also presumes the conventional 3dB
scaling of the W channel as per the "original" B-Format spec, and that
is now regarded as both inconvenient and obsolete. A "proper" file
format for HOA needs metadata in the header detailing the nature of the
encoding, agreed channel orderings and idents (especially where unused
channels are omitted). The AMB format has no metadata, just a WAVEX GUID
identifying the format.
Put most simply, the file header needs to supply all the information
required to enable an appropriate decoding to be used. The file is fully
self-describing, robust and unambiguous, so that any program can confirm
purely by reading the header that the file is properly constructed, and
can selectively extract whatever metadata is provided.
Ideally it also needs to be efficient in storage, by excluding any
unused B-Format channels. One solution that has been defined is to rely
on lossless compression to do this, i.e. incorporate the compression
into the file format definition itself.
Also, AMB is based on the standard WAVE format with 32bit chunk sizes,
so is only able to handles file sizes up to 4GB, which is seriously
limiting for HOA with high-resolution samples (all the more so if empty
channels are included). This was reasonable enough back in 2000, when
the WAVEFORMATEXTESNIBLE format itself was very new, but is a serious
limitation today.
So, defining such a format is non-trivial, even if the core issues are
clear. There are so many options, and nobody working in HOA (which as
this list demonstrates continues to be a heavily research-active topic)
really wants to have to deal with file format limitations. I would guess
that MPEG will want a much narrower specification, and maybe base it on
some patentable compression scheme, not least as their target speaker
arrangement is ostensibly fixed. Whereas a defining characteristic of
Ambisonics (HO or otherwise) is that while there are more or less
optimum layouts, speaker arrangements are not fixed.
In practice, those defining such a format need not only to define the
file format itself, but also define and publish basic tools to
create/encode and decode, to a suitably wide range of representative
speaker arrangements; and of course to be able to confirm the whole
thing with listening tests. As well as expertise, that requires
considerable physical resources, to say nothing of the generation of
source material for test purposes. One way and another, it is an
expensive business!
Richard Dobson
_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound