Dear colleagues...
I would like to remember everybody interested or already being involved
that ITU/MPEG plan to define and issue some 3D audio standard (better:
3D audio standard framework) during this year. The 3D audio codec is
meant to be part of the (wider) MPEG-H standard.
This all makes a lot of sense, 'cos ;-) there is already some
competition around:
1. Hamasaki 22.2, well known as (audio) part of former UHDTV (Super
Hi-vision) proposals.
2. http://www.auro-3d.com/system/listening-formats
(Note:
a)
The Auro-3D® Engine comprises:
Auro Codec: The revolutionary codec that delivers native, discrete
Auro-3D® content.
Auro-Matic: The groundbreaking up-mixing algorithm that converts
legacy content into the Auro- 3D® format.
Auro-3D® Headphone: Like other audio configurations, similar results
can be achieved with headphones that use binaural technology.
b)
Film, Broadcast, Gaming, Mobile, Automotive and Multimedia industries
are all searching for a next generation sound format. With 3D
Stereoscopic imagery becoming commonplace, the time is right for an
audio experience that matches this increased level of fidelity. Sound
in 3D is clearly the next step.
3. http://www.dolby.com/us/en/consumer/technology/movie/dolby-atmos.html
(IMHO, Dolby won't participate in the MPEG standardization process. And
even if, Dolby Atmos seems to be finished.)
The current situation at MPEG:
http://www.itu.int/en/ITU-T/studygroups/com16/video/Pages/jctvc.aspx
Next meetings:
* Geneva, Switzerland, October 2013 (tentative)
* Vienna, Austria, 27 July - 2 August 2013 (tentative)
* Incheon, Korea, 20-26 April 2013 (tentative)
* Geneva, Switzerland, 14-23 January 2013 (tentative)
During the next conference (January, Genève), the important HEVC codec
should be technically finished. (Status: FDIS, for "Final Draft
International Standard")
There will also be issued a final call for an 3D audio codec:
At the 102nd MPEG meeting MPEG has issued a Draft Call for Proposals
(CfP) on 3D Audio Coding.
(This was the last meeting, Shanghai, October 2012)
MPEG-H 3D Audio is envisaged to provide a highly immersive audio
experience to accompany the highly immersive experience provided by
MPEG-H HEVC. Such an immersive listening experience will be realized
by the rendering of a realistic and compelling 3D audio scene either
by using a large number of loudspeakers, such as for 22.2 channel
audio programs, or by using headphones supporting binauralization.
Key issues to be addressed are a compact and bit-efficient
representation of multi-channel audio programs and the ability to
flexibly render an audio program to an arbitrary number of
loudspeakers with arbitrary configurations. 3D Audio support via
headphones is also a key capability in order to deliver an immersive
experience for users of mobile devices.
A final CfP will be issued at the 103rd meeting in January 2012,
(they mean January 2013, of course...)
with selection of technology from amongst the responses received at
the 105th meeting in July 2013. This technology will form the basis
for MPEG-H 3D Audio, the Audio part (Part 3) of the MPEG-H (ISO/IEC
23008) suite of technologies.
Taken together, the final deadline for any proposal seems to be around
April 2013. (Incheon, Korea meeting, April 2013)
If some Ambisonics based audio-codec is proposed (it has been done, but
as an official proposal??), I would like to add some observations.
Cinema audio and UHD TV (and this is where the push comes from) iclude
some "discrete" elements, and anybody has to be aware of this. Firstly,
there are one or two (Hamasaki 22.2) separate LFE channels. (LFE
channels make sense for movies and in the cinema, even if some people
always will dispute this...we are not talking about most music you will
listen to at home, but about cinema sound with special effects.)
Secondly, a lot of sound is tied to the screen. The narrow-spaced front
speakers might represent a problem for Ambisonics, at least for
low-order Ambisonics. (Dolby Atmos defines actually up to 5 "screen"
loudspeakers, this means three or five. Note that the front C channel is
often used as voice/conversation channel.)
A possible solution would be to offer some kind of B"+" option, the
"plus" part being the front and LFE channels. 2D/3D surround for all the
"resting" sound field would be offered via the B format (order?) sound
field, or HOA sound field. (To mix such a hybrid sound format is rather
trivial, I would say. Just leave out the front and the LFE parts in the
surround/3D field... )
So maybe define some "purist" solution (say B format 3rd order, or
horizontal 4th order mixed with vertical 1st/2nd order, or whatever),
and also some "B+" option. (The original B+ proposal was FOA + 2 stereo
channels. Note that a direct consequence of the "hybrid" Ambisonics
option would be that a 2nd or 3rd order soundfield should be enough for
the representation of the surround and height channels. In fact, you can
decode to 5.1, 7.1 Hamasaki 22.2, Auro-3D and Dolby Atmos surround
layouts. The B format "resolution" should be more than enough for any of
these layouts - maybe even at 2nd order, certainly at 3rd. The
narrow-spaced front wouldn't be any problem, by definition. LFE channels
are discrete in any case, as stated before.)
I would't be afraid to offer some hybrid option, anyway. (Dolby Atmos
defines up to 64 channels, and also audio objects for different
loudspeaker layouts. Therefore, Dolby Atmos is itself a hybrid system -
based on discrete channels and audio objects.)
I just wanted to give a small hint ;-) how anybody might set up a valid
proposal. The B+ could and < should > be included as an option. The
basic idea behind for this is that cinema audio has some specific
properties, which have to be covered by any system. (The < front > is
extremely important, because voice and many sounds are tied to events on
the screen; LFE channels are discreet; the C channel is mostly used in a
discreet way, being used as the voice channel.)
Note also that the clock is already ticking, and I absolutely mean this.
The MPEG can chose from some valid proposals, (Hamasaki) 22.2 and
Auro-3D among these.
Ambisonics is defining a 3D audio field since the 70s, so it would seem
logic to include Ambisonics into any 3D audio standard. There are also
some clear advantages, which are getting more and more important.
(Different cinemas won't offer anywhere the same loudspeaker layouts,
pretty safe bet)
Because the MPEG will basically chose from existing proposals, somebody
has to define some valid Ambisonics based proposal.
I am apologizing to the already involved experts to have written on a
pretty basic or say introductory level. But nobody has done this here
before, and I think not everybody is sufficiently informed about these
issues - maybe even some very competent people.
However/but:
The next two or three MPEG conferences are not just like the next
spacial audio or Linux audio conference ;-) , we are talking (also)
about the probable next real-world standard for 3D audio. After MPEG-H
3rd part (audio) and Dolby Atmos exist, every future endeavour would
face some (extremely difficult) uphill battle.
If Ambisonics is not included for reasons of laziness, infighting tribes
or whatever else, I would say: Game over for Ambisonics in the
real-world.... (I don't mean this in a rude way. The thing is just
that the MPEG won't wait for even the most beautiful HOA standard which
will be represented in the year 2015 or 2020...)
The advantages of Ambisonics are clear: It is by definition a 3D audio
theory/codec, and you can decode to different loudpeaker layouts and
headphones. (This is of course very basic, but you have to tell this to
people if presenting a proposal.)
Best regards,
Stefan Schreiber Lisbon
P.S.: I personally would/will work with any 3D audio standard. Because
MPEG-H 3rd part (audio) will be a selection of several
codecs/approaches, Ambisonics should be included. If so, I would define
two options: some "purist" approach, but also some "B+" approach, which
maybe fits more to cinema-audio in the real world.
Now Thomas Chen (still lurking on this list?) would probably agree,
because the original (6-channel) B+ proposal if from him. Unfortunately
he works at Dolby... :-X
_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound