Dear colleagues,

the following posting is a standard proposal, as such quite long. (Obviously anybody can stop reading if not interested.) In any case, an Ambisonics based proposal had to be presented on this audio list.

--------------------------------------------------------------------------------------------------


I believe that Ambisonics has come to a state which requires the introduction of some agreed standard for relatively easy and user-centric implementation, including certain wider recommendations for a successful application at home, in mobile devices etc.

The proposed open and free-to-use standard will include both 1st order Ambisonics and HOA. The standard should be extensible, in two senses: I will present "minimalistic" but nevertheless powerful mandatory part (currently restricted to 8 channels), some optional part (offering currently two variants with more than 8 channels), and we will talk about possible future extensions.

The standard proposal - different than B format or say AmbiX - doesn't aim to present a complete hierarchy, for reasons of simplicity and easier implementation.

Other intentions:

The presented proposal is also designed to be usable as a CE standard, for home use, to be used on mobile devices, in areas like gaming, VR (think of "Oculus Rift") etc. Whereas the use of the term "CE" has been criticized here before, there are practical implications if you think what typical CE applications are. In the case of Ambisonics, you will always need certain decoding software, but also certain audio hardware like sound cards, receivers, loudspeakers, headphones... This is important, because some people working in this field seem to underestimate the problems related with the availability of audio hardware and equipment at home, at least in my opinion. The term "CE" applies also to certain applications. Many to most current TVs include some function or < app > for YouTube or Netflix, for example. If you define any new audio format for streaming (and if based on the suggested standard proposal), you would certainly have to use compressed audio for this aim, currently very probably AAC - or maybe the DD+ or the Opus codec. Therefore the standard is presented in a way which can be implemented in a flexible form and in any format, just like B format itself. If talking about lossless codecs, the normal "CE" and "music enthusiast" codec seems to be FLAC, not WavPack. (This is an important remark, not an intentional "kick" to insult other standard proponents. We will talk later about the elements of this standard proposal which are backward-compatible to stereo. If this matters, use FLAC, because FLAC is the codec which people already use.)

B format: I see B format as an Ambisonics channel ordering and normalization scheme, up to 3rd order. However, even B format is not a "complete" description of Ambisonics up to 3rd order, in the sense that a relatively new mixed order scheme - the xhyv hierarchy - has been proposed independently by different people. This hierarchy can use the B format components up to 3rd order, but is not described by B format itself.



Short resumen:

The proposed standard is a description of mandatory and optional elements, which can be implemented in a flexible way.

Because the standard makes heavy use of both FOA and HOA B format (but is not identical to B format in the optional part, and even less in future extensions), I propose the name "sound field format" (.SFF) as preliminary standard name. (IF this standard will go on in any form, you need a name.)


I. FOA, soundfield mikes

This is the basic and historically first form of Ambisonics. For legacy reasons, because of the many existing recordings and because of the continuing relevance of different soundfield microphones, FOA has to be included into any Ambisonics based standard for surround sound/3D audio.

The original B format (1st order, so WXYZ) has been superseded by the FuMa B format scheme up to 3rd order, and the related .amb file format. The .amb format is widely used in available Ambisonics decoder, plugins, software tools etc. In this sense, B format/.amb is the de facto standards for Ambisonics up to 3rd order.

Source:

http://dream.cs.bath.ac.uk/researchdev/wave-ex/bformat.html

There is currently not any accepted lossless or lossy compressed file format associated with B format, at least as far as I know. (It has been shown that AAC is a good codec to compress Ambisonics channels, and could be applied in different/decreasing bitrates up to very high orders - if bitrate matters.)


Recent parametric or non-conventional decoders like Harpex and DirAC try to improve the perceived resolution of FOA out of the sweet spot.

(
Further reading:

http://harpex.net/
)


Note that Harpex also provides upsampling of 1st order to 3rd order (both B format). As far as I understood, this applies at least to the horizontal case. (Commentaries/feedback very welcome)


II. HOA

Both B format and .AMB are only defined up to 3rd order.

Here the link to some background paper about the recent and current HOA standardization efforts, written from the perspective of the people behind some standard proposal beyond 3rd order Ambisonics (AmbiX):

http://iem.kug.ac.at/fileadmin/media/iem/projects/2011/ambisonics11_nachbar_zotter_sontacchi_deleflie.pdf


Because you have to start from some point, I already have proposed to combine "just" 1st order and 3rd order Ambisonics into one simple yet powerful standard for home and mobile use, recording etc.

Therefore, the mandatory part of SFF (v. 0.1) is a subset of B format. (The actual standard proposal will also include certain recommendations for decoding, backward-compatibility to stereo, mandatory and optional loudspeaker configurations etc.)

Because I have proposed this before, I will cite the respective former posting (9th of September 2013):

Dear colleagues...

To continue the proposal to use < certain > forms of .AMB as a real-world format for the transport/storage of 3D audio (including music recordings), I would like to hint to some further and important issues involved.

A full .AMB decoder would have to be able to decode the nine different combinations of .AMB to different (standard?) loudspeaker configurations, and also to headphones. (The latter would be some important point in my "requirement list".) This means there will be plenty of combinations, and some great opportunities to mess things up if anybody wants to implement the 9*9 "or so" combinations ... :-)

It would be advantageous if we would be able to limit .AMB to some < CE profile > with far fewer combinations!

(To cover just FOA won't be enough. We know that FOA has certain limitations and won't be good enough for all applications. Think just of the sweet spot issues.)

My impression is that you would have to use < at least > 3rd order to overcome many/most of the typical FOA problems.

Some advantages of TOA, compared to FOA:

- much larger sweet spot (not only support for individual listeners; IMO this is very important, as I would like to be able to demonstrate some wonderful recordings to "at least one" friend, even better to "some friends". If you don't have friends, don't bother... :-D )

- angular resolution significantly improved, compared to FOA (improvement of more than factor 2)

- improved performance at higher frequencies

- we know that FOA has certain problems to present sound "from the sides", even if the playback rig would include loudspeakers at direct lateral positions.

http://www.acoustics.hut.fi/~ville/papers/pulkkiicmc2001.pdf

- Decoding TOA to (ITU) 5.1/7.1 will show much better results than FOA to 5.1/7.1. (Comments? I know that 5.1 is an underspecified irregular array from an Ambisonics TOA perspective, but you can decode this and the results will be better than in the FOA to 5.1 case...)

- Improved behaviour at higher frequenciess


Altogether, a practical "CE" format based on Ambisonics and .AMB could be introduced in the following, simplified form

I) FOA/ UHJ (3-4 UHJ channels, the proposed backward-compatible form to stereo of FOA...)

3/4 channels

(Classical decoders and other decoders, supposed to improve on classical ones...)

II) 3rd order horizontal-only and 3h/1p, which you could combine to just 3h1p (1st order vertical)

7/8 channels,  or 8 channels

(Might still be offered in some "UHJ", stereo-compatible fashion; "UHJ" for 2nd/3rd oder doesn't exist yet, but it can be done.)

III) 3h/3p, 16 channels (call this the .AMB master format? Anyway, this is the upper end of Fu-M AMB...)


In the 1st version SFF (v0.1 - 1.0), I would restrict the < mandatory > channel number count to 8. (8 channels is still an important limit, to comply with maximum channel numbers in formats, CE interfaces, DAWaudio buses and workflows, etc.)

This means that you would combine I and II, referring to the cited passage above. (III cancelled. III might be the mastering format used in a DAW, but it is not part of SFF itself.)


I have seen later that I and II have also been used in a (well-received) audio engine for gaming:

http://etiennedeleflie.net/2008/06/24/codemasters-ups-their-useage-of-ambisonics-on-race-driver-grid/

I though you might like to know that our latest game RaceDriver GRID deploys hybrid third- order Ambisonics (again) on PlayStation 3, plus a lot more concurrent sound sources (at times hundreds play at once, each generating eight WXYZPQUV Ambisonic components)


(The above combination is  3h1p, in FuMA terms.)

This audio machine is related to Blue Ripple Sound and Richard Furse, who recently has introduced TOA (VST) plugins.

So, the < mandatory > subset of .amb which will be included into the SFF proposal (v0.1) will be FOA (3/4 channels) and mixed-order 3h1p TOA (7/8 channels), and nothing else.

Note that in both the 1st order (3-4 channel) and 3rd order (7-8 channel) case you can use the < same > decoder, which means the same decoder for 1st and 3rd order horizontal and 3D audio decoding. (If I understood this well enough, you can't decode 3h1p via a full-periphonic 16-channel TOA decoder leaving unused channels at "0", because you would obtain certain errors. This means that different mixed-order variants need a different decoder. The above case is specific, because the B format z component can be ignored if you decode to any horizontal configuration. In other words: You can ignore z if elevated speakers - positive and negative elevation possible - don't exist in the output configuration. The common horizontal/3D decoder has to include z in each case, of course. )

I propose that that the following mixed order variants could already be included into the v0.1 version of SFF, as < optional > elements:

- 2h1v (8 channels),
- 3h1v (12 channels), and/or
- 3h2p (11 channels)

(Optional means that a decoder or plugin could be able to reproduce these variants, but doesn't have to. A decoder or plugin < should not > crash if reading more than 8 channels, though... This is a superfluous definition, but has to be respected in any case. So, don't sue me for any standard implementation which doesn't work, destroys your computer file system, or whatever other minor reason... ;-) )


At this point (optional part) we have left the backward-compatibilty to B format/.amb, including some combinations of the xhyv hierarchy. (B format uses xhyp for mixed orders)

xhyv is as 3D audio format more stable than xhyp, in the sense that the horizontal resolution stays the same, independently of the elevation/height levels where the sound comes from. (As long as spatial music happens mostly in the horizontal plane, I would say that most to all current music recordings and most spatial performance projects could be "done" in xhyp. IMO 3h1p is still and clearly a better recording format than 2h1v, because of the higher horizontal resolution. If you have content which makes heavy use of the full-sphere, you would have to cope. Then and only then, 2h1v might be perceived to be a better option to present isotropic 3D audio. In this case, the reign of mighty WFS would also come to an end, in epic failure style... )

An introduction to different mixed-order schemes can be found here:

Ch. Travis "A New Mixed-Order Scheme for Ambisonic Signals"
http://ambisonics.iem.at/symposium2009/proceedings/ambisym09-travis-newmixedorder.pdf/?searchterm=travis

(Not restricted to 3rd order)

The mandatory part of SFF is a subset of .amb < and > respetcs an upper 8-channel limit. (6 channels is nowadays not a serious limit, because you won't distribute B format or "SFF v1.0" via DD, DTS or S/PDIF. Think also of the fact that AAC, Opus etc. offer much better compression than DD and DTS. I don't see < any > bitrate problem for 8 channels AAC/Opus compressed audio if 5.1 DD didn't have a bitrate problems in the 90s.)


It is good to include some optional format extensions with a higher channel count right from the start, as suggested. At worst you would prepare future extensions, but you didn't "break" current software and interfaces in the mandatory part. (I am aware that optional features are not always implemented, because people tend to be pragmatic, vulgo "lazy". On the other hand end users/professionals/musicians will be able to chose between different implementations. Choice means that optional features will be implemented at least sometimes IF a certain demand exists in the 1st place.)



III. Loudspeaker configurations (presentation as a sketch)

1. Mandatory support for:

Stereo, quad (4 speakers), 5.1, 7.1

Binaural for a mobile device. (Obviously. What else?)



2. Optional

I propose to put both the regular hexagon and octagon into this category. (Very nice for enthusiasts and for people working in this field, but statistically not yet found in the wild. Can you really expect that consumers will move around their 5.1/7.1 speakers?)

Binaural via loudspeakers, using X-talk cancellation. (Ambiophonics, BACCH etc.) (You could say that the people behind might adapt their technology to reproduce SFF.)


3. "New" horizontal configurations:

a) ITU 6.1

This is 5.1 + a CB (center back) speaker. This configuration didn't succeed for home theater use. One reason could be that the angle between SL/SR and CB is 80º according standard, so maybe a bit too high for effective pairwise panning.

In the Ambisonics case, the cited panning problem doesn't exist, and there should be some real improvement compared to the normal 5.1 configuration. 6.1 is then an easy extension of 5.1, and very probably better for soundfield generation. (The existing home theater installation is both extended and stays < unchanged >. No side effects, you only have to install one more speaker.)

Because the center back speaker might have to be closer to the listener than other speakers for practical reasons ("not enough space behind your sofa"), some distance compensation/delay might be necessary.

This same plane-wave assumption makes it possible to vary the distance of speakers within reasonable limits without upsetting the correct function of the decoder, provided that the difference is compensated with delay, the power is adjusted for uniform loudness at the center, and that per-speaker near-field compensation is used. Distance does not affect the decoder matrix.

(Source:
http://en.wikipedia.org/wiki/Ambisonic_reproduction_systems
)


b) "ITU 8.1"

Actually, this configuration is not a defined standard, and actually might not exist anywhere yet. It is ITU 7.1 plus a CB (center back) speaker. While this doesn't improve things for panned audio (unless you want to have sharp "non-phantom signals from a back position"), this might be a good configuration for TOA decoders. You have 8 speakers, and also a symmetric configuration to some thought axis in the middle, going from -90º to 90º. (This configuration is relatively close to an "ideal" octagon, and backward compatible to all versions of 5.1/6.1/7.1 film audio.)

The ITU 6.1 and "8.1" configurations would add just one loudspeaker to existing configurations for film sound.


3. Loudspeaker extensions for 3D audio.

You would add (at least) 4 upper speakers, as long as SFF stays restricted to 1st and 2nd vertical orders. 2nd order would require at least one elevation level more. (Full periphonic 3rd order would require an 8-6-1, for the upper hemisphere. Full 3h1p decoding requires 8-4 or 8-4-1. 8-4-1 fits even for the optional 3h2p case, BTW. "8" can refer to ocagon or "ITU 8.1", see above. Decoding of TOA to 5.1/6.1 is possible, even if detail is lost.)

Negative elevations are possible and have been done, in certain home loudspeaker configurations. (Including the 7.13D configuration, if we talk once more about gaming. )


IV. Binaural decoding

Decoding sound field format/SFF via HRTF sets via headphones, optionally including HT. (Note that motion-sensors, gyroscopes and position tracking via GPS are included in smartphones and many other devices. See also recent discussions on sursound.)


V. Working with SFF, mixing, cutting, editing...

All current mandatory and optional combinations of SFF v0.1 (4-12 channels) could be derived from a full-periphonic 3rd order B format mix (16 channels), presented probably in .AMB format. This is why we currentlay stay at 3rd order/TOA, and don't aim for orders above.

(When going from 1st order Ambisonics to TOA, there is a significant improvement of quality. The improvements are less clear if you compare SOA to FOA. If SFF or any accepted TOA format could be established as a kind of 5.1 successor, any potential version after might very well skip 4th order and go directly to anywhere between 5th to 7th order, to obtain some clear improvement. And probably we would use again some mixed-order scheme, maybe 5h3p or 6h3p. 20 or 22 channels sound like an awful lot compared to 5.1, but it is the factor 4. If you say that some Ambisonics channels is compressed with AAC and say 80 kbit/s each channel, you would (still and only) need about 1.6 MBit/s for 20 audio channels. (I also don't see any problems to handle about 20 uncompressed channels in a DAWs. Just think of the requirements for video editing and even Photoshop, and relax...)

8 channels can be coded easily in 640kbit/s. This is a Dolby Digital data rate. (The above data rate of 1.6 MBit/s would compare to 5.1 DTS, but is wayyyy better... There exist certain ideas how you could compress very high orders even more, but this is beyond the scope of the current SFF project.)

A far bigger problem for any application of Ambisonics >= 4th order at home than memory space or available bitrates is the fact that you will have to use (or re-use) existing loudspeaker configurations, too. (At least if you don't expect that your customer will run to Wal Mart or Lidl to buy plenty of cheap speakers and cables for the installation of his/her newest Ambisonics home system.) TOA has already been shown to work successfully in home applications, see the cited gaming audio engine in section II.

SFF is meant to be used for specific areas, especially as a home and mobile surround/3D audio format. We are not covering yet all possible applications/needs such as live concerts, audio installations, cinemas, theaters, operas etc. However, it would be possible to present some future version of SFF which will cover all those applications and areas, and might include HOA order >=4. (So, it made sense to talk about extensions to 5-7h/3p, or similar proposals, probably mixed-order variants. Any future version of SFF would be backward-compatible to "SFF v1.0", of course. Otherwise you could not speak about "SFF" at all)


VI. Microphones

The FOA soundfield microphones got some recent boost via the Harpex decoders and upsamplers, which in some cases will improve results if you compare the Harpex decoder to classical dual-band, rE or in-phase decoders. Harpex is also very useful (I hear) if you want to mix a FOA recording into some (synthetic) HOA field. (I believe this works only horizontally, for now.)

A spherical array microphone with 32 capsules can record 3rd and maybe 4th order. (If I understood this well, you could introduce a version with 20 capsules as a simpler TOA mike. Comments?)

An excellent article about both HOA and HOA microphones you'll find here:

http://ambisonics.iem.at/symposium2009/proceedings/ambisym09-daniel-evolvingviewsonhoa.pdf/at_download/file

It seems to me that direct 3rd order recording is currently about feasible. (EigenMike)


VII. Backward-compatibility to stereo

I have suggested before that FOA (1st order B format) could be distributed in a form which is backward-compatible to stereo, using the 3/4 channel form of UHJ.

http://en.wikipedia.org/wiki/Ambisonic_UHJ_format#The_UHJ_hierarchy

If a third channel (T) is available, this can be used to give improved localisation accuracy to the planar surround effect when decoded via a 3-channel UHJ decoder.


Adding a fourth channel (Q) to the UHJ system allows the encoding of full surround sound with height, known as Periphony, with a level of accuracy identical to 4-channel B-Format.


Which gives just another form of WXYZ, coded in a backward-compatible fashion to stereo, which is LRTQ.

The LRTQ presentation could very probably be used to distribute AAC stereo/FOA files. If lossless distribution is required, you might distribute LRTQ via FLAC. (Note that you need always a decoder for the surround signals. LRTQ will always be brought back into B format form, if you decode to surround sound. Nobody would expect to need an Ambisonics/UHJ decoder for POSL/plain old stereo listening, unless you are a real expert and decode B format to 2 speakers. It works and you do this maybe after recording with a SFM, but... O:-) )


This idea could be extended to higher orders, how I have suggested before. A very easy way to do so has been proposed by Jörn Nettingsmeier.

Take Ambsionics B format orders 3h and (mixed-order) 3h1p. The components are WXY(Z) UV PQ.

Jörn's suggestion is to use only 1st order to derive L/R, which leads us to LRTQ UV PQ.

In the meantime, I have found some possible problem with this: The Q and Z channels are actually different!

(The Q channel is not used to derive L/R. A possible explanation could be in different forms of filtering. See the section presenting UHJ coding and decoding equations in the cited link:

Q = 0.9772*Z

Z = 1.023*Q

Note that two-channel UHJ requires the player to use different shelf filters than for 3/4-channel UHJ and B-Format);

So, according to Wikipedia, there is actually no different filter if you compare 3/4 channel B format and UHJ format.

Why are Q and Z then different at all?! Note also that the decoding equations for WXY are exactly identical for 3- and 4-channel UHJ.

(Comments/help very welcome)

Because Q and Z are not identical, there might be some problem with our "naive" proposal. (This problem looks very fixable, but where are ye UHJ experts who show us how to extend UHJ to infinite orders, using Jörn's "don't care for higher orders if deriving L/R" (TM) approach? ;-) )

The hierarchical stereo-surround UHJ presentation of SFF saves not only memory space or transmission bitrate (you don't have to distribute a stereo file and a surround file), it also might save money in case that certain right holders would ask for double license fees. Having worked in the field of optical discs and having proposed (different) hybrid discs, I would suspect this would be the normal case, unless recently things have changed a lot in the music label and publishing rights world.

One file instead of two is also a desirable simplification, from a consumer perspective.


VIII. So what?

What is suggested is a relatively simple standard for (mainly) 1st and 3rd order Ambisonics.

SOA (2nd order Ambisonics) is skipped (with the exception of 2h1v as option), specifically 2h and 2h1p. (Going up to 3rd order provides some clear improvements compared to FOA, such as the perfect or say near-perfect reconstruction of a much wider frequency range in the "head zone", a wider sweet spot/zone and improved localization for all frequencies out of the sweet spot.)

Considering the wide use of B format in existing software, plugins, decoders and software toolboxes/libraries, TOA is clearly preferable to 4th order, from the view of a software engineer/developper.

The SFF standard proposal is built around existing solutions wherever possible, including some new elements. (Backward-compatibilty to stereo, 6.1 and "8.1" loudspeaker configurations, xhyv hierarchy...All these elements are < optional >, which is intentional. Concerning the use of the UHJ hierarchy to obtain easy backward-compatibility to stereo, I suggest that these transformations can be done in an easy way, so I strongly recommend to actually implement/use this feature. If this would not be some own recommendation, I would have said this should be a < mandatory > part. But so... 8-) )


Although the channel count of 8 is from a current perspective moderate, 3rd order Ambisonics is powerful. Maybe VBAP rendering via 8 speakers would give the same accuracy of localization compared to (3rd order) Ambisonics rendering, but I doubt that Ambisonics would test worse. Beside of this, Ambisonics is loudspeaker independent, and we have some real and flexible surround/3DA format. (Whereas VBAP is a rendering technology, not a format.)

Two optional forms of SFF (3h2p and 3h1v) use 11 and 12 channels, respectively. These are thought to improve applications which would stress the 3D audio aspect. 12 channels is about the same as the channel count for Auro-3D, which in it's most normal form is 11.1. (Auro-3D is a simple discrete format for 3D audio, so the comparison of SFF to both 7.1 and 11.1 seems fair. You/we can test later...)

This has been a long posting. However, it is about some standard proposal, and in request for comments form. Unlike like in any final standard description, I also wanted to provide some < reasons > why I propose certain features/elements, and some real discussion base. 5.1 won't go away, but the proposal (sound field format/SFF) is clearly both more powerful and flexible. It is also more complicated, we have to admit. (It is an open standard, in this sense different than Mpeg-H 3DA. Without being able to give any proof because you can't prepare unfinished states, I believe that SFF could/will be a lot simpler than Mpeg-H 3DA.)

Special thanks go to Michael Chapman, Aaron Heller, Jörn Nettingsmeier, Richard Furse and Svein Berge. You have answered both rather competent and exceptionally ignorant questions, which for my feeling and considering my limited capabilities still made a lot of sense O:-) ... (The SFF proposal and all expressed views are just my own, although every of the cited persons had some influence on my views. I am completely unaware if any of the cited persons agrees with any of my views... )

Speaking about Harpex, you could see this as a kind of bridge between 1st and 3rd order Ambisonics. (Harpex might actually be judged to be even "sharper" than 3rd order under some circumstances, which will depend on the decoded material. In general, "native" 3rd order should prove to be more natural. Feedback welcome...)

We have also discussed if parametric decoders (like Harpex or DirAC) could be extended to HOA and specifically 2nd/3rd order, which theoretically should be possible. However, this doesn't seem to have been done yet.
(Comments...)

An open standard doesn't mean that commercial solutions which would implement/use SFF should not be welcome.

The proposed ("home", "CE", whatever...) standard SFF = Sound Field Format should be controlled by the community (and/or possibly by certain open organisations/foundations), not by companies.

Some attention has been given to the application and decoding side, including < mandatory > and < optional > loudspeaker configurations and binaural decoding options. Whereas some decoders (such as Rapture3D) would offer free speaker positioning - which is impressing -, this is not the normal case, nor < should > all decoders implement such a feature. It is obvious that any "flexible" decoder could pre-install all mandatory and optional configurations in an easy and selectionable way.


Best regards,

Stefan Schreiber                                            Lisbon



---------------------------------------------------------------------------------------------

Further related thread:


-------- Original Message --------
Subject: [Sursound] Two new approaches for the distribution of surround sound/3D audio
Date:   Mon, 29 Jul 2013 03:57:34 +0100
From:   Stefan Schreiber <st...@mail.telepac.pt>
Reply-To:       Surround Sound discussion group <sursound@music.vt.edu>
To:     Surround Sound discussion group <sursound@music.vt.edu>



(Continuation of: The commercial future of Ambisonics, 15/5/2013)


Dear colleagues,

following the recent standardization of 3D audio by Mpeg (ISO/IEC 23008-3) and related activities, I have come to the conclusion that the (older) B format up to 3rd order might need some updates.

However, I also came to the conclusion that FOA (first order Ambisonics) could be easily included into all current distribution models for audio in the Internet, which are (to "99.98%") stereo-based. We nearly have been "there", in the above cited thread! ("The commercial future of Ambisonics")

I will start with this part, because you can see this as an own format. Which might be the perfect bridge or transition format for future surround/3D audio (3DA) formats...


....


IV. Improved binaural representation via headphones

Note that headphones with HT "chips" and motion-corrected binaural playback of surround sound (including 3D audio) could easily be realized, with available and actually quite affordable chips.

Oculus Rift is the direct example for this, as this is a full (and certainly more complex) VR and gaming device.

http://worthplaying.com/event/E3_2013/PostE3_2013/89888/

In its current state, the Oculus Rift is an amazing piece of work, and after decades of dealing with VR technology, it seems that we may finally see a VR unit that is going to get it right.


Wikipedia writes about the Oculus Rift motion-tracking:

"Initial prototypes used a Hillcrest 3DoF head tracker that is normally 120 Hz, with a special firmware that John Carmack requested which makes it run at 250 Hz, tracker latency being vital due to the dependency of virtual reality's realism on response time. The latest version includes Oculus' new 1000 Hz Adjacent Reality Tracker that will allow for much lower latency tracking than almost any other tracker. It uses a combination of 3-axis gyros, accelerometers, and magnetometers, which make it capable of absolute (relative to earth) head orientation tracking without drift.[20][25]"


Now, apply the same or similar HT silicon (which is already very affordable) to HT/motion-tracking headphones... (I could give some detailled recommendations how to do this, but this is also one of the next steps... Nice to see that at least the video and gaming people have kept some sense for cool technology and seemingly "weird" ideas, so to speak. How many motion updates per second would a "fluent" head-tracking binaural decoder/decoding program actually requiere? Ye Ambisonics experts, what do you think or better know?! You would have to decode some UHJ/".AMB+" file and shift the soundfield relative to the head position, I guess. The head position needs some regular and frequent updates, what we easily get b y now. You could also track the < absolute > movements of persons within some area. Say: Your decoder program tracks the movements of the visitors in some museum or building, and plays the associated audio/explanations fitting to the current position. "This is the dining hall of the castle, which was quite cold during winter, but warm or even hot during summer." Ok, this was a truly dull example... :-) )


Best regards

Stefan Schreiber                                          Lisbon



_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20140108/aaca7542/attachment.html>
_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound

Reply via email to