On 2014-11-12, Adam Somers wrote:
VR video (or as we call it, Cinematic VR) is in some ways the perfect
use-case for ambisonics. This year we've created hundreds, if not
thousands, of b-format recordings with accompanying 360º 3D video.
It really is. Because of the basic, most-old-fashioned ambisonic
principle: fully uncompromised isotropy in encoding. (Note, I'm not
saying anything about decoding.) In that the technology fits
*abominably* well with stereoscopy, and especially looking around from a
fixed viewpoint, in optics.
Then what ambisonics might *not* be so good in is virtual environments
where you move about. That's because of the centred, angularly
parametrized framework, which pretty much only lends itself to a fixed
"view"point into the acoustic field.
You can then make it work in synthetic and even recorded-recreated
acoustic environments. But not by direct recording and playback. You
have to do something extra in between. You have to somehow abstract off
your B-format recording, so that auditory cues still match. Like
reverberation falloff, the auditory parallax of close sources over your
movement, the mutual, directional correlation coefficients of the stuff
you perceive as being part of the space and envelopment.
Ville's and Archontis's work can do anything of the sort. But still,
given how nice they sound and how they also do that said kind of
abtraction on the way, I'd say they are at the forefront of the stuff
which could eventually become a Gaming/Movie/Department-store
miracle-development.
Still, I've yet to find a solution for b-to-binaural which is as
convincing as some of the BRIR-based object-sound spatialization
packages (e.g. DTS HeadphoneX and Visisonics Realspace).
I believe I know where the problem is, or at least I believe I can
participate meaningfully in a process which leads to a nigh-optimal
solution.
And in this one, I do mean it, for real. I have some real ideas here,
with my only problem being that I'm lazy, poor, already well underway
into hard alcoholism...and short of hands who'd take my ideas seriously.
The spherical surface harmonic kinds of ideas.
Just give me the usual starved for life and scholarship doctorand, even
on-list, and I'll tell you how b-to-binaural is done. If not as a final
solution, then as a bunch of processes and guidelines. A la Gerzon
Himself. :D
I think what's primarily lacking is externalization, which perhaps can
be 'faked' with BRIRs.
Ville Pulkki's work with DirAC, and his and his workgroup's two
demonstrations, have me convinced that even fourth order ambisonics
leaves too much artificial correlation in the soundfield at the size of
a human head, to sound natural. That then also means that you can't just
naïvely, linearly, statically, matrix down from any extant order of
(periphonic or otherwise) ambisonics to binaural, even with full head
tracking, and expect it to sound as good as the best object panning
format.
So you need some active decoding magic in between. The Gerzon era
problem with active decoding actually was that it was being used for the
wrong purposes, too aggressively, and at such a very low level of
electronic sophistication that it just couldn't work. Gerzon didn't ever
touch Mathematica or MATLAB either, so that much of his analysis was
that of an engineer, and not of an all-knowing AI-mathematician. Now, we
can do a bit better in all those regards. We already even have things
like Bruce's Tabu search results, well entrenced in the open literature;
something the old days could not have dared to dream about.
So, why *not* go with active decoding once again? It's not a blasphemy
if all of the original counterpoints have been answered, and we have
psychoacoustic (and in my case stupendously obvious anecdotals) data to
show we can just do better with active processing? I say no reason.
I.e. let's not talk about faking at all, anywhere. Let's only talk
about capturing the most, most processable, simplest auditory data on
the scene, and then about how to make the best of that captured data
when replaying it. Via any means at all, aiming at transparency.
That's then why I'm such a fanboy once again of Aalto people's work:
even the 4th order ideal, synthetic playback sounded like shit compared
to the reference. After the newfangled DirAC processing, lo and behold,
it came pretty close to the original. So if you think -- like I do -- of
the outcomes, you too would have to beg for Eigenmics, Aalto's software
for them, and then home AV cognizant of decorrelation, B-format minded
sound intensity analysis, and that newfangled variety of infinite order
decoder we call Higher Order DirAC.
Let me coin a short: as opposed to HOA, it's HODAC.
--
Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front
+358-40-3255353, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
___
Sursound mailing list
Sur