On Tue, Jan 26, 2016 at 09:52:21PM +0100, Jörn Nettingsmeier wrote: > On 01/26/2016 06:36 PM, Stefan Schreiber wrote: > > >2. < 8 > impulses (for 4 virtual speakers) implies that you don't > >support 3D decoders (?). If not, why this? (Immersive/3D audio is on the > >requirement list for VR. It wouldn't make a lot of sense if all sound > >sources will follow your gaze - looking upwards or downwards.) > > I think the 8 impulses are used differently. I'm scared of trying to > explain something of which my own understanding is somewhat hazy, > but here it goes: please correct me ruthlessly. Even if in the end I > wish I'd never been born, there might be something to learn from the > resulting discussion :) > > ...
Your understanding is 100% correct. I've been reading this thread with much interest, as it is exactly about the topic I've been working on for the last two months. The result is a prototype system converting full 2nd order to binaural. Decoder and binaural rendering are combined, in the way Joern explained, into a 9 * 2 convolution matrix. Motion tracking uses a cheap (50 Euro) USB sensor which provides around 90 quaternions per second, and the corresponding rotation is done in the Ambisonic domain. The whole thing works quite well so far. Unfortunately I can't tell much more, just a few comments on some of the topics raised in the thread. * Sofa is just a format for representing data such as HRIRs. Apart from the actual IRs, a sofa file will provide things like the set of directions, source distance etc. But it does not impose any standard values for any of that metadata. Converting a sofa data set into the N * 2 convolutions that are required in the end can't be done blindly. I've been using at least five different sources. All of them were in sofa format, but each one required some quite specific treatment. In other words, this is a format for researchers, not for end users. * Most HRIR sets have an LF response that is almost certainly not correct. Up to a few hundred Hz it should be flat. One essential step in the preparation is to fix this. How this is done best depends on the particular data set. If this is done correctly you can trim the IRs to a few ms without any adverse effect. * Another essential step is to align the delays. The HRIRs must be shifed in time so that all the ipsolateral sides have the same delay. This is to avoid comb filtering, which would provide false spectral cues. * From my personal experience I'd agree with Dave Malham: you will adapt to a specific set of HRIRs if they are not your personal ones. And it's my impression that this adaptation remains - you'll 'remember' it. * I don't think that the exact inter-ear distance is that important - I've been able to modify this (within reason) without any ill effect. What seems to make a set of HRIRs personal is probably more the complex direction-dependent filtering by the pinnae. * All content derived from non-surround sources (e.g. plain stereo or 5.1) requires some 'room sound' to work well. Externalisation seems to depend on having early reflections from different directions (which would allow the brain to compare their spectra). Generating such room sound can be done in the AMB domain. What exactly is required and how to do that efficiently is my current research problem. Ciao, -- FA A world of exhaustive, reliable metadata would be an utopia. It's also a pipe-dream, founded on self-delusion, nerd hubris and hysterically inflated market opportunities. (Cory Doctorow) _______________________________________________ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit account or options, view archives and so on.