Hi y'all, for the longest time now... There's been a lot of discussion about rendering ambisonic soundfields down to binaural of late, and in the past couple of years. I don't really think this is a problem that has been solved in any principled fashion as of yet, so I'd like to invite some discussion. Especially since I once was about to try my hand at the problem, but found my skills woefully lacking.

AFAICS, the thing here is to have a set of HRTF measurements -- the well-known and open KEMAR set, but also any other -- and then to derive from it an LTI coupling from a representation of the soundfield to two ears perceiving the field. The representation ought to be isotropic, as per basic ambisonic principles, and it ought to be matched to the order of the ambisonic field. If you had a neat set of measurements, over the whole sphere of directions, which was designed to be in perfect quadrature, this would be easy as cheese.

The trouble is that no set of measurements really behaves this way. They're not in quadrature at all, and almost *always* you'll have a sparsity, or even a full gap, towards the direction straight down. If the directional sampling was statistically uniform over the whole sphere of directions, and in addition the sample of directions probed was to be in quadrature, it would be an easy excercise in discrete summation to gain the transform matrix we need. But now it very much isn't.

It truly isn't so when you have those gaps of coverage in the HRTF data to the above, and especially below. It leads to divergent, numerically touchy problems in *very* high dimension: if even one of your points in the KEMAR set happens to be out of perfect quadrature, you're led to an infinite order contribution from that one data point.

It also doesn't help that, directionally speaking, our known HRTF/HRIF's don't really come in quadrature, so that they actually contribute to directional aliasing, *statistically*. To negate their individual error contributions out, to a degree. But then, again, I know of *no* global, stochastic error metric out there, nor any optimization strategy, proven to be optimal for this sort of optimization task.

So the best framework I could think of, years past, was to try and interpolate the incoming directional point cloud from the KEMAR and other sets, to the whole sphere, and then integrate. Using a priori knowledge for the edge, singular cases, where a number of the empirical observations prove to be co-planar, and as such singular in inversion. I tried stuff such as information theoretical Kullback-Leibner divergence, and Vapnik-Cervonenkis dimension, in order to pare down the stuff. The thing I settled on was a kind of mutual recursion between the directional mutual information between empirical point gained/removed and Mahalanobis distance to each spherical harmonic added/removed. It ought to have worked.

But it didn't. My heuristic, even utilizing exhaustive search at points, didn't cut it even close. It didn't even approach what Gerzon did analytically in 4.0 or 5.1.

So, any better ideas on how to interpolate and integrate, using ex ante knowledge? In order to go from arbitrary point clouds to regularized, isotropic, optimized, ambisonic -> binaural mappings?
--
Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front
+358-40-3751464, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit 
account or options, view archives and so on.

Reply via email to