Re: [Sursound] A 7th-order array with 16 microphones
On 2021-12-02, Jens Ahrens wrote: It’s hard to tell how exactly the high orders contribute. No, it is not. You can calculate via normal linear field theory, how exactly anything contributes. From the field to your ostensibly linear sensor, over an ostensibly rigid sphere, upon which your sensors have been imbedded. That's just math. Comlicated field math, to be sure, but eminently doable, and deterministic to boot. One aspect is the interaural coherence that needs to be appropriate. The other main aspect is what I typically term the equalization: Below the aliasing frequency, things are fine anyway. So why not give us the geometry of your ball-and-mic-array? We don't need any derivative measurement, because given the primary measurement, we can calculate yours on our own. Above the aliasing frequency, the spectral balance of the binaural signals tends to be more even the higher the orders are that are present. The deviations from the ideal spectral balance also tend to be less strongly dependent on the incidence angle of the sound if higher orders are present. This is already well-known from the WFS work, of them French and German friends/fiends of ours. That WFS lot. Only they mostly talk about things in rectangular coordinates, whereas us ambisonic fiends do the spherical kind. Going between those two coordinate systems isn't easy. The transformation spreads any excitation or normal wave *terribly* badly and unintuitively, over the modes of the other representation. Much of the angle dependent deviations of the spectral balance can be mitigated, for example, by MagLS [...] What is "MagLS"? [...] so that the perceptual difference between, say, 7th order and infinite order is small. That has been done via 3rd order periphonic, with active decoding, already. It certainly needs less channels than straight 7th order pantophonic. So what are you doing here, really? I can’t tell if it gets any smaller with higher orders. My (informal) feeling is that somewhere between 5th and 10th order is where the perceptual difference to the ground truth saturates, both in terms of equalization and the coherence. My hearing is that it in fact seems to cohere at about 3rd, or 4th, order, periphonically. That's about 16 independent channels over the whole sphere. Maybe with active, nonlinear, dynanamic matrix processing, as in the case of DirAC. In the case of 7th order pantophonic processing, the independent channels would have to be 14. So rather close in DSP power. And yet at the same time, they couldn't come close to isotropy, as in the case of 3rd degree ambisonics. They couln't come close to the kind of work needed for full 3D VR work, vis-a-vis, holding a ferry-wheel or roller coaster ride perceptually constant, over the whole ride. This system would alias, noticeably, unlike full, isotropic ambisonic. -- Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front +358-40-3751464, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 ___ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit account or options, view archives and so on.
Re: [Sursound] A 7th-order array with 16 microphones
On 2021-12-07, Hannes Helmholz wrote: (Also: SMA here refers to spherical microphone array) Thank you for the clarification. It's not self-evident that it is spherical, though, since it's really just circular, by said symmetry. As a wannabe-mathematician, I kinda worry about the precise topology and symmetry. Especially since it does goes to my argument about how oblique modes in the acoustic field excite a discrete array of microphones. -- Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front +358-40-3751464, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 ___ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit account or options, view archives and so on.
Re: [Sursound] A 7th-order array with 16 microphones
On 2021-12-02, eric benjamin wrote: I believe that Nando may have been thinking about reproduction with loudspeaker arrays. He has a system with eight loudspeakers on the horizontal plane, as do I. So good up to third order. What is interesting here, to me, is that sampling on the recording side, and reconstruction on the playback side by discrete speakers -- also an instance of sampling in space -- are not the same, and they deteriorate the reconstruction of the soundfield separately. Sampling in recording array and sampling in reconstruction array...I've never really seen them analyzed at the same time, in the same framework. It's always been so that we go to an intermediate domain, which is continuous, with a little bit of wobble angularly, in noise or gain figures, and then back the same way. It's all whole and good, if you can assume independence in all of the errors on the way. But then, you can't: the above Swedish case which I've been arguing, *certainly* doesn't admit such symmetry or independence assumptions. So, the statistical asummptions which underlie e.g. Makita theory, and there Gerzon's, don't go through. In particular, since we're dealing with wave phenomena, there is interference to be contended with. That doesn't come through at *all* in statistical analysis, across 2D and 3D analyses; 3D coupling to a 2D sensor is *wildly* uneven, and if you have a box around the sensor, it can be shown that the sensor coupled with its idealized surroundings, can exhibit resonant modes which run off to an infinite degree, within an infinitely small degree, in angle. It will *always* be nasty, at the edge. But I actually have 24 full-range loudspeakers available. Would it be advantageous to expand our systems to higher order? When you have those, the next thing is, you need an anechoic chamber, and well-calibrated microphones. I mean, you have the machinery to launch physical signals, in 3D. Now you need measurement machinery to catch what you launched, and a silent space between which doesn't perturb your signals. Is it that not so? ;) -- Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front +358-40-3751464, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 ___ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit account or options, view archives and so on.
Re: [Sursound] A 7th-order array with 16 microphones
The audacity on this mailing list is incredible. I am not only referring to the last respondent. Questions and discussions could be nice and fruitful. But why not some humility and decency?! On 2021-12-07, 21:19, "Sursound on behalf of Sampo Syreeni" wrote: > I don’t actually think that there are any special requirements. I think there are. And you know, I think you came to the right place: we might even be able to tell you where you're wrong, where you're right, and help you measure and quantify what your product is really about. Sursounders really like products of your kind to hit the market. They're just the *thingy*, in our beloved technology. It's just that we like to know what they're about, and how to make them the best they can be. 8) Right, well the shown work is not a product. Therefore, you're welcome to work with or on the method yourself. Recent publications on the work are out there. Also, code was made available to allow for direct reproducibility. @Sampo So actually, all the interesting questions you're demanding to be answered, you could investigate them yourself and report the results! (Also: SMA here refers to spherical microphone array) Kind regards, /Hannes ___ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit account or options, view archives and so on.
Re: [Sursound] A 7th-order array with 16 microphones
On 2021-12-02, Fons Adriaensen wrote: If I’m not misreading, then the 7th order is available somewhere between 2 kHz and 3 kHz and higher. Aliasing kicks in at around 4 kHz-ish. So the question is if this small range (less than one octave) actually contributes anything useful. 1-2 (atmost 3-) kHz is the so called phoneme range. In there both spectral contour and synchronized neural firing of the auditory neurons, (via subharmonics, and en masse, because ne firing rate of few neurons goes above a kilohertz), helps us to hear what kind of an implement or person a sound comes from. That particular range actually serves a known and useful function, even if it doesn't constitute *it'll*. My guess is that it is not more or less sensitive than SMAs. Again, what is an SMA? -- Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front +358-40-3751464, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 ___ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit account or options, view archives and so on.
Re: [Sursound] A 7th-order array with 16 microphones
On 2021-12-01, Fernando Lopez-Lezcano wrote: Cool. The correctly recovered harmonics for 7th order span about 1 octave of useful range, if I understand correctly. I'd argue in order to have proper field reconstruction, you at least need to have aliasing artifacts below the noise floor of hearing, or if you don't expect full reconstruction, then the noise needs to be well-matched to the expected noise floor, and its joint coding. It needs to follow something like rate-distortion theory. Since that kind of theory comes from information theory, it expects to know all of the possible sources of information, from all round. So, if you know of some 3D information, it will have to be incorporated. In this case, it to my mind hasn't been. Is it perceptually significant to have 7th order components? I've heard upto third order, in a research setting, in an anechoic room, using dozens of speakers. So, full periphony. I've also been presented with pantophony in various configurations. (Ville Pulkki is the professor of acoustics and signal processing here; Eero Aro the hard hitting practitioner, and avid Ambisonic amateur, on the broadcasting side of things..) That 7th order try at pantophonic ambisonics probably is nice, because even the third order is good. Even the third order leads to very good localisation, over the sphere of horizontal directions. Though at the same time, what you're doing here, is seventh order analysis, oversampling, while not doing seventh order transmission: that'd even periphonically lead to a lot more mics than you have. So somehow you're downsampling from what you have. And because you only sample spatially on the equator, that will lead to lots of missampling of obverse wavefronts; say, reverb modes which go up and down. Even of those wavefronts, which hit the near field of the mic, slightly transversely, and excite ringing modes around the sphere transducer. Those cannot be controlled without a transducer over the poples. Not even theoretically. Which is why ambisonic traditionally leads to gaussian quadrature over the entire sphere: there *anything* at all can be computationally controlled. At least in theory. Or, in other words, as you add spherical harmonics to your encoding process, how does the spatial perception change? Exactly. And how does it work if the field exciting your mic contains, physically, components which aren't equatorially symmetric? They *are* going to be there, after all. Or from the other end, if you start with a 7th order recording and you start truncating the order to lower and lower values in the decoding process, how does the perception of the recording change? Is there a decrease in order for which you can say, "well, that one did not add much, did it?" Actually this reminds me of how Gerzon (perhaps Craven as well) optimized POA for 5.1 linear decoding. Maybe that's what they do at seventh order now, because Gerzon did it at fifth already. That leads to rather an unsymmetrical decoding solution. Which would fit with how badly the above matched symmetrical field behaved -- maybe they just don't understand how to do a dual decode, over all of the field, and over the frequencies? -- Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front +358-40-3751464, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 ___ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit account or options, view archives and so on.
Re: [Sursound] A 7th-order array with 16 microphones
On 2021-12-01, Marc Lavallée wrote: With a bit of algebra, f_a = c N / ( R 2 pi ). So a smaller radius for the sphere would improve f_a? Was 0.0875 m chosen in order to embed some hardware? By the way, I think it would be nice to talk about the two different forms of spatial aliasing: that which manifests in linear coordinates, as in WFS, and that in spherical coordinates, as in HOA. Those two means of analyzing spatial aliasing are not at all the same, and cannot be neatly put into conjunction. If you try to do it, the necessary, intermediary functions are rather special, and difficult to the hilt to master. You'll immediately go into something like Glegst-Gordan coefficients, which ain't nice. Even proper mathematicians shy away from that stuff, unless *absolutely* necessary. :/ -- Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front +358-40-3751464, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 ___ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit account or options, view archives and so on.
Re: [Sursound] A 7th-order array with 16 microphones
On 2021-12-01, Jens Ahrens wrote: For this type of array, the spatial aliasing frequency f_a is dependent on order N and radius R of the array in the exact same manner like with spherical microphone arrays (SMAs): N = (2 pi f_a / c) R But it is also dependent on the angle of incidence above the equatorial. In wideband theory, if a plane wave hits a ring of discrete sensors just right, obliquely, from the third dimension, there is hell to pay in aliasing. And of course there's the near, reactive field to consider, with your sort of hard core sensor. Monopoles on top of a rigid sphere, right? The fields near a hard ball, and their equivalent far fields in free space, under Sommefeld, are highly nontrivial, and they couple lateral to vertical field components. Such near fields can of course be symmetric over the equator, but only as long as the overall acoustic field is symmetric that way, too. In practice it never is. No source, or ambient reflector, like a room, never is. No source really lies on the equatorial plane. And also, if I'm not thoroughly mistaken, the sampling over the sphere, and the sphere-induced near field, amplify the problems. 0th and 1st order are available for all frequencies. 2nd order approx. above 200 Hz 3rd order approx. above 500 Hz etc. You mean the cutoff, right? Do you quantify the bands in rise above the equator, too? I cannot comment on calibration requirements because we did calibrate the array… Against which precise standard? Over the whole of the sphere of directions? (Nor did we measure how well it was calibrated out-of-the-box.). Which you should. :) I don’t actually think that there are any special requirements. I think there are. And you know, I think you came to the right place: we might even be able to tell you where you're wrong, where you're right, and help you measure and quantify what your product is really about. Sursounders really like products of your kind to hit the market. They're just the *thingy*, in our beloved technology. It's just that we like to know what they're about, and how to make them the best they can be. 8) As before, much of the physical limitations are qualitatively (and also quantitively) similar to SMAs. Pray tell, what is a SMA? -- Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front +358-40-3751464, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 ___ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit account or options, view archives and so on.
Re: [Sursound] A 7th-order array with 16 microphones
On 2021-12-01, Jens Ahrens wrote: We would like to make you aware of the concept of equatorial microphone arrays, which use a spherical scattering body and microphones along the equator of that body. Here’s a 3-minute video of a binaural rendering of the signals from such array: https://youtu.be/95qDd13pVVY Impressive, but of course 7th order horizontal-to-binaural always is. The auditory parallax you demonstrate by speaking close to the array is especially convincing. What isn't so convincing is the performance to the back. There's something wonky going on there, because the soundscape is easily heard to be pulling in. Are you sure your HRTF's are symmetrical, as befits a hard spherical array? Or did you instead start with something like KEMAR-data, which would make the array inherently asymmetrical. I also ask this because where you do the dual test of turning the array and moving the source at the same time, I can *definitely* hear skew in the field. Maybe even a quadrupole moment. That might suggest your algorithms are turning the spherical harmonics the wrong way around; not dually as they should be, but something else. Finally, I'd like to hear this done with extremely wideband and impulsive sources as well. A soft speaking voice ain't gonna cut it as a test. I'd like to hear balloons being popped, triangles being rung, small tambourines being pounded. And at the other end of the spectrum, the lowest notes of an organ, to measure how the sensor fairs in reactive fields. In particular, what happens above the equatorial plane: does the sensor adequately and reliably mitigate the spatial aliasing which *necessarily* comes with off-plane waves, over the entire frequency range, of it. It *can* theoretically do something like that, even in the linear regime. That's just about joint spatial-temporal domain regularization. About taming the off-sightline, aliasing modes, over time-frequency, enough to make the thing seem isotropic, over a wide band. You blur out where there would be aliasing in direction, you not blur where there's symmetry enough to not do that. Now, if you do something besides this, tell us what it is. There's all sorts of work to be done in the nonlinear, machine learning, whatnot, work. "Superresolution" it's called, including all of the active decoders of the analogue era, like Dolby Surround's Active Matrix, and the lot. All of that could be done much better now, using AI methods and statistical learning, target tracking algorithms, and whatnot. If you're doing those, do tell, I'd be highly interested. But right now, I don't really see what is so special about the array. It sounds like a conventional horizontal array, on a rigid sphere, and then jsut conventional processing into a binaural playback. That sort of thing has been done in the ambisonic ambit for decades. I've even heard in free air better simulacrums of duality, with POA. So, why not publish your equations and methodology? Let's read back from there. :) Their main advantage over conventional spherical microphone arrays is the fact that they require only 2N+1 microphones for Nth spherical harmonic order (conventional arrays require (N+1)^2 microphones). The price to pay is the circumstance that the array does not capture the actual sound field but a horizontal projection of it. In fact you don't capture a horizontal projection. The only way you could do that is with an infinite vertical stack of microphones on a stiff cylinder, sensitive to only plane waves. In here, the soundfield about the spherical sensor will very much be sensitive to sound from above -- as you showed in the video. That sound will be sensitive to directional aliasing. We just don't hear it, because you talk to the mic array in a muffled speaking tone. Wideband signals of the like I suggested, would spatially alias widely, leading to a *lot* of audible artifacts, direction reversals, and the like, when the source and/or the array is made to revolve. This poses the question of what it may sound like if the array captures sound that originates from outside of the horizontal plane?!? The video is going to demonstrate this! Now test it the way I wanted, and give us the results. It's going to fail, because it's not isotropic. ;) Though, maybe it's not *supposed* to be isotropic. When you go about it that way, it's going to be a kind of periphonic array. In that role it probably will perform well. It's just that you can't use that sort of array in anything but a free field. In confined spaces, sound does not propagate in two dimensions, but in three, with there being coupled modes between the 2D and 3D fields. If you try to capture those with any anisotropic probe, there will be interplay across dimensions. For instance, reverberation will be diminished because it spreads out into the third dimension, which isn't being captured, and if there's a standing oblique mode in