Re: [Sursound] the recent 2-channel 3D sound formats and their viability for actual 360 degree sound
On 2011-07-23, Robert Greene wrote: I feel a little diffident in commenting on this in the presence of so many experts on the Soundfield mike in theory as well as in practice, but unless I am misunderstanding how it works, there are VERY serious problems of other kinds with using it at the kinds of distances (fractions of a meter less than 1/2 , much less often enough) where proximity effect becomes really major. That is true as well. I seem to remember the original impetus for the reactivity and sound intensity talks a few years back eventually proved to be not so much about sound fields, but about the poor response of the classical SoundField design towards surrounding soundfields which had higher order components. Rejection of directional aliasing from the higher order components of the local soundfield, that is. Which is precisely which progressively mounts up when you bring a small source up close, and/or have a closeby reflective geometry (such as Angelo did in his car work, then). It proved that SoundFields have no problem with out-of-phase zeroth and first order components, but in fact responded remarkably accurately to them; the only excluded explanation was directional aliasing rejection. Which in retrospect should be rather self-evident, shouldn't it? :) Namely, as I understand it, the way the B format signals are built is predicated upon the distances among the four capsules being quite small compared to the distance of the source, for the following reason: That, but not only that. They're also physically extensive, with an innate directivity of their own. Compensation is needed for the fact that the capsules are on the faces of a tetrahedron, not coincident and all at the center. Yes, and this is based on them having the cardioid characteristic, plus being approximately coincident. That is only an approximation in total, especially at the higher frequencies. Thus, the overall mic won't totally reject higher harmonics. Again, we've gone over similar considerations with Filippo in the past, just considering spherical clouds of omnis. I think we even got into topology, with the difference between just spherical clouds, and clouds on top of an acoustically opaque sphere. ;) So it seems to me(and I am prepared to be all wrong!) that the Soundfield mike could not be expected to work at all well except when the source is quite far away--a matter of meters, not inches. Long story short, I suspect very much the same, and I think there's already good empirical showing for the problems that follow, even on this very list. -- Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front +358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 ___ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound
Re: [Sursound] the recent 2-channel 3D sound formats and their viability for actual 360 degree sound
On 2011-07-23, dave.mal...@york.ac.uk wrote: (For some reason Dave H's post didn't arrive at me as-is. Dave Malham appears once quoted and Dave Hunt as twice, in below. Sorry about the hassle.) I have an interesting question (well, I think it's interesting). The Soundfield microphone, like any directional microphone, has a boosted bass response to close sounds. When listening to this through a speaker rig, we hear this boost and tend to interpret it as meaning the sound is close especially in a dry acoustic with a Greene-Lee head brace etc., etc.,. However, surely (unless I am being more dense than usual tonight) this is a learnt response based on the behaviour we have heard from directional mics? To a degree it is. I think the worst thing here is that we continuously close-mic acoustic sources, to "give them that close, intimate feel". That's originally about the proximity effect and about HF attenuation with distance, but since it's now a part of a culturally shared idiom as well, we associate extra power with it as such, beyond the mere psychoacoustics. Electronic distortion with its synthetic overtones doesn't help here, either. But still, at the very bottom, you'd have this very same reaction because of the pure, physical acoustical proximity effect. E.g. I'm pretty sure someone like Philippo Fazi could explain with neat visualisation how the boundary conditions represented by the human head turn this curved wavefront reactivity into a noticeable bass-boost at the tympanic membrane, given the shape of the human head, upper torso and pinnae. It really isn't just about the reproduction technology; it's about how soundfields work and how we naturally perceive them from within them. After all, taken individually, at those sort of frequencies our ears are essentially omnidirectional and not subject to bass boost (to anything like the same degree). The NFC-HOA theory doesn't take into account how we hear things. It's a physical theory and not a psychoacoustic one. It just recreates soundfields, and lets us hear them as we happen to do. If we don't hear those spatially differentiated low frequencies, so what? They were still recreated with perfect fidelity, and we heard them as such. If perhaps losing some of the impact in the process, given that we're no blue whales. If you then look at the second NFC-HOA paper, they actually exploit this fact to arrive at a lower energy intake transmission pipe. They don't go into psychoacoustics just yet, but they do cut out the huge low frequency anti-phase signals which come from the naïve soundfield reconstruction math. You're right that POA assumes plane waves. The encoded signals are reproduced at the distance of the loudspeakers. No. Those two are in direct contradiction. If a point source comes from the surface of the rig, it's going to be a spherical wave originating from that distance. If it's a planewave instead, it's either just a planewave, or a (very tight) approximation of an infinitely far-away point source (that also being a plane wave when you measure it locally). The shelf filters in a BLaH compliant decoder are (as I understand it) an attempt to compensate for the speakers finite distance, and that they don't produce plane waves at the listener. This is often referred to as 'distance compensation'. Sorta. In the conventional ambisonic decoder there are two separate circuits. The first (set) is the shelf one, which is based on psychoacoustics alone. In a sense it tries to compensate for the very low, seventies bandwidth of just four channels for periphony. It does so by frequency selectively varying the relative amplitude of the zeroth and first components. At low frequencies it goes with a velocity, or systematic decode, because there we seem to (or seemed to) hear interaural phase differences pretty well, but not the amplitude ones. Going higher up, the shelf filtering is optimized to reproduce power differences between the two ears, which seem(ed) to work pretty well for both a) a well-centered single listener, and b) for purely mathematical, statistical reasons for non-centered listeners as well. This first circuitry is never something you can switch off in a conventional ambisonic decoder, because it's about pure psychoacoustical optimization, with regard to the pure, physical bound signal that is being received. The distance compensation circuit on the other hand can be swiched on or off, because it has to do with the receiver end rig geometry, which is then naturally variable by design. In reality it too ought to be a continuous knob, just as the aspect ratio knob in the four speaker decoders is. But then this would lead to a total bitch of a filter circuit in the analog domain, for little gain in the usual domestic situation which ambisonic was originally aimed at. Thus, the classical decoders simply give you the choice of no distance compensation, or a swi
Re: [Sursound] the recent 2-channel 3D sound formats and their viability for actual 360 degree sound
I feel a little diffident in commenting on this in the presence of so many experts on the Soundfield mike in theory as well as in practice, but unless I am misunderstanding how it works, there are VERY serious problems of other kinds with using it at the kinds of distances (fractions of a meter less than 1/2 , much less often enough) where proximity effect becomes really major. Namely, as I understand it, the way the B format signals are built is predicated upon the distances among the four capsules being quite small compared to the distance of the source, for the following reason: Compensation is needed for the fact that the capsules are on the faces of a tetrahedron, not coincident and all at the center. This compensation is based on the fact that at reasonable distances to the source, the differences of the distances to the mikes is obtained by orthogonal projection on the axis of arrival of the sound(to a very good apporximation). To make sense of this jargon, suppose a source is on the line that is equistant from three of the capsules. Then its distance to those three will always be the same, and if the source is reasonably far away the distance to the fourth capsule will be a constnat. This comes from the Pythagorean theorem limit case in effect: at large distances , the difference between A to S and B to S is equal to the length of the projection of the line from A to B onto the line from A to S (or B to S these being parallel in the limit case). If one does NOT have such large distance to the source, the variation of distances to the capsules will be extreme and also complicated. Just think of how the distances to the four face centers of the tetrahedron will vary in odd ways when the source is close by! So it seems to me(and I am prepared to be all wrong!) that the Soundfield mike could not be expected to work at all well except when the source is quite far away--a matter of meters, not inches. At close distances, there will be wild phase differentials among the four mike capsule outputs of a kind that depends on the distance of the source from the center of the mike--something which the mike does not "know" so that it cannot be compensated for. Am I all wet here? Robert On Sat, 23 Jul 2011, dave.mal...@york.ac.uk wrote: Hi Folks, I have an interesting question (well, I think it's interesting). The Soundfield microphone, like any directional microphone, has a boosted bass response to close sounds. When listening to this through a speaker rig, we hear this boost and tend to interpret it as meaning the sound is close especially in a dry acoustic with a Greene-Lee head brace etc., etc.,. However, surely (unless I am being more dense than usual tonight) this is a learnt response based on the behaviour we have heard from directional mics? After all, taken individually, at those sort of frequencies our ears are essentially omnidirectional and not subject to bass boost (to anything like the same degree). Any thoughts, anyone? Dave On Jul 23 2011, Dave Hunt wrote: Hi again, Date: Thu, 21 Jul 2011 21:01:41 +0300 (EEST) From: Sampo Syreeni On 2011-07-21, Dave Hunt wrote: There is certainly no consideration of values outside the unit sphere. [...] Correct, and we've been here before. We certainly have. As BLaH points out, even the first order decoder handles distance as well as it possibly can. So does the SoundField mic on the encoding side. The encoding and decoding are well matched. In some ways hardly surprising. But the classical synthetic encoding equation is for infinitely far away sources only, that is, plane waves. Running the result through a proper, BLaH compliant decoder then reconstructs a simulacrum of such a plane wave, with first order directional blurring, spatial aliasing caused by the discrete rig, and the purposely imposed psychoacoustic optimizations overlaid on top of the original, extended soundfield. So in fact it's wrong to say that the source is produced at the distance of the rig: instead it's produced infinitely far away, modulo the above three complications. (That is bound to be one part of why even synthetically panned sources localise so nicely even when listening from outside the rig.) I have already admitted the error of my original statement. You're right that POA assumes plane waves. The encoded signals are reproduced at the distance of the loudspeakers. The shelf filters in a BLaH compliant decoder are (as I understand it) an attempt to compensate for the speakers finite distance, and that they don't produce plane waves at the listener. This is often referred to as 'distance compensation'. If you want to synthetically encode a near-field source so to speak "by the book", you'll have to lift the source term from Daniel, Nicol and Moreau's NFC work. I seem to remember it amounts to a first order filter on the first order part of the source signal in the continuous domain, which you'l
Re: [Sursound] the recent 2-channel 3D sound formats and their viability for actual 360 degree sound
Hi Folks, I have an interesting question (well, I think it's interesting). The Soundfield microphone, like any directional microphone, has a boosted bass response to close sounds. When listening to this through a speaker rig, we hear this boost and tend to interpret it as meaning the sound is close especially in a dry acoustic with a Greene-Lee head brace etc., etc.,. However, surely (unless I am being more dense than usual tonight) this is a learnt response based on the behaviour we have heard from directional mics? After all, taken individually, at those sort of frequencies our ears are essentially omnidirectional and not subject to bass boost (to anything like the same degree). Any thoughts, anyone? Dave On Jul 23 2011, Dave Hunt wrote: Hi again, Date: Thu, 21 Jul 2011 21:01:41 +0300 (EEST) From: Sampo Syreeni On 2011-07-21, Dave Hunt wrote: There is certainly no consideration of values outside the unit sphere. [...] Correct, and we've been here before. We certainly have. As BLaH points out, even the first order decoder handles distance as well as it possibly can. So does the SoundField mic on the encoding side. The encoding and decoding are well matched. In some ways hardly surprising. But the classical synthetic encoding equation is for infinitely far away sources only, that is, plane waves. Running the result through a proper, BLaH compliant decoder then reconstructs a simulacrum of such a plane wave, with first order directional blurring, spatial aliasing caused by the discrete rig, and the purposely imposed psychoacoustic optimizations overlaid on top of the original, extended soundfield. So in fact it's wrong to say that the source is produced at the distance of the rig: instead it's produced infinitely far away, modulo the above three complications. (That is bound to be one part of why even synthetically panned sources localise so nicely even when listening from outside the rig.) I have already admitted the error of my original statement. You're right that POA assumes plane waves. The encoded signals are reproduced at the distance of the loudspeakers. The shelf filters in a BLaH compliant decoder are (as I understand it) an attempt to compensate for the speakers finite distance, and that they don't produce plane waves at the listener. This is often referred to as 'distance compensation'. If you want to synthetically encode a near-field source so to speak "by the book", you'll have to lift the source term from Daniel, Nicol and Moreau's NFC work. I seem to remember it amounts to a first order filter on the first order part of the source signal in the continuous domain, which you'll then have to discretize. (But don't take my word for it, it's been a while since I went through DN&R.) Me too, but as I remember it tries to build the 'distance compensation' into the encoding, and thus is dependent on the distance of the loudspeakers. Thus the encoding is only suitable for an identical or similar rig, and is not transferable to other rigs. Amplitude/delay based systems such as WFS, Delta stereophony and TiMax have similar problems. The encoding has to be matched to the speaker rig. Simply manipulating the relative amplitude or even the spectral contour doesn't in theory cut it, though it's a cheap way to get some of the psychoacoustic effects of a nearby source. Agreed that it is far from perfect, but this is obviously not a trivial problem. What I'm suggesting is a fudge, though it can produce simulations of sources both inside and outside the loudspeaker radius which can be psychoacoustically effective, and are transferable to different rigs. We're still left with the "40 foot high geese" problem. The only minor nit is that synthetic panning needs a bit more refinement for near sources that wasn't being handled by the older literature. The "(potentially nasty) bass boost" you refer to is obviously a problem. You could limit it from going extremely large at very small distances, and ensure that the output only went to 0dBFS maximum, but this would require a huge dynamic range throughout the whole system: large bit depth, good DACs, very quiet amplifiers etc.. If you could do the encoding assuming a given speaker distance, then modify the decoding for a different distance it might help, though I've no idea how to do this. Ciao, Dave Hunt ___ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound ___ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound
Re: [Sursound] the recent 2-channel 3D sound formats and their viability for actual 360 degree sound
Hi again, Date: Thu, 21 Jul 2011 21:01:41 +0300 (EEST) From: Sampo Syreeni On 2011-07-21, Dave Hunt wrote: There is certainly no consideration of values outside the unit sphere. [...] Correct, and we've been here before. We certainly have. As BLaH points out, even the first order decoder handles distance as well as it possibly can. So does the SoundField mic on the encoding side. The encoding and decoding are well matched. In some ways hardly surprising. But the classical synthetic encoding equation is for infinitely far away sources only, that is, plane waves. Running the result through a proper, BLaH compliant decoder then reconstructs a simulacrum of such a plane wave, with first order directional blurring, spatial aliasing caused by the discrete rig, and the purposely imposed psychoacoustic optimizations overlaid on top of the original, extended soundfield. So in fact it's wrong to say that the source is produced at the distance of the rig: instead it's produced infinitely far away, modulo the above three complications. (That is bound to be one part of why even synthetically panned sources localise so nicely even when listening from outside the rig.) I have already admitted the error of my original statement. You're right that POA assumes plane waves. The encoded signals are reproduced at the distance of the loudspeakers. The shelf filters in a BLaH compliant decoder are (as I understand it) an attempt to compensate for the speakers finite distance, and that they don't produce plane waves at the listener. This is often referred to as 'distance compensation'. If you want to synthetically encode a near-field source so to speak "by the book", you'll have to lift the source term from Daniel, Nicol and Moreau's NFC work. I seem to remember it amounts to a first order filter on the first order part of the source signal in the continuous domain, which you'll then have to discretize. (But don't take my word for it, it's been a while since I went through DN&R.) Me too, but as I remember it tries to build the 'distance compensation' into the encoding, and thus is dependent on the distance of the loudspeakers. Thus the encoding is only suitable for an identical or similar rig, and is not transferable to other rigs. Amplitude/delay based systems such as WFS, Delta stereophony and TiMax have similar problems. The encoding has to be matched to the speaker rig. Simply manipulating the relative amplitude or even the spectral contour doesn't in theory cut it, though it's a cheap way to get some of the psychoacoustic effects of a nearby source. Agreed that it is far from perfect, but this is obviously not a trivial problem. What I'm suggesting is a fudge, though it can produce simulations of sources both inside and outside the loudspeaker radius which can be psychoacoustically effective, and are transferable to different rigs. We're still left with the "40 foot high geese" problem. The only minor nit is that synthetic panning needs a bit more refinement for near sources that wasn't being handled by the older literature. The "(potentially nasty) bass boost" you refer to is obviously a problem. You could limit it from going extremely large at very small distances, and ensure that the output only went to 0dBFS maximum, but this would require a huge dynamic range throughout the whole system: large bit depth, good DACs, very quiet amplifiers etc.. If you could do the encoding assuming a given speaker distance, then modify the decoding for a different distance it might help, though I've no idea how to do this. Ciao, Dave Hunt ___ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound