Re: [Sursound] the recent 2-channel 3D sound formats and their viability for actual 360 degree sound

2011-07-23 Thread Sampo Syreeni

On 2011-07-23, Robert Greene wrote:

I feel a little diffident in commenting on this in the presence of so 
many experts on the Soundfield mike in theory as well as in practice, 
but unless I am misunderstanding how it works, there are VERY serious 
problems of other kinds with using it at the kinds of distances 
(fractions of a meter less than 1/2 , much less often enough) where 
proximity effect becomes really major.


That is true as well. I seem to remember the original impetus for the 
reactivity and sound intensity talks a few years back eventually proved 
to be not so much about sound fields, but about the poor response of the 
classical SoundField design towards surrounding soundfields which had 
higher order components. Rejection of directional aliasing from the 
higher order components of the local soundfield, that is. Which is 
precisely which progressively mounts up when you bring a small source up 
close, and/or have a closeby reflective geometry (such as Angelo did in 
his car work, then).


It proved that SoundFields have no problem with out-of-phase zeroth and 
first order components, but in fact responded remarkably accurately to 
them; the only excluded explanation was directional aliasing rejection. 
Which in retrospect should be rather self-evident, shouldn't it? :)


Namely, as I understand it, the way the B format signals are built is 
predicated upon the distances among the four capsules being quite 
small compared to the distance of the source, for the following 
reason:


That, but not only that. They're also physically extensive, with an 
innate directivity of their own.


Compensation is needed for the fact that the capsules are on the faces 
of a tetrahedron, not coincident and all at the center.


Yes, and this is based on them having the cardioid characteristic, plus 
being approximately coincident. That is only an approximation in total, 
especially at the higher frequencies. Thus, the overall mic won't 
totally reject higher harmonics. Again, we've gone over similar 
considerations with Filippo in the past, just considering spherical 
clouds of omnis. I think we even got into topology, with the difference 
between just spherical clouds, and clouds on top of an acoustically 
opaque sphere. ;)


So it seems to me(and I am prepared to be all wrong!) that the 
Soundfield mike could not be expected to work at all well except when 
the source is quite far away--a matter of meters, not inches.


Long story short, I suspect very much the same, and I think there's 
already good empirical showing for the problems that follow, even on 
this very list.

--
Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front
+358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] the recent 2-channel 3D sound formats and their viability for actual 360 degree sound

2011-07-23 Thread Sampo Syreeni

On 2011-07-23, dave.mal...@york.ac.uk wrote:

(For some reason Dave H's post didn't arrive at me as-is. Dave Malham 
appears once quoted and Dave Hunt as twice, in below. Sorry about the 
hassle.)


I have an interesting question (well, I think it's interesting). The 
Soundfield microphone, like any directional microphone, has a boosted 
bass response to close sounds. When listening to this through a 
speaker rig, we hear this boost and tend to interpret it as meaning 
the sound is close especially in a dry acoustic with a Greene-Lee head 
brace etc., etc.,. However, surely (unless I am being more dense than 
usual tonight) this is a learnt response based on the behaviour we 
have heard from directional mics?


To a degree it is. I think the worst thing here is that we continuously 
close-mic acoustic sources, to "give them that close, intimate feel". 
That's originally about the proximity effect and about HF attenuation 
with distance, but since it's now a part of a culturally shared idiom as 
well, we associate extra power with it as such, beyond the mere 
psychoacoustics. Electronic distortion with its synthetic overtones 
doesn't help here, either.


But still, at the very bottom, you'd have this very same reaction 
because of the pure, physical acoustical proximity effect. E.g. I'm 
pretty sure someone like Philippo Fazi could explain with neat 
visualisation how the boundary conditions represented by the human head 
turn this curved wavefront reactivity into a noticeable bass-boost at 
the tympanic membrane, given the shape of the human head, upper torso 
and pinnae. It really isn't just about the reproduction technology; it's 
about how soundfields work and how we naturally perceive them from 
within them.


After all, taken individually, at those sort of frequencies our ears 
are essentially omnidirectional and not subject to bass boost (to 
anything like the same degree).


The NFC-HOA theory doesn't take into account how we hear things. It's a 
physical theory and not a psychoacoustic one. It just recreates 
soundfields, and lets us hear them as we happen to do. If we don't hear 
those spatially differentiated low frequencies, so what? They were still 
recreated with perfect fidelity, and we heard them as such. If perhaps 
losing some of the impact in the process, given that we're no blue 
whales.


If you then look at the second NFC-HOA paper, they actually exploit this 
fact to arrive at a lower energy intake transmission pipe. They don't go 
into psychoacoustics just yet, but they do cut out the huge low 
frequency anti-phase signals which come from the naïve soundfield 
reconstruction math.


You're right that POA assumes plane waves. The encoded signals are 
reproduced at the distance of the loudspeakers.


No. Those two are in direct contradiction. If a point source comes from 
the surface of the rig, it's going to be a spherical wave originating 
from that distance. If it's a planewave instead, it's either just a 
planewave, or a (very tight) approximation of an infinitely far-away 
point source (that also being a plane wave when you measure it locally).


The shelf filters in a BLaH compliant decoder are (as I understand 
it) an attempt to compensate for the speakers finite distance, and 
that they don't produce plane waves at the listener. This is often 
referred to as 'distance compensation'.


Sorta. In the conventional ambisonic decoder there are two separate 
circuits. The first (set) is the shelf one, which is based on 
psychoacoustics alone. In a sense it tries to compensate for the very 
low, seventies bandwidth of just four channels for periphony. It does so 
by frequency selectively varying the relative amplitude of the zeroth 
and first components. At low frequencies it goes with a velocity, or 
systematic decode, because there we seem to (or seemed to) hear 
interaural phase differences pretty well, but not the amplitude ones. 
Going higher up, the shelf filtering is optimized to reproduce power 
differences between the two ears, which seem(ed) to work pretty well for 
both a) a well-centered single listener, and b) for purely mathematical, 
statistical reasons for non-centered listeners as well.


This first circuitry is never something you can switch off in a 
conventional ambisonic decoder, because it's about pure psychoacoustical 
optimization, with regard to the pure, physical bound signal that is 
being received.


The distance compensation circuit on the other hand can be swiched on or 
off, because it has to do with the receiver end rig geometry, which is 
then naturally variable by design. In reality it too ought to be a 
continuous knob, just as the aspect ratio knob in the four speaker 
decoders is. But then this would lead to a total bitch of a filter 
circuit in the analog domain, for little gain in the usual domestic 
situation which ambisonic was originally aimed at. Thus, the classical 
decoders simply give you the choice of no distance compensation, or a 
swi

Re: [Sursound] the recent 2-channel 3D sound formats and their viability for actual 360 degree sound

2011-07-23 Thread Robert Greene


I feel a little diffident in commenting on this in the presence of so many 
experts on the Soundfield mike in theory as well as in practice,
but unless I am misunderstanding how it works, there are VERY serious 
problems of other kinds with using it at the kinds of distances (fractions 
of a meter less than 1/2 , much less often enough) where proximity

effect becomes really major.

Namely, as I understand it, the way the B format signals are built is 
predicated upon the distances among the four capsules being quite small

compared to the distance of the source, for the following reason:
Compensation is needed for the fact that the capsules are on the faces of 
a tetrahedron, not coincident and all at the center. This compensation
is based on the fact that at reasonable distances to the source, the 
differences of the distances to the mikes is obtained by orthogonal 
projection on the axis of arrival of the sound(to a very good 
apporximation).


To make sense of this jargon, suppose a source is on the line that is 
equistant from three of the capsules.  Then its distance to those three
will always be the same, and if the source is reasonably far away the 
distance to the fourth capsule will be a constnat. This comes from the 
Pythagorean theorem limit case in effect: at large distances , the
difference between A to S and B to S is equal to the length of the 
projection of the line from A to B onto the line from A to S (or B to S 
these being parallel in the limit case).


If one does NOT have such large distance to the source, the variation of 
distances to the capsules will be extreme and also complicated.
Just think of how the distances to the four face centers of the 
tetrahedron will vary in odd ways when the source is close by!


So it seems to me(and I am prepared to be all wrong!) that
the Soundfield mike could not be expected to work at all well
except when the source is quite far away--a matter of meters, not
inches.  At close distances, there will be wild phase differentials among 
the four mike capsule outputs of a kind that depends on the distance

of the source from the center of the mike--something which the mike
does not "know" so that it cannot be compensated for.

Am I all wet here?

Robert

On Sat, 23 Jul 2011, dave.mal...@york.ac.uk wrote:


Hi Folks,
I have an interesting question (well, I think it's interesting). The 
Soundfield microphone, like any directional microphone, has a boosted bass 
response to close sounds. When listening to this through a speaker rig, we 
hear this boost and tend to interpret it as meaning the sound is close 
especially in a dry acoustic with a Greene-Lee head brace etc., etc.,. 
However, surely (unless I am being more dense than usual tonight) this is a 
learnt response based on the behaviour we have heard from directional mics? 
After all, taken individually, at those sort of frequencies our ears are 
essentially omnidirectional and not subject to bass boost (to anything like 
the same degree).


Any thoughts, anyone?

 Dave
On Jul 23 2011, Dave Hunt wrote:


Hi again,


Date: Thu, 21 Jul 2011 21:01:41 +0300 (EEST)
From: Sampo Syreeni 

On 2011-07-21, Dave Hunt wrote:


There is certainly no consideration of values outside the unit  sphere.
[...]


Correct, and we've been here before.


We certainly have.


As BLaH points out, even the first
order decoder handles distance as well as it possibly can. So does the
SoundField mic on the encoding side.


The encoding and decoding are well matched. In some ways hardly 
surprising.



But the classical synthetic
encoding equation is for infinitely far away sources only, that is,
plane waves. Running the result through a proper, BLaH compliant  decoder
then reconstructs a simulacrum of such a plane wave, with first order
directional blurring, spatial aliasing caused by the discrete rig, and
the purposely imposed psychoacoustic optimizations overlaid on top of
the original, extended soundfield. So in fact it's wrong to say  that the
source is produced at the distance of the rig: instead it's produced
infinitely far away, modulo the above three complications. (That is
bound to be one part of why even synthetically panned sources localise
so nicely even when listening from outside the rig.)


I have already admitted the error of my original statement. You're  right 
that POA assumes plane waves. The encoded signals are  reproduced at the 
distance of the loudspeakers. The shelf filters in  a BLaH compliant 
decoder are (as I understand it) an attempt to  compensate for the speakers 
finite distance, and that they don't  produce plane waves at the listener. 
This is often referred to as  'distance compensation'.



If you want to synthetically encode a near-field source so to speak  "by
the book", you'll have to lift the source term from Daniel, Nicol and
Moreau's NFC work. I seem to remember it amounts to a first order  filter
on the first order part of the source signal in the continuous domain,
which you'l

Re: [Sursound] the recent 2-channel 3D sound formats and their viability for actual 360 degree sound

2011-07-23 Thread dave . malham

Hi Folks,
 I have an interesting question (well, I think it's interesting). The 
Soundfield microphone, like any directional microphone, has a boosted bass 
response to close sounds. When listening to this through a speaker rig, we 
hear this boost and tend to interpret it as meaning the sound is close 
especially in a dry acoustic with a Greene-Lee head brace etc., etc.,. 
However, surely (unless I am being more dense than usual tonight) this is a 
learnt response based on the behaviour we have heard from directional 
mics? After all, taken individually, at those sort of frequencies our ears 
are essentially omnidirectional and not subject to bass boost (to anything 
like the same degree).


Any thoughts, anyone?

  Dave
On Jul 23 2011, Dave Hunt wrote:


Hi again,


Date: Thu, 21 Jul 2011 21:01:41 +0300 (EEST)
From: Sampo Syreeni 

On 2011-07-21, Dave Hunt wrote:

There is certainly no consideration of values outside the unit  
sphere.

[...]


Correct, and we've been here before.


We certainly have.


As BLaH points out, even the first
order decoder handles distance as well as it possibly can. So does the
SoundField mic on the encoding side.


The encoding and decoding are well matched. In some ways hardly  
surprising.



But the classical synthetic
encoding equation is for infinitely far away sources only, that is,
plane waves. Running the result through a proper, BLaH compliant  
decoder

then reconstructs a simulacrum of such a plane wave, with first order
directional blurring, spatial aliasing caused by the discrete rig, and
the purposely imposed psychoacoustic optimizations overlaid on top of
the original, extended soundfield. So in fact it's wrong to say  
that the

source is produced at the distance of the rig: instead it's produced
infinitely far away, modulo the above three complications. (That is
bound to be one part of why even synthetically panned sources localise
so nicely even when listening from outside the rig.)


I have already admitted the error of my original statement. You're  
right that POA assumes plane waves. The encoded signals are  
reproduced at the distance of the loudspeakers. The shelf filters in  
a BLaH compliant decoder are (as I understand it) an attempt to  
compensate for the speakers finite distance, and that they don't  
produce plane waves at the listener. This is often referred to as  
'distance compensation'.


If you want to synthetically encode a near-field source so to speak  
"by

the book", you'll have to lift the source term from Daniel, Nicol and
Moreau's NFC work. I seem to remember it amounts to a first order  
filter

on the first order part of the source signal in the continuous domain,
which you'll then have to discretize. (But don't take my word for it,
it's been a while since I went through DN&R.)


Me too, but as I remember it tries to build the 'distance  
compensation' into the encoding, and thus is dependent on the  
distance of the loudspeakers. Thus the encoding is only suitable for  
an identical or similar rig, and is not transferable to other rigs.  
Amplitude/delay based systems such as WFS, Delta stereophony and  
TiMax have similar problems. The encoding has to be matched to the  
speaker rig.



 Simply
manipulating the relative amplitude or even the spectral contour  
doesn't

in theory cut it, though it's a cheap way to get some of the
psychoacoustic effects of a nearby source.


Agreed that it is far from perfect, but this is obviously not a  
trivial problem. What I'm suggesting is a fudge, though it can  
produce simulations of sources both inside and outside the  
loudspeaker radius which can be psychoacoustically effective, and are  
transferable to different rigs.


We're still left with the "40 foot high geese" problem.


 The only minor nit is that synthetic
panning needs a bit more refinement for near sources that wasn't being
handled by the older literature.


The "(potentially nasty) bass boost" you refer to is obviously a  
problem. You could limit it from going extremely large at very small  
distances, and ensure that  the output only went to 0dBFS maximum,  
but this would require a huge dynamic range throughout the whole  
system: large bit depth, good DACs, very quiet amplifiers etc..


If you could do the encoding assuming a given speaker distance, then  
modify the decoding for a different distance it might help, though  
I've no idea how to do this.


Ciao,

Dave Hunt

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound




___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] the recent 2-channel 3D sound formats and their viability for actual 360 degree sound

2011-07-23 Thread Dave Hunt

Hi again,


Date: Thu, 21 Jul 2011 21:01:41 +0300 (EEST)
From: Sampo Syreeni 

On 2011-07-21, Dave Hunt wrote:

There is certainly no consideration of values outside the unit  
sphere.

[...]


Correct, and we've been here before.


We certainly have.


As BLaH points out, even the first
order decoder handles distance as well as it possibly can. So does the
SoundField mic on the encoding side.


The encoding and decoding are well matched. In some ways hardly  
surprising.



But the classical synthetic
encoding equation is for infinitely far away sources only, that is,
plane waves. Running the result through a proper, BLaH compliant  
decoder

then reconstructs a simulacrum of such a plane wave, with first order
directional blurring, spatial aliasing caused by the discrete rig, and
the purposely imposed psychoacoustic optimizations overlaid on top of
the original, extended soundfield. So in fact it's wrong to say  
that the

source is produced at the distance of the rig: instead it's produced
infinitely far away, modulo the above three complications. (That is
bound to be one part of why even synthetically panned sources localise
so nicely even when listening from outside the rig.)


I have already admitted the error of my original statement. You're  
right that POA assumes plane waves. The encoded signals are  
reproduced at the distance of the loudspeakers. The shelf filters in  
a BLaH compliant decoder are (as I understand it) an attempt to  
compensate for the speakers finite distance, and that they don't  
produce plane waves at the listener. This is often referred to as  
'distance compensation'.


If you want to synthetically encode a near-field source so to speak  
"by

the book", you'll have to lift the source term from Daniel, Nicol and
Moreau's NFC work. I seem to remember it amounts to a first order  
filter

on the first order part of the source signal in the continuous domain,
which you'll then have to discretize. (But don't take my word for it,
it's been a while since I went through DN&R.)


Me too, but as I remember it tries to build the 'distance  
compensation' into the encoding, and thus is dependent on the  
distance of the loudspeakers. Thus the encoding is only suitable for  
an identical or similar rig, and is not transferable to other rigs.  
Amplitude/delay based systems such as WFS, Delta stereophony and  
TiMax have similar problems. The encoding has to be matched to the  
speaker rig.



 Simply
manipulating the relative amplitude or even the spectral contour  
doesn't

in theory cut it, though it's a cheap way to get some of the
psychoacoustic effects of a nearby source.


Agreed that it is far from perfect, but this is obviously not a  
trivial problem. What I'm suggesting is a fudge, though it can  
produce simulations of sources both inside and outside the  
loudspeaker radius which can be psychoacoustically effective, and are  
transferable to different rigs.


We're still left with the "40 foot high geese" problem.


 The only minor nit is that synthetic
panning needs a bit more refinement for near sources that wasn't being
handled by the older literature.


The "(potentially nasty) bass boost" you refer to is obviously a  
problem. You could limit it from going extremely large at very small  
distances, and ensure that  the output only went to 0dBFS maximum,  
but this would require a huge dynamic range throughout the whole  
system: large bit depth, good DACs, very quiet amplifiers etc..


If you could do the encoding assuming a given speaker distance, then  
modify the decoding for a different distance it might help, though  
I've no idea how to do this.


Ciao,

Dave Hunt

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound