Re: aucat(1) mixing: saturating-addition instead of add-and-divide-by-n_inputs

Sviatoslav Chagaev Wed, 11 May 2011 14:41:45 -0700

On Wed, 11 May 2011 09:40:16 +0200
Alexandre Ratchov <[email protected]> wrote:


> On Wed, May 11, 2011 at 02:50:36AM +0300, Sviatoslav Chagaev wrote:
> > I'm sitting at work, listening to music, debugging a web-application
> > with JavaScript alert()s. Each time an alert window pops up, the
> > browser plays a sound. For a brief moment, the volume drops twicefold
> > then goes back to normal. This is annoying and doesn't make sense.
> 
> I agree, this is annoying.
> 
> > In real life, if you are surrounded by multiple sound sources, their
> > sound volumes will not be divided by the total amount of sound sources.
> > Their sounds will add up until they blur and you can't distinguish
> > anything anymore. Other operating systems, such as Macrohard Doors, do
> > mixing by modeling this real world behaviour.
> 
> my physics lessons say that pressure is additive, so the resulting
> pressure of two sources close to each other is the sum of their
> respective pressures. And there's no clipping in nature, so no need to
> test against any MIN and MAX value.
> 
> A simple addition is what our ears expect.

True, with a note that I test for MIN MAX because without it, the
variable overfills, resulting in terrible distortions as soons as two
files are playing simultaneously and are loud enough.

> 
> On the other hand DACs operate on a limited dynamic range, so there's
> a MIN and a MAX value. This is not how physics laws are, there's not
> MIN and MAX values for pressure.
> 
> So keeping full dynamic range of the DAC and doing the physics
> correctly at the same time is simply mathematically impossible.

Yes, the computer is limited and can only model real world to some
degree.

> 
> What options do we have?
> 
>  (1) prescale streams => loose few dBs of dynamic range
>  (2) clipping => is not natural except if there's no clipping
>  (3) using (x + y - x * y) => distortion, similar to (2)
>  (4) do (1) but with DACs with larger dynamic range => ok
>  (5) ...
> 

Hm, perhaps some sort of a hybrid approach could be employed (for the
-l, aka system sound server functionality).

> The choice behind aucat is to never add distortion, clipping or
> whatever. So (1) and (4) are the only options afaics
> 
> > In this sense, aucat violates the principle of least surprise.
> > I'm used to how sound interacts in real world and then aucat steps in
> > and introduces it's own laws of physics.
> > 
> > To remedy this, aucat has an option -v, which lets you pre-divide the
> > volume of inputs. This results in loss of dynamic range (quiet sounds
> > might disappear and the maximum volume that you can set decreases). And
> > also, if during usage the count of inputs raises above of what I
> > predicted, the volume starts to jump up and down again.
> 
> If you have N streams, the relative jump is, N / (N + 1) so there's
> almost no step if N is large enough (it tends to 1). My experience is
> that for N > 3, I hear no step, except if I pay special attention
> and/or I use particular recordings.
> 
> > 
> > Experimentally, I've found that if you do a saturating addition between
> > inputs, it sounds very much how it might have sounded in real world
> 
> I don't agree. Sound doesn't saturate in real world. When two persons
> are speaking around me at the same time, I don't hear any
> clipping/distortion.
> 

That is true.

> Human ears might saturate at very elevated sound levels but at such
> level they are being damaged.
> 
> > and
> > how Macrohard Doors, among others, sounds like when playing
> > multiple sounds.
> > 
> 
> I bet it prescales, but nobody noticed it because it prescales all the
> time. I bet that if "-v 100" was the aucat default, we wouldn't have
> this discussion. We would be discussing about aucat defaults being
> unpractical for conversions, or about the volume being too low when a
> single stream is playing.
> 

Now that I think about it, it might be using a hybrid aproach.
Prescaling a bit and then adding with saturation (or something 
like that). I'll try to do some experiments.

> > 
> > So, why is what I'm proposing better than what currently exists:
> > 
> > * Resembles how sound behaves in real world more closely;
> > * Doesn't violate the principle of least surprise;
> > * No more annoying volume jumps up and down;
> > * No need to use the -v option anymore / less stuff to remember / "it
> > just works";
> > * No more choosing between being annoyed by volume jumps or loosing
> > dynamic range.
> > 
> 
> I guess this works well with your recordings by accident, as it would
> with mines. I bet they are pre-divided, so you almost never hit the
> ADATA_MIN and ADATA_MAX bundary, and there's almost no clipping, is
> it?
> 

Before posting, I tested by playing, I don't remember exactly,
either 2 either 3 files simultaneously. And honestly, I didn't notice
any distortions. This is probably because of the fact that like you
said, it's rare that all the waves have +/-ADATA_UNIT value at a given
moment. After reading your replies, I tried playing ~8 files
simultaneously and I indeed heard clearly sound distortion.

> If so, for such streams you could do:
> 
>       int
>       adata_sadd(int x, int y)
>       {
>               return x + y;
>       }
> 

The ADATA_MIN ADATA_MAX checks are needed because without them, you will start 
hearing distorions as soon as you play 2 files and this distortion is much 
worse sounding than the saturation distortion.

> and the result would be almost the same.
> 
> -- Alexandre

Re: aucat(1) mixing: saturating-addition instead of add-and-divide-by-n_inputs

Reply via email to