*To use log magnitude you'd first have to normalize it to look like a
probability density (non-negative, sums to one). Meaning you add an offset
so that the lowest value is zero, and then normalize.  Obviously that puts
restrictions on the class of signals it can handle - there can't be any
zeros on the unit circle (in practice we'd just apply a minimum threshold
at, say, -60dB or whatever) - and involves other complications (I'm not
sure there's a sensible time-domain interpretation).*

I've solved a lot of problems in the past by "massaging" numbers into
ranges or formats where they suit the problem I'm trying to solve.  That
approach -- adding mathematical complexity according to convenience and
intuition rather than specific theoretical justification -- is
unscientific, and time and time again it has led me to false leads and
discouraging results when solving low-level problems.  (That said, the
failures that result from "fumbling in the dark" can sometimes lead to
groundbreaking discoveries.)

Research into perception tells us that most phenomena are perceived
proportional to the logarithm of their intensity.  It tells us further that
auditory stimuli are received in a form *resembling *the frequency domain.
We're mathematicians, not neuroscientists, and that discipline comes with a
powerful confirmation bias for simple, "elegant" solutions.  But the
cochlea is not cleanly modeled
<http://www.cns.nyu.edu/~david/courses/perception/lecturenotes/pitch/pitch.html>
by a fourier transform, and as to what happens beyond, Minsky said it best:
the simplest explanation is that there is no simple explanation.  In
absence of hard research, we can't reasonably expect to add logarithm
flavoring to such a simple formula and expect it to converge with the
result of billions of years of evolution.

Anyway, that's why -- in spite of my extensive research in pitch tracking
-- I don't touch perception modeling with a ten-foot pole.  It's a soft
science and it's all too easy to develop the misconception that you know
what you're doing.  Because it will be a long time before the perceptual
properties of any brightness metric can be clearly understood, I'll stick
to formulas whose mathematical properties are transparent -- these lend
themselves infinitely better to being small pieces of larger systems.

– Evan Balster
creator of imitone <http://imitone.com>

On Thu, Feb 18, 2016 at 11:24 AM, Ethan Duni <ethan.d...@gmail.com> wrote:

> >Weighting a mean with log-magnitude can quickly lead to nonsense.
>
> To use log magnitude you'd first have to normalize it to look like a
> probability density (non-negative, sums to one). Meaning you add an offset
> so that the lowest value is zero, and then normalize. Obviously that puts
> restrictions on the class of signals it can handle - there can't be any
> zeros on the unit circle (in practice we'd just apply a minimum threshold
> at, say, -60dB or whatever) - and involves other complications (I'm not
> sure there's a sensible time-domain interpretation).
>
> >I apply Occam's razor when making decisions about what metrics correspond
> most closely to nature
>
> What is the natural phenomenon that we're trying to model here?
>
> > log-magnitude is rarely sensible outside of perception modeling
>
> But isn't the goal here to estimate the "brightness" of a signal?
> Perceptual modelling is exactly why I bring log spectra up.
>
> E
>
>
>
> On Thu, Feb 18, 2016 at 7:42 AM, Evan Balster <e...@imitone.com> wrote:
>
>> Weighting a mean with log-magnitude can quickly lead to nonsense.
>> Trivial examples:
>>
>>    - 0dB sine at 100hz, 6dB sine at 200hz --> log centroid is 200hz
>>    - -6dB sine at 100hz, 12dB sine at 200hz --> log centroid is 300hz (!)
>>
>> Sanfillipo's adaptive median finding technique is still applicable, but
>> will produce the same result as a power or magnitude version.
>>
>> I apply Occam's razor when making decisions about what metrics correspond
>> most closely to nature.  I choose the formula which is mathematically
>> simplest while utilizing operations that make sense for the dimensionality
>> of the operands and do not induce undue discontinuities.  Power is simpler
>> to compute than magnitude, log-magnitude is rarely sensible outside of
>> perception modeling, and (unlike zero-crossing techniques) a small change
>> in the signal will always produce a proportionally small change in the
>> metrics.
>>
>> At next opportunity I should post up some code describing how to compute
>> higher moments with the differential brightness estimator.
>>
>> – Evan Balster
>> creator of imitone <http://imitone.com>
>>
>> On Thu, Feb 18, 2016 at 1:00 AM, Ethan Duni <ethan.d...@gmail.com> wrote:
>>
>>> >normalized to fundamental frequency or not
>>> >normalized (so that no pitch detector is needed)?
>>>
>>> Yeah tonal signals open up a whole other can of worms. I'd like to
>>> understand the broadband case first, with relatively simple spectral
>>> statistics that correspond to the clever time-domain estimators discussed
>>> so far in the thread.
>>>
>>> The ideas for time-domain approaches got me thinking about what the
>>> optimal time-domain approach would look like. But of course it depends on
>>> what definition of spectral centroid you use. For the mean of the power
>>> spectrum it seems relatively straightforward to get some tractable
>>> expressions - I guess this is the inspiration for the one based on an
>>> approximate differentiator. But I suspect that mean of the log power
>>> spectrum is more perceptually meaningful.
>>>
>>> E
>>>
>>> On Wed, Feb 17, 2016 at 8:34 PM, robert bristow-johnson <
>>> r...@audioimagination.com> wrote:
>>>
>>>>
>>>>
>>>> ---------------------------- Original Message
>>>> ----------------------------
>>>> Subject: Re: [music-dsp] Cheap spectral centroid recipe
>>>> From: "Ethan Duni" <ethan.d...@gmail.com>
>>>> Date: Wed, February 17, 2016 11:21 pm
>>>> To: "A discussion list for music-related DSP" <
>>>> music-dsp@music.columbia.edu>
>>>>
>>>> --------------------------------------------------------------------------
>>>>
>>>> >>It's essentially computing a frequency median,
>>>> >>rather than a frequency mean as is the case
>>>> >>with the derivative-power technique described
>>>> >> in my original approach.
>>>> >
>>>> > So I'm wondering, is there any consensus on what is the best measure
>>>> of
>>>> > central tendency for a music signal spectrum? There's the median vs
>>>> the
>>>> > mean (vs trimmed means, mode, etc). But what is the right domain in
>>>> the
>>>> > first place: magnitude spectrum, power spectrum, log power spectrum
>>>> or ???
>>>>
>>>> normalized to fundamental frequency or not normalized (so that no pitch
>>>> detector is needed)?  should identical waveforms at higher pitches have the
>>>> same centroid parameter or a higher centroids?
>>>>
>>>> spectral "brightness" is a multi-dimensional perceptual parameter.  you
>>>> can have two tones with the same spectral centroid (however consistent way
>>>> you measure it) and sound very different if the "second moment" or
>>>> "variance" is much different.
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> r b-j                   r...@audioimagination.com
>>>>
>>>>
>>>>
>>>>
>>>> "Imagination is more important than knowledge."
>>>>
>>>> _______________________________________________
>>>> dupswapdrop: music-dsp mailing list
>>>> music-dsp@music.columbia.edu
>>>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>>>
>>>
>>>
>>> _______________________________________________
>>> dupswapdrop: music-dsp mailing list
>>> music-dsp@music.columbia.edu
>>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>>
>>
>>
>> _______________________________________________
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>
>
> _______________________________________________
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
_______________________________________________
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Reply via email to