Re: [music-dsp] efficient running max algorithm

Evan Balster Sat, 03 Sep 2016 00:18:01 -0700

Ross' explanation is solid.

Robert:  R refers to range ("delay line" size, one could say) and N refers
to signal length.



Clarifying about the weird linear/binary search:  It's necessary to
maintain O(N) complexity for processing N samples.  This took me a while to
grasp:

If we do a backward linear search (as in the original Lemire algorithm) we
get that O(N) complexity but get an O(R) worst case per sample.  Lemire's
paper explains why the former is the case.

If we do a binary search, we get O(log2(R)) worst case per sample, but the
total complexity is O(N*log2(R)) in the worst case where the signal is
monotonically decreasing---because our wedge will be size R at all times.

If we do a linear search over the first log2(R) samples, though, we get the
best of both worlds:  The Lemire-Fenn algorithm.  The reasoning is that
once we've done log2(R) steps of the linear search, performing a log2(R)
binary search only multiplies the work by a constant factor.  Think of this
as "clamping" the worst-case performance of the search to log2(R) without
increasing the complexity of cheaper search-and-prune operations.

(If anyone can explain that better, I invite them to!)


Re:  Using STL for DSP, I largely agree (though I'll often do it when I can
effectively control allocations).  The GitHub code is suitable for
high-level use and as a reference implementation.

– Evan Balster
creator of imitone <http://imitone.com>

On Sat, Sep 3, 2016 at 12:42 AM, Ross Bencina <rossb-li...@audiomulch.com>
wrote:

> On 3/09/2016 2:14 PM, robert bristow-johnson wrote:
>
>> and in discussing iterators, says nothing about push_back()
>> and the like.
>>
>
> push_back(), push_front(), pop_back(), pop_front() are methods generally
> available on queue-like containers.
>
>
> from online i can get an idea, but it seems to me to be a
>> LIFO stack, not a FIFO buffer.  so a sample value gets pushed onto this
>> thing and pop_back() pops it off, but how does the oldest sample get
>> pushed off the front?  what makes this vector a circular FIFO thingie?
>>
>
> What you're missing is that the code to drop the oldest samples is omitted
> from my example entirely, and is in a separate file in Evan's code.
>
> You're kinda right though, there's a LIFO process that deals with the
> incoming data, and old samples drop off the far end of the queue (FIFO)
> when they age beyond the window size. The running maximum is always at the
> far end of the queue (since the sequence in the queue is decreasing)
>
> In my code, /front/ is the oldest value and /back/ is the newest value.
> The queue only contains the decresing segments, so it's a discontiguous
> history -- the oldest value in the queue is not usually windowSize samples
> old, it's probably newer than that.
>
> Each queue entry is a pair (value, index). The index is some integer that
> counts input samples.
>
> During decreasing segments, (value, index) are pushed on the back of the
> queue. During increasing segments, samples are trimmed off the back. (This
> is the LIFO part)
>
> Samples are dropped off the front when index is older than the current
> input index (This is the FIFO part).
>
>
> that said, the "10 lines of code" is deceptive.  it's 10 lines of code
>> with function calls.  you gotta count the cost of the function calls.
>>
>
> I agree that it's opaque. But a modern C++ compiler will inline everything
> and optimize away most of the overhead. I know this sounds like fanboy
> talk, but the C++ optimizers are surprisingly good these days.
>
> Personally I steer clear of STL for dsp code.
>
>
> now, Ross, this is Evan's diagram (from earlier today), but maybe you
>> can help me:
>>
>>
>> [image: Inline image 3]
>> http://interactopia.com/archive/images/lemire_algorithm.png
>>
>
> I read that with time running from left to right.
>
> The newest samples are added on the right, the oldest samples are dropped
> from the left.
>
> The red segments are the portions stored in the running max history buffer.
>
>
> The algorithm can safely forget anything in grey because it has been
>>> "shadowed" by newer maximum or minimum values.
>>>
>> >
>
>> what is not shown on the diagram is what happens when the current
>> running max value expires (or falls off the edge of the delay buffer)?
>>  when the current max value falls offa the edge, what must you do to
>> find the new maximum over the past N samples?
>>
>
> Let's call the edge "front". That's the leftmost sample in Evan's picture.
> So we expire that one, it falls of the edge, it's no longer there. Notice
> how the stored (red) samples are all in a decreasing sequence? So the new
> maximum over N samples is just the new "front".
>
>
> you would have to be riding the slope down on the left side of the peak
>> that just fell offa the edge and then how do you compare that value to
>> any peaks (that are lower for the moment) that have not yet expired?
>>
>
> If they're all guaranteed lower, you don't need to compare them.
>
> It would be impossible for them to be higher, because the LIFO process at
> the input end of the queue ensures that all samples in the history form a
> decreasing sequence.
>
>  they have to be other wedges, right?  with their own record of height
>> and location, right?  the only thing shown on that diagram is what
>> happens when new maxima come into the delay line on the left and the
>> running max value increases.  it does not show how it decreases and the
>> running max must decrease eventually.
>>
>
> If the history is non-empty, the running max decreases every time you pop
> a sample off the front. As I said above, that wouldn't necessarily happen
> at every step. For example, if a new global max arrives, the queue would be
> completely flushed, and that global max would remain the "front" value in
> the queue for windowSize samples.
>
>
> i cannot see how this strategy works without keeping a record of *every*
>> local maxima.  both its value and its location.
>>
>
> It keeps a record of a decreasing monotonic sequence of inputs, including
> their locations.
>
>
> that record would also
>> be FIFO and the size of that record would be large if there were a lotta
>> peaks (like a high frequency signal), but each entry would be
>> chronologically placed.
>>
>
> Correct.
>
>
> any local maximum between the current running max and the "runner up"
>> (which is more current) can be ignored.  perhaps this means that you
>> only need to note where the current running max is (both location and
>> value) and the runner-up that is more current.  when the current max
>> falls offa the edge on the right, you have to ride the left slope of
>> that peak (which is at the maximum delay location) down until it's the
>> value of the runner up and then all samples between the new running max
>> and the end of the delay buffer can be ignored.  but here is the
>> question, when the runner up is promoted to the running max, how is the
>> *new* runner up found?
>>
>
> It's the next sample the stored decreasing history sequence (running max
> is at the left of Evan's picture, next running max is next red pixel down
> the hill (note that this might involve a big time gap (that big grey valley
> in the middle of the picture).
>
> that could be anywhere between the buffer input
>> (on the left side) and the new running max.  if you are not already
>> performing a binary tree on that , i cannot see how finding the new
>> runner up is cheap, unless you have a record of every local max between
>> the buffer input (on the left side) and the new running max.  and each
>> time a new local max is detected, you have to do the O(log2(something))
>> binary tree thing.
>>
>
> I'm not sure if I follow you. But as I've said before, if a new global
> maximum arrives, then in the worst case the whole history needs to be
> trimmed (or generally, back to the value that's greater than the new input
> sample).
>
> Each time a locally increasing sample arrives you do have to do
> O(log2(something)) work, that's true. The constraints on something are kind
> of interesting though.
>
> "something" = <queue size> only if the queue is "full" which happens with
> a decreasing signal for windowSize samples and the new input is a new
> maximum over the whole window.
>
> For a smooth signal, "something" ~= 1, since it will just "peel" a few
> samples off the back of the queue. This is I think why Ethan suggested
> doing a few linear steps first (as would be appropriate for a smooth input
> shown at the right on Evan's diagram.
>
> To a first order approximation, "something" depends both on the number of
> history samples in the queue (e.g. windowSize/2 for a symmetric waveform),
> the size of the positive jump in the first difference.
>
>
> Ross.
>
>
>
>
>
> _______________________________________________
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
>

_______________________________________________
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] efficient running max algorithm

Reply via email to