Re: [Jprogramming] kadane algorithm vs brute force timing and memory

Thomas McGuire Fri, 07 Apr 2023 00:09:38 -0700

In my haste to produce an email after finally figuring out how Pepe’s elegant J 
implementation of the maximal sum of the subsequences worked ,I can see how you 
might interpret this as I was disappointed in the implementation as not being 
“pure” kadane. I meant only to classify the implementation so that the CPU 
times I was recording made sense


From the perspective of big O analysis. Pure kadane is O(n). Where Pepe’s 
algorithm is being applied to a list (of length n)  of lists of varying length 
1..n, The operation performed 1+1+2+…+n-1 times. Which places it about O(n^2) 
though it’s been a while since I have done any algorithm analysis.

In terms of a J implementation of the broader category of maximal sum of the 
subsequences of a list of numbers, Pepe’s algorithm is fantastic. From the 
perspective of my tutorial it is an incredibly insightful jump from the naive 
brute force implementations I had used.  It is such a concise J implementation 
that highlights a significant portion of J’s advanced techniques. I intend on 
adding it to my tutorial I just hope I can explain it well enough to make it 
accessible to my target audience. 


> On Apr 6, 2023, at 10:55 PM, Henry Rich <[email protected]> wrote:
> 
> I think that when you say Pepe's code is not the Kadane algorithm you are 
> conflating algorithm and implementation.
> 
> Kadane's idea is that (1) at each new prospective starting point for the 
> maximal sequence, you have the option of starting at 0 or the previous best 
> total; (2) every position is a prospective ending point.  Pepe's algorithm 
> follows these observations.
> 
> The usual implementations keep two values through the loop: (a) the maximum 
> value found on sequences that have ended; (b) the current total on sequences 
> that the new value may add onto.
> 
> Well, that's one way to do it.  Another is to produce (a) as an intermediate 
> result for each possible ending position, and then at the end see which of 
> those values is the largest.  That's what Pepe did.  I approve his decision, 
> since keeping two values through the scan would require ponderous indexing 
> operations. This is the sort of implementation decision that surprises users 
> of compiled languages.
> 
> Pepe's implementation in no way changes the algorithm.  It's merely a 
> reordering of when comparisons are made.  Pepe's algorithm is O(n), just as 
> the usual algorithms are.

As I tried to explain above the ([ >. +)/ is O(n) but when used on suffixes 
with the ‘\.’ operator it becomes O(n^2). the kadane”0 implementation that 
Elijah proposed operates through the array one at a time in O(n) producing a 
list of possible maximal sums which is extracted with >./ also in O(n). That 
kadane algorithm uses your technique in Chapter 25 Loopless Code VI, to perform 
a single run through the array.

Just trying to be accurate so I can explain the implementations correctly in my 
essay. 
> 
> Pepe knew that >./ is very fast, probably 10% of the time of the other bit.  
> The time spent in (([ >. +)/\.) is O(n), not worth our discussion; but if you 
> want to blame someone for the speed, blame the implementation (i. e. me), not 
> the algorithm. Perhaps (+ 00&>.)~/\. would be faster; I haven't tested it, 
> but we're down to minute questions of instruction ordering.

No complaints about J’s speed. Even the naive brute force implementation runs 
in less than a second on a 10000 element array. I have been using the timings 
as a relative guage that the various implementations work the way I think they 
are supposed to. 

The one thing I did have a question on now that I’ve drawn you into this 
thread. Is that the memory requirement for Pepe’s algorithm is about 1/5 the 
size of the kadane”0 algorithm. The global variable assignment appears to be 
the operation that causes the large amount of memory to be used. Elijah tried 
to explain it and hsi explanation is valid for some of the implementations such 
as the use of ‘\’ and Fold. But I don’t understand in the setting of a global 
variable.

Tom McGuire
> 
> Henry Rich
> 
> 
> 
> 
> On 4/6/2023 8:12 PM, Thomas McGuire wrote:
>> 
>>> On Apr 3, 2023, at 1:28 PM, Jose Mario Quintana 
>>> <[email protected]> wrote:
>>> 
>>> For what it is worth...
>>> 
>>> kad=. >./@:(([ >. +)/\.)
>>> 
>> OK that is worth a lot. But it’s not really the kadane algorithm. It inserts 
>> a kadane-like summation in between elements for each suffix generated. 
>> Kadane obtains the sum on a single pass through the array. However it’s an 
>> incredibly elegant refactoring of the brute force example and saves 
>> performing partial sums on the suffixes.
>> 
>> I will say after figuring out exactly what is was doing it wasn’t 
>> intuitively obvious (to me anyway) that this should work. A naive view is 
>> that the largest sum in the example below is produced by the subsequence :
>> 4 _1 2 1
>> 
>> So that sequence is never generated in Jose’s version of maximum sum. 
>> However due to it’s Kadane-like properties it is generating the highest sum 
>> for each suffix as if the partial sum started at the first element. In 
>> essence using a kadane-like approach in place of performing partial sums. 
>> Therefore in one pass on each suffix you find the greatest partial sum 
>> without having to calculate each partial sum individually.
>> 
>> Jose’s version is fairly quick coming in at about double the processing time 
>> of Jose’s kadane”0 version and well below the brute force method of taking 
>> the partial sums of all the suffixes.
>> 
>> Tom McGuire
>> 
>>> For example, the maximum sum subarray,
>>> 
>>> 
>>>  kad _2 1 _3 4 _1 2 1 _5 4
>>> 6
>>> 
>>> 
>>>   kad i.0
>>> __
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Mon, Apr 3, 2023 at 5:22 AM Elijah Stone <[email protected]> wrote:
>>> 
>>>> Your kadane routines are all bound to have asymptotically poor space usage
>>>> under the present interpreter because >./ is not fused; so they all must
>>>> create an intermediate result array whose length is the same as ranvec,
>>>> and
>>>> then apply >./.  I think (haven't looked very closely) that locmax is the
>>>> same
>>>> as your final answer; so you could skip generating the intermediate array
>>>> and
>>>> simply look at locmax.  Better yet, get rid of the global variable, and
>>>> turn
>>>> this all into a fold single (rather than fold multiple), where locmax
>>>> becomes
>>>> the running value.
>>>> 
>>>> On Mon, 3 Apr 2023, Elijah Stone wrote:
>>>> 
>>>>>  7!:2 '>./ 1 kadane\ ranvec' [ locmax=: __
>>>>> 1412576
>>>>>   7!:2 '>./ kadane"0 ranvec' [ locmax=: __
>>>>> 772960
>>>>>   7!:2 '>./ kadane"1 ranvec2' [ locmax=: __ [ ranvec2=: ,.ranvec
>>>>> 1412960
>>>>>   1 <@$@$\ i.5
>>>>> ┌─┬─┬─┬─┬─┐
>>>>> │1│1│1│1│1│
>>>>> └─┴─┴─┴─┴─┘
>>>>>   <@$@$"0 i.5
>>>>> ┌─┬─┬─┬─┬─┐
>>>>> │0│0│0│0│0│
>>>>> └─┴─┴─┴─┴─┘
>>>>> 
>>>>> 1 kadane\ y applies kadane to length-1 subsequences of y--that is, in
>>>> this
>>>>> case, vectors, which have length 1.  Whereas kadane"0 applies to cells
>>>> of y,
>>>>> which have rank 0.  More space is required to store a length-1 vector
>>>> than a
>>>>> scalar, since in the former case the length also needs to be stored.
>>>>> 
>>>>> On Mon, 3 Apr 2023, Thomas McGuire wrote:
>>>>> 
>>>>>> I’m wondering if someone with more J internals knowledge can tell me
>>>> why my
>>>>> Kadane algorithm implementation is taking up more memory than my naive
>>>> brute
>>>>> force method that generates all subsequences then sums them. The time is
>>>>> about what I would expect except for the FOLD operation. The time space
>>>> data
>>>>> were generated on a MacBook Pro 2019 era I9 processor 2.3GHz.
>>>>>>  ranvec =: _5 + ?10000$10
>>>>>>  kadane =: 3 : 'locmax =: y >. locmax + y'
>>>>>>  kadane1 =: 3 : 'locmax =: (] >. locmax&+) y'
>>>>>> 
>>>>>>  NB. first brute force implementation
>>>>>>  timespacex '>./(>./@(+/\))\.ranvec'
>>>>>> 0.032948 396448
>>>>>> 
>>>>>>  NB. fork version brute force
>>>>>>  timespacex '>./([: >./ (+/\))\. ranvec'
>>>>>> 0.034208 396448
>>>>>> 
>>>>>>  NB. first kadane implementation
>>>>>>  locmax =: __
>>>>>>  timespacex '>./ 1 kadane\ranvec'
>>>>>> 0.004034 1.41251e6
>>>>>> 
>>>>>>  NB. fork kadane version
>>>>>>  locmax =: __
>>>>>>  timespacex '>./ 1 kadane1\ranvec'
>>>>>> 0.005267 1.41277e6
>>>>>> 
>>>>>>  NB. new dnfs anonymous kadane function
>>>>>>  locmax =: __
>>>>>>  timespacex '>./ 1 {{locmax =: y >. locmax + y}}\ranvec'
>>>>>> 0.003625 1.41347e6
>>>>>> 
>>>>>>  NB. fork dnfs anonymous kadane function
>>>>>>  locmax =: __
>>>>>>  timespacex '>./ 1 {{locmax =: (] >. locmax&+) y}}\ranvec'
>>>>>> 0.004776 1.41386e6
>>>>>> 
>>>>>>  NB. Fold kadane implementation
>>>>>>  timespacex '>./ __ ] F:. ([ >. +) ranvec'
>>>>>> 0.017393 905792
>>>>>> 
>>>>>> This was all part of a J Tutorial I have written
>>>>> https://code.jsoftware.com/wiki/User:Tom_McGuire/KadaneAlgorithmTutorial
>>>>>> Tom McGuire
>>>>>> ----------------------------------------------------------------------
>>>>>> For information about J forums see http://www.jsoftware.com/forums.htm
>>>>> ----------------------------------------------------------------------
>>>>> For information about J forums see http://www.jsoftware.com/forums.htm
>>>>> 
>>>> ----------------------------------------------------------------------
>>>> For information about J forums see http://www.jsoftware.com/forums.htm
>>>> 
>>> ----------------------------------------------------------------------
>>> For information about J forums see http://www.jsoftware.com/forums.htm
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
> 
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] kadane algorithm vs brute force timing and memory

Reply via email to