Re: [Jprogramming] kadane algorithm vs brute force timing and memory

Raul Miller Fri, 07 Apr 2023 00:54:28 -0700

Actually, my summary line should have been

   2 %/\ 5.98958 0.57413 0.0674812 0.0142999 0.002802
10.4324 8.508 4.719 5.10346


Also, for comparison:

   timex '?1e4#0'
0.0002895
   timex '?1e5#0'
0.0034291
   timex '?1e6#0'
0.0150786
   timex '?1e7#0'
0.081682
   timex '?1e8#0'
0.927786

   2 %/\ 0.927786 0.081682 0.0150786 0.0034291 0.0002895
11.3585 5.41708 4.39725 11.8449

There's two major sources of timing discrepancies here for what would
"theoretically" be a linear (factor of 10) increase:

(1) arbitrary conflicts with system processes, and

(2) memory cache.

FYI,

-- 
Raul

On Fri, Apr 7, 2023 at 3:47 AM Raul Miller <[email protected]> wrote:
>
> This does't look like O(n^2)
>
>    kad=. >./@:(([ >. +)/\.)
>    timex 'kad ?1e4#0'
> 0.002802
>    timex 'kad ?1e5#0'
> 0.0142999
>    timex 'kad ?1e6#0'
> 0.0674812
>    timex 'kad ?1e7#0'
> 0.57413
>    timex 'kad ?1e8#0'
> 5.98958
>
>    2 %~/\ 5.98958 0.57413 0.0674812 0.0142999 0.002802
> 0.0958548 0.117536 0.211909 0.195945
>
> --
> Raul
>
> On Fri, Apr 7, 2023 at 3:09 AM Thomas McGuire <[email protected]> wrote:
> >
> > In my haste to produce an email after finally figuring out how Pepe’s 
> > elegant J implementation of the maximal sum of the subsequences worked ,I 
> > can see how you might interpret this as I was disappointed in the 
> > implementation as not being “pure” kadane. I meant only to classify the 
> > implementation so that the CPU times I was recording made sense
> >
> > From the perspective of big O analysis. Pure kadane is O(n). Where Pepe’s 
> > algorithm is being applied to a list (of length n)  of lists of varying 
> > length 1..n, The operation performed 1+1+2+…+n-1 times. Which places it 
> > about O(n^2) though it’s been a while since I have done any algorithm 
> > analysis.
> >
> > In terms of a J implementation of the broader category of maximal sum of 
> > the subsequences of a list of numbers, Pepe’s algorithm is fantastic. From 
> > the perspective of my tutorial it is an incredibly insightful jump from the 
> > naive brute force implementations I had used.  It is such a concise J 
> > implementation that highlights a significant portion of J’s advanced 
> > techniques. I intend on adding it to my tutorial I just hope I can explain 
> > it well enough to make it accessible to my target audience.
> >
> >
> > > On Apr 6, 2023, at 10:55 PM, Henry Rich <[email protected]> wrote:
> > >
> > > I think that when you say Pepe's code is not the Kadane algorithm you are 
> > > conflating algorithm and implementation.
> > >
> > > Kadane's idea is that (1) at each new prospective starting point for the 
> > > maximal sequence, you have the option of starting at 0 or the previous 
> > > best total; (2) every position is a prospective ending point.  Pepe's 
> > > algorithm follows these observations.
> > >
> > > The usual implementations keep two values through the loop: (a) the 
> > > maximum value found on sequences that have ended; (b) the current total 
> > > on sequences that the new value may add onto.
> > >
> > > Well, that's one way to do it.  Another is to produce (a) as an 
> > > intermediate result for each possible ending position, and then at the 
> > > end see which of those values is the largest.  That's what Pepe did.  I 
> > > approve his decision, since keeping two values through the scan would 
> > > require ponderous indexing operations. This is the sort of implementation 
> > > decision that surprises users of compiled languages.
> > >
> > > Pepe's implementation in no way changes the algorithm.  It's merely a 
> > > reordering of when comparisons are made.  Pepe's algorithm is O(n), just 
> > > as the usual algorithms are.
> >
> > As I tried to explain above the ([ >. +)/ is O(n) but when used on suffixes 
> > with the ‘\.’ operator it becomes O(n^2). the kadane”0 implementation that 
> > Elijah proposed operates through the array one at a time in O(n) producing 
> > a list of possible maximal sums which is extracted with >./ also in O(n). 
> > That kadane algorithm uses your technique in Chapter 25 Loopless Code VI, 
> > to perform a single run through the array.
> >
> > Just trying to be accurate so I can explain the implementations correctly 
> > in my essay.
> > >
> > > Pepe knew that >./ is very fast, probably 10% of the time of the other 
> > > bit.  The time spent in (([ >. +)/\.) is O(n), not worth our discussion; 
> > > but if you want to blame someone for the speed, blame the implementation 
> > > (i. e. me), not the algorithm. Perhaps (+ 00&>.)~/\. would be faster; I 
> > > haven't tested it, but we're down to minute questions of instruction 
> > > ordering.
> >
> > No complaints about J’s speed. Even the naive brute force implementation 
> > runs in less than a second on a 10000 element array. I have been using the 
> > timings as a relative guage that the various implementations work the way I 
> > think they are supposed to.
> >
> > The one thing I did have a question on now that I’ve drawn you into this 
> > thread. Is that the memory requirement for Pepe’s algorithm is about 1/5 
> > the size of the kadane”0 algorithm. The global variable assignment appears 
> > to be the operation that causes the large amount of memory to be used. 
> > Elijah tried to explain it and hsi explanation is valid for some of the 
> > implementations such as the use of ‘\’ and Fold. But I don’t understand in 
> > the setting of a global variable.
> >
> > Tom McGuire
> > >
> > > Henry Rich
> > >
> > >
> > >
> > >
> > > On 4/6/2023 8:12 PM, Thomas McGuire wrote:
> > >>
> > >>> On Apr 3, 2023, at 1:28 PM, Jose Mario Quintana 
> > >>> <[email protected]> wrote:
> > >>>
> > >>> For what it is worth...
> > >>>
> > >>> kad=. >./@:(([ >. +)/\.)
> > >>>
> > >> OK that is worth a lot. But it’s not really the kadane algorithm. It 
> > >> inserts a kadane-like summation in between elements for each suffix 
> > >> generated. Kadane obtains the sum on a single pass through the array. 
> > >> However it’s an incredibly elegant refactoring of the brute force 
> > >> example and saves performing partial sums on the suffixes.
> > >>
> > >> I will say after figuring out exactly what is was doing it wasn’t 
> > >> intuitively obvious (to me anyway) that this should work. A naive view 
> > >> is that the largest sum in the example below is produced by the 
> > >> subsequence :
> > >> 4 _1 2 1
> > >>
> > >> So that sequence is never generated in Jose’s version of maximum sum. 
> > >> However due to it’s Kadane-like properties it is generating the highest 
> > >> sum for each suffix as if the partial sum started at the first element. 
> > >> In essence using a kadane-like approach in place of performing partial 
> > >> sums. Therefore in one pass on each suffix you find the greatest partial 
> > >> sum without having to calculate each partial sum individually.
> > >>
> > >> Jose’s version is fairly quick coming in at about double the processing 
> > >> time of Jose’s kadane”0 version and well below the brute force method of 
> > >> taking the partial sums of all the suffixes.
> > >>
> > >> Tom McGuire
> > >>
> > >>> For example, the maximum sum subarray,
> > >>>
> > >>>
> > >>>  kad _2 1 _3 4 _1 2 1 _5 4
> > >>> 6
> > >>>
> > >>>
> > >>>   kad i.0
> > >>> __
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Mon, Apr 3, 2023 at 5:22 AM Elijah Stone <[email protected]> wrote:
> > >>>
> > >>>> Your kadane routines are all bound to have asymptotically poor space 
> > >>>> usage
> > >>>> under the present interpreter because >./ is not fused; so they all 
> > >>>> must
> > >>>> create an intermediate result array whose length is the same as ranvec,
> > >>>> and
> > >>>> then apply >./.  I think (haven't looked very closely) that locmax is 
> > >>>> the
> > >>>> same
> > >>>> as your final answer; so you could skip generating the intermediate 
> > >>>> array
> > >>>> and
> > >>>> simply look at locmax.  Better yet, get rid of the global variable, and
> > >>>> turn
> > >>>> this all into a fold single (rather than fold multiple), where locmax
> > >>>> becomes
> > >>>> the running value.
> > >>>>
> > >>>> On Mon, 3 Apr 2023, Elijah Stone wrote:
> > >>>>
> > >>>>>  7!:2 '>./ 1 kadane\ ranvec' [ locmax=: __
> > >>>>> 1412576
> > >>>>>   7!:2 '>./ kadane"0 ranvec' [ locmax=: __
> > >>>>> 772960
> > >>>>>   7!:2 '>./ kadane"1 ranvec2' [ locmax=: __ [ ranvec2=: ,.ranvec
> > >>>>> 1412960
> > >>>>>   1 <@$@$\ i.5
> > >>>>> ┌─┬─┬─┬─┬─┐
> > >>>>> │1│1│1│1│1│
> > >>>>> └─┴─┴─┴─┴─┘
> > >>>>>   <@$@$"0 i.5
> > >>>>> ┌─┬─┬─┬─┬─┐
> > >>>>> │0│0│0│0│0│
> > >>>>> └─┴─┴─┴─┴─┘
> > >>>>>
> > >>>>> 1 kadane\ y applies kadane to length-1 subsequences of y--that is, in
> > >>>> this
> > >>>>> case, vectors, which have length 1.  Whereas kadane"0 applies to cells
> > >>>> of y,
> > >>>>> which have rank 0.  More space is required to store a length-1 vector
> > >>>> than a
> > >>>>> scalar, since in the former case the length also needs to be stored.
> > >>>>>
> > >>>>> On Mon, 3 Apr 2023, Thomas McGuire wrote:
> > >>>>>
> > >>>>>> I’m wondering if someone with more J internals knowledge can tell me
> > >>>> why my
> > >>>>> Kadane algorithm implementation is taking up more memory than my naive
> > >>>> brute
> > >>>>> force method that generates all subsequences then sums them. The time 
> > >>>>> is
> > >>>>> about what I would expect except for the FOLD operation. The time 
> > >>>>> space
> > >>>> data
> > >>>>> were generated on a MacBook Pro 2019 era I9 processor 2.3GHz.
> > >>>>>>  ranvec =: _5 + ?10000$10
> > >>>>>>  kadane =: 3 : 'locmax =: y >. locmax + y'
> > >>>>>>  kadane1 =: 3 : 'locmax =: (] >. locmax&+) y'
> > >>>>>>
> > >>>>>>  NB. first brute force implementation
> > >>>>>>  timespacex '>./(>./@(+/\))\.ranvec'
> > >>>>>> 0.032948 396448
> > >>>>>>
> > >>>>>>  NB. fork version brute force
> > >>>>>>  timespacex '>./([: >./ (+/\))\. ranvec'
> > >>>>>> 0.034208 396448
> > >>>>>>
> > >>>>>>  NB. first kadane implementation
> > >>>>>>  locmax =: __
> > >>>>>>  timespacex '>./ 1 kadane\ranvec'
> > >>>>>> 0.004034 1.41251e6
> > >>>>>>
> > >>>>>>  NB. fork kadane version
> > >>>>>>  locmax =: __
> > >>>>>>  timespacex '>./ 1 kadane1\ranvec'
> > >>>>>> 0.005267 1.41277e6
> > >>>>>>
> > >>>>>>  NB. new dnfs anonymous kadane function
> > >>>>>>  locmax =: __
> > >>>>>>  timespacex '>./ 1 {{locmax =: y >. locmax + y}}\ranvec'
> > >>>>>> 0.003625 1.41347e6
> > >>>>>>
> > >>>>>>  NB. fork dnfs anonymous kadane function
> > >>>>>>  locmax =: __
> > >>>>>>  timespacex '>./ 1 {{locmax =: (] >. locmax&+) y}}\ranvec'
> > >>>>>> 0.004776 1.41386e6
> > >>>>>>
> > >>>>>>  NB. Fold kadane implementation
> > >>>>>>  timespacex '>./ __ ] F:. ([ >. +) ranvec'
> > >>>>>> 0.017393 905792
> > >>>>>>
> > >>>>>> This was all part of a J Tutorial I have written
> > >>>>> https://code.jsoftware.com/wiki/User:Tom_McGuire/KadaneAlgorithmTutorial
> > >>>>>> Tom McGuire
> > >>>>>> ----------------------------------------------------------------------
> > >>>>>> For information about J forums see 
> > >>>>>> http://www.jsoftware.com/forums.htm
> > >>>>> ----------------------------------------------------------------------
> > >>>>> For information about J forums see http://www.jsoftware.com/forums.htm
> > >>>>>
> > >>>> ----------------------------------------------------------------------
> > >>>> For information about J forums see http://www.jsoftware.com/forums.htm
> > >>>>
> > >>> ----------------------------------------------------------------------
> > >>> For information about J forums see http://www.jsoftware.com/forums.htm
> > >> ----------------------------------------------------------------------
> > >> For information about J forums see http://www.jsoftware.com/forums.htm
> > >
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
> >
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] kadane algorithm vs brute force timing and memory

Reply via email to