Actually, my summary line should have been 2 %/\ 5.98958 0.57413 0.0674812 0.0142999 0.002802 10.4324 8.508 4.719 5.10346
Also, for comparison: timex '?1e4#0' 0.0002895 timex '?1e5#0' 0.0034291 timex '?1e6#0' 0.0150786 timex '?1e7#0' 0.081682 timex '?1e8#0' 0.927786 2 %/\ 0.927786 0.081682 0.0150786 0.0034291 0.0002895 11.3585 5.41708 4.39725 11.8449 There's two major sources of timing discrepancies here for what would "theoretically" be a linear (factor of 10) increase: (1) arbitrary conflicts with system processes, and (2) memory cache. FYI, -- Raul On Fri, Apr 7, 2023 at 3:47 AM Raul Miller <rauldmil...@gmail.com> wrote: > > This does't look like O(n^2) > > kad=. >./@:(([ >. +)/\.) > timex 'kad ?1e4#0' > 0.002802 > timex 'kad ?1e5#0' > 0.0142999 > timex 'kad ?1e6#0' > 0.0674812 > timex 'kad ?1e7#0' > 0.57413 > timex 'kad ?1e8#0' > 5.98958 > > 2 %~/\ 5.98958 0.57413 0.0674812 0.0142999 0.002802 > 0.0958548 0.117536 0.211909 0.195945 > > -- > Raul > > On Fri, Apr 7, 2023 at 3:09 AM Thomas McGuire <tmcguir...@gmail.com> wrote: > > > > In my haste to produce an email after finally figuring out how Pepe’s > > elegant J implementation of the maximal sum of the subsequences worked ,I > > can see how you might interpret this as I was disappointed in the > > implementation as not being “pure” kadane. I meant only to classify the > > implementation so that the CPU times I was recording made sense > > > > From the perspective of big O analysis. Pure kadane is O(n). Where Pepe’s > > algorithm is being applied to a list (of length n) of lists of varying > > length 1..n, The operation performed 1+1+2+…+n-1 times. Which places it > > about O(n^2) though it’s been a while since I have done any algorithm > > analysis. > > > > In terms of a J implementation of the broader category of maximal sum of > > the subsequences of a list of numbers, Pepe’s algorithm is fantastic. From > > the perspective of my tutorial it is an incredibly insightful jump from the > > naive brute force implementations I had used. It is such a concise J > > implementation that highlights a significant portion of J’s advanced > > techniques. I intend on adding it to my tutorial I just hope I can explain > > it well enough to make it accessible to my target audience. > > > > > > > On Apr 6, 2023, at 10:55 PM, Henry Rich <henryhr...@gmail.com> wrote: > > > > > > I think that when you say Pepe's code is not the Kadane algorithm you are > > > conflating algorithm and implementation. > > > > > > Kadane's idea is that (1) at each new prospective starting point for the > > > maximal sequence, you have the option of starting at 0 or the previous > > > best total; (2) every position is a prospective ending point. Pepe's > > > algorithm follows these observations. > > > > > > The usual implementations keep two values through the loop: (a) the > > > maximum value found on sequences that have ended; (b) the current total > > > on sequences that the new value may add onto. > > > > > > Well, that's one way to do it. Another is to produce (a) as an > > > intermediate result for each possible ending position, and then at the > > > end see which of those values is the largest. That's what Pepe did. I > > > approve his decision, since keeping two values through the scan would > > > require ponderous indexing operations. This is the sort of implementation > > > decision that surprises users of compiled languages. > > > > > > Pepe's implementation in no way changes the algorithm. It's merely a > > > reordering of when comparisons are made. Pepe's algorithm is O(n), just > > > as the usual algorithms are. > > > > As I tried to explain above the ([ >. +)/ is O(n) but when used on suffixes > > with the ‘\.’ operator it becomes O(n^2). the kadane”0 implementation that > > Elijah proposed operates through the array one at a time in O(n) producing > > a list of possible maximal sums which is extracted with >./ also in O(n). > > That kadane algorithm uses your technique in Chapter 25 Loopless Code VI, > > to perform a single run through the array. > > > > Just trying to be accurate so I can explain the implementations correctly > > in my essay. > > > > > > Pepe knew that >./ is very fast, probably 10% of the time of the other > > > bit. The time spent in (([ >. +)/\.) is O(n), not worth our discussion; > > > but if you want to blame someone for the speed, blame the implementation > > > (i. e. me), not the algorithm. Perhaps (+ 00&>.)~/\. would be faster; I > > > haven't tested it, but we're down to minute questions of instruction > > > ordering. > > > > No complaints about J’s speed. Even the naive brute force implementation > > runs in less than a second on a 10000 element array. I have been using the > > timings as a relative guage that the various implementations work the way I > > think they are supposed to. > > > > The one thing I did have a question on now that I’ve drawn you into this > > thread. Is that the memory requirement for Pepe’s algorithm is about 1/5 > > the size of the kadane”0 algorithm. The global variable assignment appears > > to be the operation that causes the large amount of memory to be used. > > Elijah tried to explain it and hsi explanation is valid for some of the > > implementations such as the use of ‘\’ and Fold. But I don’t understand in > > the setting of a global variable. > > > > Tom McGuire > > > > > > Henry Rich > > > > > > > > > > > > > > > On 4/6/2023 8:12 PM, Thomas McGuire wrote: > > >> > > >>> On Apr 3, 2023, at 1:28 PM, Jose Mario Quintana > > >>> <jose.mario.quint...@gmail.com> wrote: > > >>> > > >>> For what it is worth... > > >>> > > >>> kad=. >./@:(([ >. +)/\.) > > >>> > > >> OK that is worth a lot. But it’s not really the kadane algorithm. It > > >> inserts a kadane-like summation in between elements for each suffix > > >> generated. Kadane obtains the sum on a single pass through the array. > > >> However it’s an incredibly elegant refactoring of the brute force > > >> example and saves performing partial sums on the suffixes. > > >> > > >> I will say after figuring out exactly what is was doing it wasn’t > > >> intuitively obvious (to me anyway) that this should work. A naive view > > >> is that the largest sum in the example below is produced by the > > >> subsequence : > > >> 4 _1 2 1 > > >> > > >> So that sequence is never generated in Jose’s version of maximum sum. > > >> However due to it’s Kadane-like properties it is generating the highest > > >> sum for each suffix as if the partial sum started at the first element. > > >> In essence using a kadane-like approach in place of performing partial > > >> sums. Therefore in one pass on each suffix you find the greatest partial > > >> sum without having to calculate each partial sum individually. > > >> > > >> Jose’s version is fairly quick coming in at about double the processing > > >> time of Jose’s kadane”0 version and well below the brute force method of > > >> taking the partial sums of all the suffixes. > > >> > > >> Tom McGuire > > >> > > >>> For example, the maximum sum subarray, > > >>> > > >>> > > >>> kad _2 1 _3 4 _1 2 1 _5 4 > > >>> 6 > > >>> > > >>> > > >>> kad i.0 > > >>> __ > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> On Mon, Apr 3, 2023 at 5:22 AM Elijah Stone <elro...@elronnd.net> wrote: > > >>> > > >>>> Your kadane routines are all bound to have asymptotically poor space > > >>>> usage > > >>>> under the present interpreter because >./ is not fused; so they all > > >>>> must > > >>>> create an intermediate result array whose length is the same as ranvec, > > >>>> and > > >>>> then apply >./. I think (haven't looked very closely) that locmax is > > >>>> the > > >>>> same > > >>>> as your final answer; so you could skip generating the intermediate > > >>>> array > > >>>> and > > >>>> simply look at locmax. Better yet, get rid of the global variable, and > > >>>> turn > > >>>> this all into a fold single (rather than fold multiple), where locmax > > >>>> becomes > > >>>> the running value. > > >>>> > > >>>> On Mon, 3 Apr 2023, Elijah Stone wrote: > > >>>> > > >>>>> 7!:2 '>./ 1 kadane\ ranvec' [ locmax=: __ > > >>>>> 1412576 > > >>>>> 7!:2 '>./ kadane"0 ranvec' [ locmax=: __ > > >>>>> 772960 > > >>>>> 7!:2 '>./ kadane"1 ranvec2' [ locmax=: __ [ ranvec2=: ,.ranvec > > >>>>> 1412960 > > >>>>> 1 <@$@$\ i.5 > > >>>>> ┌─┬─┬─┬─┬─┐ > > >>>>> │1│1│1│1│1│ > > >>>>> └─┴─┴─┴─┴─┘ > > >>>>> <@$@$"0 i.5 > > >>>>> ┌─┬─┬─┬─┬─┐ > > >>>>> │0│0│0│0│0│ > > >>>>> └─┴─┴─┴─┴─┘ > > >>>>> > > >>>>> 1 kadane\ y applies kadane to length-1 subsequences of y--that is, in > > >>>> this > > >>>>> case, vectors, which have length 1. Whereas kadane"0 applies to cells > > >>>> of y, > > >>>>> which have rank 0. More space is required to store a length-1 vector > > >>>> than a > > >>>>> scalar, since in the former case the length also needs to be stored. > > >>>>> > > >>>>> On Mon, 3 Apr 2023, Thomas McGuire wrote: > > >>>>> > > >>>>>> I’m wondering if someone with more J internals knowledge can tell me > > >>>> why my > > >>>>> Kadane algorithm implementation is taking up more memory than my naive > > >>>> brute > > >>>>> force method that generates all subsequences then sums them. The time > > >>>>> is > > >>>>> about what I would expect except for the FOLD operation. The time > > >>>>> space > > >>>> data > > >>>>> were generated on a MacBook Pro 2019 era I9 processor 2.3GHz. > > >>>>>> ranvec =: _5 + ?10000$10 > > >>>>>> kadane =: 3 : 'locmax =: y >. locmax + y' > > >>>>>> kadane1 =: 3 : 'locmax =: (] >. locmax&+) y' > > >>>>>> > > >>>>>> NB. first brute force implementation > > >>>>>> timespacex '>./(>./@(+/\))\.ranvec' > > >>>>>> 0.032948 396448 > > >>>>>> > > >>>>>> NB. fork version brute force > > >>>>>> timespacex '>./([: >./ (+/\))\. ranvec' > > >>>>>> 0.034208 396448 > > >>>>>> > > >>>>>> NB. first kadane implementation > > >>>>>> locmax =: __ > > >>>>>> timespacex '>./ 1 kadane\ranvec' > > >>>>>> 0.004034 1.41251e6 > > >>>>>> > > >>>>>> NB. fork kadane version > > >>>>>> locmax =: __ > > >>>>>> timespacex '>./ 1 kadane1\ranvec' > > >>>>>> 0.005267 1.41277e6 > > >>>>>> > > >>>>>> NB. new dnfs anonymous kadane function > > >>>>>> locmax =: __ > > >>>>>> timespacex '>./ 1 {{locmax =: y >. locmax + y}}\ranvec' > > >>>>>> 0.003625 1.41347e6 > > >>>>>> > > >>>>>> NB. fork dnfs anonymous kadane function > > >>>>>> locmax =: __ > > >>>>>> timespacex '>./ 1 {{locmax =: (] >. locmax&+) y}}\ranvec' > > >>>>>> 0.004776 1.41386e6 > > >>>>>> > > >>>>>> NB. Fold kadane implementation > > >>>>>> timespacex '>./ __ ] F:. ([ >. +) ranvec' > > >>>>>> 0.017393 905792 > > >>>>>> > > >>>>>> This was all part of a J Tutorial I have written > > >>>>> https://code.jsoftware.com/wiki/User:Tom_McGuire/KadaneAlgorithmTutorial > > >>>>>> Tom McGuire > > >>>>>> ---------------------------------------------------------------------- > > >>>>>> For information about J forums see > > >>>>>> http://www.jsoftware.com/forums.htm > > >>>>> ---------------------------------------------------------------------- > > >>>>> For information about J forums see http://www.jsoftware.com/forums.htm > > >>>>> > > >>>> ---------------------------------------------------------------------- > > >>>> For information about J forums see http://www.jsoftware.com/forums.htm > > >>>> > > >>> ---------------------------------------------------------------------- > > >>> For information about J forums see http://www.jsoftware.com/forums.htm > > >> ---------------------------------------------------------------------- > > >> For information about J forums see http://www.jsoftware.com/forums.htm > > > > > > ---------------------------------------------------------------------- > > > For information about J forums see http://www.jsoftware.com/forums.htm > > > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm