This is an interesting approach, to have the interpreter automatically apply the typical memory allocation optimisations rather than you having to do it yourself.
That said, I think it needs to be a bit more clever. There are many occurrences where one appends a single element to a large array. If all of those cause a doubling of memory, the effect can be quite serious. That said, if the algorithm only kicks in for arrays larger than 𝓁, and the increment is made by a factor 𝒇, where both 𝓁 is on the order of, say, 30 and 𝒇 perhaps 1.5 (configurable through a quad-parameter of course) then it will be easy to experiment to come up with good numbers for a given workload. Regards, Elias On 1 July 2014 13:44, David B. Lamkins <dlamk...@gmail.com> wrote: > Again: proofread, *then* post... Sorry. :( > > - To clarify: Part 2 is not really an implementation outline, but > rather a discussion of techniques. > > - The test.apl file test2 function doesn't implement "double and copy", > but rather "preallocate all". This is a much simpler case. Here's the > "double and copy" function: > > ⍝ test3 starts with an empty vector and doubles its size every time we > ⍝ need more space. I think that this'll be somewhat worse than O(n) > ⍝ but better than O(n^2). > > ∇test3 n > x←n⍴'' > i←0 > a←0 > loop: > →(i≥n)/0 > i←i+1 > →(i≤a)/mutate > x←x,(⍴x)⍴'' > a←⍴x > mutate: > x[i]←'x' > →loop > ∇ > > Informally (i.e. I didn't use a stopwatch or timing function) test3 is > about 2/3 as fast as test2 with N=5000000. Keep in mind that there are a > few more instructions in the test3 loop. > > > > On Mon, 2014-06-30 at 21:34 -0700, David Lamkins wrote: > > This is a long post. Part one is background to motivate the > > implementation outline in part two. If you already understand order > > analysis (i.e. the asymptotic analysis of program runtime), then you > > already understand everything I'll say in Part 1; you can skip to Part > > 2. > > > > > > Part 1 - Motivation > > > > So long as you can work with data of definite size, APL is able to > > allocate the correct amount of storage in time proportional to > > (assuming that allocation must be accompanied by initialization) the > > size of the storage. That's pretty good. (There might be OS-dependent > > tricks to finesse the initialization time, but that's not the subject > > of this note...) > > > > > > If you have to work with data of indeterminate size and store that > > data in an APL vector or array, that's where things get tricky. > > > > > > The classic example of this is accumulating data elements from an > > input stream of indeterminate length: > > > > > > 0.Initialize an empty vector; this will, while the program is running, > > accumulate data elements. > > > > 1. If there's no more data to read, we're done. The vector contains > > the result. > > > > 2. Read a data element from the input stream and append the element to > > the vector. > > > > 3. Go to step 1. > > > > > > As this program runs, the vector will take on lengths in the series 0, > > 1, 2, 3, 4, 5, ... Each time the vector expands, all of the data must > > be copied to the new vector before appending the next element. If you > > look at copying the data as the "cost" of this algorithm, you'll see > > that this is a really inefficient way to do things. The runtime is > > O(n^2). In other works, the runtime grows as the square of the input > > size. > > > > > > One way to optimize this program is to extend the vector by a fixed > > size every time you need more space for the next element. IOW, rather > > than making space for one more element, you'd make space for a hundred > > or a thousand elements; by doing so, you reduce the number of times > > you must copy all the existing data to an expanded vector. This seems > > like a smart approach. It definitely gives you better performance for > > a data set of a given size. However, looking at the asymptotic limit, > > this is still an O(n^2) algorithm. > > > > > > So how do you improve the runtime of this algorithm...? A common > > approach is to double the size of the vector each time you run out of > > space. The size of the vector will increase pretty rapidly. > > Intuitively, you're giving yourself twice as much time to run a linear > > algorithm each time you run out of space for a new element. The > > asymptotic runtime won't be linear (you'll still have to stop and copy > > at increasingly infrequent intervals), but it'll be better than > > O(n^2). > > > > > > What's the downside? Your program will eat more memory than it really > > needs. In fact, depending upon where you run out of input, the program > > may have allocated up to twice as much memory as it really needs. > > > > > > Here's where I argue that trading memory for time is a good thing... > > Memory is cheap. If you're going to be solving a problme that taxes > > physical memory on a modern computer, you (or your employer) can > > almost certainly afford to double your machine's memory. Also, memory > > is slow. A modern processor can execute lots and lots of cycles in the > > time it takes to read or write one word of data in memory. This is why > > copies dominate the runtime of many algorithms. > > > > > > The trick in making this "double the size before copying" algorithm is > > to somehow manage that overallocated space. That the subject of Part > > 2. > > > > > > > > > > Part 2 - Double, Copy, Trim > > > > > > I haven't yet looked at how all this might fit into GNU APL. Some of > > my assumptions may be wrong. This section is intended to provoke > > discussion, not to serve as a design sketch. > > > > > > The "double the size before copying" part of the algorithm seems > > simple enough. (Keep in mind that I'll use this phrase repeatedly > > without also mentioning the necessary test for having run out of > > allocated-but-unused space.) The only tricky parts are (a) knowing > > when to do it, (b) using a new bit of header information that keeps > > track of allocated size as opposed to actual size, and (c) handling > > the new case of mutating an existing variable (specifically, doing an > > update-in-place over a previously allocated-but-unused element) rather > > than allocating a fresh copy of a precise size and copying the entire > > result. > > > > > > Why is "knowing when to do it" tricky? I'm assuming that GNU APL > > evaluates an expression of the form a←a,x by computing the size of a,x > > before allocating a new a and copying all the data. We could blindly > > double the size of a each time we need to make more room; at the > > worst, our programs would use about twice as much memory. But we might > > be able to do better... > > > > > > What we need is an inexpensive way of predicting that a vector may be > > subject to repeated expansion. > > > > We could use static analysis of the program text: Every time we see an > > expression of the form a←a,... we could choose the "double the size > > before copying" technique. We might even strengthen the test by > > confirming that the assignment appears inside a loop. > > > > > > Alternatively, we might somehow look at access patterns. If the > > variable is infrequently updated (this assumes that we can have a > > timestamp associated with the variable's header), then we might take a > > chance and do minimal (i.e. just enough space for the new element) > > reallocation. On the other hand, if we're updating that variable > > frequently (as we'd do in a tight loop), then it'd make sense to apply > > the "double the size before copying" technique. > > > > > > That's about all I can say for now about the double and copy parts. > > What about trim? > > > > > > Let's assume that we'd like to do better than to potentially double > > the runtime size of our GNU APL programs. What techniques might we use > > to mitigate that memory growth. > > > > > > We might set an upper bound on "overhead" memory allocation. (In this > > case, the overhead is the allocated-but-unused space consumed by > > reserving unused space in variables that might be subject to growth. > > That seems like an attractive approach until you consider that > > whatever threshold you select at which to stop reserving extra space, > > you could reach that threshold before hitting the one part of your > > program that's really going to benefit from the optimization. We > > really need to make this work without having to tweak interpreter > > parameters on a per-program and per-run basis. > > > > > > One possible approach might be to "garbage-collect" reserved space > > upon some trigger condition. (Maybe memory pressure, maybe time, maybe > > something else...) The downside is that this can fragment the heap. On > > the other hand, we might get away with it thanks to VM and locality of > > access; it may take some experiments to work out the best approach. > > > > > > Another possible approach might be to dynamically prioritize which > > variables are subject to reservation of extra space upon expansion, > > based upon frequency of updates or how recently the variable was > > updated. Again, experimentation may be necessary. > > > > > > Finally, the same "double the size before copying" technique should be > > applied to array assignments (e.g. a[m]←a[m],x). > > > > > > > > > > Appendix > > > > > > The attached test.apl file contains two functions: test1 and test2. > > (I'm tempted to break into Dr. Seuss rhyme, but I'll refrain...) > > > > > > The test1 function expands a vector by one element each time an > > element is added. > > > > The test2 function doubles the size of the vector each time the > > current allocation fills up. > > > > > > For small N, you'd be hard pressed to notice a difference between the > > runtime of test1 and test2. For example, try > > > > > > test1 5 > > > > test2 5 > > > > test1 50 > > > > test2 50 > > > > ... etc > > > > > > > > At some point, you'll see a dramatic difference. On my machine, > > there's a noticeable difference at 5000. At 50000, the difference is > > stunning. > > > > > > -- > > "The secret to creativity is knowing how to hide your sources." > > Albert Einstein > > > > > > http://soundcloud.com/davidlamkins > > http://reverbnation.com/lamkins > > http://reverbnation.com/lcw > > http://lamkins-guitar.com/ > > http://lamkins.net/ > > http://successful-lisp.com/ > > > >