Re: [Boston.pm] Perl and recursion
On Tue, Apr 09, 2013 at 10:24:45PM -0400, Bob Rogers wrote: map should produce better code and does reads better but it also iterates the entire list and you can't break out of it . . . Not quite true, as you can goto a label outside of the block: sub find_odd { # Return the first odd argument. my (@values) = @_; my $result; map { $result = $_ and goto LAST if $_ 1; } @values; LAST: return $result; } Indeed, you are correct. However, when you break out of a map in this way, the map returns an empty list, regardless of how many results had been previously collected. Because your example handles a single result, it is not clear that, in fact, any return values MUST be collected similarly. Unless, of course, map is used simply for side effects. It's hard to think of a case where this would be better than a simple loop, which is both smaller and more readable. However, this technique also gives you an early exit when mapping over a tree recursively. Even then, the need to handle collected values manually would seem to obviate any advantage in doing so. This discussion has confirmed, for me, that writing code is, in many ways, similar to writing prose. Though there may initially, appear to be many ways to communicate, thoughtfulness, good taste and circumstance quickly show that there are only a few good ways to do so. -Gyepi ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
From: Gyepi SAM gy...@praxis-sw.com Date: Wed, 10 Apr 2013 13:54:06 -0400 On Tue, Apr 09, 2013 at 10:24:45PM -0400, Bob Rogers wrote: Not quite true, as you can goto a label outside of the block: [first example omitted] Indeed, you are correct. However, when you break out of a map in this way, the map returns an empty list, regardless of how many results had been previously collected. Actually, the map does not return at all, as the following variation shows: sub find_odd { my (@values) = @_; my $result; my @results = qw(nothing); @results = map { $result = $_ and goto LAST if $_ 1; $_; } @values; LAST: warn got , join(' ', @results); return $result; } You will see a got nothing warning when @values has an odd number, so that the goto is taken. In that case, map never returns, and the assignment to @results is never made. Because your example handles a single result, it is not clear that, in fact, any return values MUST be collected similarly. Unless, of course, map is used simply for side effects. I would hope (but don't know how to check) that Perl is smart enough to notice that the map in my original example is in a void context, and won't bother to store any return values from the block. Is that what you mean? It's hard to think of a case where this would be better than a simple loop, which is both smaller and more readable. However, this technique also gives you an early exit when mapping over a tree recursively. Even then, the need to handle collected values manually would seem to obviate any advantage in doing so. Over the years, I have seen many applications for mapping over trees (and also written a few), and most of them just produce side effects, either to the tree or externally. Those that don't tend to be of the nature collect (or count) all nodes that satisfy X, for which a non-local exit would defeat the purpose. Those that do require non-local exit are of the form find the *first* X or return true if *any* X, where the burden of handling a scalar return is minimal and the benefit from being able to skip the remainder of the tree is clear. In other words, I believe the collect values cases and the nonlocal exit cases are effectively disjoint. I've never found the need for (e.g.) a collect the first N nodes that satisfy X routine. This discussion has confirmed, for me, that writing code is, in many ways, similar to writing prose. Though there may initially, appear to be many ways to communicate, thoughtfulness, good taste and circumstance quickly show that there are only a few good ways to do so. -Gyepi To the extent that there are vastly more ways to write bad (code|prose) than good, I heartily agree. But I submit that that still leaves an enormous field of possibilities. Of course, I always try to write the best (code|prose) I can, subject to constraints, of course, and it often seems to me that there's only a few acceptable solutions, maybe only one. But I have strong preferences in coding style, so I suspect that's just my biases showing. Sorry to come back at you with so much disagreement; that was not my intention. -- Bob ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
On Wed, Apr 10, 2013 at 1:53 PM, Bob Rogers rogers-...@rgrjr.dyndns.org wrote: From: Gyepi SAM gy...@praxis-sw.com Date: Wed, 10 Apr 2013 13:54:06 -0400 On Tue, Apr 09, 2013 at 10:24:45PM -0400, Bob Rogers wrote: [...] Because your example handles a single result, it is not clear that, in fact, any return values MUST be collected similarly. Unless, of course, map is used simply for side effects. I would hope (but don't know how to check) that Perl is smart enough to notice that the map in my original example is in a void context, and won't bother to store any return values from the block. Is that what you mean? Current versions of Perl are sufficiently smart to not build a return list. You can verify by creating objects with destructors inside the map and seeing when they get destroyed. It's hard to think of a case where this would be better than a simple loop, which is both smaller and more readable. However, this technique also gives you an early exit when mapping over a tree recursively. Even then, the need to handle collected values manually would seem to obviate any advantage in doing so. Over the years, I have seen many applications for mapping over trees (and also written a few), and most of them just produce side effects, either to the tree or externally. Those that don't tend to be of the nature collect (or count) all nodes that satisfy X, for which a non-local exit would defeat the purpose. Those that do require non-local exit are of the form find the *first* X or return true if *any* X, where the burden of handling a scalar return is minimal and the benefit from being able to skip the remainder of the tree is clear. Many also are of the form, Start where you left off, give me the next one. That gets you into internal versus external iterators. In other words, I believe the collect values cases and the nonlocal exit cases are effectively disjoint. I've never found the need for (e.g.) a collect the first N nodes that satisfy X routine. Returning paginated results in a web page. This discussion has confirmed, for me, that writing code is, in many ways, similar to writing prose. Though there may initially, appear to be many ways to communicate, thoughtfulness, good taste and circumstance quickly show that there are only a few good ways to do so. -Gyepi To the extent that there are vastly more ways to write bad (code|prose) than good, I heartily agree. But I submit that that still leaves an enormous field of possibilities. Of course, I always try to write the best (code|prose) I can, subject to constraints, of course, and it often seems to me that there's only a few acceptable solutions, maybe only one. But I have strong preferences in coding style, so I suspect that's just my biases showing. I absolutely agree. There is not just one right way. Sorry to come back at you with so much disagreement; that was not my intention. -- Bob ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
From: Ben Tilly bti...@gmail.com Date: Wed, 10 Apr 2013 14:09:19 -0700 On Wed, Apr 10, 2013 at 1:53 PM, Bob Rogers rogers-...@rgrjr.dyndns.org wrote: I would hope (but don't know how to check) that Perl is smart enough to notice that the map in my original example is in a void context, and won't bother to store any return values from the block . . . Current versions of Perl are sufficiently smart to not build a return list. You can verify by creating objects with destructors inside the map and seeing when they get destroyed. Good idea; I doubt I would have thought of that. Over the years, I have seen many applications for mapping over trees (and also written a few), and most of them just produce side effects, either to the tree or externally. Those that don't tend to be of the nature collect (or count) all nodes that satisfy X, for which a non-local exit would defeat the purpose. Those that do require non-local exit are of the form find the *first* X or return true if *any* X, where the burden of handling a scalar return is minimal and the benefit from being able to skip the remainder of the tree is clear. Many also are of the form, Start where you left off, give me the next one. That gets you into internal versus external iterators. That *is* an iterator, or the specification of one, but we are talking about mapping over trees. If you have a map_tree function, then iterators are unnecessary, besides being harder to implement [1]. To be fair, though, I've never used iterators per se, so that colors my experience. But I also think it's true that I've used trees more in the more powerful languages (chiefly Common Lisp and Perl) because trees are harder to work with in less-powerful languages (e.g. Pascal and Fortran), largely because of the lack of functional programming tools. In other words, I believe the collect values cases and the nonlocal exit cases are effectively disjoint. I've never found the need for (e.g.) a collect the first N nodes that satisfy X routine. Returning paginated results in a web page. Personally, I have never encountered this sort of partial results problem where the data were stored in a tree. The database engine for a Web app may well have to deal with trees, if the underlying tables are tree- structured. But there the problem is send to the client the first N rows that satisfy X -- since N could be huge, I suspect that would have to be written in nonlocal-exit style, rather than by collecting the rows before sending them. -- Bob [1] Henry G. Baker, Iterators: Signs of Weakness in Object-Oriented Languages, 1992. http://home.pipeline.com/~hbaker1/Iterator.html ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
From: Gyepi SAM gy...@praxis-sw.com Date: Mon, 8 Apr 2013 15:02:55 -0400 On Mon, Apr 08, 2013 at 10:54:47AM -0400, Ricker, William wrote: Have we mentioned the other solution? The Implicit loops of map {block} list; should produce more optimized code and reads better too. map should produce better code and does reads better but it also iterates the entire list and you can't break out of it . . . Not quite true, as you can goto a label outside of the block: sub find_odd { # Return the first odd argument. my (@values) = @_; my $result; map { $result = $_ and goto LAST if $_ 1; } @values; LAST: return $result; } It's hard to think of a case where this would be better than a simple loop, which is both smaller and more readable. However, this technique also gives you an early exit when mapping over a tree recursively. -- Bob Rogers http://www.rgrjr.com/ ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
On Fri, Apr 05, 2013 at 09:20:16PM -0400, Jerrad Pierce wrote: Regardless, my understanding was that although perl's sub calls are somewhat expensive ... I think it's *method calls* that are expensive, not subroutine calls. The reason being that the subroutine that a method resolves to can change at runtime. -- David Cantrell | top google result for topless karaoke murders There's a hole in my bucket, dear Liza, dear Liza. WHAT MAKES YOU SAY THERE IS A HOLE IN YOUR BUCKET? ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
On Fri, Apr 5, 2013 at 8:22 PM, Jerrad Pierce belg4...@pthbb.org wrote: at each level of recursion. What seems to be the case though is that when we start going bac up the stack that memory doesn't seem to be released at each pop. If, say, at max depth 500mb of ram has been allocated I don't see that released at any point except for when perl exits and then of course it is all released at once. Or at least that is what seems to happen. Perl doesn't release memory, it keeps it for reallocation. This is (was?) one of the issues one had to be aware of with mod_perl. Normally the memory is not given back, but Perl does make a good faith effort to try to give it back if it is convenient. http://www.perlmonks.org/?node_id=746953 has a couple of examples of where it can happen. However this is not an issue in practice very often. After making sure the recursive implementation was correct I re-did it iteratively and did some comparisons with Benchmark. As expected the recursive implementation was slower: (output from cmpthese()) recursive 6.98-- -42% iterative 4.04 73%-- more amusing though was the memory usage. Looking at the RSS values for the perl process the iterative solution used ~9MB of RAM and the recursive solution came in at about 350MB. This was for a recursive depth of 4000 which represents a very large, but not wholly unreasonable, input. ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
Are languages that have mark and sweep garbage collection better about returning memory to the system than languages like Perl that use reference count garbage collection. Also if you really want to see how and why Perl is using memory, take a look at Tim Bunce's Devel::SizeMehttp://search.cpan.org/~timb/Devel-SizeMe-0.06/lib/Devel/SizeMe.pm . -- David On Sat, Apr 6, 2013 at 12:20 AM, Conor Walsh c...@adverb.ly wrote: On 4/6/2013 12:01 AM, Adam Russell wrote: Ah! Ok, so maybe I was confused about this. Even if I set the last reference to an object to undef perl will keep the memory until exit? The high water mark for memory usage never goes down? Well, that is fine I suppose, it isn't like this process will be really all that long lived. It also means that the iterative form of this algorithm will use all that much less ram, I think. Yeah, this is how the perl executable works. Undef'd (well, garbage-collected) stuff becomes free to be re-used by additional stuff within that particular instance of perl, but never to the OS. Uri's point about rarely-used pages getting swapped to disk by the OS stands, though. -C. __**_ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/**listinfo/boston-pmhttp://mail.pm.org/mailman/listinfo/boston-pm ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
David Larochelle writes: Are languages that have mark and sweep garbage collection better about returning memory to the system than languages like Perl that use reference count garbage collection. Not in practice. While that potential is there, in it is not utilized. Most programs that allocate then free a lot of memory will either terminate in the near future or will reallocate that memory again in the near future. One can imagine two programs on a system which took turns allocating then freeing a lot of memory, such that it would make sense for each of them to return the memory to the host for it to allocate to the other -- but since such programs are rare, the chances of having two of them that happen to synchronize thusly is even lower -- it's just not worth the effort to make it happen -- particularly because there's another approach which makes more sense and permits application to exercise fine control over memory use. Programs that want to do this kind of thing -- alone or cooperatively -- should use mmap (UNIX) or CreateFileMapping (Windows). This provides the application with a virtual address mapping that is backed (generally) by a file on disk. When the application wants to release that memory, it merely unmaps the file. At the point where actual content is read or written to the file, physical RAM pages are allocated used, but they are released with the file itself. This is generally the technique programs should use when they want control over this kind of thing -- mmap also makes it easy for one program to map-in what another program has mapped-out, thus instantly sharing a complex data structure (using offsets rather than absolute addressing for internal pointers). If two programs both mmap a file at the same time, they can share the physical memory and thus communicate with each other that way (generally also using semaphores for locking). There are a number of options which exist other than simply mapping a regular file -- it is also possible to map directly against swap, and to control how the mapped pages are addressed and cached and so forth. This is essentially a user-space mechanism to tell the kernel exactly how certain segments of memory need to be handled. ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
On Sat, Apr 06, 2013 at 09:11:55AM -0400, David Larochelle wrote: Are languages that have mark and sweep garbage collection better about returning memory to the system than languages like Perl that use reference count garbage collection. Not usually. On Unix systems, at least, memory is acquired from (and released to) the system using sbrk. malloc(3) and free(3) are actually front ends to sbrk. free marks memory as available to be reallocated, but only returns memory when there is enough of it at the high end of the heap. Depending on usage patterns this is unlikely to happen naturally unless memory is first compacted. This is an additional step to the garbage collection and not many languages do it AFAIK. I know Lisp can do this, but only to avoid fragmentation and does not return the memory. It is generally not necessary to return memory to the system because most modern operating systems provide a virtual view of memory so every program appears to have access to the entire memory space. Most of the time this works fine. Programs that allocate too much get swapped out when necessary. If you don't want that for your program, you can always use mlock(3) and friends. -Gyepi ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
On Sat, Apr 6, 2013 at 12:20 AM, Conor Walsh c...@adverb.ly wrote: Ah! Ok, so maybe I was confused about this. Even if I set the last reference to an object to undef perl will keep the memory until exit? The high water mark for memory usage never goes down? Well, that is fine I suppose, it isn't like this process will be really all that long lived. It also means that the iterative form of this algorithm will use all that much less ram, I think. Yeah, this is how the perl executable works. Undef'd (well, garbage-collected) stuff becomes free to be re-used by additional stuff within that particular instance of perl, but never to the OS. Uri's point about rarely-used pages getting swapped to disk by the OS stands, though. As Uri said, this is not peculiar to perl. When C heap grows, it never shrinks C programs can release non-heap, non-stack special segments they may have mapped previously, but those are peculiar techniques. The only difference with Java (whose interpreter like perl's is build on C) is there's a start flag to limit max heap growth, still doesn't give back. As long as the heap doesn't get too fragmented (java's GC moves things, perl recount doesn't) and there's no significant leakage, this isn't a problem. But it *can* be a problem. To avoid fragmentation pushing perpetual growth, perl sometimes needs a C-like pre-allocation pool pattern (or dedicated heap) for the big objects and buffers. -- Bill @n1vux bill.n1...@gmail.com ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
On Fri, Apr 5, 2013 at 8:22 PM, Jerrad Pierce belg4...@pthbb.org wrote: at each level of recursion. What seems to be the case though is that when we start going bac up the stack that memory doesn't seem to be released at each pop. If, say, at max depth 500mb of ram has been allocated I don't see that released at any point except for when perl exits and then of course it is all released at once. Or at least that is what seems to happen. Perl doesn't release memory, it keeps it for reallocation. This is (was?) one of the issues one had to be aware of with mod_perl. Normally the memory is not given back, but Perl does make a good faith effort to try to give it back if it is convenient. http://www.perlmonks.org/?node_id=746953 has a couple of examples of where it can happen. However this is not an issue in practice very often. ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
On Sat, Apr 6, 2013 at 12:30 PM, Bill Ricker bill.n1...@gmail.com wrote: On Sat, Apr 6, 2013 at 12:20 AM, Conor Walsh c...@adverb.ly wrote: Ah! Ok, so maybe I was confused about this. Even if I set the last reference to an object to undef perl will keep the memory until exit? The high water mark for memory usage never goes down? Well, that is fine I suppose, it isn't like this process will be really all that long lived. It also means that the iterative form of this algorithm will use all that much less ram, I think. Yeah, this is how the perl executable works. Undef'd (well, garbage-collected) stuff becomes free to be re-used by additional stuff within that particular instance of perl, but never to the OS. Uri's point about rarely-used pages getting swapped to disk by the OS stands, though. As Uri said, this is not peculiar to perl. When C heap grows, it never shrinks C programs can release non-heap, non-stack special segments they may have mapped previously, but those are peculiar techniques. The only difference with Java (whose interpreter like perl's is build on C) is there's a start flag to limit max heap growth, still doesn't give back. As long as the heap doesn't get too fragmented (java's GC moves things, perl recount doesn't) and there's no significant leakage, this isn't a problem. But it *can* be a problem. To avoid fragmentation pushing perpetual growth, perl sometimes needs a C-like pre-allocation pool pattern (or dedicated heap) for the big objects and buffers. -- Bill @n1vux bill.n1...@gmail.com -- Bill @n1vux bill.n1...@gmail.com ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
On 04/06/2013 10:30 AM, John Redford wrote: David Larochelle writes: Are languages that have mark and sweep garbage collection better about returning memory to the system than languages like Perl that use reference count garbage collection. Programs that want to do this kind of thing -- alone or cooperatively -- should use mmap (UNIX) or CreateFileMapping (Windows). This provides the application with a virtual address mapping that is backed (generally) by a file on disk. When the application wants to release that memory, it merely unmaps the file. At the point where actual content is read or written to the file, physical RAM pages are allocated used, but they are released with the file itself. This is generally the technique programs should use when they want control over this kind of thing -- mmap also makes it easy for one program to map-in what another program has mapped-out, thus instantly sharing a complex data structure (using offsets rather than absolute addressing for internal pointers). If two programs both mmap a file at the same time, they can share the physical memory and thus communicate with each other that way (generally also using semaphores for locking). mmap won't help with returning ram to the system. there is still a malloc/free involved and that will come from the same place as the rest of ram. the only way to return virtual ram is by lowering the top virtual address with brk()/sbrk() calls. and that needs contiguous ram at the top of the address space. and that means freed ram must be collected into contiguous blocks. that is nothing to do with mark/sweep vs ref counting or any other type of gc. mark sweep will use a large contiguous area to copy active data but it needs the other buffer to do it again. it won't ever want to free that ram. the only way this ever happens is the custom code that manages the ram blocks by calling brk() directly. it is too tricky to do in most cases. and with virtual ram it doesn't really matter as i said, it will be swapped out until the process dies if you don't use it again. uri There are a number of options which exist other than simply mapping a regular file -- it is also possible to map directly against swap, and to control how the mapped pages are addressed and cached and so forth. This is essentially a user-space mechanism to tell the kernel exactly how certain segments of memory need to be handled. ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
Uri Guttman writes: mmap won't help with returning ram to the system. there is still a malloc/free involved and that will come from the same place as the rest of ram. the only way to return virtual ram is by lowering the top virtual address with brk()/sbrk() calls. and that needs contiguous ram at the top of the address space. and that means freed ram must be collected into contiguous blocks. that is nothing to do with mark/sweep vs ref counting or any other type of gc. mark sweep will use a large contiguous area to copy active data but it needs the other buffer to do it again. it won't ever want to free that ram. the only way this ever happens is the custom code that manages the ram blocks by calling brk() directly. it is too tricky to do in most cases. and with virtual ram it doesn't really matter as i said, it will be swapped out until the process dies if you don't use it again. At the risk of again poking this prickly mailing list... this is simply wrong. Certainly I cannot speak for every variation of every operating system ever written, and your understanding of this may be based on some system that actually worked this way, unlikely though that seems. But mmap does not perform any malloc free -- nor does it use brk/sbrk to extend the heap boundary. At any given point in time, every process has a _virtual_ address space which is essentially From 0 to MAXINT/MAXLONG/MAXLONGLONG/Etc, depending on the specific OS -- that is, the virtual address space is every possible location that a pointer can hold, even if nothing is at that address. To keep things simple, processes have a region of memory called heap, which is basically one big .. heap. It's a block of addresses, and when a process calls malloc, it attempts to find some unused memory in that block, and if it cannot, it uses brk/sbrk to extend the heap -- make it bigger. At any given point in time, if a process attempts to access a bit of memory, it passes through one or more layers of virtual-address-translation. If the address is in a location the kernel says is ok, then the translation will map to a page of physical RAM -- if this is a new mapping, the page is faulted in when the VAT mapping is established. When a process brk's the heap, the kernel will be prepared to facilitate faulting RAM to back the additional address space. However, mmap does not use the heap -- it picks other locations in the virtual address space to use (often near the stack, far away from the heap) -- the caller can even specify where in that address space they want the mapping to occur, which can facilitate using non-offset internal pointers. When the kernel services a mmap call, it arranges to handle page faults within the range provided by mmap (base address + size) by using the VAT to map the virtual address to a page of physical RAM which contains (by reading it from the file the first time it's accessed) the content of the mapped file. (When mmap is used without a file, the content is left uninitialized.) If the mmap call is private/exclusive, then this page of physical RAM will only be mapped to this process's virtual addressing; if it is shared, then multiple processes, from distinct virtual addressed, will VAT to the same page of physical RAM. This can facilitate many efficiencies. Again, to keep this simple... in order for the VAT to become unmapped, and return the physical RAM to the pool where it may be reused (multiple such pools exist, but I'll not dwell on that), the kernel has to be told that the process no longer wants that VAT to work -- that is, it arranges for accesses to those virtual addresses to result in segmentation violation (segv). At that point, the RAM itself is free and available for any other process. In order for the heap's range of virtual addresses to return RAM, the end of the heap must be lowered, and thus -- as people have said -- the process needs to be sure that all the addressable stuff it is using lies below some point, such that the end can be lowered to that point. In order for the range of addresses provided by any given mmap to return RAM, the kernel only needs to be told to unmap the memory. Now, it seems obvious, but I will point out that this does require that the process be sure that it is not still using virtual addresses that are in that range -- one cannot have one's memory and free it too. But it does not require that anything be shuffled around to fit into a smaller contiguous block such as the help. It is worth pointing out that (in general) if one simply mmap's a 100MB file and accesses a single byte in the middle of that file, then one will only fault in _one_ page of RAM, and when one unmap's the file, one will only release _one_ page of RAM. It is possible to instruct the kernel to pre-fault pages into RAM to enhance performance, but as a general rule, you will only use as much RAM as is required based on the actual reads writes to virtual addresses mapped into a file. Anyway, to
[Boston.pm] Perl and recursion
I am currently in the midst of implementing a fairly non-trivial recursive algorithm in Perl. The depth of the recursion is quite large, so much so that I have set no warning recursion which warns with a depth over 100. This seems pretty small to me! If the default is to warn at a depth of 100 does this imply that Perl runs into issues internally somewhere? A quick profiling via top does indeed confirm what I consider an unusually large memory allocation when run with inputs that force a depth of greater than a few hundred. Googling did not provide much insight and in fact it seems only small trivial examples are mainly discussed on this topic. Can anyone here comment on what should be considered here? Best Regards, Adam Sent from my iPhone On Apr 5, 2013, at 7:46 PM, boston-pm-requ...@mail.pm.org wrote: Send Boston-pm mailing list submissions to boston-pm@mail.pm.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.pm.org/mailman/listinfo/boston-pm or, via email, send a message with subject or body 'help' to boston-pm-requ...@mail.pm.org You can reach the person managing the list at boston-pm-ow...@mail.pm.org When replying, please edit your Subject line so it is more specific than Re: Contents of Boston-pm digest... Today's Topics: 1. Re: Passing large complex data structures betweenprocess (John Redford) 2. Re: Passing large complex data structures betweenprocess (Anthony Caravello) 3. Re: Passing large complex data structures betweenprocess (Ben Tilly) 4. Re: Passing large complex data structuresbetweenprocess (John Redford) 5. Re: Passing large complex data structures betweenprocess (John Redford) 6. Re: Passing large complex data structures betweenprocess (Anthony Caravello) -- Message: 1 Date: Fri, 5 Apr 2013 15:04:21 -0400 From: John Redford eire...@hotmail.com To: 'Ben Tilly' bti...@gmail.com Cc: 'L-boston-pm' boston-pm@mail.pm.org Subject: Re: [Boston.pm] Passing large complex data structures between process Message-ID: blu172-ds717390f3b68dc4f878a8bb8...@phx.gbl Content-Type: text/plain; charset=us-ascii Ben Tilly emitted: Pro tip. I've seen both push based systems and pull based systems at work. The push based systems tend to break whenever the thing that you're pushing to has problems. Pull-based systems tend to be much more reliable in my experience. [...] If you disregard this tip, then learn from experience and give thought in advance to how you're going to monitor the things that you're pushing to, notice their problems, and fix them when they break. (Rather than 2 weeks later when someone wonders why their data stopped updating.) Your writing is FUD. Pro tip. Learn to use a database. I know that it can be fun to play with the latest piece of shiny technofrippery, like Redis, and to imagine that because it is new, it somehow is better than anything that came before and that it can solve problems that have never been solved before. It's not. There's nothing specifically wrong with it, but it's not a silver bullet and parallelism is not a werewolf. -- Message: 2 Date: Fri, 5 Apr 2013 16:26:41 -0400 From: Anthony Caravello t...@caravello.us To: John Redford eire...@hotmail.com Cc: L-boston-pm boston-pm@mail.pm.org Subject: Re: [Boston.pm] Passing large complex data structures between process Message-ID: caglsx2rwg-avd__y+lr++7qry4noiw_7fketiuhp5-slwew...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 Queuing systems aren't really new or 'technofrippery'. In-memory FIFO stacks are ridiculously fast compared to transaction safe rdbms' for this simple purpose. Databases incur a lot of overhead for wonderful things that don't aid this cause. This isn't magic, sometimes it's just the right tool for the job. On Fri, Apr 5, 2013 at 3:04 PM, John Redford eire...@hotmail.com wrote: Ben Tilly emitted: Pro tip. I've seen both push based systems and pull based systems at work. The push based systems tend to break whenever the thing that you're pushing to has problems. Pull-based systems tend to be much more reliable in my experience. [...] If you disregard this tip, then learn from experience and give thought in advance to how you're going to monitor the things that you're pushing to, notice their problems, and fix them when they break. (Rather than 2 weeks later when someone wonders why their data stopped updating.) Your writing is FUD. Pro tip. Learn to use a database. I know that it can be fun to play with the latest piece of shiny technofrippery, like Redis, and to imagine that because it is new, it somehow is better than anything that came before and that it can solve problems that have never been solved before. It's
Re: [Boston.pm] Perl and recursion
On Apr 5, 2013 8:24 PM, Uri Guttman u...@stemsystems.com wrote: as for your ram usage, all recursions can be unrolled into plain loops by managing your own stack. this is a classic way to save ram and sub call overhead. with perl it would be a fairly trivial thing to do. use an array for the stack and each element could be a hash ref containing all the data/args for that level's call. then you just loop and decide to 'recurse' or not. if you recurse you push a new hash ref onto the stack and loop. if you don't recurse you pop a value from the stack and maybe modify the previous top of the stack (like doing a return early in recursion). i leave the details to you. this would save you a ton of ram and cpu for each call if you go very deep and complex. Uri, you know more perlguts than I do, so maybe this is a dumb question, but... why is this that much faster than actual recursion? That speaks poorly of lowercase-p perl. -Conor ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
On Fri, Apr 5, 2013 at 6:03 PM, Conor Walsh c...@adverb.ly wrote: On Apr 5, 2013 8:24 PM, Uri Guttman u...@stemsystems.com wrote: as for your ram usage, all recursions can be unrolled into plain loops by managing your own stack. this is a classic way to save ram and sub call overhead. with perl it would be a fairly trivial thing to do. use an array for the stack and each element could be a hash ref containing all the data/args for that level's call. then you just loop and decide to 'recurse' or not. if you recurse you push a new hash ref onto the stack and loop. if you don't recurse you pop a value from the stack and maybe modify the previous top of the stack (like doing a return early in recursion). i leave the details to you. this would save you a ton of ram and cpu for each call if you go very deep and complex. Uri, you know more perlguts than I do, so maybe this is a dumb question, but... why is this that much faster than actual recursion? That speaks poorly of lowercase-p perl. Do you like having detailed stack backtraces on program crashes? It takes a lot of work to maintain that information, and you're using it whether or not you crash. ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
Detailed? What's kept beyond a called b (arguments...) ? That's not a lot of bytes, unless it's complete deep copies of structures. -C. On Apr 5, 2013 9:08 PM, Ben Tilly bti...@gmail.com wrote: On Fri, Apr 5, 2013 at 6:03 PM, Conor Walsh c...@adverb.ly wrote: On Apr 5, 2013 8:24 PM, Uri Guttman u...@stemsystems.com wrote: as for your ram usage, all recursions can be unrolled into plain loops by managing your own stack. this is a classic way to save ram and sub call overhead. with perl it would be a fairly trivial thing to do. use an array for the stack and each element could be a hash ref containing all the data/args for that level's call. then you just loop and decide to 'recurse' or not. if you recurse you push a new hash ref onto the stack and loop. if you don't recurse you pop a value from the stack and maybe modify the previous top of the stack (like doing a return early in recursion). i leave the details to you. this would save you a ton of ram and cpu for each call if you go very deep and complex. Uri, you know more perlguts than I do, so maybe this is a dumb question, but... why is this that much faster than actual recursion? That speaks poorly of lowercase-p perl. Do you like having detailed stack backtraces on program crashes? It takes a lot of work to maintain that information, and you're using it whether or not you crash. ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
Detailed? What's kept beyond a called b (arguments...) ? That's not a lot of bytes, unless it's complete deep copies of structures. perldoc -f caller package, line number, etc. Regardless, my understanding was that although perl's sub calls are somewhat expensive to some other languages that recursive algorithms, while simple, were always unrollable to something more effecient... just as Uri proposes. ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
THEORY Ever general computer science over-simplification has a BUT that is very important. Recursion is as efficient as iteration ... ... IF AND ONLY IF Tail Recursion Optimization is in effect. When Tail Recursion is in effect, you do NOT have all that call stack, you're only one level down the entire time (which means no overhead and no recursion limit either). (Whether you can be thread safe and tail recursive in any modern language i haven't heard.) PRACTICE When in doubt, benchmark. -- Bill @n1vux bill.n1...@gmail.com ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
On Fri, Apr 5, 2013 at 6:39 PM, Bill Ricker bill.n1...@gmail.com wrote: THEORY Ever general computer science over-simplification has a BUT that is very important. Recursion is as efficient as iteration ... ... IF AND ONLY IF Tail Recursion Optimization is in effect. When Tail Recursion is in effect, you do NOT have all that call stack, you're only one level down the entire time (which means no overhead and no recursion limit either). (Whether you can be thread safe and tail recursive in any modern language i haven't heard.) Erlang is the best example. PRACTICE When in doubt, benchmark. I always doubt my benchmarks. ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
On Fri, Apr 5, 2013 at 9:51 PM, Ben Tilly bti...@gmail.com wrote: When in doubt, benchmark. I always doubt my benchmarks. I also doubt benchmarks, but doubt extrapolating from theory to reality even more. -- Bill @n1vux bill.n1...@gmail.com ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
On Fri, Apr 05, 2013 at 09:03:13PM -0400, Conor Walsh wrote: why is this that much faster than actual recursion? That speaks poorly of lowercase-p perl. This is not a perl specific issue (for the most part). Most languages that support function calls need to maintain an activation record for every call in order to keep track of, at the very least, parameters, return values, next action, etc [1]. A recursive function creates a chain of these records. So, Uri's suggestion is faster because replacing recursion with iteration and stack operations uses fewer resources than the runtime does in maintaining a call stack. More to the point though, one avoids the built in limits, or rather, exchanges that limit for memory limits. I don't think this speaks poorly of perl at all. Every operation has a cost, after all, and function calls are relatively expensive. Certainly more so than iteration or array operations. [1] There are a couple of ways to avoid this problem while still maintaining the benefits of recursion. One is to use threaded code, as in Forth, or more commonly, to use tail recursion (or tail call elimination), which re-uses the current activation record for the recursive call, iff the recursive call occurs at the end of the function since that means that the only state we need store are the return values. Since those are on the stack, they can, with a bit of pointer twiddling or copying, become the parameters to the next call. A number of languages do this, to great advantage. Perl does not have tail call elimination. However, it is mentioned in Porting/todo.pod -Gyepi ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
function calls are relatively expensive. Certainly more so than iteration or array operations. Maybe we could get a new pragma: no overhead 'subs'; ;/ I've been fighting a perl script problem for a while now and just recently figured out a potential solution that just happens to involve a whole lot of subroutine calls, including potential recursion. ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
Date: Fri, 05 Apr 2013 20:23:30 -0400 From: Uri Guttman u...@stemsystems.com To: boston-pm@mail.pm.org Subject: Re: [Boston.pm] Perl and recursion Message-ID: 515f6b02.70...@stemsystems.com Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 04/05/2013 08:13 PM, Adam Russell wrote: I am currently in the midst of implementing a fairly non-trivial recursive algorithm in Perl. The depth of the recursion is quite large, so much so that I have set no warning recursion which warns with a depth over 100. This seems pretty small to me! If the default is to warn at a depth of 100 does this imply that Perl runs into issues internally somewhere? A quick profiling via top does indeed confirm what I consider an unusually large memory allocation when run with inputs that force a depth of greater than a few hundred. Googling did not provide much insight and in fact it seems only small trivial examples are mainly discussed on this topic. Can anyone here comment on what should be considered here? massive delete Ha! Sorry about that. The massive amount of quoted text was was caused by uncareful use of email by phone. the depth of 100 to warn for deep recursion was sort of chosen arbitrarily. rarely do most recursive programs need to go that deep and then you can shut off the warning as you know. i read something on p5p very recently where some other thing was limited to 100 and the recursion depth warning was mentioned. they may be changing it to make it 1000. afaik, it wasn't due to ram usage but just a general warning that you might have infinite recursion happening. Yes, that is what the docs say, a depth of 100 is a symptom of an infinite recursion. This is just too low in many practical cases but, yes, it is just a warning that can easily be turned off. as for your ram usage, all recursions can be unrolled into plain loops by managing your own stack. this is a classic way to save ram and sub call overhead. with perl it would be a fairly trivial thing to do. use an array for the stack and each element could be a hash ref containing all the data/args for that level's call. then you just loop and decide to 'recurse' or not. if you recurse you push a new hash ref onto the stack and loop. if you don't recurse you pop a value from the stack and maybe modify the previous top of the stack (like doing a return early in recursion). i leave the details to you. this would save you a ton of ram and cpu for each call if you go very deep and complex. The initial implementation was of the algorithm from the literature as a reference and baseline. I do plan on eventually doing an iterative solution. I was just curious about why the had to be the case. I now understand the interpreter is storing lots of info at each level of recursion. What seems to be the case though is that when we start going back up the stack that memory doesn't seem to be released at each pop. If, say, at max depth 500mb of ram has been allocated I don't see that released at any point except for when perl exits and then of course it is all released at once. Or at least that is what seems to happen. ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
at each level of recursion. What seems to be the case though is that when we start going bac up the stack that memory doesn't seem to be released at each pop. If, say, at max depth 500mb of ram has been allocated I don't see that released at any point except for when perl exits and then of course it is all released at once. Or at least that is what seems to happen. Perl doesn't release memory, it keeps it for reallocation. This is (was?) one of the issues one had to be aware of with mod_perl. ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
On 04/05/2013 11:22 PM, Jerrad Pierce wrote: at each level of recursion. What seems to be the case though is that when we start going bac up the stack that memory doesn't seem to be released at each pop. If, say, at max depth 500mb of ram has been allocated I don't see that released at any point except for when perl exits and then of course it is all released at once. Or at least that is what seems to happen. Perl doesn't release memory, it keeps it for reallocation. This is (was?) one of the issues one had to be aware of with mod_perl. this is true for almost all programs. releasing ram back to the system is tricky on unix flavors. it can only happen at the top of the ram space by using the brk() system call. this means your process has to not only free its ram it has to consolidate it and check when it can release it to the system. this needs special and likely much slower malloc/free calls. it can be done with careful and special ram allocation but with the massive number of small mallocs that perl does (and most langs do) it is a practical impossibility. but remember, this is virtual ram and not physical. if perl stops accessing ram it has freed as it doesn't need it anymore, it will swap out naturally. it is still more resource intensive than an iterative solution but not as bad as you (the OP) thinks. recursion is a great concept and i use it for template parsing in Template::Simple. i don't expect template structures to nest 100 deep! :) in that code an iterative version would be harder to code than a simple single call to parse the nested sub-template. but recursion 100's deep would bring up serious speed and resource issues and i would explore iterative solutions from the start. others mention tail recursion and that is only good if you make the recursive call last. that isn't always possible and when you can do it, you may need to do some entangled code to force that to happen. for example in my template module, i need to do work after the recursive call and it would be very convoluted to make that become a tail recursion. one thing i dislike about tail recursion is how it is used just for simple looping. maybe it is as fast if done right but it feels like the everything looks like a nail when you have a hammer syndrome. loops are fine as they are. uri ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
at each level of recursion. What seems to be the case though is that when we start going bac up the stack that memory doesn't seem to be released at each pop. If, say, at max depth 500mb of ram has been allocated I don't see that released at any point except for when perl exits and then of course it is all released at once. Or at least that is what seems to happen. Perl doesn't release memory, it keeps it for reallocation. This is (was?) one of the issues one had to be aware of with mod_perl. Ah! Ok, so maybe I was confused about this. Even if I set the last reference to an object to undef perl will keep the memory until exit? The high water mark for memory usage never goes down? Well, that is fine I suppose, it isn't like this process will be really all that long lived. It also means that the iterative form of this algorithm will use all that much less ram, I think. ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Perl and recursion
On 4/6/2013 12:01 AM, Adam Russell wrote: Ah! Ok, so maybe I was confused about this. Even if I set the last reference to an object to undef perl will keep the memory until exit? The high water mark for memory usage never goes down? Well, that is fine I suppose, it isn't like this process will be really all that long lived. It also means that the iterative form of this algorithm will use all that much less ram, I think. Yeah, this is how the perl executable works. Undef'd (well, garbage-collected) stuff becomes free to be re-used by additional stuff within that particular instance of perl, but never to the OS. Uri's point about rarely-used pages getting swapped to disk by the OS stands, though. -C. ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm