Re: [Boston.pm] Perl and recursion

2013-04-10 Thread Gyepi SAM
On Tue, Apr 09, 2013 at 10:24:45PM -0400, Bob Rogers wrote:
map should produce better code and does reads better but it also
iterates the entire list and you can't break out of it . . .
 
 Not quite true, as you can goto a label outside of the block:
 
 sub find_odd {
   # Return the first odd argument.
   my (@values) = @_;
 
   my $result;
   map { $result = $_ and goto LAST
 if $_  1;
   } @values;
   LAST:
   return $result;
 }

Indeed, you are correct. However, when you break out of a map in this way,
the map returns an empty list, regardless of how many results had been 
previously
collected. 

Because your example handles a single result, it is not clear that, in fact,
any return values MUST be collected similarly. Unless, of course, map is used
simply for side effects.
 
 It's hard to think of a case where this would be better than a simple
 loop, which is both smaller and more readable.  However, this technique
 also gives you an early exit when mapping over a tree recursively.

Even then, the need to handle collected values manually would seem to obviate
any advantage in doing so.

This discussion has confirmed, for me, that writing code is, in many ways,
similar to writing prose. Though there may initially, appear to be many ways
to communicate, thoughtfulness, good taste and circumstance quickly show that
there are only a few good ways to do so.

-Gyepi 

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-10 Thread Bob Rogers
   From: Gyepi SAM gy...@praxis-sw.com
   Date: Wed, 10 Apr 2013 13:54:06 -0400

   On Tue, Apr 09, 2013 at 10:24:45PM -0400, Bob Rogers wrote:

Not quite true, as you can goto a label outside of the block:

[first example omitted]

   Indeed, you are correct. However, when you break out of a map in this
   way, the map returns an empty list, regardless of how many results
   had been previously collected.

Actually, the map does not return at all, as the following variation
shows:

sub find_odd {
my (@values) = @_;

my $result;
my @results = qw(nothing);
@results = map { $result = $_ and goto LAST
 if $_  1;
 $_;
} @values;
  LAST:
warn got , join(' ', @results);
return $result;
}

You will see a got nothing warning when @values has an odd number, so
that the goto is taken.  In that case, map never returns, and the
assignment to @results is never made.

   Because your example handles a single result, it is not clear that,
   in fact, any return values MUST be collected similarly. Unless, of
   course, map is used simply for side effects.

I would hope (but don't know how to check) that Perl is smart enough to
notice that the map in my original example is in a void context, and
won't bother to store any return values from the block.  Is that what
you mean?

It's hard to think of a case where this would be better than a simple
loop, which is both smaller and more readable.  However, this technique
also gives you an early exit when mapping over a tree recursively.

   Even then, the need to handle collected values manually would seem to
   obviate any advantage in doing so.

Over the years, I have seen many applications for mapping over trees
(and also written a few), and most of them just produce side effects,
either to the tree or externally.  Those that don't tend to be of the
nature collect (or count) all nodes that satisfy X, for which a
non-local exit would defeat the purpose.  Those that do require
non-local exit are of the form find the *first* X or return true if
*any* X, where the burden of handling a scalar return is minimal and
the benefit from being able to skip the remainder of the tree is clear.

   In other words, I believe the collect values cases and the
nonlocal exit cases are effectively disjoint.  I've never found the
need for (e.g.) a collect the first N nodes that satisfy X routine.

   This discussion has confirmed, for me, that writing code is, in many
   ways, similar to writing prose. Though there may initially, appear to
   be many ways to communicate, thoughtfulness, good taste and
   circumstance quickly show that there are only a few good ways to do
   so.

   -Gyepi 

To the extent that there are vastly more ways to write bad (code|prose)
than good, I heartily agree.  But I submit that that still leaves an
enormous field of possibilities.  Of course, I always try to write the
best (code|prose) I can, subject to constraints, of course, and it often
seems to me that there's only a few acceptable solutions, maybe only
one.  But I have strong preferences in coding style, so I suspect that's
just my biases showing.

   Sorry to come back at you with so much disagreement; that was not my
intention.

-- Bob

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-10 Thread Ben Tilly
On Wed, Apr 10, 2013 at 1:53 PM, Bob Rogers rogers-...@rgrjr.dyndns.org wrote:
From: Gyepi SAM gy...@praxis-sw.com
Date: Wed, 10 Apr 2013 13:54:06 -0400

On Tue, Apr 09, 2013 at 10:24:45PM -0400, Bob Rogers wrote:
[...]
Because your example handles a single result, it is not clear that,
in fact, any return values MUST be collected similarly. Unless, of
course, map is used simply for side effects.

 I would hope (but don't know how to check) that Perl is smart enough to
 notice that the map in my original example is in a void context, and
 won't bother to store any return values from the block.  Is that what
 you mean?

Current versions of Perl are sufficiently smart to not build a return
list.  You can verify by creating objects with destructors inside the
map and seeing when they get destroyed.

 It's hard to think of a case where this would be better than a simple
 loop, which is both smaller and more readable.  However, this technique
 also gives you an early exit when mapping over a tree recursively.

Even then, the need to handle collected values manually would seem to
obviate any advantage in doing so.

 Over the years, I have seen many applications for mapping over trees
 (and also written a few), and most of them just produce side effects,
 either to the tree or externally.  Those that don't tend to be of the
 nature collect (or count) all nodes that satisfy X, for which a
 non-local exit would defeat the purpose.  Those that do require
 non-local exit are of the form find the *first* X or return true if
 *any* X, where the burden of handling a scalar return is minimal and
 the benefit from being able to skip the remainder of the tree is clear.

Many also are of the form, Start where you left off, give me the next
one.  That gets you into internal versus external iterators.

In other words, I believe the collect values cases and the
 nonlocal exit cases are effectively disjoint.  I've never found the
 need for (e.g.) a collect the first N nodes that satisfy X routine.

Returning paginated results in a web page.

This discussion has confirmed, for me, that writing code is, in many
ways, similar to writing prose. Though there may initially, appear to
be many ways to communicate, thoughtfulness, good taste and
circumstance quickly show that there are only a few good ways to do
so.

-Gyepi

 To the extent that there are vastly more ways to write bad (code|prose)
 than good, I heartily agree.  But I submit that that still leaves an
 enormous field of possibilities.  Of course, I always try to write the
 best (code|prose) I can, subject to constraints, of course, and it often
 seems to me that there's only a few acceptable solutions, maybe only
 one.  But I have strong preferences in coding style, so I suspect that's
 just my biases showing.

I absolutely agree.  There is not just one right way.

Sorry to come back at you with so much disagreement; that was not my
 intention.

 -- Bob

 ___
 Boston-pm mailing list
 Boston-pm@mail.pm.org
 http://mail.pm.org/mailman/listinfo/boston-pm

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-10 Thread Bob Rogers
   From: Ben Tilly bti...@gmail.com
   Date: Wed, 10 Apr 2013 14:09:19 -0700

   On Wed, Apr 10, 2013 at 1:53 PM, Bob Rogers rogers-...@rgrjr.dyndns.org 
wrote:

I would hope (but don't know how to check) that Perl is smart enough to
notice that the map in my original example is in a void context, and
won't bother to store any return values from the block . . .

   Current versions of Perl are sufficiently smart to not build a return
   list.  You can verify by creating objects with destructors inside the
   map and seeing when they get destroyed.

Good idea; I doubt I would have thought of that.

Over the years, I have seen many applications for mapping over trees
(and also written a few), and most of them just produce side effects,
either to the tree or externally.  Those that don't tend to be of the
nature collect (or count) all nodes that satisfy X, for which a
non-local exit would defeat the purpose.  Those that do require
non-local exit are of the form find the *first* X or return true if
*any* X, where the burden of handling a scalar return is minimal and
the benefit from being able to skip the remainder of the tree is clear.

   Many also are of the form, Start where you left off, give me the next
   one.  That gets you into internal versus external iterators.

That *is* an iterator, or the specification of one, but we are talking
about mapping over trees.  If you have a map_tree function, then
iterators are unnecessary, besides being harder to implement [1].

   To be fair, though, I've never used iterators per se, so that colors
my experience.  But I also think it's true that I've used trees more in
the more powerful languages (chiefly Common Lisp and Perl) because trees
are harder to work with in less-powerful languages (e.g. Pascal and
Fortran), largely because of the lack of functional programming tools.

   In other words, I believe the collect values cases and the
nonlocal exit cases are effectively disjoint.  I've never found the
need for (e.g.) a collect the first N nodes that satisfy X routine.

   Returning paginated results in a web page.

Personally, I have never encountered this sort of partial results
problem where the data were stored in a tree.  The database engine for a
Web app may well have to deal with trees, if the underlying tables are
tree- structured.  But there the problem is send to the client the
first N rows that satisfy X -- since N could be huge, I suspect that
would have to be written in nonlocal-exit style, rather than by
collecting the rows before sending them.

-- Bob

[1]  Henry G. Baker, Iterators: Signs of Weakness in Object-Oriented
 Languages, 1992.
 http://home.pipeline.com/~hbaker1/Iterator.html

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-09 Thread Bob Rogers
   From: Gyepi SAM gy...@praxis-sw.com
   Date: Mon, 8 Apr 2013 15:02:55 -0400

   On Mon, Apr 08, 2013 at 10:54:47AM -0400, Ricker, William wrote:
Have we mentioned the other solution? 
The Implicit loops of map {block} list; should produce more
optimized code and reads better too.

   map should produce better code and does reads better but it also
   iterates the entire list and you can't break out of it . . .

Not quite true, as you can goto a label outside of the block:

sub find_odd {
# Return the first odd argument.
my (@values) = @_;

my $result;
map { $result = $_ and goto LAST
  if $_  1;
} @values;
  LAST:
return $result;
}

It's hard to think of a case where this would be better than a simple
loop, which is both smaller and more readable.  However, this technique
also gives you an early exit when mapping over a tree recursively.

-- Bob Rogers
   http://www.rgrjr.com/

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-08 Thread David Cantrell
On Fri, Apr 05, 2013 at 09:20:16PM -0400, Jerrad Pierce wrote:

 Regardless, my understanding was that although perl's sub calls are
 somewhat expensive ...

I think it's *method calls* that are expensive, not subroutine calls.
The reason being that the subroutine that a method resolves to can
change at runtime.

-- 
David Cantrell | top google result for topless karaoke murders

There's a hole in my bucket, dear Liza, dear Liza.
WHAT MAKES YOU SAY THERE IS A HOLE IN YOUR BUCKET?

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-08 Thread Adam Russell
 On Fri, Apr 5, 2013 at 8:22 PM, Jerrad Pierce belg4...@pthbb.org wrote:
 at each level of recursion. What seems to be the case though is that when 
 we start going bac
 up the stack that memory doesn't seem to be released at each pop. If, say, 
 at max depth
 500mb of ram has been allocated I don't see that released at any point 
 except for when
 perl exits and then of course it is all released at once. Or at least that 
 is what
 seems to happen.
 
  Perl doesn't release memory, it keeps it for reallocation.
  This is (was?) one of the issues one had to be aware of with mod_perl.
 
 Normally the memory is not given back, but Perl does make a good faith
 effort to try to give it back if it is convenient.
 http://www.perlmonks.org/?node_id=746953 has a couple of examples of
 where it can happen.
 
 However this is not an issue in practice very often.

After making sure the recursive implementation was correct I re-did it 
iteratively and did some comparisons
with Benchmark.
As expected the recursive implementation was slower:
(output from cmpthese())
recursive   6.98--  -42%
iterative   4.04   73%--
more amusing though was the memory usage. 
Looking at the RSS values for the perl process the iterative solution used ~9MB 
of RAM and the recursive solution came in at about 350MB. This was for a 
recursive depth of 4000 which represents a very large, but not wholly 
unreasonable, input.

  

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-06 Thread David Larochelle
Are languages that have mark and sweep garbage collection better about
returning memory to the system than languages like Perl that use reference
count garbage collection.

Also if you really want to see how and why Perl is using memory, take a
look at Tim Bunce's
Devel::SizeMehttp://search.cpan.org/~timb/Devel-SizeMe-0.06/lib/Devel/SizeMe.pm
.

--

David


On Sat, Apr 6, 2013 at 12:20 AM, Conor Walsh c...@adverb.ly wrote:

 On 4/6/2013 12:01 AM, Adam Russell wrote:

 Ah! Ok, so maybe I was confused about this. Even if I set the last
 reference to an object to undef perl will keep the memory until exit? The
 high water mark for memory usage never goes down? Well, that is fine I
 suppose, it isn't like this process will be really all that long lived. It
 also means that the iterative form of this algorithm will use all that much
 less ram, I think.


 Yeah, this is how the perl executable works.  Undef'd (well,
 garbage-collected) stuff becomes free to be re-used by additional stuff
 within that particular instance of perl, but never to the OS. Uri's point
 about rarely-used pages getting swapped to disk by the OS stands, though.

 -C.


 __**_
 Boston-pm mailing list
 Boston-pm@mail.pm.org
 http://mail.pm.org/mailman/**listinfo/boston-pmhttp://mail.pm.org/mailman/listinfo/boston-pm


___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-06 Thread John Redford
David Larochelle writes:
 
 Are languages that have mark and sweep garbage collection better about
 returning memory to the system than languages like Perl that use reference
 count garbage collection.

Not in practice.

While that potential is there, in it is not utilized. Most programs that
allocate  then free a lot of memory will either terminate in the near
future or will reallocate that memory again in the near future.  One can
imagine two programs on a system which took turns allocating  then freeing
a lot of memory, such that it would make sense for each of them to return
the memory to the host for it to allocate to the other -- but since such
programs are rare, the chances of having two of them that happen to
synchronize thusly is even lower -- it's just not worth the effort to make
it happen -- particularly because there's another approach which makes more
sense and permits application to exercise fine control over memory use.

Programs that want to do this kind of thing -- alone or cooperatively --
should use mmap (UNIX) or CreateFileMapping (Windows). This provides the
application with a virtual address mapping that is backed (generally) by a
file on disk.  When the application wants to release that memory, it merely
unmaps the file. At the point where actual content is read or written to the
file, physical RAM pages are allocated  used, but they are released with
the file itself. This is generally the technique programs should use when
they want control over this kind of thing -- mmap also makes it easy for one
program to map-in what another program has mapped-out, thus instantly
sharing a complex data structure (using offsets rather than absolute
addressing for internal pointers).  If two programs both mmap a file at the
same time, they can share the physical memory and thus communicate with each
other that way (generally also using semaphores for locking).

There are a number of options which exist other than simply mapping a
regular file -- it is also possible to map directly against swap, and to
control how the mapped pages are addressed and cached and so forth. This is
essentially a user-space mechanism to tell the kernel exactly how certain
segments of memory need to be handled.



___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-06 Thread Gyepi SAM
On Sat, Apr 06, 2013 at 09:11:55AM -0400, David Larochelle wrote:
 Are languages that have mark and sweep garbage collection better about
 returning memory to the system than languages like Perl that use reference
 count garbage collection.

Not usually.  On Unix systems, at least, memory is acquired from (and released
to) the system using sbrk. malloc(3) and free(3) are actually front ends to
sbrk. free marks memory as available to be reallocated, but only returns 
memory when there is enough of it at the high end of the heap. Depending
on usage patterns this is unlikely to happen naturally unless memory is first
compacted.

This is an additional step to the garbage collection and not many languages do 
it AFAIK.
I know Lisp can do this, but only to avoid fragmentation and does not return 
the memory.

It is generally not necessary to return memory to the system because
most modern operating systems provide a virtual view of memory so every
program appears to have access to the entire memory space. Most of the time
this works fine. Programs that allocate too much get swapped out when
necessary. If you don't want that for your program, you can always use
mlock(3) and friends.

-Gyepi

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-06 Thread Bill Ricker
On Sat, Apr 6, 2013 at 12:20 AM, Conor Walsh c...@adverb.ly wrote:

 Ah! Ok, so maybe I was confused about this. Even if I set the last
 reference to an object to undef perl will keep the memory until exit? The
 high water mark for memory usage never goes down? Well, that is fine I
 suppose, it isn't like this process will be really all that long lived. It
 also means that the iterative form of this algorithm will use all that much
 less ram, I think.


 Yeah, this is how the perl executable works.  Undef'd (well,
 garbage-collected) stuff becomes free to be re-used by additional stuff
 within that particular instance of perl, but never to the OS. Uri's point
 about rarely-used pages getting swapped to disk by the OS stands, though.


As Uri said, this is not peculiar to perl. When C heap grows, it never
shrinks  C programs can release non-heap, non-stack special segments they
may have mapped previously, but those are peculiar techniques. The only
difference with Java (whose interpreter like perl's is build on C) is
there's a start flag to limit max heap growth, still doesn't give back.

As long as the heap doesn't get too fragmented (java's GC moves things,
perl recount doesn't) and there's no significant leakage, this isn't a
problem. But it *can* be a problem. To avoid fragmentation pushing
perpetual growth, perl sometimes needs a C-like pre-allocation pool pattern
(or dedicated heap) for the big objects and buffers.


-- 
Bill
@n1vux bill.n1...@gmail.com

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-06 Thread Ben Tilly
On Fri, Apr 5, 2013 at 8:22 PM, Jerrad Pierce belg4...@pthbb.org wrote:
at each level of recursion. What seems to be the case though is that when we 
start going bac
up the stack that memory doesn't seem to be released at each pop. If, say, at 
max depth
500mb of ram has been allocated I don't see that released at any point except 
for when
perl exits and then of course it is all released at once. Or at least that is 
what
seems to happen.

 Perl doesn't release memory, it keeps it for reallocation.
 This is (was?) one of the issues one had to be aware of with mod_perl.

Normally the memory is not given back, but Perl does make a good faith
effort to try to give it back if it is convenient.
http://www.perlmonks.org/?node_id=746953 has a couple of examples of
where it can happen.

However this is not an issue in practice very often.

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-06 Thread Bill Ricker
On Sat, Apr 6, 2013 at 12:30 PM, Bill Ricker bill.n1...@gmail.com wrote:


 On Sat, Apr 6, 2013 at 12:20 AM, Conor Walsh c...@adverb.ly wrote:

 Ah! Ok, so maybe I was confused about this. Even if I set the last
 reference to an object to undef perl will keep the memory until exit? The
 high water mark for memory usage never goes down? Well, that is fine I
 suppose, it isn't like this process will be really all that long lived. It
 also means that the iterative form of this algorithm will use all that much
 less ram, I think.


 Yeah, this is how the perl executable works.  Undef'd (well,
 garbage-collected) stuff becomes free to be re-used by additional stuff
 within that particular instance of perl, but never to the OS. Uri's point
 about rarely-used pages getting swapped to disk by the OS stands, though.


 As Uri said, this is not peculiar to perl. When C heap grows, it never
 shrinks  C programs can release non-heap, non-stack special segments they
 may have mapped previously, but those are peculiar techniques. The only
 difference with Java (whose interpreter like perl's is build on C) is
 there's a start flag to limit max heap growth, still doesn't give back.

 As long as the heap doesn't get too fragmented (java's GC moves things,
 perl recount doesn't) and there's no significant leakage, this isn't a
 problem. But it *can* be a problem. To avoid fragmentation pushing
 perpetual growth, perl sometimes needs a C-like pre-allocation pool pattern
 (or dedicated heap) for the big objects and buffers.


 --
 Bill
 @n1vux bill.n1...@gmail.com




-- 
Bill
@n1vux bill.n1...@gmail.com

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-06 Thread Uri Guttman

On 04/06/2013 10:30 AM, John Redford wrote:

David Larochelle writes:


Are languages that have mark and sweep garbage collection better about
returning memory to the system than languages like Perl that use reference
count garbage collection.






Programs that want to do this kind of thing -- alone or cooperatively --
should use mmap (UNIX) or CreateFileMapping (Windows). This provides the
application with a virtual address mapping that is backed (generally) by a
file on disk.  When the application wants to release that memory, it merely
unmaps the file. At the point where actual content is read or written to the
file, physical RAM pages are allocated  used, but they are released with
the file itself. This is generally the technique programs should use when
they want control over this kind of thing -- mmap also makes it easy for one
program to map-in what another program has mapped-out, thus instantly
sharing a complex data structure (using offsets rather than absolute
addressing for internal pointers).  If two programs both mmap a file at the
same time, they can share the physical memory and thus communicate with each
other that way (generally also using semaphores for locking).


mmap won't help with returning ram to the system. there is still a 
malloc/free involved and that will come from the same place as the rest 
of ram. the only way to return virtual ram is by lowering the top 
virtual address with brk()/sbrk() calls. and that needs contiguous ram 
at the top of the address space. and that means freed ram must be 
collected into contiguous blocks. that is nothing to do with mark/sweep 
vs ref counting or any other type of gc. mark sweep will use a large 
contiguous area to copy active data but it needs the other buffer to do 
it again. it won't ever want to free that ram. the only way this ever 
happens is the custom code that manages the ram blocks by calling brk() 
directly. it is too tricky to do in most cases. and with virtual ram it 
doesn't really matter as i said, it will be swapped out until the 
process dies if you don't use it again.


uri



There are a number of options which exist other than simply mapping a
regular file -- it is also possible to map directly against swap, and to
control how the mapped pages are addressed and cached and so forth. This is
essentially a user-space mechanism to tell the kernel exactly how certain
segments of memory need to be handled.



___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm





___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-06 Thread John Redford
Uri Guttman writes:
 
 mmap won't help with returning ram to the system. there is still a
malloc/free
 involved and that will come from the same place as the rest of ram. the
only
 way to return virtual ram is by lowering the top virtual address with
brk()/sbrk()
 calls. and that needs contiguous ram at the top of the address space. and
that
 means freed ram must be collected into contiguous blocks. that is nothing
to do
 with mark/sweep vs ref counting or any other type of gc. mark sweep will
use a
 large contiguous area to copy active data but it needs the other buffer to
do it
 again. it won't ever want to free that ram. the only way this ever happens
is the
 custom code that manages the ram blocks by calling brk() directly. it is
too
 tricky to do in most cases. and with virtual ram it doesn't really matter
as i said,
 it will be swapped out until the process dies if you don't use it again.

At the risk of again poking this prickly mailing list... this is simply
wrong. Certainly I cannot speak for every variation of every operating
system ever written, and your understanding of this may be based on some
system that actually worked this way, unlikely though that seems. But mmap
does not perform any malloc  free -- nor does it use brk/sbrk to extend the
heap boundary.

At any given point in time, every process has a _virtual_ address space
which is essentially From 0 to MAXINT/MAXLONG/MAXLONGLONG/Etc, depending
on the specific OS -- that is, the virtual address space is every possible
location that a pointer can hold, even if nothing is at that address.

To keep things simple, processes have a region of memory called heap,
which is basically one big .. heap.  It's a block of addresses, and when a
process calls malloc, it attempts to find some unused memory in that block,
and if it cannot, it uses brk/sbrk to extend the heap -- make it bigger.

At any given point in time, if a process attempts to access a bit of memory,
it passes through one or more layers of virtual-address-translation.  If the
address is in a location the kernel says is ok, then the translation will
map to a page of physical RAM -- if this is a new mapping, the page is
faulted in when the VAT mapping is established. When a process brk's the
heap, the kernel will be prepared to facilitate faulting RAM to back the
additional address space.

However, mmap does not use the heap -- it picks other locations in the
virtual address space to use (often near the stack, far away from the heap)
-- the caller can even specify where in that address space they want the
mapping to occur, which can facilitate using non-offset internal pointers.
When the kernel services a mmap call, it arranges to handle page faults
within the range provided by mmap (base address + size) by using the VAT to
map the virtual address to a page of physical RAM which contains (by reading
it from the file the first time it's accessed) the content of the mapped
file.  (When mmap is used without a file, the content is left
uninitialized.)  If the mmap call is private/exclusive, then this page of
physical RAM will only be mapped to this process's virtual addressing; if it
is shared, then multiple processes, from distinct virtual addressed, will
VAT to the same page of physical RAM.  This can facilitate many
efficiencies.

Again, to keep this simple... in order for the VAT to become unmapped, and
return the physical RAM to the pool where it may be reused (multiple such
pools exist, but I'll not dwell on that), the kernel has to be told that the
process no longer wants that VAT to work -- that is, it arranges for
accesses to those virtual addresses to result in segmentation violation
(segv). At that point, the RAM itself is free and available for any other
process.

In order for the heap's range of virtual addresses to return RAM, the end
of the heap must be lowered, and thus -- as people have said -- the process
needs to be sure that all the addressable stuff it is using lies below some
point, such that the end can be lowered to that point.

In order for the range of addresses provided by any given mmap to return
RAM, the kernel only needs to be told to unmap the memory.  Now, it seems
obvious, but I will point out that this does require that the process be
sure that it is not still using virtual addresses that are in that range --
one cannot have one's memory and free it too.  But it does not require that
anything be shuffled around to fit into a smaller contiguous block such as
the help.

It is worth pointing out that (in general) if one simply mmap's a 100MB file
and accesses a single byte in the middle of that file, then one will only
fault in _one_ page of RAM, and when one unmap's the file, one will only
release _one_ page of RAM.  It is possible to instruct the kernel to
pre-fault pages into RAM to enhance performance, but as a general rule, you
will only use as much RAM as is required based on the actual reads  writes
to virtual addresses mapped into a file.

Anyway, to 

[Boston.pm] Perl and recursion

2013-04-05 Thread Adam Russell
I am currently in the midst of implementing a fairly non-trivial recursive 
algorithm in Perl. The depth of the recursion is quite large, so much so that I 
have set no warning recursion which warns with a depth over 100. This seems 
pretty small to me! If the default is to warn at a depth of 100 does this imply 
that Perl runs into issues internally somewhere? A quick profiling via top does 
indeed confirm what I consider an unusually large memory allocation when run 
with inputs that force a depth of greater than a few hundred. Googling did not 
provide much insight and in fact it seems only small trivial examples are 
mainly discussed on this topic. Can anyone here comment on what should be 
considered here?
Best Regards,
Adam

Sent from my iPhone

On Apr 5, 2013, at 7:46 PM, boston-pm-requ...@mail.pm.org wrote:

 Send Boston-pm mailing list submissions to
boston-pm@mail.pm.org
 
 To subscribe or unsubscribe via the World Wide Web, visit
http://mail.pm.org/mailman/listinfo/boston-pm
 or, via email, send a message with subject or body 'help' to
boston-pm-requ...@mail.pm.org
 
 You can reach the person managing the list at
boston-pm-ow...@mail.pm.org
 
 When replying, please edit your Subject line so it is more specific
 than Re: Contents of Boston-pm digest...
 
 
 Today's Topics:
 
   1. Re: Passing large complex data structures betweenprocess
  (John Redford)
   2. Re: Passing large complex data structures betweenprocess
  (Anthony Caravello)
   3. Re: Passing large complex data structures betweenprocess
  (Ben Tilly)
   4. Re: Passing large complex data structuresbetweenprocess
  (John Redford)
   5. Re: Passing large complex data structures betweenprocess
  (John Redford)
   6. Re: Passing large complex data structures betweenprocess
  (Anthony Caravello)
 
 
 --
 
 Message: 1
 Date: Fri, 5 Apr 2013 15:04:21 -0400
 From: John Redford eire...@hotmail.com
 To: 'Ben Tilly' bti...@gmail.com
 Cc: 'L-boston-pm' boston-pm@mail.pm.org
 Subject: Re: [Boston.pm] Passing large complex data structures between
process
 Message-ID: blu172-ds717390f3b68dc4f878a8bb8...@phx.gbl
 Content-Type: text/plain; charset=us-ascii
 
 Ben Tilly emitted:
 
 Pro tip.  I've seen both push based systems and pull based systems at
 work.  The
 push based systems tend to break whenever the thing that you're pushing to
 has problems.  Pull-based systems tend to be much more reliable in my
 experience.
 [...]
 
 If you disregard this tip, then learn from experience and give thought in
 advance to how you're going to monitor the things that you're pushing to,
 notice their problems, and fix them when they break.
 (Rather than 2 weeks later when someone wonders why their data stopped
 updating.)
 
 Your writing is FUD.
 
 Pro tip.  Learn to use a database.  I know that it can be fun to play with
 the latest piece of shiny technofrippery, like Redis, and to imagine that
 because it is new, it somehow is better than anything that came before and
 that it can solve problems that have never been solved before.  It's not.
 There's nothing specifically wrong with it, but it's not a silver bullet and
 parallelism is not a werewolf.
 
 
 
 
 --
 
 Message: 2
 Date: Fri, 5 Apr 2013 16:26:41 -0400
 From: Anthony Caravello t...@caravello.us
 To: John Redford eire...@hotmail.com
 Cc: L-boston-pm boston-pm@mail.pm.org
 Subject: Re: [Boston.pm] Passing large complex data structures between
process
 Message-ID:
caglsx2rwg-avd__y+lr++7qry4noiw_7fketiuhp5-slwew...@mail.gmail.com
 Content-Type: text/plain; charset=ISO-8859-1
 
 Queuing systems aren't really new or 'technofrippery'.  In-memory FIFO
 stacks are ridiculously fast compared to transaction safe rdbms' for this
 simple purpose.  Databases incur a lot of overhead for wonderful things
 that don't aid this cause.
 
 This isn't magic, sometimes it's just the right tool for the job.
 
 On Fri, Apr 5, 2013 at 3:04 PM, John Redford eire...@hotmail.com wrote:
 
 Ben Tilly emitted:
 
 Pro tip.  I've seen both push based systems and pull based systems at
 work.  The
 push based systems tend to break whenever the thing that you're pushing
 to
 has problems.  Pull-based systems tend to be much more reliable in my
 experience.
 [...]
 
 If you disregard this tip, then learn from experience and give thought in
 advance to how you're going to monitor the things that you're pushing to,
 notice their problems, and fix them when they break.
 (Rather than 2 weeks later when someone wonders why their data stopped
 updating.)
 
 Your writing is FUD.
 
 Pro tip.  Learn to use a database.  I know that it can be fun to play with
 the latest piece of shiny technofrippery, like Redis, and to imagine that
 because it is new, it somehow is better than anything that came before and
 that it can solve problems that have never been solved before.  It's 

Re: [Boston.pm] Perl and recursion

2013-04-05 Thread Conor Walsh
On Apr 5, 2013 8:24 PM, Uri Guttman u...@stemsystems.com wrote:
 as for your ram usage, all recursions can be unrolled into plain loops by
managing your own stack. this is a classic way to save ram and sub call
overhead. with perl it would be a fairly trivial thing to do. use an array
for the stack and each element could be a hash ref containing all the
data/args for that level's call. then you just loop and decide to 'recurse'
or not. if you recurse you push a new hash ref onto the stack and loop. if
you don't recurse you pop a value from the stack and maybe modify the
previous top of the stack (like doing a return early in recursion). i leave
the details to you. this would save you a ton of ram and cpu for each call
if you go very deep and complex.

Uri, you know more perlguts than I do, so maybe this is a dumb question,
but...  why is this that much faster than actual recursion?  That speaks
poorly of lowercase-p perl.

-Conor

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-05 Thread Ben Tilly
On Fri, Apr 5, 2013 at 6:03 PM, Conor Walsh c...@adverb.ly wrote:
 On Apr 5, 2013 8:24 PM, Uri Guttman u...@stemsystems.com wrote:
  as for your ram usage, all recursions can be unrolled into plain loops by
  managing your own stack. this is a classic way to save ram and sub call
  overhead. with perl it would be a fairly trivial thing to do. use an array
  for the stack and each element could be a hash ref containing all the
  data/args for that level's call. then you just loop and decide to 'recurse'
  or not. if you recurse you push a new hash ref onto the stack and loop. if
  you don't recurse you pop a value from the stack and maybe modify the
  previous top of the stack (like doing a return early in recursion). i leave
  the details to you. this would save you a ton of ram and cpu for each call
  if you go very deep and complex.

 Uri, you know more perlguts than I do, so maybe this is a dumb question,
 but...  why is this that much faster than actual recursion?  That speaks
 poorly of lowercase-p perl.

Do you like having detailed stack backtraces on program crashes?  It
takes a lot of work to maintain that information, and you're using it
whether or not you crash.

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-05 Thread Conor Walsh
Detailed?  What's kept beyond a called b (arguments...) ?  That's not a
lot of bytes, unless it's complete deep copies of structures.

-C.
On Apr 5, 2013 9:08 PM, Ben Tilly bti...@gmail.com wrote:

 On Fri, Apr 5, 2013 at 6:03 PM, Conor Walsh c...@adverb.ly wrote:
  On Apr 5, 2013 8:24 PM, Uri Guttman u...@stemsystems.com wrote:
   as for your ram usage, all recursions can be unrolled into plain loops
 by
   managing your own stack. this is a classic way to save ram and sub call
   overhead. with perl it would be a fairly trivial thing to do. use an
 array
   for the stack and each element could be a hash ref containing all the
   data/args for that level's call. then you just loop and decide to
 'recurse'
   or not. if you recurse you push a new hash ref onto the stack and
 loop. if
   you don't recurse you pop a value from the stack and maybe modify the
   previous top of the stack (like doing a return early in recursion). i
 leave
   the details to you. this would save you a ton of ram and cpu for each
 call
   if you go very deep and complex.
 
  Uri, you know more perlguts than I do, so maybe this is a dumb question,
  but...  why is this that much faster than actual recursion?  That speaks
  poorly of lowercase-p perl.

 Do you like having detailed stack backtraces on program crashes?  It
 takes a lot of work to maintain that information, and you're using it
 whether or not you crash.


___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-05 Thread Jerrad Pierce
Detailed?  What's kept beyond a called b (arguments...) ?  That's not a
lot of bytes, unless it's complete deep copies of structures.

perldoc -f caller

package, line number, etc.

Regardless, my understanding was that although perl's sub calls are
somewhat expensive to some other languages that recursive algorithms,
while simple, were always unrollable to something more effecient...
just as Uri proposes.

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-05 Thread Bill Ricker
THEORY

Ever general computer science over-simplification has a BUT that is very
important.

Recursion is as efficient as iteration ...
... IF AND ONLY IF Tail Recursion Optimization is in effect.

When Tail Recursion is in effect, you do NOT have all that call stack,
you're only one level down the entire time (which means no overhead and no
recursion limit either).

(Whether you can be thread safe and tail recursive in any modern language i
haven't heard.)

PRACTICE

When in doubt, benchmark.

-- 
Bill
@n1vux bill.n1...@gmail.com

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-05 Thread Ben Tilly
On Fri, Apr 5, 2013 at 6:39 PM, Bill Ricker bill.n1...@gmail.com wrote:
 THEORY

 Ever general computer science over-simplification has a BUT that is very
 important.

 Recursion is as efficient as iteration ...
 ... IF AND ONLY IF Tail Recursion Optimization is in effect.

 When Tail Recursion is in effect, you do NOT have all that call stack,
 you're only one level down the entire time (which means no overhead and no
 recursion limit either).

 (Whether you can be thread safe and tail recursive in any modern language i
 haven't heard.)

Erlang is the best example.

 PRACTICE

 When in doubt, benchmark.

I always doubt my benchmarks.

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-05 Thread Bill Ricker
On Fri, Apr 5, 2013 at 9:51 PM, Ben Tilly bti...@gmail.com wrote:

  When in doubt, benchmark.

 I always doubt my benchmarks.


I also doubt benchmarks, but doubt extrapolating from theory
to reality even more.



-- 
Bill
@n1vux bill.n1...@gmail.com

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-05 Thread Gyepi SAM
On Fri, Apr 05, 2013 at 09:03:13PM -0400, Conor Walsh wrote:
 why is this that much faster than actual recursion?  That speaks
 poorly of lowercase-p perl.

This is not a perl specific issue (for the most part).

Most languages that support function calls need to maintain an activation
record for every call in order to keep track of, at the very least,
parameters, return values, next action, etc [1]. A recursive function creates
a chain of these records.

So, Uri's suggestion is faster because replacing recursion with iteration and
stack operations uses fewer resources than the runtime does in maintaining a 
call stack.
More to the point though, one avoids the built in limits, or rather, exchanges 
that limit
for memory limits.

I don't think this speaks poorly of perl at all.
Every operation has a cost, after all, and function calls are relatively 
expensive.
Certainly more so than iteration or array operations.

[1] There are a couple of ways to avoid this problem while still maintaining the
benefits of recursion. One is to use threaded code, as in Forth, or more 
commonly,
to use tail recursion (or tail call elimination), which re-uses the current 
activation
record for the recursive call, iff the recursive call occurs at the end of the 
function
since that means that the only state we need store are the return values.
Since those are on the stack, they can, with a bit of pointer twiddling or 
copying,
become the parameters to the next call.

A number of languages do this, to great advantage.
Perl does not have tail call elimination. However, it is mentioned in 
Porting/todo.pod

-Gyepi

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-05 Thread Greg London

 function calls are relatively expensive.
 Certainly more so than iteration or array operations.

Maybe we could get a new pragma:

no overhead 'subs';

;/

I've been fighting a perl script problem for a while now
and just recently figured out a potential solution
that just happens to involve a whole lot of subroutine
calls, including potential recursion.






___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-05 Thread Adam Russell
Date: Fri, 05 Apr 2013 20:23:30 -0400
From: Uri Guttman u...@stemsystems.com
To: boston-pm@mail.pm.org
Subject: Re: [Boston.pm] Perl and recursion
Message-ID: 515f6b02.70...@stemsystems.com
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 
On 04/05/2013 08:13 PM, Adam Russell wrote:
 I am currently in the midst of implementing a fairly non-trivial
 recursive algorithm in Perl. The depth of the recursion is quite
 large, so much so that I have set no warning recursion which warns
 with a depth over 100. This seems pretty small to me! If the default
 is to warn at a depth of 100 does this imply that Perl runs into
 issues internally somewhere? A quick profiling via top does indeed
 confirm what I consider an unusually large memory allocation when run
 with inputs that force a depth of greater than a few hundred.
 Googling did not provide much insight and in fact it seems only small
 trivial examples are mainly discussed on this topic. Can anyone here
 comment on what should be considered here?
 

massive delete

Ha! Sorry about that. The massive amount of quoted text was was caused by 
uncareful use of email by phone.

the depth of 100 to warn for deep recursion was sort of chosen
arbitrarily. rarely do most recursive programs need to go that deep and
then you can shut off the warning as you know. i read something on p5p 
very recently where some other thing was limited to 100 and the 
recursion depth warning was mentioned. they may be changing it to make 
it 1000. afaik, it wasn't due to ram usage but just a general warning 
that you might have infinite recursion happening.

Yes, that is what the docs say, a depth of 100 is a symptom of an infinite 
recursion. This is just too low in many practical cases but, yes, it is
just a warning that can easily be turned off.

as for your ram usage, all recursions can be unrolled into plain loops 
by managing your own stack. this is a classic way to save ram and sub 
call overhead. with perl it would be a fairly trivial thing to do. use 
an array for the stack and each element could be a hash ref containing 
all the data/args for that level's call. then you just loop and decide 
to 'recurse' or not. if you recurse you push a new hash ref onto the 
stack and loop. if you don't recurse you pop a value from the stack and 
maybe modify the previous top of the stack (like doing a return early in 
recursion). i leave the details to you. this would save you a ton of ram 
and cpu for each call if you go very deep and complex.

The initial implementation was of the algorithm from the literature as a 
reference
and baseline. I do plan on eventually doing an iterative solution. I was just 
curious about
why the had to be the case. I now understand the interpreter is storing lots of 
info 
at each level of recursion. What seems to be the case though is that when we 
start going back
up the stack that memory doesn't seem to be released at each pop. If, say, at 
max depth
500mb of ram has been allocated I don't see that released at any point except 
for when
perl exits and then of course it is all released at once. Or at least that is 
what
seems to happen.

  

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-05 Thread Jerrad Pierce
at each level of recursion. What seems to be the case though is that when we 
start going bac
up the stack that memory doesn't seem to be released at each pop. If, say, at 
max depth
500mb of ram has been allocated I don't see that released at any point except 
for when
perl exits and then of course it is all released at once. Or at least that is 
what
seems to happen.

Perl doesn't release memory, it keeps it for reallocation.
This is (was?) one of the issues one had to be aware of with mod_perl.

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-05 Thread Uri Guttman

On 04/05/2013 11:22 PM, Jerrad Pierce wrote:

at each level of recursion. What seems to be the case though is that when we 
start going bac
up the stack that memory doesn't seem to be released at each pop. If, say, at 
max depth
500mb of ram has been allocated I don't see that released at any point except 
for when
perl exits and then of course it is all released at once. Or at least that is 
what
seems to happen.


Perl doesn't release memory, it keeps it for reallocation.
This is (was?) one of the issues one had to be aware of with mod_perl.


this is true for almost all programs. releasing ram back to the system 
is tricky on unix flavors. it can only happen at the top of the ram 
space by using the brk() system call. this means your process has to not 
only free its ram it has to consolidate it and check when it can release 
it to the system. this needs special and likely much slower malloc/free 
calls. it can be done with careful and special ram allocation but with 
the massive number of small mallocs that perl does (and most langs do) 
it is a practical impossibility.


but remember, this is virtual ram and not physical. if perl stops 
accessing ram it has freed as it doesn't need it anymore, it will swap 
out naturally. it is still more resource intensive than an iterative 
solution but not as bad as you (the OP) thinks.


recursion is a great concept and i use it for template parsing in 
Template::Simple. i don't expect template structures to nest 100 deep! 
:) in that code an iterative version would be harder to code than a 
simple single call to parse the nested sub-template.


but recursion 100's deep would bring up serious speed and resource 
issues and i would explore iterative solutions from the start.


others mention tail recursion and that is only good if you make the 
recursive call last. that isn't always possible and when you can do it, 
you may need to do some entangled code to force that to happen. for 
example in my template module, i need to do work after the recursive 
call and it would be very convoluted to make that become a tail recursion.


one thing i dislike about tail recursion is how it is used just for 
simple looping. maybe it is as fast if done right but it feels like the 
everything looks like a nail when you have a hammer syndrome. loops are 
fine as they are.


uri



___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-05 Thread Adam Russell
 at each level of recursion. What seems to be the case though is that when we 
 start going bac
 up the stack that memory doesn't seem to be released at each pop. If, say, 
 at max depth
 500mb of ram has been allocated I don't see that released at any point 
 except for when
 perl exits and then of course it is all released at once. Or at least that 
 is what
 seems to happen.
 
 Perl doesn't release memory, it keeps it for reallocation.
 This is (was?) one of the issues one had to be aware of with mod_perl.

Ah! Ok, so maybe I was confused about this. Even if I set the last reference to 
an object to undef perl
will keep the memory until exit? The high water mark for memory usage never 
goes down?
Well, that is fine I suppose, it isn't like this process will be really all 
that long lived.
It also means that the iterative form of this algorithm will use all that much 
less ram, I think.

  

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Perl and recursion

2013-04-05 Thread Conor Walsh

On 4/6/2013 12:01 AM, Adam Russell wrote:
Ah! Ok, so maybe I was confused about this. Even if I set the last 
reference to an object to undef perl will keep the memory until exit? 
The high water mark for memory usage never goes down? Well, that is 
fine I suppose, it isn't like this process will be really all that 
long lived. It also means that the iterative form of this algorithm 
will use all that much less ram, I think.


Yeah, this is how the perl executable works.  Undef'd (well, 
garbage-collected) stuff becomes free to be re-used by additional stuff 
within that particular instance of perl, but never to the OS. Uri's 
point about rarely-used pages getting swapped to disk by the OS stands, 
though.


-C.

___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm