Re: Running out of memory when using loop/recur and destructuring

2009-11-10 Thread Christophe Grand
On Wed, Nov 4, 2009 at 6:46 AM, John Harrop  wrote:
> On Tue, Nov 3, 2009 at 1:53 AM, Alex Osborne  wrote:
>>
>> The new loop uses the outer-let to get around this:
>> (let [G__13697 s
>>       [x & xs] G__13697
>>       y xs]
>>   (loop* [G__13697 G__13697
>>           y y]
>>          (let [[x & xs] G__13697
>>                y y]
>>            ...)))
>
> Now, if that were
> (let [G__13697 (java.lang.ref.SoftReference. s)
>       [x & xs] (.get G__13697)
>       y xs]
>   (loop* [G__13697 (.get G__13697)
>           y y]
>          (let [[x & xs] G__13697
>                y y]
>            ...)))
> instead ...


Or you can rely on the existing local clearing on tail calls:
(#(let [G__13697 s
[x & xs] G__13697
y xs]
((fn [G__13697 y]
   (let [[x & xs] G__13697
 y y]
 ...) G__13697 y

It _should_ work (untested).

Christophe

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Running out of memory when using loop/recur and destructuring

2009-11-10 Thread John Harrop
On Tue, Nov 10, 2009 at 7:21 AM, Rich Hickey  wrote:

> Right - pervasive locals clearing will definitely do the trick here.
> Interestingly, when I was at Microsoft and asked them about handling
> this issue for the CLR they stated plainly it wasn't an issue at all -
> their system can fully detect that any such references are in fact
> unreachable and are subject to GC. So, in a sense, all locals clearing
> on my part is a workaround for a JVM weakness in this area.


Did you see my earlier post? One possible workaround in this specific
instance could be to use SoftReference to make a "clearable" local in the
let outside the loop. In case of any worries about the reference being
cleared in the few nanos before it can be copied into the loop variables,
you could also use a mutable store, e.g.

let [G__13697 (atom s)
  [x & xs] @G__13697
  x (atom x)
  xs (atom xs)
  y (atom @xs)
  grab (fn [a] (let [x @a] (reset! a nil) x))]
  (loop* [G__13697 (grab G__13697)
  y (grab y)]
 (let [[x & xs] G__13697
   y y]
   ...)))

How it works:
1. When the outer let is needed due to destructuring, the nondestructuring
binds are wrapped in (atom). The destructuring binds are not, but each
destructuring-produced binding is then rebound to itself wrapped in (atom).
Any reference to one of the preceding locals is wrapped in (deref).
2. When the loop needs to initialize its loop variables, it wraps access to
the let's variables in grab, which returns the contents of the atom while
also resetting it.

It's not very CPU efficient, since it uses atoms which seem to be slow, but
it will prevent head-retention of lazy seqs IF (let [x the-seq x
something-else]) lets go of the head of the-seq when the second binding of x
in the same let is performed.

By the time the loop body begins executing, the only things the outer let's
bindings are holding onto will be tiny little empty atoms, consuming just a
few heap bytes each.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Running out of memory when using loop/recur and destructuring

2009-11-10 Thread Rich Hickey
On Wed, Nov 4, 2009 at 8:47 AM, Christophe Grand  wrote:
>
> On Tue, Nov 3, 2009 at 7:27 PM, Paul  Mooser  wrote:
>>
>> Ah -- I hadn't understood that when using destructuring, that
>> subsequent bindings could refer to the destructured elements. I should
>> have, since clojure "only" has let*, and this behavior seems
>> consistent with that, for binding.
>>
>> Eeww. It seems like quite a thorny issue to solve, even if simple to
>> describe.
>
> Well, in truth, there's a way to fix the loop macro but it is too
> ugly: the idea is to wrap the outer let in a closure and replace the
> loop by a function, thus we could benefit from the locals clearing on
> tail call.
>
> The real solution would be pervasive locals clearing but I seem to
> remember Rich saying he'd like to delay such work until clojure in
> clojure.
>

Right - pervasive locals clearing will definitely do the trick here.
Interestingly, when I was at Microsoft and asked them about handling
this issue for the CLR they stated plainly it wasn't an issue at all -
their system can fully detect that any such references are in fact
unreachable and are subject to GC. So, in a sense, all locals clearing
on my part is a workaround for a JVM weakness in this area.

>> What's the procedure for creating a ticket for this? Is it at least
>> acknowledged that this IS a bug?
>
> It's better to wait for Rich's opinion on this problem before creating a 
> ticket.
>

No ticket, please. This issue is well understood and has been
discussed at length.

Thanks,

Rich

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Running out of memory when using loop/recur and destructuring

2009-11-09 Thread Paul Mooser

I imagine he's just busy. At this point, I plan to create a ticket on
assembla, if that's possible - I think I just need to create a login
and then file it.

On Nov 9, 2:07 pm, John Harrop  wrote:
> On Mon, Nov 9, 2009 at 4:31 PM, Rock  wrote:
> > I've been following this thread, and I must say I'm puzzled that Rich
> > hasn't said anything at all about this issue yet. It seems important
> > enough to hear his own opinion.
>
> My observation over the past few months is that Rich has long absences away
> from the list, busy with other things. Maybe he cloisters himself sometimes
> to go on uninterrupted, focused development binges? It would explain why
> Clojure has grown and matured as rapidly as it has. :)
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-09 Thread John Harrop
On Mon, Nov 9, 2009 at 4:31 PM, Rock  wrote:

> I've been following this thread, and I must say I'm puzzled that Rich
> hasn't said anything at all about this issue yet. It seems important
> enough to hear his own opinion.


My observation over the past few months is that Rich has long absences away
from the list, busy with other things. Maybe he cloisters himself sometimes
to go on uninterrupted, focused development binges? It would explain why
Clojure has grown and matured as rapidly as it has. :)

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-09 Thread Rock

I've been following this thread, and I must say I'm puzzled that Rich
hasn't said anything at all about this issue yet. It seems important
enough to hear his own opinion.

On 6 Nov, 18:56, Paul  Mooser  wrote:
> So, I've been hoping that Rich (or someone?) would weigh in on this,
> and give the go-ahead to file a ticket on it. Do people feel at this
> point there is consensus that this is indeed a bug?
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-05 Thread Paul Mooser

It does make me wonder, however, if having the lazy-seq cache things
is sort of conflating laziness and consistency, since as you point
out, not all ISeq implementations do any sort of caching.

I wonder if it would be interesting to decompose it into 'lazy-
seq' (uncached), and 'cached-seq'. I understand that this is unlikely
to ever happen, but it occurred to me last night when I was idly
thinking about this.

On Nov 4, 2:03 pm, Paul  Mooser  wrote:
> I completely understand the difference between the ISeq interface, and
> the particular implementation (lazy-seq) that results in these
> problems. It would be fairly straightforward, I think, to write some
> kind of uncached-lazy-seq which doesn't exhibit these problems, but
> I've felt that is sidestepping and issue and introduces issues of its
> own.
>
> On Nov 4, 1:16 pm, Chouser  wrote:
>
>
>
> > Both those examples retain the head, but since 'incs' isn't
> > a lazy-seq, the intermediate values can be garbage-collected.
> > Note the difference between the seq abstraction and the lazy-seq
> > implementation.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-04 Thread Paul Mooser

I completely understand the difference between the ISeq interface, and
the particular implementation (lazy-seq) that results in these
problems. It would be fairly straightforward, I think, to write some
kind of uncached-lazy-seq which doesn't exhibit these problems, but
I've felt that is sidestepping and issue and introduces issues of its
own.

On Nov 4, 1:16 pm, Chouser  wrote:
> Both those examples retain the head, but since 'incs' isn't
> a lazy-seq, the intermediate values can be garbage-collected.
> Note the difference between the seq abstraction and the lazy-seq
> implementation.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-04 Thread Chouser

On Tue, Nov 3, 2009 at 11:51 PM, Mark Engelberg
 wrote:
>
> Clojure's built-in "range" function (last time I looked) essentially
> produces an uncached sequence.  And that makes a lot of sense.

'range' has since changed and now produces a chunked lazy seq
(master branch post-1.0).

> Producing the next value in a range on-demand is way more efficient
> and practical than caching those values.  I think that Clojure
> programmers should have an easy way to make similarly uncached
> sequences if that's what they really want/need.

This can be done by implementing the ISeq interface, today with
proxy, in the future with newnew/reify/deftype/etc.

  (defn incs [i]
(proxy [clojure.lang.ISeq] []
  (seq [] this)
  (first [] i)
  (next [] (incs (inc i)

  user=> (let [r (range 1e9)] [(first r) (last r)])
  java.lang.OutOfMemoryError: GC overhead limit exceeded (NO_SOURCE_FILE:0)

  user=> (let [r (incs 10)] [(first r) (nth r 1e9)])
  [10 100010]

Both those examples retain the head, but since 'incs' isn't
a lazy-seq, the intermediate values can be garbage-collected.
Note the difference between the seq abstraction and the lazy-seq
implementation.

--Chouser

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-04 Thread Paul Mooser

Well, I care (conceptually) more about the fix being made, rather than
the exact timeframe. If we had to wait until clojure-in-clojure, I
think I could live with that, since the issue can be readily avoided.
We'll see if Rich has a chance to chime-in to acknowledge whether or
not he considers this a bug.

On Nov 4, 5:47 am, Christophe Grand  wrote:
> On Tue, Nov 3, 2009 at 7:27 PM, Paul  Mooser  wrote:
> The real solution would be pervasive locals clearing but I seem to
> remember Rich saying he'd like to delay such work until clojure in
> clojure.
>
> > What's the procedure for creating a ticket for this? Is it at least
> > acknowledged that this IS a bug?
>
> It's better to wait for Rich's opinion on this problem before creating a 
> ticket.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-04 Thread Christophe Grand

On Tue, Nov 3, 2009 at 7:27 PM, Paul  Mooser  wrote:
>
> Ah -- I hadn't understood that when using destructuring, that
> subsequent bindings could refer to the destructured elements. I should
> have, since clojure "only" has let*, and this behavior seems
> consistent with that, for binding.
>
> Eeww. It seems like quite a thorny issue to solve, even if simple to
> describe.

Well, in truth, there's a way to fix the loop macro but it is too
ugly: the idea is to wrap the outer let in a closure and replace the
loop by a function, thus we could benefit from the locals clearing on
tail call.

The real solution would be pervasive locals clearing but I seem to
remember Rich saying he'd like to delay such work until clojure in
clojure.

> What's the procedure for creating a ticket for this? Is it at least
> acknowledged that this IS a bug?

It's better to wait for Rich's opinion on this problem before creating a ticket.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-03 Thread John Harrop
On Tue, Nov 3, 2009 at 1:53 AM, Alex Osborne  wrote:

> The new loop uses the outer-let to get around this:
>
> (let [G__13697 s
>   [x & xs] G__13697
>   y xs]
>   (loop* [G__13697 G__13697
>   y y]
>  (let [[x & xs] G__13697
>y y]
>...)))
>

Now, if that were

(let [G__13697 (java.lang.ref.SoftReference. s)
  [x & xs] (.get G__13697)
  y xs]
  (loop* [G__13697 (.get G__13697)
  y y]
 (let [[x & xs] G__13697
   y y]
   ...)))

instead ...

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-03 Thread Mark Engelberg

I agree that seqs carry a large degree of risk.  You have to work very
hard to avoid giving your large sequences a name, lest you
accidentally "hang on to the head".

In Clojure's early days, I complained about this and described some of
my own experiments with uncached sequences.  Rich said he was
experimenting with another model for uncached iterator-like
constructs, I think he called them streams.  As far as I know, none of
that has ever made it into Clojure.  So I still feel there's a need
here that eventually needs to be addressed.

Clojure's built-in "range" function (last time I looked) essentially
produces an uncached sequence.  And that makes a lot of sense.
Producing the next value in a range on-demand is way more efficient
and practical than caching those values.  I think that Clojure
programmers should have an easy way to make similarly uncached
sequences if that's what they really want/need.  (Well, obviously you
can drop down into Java and use some of the same tricks that range
uses, but I mean it should be easy to do this from within Clojure).

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-03 Thread Paul Mooser

I understand the pragmatism of your approach, but it's really
unfortunate. Seqs are a really convenient abstraction, and the ability
to model arbitrarily large or infinite ones (with laziness) is really
useful. In my opinion, only using seqs when all of the data can be fit
into memory really undermines the value of the abstraction (by
narrowing its usages so severely), and also makes laziness far less
useful (except possibly as a way to amortize costs over time, rather
than as a way to model infinite things).

This path has been well-tread, but the danger of hanging on to the
head of the list is due to the caching behavior of lazy seqs, which is
important for consistency - otherwise, walking the same seq twice
might result in different results.

As with most engineering efforts, there are trade-offs, but I've been
willing to accept the extra caution I need to employ when dealing with
lazy seqs. I've run into a few of these kinds of bugs over time, and
I'm guessing it's generally because in my uses, I'm dealing with
millions of records, and far more data than I can fit in memory. I'm
not sure that this indicates that seqs are the wrong tool in this
instance (as you seem to say), but the answer isn't clear to me.

On Nov 3, 1:20 pm, Brian Hurt  wrote:
> We finally said "don't use a seq unless you don't mind all the elements
> being in memory!" and wrote a producer class.  The producer class is similar
> to a normal Java iterator, in that getting the next element updates the
> state of the object- however maps and filters are applied lazily, and there
> is an additional close function which says that no more elements need to be
> produced (allowing for the closing the underlying file descriptor, for
> example).
>
> I disbelieve in golden hammers.  Seqs (aka lazy lists) are incredibly useful
> in a lot of places, and I'm glad that Clojure has them.  On the other hand,
> there are times and uses where seqs are the wrong tool to use.  Of course,
> the same can be said of producers.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-03 Thread Paul Mooser

In the particular case given below, I'd assume that during the
invocation of print-seq, the binding to "s" (the head of the sequence)
would be retained, because my mental model for the execution
environment of a function is that it is the environment in which they
were declared, extended with the bindings for their parameters. So,
anything inside that function execution should have a reference to
that environment, and I would expect those bindings to exist until the
function has completed. I seem to recall that in clojure's
implementation, environments aren't reified as such, but I believe the
behavior is the same.

If certain JDKs are smart enough to avoid this, I consider that an
implementation detail of that JDK, and thus (as with most
optimizations), it's not something you should depend upon for
correctness.

I can't dispute your point that subtle bugs can occur with seqs, but I
think that in most cases, it isn't that bad. I'm hopeful that as we
find these kinds of bugs in core clojure forms, that we can get them
addressed. Most of the bugs of this sort that I've introduced in my
own code have been relatively straightforward to diagnose and debug.

On Nov 3, 2:47 pm, Brian Hurt  wrote:
> I agree.  I don't like having to ditch seqs.  And producers bring their own
> downsides- for example, being imperative constructs, they open the door for
> race conditions on multi-threaded code in a way that seqs don't.  If
> anything, producers have a more-limited range of applicability than seqs, or
> even iterators, do.  Also, polluting the meme-space with three constructs
> which are very similiar, but subtly different, is also a problem I'm not
> happy with.
>
> But here's an example of the sorts of problems we were hitting.  OK, we all
> know that doseq doesn't hold on to the head of the seq.  But what if I
> write:
> (defn print-seq [ s ]
>     (doseq [ x s ]
>         (println x)))
> Does this code hold on to the head of the seq (in the argument to the
> function)?  I'm honestly not sure- and strongly suspect that the answer
> depends upon (among other things) which JVM you run the code on (and which
> optimizations it will perform), and how long the code has been running (and
> thus what optimizations have been performed on the code).
>
> And even if it doesn't, then I have no doubt that with a little
> complication, I can develop code that does (or at least might) hold on the
> head of the seq unnecessarily.  Which means this is not only an issue for
> the original writer of the code, but also the maintainer.
>
> It's not clear.  If you know that millions of records are as large as you're
> going to see, then seqs are the right tool- and if you load everything into
> memory, oh well.  If the number of records might creep into the billions or
> trillions, then seqs (with their risk of wanting to keep everything in
> memory) are a bad choice IMHO.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-03 Thread Brian Hurt
We encountered similar problems at work trying to wrap I/O up into lazy
seq's.  The problem is that it is very easy to accidentally hold on to the
head of a seq while enumerating it's elements.  In addition, we had problems
with not closing file descriptors.  A common pattern was to open a file,
produce a lazy seq of the contents of the file, closing the file when the
last element was read.  The seq would then be passed off to other parts of
the code, which would read some of the elements, and then drop the seq-
leaking the open file handle (at least until the gc got around to it).

We finally said "don't use a seq unless you don't mind all the elements
being in memory!" and wrote a producer class.  The producer class is similar
to a normal Java iterator, in that getting the next element updates the
state of the object- however maps and filters are applied lazily, and there
is an additional close function which says that no more elements need to be
produced (allowing for the closing the underlying file descriptor, for
example).

I disbelieve in golden hammers.  Seqs (aka lazy lists) are incredibly useful
in a lot of places, and I'm glad that Clojure has them.  On the other hand,
there are times and uses where seqs are the wrong tool to use.  Of course,
the same can be said of producers.

Brian

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-03 Thread Brian Hurt
On Tue, Nov 3, 2009 at 5:19 PM, Paul Mooser  wrote:

>
> I understand the pragmatism of your approach, but it's really
> unfortunate. Seqs are a really convenient abstraction, and the ability
> to model arbitrarily large or infinite ones (with laziness) is really
> useful. In my opinion, only using seqs when all of the data can be fit
> into memory really undermines the value of the abstraction (by
> narrowing its usages so severely), and also makes laziness far less
> useful (except possibly as a way to amortize costs over time, rather
> than as a way to model infinite things).
>
>
I agree.  I don't like having to ditch seqs.  And producers bring their own
downsides- for example, being imperative constructs, they open the door for
race conditions on multi-threaded code in a way that seqs don't.  If
anything, producers have a more-limited range of applicability than seqs, or
even iterators, do.  Also, polluting the meme-space with three constructs
which are very similiar, but subtly different, is also a problem I'm not
happy with.

But here's an example of the sorts of problems we were hitting.  OK, we all
know that doseq doesn't hold on to the head of the seq.  But what if I
write:
(defn print-seq [ s ]
(doseq [ x s ]
(println x)))
Does this code hold on to the head of the seq (in the argument to the
function)?  I'm honestly not sure- and strongly suspect that the answer
depends upon (among other things) which JVM you run the code on (and which
optimizations it will perform), and how long the code has been running (and
thus what optimizations have been performed on the code).

And even if it doesn't, then I have no doubt that with a little
complication, I can develop code that does (or at least might) hold on the
head of the seq unnecessarily.  Which means this is not only an issue for
the original writer of the code, but also the maintainer.



> This path has been well-tread, but the danger of hanging on to the
> head of the list is due to the caching behavior of lazy seqs, which is
> important for consistency - otherwise, walking the same seq twice
> might result in different results.
>
> As with most engineering efforts, there are trade-offs, but I've been
> willing to accept the extra caution I need to employ when dealing with
> lazy seqs. I've run into a few of these kinds of bugs over time, and
> I'm guessing it's generally because in my uses, I'm dealing with
> millions of records, and far more data than I can fit in memory. I'm
> not sure that this indicates that seqs are the wrong tool in this
> instance (as you seem to say), but the answer isn't clear to me.
>

It's not clear.  If you know that millions of records are as large as you're
going to see, then seqs are the right tool- and if you load everything into
memory, oh well.  If the number of records might creep into the billions or
trillions, then seqs (with their risk of wanting to keep everything in
memory) are a bad choice IMHO.

Brian

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-03 Thread Paul Mooser

Ah -- I hadn't understood that when using destructuring, that
subsequent bindings could refer to the destructured elements. I should
have, since clojure "only" has let*, and this behavior seems
consistent with that, for binding.

Eeww. It seems like quite a thorny issue to solve, even if simple to
describe.

What's the procedure for creating a ticket for this? Is it at least
acknowledged that this IS a bug? I don't see it in the list of
assembla tickets for clojure.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-02 Thread Alex Osborne

Paul Mooser wrote:
> Good job tracking down that diff -- upon looking at it, unfortunately,
> I obviously don't understand the underlying issue being fixed (the
> inter-binding dependencies) because the "old code" basically matches
> what I would think would be the way to avoid introducing this in an
> outer let form

Lets macro-expand Christophe's example with both the old loop and the 
new loop.

(loop [[x & xs] s
y xs]
   ...)

So with the old loop we get this:

(loop*
[G__13702 s
 G__13703 xs]
(let [[x & xs] G__13702
  y G__13703]
  ...))

See the problem? "xs" is used before it's defined.

The new loop uses the outer-let to get around this:

(let [G__13697 s
   [x & xs] G__13697
   y xs]
   (loop* [G__13697 G__13697
   y y]
  (let [[x & xs] G__13697
y y]
...)))

What initially occurs to me is to move the outer loop into loop*'s vector:

(loop*
[G__13702 s
 G__13703 (let [[x & xs] G__13702] xs)]
(let [[x & xs] G__13702
  y G__13703]
  x))

A problem with that is we're going to have to put in a destructuring let 
of all the previous arguments for each loop* argument, so it'll blow up 
in size pretty quickly and  I guess the JVM won't optimize the unused 
bindings away as it can't be sure the (first s) and (rest s) that [[x & 
xs] s] expands into are side-effect free.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-02 Thread Paul Mooser

Good job tracking down that diff -- upon looking at it, unfortunately,
I obviously don't understand the underlying issue being fixed (the
inter-binding dependencies) because the "old code" basically matches
what I would think would be the way to avoid introducing this in an
outer let form -- clearly the old code must have significant or subtle
issues of its own!

On Nov 2, 10:39 am, Christophe Grand  wrote:
> Thus this commit allows to write (loop [[x & xs] s y xs] ...) but
> introduces this head-retention behaviour.
>
> Right now I can't see how loop can be made to support both cases.
> Hopefully someone else will.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-02 Thread Paul Mooser

This is great advice, of course. On the other hand, I feel it's
important to be explicitly clear about which forms will hold on to
(seemingly) transient data. Certain things are explicitly clear about
this (such as the docstring for doseq), and this particular case is
unfortunate because in the common case, loop doesn't hold on to the
reference. I think the inconsistency in this case is dangerous,
especially since loop/recur is the generic iteration construct.

On Nov 2, 12:59 pm, John Harrop  wrote:
> In the meantime, remember that it's always worth trying to implement
> seq-processing in terms of map, reduce, filter, for, and friends if
> possible, or lazy-seq, before resorting to loop/recur.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-02 Thread John Harrop
On Mon, Nov 2, 2009 at 2:39 PM, Christophe Grand wrote:

> Right now I can't see how loop can be made to support both cases.
> Hopefully someone else will.


In the meantime, remember that it's always worth trying to implement
seq-processing in terms of map, reduce, filter, for, and friends if
possible, or lazy-seq, before resorting to loop/recur.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-02 Thread Christophe Grand

Hi Paul,

It's indeed surprising and at first glance, it looks like a bug but
after researching the logs, this let form was introduced in the
following commit
http://github.com/richhickey/clojure/commit/288f34dbba4a9e643dd7a7f77642d0f0088f95ad
with comment "fixed loop with destructuring and inter-binding
dependencies".

Thus this commit allows to write (loop [[x & xs] s y xs] ...) but
introduces this head-retention behaviour.

Right now I can't see how loop can be made to support both cases.
Hopefully someone else will.

Christophe


On Mon, Nov 2, 2009 at 6:49 PM, Paul  Mooser  wrote:
>
> I'm a little surprised I haven't seen more response on this topic,
> since this class of bug (inadvertently holding onto the head of
> sequences) is pretty nasty to run into, and is sort of awful to debug.
> I'm wondering if there's a different way to write the loop macro so
> that it doesn't expand into an outer "let" form.
>
>
> >
>



-- 
Professional: http://cgrand.net/ (fr)
On Clojure: http://clj-me.cgrand.net/ (en)

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-11-02 Thread Paul Mooser

I'm a little surprised I haven't seen more response on this topic,
since this class of bug (inadvertently holding onto the head of
sequences) is pretty nasty to run into, and is sort of awful to debug.
I'm wondering if there's a different way to write the loop macro so
that it doesn't expand into an outer "let" form.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-10-31 Thread Paul Mooser

>From looking at the source code the loop macro, it looks like this
might be particular to destructuring with loop, rather than being
related to destructuring in general ?
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-10-31 Thread Paul Mooser

A user on IRC named hiredman had the excellent idea (which should have
occurred to me, but didn't) to macroexpand my code.

A macro expansion of

(loop [[head & tail] (repeat 1)]   (recur tail))

results in:

(let* [G__10   (repeat 1)
   vec__11 G__10
   head (clojure.core/nth vec__11 0 nil)
   tail (clojure.core/nthnext vec__11 1)]
  (loop* [G__10 G__10]
 (clojure.core/let [[head & tail] G__10]
   (recur tail

So, if I'm interpreting this correctly, it appears if you destructure
in this way, there is going to be a reference to the seq held outside
the loop itself. Does this mean, then, that this kind of heap
explosion is inevitable using destructuring with large lazy seqs? It's
hard for me to believe that is the case, and I'm definitely not a
macro expert, so I'd be happy to be shown to be wrong.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-10-31 Thread Paul Mooser

I actually restructured my code (not the toy example posted here) to
avoid the destructuring, and was disappointed to find it also
eventually blows up on 1.6 as well. I'm reasonably certain in that
case that I'm not holding on to any of the sequence (since I don't
refer to it outside the invocation for the initial value of the loop,
so I'm still hoping people can help me suss this out.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Running out of memory when using loop/recur and destructuring

2009-10-30 Thread John Harrop
On Fri, Oct 30, 2009 at 3:15 PM, Paul Mooser  wrote:

> Is this behavior due to some artifact of destructuring I'm not aware
> of (or something else I'm missing), or is there a bug? If it sounds
> like a bug, can anyone else reproduce?
>
> Thanks!


I vaguely remember something like this coming up before, months ago. It may
have been a bug. That the behavior depends on the JVM and is nonbothersome
with a more recent JVM version suggests so.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---