Re: Euler 14

2011-01-21 Thread Ken Wesson
On Fri, Jan 21, 2011 at 6:43 PM, Aaron Cohen  wrote:
>> max-key uses destructuring, which was one of the culprits for
>> unexpectedly holding onto the head of lists before locals clearing was
>> added.
>
> This part of what I said is garbage, I'm sorry. I looked at max-key
> too quickly, but there isn't any destructuring there.

I wrote my own implementation of max-key that doesn't use
destructuring before reading the later part of this thread. :)

It is here:

(defn my-max-key [f & xs]
  (reduce
(fn [v1 v2]
  (let [k1 (f v1)
k2 (f v2)]
   (if (> k1 k2)
 v1
 v2)))
xs))

If the original code had blown the heap on my machine I was going to
substitute my-max-key and see if it still did, and if it did tweak
my-max-key to see if there was a way to make the problem go away.

This my-max-key is less than perfectly efficient in that it keeps
recomputing f of the current max-key; it's a no-frills version
intended to be sure not to hold onto the head of anything, and to be
simple and easy to tweak and debug. I'd have added saving f of current
max-key later and seen if it still avoided blowing the heap on the
original problem, and, if so, submitted it as a possible replacement
for the core implementation of max-key. But that's moot now. It looks
like one of the improvements to locals clearing in 1.2 fixed the OP's
problem.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Euler 14

2011-01-21 Thread Aaron Cohen
> max-key uses destructuring, which was one of the culprits for
> unexpectedly holding onto the head of lists before locals clearing was
> added.

This part of what I said is garbage, I'm sorry. I looked at max-key
too quickly, but there isn't any destructuring there.

That doesn't change that I think that it was probably the addition of
locals clearing in 1.2 that is causing what you guys are seeing (or
not seeing).

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Euler 14

2011-01-21 Thread Aaron Cohen
On Thu, Jan 20, 2011 at 11:57 PM, Mark Engelberg
 wrote:
> On Thu, Jan 20, 2011 at 6:51 AM, Andreas Liljeqvist  wrote:
>> I am sorry, I can't seem to reproduce the behavior at the moment :(
>> Mark, please tell me that I am not delusional...
>
> I definitely exhausted the heap running your program.  I was using
> Clojure 1.1, Java 1.6.0_21 with -server -Xmx1600M flags, running via
> Clojure Box on Windows XP.
>

"Fine grained locals clearing" was added in clojure 1.2, so it's
likely that your entire (range 1 100) list has to be processed
before it can be GC'ed.

max-key uses destructuring, which was one of the culprits for
unexpectedly holding onto the head of lists before locals clearing was
added.

See http://groups.google.com/group/clojure/msg/9b4e268b85c20cd6

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Euler 14

2011-01-20 Thread Ken Wesson
FWIW, no heap error on my system (Clojure 1.2, Java 1.6.0_13 -server).

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Euler 14

2011-01-20 Thread Mark Engelberg
On Thu, Jan 20, 2011 at 6:51 AM, Andreas Liljeqvist  wrote:
> I am sorry, I can't seem to reproduce the behavior at the moment :(
> Mark, please tell me that I am not delusional...

I definitely exhausted the heap running your program.  I was using
Clojure 1.1, Java 1.6.0_21 with -server -Xmx1600M flags, running via
Clojure Box on Windows XP.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Euler 14

2011-01-20 Thread Andreas Liljeqvist
I am sorry, I can't seem to reproduce the behavior at the moment :(
Mark, please tell me that I am not delusional...

Will try at home also.

2011/1/20 ka 

> Andreas, How are you running that? Also what do you see in the heap
> dump and what is the jvm heap size?
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Euler 14

2011-01-20 Thread ka
Andreas, How are you running that? Also what do you see in the heap
dump and what is the jvm heap size?

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Euler 14

2011-01-20 Thread Andreas Liljeqvist
Thank you, that explains the failure of the lazy-cseq using top-level defs.
I really hope it gets fixed, Clojure is going to be a hard sell if I have to
explain things like this to my coworkers :(

Anyhow there is still the problem with nonlazy cseq blowing the heap.

(defn cseq [n]
  (if (= 1 n)
[1]
(cons n (cseq (if (even? n)
(/ n 2)
(+ (* 3 n) 1 ))

(apply max-key count (map cseq (range 1 100)))

*BOOM*

2011/1/20 David Powell 

>
> This same problem was raised recently:
>
>
> https://groups.google.com/group/clojure/browse_thread/thread/df4ae16ab0952786?tvc=2&q=memory
>
> It isn't a GC problem, it is an issue in the Clojure compiler.
>
>
> The issue seems to only affect top-level defs.  At the top-level:
>
> (reduce + (range 1000))
>
>  - runs quickly without using lots of memory
>
>
> (def x (#(reduce + (range 1000
>
>  - also runs quickly without using lots of memory
>
>
> (def x (reduce + (range 1000)))
>
>  - uses lots of memory due to the sequence getting forced
>
>
> The problem seems to be caused by the clojure compiler forcing the sequence
> when trying to figure out how to compile it, not
> by the compiled code retaining the sequence at runtime.
>
> It probably isn't a good idea to put expensive top-level defs in your code,
> because they get run when you just try to aot
> compile your code; but if you need to, a workaround at the moment is to
> wrap your value in a function and call it as above.
>
> I suspect that it is possible for us to fix the compiler not to force these
> sequences.  I think it is happening in the
> InvokeExpr class.
>
> --
> Dave
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Euler 14

2011-01-20 Thread Andy Fingerhut

David:

Here is a link to the exact source code of the program I was running  
for my recently published results in the sister thread titled "Problem  
with garbage collection? Was Euler 14".


https://github.com/jafingerhut/clojure-benchmarks/blob/master/collatz/collatz.clj-1.clj

Is this code also exhibiting the same problem, and if so, which top- 
level def is causing it?  The only top-level defs in my code are three  
functions defined with defn.


Thanks,
Andy

On Jan 20, 2011, at 2:10 AM, David Powell wrote:



This same problem was raised recently:

https://groups.google.com/group/clojure/browse_thread/thread/df4ae16ab0952786?tvc=2&q=memory

It isn't a GC problem, it is an issue in the Clojure compiler.


The issue seems to only affect top-level defs.  At the top-level:

(reduce + (range 1000))

 - runs quickly without using lots of memory


(def x (#(reduce + (range 1000

 - also runs quickly without using lots of memory


(def x (reduce + (range 1000)))

 - uses lots of memory due to the sequence getting forced


The problem seems to be caused by the clojure compiler forcing the  
sequence when trying to figure out how to compile it, not

by the compiled code retaining the sequence at runtime.

It probably isn't a good idea to put expensive top-level defs in  
your code, because they get run when you just try to aot
compile your code; but if you need to, a workaround at the moment is  
to wrap your value in a function and call it as above.


I suspect that it is possible for us to fix the compiler not to  
force these sequences.  I think it is happening in the

InvokeExpr class.

--
Dave

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient  
with your first post.

To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Euler 14

2011-01-20 Thread David Powell

This same problem was raised recently:

https://groups.google.com/group/clojure/browse_thread/thread/df4ae16ab0952786?tvc=2&q=memory

It isn't a GC problem, it is an issue in the Clojure compiler.


The issue seems to only affect top-level defs.  At the top-level:

(reduce + (range 1000))

  - runs quickly without using lots of memory


(def x (#(reduce + (range 1000

  - also runs quickly without using lots of memory


(def x (reduce + (range 1000)))

  - uses lots of memory due to the sequence getting forced


The problem seems to be caused by the clojure compiler forcing the sequence 
when trying to figure out how to compile it, not 
by the compiled code retaining the sequence at runtime.

It probably isn't a good idea to put expensive top-level defs in your code, 
because they get run when you just try to aot 
compile your code; but if you need to, a workaround at the moment is to wrap 
your value in a function and call it as above.

I suspect that it is possible for us to fix the compiler not to force these 
sequences.  I think it is happening in the 
InvokeExpr class.

-- 
Dave

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Euler 14

2011-01-20 Thread ka
>> Running in VisualVM suggests that % of Used Heap is actually very
>> less. So the problem is that GC isn't running often enough, so the JVM
>> has to keep allocating more memory.

By problem I meant, in my case, of too much memory consumption.

> Odd.  I'd expect the JVM to run a GC immediately before reporting that
> the heap has been exhausted.

That certainly is odd, I would expect the same. But quick googling
doesn't turn up any references of the JVM which mention this.

Can you take a heap dump of the chunked and non-chunked versions (-XX:-
HeapDumpOnOutOfMemoryError)?

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Euler 14

2011-01-19 Thread Mark Engelberg
On Wed, Jan 19, 2011 at 6:48 PM, ka  wrote:
> Running in VisualVM suggests that % of Used Heap is actually very
> less. So the problem is that GC isn't running often enough, so the JVM
> has to keep allocating more memory.

Odd.  I'd expect the JVM to run a GC immediately before reporting that
the heap has been exhausted.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Euler 14

2011-01-19 Thread ka
To me this looks totally fine, max-key should keep at most two
sequences in memory. I don't think there should be any difference
between the non-lazy and lazy versions as the depth of cseq is ~500.

The non-lazy version works for me (no heap error) for inputs 1M, 2M,
4M, but for 4M the java process did start to consume very large amount
of memory.

I did some tests with the original non-lazy cseq and (apply max-key
count (map cseq (take 400 (iterate inc 1 [Free of chunking I
think]

Running in VisualVM suggests that % of Used Heap is actually very
less. So the problem is that GC isn't running often enough, so the JVM
has to keep allocating more memory.

Running with -Xmx200M -XX:+UseParallelGC, did keep heap upto 200MB and
produces results for 4M, 10M (ans=686) w/o any problems. It will
consume much CPU and time though.



-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Euler 14

2011-01-17 Thread Mark Engelberg
On Mon, Jan 17, 2011 at 11:55 AM, Andreas Liljeqvist  wrote:
> I don't see why the cseq's have to be lazy, they are at the most 525
> elements long.
> shouldn't each sequence only be produced when it is reduced in max-key and
> then discarded?

You're right, the chains aren't as long as I thought.  I can't think
of any good explanation as to why max-key blows the heap when cseq is
non-lazy and doesn't when cseq is lazy.  (I confirmed that on my
system, I get the same behavior as you, even set to a 1600MB heap
size).

Interestingly, this:
(apply max (map count (map cseq (range 1 100
works just fine with non-lazy seq.

The only real difference between this and the max-key version is that
reducing the max-key requires keeping around the longest list so far,
whereas this one just keeps around the count.  But keeping around a
525 element list shouldn't be enough to overflow the heap.

I've tried to figure out:
Could max-key be holding on to the heads of these lists?
Could this be an effect of chunked sequences?

The chunked sequences explanation seems almost plausible.  You could
in theory have 32 lists realized at once, plus the biggest one you've
seen so far.  But even in the worst-case scenario, you're talking
about 15000 elements, and I don't see how that could overflow the
heap.  Just out of curiosity, I tried this with an unchunked range,
and still got the heap overflow, so I don't see how it could be a
chunking issue.

Furthermore if either of these were the problem, you'd expect to see
the same problem with lazy cseq, because ultimately count has to
realize the lazy cseq, so if too many heads were being retained at
once, the lazy version would exhibit the same heap space problem.

This leads to the more disturbing possibility that maybe there's a
garbage collection flaw relating to non-lazy lists.

I'd love to see some more people investigate this and see if we can
come up with a good explanation as to why the original poster's code
overflows the heap space.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Euler 14

2011-01-17 Thread Andreas Liljeqvist
I don't see why the cseq's have to be lazy, they are at the most 525
elements long.
shouldn't each sequence only be produced when it is reduced in max-key and
then discarded?

But it makes a difference:

(defn cseq [n]
  (if (= 1 n)
[1]
(lazy-seq (cons n (cseq (if (even? n)
  (/ n 2)
  (+ (* 3 n) 1 )))

(apply max-key count (map cseq (range 1 100)))

gives me results.

(def a (apply max-key count (map cseq (range 1 100
gives a heap error.

Thanks for your answer.

2011/1/17 Mark Engelberg 

> Your cseq is not lazy, and some of the sequences can be quite long, so
> it wouldn't surprise me if that's the source of your problem.
>
> You can test if this is the problem by doing something like:
> (dorun (map cseq (range 1 100))) which removes the max-key from
> the computation entirely.
>
> You'll probably need to reformulate cseq with lazy lists.  Even then,
> you will likely find that this program will be too slow without
> further optimizations (e.g., memoization or dynamic programming).
>
> That's the nature of project Euler problems; the "obvious" way to
> solve the problem is often too slow and further cleverness is
> required.
>
> On Mon, Jan 17, 2011 at 7:53 AM, Andreas Liljeqvist 
> wrote:
> > Hi.
> >
> > I am using max-key to get the longest sequence from a couple of
> sequences.
> >
> > (defn cseq [n]
> >   (if (= 1 n)
> > [1]
> > (cons n (cseq (if (even? n)
> > (/ n 2)
> > (+ (* 3 n) 1 ))
> >
> > (apply max-key count (map cseq (range 1 100)))
> >
> > This gives me a heap error.
> >
> > To my understanding max-key should keep at most two sequences in memory.
> > I could certainly solve this by working around the problem, but I want to
> > understand where my logic fails.
> > Thankful for any help.
> >
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "Clojure" group.
> > To post to this group, send email to clojure@googlegroups.com
> > Note that posts from new members are moderated - please be patient with
> your
> > first post.
> > To unsubscribe from this group, send email to
> > clojure+unsubscr...@googlegroups.com
> > For more options, visit this group at
> > http://groups.google.com/group/clojure?hl=en
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Euler 14

2011-01-17 Thread Mark Engelberg
Your cseq is not lazy, and some of the sequences can be quite long, so
it wouldn't surprise me if that's the source of your problem.

You can test if this is the problem by doing something like:
(dorun (map cseq (range 1 100))) which removes the max-key from
the computation entirely.

You'll probably need to reformulate cseq with lazy lists.  Even then,
you will likely find that this program will be too slow without
further optimizations (e.g., memoization or dynamic programming).

That's the nature of project Euler problems; the "obvious" way to
solve the problem is often too slow and further cleverness is
required.

On Mon, Jan 17, 2011 at 7:53 AM, Andreas Liljeqvist  wrote:
> Hi.
>
> I am using max-key to get the longest sequence from a couple of sequences.
>
> (defn cseq [n]
>   (if (= 1 n)
>     [1]
>     (cons n (cseq (if (even? n)
>             (/ n 2)
>             (+ (* 3 n) 1 ))
>
> (apply max-key count (map cseq (range 1 100)))
>
> This gives me a heap error.
>
> To my understanding max-key should keep at most two sequences in memory.
> I could certainly solve this by working around the problem, but I want to
> understand where my logic fails.
> Thankful for any help.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en