Re: Difference between JVM and CLR when destructuring a lazy sequence
Thank you! On Fri, Nov 23, 2012 at 3:04 PM, dmiller dmiller2...@gmail.com wrote: Frank: Fixed in the master branch (which is 1.5 dev). I also created a new branch named clojure-1.4.1 that is still a 1.4 version, with the the patch. Also created binary distribution zip files for the new 1.4.1 release. Several other bug fixes included in this update. -David On Friday, November 16, 2012 8:46:01 AM UTC-6, ffailla wrote: Thank you David for looking into this so quickly. For now I am working around this by not destructuring, but I look forward to the patch. Thanks. -Frank On Thursday, November 15, 2012 7:41:39 PM UTC-5, dmiller wrote: The difference is that the JVM version is correct and the CLR implementation has a bug. I'll fix it in the current branch and try to get a patched 1.4 out as soon as I can. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Difference between JVM and CLR when destructuring a lazy sequence
Frank: Fixed in the master branch (which is 1.5 dev). I also created a new branch named clojure-1.4.1 that is still a 1.4 version, with the the patch. Also created binary distribution zip files for the new 1.4.1 release. Several other bug fixes included in this update. -David On Friday, November 16, 2012 8:46:01 AM UTC-6, ffailla wrote: Thank you David for looking into this so quickly. For now I am working around this by not destructuring, but I look forward to the patch. Thanks. -Frank On Thursday, November 15, 2012 7:41:39 PM UTC-5, dmiller wrote: The difference is that the JVM version is correct and the CLR implementation has a bug. I'll fix it in the current branch and try to get a patched 1.4 out as soon as I can. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Difference between JVM and CLR when destructuring a lazy sequence
Thank you David for looking into this so quickly. For now I am working around this by not destructuring, but I look forward to the patch. Thanks. -Frank On Thursday, November 15, 2012 7:41:39 PM UTC-5, dmiller wrote: The difference is that the JVM version is correct and the CLR implementation has a bug. I'll fix it in the current branch and try to get a patched 1.4 out as soon as I can. -- Above is all you really need to know, but I find myself forced to continue. :) -- This bug has sitting there from the first commit in the public repo. (That would be early 2009.) The line of code in question is testing for the IList interface. The line has a comment that the JVM implementation changed from IList to RandomAccess, which has no equivalent in the CLR. I didn't know why the change was made, so I left it alone. (The history is lost, but I can place the JVM version change between Nov 08 and Feb 09.) Four years later, I've just discovered the reason. The bug only surfaces in certain circumstances on infinite (lazy) sequences -- and specifically it is triggered by destructuring. LazySeq itself is not the problem -- that's used everywhere. -David On Thursday, November 15, 2012 9:23:05 AM UTC-6, ffailla wrote: I believe I have discovered differing behavior between the JVM and CLR implementations when running the following statement: user (let [foo (repeatedly (fn [] (let [r (rand)] (println in-repeat: r) r))) [f rst] foo] (println return: f)) When run on the JVM with clojure 1.4.0, I get the following output: in-repeat: 0.6929552277817549 in-repeat: 0.7005322422752974 return: 0.6929552277817549 nil user When run on the CLR with clojure-clr 1.4.0, the random number will be printed from in-repeat infinitely, never to return. Is this difference between the JVM and CLR implementations when destructuring a lazy sequence known? Also, why was the random number printed twice on the JVM side. I haven't looked an the implementation, but I would guess this would be due to chunking the sequence. Thanks. -Frank Failla -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Difference between JVM and CLR when destructuring a lazy sequence
I believe I have discovered differing behavior between the JVM and CLR implementations when running the following statement: user (let [foo (repeatedly (fn [] (let [r (rand)] (println in-repeat: r) r))) [f rst] foo] (println return: f)) When run on the JVM with clojure 1.4.0, I get the following output: in-repeat: 0.6929552277817549 in-repeat: 0.7005322422752974 return: 0.6929552277817549 nil user When run on the CLR with clojure-clr 1.4.0, the random number will be printed from in-repeat infinitely, never to return. Is this difference between the JVM and CLR implementations when destructuring a lazy sequence known? Also, why was the random number printed twice on the JVM side. I haven't looked an the implementation, but I would guess this would be due to chunking the sequence. Thanks. -Frank Failla -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Difference between JVM and CLR when destructuring a lazy sequence
Binding to [ rst] must realize an element of the sequence, to determine if there are any left, and it promises to never bind (), only nil. On Thursday, November 15, 2012 7:23:05 AM UTC-8, ffailla wrote: I believe I have discovered differing behavior between the JVM and CLR implementations when running the following statement: user (let [foo (repeatedly (fn [] (let [r (rand)] (println in-repeat: r) r))) [f rst] foo] (println return: f)) When run on the JVM with clojure 1.4.0, I get the following output: in-repeat: 0.6929552277817549 in-repeat: 0.7005322422752974 return: 0.6929552277817549 nil user When run on the CLR with clojure-clr 1.4.0, the random number will be printed from in-repeat infinitely, never to return. Is this difference between the JVM and CLR implementations when destructuring a lazy sequence known? Also, why was the random number printed twice on the JVM side. I haven't looked an the implementation, but I would guess this would be due to chunking the sequence. Thanks. -Frank Failla -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Difference between JVM and CLR when destructuring a lazy sequence
The difference is that the JVM version is correct and the CLR implementation has a bug. I'll fix it in the current branch and try to get a patched 1.4 out as soon as I can. -- Above is all you really need to know, but I find myself forced to continue. :) -- This bug has sitting there from the first commit in the public repo. (That would be early 2009.) The line of code in question is testing for the IList interface. The line has a comment that the JVM implementation changed from IList to RandomAccess, which has no equivalent in the CLR. I didn't know why the change was made, so I left it alone. (The history is lost, but I can place the JVM version change between Nov 08 and Feb 09.) Four years later, I've just discovered the reason. The bug only surfaces in certain circumstances on infinite (lazy) sequences -- and specifically it is triggered by destructuring. LazySeq itself is not the problem -- that's used everywhere. -David On Thursday, November 15, 2012 9:23:05 AM UTC-6, ffailla wrote: I believe I have discovered differing behavior between the JVM and CLR implementations when running the following statement: user (let [foo (repeatedly (fn [] (let [r (rand)] (println in-repeat: r) r))) [f rst] foo] (println return: f)) When run on the JVM with clojure 1.4.0, I get the following output: in-repeat: 0.6929552277817549 in-repeat: 0.7005322422752974 return: 0.6929552277817549 nil user When run on the CLR with clojure-clr 1.4.0, the random number will be printed from in-repeat infinitely, never to return. Is this difference between the JVM and CLR implementations when destructuring a lazy sequence known? Also, why was the random number printed twice on the JVM side. I haven't looked an the implementation, but I would guess this would be due to chunking the sequence. Thanks. -Frank Failla -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
when to be lazy
I don't yet understand how laziness helps. Can anyone point me to a reference? I have vague memory that one of the videos addresses this (I remember something about these are not iterators), but I'm having trouble finding it now. I'm finding that lazy seqs are too slow for everything, so I expect I'm using them incorrectly. Many of the core functions return seqs, and I invariably end up wrapping them with (vec ...) to get any kind of reasonable performance. Is that what I should be doing? Is a lazy seq mostly about algorithmic clarity, and avoiding unnecessary computation? So far I haven't run into any cases where I wouldn't realize the entire sequence, and it's always faster to do it up-front. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: when to be lazy
On Tue, Oct 23, 2012 at 11:38 AM, Brian Craft craft.br...@gmail.com wrote: I don't yet understand how laziness helps. Can anyone point me to a reference? all of haskell? ;-) -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: when to be lazy
On 23/10/12 19:38, Brian Craft wrote: it's always faster to do it up-front. it will always always be faster do it upfront...no way around that! Clojure offers both worlds...be lazy when designing APIs or dealing with big-data that don't fit to memory and be greedy when you want petal to the metal perf. Most of the precious fns in core now have a reducer brother. No assumptions or promises about the underlying collection are made with reducers...just poor the reducer into a vector and you're golden... :-) hope that helps... Jim -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: when to be lazy
On Tue, Oct 23, 2012 at 2:49 PM, Jim - FooBar(); jimpil1...@gmail.comwrote: On 23/10/12 19:38, Brian Craft wrote: it's always faster to do it up-front. it will always always be faster do it upfront...no way around that! Unless you don't need to do it at all. Brian -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: when to be lazy
On 23/10/12 19:57, Brian Hurt wrote: Unless you don't need to do it at all. nce... ;-) Jim -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: when to be lazy
On Tue, Oct 23, 2012 at 2:58 PM, Jim - FooBar(); jimpil1...@gmail.comwrote: On 23/10/12 19:57, Brian Hurt wrote: Unless you don't need to do it at all. nce... ;-) I was actually serious. One of the advantages of lazy eval is that it lets you delay deciding whether or not to do a computation until you actually need the result- if there is a decent chance you won't, then it's a win. So it lets you play games like: (let [ lst (map expensive_function (range 10)) ] ; Note, lst is lazily evaluated, the above expression is O(1) cost! ... ; Later- nah, I've changed my mind, I only need the first 10 elements (take 10 lst) Note that we haven't paid the cost of evaluating all billion calls to expensive_function, we've only paid the cost of doing it 10 times- skipping the remaining 999,999,990 calls. Brian Jim -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscribe@**googlegroups.comclojure%2bunsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/**group/clojure?hl=enhttp://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: when to be lazy
Example: You want to find an element with a certain property in a list. In a imperative language you can do: function ... for( ... loop over list) if( ... current element has the property ...) return the element But in clojure we can do: (first (filter has-the-property? the-list)) If 'filter' was not lazy, the performance would be absolutely terrible. With laziness however, it is only necessary to evaluate until we reach an element that has the property (that is, the same as in the imperative way). A simple runnable example would be: (first (filter #( % 5) (range))) ;= 6 Where (range) is of course an infinite list, so this would not even be possible without laziness. The point is that laziness basically makes it possible to use the functions that operates on seq, without really caring about how much we are evaluating. The above example might seem like a nifty little example, that doesn't reflect the real world. I think it does. In fact, I think the advantages you get from laziness is much greater than this example shows. I often have functions where I do several map/filter/that stuff on a list, then send it to the next function(s) that also does something along those lines. So on and so on. --- Is a lazy seq mostly about algorithmic clarity, and avoiding unnecessary computation? Basically, yes, I think so. You could rephrase it as making algorithmic clarity possible, by avoiding unnecessary computation.. So far I haven't run into any cases where I wouldn't realize the entire sequence, and it's always faster to do it up-front. I think the cases where I have to realize the entire thing are much more common than not. However, when they do crop up, it's absolutely necessary to have laziness. Is it always faster to do it up-front? Maybe. The question is: do we really need the extra performance. In almost all cases, I would say not really. The only time that I have actually needed the extra performance was in some euler problem(s). Otherwise, the performance have not been noticeable. Jonathan On Tue, Oct 23, 2012 at 8:38 PM, Brian Craft craft.br...@gmail.com wrote: I don't yet understand how laziness helps. Can anyone point me to a reference? I have vague memory that one of the videos addresses this (I remember something about these are not iterators), but I'm having trouble finding it now. I'm finding that lazy seqs are too slow for everything, so I expect I'm using them incorrectly. Many of the core functions return seqs, and I invariably end up wrapping them with (vec ...) to get any kind of reasonable performance. Is that what I should be doing? Is a lazy seq mostly about algorithmic clarity, and avoiding unnecessary computation? So far I haven't run into any cases where I wouldn't realize the entire sequence, and it's always faster to do it up-front. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: when to be lazy
Hi Brian, Laziness (and first class functions) can help with code modularity, I would suggest reading paper by John Hughes on the topic: www.cs.kent.ac.uk/people/staff/dat/miranda/whyfp90.pdf Kurman On Tue, Oct 23, 2012 at 11:38 AM, Brian Craft craft.br...@gmail.com wrote: I don't yet understand how laziness helps. Can anyone point me to a reference? I have vague memory that one of the videos addresses this (I remember something about these are not iterators), but I'm having trouble finding it now. I'm finding that lazy seqs are too slow for everything, so I expect I'm using them incorrectly. Many of the core functions return seqs, and I invariably end up wrapping them with (vec ...) to get any kind of reasonable performance. Is that what I should be doing? Is a lazy seq mostly about algorithmic clarity, and avoiding unnecessary computation? So far I haven't run into any cases where I wouldn't realize the entire sequence, and it's always faster to do it up-front. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: when to be lazy
Sorry- I didn't mean for my post to come off sounding like the only reason to use lazy eval is to skip computation. It's just *one* of the many reasons. On Tue, Oct 23, 2012 at 3:08 PM, Kurman Karabukaev kur...@gmail.com wrote: Hi Brian, Laziness (and first class functions) can help with code modularity, I would suggest reading paper by John Hughes on the topic: www.cs.kent.ac.uk/people/staff/dat/miranda/whyfp90.pdf Kurman On Tue, Oct 23, 2012 at 11:38 AM, Brian Craft craft.br...@gmail.comwrote: I don't yet understand how laziness helps. Can anyone point me to a reference? I have vague memory that one of the videos addresses this (I remember something about these are not iterators), but I'm having trouble finding it now. I'm finding that lazy seqs are too slow for everything, so I expect I'm using them incorrectly. Many of the core functions return seqs, and I invariably end up wrapping them with (vec ...) to get any kind of reasonable performance. Is that what I should be doing? Is a lazy seq mostly about algorithmic clarity, and avoiding unnecessary computation? So far I haven't run into any cases where I wouldn't realize the entire sequence, and it's always faster to do it up-front. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: when to be lazy
On Tue, Oct 23, 2012 at 11:38 AM, Brian Craft craft.br...@gmail.com wrote: Is a lazy seq mostly about algorithmic clarity, and avoiding unnecessary computation? So far I haven't run into any cases where I wouldn't realize the entire sequence, and it's always faster to do it up-front. Here's a real world example or two from World Singles (where I work): Search engine results We use a search engine that returns pages of results. We provide the criteria, page number and page size, and get back that page of results from the overall result set. We have a process that looks thru search results and discards matches a member has already seen recently and various other filters. It would be messy to have to write all of that paging logic into the filtering logic so we have a lazy-search-results function that hides the paging and turns the result set into a flat, lazy sequence. That's the only place that has to deal with paging complexity. The rest of the algorithm is much, much simpler since it can now operate on a plain ol' Clojure sequence of search results. Huge win for simplicity. Emailing matches to members daily We have millions of members. We have a process that scours the database for members who haven't had an email from us recently, which then looks for different types of matches for them (related to the process above). After each period of 24 hours, the process restarts from the beginning. We use a lazy sequence around fetching suitable members from the database that automatically gets a sentinel inserted 24 hours after we started that period's search. As above, the process now simply just processes a sequence until it hits the sentinel (it's actually interleaving about fifty sequences and having the sentinel dynamically inserted in each sequence makes the code simpler than just hitting the 'end' of a sequence - we tried that first). The number of members processed in 24 hours depends on how many matches we find, how far thru each result set we have to look to find matches and so on. Lazy sequences make this much simpler (and much less memory intensive since we don't have to hold the entire sequence in memory in order to process it). Updating the search engine We also have a process that watches the database for member profile changes and transforms profile data into XML and posts it to the search engine, to keep results fresh. Again, a lazy sequence is used to allow us to continually process the 'sequence' of changes from the database and handle 'millions' of profiles in a (relatively) fixed amount of memory. So, yes, we are constantly processes sequences that either wouldn't fit in memory fully realized or are actually infinite. Is the processing slower than the procedural equivalent of loops and tests? Quite probably. Is the memory usage better than realizing entire chunks of sequences? Oh yes, and not having to worry about tuning all that is a big simplification. Is the code simpler than the procedural equivalent? Hell, yeah! Hope that helps? -- Sean A Corfield -- (904) 302-SEAN An Architect's View -- http://corfield.org/ World Singles, LLC. -- http://worldsingles.com/ Perfection is the enemy of the good. -- Gustave Flaubert, French realist novelist (1821-1880) -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: when to be lazy
Thanks for all the responses! This is great. b.c. On Tuesday, October 23, 2012 12:51:11 PM UTC-7, Sean Corfield wrote: On Tue, Oct 23, 2012 at 11:38 AM, Brian Craft craft...@gmail.comjavascript: wrote: Is a lazy seq mostly about algorithmic clarity, and avoiding unnecessary computation? So far I haven't run into any cases where I wouldn't realize the entire sequence, and it's always faster to do it up-front. Here's a real world example or two from World Singles (where I work): Search engine results We use a search engine that returns pages of results. We provide the criteria, page number and page size, and get back that page of results from the overall result set. We have a process that looks thru search results and discards matches a member has already seen recently and various other filters. It would be messy to have to write all of that paging logic into the filtering logic so we have a lazy-search-results function that hides the paging and turns the result set into a flat, lazy sequence. That's the only place that has to deal with paging complexity. The rest of the algorithm is much, much simpler since it can now operate on a plain ol' Clojure sequence of search results. Huge win for simplicity. Emailing matches to members daily We have millions of members. We have a process that scours the database for members who haven't had an email from us recently, which then looks for different types of matches for them (related to the process above). After each period of 24 hours, the process restarts from the beginning. We use a lazy sequence around fetching suitable members from the database that automatically gets a sentinel inserted 24 hours after we started that period's search. As above, the process now simply just processes a sequence until it hits the sentinel (it's actually interleaving about fifty sequences and having the sentinel dynamically inserted in each sequence makes the code simpler than just hitting the 'end' of a sequence - we tried that first). The number of members processed in 24 hours depends on how many matches we find, how far thru each result set we have to look to find matches and so on. Lazy sequences make this much simpler (and much less memory intensive since we don't have to hold the entire sequence in memory in order to process it). Updating the search engine We also have a process that watches the database for member profile changes and transforms profile data into XML and posts it to the search engine, to keep results fresh. Again, a lazy sequence is used to allow us to continually process the 'sequence' of changes from the database and handle 'millions' of profiles in a (relatively) fixed amount of memory. So, yes, we are constantly processes sequences that either wouldn't fit in memory fully realized or are actually infinite. Is the processing slower than the procedural equivalent of loops and tests? Quite probably. Is the memory usage better than realizing entire chunks of sequences? Oh yes, and not having to worry about tuning all that is a big simplification. Is the code simpler than the procedural equivalent? Hell, yeah! Hope that helps? -- Sean A Corfield -- (904) 302-SEAN An Architect's View -- http://corfield.org/ World Singles, LLC. -- http://worldsingles.com/ Perfection is the enemy of the good. -- Gustave Flaubert, French realist novelist (1821-1880) -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: when to be lazy
Just found this: http://www.infoq.com/presentations/Laziness-Good-Bad-Ugly Jonathan On Tue, Oct 23, 2012 at 10:09 PM, Brian Craft craft.br...@gmail.com wrote: Thanks for all the responses! This is great. b.c. On Tuesday, October 23, 2012 12:51:11 PM UTC-7, Sean Corfield wrote: On Tue, Oct 23, 2012 at 11:38 AM, Brian Craft craft...@gmail.com wrote: Is a lazy seq mostly about algorithmic clarity, and avoiding unnecessary computation? So far I haven't run into any cases where I wouldn't realize the entire sequence, and it's always faster to do it up-front. Here's a real world example or two from World Singles (where I work): Search engine results We use a search engine that returns pages of results. We provide the criteria, page number and page size, and get back that page of results from the overall result set. We have a process that looks thru search results and discards matches a member has already seen recently and various other filters. It would be messy to have to write all of that paging logic into the filtering logic so we have a lazy-search-results function that hides the paging and turns the result set into a flat, lazy sequence. That's the only place that has to deal with paging complexity. The rest of the algorithm is much, much simpler since it can now operate on a plain ol' Clojure sequence of search results. Huge win for simplicity. Emailing matches to members daily We have millions of members. We have a process that scours the database for members who haven't had an email from us recently, which then looks for different types of matches for them (related to the process above). After each period of 24 hours, the process restarts from the beginning. We use a lazy sequence around fetching suitable members from the database that automatically gets a sentinel inserted 24 hours after we started that period's search. As above, the process now simply just processes a sequence until it hits the sentinel (it's actually interleaving about fifty sequences and having the sentinel dynamically inserted in each sequence makes the code simpler than just hitting the 'end' of a sequence - we tried that first). The number of members processed in 24 hours depends on how many matches we find, how far thru each result set we have to look to find matches and so on. Lazy sequences make this much simpler (and much less memory intensive since we don't have to hold the entire sequence in memory in order to process it). Updating the search engine We also have a process that watches the database for member profile changes and transforms profile data into XML and posts it to the search engine, to keep results fresh. Again, a lazy sequence is used to allow us to continually process the 'sequence' of changes from the database and handle 'millions' of profiles in a (relatively) fixed amount of memory. So, yes, we are constantly processes sequences that either wouldn't fit in memory fully realized or are actually infinite. Is the processing slower than the procedural equivalent of loops and tests? Quite probably. Is the memory usage better than realizing entire chunks of sequences? Oh yes, and not having to worry about tuning all that is a big simplification. Is the code simpler than the procedural equivalent? Hell, yeah! Hope that helps? -- Sean A Corfield -- (904) 302-SEAN An Architect's View -- http://corfield.org/ World Singles, LLC. -- http://worldsingles.com/ Perfection is the enemy of the good. -- Gustave Flaubert, French realist novelist (1821-1880) -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: when to be lazy
hipster presentation is not so great in archive: can't really see what he's doing. On Tuesday, October 23, 2012 1:55:08 PM UTC-7, Jonathan Fischer Friberg wrote: Just found this: http://www.infoq.com/presentations/Laziness-Good-Bad-Ugly Jonathan On Tue, Oct 23, 2012 at 10:09 PM, Brian Craft craft...@gmail.comjavascript: wrote: Thanks for all the responses! This is great. b.c. On Tuesday, October 23, 2012 12:51:11 PM UTC-7, Sean Corfield wrote: On Tue, Oct 23, 2012 at 11:38 AM, Brian Craft craft...@gmail.com wrote: Is a lazy seq mostly about algorithmic clarity, and avoiding unnecessary computation? So far I haven't run into any cases where I wouldn't realize the entire sequence, and it's always faster to do it up-front. Here's a real world example or two from World Singles (where I work): Search engine results We use a search engine that returns pages of results. We provide the criteria, page number and page size, and get back that page of results from the overall result set. We have a process that looks thru search results and discards matches a member has already seen recently and various other filters. It would be messy to have to write all of that paging logic into the filtering logic so we have a lazy-search-results function that hides the paging and turns the result set into a flat, lazy sequence. That's the only place that has to deal with paging complexity. The rest of the algorithm is much, much simpler since it can now operate on a plain ol' Clojure sequence of search results. Huge win for simplicity. Emailing matches to members daily We have millions of members. We have a process that scours the database for members who haven't had an email from us recently, which then looks for different types of matches for them (related to the process above). After each period of 24 hours, the process restarts from the beginning. We use a lazy sequence around fetching suitable members from the database that automatically gets a sentinel inserted 24 hours after we started that period's search. As above, the process now simply just processes a sequence until it hits the sentinel (it's actually interleaving about fifty sequences and having the sentinel dynamically inserted in each sequence makes the code simpler than just hitting the 'end' of a sequence - we tried that first). The number of members processed in 24 hours depends on how many matches we find, how far thru each result set we have to look to find matches and so on. Lazy sequences make this much simpler (and much less memory intensive since we don't have to hold the entire sequence in memory in order to process it). Updating the search engine We also have a process that watches the database for member profile changes and transforms profile data into XML and posts it to the search engine, to keep results fresh. Again, a lazy sequence is used to allow us to continually process the 'sequence' of changes from the database and handle 'millions' of profiles in a (relatively) fixed amount of memory. So, yes, we are constantly processes sequences that either wouldn't fit in memory fully realized or are actually infinite. Is the processing slower than the procedural equivalent of loops and tests? Quite probably. Is the memory usage better than realizing entire chunks of sequences? Oh yes, and not having to worry about tuning all that is a big simplification. Is the code simpler than the procedural equivalent? Hell, yeah! Hope that helps? -- Sean A Corfield -- (904) 302-SEAN An Architect's View -- http://corfield.org/ World Singles, LLC. -- http://worldsingles.com/ Perfection is the enemy of the good. -- Gustave Flaubert, French realist novelist (1821-1880) -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clo...@googlegroups.comjavascript: Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+u...@googlegroups.com javascript: For more options, visit this group at http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: when to be lazy
No, that's unfortunate. :( Jonathan On Wed, Oct 24, 2012 at 12:27 AM, Brian Craft craft.br...@gmail.com wrote: hipster presentation is not so great in archive: can't really see what he's doing. On Tuesday, October 23, 2012 1:55:08 PM UTC-7, Jonathan Fischer Friberg wrote: Just found this: http://www.infoq.com/**presentations/Laziness-Good-** Bad-Ugly http://www.infoq.com/presentations/Laziness-Good-Bad-Ugly Jonathan On Tue, Oct 23, 2012 at 10:09 PM, Brian Craft craft...@gmail.com wrote: Thanks for all the responses! This is great. b.c. On Tuesday, October 23, 2012 12:51:11 PM UTC-7, Sean Corfield wrote: On Tue, Oct 23, 2012 at 11:38 AM, Brian Craft craft...@gmail.com wrote: Is a lazy seq mostly about algorithmic clarity, and avoiding unnecessary computation? So far I haven't run into any cases where I wouldn't realize the entire sequence, and it's always faster to do it up-front. Here's a real world example or two from World Singles (where I work): Search engine results We use a search engine that returns pages of results. We provide the criteria, page number and page size, and get back that page of results from the overall result set. We have a process that looks thru search results and discards matches a member has already seen recently and various other filters. It would be messy to have to write all of that paging logic into the filtering logic so we have a lazy-search-results function that hides the paging and turns the result set into a flat, lazy sequence. That's the only place that has to deal with paging complexity. The rest of the algorithm is much, much simpler since it can now operate on a plain ol' Clojure sequence of search results. Huge win for simplicity. Emailing matches to members daily We have millions of members. We have a process that scours the database for members who haven't had an email from us recently, which then looks for different types of matches for them (related to the process above). After each period of 24 hours, the process restarts from the beginning. We use a lazy sequence around fetching suitable members from the database that automatically gets a sentinel inserted 24 hours after we started that period's search. As above, the process now simply just processes a sequence until it hits the sentinel (it's actually interleaving about fifty sequences and having the sentinel dynamically inserted in each sequence makes the code simpler than just hitting the 'end' of a sequence - we tried that first). The number of members processed in 24 hours depends on how many matches we find, how far thru each result set we have to look to find matches and so on. Lazy sequences make this much simpler (and much less memory intensive since we don't have to hold the entire sequence in memory in order to process it). Updating the search engine We also have a process that watches the database for member profile changes and transforms profile data into XML and posts it to the search engine, to keep results fresh. Again, a lazy sequence is used to allow us to continually process the 'sequence' of changes from the database and handle 'millions' of profiles in a (relatively) fixed amount of memory. So, yes, we are constantly processes sequences that either wouldn't fit in memory fully realized or are actually infinite. Is the processing slower than the procedural equivalent of loops and tests? Quite probably. Is the memory usage better than realizing entire chunks of sequences? Oh yes, and not having to worry about tuning all that is a big simplification. Is the code simpler than the procedural equivalent? Hell, yeah! Hope that helps? -- Sean A Corfield -- (904) 302-SEAN An Architect's View -- http://corfield.org/ World Singles, LLC. -- http://worldsingles.com/ Perfection is the enemy of the good. -- Gustave Flaubert, French realist novelist (1821-1880) -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clo...@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+u...@**googlegroups.com For more options, visit this group at http://groups.google.com/**group/clojure?hl=enhttp://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new
Re: Memory usage when iterating over lazy sequences
On Oct 31, 4:58 pm, Mark Triggs [EMAIL PROTECTED] wrote: On Oct 31, 1:57 pm, Mark Triggs [EMAIL PROTECTED] wrote: When I ran my code it very quickly ran out of memory and fell over. After thinking about it for a while I've realised it must be because my 'do-something' function call is hanging on to the head of the list, so as its elements are realised and cached it gradually eats up all my memory. Answering my own question, using the function itself as the recur target does exactly what I want: ;; Process the list one chunk at a time (defn do-something [biglist] (when biglist (doall (take 1000 biglist)) (recur (drop 1000 biglist I guess I should have tried it instead of assuming it wouldn't work. Please excuse my talking to myself :o) Sorry for the delay. You are right, if you want to do this manually you have to take care not to retain the head of the list. OTOH, you might want to reconsider doing it manually and leverage the seq functions that already handle this: (map process-a-chunk (partition 1000 biglist)) Rich --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Memory usage when iterating over lazy sequences
Hi all, I'm just getting started with Clojure, and I've got a bit of a beginner's question regarding memory usage of lazy sequences. I have an array of data that is too big to fit in memory, so I thought I would be clever and process it in manageable-sized chunks. My instinct was to do this using loop/recur using 'take' to bite off the next chunk, and 'drop' to give me the remainder for the next recursive call. Here's a contrived example: ;; Process the list one chunk at a time (defn do-something [biglist] (loop [rest (drop 1000 biglist)] (when rest (doall (take 1000 rest)) (recur (drop 1000 rest) ;; Lazily calculate our big data set and pass it along for processing (do (do-something (map (fn [n] (make-array java.lang.Character 10240)) (range 0 10))) nil) When I ran my code it very quickly ran out of memory and fell over. After thinking about it for a while I've realised it must be because my 'do-something' function call is hanging on to the head of the list, so as its elements are realised and cached it gradually eats up all my memory. Assuming my diagnosis is right, is there some sort of idiomatic way of dealing with this sort of issue? I suppose that if the JVM allowed for true tail recursion then this problem wouldn't arise? Many thanks, Mark -- Mark Triggs [EMAIL PROTECTED] --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: Memory usage when iterating over lazy sequences
On Oct 31, 1:57 pm, Mark Triggs [EMAIL PROTECTED] wrote: When I ran my code it very quickly ran out of memory and fell over. After thinking about it for a while I've realised it must be because my 'do-something' function call is hanging on to the head of the list, so as its elements are realised and cached it gradually eats up all my memory. Answering my own question, using the function itself as the recur target does exactly what I want: ;; Process the list one chunk at a time (defn do-something [biglist] (when biglist (doall (take 1000 biglist)) (recur (drop 1000 biglist I guess I should have tried it instead of assuming it wouldn't work. Please excuse my talking to myself :o) Cheers, Mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---