Re: when to be lazy

Sean Corfield Tue, 23 Oct 2012 12:51:05 -0700

On Tue, Oct 23, 2012 at 11:38 AM, Brian Craft <craft.br...@gmail.com> wrote:
> Is a lazy seq mostly about algorithmic clarity, and avoiding unnecessary
> computation? So far I haven't run into any cases where I wouldn't realize
> the entire sequence, and it's always faster to do it up-front.


Here's a real world example or two from World Singles (where I work):

Search engine results

We use a search engine that returns "pages" of results. We provide the
criteria, page number and page size, and get back that "page" of
results from the overall result set. We have a process that looks thru
search results and discards matches a member has already seen recently
and various other filters. It would be messy to have to write all of
that paging logic into the filtering logic so we have a
lazy-search-results function that hides the paging and turns the
result set into a flat, lazy sequence. That's the only place that has
to deal with paging complexity. The rest of the algorithm is much,
much simpler since it can now operate on a plain ol' Clojure sequence
of search results. Huge win for simplicity.

Emailing matches to members daily

We have millions of members. We have a process that scours the
database for members who haven't had an email from us recently, which
then looks for different types of matches for them (related to the
process above). After each period of 24 hours, the process restarts
from the beginning. We use a lazy sequence around fetching suitable
members from the database that automatically gets a sentinel inserted
24 hours after we started that period's search. As above, the process
now simply just processes a sequence until it hits the sentinel (it's
actually interleaving about fifty sequences and having the sentinel
dynamically inserted in each sequence makes the code simpler than just
hitting the 'end' of a sequence - we tried that first). The number of
members processed in 24 hours depends on how many matches we find, how
far thru each result set we have to look to find matches and so on.
Lazy sequences make this much simpler (and much less memory intensive
since we don't have to hold the entire sequence in memory in order to
process it).

Updating the search engine

We also have a process that watches the database for member profile
changes and transforms profile data into XML and posts it to the
search engine, to keep results fresh. Again, a lazy sequence is used
to allow us to continually process the 'sequence' of changes from the
database and handle 'millions' of profiles in a (relatively) fixed
amount of memory.

So, yes, we are constantly processes sequences that either wouldn't
fit in memory fully realized or are actually infinite. Is the
processing slower than the procedural equivalent of loops and tests?
Quite probably. Is the memory usage better than realizing entire
chunks of sequences? Oh yes, and not having to worry about tuning all
that is a big simplification. Is the code simpler than the procedural
equivalent? Hell, yeah!

Hope that helps?
-- 
Sean A Corfield -- (904) 302-SEAN
An Architect's View -- http://corfield.org/
World Singles, LLC. -- http://worldsingles.com/

"Perfection is the enemy of the good."
-- Gustave Flaubert, French realist novelist (1821-1880)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: when to be lazy

Reply via email to