subject:"Re\: Insertion \- The clojure way"

Re: Insertion - The clojure way

2010-12-29 Thread Ken Wesson

On Wed, Dec 29, 2010 at 11:14 AM, Andreas Kostler
 wrote:
> Hi all,
> I've implemented a recursive version of insertion sort as a bit of an warm up 
> with clojure. For some reason it just doesn't "feel" right. The recursion 
> feels forced on poor old insertion sort. This might be inherent to insertion 
> sort.
> My poor clojure skills are more likely to blame, though.
> Can I please get some feedback on how to address the problem in a more 
> idiomatic way?
>
> ; Inserts element into a sorted vector
> (defn insert-into [sorted-vec element]
>        (loop [i (count sorted-vec)]
>                (if (and (> i 0) (> (nth sorted-vec (dec i)) element))
>                (recur (dec i))
>                (into (conj (subvec sorted-vec 0 i) element) (subvec 
> sorted-vec i)
>
>
> (defn insertion-sort [vector]
>  (loop [i 0
>        _vec vector]
>        (if (< i (count _vec))
>                (recur (inc i)
>                        (into (insert-into (subvec _vec 0 i) (nth _vec i)) 
> (subvec _vec (inc i
>                _vec)))
>
> Kind Regards
> Andreas

Vectors are actually rather poorly designed for that sort of thing
(insertion in the middle) and loop is generally to be resorted to only
if there isn't a higher-order function that will do the job. How
about:

(defn insert-into [s x]
  (let [[low high] (split-with #(< % x) s)]
(concat low [x] high)))

(defn insertion-sort [s]
  (reduce insert-into [] s))

Here, split-with and reduce are the HOFs that obviate the need for loop.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Laurent PETIT

2010/12/29 Ken Wesson 

> On Wed, Dec 29, 2010 at 11:14 AM, Andreas Kostler
>  wrote:
> > Hi all,
> > I've implemented a recursive version of insertion sort as a bit of an
> warm up with clojure. For some reason it just doesn't "feel" right. The
> recursion feels forced on poor old insertion sort. This might be inherent to
> insertion sort.
> > My poor clojure skills are more likely to blame, though.
> > Can I please get some feedback on how to address the problem in a more
> idiomatic way?
> >
> > ; Inserts element into a sorted vector
> > (defn insert-into [sorted-vec element]
> >(loop [i (count sorted-vec)]
> >(if (and (> i 0) (> (nth sorted-vec (dec i)) element))
> >(recur (dec i))
> >(into (conj (subvec sorted-vec 0 i) element) (subvec
> sorted-vec i)
> >
> >
> > (defn insertion-sort [vector]
> >  (loop [i 0
> >_vec vector]
> >(if (< i (count _vec))
> >(recur (inc i)
> >(into (insert-into (subvec _vec 0 i) (nth _vec i))
> (subvec _vec (inc i
> >_vec)))
> >
> > Kind Regards
> > Andreas
>
> Vectors are actually rather poorly designed for that sort of thing
> (insertion in the middle) and loop is generally to be resorted to only
> if there isn't a higher-order function that will do the job. How
> about:
>
> (defn insert-into [s x]
>  (let [[low high] (split-with #(< % x) s)]
>(concat low [x] high)))
>
> (defn insertion-sort [s]
>  (reduce insert-into [] s))
>

Hello, just a little 0.2€ : insert-into will return seqs, so the reduce
could read :
(reduce insert-into () s)
to make it clear that it's seqs end to end inside insertion-sort


>
> Here, split-with and reduce are the HOFs that obviate the need for loop.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Ken Wesson

On Wed, Dec 29, 2010 at 11:42 AM, Laurent PETIT  wrote:
> 2010/12/29 Ken Wesson 
>> (defn insert-into [s x]
>>  (let [[low high] (split-with #(< % x) s)]
>>    (concat low [x] high)))
>>
>> (defn insertion-sort [s]
>>  (reduce insert-into [] s))
>
> Hello, just a little 0.2€ : insert-into will return seqs, so the reduce
> could read :
> (reduce insert-into () s)
> to make it clear that it's seqs end to end inside insertion-sort

It's my habit to use a vector for any "seq literal".

How about I split the difference?

(reduce insert-into nil s)

seems to work just as well. :)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Laurent PETIT

2010/12/29 Ken Wesson 

> On Wed, Dec 29, 2010 at 11:42 AM, Laurent PETIT 
> wrote:
> > 2010/12/29 Ken Wesson 
> >> (defn insert-into [s x]
> >>  (let [[low high] (split-with #(< % x) s)]
> >>(concat low [x] high)))
> >>
> >> (defn insertion-sort [s]
> >>  (reduce insert-into [] s))
> >
> > Hello, just a little 0.2€ : insert-into will return seqs, so the
> reduce
> > could read :
> > (reduce insert-into () s)
> > to make it clear that it's seqs end to end inside insertion-sort
>
> It's my habit to use a vector for any "seq literal".
>
> How about I split the difference?
>
> (reduce insert-into nil s)
>
> seems to work just as well. :)
>

Yeah, not a big deal, it's just that by just reading the line (reduce
insert-into [] s), I see the initial "collected value" is a vector, and I
make an assumption about insert-into to return vectors as well.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Ken Wesson

On Wed, Dec 29, 2010 at 11:55 AM, Laurent PETIT  wrote:
>
>
> 2010/12/29 Ken Wesson 
>>
>> On Wed, Dec 29, 2010 at 11:42 AM, Laurent PETIT 
>> wrote:
>> > 2010/12/29 Ken Wesson 
>> >> (defn insert-into [s x]
>> >>  (let [[low high] (split-with #(< % x) s)]
>> >>    (concat low [x] high)))
>> >>
>> >> (defn insertion-sort [s]
>> >>  (reduce insert-into [] s))
>> >
>> > Hello, just a little 0.2€ : insert-into will return seqs, so the
>> > reduce
>> > could read :
>> > (reduce insert-into () s)
>> > to make it clear that it's seqs end to end inside insertion-sort
>>
>> It's my habit to use a vector for any "seq literal".
>>
>> How about I split the difference?
>>
>> (reduce insert-into nil s)
>>
>> seems to work just as well. :)
>
> Yeah, not a big deal, it's just that by just reading the line (reduce
> insert-into [] s), I see the initial "collected value" is a vector, and I
> make an assumption about insert-into to return vectors as well.

Eh? I don't. I tend to only assume "it takes seqables" if I see vector
literals going into something. If it's a reduction, I presume the
function being reduced to return seqables as well.

My preference here is for the empty vector:

* The empty list () is odd. In Clojure usually parentheses wrap an
  executable expression, not just data.
* Quoting it -- '() -- is just plain ugly.
* The value nil can generally stand in for an empty seq, but has
  other unrelated meanings as well.
* The empty vector is, explicitly, a zero-length seqable piece of
  inert data.
* So is the empty set #{}, but it's commonplace to use vector
  literals as "seq literals", so an empty vector is less confusing.
  My objection to the empty set is the same as yours to the empty
  vector.
* That goes double for the empty map {}.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Laurent PETIT

2010/12/29 Ken Wesson 

> On Wed, Dec 29, 2010 at 11:55 AM, Laurent PETIT 
> wrote:
> >
> >
> > 2010/12/29 Ken Wesson 
> >>
> >> On Wed, Dec 29, 2010 at 11:42 AM, Laurent PETIT <
> laurent.pe...@gmail.com>
> >> wrote:
> >> > 2010/12/29 Ken Wesson 
> >> >> (defn insert-into [s x]
> >> >>  (let [[low high] (split-with #(< % x) s)]
> >> >>(concat low [x] high)))
> >> >>
> >> >> (defn insertion-sort [s]
> >> >>  (reduce insert-into [] s))
> >> >
> >> > Hello, just a little 0.2€ : insert-into will return seqs, so the
> >> > reduce
> >> > could read :
> >> > (reduce insert-into () s)
> >> > to make it clear that it's seqs end to end inside insertion-sort
> >>
> >> It's my habit to use a vector for any "seq literal".
> >>
> >> How about I split the difference?
> >>
> >> (reduce insert-into nil s)
> >>
> >> seems to work just as well. :)
> >
> > Yeah, not a big deal, it's just that by just reading the line (reduce
> > insert-into [] s), I see the initial "collected value" is a vector, and I
> > make an assumption about insert-into to return vectors as well.
>
> Eh? I don't. I tend to only assume "it takes seqables" if I see vector
> literals going into something. If it's a reduction, I presume the
> function being reduced to return seqables as well.
>
> My preference here is for the empty vector:
>
> * The empty list () is odd. In Clojure usually parentheses wrap an
>  executable expression, not just data.
> * Quoting it -- '() -- is just plain ugly.
> * The value nil can generally stand in for an empty seq, but has
>  other unrelated meanings as well.
> * The empty vector is, explicitly, a zero-length seqable piece of
>  inert data.
> * So is the empty set #{}, but it's commonplace to use vector
>  literals as "seq literals", so an empty vector is less confusing.
>  My objection to the empty set is the same as yours to the empty
>  vector.
> * That goes double for the empty map {}.
>

To the risk of repeating myself and not having totally understood your above
explanation, worded differently :

when I see (reduce insert-into [] s) with [] in the 'val' position, I use
this as a hint on what the resulting val returned by reduce will be.

If we're ok with the above reasoning, then I find that [] implies the result
will be a datastructure (and a vector datastructure, to be even more precise
IMHO).
While I see the () value as more "neutral" (though not ideal as you said,
but still better than [] in the particular case).

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Mark Engelberg

I recommend reading:
http://htdp.org/2003-09-26/Book/curriculum-Z-H-16.html#node_sec_12.2
for a detailed discussion of how to design and write insertion sort in
a functional language, using linked lists and recursion.

One caveat:  Directly translating that code to Clojure will be
problematic because Clojure uses Java's stack, which overflows quite
easily when using recursion in anything except Clojure's loop/recur
construct.  The version of Scheme used for that textbook, on the other
hand, uses a clever technique to swap out the stack to memory when
needed, in order to simulate a stack that's essentially unbounded
except by the size of your computer's memory.  But it's well worth
learning how adjust code like that to get it to run in Clojure,
because it's something you'll need to do frequently in order to adapt
code written in other functional languages.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Ken Wesson

On Wed, Dec 29, 2010 at 12:39 PM, Mark Engelberg
 wrote:
> I recommend reading:
> http://htdp.org/2003-09-26/Book/curriculum-Z-H-16.html#node_sec_12.2
> for a detailed discussion of how to design and write insertion sort in
> a functional language, using linked lists and recursion.
>
> One caveat:  Directly translating that code to Clojure will be
> problematic because Clojure uses Java's stack, which overflows quite
> easily when using recursion in anything except Clojure's loop/recur
> construct.  The version of Scheme used for that textbook, on the other
> hand, uses a clever technique to swap out the stack to memory when
> needed, in order to simulate a stack that's essentially unbounded
> except by the size of your computer's memory.  But it's well worth
> learning how adjust code like that to get it to run in Clojure,
> because it's something you'll need to do frequently in order to adapt
> code written in other functional languages.

Who needs to muck about with the stack and recur when you've got laziness? :)

Actually I concur that it could be useful. But laziness is sometimes a
superior alternative to an eager, stack-consuming algorithm and
Clojure makes it fairly easy.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Meikel Brandmeyer

Hi,

Am 29.12.2010 um 19:36 schrieb Ken Wesson:

> Who needs to muck about with the stack and recur when you've got laziness? :)

Even then you have to take care, because stacking lazy seq on lazy seq on … 
might also result in a stackoverflow.

Sincerely
Meikel

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Andreas Kostler


On 30/12/2010, at 2:39 AM, Mark Engelberg wrote:

> I recommend reading:
> http://htdp.org/2003-09-26/Book/curriculum-Z-H-16.html#node_sec_12.2
> for a detailed discussion of how to design and write insertion sort in
> a functional language, using linked lists and recursion.
That's what I was looking for. However, Ken's version was a real eye opener :) 
Thanks




-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Meikel Brandmeyer

Hi,

as Ken said vectors are suboptimal for insertion. You might want to look at 
finger trees, which support that better, IIRC.

Sincerely
Meikel

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Ken Wesson

On Wed, Dec 29, 2010 at 4:28 PM, Meikel Brandmeyer  wrote:
> Hi,
>
> Am 29.12.2010 um 19:36 schrieb Ken Wesson:
>
>> Who needs to muck about with the stack and recur when you've got laziness? :)
>
> Even then you have to take care, because stacking lazy seq on lazy seq on … 
> might also result in a stackoverflow.

True, but it's far less common to have lazy operations like maps
nested 100,000 deep than it is to have seqs 100,000 items long. :)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread David Nolen

On Wed, Dec 29, 2010 at 5:29 PM, Ken Wesson  wrote:

> On Wed, Dec 29, 2010 at 4:28 PM, Meikel Brandmeyer  wrote:
> > Hi,
> >
> > Am 29.12.2010 um 19:36 schrieb Ken Wesson:
> >
> >> Who needs to muck about with the stack and recur when you've got
> laziness? :)
> >
> > Even then you have to take care, because stacking lazy seq on lazy seq on
> … might also result in a stackoverflow.
>
> True, but it's far less common to have lazy operations like maps
> nested 100,000 deep than it is to have seqs 100,000 items long. :)


You only need to nest lazy operations 1000 levels deep before you blow the
stack w/ default JVM settings (on OS X, I imagine it's similar for other
systems). This is pretty easy to do if you're not careful.

David

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Ken Wesson

On Wed, Dec 29, 2010 at 5:45 PM, David Nolen  wrote:
> On Wed, Dec 29, 2010 at 5:29 PM, Ken Wesson  wrote:
>>
>> On Wed, Dec 29, 2010 at 4:28 PM, Meikel Brandmeyer  wrote:
>> > Hi,
>> >
>> > Am 29.12.2010 um 19:36 schrieb Ken Wesson:
>> >
>> >> Who needs to muck about with the stack and recur when you've got
>> >> laziness? :)
>> >
>> > Even then you have to take care, because stacking lazy seq on lazy seq
>> > on … might also result in a stackoverflow.
>>
>> True, but it's far less common to have lazy operations like maps
>> nested 100,000 deep than it is to have seqs 100,000 items long. :)
>
> You only need to nest lazy operations 1000 levels deep before you blow the
> stack w/ default JVM settings (on OS X, I imagine it's similar for other
> systems).

Eww. Apple's JVM sucketh greatly. I'd be surprised if the default
maximum stack depth isn't at *least* 32,768 on Sun's Hotspot, given
how rarely I get SOEs outside of actually unbounded recursions (and
how blasted long the resulting traces are when they're dump into my
repl! So much for using backscroll afterward).

Of course, even one thousand nested (map f (map g (map h ...))) is
fairly unusual. Probably you'd have to have a seq you passed around
and sometimes transformed, repeatedly, to get even that depth. An
occasional (doall ...) thrown in would then fix it by flattening it
every so often (if the seq would fit in main memory all at once, of
course).

I take it there are also -Xfoo flags to the various JVMs (probably
different for each vendor) to tweak the stack depth, if you really
need it for, say, a production webserver. And you can probably find
some tool to package end-user applications as an executable installer
that unpacks a jar and a stub .exe that boots the system JVM (or a
bundled one) with particular options and tweaks, then hands off to a
class in the jar, so desktop developers won't have to rely on end
users knowing how to fiddle with JVM flags to use their product.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Mark Engelberg

On Wed, Dec 29, 2010 at 1:29 PM, Andreas Kostler
 wrote:
>
> On 30/12/2010, at 2:39 AM, Mark Engelberg wrote:
>
>> I recommend reading:
>> http://htdp.org/2003-09-26/Book/curriculum-Z-H-16.html#node_sec_12.2
>> for a detailed discussion of how to design and write insertion sort in
>> a functional language, using linked lists and recursion.
> That's what I was looking for. However, Ken's version was a real eye opener :)
> Thanks

Andreas,

Unfortunately, Ken's version doesn't work on large lists, and suffers
from the nested-laziness problem that everyone's been talking about.
It's worth studying his code, though, because it's actually a great
illustration of how you have to be super-careful when working with
laziness, and how laziness sometimes makes it very hard to determine
runtime performance characteristics.

Using his code, try (first (insertion-sort (range 1))).  I'm using
Sun's JVM, in -server mode, and this generates a stack overflow.

Aside from that problem, there are a few other details worth
mentioning about his version.  The split-with causes the first part of
the list (up to the insertion point) to be traversed twice, which is
somewhat more inefficient than it needs to be.  BTW, the use of concat
is where the nested laziness problem creeps in.  His clever use of
reduce unfortunately makes this not a "stable sort", because elements
that compare as equal will end up having their orders reversed.
(Doesn't matter if you're only sorting numbers, but could matter if
you're sorting records based on some field).

So again, I encourage you to first start by porting the Htdp version
of to Clojure.  It's always good to start with the clearest, most
elegant "classic" implementation.  Here's how it looks (adapted to
sort in increasing, rather than decreasing order):

(defn insert [n alon]
   (cond
(empty? alon) (cons n nil)
(<= n (first alon)) (cons n alon)
(> n (first alon)) (cons (first alon) (insert n (rest alon)

(defn isort [alon]
  (cond
   (empty? alon) nil
   (seq? alon) (insert (first alon) (isort (rest alon)

This is a fairly literal translation to Clojure.   It works, but in
Clojure, generates a stack overflow on large lists.
Now, the question you should ask yourself is:
What are the minimal changes I can make to this code to eliminate the
stack overflow problem?

This will be a valuable exercise that will help you with all Clojure
code you write later, so make sure you can answer this question before
trying to get all fancy :)  Ask if you need further assistance...

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Ken Wesson

On Wed, Dec 29, 2010 at 6:41 PM, Mark Engelberg
 wrote:
> On Wed, Dec 29, 2010 at 1:29 PM, Andreas Kostler
>  wrote:
>>
>> On 30/12/2010, at 2:39 AM, Mark Engelberg wrote:
>>
>>> I recommend reading:
>>> http://htdp.org/2003-09-26/Book/curriculum-Z-H-16.html#node_sec_12.2
>>> for a detailed discussion of how to design and write insertion sort in
>>> a functional language, using linked lists and recursion.
>> That's what I was looking for. However, Ken's version was a real eye opener 
>> :)
>> Thanks
>
> Andreas,
>
> Unfortunately, Ken's version doesn't work on large lists, and suffers
> from the nested-laziness problem that everyone's been talking about.
> It's worth studying his code, though, because it's actually a great
> illustration of how you have to be super-careful when working with
> laziness, and how laziness sometimes makes it very hard to determine
> runtime performance characteristics.
>
> Using his code, try (first (insertion-sort (range 1))).  I'm using
> Sun's JVM, in -server mode, and this generates a stack overflow.
>
> Aside from that problem, there are a few other details worth
> mentioning about his version.  The split-with causes the first part of
> the list (up to the insertion point) to be traversed twice, which is
> somewhat more inefficient than it needs to be.  BTW, the use of concat
> is where the nested laziness problem creeps in.  His clever use of
> reduce unfortunately makes this not a "stable sort", because elements
> that compare as equal will end up having their orders reversed.
> (Doesn't matter if you're only sorting numbers, but could matter if
> you're sorting records based on some field).
>
> So again, I encourage you to first start by porting the Htdp version
> of to Clojure.  It's always good to start with the clearest, most
> elegant "classic" implementation.  Here's how it looks (adapted to
> sort in increasing, rather than decreasing order):
>
> (defn insert [n alon]
>   (cond
>    (empty? alon) (cons n nil)
>    (<= n (first alon)) (cons n alon)
>    (> n (first alon)) (cons (first alon) (insert n (rest alon)
>
> (defn isort [alon]
>  (cond
>   (empty? alon) nil
>   (seq? alon) (insert (first alon) (isort (rest alon)
>
> This is a fairly literal translation to Clojure.   It works, but in
> Clojure, generates a stack overflow on large lists.
> Now, the question you should ask yourself is:
> What are the minimal changes I can make to this code to eliminate the
> stack overflow problem?
>
> This will be a valuable exercise that will help you with all Clojure
> code you write later, so make sure you can answer this question before
> trying to get all fancy :)  Ask if you need further assistance...

I was demonstrating the most concise, idiomatic expression of the
algorithm, rather than the most useful for, say, large lists. As for
making it stable, changing (reduce insert-into [] s) to (reduce
insert-into [] (reverse s)) should suffice, though it will traverse s
one more time. (And if split-with traverses the input list twice,
that's an implementation issue with split-with. Perhaps it could be
rewritten to be more efficient.)

For production use I'd just use the existing sorted-foo structures,
the java.util sort, or implement quicksort or mergesort, but if I had
to use insertion sort, something like this would probably be
moderately efficient and work on large inputs:

(defn insert-into [s x]
  (loop [s (seq s) o [] xin? false]
(if s
  (let [y (first s)]
(if xin?
  (recur (next s) (conj o y) true)
  (if (< y x)
(recur (next s) (conj o y) false)
(recur (next s) (conj (conj o x) y) true
  (if xin? o (conj o x)

(defn insertion-sort [s]
  (reduce insert-into [] (reverse s)))

This is a stable sort and works for large inputs, though it's
quadratic. (Bubble sort can probably be made faster since it can take
advantage of transient.) Memory use stays bounded and fairly low.

user=> (def q (Integer. 280))
#'user/q
user=> (def r (Integer. 280))
#'user/r
user=> (identical? q r)
false   ; so, we can tell these two apart
user=> (insertion-sort [3 17 360 q 23 r 42])
[3 17 23 42 280 280 360]
user=> (identical? (nth (insertion-sort [3 17 360 q 23 r 42]) 4) q)
true; the sort is stable
user=> (nth (insertion-sort (reverse (range 10))) 77512)
77512   ; 15 whole minutes after being submitted, natch
; but, no stack overflow

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Mark Engelberg

Ken, I agree with your analysis, and of course, you are right that
this is only interesting as an exercise, because obviously there are
better ways to sort.

I agree:

1.  reducing with the reverse of the list is one way to get a stable
sort.  Another possibility is to alter the insert method so that an
equal item gets inserted at the end of a batch of equal items.

2.  Your new insertion function is a reasonable way to get rid of the
laziness that breaks the function for large inputs.
Another, slightly clearer and faster way:
(defn insert [n alon]
  (loop [alon alon, result []]
(cond
 (empty? alon) (conj result n)
 (<= n (first alon)) (into result (cons n alon))
 :else (recur (rest alon) (conj result (first alon))

(Also, as you point out, the above could be further sped up with transients).

However, it should be pointed out that even this latest version
doesn't share all the performance characteristics of the classic
functional insertion sort given at the Htdp link.  Specifically, one
of the (few) great things about insertion sort is that it is blazingly
fast on input lists that are already sorted, or nearly sorted.  As a
challenge (to you or Andreas, or anyone who is interested in this
thread), figure out why the original insertion sort works so quickly
on sorted lists, why the above code does not, and then try to
duplicate that performance characteristic with an insertion sort
algorithm in Clojure.

One lesson here:  Sometimes Clojure/Java's stack limitations hurt, a
lot.  It's a lot harder to transform this algorithm than it ideally
should be.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread David Nolen

On Wed, Dec 29, 2010 at 6:09 PM, Ken Wesson  wrote:

> On Wed, Dec 29, 2010 at 5:45 PM, David Nolen 
> wrote:
> > On Wed, Dec 29, 2010 at 5:29 PM, Ken Wesson  wrote:
> >>
> >> On Wed, Dec 29, 2010 at 4:28 PM, Meikel Brandmeyer  wrote:
> >> > Hi,
> >> >
> >> > Am 29.12.2010 um 19:36 schrieb Ken Wesson:
> >> >
> >> >> Who needs to muck about with the stack and recur when you've got
> >> >> laziness? :)
> >> >
> >> > Even then you have to take care, because stacking lazy seq on lazy seq
> >> > on … might also result in a stackoverflow.
> >>
> >> True, but it's far less common to have lazy operations like maps
> >> nested 100,000 deep than it is to have seqs 100,000 items long. :)
> >
> > You only need to nest lazy operations 1000 levels deep before you blow
> the
> > stack w/ default JVM settings (on OS X, I imagine it's similar for other
> > systems).
>
> Eww. Apple's JVM sucketh greatly. I'd be surprised if the default
> maximum stack depth isn't at *least* 32,768 on Sun's Hotspot, given
> how rarely I get SOEs outside of actually unbounded recursions (and
> how blasted long the resulting traces are when they're dump into my
> repl! So much for using backscroll afterward).
>

Doh, actually, I might have been messing around with my JVM stack size
recently. I see that the default stack depth on OS X is more like ~43000 :)

David

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Ken Wesson

On Wed, Dec 29, 2010 at 8:13 PM, Mark Engelberg
 wrote:
> 2.  Your new insertion function is a reasonable way to get rid of the
> laziness that breaks the function for large inputs.
> Another, slightly clearer and faster way:
> (defn insert [n alon]
>  (loop [alon alon, result []]
>    (cond
>     (empty? alon) (conj result n)
>     (<= n (first alon)) (into result (cons n alon))
>     :else (recur (rest alon) (conj result (first alon))

I don't agree that this is necessarily clearer; shorter, yes, but
that's not always the same thing.

> (Also, as you point out, the above could be further sped up with transients).

Can conj be sped up with transients, without some hint as to the
vector's eventual size? I know repeated assoces on a vector are
significantly sped up by using transients.

> However, it should be pointed out that even this latest version
> doesn't share all the performance characteristics of the classic
> functional insertion sort given at the Htdp link.  Specifically, one
> of the (few) great things about insertion sort is that it is blazingly
> fast on input lists that are already sorted, or nearly sorted.  As a
> challenge (to you or Andreas, or anyone who is interested in this
> thread), figure out why the original insertion sort works so quickly
> on sorted lists, why the above code does not, and then try to
> duplicate that performance characteristic with an insertion sort
> algorithm in Clojure.

I'd think the key thing to making it performant for almost-sorted data
is to have a bidirectional linked list with a cursor in it; when
adding a new element you'd move from the current position left or
right until the comparison sign changed and then splice a new node in.
With sorted data it would boil down to just appending nodes at the end
of the list in O(n) and with almost-sorted data there'd be the
occasional excursion back into the list, then forwards again to the
tail.

A big difficulty with this in Clojure is that its seqs don't support
efficient backward traversal while its vectors don't support efficient
in-place insertion. Using a mutable java.util.LinkedList as a
"transient" inside the insertion-sort function makes it easy enough to
do this; the insertion operation starts with a ListIterator, sees if
the element to add is greater-equal or lesser, and then starts walking
the Iterator one way until the sign changes or one end of the list is
hit, then inserts there. Obviously the innards are not functional, but
if the LinkedList is copied into a Clojure list or vector before being
returned, the sort itself can be functional.

In theory this form of insertion sort could also be parallelized; the
concurrent operations on the linked list are all insertions. For this
you'd want to use Clojure refs holding explicit list nodes rather than
a java.util collection. I might try knocking something like this
together later. (Of course, mergesort is eminently parallelizable
after the first pivot is chosen and before the final merge. The final
merge might be, hypothetically, also parallelized with 2 threads
working toward the middle from each end, as well, given again a
bidirectional linked list as the underlying data structure. Naturally,
larger numbers of threads than 2 could only be involved after the
first several pivot selections and before the last several merges.)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread David Nolen

On Wed, Dec 29, 2010 at 8:13 PM, Mark Engelberg wrote:

> One lesson here:  Sometimes Clojure/Java's stack limitations hurt, a
> lot.  It's a lot harder to transform this algorithm than it ideally
> should be.


To be clear, the HTDP implementation of insert sort is stack-consuming in
Scheme/Racket as well. For large enough inputs, Scheme/Racket will also
barf.

David

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Ken Wesson

On Wed, Dec 29, 2010 at 9:12 PM, David Nolen  wrote:
> On Wed, Dec 29, 2010 at 6:09 PM, Ken Wesson  wrote:
>>
>> On Wed, Dec 29, 2010 at 5:45 PM, David Nolen 
>> wrote:
>> > On Wed, Dec 29, 2010 at 5:29 PM, Ken Wesson  wrote:
>> >>
>> >> On Wed, Dec 29, 2010 at 4:28 PM, Meikel Brandmeyer  wrote:
>> >> > Hi,
>> >> >
>> >> > Am 29.12.2010 um 19:36 schrieb Ken Wesson:
>> >> >
>> >> >> Who needs to muck about with the stack and recur when you've got
>> >> >> laziness? :)
>> >> >
>> >> > Even then you have to take care, because stacking lazy seq on lazy
>> >> > seq
>> >> > on … might also result in a stackoverflow.
>> >>
>> >> True, but it's far less common to have lazy operations like maps
>> >> nested 100,000 deep than it is to have seqs 100,000 items long. :)
>> >
>> > You only need to nest lazy operations 1000 levels deep before you blow
>> > the
>> > stack w/ default JVM settings (on OS X, I imagine it's similar for other
>> > systems).
>>
>> Eww. Apple's JVM sucketh greatly. I'd be surprised if the default
>> maximum stack depth isn't at *least* 32,768 on Sun's Hotspot, given
>> how rarely I get SOEs outside of actually unbounded recursions (and
>> how blasted long the resulting traces are when they're dump into my
>> repl! So much for using backscroll afterward).
>
> Doh, actually, I might have been messing around with my JVM stack size
> recently. I see that the default stack depth on OS X is more like ~43000 :)
> David

OK, I take back what I said about Apple's JVM. It sucketh greatly but
for completely different reasons. ;)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Mark Engelberg

On Wed, Dec 29, 2010 at 7:16 PM, David Nolen  wrote:
> On Wed, Dec 29, 2010 at 8:13 PM, Mark Engelberg 
> wrote:
>>
>> One lesson here:  Sometimes Clojure/Java's stack limitations hurt, a
>> lot.  It's a lot harder to transform this algorithm than it ideally
>> should be.
>
> To be clear, the HTDP implementation of insert sort is stack-consuming in
> Scheme/Racket as well. For large enough inputs, Scheme/Racket will also
> barf.
> David

But Racket's stack is only limited by your overall memory.  So if you
have enough memory to hold the list you're sorting and the sorted
list, you most likely have enough memory to hold the stack of
computations necessary to generate the list (since the memory needed
for the stack decreases as the output list is built and consumes
memory).

In other words, this kind of stack-consuming implementation would be a
perfectly practical, useful implementation in Racket (assuming you
were applying the technique to something more practical than insertion
sort:)), whereas in Clojure (and most languages, for that matter), the
stack blows up on very realistic list sizes unless you write it a
different way.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread Mark Engelberg

> On Wed, Dec 29, 2010 at 8:13 PM, Mark Engelberg
>> However, it should be pointed out that even this latest version
>> doesn't share all the performance characteristics of the classic
>> functional insertion sort given at the Htdp link.  Specifically, one
>> of the (few) great things about insertion sort is that it is blazingly
>> fast on input lists that are already sorted, or nearly sorted.  As a
>> challenge (to you or Andreas, or anyone who is interested in this
>> thread), figure out why the original insertion sort works so quickly
>> on sorted lists, why the above code does not, and then try to
>> duplicate that performance characteristic with an insertion sort
>> algorithm in Clojure.

So here's the answer to the puzzle I posed earlier, for anyone who's
still following along.  The reason the other implementation is fast on
already sorted lists is because the insert function only needs to
traverse as far as the insertion point, and then can share the
structure of the remainder of the list.  This is handled transparently
by the recursive process, with the stack accumulating the conses that
need to be done.  To simulate this in Clojure, you need some way to
manually accumulate the list of things you traversed on the way to the
insertion point, and then non-lazily stick them on the front of the
remainder of the list.

Here's one way to do that:
; (append l1 l2) is equivalent to a non-lazy (concat (reverse l1) l2)
(defn append [l1 l2]
  (loop [l1 (seq l1) result l2]
(if l1
  (recur (next l1) (cons (first l1) result))
  result)))

(defn insert [n alon]
  (loop [alon (seq alon), traversed nil]
(cond
 (nil? alon) (append traversed (list n))
 (<= n (first alon)) (append traversed (cons n alon))
 (> n (first alon)) (recur (next alon) (conj traversed (first alon))

Then, do the actual insertion sort either using Ken's reduce technique
(on the reverse of the list if stability is desired), or manually
implementing the equivalent pattern:

(defn isort [alon]
  (loop [alon (seq (reverse alon)), sorted nil]
(if alon
  (recur (next alon) (insert (first alon) sorted))
  sorted)))

This is really fast on sorted lists and as fast as any of the other
versions we've batted around on random data.  I believe this is the
closest Clojurian equivalent to the Racket version.  Sadly, it loses
much of its simplicity.  I have students write insertion sort in
Racket after only a dozen or so lessons, but writing the equivalent in
Clojure requires some significant machinations.  (If anyone sees a
simpler way to accomplish the same thing in Clojure, please let me
know!)

Still, I hope this helps explain how to transform stack-consuming
patterns into Clojure's loop/recur construct.  It can be a bit of
work, but it's necessary in Clojure to implement many functional
algorithms.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-29 Thread David Nolen

On Wed, Dec 29, 2010 at 10:33 PM, Mark Engelberg 

> But Racket's stack is only limited by your overall memory.  So if you
> have enough memory to hold the list you're sorting and the sorted
> list, you most likely have enough memory to hold the stack of
> computations necessary to generate the list (since the memory needed
> for the stack decreases as the output list is built and consumes
> memory).
>

The original algorithm is simply wrong for very large inputs, it cannot be
TCO'ed. Pick a ridiculously large list size and the Scheme/Racket program
will barf *immediately*.

David

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Andreas Kostler

Hi All,
It is amazing what a discussion inserstion sort (still) can kick off :)
My intention was to play with algorithms in clojure and be made aware of the 
little
traps one has to look out for coming from scheme for instance. Therefore, 
translating the htdp version to clojure 
and playing with it a little more was the intention. 
I have one more question, though. My initial implementation looked like this:
> (defn insert-into [sorted-vec element]
>(loop [i (count sorted-vec)]
>(if (and (> i 0) (> (nth sorted-vec (dec i)) element))
>(recur (dec i))
>(into (conj (subvec sorted-vec 0 i) element) (subvec 
> sorted-vec i)


Now the discussion evolved around vectors not being suitable for this task. 
Can someone explain why this holds in this context? In my understanding
subvec is a constant time operation. So maybe vector is not that unsuitable
for this task after all?


On 30/12/2010, at 2:31 PM, Mark Engelberg wrote:

>> On Wed, Dec 29, 2010 at 8:13 PM, Mark Engelberg
>>> However, it should be pointed out that even this latest version
>>> doesn't share all the performance characteristics of the classic
>>> functional insertion sort given at the Htdp link.  Specifically, one
>>> of the (few) great things about insertion sort is that it is blazingly
>>> fast on input lists that are already sorted, or nearly sorted.  As a
>>> challenge (to you or Andreas, or anyone who is interested in this
>>> thread), figure out why the original insertion sort works so quickly
>>> on sorted lists, why the above code does not, and then try to
>>> duplicate that performance characteristic with an insertion sort
>>> algorithm in Clojure.
> 
> So here's the answer to the puzzle I posed earlier, for anyone who's
> still following along.  The reason the other implementation is fast on
> already sorted lists is because the insert function only needs to
> traverse as far as the insertion point, and then can share the
> structure of the remainder of the list.  This is handled transparently
> by the recursive process, with the stack accumulating the conses that
> need to be done.  To simulate this in Clojure, you need some way to
> manually accumulate the list of things you traversed on the way to the
> insertion point, and then non-lazily stick them on the front of the
> remainder of the list.
> 
> Here's one way to do that:
> ; (append l1 l2) is equivalent to a non-lazy (concat (reverse l1) l2)
> (defn append [l1 l2]
>  (loop [l1 (seq l1) result l2]
>(if l1
>  (recur (next l1) (cons (first l1) result))
>  result)))
> 
> (defn insert [n alon]
>  (loop [alon (seq alon), traversed nil]
>(cond
> (nil? alon) (append traversed (list n))
> (<= n (first alon)) (append traversed (cons n alon))
> (> n (first alon)) (recur (next alon) (conj traversed (first alon))
> 
> Then, do the actual insertion sort either using Ken's reduce technique
> (on the reverse of the list if stability is desired), or manually
> implementing the equivalent pattern:
> 
> (defn isort [alon]
>  (loop [alon (seq (reverse alon)), sorted nil]
>(if alon
>  (recur (next alon) (insert (first alon) sorted))
>  sorted)))
> 
> This is really fast on sorted lists and as fast as any of the other
> versions we've batted around on random data.  I believe this is the
> closest Clojurian equivalent to the Racket version.  Sadly, it loses
> much of its simplicity.  I have students write insertion sort in
> Racket after only a dozen or so lessons, but writing the equivalent in
> Clojure requires some significant machinations.  (If anyone sees a
> simpler way to accomplish the same thing in Clojure, please let me
> know!)
> 
> Still, I hope this helps explain how to transform stack-consuming
> patterns into Clojure's loop/recur construct.  It can be a bit of
> work, but it's necessary in Clojure to implement many functional
> algorithms.
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

--
 “There is a strong correlation between being smart and being a nerd, and an 
even stronger inverse correlation between being a nerd and being popular”
(Paul Graham)
-- 
**
Andreas Koestler, Software Engineer
Leica Geosystems Pty Ltd
270 Gladstone Road, Dutton Park QLD 4102
Main: +61 7 3891 9772 Direct: +61 7 3117 8808
Fax: +61 7 3891 9336
Email: andreas.koest...@leica-geosystems.com

www.leica-geosystems.com*

when it has to be right, Leica Geosystems

Please  consider the environment before printing this email.





-- 
You received

Re: Insertion - The clojure way

2010-12-30 Thread Mark Engelberg

On Thu, Dec 30, 2010 at 12:51 AM, Andreas Kostler
 wrote:
> Now the discussion evolved around vectors not being suitable for this task.
> Can someone explain why this holds in this context? In my understanding
> subvec is a constant time operation. So maybe vector is not that unsuitable
> for this task after all?

I agree.  vector seems like a perfectly reasonable choice, especially
in light of the difficulties that sequences present in implementing
this algorithm.  Your implementation of insert-into looks okay to me.
I was mainly balking at your implementation of insertion-sort, which
still seems a bit strange to me, but you can probably mix something
like Ken's reduce technique with your vector-based insert-into with
good results.

So combining the two, you get something like:
(defn insert-into [sorted-vec element]
   (loop [i (count sorted-vec)]
   (if (and (> i 0) (> (nth sorted-vec (dec i)) element))
   (recur (dec i))
   (into (conj (subvec sorted-vec 0 i) element) (subvec
sorted-vec i)

(defn insertion-sort [s]
 (reduce insert-into [] s))

which is a perfectly reasonable implementation.  By my benchmarking,
it's slightly slower than my complicated sequence-based version, but
it's in the same ballpark.  It's easy to understand and furthermore,
it doesn't blow the stack and it's quick on already sorted lists.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Mark Engelberg

On Wed, Dec 29, 2010 at 11:31 PM, David Nolen  wrote:
> The original algorithm is simply wrong for very large inputs, it cannot be
> TCO'ed. Pick a ridiculously large list size and the Scheme/Racket program
> will barf *immediately*.
> David

Have you tried it?  I tried this particular algorithm, in Racket, on
rather large lists.  It runs seemingly forever (it is a quadratic
algorithm, of course), but I was unable to make Racket "barf
immediately".  In my experience, Racket's more likely to fall over
from trying to create such a large input to begin with (simply because
there's not enough memory to hold a list that large) than it is to
barf from stack size while executing a non-TCO recursive function over
a list.  My understanding is that Racket's key to handling non-TCO
recursion so gracefully is that it traps errors from stack overflows,
and swaps the stack out to memory to make room on the stack to
continue.  So in essence, you get a stack that is bounded only by your
computer's memory (or whatever memory threshold you've enabled in the
settings).

What size input is barfing for you?

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Andreas Kostler

Oh, vectors are indeed bad since vector concatenation is a linear time 
operation. 
My apologies to Meikel, I simply missed your remark about finger trees. 
I guess a version that swaps the values "in place" would be the way to go. 
On 30/12/2010, at 7:34 PM, Mark Engelberg wrote:

> On Wed, Dec 29, 2010 at 11:31 PM, David Nolen  wrote:
>> The original algorithm is simply wrong for very large inputs, it cannot be
>> TCO'ed. Pick a ridiculously large list size and the Scheme/Racket program
>> will barf *immediately*.
>> David
> 
> Have you tried it?  I tried this particular algorithm, in Racket, on
> rather large lists.  It runs seemingly forever (it is a quadratic
> algorithm, of course), but I was unable to make Racket "barf
> immediately".  In my experience, Racket's more likely to fall over
> from trying to create such a large input to begin with (simply because
> there's not enough memory to hold a list that large) than it is to
> barf from stack size while executing a non-TCO recursive function over
> a list.  My understanding is that Racket's key to handling non-TCO
> recursion so gracefully is that it traps errors from stack overflows,
> and swaps the stack out to memory to make room on the stack to
> continue.  So in essence, you get a stack that is bounded only by your
> computer's memory (or whatever memory threshold you've enabled in the
> settings).
> 
> What size input is barfing for you?
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

--
 “There is a strong correlation between being smart and being a nerd, and an 
even stronger inverse correlation between being a nerd and being popular”
(Paul Graham)
-- 
**
Andreas Koestler, Software Engineer
Leica Geosystems Pty Ltd
270 Gladstone Road, Dutton Park QLD 4102
Main: +61 7 3891 9772 Direct: +61 7 3117 8808
Fax: +61 7 3891 9336
Email: andreas.koest...@leica-geosystems.com

www.leica-geosystems.com*

when it has to be right, Leica Geosystems

Please  consider the environment before printing this email.





-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Mark Engelberg

On Thu, Dec 30, 2010 at 3:24 AM, Andreas Kostler
 wrote:
> Oh, vectors are indeed bad since vector concatenation is a linear time 
> operation.
> My apologies to Meikel, I simply missed your remark about finger trees.
> I guess a version that swaps the values "in place" would be the way to go.

When you perform (into v1 v2), the length of the time it takes is
proportional to the length of v2.
This is similar to the amount of work done when doing a list insertion
(must rebuild the portion of the list up to the insertion point, but
not the entire list.

As for finger trees, I have not found them to be useful.  Although
they have good theoretical bounds, they are very slow in practice.
For example, if you try splitting, concatenating, and traversing lists
of, say, 10 elements, these operations on finger trees are still
many times slower than just using a naive take/drop/concat/seq on a
regular list.  How many millions of elements do you need to have for
the so-called "fast" finger tree operations to be worth the overhead?

I think it is highly unlikely that you will see any speed improvement
by converting insertion sort to finger trees, but feel free to try.

[Note: This is just based on my own personal benchmarking, and I would
love to be proven wrong about finger trees],

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Andreas Kostler


On 30/12/2010, at 8:40 PM, Mark Engelberg wrote:

> On Thu, Dec 30, 2010 at 3:24 AM, Andreas Kostler
>  wrote:
>> Oh, vectors are indeed bad since vector concatenation is a linear time 
>> operation.
>> My apologies to Meikel, I simply missed your remark about finger trees.
>> I guess a version that swaps the values "in place" would be the way to go.
> 
> When you perform (into v1 v2), the length of the time it takes is
> proportional to the length of v2.
> This is similar to the amount of work done when doing a list insertion
> (must rebuild the portion of the list up to the insertion point, but
> not the entire list.
Isn't that the problem though? This will make the best case inefficient since
the work performed by into is proportional to the length of v2 (or v1 if you 
swap them around).
Therefore, even thought the list is sorted you still need to traverse v2. A 
simple fix would be
checking (= i 0) but that's kinda ugly. 
> 
> As for finger trees, I have not found them to be useful.  Although
> they have good theoretical bounds, they are very slow in practice.
> For example, if you try splitting, concatenating, and traversing lists
> of, say, 10 elements, these operations on finger trees are still
> many times slower than just using a naive take/drop/concat/seq on a
> regular list.  How many millions of elements do you need to have for
> the so-called "fast" finger tree operations to be worth the overhead?
I have no experience with finger trees what so ever. I only understand Meikels 
comment now :)

> 
> I think it is highly unlikely that you will see any speed improvement
> by converting insertion sort to finger trees, but feel free to try.
> 
> [Note: This is just based on my own personal benchmarking, and I would
> love to be proven wrong about finger trees],
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

--
 “There is a strong correlation between being smart and being a nerd, and an 
even stronger inverse correlation between being a nerd and being popular”
(Paul Graham)
-- 
**
Andreas Koestler, Software Engineer
Leica Geosystems Pty Ltd
270 Gladstone Road, Dutton Park QLD 4102
Main: +61 7 3891 9772 Direct: +61 7 3117 8808
Fax: +61 7 3891 9336
Email: andreas.koest...@leica-geosystems.com

www.leica-geosystems.com*

when it has to be right, Leica Geosystems

Please  consider the environment before printing this email.





-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Mark Engelberg

On Wed, Dec 29, 2010 at 9:31 PM, Mark Engelberg
 wrote:
> So here's the answer to the puzzle I posed earlier

I just realized that the "append" function in my code was unnecessary,
since into serves the same purpose.  Thus my solution could be
rewritten as:

(defn insert [n alon]
  (loop [alon (seq alon), traversed nil]
(cond
 (nil? alon) (into (list n) traversed)
 (<= n (first alon)) (into (cons n alon) traversed)
 (> n (first alon)) (recur (next alon) (conj traversed (first alon))

(defn isort [alon]
  (loop [alon (seq (reverse alon)), sorted nil]
(if alon
  (recur (next alon) (insert (first alon) sorted))
  sorted)))

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Mark Engelberg

On Thu, Dec 30, 2010 at 3:49 AM, Andreas Kostler
 wrote:
> Isn't that the problem though? This will make the best case inefficient since
> the work performed by into is proportional to the length of v2 (or v1 if you 
> swap them around).
> Therefore, even thought the list is sorted you still need to traverse v2. A 
> simple fix would be
> checking (= i 0) but that's kinda ugly.

No, v2 in your algorithm is only the portion you've scanned over to
find the insertion (since you scan from the back).  Your algorithm is
fast for sorted lists, I checked.

I think the only way you can beat that (using insertion sort) is if
you have a data structure that allows for both a binary search to
quickly find the insertion point, and then allows for fast insertion
at that point.  So maybe if finger trees can give you both the fast
search and the fast insertion, that might pay off, but I still
speculate it's unlikely to be any faster.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Andreas Kostler

On 30/12/2010, at 8:54 PM, Mark Engelberg wrote:

> On Thu, Dec 30, 2010 at 3:49 AM, Andreas Kostler
>  wrote:
>> Isn't that the problem though? This will make the best case inefficient since
>> the work performed by into is proportional to the length of v2 (or v1 if you 
>> swap them around).
>> Therefore, even thought the list is sorted you still need to traverse v2. A 
>> simple fix would be
>> checking (= i 0) but that's kinda ugly.
> 
> No, v2 in your algorithm is only the portion you've scanned over to
> find the insertion (since you scan from the back).  Your algorithm is
> fast for sorted lists, I checked.
Oh of course. I apologise. Mark do you mind to lay out how you perform your 
benchmarks.
On my system (insertion-sort (into [] (range 1))) runs for an unacceptable 
amount of time (talking ~10s)
That seems to be quite long for sorting an already sorted vector of 10k 
elements. 

> I think the only way you can beat that (using insertion sort) is if
> you have a data structure that allows for both a binary search to
> quickly find the insertion point, and then allows for fast insertion
> at that point.  So maybe if finger trees can give you both the fast
> search and the fast insertion, that might pay off, but I still
> speculate it's unlikely to be any faster.
With the initial implementation, using binary search to find the insertion 
point should still result in a speed up, right?
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en





-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Mark Engelberg

On Thu, Dec 30, 2010 at 4:01 AM, Andreas Kostler
 wrote:
> Oh of course. I apologise. Mark do you mind to lay out how you perform your
> benchmarks.
> On my system (insertion-sort (into [] (range 1))) runs for an
> unacceptable amount of time (talking ~10s)
> That seems to be quite long for sorting an already sorted vector of 10k
> elements.

Maybe you're still using your original version with the good insert,
but the bad sort?
Try this:

(defn insert-into [sorted-vec element]
  (loop [i (count sorted-vec)]
  (if (and (> i 0) (> (nth sorted-vec (dec i)) element))
  (recur (dec i))
  (into (conj (subvec sorted-vec 0 i) element) (subvec
sorted-vec i)

(defn insertion-sort [s]
 (reduce insert-into [] s))

(time (last (insertion-sort (range 1

I get 30ms.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Mark Engelberg

On Thu, Dec 30, 2010 at 4:01 AM, Andreas Kostler
 wrote:
> With the initial implementation, using binary search to find the insertion
> point should still result in a speed up, right?

Sure, that should help, although it doesn't change the overall
quadratic bound unless you can also do the insertion in logarithmic
time.  It might be worth trying a binary search in your vector-based
insert, and see how much of a difference it makes.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread David Nolen

On Thu, Dec 30, 2010 at 5:34 AM, Mark Engelberg wrote:

> On Wed, Dec 29, 2010 at 11:31 PM, David Nolen 
> wrote:
> > The original algorithm is simply wrong for very large inputs, it cannot
> be
> > TCO'ed. Pick a ridiculously large list size and the Scheme/Racket program
> > will barf *immediately*.
> > David
>
> Have you tried it?  I tried this particular algorithm, in Racket, on
> rather large lists.  It runs seemingly forever (it is a quadratic
> algorithm, of course), but I was unable to make Racket "barf
> immediately".  In my experience, Racket's more likely to fall over
> from trying to create such a large input to begin with (simply because
> there's not enough memory to hold a list that large) than it is to
> barf from stack size while executing a non-TCO recursive function over
> a list.  My understanding is that Racket's key to handling non-TCO
> recursion so gracefully is that it traps errors from stack overflows,
> and swaps the stack out to memory to make room on the stack to
> continue.  So in essence, you get a stack that is bounded only by your
> computer's memory (or whatever memory threshold you've enabled in the
> settings).
>
> What size input is barfing for you?


2e6. Racket can easily create a list of that size. Calling sort on it
however fails fast.

David

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Daniel Brotsky

@Mark: applause.  A nice pedagogical exercise, and your step-by-step
approach --- so reminiscent of HTDP --- was very effective.

Your final solution reminds me of an undergraduate data structures and
algorithms final I took (too) many years ago on which, after weeks of
doing recursive sorts and traversals, the professor asked us to walk a
tree with three pointer variables and a loop.  As an old (ancient,
really) lisp lover who's just discovered Clojure, this thread feels
like a great intro to what's the same and what's different.  Thanks!

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Ken Wesson

On Thu, Dec 30, 2010 at 6:40 AM, Mark Engelberg
 wrote:
> As for finger trees, I have not found them to be useful.  Although
> they have good theoretical bounds, they are very slow in practice.
> For example, if you try splitting, concatenating, and traversing lists
> of, say, 10 elements, these operations on finger trees are still
> many times slower than just using a naive take/drop/concat/seq on a
> regular list.

I hope that didn't include a naive benchmarking of
take/drop/concat/seq. Because three of those are lazy, you need to
wrap the output of an algorithm using them in a (doall ...) and then
in your timing function to get an accurate read on the amount of time
it actually takes to perform the full algorithm.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Mark Engelberg

On Thu, Dec 30, 2010 at 11:01 AM, Ken Wesson  wrote:
> I hope that didn't include a naive benchmarking of
> take/drop/concat/seq. Because three of those are lazy, you need to
> wrap the output of an algorithm using them in a (doall ...) and then
> in your timing function to get an accurate read on the amount of time
> it actually takes to perform the full algorithm.

Yes, I compared with doall.  Try it and let me know if you get
different results, but I find that traversing a finger tree is so much
slower than traversing a sequence, it all but nullifies any advantages
you get from other operations (even assuming the other operations are
way faster, which they typically are not).

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Ken Wesson

On Thu, Dec 30, 2010 at 4:19 PM, Mark Engelberg
 wrote:
> On Thu, Dec 30, 2010 at 11:01 AM, Ken Wesson  wrote:
>> I hope that didn't include a naive benchmarking of
>> take/drop/concat/seq. Because three of those are lazy, you need to
>> wrap the output of an algorithm using them in a (doall ...) and then
>> in your timing function to get an accurate read on the amount of time
>> it actually takes to perform the full algorithm.
>
> Yes, I compared with doall.  Try it and let me know if you get
> different results, but I find that traversing a finger tree is so much
> slower than traversing a sequence, it all but nullifies any advantages
> you get from other operations (even assuming the other operations are
> way faster, which they typically are not).

Are these searches, which should be log n? Or full (e.g. in-order) traversals?

Traversals of persistent trees can be tricky, since there won't be a
parent pointer on each node. But I'd think a judicious use of tree-seq
would be effective in traversing one decently.

I can test this when I have more time: implement a red-black finger
tree, say, and check the speeds of various operations including a
tree-seq based in-order traversal.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way

2010-12-30 Thread Mark Engelberg

On Thu, Dec 30, 2010 at 9:50 PM, Ken Wesson  wrote:
> Are these searches, which should be log n? Or full (e.g. in-order) traversals?

I'm talking about full traversals.  Finger trees are sequences, and
after building the sequence, splitting, concatenating, or whatever, I
find I eventually need to visit all the elements.  This is something
like a hundred times slower on a large finger tree than a basic
sequence.

I had tried a couple years back to implement finger trees in Racket,
and found it too slow to be of use.  Recently, when I saw chouser had
developed an implementation in Clojure
(https://github.com/clojure/data.finger-tree), I eagerly checked it
out to see if his implementation worked better than my own effort.
It's definitely faster than my Racket attempt, but I still found it to
be quite slow.  I hope I'm incorrect and someone can show me a good
demonstration of an application where finger trees outperform
Clojure's built-in sequences.  But my impression is that even though
they are "asymptotically fast", the constant factors are too large,
and even for lists of hundreds of thousands of elements, you'd be
better off, for example, doing an "inefficient" concatenation of
Clojure's vectors than the "efficient" concatenation of a finger tree.

Play around with chouser's implementation and let me know what you find.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Insertion - The clojure way [HTDP]

2011-01-04 Thread Eli Barzilay

> In other words, this kind of stack-consuming implementation would be a
> perfectly practical, useful implementation in Racket

Even more than that -- in some cases the simple non-TCO version can be
faster than the usual TCO + reverse-the-accumulator thing.  Here's a
random example:

Welcome to Racket v5.0.99.6.
-> (define (append1 l1 l2) (if (null? l1) l2 (cons (car l1) (append (cdr l1) 
l2
-> (define (append2 l1 l2) (let loop ([l1 (reverse l1)] [l2 l2]) (if (null? l1) 
l2 (loop (cdr l1) (cons (car l1) l2)
-> (define l1 (make-list 100 2))
-> (define l2 (make-list 100 2))
-> ,time 11 (for ([i (in-range 20)]) (append1 l1 l2))
;; run #1... -> #
...
;; run #11... -> #
;; 11 runs, 3 best/worst removed, 5 left for average:
;; cpu time: 2529ms = 453ms + 2076ms gc; real time: 2530ms
-> ,time 11 (for ([i (in-range 20)]) (append2 l1 l2))
;; run #1... -> #
...
;; run #11... -> #
;; 11 runs, 3 best/worst removed, 5 left for average:
;; cpu time: 4026ms = 896ms + 3130ms gc; real time: 4026ms


-- 
  ((lambda (x) (x x)) (lambda (x) (x x)))  Eli Barzilay:
http://barzilay.org/   Maze is Life!

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

42 matches

Mail list logo