Andreas, Base, sure, I'll try (I wrote it yesterday by night, let's see if I
can remember what I wrote :-) ):

So I've studied the algorithm presented here in wikipedia:
http://en.wikipedia.org/wiki/Levenshtein_distance

I've also taken note of the suggested possible improvement on space :
"We can adapt the algorithm to use less space,
*O*<http://en.wikipedia.org/wiki/Big_O_notation>
(*m*) instead of *O*(*mn*), since it only requires that the previous row and
current row be stored at any one time"

(here  :
http://en.wikipedia.org/wiki/Levenshtein_distance#Possible_improvements)

So the algorithm of the clojure implementation is decomposed in two:

function new-row:
this function computes a new "row" (stricto sensu a row in the wikipedia
page examples) given arguments:
  * prev-row: sequence of distances for the previous row
  * row-elem: the value of the element of the seq represented vertically in
the wikipedia examples grid, for the current row for which we are computing.
  * t: the sequence whose values are heads of columns in the wikipedia
examples grid
At its core, this function then just computes the new row by "walking" in
parallel the previous row. It uses "reduce" and not map because for each
"cell", I need the value just computed previously. reduce gives me access
via its accumulator to the row (implemented as a vector) we're computing
(via the peek call), while map's interface does not allow me such freedom.

function levenshtein:
It walks the sequence "s" (the rows in the grid), computing a new row each
time from the previous row. Then it's just a matter of getting the last
value of the last raw to get back the levenshtein distance.
For the same reason as above, I'm not using map but reduce to get access to
the previously computed row.

Hope this is clearer,

ps: I can feel fogus resisting the urge to suggest me to rewrite my gist
using marginalia ... and he may not be wrong at all!

-- 
Laurent

2011/2/16 Base <basselh...@gmail.com>

> I'm with Andreas, and would love some help dissecting this a bit
> further.
>
> On Feb 15, 6:53 pm, Andreas Kostler <andreas.koestler.le...@gmail.com>
> wrote:
> > Laurent,
> > I've been studying your implementation for a while now and can't really
> fully grasp it.
> > Can you elaborate a bit on the algorithm?
> > Cheers
> > Andreas
> >
> > On 16/02/2011, at 9:45 AM, Stuart Sierra wrote:
> >
> >
> >
> >
> >
> > > Cool!  That's a very compact implementation.
> >
> > > Could the same technique be adapted to give you the longest common
> substring?
> > > e.g. (foo "fooba" "baab") => "ba"
> >
> > > Or better yet, the length of the longest common substring and the
> starting indices of each common substring of that length,
> > > e.g. (foo "baaboobaa" "baa") => {:length 3, :indices #{0 6}}
> >
> > > -Stuart Sierra
> > > clojure.com
> >
> > > --
> > > You received this message because you are subscribed to the Google
> > > Groups "Clojure" group.
> > > To post to this group, send email to clojure@googlegroups.com
> > > Note that posts from new members are moderated - please be patient with
> your first post.
> > > To unsubscribe from this group, send email to
> > > clojure+unsubscr...@googlegroups.com
> > > For more options, visit this group at
> > >http://groups.google.com/group/clojure?hl=en
> >
> > --
> > "Programs must be written for people to read, and only incidentally for
> machines to execute."
> > - Abelson & Sussman, SICP
> > --
> > **********************************************************
> > Andreas Koestler, Software Engineer
> > Leica Geosystems Pty Ltd
> > 270 Gladstone Road, Dutton Park QLD 4102
> > Main:+61 7 3891 9772    Direct:+61 7 3117 8808
> > Fax:+61 7 3891 9336
> > Email: andreas.koest...@leica-geosystems.com
> >
> > ************www.leica-geosystems.com*************
> >
> > when it has to be right, Leica Geosystems
> >
> > Please  consider the environment before printing this email.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to