Laurent,

When I have some time, I will take another look at my code to see if I
can get an accurate levenshtein distance without adding significant
complexity. I am optimistic.

I'll let you know what I find.

Thanks again for sharing your code.
Brenton

On Feb 16, 2:12 am, Laurent PETIT <laurent.pe...@gmail.com> wrote:
> Hi Brenton!
>
> 2011/2/16 Brenton <bashw...@gmail.com>
>
> > Laurent,
>
> > I have been doing some work on a diff library for Clojure sequences (I
> > need to get back to it and finish it up).
>
> >http://github.com/brentonashworth/clj-diff
>
> > The main goal of this library is to compute sequential diffs quickly.
>
> While mine goals was really just to first see how I could express the
> algorithm using higher order functions and clojure datastructures / sequence
> abstraction, even internally.
>
> > Whenever I see someone doing something similar I like to compare
> > performance just in case you know something that I don't.
>
> Fair enough ;)
>
> > Other algorithms usually perform well on small sequences but then
> > break down as the sizes grow. For example, I did a quick test of this
> > algorithm on two 10,000 character strings and your algorithm took 80
> > seconds while mine computed the edit distance is 120 ms.
>
> It could be interesting to plot some more intermediate results between 10
> character strings comparisons and 10,000, so that we can see if this is just
> a problem of a constant factor, or a divergence in big O complexity.
>
> Please note that my algorithm is very close in implementation (and thus
> big-O complexity, unless I've made a mistake) to the one described on the
> wikipedia page.
> It's not an "approximate" levenshtein distance, it is the levenshtein
> distance.
>
> Do you know what the performance of your version could become if you really
> implemented the computation of the levenshtein distance by really computing
> and comparing all edit paths ?
>
> (Not saying that strictly sticking to levenshtein distance/algorithm is a
> rational goal, just trying to compare apples with apples)
>
> > While my library is primarily concerned with diffs and edit distance,
> > I did add a levenshtein-distance function which attempts to compute
> > this distance from a previously computed minimum edit path. It is not
> > always accurate because there may be many minimum edit paths with
> > shorter or longer levenshtein distances. If the algorithm is modified
> > slightly so that the edit path with the minimum levenshtein distance
> > is chosen then it would be able to do both.
>
> > I can't take credit for the algorithm, I just implemented what I read
> > in a paper. But I do think this approach will get the job done as
> > quickly as possible. Of course there is a lot more code to read than
> > your very impressive ten lines.
>
> You're welcome,
>
> Cheers,
>
> --
> Laurent

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to