It would have come in handy for me while doing bioinformatics work in the past. But, you're right, they have very cool tools out there. I was amazed at some of the stuff they can do. On Mar 13, 2012 7:08 PM, "Thomas Neidhart" <thomas.neidh...@gmail.com> wrote:
> On 03/13/2012 08:55 AM, Luc Maisonobe wrote: > > Le 13/03/2012 00:53, James Carman a écrit : > >> A lot of bioinformaticians would love us if we added this! > > I picked this topic up as I find it interesting to myself and it would > be a useful addition for many other people too I guess, but from what I > have seen so far, bioinformaticians wouldn't be necessarily impressed by > that ;-). Afaik they have pretty good tools, and there exist special > algorithms to compute suffix trees for really large strings in clusters > or on disk as they wont fit in memory anymore. > > > In the same spirit, I know an implementation of the Myers difference > > algorithm that runs on any object implementing equals and also provides > > an API for browsing the "edit script" resulting from the comparison. > > This allows for example to retrieve only the shared elements, or only > > the ones in the first or the second sequence, or "running" the script, > > or whatever. > > > > If you consider this could be a good addition to [lang] or another > > component ([graph] ?) I can ask for a grant for this. > > this would be a perfect companion for the longest common substring > problem, the o.a.c.l.text package looks like a good fit for these things > imho. > > Thomas > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >