[OT] Determining the similarity between a pair of texts

2005-06-15 Thread Ugo Cei
Excuse the Off-Topic, but I'm looking for a Java API for determining the degree of similarity (based on word frequency or whatever) between two text strings. I though of posting here since I know there are some people here expert in semantic web technologies that could maybe help me. Thanks

Re: [OT] Determining the similarity between a pair of texts

2005-06-15 Thread Tony Collen
Ugo, I think what you're looking for is the Levenshtein Distance Algorithm. http://www.google.com/search?hl=enq=java+Levenshtein+implementationbtnG=Google+Search HTH, Tony Ugo Cei wrote: Excuse the Off-Topic, but I'm looking for a Java API for determining the degree of similarity (based on

Re: [OT] Determining the similarity between a pair of texts

2005-06-15 Thread Peter Hunsberger
On 6/15/05, Ugo Cei [EMAIL PROTECTED] wrote: Excuse the Off-Topic, but I'm looking for a Java API for determining the degree of similarity (based on word frequency or whatever) between two text strings. I though of posting here since I know there are some people here expert in semantic web

Re: [OT] Determining the similarity between a pair of texts

2005-06-15 Thread Torsten Curdt
Excuse the Off-Topic, but I'm looking for a Java API for determining the degree of similarity (based on word frequency or whatever) between two text strings. also commons codec has some algorithms ...depends on what you are after exactly http://jakarta.apache.org/commons/codec/ cheers --

Re: [OT] Determining the similarity between a pair of texts

2005-06-15 Thread Ugo Cei
Il giorno 15/giu/05, alle 16:32, Tony Collen ha scritto: Ugo, I think what you're looking for is the Levenshtein Distance Algorithm. http://www.google.com/search? hl=enq=java+Levenshtein+implementationbtnG=Google+Search Nice! I also found an implementation nearby:

Re: [OT] Determining the similarity between a pair of texts

2005-06-15 Thread Peter Hunsberger
On 6/15/05, Ugo Cei [EMAIL PROTECTED] wrote: Il giorno 15/giu/05, alle 16:32, Tony Collen ha scritto: snip/ Actually, what I am trying to come up is an algorithm for determining whether two texts refer (more or less) about similar subjects. Eee, then you may have to jump into the NLP stuff

Re: [OT] Determining the similarity between a pair of texts

2005-06-15 Thread Stefano Mazzocchi
Peter Hunsberger wrote: On 6/15/05, Ugo Cei [EMAIL PROTECTED] wrote: Il giorno 15/giu/05, alle 16:32, Tony Collen ha scritto: snip/ Actually, what I am trying to come up is an algorithm for determining whether two texts refer (more or less) about similar subjects. Eee, then you may

Re: [OT] Determining the similarity between a pair of texts

2005-06-15 Thread Ugo Cei
Il giorno 15/giu/05, alle 18:27, Stefano Mazzocchi ha scritto: I've been working on this for the past few months. There is no clearcut solution, but using LSI is probably the best approach for the above LSI == ? As for string distance, you might want to check out secondstring.sf.net.

Re: [OT] Determining the similarity between a pair of texts

2005-06-15 Thread Stefano Mazzocchi
Ugo Cei wrote: Il giorno 15/giu/05, alle 18:27, Stefano Mazzocchi ha scritto: I've been working on this for the past few months. There is no clearcut solution, but using LSI is probably the best approach for the above LSI == ? latent semantic indexing As for string distance, you might