Visit this page http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/index.html?org/apache/lucene/analysis/standard/StandardAnalyzer.html this is lucene implementation in java fro synonyms
Check if it helps you Cheers, Ashish On Dec 7, 2007 2:01 PM, macoovacany <[EMAIL PROTECTED]> wrote: > > Hello All. > > Say I have website that has articles on clothes, and I allow people to > tag each article. (Unlimited vocabulary). I wish to recognize which > words are being used as synonyms. > > For example: > (I make the additional restriction that every person must tag the > article with two tags.) > > Art1 (an opinion on shoes in winter) > WinterSeason, Shoes > : Shoes, Fashion > : opinion, Winter, etc > > Art2: (Shoes and Gloves) > : shoes, gloves > : accessories, gloves > : footwear gloves, etc... > > Art3: (Jackets in winter) > :Jackets, winter, etc. > > > Now, if I were to do a search on "footwear", I would come up with > Art2, and not Art1. > > Is there any algorithm that will recognise that a search of "footwear" > and "shoes" should return the same set. > > I have a feeling that some kind of conditional probability calculation > should be used. i.e. P("footwear")|P("shoes") / P("footwear") > > Thoughts, or any direction to go from here? > > Regards, > Timbo > > > > -- ///\\ (@ @) +----oOO----(_)-----------------------+ | ~~~ | | Phone: +91 9968158191 | | ~~~ | | Disclaimer: | | The Statement and options | | expressed here are my own | | do not necessarily represent | | those of MPS Tech. | +-----------------oOO-------------------+ |__|__| || || ooO Ooo --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Algorithm Geeks" group. To post to this group, send email to algogeeks@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/algogeeks -~----------~----~----~----~------~----~------~--~---