Hi Erik Apologies.......
I may be a little offline from this form,but I may help u for the next version of Luncene In Action. I Was working on Java WordNet Library , On fiddling with the API's, found something Interesting , the code attached to this get's more Synonyms then the Wordnet's Indexed format avaliable from the LuceneinAction Zip File 1) It needs Wordnet2.0's Dictonery Installed 2) jwnl.jar from SourceForge [ http://sourceforge.net/project/showfiles.php?group_id=33824&package_id=33975 &release_id=196864 ] After sucess compilation Type for watch ORIGINAL : "watch" OR "analog_watch" OR "digital_watch" OR "hunter" OR "hunting_watch" OR "pendulum_watch" OR "pocket_watch" OR "stem-winder" OR "wristwatch" OR "wrist_watch" FORMATTED : "watch" OR "analog watch" OR "digital watch" OR "hunter" OR "hunting watch" OR "pendulum watch" OR "pocket watch" Check this Out,may be u will come up with Briliant Idea's with regards Karthik -----Original Message----- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Monday, January 10, 2005 5:19 PM To: Lucene Users List Subject: Re: SYNONYM + GOOGLE On Jan 10, 2005, at 5:33 AM, Karthik N S wrote: > If u search Google using '~shoes', It returns hits based on the > Synonym's > > [ I know there is a Synonym Wordnet based Lucene Package in the > sandbox > > http://cvs.apache.org/viewcvs.cgi/jakarta-lucene-sandbox/ > contributions/WordN > et/ ] > > Can this be achieved in Lucene ,If so How ??? Yes, it can be achieved. Not quite synonyms, but various forms of the same word can be found in this example, like this search for similar (see the highlighted variations): http://www.lucenebook.com/search?query=similar This is accomplished using the Snowball stemmer filter found in the sandbox. For synonyms, you have lots of options. In Lucene in Action I demonstrate custom analyzers that inject synonyms using the WordNet database (from the sandbox). From the source code distribution of LIA: % ant SynonymAnalyzerViewer Buildfile: build.xml SynonymAnalyzerViewer: [echo] [echo] Using a custom SynonymAnalyzer, two fixed strings are [echo] analyzed with the results displayed. Synonyms, from the [echo] WordNet database, are injected into the same positions [echo] as the original words. [echo] [echo] See the "Analysis" chapter for more on synonym injection and [echo] position increments. The "Tools and extensions" chapter covers [echo] the WordNet feature found in the Lucene sandbox. [echo] [input] Press return to continue... [echo] Running lia.analysis.synonym.SynonymAnalyzerViewer... [java] 1: [quick] [warm] [straightaway] [spry] [speedy] [ready] [quickly] [promptly] [prompt] [nimble] [immediate] [flying] [fast] [agile] [java] 2: [brown] [brownness] [brownish] [java] 3: [fox] [trick] [throw] [slyboots] [fuddle] [fob] [dodger] [discombobulate] [confuse] [confound] [befuddle] [bedevil] [java] 4: [jumps] [java] 5: [over] [o] [across] [java] 6: [lazy] [faineant] [indolent] [otiose] [slothful] [java] 7: [dogs] ... The phrase analyzed was "The quick brown fox jumps over the lazy dogs". Why no synonyms for "jumps" and "dogs"? WordNet has synonyms for "jump" and "dog", but not the plural forms. Stemming would be a necessary step in achieving full synonym look-up, though this would need to be done carefully as the stem of a word is not necessarily a real word itself - so you'd probably want to stem the synonym database also to ensure accurate lookup. Also notice the semantically incorrect synonyms that appear for the animal fox ("confuse", for example). Be careful! :) Erik --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]