Erik Hatcher wrote:
Karthik,
Thanks for that info. I knew I was behind the times with WordNet using the sandbox code, but it was good enough for my purposes at the time. I will definitely try out the latest WordNet offerings in the future
Hi...I wrote the WordNet sandbox code - but I'm not sure if I undertand this thread. Are we saying that it does not work w/ the new WordNet data, or that code in Eric's book is better/more up to date etc?
If needed I can update the sandbox code..
thx, Dave
though.
Erik
On Jan 10, 2005, at 7:37 AM, Karthik N S wrote:
Hi Erik
Apologies.......
I may be a little offline from this form,but I may help u for the next version of Luncene In Action.
I Was working on Java WordNet Library , On fiddling with the API's, found
something Interesting ,
the code attached to this get's more Synonyms then the Wordnet's Indexed
format avaliable from the LuceneinAction Zip File
1) It needs Wordnet2.0's Dictonery Installed
2) jwnl.jar from SourceForge
[
http://sourceforge.net/project/showfiles.php? group_id=33824&package_id=33975
&release_id=196864 ]
After sucess compilation
Type for watch
ORIGINAL : "watch" OR "analog_watch" OR "digital_watch" OR "hunter" OR
"hunting_watch" OR "pendulum_watch" OR
"pocket_watch" OR "stem-winder" OR "wristwatch" OR "wrist_watch"
FORMATTED : "watch" OR "analog watch" OR "digital watch" OR "hunter" OR "hunting watch" OR "pendulum watch" OR "pocket watch"
Check this Out,may be u will come up with Briliant Idea's
with regards Karthik
-----Original Message----- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Monday, January 10, 2005 5:19 PM To: Lucene Users List Subject: Re: SYNONYM + GOOGLE
On Jan 10, 2005, at 5:33 AM, Karthik N S wrote:
If u search Google using '~shoes', It returns hits based on the Synonym's
[ I know there is a Synonym Wordnet based Lucene Package in the sandbox
http://cvs.apache.org/viewcvs.cgi/jakarta-lucene-sandbox/ contributions/WordN et/ ]
Can this be achieved in Lucene ,If so How ???
Yes, it can be achieved. Not quite synonyms, but various forms of the same word can be found in this example, like this search for similar (see the highlighted variations):
http://www.lucenebook.com/search?query=similar
This is accomplished using the Snowball stemmer filter found in the sandbox. For synonyms, you have lots of options. In Lucene in Action I demonstrate custom analyzers that inject synonyms using the WordNet database (from the sandbox). From the source code distribution of LIA:
% ant SynonymAnalyzerViewer Buildfile: build.xml
SynonymAnalyzerViewer:
[echo]
[echo] Using a custom SynonymAnalyzer, two fixed strings are
[echo] analyzed with the results displayed. Synonyms, from
the
[echo] WordNet database, are injected into the same positions
[echo] as the original words.
[echo]
[echo] See the "Analysis" chapter for more on synonym
injection and
[echo] position increments. The "Tools and extensions"
chapter covers
[echo] the WordNet feature found in the Lucene sandbox.
[echo]
[input] Press return to continue...
[echo] Running lia.analysis.synonym.SynonymAnalyzerViewer...
[java] 1: [quick] [warm] [straightaway] [spry] [speedy] [ready] [quickly] [promptly] [prompt] [nimble] [immediate] [flying] [fast] [agile] [java] 2: [brown] [brownness] [brownish] [java] 3: [fox] [trick] [throw] [slyboots] [fuddle] [fob] [dodger] [discombobulate] [confuse] [confound] [befuddle] [bedevil] [java] 4: [jumps] [java] 5: [over] [o] [across] [java] 6: [lazy] [faineant] [indolent] [otiose] [slothful] [java] 7: [dogs]
...
The phrase analyzed was "The quick brown fox jumps over the lazy dogs". Why no synonyms for "jumps" and "dogs"? WordNet has synonyms for "jump" and "dog", but not the plural forms. Stemming would be a necessary step in achieving full synonym look-up, though this would need to be done carefully as the stem of a word is not necessarily a real word itself - so you'd probably want to stem the synonym database also to ensure accurate lookup.
Also notice the semantically incorrect synonyms that appear for the animal fox ("confuse", for example). Be careful! :)
Erik
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]