:   > can I get the similar wordlist as output. so that I can show the end
:   > user in the column  ---------------   do you mean "foam"?
:   > How can I get similar word list in the given content?

This is a non trivial problem, because the definition of "similar" is
subject to interpretation.  I would look into various dictionary
implimentations, and see if you can find a good Java based dictionary that
can suggest alternatives based on an input string.

Once you have that, then you should be able to use IndexSearcher.docFreq
to find out how many docs contains each alternate word, and compare that
with the number of docs that contain the initial word ... if one of the
alternates has a significantly higher number of matches, then you suggest
it.


NOTE: The DICT protocol defines a client/server approach to providing
spell correction and definitions.  Maybe you can leverage some of the
spell correction code mentioned in the "Server Software Written in Java"
section of this doc...
        http://www.dict.org/links.html
In particular, you might want to take a look at JavaDict's Database.match
function using the LevenshteinStrategy...
http://ktulu.com.ar/javadict/docs/ar/com/ktulu/dict/Database.html#match(java.lang.String,%20ar.com.ktulu.dict.strategies.Strategy)



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to