Greetings!
Being a newbie, I'm still mostly in the dark regarding where the line is
between Solr and Lucene. The following code snippet is -- I think --
all Lucene and no Solr. It is a significantly modified version of some
example code I found on the net.
dir =
FSDirectory.open(FileSystems.getDefault().getPath("/localapps/dev/EventLog/solr/data",
"SpellIndex"));
speller = new SpellChecker(dir);
fis = new FileInputStream("/usr/share/dict/words");
analyzer = new StandardAnalyzer();
speller.indexDictionary(new PlainTextDictionary(EventLog.fis), new
IndexWriterConfig(analyzer), false);
// now let's see speller in action...
System.out.println(speller.exist("beez")); // returns false
System.out.println(speller.exist("bees")); // returns true
String[] suggestions = speller.suggestSimilar("beez", 10);
for (String suggestion : suggestions)
System.err.println(suggestion);
(Later in my code, I close what objects need to be...) This code
(above) does the following:
1. identifies whether a given word is misspelled or spelled correctly.
2. Gives alternate suggestions to a given word (whether spelled
correctly or not).
3. I presume, but haven't tested this yet, that I can add a second or
third word list to the index, say, a site dictionary containing
names of people or places commonly found in the text.
But this code does not:
1. parse any given text into words, and testing each word.
2. provide markers showing where the misspelled/suspect words are
within the text.
and so my code will have to provide the latter functionality. Or does
Solr provide this capability, such that it would be silly to write my own?
Thanks,
Mark