lucene index details

2009-02-19 Thread Seid Mohammed
I am new to lucene, and reading lucene in action book sometimes, i better understand when somone tell me an answer than a book. my queston is when indexing, what actually lucene is doing? if i have a file called test.txt with contents " lucen is used to index files" and i apply lucene indexing, wh

Re: lucene index details

2009-02-19 Thread Erick Erickson
You have to look at Analyzers a bit here because that's what controls what is in the index. The simplest case is a WhitespaceAnalyzer that breaks the input stream up into tokens on any whitespace. So, in your example and using a WhitespaceAnalyzer, you'd get the following tokens: lucene, is, used,

Re: lucene index details

2009-02-22 Thread Matt Ronge
On Feb 19, 2009, at 5:17 AM, Seid Mohammed wrote: I am new to lucene, and reading lucene in action book sometimes, i better understand when somone tell me an answer than a book. my queston is when indexing, what actually lucene is doing? if i have a file called test.txt with contents " luce