Madhu, Analyzer is the magic word here.
Lucene's StandardAnalyzer has a whole grammar to split words into tokens. There are many more analyzers, most of which are language specific (e.g. based the Snowball or Porter-stemmers, see contribs or javadoc of core).
For which language do wish to use that ? paul Le 13 sept. 05, à 11:45, Madhu Satyanarayana Panitini a écrit :
Hai all I want know the split pattern of text before indexing in Lucene, its splits where ever there is space in between the words Or is there any pattern in splitting the words of text document. In which program I can find the code on the splitting of the word. Madhu Madhu Satyanarayana. Panitini PASS GCA Solution Centre Pvt Ltd. 601 Aditya Trade Centre, Ameerpet, Hyderabad, India. www.pass-consulting.com
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]