I've never used the german analyzer, so I don't know what stop words it defines/uses. Someone else will have to answer that. Sorry
On Wed, 22 Dec 2004 17:45:17 +0100, DES <[EMAIL PROTECTED]> wrote: > I actually use Field.Text(String,String) to add documents to my index. Maybe > I do not understand the way an analyzer works, but I thought that all German > articles (der, die, das etc) should be filtered out. However if I use Luke > to view my index, the original text is completely stored in a field. And > what I need is term vector, that I can create from an indexed document > field. So this field should contain terms only. > > > Whether or not the text is stored in the index is a different concern > > that how it is analyzed. If you want the text to be indexed, and not > > stored, then use the Field.Text(String, String) method or the > > appropriate constructor when adding a field to the Document. You'll > > need to also store a reference to the actual file (URL, Path, etc) in > > the document so it can be retrieved from the doc returned in the Hits > > object. > > > > Or did I completely misunderstand the question? > > > > -Mike > > > > On Wed, 22 Dec 2004 17:23:24 +0100, DES <[EMAIL PROTECTED]> wrote: > >> hi > >> > >> i need to index my text so that index contains only tokenized stemmed > >> words without stopwords etc. The text ist german, so I tried to use > >> GermanAnalyzer, but it stores whole text, not terms. Please give me a tip > >> how to index terms only. Thanks! > >> > >> DES > >> > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]