I actually use Field.Text(String,String) to add documents to my index. Maybe I do not understand the way an analyzer works, but I thought that all German articles (der, die, das etc) should be filtered out. However if I use Luke to view my index, the original text is completely stored in a field. And what I need is term vector, that I can create from an indexed document field. So this field should contain terms only.

Whether or not the text is stored in the index is a different concern
that how it is analyzed.  If you want the text to be indexed, and not
stored, then use the Field.Text(String, String) method or the
appropriate constructor when adding a field to the Document.  You'll
need to also store a reference to the actual file (URL, Path, etc) in
the document so it can be retrieved from the doc returned in the Hits
object.

Or did I completely misunderstand the question?

-Mike

On Wed, 22 Dec 2004 17:23:24 +0100, DES <[EMAIL PROTECTED]> wrote:
hi

i need to index my text so that index contains only tokenized stemmed words without stopwords etc. The text ist german, so I tried to use GermanAnalyzer, but it stores whole text, not terms. Please give me a tip how to index terms only. Thanks!

DES


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to