I think you meant Field.Index.NO and Field.Index.TOKENIZED, for those
two docs.
The answer is yes -- Lucene considers the field "indexed" if ever any
doc, even a single doc, had set Index.TOKENIZED or Index.UN_TOKENIZED
for that field.
However, your document A still will not have been indexed.
Lucene has no global schema; it allows individual docs to have
whatever fields, with whatever options, they want while indexing.
However, when documents are indexed together into a segment, some
properties, eg isIndexed, omitNorms, storeTermVectors, etc, are stored
globally in the FieldInfos (*.fnm files) and thus must be "merged".
EG, with omitNorms, only if every doc in the segment called omitNorms
for field "foo" will norms actually not be stored.
Mike
DimitriD wrote:
John,
Thanks for answering.
So if I have a document "A "with field "foo" and the field has
attribute
Field.NO and
I have document "B" with field "foo" and the field is Field.TOKENIZED.
Will IndexReader.getFieldNames(IndexReader.FieldOption.INDEXED)
return field
"foo"?
Thanks,
Dimitri
John Griffin-3 wrote:
Dimitri,
Field.TOKENIZED and Field.NO_NORMs send their field's contents
through a tokenizer and make their contents indexed and therefore
searchable. FIELD.UN_TOKENIZED does not send its field's contents
through
a
tokenizer but it still indexes its contents. Only Field.NO does not
index
its field's contents.
So, regardless of whether or not a particular document has a field
tokenized while another document does not has nothing to do with
whether
or
not the contents are indexed. As long as it's not Field.NO, they are.
John G.
-----Original Message-----
From: DimitriD [mailto:[EMAIL PROTECTED]
Sent: Friday, August 22, 2008 9:14 AM
To: java-user@lucene.apache.org
Subject: Field Question
I am new to lucene. Here is my question. The document has fields.
When I
add
a field to the document I can specify that field is Indexed,
Tokenized,
etc.. So the same field can be Tokenized in one document and be
not-tokenized in another document. However the is a method
IndexReader.getFieldNames(IndexReader.FieldOption.INDEXED) that
returns
all
index fields in the index. It seems like it assumes that FIELD
should have
the same attributes across all documents.
Can anyoen explain it?
--
View this message in context:
http://www.nabble.com/Field-Question-tp19108787p19108787.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
View this message in context:
http://www.nabble.com/RE%3A-Field-Question-tp19116394p19136508.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]