Re: Field Question

Michael McCandless Mon, 25 Aug 2008 02:23:34 -0700

I think you meant Field.Index.NO and Field.Index.TOKENIZED, for thosetwo docs.

The answer is yes -- Lucene considers the field "indexed" if ever anydoc, even a single doc, had set Index.TOKENIZED or Index.UN_TOKENIZEDfor that field.


However, your document A still will not have been indexed.

Lucene has no global schema; it allows individual docs to havewhatever fields, with whatever options, they want while indexing.However, when documents are indexed together into a segment, someproperties, eg isIndexed, omitNorms, storeTermVectors, etc, are storedglobally in the FieldInfos (*.fnm files) and thus must be "merged".EG, with omitNorms, only if every doc in the segment called omitNormsfor field "foo" will norms actually not be stored.


Mike

DimitriD wrote:


John,

Thanks for answering.

So if I have a document "A "with field "foo" and the field hasattribute

Field.NO and
I have document "B" with field "foo" and the field is Field.TOKENIZED.

Will IndexReader.getFieldNames(IndexReader.FieldOption.INDEXED)return field

"foo"?

Thanks,

Dimitri


John Griffin-3 wrote:


Dimitri,

        Field.TOKENIZED and Field.NO_NORMs send their field's contents
through a tokenizer and make their contents indexed and therefore

searchable. FIELD.UN_TOKENIZED does not send its field's contentsthrough

tokenizer but it still indexes its contents. Only Field.NO does notindex

its field's contents.
        
        So, regardless of whether or not a particular document has a field

tokenized while another document does not has nothing to do withwhether

or
not the contents are indexed. As long as it's not Field.NO, they are.

John G.

-----Original Message-----
From: DimitriD [mailto:[EMAIL PROTECTED]
Sent: Friday, August 22, 2008 9:14 AM
To: [email protected]
Subject: Field Question

I am new to lucene. Here is my question. The document has fields.When I

add

a field to the document I can specify that field is Indexed,Tokenized,

etc.. So the same field can be Tokenized in one document and be
not-tokenized in another document. However the is a method

IndexReader.getFieldNames(IndexReader.FieldOption.INDEXED) thatreturns

all

index fields in the index. It seems like it assumes that FIELDshould have

the same attributes across all documents.
Can anyoen explain it?

--
View this message in context:
http://www.nabble.com/Field-Question-tp19108787p19108787.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


--
View this message in context: 
http://www.nabble.com/RE%3A-Field-Question-tp19116394p19136508.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Field Question

Reply via email to