On Nov 28, 2006, at 4:31 PM, Michael Rusch wrote:

I have documents that can be referred to by multiple identifiers (and I want
to store the identifiers separate from the main indexed content).  I'm
wondering if I should put each identifier in it's own keyword field, or have one tokenized field with all of the identifiers in it. What I'm talking
about is something like this:

"Identifier" is a keyword field
Add field Identifier="ABCD"
Add field Identifier="WXYZ"

Or

Identifiers is a tokenized, indexed, unstored field
Add field Identifiers="ABCD WXYZ"

It would seem that either would work, but I was wondering if there was a
"standard" way or if anybody had thoughts on relative advantages or
disadvantages (or is it half of one/six dozen of the other and I should just
pick one and go with it).

There really is no difference in these approaches for identifiers. Different field instances have a positional gap feature, but that wouldn't be a consideration for TermQuery's on these identifiers. I recommend the multiple keyword field approach, to avoid having to deal with analysis (in case identifiers have special characters, etc).
        
        Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to