On Nov 28, 2006, at 4:31 PM, Michael Rusch wrote:
I have documents that can be referred to by multiple identifiers
(and I want
to store the identifiers separate from the main indexed content). I'm
wondering if I should put each identifier in it's own keyword
field, or have
one tokenized field with all of the identifiers in it. What I'm
talking
about is something like this:
"Identifier" is a keyword field
Add field Identifier="ABCD"
Add field Identifier="WXYZ"
Or
Identifiers is a tokenized, indexed, unstored field
Add field Identifiers="ABCD WXYZ"
It would seem that either would work, but I was wondering if there
was a
"standard" way or if anybody had thoughts on relative advantages or
disadvantages (or is it half of one/six dozen of the other and I
should just
pick one and go with it).
There really is no difference in these approaches for identifiers.
Different field instances have a positional gap feature, but that
wouldn't be a consideration for TermQuery's on these identifiers. I
recommend the multiple keyword field approach, to avoid having to
deal with analysis (in case identifiers have special characters, etc).
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]