The first question is how do you want to access the data? What do you want your queries to look like?

What is the larger context? Are these properties of larger documents? Are there more than one per document? Etc.

Why not just store the property as a tokenized field? Then you can query whether v(i) or v(j) are or are not present as keywords.

-- Jack Krupansky

-----Original Message----- From: Paul Bell
Sent: Sunday, March 31, 2013 8:21 AM
To: java-user@lucene.apache.org
Subject: Indexing a long list

Hi All,

Suppose I need to index a property whose value is a long list of terms. For
example,

   someProperty = ["v1", "v2", .... , "v1000000"]

Please note that I could drop the leading "v" and index these as numbers
instead of strings.

But the question is what's the best practice in Lucene when dealing with a
case like this? I need to be able to retrieve the list. This makes methink
that I need to store it. And I suppose that the list could be stored in the
index itself or in the "content" to which the index points.

So there are really two parts to this question:

1. Lucene "best practices" for long list
2. Where to store such a list

Thanks for your help.

-Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to