The use case is when there is some data that changes frequently, but
some data is static, _and_ that the volatile index can be rebuilt in
the same order that the static one was built. The indexes must be
"parallel" in terms of the document index order. If you delete, then
you should delete from both indexes, and likewise with add.
Erik
On Oct 10, 2005, at 4:14 PM, John Smith wrote:
Sorry to bug people on this again and again.
I might be missing something or confused totally, But what is the
use case for a ParallelReader if the use case is not addressing the
situation where we have a index changing frequently( meaning
deletes and reindex) and index not changing , but has same number
of docs. Wouldn't people want to stick with just one index in any
case?
Any comments or response appreciated.
JZ
John Smith <[EMAIL PROTECTED]> wrote:
A while ago I had asked a question on what would be a good solution
for a situation mentioned below and I was pointed in the direction
of Parallel Reader. Looks like that will not work.
Thank you for alerting me on this.
So other than delete and reindex the document to a single index,
there is no way of addressing the situation.
JZ
Eyal wrote:Run a search on "Lucene ParallelReader" in google -
You'll find something
Doug Cutting wrote that I believe is what you're looking for.
Eyal
-----Original Message-----
From: John Smith [mailto:[EMAIL PROTECTED]
Sent: Thursday, August 11, 2005 21:12 PM
To: java-user@lucene.apache.org
Subject: Updating existing documents in index: Solutions
Hi all
This is a slightly long email. Pardon me.
As Lucene does not allow for updating an existing document in
the index, the only option is to delete and reindex the
message.When you have too many updates, this gets a little
cumbersome. In our case, as such the actual content of the
document being indexed does
not change, but the fields around the content, like say
"LastReadby" or something like Folder associated with it etc
change. These are all fields that have been indexed as a part
of the original document in the index.
I have been contemplating putting these "commonly changing
fields" into one index and allow for delete and reindex on
this index alone and keep the static data in another index.
DocumentID will be a stored field and will be stored in both
the static and dynamic index, as a way of identifying the document.
Static index: Contains content of document indexed and
documentID stored.
Dynamic index: Contains all fields about the document which
change frequently indexed and documentID stored.
Questions
1. First of all, is there a better solution to this
frequently changing fields having to be reindexed ?
2. Let's say I go with the 2 index approach,
Example query: Content: "Hello world" AND Folder:Folder1 AND
LastReadBy: jane. If we execute these queries on our static
and dynamic indexes, they will obviously fail to get hits.
Let's say I have a way of splitting my queries such that
all content queries go to static (content) index only and
queries on other fields go to the dynamic index, basically
allow for queries to come in such a way that it is always a
AND between the dynamic index result set and static index
result set. So on the results set, I would have to retrieve
the document ID and make sure we have the same documentID in
both the result sets, in order for it to be a match.
In cases where the result sets are really huge from
both the queries, then even to get the number of hits, I will
have to retrieve each and every document from the results, in
order to get the documentID for comparison. Queries can get
really slow.
Has anyone faced similar problems, If so what was your solution?
Any comments/thoughts will be appreciated.
Thank you
JS
Daniel Naber wrote:
On Montag 10 Oktober 2005 20:24, John Smith wrote:
My understanding is ParallelReader works for situations where you
have a
static index and a dynamic index.
That's no correct. Quoting the documentation:
It is up to you to make sure all indexes
are created and modified the same way. For example, if you add
documents to one index, you need to add the same documents in the
same order to the other indexes. Failure to do so will result in
undefined behavior.
Regards
Daniel
--
http://www.danielnaber.de
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------
Yahoo! Music Unlimited - Access over 1 million songs. Try it free.
---------------------------------
Yahoo! Music Unlimited - Access over 1 million songs. Try it free.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]