Re: Iterating Over All Documents On a Changing Index

Matt Davis Wed, 30 Oct 2019 16:56:48 -0700

Thanks for the clarification.  I have written my own logic tracking changes
and ignoring documents that have been written or deleted since the reindex
started.




On Mon, Oct 21, 2019, 4:58 PM Adrien Grand <[email protected]> wrote:

> This is the right place to ask these questions indeed.
>
> This is a good way to iterate over documents. Regarding your 2nd
> question, Lucene IndexReaders are point-in-time views of the data, so
> changes won't become visible in-place. The tricky problem with this
> kind of problem is usually to deal with documents that are getting
> indexed after you pulled a new reader and while you are in the process
> of reindexing.
>
> On Sat, Oct 19, 2019 at 1:35 AM Matt Davis <[email protected]>
> wrote:
> >
> > Hi All,
> >
> > I am working on implementing of an in place reindex using Lucene.  In my
> > case, I have BSON document stored in a binary field and have a set of
> rules
> > that pull fields out of the BSON and indexes them into different Lucene
> > fields with different analyzers.  I would like to be able to change these
> > rules / schema and then iterate over the documents, indexing them using
> the
> > new schema.
> >
> > I have come up with the following code block:
> > https://gist.github.com/mdavis95/f600e0a8233d0a1232eff77645d1dc8a
> >
> > I have two questions:
> > 1) Is this a good way to iterate over the documents
> > 2) How can I manage documents changing when I am doing this.  New
> documents
> > coming in should be fine I believe but changes to existing documents
> could
> > be lost if I understand correctly.
> >
> > I hope that this is the right place to ask this question and I apologize
> if
> > this is obvious or has been asked and answered.
> >
> > Thanks,
> > Matt
>
>
>
> --
> Adrien
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: Iterating Over All Documents On a Changing Index

Reply via email to