Hello!
I'm in trouble with Lucene Index Writer. I'm benchmarking some algorithm
which might seem like NRT-case, but I'm not sure that I need it
particularly. The overall problem is to writing "join index" (column holds
docnums) via updating binary docvalues after commit. i.e.:
 - update docs
 - commit
 - read docs (openIfChanged() before )
 - updateDocVals
 - commit

It's clunky but it works, until guess what happens... merge.Oh my.

Once a time I have segments
segments_ec:2090 _7c(5.0):C117/8:delGen=8:....
_7j(5.0):C1:fieldInfosGen=1:dvGen=1 _7k(5.0):C1)

I apply one update and trigger commit, as a result I have:
segments_ee:2102 _7c(5.0):C117/9:delGen=9:..
_7k(5.0):C1:fieldInfosGen=1:dvGen=1 _7l(5.0):C1)

however, somewhere inside of the this commit call, pretty
SerialMergeScheduler bakes the single solid segment
_7m(5.0):C117
however, it wasn't exposed in via any segments file so far.

And now I get into trouble:
if I call DR.openIfChanged(segments_ec) (even after IW.waitMerges()), I've
got segments_ee that's fairly reasonable, to keep it incremental and fast.
but if I use that IndexWriter, it applies new updates on top of that
merged one (_7m(5.0):C117), not on segments_ee. And it broke my assertions.
I rather need to open reader of that merged _7m(5.0):C117, which IW keeps
somewhere internally, and it's better to do if fancy&incremental. If you
can point me on how NRT can solve I'd happy to switch on it.

Incredibly thank you for your time!!!

-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<[email protected]>

Reply via email to