Hello! I'm in trouble with Lucene Index Writer. I'm benchmarking some algorithm which might seem like NRT-case, but I'm not sure that I need it particularly. The overall problem is to writing "join index" (column holds docnums) via updating binary docvalues after commit. i.e.: - update docs - commit - read docs (openIfChanged() before ) - updateDocVals - commit
It's clunky but it works, until guess what happens... merge.Oh my. Once a time I have segments segments_ec:2090 _7c(5.0):C117/8:delGen=8:.... _7j(5.0):C1:fieldInfosGen=1:dvGen=1 _7k(5.0):C1) I apply one update and trigger commit, as a result I have: segments_ee:2102 _7c(5.0):C117/9:delGen=9:.. _7k(5.0):C1:fieldInfosGen=1:dvGen=1 _7l(5.0):C1) however, somewhere inside of the this commit call, pretty SerialMergeScheduler bakes the single solid segment _7m(5.0):C117 however, it wasn't exposed in via any segments file so far. And now I get into trouble: if I call DR.openIfChanged(segments_ec) (even after IW.waitMerges()), I've got segments_ee that's fairly reasonable, to keep it incremental and fast. but if I use that IndexWriter, it applies new updates on top of that merged one (_7m(5.0):C117), not on segments_ee. And it broke my assertions. I rather need to open reader of that merged _7m(5.0):C117, which IW keeps somewhere internally, and it's better to do if fancy&incremental. If you can point me on how NRT can solve I'd happy to switch on it. Incredibly thank you for your time!!! -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics <http://www.griddynamics.com> <[email protected]>
