Hello Mike! Thanks for your attention. I pushed the mad case at https://github.com/m-khl/lucene-merge-visibility/commit/fa2d60be5b13eb57e0527c843119cf62cfa83a7d#diff-86ebfbf440fe69ee36a52705cb92b824R120
it does the following - writes a pair of doc - commit - reopen reader, searches for one of them - update this doc with its' docnum (I know it's weird, but I should work if reopened reader sees that update) - commit this DV update - search that doc and check the written doc val. it passes if hardReopenBeforeDVUpdate=true and fails otherwise I know that changing docnum is natural, but I expect it doesnt change while I update docvals. here how it flips: at the commit after doc update we have many segments now checkpoint "_0(6.0.0):C2/1:delGen=1:fieldInfosGen=1:dvGen=1 _1(6.0.0):C2:fieldInfosGen=1:dvGen=1 _2(6.0.0):C2: commit: wrote segments file "segments_j" but also there is a solid segment, which is merged but haven't committed/published after commitMerge: _a(6.0.0):c19 and after DV update commit we have that solid segment visible now checkpoint "_a(6.0.0):c19:fieldInfosGen=1:dvGen=1" [1 segments ; isCommit = true] IFD 0 [Thu Sep 25 23:56:22 SAST 2014; TEST-TestNumDValUpdVsReaderVisibility.testSimple-seed#[6131CF35B3A45FC3]]: deleteCommits: now decRef commit "segments_j" ... wrote segments file "segments_k" I'm using SerialMergeScheduler, and expect to see single solid segment after I commit document updates and it triggers the merge. How I can reopen reader which sees it? Thanks On Wed, Sep 24, 2014 at 10:07 PM, Michael McCandless < [email protected]> wrote: > I don't understand what's actually happening / going wrong here. > > Maybe you can make a test case / give more details? > > What assertions are broken? Why is it bad if SMS does a merge before > you reopen? Why are you using SMS :) > > Mike McCandless > > http://blog.mikemccandless.com > > On Mon, Sep 22, 2014 at 6:00 PM, Mikhail Khludnev > <[email protected]> wrote: > > Hello! > > I'm in trouble with Lucene Index Writer. I'm benchmarking some algorithm > > which might seem like NRT-case, but I'm not sure that I need it > > particularly. The overall problem is to writing "join index" (column > holds > > docnums) via updating binary docvalues after commit. i.e.: > > - update docs > > - commit > > - read docs (openIfChanged() before ) > > - updateDocVals > > - commit > > > > It's clunky but it works, until guess what happens... merge.Oh my. > > > > Once a time I have segments > > segments_ec:2090 _7c(5.0):C117/8:delGen=8:.... > > _7j(5.0):C1:fieldInfosGen=1:dvGen=1 _7k(5.0):C1) > > > > I apply one update and trigger commit, as a result I have: > > segments_ee:2102 _7c(5.0):C117/9:delGen=9:.. > > _7k(5.0):C1:fieldInfosGen=1:dvGen=1 _7l(5.0):C1) > > > > however, somewhere inside of the this commit call, pretty > > SerialMergeScheduler bakes the single solid segment > > _7m(5.0):C117 > > however, it wasn't exposed in via any segments file so far. > > > > And now I get into trouble: > > if I call DR.openIfChanged(segments_ec) (even after IW.waitMerges()), > I've > > got segments_ee that's fairly reasonable, to keep it incremental and > fast. > > but if I use that IndexWriter, it applies new updates on top of that > merged > > one (_7m(5.0):C117), not on segments_ee. And it broke my assertions. I > > rather need to open reader of that merged _7m(5.0):C117, which IW keeps > > somewhere internally, and it's better to do if fancy&incremental. If you > can > > point me on how NRT can solve I'd happy to switch on it. > > > > Incredibly thank you for your time!!! > > > > -- > > Sincerely yours > > Mikhail Khludnev > > Principal Engineer, > > Grid Dynamics > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics <http://www.griddynamics.com> <[email protected]>
