Hello Mike!

Thanks for your attention.
I pushed the mad case at
https://github.com/m-khl/lucene-merge-visibility/commit/fa2d60be5b13eb57e0527c843119cf62cfa83a7d#diff-86ebfbf440fe69ee36a52705cb92b824R120

it does the following

- writes a pair of doc
- commit
- reopen reader, searches for one of them
- update this doc with its' docnum (I know it's weird, but I should work if
reopened reader sees that update)
- commit this DV update
- search that doc and check the written doc val.
it passes if hardReopenBeforeDVUpdate=true and fails otherwise

I know that changing docnum is natural, but I expect it doesnt change while
I update docvals.
here how it flips:
at the commit after doc update we have many segments

 now checkpoint "_0(6.0.0):C2/1:delGen=1:fieldInfosGen=1:dvGen=1
_1(6.0.0):C2:fieldInfosGen=1:dvGen=1 _2(6.0.0):C2:
commit: wrote segments file "segments_j"

but also there is a solid segment, which is merged but haven't
committed/published
after commitMerge: _a(6.0.0):c19

and after DV update commit we have that solid segment visible

now checkpoint "_a(6.0.0):c19:fieldInfosGen=1:dvGen=1" [1 segments ;
isCommit = true]
IFD 0 [Thu Sep 25 23:56:22 SAST 2014;
TEST-TestNumDValUpdVsReaderVisibility.testSimple-seed#[6131CF35B3A45FC3]]:
deleteCommits: now decRef commit "segments_j"
...
wrote segments file "segments_k"

I'm using SerialMergeScheduler, and expect to see single solid segment
after I commit document updates and it triggers the merge.
How I can reopen reader which sees it?
Thanks


On Wed, Sep 24, 2014 at 10:07 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> I don't understand what's actually happening / going wrong here.
>
> Maybe you can make a test case / give more details?
>
> What assertions are broken?  Why is it bad if SMS does a merge before
> you reopen?  Why are you using SMS :)
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Mon, Sep 22, 2014 at 6:00 PM, Mikhail Khludnev
> <mkhlud...@griddynamics.com> wrote:
> > Hello!
> > I'm in trouble with Lucene Index Writer. I'm benchmarking some algorithm
> > which might seem like NRT-case, but I'm not sure that I need it
> > particularly. The overall problem is to writing "join index" (column
> holds
> > docnums) via updating binary docvalues after commit. i.e.:
> >  - update docs
> >  - commit
> >  - read docs (openIfChanged() before )
> >  - updateDocVals
> >  - commit
> >
> > It's clunky but it works, until guess what happens... merge.Oh my.
> >
> > Once a time I have segments
> > segments_ec:2090 _7c(5.0):C117/8:delGen=8:....
> > _7j(5.0):C1:fieldInfosGen=1:dvGen=1 _7k(5.0):C1)
> >
> > I apply one update and trigger commit, as a result I have:
> > segments_ee:2102 _7c(5.0):C117/9:delGen=9:..
> > _7k(5.0):C1:fieldInfosGen=1:dvGen=1 _7l(5.0):C1)
> >
> > however, somewhere inside of the this commit call, pretty
> > SerialMergeScheduler bakes the single solid segment
> > _7m(5.0):C117
> > however, it wasn't exposed in via any segments file so far.
> >
> > And now I get into trouble:
> > if I call DR.openIfChanged(segments_ec) (even after IW.waitMerges()),
> I've
> > got segments_ee that's fairly reasonable, to keep it incremental and
> fast.
> > but if I use that IndexWriter, it applies new updates on top of that
> merged
> > one (_7m(5.0):C117), not on segments_ee. And it broke my assertions. I
> > rather need to open reader of that merged _7m(5.0):C117, which IW keeps
> > somewhere internally, and it's better to do if fancy&incremental. If you
> can
> > point me on how NRT can solve I'd happy to switch on it.
> >
> > Incredibly thank you for your time!!!
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> > Principal Engineer,
> > Grid Dynamics
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mkhlud...@griddynamics.com>

Reply via email to