[ 
https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5189:
-------------------------------

    Attachment: LUCENE-5189.patch

Patch addresses Rob's idea:

* ReaderAndLiveDocs and SegCoreReaders set segmentSuffix to docValuesGen and 
also set SegReadState.directory accordingly (CFS or si.info.dir if dvGen != -1).

* All the changes to DVFormat were removed (including the 45Producer/Consumer). 
I had to fix a bug in PerFieldDVF which ignore state.segmentSuffix (and also 
resolved a TODO on the way, since it now respects it).

* Removed the nocommit in ReaderAndLiveDocs regarding letting 
TrackingDirWrapper forbid createOutput on a file which is referenced by a 
commit, since now Codecs are not aware of dvGen at all. As long as they don't 
ignore segmentSuffix (which they better, otherwise they're broken), they can be 
upgraded safely to support DVUpdate. We can still do that though under a 
separate issue, as another safety mechanism.

I wanted to get rid of the nocommit in TestNumericDocValuesUpdates which sets 
the default Codec to Lucene45 since now presumably all Codecs should support 
dv-update. But when the test runs with Lucene40 (I haven't tried other codecs, 
it's the first one that failed), I hit an exception as if trying to write to 
the same CFS file. Looking at Lucene40DVF.fieldsProducer, I see that it 
defaults to CFS extension and also Lucene40DVWriter uses hard-coded 
segmentSuffix="dv" and ignore state.segmentSuffix. I guess that the actual 
Codec that was used is Lucene40RWDocValuesFormat, otherwise fieldsProducer 
should have hit an exception. I didn't know our tests pick "old" codecs at 
random too :). How can I avoid picking the "old" Codecs (40, 42)? I still want 
to test other codecs, such as Asserting, maybe MemoryDVF (if it's chosen at 
random).
                
> Numeric DocValues Updates
> -------------------------
>
>                 Key: LUCENE-5189
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5189
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>         Attachments: LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch
>
>
> In LUCENE-4258 we started to work on incremental field updates, however the 
> amount of changes are immense and hard to follow/consume. The reason is that 
> we targeted postings, stored fields, DV etc., all from the get go.
> I'd like to start afresh here, with numeric-dv-field updates only. There are 
> a couple of reasons to that:
> * NumericDV fields should be easier to update, if e.g. we write all the 
> values of all the documents in a segment for the updated field (similar to 
> how livedocs work, and previously norms).
> * It's a fairly contained issue, attempting to handle just one data type to 
> update, yet requires many changes to core code which will also be useful for 
> updating other data types.
> * It has value in and on itself, and we don't need to allow updating all the 
> data types in Lucene at once ... we can do that gradually.
> I have some working patch already which I'll upload next, explaining the 
> changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to