[ https://issues.apache.org/jira/browse/LUCENE-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated LUCENE-5969: -------------------------------- Attachment: LUCENE-5969_part3.patch Here is a patch for part 3. I think its ready, we should close the issue after this. Other improvements can be separate issues from here. Also after resolving this issue and backporting, we can do further cleanups in trunk, and remove all the 4.x support in backwards-codecs and further cleanups in SegmentInfos. Patch finishes adding all safety (docvalues, terms, postings, commit points). CodecUtil "segmentHeader" is renamed to "indexHeader", as its used for all index files (including commit points). BlockTree doesn't "backdoor" via checkindex to return stats, there is a dead simple API for this. Norms sparse encoding is further improved with PATCHED strategy. There is an API change for SegmentInfos for safety, instead of instance methods for reading read into "mutable" SIS: {code} SegmentInfos.read(Dir); SegmentInfos.read(Dir, file); {code} these are now static methods that return a clean instance (and named readCommit and readLatestCommit respectively, to not be fragile on upgrade). There is more to fix here, IMO SIS "tries to take on too much" (mutable state by IndexWriter, tracking of counters etc by IndexWriter, reading/writing commits, tries to be a "low level user-friendly" and too much publicly exposed dangers. This is all for a heavily versioned important file with conditional logic. But thats a bigger problem. > Add Lucene50Codec > ----------------- > > Key: LUCENE-5969 > URL: https://issues.apache.org/jira/browse/LUCENE-5969 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Fix For: 5.0, Trunk > > Attachments: LUCENE-5969.patch, LUCENE-5969.patch, > LUCENE-5969_part2.patch, LUCENE-5969_part3.patch > > > Spinoff from LUCENE-5952: > * Fix .si to write Version as 3 ints, not a String that requires parsing at > read time. > * Lucene42TermVectorsFormat should not use the same codecName as > Lucene41StoredFieldsFormat > It would also be nice if we had a "bumpCodecVersion" script so rolling a new > codec is not so daunting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org