[ https://issues.apache.org/jira/browse/LUCENE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir resolved LUCENE-4050. --------------------------------- Resolution: Fixed Fix Version/s: (was: 4.0) 4.0-ALPHA This was never resolved (segments_N file is codec-independent now) > Make segments_NN file codec-independent > --------------------------------------- > > Key: LUCENE-4050 > URL: https://issues.apache.org/jira/browse/LUCENE-4050 > Project: Lucene - Core > Issue Type: Bug > Components: core/codecs > Reporter: Andrzej Bialecki > Assignee: Robert Muir > Fix For: 4.0-ALPHA > > > I propose to change the format of SegmentInfos file (segments_NN) to use > plain text instead of the current binary format. > SegmentInfos file represents a commit point, and it also declares what codecs > were used for writing each of the segments that the commit point consists of. > However, this is a chicken and egg situation - in theory the format of this > file is customizable via Codec.getSegmentInfosFormat, but in practice we have > to first discover what is the codec implementation that wrote this file - so > the SegmentCoreReaders assumes a certain fixed binary layout of a preamble of > this file that contains the codec name... and then the file is read again, > only this time using the right Codec. > This is ugly. Instead I propose to use a simple plain text format, either > line oriented properties or JSON, in such a way that newer versions could > easily extend it, and which wouldn't require any special Codec to read and > parse. Consequently we could remove SegmentInfosFormat altogether, and > instead add SegmentInfoFormat (notice the singular) to Codec to read single > per-segment SegmentInfo-s in a codec-specific way. E.g. for Lucene40 codec we > could either add another file or we could extend the .fnm file (FieldInfos) > to contain also this information. > Then the plain text SegmentInfos would contain just the following information: > * list of global files for this commit point (if any) > * list of segments for this commit point, and their corresponding codec class > names > * user data map -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org