[ 
https://issues.apache.org/jira/browse/LUCENE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13277041#comment-13277041
 ] 

Marvin Humphrey commented on LUCENE-4050:
-----------------------------------------

Ever considered using hard links instead of renaming?
                
> Make segments_NN file codec-independent
> ---------------------------------------
>
>                 Key: LUCENE-4050
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4050
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/codecs
>            Reporter: Andrzej Bialecki 
>            Assignee: Robert Muir
>             Fix For: 4.0
>
>
> I propose to change the format of SegmentInfos file (segments_NN) to use 
> plain text instead of the current binary format.
> SegmentInfos file represents a commit point, and it also declares what codecs 
> were used for writing each of the segments that the commit point consists of. 
> However, this is a chicken and egg situation - in theory the format of this 
> file is customizable via Codec.getSegmentInfosFormat, but in practice we have 
> to first discover what is the codec implementation that wrote this file - so 
> the SegmentCoreReaders assumes a certain fixed binary layout of a preamble of 
> this file that contains the codec name... and then the file is read again, 
> only this time using the right Codec.
> This is ugly. Instead I propose to use a simple plain text format, either 
> line oriented properties or JSON, in such a way that newer versions could 
> easily extend it, and which wouldn't require any special Codec to read and 
> parse. Consequently we could remove SegmentInfosFormat altogether, and 
> instead add SegmentInfoFormat (notice the singular) to Codec to read single 
> per-segment SegmentInfo-s in a codec-specific way. E.g. for Lucene40 codec we 
> could either add another file or we could extend the .fnm file (FieldInfos) 
> to contain also this information. 
> Then the plain text SegmentInfos would contain just the following information:
> * list of global files for this commit point (if any)
> * list of segments for this commit point, and their corresponding codec class 
> names
> * user data map

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to