[ https://issues.apache.org/jira/browse/LUCENE-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527261#comment-13527261 ]
Renaud Delbru commented on LUCENE-4591: --------------------------------------- It is a similar approach that we followed (see attached files: PerFieldStoredFieldsFormat, PerFieldStoredFieldsWriter and PerFieldStoredFieldsReader). The issue is that our secondary StoredFieldsReader/Writer we are using is, for the moment, a wrapper around an instance of the CompressingStoredFieldsReader/Writer (using a wrapper approach was another way to extend CompressingStoredFieldsReader/Writer). The wrapper implements our encoding logic, and uses the underlying CompressingStoredFieldsWriter to write our data as a binary block. The problem with this approach is that since we can not configure the segment suffix of the CompressingStoredFieldsWriter, then the two StoredFieldsFormat try to write to files that have identical names. Since we are using a CompressingStoredFieldsReader/Writer as underlying mechanism to write the stored fields, why are we not using just one instance to store default lucene fields and our specific fields ? The reasons are: that it was more simple for our first implementation to leverage CompressingStoredFieldsReader/Writer (as a temporary solution); and that we would like to keep things (code and segment files) more isolated from each other. As said previously, we could simply copy-paste the compressing codec on our side to solve the problem, but I thought that maybe by raising the issue, we could have found a more appropriate solution. > Make StoredFieldsFormat more configurable > ----------------------------------------- > > Key: LUCENE-4591 > URL: https://issues.apache.org/jira/browse/LUCENE-4591 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs > Affects Versions: 4.1 > Reporter: Renaud Delbru > Fix For: 4.1 > > Attachments: LUCENE-4591.patch > > > The current StoredFieldsFormat are implemented with the assumption that only > one type of StoredfieldsFormat is used by the index. > We would like to be able to configure a StoredFieldsFormat per field, > similarly to the PostingsFormat. > There is a few issues that need to be solved for allowing that: > 1) allowing to configure a segment suffix to the StoredFieldsFormat > 2) implement SPI interface in StoredFieldsFormat > 3) create a PerFieldStoredFieldsFormat > We are proposing to start first with 1) by modifying the signature of > StoredFieldsFormat#fieldsReader and StoredFieldsFormat#fieldsWriter so that > they use SegmentReadState and SegmentWriteState instead of the current set of > parameters. > Let us know what you think about this idea. If this is of interest, we can > contribute with a first path for 1). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org