[ https://issues.apache.org/jira/browse/LUCENE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037983#comment-13037983 ]
Shai Erera commented on LUCENE-3126: ------------------------------------ Patch does not handle all files well (few tests fail). Apparently, the .del file should not be rolled into the .cfs. SegmentMerger.createCompoundFile does this by default, however it's only called from code that ensures no deletions exist. Would have been nice if this method documented it :). Also, I think *.s<num> should not be rolled into .cfs (those are the separate norms files). I don't know how to create such files in the first place (thought they're of old format, but 3.1 indexes have them also), and TestBackCompat fails. Is there a way to identify those files? Is it safe to check if the file extension starts w/ IndexFileNames.SEPARATE_NORMS_EXTENSION? Feels hacky to me. Another thing, I think in order to avoid shared doc stores (and whatever other old-format) stuff, since it's only an optimization, that the code should copy into CFS only if the segment version is on or after 3.1 (that is StringHelper.getVersionComparator().compare(info.getVersion, "3.1") >= 0). I think I'm close to finish it, just need to figure out the separate norms thing. > IndexWriter.addIndexes can make any incoming segment into CFS if it isn't > already > --------------------------------------------------------------------------------- > > Key: LUCENE-3126 > URL: https://issues.apache.org/jira/browse/LUCENE-3126 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index > Reporter: Shai Erera > Assignee: Shai Erera > Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3126.patch > > > Today, IW.addIndexes(Directory) does not modify the CFS-mode of the incoming > segments. However, if IndexWriter's MP wants to create CFS (in general), > there's no reason why not turn the incoming non-CFS segments into CFS. We > anyway copy them, and if MP is not against CFS, we should create a CFS out of > them. > Will need to use CFW, not sure it's ready for that w/ current API (I'll need > to check), but luckily we're allowed to change it (@lucene.internal). > This should be done, IMO, even if the incoming segment is large (i.e., passes > MP.noCFSRatio) b/c like I wrote above, we anyway copy it. However, if you > think otherwise, speak up :). > I'll take a look at this in the next few days. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org