[ https://issues.apache.org/jira/browse/LUCENE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038136#comment-13038136 ]
Michael McCandless commented on LUCENE-3126: -------------------------------------------- bq. Patch does not handle all files well (few tests fail). Apparently, the .del file should not be rolled into the .cfs. Right, .del files never appear inside a CFS. bq. SegmentMerger.createCompoundFile does this by default, however it's only called from code that ensures no deletions exist. Would have been nice if this method documented it . Please add comments to this! It's non-obvious ;) bq. Also, I think *.s<num> should not be rolled into .cfs (those are the separate norms files). I don't know how to create such files in the first place (thought they're of old format, but 3.1 indexes have them also), and TestBackCompat fails. Right, these too only live outside a CFS. You create them by opening a writable IndexReader (I know: confusing!) and calling setNorm, then closing it. They are not only for old indices... 4.0 creates them too. bq. Is there a way to identify those files? Is it safe to check if the file extension starts w/ IndexFileNames.SEPARATE_NORMS_EXTENSION? Feels hacky to me. Hackish though it seems (I agree) I think that's the only way? SegmentInfo.hasSeparateNorms is equally hacky... bq. Another thing, I think in order to avoid shared doc stores (and whatever other old-format) stuff, since it's only an optimization, that the code should copy into CFS only if the segment version is on or after 3.1 (that is StringHelper.getVersionComparator().compare(info.getVersion, "3.1") >= 0). Shared doc stores, yes, but the separate del docs / norms are produced by all versions. More generally: does addIndexes properly refuse to import a too-old index? We should throw IndexFormatTooOldExc in this case? (And, maybe also IndexFormatTooNewExc?). > IndexWriter.addIndexes can make any incoming segment into CFS if it isn't > already > --------------------------------------------------------------------------------- > > Key: LUCENE-3126 > URL: https://issues.apache.org/jira/browse/LUCENE-3126 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index > Reporter: Shai Erera > Assignee: Shai Erera > Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3126.patch > > > Today, IW.addIndexes(Directory) does not modify the CFS-mode of the incoming > segments. However, if IndexWriter's MP wants to create CFS (in general), > there's no reason why not turn the incoming non-CFS segments into CFS. We > anyway copy them, and if MP is not against CFS, we should create a CFS out of > them. > Will need to use CFW, not sure it's ready for that w/ current API (I'll need > to check), but luckily we're allowed to change it (@lucene.internal). > This should be done, IMO, even if the incoming segment is large (i.e., passes > MP.noCFSRatio) b/c like I wrote above, we anyway copy it. However, if you > think otherwise, speak up :). > I'll take a look at this in the next few days. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org