[ 
https://issues.apache.org/jira/browse/LUCENE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037983#comment-13037983
 ] 

Shai Erera commented on LUCENE-3126:
------------------------------------

Patch does not handle all files well (few tests fail). Apparently, the .del 
file should not be rolled into the .cfs. SegmentMerger.createCompoundFile does 
this by default, however it's only called from code that ensures no deletions 
exist. Would have been nice if this method documented it :).

Also, I think *.s<num> should not be rolled into .cfs (those are the separate 
norms files). I don't know how to create such files in the first place (thought 
they're of old format, but 3.1 indexes have them also), and TestBackCompat 
fails. Is there a way to identify those files? Is it safe to check if the file 
extension starts w/ IndexFileNames.SEPARATE_NORMS_EXTENSION? Feels hacky to me.

Another thing, I think in order to avoid shared doc stores (and whatever other 
old-format) stuff, since it's only an optimization, that the code should copy 
into CFS only if the segment version is on or after 3.1 (that is 
StringHelper.getVersionComparator().compare(info.getVersion, "3.1") >= 0).

I think I'm close to finish it, just need to figure out the separate norms 
thing.

> IndexWriter.addIndexes can make any incoming segment into CFS if it isn't 
> already
> ---------------------------------------------------------------------------------
>
>                 Key: LUCENE-3126
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3126
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3126.patch
>
>
> Today, IW.addIndexes(Directory) does not modify the CFS-mode of the incoming 
> segments. However, if IndexWriter's MP wants to create CFS (in general), 
> there's no reason why not turn the incoming non-CFS segments into CFS. We 
> anyway copy them, and if MP is not against CFS, we should create a CFS out of 
> them.
> Will need to use CFW, not sure it's ready for that w/ current API (I'll need 
> to check), but luckily we're allowed to change it (@lucene.internal).
> This should be done, IMO, even if the incoming segment is large (i.e., passes 
> MP.noCFSRatio) b/c like I wrote above, we anyway copy it. However, if you 
> think otherwise, speak up :).
> I'll take a look at this in the next few days.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to