[ 
https://issues.apache.org/jira/browse/LUCENE-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157717#comment-13157717
 ] 

Doron Cohen commented on LUCENE-3596:
-------------------------------------

Also, there seems to be a bug in current taxonomy writer test - TestIndexClose 
- where the IndexWriterConfig's merge policy might allow to merge segments 
out-of-order. That test calls LTC.newIndexWriterConfig() and it is just by luck 
that this test have not failed so far.

This is a bad type of failure for an application (is there ever a good 
type?;)), because by the time the bug is exposed it would show as a wrong facet 
returned in faceted search, and go figure that late that this is because at an 
earlier time an index writer was created which allowed out-of-order merging...

Therefore, it would have been useful if, in addition to the javadocs about 
requiring type of merge policy, we would also throw an exception 
(IllegalArgument or IO) if the IWC's merge policy allows merging out-of-order. 
This should be checked in two locations: 
- createIWC() returns
- openIndex() returns, by examining the IWC of the index

The second check is more involved as it is done after the index was already 
opened, so it must be closed prior to throwing that exception.

However, merge-policy does not have in its "contract" anything like 
Collector.acceptsDocsOutOfOrder(), so it is not possible to verify this at all.

Adding such a method to MergePolicy seems to me an over-kill, for this 
particular case, unless there is additional interest in such a declaration?

Otherwise, it is possible to require that the merge policy must be a descendant 
of LogMergePolicy. This on the other hand would not allow to test this class 
with other order-preserving policies, such as NoMerge.

So I am not sure what is the best  way to proceed in this regard.

I think there are two options actually:
# just javadoc that fact, and fix the test to always create an order preserving 
MP.
# add that declaration to MP.

Unless there are opinions favoring the second option I'll go with the first one.

In addition, (this is true for both options) I will move the call to createIWC 
into the constructor and modify openIndex signature to accept an IWC instead of 
the open mode, as it seems wrong - API wise - that one extension point 
(createIWC) is invoked by another extension point (openIndex) - better have 
them both be invoked from the constructor, making it harder for someone to, by 
mistake, totally ignore in createIndex() the value returned by createIWC().
                
> DirectoryTaxonomyWriter extensions should be able to set internal index 
> writer config attributes such as info stream
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3596
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3596
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Doron Cohen
>            Assignee: Doron Cohen
>            Priority: Minor
>         Attachments: LUCENE-3596.patch
>
>
> Current protected openIndexWriter(Directory directory, OpenMode openMode) 
> does not provide access to the IWC it creates.
> So extensions must reimplement this method completely in order to set e.f. 
> info stream for the internal index writer.
> This came up in [user question: Taxonomy indexer debug 
> |http://lucene.472066.n3.nabble.com/Taxonomy-indexer-debug-td3533341.html]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to