[ 
https://issues.apache.org/jira/browse/LUCENE-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746428#action_12746428
 ] 

Uwe Schindler commented on LUCENE-1842:
---------------------------------------

I still do not understand your proposal. You can always create all tokenizer 
chains at the beginning with exactly one tokenizer (after LUCENE-1826). You are 
then free to call incrementToken() on all sub-tokenstreams and all these calls 
will put the tokenized values in the same attributes.

Adding a reset(AttributeSource) method would not help really, as you would have 
to do this for the whole Tokenizer chain. If you do it in the wrong way, there 
may be some tokenfilters in the chain that use a different attributesource and 
so on. Because of all these problem and the complexity, we do not want to have 
setters for AttributeSources or changes of AttributeFactory and so on. During 
the lifetime of one TokenStream, there is in my opinion no real use-case for 
changing its attribute maps that rectify the added complexity and risk for 
errors. 

The cost of adding Attributes is very low if you reuse TokenStreams, what you 
could even do with your concenatting TokenStream.

> Add reset(AttributeSource) method to AttributeSource
> ----------------------------------------------------
>
>                 Key: LUCENE-1842
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1842
>             Project: Lucene - Java
>          Issue Type: Wish
>          Components: Analysis
>            Reporter: Tim Smith
>            Priority: Minor
>             Fix For: 2.9
>
>
> Originally proposed in LUCENE-1826
> Proposing the addition of the following method to AttributeSource
> {code}
> public void reset(AttributeSource input) {
>     if (input == null) {
>       throw new IllegalArgumentException("input AttributeSource must not be 
> null");
>     }
>     this.attributes = input.attributes;
>     this.attributeImpls = input.attributeImpls;
>     this.factory = input.factory;
> }
> {code}
> Impacts:
> * requires all TokenStreams/TokenFIlters/etc to call addAttribute() in their 
> reset() method, not in their constructor
> * requires making AttributeSource.attributes and 
> AttributeSource.attributesImpl non-final
> Advantages:
> Allows creating only a single actual AttributeSource per thread that can then 
> be used for indexing with a multitude of TokenStream/Tokenizer combinations 
> (allowing utmost reuse of TokenStream/Tokenizer instances)
> this results in only a single "attributes"/"attributesImpl" map being 
> required per thread
> addAttribute() calls will almost always return right away (will only be 
> "initialized" once per thread)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to