[
https://issues.apache.org/jira/browse/LUCENE-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746452#action_12746452
]
Tim Smith commented on LUCENE-1842:
-----------------------------------
Yes, i know that creating the Tokenizer/TokenStream fully each time will do the
trick as well, but i was hoping for some way to take advantage of the
"reusableTokenStream" concepts (esecially in the case of Tokenizers that take a
long time to construct (load resources/etc))
what i guess i really want is this method added to Analyzer:
{code}
public TokenStream tokenStream(AttributeSource attrs, Reader reader);
{code}
but i assume this would either have to reconstruct the full TokenStream chain
every time (could be costly), or it would require
AttributeSource.reset(AttributeSource) method in order to reuse saved streams
> Add reset(AttributeSource) method to AttributeSource
> ----------------------------------------------------
>
> Key: LUCENE-1842
> URL: https://issues.apache.org/jira/browse/LUCENE-1842
> Project: Lucene - Java
> Issue Type: Wish
> Components: Analysis
> Reporter: Tim Smith
> Priority: Minor
>
> Originally proposed in LUCENE-1826
> Proposing the addition of the following method to AttributeSource
> {code}
> public void reset(AttributeSource input) {
> if (input == null) {
> throw new IllegalArgumentException("input AttributeSource must not be
> null");
> }
> this.attributes = input.attributes;
> this.attributeImpls = input.attributeImpls;
> this.factory = input.factory;
> }
> {code}
> Impacts:
> * requires all TokenStreams/TokenFIlters/etc to call addAttribute() in their
> reset() method, not in their constructor
> * requires making AttributeSource.attributes and
> AttributeSource.attributesImpl non-final
> Advantages:
> Allows creating only a single actual AttributeSource per thread that can then
> be used for indexing with a multitude of TokenStream/Tokenizer combinations
> (allowing utmost reuse of TokenStream/Tokenizer instances)
> this results in only a single "attributes"/"attributesImpl" map being
> required per thread
> addAttribute() calls will almost always return right away (will only be
> "initialized" once per thread)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]