[ 
https://issues.apache.org/jira/browse/LUCENE-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13738113#comment-13738113
 ] 

Uwe Schindler edited comment on LUCENE-5170 at 8/13/13 11:58 AM:
-----------------------------------------------------------------

Robert: After reviewing the code:
The fixed-nonchangeable "default" in AnalyzerWrapper is PerField, which is a 
large overhead and should only be used in stuff like PerFieldAnalyzerWrapper 
(this class should call super(PerField) in its own ctor). But for other use 
cases of AnalyzerWrapper I have to use global strategy or the one of a wrapped 
analyzer). It looks like the current impl in AnalyzerWrapper is somehow 
assuming you want to wrap per field.

I would suggest to make it mandatory in Lucene trunk, and add the missing ctor 
in Lucene 4.x, too. The default one should be deprecated with a hint that it 
might be a bad idea to use this default.

My use case is:
I have lots of predefined Analyzers for several languages or functionality in 
my search application. I have some additional AnalyzerWrappers around that 
simply turn any other analyzer into a phonetic one or ASCIIFolding one (so I 
can use that with another field). So, my wrapper just takes one of these 
per-language Analyzers and wraps with another additional TokenFilter. As the 
underlying Analyzer is global reuse, I need to make the wrapper global, too - 
currently impossible. Per field is a waste of resources in this case.

Only PerFieldAnalyzerWrapper should use PerField strategy hardcoded (as it is 
per field), the base class not!

So I would suggest to make the base class AnalyzerWrapper copy the ctor of the 
superclass Analyzer and deprecate the default ctor in 4.x. For my above example 
(to wrap another analyzer), I still need the resuse strategy of the inner 
analyzer, so I need set getter on Analyzer.java, too (see current patch).
                
      was (Author: thetaphi):
    Robert: After reviewing the code:
The fixed-nonchangeable "default" in AnalyzerWrapper is PerField, which is a 
large overhead and should only be used in stuff like PerFieldAnalyzerWrapper 
(this class should call super(PerField) in its own ctor). But for other use 
cases of AnalyzerWrapper I have to use global strategy or the one of a wrapped 
analyzer). It looks like the current impl in AnalyzerWrapper is somehow 
assuming you want to wrap per field.

I would suggest to make it mandatory in Lucene trunk, and add the missing ctor 
in Lucene 4.x, too. The default one should be deprecated with a hint that it 
might be a bad idea to use this default.

My use case is:
I have lots of predefined Analyzers for several languages or functionality in 
my search application. I have some additional AnalyzerWrappers around that 
simply turn any other analyzer into a phonetic one or ASCIIFolding one (so I 
can use that with another field). So, my wrapper just takes one of these 
per-language Analyzers and wraps with another additional TokenFilter. As the 
underlying Analyzer is global reuse, I need to make the wrapper global, too - 
currently impossible. Per field is a waste of resources in this case.

So I would suggest to make the base class AnalyzerWrapper copy the ctor of the 
superclass Analyzer and deprecate the default ctor in 4.x. For my above example 
(to wrap another analyzer), I still need the resuse strategy of the inner 
analyzer, so I need set getter on Analyzer.java, too (see current patch).
                  
> Add getter for reuse strategy to Analyzer
> -----------------------------------------
>
>                 Key: LUCENE-5170
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5170
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 5.0, 4.5
>
>         Attachments: LUCENE-5170.patch
>
>
> If you write an Analyzer that wraps another one (but without using 
> AnalyzerWrapper) you may need use the same reuse strategy in your wrapper. 
> This is not possible as there is no way to get the reuse startegy (private 
> field and no getter).
> An example is ES's NamedAnalyzer, see my comment: 
> [https://github.com/elasticsearch/elasticsearch/commit/b9a2fbd8741aa1b9beffb7d2922fc9b4525397e4#src/main/java/org/elasticsearch/index/analysis/NamedAnalyzer.java]
> This would add a getter, just a 3-liner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to