[
https://issues.apache.org/jira/browse/LUCENE-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Male updated LUCENE-3396:
-------------------------------
Attachment: LUCENE-3396-remaining-analyzers.patch
Patch which converts the last of the Analyzers over to using
ReusableAnalyzerBase. At this stage, RAB is now the only extension of Analyzer.
Patch includes a few unique changes:
- Adds AnalyzerWrapper which is used by Analyzers which wrap other Analyzers.
This is necessary to allow access the TokenStreamComponents of the wrapped
Analyzers, without making TSC public.
- IndexSchemaRuntimeFieldTest is made out of IndexSchemaTest due to
testRuntimeFieldCreation doing some thread-unsafe changes to the Analyzers
stored in IndexSchema. When run in IndexSchemaTest, depending on the
execution order, the test fails. When run by itself, there is no problems. I
think this is okay because the actual code being tested is documented as being
thread-unsafe and the test also notes it does some tacky things.
I'm not going to look to commit this just yet, as I want to collapse Analyzer
and RAB into a single class.
> Make TokenStream Reuse Mandatory for Analyzers
> ----------------------------------------------
>
> Key: LUCENE-3396
> URL: https://issues.apache.org/jira/browse/LUCENE-3396
> Project: Lucene - Java
> Issue Type: Improvement
> Components: modules/analysis
> Reporter: Chris Male
> Attachments: LUCENE-3396-forgotten.patch, LUCENE-3396-rab.patch,
> LUCENE-3396-rab.patch, LUCENE-3396-rab.patch, LUCENE-3396-rab.patch,
> LUCENE-3396-rab.patch, LUCENE-3396-rab.patch, LUCENE-3396-rab.patch,
> LUCENE-3396-remaining-analyzers.patch
>
>
> In LUCENE-2309 it became clear that we'd benefit a lot from Analyzer having
> to return reusable TokenStreams. This is a big chunk of work, but its time
> to bite the bullet.
> I plan to attack this in the following way:
> - Collapse the logic of ReusableAnalyzerBase into Analyzer
> - Add a ReuseStrategy abstraction to Analyzer which controls whether the
> TokenStreamComponents are reused globally (as they are today) or per-field.
> - Convert all Analyzers over to using TokenStreamComponents. I've already
> seen that some of the TokenStreams created in tests need some work to be
> reusable (even if they aren't reused).
> - Remove Analyzer.reusableTokenStream and convert everything over to using
> .tokenStream (which will now be returning reusable TokenStreams).
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]