[
https://issues.apache.org/jira/browse/LUCENE-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Busch updated LUCENE-1775:
----------------------------------
Attachment: lucene-1775.patch
ShingleMatrixFilter is a very complicated filter. It seems that it is
implemented in a very inefficient way, it does lots of cloning. While I was
able to fully convert ShingleFilter in a way, so that it is now much more
efficient now, I'm not going to do that with the ShingleMatrixFilter.
I don't know the code well enough to even try and with 1000 LOC it's very
complex.
The drawback of not fully converting it is that if someone uses custom
Attributes, i. e. ones that are not in core Lucene, it is undefined what the
filter will do with those Attributes. However, I don't even know what the
behavior should be. If only core Attributes are used, everything is working
fine, as the passing junits show.
I added a corresponding comment to the javadocs of that class.
> Change org.apache.lucene.analysis.shingle to use new TokenStream API
> --------------------------------------------------------------------
>
> Key: LUCENE-1775
> URL: https://issues.apache.org/jira/browse/LUCENE-1775
> Project: Lucene - Java
> Issue Type: Task
> Components: contrib/analyzers
> Reporter: Michael Busch
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.9
>
> Attachments: lucene-1775.patch, lucene-1775.patch, lucene-1775.patch
>
>
> All other contrib streams/filters have already been converted with
> LUCENE-1460.
> The two shingle filters are the last ones we need to convert.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]