[
https://issues.apache.org/jira/browse/LUCENE-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12801692#action_12801692
]
Steven Rowe edited comment on LUCENE-2223 at 1/18/10 7:13 AM:
--------------------------------------------------------------
bq. This appears to work well, the only thing I would ask for is a simple test
for the task (maybe especially testing the option that changes the wrapped
analyzer's classname from the default std. analyzer)
Done in attached patch - thanks for catching this oversight.
In constructing the test, I noticed that I had not brought over the analyzer
package abbreviation logic from NewAnalyzerTask; this is now present in
NewShingleAnalyzerTask, so that "analyzer:WhitespaceAnalyzer" is functional as
a param.
*Edit*: Also removed some debug printing I'd forgotten to remove from
NewShingleAnalyzerTask.
was (Author: steve_rowe):
bq. This appears to work well, the only thing I would ask for is a simple
test for the task (maybe especially testing the option that changes the wrapped
analyzer's classname from the default std. analyzer)
Done in attached patch - thanks for catching this oversight.
In constructing the test, I noticed that I had not brought over the analyzer
package abbreviation logic from NewAnalyzerTask; this is now present in
NewShingleAnalyzerTask, so that "analyzer:WhitespaceAnalyzer" is functional as
a param.
> ShingleFilter benchmark
> -----------------------
>
> Key: LUCENE-2223
> URL: https://issues.apache.org/jira/browse/LUCENE-2223
> Project: Lucene - Java
> Issue Type: New Feature
> Components: contrib/benchmark
> Affects Versions: 3.0
> Reporter: Steven Rowe
> Priority: Minor
> Attachments: LUCENE-2223.patch, LUCENE-2223.patch
>
>
> Spawned from LUCENE-2218: a benchmark for ShingleFilter, along with a new
> task to instantiate (non-default-constructor) ShingleAnalyzerWrapper:
> NewShingleAnalyzerTask.
> The included shingle.alg runs ShingleAnalyzerWrapper, wrapping the default
> StandardAnalyzer, with 4 different configurations over 10,000 Reuters
> documents each. To allow ShingleFilter timings to be isolated from the rest
> of the pipeline, StandardAnalyzer is also run over the same set of Reuters
> documents. This set of 5 runs is then run 5 times.
> The patch includes two perl scripts, the first to output JIRA table formatted
> timing information, with the minimum elapsed time for each of the 4
> ShingleAnalyzerWrapper runs and the StandardAnalyzer run, and the second to
> compare two runs' JIRA output, producing another JIRA table showing %
> improvement.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]