[
https://issues.apache.org/jira/browse/LUCENE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798508#action_12798508
]
Steven Rowe commented on LUCENE-2181:
-------------------------------------
Looks good. I like the way you've integrated it into the benchmark suite, and
as you say the NewLocaleTask should prove useful elsewhere.
bq. I put the files in my apache directory, but modified your patch somewhat
One major thing you changed but didn't mention above is that rather than
applying the collation key transform only to the LineDoc body field, it's now
applied also to the title and date fields. Given the nature of the top 100k
words files -- the title is an integer representing term frequency, and the
date is essentially meaningless (the date on which I created the file) -- I
don't think this makes sense (and that's why I made analyzers that only applied
collation to the body field).
> benchmark for collation
> -----------------------
>
> Key: LUCENE-2181
> URL: https://issues.apache.org/jira/browse/LUCENE-2181
> Project: Lucene - Java
> Issue Type: New Feature
> Components: contrib/benchmark
> Reporter: Robert Muir
> Assignee: Robert Muir
> Attachments: LUCENE-2181.patch, LUCENE-2181.patch,
> top.100k.words.de.en.fr.uk.wikipedia.2009-11.tar.bz2
>
>
> Steven Rowe attached a contrib/benchmark-based benchmark for collation (both
> jdk and icu) under LUCENE-2084, along with some instructions to run it...
> I think it would be a nice if we could turn this into a committable patch and
> add it to benchmark.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]