[ 
https://issues.apache.org/jira/browse/LUCENE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe updated LUCENE-2181:
--------------------------------

    Attachment: top.100k.words.de.en.fr.uk.wikipedia.2009-11.tar.bz2
                LUCENE-2181.patch

Hi Robert, 

In the new version of the patch, {{ant benchmark}} from the {{contrib/icu/}} 
directory attempts to download the attached {{tar.bz2}} file from 
{{http://people.apache.org/~rmuir/wikipedia}} (*please change this to the 
location where you end up putting the file*), then unpacks the archive to the 
{{contrib/icu/src/benchmark/work/}} directory, then compiles and runs the 
benchmark.

In addition to the top 100K word lists, the {{tar.bz2}} file contains 
{{LICENSE.txt}}, which contains links to the Wikipedia dumps from which the 
lists were extracted, along with a link to the license that Wikipedia uses.

> benchmark for collation
> -----------------------
>
>                 Key: LUCENE-2181
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2181
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: contrib/benchmark
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>         Attachments: LUCENE-2181.patch, 
> top.100k.words.de.en.fr.uk.wikipedia.2009-11.tar.bz2
>
>
> Steven Rowe attached a contrib/benchmark-based benchmark for collation (both 
> jdk and icu) under LUCENE-2084, along with some instructions to run it... 
> I think it would be a nice if we could turn this into a committable patch and 
> add it to benchmark.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to