[ http://issues.apache.org/jira/browse/LUCENE-675?page=all ]

Doron Cohen updated LUCENE-675:
-------------------------------

    Attachment: benchmark.byTask.patch

I am attaching benchmark.byTask.patch - to be applied in the contrib/benchmark 
directory. 

Root package of byTask classes was modified to 
org.apache.lucene.benchmark.byTask, in the lines of Grant's suggestion - seems 
better cause it keeps all benchmark classes under 
lucene.benchmark.

I added one a sample .alg under conf and added some documentation. 

Entry point - documentation wise - is the package doc for 
org.apache.lucene.benchmark.byTask.

Thanks for any comments on this!

PS. Before submitting the patch file, I tried to apply it myself on a clean 
version of the code, just to make sure that it works. But I got errors like 
this -- Could not retrieve revision 0 of "...\byTask\.." -- for every file 
under a new folder. So I am not sure if it is just my (Windows) svn patch 
applying utility, or is it really impossible to apply a patch that creates 
files in (yet) nonexistent directories.  I searched Lucene mailing lists and 
SVN mailing lists and went again through the SVN book again but nowhere could I 
find what is the expected behavior for applying a patch containing new 
directories. In fact, "svn diff" would not even show you files that are new 
(again, this is the Windows svn 1.4.2 version). (I used Tortoise SVN to create 
the patch). This is rather annoying and I might be misunderstanding something 
basic about SVN, but I thought it'd be better to share this experience here - 
might save some time for others trying to apply this patch or other patches
 ...

> Lucene benchmark: objective performance test for Lucene
> -------------------------------------------------------
>
>                 Key: LUCENE-675
>                 URL: http://issues.apache.org/jira/browse/LUCENE-675
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Andrzej Bialecki 
>         Assigned To: Grant Ingersoll
>         Attachments: benchmark.byTask.patch, benchmark.patch, 
> BenchmarkingIndexer.pm, extract_reuters.plx, LuceneBenchmark.java, 
> LuceneIndexer.java, taskBenchmark.zip, timedata.zip, tiny.alg, tiny.properties
>
>
> We need an objective way to measure the performance of Lucene, both indexing 
> and querying, on a known corpus. This issue is intended to collect comments 
> and patches implementing a suite of such benchmarking tests.
> Regarding the corpus: one of the widely used and freely available corpora is 
> the original Reuters collection, available from 
> http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.tar.gz 
> or 
> http://people.csail.mit.edu/u/j/jrennie/public_html/20Newsgroups/20news-18828.tar.gz.
>  I propose to use this corpus as a base for benchmarks. The benchmarking 
> suite could automatically retrieve it from known locations, and cache it 
> locally.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to