For now it is in http://people.apache.org/~gsingers/wikipedia/
enwiki-20070527-pages-articles.xml.bz2
Does ANT get work with redirects? I may eventually move this. I am
trying to find the old message and responses from Infrastructure
saying where this should go. The original suggestion was zones, but
that only has Tomcat on it and I don't want to consume those resources.
I can probably just update the patch, so no need to submit a new one
unless you want to.
-Grant
On Jun 4, 2007, at 4:18 PM, Steven Parkes (JIRA) wrote:
[ https://issues.apache.org/jira/browse/LUCENE-848?
page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
tabpanel#action_12501344 ]
Steven Parkes commented on LUCENE-848:
--------------------------------------
It looks like the latest successful dump is
http://download.wikimedia.org/enwiki/20070527/enwiki-20070527-pages-
articles.xml.bz2
If you copy it whereever, I'll fetch it from there and test it.
Add supported for Wikipedia English as a corpus in the benchmarker
stuff
---------------------------------------------------------------------
---
Key: LUCENE-848
URL: https://issues.apache.org/jira/browse/LUCENE-848
Project: Lucene - Java
Issue Type: New Feature
Components: contrib/benchmark
Reporter: Steven Parkes
Assignee: Grant Ingersoll
Priority: Minor
Attachments: LUCENE-848.txt, LUCENE-848.txt,
LUCENE-848.txt, LUCENE-848.txt, LUCENE-848.txt, LUCENE-848.txt,
LUCENE-848.txt, WikipediaHarvester.java, xerces.jar, xerces.jar,
xml-apis.jar
Add support for using Wikipedia for benchmarking.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]