It might be worth noting that Freebase publishes a Text only extract of 
Wikipedia: http://download.freebase.com/wex/latest/  We could take a snapshot 
of that and host it somewhere as the new standard for benchmarking.


On Jan 31, 2011, at 2:20 PM, mikemcc...@apache.org wrote:

> Author: mikemccand
> Date: Mon Jan 31 19:20:34 2011
> New Revision: 1065719
> 
> URL: http://svn.apache.org/viewvc?rev=1065719&view=rev
> Log:
> LUCENE-1591: rollback to old patched xercesImpl.jar to workaround 
> XERCESJ-1257, which we hit on current Wikipedia XML export 
> (enwiki-20110115-pages-articles.xml)
> 
> Added:
>    
> lucene/dev/trunk/modules/benchmark/lib/xercesImpl-2.9.1-patched-XERCESJ-1257.jar
>    (with props)
>    lucene/dev/trunk/modules/benchmark/lib/xml-apis-2.9.0.jar   (with props)
> Removed:
>    lucene/dev/trunk/modules/benchmark/lib/xercesImpl-2.10.0.jar
>    lucene/dev/trunk/modules/benchmark/lib/xml-apis-2.10.0.jar
> 
> Added: 
> lucene/dev/trunk/modules/benchmark/lib/xercesImpl-2.9.1-patched-XERCESJ-1257.jar
> URL: 
> http://svn.apache.org/viewvc/lucene/dev/trunk/modules/benchmark/lib/xercesImpl-2.9.1-patched-XERCESJ-1257.jar?rev=1065719&view=auto
> ==============================================================================
> Binary file - no diff available.
> 
> Added: lucene/dev/trunk/modules/benchmark/lib/xml-apis-2.9.0.jar
> URL: 
> http://svn.apache.org/viewvc/lucene/dev/trunk/modules/benchmark/lib/xml-apis-2.9.0.jar?rev=1065719&view=auto
> ==============================================================================
> Binary file - no diff available.
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to