Re: Does lucene support distributed indexing?

2008-04-28 Thread Vaijanath N. Rao
Hi all, How about adding hadoop support for distributed indexing. If required I can start working on this. If Hadoop is the fesiable option. Also what other technique one can think for doing distributed Indexing. Currently I am planning on extending the SolrJ to keep a map of where the docum

Re: Does lucene support distributed indexing?

2008-04-28 Thread Otis Gospodnetic
That's right - most of them are about distributed searching (hence my notes about sharding being up to the app). Hadoop's contrib/index is about dist indexing: "This contrib package provides a utility to build or update an index using Map/Reduce. A distributed "index" is partitioned into "shar

Re: Does lucene support distributed indexing?

2008-04-28 Thread Chris Hostetter
: There are actually several distributed indexing or searching projects in : Lucene (the top-level ASF Lucene project, not Lucene Java), and it's : time to start thinking about the possibility of bringing them together, : finding commonalities, etc. I would actually argue that almost all of th

RE: Does lucene support distributed indexing?

2008-04-28 Thread Stu Hood
Solr does not do distributed indexing, but the development version _does_ do distributed search, in addition to replication. Currently, you can manually shard up your data to a set of Solr instances, and then query them by adding a 'shard=localhost:8080/solr_1,localhost:8080/solr_2' parameter.

search performance & caching

2008-04-28 Thread Beard, Brian
I'm using lucene 2.2.0 & have two questions: 1) Should search times be linear wrt number of queries hitting a single searcher? I've run multiple search threads against a single searcher, and the search times are very linear - 10x slower for 10 threads vs 1 thread, etc. I'm using a paralle multi-

Re: TrecDocMaker

2008-04-28 Thread Grant Ingersoll
Yeah, these classes are a bit weird in that they are configured via properties, and not setters. They really are designed to run inside the benchmaker and not much attention was paid to using them elsewhere. However, one can co-opt them for the purposes you are doing: Something like: TrecDo

RE: Does lucene support distributed indexing?

2008-04-28 Thread Fang_Li
Solr does not do distributed indexing, but index replication. All copies are identical. Lucene has some build in support for distributed search, please take a look at RemoteSearchable. For indexing, you can add a front load balancer in a naïve way. Regards, -Original Message- From: Sam