Re: solr in distributed mode

2009-06-11 Thread Rakhi Khatwani
Hi,
 i went through the document:
http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr

i have a couple of questions:

1. In the document its been mentioned that
"There will be a 'master' server for each shard and then 1-n 'slaves' that
are replicated from the master."

how is the replication process done?

suppose i have 2 machines nodeA and nodeB
I edited scripts.config in solr/conf of both nodeA and nodeB to point to the
master (i.e. nodeA).
   i) is it the right approach for setting up master/slave configuration?
   ii) to start the master/slave config, should i execute start.jar from
both the nodes? or just from the master node?
   iii) are indexes automatically replicated when you insert/update it in
the master.. or do we have to run a script for that?
   iv) how do i know if replication process is sucessfully carried out.
   v) suppose the master goes down. i do i perform a node failover.. for
example make one of the slaves as master without disrupting my application?


2. It has also been mentioned that:

"With distribution and replication, none of the master shards know about
each other. You index to each master, the index is replicated to each slave,
and then searches are distributed across the slaves, using one slave from
each master/slave shard."

  i) Are slaves used only for index replications? i mean can't i have
indexes distributed across slaves so that when i perform a search, it
searches across all slaves?
ii) since none of the shards have any information about one another, if i
update/delete the document based on term, how does the index gets updated
across all shards? or do we have to merge, update/delete and then distribute
it across shards?

Regards,
Rakahi





In a distributed configuration, one server 'shard' will get a query request
and then search itself, as well as the other shards in the configuration,
and return the combined results from each shard.



On Wed, Jun 10, 2009 at 11:23 AM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

>
> Hello,
>
> All of this is covered on the Wiki, search for: distributed search
>
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message 
> > From: Rakhi Khatwani 
> > To: solr-user@lucene.apache.org
> > Cc: ninad.r...@germinait.com; ranjit.n...@germinait.com;
> saurabh.maha...@germinait.com
> > Sent: Tuesday, June 9, 2009 4:55:55 AM
> > Subject: solr in distributed mode
> >
> > Hi,
> > I was looking for ways in which we can use solr in distributed mode.
> > is there anyways we can use solr indexes across machines or by using
> Hadoop
> > Distributed File System?
> >
> > Its has been mentioned in the wiki that
> > When an index becomes too large to fit on a single system, or when a
> single
> > query takes too long to execute, an index can be split into multiple
> shards,
> > and Solr can query and merge results across those shards.
> >
> > what i understand is that shards are a partition. are shards on the same
> > machine or can it be on different machines?? do we have to manually
> > split the indexes to store in different shards.
> >
> > do you have an example or some tutorial which demonstrates distributed
> index
> > searching/ storing using shards?
> >
> > Regards,
> > Raakhi
>
>


Re: solr in distributed mode

2009-06-09 Thread Otis Gospodnetic

Hello,

All of this is covered on the Wiki, search for: distributed search

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Rakhi Khatwani 
> To: solr-user@lucene.apache.org
> Cc: ninad.r...@germinait.com; ranjit.n...@germinait.com; 
> saurabh.maha...@germinait.com
> Sent: Tuesday, June 9, 2009 4:55:55 AM
> Subject: solr in distributed mode
> 
> Hi,
> I was looking for ways in which we can use solr in distributed mode.
> is there anyways we can use solr indexes across machines or by using Hadoop
> Distributed File System?
> 
> Its has been mentioned in the wiki that
> When an index becomes too large to fit on a single system, or when a single
> query takes too long to execute, an index can be split into multiple shards,
> and Solr can query and merge results across those shards.
> 
> what i understand is that shards are a partition. are shards on the same
> machine or can it be on different machines?? do we have to manually
> split the indexes to store in different shards.
> 
> do you have an example or some tutorial which demonstrates distributed index
> searching/ storing using shards?
> 
> Regards,
> Raakhi



Re: solr in distributed mode

2009-06-09 Thread Mark Miller

Rakhi Khatwani wrote:

Hi,
I was looking for ways in which we can use solr in distributed mode.
is there anyways we can use solr indexes across machines or by using Hadoop
Distributed File System?

Its has been mentioned in the wiki that
When an index becomes too large to fit on a single system, or when a single
query takes too long to execute, an index can be split into multiple shards,
and Solr can query and merge results across those shards.

what i understand is that shards are a partition. are shards on the same
machine or can it be on different machines?? do we have to manually
split the indexes to store in different shards.

do you have an example or some tutorial which demonstrates distributed index
searching/ storing using shards?

Regards,
Raakhi

  
You might check out this article to get an idea of how Solr scales (lot 
of extra stuff in Lucene in there too, just skip to around)

http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr

You can also check out the wiki: 
http://wiki.apache.org/solr/DistributedSearch


Also see:

Solr 1.4 : http://wiki.apache.org/solr/SolrReplication
Solr 1.3,1.4: http://wiki.apache.org/solr/CollectionDistribution

--
- Mark

http://www.lucidimagination.com





solr in distributed mode

2009-06-09 Thread Rakhi Khatwani
Hi,
I was looking for ways in which we can use solr in distributed mode.
is there anyways we can use solr indexes across machines or by using Hadoop
Distributed File System?

Its has been mentioned in the wiki that
When an index becomes too large to fit on a single system, or when a single
query takes too long to execute, an index can be split into multiple shards,
and Solr can query and merge results across those shards.

what i understand is that shards are a partition. are shards on the same
machine or can it be on different machines?? do we have to manually
split the indexes to store in different shards.

do you have an example or some tutorial which demonstrates distributed index
searching/ storing using shards?

Regards,
Raakhi