Re: Distibuted search

2013-01-28 Thread Mingfeng Yang
In your case, since there is no co-current queries, adding replicas won't
help much on improving the response speed.  However, break your index into
a few shards do help increase query performance. I recently break an index
with 30 million documents (30G) into 4 shards, and the boost is pretty
impressive (roughly 2-5x faster for a complicated query)

Ming


On Mon, Jan 28, 2013 at 10:54 AM, Isaac Hebsh isaac.he...@gmail.com wrote:

 Does adding replicas (on additional servers) help to improve search
 performance?

 It is known that each query goes to all the shards. It's clear that if we
 have massive load, then multiple cores serving the same shard are very
 useful.

 But what happens if I'll never have concurrent queries (one query is in the
 system at any time), but I want these single queries to return faster. Is a
 bigger replication factor will contribute?

 Especially, Will a complicated query (with a large amount of queried
 fields) go to multiple cores *of the same shard*? (E.g. core1 searching for
 term1 in field1, and core2 searching for term 2 in field2)

 And what about a query on a single field, which contains a lot of terms?

 Thanks in advance..



Re: Distibuted search

2013-01-28 Thread Isaac Hebsh
Well, My index is already broken to 16 shards...
The behaviour I supposed - It absolutely doesn't happen... Right?
Does it make sense somehow as an improvement request?
Technically, Can multiple Lucene responses be intersected this way?


On Mon, Jan 28, 2013 at 9:27 PM, Mingfeng Yang mfy...@wisewindow.comwrote:

 In your case, since there is no co-current queries, adding replicas won't
 help much on improving the response speed.  However, break your index into
 a few shards do help increase query performance. I recently break an index
 with 30 million documents (30G) into 4 shards, and the boost is pretty
 impressive (roughly 2-5x faster for a complicated query)

 Ming


 On Mon, Jan 28, 2013 at 10:54 AM, Isaac Hebsh isaac.he...@gmail.com
 wrote:

  Does adding replicas (on additional servers) help to improve search
  performance?
 
  It is known that each query goes to all the shards. It's clear that if we
  have massive load, then multiple cores serving the same shard are very
  useful.
 
  But what happens if I'll never have concurrent queries (one query is in
 the
  system at any time), but I want these single queries to return faster.
 Is a
  bigger replication factor will contribute?
 
  Especially, Will a complicated query (with a large amount of queried
  fields) go to multiple cores *of the same shard*? (E.g. core1 searching
 for
  term1 in field1, and core2 searching for term 2 in field2)
 
  And what about a query on a single field, which contains a lot of terms?
 
  Thanks in advance..