subject:"Re\: Distributed Search and the Stale Check"

Re: Distributed Search and the Stale Check

2013-02-27 Thread Ryan Zezeski

On Mon, Feb 25, 2013 at 8:26 PM, Mark Miller markrmil...@gmail.com wrote:

 Please file a JIRA issue and attach your patch. Great write up! (Saw it
 pop up on twitter, so I read it a little earlier).


Done.

https://issues.apache.org/jira/browse/SOLR-4509

RE: Distributed Search and the Stale Check

2013-02-25 Thread Michael Ryan

I don't have anything to add besides saying this is awesome. Great analysis.

-Michael

Re: Distributed Search and the Stale Check

2013-02-25 Thread Mark Miller


On Feb 25, 2013, at 8:14 PM, Ryan Zezeski rzeze...@gmail.com wrote:

 I would like to see a
 similar fix made upstream and that is why I am posting here.

Please file a JIRA issue and attach your patch. Great write up! (Saw it pop up 
on twitter, so I read it a little earlier).

- Mark

Re: Distributed Search and the Stale Check

2013-02-25 Thread Yonik Seeley

On my particular benchmark rig, each stale check call accounted for an
additional ~10ms.

That's insane!

It's still not even clear to me how the stale check works (reliably).
Couldn't the server still close the connection between the stale check
and the send of data by the client?

-Yonik
http://lucidworks.com

On Mon, Feb 25, 2013 at 8:14 PM, Ryan Zezeski rzeze...@gmail.com wrote:
Hello Solr Users,

I just wrote up a piece about some work I did recently to improve the
throughput of distributed search.

http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html

The short of it is that the stale check in Apache's HTTP Client used by
SolrJ can add a lot of latency to a distributed search request. Especially
given that distributed search is actually made up of 2 stages, each of
which must perform its own stale check. For my particular benchmark setup
I saw a 2-4x increase in throughput and 100ms+ drop in latency. All my
work has been done in context of a larger project, Yokozuna [1], and thus
the patch is currently local to that project. I would like to see a
similar fix made upstream and that is why I am posting here. I was hoping
the Solr sages could offer their input. My fix is very basic, simply
disabling the check and adding a sweeper thread to prevent socket reset
errors [2]. But if I had more time I think a rewrite using the latest
Apache HTTP Components might be in order. I'm not sure. I'm happy to
answer any questions and give more details on my test setup.

-Z

[1] https://github.com/rzezeski/yokozuna

[2]
https://github.com/rzezeski/yokozuna/blob/a731748f07ee2156b5b3eb558e6b8a3efda4bfe4/solr-patches/no-stale-check.patch

Re: Distributed Search and the Stale Check

2013-02-25 Thread Ryan Zezeski

On Mon, Feb 25, 2013 at 8:42 PM, Yonik Seeley yo...@lucidworks.com wrote:


 That's insane!


It is insane.  Keep in mind this was a 5-node cluster on the
same physical machine sharing the same resources.  It consist of 5 smartos
zones on the same global zone.  On my MacBook Pro I saw ~1.5ms per stale
check but that was not under load (I'm honestly not sure if on/off load
makes a difference as it didn't seem to on my smartos cluster).  I could
probably get to the root of this with DTrace/BTrace, but alas I haven't
bothered.



 It's still not even clear to me how the stale check works (reliably).
 Couldn't the server still close the connection between the stale check
 and the send of data by the client?


The stale check isn't 100%, but it works most of the time.  As you say, the
server could close the socket between the stale check completing and the
request data being sent.  I'm pretty sure Oleg, one of the maintainers, has
said as much but I can't find the original context.

-Z

Re: Distributed Search and the Stale Check

RE: Distributed Search and the Stale Check

Re: Distributed Search and the Stale Check

Re: Distributed Search and the Stale Check

Re: Distributed Search and the Stale Check

5 matches

Site Navigation

Mail list logo

Footer information