Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-19 Thread Chris Hostetter

: I have several custom QueryComponents that have high one-time startup costs
: (hashing things in the index, caching things from a RDBMS, etc...)

you need to provide more details about how your custom components work -- 
in particular: where in teh lifecycle of your components is this 
high-startup cost happening?

: Is there a way to prevent solr from accepting connections before all
: QueryComponents are ready?

Define ready ? ... things that happen in the init() and inform(SolrCore) 
methods will completley prevent the SolrCore from being available for 
queries.

Likewise: if you are using firstSearcher warming queries, then the 
useColdSearcher option in solrconfig.xml can be used to control wether 
or not external requests will block until the searcher is available or 
not -- however this doesn't prevent the servlet container from accepting 
the HTTP connection.  but as mentioned, this is where things like the 
PingRequestHandler and the enable/disable commands can be used to take 
servers in and out of rotation with your load balancer -- assuming that 
your load balanver can be configured to monitor the ping URL.   
Alternatively you can just use native features of your load balancer to 
control this independent of solr (but the ping handler is a nice way of 
letting one set of dev/ops folks own the solr servers and control their 
availability even if they don't have the ability to control the load 
blaancer itself)


-Hoss


Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-11 Thread Jack Krupansky
Is the issue here that the Solr node is continuously live with the load 
balancer so that the moment during startup that Solr can respond to 
anything, the load balancer will be sending it traffic and that this can 
occur while Solr is still warming up?


First, shouldn't we be encouraging people to have an app layer between Solr 
and the outside world? If so, the app layer should simply not respond to 
traffic until the app layer can verified that Solr has stabilized. If not, 
then maybe we do need to suggest a change to Solr so that the developer can 
control exactly when Solr becomes live and responsive to incoming traffic.


At a minimum, we should document when that moment is today in terms of an 
explicit contract. It sounds like the problem is that the contract is either 
nonexistent, vague, ambiguous, non-deterministic, or whatever.


-- Jack Krupansky

-Original Message- 
From: Amit Nithian

Sent: Saturday, November 10, 2012 4:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Preventing accepting queries while custom QueryComponent starts 
up?


Yeah that's what I was suggesting in my response too. I don't think your
load balancer should be doing this but whatever script does the release
(restarting the container) should do this so that when the ping is enabled
the warming has finished.


On Sat, Nov 10, 2012 at 3:33 PM, Erick Erickson 
erickerick...@gmail.comwrote:


Hmmm, rather than hit the ping query, why not just send in a real query 
and

only let the queued ones through after the response?

Just a random thought
Erick


On Sat, Nov 10, 2012 at 2:53 PM, Amit Nithian anith...@gmail.com wrote:

 Yes but the problem is that if user facing queries are hitting a server
 that is warming up and isn't being serviced quickly, then you could
 potentially bring down your site if all the front end threads are 
 blocked

 on Solr queries b/c those queries are waiting (presumably at the
container
 level since the filter hasn't finished its init() sequence) for the
warming
 to complete (this is especially notorious when your front end is rails).
 This is why your ping to enable/disable a server from the load balancer
has
 to be accurate with regards to whether or not a server is truly ready 
 and

 warm.

 I think what I am gathering from this discussion is that the server is
 warming up, the ping is going through and tells the load balancer this
 server is ready, user queries are hitting this server and are queued
 waiting for the firstSearcher to finish (say these initial user queries
are
 to respond in 500-1000ms) that's terrible for performance.

 Alternatively, if you have a bunch of servers behind a load balancer, 
 you

 want this one server (or block of servers depending on your deployment)
to
 be reasonably sure that user queries will return in a decent time
(whatever
 you define decent to be) hence why this matters.

 Let me know if I am missing anything.

 Thanks
 Amit


 On Sat, Nov 10, 2012 at 10:03 AM, Erick Erickson 
erickerick...@gmail.com
 wrote:

  Why does it matter? The whole idea of firstSearcher queries is to warm
up
  your system as fast as possible. The theory is that upon restarting 
  the

  server, let's bet this stuff going immediately... They were never
 intended
  (as far as I know) to complete before any queries were handled. As an
  aside, I'm not quite sure I understand why pings during the warmup are
a
  problem.
 
  But anyway. firstSearcher is particularly relevant because the
  autowarmCount settings on your caches are irrelevant when starting the
  server, there's no history to autowarm
 
  But, there's no good reason _not_ to let queries through while
  firstSearcher is doing it's tricks, they just get into the queue and
are
  served as quickly as they may. That might be some time since, as you
say,
  they may not get serviced until the expensive parts get filled. But I
 don't
  think having them be serviced is doing any harm.
 
  Now, newSearcher and autowarming of the caches is a completely
different
  beast since having the old searchers continue serving requests until
the
  warmups _does_ directly impact the user, they don't see random 
  slowness

  because a searcher is being opened.
 
  So I guess my real question is whether you're seeing a measurable
problem
  or if this is a red herring
 
  FWIW,
  Erick
 
 
  On Thu, Nov 8, 2012 at 2:54 PM, Aaron Daubman daub...@gmail.com
wrote:
 
   Greetings,
  
   I have several custom QueryComponents that have high one-time 
   startup

  costs
   (hashing things in the index, caching things from a RDBMS, etc...)
  
   Is there a way to prevent solr from accepting connections before all
   QueryComponents are ready?
  
   Especially, since many of our instance are load-balanced (and
   added-in/removed automatically based on admin/ping responses)
 preventing
   ping from answering prior to all custom QueryComponents being ready
 would
   be ideal...
  
   Thanks,
Aaron
  
 






Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-11 Thread Amit Nithian
Jack,

I think the issue is that the ping which is used to determine whether or
not the server is live returns a seemingly false positive back to the load
balancer (and indirectly the client) that this server is ready to go when
in fact it's not. Reading this page (
http://wiki.apache.org/solr/SolrConfigXml), it does seem to be documented
to do this but it may not be fully stressed to hide your Solr behind a load
balancer.  I am more than happy to write up a post that, in my opinion at
least, stresses some best practices on the use of Solr based on my
experience if others find this useful.

What seems odd here is that the ping is a query so maybe the ping query in
the solrconfig (for Aaron and others having this) should be configured to
hit the handler that is used by the front end app so that while that
handler is warming up the ping query will be blocked.

Of course using the load balancer means that the app layer knows nothing
about servers in and out of rotation.

Cheers!
Amit


On Sun, Nov 11, 2012 at 8:05 AM, Jack Krupansky j...@basetechnology.comwrote:

 Is the issue here that the Solr node is continuously live with the load
 balancer so that the moment during startup that Solr can respond to
 anything, the load balancer will be sending it traffic and that this can
 occur while Solr is still warming up?

 First, shouldn't we be encouraging people to have an app layer between
 Solr and the outside world? If so, the app layer should simply not respond
 to traffic until the app layer can verified that Solr has stabilized. If
 not, then maybe we do need to suggest a change to Solr so that the
 developer can control exactly when Solr becomes live and responsive to
 incoming traffic.

 At a minimum, we should document when that moment is today in terms of an
 explicit contract. It sounds like the problem is that the contract is
 either nonexistent, vague, ambiguous, non-deterministic, or whatever.

 -- Jack Krupansky

 -Original Message- From: Amit Nithian
 Sent: Saturday, November 10, 2012 4:24 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Preventing accepting queries while custom QueryComponent
 starts up?


 Yeah that's what I was suggesting in my response too. I don't think your
 load balancer should be doing this but whatever script does the release
 (restarting the container) should do this so that when the ping is enabled
 the warming has finished.


 On Sat, Nov 10, 2012 at 3:33 PM, Erick Erickson erickerick...@gmail.com*
 *wrote:

  Hmmm, rather than hit the ping query, why not just send in a real query
 and
 only let the queued ones through after the response?

 Just a random thought
 Erick


 On Sat, Nov 10, 2012 at 2:53 PM, Amit Nithian anith...@gmail.com wrote:

  Yes but the problem is that if user facing queries are hitting a server
  that is warming up and isn't being serviced quickly, then you could
  potentially bring down your site if all the front end threads are 
 blocked
  on Solr queries b/c those queries are waiting (presumably at the
 container
  level since the filter hasn't finished its init() sequence) for the
 warming
  to complete (this is especially notorious when your front end is rails).
  This is why your ping to enable/disable a server from the load balancer
 has
  to be accurate with regards to whether or not a server is truly ready 
 and
  warm.
 
  I think what I am gathering from this discussion is that the server is
  warming up, the ping is going through and tells the load balancer this
  server is ready, user queries are hitting this server and are queued
  waiting for the firstSearcher to finish (say these initial user queries
 are
  to respond in 500-1000ms) that's terrible for performance.
 
  Alternatively, if you have a bunch of servers behind a load balancer, 
 you
  want this one server (or block of servers depending on your deployment)
 to
  be reasonably sure that user queries will return in a decent time
 (whatever
  you define decent to be) hence why this matters.
 
  Let me know if I am missing anything.
 
  Thanks
  Amit
 
 
  On Sat, Nov 10, 2012 at 10:03 AM, Erick Erickson 
 erickerick...@gmail.com
  wrote:
 
   Why does it matter? The whole idea of firstSearcher queries is to warm
 up
   your system as fast as possible. The theory is that upon restarting 
  the
   server, let's bet this stuff going immediately... They were never
  intended
   (as far as I know) to complete before any queries were handled. As an
   aside, I'm not quite sure I understand why pings during the warmup are
 a
   problem.
  
   But anyway. firstSearcher is particularly relevant because the
   autowarmCount settings on your caches are irrelevant when starting the
   server, there's no history to autowarm
  
   But, there's no good reason _not_ to let queries through while
   firstSearcher is doing it's tricks, they just get into the queue and
 are
   served as quickly as they may. That might be some time since, as you
 say,
   they may not get serviced

Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-10 Thread Erick Erickson
Why does it matter? The whole idea of firstSearcher queries is to warm up
your system as fast as possible. The theory is that upon restarting the
server, let's bet this stuff going immediately... They were never intended
(as far as I know) to complete before any queries were handled. As an
aside, I'm not quite sure I understand why pings during the warmup are a
problem.

But anyway. firstSearcher is particularly relevant because the
autowarmCount settings on your caches are irrelevant when starting the
server, there's no history to autowarm

But, there's no good reason _not_ to let queries through while
firstSearcher is doing it's tricks, they just get into the queue and are
served as quickly as they may. That might be some time since, as you say,
they may not get serviced until the expensive parts get filled. But I don't
think having them be serviced is doing any harm.

Now, newSearcher and autowarming of the caches is a completely different
beast since having the old searchers continue serving requests until the
warmups _does_ directly impact the user, they don't see random slowness
because a searcher is being opened.

So I guess my real question is whether you're seeing a measurable problem
or if this is a red herring

FWIW,
Erick


On Thu, Nov 8, 2012 at 2:54 PM, Aaron Daubman daub...@gmail.com wrote:

 Greetings,

 I have several custom QueryComponents that have high one-time startup costs
 (hashing things in the index, caching things from a RDBMS, etc...)

 Is there a way to prevent solr from accepting connections before all
 QueryComponents are ready?

 Especially, since many of our instance are load-balanced (and
 added-in/removed automatically based on admin/ping responses) preventing
 ping from answering prior to all custom QueryComponents being ready would
 be ideal...

 Thanks,
  Aaron



Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-10 Thread Amit Nithian
Yes but the problem is that if user facing queries are hitting a server
that is warming up and isn't being serviced quickly, then you could
potentially bring down your site if all the front end threads are blocked
on Solr queries b/c those queries are waiting (presumably at the container
level since the filter hasn't finished its init() sequence) for the warming
to complete (this is especially notorious when your front end is rails).
This is why your ping to enable/disable a server from the load balancer has
to be accurate with regards to whether or not a server is truly ready and
warm.

I think what I am gathering from this discussion is that the server is
warming up, the ping is going through and tells the load balancer this
server is ready, user queries are hitting this server and are queued
waiting for the firstSearcher to finish (say these initial user queries are
to respond in 500-1000ms) that's terrible for performance.

Alternatively, if you have a bunch of servers behind a load balancer, you
want this one server (or block of servers depending on your deployment) to
be reasonably sure that user queries will return in a decent time (whatever
you define decent to be) hence why this matters.

Let me know if I am missing anything.

Thanks
Amit


On Sat, Nov 10, 2012 at 10:03 AM, Erick Erickson erickerick...@gmail.comwrote:

 Why does it matter? The whole idea of firstSearcher queries is to warm up
 your system as fast as possible. The theory is that upon restarting the
 server, let's bet this stuff going immediately... They were never intended
 (as far as I know) to complete before any queries were handled. As an
 aside, I'm not quite sure I understand why pings during the warmup are a
 problem.

 But anyway. firstSearcher is particularly relevant because the
 autowarmCount settings on your caches are irrelevant when starting the
 server, there's no history to autowarm

 But, there's no good reason _not_ to let queries through while
 firstSearcher is doing it's tricks, they just get into the queue and are
 served as quickly as they may. That might be some time since, as you say,
 they may not get serviced until the expensive parts get filled. But I don't
 think having them be serviced is doing any harm.

 Now, newSearcher and autowarming of the caches is a completely different
 beast since having the old searchers continue serving requests until the
 warmups _does_ directly impact the user, they don't see random slowness
 because a searcher is being opened.

 So I guess my real question is whether you're seeing a measurable problem
 or if this is a red herring

 FWIW,
 Erick


 On Thu, Nov 8, 2012 at 2:54 PM, Aaron Daubman daub...@gmail.com wrote:

  Greetings,
 
  I have several custom QueryComponents that have high one-time startup
 costs
  (hashing things in the index, caching things from a RDBMS, etc...)
 
  Is there a way to prevent solr from accepting connections before all
  QueryComponents are ready?
 
  Especially, since many of our instance are load-balanced (and
  added-in/removed automatically based on admin/ping responses) preventing
  ping from answering prior to all custom QueryComponents being ready would
  be ideal...
 
  Thanks,
   Aaron
 



Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-10 Thread Erick Erickson
Hmmm, rather than hit the ping query, why not just send in a real query and
only let the queued ones through after the response?

Just a random thought
Erick


On Sat, Nov 10, 2012 at 2:53 PM, Amit Nithian anith...@gmail.com wrote:

 Yes but the problem is that if user facing queries are hitting a server
 that is warming up and isn't being serviced quickly, then you could
 potentially bring down your site if all the front end threads are blocked
 on Solr queries b/c those queries are waiting (presumably at the container
 level since the filter hasn't finished its init() sequence) for the warming
 to complete (this is especially notorious when your front end is rails).
 This is why your ping to enable/disable a server from the load balancer has
 to be accurate with regards to whether or not a server is truly ready and
 warm.

 I think what I am gathering from this discussion is that the server is
 warming up, the ping is going through and tells the load balancer this
 server is ready, user queries are hitting this server and are queued
 waiting for the firstSearcher to finish (say these initial user queries are
 to respond in 500-1000ms) that's terrible for performance.

 Alternatively, if you have a bunch of servers behind a load balancer, you
 want this one server (or block of servers depending on your deployment) to
 be reasonably sure that user queries will return in a decent time (whatever
 you define decent to be) hence why this matters.

 Let me know if I am missing anything.

 Thanks
 Amit


 On Sat, Nov 10, 2012 at 10:03 AM, Erick Erickson erickerick...@gmail.com
 wrote:

  Why does it matter? The whole idea of firstSearcher queries is to warm up
  your system as fast as possible. The theory is that upon restarting the
  server, let's bet this stuff going immediately... They were never
 intended
  (as far as I know) to complete before any queries were handled. As an
  aside, I'm not quite sure I understand why pings during the warmup are a
  problem.
 
  But anyway. firstSearcher is particularly relevant because the
  autowarmCount settings on your caches are irrelevant when starting the
  server, there's no history to autowarm
 
  But, there's no good reason _not_ to let queries through while
  firstSearcher is doing it's tricks, they just get into the queue and are
  served as quickly as they may. That might be some time since, as you say,
  they may not get serviced until the expensive parts get filled. But I
 don't
  think having them be serviced is doing any harm.
 
  Now, newSearcher and autowarming of the caches is a completely different
  beast since having the old searchers continue serving requests until the
  warmups _does_ directly impact the user, they don't see random slowness
  because a searcher is being opened.
 
  So I guess my real question is whether you're seeing a measurable problem
  or if this is a red herring
 
  FWIW,
  Erick
 
 
  On Thu, Nov 8, 2012 at 2:54 PM, Aaron Daubman daub...@gmail.com wrote:
 
   Greetings,
  
   I have several custom QueryComponents that have high one-time startup
  costs
   (hashing things in the index, caching things from a RDBMS, etc...)
  
   Is there a way to prevent solr from accepting connections before all
   QueryComponents are ready?
  
   Especially, since many of our instance are load-balanced (and
   added-in/removed automatically based on admin/ping responses)
 preventing
   ping from answering prior to all custom QueryComponents being ready
 would
   be ideal...
  
   Thanks,
Aaron
  
 



Preventing accepting queries while custom QueryComponent starts up?

2012-11-08 Thread Aaron Daubman
Greetings,

I have several custom QueryComponents that have high one-time startup costs
(hashing things in the index, caching things from a RDBMS, etc...)

Is there a way to prevent solr from accepting connections before all
QueryComponents are ready?

Especially, since many of our instance are load-balanced (and
added-in/removed automatically based on admin/ping responses) preventing
ping from answering prior to all custom QueryComponents being ready would
be ideal...

Thanks,
 Aaron


Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-08 Thread Amit Nithian
I think Solr does this by default and are you executing warming queries in
the firstSearcher so that these actions are done before Solr is ready to
accept real queries?


On Thu, Nov 8, 2012 at 11:54 AM, Aaron Daubman daub...@gmail.com wrote:

 Greetings,

 I have several custom QueryComponents that have high one-time startup costs
 (hashing things in the index, caching things from a RDBMS, etc...)

 Is there a way to prevent solr from accepting connections before all
 QueryComponents are ready?

 Especially, since many of our instance are load-balanced (and
 added-in/removed automatically based on admin/ping responses) preventing
 ping from answering prior to all custom QueryComponents being ready would
 be ideal...

 Thanks,
  Aaron



Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-08 Thread Aaron Daubman
Amit,

I am using warming /firstSearcher queries to ensure this happens before any
external queries are received, however, unless I am misinterpreting the
logs, solr starts responding to admin/ping requests before firstSearcher
completes, and, the LB then puts the solr instance back in the pool, and it
starts accepting connections...


On Thu, Nov 8, 2012 at 4:24 PM, Amit Nithian anith...@gmail.com wrote:

 I think Solr does this by default and are you executing warming queries in
 the firstSearcher so that these actions are done before Solr is ready to
 accept real queries?


 On Thu, Nov 8, 2012 at 11:54 AM, Aaron Daubman daub...@gmail.com wrote:

  Greetings,
 
  I have several custom QueryComponents that have high one-time startup
 costs
  (hashing things in the index, caching things from a RDBMS, etc...)
 
  Is there a way to prevent solr from accepting connections before all
  QueryComponents are ready?
 
  Especially, since many of our instance are load-balanced (and
  added-in/removed automatically based on admin/ping responses) preventing
  ping from answering prior to all custom QueryComponents being ready would
  be ideal...
 
  Thanks,
   Aaron
 



Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-08 Thread Amit Nithian
Sorry I misunderstood. I am having difficulty finding this but it's never
clear the exact load order. It seems odd that you'd be getting requests
when the filter (DispatchFilter) hasn't 100% loaded yet.

I didn't think that the admin handler would allow requests while the
dispatch filter is still init'ing but sounds like it is? I'll have to play
with this to see.. curious what the problem is for we have a similar setup
but not as bad of an init problem (plus when I deploy, my deploy script
runs some actual simple test queries to ensure they return before enabling
the ping handler to return 200s) to avoid this problem.

Cheers
Amit


On Thu, Nov 8, 2012 at 1:33 PM, Aaron Daubman daub...@gmail.com wrote:

 Amit,

 I am using warming /firstSearcher queries to ensure this happens before any
 external queries are received, however, unless I am misinterpreting the
 logs, solr starts responding to admin/ping requests before firstSearcher
 completes, and, the LB then puts the solr instance back in the pool, and it
 starts accepting connections...


 On Thu, Nov 8, 2012 at 4:24 PM, Amit Nithian anith...@gmail.com wrote:

  I think Solr does this by default and are you executing warming queries
 in
  the firstSearcher so that these actions are done before Solr is ready to
  accept real queries?
 
 
  On Thu, Nov 8, 2012 at 11:54 AM, Aaron Daubman daub...@gmail.com
 wrote:
 
   Greetings,
  
   I have several custom QueryComponents that have high one-time startup
  costs
   (hashing things in the index, caching things from a RDBMS, etc...)
  
   Is there a way to prevent solr from accepting connections before all
   QueryComponents are ready?
  
   Especially, since many of our instance are load-balanced (and
   added-in/removed automatically based on admin/ping responses)
 preventing
   ping from answering prior to all custom QueryComponents being ready
 would
   be ideal...
  
   Thanks,
Aaron
  
 



Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-08 Thread Aaron Daubman
  (plus when I deploy, my deploy script
 runs some actual simple test queries to ensure they return before enabling
 the ping handler to return 200s) to avoid this problem.


What are you doing to programmatically disable/enable the ping handler?
This sounds like exactly what I should be doing as well...


Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-08 Thread Amit Nithian
Hi Aaron,

Check out
http://lucene.apache.org/solr/api-4_0_0-BETA/org/apache/solr/handler/PingRequestHandler.html
You'll see the ?action=enable/disable. I have our load balancers remove the
server out of rotation when the response code != 200 for some number of
times in a row which I suspect you are doing too. If I am rolling releasing
our search code to production, it gets disabled, sleep for some known
number of seconds for the LB to yank the search server out of rotation,
push the code, execute some queries using CURL to ensure a response (the
warming process should block the request until done) and then enable.

HTH!
Amit


On Thu, Nov 8, 2012 at 2:01 PM, Aaron Daubman daub...@gmail.com wrote:

   (plus when I deploy, my deploy script
  runs some actual simple test queries to ensure they return before
 enabling
  the ping handler to return 200s) to avoid this problem.
 

 What are you doing to programmatically disable/enable the ping handler?
 This sounds like exactly what I should be doing as well...