Re: Availability Issues

2008-10-07 Thread sunnyfr

Hi Matthew,

Can you tell me what you mean by posting updates to every machine,
you mean snapshot files or index directory copy  

thanks a lot Matthew,
Wish you a nice day,


Matthew Runo wrote:
 
 The way I'd do it would be to buy more servers, set up Tomcat on  
 each, and get SOLR replicating from your current machine to the  
 others. Then, throw them all behind a load balancer, and there you go.
 
 You could also post your updates to every machine. Then you don't  
 need to worry about getting replication running.
 
 ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
 ++
 
 
 On Oct 9, 2007, at 7:12 AM, David Whalen wrote:
 
 All:

 How can I break up my install onto more than one box?  We've
 hit a learning curve here and we don't understand how best to
 proceed.  Right now we have everything crammed onto one box
 because we don't know any better.

 So, how would you build it if you could?  Here are the specs:

 a) the index needs to hold at least 25 million articles
 b) the index is constantly updated at a rate of 10,000 articles
 per minute
 c) we need to have faceted queries

 Again, real-world experience is preferred here over book knowledge.
 We've tried to read the docs and it's only made us more confused.

 TIA

 Dave W


 -Original Message-
 From: Yonik Seeley [mailto:[EMAIL PROTECTED]
 Sent: Monday, October 08, 2007 3:42 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Availability Issues

 On 10/8/07, David Whalen [EMAIL PROTECTED] wrote:
 Do you see any requests that took a really long time to finish?

 The requests that take a long time to finish are just
 simple queries.
 And the same queries run at a later time come back much faster.

 Our logs contain 99% inserts and 1% queries.  We are
 constantly adding
 documents to the index at a rate of 10,000 per minute, so the logs
 show mostly that.

 Oh, so you are using the same boxes for updating and querying?
 When you insert, are you using multiple threads?  If so, how many?

 What is the full URL of those slow query requests?
 Do the slow requests start after a commit?

 Start with the thread dump.
 I bet it's multiple queries piling up around some synchronization
 points in lucene (sometimes caused by multiple threads generating
 the same big filter that isn't yet cached).

 What would be my next steps after that?  I'm not sure I'd
 understand
 enough from the dump to make heads-or-tails of it.  Can I
 share that
 here?

 Yes, post it here.  Most likely a majority of the threads
 will be blocked somewhere deep in lucene code, and you will
 probably need help from people here to figure it out.

 -Yonik



 
 
 

-- 
View this message in context: 
http://www.nabble.com/Availability-Issues-tp13102075p19852301.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Availability Issues

2008-10-06 Thread sunnyfr

Hi Matthew,

What do you mean by post your updates ?
Does that mean that you just scp, copy data directory by cron job without
using automatic replication.
Because really since, I started to turn on autoCommit snapshooter, it does
slow down and mess up a bit everything.

Did you have had the same problem?
Thanks a lot,


Matthew Runo wrote:
 
 The way I'd do it would be to buy more servers, set up Tomcat on  
 each, and get SOLR replicating from your current machine to the  
 others. Then, throw them all behind a load balancer, and there you go.
 
 You could also post your updates to every machine. Then you don't  
 need to worry about getting replication running.
 
 ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
 ++
 
 
 On Oct 9, 2007, at 7:12 AM, David Whalen wrote:
 
 All:

 How can I break up my install onto more than one box?  We've
 hit a learning curve here and we don't understand how best to
 proceed.  Right now we have everything crammed onto one box
 because we don't know any better.

 So, how would you build it if you could?  Here are the specs:

 a) the index needs to hold at least 25 million articles
 b) the index is constantly updated at a rate of 10,000 articles
 per minute
 c) we need to have faceted queries

 Again, real-world experience is preferred here over book knowledge.
 We've tried to read the docs and it's only made us more confused.

 TIA

 Dave W


 -Original Message-
 From: Yonik Seeley [mailto:[EMAIL PROTECTED]
 Sent: Monday, October 08, 2007 3:42 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Availability Issues

 On 10/8/07, David Whalen [EMAIL PROTECTED] wrote:
 Do you see any requests that took a really long time to finish?

 The requests that take a long time to finish are just
 simple queries.
 And the same queries run at a later time come back much faster.

 Our logs contain 99% inserts and 1% queries.  We are
 constantly adding
 documents to the index at a rate of 10,000 per minute, so the logs
 show mostly that.

 Oh, so you are using the same boxes for updating and querying?
 When you insert, are you using multiple threads?  If so, how many?

 What is the full URL of those slow query requests?
 Do the slow requests start after a commit?

 Start with the thread dump.
 I bet it's multiple queries piling up around some synchronization
 points in lucene (sometimes caused by multiple threads generating
 the same big filter that isn't yet cached).

 What would be my next steps after that?  I'm not sure I'd
 understand
 enough from the dump to make heads-or-tails of it.  Can I
 share that
 here?

 Yes, post it here.  Most likely a majority of the threads
 will be blocked somewhere deep in lucene code, and you will
 probably need help from people here to figure it out.

 -Yonik



 
 
 

-- 
View this message in context: 
http://www.nabble.com/Availability-Issues-tp13102075p19835109.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Availability Issues

2007-10-11 Thread Norberto Meijome
On Tue, 9 Oct 2007 10:12:51 -0400
David Whalen [EMAIL PROTECTED] wrote:

 So, how would you build it if you could?  Here are the specs:
 
 a) the index needs to hold at least 25 million articles
 b) the index is constantly updated at a rate of 10,000 articles
 per minute
 c) we need to have faceted queries

Hi David,
Others with more experience than I have given you good answers , so I won't go
there

One thing you want to consider when you have lots of ongoing updates is,
how fast do you want your latest changes to show up in your results. 

Yes, everyone wants the latest to be live the second it hits the index, but 
balancing that act with having a responsive search within certain budget (and
architectural, maybe ? ) constrains isn't always that easy. 

In all seriousness, not everyone is  in a situation where every one of their
users would really need (or benefit hugely) from having each of the 200 docs
posted in the last second come up the ms. they hit Search. Can they tell if
it was posted within the last 3, 5 or 10 minutes?

I think that tuning the  values for cache warming should yield some good
results. Your probably don't want to have all your searches held until your
cache fully warms...or have to warm too often.

I was thinking that you could even split your indexes, have the latest entries
on smaller, faster index,and the rest of your 25M in another index which gets
updated , say, hourly. But if you have 10K updates (not new docs, but
changes),then maybe the idea of splitting the index is not that useful...

anyway, there many ways to skin a cat :)

good luck,
B
_
{Beto|Norberto|Numard} Meijome

Everything is interesting if you go into it deeply enough
  Richard Feynman

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Availability Issues

2007-10-10 Thread Otis Gospodnetic
Hi,

- Original Message 
From: David Whalen [EMAIL PROTECTED]

On that note -- I've read that Jetty isn't the best servlet
container to use in these situations, is that your experience?

OG: In which situations?  Jetty is great, actually! (the pretty high traffic 
site in my sig runs Jetty)

Otis 

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share



 -Original Message-
 From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
 Sent: Monday, October 08, 2007 11:20 PM
 To: solr-user
 Subject: RE: Availability Issues
 
 
 : My logs don't look anything like that.  They look like HTTP
 : requests.  Am I looking in the wrong place?
 
 what servlet container are you using?  
 
 every servlet container handles applications logs differently 
 -- it's especially tricky becuse even the format can be 
 changed, the examples i gave before are in the default format 
 you get if you use the jetty setup in the solr example (which 
 logs to stdout), but many servlet containers won't include 
 that much detail by default (they typically leave out the 
 classname and method name).  there's also typically a setting 
 that controls the verbosity -- so in some configurations only 
 the SEVERE messages are logged and in others the INFO 
 messages are logged ... you're going to want at least the 
 INFO level to debug stuff.
 
 grep all the log files you can find for Solr home set to 
 ... that's one of the first messages Solr logs.  if you can 
 find that, you'll find the other messages i was talking about.
 
 
 -Hoss
 
 
 





RE: Availability Issues

2007-10-09 Thread David Whalen
Chris:

We're using Jetty also, so I get the sense I'm looking at the
wrong log file.

On that note -- I've read that Jetty isn't the best servlet
container to use in these situations, is that your experience?

Dave


 -Original Message-
 From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
 Sent: Monday, October 08, 2007 11:20 PM
 To: solr-user
 Subject: RE: Availability Issues
 
 
 : My logs don't look anything like that.  They look like HTTP
 : requests.  Am I looking in the wrong place?
 
 what servlet container are you using?  
 
 every servlet container handles applications logs differently 
 -- it's especially tricky becuse even the format can be 
 changed, the examples i gave before are in the default format 
 you get if you use the jetty setup in the solr example (which 
 logs to stdout), but many servlet containers won't include 
 that much detail by default (they typically leave out the 
 classname and method name).  there's also typically a setting 
 that controls the verbosity -- so in some configurations only 
 the SEVERE messages are logged and in others the INFO 
 messages are logged ... you're going to want at least the 
 INFO level to debug stuff.
 
 grep all the log files you can find for Solr home set to 
 ... that's one of the first messages Solr logs.  if you can 
 find that, you'll find the other messages i was talking about.
 
 
 -Hoss
 
 
 


RE: Availability Issues

2007-10-09 Thread David Whalen
All:

How can I break up my install onto more than one box?  We've
hit a learning curve here and we don't understand how best to
proceed.  Right now we have everything crammed onto one box
because we don't know any better.

So, how would you build it if you could?  Here are the specs:

a) the index needs to hold at least 25 million articles
b) the index is constantly updated at a rate of 10,000 articles
per minute
c) we need to have faceted queries

Again, real-world experience is preferred here over book knowledge.
We've tried to read the docs and it's only made us more confused.

TIA

Dave W
  

 -Original Message-
 From: Yonik Seeley [mailto:[EMAIL PROTECTED] 
 Sent: Monday, October 08, 2007 3:42 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Availability Issues
 
 On 10/8/07, David Whalen [EMAIL PROTECTED] wrote:
   Do you see any requests that took a really long time to finish?
 
  The requests that take a long time to finish are just 
 simple queries.  
  And the same queries run at a later time come back much faster.
 
  Our logs contain 99% inserts and 1% queries.  We are 
 constantly adding 
  documents to the index at a rate of 10,000 per minute, so the logs 
  show mostly that.
 
 Oh, so you are using the same boxes for updating and querying?
 When you insert, are you using multiple threads?  If so, how many?
 
 What is the full URL of those slow query requests?
 Do the slow requests start after a commit?
 
   Start with the thread dump.
   I bet it's multiple queries piling up around some synchronization 
   points in lucene (sometimes caused by multiple threads generating 
   the same big filter that isn't yet cached).
 
  What would be my next steps after that?  I'm not sure I'd 
 understand 
  enough from the dump to make heads-or-tails of it.  Can I 
 share that 
  here?
 
 Yes, post it here.  Most likely a majority of the threads 
 will be blocked somewhere deep in lucene code, and you will 
 probably need help from people here to figure it out.
 
 -Yonik
 
 


Re: Availability Issues

2007-10-09 Thread Matthew Runo
The way I'd do it would be to buy more servers, set up Tomcat on  
each, and get SOLR replicating from your current machine to the  
others. Then, throw them all behind a load balancer, and there you go.


You could also post your updates to every machine. Then you don't  
need to worry about getting replication running.


++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Oct 9, 2007, at 7:12 AM, David Whalen wrote:


All:

How can I break up my install onto more than one box?  We've
hit a learning curve here and we don't understand how best to
proceed.  Right now we have everything crammed onto one box
because we don't know any better.

So, how would you build it if you could?  Here are the specs:

a) the index needs to hold at least 25 million articles
b) the index is constantly updated at a rate of 10,000 articles
per minute
c) we need to have faceted queries

Again, real-world experience is preferred here over book knowledge.
We've tried to read the docs and it's only made us more confused.

TIA

Dave W



-Original Message-
From: Yonik Seeley [mailto:[EMAIL PROTECTED]
Sent: Monday, October 08, 2007 3:42 PM
To: solr-user@lucene.apache.org
Subject: Re: Availability Issues

On 10/8/07, David Whalen [EMAIL PROTECTED] wrote:

Do you see any requests that took a really long time to finish?


The requests that take a long time to finish are just

simple queries.

And the same queries run at a later time come back much faster.

Our logs contain 99% inserts and 1% queries.  We are

constantly adding

documents to the index at a rate of 10,000 per minute, so the logs
show mostly that.


Oh, so you are using the same boxes for updating and querying?
When you insert, are you using multiple threads?  If so, how many?

What is the full URL of those slow query requests?
Do the slow requests start after a commit?


Start with the thread dump.
I bet it's multiple queries piling up around some synchronization
points in lucene (sometimes caused by multiple threads generating
the same big filter that isn't yet cached).


What would be my next steps after that?  I'm not sure I'd

understand

enough from the dump to make heads-or-tails of it.  Can I

share that

here?


Yes, post it here.  Most likely a majority of the threads
will be blocked somewhere deep in lucene code, and you will
probably need help from people here to figure it out.

-Yonik








Re: Availability Issues

2007-10-09 Thread Charles Hornberger
I'm about to do a prototype deployment of Solr for a pretty
high-volume site, and I've been following this thread with some
interest.

One thing I want to confirm: It's really possible for Solr to handle a
constant stream of 10K updates/min (150 updates/sec) to a
25M-document index? I new Solr and Lucene were good, but that seems
like a pretty tall order. From the responses I'm seeing to David
Whalen's inquiries, it seems like people think that's possible.

Thanks,
Charlie

On 10/9/07, Matthew Runo [EMAIL PROTECTED] wrote:
 The way I'd do it would be to buy more servers, set up Tomcat on
 each, and get SOLR replicating from your current machine to the
 others. Then, throw them all behind a load balancer, and there you go.

 You could also post your updates to every machine. Then you don't
 need to worry about getting replication running.

 ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
 ++


 On Oct 9, 2007, at 7:12 AM, David Whalen wrote:

  All:
 
  How can I break up my install onto more than one box?  We've
  hit a learning curve here and we don't understand how best to
  proceed.  Right now we have everything crammed onto one box
  because we don't know any better.
 
  So, how would you build it if you could?  Here are the specs:
 
  a) the index needs to hold at least 25 million articles
  b) the index is constantly updated at a rate of 10,000 articles
  per minute
  c) we need to have faceted queries
 
  Again, real-world experience is preferred here over book knowledge.
  We've tried to read the docs and it's only made us more confused.
 
  TIA
 
  Dave W
 
 
  -Original Message-
  From: Yonik Seeley [mailto:[EMAIL PROTECTED]
  Sent: Monday, October 08, 2007 3:42 PM
  To: solr-user@lucene.apache.org
  Subject: Re: Availability Issues
 
  On 10/8/07, David Whalen [EMAIL PROTECTED] wrote:
  Do you see any requests that took a really long time to finish?
 
  The requests that take a long time to finish are just
  simple queries.
  And the same queries run at a later time come back much faster.
 
  Our logs contain 99% inserts and 1% queries.  We are
  constantly adding
  documents to the index at a rate of 10,000 per minute, so the logs
  show mostly that.
 
  Oh, so you are using the same boxes for updating and querying?
  When you insert, are you using multiple threads?  If so, how many?
 
  What is the full URL of those slow query requests?
  Do the slow requests start after a commit?
 
  Start with the thread dump.
  I bet it's multiple queries piling up around some synchronization
  points in lucene (sometimes caused by multiple threads generating
  the same big filter that isn't yet cached).
 
  What would be my next steps after that?  I'm not sure I'd
  understand
  enough from the dump to make heads-or-tails of it.  Can I
  share that
  here?
 
  Yes, post it here.  Most likely a majority of the threads
  will be blocked somewhere deep in lucene code, and you will
  probably need help from people here to figure it out.
 
  -Yonik
 
 
 




Re: Availability Issues

2007-10-09 Thread Matthew Runo
When we are doing a reindex (1x a day), we post around 150-200  
documents per second, on average. Our index is not as large though,  
about 200k docs. During this import, the search service (with faceted  
page navigation) remains available for front-end searches and  
performance does not noticeably change. You can see this install  
running at http://www.6pm.com, where SOLR is in use for every part of  
the navigation and search.


I believe that a sustained load of 150+ posts per second is very  
possible. At that load though, it does make sense to consider  
multiple machines.


++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Oct 9, 2007, at 10:16 AM, Charles Hornberger wrote:


I'm about to do a prototype deployment of Solr for a pretty
high-volume site, and I've been following this thread with some
interest.

One thing I want to confirm: It's really possible for Solr to handle a
constant stream of 10K updates/min (150 updates/sec) to a
25M-document index? I new Solr and Lucene were good, but that seems
like a pretty tall order. From the responses I'm seeing to David
Whalen's inquiries, it seems like people think that's possible.

Thanks,
Charlie

On 10/9/07, Matthew Runo [EMAIL PROTECTED] wrote:

The way I'd do it would be to buy more servers, set up Tomcat on
each, and get SOLR replicating from your current machine to the
others. Then, throw them all behind a load balancer, and there you  
go.


You could also post your updates to every machine. Then you don't
need to worry about getting replication running.

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++


On Oct 9, 2007, at 7:12 AM, David Whalen wrote:


All:

How can I break up my install onto more than one box?  We've
hit a learning curve here and we don't understand how best to
proceed.  Right now we have everything crammed onto one box
because we don't know any better.

So, how would you build it if you could?  Here are the specs:

a) the index needs to hold at least 25 million articles
b) the index is constantly updated at a rate of 10,000 articles
per minute
c) we need to have faceted queries

Again, real-world experience is preferred here over book knowledge.
We've tried to read the docs and it's only made us more confused.

TIA

Dave W



-Original Message-
From: Yonik Seeley [mailto:[EMAIL PROTECTED]
Sent: Monday, October 08, 2007 3:42 PM
To: solr-user@lucene.apache.org
Subject: Re: Availability Issues

On 10/8/07, David Whalen [EMAIL PROTECTED] wrote:

Do you see any requests that took a really long time to finish?


The requests that take a long time to finish are just

simple queries.

And the same queries run at a later time come back much faster.

Our logs contain 99% inserts and 1% queries.  We are

constantly adding

documents to the index at a rate of 10,000 per minute, so the logs
show mostly that.


Oh, so you are using the same boxes for updating and querying?
When you insert, are you using multiple threads?  If so, how many?

What is the full URL of those slow query requests?
Do the slow requests start after a commit?


Start with the thread dump.
I bet it's multiple queries piling up around some synchronization
points in lucene (sometimes caused by multiple threads generating
the same big filter that isn't yet cached).


What would be my next steps after that?  I'm not sure I'd

understand

enough from the dump to make heads-or-tails of it.  Can I

share that

here?


Yes, post it here.  Most likely a majority of the threads
will be blocked somewhere deep in lucene code, and you will
probably need help from people here to figure it out.

-Yonik













RE: Availability Issues

2007-10-09 Thread Chris Hostetter

: We're using Jetty also, so I get the sense I'm looking at the
: wrong log file.

if you are using the jetty configs that comes in the solr downloads, it 
writes all of the solr log messages to stdout (ie: when you run it on the 
commandline, the messages come to your terminal).  i don't know off the 
top of my head how to configure Jetty to log application log messages to a 
specific file ... there may be jetty specific config options ofr 
controlling this, or jetty may expect you to explicitly set the system 
properties that tell the JVM default log manager what you wnat it to do...

http://java.sun.com/j2se/1.5.0/docs/guide/logging/overview.html

: On that note -- I've read that Jetty isn't the best servlet
: container to use in these situations, is that your experience?

i can't make any specific recommendations ... i use Resin because someone 
else at my work did some research and decided it's worth paying for.  From 
what i've seen tomcat seems easier to configure then jetty and i had an 
easier time understanding it's docs, but i've never done any performance 
tests.



-Hoss



Re: Availability Issues

2007-10-08 Thread Tom Hill
Hi -

We're definitely not seeing that. What do your logs show? What do your
schema/solrconfig look like?

Tom


On 10/8/07, David Whalen [EMAIL PROTECTED] wrote:

 Hi All.

 I'm seeing all these threads about availability and I'm
 wondering why my situation is so different than others'.

 We're running SOLR 1.2 with a 2.5G heap size.  On any
 given day, the system becomes completely unresponsive.
 We can't even get /solr/admin/ to come up, much less
 any select queries.

 The only thing we can do is kill the SOLR process and
 re-start it.

 We are indexing over 25 million documents and we add
 about as much as we remove daily, so the number remains
 fairly constant.

 Again, it seems like other folks are having a much
 easier time with SOLR than we are.  Can anyone help
 by sharing how you've got it configured?  Does anyone
 have a similar experience?

 TIA.

 DW




RE: Availability Issues

2007-10-08 Thread David Whalen
Hi Tom.

The logs show nothing but regular activity.  We do a tail -f
on the logfile and we can read it during the unresponsive period
and we don't see any errors.

I've attached our schema/config files.  They are pretty much
out-of-the-box values, except for our index.

Dave


 -Original Message-
 From: Tom Hill [mailto:[EMAIL PROTECTED] 
 Sent: Monday, October 08, 2007 2:22 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Availability Issues
 
 Hi -
 
 We're definitely not seeing that. What do your logs show? 
 What do your schema/solrconfig look like?
 
 Tom
 
 
 On 10/8/07, David Whalen [EMAIL PROTECTED] wrote:
 
  Hi All.
 
  I'm seeing all these threads about availability and I'm 
 wondering why 
  my situation is so different than others'.
 
  We're running SOLR 1.2 with a 2.5G heap size.  On any given 
 day, the 
  system becomes completely unresponsive.
  We can't even get /solr/admin/ to come up, much less any select 
  queries.
 
  The only thing we can do is kill the SOLR process and re-start it.
 
  We are indexing over 25 million documents and we add about 
 as much as 
  we remove daily, so the number remains fairly constant.
 
  Again, it seems like other folks are having a much easier time with 
  SOLR than we are.  Can anyone help by sharing how you've got it 
  configured?  Does anyone have a similar experience?
 
  TIA.
 
  DW
 
 
 
 


RE: Availability Issues

2007-10-08 Thread David Whalen
Hi Yonik.

 What version of Solr are you running?

We're running:
Solr Specification Version: 1.2.2007.08.24.08.06.00 
Solr Implementation Version: nightly ${svnversion} - yonik - 2007-08-24 
08:06:00 
Lucene Specification Version: 2.2.0 
Lucene Implementation Version: 2.2.0 548010 - buschmi - 2007-06-16 23:15:56 

 Is the CPU pegged at 100% when it's unresponsive?

It's a little difficult to be sure.  We have a HT box and the
CPU % we get back is misleading.  I think it's safe to say we
may spike up to 100% but we don't necessarily stay pegged there.

 Have you taken a thread dump to see what is going on?

We can't do it b/c during the unresponsive time we can't access
the admin site (/solr/admin) at all.  I don't know how to do a
thread dump via the command line

 Do you get into a situation where more than one searcher is 
 warming at a time? (there is configuration that can prevent 
 this one from happening).

Forgive me when I say I'm not totally clear on what this 
question means.  The index is constantly getting hit with
a myriad or queries, if that's what you meant

Thanks,

Dave


  

 -Original Message-
 From: Yonik Seeley [mailto:[EMAIL PROTECTED] 
 Sent: Monday, October 08, 2007 2:23 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Availability Issues
 
 On 10/8/07, David Whalen [EMAIL PROTECTED] wrote:
  We're running SOLR 1.2 with a 2.5G heap size.  On any given 
 day, the 
  system becomes completely unresponsive.
  We can't even get /solr/admin/ to come up, much less any select 
  queries.
 
 What version of Solr are you running?
 The first step to diagnose something like this is to figure 
 out what is going on...
 Is the CPU pegged at 100% when it's unresponsive?
 Have you taken a thread dump to see what is going on?
 Do you get into a situation where more than one searcher is 
 warming at a time? (there is configuration that can prevent 
 this one from happening).
 
 -Yonik
 
 


Re: Availability Issues

2007-10-08 Thread Yonik Seeley
On 10/8/07, David Whalen [EMAIL PROTECTED] wrote:
  Have you taken a thread dump to see what is going on?

 We can't do it b/c during the unresponsive time we can't access
 the admin site (/solr/admin) at all.  I don't know how to do a
 thread dump via the command line

kill -3 pid_of_jvm_running_solr

Start with the thread dump.
I bet it's multiple queries piling up around some synchronization
points in lucene (sometimes caused by multiple threads generating the
same big filter that isn't yet cached).

-Yonik


Re: Availability Issues

2007-10-08 Thread Yonik Seeley
On 10/8/07, David Whalen [EMAIL PROTECTED] wrote:
 The logs show nothing but regular activity.  We do a tail -f
 on the logfile and we can read it during the unresponsive period
 and we don't see any errors.

You don't see log entries for requests until after they complete.
When a server becomes unresponsive, try shutting off further traffic
to it, and let it finish whatever requests it's working on (assuming
that's the issue) so you can see them in the log.  Do you see any
requests that took a really long time to finish?

-Yonik


RE: Availability Issues

2007-10-08 Thread David Whalen
Hi Yonik.

 Do you see any requests that took a really long time to finish?

The requests that take a long time to finish are just simple
queries.  And the same queries run at a later time come back
much faster.

Our logs contain 99% inserts and 1% queries.  We are constantly
adding documents to the index at a rate of 10,000 per minute,
so the logs show mostly that.


 Start with the thread dump.
 I bet it's multiple queries piling up around some 
 synchronization points in lucene (sometimes caused by 
 multiple threads generating the same big filter that isn't 
 yet cached).

What would be my next steps after that?  I'm not sure I'd
understand enough from the dump to make heads-or-tails of
it.  Can I share that here?

Dave


 -Original Message-
 From: Yonik Seeley [mailto:[EMAIL PROTECTED] 
 Sent: Monday, October 08, 2007 3:01 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Availability Issues
 
 On 10/8/07, David Whalen [EMAIL PROTECTED] wrote:
  The logs show nothing but regular activity.  We do a tail -f
  on the logfile and we can read it during the unresponsive 
 period and 
  we don't see any errors.
 
 You don't see log entries for requests until after they complete.
 When a server becomes unresponsive, try shutting off further 
 traffic to it, and let it finish whatever requests it's 
 working on (assuming that's the issue) so you can see them in 
 the log.  Do you see any requests that took a really long 
 time to finish?
 
 -Yonik
 
 


Re: Availability Issues

2007-10-08 Thread Yonik Seeley
On 10/8/07, David Whalen [EMAIL PROTECTED] wrote:
  Do you see any requests that took a really long time to finish?

 The requests that take a long time to finish are just simple
 queries.  And the same queries run at a later time come back
 much faster.

 Our logs contain 99% inserts and 1% queries.  We are constantly
 adding documents to the index at a rate of 10,000 per minute,
 so the logs show mostly that.

Oh, so you are using the same boxes for updating and querying?
When you insert, are you using multiple threads?  If so, how many?

What is the full URL of those slow query requests?
Do the slow requests start after a commit?

  Start with the thread dump.
  I bet it's multiple queries piling up around some
  synchronization points in lucene (sometimes caused by
  multiple threads generating the same big filter that isn't
  yet cached).

 What would be my next steps after that?  I'm not sure I'd
 understand enough from the dump to make heads-or-tails of
 it.  Can I share that here?

Yes, post it here.  Most likely a majority of the threads will be
blocked somewhere deep in lucene code, and you will probably need help
from people here to figure it out.

-Yonik


RE: Availability Issues

2007-10-08 Thread David Whalen
 Oh, so you are using the same boxes for updating and querying?

Yep.  We have a MySQL database on the box and we query it and
POST directly into SOLR via wget in PERL.  We then also hit the
box for queries.

[We'd be very interested in hearing about best practices on
how to seperate-out the data from the index and how to balance
them when the inserts outweigh the selects by factors of 50,000:1]

 When you insert, are you using multiple threads?  If so, how many?

We're not threading at all.  We have a PERL script that does a
select statement out of a MySQL database and runs POSTs sequentially
into SOLR, one per document.  After a batch of 10,000 POSTs, we run a
background commit (using waitFlush and waitSearcher)

Again, I'd be very grateful for success stories from people in terms
of good server architecture.  We are ready and willing to change versions
of linux, of the Java container, etc.  And we're ready to add more
boxes if that'll help.  We just need some guidance.

 What is the full URL of those slow query requests?

They can be anything.  For example:

[08/10/2007:18:51:55 +] GET 
/solr/select/?q=solrversion=2.2start=0rows=10indent=on HTTP/1.1 200 45799

 Do the slow requests start after a commit?

Based on the way the logs read, you could argue that point.
The stream of POSTs end in the logs and then subsequent queries
take longer to run, but it's hard to be sure there's a direct
correlation.

 Yes, post it here.  Most likely a majority of the threads 
 will be blocked somewhere deep in lucene code, and you will 
 probably need help from people here to figure it out.

Next time it happens I'll shoot it over.
  
--Dave


 -Original Message-
 From: Yonik Seeley [mailto:[EMAIL PROTECTED] 
 Sent: Monday, October 08, 2007 3:42 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Availability Issues
 
 On 10/8/07, David Whalen [EMAIL PROTECTED] wrote:
   Do you see any requests that took a really long time to finish?
 
  The requests that take a long time to finish are just 
 simple queries.  
  And the same queries run at a later time come back much faster.
 
  Our logs contain 99% inserts and 1% queries.  We are 
 constantly adding 
  documents to the index at a rate of 10,000 per minute, so the logs 
  show mostly that.
 
 Oh, so you are using the same boxes for updating and querying?
 When you insert, are you using multiple threads?  If so, how many?
 
 What is the full URL of those slow query requests?
 Do the slow requests start after a commit?
 
   Start with the thread dump.
   I bet it's multiple queries piling up around some synchronization 
   points in lucene (sometimes caused by multiple threads generating 
   the same big filter that isn't yet cached).
 
  What would be my next steps after that?  I'm not sure I'd 
 understand 
  enough from the dump to make heads-or-tails of it.  Can I 
 share that 
  here?
 
 Yes, post it here.  Most likely a majority of the threads 
 will be blocked somewhere deep in lucene code, and you will 
 probably need help from people here to figure it out.
 
 -Yonik
 
 


RE: Availability Issues

2007-10-08 Thread Chris Hostetter
: I've attached our schema/config files.  They are pretty much
: out-of-the-box values, except for our index.

FYI: the mailing list strips most attachemnts ... the best thing to do is 
just inline them in your mail.

Quick question: do you have autoCommit turned on in your solrconfig.xml?

Second question: do you have autowarming on your caches?



-Hoss



RE: Availability Issues

2007-10-08 Thread Chris Hostetter

:  Do the slow requests start after a commit?
: 
: Based on the way the logs read, you could argue that point.
: The stream of POSTs end in the logs and then subsequent queries
: take longer to run, but it's hard to be sure there's a direct
: correlation.

you would know based on the INFO level messages related to a commit ... 
you'll see messages that look like this when the commit starts...

Oct 8, 2007 1:56:48 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true)

...then you'll see a message like this...

Oct 8, 2007 1:56:48 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush

...if you have autowarming you'll see a bunch of logs about that, and then 
eventually you'll see a message like this...

Oct 8, 2007 1:56:48 PM org.apache.solr.update.processor.LogUpdateProcessor 
finish
INFO: {commit=} 0 299

...the important question is how many of these hangs or really long 
queries happen in the midst of all that ... how many happen very quickly 
after it (which may indicate not enough warming)

(NOTE: some of those log messages may look different in your nightly 
snapshot version, but the main gist should be the same .. i don't remember 
when exactly the LogUpdateProcessor was added).




-Hoss



RE: Availability Issues

2007-10-08 Thread David Whalen
Hi Chris.

My logs don't look anything like that.  They look like HTTP
requests.  Am I looking in the wrong place?

Dave


 -Original Message-
 From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
 Sent: Monday, October 08, 2007 5:02 PM
 To: solr-user
 Subject: RE: Availability Issues
 
 
 :  Do the slow requests start after a commit?
 : 
 : Based on the way the logs read, you could argue that point.
 : The stream of POSTs end in the logs and then subsequent queries
 : take longer to run, but it's hard to be sure there's a direct
 : correlation.
 
 you would know based on the INFO level messages related to a 
 commit ... 
 you'll see messages that look like this when the commit starts...
 
 Oct 8, 2007 1:56:48 PM 
 org.apache.solr.update.DirectUpdateHandler2 commit
 INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true)
 
 ...then you'll see a message like this...
 
 Oct 8, 2007 1:56:48 PM 
 org.apache.solr.update.DirectUpdateHandler2 commit
 INFO: end_commit_flush
 
 ...if you have autowarming you'll see a bunch of logs about 
 that, and then eventually you'll see a message like this...
 
 Oct 8, 2007 1:56:48 PM 
 org.apache.solr.update.processor.LogUpdateProcessor finish
 INFO: {commit=} 0 299
 
 ...the important question is how many of these hangs or 
 really long queries happen in the midst of all that ... how 
 many happen very quickly after it (which may indicate not 
 enough warming)
 
 (NOTE: some of those log messages may look different in your 
 nightly snapshot version, but the main gist should be the 
 same .. i don't remember when exactly the LogUpdateProcessor 
 was added).
 
 
 
 
 -Hoss
 
 
 


RE: Availability Issues

2007-10-08 Thread David Whalen
 for
 partitioning the index, independent of any user selected filtering
 that may also be desired (perhaps as a result of faceted searching).

 NOTE: there is *absolutely* nothing a client can do to prevent these
 appends values from being used, so don't use this mechanism
 unless you are sure you always want it.
  --
lst name=appends
  str name=fqinStock:true/str
/lst
!-- invariants are a way of letting the Solr maintainer lock down
 the options available to Solr clients.  Any params values
 specified here are used regardless of what values may be specified
 in either the query, the defaults, or the appends params.

 In this example, the facet.field and facet.query params are fixed,
 limiting the facets clients can use.  Faceting is not turned on by
 default - but if the client does specify facet=true in the request,
 these are the only facets they will be able to see counts for;
 regardless of what other facet.field or facet.query params they
 may specify.

 NOTE: there is *absolutely* nothing a client can do to prevent these
 invariants values from being used, so don't use this mechanism
 unless you are sure you always want it.
  --
lst name=invariants
  str name=facet.fieldcat/str
  str name=facet.fieldmanu_exact/str
  str name=facet.queryprice:[* TO 500]/str
  str name=facet.queryprice:[500 TO *]/str
/lst
  /requestHandler
  
  requestHandler name=instock class=solr.DisMaxRequestHandler 
!-- for legacy reasons, DisMaxRequestHandler will assume all init
 params are defaults if you don't explicitly specify any defaults.
  --
 str name=fq
inStock:true
 /str
 str name=qf
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
 /str
 str name=mm
2lt;-1 5lt;-2 6lt;90%
 /str
  /requestHandler
  
  !-- queryResponseWriter plugins... query responses will be written using the
writer specified by the 'wt' request parameter matching the name of a 
registered
writer.
The standard writer is the default and will be used if 'wt' is not 
specified 
in the request. XMLResponseWriter will be used if nothing is specified here.
The json, python, and ruby writers are also available by default.

queryResponseWriter name=standard 
class=org.apache.solr.request.XMLResponseWriter/
queryResponseWriter name=json 
class=org.apache.solr.request.JSONResponseWriter/
queryResponseWriter name=python 
class=org.apache.solr.request.PythonResponseWriter/
queryResponseWriter name=ruby 
class=org.apache.solr.request.RubyResponseWriter/

queryResponseWriter name=custom class=com.example.MyResponseWriter/
  --

  !-- XSLT response writer transforms the XML output by any xslt file found
   in Solr's conf/xslt directory.  Changes to xslt files are checked for
   every xsltCacheLifetimeSeconds.  
   --
  queryResponseWriter name=xslt 
class=org.apache.solr.request.XSLTResponseWriter
int name=xsltCacheLifetimeSeconds5/int
  /queryResponseWriter 

  !-- config for the admin interface -- 
  admin
defaultQuerysolr/defaultQuery
gettableFilessolrconfig.xml schema.xml admin-extra.html/gettableFiles
!-- pingQuery should be URLish ...
 amp; separated key=val pairs ... but there shouldn't be any
 URL escaping of the values --
pingQuery
 qt=dismaxamp;q=solramp;start=3amp;fq=id:[* TO *]amp;fq=cat:[* TO *]
/pingQuery
!-- configure a healthcheck file for servers behind a loadbalancer
healthcheck type=fileserver-enabled/healthcheck
--
  /admin

/config



 END CONFIG.XML ===



 

 -Original Message-
 From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
 Sent: Monday, October 08, 2007 4:56 PM
 To: solr-user
 Subject: RE: Availability Issues
 
 : I've attached our schema/config files.  They are pretty much
 : out-of-the-box values, except for our index.
 
 FYI: the mailing list strips most attachemnts ... the best 
 thing to do is just inline them in your mail.
 
 Quick question: do you have autoCommit turned on in your 
 solrconfig.xml?
 
 Second question: do you have autowarming on your caches?
 
 
 
 -Hoss
 
 
 


Re: Availability Issues

2007-10-08 Thread James liu
...
   --
 str name=bqincubationdate_dt:[* TO NOW/DAY-1MONTH]^2.2/str
/lst
!-- In addition to defaults, appends params can be specified
 to identify values which should be appended to the list of
 multi-val params from the query (or the existing defaults).

 In this example, the param fq=instock:true will be appended to
 any query time fq params the user may specify, as a mechanism for
 partitioning the index, independent of any user selected filtering
 that may also be desired (perhaps as a result of faceted
 searching).

 NOTE: there is *absolutely* nothing a client can do to prevent
 these
 appends values from being used, so don't use this mechanism
 unless you are sure you always want it.
  --
lst name=appends
  str name=fqinStock:true/str
/lst
!-- invariants are a way of letting the Solr maintainer lock down
 the options available to Solr clients.  Any params values
 specified here are used regardless of what values may be specified
 in either the query, the defaults, or the appends params.

 In this example, the facet.field and facet.query params are fixed,
 limiting the facets clients can use.  Faceting is not turned on by
 default - but if the client does specify facet=true in the
 request,
 these are the only facets they will be able to see counts for;
 regardless of what other facet.field or facet.query params they
 may specify.

 NOTE: there is *absolutely* nothing a client can do to prevent
 these
 invariants values from being used, so don't use this mechanism
 unless you are sure you always want it.
  --
lst name=invariants
  str name=facet.fieldcat/str
  str name=facet.fieldmanu_exact/str
  str name=facet.queryprice:[* TO 500]/str
  str name=facet.queryprice:[500 TO *]/str
/lst
 /requestHandler

 requestHandler name=instock class=solr.DisMaxRequestHandler 
!-- for legacy reasons, DisMaxRequestHandler will assume all init
 params are defaults if you don't explicitly specify any
 defaults.
  --
 str name=fq
inStock:true
 /str
 str name=qf
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
 /str
 str name=mm
2lt;-1 5lt;-2 6lt;90%
 /str
 /requestHandler

 !-- queryResponseWriter plugins... query responses will be written using
 the
writer specified by the 'wt' request parameter matching the name of a
 registered
writer.
The standard writer is the default and will be used if 'wt' is not
 specified
in the request. XMLResponseWriter will be used if nothing is specified
 here.
The json, python, and ruby writers are also available by default.

queryResponseWriter name=standard class=
 org.apache.solr.request.XMLResponseWriter/
queryResponseWriter name=json class=
 org.apache.solr.request.JSONResponseWriter/
queryResponseWriter name=python class=
 org.apache.solr.request.PythonResponseWriter/
queryResponseWriter name=ruby class=
 org.apache.solr.request.RubyResponseWriter/

queryResponseWriter name=custom class=com.example.MyResponseWriter
 /
 --

 !-- XSLT response writer transforms the XML output by any xslt file found
   in Solr's conf/xslt directory.  Changes to xslt files are checked
 for
   every xsltCacheLifetimeSeconds.
   --
 queryResponseWriter name=xslt class=
 org.apache.solr.request.XSLTResponseWriter
int name=xsltCacheLifetimeSeconds5/int
 /queryResponseWriter

 !-- config for the admin interface --
 admin
defaultQuerysolr/defaultQuery
gettableFilessolrconfig.xml schema.xml admin-extra.html
 /gettableFiles
!-- pingQuery should be URLish ...
 amp; separated key=val pairs ... but there shouldn't be any
 URL escaping of the values --
pingQuery
 qt=dismaxamp;q=solramp;start=3amp;fq=id:[* TO *]amp;fq=cat:[* TO
 *]
/pingQuery
!-- configure a healthcheck file for servers behind a loadbalancer
healthcheck type=fileserver-enabled/healthcheck
--
 /admin

 /config



  END CONFIG.XML===





  -Original Message-
  From: Chris Hostetter [mailto:[EMAIL PROTECTED]
  Sent: Monday, October 08, 2007 4:56 PM
  To: solr-user
  Subject: RE: Availability Issues
 
  : I've attached our schema/config files.  They are pretty much
  : out-of-the-box values, except for our index.
 
  FYI: the mailing list strips most attachemnts ... the best
  thing to do is just inline them in your mail.
 
  Quick question: do you have autoCommit turned on in your
  solrconfig.xml?
 
  Second question: do you have autowarming on your caches?
 
 
 
  -Hoss
 
 
 




-- 
regards
jl