Re: SolrException: Unavailable Service

2011-04-13 Thread Phong Dais
Erick,

I was under the misconception that a solr transaction is ACID.
From what you said, I guess solr transactions are not Isolated.

Thanks,
Phong

On Tue, Apr 12, 2011 at 2:54 PM, Erick Erickson erickerick...@gmail.comwrote:

 See below:

 On Tue, Apr 12, 2011 at 2:21 PM, Phong Dais phong.gd...@gmail.com wrote:

  Erick,
 
  My setup is not quite the way you described.  I have multiple threads
  indexing simultaneously, but I only have 1 thread doing the commit after
  all
  indexing threads finished.  I have multiple instances of this running
 each
  in their own java vm.  I'm ok with throwing out all the docs indexed so
 far
  if the commit fail.
 
 
 But this is really the same thing. On the back end, Solr is piping them all
 into
 a common index and that is where the autocommit happens.

 The fact that it's happening in separate JVMs doesn't alter the concept,
 you
 should
 let autocommit handle things. The problem here is knowing what hasn't
 indexed.


  I did not know that the recommended procedure is to use auto commit.  I
  will
  explore this avenue.  I was not aware of the master slave setup neither.
 
  The first thing that comes to mind is how do I know which docs did not
 get
  committed if the auto commit ever fails?  What is the recommended
 procedure
  for handling failure?  Any failed docs will need to be index at some
 point
  in the future.
 

 Assuming that you have a uniqueKey defined, you can look at the logs to
 see failures.
 Then you can choose to re-index all the documents that have changed around
 that
 time (backing up as far as you need to to be safe) . The key here is that
 you can re-index and the old copy (if any) will be replaced by the
 re-indexed
 copy.

 There's nothing really built into Solr that does this for you, you really
 have to build this
 part yourself.

 Best
 Erick


 
  Thanks for the valuable inputs.
 
  Phong
 
 
  On Tue, Apr 12, 2011 at 9:09 AM, Erick Erickson erickerick...@gmail.com
  wrote:
 
   Sorry, fat fingers. Sent that last e-mail inadvertently.
  
   Anyway, if I have this correct, I'd recommend going to
   autocommit and NOT committing from the clients. That's
   usually the recommended procedure.
  
   This is especially true if you have a master/slave setup,
   because each commit from each client will trigger
   (potentially) a replication.
  
   Best
   Erick
  
   On Tue, Apr 12, 2011 at 9:07 AM, Erick Erickson 
 erickerick...@gmail.com
   wrote:
  
If your commit from the client fails, you don't really know the
state of your index anyway. All the threads you have sending
documents to Solr are adding them to a single internal buffer.
Committing flushes that buffer.
   
So if thread 1 gets an error on commit, it will presumably
have some documents from thread 2 in the commit. But
thread 2 won't necessarily see the results. So I don't think
your statement about needing to know if a commit fails
is really
   
   
On Tue, Apr 12, 2011 at 8:50 AM, Phong Dais phong.gd...@gmail.com
   wrote:
   
Hi,
   
I did not want to hijack this thread (
   
 http://www.mail-archive.com/solr-user@lucene.apache.org/msg34181.html
  )
but I am experiencing the same exact problem mentioned here.
   
To sum up the issue, I am getting intermittent Unavailable Service
exception during indexing commit phase.
I know that I am calling commit very often but I do not see any
 way
around
this.  This is my situation, I am
indexing a huge amount of documents using multiple instance of SolrJ
client
running on multiple servers.  There is no way
for me control when commit is called from these clients, so two
different
clients can call commit at the same time.
I am not sure if I can/should use auto/timed commit because I need
 to
   know
if a commit failed so I can rollback the batch that failed.
   
What kind of options do I have?
Should I try to catch the exception and keep trying to recommit
  until
   it
goes through?  I can see some potential of problems with this
  approach.
Do I need to write a request broker to queue up all these commit and
   send
them to solr one by one in a timely manner?
   
Just wanted to know if anyone has a solution for this problem before
 I
dive
off the deep end.
   
Thanks,
Phong
   
   
   
  
 



SolrException: Unavailable Service

2011-04-12 Thread Phong Dais
Hi,

I did not want to hijack this thread (
http://www.mail-archive.com/solr-user@lucene.apache.org/msg34181.html)
but I am experiencing the same exact problem mentioned here.

To sum up the issue, I am getting intermittent Unavailable Service
exception during indexing commit phase.
I know that I am calling commit very often but I do not see any way around
this.  This is my situation, I am
indexing a huge amount of documents using multiple instance of SolrJ client
running on multiple servers.  There is no way
for me control when commit is called from these clients, so two different
clients can call commit at the same time.
I am not sure if I can/should use auto/timed commit because I need to know
if a commit failed so I can rollback the batch that failed.

What kind of options do I have?
Should I try to catch the exception and keep trying to recommit until it
goes through?  I can see some potential of problems with this approach.
Do I need to write a request broker to queue up all these commit and send
them to solr one by one in a timely manner?

Just wanted to know if anyone has a solution for this problem before I dive
off the deep end.

Thanks,
Phong


Re: SolrException: Unavailable Service

2011-04-12 Thread Erick Erickson
If your commit from the client fails, you don't really know the
state of your index anyway. All the threads you have sending
documents to Solr are adding them to a single internal buffer.
Committing flushes that buffer.

So if thread 1 gets an error on commit, it will presumably
have some documents from thread 2 in the commit. But
thread 2 won't necessarily see the results. So I don't think
your statement about needing to know if a commit fails
is really

On Tue, Apr 12, 2011 at 8:50 AM, Phong Dais phong.gd...@gmail.com wrote:

 Hi,

 I did not want to hijack this thread (
 http://www.mail-archive.com/solr-user@lucene.apache.org/msg34181.html)
 but I am experiencing the same exact problem mentioned here.

 To sum up the issue, I am getting intermittent Unavailable Service
 exception during indexing commit phase.
 I know that I am calling commit very often but I do not see any way
 around
 this.  This is my situation, I am
 indexing a huge amount of documents using multiple instance of SolrJ client
 running on multiple servers.  There is no way
 for me control when commit is called from these clients, so two different
 clients can call commit at the same time.
 I am not sure if I can/should use auto/timed commit because I need to know
 if a commit failed so I can rollback the batch that failed.

 What kind of options do I have?
 Should I try to catch the exception and keep trying to recommit until it
 goes through?  I can see some potential of problems with this approach.
 Do I need to write a request broker to queue up all these commit and send
 them to solr one by one in a timely manner?

 Just wanted to know if anyone has a solution for this problem before I dive
 off the deep end.

 Thanks,
 Phong



Re: SolrException: Unavailable Service

2011-04-12 Thread Erick Erickson
Sorry, fat fingers. Sent that last e-mail inadvertently.

Anyway, if I have this correct, I'd recommend going to
autocommit and NOT committing from the clients. That's
usually the recommended procedure.

This is especially true if you have a master/slave setup,
because each commit from each client will trigger
(potentially) a replication.

Best
Erick

On Tue, Apr 12, 2011 at 9:07 AM, Erick Erickson erickerick...@gmail.comwrote:

 If your commit from the client fails, you don't really know the
 state of your index anyway. All the threads you have sending
 documents to Solr are adding them to a single internal buffer.
 Committing flushes that buffer.

 So if thread 1 gets an error on commit, it will presumably
 have some documents from thread 2 in the commit. But
 thread 2 won't necessarily see the results. So I don't think
 your statement about needing to know if a commit fails
 is really


 On Tue, Apr 12, 2011 at 8:50 AM, Phong Dais phong.gd...@gmail.com wrote:

 Hi,

 I did not want to hijack this thread (
 http://www.mail-archive.com/solr-user@lucene.apache.org/msg34181.html)
 but I am experiencing the same exact problem mentioned here.

 To sum up the issue, I am getting intermittent Unavailable Service
 exception during indexing commit phase.
 I know that I am calling commit very often but I do not see any way
 around
 this.  This is my situation, I am
 indexing a huge amount of documents using multiple instance of SolrJ
 client
 running on multiple servers.  There is no way
 for me control when commit is called from these clients, so two
 different
 clients can call commit at the same time.
 I am not sure if I can/should use auto/timed commit because I need to know
 if a commit failed so I can rollback the batch that failed.

 What kind of options do I have?
 Should I try to catch the exception and keep trying to recommit until it
 goes through?  I can see some potential of problems with this approach.
 Do I need to write a request broker to queue up all these commit and send
 them to solr one by one in a timely manner?

 Just wanted to know if anyone has a solution for this problem before I
 dive
 off the deep end.

 Thanks,
 Phong





Re: SolrException: Unavailable Service

2011-04-12 Thread Phong Dais
Erick,

My setup is not quite the way you described.  I have multiple threads
indexing simultaneously, but I only have 1 thread doing the commit after all
indexing threads finished.  I have multiple instances of this running each
in their own java vm.  I'm ok with throwing out all the docs indexed so far
if the commit fail.

I did not know that the recommended procedure is to use auto commit.  I will
explore this avenue.  I was not aware of the master slave setup neither.

The first thing that comes to mind is how do I know which docs did not get
committed if the auto commit ever fails?  What is the recommended procedure
for handling failure?  Any failed docs will need to be index at some point
in the future.

Thanks for the valuable inputs.

Phong


On Tue, Apr 12, 2011 at 9:09 AM, Erick Erickson erickerick...@gmail.comwrote:

 Sorry, fat fingers. Sent that last e-mail inadvertently.

 Anyway, if I have this correct, I'd recommend going to
 autocommit and NOT committing from the clients. That's
 usually the recommended procedure.

 This is especially true if you have a master/slave setup,
 because each commit from each client will trigger
 (potentially) a replication.

 Best
 Erick

 On Tue, Apr 12, 2011 at 9:07 AM, Erick Erickson erickerick...@gmail.com
 wrote:

  If your commit from the client fails, you don't really know the
  state of your index anyway. All the threads you have sending
  documents to Solr are adding them to a single internal buffer.
  Committing flushes that buffer.
 
  So if thread 1 gets an error on commit, it will presumably
  have some documents from thread 2 in the commit. But
  thread 2 won't necessarily see the results. So I don't think
  your statement about needing to know if a commit fails
  is really
 
 
  On Tue, Apr 12, 2011 at 8:50 AM, Phong Dais phong.gd...@gmail.com
 wrote:
 
  Hi,
 
  I did not want to hijack this thread (
  http://www.mail-archive.com/solr-user@lucene.apache.org/msg34181.html)
  but I am experiencing the same exact problem mentioned here.
 
  To sum up the issue, I am getting intermittent Unavailable Service
  exception during indexing commit phase.
  I know that I am calling commit very often but I do not see any way
  around
  this.  This is my situation, I am
  indexing a huge amount of documents using multiple instance of SolrJ
  client
  running on multiple servers.  There is no way
  for me control when commit is called from these clients, so two
  different
  clients can call commit at the same time.
  I am not sure if I can/should use auto/timed commit because I need to
 know
  if a commit failed so I can rollback the batch that failed.
 
  What kind of options do I have?
  Should I try to catch the exception and keep trying to recommit until
 it
  goes through?  I can see some potential of problems with this approach.
  Do I need to write a request broker to queue up all these commit and
 send
  them to solr one by one in a timely manner?
 
  Just wanted to know if anyone has a solution for this problem before I
  dive
  off the deep end.
 
  Thanks,
  Phong
 
 
 



Re: SolrException: Unavailable Service

2011-04-12 Thread Erick Erickson
See below:

On Tue, Apr 12, 2011 at 2:21 PM, Phong Dais phong.gd...@gmail.com wrote:

 Erick,

 My setup is not quite the way you described.  I have multiple threads
 indexing simultaneously, but I only have 1 thread doing the commit after
 all
 indexing threads finished.  I have multiple instances of this running each
 in their own java vm.  I'm ok with throwing out all the docs indexed so far
 if the commit fail.


But this is really the same thing. On the back end, Solr is piping them all
into
a common index and that is where the autocommit happens.

The fact that it's happening in separate JVMs doesn't alter the concept, you
should
let autocommit handle things. The problem here is knowing what hasn't
indexed.


 I did not know that the recommended procedure is to use auto commit.  I
 will
 explore this avenue.  I was not aware of the master slave setup neither.

 The first thing that comes to mind is how do I know which docs did not get
 committed if the auto commit ever fails?  What is the recommended procedure
 for handling failure?  Any failed docs will need to be index at some point
 in the future.


Assuming that you have a uniqueKey defined, you can look at the logs to
see failures.
Then you can choose to re-index all the documents that have changed around
that
time (backing up as far as you need to to be safe) . The key here is that
you can re-index and the old copy (if any) will be replaced by the
re-indexed
copy.

There's nothing really built into Solr that does this for you, you really
have to build this
part yourself.

Best
Erick



 Thanks for the valuable inputs.

 Phong


 On Tue, Apr 12, 2011 at 9:09 AM, Erick Erickson erickerick...@gmail.com
 wrote:

  Sorry, fat fingers. Sent that last e-mail inadvertently.
 
  Anyway, if I have this correct, I'd recommend going to
  autocommit and NOT committing from the clients. That's
  usually the recommended procedure.
 
  This is especially true if you have a master/slave setup,
  because each commit from each client will trigger
  (potentially) a replication.
 
  Best
  Erick
 
  On Tue, Apr 12, 2011 at 9:07 AM, Erick Erickson erickerick...@gmail.com
  wrote:
 
   If your commit from the client fails, you don't really know the
   state of your index anyway. All the threads you have sending
   documents to Solr are adding them to a single internal buffer.
   Committing flushes that buffer.
  
   So if thread 1 gets an error on commit, it will presumably
   have some documents from thread 2 in the commit. But
   thread 2 won't necessarily see the results. So I don't think
   your statement about needing to know if a commit fails
   is really
  
  
   On Tue, Apr 12, 2011 at 8:50 AM, Phong Dais phong.gd...@gmail.com
  wrote:
  
   Hi,
  
   I did not want to hijack this thread (
   http://www.mail-archive.com/solr-user@lucene.apache.org/msg34181.html
 )
   but I am experiencing the same exact problem mentioned here.
  
   To sum up the issue, I am getting intermittent Unavailable Service
   exception during indexing commit phase.
   I know that I am calling commit very often but I do not see any way
   around
   this.  This is my situation, I am
   indexing a huge amount of documents using multiple instance of SolrJ
   client
   running on multiple servers.  There is no way
   for me control when commit is called from these clients, so two
   different
   clients can call commit at the same time.
   I am not sure if I can/should use auto/timed commit because I need to
  know
   if a commit failed so I can rollback the batch that failed.
  
   What kind of options do I have?
   Should I try to catch the exception and keep trying to recommit
 until
  it
   goes through?  I can see some potential of problems with this
 approach.
   Do I need to write a request broker to queue up all these commit and
  send
   them to solr one by one in a timely manner?
  
   Just wanted to know if anyone has a solution for this problem before I
   dive
   off the deep end.
  
   Thanks,
   Phong