Erick,

I was under the misconception that a solr "transaction" is ACID.
>From what you said, I guess solr "transactions" are not Isolated.

Thanks,
Phong

On Tue, Apr 12, 2011 at 2:54 PM, Erick Erickson <erickerick...@gmail.com>wrote:

> See below:
>
> On Tue, Apr 12, 2011 at 2:21 PM, Phong Dais <phong.gd...@gmail.com> wrote:
>
> > Erick,
> >
> > My setup is not quite the way you described.  I have multiple threads
> > indexing simultaneously, but I only have 1 thread doing the commit after
> > all
> > indexing threads finished.  I have multiple instances of this running
> each
> > in their own java vm.  I'm ok with throwing out all the docs indexed so
> far
> > if the commit fail.
> >
> >
> But this is really the same thing. On the back end, Solr is piping them all
> into
> a common index and that is where the autocommit happens.
>
> The fact that it's happening in separate JVMs doesn't alter the concept,
> you
> should
> let autocommit handle things. The problem here is knowing what hasn't
> indexed.
>
>
> > I did not know that the recommended procedure is to use auto commit.  I
> > will
> > explore this avenue.  I was not aware of the master slave setup neither.
> >
> > The first thing that comes to mind is how do I know which docs did not
> get
> > committed if the auto commit ever fails?  What is the recommended
> procedure
> > for handling failure?  Any failed docs will need to be index at some
> point
> > in the future.
> >
>
> Assuming that you have a <uniqueKey> defined, you can look at the logs to
> see failures.
> Then you can choose to re-index all the documents that have changed around
> that
> time (backing up as far as you need to to be safe) . The key here is that
> you can re-index and the old copy (if any) will be replaced by the
> re-indexed
> copy.
>
> There's nothing really built into Solr that does this for you, you really
> have to build this
> part yourself.
>
> Best
> Erick
>
>
> >
> > Thanks for the valuable inputs.
> >
> > Phong
> >
> >
> > On Tue, Apr 12, 2011 at 9:09 AM, Erick Erickson <erickerick...@gmail.com
> > >wrote:
> >
> > > Sorry, fat fingers. Sent that last e-mail inadvertently.
> > >
> > > Anyway, if I have this correct, I'd recommend going to
> > > autocommit and NOT committing from the clients. That's
> > > usually the recommended procedure.
> > >
> > > This is especially true if you have a master/slave setup,
> > > because each commit from each client will trigger
> > > (potentially) a replication.
> > >
> > > Best
> > > Erick
> > >
> > > On Tue, Apr 12, 2011 at 9:07 AM, Erick Erickson <
> erickerick...@gmail.com
> > > >wrote:
> > >
> > > > If your commit from the client fails, you don't really know the
> > > > state of your index anyway. All the threads you have sending
> > > > documents to Solr are adding them to a single internal buffer.
> > > > Committing flushes that buffer.
> > > >
> > > > So if thread 1 gets an error on commit, it will presumably
> > > > have some documents from thread 2 in the commit. But
> > > > thread 2 won't necessarily see the results. So I don't think
> > > > your statement about needing to know if a commit fails
> > > > is really
> > > >
> > > >
> > > > On Tue, Apr 12, 2011 at 8:50 AM, Phong Dais <phong.gd...@gmail.com>
> > > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> I did not want to hijack this thread (
> > > >>
> http://www.mail-archive.com/solr-user@lucene.apache.org/msg34181.html
> > )
> > > >> but I am experiencing the same exact problem mentioned here.
> > > >>
> > > >> To sum up the issue, I am getting intermittent "Unavailable Service"
> > > >> exception during indexing commit phase.
> > > >> I know that I am calling commit "very often" but I do not see any
> way
> > > >> around
> > > >> this.  This is my situation, I am
> > > >> indexing a huge amount of documents using multiple instance of SolrJ
> > > >> client
> > > >> running on multiple servers.  There is no way
> > > >> for me control when "commit" is called from these clients, so two
> > > >> different
> > > >> clients can call commit "at the same time".
> > > >> I am not sure if I can/should use auto/timed commit because I need
> to
> > > know
> > > >> if a commit failed so I can rollback the batch that failed.
> > > >>
> > > >> What kind of options do I have?
> > > >> Should I try to catch the exception and keep trying to "recommit"
> > until
> > > it
> > > >> goes through?  I can see some potential of problems with this
> > approach.
> > > >> Do I need to write a request broker to queue up all these commit and
> > > send
> > > >> them to solr one by one in a "timely" manner?
> > > >>
> > > >> Just wanted to know if anyone has a solution for this problem before
> I
> > > >> dive
> > > >> off the deep end.
> > > >>
> > > >> Thanks,
> > > >> Phong
> > > >>
> > > >
> > > >
> > >
> >
>

Reply via email to