Erick, I was under the misconception that a solr "transaction" is ACID. >From what you said, I guess solr "transactions" are not Isolated.
Thanks, Phong On Tue, Apr 12, 2011 at 2:54 PM, Erick Erickson <erickerick...@gmail.com>wrote: > See below: > > On Tue, Apr 12, 2011 at 2:21 PM, Phong Dais <phong.gd...@gmail.com> wrote: > > > Erick, > > > > My setup is not quite the way you described. I have multiple threads > > indexing simultaneously, but I only have 1 thread doing the commit after > > all > > indexing threads finished. I have multiple instances of this running > each > > in their own java vm. I'm ok with throwing out all the docs indexed so > far > > if the commit fail. > > > > > But this is really the same thing. On the back end, Solr is piping them all > into > a common index and that is where the autocommit happens. > > The fact that it's happening in separate JVMs doesn't alter the concept, > you > should > let autocommit handle things. The problem here is knowing what hasn't > indexed. > > > > I did not know that the recommended procedure is to use auto commit. I > > will > > explore this avenue. I was not aware of the master slave setup neither. > > > > The first thing that comes to mind is how do I know which docs did not > get > > committed if the auto commit ever fails? What is the recommended > procedure > > for handling failure? Any failed docs will need to be index at some > point > > in the future. > > > > Assuming that you have a <uniqueKey> defined, you can look at the logs to > see failures. > Then you can choose to re-index all the documents that have changed around > that > time (backing up as far as you need to to be safe) . The key here is that > you can re-index and the old copy (if any) will be replaced by the > re-indexed > copy. > > There's nothing really built into Solr that does this for you, you really > have to build this > part yourself. > > Best > Erick > > > > > > Thanks for the valuable inputs. > > > > Phong > > > > > > On Tue, Apr 12, 2011 at 9:09 AM, Erick Erickson <erickerick...@gmail.com > > >wrote: > > > > > Sorry, fat fingers. Sent that last e-mail inadvertently. > > > > > > Anyway, if I have this correct, I'd recommend going to > > > autocommit and NOT committing from the clients. That's > > > usually the recommended procedure. > > > > > > This is especially true if you have a master/slave setup, > > > because each commit from each client will trigger > > > (potentially) a replication. > > > > > > Best > > > Erick > > > > > > On Tue, Apr 12, 2011 at 9:07 AM, Erick Erickson < > erickerick...@gmail.com > > > >wrote: > > > > > > > If your commit from the client fails, you don't really know the > > > > state of your index anyway. All the threads you have sending > > > > documents to Solr are adding them to a single internal buffer. > > > > Committing flushes that buffer. > > > > > > > > So if thread 1 gets an error on commit, it will presumably > > > > have some documents from thread 2 in the commit. But > > > > thread 2 won't necessarily see the results. So I don't think > > > > your statement about needing to know if a commit fails > > > > is really > > > > > > > > > > > > On Tue, Apr 12, 2011 at 8:50 AM, Phong Dais <phong.gd...@gmail.com> > > > wrote: > > > > > > > >> Hi, > > > >> > > > >> I did not want to hijack this thread ( > > > >> > http://www.mail-archive.com/solr-user@lucene.apache.org/msg34181.html > > ) > > > >> but I am experiencing the same exact problem mentioned here. > > > >> > > > >> To sum up the issue, I am getting intermittent "Unavailable Service" > > > >> exception during indexing commit phase. > > > >> I know that I am calling commit "very often" but I do not see any > way > > > >> around > > > >> this. This is my situation, I am > > > >> indexing a huge amount of documents using multiple instance of SolrJ > > > >> client > > > >> running on multiple servers. There is no way > > > >> for me control when "commit" is called from these clients, so two > > > >> different > > > >> clients can call commit "at the same time". > > > >> I am not sure if I can/should use auto/timed commit because I need > to > > > know > > > >> if a commit failed so I can rollback the batch that failed. > > > >> > > > >> What kind of options do I have? > > > >> Should I try to catch the exception and keep trying to "recommit" > > until > > > it > > > >> goes through? I can see some potential of problems with this > > approach. > > > >> Do I need to write a request broker to queue up all these commit and > > > send > > > >> them to solr one by one in a "timely" manner? > > > >> > > > >> Just wanted to know if anyone has a solution for this problem before > I > > > >> dive > > > >> off the deep end. > > > >> > > > >> Thanks, > > > >> Phong > > > >> > > > > > > > > > > > > > >