>> Ales is this error happening after a node failure? No node failure that I'm aware of.
We did get some unexpected NPE in DataNucleus framework, but, imo, that shouldn't completely kill the app. We'll re-try. And then also re-try with no locking. > Or make something clever based on JGroups views > > default.locking_strategy = fully.qualified.custom.Implementation @Bela, Sanne: how would this look like? As this looks like the best workaround for now -- if this is really the issue. -Ales On Apr 8, 2013, at 1:19 PM, Sanne Grinovero <sa...@infinispan.org> wrote: > There should be no locking contention at all, that is the whole point > of using such a backend and forwarding changes to a single node: that > only a single node ever attempts to acquire this lock. Hence the error > is a simptom of some previous error, I primarily suspect cluster view > stability. > > I indeed have committed an experimental alternative backend in > Infinispan Query (included in master) and another one in Hibernate > Search (not master but a branch I'm working on); > the one in Hibernate Search is meant to superseed the others but it's > not working yet as I need CAS to be working in Infinispan, and this is > still failing my tests. > > The result of failing CAS is the master election: multiple nodes elect > themselves, which results in the locking error. > > Ales is this error happening after a node failure? AFAIK the missing > feature of the JGroups based backend is that it doesn't cleanup stale > index locks when a master fails; each master node releases the lock as > soon as possible (as you have set exclusive_index_use=false) but if > the node is disconnected exactly during the write operation the lock > will need to be cleaned up forcefully. I would normally expect this to > be very unlikely but it could be triggered if you have view stability > problems. > We could try integrating some kind of force-lock clean operation but > it's quite tricky to make sure this happens safely.. there is of > course a purpose for this lock. > > You could try turning off the seatbelt by setting > > default.locking_strategy = none > > Or make something clever based on JGroups views > > default.locking_strategy = fully.qualified.custom.Implementation > > I do have a cleaner solution with proper lock cleanup routines, but > these are based on the CAS operation too.. they are failing stress > tests so I won't commit them yet. > > Sanne > > On 8 April 2013 11:38, Manik Surtani <msurt...@redhat.com> wrote: >> >> On 8 Apr 2013, at 11:28, Ales Justin <ales.jus...@gmail.com> wrote: >> >> This "jgroups" backend was there "long" ago. >> And it was actually us - CD - that fixed it and made use of it. >> It's no different from static JGroups backed, the only diff that this one >> elects master automatically. >> >> I can change to Sanne's new Ispn based prototype if it will help. >> >> But - with my limited cluster knowledge - the issue doesn't look to be >> there. >> I mean, the msgs get properly routed to indexing master, which just cannot >> handle locking contention. >> >> >> Any thoughts on this, Sanne? >> >> >> -Ales >> >> I believe this new backend is WIP in Hibernate Search. Sanne, didn't you >> have a prototype in Infinispan's codebase though? >> >> On 5 Apr 2013, at 15:28, Ales Justin <ales.jus...@gmail.com> wrote: >> >> are you not using the JGroups backend anymore? >> >> >> I'm using that "jgroups" backend, with auto-master election. >> >> these Lock acquisitions are on the index lock, and make me suspect your >> configuration is no longer applying the pattern we discussed a while back, >> when you contributed the fixed to the JGroups indexing backend. >> >> Or is it the "Replication timeout for mstruk/capedwarf" which is causing >> those locking errors? >> >> >> No idea. >> >> btw: didn't you say you had some new backend mechanism? >> Off Infinispan's channel. >> >> -Ales >> >> On 5 April 2013 14:56, Ales Justin <ales.jus...@gmail.com> wrote: >>> >>> We're running a GAE HelloOrm2 example app on 3 nodes (3 laptops). >>> >>> Very soon after deploy, we get a never-ending stack of timeouts, >>> which completely kills the app: >>> * https://gist.github.com/alesj/5319414 >>> >>> I then need to kill the AS in order to get it shutdown. >>> >>> How can this be tuned / fixed? >>> >>> -Ales >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> -- >> Manik Surtani >> ma...@jboss.org >> twitter.com/maniksurtani >> >> Platform Architect, JBoss Data Grid >> http://red.ht/data-grid >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> -- >> Manik Surtani >> ma...@jboss.org >> twitter.com/maniksurtani >> >> Platform Architect, JBoss Data Grid >> http://red.ht/data-grid >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev