Re: Deadlocks, editing context locking and network tasks

2016-09-05 Thread Samuel Pelletier
Mark,

I discover some useful classes in the er.extensions.concurrency package inside 
ERExtensions.

Bases on these my current pattern for background tasks is this:

ERXApplication._startRequest();
ec = ERXEC.newEditingContext(parentObjectStore);
try {
do the job without worrying about locks, auto locking will handle then
}
finally {
ec = null;
ERXApplication._endRequest();
}

> 1. For a background thread, it is appropriate to create a new editing context 
> (ERXEC.newEditingContext(Bosc)) using a dedicated and new object store 
> coordinator (created using osc = new ERXObjectStoreCoordinator()).

The new editing context is mandatory, the new OSC depends on your case, as 
usual there are pros and cons...
Pros:
- You will use a new connection to the database and if your connections 
settings and use case allows it (do not create long lock in the database 
server), you will not block others threads of your app.

Cons:
- You will not uses the snapshot cache of the main OSC so everything 
will be fetched, this can represent a large memory duplication and will require 
more time if most or your data os already cached.
- Your changes will NOT be propagated to others EOEditingcontexts, they 
only propagate inside an OSC.

Unless you need to perform long fetch (or update), a separate OSC may is 
probably not be the most efficient solution. It is really dependant on the type 
of database access performed by the task.

> 
> 2. For a background thread, all such editing contexts should be lock()’ed and 
> then unlock()’ed - unlocked in finally {} clause in case of uncaught 
> exceptions. Automatic locking is only for ECs used within the R-R loop?

You can, see the beginning of the message.

> 3. But what should one do if, either during a background thread, R-R loop 
> (direct action or component action), one locks an editing context, does some 
> processing of objects within that context, makes a network call, and then 
> does some more processing within that context. Should one simply lock() and 
> then hope for the best, or unlock, do the network process and then re-lock at 
> the end. Are there any issues running unlock() if the EC isn’t actually 
> locked? What happens if that network call never returns?

That should not be a problem if your EOEditing context is private but you will 
not receive the change of the EO from others EOEditingcontexts when you are 
locked. As other said, you should have some timeout in place and handle them 
properly.

I do not know about too many unlock, I do not expect it to cause problems but I 
suggest to try, this is easy.


> 4. Is locking an EC from a newly created OSC completely independent from all 
> other OSC ECs? If that lock isn’t released for some time, does it matter?

As any lock, all resources used will never be released. This will include the 
snapshot cache of everything fetched in this EC.

Samuel



 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Deadlocks, editing context locking and network tasks

2016-09-05 Thread Mark Wardle
Dear René,

Thank you. This is really helpful. I hadn’t spotted the screencast and will 
check it out.

Mark

> On 5 Sep 2016, at 09:34, René Bock  wrote:
> 
> Hi Mark
> 
>> Am 03.09.2016 um 22:36 schrieb Mark Wardle :
>> 
>> Dear all,
>> 
>> I’m debugging a deadlock and realise that I probably need to re-design some 
>> of my code logic.
>> 
>> Am I right in saying…
>> 
>> 1. For a background thread, it is appropriate to create a new editing 
>> context (ERXEC.newEditingContext(osc)) using a dedicated and new object 
>> store coordinator (created using osc = new ERXObjectStoreCoordinator()).
> 
> you should do that.
> 
>> 
>> 2. For a background thread, all such editing contexts should be lock()’ed 
>> and then unlock()’ed - unlocked in finally {} clause in case of uncaught 
>> exceptions. Automatic locking is only for ECs used within the R-R loop?
> 
> yes
> 
>> 
>> 3. But what should one do if, either during a background thread, R-R loop 
>> (direct action or component action), one locks an editing context, does some 
>> processing of objects within that context, makes a network call, and then 
>> does some more processing within that context. Should one simply lock() and 
>> then hope for the best, or unlock, do the network process and then re-lock 
>> at the end. Are there any issues running unlock() if the EC isn’t actually 
>> locked? What happens if that network call never returns?
> 
> You should handle network time-outs ;-)  How long may the remote call may 
> take? Seconds, minutes or hours? If you have many background tasks waiting 
> network I/O, you may run out of OSCs or memory.. 
> 
> 
>> 
>> 4. Is locking an EC from a newly created OSC completely independent from all 
>> other OSC ECs?
> 
> If you lock en EC, the other OSC (and theire ECs) are not affected
> 
>> If that lock isn’t released for some time, does it matter?
> 
> see above.
> 
>> 
>> All advice appreciated,
> 
> 
> By the way: there is a very helpful screencast on wocummunity:
> 
> http://www.wocommunity.org/podcasts/wowodc/2011/BackgroundTasks.mov
> 
> 
> Best regards
> 
> René Bock
> 
> --
> Phone: +49 69 650096 18
> 
> salient GmbH, Lindleystraße 12, 60314 Frankfurt
> Main: +49 69 65 00 96 0  |  http://www.salient-doremus.de
> 


 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Deadlocks, editing context locking and network tasks

2016-09-05 Thread René Bock
Hi Mark

> Am 03.09.2016 um 22:36 schrieb Mark Wardle :
> 
> Dear all,
> 
> I’m debugging a deadlock and realise that I probably need to re-design some 
> of my code logic.
> 
> Am I right in saying…
> 
> 1. For a background thread, it is appropriate to create a new editing context 
> (ERXEC.newEditingContext(osc)) using a dedicated and new object store 
> coordinator (created using osc = new ERXObjectStoreCoordinator()).

you should do that.

> 
> 2. For a background thread, all such editing contexts should be lock()’ed and 
> then unlock()’ed - unlocked in finally {} clause in case of uncaught 
> exceptions. Automatic locking is only for ECs used within the R-R loop?

yes

> 
> 3. But what should one do if, either during a background thread, R-R loop 
> (direct action or component action), one locks an editing context, does some 
> processing of objects within that context, makes a network call, and then 
> does some more processing within that context. Should one simply lock() and 
> then hope for the best, or unlock, do the network process and then re-lock at 
> the end. Are there any issues running unlock() if the EC isn’t actually 
> locked? What happens if that network call never returns?

You should handle network time-outs ;-)  How long may the remote call may take? 
Seconds, minutes or hours? If you have many background tasks waiting network 
I/O, you may run out of OSCs or memory.. 


> 
> 4. Is locking an EC from a newly created OSC completely independent from all 
> other OSC ECs?

If you lock en EC, the other OSC (and theire ECs) are not affected

> If that lock isn’t released for some time, does it matter?

see above.

> 
> All advice appreciated,


By the way: there is a very helpful screencast on wocummunity:

http://www.wocommunity.org/podcasts/wowodc/2011/BackgroundTasks.mov


Best regards

René Bock

--
Phone: +49 69 650096 18

salient GmbH, Lindleystraße 12, 60314 Frankfurt
Main: +49 69 65 00 96 0  |  http://www.salient-doremus.de


 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Deadlocks, editing context locking and network tasks

2016-09-03 Thread Mark Wardle
Dear all,

I’m debugging a deadlock and realise that I probably need to re-design some of 
my code logic.

Am I right in saying…

1. For a background thread, it is appropriate to create a new editing context 
(ERXEC.newEditingContext(osc)) using a dedicated and new object store 
coordinator (created using osc = new ERXObjectStoreCoordinator()).

2. For a background thread, all such editing contexts should be lock()’ed and 
then unlock()’ed - unlocked in finally {} clause in case of uncaught 
exceptions. Automatic locking is only for ECs used within the R-R loop?

3. But what should one do if, either during a background thread, R-R loop 
(direct action or component action), one locks an editing context, does some 
processing of objects within that context, makes a network call, and then does 
some more processing within that context. Should one simply lock() and then 
hope for the best, or unlock, do the network process and then re-lock at the 
end. Are there any issues running unlock() if the EC isn’t actually locked? 
What happens if that network call never returns?

4. Is locking an EC from a newly created OSC completely independent from all 
other OSC ECs? If that lock isn’t released for some time, does it matter?

All advice appreciated,

Mark

PS. Using Wonder, using safeLock ERXEC flag in application properties.


 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: WebObjects application instances hanging - Deadlocks occurring

2014-11-21 Thread Ralf Schuchardt
Hi Raghu,

as far as I can see, you have a log of lock/unlock operations with the 
EditingContext IDs and your original deadlock log with blocked thread names. 
But it is not easy to correlate those two logs, as the deadlock log is missing 
object IDs and the lock/unlock log has no thread names. I would expect there to 
be one more lock than unlock in the log, but it is tedious to find the right 
one.

There is the property er.extensions.ERXEC.markOpenLocks that may help, if you 
can get it to work. When the deadlock occurs the direct action 
ERXDirectAction/showOpenEditingContextLockTraces should show you a more 
complete picture of currently open locks and where the offending editing 
context was created.

Kind regards,
Ralf


Am 20.11.2014 um 15:23 schrieb Raghavender Bokka 
raghavender.bo...@prithvisolutions.com:

 Hi Team,
 
 The following are the exceptions generating in the log files when we enable 
 the ERX logging, and we do not have any code in the Session.sleep method. And 
 some of our WebObjects application instances are hanging when some user load 
 (around 1000 users) are testing, when we look into the java process thread 
 dump there are deadlocks occurring.
 
 ---
 ---
 Exception
 at er.extensions.eof.ERXEC.unlock(ERXEC.java:501)
 at 
 com.webobjects.eocontrol.EOEditingContext._sendOrEnqueueNotification(EOEditingContext.java:4721)
 at 
 com.webobjects.eocontrol.EOEditingContext._objectsChangedInStore(EOEditingContext.java:3562)
 at er.extensions.eof.ERXEC._objectsChangedInStore(ERXEC.java:1285)
... skipped 7 stack elements
 at 
 com.webobjects.eocontrol.EOObjectStoreCoordinator._objectsChangedInSubStore(EOObjectStoreCoordinator.java:693)
... skipped 16 stack elements
 at 
 com.webobjects.eocontrol.EOObjectStoreCoordinator.saveChangesInEditingContext(EOObjectStoreCoordinator.java:386)
 at 
 com.webobjects.eocontrol.EOEditingContext.saveChanges(EOEditingContext.java:3192)
 at er.extensions.eof.ERXEC._saveChanges(ERXEC.java:981)
 at er.extensions.eof.ERXEC.saveChanges(ERXEC.java:903)
 at 
 TestTakingMode$StudentTestSessionMode.testSubmitted(TestTakingMode.java:648)
 at ReviewTestResponsePage.submitTest(ReviewTestResponsePage.java:99)
... skipped 4 stack elements
 at 
 KeyValueCodingProtectedAccessor.methodValue(KeyValueCodingProtectedAccessor.java:60)
... skipped 46 stack elements
 at Application.dispatchRequest(Application.java:670)
 Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.appserver.ERXSession  - Will 
 terminate, sessionId is FkDsWpsOxKy1TDaligNLDg
 Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.appserver.ERXBrowserFactory  
 - _incrementReferenceCounterForKey() - count = 26, key = 
 IE.7.0.4.0.Windows.{cpu = Unknown CPU; geckoRevision = No Gecko; }
 Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.eof.ERXEC  - After popping: 
 [er.extensions.eof.ERXEC@dd151f]
 Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.ERXEC.LockLogger  - unlocked 
 er.extensions.eof.ERXEC@13cd5b5
 Exception
 at er.extensions.eof.ERXEC.unlock(ERXEC.java:501)
 at com.webobjects.appserver.WOSession._sleepInContext(WOSession.java:849)
 at 
 com.webobjects.appserver.WOApplication.saveSessionForContext(WOApplication.java:1883)
 at 
 er.extensions.appserver.ERXApplication.saveSessionForContext(ERXApplication.java:2075)
... skipped 6 stack elements
 at Application.dispatchRequest(Application.java:670)
 ... skipped 3 stack elements
 Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.ERXEC.LockLogger  - locked 
 er.extensions.eof.ERXEC@13cd5b5
 Exception
 at er.extensions.eof.ERXEC.lock(ERXEC.java:483)
 at 
 com.webobjects.eocontrol.EOEditingContext._dispose(EOEditingContext.java:1116)
 at 
 com.webobjects.eocontrol.EOEditingContext.dispose(EOEditingContext.java:)
 at er.extensions.eof.ERXEC.dispose(ERXEC.java:610)
 at com.webobjects.appserver.WOSession._sleepInContext(WOSession.java:854)
 at 
 com.webobjects.appserver.WOApplication.saveSessionForContext(WOApplication.java:1883)
 at 
 er.extensions.appserver.ERXApplication.saveSessionForContext(ERXApplication.java:2075)
... skipped 6 stack elements
 at Application.dispatchRequest(Application.java:670)
 ... skipped 3 stack elements
 Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.eof.ERXEC  - After pushing: 
 [er.extensions.eof.ERXEC@dd151f, er.extensions.eof.ERXEC@13cd5b5]
 Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.eof.ERXEC  - After popping: 
 [er.extensions.eof.ERXEC@dd151f]
 Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.ERXEC.LockLogger  - unlocked 
 er.extensions.eof.ERXEC@13cd5b5
 Exception
 at er.extensions.eof.ERXEC.unlock(ERXEC.java:501)
 at 
 com.webobjects.eocontrol.EOEditingContext._dispose(EOEditingContext.java:1218)
 at 
 com.webobjects.eocontrol.EOEditingContext.dispose(EOEditingContext.java:)
 at er.extensions.eof.ERXEC.dispose(ERXEC.java:610)
 at com.webobjects.appserver.WOSession

WebObjects application instances hanging - Deadlocks occurring

2014-11-20 Thread Raghavender Bokka
Hi Team,

The following are the exceptions generating in the log files when we enable the 
ERX logging, and we do not have any code in the Session.sleep method. And some 
of our WebObjects application instances are hanging when some user load (around 
1000 users) are testing, when we look into the java process thread dump there 
are deadlocks occurring.

---
---
Exception
 at er.extensions.eof.ERXEC.unlock(ERXEC.java:501)
 at 
com.webobjects.eocontrol.EOEditingContext._sendOrEnqueueNotification(EOEditingContext.java:4721)
 at 
com.webobjects.eocontrol.EOEditingContext._objectsChangedInStore(EOEditingContext.java:3562)
 at er.extensions.eof.ERXEC._objectsChangedInStore(ERXEC.java:1285)
... skipped 7 stack elements
 at 
com.webobjects.eocontrol.EOObjectStoreCoordinator._objectsChangedInSubStore(EOObjectStoreCoordinator.java:693)
... skipped 16 stack elements
 at 
com.webobjects.eocontrol.EOObjectStoreCoordinator.saveChangesInEditingContext(EOObjectStoreCoordinator.java:386)
 at 
com.webobjects.eocontrol.EOEditingContext.saveChanges(EOEditingContext.java:3192)
 at er.extensions.eof.ERXEC._saveChanges(ERXEC.java:981)
 at er.extensions.eof.ERXEC.saveChanges(ERXEC.java:903)
 at TestTakingMode$StudentTestSessionMode.testSubmitted(TestTakingMode.java:648)
 at ReviewTestResponsePage.submitTest(ReviewTestResponsePage.java:99)
... skipped 4 stack elements
 at 
KeyValueCodingProtectedAccessor.methodValue(KeyValueCodingProtectedAccessor.java:60)
... skipped 46 stack elements
 at Application.dispatchRequest(Application.java:670)
Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.appserver.ERXSession  - Will 
terminate, sessionId is FkDsWpsOxKy1TDaligNLDg
Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.appserver.ERXBrowserFactory  - 
_incrementReferenceCounterForKey() - count = 26, key = IE.7.0.4.0.Windows.{cpu 
= Unknown CPU; geckoRevision = No Gecko; }
Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.eof.ERXEC  - After popping: 
[er.extensions.eof.ERXEC@dd151f]
Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.ERXEC.LockLogger  - unlocked 
er.extensions.eof.ERXEC@13cd5b5
Exception
 at er.extensions.eof.ERXEC.unlock(ERXEC.java:501)
 at com.webobjects.appserver.WOSession._sleepInContext(WOSession.java:849)
 at 
com.webobjects.appserver.WOApplication.saveSessionForContext(WOApplication.java:1883)
 at 
er.extensions.appserver.ERXApplication.saveSessionForContext(ERXApplication.java:2075)
... skipped 6 stack elements
 at Application.dispatchRequest(Application.java:670)
 ... skipped 3 stack elements
Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.ERXEC.LockLogger  - locked 
er.extensions.eof.ERXEC@13cd5b5
Exception
 at er.extensions.eof.ERXEC.lock(ERXEC.java:483)
 at 
com.webobjects.eocontrol.EOEditingContext._dispose(EOEditingContext.java:1116)
 at 
com.webobjects.eocontrol.EOEditingContext.dispose(EOEditingContext.java:)
 at er.extensions.eof.ERXEC.dispose(ERXEC.java:610)
 at com.webobjects.appserver.WOSession._sleepInContext(WOSession.java:854)
 at 
com.webobjects.appserver.WOApplication.saveSessionForContext(WOApplication.java:1883)
 at 
er.extensions.appserver.ERXApplication.saveSessionForContext(ERXApplication.java:2075)
... skipped 6 stack elements
 at Application.dispatchRequest(Application.java:670)
 ... skipped 3 stack elements
Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.eof.ERXEC  - After pushing: 
[er.extensions.eof.ERXEC@dd151f, er.extensions.eof.ERXEC@13cd5b5]
Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.eof.ERXEC  - After popping: 
[er.extensions.eof.ERXEC@dd151f]
Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.ERXEC.LockLogger  - unlocked 
er.extensions.eof.ERXEC@13cd5b5
Exception
 at er.extensions.eof.ERXEC.unlock(ERXEC.java:501)
 at 
com.webobjects.eocontrol.EOEditingContext._dispose(EOEditingContext.java:1218)
 at 
com.webobjects.eocontrol.EOEditingContext.dispose(EOEditingContext.java:)
 at er.extensions.eof.ERXEC.dispose(ERXEC.java:610)
 at com.webobjects.appserver.WOSession._sleepInContext(WOSession.java:854)
 at 
com.webobjects.appserver.WOApplication.saveSessionForContext(WOApplication.java:1883)
 at 
er.extensions.appserver.ERXApplication.saveSessionForContext(ERXApplication.java:2075)
... skipped 6 stack elements
 at Application.dispatchRequest(Application.java:670)
 ... skipped 3 stack elements
Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.eof.ERXEC  - After pushing: 
[er.extensions.eof.ERXEC@dd151f, er.extensions.eof.ERXEC@dd151f]
Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.eof.ERXEC  - After popping: 
[er.extensions.eof.ERXEC@dd151f]
Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.appserver.ERXBrowserFactory  - 
_decrementReferenceCounterForKey() - count = 25, key = IE.7.0.4.0.Windows.{cpu 
= Unknown CPU; geckoRevision = No Gecko; }
Nov 17 22:22:01 Solar[6009] DEBUG er.extensions.appserver.ERXBrowserFactory

Re: WebObjects application instances hanging - Deadlocks occurring

2014-11-18 Thread Raghavender Bokka
._private.WODynamicGroup.invokeChildrenAction(WODynamicGroup.java:105)
at 
com.webobjects.appserver._private.WODynamicGroup.invokeAction(WODynamicGroup.java:115)
at 
com.webobjects.appserver.WOComponent.invokeAction(WOComponent.java:1079)
at 
er.extensions.components.ERXComponent.invokeAction(ERXComponent.java:92)
at com.webobjects.appserver.WOSession.invokeAction(WOSession.java:1357)
at Session.invokeAction(Session.java:191)
at 
com.webobjects.appserver.WOApplication.invokeAction(WOApplication.java:1745)
at 
er.extensions.appserver.ajax.ERXAjaxApplication.invokeAction(ERXAjaxApplication.java:50)
at 
er.extensions.appserver.ERXApplication.invokeAction(ERXApplication.java:1687)
at 
com.webobjects.appserver._private.WOComponentRequestHandler._dispatchWithPreparedPage(WOComponentRequestHandler.java:206)
at 
com.webobjects.appserver._private.WOComponentRequestHandler._dispatchWithPreparedSession(WOComponentRequestHandler.java:298)
at 
com.webobjects.appserver._private.WOComponentRequestHandler._dispatchWithPreparedApplication(WOComponentRequestHandler.java:332)
at 
com.webobjects.appserver._private.WOComponentRequestHandler._handleRequest(WOComponentRequestHandler.java:369)
at 
com.webobjects.appserver._private.WOComponentRequestHandler.handleRequest(WOComponentRequestHandler.java:442)
at 
com.webobjects.appserver.WOApplication.dispatchRequest(WOApplication.java:1687)
at 
er.extensions.appserver.ERXApplication.dispatchRequestImmediately(ERXApplication.java:1802)
at 
er.extensions.appserver.ERXApplication.dispatchRequest(ERXApplication.java:1767)
at Application.dispatchRequest(Application.java:653)
at 
com.webobjects.appserver._private.WOWorkerThread.runOnce(WOWorkerThread.java:144)
at 
com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:226)
at java.lang.Thread.run(Thread.java:619)

Nov 17 22:20:48 Solar[6009] DEBUG er.extensions.ERXEC.LockLogger  - locked 
er.extensions.eof.ERXEC@13cd5b5
Exception
  at er.extensions.eof.ERXEC.lock(ERXEC.java:483)
  at er.extensions.eof.ERXEC$DefaultFactory._newEditingContext(ERXEC.java:1465)
  at er.extensions.eof.ERXEC$DefaultFactory._newEditingContext(ERXEC.java:1434)
  at er.extensions.eof.ERXEC.newEditingContext(ERXEC.java:1540)
  at 
er.extensions.appserver.ERXSession.defaultEditingContext(ERXSession.java:353)
  at Session.setLoginUser(Session.java:106)
  at Main.login(Main.java:185)
  at Main.login(Main.java:120)
 ... skipped 4 stack elements
  at 
KeyValueCodingProtectedAccessor.methodValue(KeyValueCodingProtectedAccessor.java:60)
 ... skipped 46 stack elements
  at Application.dispatchRequest(Application.java:653)
  ... skipped 3 stack elements
Nov 17 22:20:48 Solar[6009] DEBUG er.extensions.eof.ERXEC  - After pushing: 
[er.extensions.eof.ERXEC@5971c3, er.extensions.eof.ERXEC@13cd5b5]
Nov 17 22:20:48 Solar[6009] DEBUG er.extensions.eof.ERXEC  - After popping: 
[er.extensions.eof.ERXEC@5971c3]
Nov 17 22:20:48 Solar[6009] DEBUG er.extensions.ERXEC.LockLogger  - unlocked 
er.extensions.eof.ERXEC@13cd5b5
Exception
  at er.extensions.eof.ERXEC.unlock(ERXEC.java:501)
  at er.extensions.eof.ERXEC$DefaultFactory._newEditingContext(ERXEC.java:1467)
  at er.extensions.eof.ERXEC$DefaultFactory._newEditingContext(ERXEC.java:1434)
  at er.extensions.eof.ERXEC.newEditingContext(ERXEC.java:1540)
  at 
er.extensions.appserver.ERXSession.defaultEditingContext(ERXSession.java:353)
  at Session.setLoginUser(Session.java:106)
  at Main.login(Main.java:185)
  at Main.login(Main.java:120)
 ... skipped 4 stack elements
  at 
KeyValueCodingProtectedAccessor.methodValue(KeyValueCodingProtectedAccessor.java:60)
 ... skipped 46 stack elements
  at Application.dispatchRequest(Application.java:653)
  ... skipped 3 stack elements
---
---

Any help would be appreciated.

Regards,
Raghu.

On 17-Nov-2014, at 11:41 PM, webobjects-dev-requ...@lists.apple.com wrote:

 Send Webobjects-dev mailing list submissions to
   webobjects-dev@lists.apple.com
 
 To subscribe or unsubscribe via the World Wide Web, visit
   https://lists.apple.com/mailman/listinfo/webobjects-dev
 or, via email, send a message with subject or body 'help' to
   webobjects-dev-requ...@lists.apple.com
 
 You can reach the person managing the list at
   webobjects-dev-ow...@lists.apple.com
 
 When replying, please edit your Subject line so it is more specific
 than Re: Contents of Webobjects-dev digest...
 
 
 Today's Topics:
 
   1. Re: WebObjects application instances hanging - Deadlocks
  occurring (Ralf Schuchardt)
   2. Re: WOWODC 2015 - April 25, 26 and 27 2015
  (CHRISTOPH WICK | i4innovation GmbH, Bonn)
   3. Re: WOCommunity maven repository down? (Henrique Prange)
   4. Re: WOWODC 2015

Re: WebObjects application instances hanging - Deadlocks occurring

2014-11-17 Thread Ralf Schuchardt
Hi,

Am 17.11.2014 um 13:33 schrieb Raghavender Bokka 
raghavender.bo...@prithvisolutions.com:

 Hi Team,
 
 Some of our WebObjects application instances are hanging when some user load 
 (around 1000 users) are testing, when we look into the java process thread 
 dump there are deadlocks occurring. The following is the thread dump: 

[...]

 WorkerThread24 prio=3 tid=0x00e42800 nid=0x31 waiting on condition 
 [0xd49fe000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0xdc3837c8 (a 
 java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
at 
 java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
at 
 java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
at 
 com.webobjects.eocontrol.EOEditingContext.lock(EOEditingContext.java:4617)
at er.extensions.eof.ERXEC.lock(ERXEC.java:480)
at 
 com.webobjects.appserver.WOSession._awakeInContext(WOSession.java:835)
at 
 com.webobjects.appserver.WOApplication.restoreSessionWithID(WOApplication.java:1917)
at 
 er.extensions.appserver.ERXApplication.restoreSessionWithID(ERXApplication.java:2093)
at 
 com.webobjects.appserver._private.WOComponentRequestHandler._dispatchWithPreparedApplication(WOComponentRequestHandler.java:324)
at 
 com.webobjects.appserver._private.WOComponentRequestHandler._handleRequest(WOComponentRequestHandler.java:369)
at 
 com.webobjects.appserver._private.WOComponentRequestHandler.handleRequest(WOComponentRequestHandler.java:442)
- locked 0xdbc631d0 (a java.lang.Object)
at 
 com.webobjects.appserver.WOApplication.dispatchRequest(WOApplication.java:1687)
at 
 er.extensions.appserver.ERXApplication.dispatchRequestImmediately(ERXApplication.java:1802)
at 
 er.extensions.appserver.ERXApplication.dispatchRequest(ERXApplication.java:1767)
at Application.dispatchRequest(Application.java:670)
at 
 com.webobjects.appserver._private.WOWorkerThread.runOnce(WOWorkerThread.java:144)
at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:226)
at java.lang.Thread.run(Thread.java:619)

This stack trace seems to indicate, that the defaultEditingContext was not 
unlocked in the previous request. Do you see an exception prior to the deadlock?
If you have code in a Session.sleep() method, make sure to catch all exceptions 
there.

Ralf

 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

RE: WOWorkerThread deadlocks

2013-01-15 Thread Michael Gargano
Hi guys,

I know I'm a little late on this, but I'm also seeing the same 
behavior.  It's not a long running query I don't think because I'm logging long 
queries in postgres and nothing is running over 10 seconds.  Can you explain 
why having a max of 256 worker threads is too high?  Any other things I should 
look at?  The customers are not happy!  My last problem did turn out to be a 
bunch of deadlocks, which all now seem to be resolved.  It had to with setting 
er.extensions.ERXObjectStoreCoordinatorPool.maxCoordinators=4 which should be 
seamless (you would think) but causes issues with fetch specs that have EOs 
crossing OSCs.  I had to pull all EOs local, seems like something that should 
be handled inside wonder automatically (so I consider it a bug, whether it is 
or not could be argued I guess).  Anyway, after those all got fixed, I'm now 
running into this.  Much harder to figure out since I don't even know what the 
lock is held on.

BTW Chuck and Quinton, I owe you guys a beer.  Thanks for pointing me in the 
right direction on the last problem.

Thanks for any help.
-Mike

-Original Message-
From: webobjects-dev-bounces+mgargano=escholar@lists.apple.com 
[mailto:webobjects-dev-bounces+mgargano=escholar@lists.apple.com] On Behalf 
Of Chuck Hill
Sent: Monday, September 10, 2012 1:24 PM
To: Maik Musall
Cc: webobjects-dev@lists.apple.com WebObjects
Subject: Re: WOWorkerThread deadlocks

Hi Maik,

WorkerThread207 that many worker threads indicates two things to me:
1. Your app configuration is too high.  I'd use a max of 6-10 and a listen 
queue size of around 4 (adjusted to suit your specific needs).  A WO app is 
very, very unlikely to recover from a 200 worker thread backlog in any way that 
is useful to the users

2. You have a thread that is taking a long time to return a result.  If you are 
dispatching requests concurrently, then this is most likely stuck in 
EOControl/EOAccess (e.g. waiting for a slow query result) or connecting to some 
external process.  You could also have a deadlock.  If you are not dispatching 
requests concurrently, then this delay could be in other code.

The traces below do not show the problem.  If you want to send a full dump, I 
am willing to look at it.  It is possible that the problem had resolved by the 
time you took this dump.  What you show below is normal for a lot of worker 
threads.  WorkerThread206 is waiting for a new request, WorkerThread207 is idle 
waiting for something to do in the future.

Chuck


On 2012-09-10, at 8:03 AM, Maik Musall wrote:

 Hi,
 
 in an app with high concurrency, the app sometimes becomes unresponsive to 
 everything but DirectActions at the time of day with the most concurrency. 
 All users aren't seeing responses any more. In jstack I see hundreds of these:
 
 WorkerThread207 prio=5 tid=131e0a800 nid=0x151aa2000 waiting for monitor 
 entry [151aa1000]
   java.lang.Thread.State: BLOCKED (on object monitor)
  at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:406)
  - waiting to lock 20d3da450 (a java.net.SocksSocketImpl)
  at java.net.ServerSocket.implAccept(ServerSocket.java:462)
  at java.net.ServerSocket.accept(ServerSocket.java:430)
  at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:210)
  at java.lang.Thread.run(Thread.java:680)
 
 all waiting on the same lock 20d3da450, and one thread holding that lock:
 
 WorkerThread206 prio=5 tid=131d79800 nid=0x15199f000 runnable [15199e000]
   java.lang.Thread.State: RUNNABLE
  at java.net.PlainSocketImpl.socketAccept(Native Method)
  at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
  - locked 20d3da450 (a java.net.SocksSocketImpl)
  at java.net.ServerSocket.implAccept(ServerSocket.java:462)
  at java.net.ServerSocket.accept(ServerSocket.java:430)
  at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:210)
  at java.lang.Thread.run(Thread.java:680)
 
 Anyone familiar with this problem?
 
 Maik
 ___
 Do not post admin requests to the list. They will be ignored.
 Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
 Help/Unsubscribe/Update your Subscription:
 https://lists.apple.com/mailman/options/webobjects-dev/chill%40global-village.net
 
 This email sent to ch...@global-village.net

-- 
Chuck Hill Senior Consultant / VP Development

Practical WebObjects - for developers who want to increase their overall 
knowledge of WebObjects or who are trying to solve specific problems.
http://www.global-village.net/gvc/practical_webobjects









 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/mgargano%40escholar.com

This email sent to mgarg

Re: WOWorkerThread deadlocks

2013-01-15 Thread Michael Gargano

On Jan 15, 2013, at 2:54 PM, Chuck Hill wrote:

 
 On 2013-01-15, at 10:50 AM, Michael Gargano wrote:
 
 Hi guys,
 
  I know I'm a little late on this, but I'm also seeing the same 
 behavior.  It's not a long running query I don't think because I'm logging 
 long queries in postgres and nothing is running over 10 seconds.  Can you 
 explain why having a max of 256 worker threads is too high?
 
 http://osdir.com/ml/web.webobjects.admin/2005-02/msg6.html
 
 Keep in mind that you have 256 threads all trying to do something that 
 usually sooner or later needs a single threaded EOF lock.  That is just not 
 going to make for happy users.

Thanks.  I'll take a look at this.

 
 
 Any other things I should look at?  The customers are not happy!
 
 Cut down the number of worker threads and the listen queue size.  It won't 
 fix the problem but at least (a) you will see it sooner and (b) the app can 
 recover.
 
 
 My last problem did turn out to be a bunch of deadlocks, which all now seem 
 to be resolved.  It had to with setting 
 er.extensions.ERXObjectStoreCoordinatorPool.maxCoordinators=4 which should 
 be seamless (you would think) but causes issues with fetch specs that have 
 EOs crossing OSCs.  
 
 Why on earth would an EO ever cross an OSC?  They don't even cross ECs.

a page creates a new EC gets an EO... that EO is passed around, is on 
another (or the same) page where another EC is created, when a fetchSpec is run 
against the new EC, but the other EO is used as part of the fetchSpec those ECs 
can be associated with two different OSCs, the new EC just created and the EC 
associated with the EO we already have a reference to.  once i called 
localInstance on every EO being used like that all the deadlocks went away.

 
 
 I had to pull all EOs local, seems like something that should be handled 
 inside wonder automatically (so I consider it a bug, whether it is or not 
 could be argued I guess).  Anyway, after those all got fixed, I'm now 
 running into this.  Much harder to figure out since I don't even know what 
 the lock is held on.
 
 sudo jstack -F process id 
 will show you if it is a deadlock.  Otherwise it is likely bad exception 
 handling that results in your code doing a lock() and never doing an unlock()
 

no deadlocks are being detected and i don't see any either.  i see the same 
thing Maik saw, all the worker threads are waiting on a lock held by one worker 
thread which is in a run state and awaiting a socket accept.  I did searches 
across all the code and there is no manual locking anywhere, everything is 
through the autolocking of wonder.


 
 Chuck
 
 
 BTW Chuck and Quinton, I owe you guys a beer.  Thanks for pointing me in the 
 right direction on the last problem.
 
 Thanks for any help.
 -Mike
 
 -Original Message-
 From: webobjects-dev-bounces+mgargano=escholar@lists.apple.com 
 [mailto:webobjects-dev-bounces+mgargano=escholar@lists.apple.com] On 
 Behalf Of Chuck Hill
 Sent: Monday, September 10, 2012 1:24 PM
 To: Maik Musall
 Cc: webobjects-dev@lists.apple.com WebObjects
 Subject: Re: WOWorkerThread deadlocks
 
 Hi Maik,
 
 WorkerThread207 that many worker threads indicates two things to me:
 1. Your app configuration is too high.  I'd use a max of 6-10 and a listen 
 queue size of around 4 (adjusted to suit your specific needs).  A WO app is 
 very, very unlikely to recover from a 200 worker thread backlog in any way 
 that is useful to the users
 
 2. You have a thread that is taking a long time to return a result.  If you 
 are dispatching requests concurrently, then this is most likely stuck in 
 EOControl/EOAccess (e.g. waiting for a slow query result) or connecting to 
 some external process.  You could also have a deadlock.  If you are not 
 dispatching requests concurrently, then this delay could be in other code.
 
 The traces below do not show the problem.  If you want to send a full dump, 
 I am willing to look at it.  It is possible that the problem had resolved by 
 the time you took this dump.  What you show below is normal for a lot of 
 worker threads.  WorkerThread206 is waiting for a new request, 
 WorkerThread207 is idle waiting for something to do in the future.
 
 Chuck
 
 
 On 2012-09-10, at 8:03 AM, Maik Musall wrote:
 
 Hi,
 
 in an app with high concurrency, the app sometimes becomes unresponsive to 
 everything but DirectActions at the time of day with the most concurrency. 
 All users aren't seeing responses any more. In jstack I see hundreds of 
 these:
 
 WorkerThread207 prio=5 tid=131e0a800 nid=0x151aa2000 waiting for monitor 
 entry [151aa1000]
 java.lang.Thread.State: BLOCKED (on object monitor)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:406)
- waiting to lock 20d3da450 (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:462)
at java.net.ServerSocket.accept(ServerSocket.java:430)
at 
 com.webobjects.appserver._private.WOWorkerThread.run

Re: WOSessionStore deadlocks - SOLVED

2012-11-08 Thread Maik Musall
Hi Chuck,

a follow-up on this:

Am 19.10.2012 um 20:05 schrieb Chuck Hill ch...@global-village.net:

 Hi Maik,
 
 This can also indicate some other things too:
 - session did not get checked in (app threw OutOfMemory, sleep() threw an 
 exception)
 - previous request for this session is still running (deadlock, waiting, 
 infinite loop)
 - 2+ requests for the same session in rapid sequence where the first 
 terminates the session

Looks like my answer that OutOfMemory would be OutOfTheQuestion was not true. 
I now discovered what lead to my application hanging every afternoon, after 
*once* it finally cared to log a proper message before hanging:

java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: PermGen 
space

Doh, the PermGen. I totally forgot about that. I had the app at -Xmx24576m, but 
didn't adjust PermGen. Now with a PermGen limit of 512m (of which currently 
about 154m gets used max according to jvisualvm) everything is finally running 
smoothly. The app turns out to load about 12000 classes over a workday. I think 
I need to have a look at what those are sometime...

Maik



 
 
 Chuck
 
 
 
 On 2012-10-19, at 4:00 AM, Maik Musall wrote:
 
 Hi,
 
 I recently discovered what may be responsible for frequent deadlocks of an 
 application here. In the jstack -l output, I see almost all threads 
 waiting on a single ReentrantLock, and this thread is what holds that lock:
 
 
 WorkerThread4 prio=5 tid=103bc9000 nid=0x132caf000 in Object.wait() 
 [132cae000]
  java.lang.Thread.State: WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on 22711d098 (a 
 com.webobjects.appserver.WOSessionStore$TimeoutEntry)
   at java.lang.Object.wait(Object.java:485)
   at 
 com.webobjects.appserver.WOSessionStore.checkOutSessionWithID(WOSessionStore.java:191)
   - locked 22711d098 (a 
 com.webobjects.appserver.WOSessionStore$TimeoutEntry)
   at 
 com.webobjects.appserver.WOApplication.restoreSessionWithID(WOApplication.java:1913)
   at 
 er.extensions.appserver.ERXApplication.restoreSessionWithID(ERXApplication.java:2440)
   at 
 er.extensions.appserver.ERXComponentRequestHandler._dispatchWithPreparedApplication(ERXComponentRequestHandler.java:260)
   at 
 er.extensions.appserver.ERXComponentRequestHandler._handleRequest(ERXComponentRequestHandler.java:302)
   at 
 er.extensions.appserver.ERXComponentRequestHandler.handleRequest(ERXComponentRequestHandler.java:377)
   at 
 com.webobjects.appserver.WOApplication.dispatchRequest(WOApplication.java:1687)
   at 
 er.extensions.appserver.ERXApplication.dispatchRequestImmediately(ERXApplication.java:2139)
   at 
 er.extensions.appserver.ERXApplication.dispatchRequest(ERXApplication.java:2104)
   at 
 com.webobjects.appserver._private.WOWorkerThread.runOnce(WOWorkerThread.java:144)
   at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:226)
   at java.lang.Thread.run(Thread.java:680)
 
  Locked ownable synchronizers:
   - 20ce7bbc0 (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
 
 
 Now, ERXApplication.restoreSessionWithID contains an interesting call to 
 useSessionStoreDeadlockDetection(), but this detection only works in single 
 threaded mode. I'm afraid I can't afford to switch off concurrent requests 
 even for a testing period in production.
 
 I'm looking for someone with experience regarding this problem. The doc for 
 that method mentions that it could help to find cases when a session is 
 checked out twice in a single RR-loop, which will lead to a session store 
 lockup. Since I cannot switch on this detection, what in your experience 
 could lead to that happening?
 
 Thanks
 Maik
 ___
 Do not post admin requests to the list. They will be ignored.
 Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
 Help/Unsubscribe/Update your Subscription:
 https://lists.apple.com/mailman/options/webobjects-dev/chill%40global-village.net
 
 This email sent to ch...@global-village.net
 
 -- 
 Chuck Hill Senior Consultant / VP Development
 
 Practical WebObjects - for developers who want to increase their overall 
 knowledge of WebObjects or who are trying to solve specific problems.
 http://www.global-village.net/gvc/practical_webobjects
 
 Global Village Consulting ranks 13th in 2012 in BIV's Top 100 Fastest Growing 
 Companies in B.C! 
 Global Village Consulting ranks 76th in 24th annual PROFIT 200 ranking of 
 Canada’s Fastest-Growing Companies by PROFIT Magazine!
 
 
 
 
 
 
 
 


 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: WOSessionStore deadlocks - SOLVED

2012-11-08 Thread Farrukh Ijaz
Hi Maik,

Use -XX:MaxPermSize=your-desired-size-in-mbm 

Farrukh

On Nov 8, 2012, at 1:59 PM, Maik Musall m...@selbstdenker.ag wrote:

 Hi Chuck,
 
 a follow-up on this:
 
 Am 19.10.2012 um 20:05 schrieb Chuck Hill ch...@global-village.net:
 
 Hi Maik,
 
 This can also indicate some other things too:
 - session did not get checked in (app threw OutOfMemory, sleep() threw an 
 exception)
 - previous request for this session is still running (deadlock, waiting, 
 infinite loop)
 - 2+ requests for the same session in rapid sequence where the first 
 terminates the session
 
 Looks like my answer that OutOfMemory would be OutOfTheQuestion was not 
 true. I now discovered what lead to my application hanging every afternoon, 
 after *once* it finally cared to log a proper message before hanging:
 
 java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: PermGen 
 space
 
 Doh, the PermGen. I totally forgot about that. I had the app at -Xmx24576m, 
 but didn't adjust PermGen. Now with a PermGen limit of 512m (of which 
 currently about 154m gets used max according to jvisualvm) everything is 
 finally running smoothly. The app turns out to load about 12000 classes over 
 a workday. I think I need to have a look at what those are sometime...
 
 Maik
 
 
 
 
 
 Chuck
 
 
 
 On 2012-10-19, at 4:00 AM, Maik Musall wrote:
 
 Hi,
 
 I recently discovered what may be responsible for frequent deadlocks of an 
 application here. In the jstack -l output, I see almost all threads 
 waiting on a single ReentrantLock, and this thread is what holds that lock:
 
 
 WorkerThread4 prio=5 tid=103bc9000 nid=0x132caf000 in Object.wait() 
 [132cae000]
 java.lang.Thread.State: WAITING (on object monitor)
  at java.lang.Object.wait(Native Method)
  - waiting on 22711d098 (a 
 com.webobjects.appserver.WOSessionStore$TimeoutEntry)
  at java.lang.Object.wait(Object.java:485)
  at 
 com.webobjects.appserver.WOSessionStore.checkOutSessionWithID(WOSessionStore.java:191)
  - locked 22711d098 (a 
 com.webobjects.appserver.WOSessionStore$TimeoutEntry)
  at 
 com.webobjects.appserver.WOApplication.restoreSessionWithID(WOApplication.java:1913)
  at 
 er.extensions.appserver.ERXApplication.restoreSessionWithID(ERXApplication.java:2440)
  at 
 er.extensions.appserver.ERXComponentRequestHandler._dispatchWithPreparedApplication(ERXComponentRequestHandler.java:260)
  at 
 er.extensions.appserver.ERXComponentRequestHandler._handleRequest(ERXComponentRequestHandler.java:302)
  at 
 er.extensions.appserver.ERXComponentRequestHandler.handleRequest(ERXComponentRequestHandler.java:377)
  at 
 com.webobjects.appserver.WOApplication.dispatchRequest(WOApplication.java:1687)
  at 
 er.extensions.appserver.ERXApplication.dispatchRequestImmediately(ERXApplication.java:2139)
  at 
 er.extensions.appserver.ERXApplication.dispatchRequest(ERXApplication.java:2104)
  at 
 com.webobjects.appserver._private.WOWorkerThread.runOnce(WOWorkerThread.java:144)
  at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:226)
  at java.lang.Thread.run(Thread.java:680)
 
 Locked ownable synchronizers:
  - 20ce7bbc0 (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
 
 
 Now, ERXApplication.restoreSessionWithID contains an interesting call to 
 useSessionStoreDeadlockDetection(), but this detection only works in single 
 threaded mode. I'm afraid I can't afford to switch off concurrent requests 
 even for a testing period in production.
 
 I'm looking for someone with experience regarding this problem. The doc for 
 that method mentions that it could help to find cases when a session is 
 checked out twice in a single RR-loop, which will lead to a session store 
 lockup. Since I cannot switch on this detection, what in your experience 
 could lead to that happening?
 
 Thanks
 Maik
 ___
 Do not post admin requests to the list. They will be ignored.
 Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
 Help/Unsubscribe/Update your Subscription:
 https://lists.apple.com/mailman/options/webobjects-dev/chill%40global-village.net
 
 This email sent to ch...@global-village.net
 
 -- 
 Chuck Hill Senior Consultant / VP Development
 
 Practical WebObjects - for developers who want to increase their overall 
 knowledge of WebObjects or who are trying to solve specific problems.
 http://www.global-village.net/gvc/practical_webobjects
 
 Global Village Consulting ranks 13th in 2012 in BIV's Top 100 Fastest 
 Growing Companies in B.C! 
 Global Village Consulting ranks 76th in 24th annual PROFIT 200 ranking of 
 Canada’s Fastest-Growing Companies by PROFIT Magazine!
 
 
 
 
 
 
 
 
 
 
 ___
 Do not post admin requests to the list. They will be ignored.
 Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
 Help/Unsubscribe/Update your Subscription:
 https

Re: WOSessionStore deadlocks - SOLVED

2012-11-08 Thread Maik Musall
Hi Farrukh,

uh.. I think I described nothing else but my success doing that (although not 
mentioned the syntax)?

Maik


Am 08.11.2012 um 12:24 schrieb Farrukh Ijaz 
farrukh.i...@fuegodigitalmedia.com:

 Hi Maik,
 
 Use -XX:MaxPermSize=your-desired-size-in-mbm 
 
 Farrukh
 
 On Nov 8, 2012, at 1:59 PM, Maik Musall m...@selbstdenker.ag wrote:
 
 Hi Chuck,
 
 a follow-up on this:
 
 Am 19.10.2012 um 20:05 schrieb Chuck Hill ch...@global-village.net:
 
 Hi Maik,
 
 This can also indicate some other things too:
 - session did not get checked in (app threw OutOfMemory, sleep() threw an 
 exception)
 - previous request for this session is still running (deadlock, waiting, 
 infinite loop)
 - 2+ requests for the same session in rapid sequence where the first 
 terminates the session
 
 Looks like my answer that OutOfMemory would be OutOfTheQuestion was not 
 true. I now discovered what lead to my application hanging every afternoon, 
 after *once* it finally cared to log a proper message before hanging:
 
 java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: PermGen 
 space
 
 Doh, the PermGen. I totally forgot about that. I had the app at -Xmx24576m, 
 but didn't adjust PermGen. Now with a PermGen limit of 512m (of which 
 currently about 154m gets used max according to jvisualvm) everything is 
 finally running smoothly. The app turns out to load about 12000 classes over 
 a workday. I think I need to have a look at what those are sometime...
 
 Maik
 
 
 
 
 
 Chuck
 
 
 
 On 2012-10-19, at 4:00 AM, Maik Musall wrote:
 
 Hi,
 
 I recently discovered what may be responsible for frequent deadlocks of an 
 application here. In the jstack -l output, I see almost all threads 
 waiting on a single ReentrantLock, and this thread is what holds that lock:
 
 
 WorkerThread4 prio=5 tid=103bc9000 nid=0x132caf000 in Object.wait() 
 [132cae000]
 java.lang.Thread.State: WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 - waiting on 22711d098 (a 
 com.webobjects.appserver.WOSessionStore$TimeoutEntry)
 at java.lang.Object.wait(Object.java:485)
 at 
 com.webobjects.appserver.WOSessionStore.checkOutSessionWithID(WOSessionStore.java:191)
 - locked 22711d098 (a 
 com.webobjects.appserver.WOSessionStore$TimeoutEntry)
 at 
 com.webobjects.appserver.WOApplication.restoreSessionWithID(WOApplication.java:1913)
 at 
 er.extensions.appserver.ERXApplication.restoreSessionWithID(ERXApplication.java:2440)
 at 
 er.extensions.appserver.ERXComponentRequestHandler._dispatchWithPreparedApplication(ERXComponentRequestHandler.java:260)
 at 
 er.extensions.appserver.ERXComponentRequestHandler._handleRequest(ERXComponentRequestHandler.java:302)
 at 
 er.extensions.appserver.ERXComponentRequestHandler.handleRequest(ERXComponentRequestHandler.java:377)
 at 
 com.webobjects.appserver.WOApplication.dispatchRequest(WOApplication.java:1687)
 at 
 er.extensions.appserver.ERXApplication.dispatchRequestImmediately(ERXApplication.java:2139)
 at 
 er.extensions.appserver.ERXApplication.dispatchRequest(ERXApplication.java:2104)
 at 
 com.webobjects.appserver._private.WOWorkerThread.runOnce(WOWorkerThread.java:144)
 at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:226)
 at java.lang.Thread.run(Thread.java:680)
 
 Locked ownable synchronizers:
 - 20ce7bbc0 (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
 
 
 Now, ERXApplication.restoreSessionWithID contains an interesting call to 
 useSessionStoreDeadlockDetection(), but this detection only works in 
 single threaded mode. I'm afraid I can't afford to switch off concurrent 
 requests even for a testing period in production.
 
 I'm looking for someone with experience regarding this problem. The doc 
 for that method mentions that it could help to find cases when a session 
 is checked out twice in a single RR-loop, which will lead to a session 
 store lockup. Since I cannot switch on this detection, what in your 
 experience could lead to that happening?
 
 Thanks
 Maik
 ___
 Do not post admin requests to the list. They will be ignored.
 Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
 Help/Unsubscribe/Update your Subscription:
 https://lists.apple.com/mailman/options/webobjects-dev/chill%40global-village.net
 
 This email sent to ch...@global-village.net
 
 -- 
 Chuck Hill Senior Consultant / VP Development
 
 Practical WebObjects - for developers who want to increase their overall 
 knowledge of WebObjects or who are trying to solve specific problems.
 http://www.global-village.net/gvc/practical_webobjects
 
 Global Village Consulting ranks 13th in 2012 in BIV's Top 100 Fastest 
 Growing Companies in B.C! 
 Global Village Consulting ranks 76th in 24th annual PROFIT 200 ranking of 
 Canada’s Fastest-Growing Companies by PROFIT Magazine

WOSessionStore deadlocks

2012-10-19 Thread Maik Musall
Hi,

I recently discovered what may be responsible for frequent deadlocks of an 
application here. In the jstack -l output, I see almost all threads waiting 
on a single ReentrantLock, and this thread is what holds that lock:


WorkerThread4 prio=5 tid=103bc9000 nid=0x132caf000 in Object.wait() 
[132cae000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 22711d098 (a 
com.webobjects.appserver.WOSessionStore$TimeoutEntry)
at java.lang.Object.wait(Object.java:485)
at 
com.webobjects.appserver.WOSessionStore.checkOutSessionWithID(WOSessionStore.java:191)
- locked 22711d098 (a 
com.webobjects.appserver.WOSessionStore$TimeoutEntry)
at 
com.webobjects.appserver.WOApplication.restoreSessionWithID(WOApplication.java:1913)
at 
er.extensions.appserver.ERXApplication.restoreSessionWithID(ERXApplication.java:2440)
at 
er.extensions.appserver.ERXComponentRequestHandler._dispatchWithPreparedApplication(ERXComponentRequestHandler.java:260)
at 
er.extensions.appserver.ERXComponentRequestHandler._handleRequest(ERXComponentRequestHandler.java:302)
at 
er.extensions.appserver.ERXComponentRequestHandler.handleRequest(ERXComponentRequestHandler.java:377)
at 
com.webobjects.appserver.WOApplication.dispatchRequest(WOApplication.java:1687)
at 
er.extensions.appserver.ERXApplication.dispatchRequestImmediately(ERXApplication.java:2139)
at 
er.extensions.appserver.ERXApplication.dispatchRequest(ERXApplication.java:2104)
at 
com.webobjects.appserver._private.WOWorkerThread.runOnce(WOWorkerThread.java:144)
at 
com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:226)
at java.lang.Thread.run(Thread.java:680)

   Locked ownable synchronizers:
- 20ce7bbc0 (a java.util.concurrent.locks.ReentrantLock$NonfairSync)


Now, ERXApplication.restoreSessionWithID contains an interesting call to 
useSessionStoreDeadlockDetection(), but this detection only works in single 
threaded mode. I'm afraid I can't afford to switch off concurrent requests even 
for a testing period in production.

I'm looking for someone with experience regarding this problem. The doc for 
that method mentions that it could help to find cases when a session is 
checked out twice in a single RR-loop, which will lead to a session store 
lockup. Since I cannot switch on this detection, what in your experience could 
lead to that happening?

Thanks
Maik
 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOSessionStore deadlocks

2012-10-19 Thread Chuck Hill
Hi Maik,

This can also indicate some other things too:
- session did not get checked in (app threw OutOfMemory, sleep() threw an 
exception)
- previous request for this session is still running (deadlock, waiting, 
infinite loop)
- 2+ requests for the same session in rapid sequence where the first terminates 
the session


Chuck



On 2012-10-19, at 4:00 AM, Maik Musall wrote:

 Hi,
 
 I recently discovered what may be responsible for frequent deadlocks of an 
 application here. In the jstack -l output, I see almost all threads waiting 
 on a single ReentrantLock, and this thread is what holds that lock:
 
 
 WorkerThread4 prio=5 tid=103bc9000 nid=0x132caf000 in Object.wait() 
 [132cae000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 22711d098 (a 
 com.webobjects.appserver.WOSessionStore$TimeoutEntry)
at java.lang.Object.wait(Object.java:485)
at 
 com.webobjects.appserver.WOSessionStore.checkOutSessionWithID(WOSessionStore.java:191)
- locked 22711d098 (a 
 com.webobjects.appserver.WOSessionStore$TimeoutEntry)
at 
 com.webobjects.appserver.WOApplication.restoreSessionWithID(WOApplication.java:1913)
at 
 er.extensions.appserver.ERXApplication.restoreSessionWithID(ERXApplication.java:2440)
at 
 er.extensions.appserver.ERXComponentRequestHandler._dispatchWithPreparedApplication(ERXComponentRequestHandler.java:260)
at 
 er.extensions.appserver.ERXComponentRequestHandler._handleRequest(ERXComponentRequestHandler.java:302)
at 
 er.extensions.appserver.ERXComponentRequestHandler.handleRequest(ERXComponentRequestHandler.java:377)
at 
 com.webobjects.appserver.WOApplication.dispatchRequest(WOApplication.java:1687)
at 
 er.extensions.appserver.ERXApplication.dispatchRequestImmediately(ERXApplication.java:2139)
at 
 er.extensions.appserver.ERXApplication.dispatchRequest(ERXApplication.java:2104)
at 
 com.webobjects.appserver._private.WOWorkerThread.runOnce(WOWorkerThread.java:144)
at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:226)
at java.lang.Thread.run(Thread.java:680)
 
   Locked ownable synchronizers:
- 20ce7bbc0 (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
 
 
 Now, ERXApplication.restoreSessionWithID contains an interesting call to 
 useSessionStoreDeadlockDetection(), but this detection only works in single 
 threaded mode. I'm afraid I can't afford to switch off concurrent requests 
 even for a testing period in production.
 
 I'm looking for someone with experience regarding this problem. The doc for 
 that method mentions that it could help to find cases when a session is 
 checked out twice in a single RR-loop, which will lead to a session store 
 lockup. Since I cannot switch on this detection, what in your experience 
 could lead to that happening?
 
 Thanks
 Maik
 ___
 Do not post admin requests to the list. They will be ignored.
 Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
 Help/Unsubscribe/Update your Subscription:
 https://lists.apple.com/mailman/options/webobjects-dev/chill%40global-village.net
 
 This email sent to ch...@global-village.net

-- 
Chuck Hill Senior Consultant / VP Development

Practical WebObjects - for developers who want to increase their overall 
knowledge of WebObjects or who are trying to solve specific problems.
http://www.global-village.net/gvc/practical_webobjects

Global Village Consulting ranks 13th in 2012 in BIV's Top 100 Fastest Growing 
Companies in B.C! 
Global Village Consulting ranks 76th in 24th annual PROFIT 200 ranking of 
Canada’s Fastest-Growing Companies by PROFIT Magazine!









 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: WOWorkerThread deadlocks

2012-09-14 Thread Susanne Schneider

Hi Chuck,

many thanks for your answer.

Am 13.09.2012 20:06, schrieb Chuck Hill:

Hi Susanne,


On 2012-09-13, at 8:57 AM, Susanne Schneider wrote:


Hi all,

please allow me to add one question regarding this interesting topic.

Alexis Tual (my mail client has problem with correct quoting) has suggested for 
EOF background handling:
snip
ec.lock();
try {
   // huge loop to compute stats
   for (i = 0; i  100; i++) {
// doing stuff with ec...
   // cycling the ec
if (i % 100 == 0) {
   ec.unlock();
   ec.dispose();
   ec = newEditingContextForMyWork();
   ec.lock();
}
   }
} finally {
   ec.unlock();
}
/snip

Now my question: is it correct to dispose the ec after unlock or would it be 
better to do this beforehand, like:

ec.dipsose();
ec.unlock();


It is correct to unlock it before disposing it.

Good to know, we will do it this way.




If I turn on the ec-lock logging in my application, there are many remarks from the 
Finalizers like: *** EOEditingContext: access with no lock: _eoForGID()! Is 
this a real problem or can it be ignored?


I am not sure, can you send the full stack trace?

There is nor real exception, just the logging message. We have turned on 
debugging with


   NSLog.debug.setAllowedDebugLevel(NSLog.DebugLevelInformational);
   NSLog.allowDebugLoggingForGroups(NSLog.DebugGroupMultithreading);
   EOObjectStore._resetAssertLock();

in the application constructor because we were experiencing sporadic 
deadlocks and hoped to get some information of any EC locking problem 
that way. Besides other information (about real unlocked ec usage) this 
results in messages like


   [120726 18:54:07] DEBUG Finalizer com.webobjects - *** 
EOEditingContext: access with no lock: _eoForGID()!


at random intervals (whenever the garbage collection is done). There 
seem to be nothing related to this message. Explicitly disposing any 
local ec seems to help regarding this special message. But because I am 
not so familiar with the EOF internals, I was not sure if this is a real 
problem or just too chatty logging.


Best regards.
Susanne
--
Susanne Schneider
Coordinator secuTrial Development

iAS interActive Systems GmbH
Dieffenbachstraße 33 c, D-10967 Berlin

fon+49(0)30 22 50 50 - 498
fax+49(0)30 22 50 50 - 451
mail   susanne.schnei...@interactive-systems.de
webhttp://www.interActive-Systems.de


Geschäftsführer: Dr. Marko Reschke, Thomas Fritzsche
Sitz der Gesellschaft: Berlin
Amtsgericht Berlin Charlottenburg, HRB 106103B

___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: WOWorkerThread deadlocks

2012-09-14 Thread Chuck Hill
I think you can safely ignore warnings from the finalizer.


On 2012-09-14, at 9:28 AM, Susanne Schneider wrote:

 Hi Chuck,
 
 many thanks for your answer.
 
 Am 13.09.2012 20:06, schrieb Chuck Hill:
 Hi Susanne,
 
 
 On 2012-09-13, at 8:57 AM, Susanne Schneider wrote:
 
 Hi all,
 
 please allow me to add one question regarding this interesting topic.
 
 Alexis Tual (my mail client has problem with correct quoting) has suggested 
 for EOF background handling:
 snip
 ec.lock();
 try {
   // huge loop to compute stats
   for (i = 0; i  100; i++) {
// doing stuff with ec...
   // cycling the ec
if (i % 100 == 0) {
   ec.unlock();
   ec.dispose();
   ec = newEditingContextForMyWork();
   ec.lock();
}
   }
 } finally {
   ec.unlock();
 }
 /snip
 
 Now my question: is it correct to dispose the ec after unlock or would it 
 be better to do this beforehand, like:
 
 ec.dipsose();
 ec.unlock();
 
 It is correct to unlock it before disposing it.
 Good to know, we will do it this way.
 
 
 If I turn on the ec-lock logging in my application, there are many remarks 
 from the Finalizers like: *** EOEditingContext: access with no lock: 
 _eoForGID()! Is this a real problem or can it be ignored?
 
 I am not sure, can you send the full stack trace?
 
 There is nor real exception, just the logging message. We have turned on 
 debugging with
 
   NSLog.debug.setAllowedDebugLevel(NSLog.DebugLevelInformational);
   NSLog.allowDebugLoggingForGroups(NSLog.DebugGroupMultithreading);
   EOObjectStore._resetAssertLock();
 
 in the application constructor because we were experiencing sporadic 
 deadlocks and hoped to get some information of any EC locking problem that 
 way. Besides other information (about real unlocked ec usage) this results in 
 messages like
 
   [120726 18:54:07] DEBUG Finalizer com.webobjects - *** EOEditingContext: 
 access with no lock: _eoForGID()!
 
 at random intervals (whenever the garbage collection is done). There seem to 
 be nothing related to this message. Explicitly disposing any local ec seems 
 to help regarding this special message. But because I am not so familiar with 
 the EOF internals, I was not sure if this is a real problem or just too 
 chatty logging.
 
 Best regards.
 Susanne
 -- 
 Susanne Schneider
 Coordinator secuTrial Development
 
 iAS interActive Systems GmbH
 Dieffenbachstraße 33 c, D-10967 Berlin
 
 fon+49(0)30 22 50 50 - 498
 fax+49(0)30 22 50 50 - 451
 mail   susanne.schnei...@interactive-systems.de
 webhttp://www.interActive-Systems.de
 
 
 Geschäftsführer: Dr. Marko Reschke, Thomas Fritzsche
 Sitz der Gesellschaft: Berlin
 Amtsgericht Berlin Charlottenburg, HRB 106103B
 

-- 
Chuck Hill Senior Consultant / VP Development

Practical WebObjects - for developers who want to increase their overall 
knowledge of WebObjects or who are trying to solve specific problems.
http://www.global-village.net/gvc/practical_webobjects









 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: WOWorkerThread deadlocks

2012-09-13 Thread Maik Musall
Hi Alex, Hi Chuck,

Am 13.09.2012 um 02:28 schrieb Chuck Hill ch...@global-village.net:
 Never out of memory. The app is allowed to grow up to 24 GByte, stays in 
 the 1-4 GByte range in normal use and occasionally grows up to 12 GByte 
 with the most advanced statistics that tend to suck in the whole database.
 
 That's also the reason though that I can't use separate EOF stacks for the 
 statistics, because as soon as there were more than one of them, I'd have 
 multiple 10 GByte-ish snapshot caches. The server has 48 GByte and I don't 
 really want to hit that limit... and with separate stacks, it also would be 
 difficult to keep the stats reflect current changes in the other stacks.
 
 
 I am not sure about the background threads (depends on how long OSC locks 
 are held), but using ECs sharing the same EOF stack with regular requests is 
 likely to cause problems like you are seeing.
 
 Do you mean that the application would be unresponsive while the lock was 
 held in the background thread, or that simply doing it that way will lead to 
 unrecoverable deadlocks? 
 If you do massive fetches in the background, that will block other requests 
 as the only OSC is locked.
 
 Correct.
 
 
 That said, I think (and correct me if I'm wrong) if you lock the ec but do 
 not fetch anything with this ec, other ecs can still access the db.
 
 Also correct.  The lock contention is only when fetching or saving.  It can 
 also happen if your code (or Wonder code that you are using) locks something 
 in EOControl or EOAccess.


I'm very familiar with that stuff, and my users know how it feels to wait for 
that lock :-)

 Anyway, the best practice is to use a dedicated OSC to do background work.
 Maik, you should use a dedicated OSC for your stats, and try, if possible to 
 clean memory, for example :
 
 ec.lock();
 try {
   // huge loop to compute stats
   for (i = 0; i  100; i++) {
// doing stuff with ec...
   // cycling the ec
if (i % 100 == 0) {
   ec.unlock();
   ec.dispose();
   ec = newEditingContextForMyWork();
   ec.lock();
}
   }
 } finally {
   ec.unlock();
 }
 
 If practical (I recall that it is not in Maik's case) that can be a good way 
 of limiting memory usage.

Right, not practical for me. I even rely on those statistics to fill the 
snapshot cache with data that other users will need in a minute anyway to speed 
up overall response times. Those statistics are not strictly background 
processes, they are user interaction that happens to be implemented in a worker 
thread while the user is displayed a long response page.

What I've done to improve concurrent response times while those stats fetch 
their 30 EOs: I fetch them in batches of a few 1000 and release the lock in 
between. This is the method I can call on my manual-locking editing context 
between batches:

public void shortLockRelease() {
unlock();
try {
Thread.sleep( 50 );
} catch( InterruptedException e ) {
e.printStackTrace();
} finally {
lock();
}
}

This effectively gives other threads the opportunity to sneak in a few 
transactions before the stats worker resumes grabbing the OSC's resources, and 
is enough to keep response times within a reasonable limit. Users feel it when 
stats are running, but they don't have to really wait any more. I've even tuned 
those 50 ms. Less than that and don't get the desired effect. More than that 
and you needlessly increase the stats execution time.

Maik
 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-13 Thread Alexis Tual
2012/9/13 Maik Musall m...@selbstdenker.ag

 Hi Alex, Hi Chuck,

 Am 13.09.2012 um 02:28 schrieb Chuck Hill ch...@global-village.net:
  Never out of memory. The app is allowed to grow up to 24 GByte, stays
 in the 1-4 GByte range in normal use and occasionally grows up to 12 GByte
 with the most advanced statistics that tend to suck in the whole database.
 
  That's also the reason though that I can't use separate EOF stacks for
 the statistics, because as soon as there were more than one of them, I'd
 have multiple 10 GByte-ish snapshot caches. The server has 48 GByte and I
 don't really want to hit that limit... and with separate stacks, it also
 would be difficult to keep the stats reflect current changes in the other
 stacks.
 
 
  I am not sure about the background threads (depends on how long OSC
 locks are held), but using ECs sharing the same EOF stack with regular
 requests is likely to cause problems like you are seeing.
 
  Do you mean that the application would be unresponsive while the lock
 was held in the background thread, or that simply doing it that way will
 lead to unrecoverable deadlocks?
  If you do massive fetches in the background, that will block other
 requests as the only OSC is locked.
 
  Correct.
 
 
  That said, I think (and correct me if I'm wrong) if you lock the ec but
 do not fetch anything with this ec, other ecs can still access the db.
 
  Also correct.  The lock contention is only when fetching or saving.  It
 can also happen if your code (or Wonder code that you are using) locks
 something in EOControl or EOAccess.


 I'm very familiar with that stuff, and my users know how it feels to wait
 for that lock :-)

  Anyway, the best practice is to use a dedicated OSC to do background
 work.
  Maik, you should use a dedicated OSC for your stats, and try, if
 possible to clean memory, for example :
 
  ec.lock();
  try {
// huge loop to compute stats
for (i = 0; i  100; i++) {
 // doing stuff with ec...
// cycling the ec
 if (i % 100 == 0) {
ec.unlock();
ec.dispose();
ec = newEditingContextForMyWork();
ec.lock();
 }
}
  } finally {
ec.unlock();
  }
 
  If practical (I recall that it is not in Maik's case) that can be a good
 way of limiting memory usage.

 Right, not practical for me. I even rely on those statistics to fill the
 snapshot cache with data that other users will need in a minute anyway to
 speed up overall response times. Those statistics are not strictly
 background processes, they are user interaction that happens to be
 implemented in a worker thread while the user is displayed a long response
 page.

 What I've done to improve concurrent response times while those stats
 fetch their 30 EOs: I fetch them in batches of a few 1000 and release
 the lock in between. This is the method I can call on my manual-locking
 editing context between batches:

 public void shortLockRelease() {
 unlock();
 try {
 Thread.sleep( 50 );
 } catch( InterruptedException e ) {
 e.printStackTrace();
 } finally {
 lock();
 }
 }

 This effectively gives other threads the opportunity to sneak in a few
 transactions before the stats worker resumes grabbing the OSC's resources,
 and is enough to keep response times within a reasonable limit. Users feel
 it when stats are running, but they don't have to really wait any more.
 I've even tuned those 50 ms. Less than that and don't get the desired
 effect. More than that and you needlessly increase the stats execution time.


Interesting setup, thanks for sharing, looks like one giant VM (and EOF)
can handle this amount of objects !
If the DB is touched by this app only, you could fetch all the stats at
startup... but I imagine this is more complicated :)

Alex
 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-13 Thread Susanne Schneider

Hi all,

please allow me to add one question regarding this interesting topic.

Alexis Tual (my mail client has problem with correct quoting) has 
suggested for EOF background handling:

snip
ec.lock();
try {
   // huge loop to compute stats
   for (i = 0; i  100; i++) {
// doing stuff with ec...
   // cycling the ec
if (i % 100 == 0) {
   ec.unlock();
   ec.dispose();
   ec = newEditingContextForMyWork();
   ec.lock();
}
   }
} finally {
   ec.unlock();
}
/snip

Now my question: is it correct to dispose the ec after unlock or would 
it be better to do this beforehand, like:


ec.dipsose();
ec.unlock();

If I turn on the ec-lock logging in my application, there are many 
remarks from the Finalizers like: *** EOEditingContext: access with no 
lock: _eoForGID()! Is this a real problem or can it be ignored?


Best regards,
Susanne
--
Susanne Schneider
Coordinator secuTrial Development

iAS interActive Systems GmbH
Dieffenbachstraße 33 c, 10967 Berlin

fon+49 30 22 50 50 - 498
fax+49 30 22 50 50 - 451
mail   susanne.schnei...@interactive-systems.de
webhttp://www.interActive-Systems.de


Geschäftsführer: Dr. Marko Reschke, Thomas Fritzsche
Sitz der Gesellschaft: Berlin
Amtsgericht Berlin Charlottenburg, HRB 106103B

___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: WOWorkerThread deadlocks

2012-09-13 Thread Chuck Hill
Hi Susanne,


On 2012-09-13, at 8:57 AM, Susanne Schneider wrote:

 Hi all,
 
 please allow me to add one question regarding this interesting topic.
 
 Alexis Tual (my mail client has problem with correct quoting) has suggested 
 for EOF background handling:
 snip
 ec.lock();
 try {
   // huge loop to compute stats
   for (i = 0; i  100; i++) {
// doing stuff with ec...
   // cycling the ec
if (i % 100 == 0) {
   ec.unlock();
   ec.dispose();
   ec = newEditingContextForMyWork();
   ec.lock();
}
   }
 } finally {
   ec.unlock();
 }
 /snip
 
 Now my question: is it correct to dispose the ec after unlock or would it be 
 better to do this beforehand, like:
 
   ec.dipsose();
   ec.unlock();

It is correct to unlock it before disposing it.

 If I turn on the ec-lock logging in my application, there are many remarks 
 from the Finalizers like: *** EOEditingContext: access with no lock: 
 _eoForGID()! Is this a real problem or can it be ignored?

I am not sure, can you send the full stack trace?


Chuck


-- 
Chuck Hill Senior Consultant / VP Development

Practical WebObjects - for developers who want to increase their overall 
knowledge of WebObjects or who are trying to solve specific problems.
http://www.global-village.net/gvc/practical_webobjects









 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-12 Thread John Huss

  The state the app was in when I took that jstack was that no login was
 possible and user's requests would not return, ultimately running into no
 instance responses after the timeout elapsed.
 
  Grep the app logs for OutOfMemory, that is one possibility.  They look
 ready to accept connections.  It could also be that they got so back logged
 that wotaskd gave up on them and decided they were dead.  Having the lower
 numbers above should help in this respect - the app will be able to recover
 more quickly.
 
  Never out of memory. The app is allowed to grow up to 24 GByte, stays in
 the 1-4 GByte range in normal use and occasionally grows up to 12 GByte
 with the most advanced statistics that tend to suck in the whole database.
 
  That's also the reason though that I can't use separate EOF stacks for
 the statistics, because as soon as there were more than one of them, I'd
 have multiple 10 GByte-ish snapshot caches. The server has 48 GByte and I
 don't really want to hit that limit... and with separate stacks, it also
 would be difficult to keep the stats reflect current changes in the other
 stacks.


 I am not sure about the background threads (depends on how long OSC locks
 are held), but using ECs sharing the same EOF stack with regular requests
 is likely to cause problems like you are seeing.


Do you mean that the application would be unresponsive while the lock was
held in the background thread, or that simply doing it that way will lead
to unrecoverable deadlocks?
 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-12 Thread Alexis Tual
Hi,

2012/9/13 John Huss johnth...@gmail.com

  The state the app was in when I took that jstack was that no login was
 possible and user's requests would not return, ultimately running into no
 instance responses after the timeout elapsed.
 
  Grep the app logs for OutOfMemory, that is one possibility.  They look
 ready to accept connections.  It could also be that they got so back logged
 that wotaskd gave up on them and decided they were dead.  Having the lower
 numbers above should help in this respect - the app will be able to recover
 more quickly.
 
  Never out of memory. The app is allowed to grow up to 24 GByte, stays
 in the 1-4 GByte range in normal use and occasionally grows up to 12 GByte
 with the most advanced statistics that tend to suck in the whole database.
 
  That's also the reason though that I can't use separate EOF stacks for
 the statistics, because as soon as there were more than one of them, I'd
 have multiple 10 GByte-ish snapshot caches. The server has 48 GByte and I
 don't really want to hit that limit... and with separate stacks, it also
 would be difficult to keep the stats reflect current changes in the other
 stacks.


 I am not sure about the background threads (depends on how long OSC locks
 are held), but using ECs sharing the same EOF stack with regular requests
 is likely to cause problems like you are seeing.


 Do you mean that the application would be unresponsive while the lock was
 held in the background thread, or that simply doing it that way will lead
 to unrecoverable deadlocks?

If you do massive fetches in the background, that will block other requests
as the only OSC is locked.
That said, I think (and correct me if I'm wrong) if you lock the ec but do
not fetch anything with this ec, other ecs can still access the db.
Anyway, the best practice is to use a dedicated OSC to do background work.
Maik, you should use a dedicated OSC for your stats, and try, if possible
to clean memory, for example :

ec.lock();
try {
   // huge loop to compute stats
   for (i = 0; i  100; i++) {
// doing stuff with ec...
   // cycling the ec
if (i % 100 == 0) {
   ec.unlock();
   ec.dispose();
   ec = newEditingContextForMyWork();
   ec.lock();
}
   }
} finally {
   ec.unlock();
}

Alex
 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-12 Thread Chuck Hill
Hi John,


On 2012-09-12, at 7:13 AM, John Huss wrote:

  The state the app was in when I took that jstack was that no login was 
  possible and user's requests would not return, ultimately running into 
  no instance responses after the timeout elapsed.
 
  Grep the app logs for OutOfMemory, that is one possibility.  They look 
  ready to accept connections.  It could also be that they got so back 
  logged that wotaskd gave up on them and decided they were dead.  Having 
  the lower numbers above should help in this respect - the app will be able 
  to recover more quickly.
 
  Never out of memory. The app is allowed to grow up to 24 GByte, stays in 
  the 1-4 GByte range in normal use and occasionally grows up to 12 GByte 
  with the most advanced statistics that tend to suck in the whole database.
 
  That's also the reason though that I can't use separate EOF stacks for the 
  statistics, because as soon as there were more than one of them, I'd have 
  multiple 10 GByte-ish snapshot caches. The server has 48 GByte and I don't 
  really want to hit that limit... and with separate stacks, it also would be 
  difficult to keep the stats reflect current changes in the other stacks.
 
 
 I am not sure about the background threads (depends on how long OSC locks are 
 held), but using ECs sharing the same EOF stack with regular requests is 
 likely to cause problems like you are seeing.
 
 Do you mean that the application would be unresponsive while the lock was 
 held in the background thread, or that simply doing it that way will lead to 
 unrecoverable deadlocks?

I meant that when the EC locks the OSC (e.g during fetches and saves) it would 
block all other requests also needing to lock the OSC.  If the background 
thread's locks of the OSC are very short in duration (and also not happening 
constantly) it would have little effect on the other request.  However that is 
not what background processing is often used for.

Chuck

-- 
Chuck Hill Senior Consultant / VP Development

Practical WebObjects - for developers who want to increase their overall 
knowledge of WebObjects or who are trying to solve specific problems.
http://www.global-village.net/gvc/practical_webobjects









 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-12 Thread Chuck Hill

On 2012-09-12, at 2:58 PM, Alexis Tual wrote:

 Hi,
 
 2012/9/13 John Huss johnth...@gmail.com
  The state the app was in when I took that jstack was that no login was 
  possible and user's requests would not return, ultimately running into 
  no instance responses after the timeout elapsed.
 
  Grep the app logs for OutOfMemory, that is one possibility.  They look 
  ready to accept connections.  It could also be that they got so back 
  logged that wotaskd gave up on them and decided they were dead.  Having 
  the lower numbers above should help in this respect - the app will be able 
  to recover more quickly.
 
  Never out of memory. The app is allowed to grow up to 24 GByte, stays in 
  the 1-4 GByte range in normal use and occasionally grows up to 12 GByte 
  with the most advanced statistics that tend to suck in the whole database.
 
  That's also the reason though that I can't use separate EOF stacks for the 
  statistics, because as soon as there were more than one of them, I'd have 
  multiple 10 GByte-ish snapshot caches. The server has 48 GByte and I don't 
  really want to hit that limit... and with separate stacks, it also would be 
  difficult to keep the stats reflect current changes in the other stacks.
 
 
 I am not sure about the background threads (depends on how long OSC locks are 
 held), but using ECs sharing the same EOF stack with regular requests is 
 likely to cause problems like you are seeing.
 
 Do you mean that the application would be unresponsive while the lock was 
 held in the background thread, or that simply doing it that way will lead to 
 unrecoverable deadlocks? 
 If you do massive fetches in the background, that will block other requests 
 as the only OSC is locked.

Correct.


 That said, I think (and correct me if I'm wrong) if you lock the ec but do 
 not fetch anything with this ec, other ecs can still access the db.

Also correct.  The lock contention is only when fetching or saving.  It can 
also happen if your code (or Wonder code that you are using) locks something in 
EOControl or EOAccess.



 Anyway, the best practice is to use a dedicated OSC to do background work.
 Maik, you should use a dedicated OSC for your stats, and try, if possible to 
 clean memory, for example :
 
 ec.lock();
 try {
// huge loop to compute stats
for (i = 0; i  100; i++) {
 // doing stuff with ec...
// cycling the ec
 if (i % 100 == 0) {
ec.unlock();
ec.dispose();
ec = newEditingContextForMyWork();
ec.lock();
 }
}
 } finally {
ec.unlock();
 }

If practical (I recall that it is not in Maik's case) that can be a good way of 
limiting memory usage.


Chuck

-- 
Chuck Hill Senior Consultant / VP Development

Practical WebObjects - for developers who want to increase their overall 
knowledge of WebObjects or who are trying to solve specific problems.
http://www.global-village.net/gvc/practical_webobjects









 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-11 Thread Maik Musall
Hi Chuck,

Am 10.09.2012 um 22:30 schrieb Chuck Hill ch...@global-village.net:

 The state the app was in when I took that jstack was that no login was 
 possible and user's requests would not return, ultimately running into no 
 instance responses after the timeout elapsed.
 
 Grep the app logs for OutOfMemory, that is one possibility.  They look ready 
 to accept connections.  It could also be that they got so back logged that 
 wotaskd gave up on them and decided they were dead.  Having the lower numbers 
 above should help in this respect - the app will be able to recover more 
 quickly.

Never out of memory. The app is allowed to grow up to 24 GByte, stays in the 
1-4 GByte range in normal use and occasionally grows up to 12 GByte with the 
most advanced statistics that tend to suck in the whole database.

That's also the reason though that I can't use separate EOF stacks for the 
statistics, because as soon as there were more than one of them, I'd have 
multiple 10 GByte-ish snapshot caches. The server has 48 GByte and I don't 
really want to hit that limit... and with separate stacks, it also would be 
difficult to keep the stats reflect current changes in the other stacks.

Maik


 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-11 Thread Maik Musall
Hi Alexis,

Am 10.09.2012 um 23:19 schrieb Alexis Tual alexis.t...@gmail.com:

 Note that I recently switched to Wonder for this project (using all the 
 Wonder base classes), and since I did, this problem occurred more frequently. 
 It's now almost once a day, and was about once a week before. I switched from 
 MultiECLockManager to ERXEC with autolocking in the process.
 
 I've seen you have long response pages, have you turned off autolocking for 
 these special cases ?

Good point. I just checked: those are simple WOLongResponsePages that don't 
hold anything regarding EOF, just wait for the background worker thread to 
notify when it's done. The background workers all use manual locking, but some 
of them don't explicitly use my manual locking EC factory but use an 
autolocking EC and do manual locking on top. I'll correct that, thanks.

Maik


 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-11 Thread Maik Musall

Am 11.09.2012 um 09:10 schrieb Maik Musall m...@selbstdenker.ag:

 Hi Alexis,
 
 Am 10.09.2012 um 23:19 schrieb Alexis Tual alexis.t...@gmail.com:
 
 Note that I recently switched to Wonder for this project (using all the 
 Wonder base classes), and since I did, this problem occurred more 
 frequently. It's now almost once a day, and was about once a week before. I 
 switched from MultiECLockManager to ERXEC with autolocking in the process.
 
 I've seen you have long response pages, have you turned off autolocking for 
 these special cases ?
 
 Good point. I just checked: those are simple WOLongResponsePages that don't 
 hold anything regarding EOF, just wait for the background worker thread to 
 notify when it's done. The background workers all use manual locking, but 
 some of them don't explicitly use my manual locking EC factory but use an 
 autolocking EC and do manual locking on top. I'll correct that, thanks.

Hmm, seems I have the choice between
* use manual locking only in those background worker threads
* diss manual locks and rely on autolocking for them.

Worker threads are all implemented like this:

public void run() {
  localEC.lock();
  try {
// heavy duty fetches, batchfetches, filtering and stuff that can take a 
minute
  } finally {
localEC.unlock();
  }
}

What would you recommend? My ERXEC-subclass-factory can give me either type.

Maik
 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-11 Thread Alexis Tual
Use manual locking for your background threads, your snippet is right, be
sure your localEC has autolock set to false.
Check out Kerian presentation at WOWODC 2011 :
http://www.wocommunity.org/podcasts/wowodc/2011/BackgroundTasks.mov
and the examples :
https://github.com/projectwonder/wonder/tree/master/Examples/Misc/BackgroundTasks

Good luck,

Alex

2012/9/11 Maik Musall m...@selbstdenker.ag


 Am 11.09.2012 um 09:10 schrieb Maik Musall m...@selbstdenker.ag:

 Hi Alexis,

 Am 10.09.2012 um 23:19 schrieb Alexis Tual alexis.t...@gmail.com:

 Note that I recently switched to Wonder for this project (using all the
 Wonder base classes), and since I did, this problem occurred more
 frequently. It's now almost once a day, and was about once a week before. I
 switched from MultiECLockManager to ERXEC with autolocking in the process.


 I've seen you have long response pages, have you turned off autolocking
 for these special cases ?


 Good point. I just checked: those are simple WOLongResponsePages that
 don't hold anything regarding EOF, just wait for the background worker
 thread to notify when it's done. The background workers all use manual
 locking, but some of them don't explicitly use my manual locking EC factory
 but use an autolocking EC and do manual locking on top. I'll correct that,
 thanks.


 Hmm, seems I have the choice between
 * use manual locking only in those background worker threads
 * diss manual locks and rely on autolocking for them.

 Worker threads are all implemented like this:

 public void run() {
   localEC.lock();
   try {
 // heavy duty fetches, batchfetches, filtering and stuff that can take
 a minute
   } finally {
 localEC.unlock();
   }
 }

 What would you recommend? My ERXEC-subclass-factory can give me either
 type.

 Maik

 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-11 Thread Chuck Hill

On 2012-09-10, at 11:15 PM, Maik Musall wrote:

 Hi Chuck,
 
 Am 10.09.2012 um 22:30 schrieb Chuck Hill ch...@global-village.net:
 
 The state the app was in when I took that jstack was that no login was 
 possible and user's requests would not return, ultimately running into no 
 instance responses after the timeout elapsed.
 
 Grep the app logs for OutOfMemory, that is one possibility.  They look ready 
 to accept connections.  It could also be that they got so back logged that 
 wotaskd gave up on them and decided they were dead.  Having the lower 
 numbers above should help in this respect - the app will be able to recover 
 more quickly.
 
 Never out of memory. The app is allowed to grow up to 24 GByte, stays in the 
 1-4 GByte range in normal use and occasionally grows up to 12 GByte with the 
 most advanced statistics that tend to suck in the whole database.
 
 That's also the reason though that I can't use separate EOF stacks for the 
 statistics, because as soon as there were more than one of them, I'd have 
 multiple 10 GByte-ish snapshot caches. The server has 48 GByte and I don't 
 really want to hit that limit... and with separate stacks, it also would be 
 difficult to keep the stats reflect current changes in the other stacks.


I am not sure about the background threads (depends on how long OSC locks are 
held), but using ECs sharing the same EOF stack with regular requests is likely 
to cause problems like you are seeing.

Chuck


-- 
Chuck Hill Senior Consultant / VP Development

Practical WebObjects - for developers who want to increase their overall 
knowledge of WebObjects or who are trying to solve specific problems.
http://www.global-village.net/gvc/practical_webobjects









 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


WOWorkerThread deadlocks

2012-09-10 Thread Maik Musall
Hi,

in an app with high concurrency, the app sometimes becomes unresponsive to 
everything but DirectActions at the time of day with the most concurrency. All 
users aren't seeing responses any more. In jstack I see hundreds of these:

 WorkerThread207 prio=5 tid=131e0a800 nid=0x151aa2000 waiting for monitor 
 entry [151aa1000]
java.lang.Thread.State: BLOCKED (on object monitor)
   at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:406)
   - waiting to lock 20d3da450 (a java.net.SocksSocketImpl)
   at java.net.ServerSocket.implAccept(ServerSocket.java:462)
   at java.net.ServerSocket.accept(ServerSocket.java:430)
   at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:210)
   at java.lang.Thread.run(Thread.java:680)

all waiting on the same lock 20d3da450, and one thread holding that lock:

 WorkerThread206 prio=5 tid=131d79800 nid=0x15199f000 runnable [15199e000]
java.lang.Thread.State: RUNNABLE
   at java.net.PlainSocketImpl.socketAccept(Native Method)
   at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
   - locked 20d3da450 (a java.net.SocksSocketImpl)
   at java.net.ServerSocket.implAccept(ServerSocket.java:462)
   at java.net.ServerSocket.accept(ServerSocket.java:430)
   at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:210)
   at java.lang.Thread.run(Thread.java:680)

Anyone familiar with this problem?

Maik
 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-10 Thread Miguel Arroz
Hi,

  Isn't that normal? Only one thread can be accepting at any time, when it 
accepts, it releases the lock for the next one to enter the accept state. I 
think those are not the threads you are looking for…

  Regards,

Miguel Arroz

On 2012-09-10, at 8:03 AM, Maik Musall m...@selbstdenker.ag wrote:

 Hi,
 
 in an app with high concurrency, the app sometimes becomes unresponsive to 
 everything but DirectActions at the time of day with the most concurrency. 
 All users aren't seeing responses any more. In jstack I see hundreds of these:
 
 WorkerThread207 prio=5 tid=131e0a800 nid=0x151aa2000 waiting for monitor 
 entry [151aa1000]
   java.lang.Thread.State: BLOCKED (on object monitor)
  at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:406)
  - waiting to lock 20d3da450 (a java.net.SocksSocketImpl)
  at java.net.ServerSocket.implAccept(ServerSocket.java:462)
  at java.net.ServerSocket.accept(ServerSocket.java:430)
  at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:210)
  at java.lang.Thread.run(Thread.java:680)
 
 all waiting on the same lock 20d3da450, and one thread holding that lock:
 
 WorkerThread206 prio=5 tid=131d79800 nid=0x15199f000 runnable [15199e000]
   java.lang.Thread.State: RUNNABLE
  at java.net.PlainSocketImpl.socketAccept(Native Method)
  at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
  - locked 20d3da450 (a java.net.SocksSocketImpl)
  at java.net.ServerSocket.implAccept(ServerSocket.java:462)
  at java.net.ServerSocket.accept(ServerSocket.java:430)
  at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:210)
  at java.lang.Thread.run(Thread.java:680)
 
 Anyone familiar with this problem?
 
 Maik
 ___
 Do not post admin requests to the list. They will be ignored.
 Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
 Help/Unsubscribe/Update your Subscription:
 https://lists.apple.com/mailman/options/webobjects-dev/arroz%40guiamac.com
 
 This email sent to ar...@guiamac.com



 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: WOWorkerThread deadlocks

2012-09-10 Thread Chuck Hill
Hi Maik,

WorkerThread207 that many worker threads indicates two things to me:
1. Your app configuration is too high.  I'd use a max of 6-10 and a listen 
queue size of around 4 (adjusted to suit your specific needs).  A WO app is 
very, very unlikely to recover from a 200 worker thread backlog in any way that 
is useful to the users

2. You have a thread that is taking a long time to return a result.  If you are 
dispatching requests concurrently, then this is most likely stuck in 
EOControl/EOAccess (e.g. waiting for a slow query result) or connecting to some 
external process.  You could also have a deadlock.  If you are not dispatching 
requests concurrently, then this delay could be in other code.

The traces below do not show the problem.  If you want to send a full dump, I 
am willing to look at it.  It is possible that the problem had resolved by the 
time you took this dump.  What you show below is normal for a lot of worker 
threads.  WorkerThread206 is waiting for a new request, WorkerThread207 is idle 
waiting for something to do in the future.

Chuck


On 2012-09-10, at 8:03 AM, Maik Musall wrote:

 Hi,
 
 in an app with high concurrency, the app sometimes becomes unresponsive to 
 everything but DirectActions at the time of day with the most concurrency. 
 All users aren't seeing responses any more. In jstack I see hundreds of these:
 
 WorkerThread207 prio=5 tid=131e0a800 nid=0x151aa2000 waiting for monitor 
 entry [151aa1000]
   java.lang.Thread.State: BLOCKED (on object monitor)
  at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:406)
  - waiting to lock 20d3da450 (a java.net.SocksSocketImpl)
  at java.net.ServerSocket.implAccept(ServerSocket.java:462)
  at java.net.ServerSocket.accept(ServerSocket.java:430)
  at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:210)
  at java.lang.Thread.run(Thread.java:680)
 
 all waiting on the same lock 20d3da450, and one thread holding that lock:
 
 WorkerThread206 prio=5 tid=131d79800 nid=0x15199f000 runnable [15199e000]
   java.lang.Thread.State: RUNNABLE
  at java.net.PlainSocketImpl.socketAccept(Native Method)
  at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
  - locked 20d3da450 (a java.net.SocksSocketImpl)
  at java.net.ServerSocket.implAccept(ServerSocket.java:462)
  at java.net.ServerSocket.accept(ServerSocket.java:430)
  at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:210)
  at java.lang.Thread.run(Thread.java:680)
 
 Anyone familiar with this problem?
 
 Maik
 ___
 Do not post admin requests to the list. They will be ignored.
 Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
 Help/Unsubscribe/Update your Subscription:
 https://lists.apple.com/mailman/options/webobjects-dev/chill%40global-village.net
 
 This email sent to ch...@global-village.net

-- 
Chuck Hill Senior Consultant / VP Development

Practical WebObjects - for developers who want to increase their overall 
knowledge of WebObjects or who are trying to solve specific problems.
http://www.global-village.net/gvc/practical_webobjects









 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-10 Thread Maik Musall
Hi Chuck,

Am 10.09.2012 um 19:23 schrieb Chuck Hill ch...@global-village.net:

 Hi Maik,
 
 WorkerThread207 that many worker threads indicates two things to me:
 1. Your app configuration is too high.  I'd use a max of 6-10 and a listen 
 queue size of around 4 (adjusted to suit your specific needs).  A WO app is 
 very, very unlikely to recover from a 200 worker thread backlog in any way 
 that is useful to the users

You may be right, they were at 16/512/8/128. I just set them to 4/8/8/6 and am 
eager to watch the behaviour tomorrow.

There are up to 100 users concurrently (it's a backoffice app), although 
concurrently running requests are typically not more than 2-3, plus 1-2 
DirectActions, plus possibly 1-2 long response pages running statistics stuff.

 2. You have a thread that is taking a long time to return a result.  If you 
 are dispatching requests concurrently, then this is most likely stuck in 
 EOControl/EOAccess (e.g. waiting for a slow query result) or connecting to 
 some external process.  You could also have a deadlock.  If you are not 
 dispatching requests concurrently, then this delay could be in other code.

When that situation occurs, the app is not using CPU any more, neither is the 
database. It often doesn't respond to SIGTERM any more and needs SIGKILL to 
terminate so we can restart.

 The traces below do not show the problem.  If you want to send a full dump, I 
 am willing to look at it.  It is possible that the problem had resolved by 
 the time you took this dump.  What you show below is normal for a lot of 
 worker threads.  WorkerThread206 is waiting for a new request, 
 WorkerThread207 is idle waiting for something to do in the future.

Thanks for the offer; here is the full jstack output:
http://akaihi.selbstdenker.com/~maik/jstack_powerd_20120910.txt

Maik

 On 2012-09-10, at 8:03 AM, Maik Musall wrote:
 
 Hi,
 
 in an app with high concurrency, the app sometimes becomes unresponsive to 
 everything but DirectActions at the time of day with the most concurrency. 
 All users aren't seeing responses any more. In jstack I see hundreds of 
 these:
 
 WorkerThread207 prio=5 tid=131e0a800 nid=0x151aa2000 waiting for monitor 
 entry [151aa1000]
  java.lang.Thread.State: BLOCKED (on object monitor)
 at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:406)
 - waiting to lock 20d3da450 (a java.net.SocksSocketImpl)
 at java.net.ServerSocket.implAccept(ServerSocket.java:462)
 at java.net.ServerSocket.accept(ServerSocket.java:430)
 at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:210)
 at java.lang.Thread.run(Thread.java:680)
 
 all waiting on the same lock 20d3da450, and one thread holding that lock:
 
 WorkerThread206 prio=5 tid=131d79800 nid=0x15199f000 runnable [15199e000]
  java.lang.Thread.State: RUNNABLE
 at java.net.PlainSocketImpl.socketAccept(Native Method)
 at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
 - locked 20d3da450 (a java.net.SocksSocketImpl)
 at java.net.ServerSocket.implAccept(ServerSocket.java:462)
 at java.net.ServerSocket.accept(ServerSocket.java:430)
 at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:210)
 at java.lang.Thread.run(Thread.java:680)
 
 Anyone familiar with this problem?
 
 Maik
 ___
 Do not post admin requests to the list. They will be ignored.
 Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
 Help/Unsubscribe/Update your Subscription:
 https://lists.apple.com/mailman/options/webobjects-dev/chill%40global-village.net
 
 This email sent to ch...@global-village.net
 
 -- 
 Chuck Hill Senior Consultant / VP Development
 
 Practical WebObjects - for developers who want to increase their overall 
 knowledge of WebObjects or who are trying to solve specific problems.
 http://www.global-village.net/gvc/practical_webobjects
 
 
 
 
 
 
 
 


 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-10 Thread Chuck Hill
Hi Maik,


On 2012-09-10, at 11:04 AM, Maik Musall wrote:

 Hi Chuck,
 
 Am 10.09.2012 um 19:23 schrieb Chuck Hill ch...@global-village.net:
 
 Hi Maik,
 
 WorkerThread207 that many worker threads indicates two things to me:
 1. Your app configuration is too high.  I'd use a max of 6-10 and a listen 
 queue size of around 4 (adjusted to suit your specific needs).  A WO app is 
 very, very unlikely to recover from a 200 worker thread backlog in any way 
 that is useful to the users
 
 You may be right, they were at 16/512/8/128. I just set them to 4/8/8/6 and 
 am eager to watch the behaviour tomorrow.

You should at least know when there is a problem sooner.  Then as quickly as 
you can, get a thread dump with jstack.

 
 There are up to 100 users concurrently (it's a backoffice app), although 
 concurrently running requests are typically not more than 2-3, plus 1-2 
 DirectActions, plus possibly 1-2 long response pages running statistics stuff.

OK, the 4/8/8/6 numbers you have seem reasonable for that load.


 2. You have a thread that is taking a long time to return a result.  If you 
 are dispatching requests concurrently, then this is most likely stuck in 
 EOControl/EOAccess (e.g. waiting for a slow query result) or connecting to 
 some external process.  You could also have a deadlock.  If you are not 
 dispatching requests concurrently, then this delay could be in other code.
 
 When that situation occurs, the app is not using CPU any more, neither is the 
 database. It often doesn't respond to SIGTERM any more and needs SIGKILL to 
 terminate so we can restart.

That sounds like what a blocked non-daemon thread would cause.


 The traces below do not show the problem.  If you want to send a full dump, 
 I am willing to look at it.  It is possible that the problem had resolved by 
 the time you took this dump.  What you show below is normal for a lot of 
 worker threads.  WorkerThread206 is waiting for a new request, 
 WorkerThread207 is idle waiting for something to do in the future.
 
 Thanks for the offer; here is the full jstack output:
 http://akaihi.selbstdenker.com/~maik/jstack_powerd_20120910.txt

Other than having a large number of idle worker threads, there is nothing in 
that trace that indicates the problem.  In my experience, that means that they 
problem has resolved itself and the application recovered.  You will need to 
run jstack closer to the start of the problem even to capture what is going 
wrong.


Chuck




 On 2012-09-10, at 8:03 AM, Maik Musall wrote:
 
 Hi,
 
 in an app with high concurrency, the app sometimes becomes unresponsive to 
 everything but DirectActions at the time of day with the most concurrency. 
 All users aren't seeing responses any more. In jstack I see hundreds of 
 these:
 
 WorkerThread207 prio=5 tid=131e0a800 nid=0x151aa2000 waiting for monitor 
 entry [151aa1000]
 java.lang.Thread.State: BLOCKED (on object monitor)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:406)
- waiting to lock 20d3da450 (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:462)
at java.net.ServerSocket.accept(ServerSocket.java:430)
at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:210)
at java.lang.Thread.run(Thread.java:680)
 
 all waiting on the same lock 20d3da450, and one thread holding that lock:
 
 WorkerThread206 prio=5 tid=131d79800 nid=0x15199f000 runnable [15199e000]
 java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
- locked 20d3da450 (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:462)
at java.net.ServerSocket.accept(ServerSocket.java:430)
at 
 com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:210)
at java.lang.Thread.run(Thread.java:680)
 
 Anyone familiar with this problem?
 
 Maik
 ___
 Do not post admin requests to the list. They will be ignored.
 Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
 Help/Unsubscribe/Update your Subscription:
 https://lists.apple.com/mailman/options/webobjects-dev/chill%40global-village.net
 
 This email sent to ch...@global-village.net
 
 -- 
 Chuck Hill Senior Consultant / VP Development
 
 Practical WebObjects - for developers who want to increase their overall 
 knowledge of WebObjects or who are trying to solve specific problems.
 http://www.global-village.net/gvc/practical_webobjects
 
 
 
 
 
 
 
 
 

-- 
Chuck Hill Senior Consultant / VP Development

Practical WebObjects - for developers who want to increase their overall 
knowledge of WebObjects or who are trying to solve specific problems.
http://www.global-village.net/gvc/practical_webobjects









 ___
Do not post admin requests to the list. They will be 

Re: WOWorkerThread deadlocks

2012-09-10 Thread Maik Musall
Hi Chuck,

Am 10.09.2012 um 21:35 schrieb Chuck Hill ch...@global-village.net:
 WorkerThread207 that many worker threads indicates two things to me:
 1. Your app configuration is too high.  I'd use a max of 6-10 and a listen 
 queue size of around 4 (adjusted to suit your specific needs).  A WO app is 
 very, very unlikely to recover from a 200 worker thread backlog in any way 
 that is useful to the users
 
 You may be right, they were at 16/512/8/128. I just set them to 4/8/8/6 and 
 am eager to watch the behaviour tomorrow.
 
 You should at least know when there is a problem sooner.  Then as quickly as 
 you can, get a thread dump with jstack.
 
 
 There are up to 100 users concurrently (it's a backoffice app), although 
 concurrently running requests are typically not more than 2-3, plus 1-2 
 DirectActions, plus possibly 1-2 long response pages running statistics 
 stuff.
 
 OK, the 4/8/8/6 numbers you have seem reasonable for that load.
 
 
 2. You have a thread that is taking a long time to return a result.  If you 
 are dispatching requests concurrently, then this is most likely stuck in 
 EOControl/EOAccess (e.g. waiting for a slow query result) or connecting to 
 some external process.  You could also have a deadlock.  If you are not 
 dispatching requests concurrently, then this delay could be in other code.
 
 When that situation occurs, the app is not using CPU any more, neither is 
 the database. It often doesn't respond to SIGTERM any more and needs SIGKILL 
 to terminate so we can restart.
 
 That sounds like what a blocked non-daemon thread would cause.
 
 
 The traces below do not show the problem.  If you want to send a full dump, 
 I am willing to look at it.  It is possible that the problem had resolved 
 by the time you took this dump.  What you show below is normal for a lot of 
 worker threads.  WorkerThread206 is waiting for a new request, 
 WorkerThread207 is idle waiting for something to do in the future.
 
 Thanks for the offer; here is the full jstack output:
 http://akaihi.selbstdenker.com/~maik/jstack_powerd_20120910.txt
 
 Other than having a large number of idle worker threads, there is nothing in 
 that trace that indicates the problem.  In my experience, that means that 
 they problem has resolved itself and the application recovered.  You will 
 need to run jstack closer to the start of the problem even to capture what is 
 going wrong.

The state the app was in when I took that jstack was that no login was possible 
and user's requests would not return, ultimately running into no instance 
responses after the timeout elapsed.

If the problem persists, I think I'll set up a cronjob to record jstacks every 
couple of minutes or so.

Note that I recently switched to Wonder for this project (using all the Wonder 
base classes), and since I did, this problem occurred more frequently. It's now 
almost once a day, and was about once a week before. I switched from 
MultiECLockManager to ERXEC with autolocking in the process.

Maik


 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-10 Thread Chuck Hill
Hi Maik,


On 2012-09-10, at 1:07 PM, Maik Musall wrote:

 Hi Chuck,
 
 Am 10.09.2012 um 21:35 schrieb Chuck Hill ch...@global-village.net:
 WorkerThread207 that many worker threads indicates two things to me:
 1. Your app configuration is too high.  I'd use a max of 6-10 and a listen 
 queue size of around 4 (adjusted to suit your specific needs).  A WO app 
 is very, very unlikely to recover from a 200 worker thread backlog in any 
 way that is useful to the users
 
 You may be right, they were at 16/512/8/128. I just set them to 4/8/8/6 and 
 am eager to watch the behaviour tomorrow.
 
 You should at least know when there is a problem sooner.  Then as quickly as 
 you can, get a thread dump with jstack.
 
 
 There are up to 100 users concurrently (it's a backoffice app), although 
 concurrently running requests are typically not more than 2-3, plus 1-2 
 DirectActions, plus possibly 1-2 long response pages running statistics 
 stuff.
 
 OK, the 4/8/8/6 numbers you have seem reasonable for that load.
 
 
 2. You have a thread that is taking a long time to return a result.  If 
 you are dispatching requests concurrently, then this is most likely stuck 
 in EOControl/EOAccess (e.g. waiting for a slow query result) or connecting 
 to some external process.  You could also have a deadlock.  If you are not 
 dispatching requests concurrently, then this delay could be in other code.
 
 When that situation occurs, the app is not using CPU any more, neither is 
 the database. It often doesn't respond to SIGTERM any more and needs 
 SIGKILL to terminate so we can restart.
 
 That sounds like what a blocked non-daemon thread would cause.
 
 
 The traces below do not show the problem.  If you want to send a full 
 dump, I am willing to look at it.  It is possible that the problem had 
 resolved by the time you took this dump.  What you show below is normal 
 for a lot of worker threads.  WorkerThread206 is waiting for a new 
 request, WorkerThread207 is idle waiting for something to do in the future.
 
 Thanks for the offer; here is the full jstack output:
 http://akaihi.selbstdenker.com/~maik/jstack_powerd_20120910.txt
 
 Other than having a large number of idle worker threads, there is nothing in 
 that trace that indicates the problem.  In my experience, that means that 
 they problem has resolved itself and the application recovered.  You will 
 need to run jstack closer to the start of the problem even to capture what 
 is going wrong.
 
 The state the app was in when I took that jstack was that no login was 
 possible and user's requests would not return, ultimately running into no 
 instance responses after the timeout elapsed.

Grep the app logs for OutOfMemory, that is one possibility.  They look ready to 
accept connections.  It could also be that they got so back logged that wotaskd 
gave up on them and decided they were dead.  Having the lower numbers above 
should help in this respect - the app will be able to recover more quickly.


 If the problem persists, I think I'll set up a cronjob to record jstacks 
 every couple of minutes or so.

That might be one way, unless can you babysit it and start grabbing them when 
the number of active worker threads goes up.


 Note that I recently switched to Wonder for this project (using all the 
 Wonder base classes), and since I did, this problem occurred more frequently. 
 It's now almost once a day, and was about once a week before. I switched from 
 MultiECLockManager to ERXEC with autolocking in the process.


I don't have any suggestions on how that change might cause this to happen more 
often.

Chuck


-- 
Chuck Hill Senior Consultant / VP Development

Practical WebObjects - for developers who want to increase their overall 
knowledge of WebObjects or who are trying to solve specific problems.
http://www.global-village.net/gvc/practical_webobjects









 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-10 Thread Alexis Tual
Hi,

2012/9/11 Maik Musall m...@selbstdenker.ag



 Note that I recently switched to Wonder for this project (using all the
 Wonder base classes), and since I did, this problem occurred more
 frequently. It's now almost once a day, and was about once a week before. I
 switched from MultiECLockManager to ERXEC with autolocking in the process.


I've seen you have long response pages, have you turned off autolocking for
these special cases ?
To help diagnose, you could make a little script to poll your app every 10
sec and if the response contains No instance available, you jstack the
process...

Alex
 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: WOWorkerThread deadlocks

2012-09-10 Thread Chuck Hill
That is a good point!  And also make sure that the long requests and background 
threads are not using the main EOF stack.

Chuck 

On Sep 10, 2012, at 2:19 PM, Alexis Tual alexis.t...@gmail.com wrote:

 Hi,
 
 2012/9/11 Maik Musall m...@selbstdenker.ag
 
 
 Note that I recently switched to Wonder for this project (using all the 
 Wonder base classes), and since I did, this problem occurred more frequently. 
 It's now almost once a day, and was about once a week before. I switched from 
 MultiECLockManager to ERXEC with autolocking in the process.
 
 I've seen you have long response pages, have you turned off autolocking for 
 these special cases ?
 To help diagnose, you could make a little script to poll your app every 10 
 sec and if the response contains No instance available, you jstack the 
 process...
 
 Alex
 
 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: Deadlocks in one of our apps

2010-06-04 Thread Pascal Robert

...

- It's a (non public) online store. When people log in, we  
create a order in memory and customers add order items to the  
order. We don't store anything in the DB until the payment is  
made with PayFlow. When we get the response from PayFlow, we  
store a copy of the order (and the items) to our Oracle db.  
After that, we contact our SQL Server db (actually, a accounting  
system, and we send the data to a stored procedure), and we get  
the invoice number produced by the accounting system and store  
it in the order EO in Oracle.


So in summary :

- People login, we create a order EO, the EO is created in the  
session's editing context

- People add items to the order
- They start the order payment steps
- Long response page kicks in
- We contact PayFlow to make the payment
- If the payment is succesful, we store the order in Oracle
- We create a new EO, in a different EC, for SQL Server
- We update the order EO to store the invoice number in Oracle
	- We generate (FOXML, generated in a separated JVM) the invoice  
in PDF

- Long response page is done, pageForResult is called

Everything is done in session.defaultEditingContext EXCEPT the  
SQL Server EOs,


You are not using the session.defaultEditingContext in the long  
response page, are you?  I am pretty sure that is an excellent  
source of deadlocks.


Hum, yes we do use in the long response page... But since  
localInstanceOfObject won't let me have a copy in a new EC, what  
are the options except not using the session EC?


Not using the session EC would be a good choice.  Make a different  
EC. Pass it into the long response page.  Be careful handing off  
locking.


You could also save the order in an unpaid state, then fetch it  
in the long response page and update it if paid, or delete it if not.


Ooh, yeah, you could do that too.


So we end up doing :

- before the long response page is called, we save the EO in a unpaid  
state


- in the method called in performAction, we create a new editing  
context, we lock it and we insert a copy of the order EO into the new  
EC by using localInstanceOfObject


- the bulk of the job is done in a try {} finally { ec.unlock() }

- when the method that is called inside performAction have done his  
job, the order EO is sent back


- we override the order EO that was stored in the session default  
editing context with the one that was inserted in the temporary EC :



((Session 
)session 
()).setCommande 
((Commande 
)EOUtilities.localInstanceOfObject(session().defaultEditingContext(), 
(Commande)copieCommande));


So far, so good. We will see later today when the load get higher if  
we get any deadlocks.


Many thanks to all who helped out :-)


Pascal Robert
prob...@macti.ca

AIM: MacTICanada
Twitter : MacTICanada
LinkedIn : http://www.linkedin.com/in/macti
WO Community profile : http://wocommunity.org/page/member?name=probert

___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: Deadlocks in one of our apps

2010-06-04 Thread Chuck Hill


On Jun 4, 2010, at 4:40 AM, Pascal Robert wrote:


...

- It's a (non public) online store. When people log in, we  
create a order in memory and customers add order items to the  
order. We don't store anything in the DB until the payment is  
made with PayFlow. When we get the response from PayFlow, we  
store a copy of the order (and the items) to our Oracle db.  
After that, we contact our SQL Server db (actually, a  
accounting system, and we send the data to a stored procedure),  
and we get the invoice number produced by the accounting system  
and store it in the order EO in Oracle.


So in summary :

- People login, we create a order EO, the EO is created in the  
session's editing context

- People add items to the order
- They start the order payment steps
- Long response page kicks in
- We contact PayFlow to make the payment
- If the payment is succesful, we store the order in Oracle
- We create a new EO, in a different EC, for SQL Server
- We update the order EO to store the invoice number in Oracle
	- We generate (FOXML, generated in a separated JVM) the  
invoice in PDF

- Long response page is done, pageForResult is called

Everything is done in session.defaultEditingContext EXCEPT the  
SQL Server EOs,


You are not using the session.defaultEditingContext in the long  
response page, are you?  I am pretty sure that is an excellent  
source of deadlocks.


Hum, yes we do use in the long response page... But since  
localInstanceOfObject won't let me have a copy in a new EC, what  
are the options except not using the session EC?


Not using the session EC would be a good choice.  Make a different  
EC. Pass it into the long response page.  Be careful handing off  
locking.


You could also save the order in an unpaid state, then fetch it  
in the long response page and update it if paid, or delete it if  
not.


Ooh, yeah, you could do that too.


So we end up doing :

- before the long response page is called, we save the EO in a  
unpaid state


- in the method called in performAction, we create a new editing  
context, we lock it and we insert a copy of the order EO into the  
new EC by using localInstanceOfObject


we fault (not insert) a copy of the order EO into the new EC...
Just everyone is clear on what is happening.



- the bulk of the job is done in a try {} finally { ec.unlock() }

- when the method that is called inside performAction have done his  
job, the order EO is sent back


- we override the order EO that was stored in the session default  
editing context with the one that was inserted in the temporary EC :


   
((Session 
)session 
()).setCommande 
((Commande 
)EOUtilities.localInstanceOfObject(session().defaultEditingContext(), 
(Commande)copieCommande));


I'd put that move between ECs in Session:

public void setCommande(Commande c) {
	command =  
(Commande)EOUtilities.localInstanceOfObject(defaultEditingContext(),c);

}

and change the line above to

((Session)session()).setCommande(copieCommande);


That way the session is protected and does not rely on the client code  
being correct.



So far, so good. We will see later today when the load get higher if  
we get any deadlocks.


Many thanks to all who helped out :-)


Let us know how it works out!

Chuck




Pascal Robert
prob...@macti.ca

AIM: MacTICanada
Twitter : MacTICanada
LinkedIn : http://www.linkedin.com/in/macti
WO Community profile : http://wocommunity.org/page/member?name=probert



--
Chuck Hill Senior Consultant / VP Development

Practical WebObjects - for developers who want to increase their  
overall knowledge of WebObjects or who are trying to solve specific  
problems.

http://www.global-village.net/products/practical_webobjects







___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: Deadlocks in one of our apps

2010-06-02 Thread Pascal Robert
... And going back to the physical server didn't solve anything, I got  
the same deadlock this morning.


Ok, so I will move back the DB to the physical server to see if the  
problem goes away.




On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:

Hum... And after I started using ERXWOLongResponsePage, I still  
got a deadlock, but this time, it says that it's a  
EODatabaseContext lock :


Thread t...@92163: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
- com.webobjects.foundation.NSRecursiveLock.lock() @bci=54,  
line=72 (Interpreted frame)
- com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56,  
line=1973 (Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5,  
line=130 (Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext 
(com.webobjects.eocontrol.EOEditingContext) @bci=34, line=166  
(Interpreted frame)

...

We don't manual (eg , in code) locking at the EODatabaseContext  
level.


It is possible that an odd exception in EOAccess or below is  
resulting in this not getting unlocked.  Joe's reply below might be  
what is happening to you.


Chuck


Another thing to note if this is a long request to a database  
housed in an ESX vm. We had similar problems with long requests  
timing out between two systems, with one hosted by esx 4.x. Such  
long requests were caught by some low level interface muxing issue  
and my whole EOF stack was frozen when the underlying db  
connection was lost mid-transaction. I resolved it by moving this  
application off of a vm.




On May 31, 2010, at 5:33 PM, Pascal Robert prob...@macti.ca  
wrote:


Ok, will try with ERXWOLongResponsePage since it look like it's  
locking and unlocking all ECs in the thread.


There's a bunch of stuff wrong here. First, the only actually  
locked thread is:


-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5,  
line=130 (Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext 
(com.webobjects.eocontrol.EOEditingContext) @bci=34, line=166  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
._selectWithFetchSpecificationEditingContext 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=158, line=788  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.selectObjectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=64, line=215  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseContext 
._objectsWithFetchSpecificationEditingContext 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=219, line=3205  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=34, line=3346  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=97, line=539  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOEditingContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4114  
(Interpreted frame)
-  
er 
.extensions 
.eof 
.ERXEC 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=72, line=1211  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOEditingContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification) @bci=3,  
line=4500 (Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
._Licence 
.fetchLicences(com.webobjects.eocontrol.EOEditingContext,  
com.webobjects.eocontrol.EOQualifier,  
com.webobjects.foundation.NSArray) @bci=19, line=1062  
(Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
._Membre.licences(com.webobjects.eocontrol.EOQualifier,  
com.webobjects.foundation.NSArray, boolean) @bci=77, line=8920  
(Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
._Membre.licences(com.webobjects.eocontrol.EOQualifier,  
boolean) @bci=4, line=8893 (Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
.Membre 
.licencesParEtats(com.acaiq.fondation.acaiqCore.EtatMembre[])  
@bci=100, line=980 (Interpreted frame)
- com.acaiq.fondation.acaiqCore.Membre.licencesValides()  
@bci=11, line=996 (Interpreted frame)
- com.acaiq.fondation.acaiqCore.Membre.estCourtier() @bci=5,  
line=1035 (Interpreted frame)
-  

Re: Deadlocks in one of our apps

2010-06-02 Thread Chuck Hill
That makes your code look guilty then.  :-)  Check your long response  
page implementation again.  Are there any exceptions in the log that  
might be related?


I'd also reduce the Maximum Adaptor threads (JavaMonitor -  
Application configuration - Application settings).  6 or 8 is  
probably more than enough for this app.  That will at least reduce the  
size of the thread dumps.   I'd also trim down the listen queue size  
to 2 or 4, might as well catch this as soon as possible.


Chuck



On Jun 2, 2010, at 5:19 AM, Pascal Robert wrote:

... And going back to the physical server didn't solve anything, I  
got the same deadlock this morning.


Ok, so I will move back the DB to the physical server to see if the  
problem goes away.




On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:

Hum... And after I started using ERXWOLongResponsePage, I still  
got a deadlock, but this time, it says that it's a  
EODatabaseContext lock :


Thread t...@92163: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
- com.webobjects.foundation.NSRecursiveLock.lock() @bci=54,  
line=72 (Interpreted frame)
- com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56,  
line=1973 (Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5,  
line=130 (Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext 
(com.webobjects.eocontrol.EOEditingContext) @bci=34, line=166  
(Interpreted frame)

...

We don't manual (eg , in code) locking at the EODatabaseContext  
level.


It is possible that an odd exception in EOAccess or below is  
resulting in this not getting unlocked.  Joe's reply below might  
be what is happening to you.


Chuck


Another thing to note if this is a long request to a database  
housed in an ESX vm. We had similar problems with long requests  
timing out between two systems, with one hosted by esx 4.x. Such  
long requests were caught by some low level interface muxing  
issue and my whole EOF stack was frozen when the underlying db  
connection was lost mid-transaction. I resolved it by moving this  
application off of a vm.




On May 31, 2010, at 5:33 PM, Pascal Robert prob...@macti.ca  
wrote:


Ok, will try with ERXWOLongResponsePage since it look like it's  
locking and unlocking all ECs in the thread.


There's a bunch of stuff wrong here. First, the only actually  
locked thread is:


-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5,  
line=130 (Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext 
(com.webobjects.eocontrol.EOEditingContext) @bci=34, line=166  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
._selectWithFetchSpecificationEditingContext 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=158, line=788  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.selectObjectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=64, line=215  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseContext 
._objectsWithFetchSpecificationEditingContext 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=219, line=3205  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=34, line=3346  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=97, line=539  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOEditingContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4114  
(Interpreted frame)
-  
er 
.extensions 
.eof 
.ERXEC 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=72, line=1211  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOEditingContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification) @bci=3,  
line=4500 (Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
._Licence 
.fetchLicences(com.webobjects.eocontrol.EOEditingContext,  
com.webobjects.eocontrol.EOQualifier,  
com.webobjects.foundation.NSArray) @bci=19, line=1062  
(Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
._Membre.licences(com.webobjects.eocontrol.EOQualifier,  

Re: Deadlocks in one of our apps

2010-06-02 Thread Mike Schrag
doesn't addCooperatingObjectStore have a race condition in =5.4? i don't 
recall if wonder fixed that or not ...

On Jun 2, 2010, at 8:19 AM, Pascal Robert wrote:

 ... And going back to the physical server didn't solve anything, I got the 
 same deadlock this morning.
 
 Ok, so I will move back the DB to the physical server to see if the problem 
 goes away.
 
 
 On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:
 
 Hum... And after I started using ERXWOLongResponsePage, I still got a 
 deadlock, but this time, it says that it's a EODatabaseContext lock :
 
 Thread t...@92163: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
 - java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
 - com.webobjects.foundation.NSRecursiveLock.lock() @bci=54, line=72 
 (Interpreted frame)
 - com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56, line=1973 
 (Interpreted frame)
 - 
 com.webobjects.eocontrol.EOObjectStoreCoordinator.addCooperatingObjectStore(com.webobjects.eocontrol.EOCooperatingObjectStore)
  @bci=5, line=130 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseChannel.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)
  @bci=34, line=166 (Interpreted frame)
 ...
 
 We don't manual (eg , in code) locking at the EODatabaseContext level.
 
 It is possible that an odd exception in EOAccess or below is resulting in 
 this not getting unlocked.  Joe's reply below might be what is happening to 
 you.
 
 Chuck
 
 
 Another thing to note if this is a long request to a database housed in an 
 ESX vm. We had similar problems with long requests timing out between two 
 systems, with one hosted by esx 4.x. Such long requests were caught by 
 some low level interface muxing issue and my whole EOF stack was frozen 
 when the underlying db connection was lost mid-transaction. I resolved it 
 by moving this application off of a vm.
 
 
 
 On May 31, 2010, at 5:33 PM, Pascal Robert prob...@macti.ca wrote:
 
 Ok, will try with ERXWOLongResponsePage since it look like it's locking 
 and unlocking all ECs in the thread.
 
 There's a bunch of stuff wrong here. First, the only actually locked 
 thread is:
 
 - 
 com.webobjects.eocontrol.EOObjectStoreCoordinator.addCooperatingObjectStore(com.webobjects.eocontrol.EOCooperatingObjectStore)
  @bci=5, line=130 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseChannel.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)
  @bci=34, line=166 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseChannel._selectWithFetchSpecificationEditingContext(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=158, line=788 
 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseChannel.selectObjectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=64, line=215 
 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseContext._objectsWithFetchSpecificationEditingContext(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=219, line=3205 
 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=34, line=3346 
 (Interpreted frame)
 - 
 com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=97, line=539 
 (Interpreted frame)
 - 
 com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4114 
 (Interpreted frame)
 - 
 er.extensions.eof.ERXEC.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=72, line=1211 
 (Interpreted frame)
 - 
 com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification)
  @bci=3, line=4500 (Interpreted frame)
 - 
 com.acaiq.fondation.acaiqCore._Licence.fetchLicences(com.webobjects.eocontrol.EOEditingContext,
  com.webobjects.eocontrol.EOQualifier, 
 com.webobjects.foundation.NSArray) @bci=19, line=1062 (Interpreted 
 frame)
 - 
 com.acaiq.fondation.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,
  com.webobjects.foundation.NSArray, boolean) @bci=77, line=8920 
 (Interpreted frame)
 - 
 com.acaiq.fondation.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,
  boolean) @bci=4, line=8893 (Interpreted frame)
 - 
 com.acaiq.fondation.acaiqCore.Membre.licencesParEtats(com.acaiq.fondation.acaiqCore.EtatMembre[])
  @bci=100, line=980 (Interpreted frame)
 - com.acaiq.fondation.acaiqCore.Membre.licencesValides() @bci=11, 
 line=996 (Interpreted frame)
 - com.acaiq.fondation.acaiqCore.Membre.estCourtier() 

Re: Deadlocks in one of our apps

2010-06-02 Thread Pascal Robert


Le 10-06-02 à 10:35, Mike Schrag a écrit :

doesn't addCooperatingObjectStore have a race condition in =5.4? i  
don't recall if wonder fixed that or not ...


I guess we could try with WO 5.4.3, but about Wonder, the app is  
extending from ERXApplication/ERXSession. Wonder download from two  
months ago.



On Jun 2, 2010, at 8:19 AM, Pascal Robert wrote:

... And going back to the physical server didn't solve anything, I  
got the same deadlock this morning.


Ok, so I will move back the DB to the physical server to see if  
the problem goes away.




On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:

Hum... And after I started using ERXWOLongResponsePage, I still  
got a deadlock, but this time, it says that it's a  
EODatabaseContext lock :


Thread t...@92163: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
- com.webobjects.foundation.NSRecursiveLock.lock() @bci=54,  
line=72 (Interpreted frame)
- com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56,  
line=1973 (Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5,  
line=130 (Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext 
(com.webobjects.eocontrol.EOEditingContext) @bci=34, line=166  
(Interpreted frame)

...

We don't manual (eg , in code) locking at the  
EODatabaseContext level.


It is possible that an odd exception in EOAccess or below is  
resulting in this not getting unlocked.  Joe's reply below might  
be what is happening to you.


Chuck


Another thing to note if this is a long request to a database  
housed in an ESX vm. We had similar problems with long requests  
timing out between two systems, with one hosted by esx 4.x. Such  
long requests were caught by some low level interface muxing  
issue and my whole EOF stack was frozen when the underlying db  
connection was lost mid-transaction. I resolved it by moving  
this application off of a vm.




On May 31, 2010, at 5:33 PM, Pascal Robert prob...@macti.ca  
wrote:


Ok, will try with ERXWOLongResponsePage since it look like  
it's locking and unlocking all ECs in the thread.


There's a bunch of stuff wrong here. First, the only actually  
locked thread is:


-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5,  
line=130 (Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext 
(com.webobjects.eocontrol.EOEditingContext) @bci=34, line=166  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
._selectWithFetchSpecificationEditingContext 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=158, line=788  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.selectObjectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=64, line=215  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseContext 
._objectsWithFetchSpecificationEditingContext 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=219,  
line=3205 (Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=34, line=3346  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=97, line=539  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOEditingContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4114  
(Interpreted frame)
-  
er 
.extensions 
.eof 
.ERXEC 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=72, line=1211  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOEditingContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification) @bci=3,  
line=4500 (Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
._Licence 
.fetchLicences(com.webobjects.eocontrol.EOEditingContext,  
com.webobjects.eocontrol.EOQualifier,  
com.webobjects.foundation.NSArray) @bci=19, line=1062  
(Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
._Membre.licences(com.webobjects.eocontrol.EOQualifier,  
com.webobjects.foundation.NSArray, boolean) @bci=77,  
line=8920 (Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
._Membre.licences(com.webobjects.eocontrol.EOQualifier,  
boolean) @bci=4, line=8893 

Re: Deadlocks in one of our apps

2010-06-02 Thread Pascal Robert


Le 10-06-02 à 10:30, Chuck Hill a écrit :


That makes your code look guilty then.  :-)


Funny thing is that he not really my code (eg, I didn't write it) but  
this is code dated from WO 5.2. It's just that this app never had that  
much traffic.


And I did try stress loading this app with JMeter, but since the URL  
is changed when the long response page is called (session ID is put  
back in the URL) and I don't know how to fix this, that part was not  
stress loaded.


Check your long response page implementation again.  Are there any  
exceptions in the log that might be related?


Just to explain a bit more :

- It's a (non public) online store. When people log in, we create a  
order in memory and customers add order items to the order. We don't  
store anything in the DB until the payment is made with PayFlow. When  
we get the response from PayFlow, we store a copy of the order (and  
the items) to our Oracle db. After that, we contact our SQL Server db  
(actually, a accounting system, and we send the data to a stored  
procedure), and we get the invoice number produced by the accounting  
system and store it in the order EO in Oracle.


So in summary :

- People login, we create a order EO, the EO is created in the  
session's editing context

- People add items to the order
- They start the order payment steps
- Long response page kicks in
- We contact PayFlow to make the payment
- If the payment is succesful, we store the order in Oracle
- We create a new EO, in a different EC, for SQL Server
- We update the order EO to store the invoice number in Oracle
- We generate (FOXML, generated in a separated JVM) the invoice in PDF
- Long response page is done, pageForResult is called

Everything is done in session.defaultEditingContext EXCEPT the SQL  
Server EOs, where we create a new EOObjectStore, create a new  
EOEditingContext inside the new object store, and


EOObjectStore osc = new EOObjectStoreCoordinator();
EOEditingContext ec = new EOEditingContext(osc);
ec.lock();
try {
CommandesEcom commandeEcom = 
CommandesEcom.creerCommandesEcom(ec);
...
ec.saveChanges();
finally {
ec.unlock();
ec.dispose();
osc.dispose();
ec = null;
osc = null;
}

A co-worker suggested that we create a new editing context in the long  
response page, and call EOUtilities.localInstanceOfObject to have a  
copy of the order EO in the new EC, but the resulting EO is null, even  
if the source is not.


I'd also reduce the Maximum Adaptor threads (JavaMonitor -  
Application configuration - Application settings).  6 or 8 is  
probably more than enough for this app.  That will at least reduce  
the size of the thread dumps.   I'd also trim down the listen queue  
size to 2 or 4, might as well catch this as soon as possible.


Chuck



On Jun 2, 2010, at 5:19 AM, Pascal Robert wrote:

... And going back to the physical server didn't solve anything, I  
got the same deadlock this morning.


Ok, so I will move back the DB to the physical server to see if  
the problem goes away.




On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:

Hum... And after I started using ERXWOLongResponsePage, I still  
got a deadlock, but this time, it says that it's a  
EODatabaseContext lock :


Thread t...@92163: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
- com.webobjects.foundation.NSRecursiveLock.lock() @bci=54,  
line=72 (Interpreted frame)
- com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56,  
line=1973 (Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5,  
line=130 (Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext 
(com.webobjects.eocontrol.EOEditingContext) @bci=34, line=166  
(Interpreted frame)

...

We don't manual (eg , in code) locking at the  
EODatabaseContext level.


It is possible that an odd exception in EOAccess or below is  
resulting in this not getting unlocked.  Joe's reply below might  
be what is happening to you.


Chuck


Another thing to note if this is a long request to a database  
housed in an ESX vm. We had similar problems with long requests  
timing out between two systems, with one hosted by esx 4.x. Such  
long requests were caught by some low level interface muxing  
issue and my whole EOF stack was frozen when the underlying db  
connection was lost mid-transaction. I resolved it by moving  
this application off of a vm.




On May 31, 2010, at 5:33 PM, Pascal Robert prob...@macti.ca  
wrote:


Ok, will try with 

Re: Deadlocks in one of our apps

2010-06-02 Thread Kieran Kelleher
FYI, I did a brief test of using WO 5.4.3 on a mature app a week or two ago, 
and running a barrage of Selenium tests (where each test generally created a 
new Session with a specific user on a specific page with a master EO) would 
deadlock some of the Sessions' defaultERXEC's every time. Switching back to WO 
5.3.3 made the problem go away ... and yes, I built Wonder with the 54 patch 
too so that quickly killed my confidence in WO 5.4.3. Even though I am 99% 
sure that this is probably compatability between my code, Wonder and WO 5.4.3, 
I could not find the problem after 2 hours, so I had to park it, revert to WO 
5.3.3 and get priority work done.

-Kieran

On Jun 2, 2010, at 10:52 AM, Pascal Robert wrote:

 
 Le 10-06-02 à 10:35, Mike Schrag a écrit :
 
 doesn't addCooperatingObjectStore have a race condition in =5.4? i don't 
 recall if wonder fixed that or not ...
 
 I guess we could try with WO 5.4.3, but about Wonder, the app is extending 
 from ERXApplication/ERXSession. Wonder download from two months ago.
 
 On Jun 2, 2010, at 8:19 AM, Pascal Robert wrote:
 
 ... And going back to the physical server didn't solve anything, I got the 
 same deadlock this morning.
 
 Ok, so I will move back the DB to the physical server to see if the 
 problem goes away.
 
 
 On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:
 
 Hum... And after I started using ERXWOLongResponsePage, I still got a 
 deadlock, but this time, it says that it's a EODatabaseContext lock :
 
 Thread t...@92163: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
 - java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
 - com.webobjects.foundation.NSRecursiveLock.lock() @bci=54, line=72 
 (Interpreted frame)
 - com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56, line=1973 
 (Interpreted frame)
 - 
 com.webobjects.eocontrol.EOObjectStoreCoordinator.addCooperatingObjectStore(com.webobjects.eocontrol.EOCooperatingObjectStore)
  @bci=5, line=130 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseChannel.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)
  @bci=34, line=166 (Interpreted frame)
 ...
 
 We don't manual (eg , in code) locking at the EODatabaseContext level.
 
 It is possible that an odd exception in EOAccess or below is resulting in 
 this not getting unlocked.  Joe's reply below might be what is happening 
 to you.
 
 Chuck
 
 
 Another thing to note if this is a long request to a database housed in 
 an ESX vm. We had similar problems with long requests timing out between 
 two systems, with one hosted by esx 4.x. Such long requests were caught 
 by some low level interface muxing issue and my whole EOF stack was 
 frozen when the underlying db connection was lost mid-transaction. I 
 resolved it by moving this application off of a vm.
 
 
 
 On May 31, 2010, at 5:33 PM, Pascal Robert prob...@macti.ca wrote:
 
 Ok, will try with ERXWOLongResponsePage since it look like it's 
 locking and unlocking all ECs in the thread.
 
 There's a bunch of stuff wrong here. First, the only actually locked 
 thread is:
 
 - 
 com.webobjects.eocontrol.EOObjectStoreCoordinator.addCooperatingObjectStore(com.webobjects.eocontrol.EOCooperatingObjectStore)
  @bci=5, line=130 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseChannel.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)
  @bci=34, line=166 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseChannel._selectWithFetchSpecificationEditingContext(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=158, line=788 
 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseChannel.selectObjectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=64, line=215 
 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseContext._objectsWithFetchSpecificationEditingContext(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=219, line=3205 
 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=34, line=3346 
 (Interpreted frame)
 - 
 com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=97, line=539 
 (Interpreted frame)
 - 
 com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4114 
 (Interpreted frame)
 - 
 er.extensions.eof.ERXEC.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=72, line=1211 
 (Interpreted frame)
 - 
 

Re: Deadlocks in one of our apps

2010-06-02 Thread Chuck Hill


On Jun 2, 2010, at 8:16 AM, Pascal Robert wrote:



Le 10-06-02 à 10:30, Chuck Hill a écrit :


That makes your code look guilty then.  :-)


Funny thing is that he not really my code (eg, I didn't write it)  
but this is code dated from WO 5.2. It's just that this app never  
had that much traffic.


And I did try stress loading this app with JMeter, but since the URL  
is changed when the long response page is called (session ID is put  
back in the URL) and I don't know how to fix this, that part was not  
stress loaded.


Check your long response page implementation again.  Are there any  
exceptions in the log that might be related?


Just to explain a bit more :

- It's a (non public) online store. When people log in, we create a  
order in memory and customers add order items to the order. We don't  
store anything in the DB until the payment is made with PayFlow.  
When we get the response from PayFlow, we store a copy of the order  
(and the items) to our Oracle db. After that, we contact our SQL  
Server db (actually, a accounting system, and we send the data to a  
stored procedure), and we get the invoice number produced by the  
accounting system and store it in the order EO in Oracle.


So in summary :

- People login, we create a order EO, the EO is created in the  
session's editing context

- People add items to the order
- They start the order payment steps
- Long response page kicks in
- We contact PayFlow to make the payment
- If the payment is succesful, we store the order in Oracle
- We create a new EO, in a different EC, for SQL Server
- We update the order EO to store the invoice number in Oracle
	- We generate (FOXML, generated in a separated JVM) the invoice in  
PDF

- Long response page is done, pageForResult is called

Everything is done in session.defaultEditingContext EXCEPT the SQL  
Server EOs,


You are not using the session.defaultEditingContext in the long  
response page, are you?  I am pretty sure that is an excellent source  
of deadlocks.



Chuck

where we create a new EOObjectStore, create a new EOEditingContext  
inside the new object store, and


EOObjectStore osc = new EOObjectStoreCoordinator();
EOEditingContext ec = new EOEditingContext(osc);
ec.lock();
try {
CommandesEcom commandeEcom = 
CommandesEcom.creerCommandesEcom(ec);
...
ec.saveChanges();
finally {
ec.unlock();
ec.dispose();
osc.dispose();
ec = null;
osc = null;
}

A co-worker suggested that we create a new editing context in the  
long response page, and call EOUtilities.localInstanceOfObject to  
have a copy of the order EO in the new EC, but the resulting EO is  
null, even if the source is not.


I'd also reduce the Maximum Adaptor threads (JavaMonitor -  
Application configuration - Application settings).  6 or 8 is  
probably more than enough for this app.  That will at least reduce  
the size of the thread dumps.   I'd also trim down the listen queue  
size to 2 or 4, might as well catch this as soon as possible.


Chuck



On Jun 2, 2010, at 5:19 AM, Pascal Robert wrote:

... And going back to the physical server didn't solve anything, I  
got the same deadlock this morning.


Ok, so I will move back the DB to the physical server to see if  
the problem goes away.




On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:

Hum... And after I started using ERXWOLongResponsePage, I still  
got a deadlock, but this time, it says that it's a  
EODatabaseContext lock :


Thread t...@92163: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
- com.webobjects.foundation.NSRecursiveLock.lock() @bci=54,  
line=72 (Interpreted frame)
- com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56,  
line=1973 (Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5,  
line=130 (Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext 
(com.webobjects.eocontrol.EOEditingContext) @bci=34, line=166  
(Interpreted frame)

...

We don't manual (eg , in code) locking at the  
EODatabaseContext level.


It is possible that an odd exception in EOAccess or below is  
resulting in this not getting unlocked.  Joe's reply below might  
be what is happening to you.


Chuck


Another thing to note if this is a long request to a database  
housed in an ESX vm. We had similar problems with long requests  
timing out between two systems, with one hosted by esx 4.x.  
Such long requests were caught by some low level interface  
muxing issue and my whole EOF stack was frozen when

Re: Deadlocks in one of our apps

2010-06-02 Thread Pascal Robert


Le 10-06-02 à 11:39, Chuck Hill a écrit :



On Jun 2, 2010, at 8:16 AM, Pascal Robert wrote:



Le 10-06-02 à 10:30, Chuck Hill a écrit :


That makes your code look guilty then.  :-)


Funny thing is that he not really my code (eg, I didn't write it)  
but this is code dated from WO 5.2. It's just that this app never  
had that much traffic.


And I did try stress loading this app with JMeter, but since the  
URL is changed when the long response page is called (session ID is  
put back in the URL) and I don't know how to fix this, that part  
was not stress loaded.


Check your long response page implementation again.  Are there any  
exceptions in the log that might be related?


Just to explain a bit more :

- It's a (non public) online store. When people log in, we create a  
order in memory and customers add order items to the order. We  
don't store anything in the DB until the payment is made with  
PayFlow. When we get the response from PayFlow, we store a copy of  
the order (and the items) to our Oracle db. After that, we contact  
our SQL Server db (actually, a accounting system, and we send the  
data to a stored procedure), and we get the invoice number produced  
by the accounting system and store it in the order EO in Oracle.


So in summary :

- People login, we create a order EO, the EO is created in the  
session's editing context

- People add items to the order
- They start the order payment steps
- Long response page kicks in
- We contact PayFlow to make the payment
- If the payment is succesful, we store the order in Oracle
- We create a new EO, in a different EC, for SQL Server
- We update the order EO to store the invoice number in Oracle
	- We generate (FOXML, generated in a separated JVM) the invoice in  
PDF

- Long response page is done, pageForResult is called

Everything is done in session.defaultEditingContext EXCEPT the SQL  
Server EOs,


You are not using the session.defaultEditingContext in the long  
response page, are you?  I am pretty sure that is an excellent  
source of deadlocks.


Hum, yes we do use in the long response page... But since  
localInstanceOfObject won't let me have a copy in a new EC, what are  
the options except not using the session EC?




Chuck

where we create a new EOObjectStore, create a new EOEditingContext  
inside the new object store, and


EOObjectStore osc = new EOObjectStoreCoordinator();
EOEditingContext ec = new EOEditingContext(osc);
ec.lock();
try {
CommandesEcom commandeEcom = 
CommandesEcom.creerCommandesEcom(ec);
...
ec.saveChanges();
finally {
ec.unlock();
ec.dispose();
osc.dispose();
ec = null;
osc = null;
}

A co-worker suggested that we create a new editing context in the  
long response page, and call EOUtilities.localInstanceOfObject to  
have a copy of the order EO in the new EC, but the resulting EO is  
null, even if the source is not.


I'd also reduce the Maximum Adaptor threads (JavaMonitor -  
Application configuration - Application settings).  6 or 8 is  
probably more than enough for this app.  That will at least reduce  
the size of the thread dumps.   I'd also trim down the listen  
queue size to 2 or 4, might as well catch this as soon as possible.


Chuck



On Jun 2, 2010, at 5:19 AM, Pascal Robert wrote:

... And going back to the physical server didn't solve anything,  
I got the same deadlock this morning.


Ok, so I will move back the DB to the physical server to see if  
the problem goes away.




On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:

Hum... And after I started using ERXWOLongResponsePage, I  
still got a deadlock, but this time, it says that it's a  
EODatabaseContext lock :


Thread t...@92163: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
- com.webobjects.foundation.NSRecursiveLock.lock() @bci=54,  
line=72 (Interpreted frame)
- com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56,  
line=1973 (Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5,  
line=130 (Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext 
(com.webobjects.eocontrol.EOEditingContext) @bci=34, line=166  
(Interpreted frame)

...

We don't manual (eg , in code) locking at the  
EODatabaseContext level.


It is possible that an odd exception in EOAccess or below is  
resulting in this not getting unlocked.  Joe's reply below  
might be what is happening to you.


Chuck


Another thing to note if this is a long request to a database  
housed in an ESX vm

Re: Deadlocks in one of our apps

2010-06-02 Thread Chuck Hill


On Jun 2, 2010, at 8:51 AM, Pascal Robert wrote:



Le 10-06-02 à 11:39, Chuck Hill a écrit :



On Jun 2, 2010, at 8:16 AM, Pascal Robert wrote:



Le 10-06-02 à 10:30, Chuck Hill a écrit :


That makes your code look guilty then.  :-)


Funny thing is that he not really my code (eg, I didn't write it)  
but this is code dated from WO 5.2. It's just that this app never  
had that much traffic.


And I did try stress loading this app with JMeter, but since the  
URL is changed when the long response page is called (session ID  
is put back in the URL) and I don't know how to fix this, that  
part was not stress loaded.


Check your long response page implementation again.  Are there  
any exceptions in the log that might be related?


Just to explain a bit more :

- It's a (non public) online store. When people log in, we create  
a order in memory and customers add order items to the order. We  
don't store anything in the DB until the payment is made with  
PayFlow. When we get the response from PayFlow, we store a copy of  
the order (and the items) to our Oracle db. After that, we contact  
our SQL Server db (actually, a accounting system, and we send the  
data to a stored procedure), and we get the invoice number  
produced by the accounting system and store it in the order EO in  
Oracle.


So in summary :

- People login, we create a order EO, the EO is created in the  
session's editing context

- People add items to the order
- They start the order payment steps
- Long response page kicks in
- We contact PayFlow to make the payment
- If the payment is succesful, we store the order in Oracle
- We create a new EO, in a different EC, for SQL Server
- We update the order EO to store the invoice number in Oracle
	- We generate (FOXML, generated in a separated JVM) the invoice  
in PDF

- Long response page is done, pageForResult is called

Everything is done in session.defaultEditingContext EXCEPT the SQL  
Server EOs,


You are not using the session.defaultEditingContext in the long  
response page, are you?  I am pretty sure that is an excellent  
source of deadlocks.


Hum, yes we do use in the long response page... But since  
localInstanceOfObject won't let me have a copy in a new EC, what are  
the options except not using the session EC?


Not using the session EC would be a good choice.  Make a different EC.  
Pass it into the long response page.  Be careful handing off locking.


You could also save the order in an unpaid state, then fetch it in  
the long response page and update it if paid, or delete it if not.



Chuck






Chuck

where we create a new EOObjectStore, create a new EOEditingContext  
inside the new object store, and


EOObjectStore osc = new EOObjectStoreCoordinator();
EOEditingContext ec = new EOEditingContext(osc);
ec.lock();
try {
			CommandesEcom commandeEcom =  
CommandesEcom.creerCommandesEcom(ec);

...
ec.saveChanges();
finally {
ec.unlock();
ec.dispose();
osc.dispose();
ec = null;
osc = null;
}

A co-worker suggested that we create a new editing context in the  
long response page, and call EOUtilities.localInstanceOfObject to  
have a copy of the order EO in the new EC, but the resulting EO is  
null, even if the source is not.


I'd also reduce the Maximum Adaptor threads (JavaMonitor -  
Application configuration - Application settings).  6 or 8 is  
probably more than enough for this app.  That will at least  
reduce the size of the thread dumps.   I'd also trim down the  
listen queue size to 2 or 4, might as well catch this as soon as  
possible.


Chuck



On Jun 2, 2010, at 5:19 AM, Pascal Robert wrote:

... And going back to the physical server didn't solve anything,  
I got the same deadlock this morning.


Ok, so I will move back the DB to the physical server to see if  
the problem goes away.




On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:

Hum... And after I started using ERXWOLongResponsePage, I  
still got a deadlock, but this time, it says that it's a  
EODatabaseContext lock :


Thread t...@92163: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
- com.webobjects.foundation.NSRecursiveLock.lock() @bci=54,  
line=72 (Interpreted frame)
- com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56,  
line=1973 (Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5,  
line=130 (Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext 
(com.webobjects.eocontrol.EOEditingContext) @bci=34, line=166  
(Interpreted frame

Re: Deadlocks in one of our apps

2010-06-02 Thread David LeBer

On 2010-06-02, at 11:51 AM, Pascal Robert wrote:

 
 Le 10-06-02 à 11:39, Chuck Hill a écrit :
 
 
 On Jun 2, 2010, at 8:16 AM, Pascal Robert wrote:
 
 
 Le 10-06-02 à 10:30, Chuck Hill a écrit :
 
 That makes your code look guilty then.  :-)
 
 Funny thing is that he not really my code (eg, I didn't write it) but this 
 is code dated from WO 5.2. It's just that this app never had that much 
 traffic.
 
 And I did try stress loading this app with JMeter, but since the URL is 
 changed when the long response page is called (session ID is put back in 
 the URL) and I don't know how to fix this, that part was not stress loaded.
 
 Check your long response page implementation again.  Are there any 
 exceptions in the log that might be related?
 
 Just to explain a bit more :
 
 - It's a (non public) online store. When people log in, we create a order 
 in memory and customers add order items to the order. We don't store 
 anything in the DB until the payment is made with PayFlow. When we get the 
 response from PayFlow, we store a copy of the order (and the items) to our 
 Oracle db. After that, we contact our SQL Server db (actually, a accounting 
 system, and we send the data to a stored procedure), and we get the invoice 
 number produced by the accounting system and store it in the order EO in 
 Oracle.
 
 So in summary :
 
 - People login, we create a order EO, the EO is created in the session's 
 editing context
 - People add items to the order
 - They start the order payment steps
 - Long response page kicks in
 - We contact PayFlow to make the payment
 - If the payment is succesful, we store the order in Oracle
 - We create a new EO, in a different EC, for SQL Server
 - We update the order EO to store the invoice number in Oracle
 - We generate (FOXML, generated in a separated JVM) the invoice in PDF
 - Long response page is done, pageForResult is called
 
 Everything is done in session.defaultEditingContext EXCEPT the SQL Server 
 EOs,
 
 You are not using the session.defaultEditingContext in the long response 
 page, are you?  I am pretty sure that is an excellent source of deadlocks.
 
 Hum, yes we do use in the long response page... But since 
 localInstanceOfObject won't let me have a copy in a new EC, what are the 
 options except not using the session EC?

Not sure about the vagaries of using the long response page, but you could 
create a new EC when you create the order - which is what I would do regardless 
of what other steps you needed to take.

Alternately, you could clone the object graph into a new EC for the long 
response page.

 
 
 Chuck
 
 where we create a new EOObjectStore, create a new EOEditingContext inside 
 the new object store, and
 
 EOObjectStore osc = new EOObjectStoreCoordinator();
 EOEditingContext ec = new EOEditingContext(osc);
 ec.lock();
 try {
 CommandesEcom commandeEcom = 
 CommandesEcom.creerCommandesEcom(ec);
 ...
 ec.saveChanges();
 finally {
 ec.unlock();
 ec.dispose();
 osc.dispose();
 ec = null;
 osc = null;
 }
 
 A co-worker suggested that we create a new editing context in the long 
 response page, and call EOUtilities.localInstanceOfObject to have a copy of 
 the order EO in the new EC, but the resulting EO is null, even if the 
 source is not.
 
 I'd also reduce the Maximum Adaptor threads (JavaMonitor - Application 
 configuration - Application settings).  6 or 8 is probably more than 
 enough for this app.  That will at least reduce the size of the thread 
 dumps.   I'd also trim down the listen queue size to 2 or 4, might as well 
 catch this as soon as possible.
 
 Chuck
 
 
 
 On Jun 2, 2010, at 5:19 AM, Pascal Robert wrote:
 
 ... And going back to the physical server didn't solve anything, I got 
 the same deadlock this morning.
 
 Ok, so I will move back the DB to the physical server to see if the 
 problem goes away.
 
 
 On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:
 
 Hum... And after I started using ERXWOLongResponsePage, I still got a 
 deadlock, but this time, it says that it's a EODatabaseContext lock :
 
 Thread t...@92163: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
 - java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
 - com.webobjects.foundation.NSRecursiveLock.lock() @bci=54, line=72 
 (Interpreted frame)
 - com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56, line=1973 
 (Interpreted frame)
 - 
 com.webobjects.eocontrol.EOObjectStoreCoordinator.addCooperatingObjectStore(com.webobjects.eocontrol.EOCooperatingObjectStore)
  @bci=5, line=130 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseChannel.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)
  @bci=34, line=166 (Interpreted frame)
 ...
 
 We don't manual

Re: Deadlocks in one of our apps

2010-06-02 Thread David LeBer

On 2010-06-02, at 11:54 AM, Chuck Hill wrote:

 
 On Jun 2, 2010, at 8:51 AM, Pascal Robert wrote:
 
 
 Le 10-06-02 à 11:39, Chuck Hill a écrit :
 
 
 On Jun 2, 2010, at 8:16 AM, Pascal Robert wrote:
 
 
 Le 10-06-02 à 10:30, Chuck Hill a écrit :
 
 That makes your code look guilty then.  :-)
 
 Funny thing is that he not really my code (eg, I didn't write it) but this 
 is code dated from WO 5.2. It's just that this app never had that much 
 traffic.
 
 And I did try stress loading this app with JMeter, but since the URL is 
 changed when the long response page is called (session ID is put back in 
 the URL) and I don't know how to fix this, that part was not stress loaded.
 
 Check your long response page implementation again.  Are there any 
 exceptions in the log that might be related?
 
 Just to explain a bit more :
 
 - It's a (non public) online store. When people log in, we create a order 
 in memory and customers add order items to the order. We don't store 
 anything in the DB until the payment is made with PayFlow. When we get the 
 response from PayFlow, we store a copy of the order (and the items) to our 
 Oracle db. After that, we contact our SQL Server db (actually, a 
 accounting system, and we send the data to a stored procedure), and we get 
 the invoice number produced by the accounting system and store it in the 
 order EO in Oracle.
 
 So in summary :
 
 - People login, we create a order EO, the EO is created in the session's 
 editing context
 - People add items to the order
 - They start the order payment steps
 - Long response page kicks in
- We contact PayFlow to make the payment
- If the payment is succesful, we store the order in Oracle
- We create a new EO, in a different EC, for SQL Server
- We update the order EO to store the invoice number in Oracle
- We generate (FOXML, generated in a separated JVM) the invoice in PDF
 - Long response page is done, pageForResult is called
 
 Everything is done in session.defaultEditingContext EXCEPT the SQL Server 
 EOs,
 
 You are not using the session.defaultEditingContext in the long response 
 page, are you?  I am pretty sure that is an excellent source of deadlocks.
 
 Hum, yes we do use in the long response page... But since 
 localInstanceOfObject won't let me have a copy in a new EC, what are the 
 options except not using the session EC?
 
 Not using the session EC would be a good choice.  Make a different EC. Pass 
 it into the long response page.  Be careful handing off locking.
 
 You could also save the order in an unpaid state, then fetch it in the long 
 response page and update it if paid, or delete it if not.

Ooh, yeah, you could do that too.

 
 
 Chuck
 
 
 
 
 Chuck
 
 where we create a new EOObjectStore, create a new EOEditingContext inside 
 the new object store, and
 
EOObjectStore osc = new EOObjectStoreCoordinator();
EOEditingContext ec = new EOEditingContext(osc);
ec.lock();
try {
CommandesEcom commandeEcom = 
 CommandesEcom.creerCommandesEcom(ec);
...
ec.saveChanges();
finally {
ec.unlock();
ec.dispose();
osc.dispose();
ec = null;
osc = null;
}
 
 A co-worker suggested that we create a new editing context in the long 
 response page, and call EOUtilities.localInstanceOfObject to have a copy 
 of the order EO in the new EC, but the resulting EO is null, even if the 
 source is not.
 
 I'd also reduce the Maximum Adaptor threads (JavaMonitor - Application 
 configuration - Application settings).  6 or 8 is probably more than 
 enough for this app.  That will at least reduce the size of the thread 
 dumps.   I'd also trim down the listen queue size to 2 or 4, might as 
 well catch this as soon as possible.
 
 Chuck
 
 
 
 On Jun 2, 2010, at 5:19 AM, Pascal Robert wrote:
 
 ... And going back to the physical server didn't solve anything, I got 
 the same deadlock this morning.
 
 Ok, so I will move back the DB to the physical server to see if the 
 problem goes away.
 
 
 On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:
 
 Hum... And after I started using ERXWOLongResponsePage, I still got a 
 deadlock, but this time, it says that it's a EODatabaseContext lock :
 
 Thread t...@92163: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
 - java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
 - com.webobjects.foundation.NSRecursiveLock.lock() @bci=54, line=72 
 (Interpreted frame)
 - com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56, line=1973 
 (Interpreted frame)
 - 
 com.webobjects.eocontrol.EOObjectStoreCoordinator.addCooperatingObjectStore(com.webobjects.eocontrol.EOCooperatingObjectStore)
  @bci=5, line=130 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseChannel.setCurrentEditingContext

Re: Deadlocks in one of our apps

2010-06-01 Thread Pascal Robert
Hum... And after I started using ERXWOLongResponsePage, I still got a  
deadlock, but this time, it says that it's a EODatabaseContext lock :


Thread t...@92163: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
 - java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
 - com.webobjects.foundation.NSRecursiveLock.lock() @bci=54, line=72  
(Interpreted frame)
 - com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56,  
line=1973 (Interpreted frame)
 -  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5, line=130  
(Interpreted frame)
 -  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)  
@bci=34, line=166 (Interpreted frame)

...

We don't manual (eg , in code) locking at the EODatabaseContext level.


Another thing to note if this is a long request to a database housed  
in an ESX vm. We had similar problems with long requests timing out  
between two systems, with one hosted by esx 4.x. Such long requests  
were caught by some low level interface muxing issue and my whole  
EOF stack was frozen when the underlying db connection was lost mid- 
transaction. I resolved it by moving this application off of a vm.




On May 31, 2010, at 5:33 PM, Pascal Robert prob...@macti.ca wrote:

Ok, will try with ERXWOLongResponsePage since it look like it's  
locking and unlocking all ECs in the thread.


There's a bunch of stuff wrong here. First, the only actually  
locked thread is:


-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5,  
line=130 (Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext 
(com.webobjects.eocontrol.EOEditingContext) @bci=34, line=166  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
._selectWithFetchSpecificationEditingContext 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=158, line=788  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.selectObjectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=64, line=215  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseContext 
._objectsWithFetchSpecificationEditingContext 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=219, line=3205  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=34, line=3346  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=97, line=539  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOEditingContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4114  
(Interpreted frame)
-  
er 
.extensions 
.eof 
.ERXEC 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=72, line=1211  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOEditingContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification) @bci=3, line=4500  
(Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
._Licence.fetchLicences(com.webobjects.eocontrol.EOEditingContext,  
com.webobjects.eocontrol.EOQualifier,  
com.webobjects.foundation.NSArray) @bci=19, line=1062 (Interpreted  
frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,  
com.webobjects.foundation.NSArray, boolean) @bci=77, line=8920  
(Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,  
boolean) @bci=4, line=8893 (Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
.Membre 
.licencesParEtats(com.acaiq.fondation.acaiqCore.EtatMembre[])  
@bci=100, line=980 (Interpreted frame)
- com.acaiq.fondation.acaiqCore.Membre.licencesValides() @bci=11,  
line=996 (Interpreted frame)
- com.acaiq.fondation.acaiqCore.Membre.estCourtier() @bci=5,  
line=1035 (Interpreted frame)
- sun.reflect.GeneratedMethodAccessor87.invoke(java.lang.Object,  
java.lang.Object[]) @bci=40 (Interpreted frame)


Which reminds me of an unlocked EC/OSC. Second:

java.lang.IllegalArgumentException: Attribute noCommandeOracle  
can't receive a null parameter :
	at  
com 
.acaiq 
.fondation 
.depot 
.lbaArticle 
._CommandesEcom.setNoCommandeOracle(_CommandesEcom.java:419)


This is a *template* that throws on null?? You sure 

Re: Deadlocks in one of our apps

2010-06-01 Thread Anjo Krank
Um. Just how did you switch to ERXWOLongResponsePage? If you overrode run() 
than nothing's gonna happen.

Cheers, Anjo

Am 01.06.2010 um 15:34 schrieb Pascal Robert:

 ERXWOLongResponsePage

 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: Deadlocks in one of our apps

2010-06-01 Thread Pascal Robert
We have a component, OSLongResponseComponent, that was extending from  
WOLongResponsePage, and now it's extending from ERXWOLongResponsePage.  
The only thing we are overriding is valueForKeyPath and  
appendToResponse, run() is not overriden.


Um. Just how did you switch to ERXWOLongResponsePage? If you  
overrode run() than nothing's gonna happen.


Cheers, Anjo

Am 01.06.2010 um 15:34 schrieb Pascal Robert:


ERXWOLongResponsePage






Pascal Robert
prob...@macti.ca

AIM: MacTICanada
Twitter : MacTICanada
LinkedIn : http://www.linkedin.com/in/macti
WO Community profile : http://wocommunity.org/page/member?name=probert

___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: Deadlocks in one of our apps

2010-06-01 Thread Chuck Hill


On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:

Hum... And after I started using ERXWOLongResponsePage, I still got  
a deadlock, but this time, it says that it's a EODatabaseContext  
lock :


Thread t...@92163: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
- com.webobjects.foundation.NSRecursiveLock.lock() @bci=54, line=72  
(Interpreted frame)
- com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56,  
line=1973 (Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5, line=130  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)  
@bci=34, line=166 (Interpreted frame)

...

We don't manual (eg , in code) locking at the EODatabaseContext  
level.


It is possible that an odd exception in EOAccess or below is resulting  
in this not getting unlocked.  Joe's reply below might be what is  
happening to you.


Chuck


Another thing to note if this is a long request to a database housed  
in an ESX vm. We had similar problems with long requests timing out  
between two systems, with one hosted by esx 4.x. Such long requests  
were caught by some low level interface muxing issue and my whole  
EOF stack was frozen when the underlying db connection was lost mid- 
transaction. I resolved it by moving this application off of a vm.




On May 31, 2010, at 5:33 PM, Pascal Robert prob...@macti.ca wrote:

Ok, will try with ERXWOLongResponsePage since it look like it's  
locking and unlocking all ECs in the thread.


There's a bunch of stuff wrong here. First, the only actually  
locked thread is:


-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5,  
line=130 (Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext 
(com.webobjects.eocontrol.EOEditingContext) @bci=34, line=166  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
._selectWithFetchSpecificationEditingContext 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=158, line=788  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.selectObjectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=64, line=215  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseContext 
._objectsWithFetchSpecificationEditingContext 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=219, line=3205  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=34, line=3346  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=97, line=539  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOEditingContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4114  
(Interpreted frame)
-  
er 
.extensions 
.eof 
.ERXEC 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=72, line=1211  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOEditingContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification) @bci=3, line=4500  
(Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
._Licence 
.fetchLicences(com.webobjects.eocontrol.EOEditingContext,  
com.webobjects.eocontrol.EOQualifier,  
com.webobjects.foundation.NSArray) @bci=19, line=1062  
(Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,  
com.webobjects.foundation.NSArray, boolean) @bci=77, line=8920  
(Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,  
boolean) @bci=4, line=8893 (Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
.Membre 
.licencesParEtats(com.acaiq.fondation.acaiqCore.EtatMembre[])  
@bci=100, line=980 (Interpreted frame)
- com.acaiq.fondation.acaiqCore.Membre.licencesValides() @bci=11,  
line=996 (Interpreted frame)
- com.acaiq.fondation.acaiqCore.Membre.estCourtier() @bci=5,  
line=1035 (Interpreted frame)
- sun.reflect.GeneratedMethodAccessor87.invoke(java.lang.Object,  
java.lang.Object[]) @bci=40 (Interpreted frame)


Which reminds me of an unlocked EC/OSC. Second:

java.lang.IllegalArgumentException: Attribute 

Re: Deadlocks in one of our apps

2010-05-31 Thread Anjo Krank
There's a bunch of stuff wrong here. First, the only actually locked thread is:

 - 
com.webobjects.eocontrol.EOObjectStoreCoordinator.addCooperatingObjectStore(com.webobjects.eocontrol.EOCooperatingObjectStore)
 @bci=5, line=130 (Interpreted frame)
 - 
com.webobjects.eoaccess.EODatabaseChannel.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)
 @bci=34, line=166 (Interpreted frame)
 - 
com.webobjects.eoaccess.EODatabaseChannel._selectWithFetchSpecificationEditingContext(com.webobjects.eocontrol.EOFetchSpecification,
 com.webobjects.eocontrol.EOEditingContext) @bci=158, line=788 (Interpreted 
frame)
 - 
com.webobjects.eoaccess.EODatabaseChannel.selectObjectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
 com.webobjects.eocontrol.EOEditingContext) @bci=64, line=215 (Interpreted 
frame)
 - 
com.webobjects.eoaccess.EODatabaseContext._objectsWithFetchSpecificationEditingContext(com.webobjects.eocontrol.EOFetchSpecification,
 com.webobjects.eocontrol.EOEditingContext) @bci=219, line=3205 (Interpreted 
frame)
 - 
com.webobjects.eoaccess.EODatabaseContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
 com.webobjects.eocontrol.EOEditingContext) @bci=34, line=3346 (Interpreted 
frame)
 - 
com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
 com.webobjects.eocontrol.EOEditingContext) @bci=97, line=539 (Interpreted 
frame)
 - 
com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
 com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4114 (Interpreted 
frame)
 - 
er.extensions.eof.ERXEC.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
 com.webobjects.eocontrol.EOEditingContext) @bci=72, line=1211 (Interpreted 
frame)
 - 
com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification)
 @bci=3, line=4500 (Interpreted frame)
 - 
com.acaiq.fondation.acaiqCore._Licence.fetchLicences(com.webobjects.eocontrol.EOEditingContext,
 com.webobjects.eocontrol.EOQualifier, com.webobjects.foundation.NSArray) 
@bci=19, line=1062 (Interpreted frame)
 - 
com.acaiq.fondation.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,
 com.webobjects.foundation.NSArray, boolean) @bci=77, line=8920 (Interpreted 
frame)
 - 
com.acaiq.fondation.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,
 boolean) @bci=4, line=8893 (Interpreted frame)
 - 
com.acaiq.fondation.acaiqCore.Membre.licencesParEtats(com.acaiq.fondation.acaiqCore.EtatMembre[])
 @bci=100, line=980 (Interpreted frame)
 - com.acaiq.fondation.acaiqCore.Membre.licencesValides() @bci=11, line=996 
(Interpreted frame)
 - com.acaiq.fondation.acaiqCore.Membre.estCourtier() @bci=5, line=1035 
(Interpreted frame)
 - sun.reflect.GeneratedMethodAccessor87.invoke(java.lang.Object, 
java.lang.Object[]) @bci=40 (Interpreted frame)

Which reminds me of an unlocked EC/OSC. Second:

 java.lang.IllegalArgumentException: Attribute noCommandeOracle can't receive 
 a null parameter :
   at 
 com.acaiq.fondation.depot.lbaArticle._CommandesEcom.setNoCommandeOracle(_CommandesEcom.java:419)

This is a *template* that throws on null?? You sure that's such a bright idea? 
Isn't this what validation is for? And third:

   at 
 com.acaiq.depot.component.TransactionAchat.performAction(TransactionAchat.java:63)
   at 
 com.webobjects.woextensions.WOLongResponsePage.run(WOLongResponsePage.java:119)

As you're throwing from inside a normal 
com.webobjects.woextensions.WOLongResponsePage, I seriously hope you're doing 
your part of try{} finally{} and EC unlocking.


Cheers, Anjo



Am 31.05.2010 um 20:02 schrieb Pascal Robert:

 One of our apps have deadlocked 5 times over 3 days, strangely enough it 
 started when we moved our Oracle Database 10gR2 DB to our VMWare ESX 4.0 
 cluster. e didn't re-install Oracle, I simply did a P2V (Physical to VM) 
 conversion, so it's the exact same version of Oracle DB as before.
 
 What's happenning is that we store some information on our Oracle database, 
 save it, and we built a copy of some of the data to a new EO (different 
 entity) in a SQL Server 2005 db so the accounting system take care of billing.
 
 The exception that cause the deadlock (or at least the last thing written to 
 the log before the deadlock) :
 
 java.lang.IllegalArgumentException: Attribute noCommandeOracle can't receive 
 a null parameter :
   at 
 com.acaiq.fondation.depot.lbaArticle._CommandesEcom.setNoCommandeOracle(_CommandesEcom.java:419)
   at com.acaiq.fondation.depot.Caissier.copiePourLBA(Caissier.java:267)
   at com.acaiq.fondation.depot.Caissier.paye(Caissier.java:137)
   at 
 com.acaiq.depot.component.TransactionAchat.performAction(TransactionAchat.java:63)
   at 
 com.webobjects.woextensions.WOLongResponsePage.run(WOLongResponsePage.java:119)

Re: Deadlocks in one of our apps

2010-05-31 Thread Pascal Robert
Ok, will try with ERXWOLongResponsePage since it look like it's  
locking and unlocking all ECs in the thread.


There's a bunch of stuff wrong here. First, the only actually locked  
thread is:


-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.addCooperatingObjectStore 
(com.webobjects.eocontrol.EOCooperatingObjectStore) @bci=5, line=130  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)  
@bci=34, line=166 (Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
._selectWithFetchSpecificationEditingContext 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=158, line=788  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseChannel 
.selectObjectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=64, line=215  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseContext 
._objectsWithFetchSpecificationEditingContext 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=219, line=3205  
(Interpreted frame)
-  
com 
.webobjects 
.eoaccess 
.EODatabaseContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=34, line=3346  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOObjectStoreCoordinator 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=97, line=539  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOEditingContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4114  
(Interpreted frame)
-  
er 
.extensions 
.eof 
.ERXEC 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification,  
com.webobjects.eocontrol.EOEditingContext) @bci=72, line=1211  
(Interpreted frame)
-  
com 
.webobjects 
.eocontrol 
.EOEditingContext 
.objectsWithFetchSpecification 
(com.webobjects.eocontrol.EOFetchSpecification) @bci=3, line=4500  
(Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
._Licence.fetchLicences(com.webobjects.eocontrol.EOEditingContext,  
com.webobjects.eocontrol.EOQualifier,  
com.webobjects.foundation.NSArray) @bci=19, line=1062 (Interpreted  
frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,  
com.webobjects.foundation.NSArray, boolean) @bci=77, line=8920  
(Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,  
boolean) @bci=4, line=8893 (Interpreted frame)
-  
com 
.acaiq 
.fondation 
.acaiqCore 
.Membre.licencesParEtats(com.acaiq.fondation.acaiqCore.EtatMembre[])  
@bci=100, line=980 (Interpreted frame)
- com.acaiq.fondation.acaiqCore.Membre.licencesValides() @bci=11,  
line=996 (Interpreted frame)
- com.acaiq.fondation.acaiqCore.Membre.estCourtier() @bci=5,  
line=1035 (Interpreted frame)
- sun.reflect.GeneratedMethodAccessor87.invoke(java.lang.Object,  
java.lang.Object[]) @bci=40 (Interpreted frame)


Which reminds me of an unlocked EC/OSC. Second:

java.lang.IllegalArgumentException: Attribute noCommandeOracle  
can't receive a null parameter :
	at  
com 
.acaiq 
.fondation 
.depot 
.lbaArticle._CommandesEcom.setNoCommandeOracle(_CommandesEcom.java: 
419)


This is a *template* that throws on null?? You sure that's such a  
bright idea? Isn't this what validation is for? And third:


	at  
com 
.acaiq 
.depot 
.component.TransactionAchat.performAction(TransactionAchat.java:63)
	at  
com 
.webobjects 
.woextensions.WOLongResponsePage.run(WOLongResponsePage.java:119)


As you're throwing from inside a normal  
com.webobjects.woextensions.WOLongResponsePage, I seriously hope  
you're doing your part of try{} finally{} and EC unlocking.



Cheers, Anjo



Am 31.05.2010 um 20:02 schrieb Pascal Robert:

One of our apps have deadlocked 5 times over 3 days, strangely  
enough it started when we moved our Oracle Database 10gR2 DB to our  
VMWare ESX 4.0 cluster. e didn't re-install Oracle, I simply did a  
P2V (Physical to VM) conversion, so it's the exact same version of  
Oracle DB as before.


What's happenning is that we store some information on our Oracle  
database, save it, and we built a copy of some of the data to a new  
EO (different entity) in a SQL Server 2005 db so the accounting  
system take care of billing.


The exception that cause the deadlock (or at least the last thing  
written to the log before the deadlock) :


java.lang.IllegalArgumentException: Attribute noCommandeOracle  
can't receive a null parameter :
	at  
com 
.acaiq 
.fondation 
.depot 
.lbaArticle._CommandesEcom.setNoCommandeOracle(_CommandesEcom.java: 
419)
	at 

Re: Deadlocks in one of our apps

2010-05-31 Thread Joe Little
Another thing to note if this is a long request to a database housed in an ESX 
vm. We had similar problems with long requests timing out between two systems, 
with one hosted by esx 4.x. Such long requests were caught by some low level 
interface muxing issue and my whole EOF stack was frozen when the underlying db 
connection was lost mid-transaction. I resolved it by moving this application 
off of a vm.



On May 31, 2010, at 5:33 PM, Pascal Robert prob...@macti.ca wrote:

 Ok, will try with ERXWOLongResponsePage since it look like it's locking and 
 unlocking all ECs in the thread.
 
 There's a bunch of stuff wrong here. First, the only actually locked thread 
 is:
 
 - 
 com.webobjects.eocontrol.EOObjectStoreCoordinator.addCooperatingObjectStore(com.webobjects.eocontrol.EOCooperatingObjectStore)
  @bci=5, line=130 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseChannel.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)
  @bci=34, line=166 (Interpreted frame)
 - 
 com.webobjects.eoaccess.EODatabaseChannel._selectWithFetchSpecificationEditingContext(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=158, line=788 (Interpreted 
 frame)
 - 
 com.webobjects.eoaccess.EODatabaseChannel.selectObjectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=64, line=215 (Interpreted 
 frame)
 - 
 com.webobjects.eoaccess.EODatabaseContext._objectsWithFetchSpecificationEditingContext(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=219, line=3205 (Interpreted 
 frame)
 - 
 com.webobjects.eoaccess.EODatabaseContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=34, line=3346 (Interpreted 
 frame)
 - 
 com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=97, line=539 (Interpreted 
 frame)
 - 
 com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4114 (Interpreted 
 frame)
 - 
 er.extensions.eof.ERXEC.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
  com.webobjects.eocontrol.EOEditingContext) @bci=72, line=1211 (Interpreted 
 frame)
 - 
 com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification)
  @bci=3, line=4500 (Interpreted frame)
 - 
 com.acaiq.fondation.acaiqCore._Licence.fetchLicences(com.webobjects.eocontrol.EOEditingContext,
  com.webobjects.eocontrol.EOQualifier, com.webobjects.foundation.NSArray) 
 @bci=19, line=1062 (Interpreted frame)
 - 
 com.acaiq.fondation.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,
  com.webobjects.foundation.NSArray, boolean) @bci=77, line=8920 (Interpreted 
 frame)
 - 
 com.acaiq.fondation.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,
  boolean) @bci=4, line=8893 (Interpreted frame)
 - 
 com.acaiq.fondation.acaiqCore.Membre.licencesParEtats(com.acaiq.fondation.acaiqCore.EtatMembre[])
  @bci=100, line=980 (Interpreted frame)
 - com.acaiq.fondation.acaiqCore.Membre.licencesValides() @bci=11, line=996 
 (Interpreted frame)
 - com.acaiq.fondation.acaiqCore.Membre.estCourtier() @bci=5, line=1035 
 (Interpreted frame)
 - sun.reflect.GeneratedMethodAccessor87.invoke(java.lang.Object, 
 java.lang.Object[]) @bci=40 (Interpreted frame)
 
 Which reminds me of an unlocked EC/OSC. Second:
 
 java.lang.IllegalArgumentException: Attribute noCommandeOracle can't 
 receive a null parameter :
 at 
 com.acaiq.fondation.depot.lbaArticle._CommandesEcom.setNoCommandeOracle(_CommandesEcom.java:419)
 
 This is a *template* that throws on null?? You sure that's such a bright 
 idea? Isn't this what validation is for? And third:
 
 at 
 com.acaiq.depot.component.TransactionAchat.performAction(TransactionAchat.java:63)
 at 
 com.webobjects.woextensions.WOLongResponsePage.run(WOLongResponsePage.java:119)
 
 As you're throwing from inside a normal 
 com.webobjects.woextensions.WOLongResponsePage, I seriously hope you're 
 doing your part of try{} finally{} and EC unlocking.
 
 
 Cheers, Anjo
 
 
 
 Am 31.05.2010 um 20:02 schrieb Pascal Robert:
 
 One of our apps have deadlocked 5 times over 3 days, strangely enough it 
 started when we moved our Oracle Database 10gR2 DB to our VMWare ESX 4.0 
 cluster. e didn't re-install Oracle, I simply did a P2V (Physical to VM) 
 conversion, so it's the exact same version of Oracle DB as before.
 
 What's happenning is that we store some information on our Oracle database, 
 save it, and we built a copy of some of the data to a new EO (different 
 entity) in a SQL Server 2005 db so the accounting system take care of 
 billing.
 

Re: Deadlocks

2007-09-10 Thread Pascal Robert


Le 07-09-05 à 18:14, Guido Neitzer a écrit :


On 05.09.2007, at 16:01, Simon McLean wrote:

We're experiencing some pretty bad deadlock issues at the moment  
and I'm pretty convinced it's down to EC lock abuse.


Get a stacktrace of your running application to verify that:

http://tinyurl.com/3bpkkv


BTW, no need to use tinyurl.com for links to the wiki, when you go on  
the Info tab for the page in the wiki, Confluence will display a  
shorter link, in that case :


http://wiki.objectstyle.org/confluence/x/sAED 
___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Re: Deadlocks

2007-09-10 Thread Simon McLean

Hi Guido -

Many thanks for that URL - and thanks to everyone else that posted  
ideas. We had to let the app fall over quite a few times and grab a  
half dozen stack traces before we figured it out, but the app is  
humming along nicely once again now.


We ended up finding 2 core issues:

1) the occasional use of new EOEditingContext() instead of  
ERXEC.newEditingContext()

2) use of the session's editing context inside a thread

Both were well buried sins that once the app scaled up became rather  
ugly :-(


Thanks again,

Simon

On 5 Sep 2007, at 23:14, Guido Neitzer wrote:


On 05.09.2007, at 16:01, Simon McLean wrote:

We're experiencing some pretty bad deadlock issues at the moment  
and I'm pretty convinced it's down to EC lock abuse.


Get a stacktrace of your running application to verify that:

http://tinyurl.com/3bpkkv


... we should never have to manually lock or unlock an EC ?


That is true, yes - but you still might run into problems if you do  
bad things.


Or put another way, when using these rules is there any situation  
that we would have to call ec.lock() or ec.unlock() in our code ?


I normally lock and unlock manually on long response pages / tasks,  
as the unlocking of editing contexts relies on the request response  
loop.


If you see problems in the stacktrace, when the session gets  
checked out from the session store, make sure you NEVER EVER touch  
something from the session's default editing context inside your  
performAction method on a long response page. This will autolock  
your session's default editing context, it will not get unlocked,  
because you are outside of the rr loop and the next checkout for  
that session will fail.


The other thing I saw with deadlocks: if you run out of space on  
your server, log4j might deadlock.


cug


___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Re: Deadlocks

2007-09-10 Thread Guido Neitzer
Pascal Robert [EMAIL PROTECTED] wrote:

 BTW, no need to use tinyurl.com for links to the wiki, when you go on
 the Info tab for the page in the wiki, Confluence will display a  
 shorter link, in that case :
 
   http://wiki.objectstyle.org/confluence/x/sAED

Ah, thanks for the hint.

cug
 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Re: Deadlocks

2007-09-06 Thread Anjo Krank


Am 06.09.2007 um 00:39 schrieb Mike Schrag:

If you see problems in the stacktrace, when the session gets  
checked out from the session store, make sure you NEVER EVER touch  
something from the session's default editing context inside your  
performAction method on a long response page. This will autolock  
your session's default editing context, it will not get unlocked,  
because you are outside of the rr loop and the next checkout for  
that session will fail.
This particular deadlock should be fixed as of a couple weeks ago  
after we talked, btw ... I think I rolled autolocking into long  
response, also.


How is that supposed to work? The actual processing is done in the  
extra thread, and if it has locked the supplied EC, any page coming  
it with this session will not run - so if you run with concurrent  
request handling off, the app request handling lock is never returned  
until the task finished and your app is dead in the meantime.  
Otherwise only the long response page is frozen...


Apart from that, when you use D2W it is ridiculously easy to deadlock  
your app when you don't use Wonder because of the pattern of locking/ 
unlocking ECs in awake() and sleep() which doesn't really work on  
reloads.


Cheers, Anjo
___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Re: Deadlocks

2007-09-06 Thread Mike Schrag

Am 06.09.2007 um 00:39 schrieb Mike Schrag:

If you see problems in the stacktrace, when the session gets  
checked out from the session store, make sure you NEVER EVER  
touch something from the session's default editing context inside  
your performAction method on a long response page. This will  
autolock your session's default editing context, it will not get  
unlocked, because you are outside of the rr loop and the next  
checkout for that session will fail.
This particular deadlock should be fixed as of a couple weeks ago  
after we talked, btw ... I think I rolled autolocking into long  
response, also.


How is that supposed to work? The actual processing is done in the  
extra thread, and if it has locked the supplied EC, any page coming  
it with this session will not run - so if you run with concurrent  
request handling off, the app request handling lock is never  
returned until the task finished and your app is dead in the  
meantime. Otherwise only the long response page is frozen...
There was a bug with coalesced autolocks where it would coalesce  
outside of the RR-loop, which meant that it would leave a lock open  
on purpose.  This would explode like you are describing in the long  
response thread because it would keep the lock on. There is a fix  
for this whereby it reverts back to just plain-jane autolocking on  
each call vs locking for the entire thread.  This is still wrong  
because you usually want a longer lock than just on each call.  The  
proper way to do it is to local instance the object into another EC  
and lock THAT for the long response.  This still requires manually  
locking/unlocking to get a lock span across multiple calls, but even  
if you don't, Wonder will at least autolock each individual call for  
you and prevent MOST terrible things.


Then there's the variation of long response (I'd have to look up  
exactly what this is called -- i think there is the normal long  
response and then another one that does this style) that will act  
like the long response thread is in a RR loop and support coalesced  
locking also.  I can't remember which of this stuff rolled into the  
main long response class offhand, but this also supports cleaning  
up left-open-locks just like in a RR loop.  There's also a Thread/ 
Runnable variant that does this same thing, so you can use the ERX  
version of the runnable and it will always clean up for you ... Just  
some safety nets.


___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Re: Deadlocks

2007-09-06 Thread Cornelius Jaeger


On 06.09.2007, at 09:04, Mike Schrag wrote:



Then there's the variation of long response (I'd have to look up  
exactly what this is called -- i think there is the normal long  
response and then another one that does this style) that will act  
like the long response thread is in a RR loop and support coalesced  
locking also.  I can't remember which of this stuff rolled into the  
main long response class offhand, but this also supports cleaning  
up left-open-locks just like in a RR loop.  There's also a Thread/ 
Runnable variant that does this same thing, so you can use the ERX  
version of the runnable and it will always clean up for you ...  
Just some safety nets.


ERX**RUnnable, which class is that.
It's time for a Wiki Page on Multithreaded EOF. I'm working through  
this now, when i get it sussed i'll try to write it up.



___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Re: Deadlocks

2007-09-06 Thread Chuck Hill


On Sep 5, 2007, at 6:47 PM, Matthew W. Taylor wrote:



If virtue
can't be mine alone
at least my faults
can be my own.
- Piet Hein


:-)


Deadlocks in WebObjects have always been my own fault.  I'm  
thankful to
Andrew Lindesay's HOWTO.  It's a super easy process -- and saves my  
bacon

from my own carelessness.


From: Steven Mark McCraw [EMAIL PROTECTED]
Date: Wed, 5 Sep 2007 20:22:42 -0400
To: Chuck Hill [EMAIL PROTECTED]
Cc: WebObjects (Group) webobjects-dev@lists.apple.com
Subject: Re: Deadlocks


Is it easy?  Or is that just the nature of the concurrency beast?


I consider it easy because I've had to deal with it so many times.  I
think WebObjects seems worse than a normal multithreaded app because
things you do that are totally unrelated to concurrency from the
programmer's point of view can cause you deadlock.


Deadlocks sure are frustrating. In my opinion WO locks are only more
noticeable than other web app environments, because, in classical WO
programming, you've only got one channel to the DB per application.  
Lock

that up -- and the rest of the app is toast.


That is usually more of a scarce resource contention problem than  
true deadlocking - unless you take out a lock on say a DB context and  
don't unlock it.  But that is a good point and something that people  
stumble over.  I don't know if there is a practical fix for this.   
Making EOF truly multi-threaded would be a duanting task.




  Other programming environments
might suffer less by having more avenues to the data.  You might be  
equally
guilty of poor practice in those environments but possibly not even  
notice
it. Instead you reboot your app when it slows to a crawl, or runs  
out of

descriptors, blaming it on Java.


Grin.



It's nice that so
much of the concurrency-handling misery you would ordinarily have to
think about with multithreaded applications is hidden from you, but
when it goes wrong, it is the height of confusion.


Well when we see that copy of open-source WO, Real Soon Now (tm).  
we can

all say  what confusion?


I trust you have not been holding your breath.  ;-)

Chuck


--

Practical WebObjects - for developers who want to increase their  
overall knowledge of WebObjects or who are trying to solve specific  
problems.

http://www.global-village.net/products/practical_webobjects





___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Deadlocks

2007-09-05 Thread Simon McLean

Hi -

We're experiencing some pretty bad deadlock issues at the moment and  
I'm pretty convinced it's down to EC lock abuse.


Can anyone confirm that if we follow these rules:

** If you do want all that Wonder magic and love:
1) extend ERXApplication
2) extend ERXSession
3) use ERXEC.newEditingContext() instead of new EOEditingContext()
4) Add to Properties:
er.extensions.ERXApplication.useEditingContextUnlocker=true
er.extensions.ERXEC.defaultAutomaticLockUnlock=true
er.extensions.ERXEC.useSharedEditingContext=false
er.extensions.ERXEC.defaultCoalesceAutoLocks=true

... we should never have to manually lock or unlock an EC ? Or put  
another way, when using these rules is there any situation that we  
would have to call ec.lock() or ec.unlock() in our code ?


Thanks, Simon ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Re: Deadlocks

2007-09-05 Thread Guido Neitzer

On 05.09.2007, at 16:01, Simon McLean wrote:

We're experiencing some pretty bad deadlock issues at the moment  
and I'm pretty convinced it's down to EC lock abuse.


Get a stacktrace of your running application to verify that:

http://tinyurl.com/3bpkkv


... we should never have to manually lock or unlock an EC ?


That is true, yes - but you still might run into problems if you do  
bad things.


Or put another way, when using these rules is there any situation  
that we would have to call ec.lock() or ec.unlock() in our code ?


I normally lock and unlock manually on long response pages / tasks,  
as the unlocking of editing contexts relies on the request response  
loop.


If you see problems in the stacktrace, when the session gets checked  
out from the session store, make sure you NEVER EVER touch something  
from the session's default editing context inside your  
performAction method on a long response page. This will autolock  
your session's default editing context, it will not get unlocked,  
because you are outside of the rr loop and the next checkout for that  
session will fail.


The other thing I saw with deadlocks: if you run out of space on your  
server, log4j might deadlock.


cug
___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Re: Deadlocks

2007-09-05 Thread Mike Schrag
If you see problems in the stacktrace, when the session gets  
checked out from the session store, make sure you NEVER EVER touch  
something from the session's default editing context inside your  
performAction method on a long response page. This will autolock  
your session's default editing context, it will not get unlocked,  
because you are outside of the rr loop and the next checkout for  
that session will fail.
This particular deadlock should be fixed as of a couple weeks ago  
after we talked, btw ... I think I rolled autolocking into long  
response, also.  But generally speaking, if you're in another thread,  
I would lock manually to be safe.


Also, if you ever access an EODatabaseContext directly, you MUST lock  
that yourself.  It will not autolock, and that will cause terrible  
problems.


ms

___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Re: Deadlocks

2007-09-05 Thread Chuck Hill


On Sep 5, 2007, at 3:01 PM, Simon McLean wrote:


Hi -

We're experiencing some pretty bad deadlock issues at the moment  
and I'm pretty convinced it's down to EC lock abuse.


What makes you think that?  As Guido indicated, if you don't have  
stack traces you are just guessing.  Guessing is not an effective  
form of debugging.  :-)




Can anyone confirm that if we follow these rules:

** If you do want all that Wonder magic and love:
1) extend ERXApplication
2) extend ERXSession
3) use ERXEC.newEditingContext() instead of new EOEditingContext()
4) Add to Properties:
er.extensions.ERXApplication.useEditingContextUnlocker=true
er.extensions.ERXEC.defaultAutomaticLockUnlock=true
er.extensions.ERXEC.useSharedEditingContext=false
er.extensions.ERXEC.defaultCoalesceAutoLocks=true

... we should never have to manually lock or unlock an EC ? Or put  
another way, when using these rules is there any situation that we  
would have to call ec.lock() or ec.unlock() in our code ?



You are probably safe for general EC usage there, but you can still  
do other bad things and end up deadlocked.


Chuck


--

Practical WebObjects - for developers who want to increase their  
overall knowledge of WebObjects or who are trying to solve specific  
problems.

http://www.global-village.net/products/practical_webobjects





___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Re: Deadlocks

2007-09-05 Thread Steven Mark McCraw
You are probably safe for general EC usage there, but you can still  
do other bad things and end up deadlocked.


There are many great things in the win column for WebObjects, but I  
believe one of the definite negatives of the technology is how  
ridiculously easy it is to deadlock a webobjects application.  You  
have to take the bad with the good.  This is miserably scary and  
nasty until you learn to dump the thread stack traces (see the URL  
Guido posted earlier:  http://tinyurl.com/3bpkkv.  Learning the  
tricks shown here cost me a week of sleep once, but now it's  
beautifully documented, so profit from the work people have done to  
write up these instructions).  Once you have the stack traces in  
hand, it becomes pretty obvious where the problem is and you can fix  
it.  Look for the thread which isn't stuck in a wait queue or  
sleeping while waiting for requests.


Mark
___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Re: Deadlocks

2007-09-05 Thread Chuck Hill


On Sep 5, 2007, at 4:12 PM, Steven Mark McCraw wrote:

You are probably safe for general EC usage there, but you can  
still do other bad things and end up deadlocked.


There are many great things in the win column for WebObjects, but  
I believe one of the definite negatives of the technology is how  
ridiculously easy it is to deadlock a webobjects application.


Is it easy?  Or is that just the nature of the concurrency beast?   
The only ways to cause deadlock that I can think of are (a) improper  
exception handling related to releasing locks and (b) unbalanced lock  
usage.  I'd expect those to cause problems in any multi-threaded,  
concurrent environment.  There _were_ some issues in this area in  
prior versions.  AFAIK, these are fixed.



The one thing I can think of  that WO could have added is some  
try...catch or try...finally blocks in WOSession.  These could, if  
present, handle when the developer does not properly handle the  
exceptions that happen in their code.


Can you think of anything else that could be done?


Chuck


  You have to take the bad with the good.  This is miserably scary  
and nasty until you learn to dump the thread stack traces (see the  
URL Guido posted earlier:  http://tinyurl.com/3bpkkv.  Learning the  
tricks shown here cost me a week of sleep once, but now it's  
beautifully documented, so profit from the work people have done to  
write up these instructions).  Once you have the stack traces in  
hand, it becomes pretty obvious where the problem is and you can  
fix it.  Look for the thread which isn't stuck in a wait queue or  
sleeping while waiting for requests.


Mark



--

Practical WebObjects - for developers who want to increase their  
overall knowledge of WebObjects or who are trying to solve specific  
problems.

http://www.global-village.net/products/practical_webobjects





___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Re: Deadlocks

2007-09-05 Thread Matthew W. Taylor

If virtue
can't be mine alone
at least my faults
can be my own.
- Piet Hein

Deadlocks in WebObjects have always been my own fault.  I'm thankful to
Andrew Lindesay's HOWTO.  It's a super easy process -- and saves my bacon
from my own carelessness.

 From: Steven Mark McCraw [EMAIL PROTECTED]
 Date: Wed, 5 Sep 2007 20:22:42 -0400
 To: Chuck Hill [EMAIL PROTECTED]
 Cc: WebObjects (Group) webobjects-dev@lists.apple.com
 Subject: Re: Deadlocks
 
 Is it easy?  Or is that just the nature of the concurrency beast?
 
 I consider it easy because I've had to deal with it so many times.  I
 think WebObjects seems worse than a normal multithreaded app because
 things you do that are totally unrelated to concurrency from the
 programmer's point of view can cause you deadlock.

Deadlocks sure are frustrating. In my opinion WO locks are only more
noticeable than other web app environments, because, in classical WO
programming, you've only got one channel to the DB per application. Lock
that up -- and the rest of the app is toast.  Other programming environments
might suffer less by having more avenues to the data.  You might be equally
guilty of poor practice in those environments but possibly not even notice
it. Instead you reboot your app when it slows to a crawl, or runs out of
descriptors, blaming it on Java.

 It's nice that so
 much of the concurrency-handling misery you would ordinarily have to
 think about with multithreaded applications is hidden from you, but
 when it goes wrong, it is the height of confusion.

Well when we see that copy of open-source WO, Real Soon Now (tm). we can
all say  what confusion?
 
-=- matt

Matthew Taylor
Northwestern University


smime.p7s
Description: S/MIME cryptographic signature
 ___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

OPENBASE and Deadlocks

2007-01-17 Thread Andrew Lindesay

Hello;

I'm not sure if I mentioned this before, but one of my projects was  
having deadlock problems with high-volume writes out of a WOA into  
OPENBASE.  I developed a subclassed adaptor for this -- so if anybody  
is interested in this, I put something in the wiki about it...


	http://en.wikibooks.org/wiki/Programming:WebObjects/ 
Database_Compatibility_and_Comparisons/OpenBase#Deadlocks


cheers.

___
Andrew Lindesay
www.lindesay.co.nz



___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to archive@mail-archive.com


Re: OPENBASE and Deadlocks

2007-01-17 Thread Scott Keith

Hi Andrew,

I think I understand now.  So I think you are saying that OpenBase is  
detecting a deadlock and aborting a transaction, right? The solution  
you have posted on this page will work fine.  You could also resave  
the transaction.


I've made a small edit to the wiki to make it more clear that we are  
talking about the problem of aborting transactions to avoid deadlocks  
rather than a deadlocked server.  Please let me know if I have  
misunderstood.  Thanks.


Best regards,

Scott Keith
OpenBase




On Jan 17, 2007, at 11:31 PM, Andrew Lindesay wrote:


Hello;

I'm not sure if I mentioned this before, but one of my projects was  
having deadlock problems with high-volume writes out of a WOA into  
OPENBASE.  I developed a subclassed adaptor for this -- so if  
anybody is interested in this, I put something in the wiki about it...


	http://en.wikibooks.org/wiki/Programming:WebObjects/ 
Database_Compatibility_and_Comparisons/OpenBase#Deadlocks


cheers.

___
Andrew Lindesay
www.lindesay.co.nz





___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to archive@mail-archive.com


RE: OPENBASE and Deadlocks - follow-up

2007-01-17 Thread Andrew Lindesay

Hello again;

I'm not sure if I mentioned this before, but one of my projects was  
having deadlock problems with high-volume writes out of a WOA into  
OPENBASE.  I developed a subclassed adaptor for this -- so if  
anybody is interested in this, I put something in the wiki about it...


I just want to follow up this post by adding that the behavior  
exhibited by the OPENBASE product is 100% correct -- this is not an  
issue with the database server.  This is a database client-end  
solution that integrates with WebObjects applications for  
specifically solving this issue in the development of WebObjects  
applications.


cheers.

___
Andrew Lindesay
www.lindesay.co.nz



___
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list  (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to archive@mail-archive.com