Maybe you can get the admin to dump that into a file for you before they kill 
<pid>? 

EOF has a global lock on the OSC. Looks like some other thread locked the OSC 
and didn’t unlock it. If you are manually locking and unlocking an EC, you 
should always

ec.lock();
try {
//do stuff
} finally {
ec.unlock();
}

But you probably shouldn’t be manually locking much if you are using an auto 
locking ec like ERXEC.

On Jan 27, 2015, at 2:58 PM, OC <o...@ocs.cz> wrote:

> Alas, runs on production server where I have no access -- just am getting the 
> logs :(
> 
> Thanks,
> OC
> 
> On 27. 1. 2015, at 22:41, Ramsey Gurley <rgur...@smarthealth.com> wrote:
> 
>> If the instance is hung, I’d start with
>> 
>> jstack -F <pid>
>> 
>> That should give a stack trace on whatever deadlocked it.
>> 
>> On Jan 27, 2015, at 2:33 PM, OC <o...@ocs.cz> wrote:
>> 
>>> Hello there,
>>> 
>>> another weird case: this application is run single-instance (but 
>>> ERXObjectStoreCoordinatorPool.maxCoordinators=3, 
>>> WOAllowsConcurrentRequestHandling=true).
>>> 
>>> Yesterday after 14:10 users reported “No instance” or “Application not 
>>> found” WO reports. Now, I log all R/R loops in Application.awake and 
>>> Application.sleep; indeed at 14:10 a R/R loop did start and never ended, 
>>> actually, almost nothing happened till the app was restarted by the 
>>> administrator at 14:20. (I am told JavaMonitor has not been able to stop 
>>> the instance normally, and it had to be killed/force quit.)
>>> 
>>> The single thing which did happen looks like this in my log:
>>> 
>>> ===
>>> //////////////////////////////////////////////////////////////////////////////////////////
>>> ////// R/R loop #2218 WorkerThread118 started at 14:10:05 26.1.
>>> //////////////////////////////////////////////////////////////////////////////////////////
>>> DA: reading CZ banner image for market 1000001...
>>> 14:20:39.844 WARN  Force Quit received. Exiting now...       //log:NSLog 
>>> [Thread-3]
>>> APPLICATION SHUTDOWN SEQUENCE COMPLETE
>>> ===
>>> 
>>> The “reading banner“ log comes from a direct action:
>>> 
>>> ===
>>>  WOActionResults bannerAction {
>>>      def 
>>> mpk=request().formValueForKey('mkpk'),lang=request().formValueForKey('lang')
>>>      println "DA: reading $lang banner image for market $mpk..."
>>>      if (!mpk || !lang) return null
>>>      ERXEC ec=ERXEC.newEditingContext()
>>>      DBMarket 
>>> market=EOUtilities.objectWithPrimaryKeyValue(ec,'DBMarket',mpk as Integer)
>>>      println "DA: ... $market" //*
>>>      if (!market) return null
>>>      def 
>>> mime=market."marketBannerMIME$lang",data=market."marketBannerData$lang"
>>>      println "DA: ... mime '$mime' data $data.length B"
>>>      WOResponse wor=new WOResponse()
>>>      wor.setHeader(mime,"content-type")
>>>      ...
>>>      wor.setContent(data)
>>>      wor
>>>  }
>>> ===
>>> 
>>> Note the //* log is not present, which suggests that 
>>> EOUtilities.objectWithPrimaryKeyValue did lock somehow?!? Neither any other 
>>> R/R loop starts, which looks like not only WorkerThread118 did block, but 
>>> that the instance stopped accepting requests at all.
>>> 
>>> The FrontBase log (which I regret to say I do not fully understand, namely, 
>>> those “--N lines, of which I get just the timestamp”) looks like this:
>>> 
>>> ===
>>> ...
>>> --6 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.53996 "1 0"
>>> --D 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55851 752 48429
>>> SELECT t0."C_AUCTION_AMOUNT_STEP", t0."C_AUCTION_AOFFALL", 
>>> t0."C_AUCTION_MAX_AMOUNT", t0."C_MAX_LENGHT", t0."C_AUCTION_MIN_AMOUNT", 
>>> t0."C_AUCTION_CHDELAY", t0."C_MIN_LENGTH", t0."C_AUCTION_MINSTART", 
>>> t0."C_AUCTION_NEXT_SEQ", t0."C_AUCTION_CONCUR", t0."C_CREATION_DATE", 
>>> t0."C_CREATOR_ID", t0."C_FIELD_IDENTIFIERS_FOR_EDITOR", 
>>> t0."C_FIELD_IDENTIFIERS_FOR_FILTER", t0."C_FIELD_IDENTIFIERS_FOR_LIST", 
>>> t0."C_FIELD_IDENTIFIERS_FOR_OFFER", t0."C_FIELD_IDENTIFIERS_FOR_PUBLIC", 
>>> t0."C_FORM_TEMPLATE_ID", t0."C_MARKET_BANNER_CZ_DATA", 
>>> t0."C_MARKET_BANNER_EN_DATA", t0."C_MARKET_BANNER_CZ_MIME", 
>>> t0."C_MARKET_BANNER_EN_MIME", t0."C_SUPPORTS_OFFERS", t0."C_SHORTCUT", 
>>> t0."C_TITLE", t0."C_UID", t0."C_WIDTHS_FOR_LIST" FROM "T_MARKET" t0 WHERE 
>>> t0."C_UID" = 1000001;
>>> --6 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55903 "1 0"
>>> --7 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55905 ""
>>> --8 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55907 "1"
>>> --2 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55976 7 48430
>>> commit;
>>> --6 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55984 "1 0"
>>> --3 0x1040841d8 0x1040844b0 2015-01-26 14:20:40.50016
>>> ...
>>> ===
>>> 
>>> If anybody can see what on earth might have happened, I'd be grateful for 
>>> any advice. Myself, I just can see the app locked _somehow_ and stopped 
>>> working at all, but have no idea why and how to prevent that...
>>> 
>>> Note: the direct action worked flawlessly (and is in the logs) in the same 
>>> instance each previous R/R loop, more than 2000 times.
>>> 
>>> Thanks,
>>> OC
>>> 
>>> 
>>> _______________________________________________
>>> Do not post admin requests to the list. They will be ignored.
>>> Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com)
>>> Help/Unsubscribe/Update your Subscription:
>>> https://lists.apple.com/mailman/options/webobjects-dev/rgurley%40smarthealth.com
>>> 
>>> This email sent to rgur...@smarthealth.com
>> 
> 


 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to