Ramsey,

thanks a lot!

> Maybe you can get the admin to dump that into a file for you before they kill 
> <pid>? 

I'll try :)

> EOF has a global lock on the OSC. Looks like some other thread locked the OSC 
> and didn’t unlock it. If you are manually locking and unlocking an EC, you 
> should always...

But if it was (all three) OSCs locked, it would block _before_ fetch and the 
FrontBase log would not contain the appropriate SELECT, or am I wrong?

(Anyway I do that in one specific place -- see e.g., the "locking with more 
coordinators causes an exception?" thread of 24th. I do log both locks and 
unlocks, too, and far as the log goes, no OSC ever stays locked. Nevertheless I 
did bump into one weird thing when I've scanned logs -- will write with another 
subject in a moment!)

> But you probably shouldn’t be manually locking much if you are using an auto 
> locking ec like ERXEC.

I did lock EOs which lead to problems (again see please "locking with more 
coordinators causes an exception?"). Now I am not, and just in one specific 
case where it is enormously important to store sequentially I lock OSC.

Thanks a lot,
OC

> On Jan 27, 2015, at 2:58 PM, OC <o...@ocs.cz> wrote:
> 
>> Alas, runs on production server where I have no access -- just am getting 
>> the logs :(
>> 
>> Thanks,
>> OC
>> 
>> On 27. 1. 2015, at 22:41, Ramsey Gurley <rgur...@smarthealth.com> wrote:
>> 
>>> If the instance is hung, I’d start with
>>> 
>>> jstack -F <pid>
>>> 
>>> That should give a stack trace on whatever deadlocked it.
>>> 
>>> On Jan 27, 2015, at 2:33 PM, OC <o...@ocs.cz> wrote:
>>> 
>>>> Hello there,
>>>> 
>>>> another weird case: this application is run single-instance (but 
>>>> ERXObjectStoreCoordinatorPool.maxCoordinators=3, 
>>>> WOAllowsConcurrentRequestHandling=true).
>>>> 
>>>> Yesterday after 14:10 users reported “No instance” or “Application not 
>>>> found” WO reports. Now, I log all R/R loops in Application.awake and 
>>>> Application.sleep; indeed at 14:10 a R/R loop did start and never ended, 
>>>> actually, almost nothing happened till the app was restarted by the 
>>>> administrator at 14:20. (I am told JavaMonitor has not been able to stop 
>>>> the instance normally, and it had to be killed/force quit.)
>>>> 
>>>> The single thing which did happen looks like this in my log:
>>>> 
>>>> ===
>>>> //////////////////////////////////////////////////////////////////////////////////////////
>>>> ////// R/R loop #2218 WorkerThread118 started at 14:10:05 26.1.
>>>> //////////////////////////////////////////////////////////////////////////////////////////
>>>> DA: reading CZ banner image for market 1000001...
>>>> 14:20:39.844 WARN  Force Quit received. Exiting now...       //log:NSLog 
>>>> [Thread-3]
>>>> APPLICATION SHUTDOWN SEQUENCE COMPLETE
>>>> ===
>>>> 
>>>> The “reading banner“ log comes from a direct action:
>>>> 
>>>> ===
>>>> WOActionResults bannerAction {
>>>>     def 
>>>> mpk=request().formValueForKey('mkpk'),lang=request().formValueForKey('lang')
>>>>     println "DA: reading $lang banner image for market $mpk..."
>>>>     if (!mpk || !lang) return null
>>>>     ERXEC ec=ERXEC.newEditingContext()
>>>>     DBMarket 
>>>> market=EOUtilities.objectWithPrimaryKeyValue(ec,'DBMarket',mpk as Integer)
>>>>     println "DA: ... $market" //*
>>>>     if (!market) return null
>>>>     def 
>>>> mime=market."marketBannerMIME$lang",data=market."marketBannerData$lang"
>>>>     println "DA: ... mime '$mime' data $data.length B"
>>>>     WOResponse wor=new WOResponse()
>>>>     wor.setHeader(mime,"content-type")
>>>>     ...
>>>>     wor.setContent(data)
>>>>     wor
>>>> }
>>>> ===
>>>> 
>>>> Note the //* log is not present, which suggests that 
>>>> EOUtilities.objectWithPrimaryKeyValue did lock somehow?!? Neither any 
>>>> other R/R loop starts, which looks like not only WorkerThread118 did 
>>>> block, but that the instance stopped accepting requests at all.
>>>> 
>>>> The FrontBase log (which I regret to say I do not fully understand, 
>>>> namely, those “--N lines, of which I get just the timestamp”) looks like 
>>>> this:
>>>> 
>>>> ===
>>>> ...
>>>> --6 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.53996 "1 0"
>>>> --D 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55851 752 48429
>>>> SELECT t0."C_AUCTION_AMOUNT_STEP", t0."C_AUCTION_AOFFALL", 
>>>> t0."C_AUCTION_MAX_AMOUNT", t0."C_MAX_LENGHT", t0."C_AUCTION_MIN_AMOUNT", 
>>>> t0."C_AUCTION_CHDELAY", t0."C_MIN_LENGTH", t0."C_AUCTION_MINSTART", 
>>>> t0."C_AUCTION_NEXT_SEQ", t0."C_AUCTION_CONCUR", t0."C_CREATION_DATE", 
>>>> t0."C_CREATOR_ID", t0."C_FIELD_IDENTIFIERS_FOR_EDITOR", 
>>>> t0."C_FIELD_IDENTIFIERS_FOR_FILTER", t0."C_FIELD_IDENTIFIERS_FOR_LIST", 
>>>> t0."C_FIELD_IDENTIFIERS_FOR_OFFER", t0."C_FIELD_IDENTIFIERS_FOR_PUBLIC", 
>>>> t0."C_FORM_TEMPLATE_ID", t0."C_MARKET_BANNER_CZ_DATA", 
>>>> t0."C_MARKET_BANNER_EN_DATA", t0."C_MARKET_BANNER_CZ_MIME", 
>>>> t0."C_MARKET_BANNER_EN_MIME", t0."C_SUPPORTS_OFFERS", t0."C_SHORTCUT", 
>>>> t0."C_TITLE", t0."C_UID", t0."C_WIDTHS_FOR_LIST" FROM "T_MARKET" t0 WHERE 
>>>> t0."C_UID" = 1000001;
>>>> --6 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55903 "1 0"
>>>> --7 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55905 ""
>>>> --8 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55907 "1"
>>>> --2 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55976 7 48430
>>>> commit;
>>>> --6 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55984 "1 0"
>>>> --3 0x1040841d8 0x1040844b0 2015-01-26 14:20:40.50016
>>>> ...
>>>> ===
>>>> 
>>>> If anybody can see what on earth might have happened, I'd be grateful for 
>>>> any advice. Myself, I just can see the app locked _somehow_ and stopped 
>>>> working at all, but have no idea why and how to prevent that...
>>>> 
>>>> Note: the direct action worked flawlessly (and is in the logs) in the same 
>>>> instance each previous R/R loop, more than 2000 times.
>>>> 
>>>> Thanks,
>>>> OC
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Do not post admin requests to the list. They will be ignored.
>>>> Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com)
>>>> Help/Unsubscribe/Update your Subscription:
>>>> https://lists.apple.com/mailman/options/webobjects-dev/rgurley%40smarthealth.com
>>>> 
>>>> This email sent to rgur...@smarthealth.com
>>> 
>> 
> 


 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to