Maybe you can get the admin to dump that into a file for you before they kill <pid>?
EOF has a global lock on the OSC. Looks like some other thread locked the OSC and didn’t unlock it. If you are manually locking and unlocking an EC, you should always ec.lock(); try { //do stuff } finally { ec.unlock(); } But you probably shouldn’t be manually locking much if you are using an auto locking ec like ERXEC. On Jan 27, 2015, at 2:58 PM, OC <o...@ocs.cz> wrote: > Alas, runs on production server where I have no access -- just am getting the > logs :( > > Thanks, > OC > > On 27. 1. 2015, at 22:41, Ramsey Gurley <rgur...@smarthealth.com> wrote: > >> If the instance is hung, I’d start with >> >> jstack -F <pid> >> >> That should give a stack trace on whatever deadlocked it. >> >> On Jan 27, 2015, at 2:33 PM, OC <o...@ocs.cz> wrote: >> >>> Hello there, >>> >>> another weird case: this application is run single-instance (but >>> ERXObjectStoreCoordinatorPool.maxCoordinators=3, >>> WOAllowsConcurrentRequestHandling=true). >>> >>> Yesterday after 14:10 users reported “No instance” or “Application not >>> found” WO reports. Now, I log all R/R loops in Application.awake and >>> Application.sleep; indeed at 14:10 a R/R loop did start and never ended, >>> actually, almost nothing happened till the app was restarted by the >>> administrator at 14:20. (I am told JavaMonitor has not been able to stop >>> the instance normally, and it had to be killed/force quit.) >>> >>> The single thing which did happen looks like this in my log: >>> >>> === >>> ////////////////////////////////////////////////////////////////////////////////////////// >>> ////// R/R loop #2218 WorkerThread118 started at 14:10:05 26.1. >>> ////////////////////////////////////////////////////////////////////////////////////////// >>> DA: reading CZ banner image for market 1000001... >>> 14:20:39.844 WARN Force Quit received. Exiting now... //log:NSLog >>> [Thread-3] >>> APPLICATION SHUTDOWN SEQUENCE COMPLETE >>> === >>> >>> The “reading banner“ log comes from a direct action: >>> >>> === >>> WOActionResults bannerAction { >>> def >>> mpk=request().formValueForKey('mkpk'),lang=request().formValueForKey('lang') >>> println "DA: reading $lang banner image for market $mpk..." >>> if (!mpk || !lang) return null >>> ERXEC ec=ERXEC.newEditingContext() >>> DBMarket >>> market=EOUtilities.objectWithPrimaryKeyValue(ec,'DBMarket',mpk as Integer) >>> println "DA: ... $market" //* >>> if (!market) return null >>> def >>> mime=market."marketBannerMIME$lang",data=market."marketBannerData$lang" >>> println "DA: ... mime '$mime' data $data.length B" >>> WOResponse wor=new WOResponse() >>> wor.setHeader(mime,"content-type") >>> ... >>> wor.setContent(data) >>> wor >>> } >>> === >>> >>> Note the //* log is not present, which suggests that >>> EOUtilities.objectWithPrimaryKeyValue did lock somehow?!? Neither any other >>> R/R loop starts, which looks like not only WorkerThread118 did block, but >>> that the instance stopped accepting requests at all. >>> >>> The FrontBase log (which I regret to say I do not fully understand, namely, >>> those “--N lines, of which I get just the timestamp”) looks like this: >>> >>> === >>> ... >>> --6 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.53996 "1 0" >>> --D 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55851 752 48429 >>> SELECT t0."C_AUCTION_AMOUNT_STEP", t0."C_AUCTION_AOFFALL", >>> t0."C_AUCTION_MAX_AMOUNT", t0."C_MAX_LENGHT", t0."C_AUCTION_MIN_AMOUNT", >>> t0."C_AUCTION_CHDELAY", t0."C_MIN_LENGTH", t0."C_AUCTION_MINSTART", >>> t0."C_AUCTION_NEXT_SEQ", t0."C_AUCTION_CONCUR", t0."C_CREATION_DATE", >>> t0."C_CREATOR_ID", t0."C_FIELD_IDENTIFIERS_FOR_EDITOR", >>> t0."C_FIELD_IDENTIFIERS_FOR_FILTER", t0."C_FIELD_IDENTIFIERS_FOR_LIST", >>> t0."C_FIELD_IDENTIFIERS_FOR_OFFER", t0."C_FIELD_IDENTIFIERS_FOR_PUBLIC", >>> t0."C_FORM_TEMPLATE_ID", t0."C_MARKET_BANNER_CZ_DATA", >>> t0."C_MARKET_BANNER_EN_DATA", t0."C_MARKET_BANNER_CZ_MIME", >>> t0."C_MARKET_BANNER_EN_MIME", t0."C_SUPPORTS_OFFERS", t0."C_SHORTCUT", >>> t0."C_TITLE", t0."C_UID", t0."C_WIDTHS_FOR_LIST" FROM "T_MARKET" t0 WHERE >>> t0."C_UID" = 1000001; >>> --6 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55903 "1 0" >>> --7 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55905 "" >>> --8 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55907 "1" >>> --2 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55976 7 48430 >>> commit; >>> --6 0x1039f41d8 0x1039f44b0 2015-01-26 14:10:05.55984 "1 0" >>> --3 0x1040841d8 0x1040844b0 2015-01-26 14:20:40.50016 >>> ... >>> === >>> >>> If anybody can see what on earth might have happened, I'd be grateful for >>> any advice. Myself, I just can see the app locked _somehow_ and stopped >>> working at all, but have no idea why and how to prevent that... >>> >>> Note: the direct action worked flawlessly (and is in the logs) in the same >>> instance each previous R/R loop, more than 2000 times. >>> >>> Thanks, >>> OC >>> >>> >>> _______________________________________________ >>> Do not post admin requests to the list. They will be ignored. >>> Webobjects-dev mailing list (Webobjects-dev@lists.apple.com) >>> Help/Unsubscribe/Update your Subscription: >>> https://lists.apple.com/mailman/options/webobjects-dev/rgurley%40smarthealth.com >>> >>> This email sent to rgur...@smarthealth.com >> > _______________________________________________ Do not post admin requests to the list. They will be ignored. Webobjects-dev mailing list (Webobjects-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com