WOTaskd will become unresponsive whenever an instance hangs. We’ve reconfigured 
WOTaskd to write the configuration into XML file and Apache adapter reads it 
instead of communicating with wotaskd directly.

Michael Kondratov
Aspire Auctions, Inc.
216-231-5515

> On Oct 24, 2016, at 11:41 AM, OC <o...@ocs.cz> wrote:
> 
> Hello there,
> 
> there seems to be one pretty rare, ugly and hard-to find lock in my 
> application (I shall get back to it at the end, in hope it might ring a 
> bell), but what's most weird: it seems that when it happens, it's _wotaskd_ 
> what primarily goes down?!?
> 
> Alas, the information is sparse: it is the deployment site, to where the 
> programming team has no access (and so far we were not able to repeat the 
> problem at the test site whatever we try), but due to the site admin and 
> logs, it looks like
> 
> (a) first, one of the worker threads hangs somehow, so far inexplicably (EC 
> locking problem possible but improbable, explained below)
> (b) for some time, other threads run without a glitch, new reqeusts are 
> served, new R/R loop worker threads are spawned and logged (I log out all R/R 
> loops)
> (c) shortly (in minutes) though the adaptor begins to redirect requests to 
> the “Redirection URL”
> (d) now, the site admin is alerted; he runs JavaMonitor **which reports 
> “Failed to contact 127.0.0.1-1085”**!
> (e) he finds which process belongs to *the application instance* (*not* the 
> wotaskd!), and kills it from Terminal
> (f) which causes wotaskd to magically cure and JavaMonitor starts working and 
> stops showing the 1085 fail, allows to re-launch the instance, all is well 
> and swell.
> 
> Does this perhaps ring a bell? To me this behaviour does not make any sense :/
> 
> As for the hang itself, it's rather weird too. There is a loop which goes 
> through a list of EOs; each of them is logged out. Something like this:
> 
> ===
>        for (DBTimeChunk tch in session().currentMarket.orderedTimeChunks()) {
>            log.info(""+tch)
>            if (tch.someTimestamp>fixedTimestamp) continue // happens to be 
> true in our case
>            ... therefore some irrelevant code here (it would log if it 
> happened, does not) ...
>        }
> ===
> 
> The problem is that
> 
> - this goes through some of the TimeChunks, and _then_ it hangs -- not at the 
> start of R/R loop, where EC locking problems could be expected
> - in the same session, with the same EC, even in the same thread (for the 
> method which contains the loop happens to be used twice in the page template) 
> the loop already run through all the TimeChunks and tested their 
> someTimestamp and ended without a glitch (so, no fault is fired when it hangs)
> 
> So far it happened about thrice; each time on different TimeChunk.
> 
> About the only thing I guess _might_ cause the hang of the thread is the "log 
> tch". TimeChunk's toString() is comparatively complex, it might call, among 
> more mundane things, also
> - this.changesFromCommittedSnapshot()
> - this.attributeKeys()
> - this.primaryKey() (of ERXGenericRecord which it inherits)
> 
> Might one of them hang the thread, if another thread does the same/something 
> other at the wrong moment? (Presumed all of them were already called for the 
> same EO in the same thread all right shortly ago.)
> 
> If it happens again, it would help if the site admin could, before killing 
> the application, to force it somehow to log the stacktracks of all its 
> threads. Is there some trick for that?
> 
> And of course, for any other advice how to hunt for this bloody kind of bug 
> I'll be extremely grateful.
> 
> Thanks a lot,
> OC
> 
> 
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com)
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/webobjects-dev/michael%40aspireauctions.com
> 
> This email sent to mich...@aspireauctions.com


 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to