On Tue, Mar 18, 2014 at 5:20 AM, David Precious <[email protected]>wrote:
> On Tue, 18 Mar 2014 01:42:42 -0500 > Mike South <[email protected]> wrote: > > I recently had a problem come up with my D1 app. I have the app > > writing the PID to the database at start and stop of a particular > > route, and I have two runs that never completed. > > > > I did an strace on the worker and it looked like this: > > > > [root@host ~]# strace -p 26508 > > Process 26508 attached - interrupt to quit > > flock(10, LOCK_EX > > > > lsof on that process gave me > > > > starman 26508 apps 10wW REG 8,22 0 > > 24 /tmp/3j6mXZXwWM (deleted) > > > > (not sure, but I think the 10 in the flock call and the 10 in the > > 10wW mean that we're talking about the same file?). > > yeah, that'll be the same file descriptor - and yeah, looks like it's > waiting to get an exclusive lock on it, but presumably another process > already has that lock and hasn't released it - possibly as it's waiting > on a lock on something the process you looked at has locked, leading to > a deadlock situation? > > Do you use File::Temp within your process to acquire temporary files? > Not directly. I don't know of it specifically being used by anything, but it is installed in my perlbrew so it definitely could be in use. I could put some debugging in there and see if it produces any clues. > > > > After a little more digging I found that every starman worker had > > that file open (except with 10w instead of 10wW for all but the one > > above and stracing a few of the other workers showed the same output > > ( "flock(10, LOCK_EX" ) as above. > > That same exact filename? If so, odd. > yes: [root@host ~]# lsof |grep /tmp/3j6mXZXwWM starman 26502 apps 10w REG 8,22 0 24 /tmp/3j6mXZXwWM (deleted) starman 26503 apps 10w REG 8,22 0 24 /tmp/3j6mXZXwWM (deleted) starman 26504 apps 10w REG 8,22 0 24 /tmp/3j6mXZXwWM (deleted) starman 26505 apps 10w REG 8,22 0 24 /tmp/3j6mXZXwWM (deleted) starman 26506 apps 10w REG 8,22 0 24 /tmp/3j6mXZXwWM (deleted) starman 26507 apps 10w REG 8,22 0 24 /tmp/3j6mXZXwWM (deleted) starman 26508 apps 10wW REG 8,22 0 24 /tmp/3j6mXZXwWM (deleted) starman 26509 apps 10w REG 8,22 0 24 /tmp/3j6mXZXwWM (deleted) starman 26510 apps 10w REG 8,22 0 24 /tmp/3j6mXZXwWM (deleted) starman 26511 apps 10w REG 8,22 0 24 /tmp/3j6mXZXwWM (deleted) starman 26512 apps 10w REG 8,22 0 24 /tmp/3j6mXZXwWM (deleted) ... According to the docs, this output means all the processes have it open for writing and 26508 has a write lock on the whole file. I'll see if I can get anything out of File::Temp--maybe adding debugging output there will give me an idea of what's so interesting to everybody. Thanks for looking at this with me. mike
_______________________________________________ dancer-users mailing list [email protected] http://lists.preshweb.co.uk/mailman/listinfo/dancer-users
