thanks for the quick response eric,
On Wed, Jun 1, 2011 at 9:48 AM, Eric Wong <[email protected]> wrote:
> Bharanee Rathna <[email protected]> wrote:
>> I'm encountering a weird error where the unicorn workers are stuck in
>> a loop after hitting a 500 on the backend sinatra app.
>
> Also, what extensions are you using in your app?
heaps of em. yajl, swift, rmagick, fastcaptcha, flock, nokogiri &
curb. except swift and curb none of the others would be touching the
network.
>> strace at the point where it starts to go into a loop of death
>
>> select(7, [4 5], NULL, [3 6], {30, 0}) = 1 (in [5], left {27, 274382})
>> fchmod(8, 01) = 0
>> fcntl(5, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
>> accept4(5, {sa_family=AF_INET, sin_port=htons(56728),
>> sin_addr=inet_addr("10.1.1.4")}, [16], SOCK_CLOEXEC) = 12
>> recvfrom(12, 0x1c99fb0, 16384, 64, 0, 0) = -1 EAGAIN (Resource
>> temporarily unavailable)
>
> (I'm somewhat more awake, now, haven't been sleeping much)
>
> Two things look off in the line above:
>
> 1) recvfrom() isn't using the MSG_DONTWAIT flag. I know you're using
> Linux, so kgio should be using MSG_DONTWAIT to do non-blocking
> recv... Which versions of unicorn/kgio are you using?
using kgio 2.3.2, i'll upgrade it and give it another try
>
> 2) TCP_DEFER_ACCEPT should prevent recvfrom() from hitting EAGAIN
> in the common case under Linux.
>
>> select(13, [12], NULL, NULL, NULL) = ? ERESTARTNOHAND (To be restarted)
>> --- SIGINT (Interrupt) @ 0 (0) ---
>> rt_sigreturn(0x2) = -1 EINTR (Interrupted system call)
>
> What triggered SIGINT?
not sure
>
> Actually, after many lines of sched_yield() in your gist, I can see it
> does actually exit the process. Did you kill it with SIGINT? If so, I
> see nothing wrong...
yes i killed it after the worker looked stuck and wasn't responding for 30s
_______________________________________________
Unicorn mailing list - [email protected]
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying