On Thu, Jul 23, 2015 at 9:34 PM, Vedran Furač <vedran.fu...@gmail.com> wrote:
> On 07/23/2015 06:47 PM, Ilya Dryomov wrote:
>>
>> To me this looks like a writev() interrupted by a SIGALRM.  I think
>> nginx guys read your original email the same way I did, which is "write
>> syscall *returned* ERESTARTSYS", but I'm pretty sure that is not the
>> case here.
>>
>> ERESTARTSYS shows up in strace output but it is handled by the kernel,
>> userpace doesn't see it (but strace has to be able to see it, otherwise
>> you wouldn't know if your system call has been restarted or not).
>>
>> You cut the output short - I asked for the entire output for a reason,
>> please paste it somewhere.
>
> Might be, however I don't know why would be nginx interrupting it, all
> writes are done pretty fast and timeouts are set to 10 minutes. Here are
> 2 examples on 2 servers with slightly different configs (timestams
> included):
>
> http://pastebin.com/wUAAcdT7
>
> http://pastebin.com/wHyWc9U5

I don't know - looks like nginx isn't setting SA_RESTART, so it should
be repeating the write()/writev() itself.  That said, if it happens
only on cephfs, we need to track it down.

Try enabling nginx debug logging?  I seem to have a vague recollection
that it records return values.  That should give us something to start
with.

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to