On 2012.08.09 11:31, sebasti...@gmx-topmail.de wrote:
> Maybe we got the jackpot (at least I hope so)?! My test computer finally 
> showed the segmentation fault today morning.

Good. The more data we get, the better we'll be able to fix the issue.

> You'll find all information in the attached file! (I limited the debug 
> information of libusbx to the last 12 seconds before the crash happened.)
>
> The last lines in the libusbx output file:
> Aug  9 06:11:02 logger: [144215.086624] [00000393] libusbx: debug 
> [usbi_handle_transfer_completion] transfer 0xb640049c has callback 0xfd71fb
> Aug  9 06:11:02 logger: [144215.086653] [00000393] libusbx: debug 
> [bulk_transfer_cb] actual_length=17
> Aug  9 06:11:02 logger: [144215.086750] [00000393] libusbx: debug 
> [add_to_flying_list] arm timerfd for timeout in 500ms (first in line)
> Aug  9 06:11:02 logger: [144215.086785] [00000393] libusbx: warning 
> [add_to_flying_list] failed to arm first timerfd (error -1)

OK, first thing I notice is that the timerfd report is output as a 
warning rather than debug, so if your test machine is behaving the same 
way as your other server, it means that you have silenced warnings 
there, which I'd strongly recommend against.

To confirm that the timerfd_settime() is the call that's causing us an 
issue, can you please try to set your production servers to at least 
output warning messages? The expectation is that, unlike debug, warnings 
aren't going to clog your log, and as you can see above, they can give 
precious clues as to where libusbx is not behaving nominally, and why.

I'd strongly recommend to have libusbx logging set to at least warning 
level (or info) if you can. You don't even have to recompile libusbx. 
Just make sure that the environment variable LIBUSB_DEBUG is set to 2 
(Warning) or 3 (INFO) before starting pcscd.

Now, unfortunately, we don't display errno on error (we'll need to patch 
that), so we just get an indication that timerfd_settime() failed.
I'll try to push a patch to make sure we get errno displayed. In the 
meantime, I'd encourage you to change line 1207 in io.c from:
   usbi_warn(ctx, "failed to arm first timerfd (error %d)", r);
to
   usbi_warn(ctx, "failed to arm first timerfd (errno %d)", errno);


Now, to the possible causes, according to the man page [1]:

   timerfd_settime() can fail with the following errors:
   o EBADF  fd is not a valid file descriptor.
   o EINVAL fd is not a valid timerfd file descriptor.

Doubtful, as it would mean we've messed up our ctx structure.

   o EINVAL flags is invalid.

We pass TFD_TIMER_ABSTIME as a constant for flags, so no.

   o EFAULT new_value or old_value is not valid a pointer.

Well, that's an interesting one, considering we always use NULL for 
old_value in our call [2]. Maybe this needs to be a valid pointer after 
all? But then it's kind of puzzling it wouldn't fail every time...

   o EINVAL new_value is not properly initialized (one of the tv_nsec 
falls outside the range zero to 999,999,999).

I don't see that as a likely, but we may want to print these values just 
in case, when timerfd_settime() fails. To do that, you can add the 
following line after the one I mentioned above:

    usbi_warn(ctx, "it.it_value = %d:%d)", it.it_value.tv_sec, 
it.it_value.tv_nsec);

Now, I'm hoping that you can apply these changes to your production 
server, as well as set your logging to WARN there. Hopefully, we'll 
catch a "failed to arm first timerfd" that will give us more clues. If 
we get an EFAULT, then we'll know how we're supposed to patch this issue.

Regards,

/Pete

[1] 
http://www.kernel.org/doc/man-pages/online/pages/man2/timerfd_create.2.html
[2] https://github.com/libusbx/libusbx/blob/master/libusb/io.c#L1205

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
libusbx-devel mailing list
libusbx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libusbx-devel

Reply via email to