Hi there,

I have a quite complex issue, with uclibc-ng (master branch). Here is a summary.

Tested with qemu (buildroot for q800) on the m68k architecture, rsyslog consumes 100% CPU during its pthread_cond_wait() calls.

This is in [1] in the doIdleProcessing() function.

Using glibc+m68k does not reveal this issue, and uclibc-ng on arm doesn't either.

Strace output shows that the futex_time64() syscall is repeatedly returning EINVAL, indicating that the call is looping without proper blocking:
# strace -f rsyslogd -dn
<snip>
[pid 110] write(1, "9139.913444393:main Q:Reg/w0 : "..., 859139.913444393:main Q:Reg/w0 : wti.c: main Q:Reg/w0: worker IDLE, waiting for work.
) = 85
[pid   110] get_thread_area()           = 0xc0b73970
[pid   110] get_thread_area()           = 0xc0b73970
[pid   110] get_thread_area()           = 0xc0b73970
[pid 110] futex_time64(0x8009aee2, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EINVAL (Invalid argument)
[pid   110] get_thread_area()           = 0xc0b73970
[pid   110] get_thread_area()           = 0xc0b73970
[pid 110] futex_time64(0x8009aee2, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EINVAL (Invalid argument)
[pid   110] get_thread_area()           = 0xc0b73970
[pid   110] get_thread_area()           = 0xc0b73970
[pid 110] futex_time64(0x8009aee2, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EINVAL (Invalid argument)
[pid   110] get_thread_area()           = 0xc0b73970
[pid   110] get_thread_area()           = 0xc0b73970
[pid 110] futex_time64(0x8009aee2, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EINVAL (Invalid argument)
[pid   110] get_thread_area()           = 0xc0b73970
[pid   110] get_thread_area()           = 0xc0b73970
[pid 110] futex_time64(0x8009aee2, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EINVAL (Invalid argument)
[pid   110] get_thread_area()           = 0xc0b73970
[pid   110] get_thread_area()           = 0xc0b73970
[pid 110] futex_time64(0x8009aee2, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EINVAL (Invalid argument)

Interestingly, a simple test program [2] works fine under similar conditions.

I tried to dig and debug, but I can't find out what is the root cause and I suppose it can occur elsewhere, for another program...

Hi there,

I have a quite complex issue, with uclibc-ng. Here is a summary.
Tested with qemu (buildroot for q800) on the m68k architecture, rsyslog consumes 100% CPU during its pthread_cond_wait() calls.
This is in [1] in the doIdleProcessing() function.

Using glibc+m68k does not reveal this issue, and uclibc-ng on arm doesn't either.

Strace output shows that the futex_time64() syscall is repeatedly returning EINVAL, indicating that the call is looping without proper blocking:
# strace -f rsyslogd -dn
<snip>
[pid 110] write(1, "9139.913444393:main Q:Reg/w0 : "..., 859139.913444393:main Q:Reg/w0 : wti.c: main Q:Reg/w0: worker IDLE, waiting for work.
) = 85
[pid   110] get_thread_area()           = 0xc0b73970
[pid   110] get_thread_area()           = 0xc0b73970
[pid   110] get_thread_area()           = 0xc0b73970
[pid 110] futex_time64(0x8009aee2, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EINVAL (Invalid argument)
[pid   110] get_thread_area()           = 0xc0b73970
[pid   110] get_thread_area()           = 0xc0b73970
[pid 110] futex_time64(0x8009aee2, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EINVAL (Invalid argument)
[pid   110] get_thread_area()           = 0xc0b73970
[pid   110] get_thread_area()           = 0xc0b73970
[pid 110] futex_time64(0x8009aee2, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EINVAL (Invalid argument)
[pid   110] get_thread_area()           = 0xc0b73970
[pid   110] get_thread_area()           = 0xc0b73970
[pid 110] futex_time64(0x8009aee2, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EINVAL (Invalid argument)
[pid   110] get_thread_area()           = 0xc0b73970
[pid   110] get_thread_area()           = 0xc0b73970
[pid 110] futex_time64(0x8009aee2, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EINVAL (Invalid argument)
[pid   110] get_thread_area()           = 0xc0b73970
[pid   110] get_thread_area()           = 0xc0b73970
[pid 110] futex_time64(0x8009aee2, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EINVAL (Invalid argument)

Interestingly, a simple test program [2] works fine under similar conditions.

I tried to dig and debug, but I can't find out what is the root cause and I suppose it can occur elsewhere, for another program...

Any help or suggestions would be appreciated!

[1]: https://github.com/rsyslog/rsyslog/blob/master/runtime/wti.c#L362
[2]: https://www.geeksforgeeks.org/condition-wait-signal-multi-threading/

Thanks !
JM
_______________________________________________
devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to