The important backtrace in there is the one from thread 11:

#0  0x00007fb288428474 in read () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#1  0x00007fb2890c4518 in ?? () from /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2
No symbol table info available.
#2  0x00007fb287895848 in ?? () from /usr/lib/x86_64-linux-gnu/libgnutls.so.30
No symbol table info available.
#3  0x00007fb28788f96a in ?? () from /usr/lib/x86_64-linux-gnu/libgnutls.so.30
No symbol table info available.
#4  0x00007fb287896d03 in ?? () from /usr/lib/x86_64-linux-gnu/libgnutls.so.30
No symbol table info available.
#5  0x00007fb28789991c in ?? () from /usr/lib/x86_64-linux-gnu/libgnutls.so.30
No symbol table info available.
#6  0x00007fb2878a10cb in ?? () from /usr/lib/x86_64-linux-gnu/libgnutls.so.30
No symbol table info available.
#7  0x00007fb28789d572 in gnutls_handshake () from 
/usr/lib/x86_64-linux-gnu/libgnutls.so.30
No symbol table info available.
#8  0x00007fb289304199 in ?? () from 
/usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2
No symbol table info available.
#9  0x00007fb289301abb in ldap_pvt_tls_accept () from 
/usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2
No symbol table info available.
#10 0x0000556b843e6f69 in connection_read (cri=<synthetic pointer>, s=430) at 
../../../../servers/slapd/connection.c:1375

debug symbols are missing there, but I have the exact same problem and
get:

#0  0x00007f2a01101474 in __libc_read (fd=40, buf=0x7f29dc142ecb, nbytes=5) at 
../sysdeps/unix/sysv/linux/read.c:27
#1  0x00007f2a01db0518 in sb_debug_read (sbiod=0x7f29dc10e940, 
buf=0x7f29dc142ecb, len=5) at ../../../../libraries/liblber/sockbuf.c:829
#2  0x00007f2a00558848 in _gnutls_stream_read (ms=0x7f29e8ffb41c, 
pull_func=0x7f2a01ff1da0 <tlsg_recv>, size=5, bufel=<synthetic pointer>, 
session=0x7f29dc008060) at buffers.c:344
#3  _gnutls_read (ms=0x7f29e8ffb41c, pull_func=0x7f2a01ff1da0 <tlsg_recv>, 
size=5, bufel=<synthetic pointer>, session=0x7f29dc008060) at buffers.c:424
#4  _gnutls_io_read_buffered (session=session@entry=0x7f29dc008060, total=5, 
recv_type=recv_type@entry=4294967295, ms=0x7f29e8ffb41c) at buffers.c:579
#5  0x00007f2a0055296a in recv_headers (ms=<optimized out>, 
record=0x7f29e8ffb470, htype=GNUTLS_HANDSHAKE_CLIENT_KEY_EXCHANGE, 
type=GNUTLS_HANDSHAKE, record_params=0x7f29dc1279f0, session=0x7f29dc008060) at 
record.c:1045
#6  _gnutls_recv_in_buffers (session=session@entry=0x7f29dc008060, 
type=type@entry=GNUTLS_HANDSHAKE, 
htype=htype@entry=GNUTLS_HANDSHAKE_CLIENT_KEY_EXCHANGE, ms=<optimized out>, 
ms@entry=0) at record.c:1173
#7  0x00007f2a00559d03 in _gnutls_handshake_io_recv_int 
(session=session@entry=0x7f29dc008060, 
htype=htype@entry=GNUTLS_HANDSHAKE_CLIENT_KEY_EXCHANGE, 
hsk=hsk@entry=0x7f29e8ffb580, optional=optional@entry=0) at buffers.c:1412
#8  0x00007f2a0055c91c in _gnutls_recv_handshake 
(session=session@entry=0x7f29dc008060, 
type=type@entry=GNUTLS_HANDSHAKE_CLIENT_KEY_EXCHANGE, 
optional=optional@entry=0, buf=buf@entry=0x7f29e8ffb830) at handshake.c:1465
#9  0x00007f2a005640cb in _gnutls_recv_client_kx_message 
(session=session@entry=0x7f29dc008060) at kx.c:563
#10 0x00007f2a00560572 in handshake_server (session=0x7f29dc008060) at 
handshake.c:3327
#11 gnutls_handshake (session=0x7f29dc008060) at handshake.c:2629
#12 0x00007f2a01ff2199 in tlsg_session_accept (session=0x7f29dc1133f0) at 
tls_g.c:363
#13 0x00007f2a01fefabb in ldap_pvt_tls_accept (sb=0x7f299c0051a0, 
ctx_arg=0x55d92cbca560) at tls2.c:425


and I've tracked it down to:

https://bugs.openldap.org/show_bug.cgi?id=8650#c12

Basically, what we see is one thread stuck in a busy loop doing read()s
on the TCP socket which all return immediately with EAGAIN as the fd is
in non-blocking mode.

In my cases, the client go offline just after sending the TLS client
hello.  That lasts for 15 minutes or about, probably until some timeout
after which the TCP connection is eventually considered dead.

It can be reproduced by running on a client:

gdb --args ldapsearch -H ldaps://ldap.example.com -x

Then in gdb:

break write
run
continue

Then the client is paused after sending the TLS "client hello".

https://bugs.openldap.org/show_bug.cgi?id=8650#c12 explains that it's
https://github.com/openldap/openldap/commit/7b5181da8cdd47a13041f9ee36fa9590a0fa6e48
that is responsible for the issue.

https://github.com/openldap/openldap/commit/4c1ab16ade18a253dd81df7e6eced4d920ac6a8e
reverted that commit, but that one did not make it into bionic.

So cherry picking
https://github.com/openldap/openldap/commit/4c1ab16ade18a253dd81df7e6eced4d920ac6a8e
should fix it.

** Bug watch added: bugs.openldap.org/ #8650
   https://bugs.openldap.org/show_bug.cgi?id=8650

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to openldap in Ubuntu.
https://bugs.launchpad.net/bugs/1926265

Title:
  slapd enter in infinite loop on sched_yield syscall

Status in openldap package in Ubuntu:
  New

Bug description:
  On a production server, sometimes slapd become unbresponsive, some threads 
loops in sched_yield syscall and consumme all CPU.
  To recover, slapd needs to restart.
  No related information is reported in log file.
  All same issues in OpenLDAP upstream project are old and fixed.
  So maybe this issue affects only Ubuntu package.
  It occurs randomly, so I have no steps to reproduce.

  
  OS : Bionic

  Openldap version:

  libldap-2.4-2:amd64                    2.4.45+dfsg-1ubuntu1.10                
         
  libldap-common                         2.4.45+dfsg-1ubuntu1.10                
         
  slapd                                  2.4.45+dfsg-1ubuntu1.10                
         

  Modules loaded:

  olcModuleLoad: {0}back_bdb
  olcModuleLoad: {1}syncprov
  olcModuleLoad: {2}back_monitor
  olcModuleLoad: {3}memberof.la
  olcModuleLoad: {4}refint.la
  olcModuleLoad: {5}rwm
  olcModuleload: {6}back_ldap

  
  Backend is BDB. slapd run in (single) master - (multi) slave mode.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/openldap/+bug/1926265/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to