[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-11-12 Thread Andreas Hasenack
Roberto, are you able to test this update? If not, I can do it, but it
would be better if someone not involved with the fix could do it.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to cyrus-sasl2 in Ubuntu.
https://bugs.launchpad.net/bugs/997217

Title:
  salsauthd maxes cpu

Status in cyrus-sasl2 package in Ubuntu:
  Fix Released
Status in cyrus-sasl2 source package in Precise:
  Won't Fix
Status in cyrus-sasl2 source package in Trusty:
  Invalid
Status in cyrus-sasl2 source package in Xenial:
  Fix Committed

Bug description:
  [Impact]

  The rimap authentication mechanism in saslauthd can hit a condition
  where it will start spinning and using all available CPU. This
  condition can be easily encountered when an authentication is
  happening and the imap service is being restarted.

  Furthermore, the saslauthd child process that picked up that
  authentication request and that is spinning now won't be reaped nor
  can it service further requests. If all children are left in this
  state, the authentication service as a whole won't be working anymore.

  [Test Case]

  This test can be performed in a LXD or VM.

  * install the needed packages. mail-stack-delivery is used to have an
  imap server available on localhost that needs no further
  configuration. Accept the defaults for all debconf prompts:

  sudo apt update
  sudo apt install sasl2-bin mail-stack-delivery

  * set the password "ubuntu" for the ubuntu user
  echo ubuntu:ubuntu | sudo chpasswd

  * start saslauthd like this, with just one child:
  sudo /usr/sbin/saslauthd -a rimap -O localhost -r -n 1

  * restart dovecot
  sudo service dovecot restart

  * test saslauthd authentication:
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  0: OK "Success."

  * Now let's break it. In one terminal watch the output of top:
  top

  * in another terminal, run the following:
  sudo testsaslauthd -u ubuntu -p ubuntu & sleep 1; sudo service dovecot stop

  * observe in the "top" terminal that saslauthd is consuming a lot of
  cpu. If that's not happening, try starting dovecot again and adjusting
  the sleep value in the previous test command, but 1s was enough in all
  my runs.

  * start dovecot and repeat the authentication request. Since the only 
saslauthd child is now spinning, this will block:
  sudo service dovecot start
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  

  [Regression Potential]
  This fix relies on read(2) returning zero bytes when the connection is 
dropped, and that is clearly documented in its manpage:

  "On  success,  the  number  of  bytes read is returned (zero indicates
  end of file),"

  The select manpage also documents such a case being a valid case to
  indicate that a socket is ready to be read from, and that it won't
  block:

  "The file descriptors listed in readfds will be watched to see if
  characters become available for reading (more precisely, to see  if  a
  read will  not  block;  in  particular, a file descriptor is also
  ready on end-of-file)"

  This patch is what was used upstream, and is also present in bionic.

  I can't think of regressions specific to this change, other than the
  usual risk of rebuilding a widely used library (sasl2) in an
  environment different from the one from when xenial was released,
  i.e., different libraries available, maybe different system-wide build
  options, etc.

  [Other Info]
  Trusty is still not accounting for read() returning zero being an end-of-file 
case, but the loop there has a counter and it eventually exits, not leading to 
a perpetual spin or high cpu usage (see comment #17 for a brief history on how 
this fix was dropped in the xenial package).

  The fix is simple and could be applied there as well, if the SRU team
  prefers.

  
  [Original Description]

  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a
  bug that causes heavy cpu utilization, impacting normal operation of
  one of our mail servers following an upgrade to Ubuntu 12.04.

  We are running the daemon with the following options:

  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5

  We noticed that users were unable to send mail and that the saslauthd
  processes were using approximately 100% of each cpu core. An strace of
  one of the runaway process showed that it was stuck in the following
  behaviour:

  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)   

[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-11-14 Thread Andreas Hasenack
xenial verification

Confirming bug:

ubuntu@xenial-saslauthd:~$ apt-cache policy sasl2-bin
sasl2-bin:
  Installed: 2.1.26.dfsg1-14build1
  Candidate: 2.1.26.dfsg1-14build1
  Version table:
 *** 2.1.26.dfsg1-14build1 500
500 http://br.archive.ubuntu.com/ubuntu xenial/main amd64 Packages


Confirmed saslauthd at 100% CPU after the test command:
  PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND 

   
  root  20   0   90860972  0 R 100.0  0.0   0:05.72 saslauthd   
   


subsequent testsaslauthd commands just block, as expected due to the bug.


Repeating the test with the package from proposed:
ubuntu@xenial-saslauthd:~$ apt-cache policy sasl2-bin
sasl2-bin:
  Installed: 2.1.26.dfsg1-14ubuntu0.1
  Candidate: 2.1.26.dfsg1-14ubuntu0.1
  Version table:
 *** 2.1.26.dfsg1-14ubuntu0.1 500
500 http://br.archive.ubuntu.com/ubuntu xenial-proposed/main amd64 
Packages

Not only does saslauthd not start eating cpu, it keeps working:
with dovecot stopped:
ubuntu@xenial-saslauthd:~$ sudo testsaslauthd -u ubuntu -p ubuntu
0: NO "authentication failed"

Starting up dovecot and trying again:
ubuntu@xenial-saslauthd:~$ sudo service dovecot start
ubuntu@xenial-saslauthd:~$ sudo testsaslauthd -u ubuntu -p ubuntu
0: OK "Success."
ubuntu@xenial-saslauthd:~$ 

Xenial verification completed successfully.

** Tags removed: verification-needed-xenial
** Tags added: verification-done-xenial

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to cyrus-sasl2 in Ubuntu.
https://bugs.launchpad.net/bugs/997217

Title:
  salsauthd maxes cpu

Status in cyrus-sasl2 package in Ubuntu:
  Fix Released
Status in cyrus-sasl2 source package in Precise:
  Won't Fix
Status in cyrus-sasl2 source package in Trusty:
  Invalid
Status in cyrus-sasl2 source package in Xenial:
  Fix Committed

Bug description:
  [Impact]

  The rimap authentication mechanism in saslauthd can hit a condition
  where it will start spinning and using all available CPU. This
  condition can be easily encountered when an authentication is
  happening and the imap service is being restarted.

  Furthermore, the saslauthd child process that picked up that
  authentication request and that is spinning now won't be reaped nor
  can it service further requests. If all children are left in this
  state, the authentication service as a whole won't be working anymore.

  [Test Case]

  This test can be performed in a LXD or VM.

  * install the needed packages. mail-stack-delivery is used to have an
  imap server available on localhost that needs no further
  configuration. Accept the defaults for all debconf prompts:

  sudo apt update
  sudo apt install sasl2-bin mail-stack-delivery

  * set the password "ubuntu" for the ubuntu user
  echo ubuntu:ubuntu | sudo chpasswd

  * start saslauthd like this, with just one child:
  sudo /usr/sbin/saslauthd -a rimap -O localhost -r -n 1

  * restart dovecot
  sudo service dovecot restart

  * test saslauthd authentication:
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  0: OK "Success."

  * Now let's break it. In one terminal watch the output of top:
  top

  * in another terminal, run the following:
  sudo testsaslauthd -u ubuntu -p ubuntu & sleep 1; sudo service dovecot stop

  * observe in the "top" terminal that saslauthd is consuming a lot of
  cpu. If that's not happening, try starting dovecot again and adjusting
  the sleep value in the previous test command, but 1s was enough in all
  my runs.

  * start dovecot and repeat the authentication request. Since the only 
saslauthd child is now spinning, this will block:
  sudo service dovecot start
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  

  [Regression Potential]
  This fix relies on read(2) returning zero bytes when the connection is 
dropped, and that is clearly documented in its manpage:

  "On  success,  the  number  of  bytes read is returned (zero indicates
  end of file),"

  The select manpage also documents such a case being a valid case to
  indicate that a socket is ready to be read from, and that it won't
  block:

  "The file descriptors listed in readfds will be watched to see if
  characters become available for reading (more precisely, to see  if  a
  read will  not  block;  in  particular, a file descriptor is also
  ready on end-of-file)"

  This patch is what was used upstream, and is also present in bionic.

  I can't think of regressions specific to this change, other than the
  usual risk of rebuilding a widely used library (sasl2) in an
  environment different from the one from when xenial was released,
  i.e., different libraries available, maybe different system-wide build
  options, etc.

  [Other Info]
  Trusty is still not accounting for read() returning zero being an end-of-file 
case, but the loop 

[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-11-19 Thread Launchpad Bug Tracker
This bug was fixed in the package cyrus-sasl2 - 2.1.26.dfsg1-14ubuntu0.1

---
cyrus-sasl2 (2.1.26.dfsg1-14ubuntu0.1) xenial; urgency=medium

  * d/p/dont_hang_when_imap_closes.patch: Don't hang when IMAP
server closes connection. (LP: #997217)

 -- Andreas Hasenack   Wed, 24 Oct 2018 14:51:00
-0300

** Changed in: cyrus-sasl2 (Ubuntu Xenial)
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to cyrus-sasl2 in Ubuntu.
https://bugs.launchpad.net/bugs/997217

Title:
  salsauthd maxes cpu

Status in cyrus-sasl2 package in Ubuntu:
  Fix Released
Status in cyrus-sasl2 source package in Precise:
  Won't Fix
Status in cyrus-sasl2 source package in Trusty:
  Invalid
Status in cyrus-sasl2 source package in Xenial:
  Fix Released

Bug description:
  [Impact]

  The rimap authentication mechanism in saslauthd can hit a condition
  where it will start spinning and using all available CPU. This
  condition can be easily encountered when an authentication is
  happening and the imap service is being restarted.

  Furthermore, the saslauthd child process that picked up that
  authentication request and that is spinning now won't be reaped nor
  can it service further requests. If all children are left in this
  state, the authentication service as a whole won't be working anymore.

  [Test Case]

  This test can be performed in a LXD or VM.

  * install the needed packages. mail-stack-delivery is used to have an
  imap server available on localhost that needs no further
  configuration. Accept the defaults for all debconf prompts:

  sudo apt update
  sudo apt install sasl2-bin mail-stack-delivery

  * set the password "ubuntu" for the ubuntu user
  echo ubuntu:ubuntu | sudo chpasswd

  * start saslauthd like this, with just one child:
  sudo /usr/sbin/saslauthd -a rimap -O localhost -r -n 1

  * restart dovecot
  sudo service dovecot restart

  * test saslauthd authentication:
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  0: OK "Success."

  * Now let's break it. In one terminal watch the output of top:
  top

  * in another terminal, run the following:
  sudo testsaslauthd -u ubuntu -p ubuntu & sleep 1; sudo service dovecot stop

  * observe in the "top" terminal that saslauthd is consuming a lot of
  cpu. If that's not happening, try starting dovecot again and adjusting
  the sleep value in the previous test command, but 1s was enough in all
  my runs.

  * start dovecot and repeat the authentication request. Since the only 
saslauthd child is now spinning, this will block:
  sudo service dovecot start
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  

  [Regression Potential]
  This fix relies on read(2) returning zero bytes when the connection is 
dropped, and that is clearly documented in its manpage:

  "On  success,  the  number  of  bytes read is returned (zero indicates
  end of file),"

  The select manpage also documents such a case being a valid case to
  indicate that a socket is ready to be read from, and that it won't
  block:

  "The file descriptors listed in readfds will be watched to see if
  characters become available for reading (more precisely, to see  if  a
  read will  not  block;  in  particular, a file descriptor is also
  ready on end-of-file)"

  This patch is what was used upstream, and is also present in bionic.

  I can't think of regressions specific to this change, other than the
  usual risk of rebuilding a widely used library (sasl2) in an
  environment different from the one from when xenial was released,
  i.e., different libraries available, maybe different system-wide build
  options, etc.

  [Other Info]
  Trusty is still not accounting for read() returning zero being an end-of-file 
case, but the loop there has a counter and it eventually exits, not leading to 
a perpetual spin or high cpu usage (see comment #17 for a brief history on how 
this fix was dropped in the xenial package).

  The fix is simple and could be applied there as well, if the SRU team
  prefers.

  
  [Original Description]

  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a
  bug that causes heavy cpu utilization, impacting normal operation of
  one of our mail servers following an upgrade to Ubuntu 12.04.

  We are running the daemon with the following options:

  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5

  We noticed that users were unable to send mail and that the saslauthd
  processes were using approximately 100% of each cpu core. An strace of
  one of the runaway process showed that it was stuck in the following
  behaviour:

  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], lef

[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-22 Thread Roberto Suarez
I'm seeing this exact same problem with saslauthd in sasl2-bin
2.1.26.dfsg1-14, using Ubuntu 16.04.4. Is there any other possible
cause?

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to cyrus-sasl2 in Ubuntu.
https://bugs.launchpad.net/bugs/997217

Title:
  salsauthd maxes cpu

Status in cyrus-sasl2 package in Ubuntu:
  Fix Released
Status in cyrus-sasl2 source package in Precise:
  Triaged

Bug description:
  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a
  bug that causes heavy cpu utilization, impacting normal operation of
  one of our mail servers following an upgrade to Ubuntu 12.04.

  We are running the daemon with the following options:

  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5

  
  We noticed that users were unable to send mail and that the saslauthd 
processes were using approximately 100% of each cpu core. An strace of one of 
the runaway process showed that it was stuck in the following behaviour:

  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  .

  
  with further inspection showing that the file descriptor in question was a 
socket connected to our imap server in CLOSE_WAIT.

  Browsing saslauthd/auth_rimap.c in the source package for sasl2-bin,
  we came across the following code, repeated in two locations:

  while( select (fds, &perm, NULL, NULL, &timeout ) >0 ) {
 if ( FD_ISSET(s, &perm) ) {
ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
if ( ret<0 ) {
   rc = ret;
   break;
} else {
   rc += ret;
}
 }
  }

  
  It looks like this loop is expected to run until a read error is encountered 
or the timeout of 1 second is reached. There is no test to check that 0 bytes 
were read, indicating that the connection was closed by the remote peer. Since 
select() will immediately return the size of the set of the partially closed 
descriptor (1, which is >0), and calls to read() will always yield 0 bytes, 
there's the potential for execution to get stuck in this non blocking loop and 
I'm presuming that that's what's happening here.

  We've not performed any further analysis to prove that this is really
  what's happening but if my intuition is correct then our IMAP server
  (an nginx imap proxy) most liklely closes the connection at an
  unexpected time under as yet undetermined conditions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cyrus-sasl2/+bug/997217/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-22 Thread Roberto Suarez
I reply to myself: while the version in Xenial is *still* affected with
this bug, any of the later versions is fixed. The version
2.1.26.dfsg1-14 is bugged, but the next one, 2.1.26.dfsg1-15 is not. I
backported the one in Bionic easily and it's working now without
problems.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to cyrus-sasl2 in Ubuntu.
https://bugs.launchpad.net/bugs/997217

Title:
  salsauthd maxes cpu

Status in cyrus-sasl2 package in Ubuntu:
  Fix Released
Status in cyrus-sasl2 source package in Precise:
  Triaged

Bug description:
  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a
  bug that causes heavy cpu utilization, impacting normal operation of
  one of our mail servers following an upgrade to Ubuntu 12.04.

  We are running the daemon with the following options:

  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5

  
  We noticed that users were unable to send mail and that the saslauthd 
processes were using approximately 100% of each cpu core. An strace of one of 
the runaway process showed that it was stuck in the following behaviour:

  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  .

  
  with further inspection showing that the file descriptor in question was a 
socket connected to our imap server in CLOSE_WAIT.

  Browsing saslauthd/auth_rimap.c in the source package for sasl2-bin,
  we came across the following code, repeated in two locations:

  while( select (fds, &perm, NULL, NULL, &timeout ) >0 ) {
 if ( FD_ISSET(s, &perm) ) {
ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
if ( ret<0 ) {
   rc = ret;
   break;
} else {
   rc += ret;
}
 }
  }

  
  It looks like this loop is expected to run until a read error is encountered 
or the timeout of 1 second is reached. There is no test to check that 0 bytes 
were read, indicating that the connection was closed by the remote peer. Since 
select() will immediately return the size of the set of the partially closed 
descriptor (1, which is >0), and calls to read() will always yield 0 bytes, 
there's the potential for execution to get stuck in this non blocking loop and 
I'm presuming that that's what's happening here.

  We've not performed any further analysis to prove that this is really
  what's happening but if my intuition is correct then our IMAP server
  (an nginx imap proxy) most liklely closes the connection at an
  unexpected time under as yet undetermined conditions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cyrus-sasl2/+bug/997217/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-23 Thread Andreas Hasenack
The patch 0034 mentioned in comment #10 is applied in the xenial package
2.1.26.dfsg1-14build1, so what Roberto hit could be a different issue
requiring a different fix.

Might have been this:
cyrus-sasl2 (2.1.26.dfsg1-15) unstable; urgency=medium

  * Add fix for auth_rimap infinite loop (hang) when IMAP server closes
connection (Closes: #815208)

Patch is 
https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=815208;filename=auth_rimap_socket_closed.patch;msg=5:
--- a/saslauthd/auth_rimap.c
+++ b/saslauthd/auth_rimap.c
@@ -494,7 +494,7 @@
 while( select (fds, &perm, NULL, NULL, &timeout ) >0 ) {
if ( FD_ISSET(s, &perm) ) {
   ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
-  if ( ret<0 ) {
+  if ( ret<=0 ) {
  rc = ret;
  break;
   } else {
@@ -607,7 +607,7 @@
 while( select (fds, &perm, NULL, NULL, &timeout ) >0 ) {
if ( FD_ISSET(s, &perm) ) {
   ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
-  if ( ret<0 ) {
+  if ( ret<=0 ) {
  rc = ret;
  break;
   } else {


** Also affects: cyrus-sasl2 (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Changed in: cyrus-sasl2 (Ubuntu Precise)
   Status: Triaged => Won't Fix

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to cyrus-sasl2 in Ubuntu.
https://bugs.launchpad.net/bugs/997217

Title:
  salsauthd maxes cpu

Status in cyrus-sasl2 package in Ubuntu:
  Fix Released
Status in cyrus-sasl2 source package in Precise:
  Won't Fix
Status in cyrus-sasl2 source package in Trusty:
  Triaged
Status in cyrus-sasl2 source package in Xenial:
  Triaged

Bug description:
  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a
  bug that causes heavy cpu utilization, impacting normal operation of
  one of our mail servers following an upgrade to Ubuntu 12.04.

  We are running the daemon with the following options:

  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5

  
  We noticed that users were unable to send mail and that the saslauthd 
processes were using approximately 100% of each cpu core. An strace of one of 
the runaway process showed that it was stuck in the following behaviour:

  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  .

  
  with further inspection showing that the file descriptor in question was a 
socket connected to our imap server in CLOSE_WAIT.

  Browsing saslauthd/auth_rimap.c in the source package for sasl2-bin,
  we came across the following code, repeated in two locations:

  while( select (fds, &perm, NULL, NULL, &timeout ) >0 ) {
 if ( FD_ISSET(s, &perm) ) {
ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
if ( ret<0 ) {
   rc = ret;
   break;
} else {
   rc += ret;
}
 }
  }

  
  It looks like this loop is expected to run until a read error is encountered 
or the timeout of 1 second is reached. There is no test to check that 0 bytes 
were read, indicating that the connection was closed by the remote peer. Since 
select() will immediately return the size of the set of the partially closed 
descriptor (1, which is >0), and calls to read() will always yield 0 bytes, 
there's the potential for execution to get stuck in this non blocking loop and 
I'm presuming that that's what's happening here.

  We've not performed any further analysis to prove that this is really
  what's happening but if my intuition is correct then our IMAP server
  (an nginx imap proxy) most liklely closes the connection at an
  unexpected time under as yet undetermined conditions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cyrus-sasl2/+bug/997217/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-23 Thread Andreas Hasenack
Trusty has 0034-fix_dovecot_authentication.patch, but it lacks the above
fix as well.

** Also affects: cyrus-sasl2 (Ubuntu Trusty)
   Importance: Undecided
   Status: New

** Tags removed: precise
** Tags added: server-next

** Changed in: cyrus-sasl2 (Ubuntu Trusty)
   Status: New => Triaged

** Changed in: cyrus-sasl2 (Ubuntu Xenial)
   Status: New => Triaged

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to cyrus-sasl2 in Ubuntu.
https://bugs.launchpad.net/bugs/997217

Title:
  salsauthd maxes cpu

Status in cyrus-sasl2 package in Ubuntu:
  Fix Released
Status in cyrus-sasl2 source package in Precise:
  Won't Fix
Status in cyrus-sasl2 source package in Trusty:
  Triaged
Status in cyrus-sasl2 source package in Xenial:
  Triaged

Bug description:
  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a
  bug that causes heavy cpu utilization, impacting normal operation of
  one of our mail servers following an upgrade to Ubuntu 12.04.

  We are running the daemon with the following options:

  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5

  
  We noticed that users were unable to send mail and that the saslauthd 
processes were using approximately 100% of each cpu core. An strace of one of 
the runaway process showed that it was stuck in the following behaviour:

  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  .

  
  with further inspection showing that the file descriptor in question was a 
socket connected to our imap server in CLOSE_WAIT.

  Browsing saslauthd/auth_rimap.c in the source package for sasl2-bin,
  we came across the following code, repeated in two locations:

  while( select (fds, &perm, NULL, NULL, &timeout ) >0 ) {
 if ( FD_ISSET(s, &perm) ) {
ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
if ( ret<0 ) {
   rc = ret;
   break;
} else {
   rc += ret;
}
 }
  }

  
  It looks like this loop is expected to run until a read error is encountered 
or the timeout of 1 second is reached. There is no test to check that 0 bytes 
were read, indicating that the connection was closed by the remote peer. Since 
select() will immediately return the size of the set of the partially closed 
descriptor (1, which is >0), and calls to read() will always yield 0 bytes, 
there's the potential for execution to get stuck in this non blocking loop and 
I'm presuming that that's what's happening here.

  We've not performed any further analysis to prove that this is really
  what's happening but if my intuition is correct then our IMAP server
  (an nginx imap proxy) most liklely closes the connection at an
  unexpected time under as yet undetermined conditions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cyrus-sasl2/+bug/997217/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-23 Thread Andreas Hasenack
I reproduced the problem in xenial by timing a "testsaslauthd" call with
a "service dovecot stop" one (dovecot being my test imap server on
localhost).

This is good enough for an SRU test case, thanks all.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to cyrus-sasl2 in Ubuntu.
https://bugs.launchpad.net/bugs/997217

Title:
  salsauthd maxes cpu

Status in cyrus-sasl2 package in Ubuntu:
  Fix Released
Status in cyrus-sasl2 source package in Precise:
  Won't Fix
Status in cyrus-sasl2 source package in Trusty:
  Triaged
Status in cyrus-sasl2 source package in Xenial:
  Triaged

Bug description:
  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a
  bug that causes heavy cpu utilization, impacting normal operation of
  one of our mail servers following an upgrade to Ubuntu 12.04.

  We are running the daemon with the following options:

  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5

  
  We noticed that users were unable to send mail and that the saslauthd 
processes were using approximately 100% of each cpu core. An strace of one of 
the runaway process showed that it was stuck in the following behaviour:

  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  .

  
  with further inspection showing that the file descriptor in question was a 
socket connected to our imap server in CLOSE_WAIT.

  Browsing saslauthd/auth_rimap.c in the source package for sasl2-bin,
  we came across the following code, repeated in two locations:

  while( select (fds, &perm, NULL, NULL, &timeout ) >0 ) {
 if ( FD_ISSET(s, &perm) ) {
ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
if ( ret<0 ) {
   rc = ret;
   break;
} else {
   rc += ret;
}
 }
  }

  
  It looks like this loop is expected to run until a read error is encountered 
or the timeout of 1 second is reached. There is no test to check that 0 bytes 
were read, indicating that the connection was closed by the remote peer. Since 
select() will immediately return the size of the set of the partially closed 
descriptor (1, which is >0), and calls to read() will always yield 0 bytes, 
there's the potential for execution to get stuck in this non blocking loop and 
I'm presuming that that's what's happening here.

  We've not performed any further analysis to prove that this is really
  what's happening but if my intuition is correct then our IMAP server
  (an nginx imap proxy) most liklely closes the connection at an
  unexpected time under as yet undetermined conditions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cyrus-sasl2/+bug/997217/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-23 Thread Andreas Hasenack
** Changed in: cyrus-sasl2 (Ubuntu Xenial)
   Importance: Undecided => High

** Changed in: cyrus-sasl2 (Ubuntu Xenial)
 Assignee: (unassigned) => Andreas Hasenack (ahasenack)

** Changed in: cyrus-sasl2 (Ubuntu Trusty)
   Importance: Undecided => High

** Changed in: cyrus-sasl2 (Ubuntu Trusty)
 Assignee: (unassigned) => Andreas Hasenack (ahasenack)

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to cyrus-sasl2 in Ubuntu.
https://bugs.launchpad.net/bugs/997217

Title:
  salsauthd maxes cpu

Status in cyrus-sasl2 package in Ubuntu:
  Fix Released
Status in cyrus-sasl2 source package in Precise:
  Won't Fix
Status in cyrus-sasl2 source package in Trusty:
  Triaged
Status in cyrus-sasl2 source package in Xenial:
  Triaged

Bug description:
  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a
  bug that causes heavy cpu utilization, impacting normal operation of
  one of our mail servers following an upgrade to Ubuntu 12.04.

  We are running the daemon with the following options:

  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5

  
  We noticed that users were unable to send mail and that the saslauthd 
processes were using approximately 100% of each cpu core. An strace of one of 
the runaway process showed that it was stuck in the following behaviour:

  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  .

  
  with further inspection showing that the file descriptor in question was a 
socket connected to our imap server in CLOSE_WAIT.

  Browsing saslauthd/auth_rimap.c in the source package for sasl2-bin,
  we came across the following code, repeated in two locations:

  while( select (fds, &perm, NULL, NULL, &timeout ) >0 ) {
 if ( FD_ISSET(s, &perm) ) {
ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
if ( ret<0 ) {
   rc = ret;
   break;
} else {
   rc += ret;
}
 }
  }

  
  It looks like this loop is expected to run until a read error is encountered 
or the timeout of 1 second is reached. There is no test to check that 0 bytes 
were read, indicating that the connection was closed by the remote peer. Since 
select() will immediately return the size of the set of the partially closed 
descriptor (1, which is >0), and calls to read() will always yield 0 bytes, 
there's the potential for execution to get stuck in this non blocking loop and 
I'm presuming that that's what's happening here.

  We've not performed any further analysis to prove that this is really
  what's happening but if my intuition is correct then our IMAP server
  (an nginx imap proxy) most liklely closes the connection at an
  unexpected time under as yet undetermined conditions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cyrus-sasl2/+bug/997217/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-24 Thread Andreas Hasenack
** Description changed:

+ [Impact]
+ 
+  * An explanation of the effects of the bug on users and
+ 
+  * justification for backporting the fix to the stable release.
+ 
+  * In addition, it is helpful, but not required, to include an
+explanation of how the upload fixes this bug.
+ 
+ [Test Case]
+ 
+  * detailed instructions how to reproduce the bug
+ 
+  * these should allow someone who is not familiar with the affected
+package to reproduce the bug and verify that the updated package fixes
+the problem.
+ 
+ [Regression Potential]
+ 
+  * discussion of how regressions are most likely to manifest as a result
+ of this change.
+ 
+  * It is assumed that any SRU candidate patch is well-tested before
+upload and has a low overall risk of regression, but it's important
+to make the effort to think about what ''could'' happen in the
+event of a regression.
+ 
+  * This both shows the SRU team that the risks have been considered,
+and provides guidance to testers in regression-testing the SRU.
+ 
+ [Other Info]
+  
+  * Anything else you think is useful to include
+  * Anticipate questions from users, SRU, +1 maintenance, security teams and 
the Technical Board
+  * and address these questions in advance
+ 
+ 
+ [Original Description]
+ 
  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a bug
  that causes heavy cpu utilization, impacting normal operation of one of
  our mail servers following an upgrade to Ubuntu 12.04.
  
  We are running the daemon with the following options:
  
  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5
  
- 
- We noticed that users were unable to send mail and that the saslauthd 
processes were using approximately 100% of each cpu core. An strace of one of 
the runaway process showed that it was stuck in the following behaviour:
+ We noticed that users were unable to send mail and that the saslauthd
+ processes were using approximately 100% of each cpu core. An strace of
+ one of the runaway process showed that it was stuck in the following
+ behaviour:
  
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  .
  
- 
- with further inspection showing that the file descriptor in question was a 
socket connected to our imap server in CLOSE_WAIT.
+ with further inspection showing that the file descriptor in question was
+ a socket connected to our imap server in CLOSE_WAIT.
  
  Browsing saslauthd/auth_rimap.c in the source package for sasl2-bin, we
  came across the following code, repeated in two locations:
  
  while( select (fds, &perm, NULL, NULL, &timeout ) >0 ) {
-if ( FD_ISSET(s, &perm) ) {
-   ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
-   if ( ret<0 ) {
-  rc = ret;
-  break;
-   } else {
-  rc += ret;
-   }
-}
- }
+    if ( FD_ISSET(s, &perm) ) {
+   ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
+   if ( ret<0 ) {
+  rc = ret;
+  break;
+   } else {
+  rc += ret;
+   }
+    }
+ }
  
- 
- It looks like this loop is expected to run until a read error is encountered 
or the timeout of 1 second is reached. There is no test to check that 0 bytes 
were read, indicating that the connection was closed by the remote peer. Since 
select() will immediately return the size of the set of the partially closed 
descriptor (1, which is >0), and calls to read() will always yield 0 bytes, 
there's the potential for execution to get stuck in this non blocking loop and 
I'm presuming that that's what's happening here.
+ It looks like this loop is expected to run until a read error is
+ encountered or the timeout of 1 second is reached. There is no test to
+ check that 0 bytes were read, indicating that the connection was closed
+ by the remote peer. Since select() will immediately return the size of
+ the set of the partially closed descriptor (1, which is >0), and calls
+ to read() will always yield 0 bytes, there's the potential for execution
+ to get stuck in this non blocking loop and I'm presuming that that's
+ what's hap

[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-24 Thread Andreas Hasenack
** Description changed:

  [Impact]
  
-  * An explanation of the effects of the bug on users and
+ The rimap authentication mechanism in saslauthd can hit a condition
+ where it will start spinning and using all available CPU. This condition
+ can be easily encountered when an authentication is happening and the
+ imap service is being restarted.
  
-  * justification for backporting the fix to the stable release.
- 
-  * In addition, it is helpful, but not required, to include an
-explanation of how the upload fixes this bug.
+ Once the saslauthd daemon enters this loop, it stops responding to
+ authentication requests.
  
  [Test Case]
  
-  * detailed instructions how to reproduce the bug
+ This test can be performed in a LXD or VM.
  
-  * these should allow someone who is not familiar with the affected
-package to reproduce the bug and verify that the updated package fixes
-the problem.
+ * install the needed packages. mail-stack-delivery is used to have an
+ imap server available on localhost that needs no further configuration.
+ Accept the defaults for all debconf prompts:
+ 
+ sudo apt update
+ sudo apt install sasl2-bin mail-stack-delivery
+ 
+ * set the password "ubuntu" for the ubuntu user
+ echo ubuntu:ubuntu | sudo chpasswd
+ 
+ * start saslauthd like this:
+ sudo /usr/sbin/saslauthd -a rimap -O localhost -r -n 5
+ 
+ * restart dovecot
+ sudo service dovecot restart
+ 
+ * test saslauthd authentication:
+ $ sudo testsaslauthd -u ubuntu -p ubuntu
+ 0: OK "Success."
+ 
+ * Now let's break it. In one terminal watch the output of top:
+ top
+ 
+ * in another terminal, run the following:
+ sudo testsaslauthd -u ubuntu -p ubuntu & sleep 1; sudo service dovecot stop
+ 
+ * observe in the "top" terminal that saslauthd is consuming a lot of
+ cpu. If that's not happening, try starting dovecot again and adjusting
+ the sleep value in the previous test command.
+ 
+ * Try authenticating again, while saslauthd is spinning. It should fail:
+ $ sudo testsaslauthd -u ubuntu -p ubuntu  
+ 0: NO "authentication failed"
+ 
  
  [Regression Potential]
  
-  * discussion of how regressions are most likely to manifest as a result
+  * discussion of how regressions are most likely to manifest as a result
  of this change.
  
-  * It is assumed that any SRU candidate patch is well-tested before
-upload and has a low overall risk of regression, but it's important
-to make the effort to think about what ''could'' happen in the
-event of a regression.
+  * It is assumed that any SRU candidate patch is well-tested before
+    upload and has a low overall risk of regression, but it's important
+    to make the effort to think about what ''could'' happen in the
+    event of a regression.
  
-  * This both shows the SRU team that the risks have been considered,
-and provides guidance to testers in regression-testing the SRU.
+  * This both shows the SRU team that the risks have been considered,
+    and provides guidance to testers in regression-testing the SRU.
  
  [Other Info]
-  
-  * Anything else you think is useful to include
-  * Anticipate questions from users, SRU, +1 maintenance, security teams and 
the Technical Board
-  * and address these questions in advance
  
+  * Anything else you think is useful to include
+  * Anticipate questions from users, SRU, +1 maintenance, security teams and 
the Technical Board
+  * and address these questions in advance
  
  [Original Description]
  
  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a bug
  that causes heavy cpu utilization, impacting normal operation of one of
  our mail servers following an upgrade to Ubuntu 12.04.
  
  We are running the daemon with the following options:
  
  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5
  
  We noticed that users were unable to send mail and that the saslauthd
  processes were using approximately 100% of each cpu core. An strace of
  one of the runaway process showed that it was stuck in the following
  behaviour:
  
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  .
  
  with further inspection showing that the file descriptor in question was
  a socket connected to our imap server 

[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-24 Thread Andreas Hasenack
** Description changed:

  [Impact]
  
  The rimap authentication mechanism in saslauthd can hit a condition
  where it will start spinning and using all available CPU. This condition
  can be easily encountered when an authentication is happening and the
  imap service is being restarted.
- 
- Once the saslauthd daemon enters this loop, it stops responding to
- authentication requests.
  
  [Test Case]
  
  This test can be performed in a LXD or VM.
  
  * install the needed packages. mail-stack-delivery is used to have an
  imap server available on localhost that needs no further configuration.
  Accept the defaults for all debconf prompts:
  
  sudo apt update
  sudo apt install sasl2-bin mail-stack-delivery
  
  * set the password "ubuntu" for the ubuntu user
  echo ubuntu:ubuntu | sudo chpasswd
  
  * start saslauthd like this:
  sudo /usr/sbin/saslauthd -a rimap -O localhost -r -n 5
  
  * restart dovecot
  sudo service dovecot restart
  
  * test saslauthd authentication:
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  0: OK "Success."
  
  * Now let's break it. In one terminal watch the output of top:
  top
  
  * in another terminal, run the following:
  sudo testsaslauthd -u ubuntu -p ubuntu & sleep 1; sudo service dovecot stop
  
  * observe in the "top" terminal that saslauthd is consuming a lot of
  cpu. If that's not happening, try starting dovecot again and adjusting
  the sleep value in the previous test command.
- 
- * Try authenticating again, while saslauthd is spinning. It should fail:
- $ sudo testsaslauthd -u ubuntu -p ubuntu  
- 0: NO "authentication failed"
  
  
  [Regression Potential]
  
   * discussion of how regressions are most likely to manifest as a result
  of this change.
  
   * It is assumed that any SRU candidate patch is well-tested before
     upload and has a low overall risk of regression, but it's important
     to make the effort to think about what ''could'' happen in the
     event of a regression.
  
   * This both shows the SRU team that the risks have been considered,
     and provides guidance to testers in regression-testing the SRU.
  
  [Other Info]
  
   * Anything else you think is useful to include
   * Anticipate questions from users, SRU, +1 maintenance, security teams and 
the Technical Board
   * and address these questions in advance
  
  [Original Description]
  
  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a bug
  that causes heavy cpu utilization, impacting normal operation of one of
  our mail servers following an upgrade to Ubuntu 12.04.
  
  We are running the daemon with the following options:
  
  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5
  
  We noticed that users were unable to send mail and that the saslauthd
  processes were using approximately 100% of each cpu core. An strace of
  one of the runaway process showed that it was stuck in the following
  behaviour:
  
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  .
  
  with further inspection showing that the file descriptor in question was
  a socket connected to our imap server in CLOSE_WAIT.
  
  Browsing saslauthd/auth_rimap.c in the source package for sasl2-bin, we
  came across the following code, repeated in two locations:
  
  while( select (fds, &perm, NULL, NULL, &timeout ) >0 ) {
     if ( FD_ISSET(s, &perm) ) {
    ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
    if ( ret<0 ) {
   rc = ret;
   break;
    } else {
   rc += ret;
    }
     }
  }
  
  It looks like this loop is expected to run until a read error is
  encountered or the timeout of 1 second is reached. There is no test to
  check that 0 bytes were read, indicating that the connection was closed
  by the remote peer. Since select() will immediately return the size of
  the set of the partially closed descriptor (1, which is >0), and calls
  to read() will always yield 0 bytes, there's the potential for execution
  to get stuck in this non blocking loop and I'm presuming that that's
  what's happening here.
  
  We've not performed any further analysis to prove that this is really
  w

[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-24 Thread Andreas Hasenack
I'm unable to trigger this bug when running the test case on trusty,
even though the same code is there. I also tried adjusting the sleep.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to cyrus-sasl2 in Ubuntu.
https://bugs.launchpad.net/bugs/997217

Title:
  salsauthd maxes cpu

Status in cyrus-sasl2 package in Ubuntu:
  Fix Released
Status in cyrus-sasl2 source package in Precise:
  Won't Fix
Status in cyrus-sasl2 source package in Trusty:
  Triaged
Status in cyrus-sasl2 source package in Xenial:
  Triaged

Bug description:
  [Impact]

  The rimap authentication mechanism in saslauthd can hit a condition
  where it will start spinning and using all available CPU. This
  condition can be easily encountered when an authentication is
  happening and the imap service is being restarted.

  Furthermore, the saslauthd child process that picked up that
  authentication request and that is spinning now won't be reaped nor
  can it service further requests. If all children are left in this
  state, the authentication service as a whole won't be working anymore.

  [Test Case]

  This test can be performed in a LXD or VM.

  * install the needed packages. mail-stack-delivery is used to have an
  imap server available on localhost that needs no further
  configuration. Accept the defaults for all debconf prompts:

  sudo apt update
  sudo apt install sasl2-bin mail-stack-delivery

  * set the password "ubuntu" for the ubuntu user
  echo ubuntu:ubuntu | sudo chpasswd

  * start saslauthd like this, with just one child:
  sudo /usr/sbin/saslauthd -a rimap -O localhost -r -n 1

  * restart dovecot
  sudo service dovecot restart

  * test saslauthd authentication:
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  0: OK "Success."

  * Now let's break it. In one terminal watch the output of top:
  top

  * in another terminal, run the following:
  sudo testsaslauthd -u ubuntu -p ubuntu & sleep 1; sudo service dovecot stop

  * observe in the "top" terminal that saslauthd is consuming a lot of
  cpu. If that's not happening, try starting dovecot again and adjusting
  the sleep value in the previous test command.

  * repeat the authentication request. Since the only saslauthd child is now 
spinning, this will block:
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  

  
  [Regression Potential]

   * discussion of how regressions are most likely to manifest as a
  result of this change.

   * It is assumed that any SRU candidate patch is well-tested before
     upload and has a low overall risk of regression, but it's important
     to make the effort to think about what ''could'' happen in the
     event of a regression.

   * This both shows the SRU team that the risks have been considered,
     and provides guidance to testers in regression-testing the SRU.

  [Other Info]

   * Anything else you think is useful to include
   * Anticipate questions from users, SRU, +1 maintenance, security teams and 
the Technical Board
   * and address these questions in advance

  [Original Description]

  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a
  bug that causes heavy cpu utilization, impacting normal operation of
  one of our mail servers following an upgrade to Ubuntu 12.04.

  We are running the daemon with the following options:

  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5

  We noticed that users were unable to send mail and that the saslauthd
  processes were using approximately 100% of each cpu core. An strace of
  one of the runaway process showed that it was stuck in the following
  behaviour:

  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  .

  with further inspection showing that the file descriptor in question
  was a socket connected to our imap server in CLOSE_WAIT.

  Browsing saslauthd/auth_rimap.c in the source package for sasl2-bin,
  we came across the following code, repeated in two locations:

  while( select (fds, &perm, NULL, NULL, &timeout ) >0 ) {
     if ( FD_ISSET(s, &perm) ) {
    ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
    if ( ret<0 ) {
   rc = ret;
   

[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-24 Thread Andreas Hasenack
Ok, trusty isn't affected because the loop there has an exit clause:
+  ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
+  if ( ret<0 ) {
+ rc = ret;
+ break;
+  } else {
+ if (ret == 0) {
+   loopc += 1;
+ } else {
+   loopc = 0;
+ }
+ if (loopc > sizeof(rbuf)) { // arbitrary chosen value
+   break;
+ }


That comes from trusty's patch named 0034-fix_dovecot_authentication.patch. So 
trusty does loop for a bit (sizeof(rbuf) is 1000), but won't get stuck. Someone 
added the loop counter as a safety net, but didn't change the "ret<0" check 
into "ret<=0" which would also have fixed this.

In precise, that same patch (0034) adds the loop, but *without* an exit
clause, hence this bug.

Xenial is interesting. Upstream at some point adopted the patch that
does *NOT* exit the loop, but the code is so similar that someone
decided the patch from the package was already applied and dropped it
from the package, reintroducing the bug.

To add to the confusion, in xenial that patch file was super slightly renamed 
from 0034_fix_dovecot_authentication.patch to 
0034-fix_dovecot_authentication.patch (can you spot what changed?) and got 
totally different contents:
--- cyrus-sasl2.orig/lib/checkpw.c
+++ cyrus-sasl2/lib/checkpw.c
@@ -587,16 +587,14 @@ static int read_wait(int fd, unsigned de
/* Timeout. */
errno = ETIMEDOUT;
return -1;
-   case +1:
-   if (FD_ISSET(fd, &rfds)) {
-   /* Success, file descriptor is readable. */
-   return 0;
-   }
-   return -1;
case -1:
if (errno == EINTR || errno == EAGAIN)
continue;
default:
+   if (FD_ISSET(fd, &rfds)) {
+   /* Success, file descriptor is readable. */
+   return 0;
+   }
/* Error catch-all. */
return -1;
}

>From bionic onwards, the upstream version has the loop with no loop
counter, but it checks read()'s result for <= 0, not just 0, so it's
fixed there.

Bottom line, only xenial is currently affected (and precise, but precise
is EOL).

** Changed in: cyrus-sasl2 (Ubuntu Trusty)
   Status: Triaged => Invalid

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to cyrus-sasl2 in Ubuntu.
https://bugs.launchpad.net/bugs/997217

Title:
  salsauthd maxes cpu

Status in cyrus-sasl2 package in Ubuntu:
  Fix Released
Status in cyrus-sasl2 source package in Precise:
  Won't Fix
Status in cyrus-sasl2 source package in Trusty:
  Invalid
Status in cyrus-sasl2 source package in Xenial:
  Triaged

Bug description:
  [Impact]

  The rimap authentication mechanism in saslauthd can hit a condition
  where it will start spinning and using all available CPU. This
  condition can be easily encountered when an authentication is
  happening and the imap service is being restarted.

  Furthermore, the saslauthd child process that picked up that
  authentication request and that is spinning now won't be reaped nor
  can it service further requests. If all children are left in this
  state, the authentication service as a whole won't be working anymore.

  [Test Case]

  This test can be performed in a LXD or VM.

  * install the needed packages. mail-stack-delivery is used to have an
  imap server available on localhost that needs no further
  configuration. Accept the defaults for all debconf prompts:

  sudo apt update
  sudo apt install sasl2-bin mail-stack-delivery

  * set the password "ubuntu" for the ubuntu user
  echo ubuntu:ubuntu | sudo chpasswd

  * start saslauthd like this, with just one child:
  sudo /usr/sbin/saslauthd -a rimap -O localhost -r -n 1

  * restart dovecot
  sudo service dovecot restart

  * test saslauthd authentication:
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  0: OK "Success."

  * Now let's break it. In one terminal watch the output of top:
  top

  * in another terminal, run the following:
  sudo testsaslauthd -u ubuntu -p ubuntu & sleep 1; sudo service dovecot stop

  * observe in the "top" terminal that saslauthd is consuming a lot of
  cpu. If that's not happening, try starting dovecot again and adjusting
  the sleep value in the previous test command, but 1s was enough in all
  my runs.

  * start dovecot and repeat the authentication request. Since the only 
saslauthd child is now spinning, this will block:
  sudo service dovecot start
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  

  [Regression Potential]

   * discussion of how regressions are most likely to manifest as a
  result of this change.

   * It is assumed that any SRU candidate patch is well-tested before
     upload and has a low overall risk of regression, but it's important
     to make the effort to think about what ''could'' happen in the
     event of a regression.

   * This both shows the SRU team that the risks have been cons

[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-24 Thread Launchpad Bug Tracker
** Merge proposal linked:
   
https://code.launchpad.net/~ahasenack/ubuntu/+source/cyrus-sasl2/+git/cyrus-sasl2/+merge/357779

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to cyrus-sasl2 in Ubuntu.
https://bugs.launchpad.net/bugs/997217

Title:
  salsauthd maxes cpu

Status in cyrus-sasl2 package in Ubuntu:
  Fix Released
Status in cyrus-sasl2 source package in Precise:
  Won't Fix
Status in cyrus-sasl2 source package in Trusty:
  Invalid
Status in cyrus-sasl2 source package in Xenial:
  In Progress

Bug description:
  [Impact]

  The rimap authentication mechanism in saslauthd can hit a condition
  where it will start spinning and using all available CPU. This
  condition can be easily encountered when an authentication is
  happening and the imap service is being restarted.

  Furthermore, the saslauthd child process that picked up that
  authentication request and that is spinning now won't be reaped nor
  can it service further requests. If all children are left in this
  state, the authentication service as a whole won't be working anymore.

  [Test Case]

  This test can be performed in a LXD or VM.

  * install the needed packages. mail-stack-delivery is used to have an
  imap server available on localhost that needs no further
  configuration. Accept the defaults for all debconf prompts:

  sudo apt update
  sudo apt install sasl2-bin mail-stack-delivery

  * set the password "ubuntu" for the ubuntu user
  echo ubuntu:ubuntu | sudo chpasswd

  * start saslauthd like this, with just one child:
  sudo /usr/sbin/saslauthd -a rimap -O localhost -r -n 1

  * restart dovecot
  sudo service dovecot restart

  * test saslauthd authentication:
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  0: OK "Success."

  * Now let's break it. In one terminal watch the output of top:
  top

  * in another terminal, run the following:
  sudo testsaslauthd -u ubuntu -p ubuntu & sleep 1; sudo service dovecot stop

  * observe in the "top" terminal that saslauthd is consuming a lot of
  cpu. If that's not happening, try starting dovecot again and adjusting
  the sleep value in the previous test command, but 1s was enough in all
  my runs.

  * start dovecot and repeat the authentication request. Since the only 
saslauthd child is now spinning, this will block:
  sudo service dovecot start
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  

  [Regression Potential]
  This fix relies on read(2) returning zero bytes when the connection is 
dropped, and that is clearly documented in its manpage:

  "On  success,  the  number  of  bytes read is returned (zero indicates
  end of file),"

  The select manpage also documents such a case being a valid case to
  indicate that a socket is ready to be read from, and that it won't
  block:

  "The file descriptors listed in readfds will be watched to see if
  characters become available for reading (more precisely, to see  if  a
  read will  not  block;  in  particular, a file descriptor is also
  ready on end-of-file)"

  This patch is what was used upstream, and is also present in bionic.

  I can't think of regressions specific to this change, other than the
  usual risk of rebuilding a widely used library (sasl2) in an
  environment different from the one from when xenial was released,
  i.e., different libraries available, maybe different system-wide build
  options, etc.

  [Other Info]
  Trusty is still not accounting for read() returning zero being an end-of-file 
case, but the loop there has a counter and it eventually exits, not leading to 
a perpetual spin or high cpu usage (see comment #17 for a brief history on how 
this fix was dropped in the xenial package).

  The fix is simple and could be applied there as well, if the SRU team
  prefers.

  
  [Original Description]

  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a
  bug that causes heavy cpu utilization, impacting normal operation of
  one of our mail servers following an upgrade to Ubuntu 12.04.

  We are running the daemon with the following options:

  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5

  We noticed that users were unable to send mail and that the saslauthd
  processes were using approximately 100% of each cpu core. An strace of
  one of the runaway process showed that it was stuck in the following
  behaviour:

  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)

[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-24 Thread Andreas Hasenack
** Description changed:

  [Impact]
  
  The rimap authentication mechanism in saslauthd can hit a condition
  where it will start spinning and using all available CPU. This condition
  can be easily encountered when an authentication is happening and the
  imap service is being restarted.
  
  Furthermore, the saslauthd child process that picked up that
  authentication request and that is spinning now won't be reaped nor can
  it service further requests. If all children are left in this state, the
  authentication service as a whole won't be working anymore.
  
  [Test Case]
  
  This test can be performed in a LXD or VM.
  
  * install the needed packages. mail-stack-delivery is used to have an
  imap server available on localhost that needs no further configuration.
  Accept the defaults for all debconf prompts:
  
  sudo apt update
  sudo apt install sasl2-bin mail-stack-delivery
  
  * set the password "ubuntu" for the ubuntu user
  echo ubuntu:ubuntu | sudo chpasswd
  
  * start saslauthd like this, with just one child:
  sudo /usr/sbin/saslauthd -a rimap -O localhost -r -n 1
  
  * restart dovecot
  sudo service dovecot restart
  
  * test saslauthd authentication:
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  0: OK "Success."
  
  * Now let's break it. In one terminal watch the output of top:
  top
  
  * in another terminal, run the following:
  sudo testsaslauthd -u ubuntu -p ubuntu & sleep 1; sudo service dovecot stop
  
  * observe in the "top" terminal that saslauthd is consuming a lot of
  cpu. If that's not happening, try starting dovecot again and adjusting
- the sleep value in the previous test command.
+ the sleep value in the previous test command, but 1s was enough in all
+ my runs.
  
- * repeat the authentication request. Since the only saslauthd child is now 
spinning, this will block:
+ * start dovecot and repeat the authentication request. Since the only 
saslauthd child is now spinning, this will block:
+ sudo service dovecot start
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  
- 
  
  [Regression Potential]
  
   * discussion of how regressions are most likely to manifest as a result
  of this change.
  
   * It is assumed that any SRU candidate patch is well-tested before
     upload and has a low overall risk of regression, but it's important
     to make the effort to think about what ''could'' happen in the
     event of a regression.
  
   * This both shows the SRU team that the risks have been considered,
     and provides guidance to testers in regression-testing the SRU.
  
  [Other Info]
  
   * Anything else you think is useful to include
   * Anticipate questions from users, SRU, +1 maintenance, security teams and 
the Technical Board
   * and address these questions in advance
  
  [Original Description]
  
  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a bug
  that causes heavy cpu utilization, impacting normal operation of one of
  our mail servers following an upgrade to Ubuntu 12.04.
  
  We are running the daemon with the following options:
  
  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5
  
  We noticed that users were unable to send mail and that the saslauthd
  processes were using approximately 100% of each cpu core. An strace of
  one of the runaway process showed that it was stuck in the following
  behaviour:
  
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  .
  
  with further inspection showing that the file descriptor in question was
  a socket connected to our imap server in CLOSE_WAIT.
  
  Browsing saslauthd/auth_rimap.c in the source package for sasl2-bin, we
  came across the following code, repeated in two locations:
  
  while( select (fds, &perm, NULL, NULL, &timeout ) >0 ) {
     if ( FD_ISSET(s, &perm) ) {
    ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
    if ( ret<0 ) {
   rc = ret;
   break;
    } else {
   rc += ret;
    }
     }
  }
  
  It looks like this loop is expected to run until a read error is
  encountered or the timeout of 1 second is reached. There is no test to
  check that 0 bytes were read, ind

[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-24 Thread Andreas Hasenack
** Description changed:

  [Impact]
  
  The rimap authentication mechanism in saslauthd can hit a condition
  where it will start spinning and using all available CPU. This condition
  can be easily encountered when an authentication is happening and the
  imap service is being restarted.
  
  Furthermore, the saslauthd child process that picked up that
  authentication request and that is spinning now won't be reaped nor can
  it service further requests. If all children are left in this state, the
  authentication service as a whole won't be working anymore.
  
  [Test Case]
  
  This test can be performed in a LXD or VM.
  
  * install the needed packages. mail-stack-delivery is used to have an
  imap server available on localhost that needs no further configuration.
  Accept the defaults for all debconf prompts:
  
  sudo apt update
  sudo apt install sasl2-bin mail-stack-delivery
  
  * set the password "ubuntu" for the ubuntu user
  echo ubuntu:ubuntu | sudo chpasswd
  
  * start saslauthd like this, with just one child:
  sudo /usr/sbin/saslauthd -a rimap -O localhost -r -n 1
  
  * restart dovecot
  sudo service dovecot restart
  
  * test saslauthd authentication:
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  0: OK "Success."
  
  * Now let's break it. In one terminal watch the output of top:
  top
  
  * in another terminal, run the following:
  sudo testsaslauthd -u ubuntu -p ubuntu & sleep 1; sudo service dovecot stop
  
  * observe in the "top" terminal that saslauthd is consuming a lot of
  cpu. If that's not happening, try starting dovecot again and adjusting
  the sleep value in the previous test command, but 1s was enough in all
  my runs.
  
  * start dovecot and repeat the authentication request. Since the only 
saslauthd child is now spinning, this will block:
  sudo service dovecot start
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  
  
  [Regression Potential]
+ This fix relies on read(2) returning zero bytes when the connection is 
dropped, and that is clearly documented in its manpage:
  
-  * discussion of how regressions are most likely to manifest as a result
- of this change.
+ "On  success,  the  number  of  bytes read is returned (zero indicates
+ end of file),"
  
-  * It is assumed that any SRU candidate patch is well-tested before
-    upload and has a low overall risk of regression, but it's important
-    to make the effort to think about what ''could'' happen in the
-    event of a regression.
+ The select manpage also documents such a case being a valid case to
+ indicate that a socket is ready to be read from, and that it won't
+ block:
  
-  * This both shows the SRU team that the risks have been considered,
-    and provides guidance to testers in regression-testing the SRU.
+ "The file descriptors listed in readfds will be watched to see if
+ characters become available for reading (more precisely, to see  if  a
+ read will  not  block;  in  particular, a file descriptor is also ready
+ on end-of-file)"
+ 
+ This patch is what was used upstream, and is also present in bionic.
+ 
+ I can't think of regressions specific to this change, other than the
+ usual risk of rebuilding a widely used library (sasl2) in an environment
+ different from the one from when xenial was released, i.e., different
+ libraries available, maybe different system-wide build options, etc.
  
  [Other Info]
+ Trusty is still not accounting for read() returning zero being an end-of-file 
case, but the loop there has a counter and it eventually exits, not leading to 
a perpetual spin or high cpu usage (see comment #17 for a brief history on how 
this fix was dropped in the xenial package).
  
-  * Anything else you think is useful to include
-  * Anticipate questions from users, SRU, +1 maintenance, security teams and 
the Technical Board
-  * and address these questions in advance
+ The fix is simple and could be applied there as well, if the SRU team
+ prefers.
+ 
  
  [Original Description]
  
  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a bug
  that causes heavy cpu utilization, impacting normal operation of one of
  our mail servers following an upgrade to Ubuntu 12.04.
  
  We are running the daemon with the following options:
  
  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5
  
  We noticed that users were unable to send mail and that the saslauthd
  processes were using approximately 100% of each cpu core. An strace of
  one of the runaway process showed that it was stuck in the following
  behaviour:
  
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(8, "", 940)= 0
  select(9, [8], NULL, NULL, {0, 0})  = 1 (in [8], left {0, 0})
  read(

[Touch-packages] [Bug 997217] Re: salsauthd maxes cpu

2018-10-29 Thread Brian Murray
Hello David, or anyone else affected,

Accepted cyrus-sasl2 into xenial-proposed. The package will build now
and be available at https://launchpad.net/ubuntu/+source/cyrus-
sasl2/2.1.26.dfsg1-14ubuntu0.1 in a few hours, and then in the -proposed
repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested and change the tag from
verification-needed-xenial to verification-done-xenial. If it does not
fix the bug for you, please add a comment stating that, and change the
tag to verification-failed-xenial. In either case, without details of
your testing we will not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: cyrus-sasl2 (Ubuntu Xenial)
   Status: In Progress => Fix Committed

** Tags added: verification-needed verification-needed-xenial

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to cyrus-sasl2 in Ubuntu.
https://bugs.launchpad.net/bugs/997217

Title:
  salsauthd maxes cpu

Status in cyrus-sasl2 package in Ubuntu:
  Fix Released
Status in cyrus-sasl2 source package in Precise:
  Won't Fix
Status in cyrus-sasl2 source package in Trusty:
  Invalid
Status in cyrus-sasl2 source package in Xenial:
  Fix Committed

Bug description:
  [Impact]

  The rimap authentication mechanism in saslauthd can hit a condition
  where it will start spinning and using all available CPU. This
  condition can be easily encountered when an authentication is
  happening and the imap service is being restarted.

  Furthermore, the saslauthd child process that picked up that
  authentication request and that is spinning now won't be reaped nor
  can it service further requests. If all children are left in this
  state, the authentication service as a whole won't be working anymore.

  [Test Case]

  This test can be performed in a LXD or VM.

  * install the needed packages. mail-stack-delivery is used to have an
  imap server available on localhost that needs no further
  configuration. Accept the defaults for all debconf prompts:

  sudo apt update
  sudo apt install sasl2-bin mail-stack-delivery

  * set the password "ubuntu" for the ubuntu user
  echo ubuntu:ubuntu | sudo chpasswd

  * start saslauthd like this, with just one child:
  sudo /usr/sbin/saslauthd -a rimap -O localhost -r -n 1

  * restart dovecot
  sudo service dovecot restart

  * test saslauthd authentication:
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  0: OK "Success."

  * Now let's break it. In one terminal watch the output of top:
  top

  * in another terminal, run the following:
  sudo testsaslauthd -u ubuntu -p ubuntu & sleep 1; sudo service dovecot stop

  * observe in the "top" terminal that saslauthd is consuming a lot of
  cpu. If that's not happening, try starting dovecot again and adjusting
  the sleep value in the previous test command, but 1s was enough in all
  my runs.

  * start dovecot and repeat the authentication request. Since the only 
saslauthd child is now spinning, this will block:
  sudo service dovecot start
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  

  [Regression Potential]
  This fix relies on read(2) returning zero bytes when the connection is 
dropped, and that is clearly documented in its manpage:

  "On  success,  the  number  of  bytes read is returned (zero indicates
  end of file),"

  The select manpage also documents such a case being a valid case to
  indicate that a socket is ready to be read from, and that it won't
  block:

  "The file descriptors listed in readfds will be watched to see if
  characters become available for reading (more precisely, to see  if  a
  read will  not  block;  in  particular, a file descriptor is also
  ready on end-of-file)"

  This patch is what was used upstream, and is also present in bionic.

  I can't think of regressions specific to this change, other than the
  usual risk of rebuilding a widely used library (sasl2) in an
  environment different from the one from when xenial was released,
  i.e., different libraries available, maybe different system-wide build
  options, etc.

  [Other Info]
  Trusty is still not accounting for read() returning zero being an end-of-file 
case, but the loop there has a counter and it eventually exits, not leading to 
a perpetual spin or high cpu usage (see comment #17 for a brief history on how 
this fix was dropped in the xenial package).

  The fix is simp