** Description changed:

  [Impact]
  
-  * An explanation of the effects of the bug on users and
+ The rimap authentication mechanism in saslauthd can hit a condition
+ where it will start spinning and using all available CPU. This condition
+ can be easily encountered when an authentication is happening and the
+ imap service is being restarted.
  
-  * justification for backporting the fix to the stable release.
- 
-  * In addition, it is helpful, but not required, to include an
-    explanation of how the upload fixes this bug.
+ Once the saslauthd daemon enters this loop, it stops responding to
+ authentication requests.
  
  [Test Case]
  
-  * detailed instructions how to reproduce the bug
+ This test can be performed in a LXD or VM.
  
-  * these should allow someone who is not familiar with the affected
-    package to reproduce the bug and verify that the updated package fixes
-    the problem.
+ * install the needed packages. mail-stack-delivery is used to have an
+ imap server available on localhost that needs no further configuration.
+ Accept the defaults for all debconf prompts:
+ 
+ sudo apt update
+ sudo apt install sasl2-bin mail-stack-delivery
+ 
+ * set the password "ubuntu" for the ubuntu user
+ echo ubuntu:ubuntu | sudo chpasswd
+ 
+ * start saslauthd like this:
+ sudo /usr/sbin/saslauthd -a rimap -O localhost -r -n 5
+ 
+ * restart dovecot
+ sudo service dovecot restart
+ 
+ * test saslauthd authentication:
+ $ sudo testsaslauthd -u ubuntu -p ubuntu
+ 0: OK "Success."
+ 
+ * Now let's break it. In one terminal watch the output of top:
+ top
+ 
+ * in another terminal, run the following:
+ sudo testsaslauthd -u ubuntu -p ubuntu & sleep 1; sudo service dovecot stop
+ 
+ * observe in the "top" terminal that saslauthd is consuming a lot of
+ cpu. If that's not happening, try starting dovecot again and adjusting
+ the sleep value in the previous test command.
+ 
+ * Try authenticating again, while saslauthd is spinning. It should fail:
+ $ sudo testsaslauthd -u ubuntu -p ubuntu  
+ 0: NO "authentication failed"
+ 
  
  [Regression Potential]
  
-  * discussion of how regressions are most likely to manifest as a result
+  * discussion of how regressions are most likely to manifest as a result
  of this change.
  
-  * It is assumed that any SRU candidate patch is well-tested before
-    upload and has a low overall risk of regression, but it's important
-    to make the effort to think about what ''could'' happen in the
-    event of a regression.
+  * It is assumed that any SRU candidate patch is well-tested before
+    upload and has a low overall risk of regression, but it's important
+    to make the effort to think about what ''could'' happen in the
+    event of a regression.
  
-  * This both shows the SRU team that the risks have been considered,
-    and provides guidance to testers in regression-testing the SRU.
+  * This both shows the SRU team that the risks have been considered,
+    and provides guidance to testers in regression-testing the SRU.
  
  [Other Info]
-  
-  * Anything else you think is useful to include
-  * Anticipate questions from users, SRU, +1 maintenance, security teams and 
the Technical Board
-  * and address these questions in advance
  
+  * Anything else you think is useful to include
+  * Anticipate questions from users, SRU, +1 maintenance, security teams and 
the Technical Board
+  * and address these questions in advance
  
  [Original Description]
  
  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a bug
  that causes heavy cpu utilization, impacting normal operation of one of
  our mail servers following an upgrade to Ubuntu 12.04.
  
  We are running the daemon with the following options:
  
  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5
  
  We noticed that users were unable to send mail and that the saslauthd
  processes were using approximately 100% of each cpu core. An strace of
  one of the runaway process showed that it was stuck in the following
  behaviour:
  
  select(9, [8], NULL, NULL, {0, 0})      = 1 (in [8], left {0, 0})
  read(8, "", 940)                        = 0
  select(9, [8], NULL, NULL, {0, 0})      = 1 (in [8], left {0, 0})
  read(8, "", 940)                        = 0
  select(9, [8], NULL, NULL, {0, 0})      = 1 (in [8], left {0, 0})
  read(8, "", 940)                        = 0
  select(9, [8], NULL, NULL, {0, 0})      = 1 (in [8], left {0, 0})
  read(8, "", 940)                        = 0
  select(9, [8], NULL, NULL, {0, 0})      = 1 (in [8], left {0, 0})
  read(8, "", 940)                        = 0
  select(9, [8], NULL, NULL, {0, 0})      = 1 (in [8], left {0, 0})
  read(8, "", 940)                        = 0
  select(9, [8], NULL, NULL, {0, 0})      = 1 (in [8], left {0, 0})
  read(8, "", 940)                        = 0
  .....
  
  with further inspection showing that the file descriptor in question was
  a socket connected to our imap server in CLOSE_WAIT.
  
  Browsing saslauthd/auth_rimap.c in the source package for sasl2-bin, we
  came across the following code, repeated in two locations:
  
  while( select (fds, &perm, NULL, NULL, &timeout ) >0 ) {
             if ( FD_ISSET(s, &perm) ) {
                ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
                if ( ret<0 ) {
                   rc = ret;
                   break;
                } else {
                   rc += ret;
                }
             }
          }
  
  It looks like this loop is expected to run until a read error is
  encountered or the timeout of 1 second is reached. There is no test to
  check that 0 bytes were read, indicating that the connection was closed
  by the remote peer. Since select() will immediately return the size of
  the set of the partially closed descriptor (1, which is >0), and calls
  to read() will always yield 0 bytes, there's the potential for execution
  to get stuck in this non blocking loop and I'm presuming that that's
  what's happening here.
  
  We've not performed any further analysis to prove that this is really
  what's happening but if my intuition is correct then our IMAP server (an
  nginx imap proxy) most liklely closes the connection at an unexpected
  time under as yet undetermined conditions.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to cyrus-sasl2 in Ubuntu.
https://bugs.launchpad.net/bugs/997217

Title:
  salsauthd maxes cpu

Status in cyrus-sasl2 package in Ubuntu:
  Fix Released
Status in cyrus-sasl2 source package in Precise:
  Won't Fix
Status in cyrus-sasl2 source package in Trusty:
  Triaged
Status in cyrus-sasl2 source package in Xenial:
  Triaged

Bug description:
  [Impact]

  The rimap authentication mechanism in saslauthd can hit a condition
  where it will start spinning and using all available CPU. This
  condition can be easily encountered when an authentication is
  happening and the imap service is being restarted.

  Furthermore, the saslauthd child process that picked up that
  authentication request and that is spinning now won't be reaped nor
  can it service further requests. If all children are left in this
  state, the authentication service as a whole won't be working anymore.

  [Test Case]

  This test can be performed in a LXD or VM.

  * install the needed packages. mail-stack-delivery is used to have an
  imap server available on localhost that needs no further
  configuration. Accept the defaults for all debconf prompts:

  sudo apt update
  sudo apt install sasl2-bin mail-stack-delivery

  * set the password "ubuntu" for the ubuntu user
  echo ubuntu:ubuntu | sudo chpasswd

  * start saslauthd like this, with just one child:
  sudo /usr/sbin/saslauthd -a rimap -O localhost -r -n 1

  * restart dovecot
  sudo service dovecot restart

  * test saslauthd authentication:
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  0: OK "Success."

  * Now let's break it. In one terminal watch the output of top:
  top

  * in another terminal, run the following:
  sudo testsaslauthd -u ubuntu -p ubuntu & sleep 1; sudo service dovecot stop

  * observe in the "top" terminal that saslauthd is consuming a lot of
  cpu. If that's not happening, try starting dovecot again and adjusting
  the sleep value in the previous test command.

  * repeat the authentication request. Since the only saslauthd child is now 
spinning, this will block:
  $ sudo testsaslauthd -u ubuntu -p ubuntu
  <blocks>

  
  [Regression Potential]

   * discussion of how regressions are most likely to manifest as a
  result of this change.

   * It is assumed that any SRU candidate patch is well-tested before
     upload and has a low overall risk of regression, but it's important
     to make the effort to think about what ''could'' happen in the
     event of a regression.

   * This both shows the SRU team that the risks have been considered,
     and provides guidance to testers in regression-testing the SRU.

  [Other Info]

   * Anything else you think is useful to include
   * Anticipate questions from users, SRU, +1 maintenance, security teams and 
the Technical Board
   * and address these questions in advance

  [Original Description]

  sasl2-bin version 2.1.24~rc1.dfsg1+cvs2011-05-23-4ubuntu contains a
  bug that causes heavy cpu utilization, impacting normal operation of
  one of our mail servers following an upgrade to Ubuntu 12.04.

  We are running the daemon with the following options:

  /usr/sbin/saslauthd -a rimap -O our.imap.server -r -m
  /var/spool/postfix/var/run/saslauthd -n 5

  We noticed that users were unable to send mail and that the saslauthd
  processes were using approximately 100% of each cpu core. An strace of
  one of the runaway process showed that it was stuck in the following
  behaviour:

  select(9, [8], NULL, NULL, {0, 0})      = 1 (in [8], left {0, 0})
  read(8, "", 940)                        = 0
  select(9, [8], NULL, NULL, {0, 0})      = 1 (in [8], left {0, 0})
  read(8, "", 940)                        = 0
  select(9, [8], NULL, NULL, {0, 0})      = 1 (in [8], left {0, 0})
  read(8, "", 940)                        = 0
  select(9, [8], NULL, NULL, {0, 0})      = 1 (in [8], left {0, 0})
  read(8, "", 940)                        = 0
  select(9, [8], NULL, NULL, {0, 0})      = 1 (in [8], left {0, 0})
  read(8, "", 940)                        = 0
  select(9, [8], NULL, NULL, {0, 0})      = 1 (in [8], left {0, 0})
  read(8, "", 940)                        = 0
  select(9, [8], NULL, NULL, {0, 0})      = 1 (in [8], left {0, 0})
  read(8, "", 940)                        = 0
  .....

  with further inspection showing that the file descriptor in question
  was a socket connected to our imap server in CLOSE_WAIT.

  Browsing saslauthd/auth_rimap.c in the source package for sasl2-bin,
  we came across the following code, repeated in two locations:

  while( select (fds, &perm, NULL, NULL, &timeout ) >0 ) {
             if ( FD_ISSET(s, &perm) ) {
                ret = read(s, rbuf+rc, sizeof(rbuf)-rc);
                if ( ret<0 ) {
                   rc = ret;
                   break;
                } else {
                   rc += ret;
                }
             }
          }

  It looks like this loop is expected to run until a read error is
  encountered or the timeout of 1 second is reached. There is no test to
  check that 0 bytes were read, indicating that the connection was
  closed by the remote peer. Since select() will immediately return the
  size of the set of the partially closed descriptor (1, which is >0),
  and calls to read() will always yield 0 bytes, there's the potential
  for execution to get stuck in this non blocking loop and I'm presuming
  that that's what's happening here.

  We've not performed any further analysis to prove that this is really
  what's happening but if my intuition is correct then our IMAP server
  (an nginx imap proxy) most liklely closes the connection at an
  unexpected time under as yet undetermined conditions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cyrus-sasl2/+bug/997217/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to