Mysterious hang with openssl and asan on ubuntu 18.04

2020-02-23 Thread Dan Kegel
Hi folks.

The project I'm working on exhibits a hang in one test case when
dealing with openssl connections, usually on 8 core machines,
when built with address sanitizer enabled.  This is mature,
theoretically well-debugged production code.  With help from c-reduce,
I minimized the rather complex test case to 108 lines of C
linked to openssl.  The hang occurs only on Ubuntu 18.04, not
on ubuntu 19.10 or 20.04 beta.   Before I go trying to run c-reduce
on openssl, I thought I'd run the so-far minimal reproducer by folks
here and see if anyone can think of a change between openssl 1.1.1 and
1.1.13
(well, to be exact, ubuntu's 1.1.1-1ubuntu2.1~18.04.5 and 1.1.1d-2ubuntu3)
that might account for this.

The following script reproduces the hang reliably for me in under a minute
(sometimes under a second):

#!/bin/sh
set -ex
gcc -g -O2 -fsanitize=address -pthread bug.i -lssl -o bug
export ASAN_OPTIONS=detect_stack_use_after_return=1
export LSAN_OPTIONS=verbosity=1
for iter in $(seq 1 1000)
do
  ./bug
done
echo "No hang found."

where bug.i contains:

-- snip --
static int readers = 40;
static int once_control = 0;
static int test_secs = 1;

typedef int pid_t;
struct timeval {
  long tv_sec;
  long tv_usec;
};
struct timespec {
  long tv_sec;
  long tv_nsec;
};
typedef unsigned long int pthread_t;
enum __itimer_which {
  ITIMER_REAL,
};
struct itimerval {
  struct timeval it_interval;
  struct timeval it_value;
};
typedef int sig_atomic_t;
typedef void(*__sighandler_t);
struct sigaction {
  struct {
__sighandler_t sa_handler;
  } __sigaction_handler;
};

typedef struct ssl_ctx_st SSL_CTX;
typedef struct ssl_method_st SSL_METHOD;
static SSL_CTX *context;

static struct timespec start_time;
static struct timespec goal_end_time;
static volatile sig_atomic_t keep_on_chugging = 1;

void do_once() {
  const struct ssl_method_st *p = 0;
  SSL_CTX *context = SSL_CTX_new(p);
}

void thread_main(void *v) { }

void do_work() {
  pthread_t net_5;
  pthread_once(&once_control, do_once);
  pthread_create(&net_5, 0, thread_main, 0);
}

static _Bool is_time_to_quit(void) {
  struct timespec now_time;
  clock_gettime(0, &now_time);
  long long remaining_nsecs =
  (goal_end_time.tv_sec - now_time.tv_sec) * 10ULL;
  remaining_nsecs += goal_end_time.tv_nsec - now_time.tv_nsec;
  if (remaining_nsecs < 0) return 1;
  return 0;
}

static void set_flag_for_exit(int signo) {
  if (is_time_to_quit()) keep_on_chugging = 0;
}

static void set_timer(void) {
  struct itimerval iv = {{0}, {0}};
  iv.it_value.tv_usec = 3000;
  iv.it_interval.tv_usec = 3000;
  if (setitimer(ITIMER_REAL, &iv, ((void *)0)) < 0)
abort();
  if (is_time_to_quit()) exit(0);
}

static void create_children(void) {
  int i;
  struct sigaction act;
  memset(&act, 0, sizeof(act));
  act.__sigaction_handler.sa_handler = set_flag_for_exit;
  sigaction(14, &act, ((void *)0));
  act.__sigaction_handler.sa_handler = ((__sighandler_t)1);
  pid_t pid;
  for (i = 0; i < readers; i++) {
if (i < readers) {
  if ((pid = fork()) < 0)
abort();
  if (pid == 0) {
set_timer();
do_work();
_exit(0);
  }
}
  }
}

static int reap_children(void) {
  int i;
  int status;
  for (i = 0; i < readers; i++) wait(&status);
}

int main(int argc, char *argv[]) {
  clock_gettime(0, &start_time);
  goal_end_time = start_time;
  goal_end_time.tv_sec += test_secs;

  create_children();
  reap_children();
}

-- snip --


Re: Query regarding SSL_ERROR_SSL during SSL handshake

2020-02-23 Thread Matt Caswell



On 24/02/2020 03:49, Mahendra SP wrote:
> Hi Matt,
> 
> Thank you for the inputs. 
> I have one more query. Is it appropriate to check for the errno in this
> case and take action based on the errno values ?

No, errno should not be checked unless SSL_get_error returns
SSL_ERROR_SYSCALL.

Matt


> 
> Thanks
> Mahendra
> 
> On Wed, Feb 19, 2020 at 3:09 PM Matt Caswell  > wrote:
> 
> 
> 
> On 19/02/2020 05:16, Mahendra SP wrote:
> > Hi All,
> >
> > We are using Openssl version 1.0.2h. When we call SSL_do_handshake,
> > sometimes we notice that handshake fails with error SSL_ERROR_SSL. 
> > As per the documentation for this error, it is non recoverable and
> fatal
> > error.  Documentation also mentions to check the error queue for
> further
> > details. Does it mean, calling SSL_get_error after SSL_ERROR_SSL will
> > give exact reason for this failure?
> 
> OpenSSL has its own error stack. SSL_ERROR_SSL means that you should
> look at that error stack for further details about what caused the
> problem. For example you can use ERR_print_errors_fp() to print all the
> error descriptions to stdout/stderr:
> 
> https://www.openssl.org/docs/man1.1.1/man3/ERR_print_errors_fp.html
> 
> You can get more fine grained control of the error stack using the
> various ERR_* functions available. See:
> 
> https://www.openssl.org/docs/man1.1.1/man3/
> 
> Matt
> 


[no subject]

2020-02-23 Thread hamed salini



Re: Query regarding SSL_ERROR_SSL during SSL handshake

2020-02-23 Thread Mahendra SP
Hi Matt,

Thank you for the inputs.
I have one more query. Is it appropriate to check for the errno in this
case and take action based on the errno values ?

Thanks
Mahendra

On Wed, Feb 19, 2020 at 3:09 PM Matt Caswell  wrote:

>
>
> On 19/02/2020 05:16, Mahendra SP wrote:
> > Hi All,
> >
> > We are using Openssl version 1.0.2h. When we call SSL_do_handshake,
> > sometimes we notice that handshake fails with error SSL_ERROR_SSL.
> > As per the documentation for this error, it is non recoverable and fatal
> > error.  Documentation also mentions to check the error queue for further
> > details. Does it mean, calling SSL_get_error after SSL_ERROR_SSL will
> > give exact reason for this failure?
>
> OpenSSL has its own error stack. SSL_ERROR_SSL means that you should
> look at that error stack for further details about what caused the
> problem. For example you can use ERR_print_errors_fp() to print all the
> error descriptions to stdout/stderr:
>
> https://www.openssl.org/docs/man1.1.1/man3/ERR_print_errors_fp.html
>
> You can get more fine grained control of the error stack using the
> various ERR_* functions available. See:
>
> https://www.openssl.org/docs/man1.1.1/man3/
>
> Matt
>