https://bugs.linaro.org/show_bug.cgi?id=1940

--- Comment #3 from Maxim Uvarov <maxim.uva...@linaro.org> ---


As I understand sequence is following:

1. timer_create(CLOCK_MONOTONIC, &sigev, &tp->timerid)
2.
sigev.sigev_notify          = SIGEV_THREAD;
sigev.sigev_notify_function = timer_notify;
sigev.sigev_value.sival_ptr = tp;
timer_create(CLOCK_MONOTONIC, &sigev, &tp->timerid);

3. timer_settime(tp->timerid, 0, &ispec, NULL);

4. timer_delete(tp->timerid)

5. odp_shm_free(tp->shm)

But before step 4 there is timer event. I.e. timer was already removed from
timer list, but timer handler started thread creation (due to SIGEV_THREAD) and
call notifier. The same note in doc:

"""
        timer_delete() deletes the timer whose ID is given in timerid.  If the
timer was armed at the time of this call, it is disarmed before being deleted. 
The treatment of any pending signal generated by
       the deleted timer is unspecified.
"""

Let's think if that can happen and how timer deletion work:

glibc code:
https://github.com/lattera/glibc/blob/master/nptl/sysdeps/unix/sysv/linux/timer_delete.c
-----------------------------------------------------
int
timer_delete (timerid)
     timer_t timerid;
{
# undef timer_delete
# ifndef __ASSUME_POSIX_TIMERS
  if (__no_posix_timers >= 0)
# endif
    {
      struct timer *kt = (struct timer *) timerid;

      /* Delete the kernel timer object.  */
      int res = INLINE_SYSCALL (timer_delete, 1, kt->ktimerid);

      if (res == 0)
        {
          if (kt->sigev_notify == SIGEV_THREAD)
            {
              /* Remove the timer from the list.  */
              pthread_mutex_lock (&__active_timer_sigev_thread_lock);
              if (__active_timer_sigev_thread == kt)
                __active_timer_sigev_thread = kt->next;
              else
                {
                  struct timer *prevp = __active_timer_sigev_thread;
                  while (prevp->next != NULL)
                    if (prevp->next == kt)
                      {
                        prevp->next = kt->next;
                        break;
                      }
                    else
                      prevp = prevp->next;
                }
              pthread_mutex_unlock (&__active_timer_sigev_thread_lock);
            }

# ifndef __ASSUME_POSIX_TIMERS
          /* We know the syscall support is available.  */
          __no_posix_timers = 1;
# endif

          /* Free the memory.  */
          (void) free (kt);

          return 0;
        }

      /* The kernel timer is not known or something else bad happened.
         Return the error.  */
# ifndef __ASSUME_POSIX_TIMERS
      if (errno != ENOSYS)
        {
          __no_posix_timers = 1;
# endif
          return -1;
# ifndef __ASSUME_POSIX_TIMERS
        }

------------------------------------
Because we did not see abort in ODP:
        if (timer_delete(tp->timerid) != 0)
                ODP_ABORT("timer_delete() returned error %s\n",

that means that:
 int res = INLINE_SYSCALL (timer_delete, 1, kt->ktimerid);
returned 0. Which means kernel said that timer was deleted.


Looking to kernel code for clock monotonic:


kernel/time/posix-timers.c:
struct k_clock clock_monotonic = {
...
.timer_del      = common_timer_del,
}

static int common_timer_del(struct k_itimer *timer)
{
        timer->it.real.interval.tv64 = 0;

        if (hrtimer_try_to_cancel(&timer->it.real.timer) < 0)
                return TIMER_RETRY;
        return 0;
}


also kernel/time/hrtimer.c:
int hrtimer_try_to_cancel(struct hrtimer *timer)
{
        struct hrtimer_clock_base *base;
        unsigned long flags;
        int ret = -1;

        /*
         * Check lockless first. If the timer is not active (neither
         * enqueued nor running the callback, nothing to do here.  The
         * base lock does not serialize against a concurrent enqueue,
         * so we can avoid taking it.
         */
        if (!hrtimer_active(timer))
                return 0;

        base = lock_hrtimer_base(timer, &flags);

        if (!hrtimer_callback_running(timer))
                ret = remove_hrtimer(timer, base, false);

        unlock_hrtimer_base(timer, &flags);

        return ret;

}

Ok, it looks like hrtimer_active(timer) should capture if callback is in
flight.
That commit added that check:

Author: Peter Zijlstra <pet...@infradead.org>  2015-06-11 15:46:48
Committer: Thomas Gleixner <t...@linutronix.de>  2015-06-19 01:09:56
Parent: 8edfb0362e8e52dec2de08fa163af01c9da2c9d0 (hrtimer: Fix
hrtimer_is_queued() hole)
Child:  bc7a34b8b9ebfb0f4b8a35a72a0b134fd6c5ef50 (timer: Reduce timer migration
overhead if disabled)
Branches: master, remotes/origin/master
Follows: v4.1-rc4
Precedes: v4.2-rc1

    hrtimer: Allow hrtimer::function() to free the timer

    Currently an hrtimer callback function cannot free its own timer
    because __run_hrtimer() still needs to clear HRTIMER_STATE_CALLBACK
    after it. Freeing the timer would result in a clear use-after-free.

    Solve this by using a scheme similar to regular timers; track the
    current running timer in hrtimer_clock_base::running.


So we need to check if it's reproducible with the latest kernel.

If it's really "in flight" problem with threaded timer handler that in ODP we
can add some sleep() before clearing timer pool memory for kernel less than
4.2.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp

Reply via email to