On 27-Apr-18 5:17 PM, Tan, Jianfeng wrote:


On 4/27/2018 11:46 PM, Tan, Jianfeng wrote:
Hi Olivier,

After this patch, I find the two IPC threads block at pthread_barrier_wait(), and never wake up. Please refer below for more information. The system is Ubuntu 16.04.

On 4/24/2018 10:46 PM, Olivier Matz wrote:
To avoid code duplication, add a parameter to rte_ctrl_thread_create()
to specify the name of the thread.

This requires to add a wrapper for the thread start routine in
rte_thread_init(), which will first wait that the thread is configured.

Signed-off-by: Olivier Matz <olivier.m...@6wind.com>
---
  drivers/net/kni/rte_eth_kni.c                |  3 +-
  lib/librte_eal/common/eal_common_proc.c      | 15 +++-----
  lib/librte_eal/common/eal_common_thread.c    | 52 +++++++++++++++++++++++++---
  lib/librte_eal/common/include/rte_lcore.h    |  7 ++--
  lib/librte_eal/linuxapp/eal/eal_interrupts.c | 13 ++-----
  lib/librte_eal/linuxapp/eal/eal_timer.c      | 12 +------
  lib/librte_vhost/socket.c                    | 25 +++----------
  7 files changed, 66 insertions(+), 61 deletions(-)
[...]
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index efbccddbc..94d2a6e42 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -7,6 +7,7 @@
  #include <stdint.h>
  #include <unistd.h>
  #include <pthread.h>
+#include <signal.h>
  #include <sched.h>
  #include <assert.h>
  #include <string.h>
@@ -141,10 +142,53 @@ eal_thread_dump_affinity(char *str, unsigned size)
      return ret;
  }
  +
+struct rte_thread_ctrl_params {
+    void *(*start_routine)(void *);
+    void *arg;
+    pthread_barrier_t configured;
+};
+
+static void *rte_thread_init(void *arg)
+{
+    struct rte_thread_ctrl_params *params = arg;
+    void *(*start_routine)(void *) = params->start_routine;
+    void *routine_arg = params->arg;
+
+    pthread_barrier_wait(&params->configured);

This thread never wakes up. The call trace as below:

#0  0x00007ffff72a8154 in futex_wait (private=0, expected=0, futex_word=0x7fffffffcff4)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:61
#1  futex_wait_simple (private=0, expected=0, futex_word=0x7fffffffcff4) at ../sysdeps/nptl/futex-internal.h:135 #2  __pthread_barrier_wait (barrier=0x7fffffffcff0) at pthread_barrier_wait.c:184 #3  0x000000000055216a in rte_thread_init (arg=0x7fffffffcfe0) at /home/tan/git/dpdk/lib/librte_eal/common/eal_common_thread.c:160 #4  0x00007ffff72a16ba in start_thread (arg=0x7ffff6ecf700) at pthread_create.c:333 #5  0x00007ffff6fd741d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

+
+    return start_routine(routine_arg);
+}
+
  __rte_experimental int
-rte_ctrl_thread_create(pthread_t *thread,
-            const pthread_attr_t *attr,
-            void *(*start_routine)(void *), void *arg)
+rte_ctrl_thread_create(pthread_t *thread, const char *name,
+        const pthread_attr_t *attr,
+        void *(*start_routine)(void *), void *arg)
  {
-    return pthread_create(thread, attr, start_routine, arg);
+    struct rte_thread_ctrl_params params = {
+        .start_routine = start_routine,
+        .arg = arg,
+    };

Update:

I doubt it's due to that we defined this variable, params, on the stack; and the value seems be overwritten by following code. Will send a patch to fix it.

I'm not sure i follow you, but looking forward to the fix :)

As far as i can tell, even if the variable is on the stack, we're making copies of values there before destroying them, so even if param somehow got destroyed before the thread had a chance to start, we've already got all data we needed from it. I can't see how that value being allocated on the stack makes a difference.

Just about the only thing i can see that's slightly wrong here is lack of pthread_barrier_destroy(). Perhaps add that as well? :)


Thanks,
Jianfeng


+    int ret;
+
+    pthread_barrier_init(&params.configured, NULL, 2);
+
+    ret = pthread_create(thread, attr, rte_thread_init, (void *)&params);
+    if (ret != 0)
+        return ret;
+
+    if (name != NULL) {
+        ret = rte_thread_setname(*thread, name);
+        if (ret < 0)
+            goto fail;
+    }
+
+    pthread_barrier_wait(&params.configured);

Here, the thread wakes up normally, and continues.

Any idea on what's going on?

Thanks,
Jianfeng

+
+    return 0;
+
+fail:
+    pthread_cancel(*thread);
+    pthread_join(*thread, NULL);
+    return ret;
  }





--
Thanks,
Anatoly

Reply via email to