Hi All, Recently we found a problem, as follow:
1. Problem description: PVECTOR_FOR_EACH use ovsrcu_get to get pvector’s current impl pointer, and the memory_order_consume in ovsrcu_get will ensure *impl is read after this instruction, so we can get the the correct ptr value in impl->vector[0], but it seems that we cannot make sure that the *ptr value is also correct. 2. Verification through testing: Copy the test code into file lib/dpif-netdev.c, and the modify fuction pmd_thread_main, insert the following line in the for (;;) loop: do_atomic_test(pmd); if the do_atomic_consumer is implement without mutex lock, we can easily get the following log in ovs-vswitchd.log: ----my_itrator:143, my_value:144 (we may get other values, and my_itrator = my_value - 1 stay true. The consumer thread can get dirty memory with memory_order_consume, if we change ovsrcu_get to get values with memory_order_acquire, we can still get the error message (my_itrator = my_value - 1). We can fix this problem if consumer thread access data with g_pvector_lock’ protection, but pvector is designed to use without locks. Where did it go wrong? Can anyone here give some comments? --------------------------------------------- A error case truly happened whining using pvector -------------------------- Suppose a scenario as follows: 1) handler thread insert a dpcls_subtable, whose address is p, to datapath classifier’s (struct dpcls) pvector; 2) pmd thread call fast_path_processing to find the subtable and access address p; 3) the subtable above is destroyed by someone; 4) handler thread insert another subtable, whose address is also p, and insert to dpcls’s pvector; 5) pmd thread call fast_path_processing -> dp_netdev_pmd_lookup_flow -> dpcls_lookup to access the subtable from pvector. We use PVECTOR_FOR_EACH to iterate the pvector in step 5, no locks is used, dirty memory access may cause to ovs-vswitchd process coredump. ----------------------------------------------Testing codes are as follows --------------------------------------------- static struct pvector g_atomic_test; static unsigned char g_my_data[65536]; static unsigned char g_value = 0; static int g_index = 0; static struct ovs_mutex g_pvector_lock = OVS_MUTEX_INITIALIZER; static void do_atomic_producer(void) { ovs_mutex_lock(&g_pvector_lock); if (g_index < 65536) { g_my_data[g_index] = g_value; pvector_insert(&g_atomic_test, (void *)&g_my_data[g_index], 0); g_index++; } else { for (int i = 0; i < g_index; i++) { pvector_remove(&g_atomic_test, (void *)&g_my_data[i]); } g_index = 0; g_value++; //value can loops to zero; } pvector_publish(&g_atomic_test); ovs_mutex_unlock(&g_pvector_lock); } static void do_atomic_consumer(void) { unsigned char *my_itr; unsigned char my_value; bool first = true; // ovs_mutex_lock(&g_pvector_lock); PVECTOR_FOR_EACH (my_itr, &g_atomic_test) { if (first) { my_value = *my_itr; first = false; } else if (*my_itr != my_value) { VLOG_ERR("----my_itrator:%u, my_value:%u\n", *my_itr, my_value); break; } } // ovs_mutex_unlock(&g_pvector_lock); } static void do_atomic_test(struct dp_netdev_pmd_thread *pmd) { if (pmd->core_id == 1) { do_atomic_producer(); } else { do_atomic_consumer(); } }
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss