There are several fixes for upcall handling on branch-2.7 after 2.7.0.

commit 634e1d0f3a5b9b40b665fca1bcf6e63a07bda2d2
Author: Joe Stringer <j...@ovn.org>
Date:   Mon Jul 31 16:54:22 2017 -0700

    ofproto-dpif-upcall: Fix key attr iteration.
    
    This call is operating on messages generated by the datapath. If a
    datapath implementation sends improperly formatted netlink attributes,
    then it's possible for a revalidator thread to end up trapped in an
    infinite loop iterating across these attributes. Rather than using the
    UNSAFE variation of this iterator, use the regular version.
    
    From master commit f2d3fef3d90253dda3e03822df2e921ec853192d.
    
    Fixes: 994fcc5a15d3 ("upcall: Check for recirc_id in 
ukey_create_from_dpif_flow()")
    Signed-off-by: Joe Stringer <j...@ovn.org>
    Reviewed-by: Greg Rose <gvrose8...@gmail.com>
    Acked-by: Ben Pfaff <b...@ovn.org>

commit 8e4e55f17e07c45143338c543239c556c2214df1
Author: Joe Stringer <j...@ovn.org>
Date:   Mon Jul 31 16:54:21 2017 -0700

    ofproto-dpif-upcall: Fix action attr iteration.
    
    This calls is operating on messages generated by the datapath. If a
    datapath implementation sends improperly formatted netlink attributes,
    then it's possible for a revalidator thread to end up trapped in an
    infinite loop iterating across the actions attributes. Rather than using
    the UNSAFE variation of this iterator, use the regular version.
    
    From master commit 55f854b9d51edcbccf4ae1655855dddd1d9ec1fe.
    
    Fixes: e672ff9b4d22 ("ofproto-dpif: Restore metadata and registers on 
recirculation.")
    Signed-off-by: Joe Stringer <j...@ovn.org>
    Acked-by: Ben Pfaff <b...@ovn.org>

commit 58e06d7ffeb9e3625a8156632c1139f4cd93d66e
Author: Joe Stringer <j...@ovn.org>
Date:   Mon May 1 12:58:07 2017 -0700

    revalidator: Fix logging of xlate_key() failure.
    
    This was being logged using xlate_strerror(), but the return code is
    actually an errno code. Use ovs_strerror() instead.
    
    Fixes: dd0dc9eda0e0 ("revalidator: Reuse xlate_ukey from deletion.")
    Signed-off-by: Joe Stringer <j...@ovn.org>
    Acked-by: Jarno Rajahalme <ja...@ovn.org>

commit 928d21efb8e6b25e06131dd9b2a58ca7ffe7adb8
Author: Joe Stringer <j...@ovn.org>
Date:   Mon May 1 12:58:06 2017 -0700

    revalidator: Revalidate ukeys created from flows.
    
    If there is no active ukey for a particular datapath flow, and it is
    dumped from the datapath, then the revalidator threads will assemble a
    ukey based on the datapath flow. This will allow tracking of the stats
    for proper attribution, and future validation of the flow.
    
    However, until now when creating the ukey in this context, the ukey's
    'reval_seq' has been set to the current udpif's reval_seq. This implies
    that the flow has been validated against the current flow table.
    However, this is not true - The flow appeared in the datapath without
    any prior knowledge in this OVS instance so we should set up the
    reval_seq of the ukey to ensure that the flow will be validated during
    the current dump/revalidation cycle.
    
    Refer also revalidate_ukey().
    
    Fixes: 23597df05226 ("upcall: Create ukeys in handler threads.")
    Signed-off-by: Joe Stringer <j...@ovn.org>
    Acked-by: Jarno Rajahalme <ja...@ovn.org>

commit f0320ddadeec012487807cb2c71dd8da4dc9dad9
Author: Joe Stringer <j...@ovn.org>
Date:   Wed Apr 26 18:03:12 2017 -0700

    revalidator: Improve logging for transition_ukey().
    
    There are a few cases where more introspection into ukey transitions
    would be relevant for logging or assertion. Track the SOURCE_LOCATOR and
    thread id when states are transitioned and use these for logging.
    
    Suggested-by: Jarno Rajahalme <ja...@ovn.org>
    Signed-off-by: Joe Stringer <j...@ovn.org>
    Acked-by: Ben Pfaff <b...@ovn.org>

commit 1c13aa391bb1404e763919b22f64ab2b3a350ada
Author: Joe Stringer <j...@ovn.org>
Date:   Wed Apr 26 18:03:11 2017 -0700

    revalidator: Avoid assert in transition_ukey().
    
    There is a case where a flow is dumped from the kernel after the ukey is
    already transitioned into an EVICTING/EVICTED/DELETED state, and the
    revalidator thread attempts to shift that into UKEY_OPERATIONAL because
    it was able to dump the flow from the datapath. This resulted in
    triggering the assert in transition_ukey(). Detect this condition and
    skip handling the flow (as it's already on its way out).
    
    Users report:
    > Program terminated with signal SIGABRT, Aborted.
    > raise () from /lib/x86_64-linux-gnu/libc.so.6
    > raise () from /lib/x86_64-linux-gnu/libc.so.6
    > abort () from /lib/x86_64-linux-gnu/libc.so.6
    > ovs_abort_valist
    > vlog_abort_valist
    > vlog_abort
    > ovs_assert_failure
    > transition_ukey (ukey=<optimized out>, dst=<optimized out>)
    >     at ofproto/ofproto-dpif-upcall.c:1674
    > revalidate (revalidator=0x1cb36c8) at ofproto/ofproto-dpif-upcall.c:2324
    > udpif_revalidator (arg=0x1cb36c8) at ofproto/ofproto-dpif-upcall.c:901
    > ovsthread_wrapper (aux_=<optimized out>) at lib/ovs-thread.c:348
    > start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    > clone () from /lib/x86_64-linux-gnu/libc.so.6
    
    VMware-BZ: #1857694
    Signed-off-by: Joe Stringer <j...@ovn.org>
    Acked-by: Ben Pfaff <b...@ovn.org>

commit bb494eca7e21d976c617c7983312ae230f847811
Author: Joe Stringer <j...@ovn.org>
Date:   Mon Mar 20 14:08:19 2017 -0700

    ofproto-dpif-upcall: Fix flow setup/delete race.
    
    If a handler thread takes a long time to set up a set of flows, it is
    possible for one of the installed flows to be dumped and scheduled
    for deletion by a revalidator thread before the handler is able to
    transition the ukey into an operational state---Between the
    dpif_operate() above this function and the ukey lock / transition logic
    modified by this patch.
    
    Only transition the ukey for the flow if it wasn't already transitioned
    to a later state by a revalidator thread.
    
    Fixes: 54ebeff4c03d ("upcall: Track ukey states.")
    Reported-by: Paul Blakey <pa...@mellanox.com>
    Signed-off-by: Joe Stringer <j...@ovn.org>
    Tested-by: Paul Blakey <pa...@mellanox.com>


On Mon, Nov 26, 2018 at 06:29:31AM +0000, Zhengjingzhou wrote:
> Any ideas?Seemed like sevreral similar issures in earlier version, and 
> several patches fix them
> 
> 发件人: Zhengjingzhou
> 发送时间: 2018年11月21日 19:02
> 收件人: 'ovs-discuss@openvswitch.org' <ovs-discuss@openvswitch.org>
> 主题: A coredump cause by the race condition by handler、upcall and revalidation 
> thread
> 
> 
> We've found a coredump in a daily testcase(ovs2.7.0+dpdk),
> After a deep analyze we found it coredumped after a ovs service restart(with 
> birdges and ports), and the revalidator and upcall threads race at the 
> ukey_state. However ,it seems hard to reproduce the coredump.We conclude a 
> possiblity
> 1. handlerA deals the packet1, prepare to put a flow,wait(lock)
> 2. handlerB deals another packet2(same mac with packet1), find the device has 
> been deleted, generate a flow
> 3. handlerA meet a error(EEXIST, in function flow_put_on_pmd), prepare to 
> transition ukeystate to evicted(in function handle_upcalls), wait to aquire 
> the lock
> for (i = 0; i < n_ops; i++) {
>         struct udpif_key *ukey = ops[i].ukey;
> 
>         if (ukey) {
>             ovs_mutex_lock(&ukey->mutex);
>             if (ops[i].dop.error) {
>                 transition_ukey(ukey, UKEY_EVICTED); ///doubt may be here 
> should add a state condition? as beflow
>             }
>                             else if (ukey->state < UKEY_OPERATIONAL) {
>                 transition_ukey(ukey, UKEY_OPERATIONAL);
>             }
> 
> 4. revalidate thread C transitions the ukeystate to deleted(by expiration or 
> other deals), releases the ukey lock
> 5. handlerA accuires the ukey lock(step 3) ,finds the pre state is deleted, 
> abort
> 
> which the stack is
> #0  0x00007fba3737f417 in raise () from /usr/lib64/libc.so.6
> #1  0x00007fba37380b08 in abort () from /usr/lib64/libc.so.6
> #2  0x00000000005746be in ovs_abort_valist (err_no=err_no@entry=0, 
> format=format@entry=0x6b0028 "Invalid ukey transition %d->%d (last 
> transitioned from thread %u at %s)", args=args@entry=0x7fb995628ea0)
>     at lib/util.c:341
> #3  0x000000000057b8b0 in vlog_abort_valist (function=<optimized out>, 
> line=<optimized out>, module_=<optimized out>,
>     message=0x6b0028 "Invalid ukey transition %d->%d (last transitioned from 
> thread %u at %s)", args=args@entry=0x7fb995628ea0) at lib/vlog.c:1229
> #4  0x000000000057b93a in vlog_abort (function=function@entry=0x6b0d90 
> <__func__.28159> "transition_ukey_at", line=line@entry=1741, 
> module=module@entry=0x9ac220 <this_module>,
>     message=message@entry=0x6b0028 "Invalid ukey transition %d->%d (last 
> transitioned from thread %u at %s)") at lib/vlog.c:1243
> #5  0x000000000049a85d in transition_ukey_at (ukey=ukey@entry=0x7fb97c005120, 
> dst=dst@entry=UKEY_EVICTED, where=where@entry=0x6b0648 
> "ofproto/ofproto_dpif_upcall.c:1467")
>     at ofproto/ofproto_dpif_upcall.c:1739
> #6  0x000000000049da45 in handle_upcalls (n_upcalls=64, 
> upcalls=0x7fb99564dc70, udpif=<optimized out>) at 
> ofproto/ofproto_dpif_upcall.c:1467
> #7  recv_upcalls (handler=0x3339f90, handler=0x3339f90) at 
> ofproto/ofproto_dpif_upcall.c:887
> #8  0x000000000049dc7a in udpif_upcall_handler (arg=0x3339f90) at 
> ofproto/ofproto_dpif_upcall.c:783
> #9  0x0000000000546e48 in ovsthread_wrapper (aux_=<optimized out>) at 
> lib/ovs_thread.c:682
> #10 0x00007fba38f97e45 in start_thread () from /usr/lib64/libpthread.so.0
> #11 0x00007fba37442afd in clone () from /usr/lib64/libc.so.6
> 
> and the relative log
> 2018-11-10T16:03:10.796061+08:00|warning|ovs-vswitchd[18219]|xlate_report_error[647]|00001|ofproto_dpif_xlate(handler16)|:
>  received packet on unknown port 1 while processing 
> udp,in_port=1,vlan_tci=0x0000,dl_src=38:4c:4f:cb:62:5f,dl_dst=38:4c:4f:cb:62:53,nw_src=199.168.1.106,nw_dst=199.168.1.32,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=53744,tp_dst=4789
>  on bridge br-1
> 2018-11-10T16:03:10.796497+08:00|warning|ovs-vswitchd[18219]|xlate_report_error[647]|00002|ofproto_dpif_xlate(handler16)|:
>  received packet on unknown port 1 while processing 
> udp,in_port=1,vlan_tci=0x0000,dl_src=38:4c:4f:cb:62:5f,dl_dst=38:4c:4f:cb:62:53,nw_src=199.168.1.106,nw_dst=199.168.1.32,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=53744,tp_dst=4789
>  on bridge br-1
> 2018-11-10T16:03:10.796871+08:00|warning|ovs-vswitchd[18219]|xlate_report_error[647]|00001|ofproto_dpif_xlate(handler19)|:
>  received packet on unknown port 1 while processing 
> udp,in_port=1,vlan_tci=0x0000,dl_src=38:4c:4f:cb:62:5f,dl_dst=38:4c:4f:cb:62:53,nw_src=199.168.1.106,nw_dst=199.168.1.32,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=41088,tp_dst=4789
>  on bridge br-1
> 2018-11-10T16:03:10.797261+08:00|info|ovs-vswitchd[18219]|recv_upcalls[848]|00002|ofproto_dpif_upcall(handler19)|:
>  received packet on unassociated datapath port 5
> 2018-11-10T16:03:10.797616+08:00|info|ovs-vswitchd[18219]|recv_upcalls[848]|00003|ofproto_dpif_upcall(handler19)|:
>  received packet on unassociated datapath port 5
> 2018-11-10T16:03:10.797833+08:00|info|ovs-vswitchd[18219]|recv_upcalls[848]|00004|ofproto_dpif_upcall(handler19)|:
>  received packet on unassociated datapath port 5
> 2018-11-10T16:03:10.798202+08:00|info|ovs-vswitchd[18219]|recv_upcalls[848]|00005|ofproto_dpif_upcall(handler19)|:
>  received packet on unassociated datapath port 5
> 2018-11-10T16:03:10.798434+08:00|info|ovs-vswitchd[18219]|recv_upcalls[848]|00006|ofproto_dpif_upcall(handler19)|:
>  received packet on unassociated datapath port 5
> 2018-11-10T16:03:10.801329+08:00|info|ovs-vswitchd[18219]|revalidate[2473]|00001|ofproto_dpif_upcall(revalidator20)|:
>  Unexpected ukey transition from state 4 (last transitioned from thread 19 at 
> ofproto/ofproto_dpif_upcall.c:1467)
> 2018-11-10T16:03:10.813474+08:00|alert|ovs-vswitchd[18219]|transition_ukey_at[1741]|00003|ofproto_dpif_upcall(handler16)|:
>  Invalid ukey transition 5->4 (last transitioned from thread 21 at 
> ofproto/ofproto_dpif_upcall.c:1870)
> 
> 

> _______________________________________________
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to