Hi Ben,

I’ve included the traces for an ARP request, and a ping. ovs-vswitchd also logs 
errors for the return traffic, but I didn’t include that as it seems redundant.

root@ip-172-31-2-45:/# ovs-appctl ofproto/trace kelda-int 
'arp,in_port=1,vlan_tci=0x0000,dl_src=02:00:0a:c5
:34:1e,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.197.52.30,arp_tpa=10.203.4.66,arp_op=1,arp_sha=02:00:0a:c5:34:1e
,arp_tha=ff:ff:ff:ff:ff:ff'
Flow: 
arp,in_port=1,vlan_tci=0x0000,dl_src=02:00:0a:c5:34:1e,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.197.52.30,arp_tpa=10.203.4.66,arp_op=1,arp_sha=02:00:0a:c5:34:1e,arp_tha=ff:ff:ff:ff:ff:ff

bridge("kelda-int")
-------------------
 0. in_port=1,dl_src=02:00:0a:c5:34:1e, priority 32768
    load:0x2->NXM_NX_REG0[]
    resubmit(,1)
 1. arp,dl_dst=ff:ff:ff:ff:ff:ff, priority 1000
    LOCAL
    output:NXM_NX_REG0[]
     -> output port is 2
     >>>> Cannot truncate output to patch port <<<<

bridge("br-int")
----------------
 0. in_port=1, priority 100
    set_field:0x1->reg13
    set_field:0x3->reg11
    set_field:0x2->reg12
    set_field:0x1->metadata
    set_field:0x1->reg14
    resubmit(,8)
 8. reg14=0x1,metadata=0x1, priority 50, cookie 0x1d48851d
    resubmit(,9)
 9. metadata=0x1, priority 0, cookie 0xb38c4d37
    resubmit(,10)
10. metadata=0x1, priority 0, cookie 0xa1f73f3a
    resubmit(,11)
11. metadata=0x1, priority 0, cookie 0x8515cdd5
    resubmit(,12)
12. metadata=0x1, priority 0, cookie 0x1163ca42
    resubmit(,13)
13. metadata=0x1, priority 0, cookie 0xbbb6e5e5
    resubmit(,14)
14. metadata=0x1, priority 0, cookie 0x98e3172d
    resubmit(,15)
15. metadata=0x1, priority 0, cookie 0x10496832
    resubmit(,16)
16. metadata=0x1, priority 0, cookie 0xe88c85a
    resubmit(,17)
17. metadata=0x1, priority 0, cookie 0x6516996b
    resubmit(,18)
18. arp,metadata=0x1,arp_tpa=10.203.4.66,arp_op=1, priority 50, cookie 
0xb33c89ed
    move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[]
     -> NXM_OF_ETH_DST[] is now 02:00:0a:c5:34:1e
    set_field:02:00:0a:cb:04:42->eth_src
    set_field:2->arp_op
    move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[]
     -> NXM_NX_ARP_THA[] is now 02:00:0a:c5:34:1e
    set_field:02:00:0a:cb:04:42->arp_sha
    move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[]
     -> NXM_OF_ARP_TPA[] is now 10.197.52.30
    set_field:10.203.4.66->arp_spa
    move:NXM_NX_REG14[]->NXM_NX_REG15[]
     -> NXM_NX_REG15[] is now 0x1
    load:0x1->NXM_NX_REG10[0]
    resubmit(,32)
32. priority 0
    resubmit(,33)
33. reg15=0x1,metadata=0x1, priority 100
    set_field:0x1->reg13
    set_field:0x3->reg11
    set_field:0x2->reg12
    resubmit(,34)
34. priority 0
    set_field:0->reg0
    set_field:0->reg1
    set_field:0->reg2
    set_field:0->reg3
    set_field:0->reg4
    set_field:0->reg5
    set_field:0->reg6
    set_field:0->reg7
    set_field:0->reg8
    set_field:0->reg9
    resubmit(,40)
40. metadata=0x1, priority 0, cookie 0x69ea0511
    resubmit(,41)
41. metadata=0x1, priority 0, cookie 0xe98a052d
    resubmit(,42)
42. metadata=0x1, priority 0, cookie 0xb9ce92f6
    resubmit(,43)
43. metadata=0x1, priority 0, cookie 0x95c9e574
    resubmit(,44)
44. metadata=0x1, priority 0, cookie 0xd46c593f
    resubmit(,45)
45. metadata=0x1, priority 0, cookie 0x2b14b7f8
    resubmit(,46)
46. metadata=0x1, priority 0, cookie 0xacf215be
    resubmit(,47)
47. metadata=0x1, priority 0, cookie 0x390a9bed
    resubmit(,48)
48. reg15=0x1,metadata=0x1, priority 50, cookie 0x9c81921d
    resubmit(,64)
64. reg10=0x1/0x1,reg15=0x1,metadata=0x1, priority 100
    push:NXM_OF_IN_PORT[]
    set_field:0->in_port
    resubmit(,65)
    65. reg15=0x1,metadata=0x1, priority 100
            output:1

        bridge("kelda-int")
        -------------------
         0. in_port=2, priority 32768
            output:1
    pop:NXM_OF_IN_PORT[]
     -> NXM_OF_IN_PORT[] is now 1

Final flow: 
arp,reg0=0x2,in_port=1,vlan_tci=0x0000,dl_src=02:00:0a:c5:34:1e,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.197.52.30,arp_tpa=10.203.4.66,arp_op=1,arp_sha=02:00:0a:c5:34:1e,arp_tha=ff:ff:ff:ff:ff:ff
Megaflow: 
recirc_id=0,ct_state=-new-est-rel-rpl-inv-trk,ct_label=0/0x1,eth,arp,in_port=1,vlan_tci=0x0000/0x1000,dl_src=02:00:0a:c5:34:1e,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.197.52.30,arp_tpa=10.203.4.66,arp_op=1,arp_sha=02:00:0a:c5:34:1e,arp_tha=ff:ff:ff:ff:ff:ff
Datapath actions: 
1,set(eth(src=02:00:0a:cb:04:42,dst=02:00:0a:c5:34:1e)),set(arp(sip=10.203.4.66,tip=10.197.52.30,op=2/0xff,sha=02:00:0a:cb:04:42,tha=02:00:0a:c5:34:1e)),3
This flow is handled by the userspace slow path because it:
        - Uses action(s) not supported by datapath.









root@ip-172-31-2-45:/# ovs-appctl ofproto/trace kelda-int 
'icmp,in_port=1,vlan_tci=0x0000,dl_src=02:00:0a:c5:34:1e,dl_dst=02:00:0a:cb:04:42,nw_src=10.197.52.30,nw_dst=10.203.4.66,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0'
Flow: 
icmp,in_port=1,vlan_tci=0x0000,dl_src=02:00:0a:c5:34:1e,dl_dst=02:00:0a:cb:04:42,nw_src=10.197.52.30,nw_dst=10.203.4.66,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0

bridge("kelda-int")
-------------------
 0. in_port=1,dl_src=02:00:0a:c5:34:1e, priority 32768
    load:0x2->NXM_NX_REG0[]
    resubmit(,1)
 1. priority 800
    output:NXM_NX_REG0[]
     -> output port is 2
     >>>> Cannot truncate output to patch port <<<<

bridge("br-int")
----------------
 0. in_port=1, priority 100
    set_field:0x1->reg13
    set_field:0x3->reg11
    set_field:0x2->reg12
    set_field:0x1->metadata
    set_field:0x1->reg14
    resubmit(,8)
 8. reg14=0x1,metadata=0x1, priority 50, cookie 0x1d48851d
    resubmit(,9)
 9. metadata=0x1, priority 0, cookie 0xb38c4d37
    resubmit(,10)
10. metadata=0x1, priority 0, cookie 0xa1f73f3a
    resubmit(,11)
11. ip,metadata=0x1, priority 100, cookie 0xca11a09b
    load:0x1->NXM_NX_XXREG0[96]
    resubmit(,12)
12. metadata=0x1, priority 0, cookie 0x1163ca42
    resubmit(,13)
13. ip,reg0=0x1/0x1,metadata=0x1, priority 100, cookie 0x3790c287
    ct(table=14,zone=NXM_NX_REG13[0..15])
    drop
     -> A clone of the packet is forked to recirculate. The forked pipeline 
will be resumed at table 14.

Final flow: 
icmp,reg0=0x2,in_port=1,vlan_tci=0x0000,dl_src=02:00:0a:c5:34:1e,dl_dst=02:00:0a:cb:04:42,nw_src=10.197.52.30,nw_dst=10.203.4.66,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0
Megaflow: 
recirc_id=0,eth,ip,in_port=1,vlan_tci=0x0000/0x1000,dl_src=02:00:0a:c5:34:1e,dl_dst=02:00:0a:cb:04:42,nw_frag=no
Datapath actions: ct(zone=1),recirc(0xd)

===============================================================================
recirc(0xd) - resume conntrack with default ct_state=trk|new (use --ct-next to 
customize)
===============================================================================

Flow: 
recirc_id=0xd,ct_state=new|trk,ct_zone=1,eth,icmp,reg0=0x1,reg11=0x3,reg12=0x2,reg13=0x1,reg14=0x1,metadata=0x1,in_port=1,vlan_tci=0x0000,dl_src=02:00:0a:c5:34:1e,dl_dst=02:00:0a:cb:04:42,nw_src=10.197.52.30,nw_dst=10.203.4.66,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0

bridge("kelda-int")
-------------------
    thaw
        Resuming from table 14
14. 
ct_state=+new-est+trk,icmp,metadata=0x1,nw_src=10.197.52.30,nw_dst=10.203.4.66, 
priority 1001, cookie 0x43c716ed
    load:0x1->NXM_NX_XXREG0[97]
    resubmit(,15)
15. metadata=0x1, priority 0, cookie 0x10496832
    resubmit(,16)
16. metadata=0x1, priority 0, cookie 0xe88c85a
    resubmit(,17)
17. ip,reg0=0x2/0x2,metadata=0x1, priority 100, cookie 0x1294b35
    ct(commit,zone=NXM_NX_REG13[0..15],exec(load:0->NXM_NX_CT_LABEL[0]))
    load:0->NXM_NX_CT_LABEL[0]
    resubmit(,18)
18. metadata=0x1, priority 0, cookie 0xd7f52407
    resubmit(,19)
19. metadata=0x1, priority 0, cookie 0x81041aca
    resubmit(,20)
20. metadata=0x1, priority 0, cookie 0x2ba72f51
    resubmit(,21)
21. metadata=0x1, priority 0, cookie 0x3acfedb2
    resubmit(,22)
22. metadata=0x1, priority 0, cookie 0xf4a7aa32
    resubmit(,23)
23. metadata=0x1,dl_dst=02:00:0a:cb:04:42, priority 50, cookie 0x5d36571c
    set_field:0x3->reg15
    resubmit(,32)
32. priority 0
    resubmit(,33)
33. reg15=0x3,metadata=0x1, priority 100
    set_field:0x6->reg13
    set_field:0x3->reg11
    set_field:0x2->reg12
    resubmit(,34)
34. priority 0
    set_field:0->reg0
    set_field:0->reg1
    set_field:0->reg2
    set_field:0->reg3
    set_field:0->reg4
    set_field:0->reg5
    set_field:0->reg6
    set_field:0->reg7
    set_field:0->reg8
    set_field:0->reg9
    resubmit(,40)
40. metadata=0x1, priority 0, cookie 0x69ea0511
    resubmit(,41)
41. ip,metadata=0x1, priority 100, cookie 0x6f45d6f5
    load:0x1->NXM_NX_XXREG0[96]
    resubmit(,42)
42. ip,reg0=0x1/0x1,metadata=0x1, priority 100, cookie 0x5da27571
    ct(table=43,zone=NXM_NX_REG13[0..15])
    drop
     -> A clone of the packet is forked to recirculate. The forked pipeline 
will be resumed at table 43.

Final flow: 
recirc_id=0xd,eth,icmp,reg0=0x1,reg11=0x3,reg12=0x2,reg13=0x6,reg14=0x1,reg15=0x3,metadata=0x1,in_port=1,vlan_tci=0x0000,dl_src=02:00:0a:c5:34:1e,dl_dst=02:00:0a:cb:04:42,nw_src=10.197.52.30,nw_dst=10.203.4.66,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0
Megaflow: 
recirc_id=0xd,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=1,dl_dst=02:00:0a:cb:04:42,nw_src=10.197.52.30,nw_dst=10.203.4.66,nw_frag=no
Datapath actions: ct(commit,zone=1,label=0/0x1),ct(zone=6),recirc(0xe)

===============================================================================
recirc(0xe) - resume conntrack with default ct_state=trk|new (use --ct-next to 
customize)
===============================================================================

Flow: 
recirc_id=0xe,ct_state=new|trk,ct_zone=6,eth,icmp,reg0=0x1,reg11=0x3,reg12=0x2,reg13=0x6,reg14=0x1,reg15=0x3,metadata=0x1,in_port=1,vlan_tci=0x0000,dl_src=02:00:0a:c5:34:1e,dl_dst=02:00:0a:cb:04:42,nw_src=10.197.52.30,nw_dst=10.203.4.66,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0

bridge("kelda-int")
-------------------
    thaw
        Resuming from table 43
43. metadata=0x1, priority 0, cookie 0x95c9e574
    resubmit(,44)
44. 
ct_state=+new-est+trk,icmp,metadata=0x1,nw_src=10.197.52.30,nw_dst=10.203.4.66, 
priority 1001, cookie 0x6468c276
    load:0x1->NXM_NX_XXREG0[97]
    resubmit(,45)
45. metadata=0x1, priority 0, cookie 0x2b14b7f8
    resubmit(,46)
46. ip,reg0=0x2/0x2,metadata=0x1, priority 100, cookie 0xac371eb4
    ct(commit,zone=NXM_NX_REG13[0..15],exec(load:0->NXM_NX_CT_LABEL[0]))
    load:0->NXM_NX_CT_LABEL[0]
    resubmit(,47)
47. metadata=0x1, priority 0, cookie 0x390a9bed
    resubmit(,48)
48. reg15=0x3,metadata=0x1, priority 50, cookie 0x8ebb8a35
    resubmit(,64)
64. priority 0
    resubmit(,65)
65. reg15=0x3,metadata=0x1, priority 100
    output:2

bridge("kelda-int")
-------------------
 0. in_port=4, priority 32768
    output:3

Final flow: 
recirc_id=0xe,eth,icmp,reg0=0x3,reg11=0x3,reg12=0x2,reg13=0x6,reg14=0x1,reg15=0x3,metadata=0x1,in_port=1,vlan_tci=0x0000,dl_src=02:00:0a:c5:34:1e,dl_dst=02:00:0a:cb:04:42,nw_src=10.197.52.30,nw_dst=10.203.4.66,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0
Megaflow: 
recirc_id=0xe,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=1,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00,nw_src=10.197.52.30,nw_dst=10.203.4.66,nw_frag=no
Datapath actions: ct(commit,zone=6,label=0/0x1),4

—Kevin
> On Dec 7, 2017, at 1:03 PM, Ben Pfaff <b...@ovn.org> wrote:
> 
> On Thu, Dec 07, 2017 at 09:26:14AM -0800, Kevin Lin wrote:
>> Hi,
>> 
>> I work on Kelda (kelda.io <http://kelda.io/>) with Ethan Jackson. We run a 
>> containerized, distributed version of OVN. The master branch of 
>> openvswitch/ovs (commit 07754b23ee5027508d64804d445e617b017cc2d1) fails with 
>> the following assertion in ovs-vswitchd:
>> 
>> ovs-vswitchd(handler2): ofproto/ofproto-dpif-xlate.c:3704: assertion 
>> !truncate failed in compose_output_action__()
>> 
>> whenever we try to use the OVN network. A little background on our setup:
>> We’re a container orchestrator that uses OVN for the container network.
>> One machine in our cluster runs ovn-northd and ovsdb-server. The network is 
>> mostly configured from here (creating the logical ports, creating ACLs etc).
>> Another machine runs ovn-controller, ovs-vswitchd, and ovsdb-server. We 
>> install some container-specific OpenFlow rules by connecting directly to 
>> ovs-vswitchd, and of course ovs-vswitchd also receives rules from OVN.
>> ovs-vswitchd does not crash immediately after the rules are installed. But 
>> it crashes as soon as the network is used (e.g. a ping from one container to 
>> another).
>> 
>> The commit before the commit that introduced the assertion works for us 
>> (https://github.com/openvswitch/ovs/commit/48f704f4768d13f85252bac4f93c8d45d8ab3eea
>>  
>> <https://github.com/openvswitch/ovs/commit/48f704f4768d13f85252bac4f93c8d45d8ab3eea>).
>> 
>> I’ve attached the ovs-vswitchd logs. I’m not sure how helpful the output of 
>> ovs-bugtool will be given our containerized setup, but I’ve also attached 
>> the output of running that from within the ovs-vswitchd container from 
>> before and after the crash. Note, because the ovs-vswitchd container 
>> crashed, the “after” tarball was generated after restarting the container, 
>> so I’m not sure if any of the commands it ran actually succeeded.
>> 
>> The crash is trivial for me to reproduce, so please let me know if there’s 
>> anymore information I can give you.
> 
> Thank you for the report.
> 
> I don't see a good reason that this should be a condition that kills
> ovs-vswitchd.  I think that it will be both easier to debug and less
> inherently harmful if we change it to an error message.  I sent out a
> patch that does that:
>        https://patchwork.ozlabs.org/patch/845845/
> 
> It would be great if you could apply the patch and then try to track
> down the activity that triggers the error.  "ofproto/trace" is the best
> way to do that, if you can find the right packet or flow, because it
> will give us all the details on how the problem gets triggered.
> 
> Thanks,
> 
> Ben.

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to