Re: [ovs-dev] OVS will hit an assert if encap(nsh) is done in bucket of group

2018-03-26 Thread Jan Scheurich
Thanks for the confirmation Yi. I will post the fix straight away.
The other fix for double encap() is also ready.

BR, Jan

> -Original Message-
> From: Yang, Yi [mailto:yi.y.y...@intel.com]
> Sent: Monday, 26 March, 2018 03:42
> To: Jan Scheurich 
> Cc: d...@openvswitch.org; Zoltán Balogh 
> Subject: Re: [ovs-dev] OVS will hit an assert if encap(nsh) is done in bucket 
> of group
> 
> I tried the below fix patch you mentioned, it did fix this issue.
> 
> diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
> index db85716..87797bc 100644
> --- a/lib/ofp-actions.c
> +++ b/lib/ofp-actions.c
> @@ -6985,9 +6985,6 @@ ofpact_is_set_or_move_action(const struct ofpact *a)
>  case OFPACT_SET_TUNNEL:
>  case OFPACT_SET_VLAN_PCP:
>  case OFPACT_SET_VLAN_VID:
> -case OFPACT_ENCAP:
> -case OFPACT_DECAP:
> -case OFPACT_DEC_NSH_TTL:
>  return true;
>  case OFPACT_BUNDLE:
>  case OFPACT_CLEAR_ACTIONS:
> @@ -7025,6 +7022,9 @@ ofpact_is_set_or_move_action(const struct ofpact *a)
>  case OFPACT_WRITE_METADATA:
>  case OFPACT_DEBUG_RECIRC:
>  case OFPACT_DEBUG_SLOW:
> +case OFPACT_ENCAP:
> +case OFPACT_DECAP:
> +case OFPACT_DEC_NSH_TTL:
>  return false;
>  default:
>  OVS_NOT_REACHED();
> 
> On Mon, Mar 26, 2018 at 12:45:46AM +, Yang, Yi Y wrote:
> > Jan, thank you so much, very exhaustive analysis :), I'll double check your 
> > fix patch.
> >
> > From: Jan Scheurich [mailto:jan.scheur...@ericsson.com]
> > Sent: Sunday, March 25, 2018 9:09 AM
> > To: Yang, Yi Y 
> > Cc: d...@openvswitch.org; Zoltán Balogh 
> > Subject: RE: OVS will hit an assert if encap(nsh) is done in bucket of group
> >
> >
> > Hi Yi,
> >
> >
> >
> > Part of the seemingly strange behavior of the encap(nsh) action in a group 
> > is caused by the (often forgotten) fact that group buckets
> do not contain action *lists* but action *sets*. I have no idea why it was 
> defined like this when groups were first introduced in
> OpenFlow 1.1. In my view it was a bad decision and causes a lot of limitation 
> for using groups. But that's the way it is.
> >
> >
> >
> > In action sets there can only be one action of a kind (except for 
> > set_field, where there can be one action per target field). If there
> are multiple actions of the same kind specified, only the last one taken, the 
> earlier  ones ignored.
> >
> >
> >
> > Furthermore, the order of execution of the actions in the action set is not 
> > given by the order in which they are listed but defined by
> the OpenFlow standard (see chapter 5.6 of OF spec 1.5.1). Of course the 
> generic encap() and decap() actions are not standardized yet,
> so the OF spec doesn't specify where to put them in the sequence. We had to 
> implement something that follows the spirit of the
> specification, knowing that whatever we chose may fit some but won't fit many 
> other legitimate use cases.
> >
> >
> >
> > OVS's order is defined in ofpacts_execute_action_set() in ofp-actions.c:
> >
> > OFPACT_STRIP_VLAN
> >
> > OFPACT_POP_MPLS
> >
> > OFPACT_DECAP
> >
> > OFPACT_ENCAP
> >
> > OFPACT_PUSH_MPLS
> >
> > OFPACT_PUSH_VLAN
> >
> > OFPACT_DEC_TTL
> >
> > OFPACT_DEC_MPLS_TTL
> >
> > OFPACT_DEC_NSH_TTL
> >
> > All OFP_ACT SET_FIELD and OFP_ACT_MOVE (target)
> >
> > OFPACT_SET_QUEUE
> >
> >
> >
> > Now, your specific group bucket use case:
> >
> >
> >
> >encap(nsh),set_field:->nsh_xxx,output:vxlan_gpe_port
> >
> >
> >
> > should be a lucky fit and execute as expected, whereas the analogous use 
> > case
> >
> >
> >
> >encap(nsh),set_field:->nsh_xxx,encap(ethernet), output:ethernet_port
> >
> >
> >
> > fails with the error
> >
> >
> >
> >Dropping packet as encap(ethernet) is not supported for packet type 
> > ethernet.
> >
> >
> >
> > because the second encap(ethernet) action replaces the encap(nsh) in the 
> > action set and is executed first on the original received
> Ethernet packet. Boom!
> >
> >
> >
> > So, why does your valid use case cause an assertion failure? It's a 
> > consequence of two faults:
> >
> >
> > 1.  In the conversion of the group bucket's action list to the bucket 
> > action set in ofpacts_execute_action_set() the action list is
> filtered with ofpact_is_set

Re: [ovs-dev] OVS will hit an assert if encap(nsh) is done in bucket of group

2018-03-25 Thread Yang, Yi
I tried the below fix patch you mentioned, it did fix this issue.

diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
index db85716..87797bc 100644
--- a/lib/ofp-actions.c
+++ b/lib/ofp-actions.c
@@ -6985,9 +6985,6 @@ ofpact_is_set_or_move_action(const struct ofpact *a)
 case OFPACT_SET_TUNNEL:
 case OFPACT_SET_VLAN_PCP:
 case OFPACT_SET_VLAN_VID:
-case OFPACT_ENCAP:
-case OFPACT_DECAP:
-case OFPACT_DEC_NSH_TTL:
 return true;
 case OFPACT_BUNDLE:
 case OFPACT_CLEAR_ACTIONS:
@@ -7025,6 +7022,9 @@ ofpact_is_set_or_move_action(const struct ofpact *a)
 case OFPACT_WRITE_METADATA:
 case OFPACT_DEBUG_RECIRC:
 case OFPACT_DEBUG_SLOW:
+case OFPACT_ENCAP:
+case OFPACT_DECAP:
+case OFPACT_DEC_NSH_TTL:
 return false;
 default:
 OVS_NOT_REACHED();

On Mon, Mar 26, 2018 at 12:45:46AM +, Yang, Yi Y wrote:
> Jan, thank you so much, very exhaustive analysis :), I'll double check your 
> fix patch.
> 
> From: Jan Scheurich [mailto:jan.scheur...@ericsson.com]
> Sent: Sunday, March 25, 2018 9:09 AM
> To: Yang, Yi Y 
> Cc: d...@openvswitch.org; Zoltán Balogh 
> Subject: RE: OVS will hit an assert if encap(nsh) is done in bucket of group
> 
> 
> Hi Yi,
> 
> 
> 
> Part of the seemingly strange behavior of the encap(nsh) action in a group is 
> caused by the (often forgotten) fact that group buckets do not contain action 
> *lists* but action *sets*. I have no idea why it was defined like this when 
> groups were first introduced in OpenFlow 1.1. In my view it was a bad 
> decision and causes a lot of limitation for using groups. But that's the way 
> it is.
> 
> 
> 
> In action sets there can only be one action of a kind (except for set_field, 
> where there can be one action per target field). If there are multiple 
> actions of the same kind specified, only the last one taken, the earlier  
> ones ignored.
> 
> 
> 
> Furthermore, the order of execution of the actions in the action set is not 
> given by the order in which they are listed but defined by the OpenFlow 
> standard (see chapter 5.6 of OF spec 1.5.1). Of course the generic encap() 
> and decap() actions are not standardized yet, so the OF spec doesn't specify 
> where to put them in the sequence. We had to implement something that follows 
> the spirit of the specification, knowing that whatever we chose may fit some 
> but won't fit many other legitimate use cases.
> 
> 
> 
> OVS's order is defined in ofpacts_execute_action_set() in ofp-actions.c:
> 
> OFPACT_STRIP_VLAN
> 
> OFPACT_POP_MPLS
> 
> OFPACT_DECAP
> 
> OFPACT_ENCAP
> 
> OFPACT_PUSH_MPLS
> 
> OFPACT_PUSH_VLAN
> 
> OFPACT_DEC_TTL
> 
> OFPACT_DEC_MPLS_TTL
> 
> OFPACT_DEC_NSH_TTL
> 
> All OFP_ACT SET_FIELD and OFP_ACT_MOVE (target)
> 
> OFPACT_SET_QUEUE
> 
> 
> 
> Now, your specific group bucket use case:
> 
> 
> 
>encap(nsh),set_field:->nsh_xxx,output:vxlan_gpe_port
> 
> 
> 
> should be a lucky fit and execute as expected, whereas the analogous use case
> 
> 
> 
>encap(nsh),set_field:->nsh_xxx,encap(ethernet), output:ethernet_port
> 
> 
> 
> fails with the error
> 
> 
> 
>Dropping packet as encap(ethernet) is not supported for packet type 
> ethernet.
> 
> 
> 
> because the second encap(ethernet) action replaces the encap(nsh) in the 
> action set and is executed first on the original received Ethernet packet. 
> Boom!
> 
> 
> 
> So, why does your valid use case cause an assertion failure? It's a 
> consequence of two faults:
> 
> 
> 1.  In the conversion of the group bucket's action list to the bucket 
> action set in ofpacts_execute_action_set() the action list is filtered with 
> ofpact_is_set_or_move_action() to select the set_field actions. This function 
> incorrectly flagged OFPACT_ENCAP, OFPACT_DECAP and OFPACT_DEC_NSH_TTL as 
> set_field actions. That's why the encap(nsh) action is wrongly copied twice 
> to the action set.
> 
> 2.  The translation of the second encap(nsh) action in the action set 
> doesn't change the packet_type as it is already (1,0x894f). Hence, the 
> commit_packet_type_change() triggered at output to vxlan_gpe port misses to 
> generate a second encap_nsh datapath action. The logic here is obviously not 
> complete to cover the NSH in NSH use case that we intended to support and 
> must be enhanced.
> 
> 
> The commit of the changes to the NSH header in commit_set_nsh_action() then 
> triggers assertion failure because the translation of the second encap(nsh) 
> action did overwrite the original nsh_np (0x3 for Ethernet in NSH) in the 
> flow with 0x4 (for NSH in NSH). Since it is not allowed to modify the nsh_np 
> with set_field this is what triggers the assertion.
> 
> 
> I believe this assertion to be correct. It did detect the combination of the 
> above two faults.
> 
> 
> 
> The solution to 1 is trivial. I'll post a bug fix straight away. That should 
> suffice for your problem.
> 
> The solution to 2 requires a bit more thinking. I will send a fix when I have 

Re: [ovs-dev] OVS will hit an assert if encap(nsh) is done in bucket of group

2018-03-25 Thread Yang, Yi Y
Jan, thank you so much, very exhaustive analysis :), I'll double check your fix 
patch.

From: Jan Scheurich [mailto:jan.scheur...@ericsson.com]
Sent: Sunday, March 25, 2018 9:09 AM
To: Yang, Yi Y 
Cc: d...@openvswitch.org; Zoltán Balogh 
Subject: RE: OVS will hit an assert if encap(nsh) is done in bucket of group


Hi Yi,



Part of the seemingly strange behavior of the encap(nsh) action in a group is 
caused by the (often forgotten) fact that group buckets do not contain action 
*lists* but action *sets*. I have no idea why it was defined like this when 
groups were first introduced in OpenFlow 1.1. In my view it was a bad decision 
and causes a lot of limitation for using groups. But that's the way it is.



In action sets there can only be one action of a kind (except for set_field, 
where there can be one action per target field). If there are multiple actions 
of the same kind specified, only the last one taken, the earlier  ones ignored.



Furthermore, the order of execution of the actions in the action set is not 
given by the order in which they are listed but defined by the OpenFlow 
standard (see chapter 5.6 of OF spec 1.5.1). Of course the generic encap() and 
decap() actions are not standardized yet, so the OF spec doesn't specify where 
to put them in the sequence. We had to implement something that follows the 
spirit of the specification, knowing that whatever we chose may fit some but 
won't fit many other legitimate use cases.



OVS's order is defined in ofpacts_execute_action_set() in ofp-actions.c:

OFPACT_STRIP_VLAN

OFPACT_POP_MPLS

OFPACT_DECAP

OFPACT_ENCAP

OFPACT_PUSH_MPLS

OFPACT_PUSH_VLAN

OFPACT_DEC_TTL

OFPACT_DEC_MPLS_TTL

OFPACT_DEC_NSH_TTL

All OFP_ACT SET_FIELD and OFP_ACT_MOVE (target)

OFPACT_SET_QUEUE



Now, your specific group bucket use case:



   encap(nsh),set_field:->nsh_xxx,output:vxlan_gpe_port



should be a lucky fit and execute as expected, whereas the analogous use case



   encap(nsh),set_field:->nsh_xxx,encap(ethernet), output:ethernet_port



fails with the error



   Dropping packet as encap(ethernet) is not supported for packet type ethernet.



because the second encap(ethernet) action replaces the encap(nsh) in the action 
set and is executed first on the original received Ethernet packet. Boom!



So, why does your valid use case cause an assertion failure? It's a consequence 
of two faults:


1.  In the conversion of the group bucket's action list to the bucket 
action set in ofpacts_execute_action_set() the action list is filtered with 
ofpact_is_set_or_move_action() to select the set_field actions. This function 
incorrectly flagged OFPACT_ENCAP, OFPACT_DECAP and OFPACT_DEC_NSH_TTL as 
set_field actions. That's why the encap(nsh) action is wrongly copied twice to 
the action set.

2.  The translation of the second encap(nsh) action in the action set 
doesn't change the packet_type as it is already (1,0x894f). Hence, the 
commit_packet_type_change() triggered at output to vxlan_gpe port misses to 
generate a second encap_nsh datapath action. The logic here is obviously not 
complete to cover the NSH in NSH use case that we intended to support and must 
be enhanced.


The commit of the changes to the NSH header in commit_set_nsh_action() then 
triggers assertion failure because the translation of the second encap(nsh) 
action did overwrite the original nsh_np (0x3 for Ethernet in NSH) in the flow 
with 0x4 (for NSH in NSH). Since it is not allowed to modify the nsh_np with 
set_field this is what triggers the assertion.


I believe this assertion to be correct. It did detect the combination of the 
above two faults.



The solution to 1 is trivial. I'll post a bug fix straight away. That should 
suffice for your problem.

The solution to 2 requires a bit more thinking. I will send a fix when I have 
found it.



BR, Jan



> -Original Message-

> From: Yang, Yi [mailto:yi.y.y...@intel.com]

> Sent: Friday, 23 March, 2018 08:55

> To: Jan Scheurich 
> mailto:jan.scheur...@ericsson.com>>

> Cc: d...@openvswitch.org; Zoltán Balogh 
> mailto:zoltan.bal...@ericsson.com>>

> Subject: Re: OVS will hit an assert if encap(nsh) is done in bucket of group

>

> On Fri, Mar 23, 2018 at 07:51:45AM +, Jan Scheurich wrote:

> > Hi Yi,

> >

> > Could you please provide the OF pipeline (flows and groups) and an 
> > ofproto/trace command that triggers that fault?

> >

> > Thanks, Jan

>

> Hi, Jan

>

> my br-int has the below ports:

>

>  1(dpdk0): addr:08:00:27:c6:9f:ff

>  config: 0

>  state:  LIVE

>  current:1GB-FD AUTO_NEG

>  speed: 1000 Mbps now, 0 Mbps max

>  2(vxlangpe1): addr:16:04:0c:e5:f1:2c

>  config: 0

>  state:  LIVE

>  speed: 0 Mbps now, 0 Mbps max

>  3(vxlan1): addr:da:1e:fb:2b:c8:63

>  config: 0

>  state:  LIVE

>  speed: 0 Mbps now, 0 Mbps max

>  4(veth-br): addr:92:3d:e0:ab:c2:85

>  config: 0

>  stat

Re: [ovs-dev] OVS will hit an assert if encap(nsh) is done in bucket of group

2018-03-24 Thread Jan Scheurich
Hi Yi,



Part of the seemingly strange behavior of the encap(nsh) action in a group is 
caused by the (often forgotten) fact that group buckets do not contain action 
*lists* but action *sets*. I have no idea why it was defined like this when 
groups were first introduced in OpenFlow 1.1. In my view it was a bad decision 
and causes a lot of limitation for using groups. But that's the way it is.



In action sets there can only be one action of a kind (except for set_field, 
where there can be one action per target field). If there are multiple actions 
of the same kind specified, only the last one taken, the earlier  ones ignored.



Furthermore, the order of execution of the actions in the action set is not 
given by the order in which they are listed but defined by the OpenFlow 
standard (see chapter 5.6 of OF spec 1.5.1). Of course the generic encap() and 
decap() actions are not standardized yet, so the OF spec doesn't specify where 
to put them in the sequence. We had to implement something that follows the 
spirit of the specification, knowing that whatever we chose may fit some but 
won't fit many other legitimate use cases.



OVS's order is defined in ofpacts_execute_action_set() in ofp-actions.c:

OFPACT_STRIP_VLAN

OFPACT_POP_MPLS

OFPACT_DECAP

OFPACT_ENCAP

OFPACT_PUSH_MPLS

OFPACT_PUSH_VLAN

OFPACT_DEC_TTL

OFPACT_DEC_MPLS_TTL

OFPACT_DEC_NSH_TTL

All OFP_ACT SET_FIELD and OFP_ACT_MOVE (target)

OFPACT_SET_QUEUE



Now, your specific group bucket use case:



   encap(nsh),set_field:->nsh_xxx,output:vxlan_gpe_port



should be a lucky fit and execute as expected, whereas the analogous use case



   encap(nsh),set_field:->nsh_xxx,encap(ethernet), output:ethernet_port



fails with the error



   Dropping packet as encap(ethernet) is not supported for packet type ethernet.



because the second encap(ethernet) action replaces the encap(nsh) in the action 
set and is executed first on the original received Ethernet packet. Boom!



So, why does your valid use case cause an assertion failure? It's a consequence 
of two faults:



  1.  In the conversion of the group bucket's action list to the bucket action 
set in ofpacts_execute_action_set() the action list is filtered with 
ofpact_is_set_or_move_action() to select the set_field actions. This function 
incorrectly flagged OFPACT_ENCAP, OFPACT_DECAP and OFPACT_DEC_NSH_TTL as 
set_field actions. That's why the encap(nsh) action is wrongly copied twice to 
the action set.

  2.  The translation of the second encap(nsh) action in the action set doesn't 
change the packet_type as it is already (1,0x894f). Hence, the 
commit_packet_type_change() triggered at output to vxlan_gpe port misses to 
generate a second encap_nsh datapath action. The logic here is obviously not 
complete to cover the NSH in NSH use case that we intended to support and must 
be enhanced.


The commit of the changes to the NSH header in commit_set_nsh_action() then 
triggers assertion failure because the translation of the second encap(nsh) 
action did overwrite the original nsh_np (0x3 for Ethernet in NSH) in the flow 
with 0x4 (for NSH in NSH). Since it is not allowed to modify the nsh_np with 
set_field this is what triggers the assertion.


I believe this assertion to be correct. It did detect the combination of the 
above two faults.



The solution to 1 is trivial. I'll post a bug fix straight away. That should 
suffice for your problem.

The solution to 2 requires a bit more thinking. I will send a fix when I have 
found it.



BR, Jan



> -Original Message-

> From: Yang, Yi [mailto:yi.y.y...@intel.com]

> Sent: Friday, 23 March, 2018 08:55

> To: Jan Scheurich 

> Cc: d...@openvswitch.org; Zoltán Balogh 

> Subject: Re: OVS will hit an assert if encap(nsh) is done in bucket of group

>

> On Fri, Mar 23, 2018 at 07:51:45AM +, Jan Scheurich wrote:

> > Hi Yi,

> >

> > Could you please provide the OF pipeline (flows and groups) and an 
> > ofproto/trace command that triggers that fault?

> >

> > Thanks, Jan

>

> Hi, Jan

>

> my br-int has the below ports:

>

>  1(dpdk0): addr:08:00:27:c6:9f:ff

>  config: 0

>  state:  LIVE

>  current:1GB-FD AUTO_NEG

>  speed: 1000 Mbps now, 0 Mbps max

>  2(vxlangpe1): addr:16:04:0c:e5:f1:2c

>  config: 0

>  state:  LIVE

>  speed: 0 Mbps now, 0 Mbps max

>  3(vxlan1): addr:da:1e:fb:2b:c8:63

>  config: 0

>  state:  LIVE

>  speed: 0 Mbps now, 0 Mbps max

>  4(veth-br): addr:92:3d:e0:ab:c2:85

>  config: 0

>  state:  LIVE

>  current:10GB-FD COPPER

>  speed: 1 Mbps now, 0 Mbps max

>  LOCAL(br-int): addr:08:00:27:c6:9f:ff

>  config: 0

>  state:  LIVE

>  current:10MB-FD COPPER

>  speed: 10 Mbps now, 0 Mbps max

>

> Here are group and flow

>

> ${OFCTL} -Oopenflow13 add-group br-int

> group_id=111,type=all,bucket="load:0xc0a83249->NXM_NX_TUN_IPV4_DST[],load:0x9-

> >NXM_NX_TUN_I

Re: [ovs-dev] OVS will hit an assert if encap(nsh) is done in bucket of group

2018-03-23 Thread Jan Scheurich
Hi Yi,

OK, this is an ALL group with two buckets. xlate_generic_encap_action is 
supposed to be called once for each bucket. 

In theory the translation in the second bucket should start from the same fresh 
base_flow as in the first bucket, as each bucket should process a clone of the 
original packet. Apparently that is not working correctly and the base_flow is 
not properly reset when starting with the second bucket.

I will need to have a closer look at the code to check.

BR, Jan

> -Original Message-
> From: Yang, Yi [mailto:yi.y.y...@intel.com]
> Sent: Friday, 23 March, 2018 08:55
> To: Jan Scheurich 
> Cc: d...@openvswitch.org; Zoltán Balogh 
> Subject: Re: OVS will hit an assert if encap(nsh) is done in bucket of group
> 
> On Fri, Mar 23, 2018 at 07:51:45AM +, Jan Scheurich wrote:
> > Hi Yi,
> >
> > Could you please provide the OF pipeline (flows and groups) and an 
> > ofproto/trace command that triggers that fault?
> >
> > Thanks, Jan
> 
> Hi, Jan
> 
> my br-int has the below ports:
> 
>  1(dpdk0): addr:08:00:27:c6:9f:ff
>  config: 0
>  state:  LIVE
>  current:1GB-FD AUTO_NEG
>  speed: 1000 Mbps now, 0 Mbps max
>  2(vxlangpe1): addr:16:04:0c:e5:f1:2c
>  config: 0
>  state:  LIVE
>  speed: 0 Mbps now, 0 Mbps max
>  3(vxlan1): addr:da:1e:fb:2b:c8:63
>  config: 0
>  state:  LIVE
>  speed: 0 Mbps now, 0 Mbps max
>  4(veth-br): addr:92:3d:e0:ab:c2:85
>  config: 0
>  state:  LIVE
>  current:10GB-FD COPPER
>  speed: 1 Mbps now, 0 Mbps max
>  LOCAL(br-int): addr:08:00:27:c6:9f:ff
>  config: 0
>  state:  LIVE
>  current:10MB-FD COPPER
>  speed: 10 Mbps now, 0 Mbps max
> 
> Here are group and flow
> 
> ${OFCTL} -Oopenflow13 add-group br-int
> group_id=111,type=all,bucket="load:0xc0a83249->NXM_NX_TUN_IPV4_DST[],load:0x9-
> >NXM_NX_TUN_ID[0..31],output:3",bucket="encap(nsh(md_type=1)),set_field:0x80->nsh_flags,set_field:0x3456-
> >nsh_spi,set_field:253->nsh_si,set_field:0x->nsh_c1,set_field:0x->nsh_c2,set_field:0x-
> >nsh_c3,set_field:0x->nsh_c4,load:0xc0a83249->NXM_NX_TUN_IPV4_DST[],load:0x9->NXM_NX_TUN_ID[0..31],output:2"
> ${OFCTL} -Oopenflow13 add-flow br-int in_port=4,icmp,actions=group:111
> 
> Then I run ping by this cmd "sudo ip netns exec app ping -c 1 192.168.2.2", 
> it will hit flow "in_port=4,icmp,actions=group:111", then
> result in this assert.
> 
> >
> > > -Original Message-
> > > From: Yang, Yi [mailto:yi.y.y...@intel.com]
> > > Sent: Friday, 23 March, 2018 04:53
> > > To: d...@openvswitch.org
> > > Cc: Jan Scheurich ; Zoltán Balogh 
> > > 
> > > Subject: OVS will hit an assert if encap(nsh) is done in bucket of group
> > >
> > > Hi, guys
> > >
> > > A NSH user found OVS will hit the below assert in function
> > > commit_set_nsh_action in file lib/odp-util.c if encap(nsh) is done in
> > > bucket of group
> > >
> > > ovs_assert(flow->nsh.mdtype == base_flow->nsh.mdtype &&
> > >flow->nsh.np == base_flow->nsh.np);
> > >
> > > But it isn't an issue if encap(nsh) in actions=.
> > >
> > > I debugged this issue but can't find the root cause, basically
> > > xlate_generic_encap_action is called twice for a packet in different
> > > code path in group bucket use case. one is upcall process, another one
> > > is normal process in two code paths, gdb call stack dump is as followed.
> > >
> > > The second call is obviously based on the result of the first call which
> > > has been committed by xlate_commit_actions in first call
> > > xlate_generic_encap_action, so flow->nsh.np will be set to NSH_P_NSH,
> > > this is wrong, but it can work normally if I comment out the above
> > > assert. I really don't know why xlate_generic_encap_action is called
> > > twice in group bucket use case, so look forward to your insights,
> > > appriciate your feedback heartfeltly.
> > >
> > > (gdb) bt
> > > #0  xlate_generic_encap_action (encap=0x7ffd9458f460,
> > > ctx=0x7ffd94590f30)
> > > at ofproto/ofproto-dpif-xlate.c:5913
> > > #1  do_xlate_actions (ofpacts=, ofpacts_len= > > out>,
> > > ctx=ctx@entry=0x7ffd94590f30,
> > > is_last_action=is_last_action@entry=false)
> > > at ofproto/ofproto-dpif-xlate.c:6499
> > > #2  0x0074add0 in xlate_group_bucket
> > > (ctx=ctx@entry=0x7ffd94590f30,
> > > is_last_action=, bucket=0x2214f90, bucket=0x2214f90)
> > > at ofproto/ofproto-dpif-xlate.c:4090
> > > #3  0x00749ee4 in xlate_all_group (is_last_action=false,
> > > group=0x2214bc0, ctx=0x7ffd94590f30) at
> > > ofproto/ofproto-dpif-xlate.c:4150
> > > #4  xlate_group_action__ (is_last_action=,
> > > group=0x2214bc0,
> > > ctx=0x7ffd94590f30) at ofproto/ofproto-dpif-xlate.c:4304
> > > #5  xlate_group_action (is_last_action=,
> > > group_id=, ctx=0x7ffd94590f30)
> > > at ofproto/ofproto-dpif-xlate.c:4335
> > > #6  do_xlate_actions (ofpacts=ofpacts@entry=0x2212558,
> > > ofpacts

Re: [ovs-dev] OVS will hit an assert if encap(nsh) is done in bucket of group

2018-03-23 Thread Yang, Yi
On Fri, Mar 23, 2018 at 07:51:45AM +, Jan Scheurich wrote:
> Hi Yi,
> 
> Could you please provide the OF pipeline (flows and groups) and an 
> ofproto/trace command that triggers that fault? 
> 
> Thanks, Jan

Hi, Jan

my br-int has the below ports:

 1(dpdk0): addr:08:00:27:c6:9f:ff
 config: 0
 state:  LIVE
 current:1GB-FD AUTO_NEG
 speed: 1000 Mbps now, 0 Mbps max
 2(vxlangpe1): addr:16:04:0c:e5:f1:2c
 config: 0
 state:  LIVE
 speed: 0 Mbps now, 0 Mbps max
 3(vxlan1): addr:da:1e:fb:2b:c8:63
 config: 0
 state:  LIVE
 speed: 0 Mbps now, 0 Mbps max
 4(veth-br): addr:92:3d:e0:ab:c2:85
 config: 0
 state:  LIVE
 current:10GB-FD COPPER
 speed: 1 Mbps now, 0 Mbps max
 LOCAL(br-int): addr:08:00:27:c6:9f:ff
 config: 0
 state:  LIVE
 current:10MB-FD COPPER
 speed: 10 Mbps now, 0 Mbps max

Here are group and flow

${OFCTL} -Oopenflow13 add-group br-int
group_id=111,type=all,bucket="load:0xc0a83249->NXM_NX_TUN_IPV4_DST[],load:0x9->NXM_NX_TUN_ID[0..31],output:3",bucket="encap(nsh(md_type=1)),set_field:0x80->nsh_flags,set_field:0x3456->nsh_spi,set_field:253->nsh_si,set_field:0x->nsh_c1,set_field:0x->nsh_c2,set_field:0x->nsh_c3,set_field:0x->nsh_c4,load:0xc0a83249->NXM_NX_TUN_IPV4_DST[],load:0x9->NXM_NX_TUN_ID[0..31],output:2"
${OFCTL} -Oopenflow13 add-flow br-int in_port=4,icmp,actions=group:111

Then I run ping by this cmd "sudo ip netns exec app ping -c 1 192.168.2.2", it 
will hit flow "in_port=4,icmp,actions=group:111", then result in this assert.

> 
> > -Original Message-
> > From: Yang, Yi [mailto:yi.y.y...@intel.com]
> > Sent: Friday, 23 March, 2018 04:53
> > To: d...@openvswitch.org
> > Cc: Jan Scheurich ; Zoltán Balogh 
> > 
> > Subject: OVS will hit an assert if encap(nsh) is done in bucket of group
> > 
> > Hi, guys
> > 
> > A NSH user found OVS will hit the below assert in function
> > commit_set_nsh_action in file lib/odp-util.c if encap(nsh) is done in
> > bucket of group
> > 
> > ovs_assert(flow->nsh.mdtype == base_flow->nsh.mdtype &&
> >flow->nsh.np == base_flow->nsh.np);
> > 
> > But it isn't an issue if encap(nsh) in actions=.
> > 
> > I debugged this issue but can't find the root cause, basically
> > xlate_generic_encap_action is called twice for a packet in different
> > code path in group bucket use case. one is upcall process, another one
> > is normal process in two code paths, gdb call stack dump is as followed.
> > 
> > The second call is obviously based on the result of the first call which
> > has been committed by xlate_commit_actions in first call
> > xlate_generic_encap_action, so flow->nsh.np will be set to NSH_P_NSH,
> > this is wrong, but it can work normally if I comment out the above
> > assert. I really don't know why xlate_generic_encap_action is called
> > twice in group bucket use case, so look forward to your insights,
> > appriciate your feedback heartfeltly.
> > 
> > (gdb) bt
> > #0  xlate_generic_encap_action (encap=0x7ffd9458f460,
> > ctx=0x7ffd94590f30)
> > at ofproto/ofproto-dpif-xlate.c:5913
> > #1  do_xlate_actions (ofpacts=, ofpacts_len= > out>,
> > ctx=ctx@entry=0x7ffd94590f30,
> > is_last_action=is_last_action@entry=false)
> > at ofproto/ofproto-dpif-xlate.c:6499
> > #2  0x0074add0 in xlate_group_bucket
> > (ctx=ctx@entry=0x7ffd94590f30,
> > is_last_action=, bucket=0x2214f90, bucket=0x2214f90)
> > at ofproto/ofproto-dpif-xlate.c:4090
> > #3  0x00749ee4 in xlate_all_group (is_last_action=false,
> > group=0x2214bc0, ctx=0x7ffd94590f30) at
> > ofproto/ofproto-dpif-xlate.c:4150
> > #4  xlate_group_action__ (is_last_action=,
> > group=0x2214bc0,
> > ctx=0x7ffd94590f30) at ofproto/ofproto-dpif-xlate.c:4304
> > #5  xlate_group_action (is_last_action=,
> > group_id=, ctx=0x7ffd94590f30)
> > at ofproto/ofproto-dpif-xlate.c:4335
> > #6  do_xlate_actions (ofpacts=ofpacts@entry=0x2212558,
> > ofpacts_len=ofpacts_len@entry=8, ctx=ctx@entry=0x7ffd94590f30,
> > is_last_action=is_last_action@entry=true)
> > at ofproto/ofproto-dpif-xlate.c:6177
> > #7  0x00750041 in xlate_actions (xin=xin@entry=0x7ffd945919d0,
> > xout=xout@entry=0x7ffd94591dd0) at ofproto/ofproto-dpif-xlate.c:7090
> > #8  0x00741f00 in upcall_xlate (wc=0x7ffd94592ff8,
> > ---Type  to continue, or q  to quit---
> > odp_actions=0x7ffd945927d0, upcall=0x7ffd94591d70, udpif=0x1805440)
> > at ofproto/ofproto-dpif-upcall.c:1162
> > #9  process_upcall (udpif=udpif@entry=0x1805440,
> > upcall=upcall@entry=0x7ffd94591d70,
> > odp_actions=odp_actions@entry=0x7ffd945927d0,
> > wc=wc@entry=0x7ffd94592ff8)
> > at ofproto/ofproto-dpif-upcall.c:1361
> > #10 0x0074244b in upcall_cb (packet=,
> > flow=0x7ffd94592d60, ufid=, pmd_id=,
> > type=, userdata=,
> > actions=0x7ffd945927d0,
> > wc=0x7ffd94592ff8, put_actions=0x7ffd9

Re: [ovs-dev] OVS will hit an assert if encap(nsh) is done in bucket of group

2018-03-23 Thread Jan Scheurich
Hi Yi,

Could you please provide the OF pipeline (flows and groups) and an 
ofproto/trace command that triggers that fault? 

Thanks, Jan

> -Original Message-
> From: Yang, Yi [mailto:yi.y.y...@intel.com]
> Sent: Friday, 23 March, 2018 04:53
> To: d...@openvswitch.org
> Cc: Jan Scheurich ; Zoltán Balogh 
> 
> Subject: OVS will hit an assert if encap(nsh) is done in bucket of group
> 
> Hi, guys
> 
> A NSH user found OVS will hit the below assert in function
> commit_set_nsh_action in file lib/odp-util.c if encap(nsh) is done in
> bucket of group
> 
> ovs_assert(flow->nsh.mdtype == base_flow->nsh.mdtype &&
>flow->nsh.np == base_flow->nsh.np);
> 
> But it isn't an issue if encap(nsh) in actions=.
> 
> I debugged this issue but can't find the root cause, basically
> xlate_generic_encap_action is called twice for a packet in different
> code path in group bucket use case. one is upcall process, another one
> is normal process in two code paths, gdb call stack dump is as followed.
> 
> The second call is obviously based on the result of the first call which
> has been committed by xlate_commit_actions in first call
> xlate_generic_encap_action, so flow->nsh.np will be set to NSH_P_NSH,
> this is wrong, but it can work normally if I comment out the above
> assert. I really don't know why xlate_generic_encap_action is called
> twice in group bucket use case, so look forward to your insights,
> appriciate your feedback heartfeltly.
> 
> (gdb) bt
> #0  xlate_generic_encap_action (encap=0x7ffd9458f460,
> ctx=0x7ffd94590f30)
> at ofproto/ofproto-dpif-xlate.c:5913
> #1  do_xlate_actions (ofpacts=, ofpacts_len= out>,
> ctx=ctx@entry=0x7ffd94590f30,
> is_last_action=is_last_action@entry=false)
> at ofproto/ofproto-dpif-xlate.c:6499
> #2  0x0074add0 in xlate_group_bucket
> (ctx=ctx@entry=0x7ffd94590f30,
> is_last_action=, bucket=0x2214f90, bucket=0x2214f90)
> at ofproto/ofproto-dpif-xlate.c:4090
> #3  0x00749ee4 in xlate_all_group (is_last_action=false,
> group=0x2214bc0, ctx=0x7ffd94590f30) at
> ofproto/ofproto-dpif-xlate.c:4150
> #4  xlate_group_action__ (is_last_action=,
> group=0x2214bc0,
> ctx=0x7ffd94590f30) at ofproto/ofproto-dpif-xlate.c:4304
> #5  xlate_group_action (is_last_action=,
> group_id=, ctx=0x7ffd94590f30)
> at ofproto/ofproto-dpif-xlate.c:4335
> #6  do_xlate_actions (ofpacts=ofpacts@entry=0x2212558,
> ofpacts_len=ofpacts_len@entry=8, ctx=ctx@entry=0x7ffd94590f30,
> is_last_action=is_last_action@entry=true)
> at ofproto/ofproto-dpif-xlate.c:6177
> #7  0x00750041 in xlate_actions (xin=xin@entry=0x7ffd945919d0,
> xout=xout@entry=0x7ffd94591dd0) at ofproto/ofproto-dpif-xlate.c:7090
> #8  0x00741f00 in upcall_xlate (wc=0x7ffd94592ff8,
> ---Type  to continue, or q  to quit---
> odp_actions=0x7ffd945927d0, upcall=0x7ffd94591d70, udpif=0x1805440)
> at ofproto/ofproto-dpif-upcall.c:1162
> #9  process_upcall (udpif=udpif@entry=0x1805440,
> upcall=upcall@entry=0x7ffd94591d70,
> odp_actions=odp_actions@entry=0x7ffd945927d0,
> wc=wc@entry=0x7ffd94592ff8)
> at ofproto/ofproto-dpif-upcall.c:1361
> #10 0x0074244b in upcall_cb (packet=,
> flow=0x7ffd94592d60, ufid=, pmd_id=,
> type=, userdata=,
> actions=0x7ffd945927d0,
> wc=0x7ffd94592ff8, put_actions=0x7ffd94592810, aux=0x1805440)
> at ofproto/ofproto-dpif-upcall.c:1263
> #11 0x0076b2d6 in dp_netdev_upcall
> (packet_=packet_@entry=0x2211680,
> flow=flow@entry=0x7ffd94592d60, wc=wc@entry=0x7ffd94592ff8,
> ufid=ufid@entry=0x7ffd945927b0, type=type@entry=DPIF_UC_MISS,
> userdata=userdata@entry=0x0, actions=actions@entry=0x7ffd945927d0,
> put_actions=put_actions@entry=0x7ffd94592810, pmd=,
> pmd=) at lib/dpif-netdev.c:4868
> #12 0x007725fd in handle_packet_upcall
> (put_actions=0x7ffd94592810,
> actions=0x7ffd945927d0, key=0x7ffd94593c40, packet=0x2211680,
> pmd=0x18a3e90) at lib/dpif-netdev.c:5079
> #13 fast_path_processing (pmd=pmd@entry=0x18a3e90,
> packets_=packets_@entry=0x7ffd94593fa0,
> keys=keys@entry=0x7ffd94593c40,
> ---Type  to continue, or q  to quit---
> batches=batches@entry=0x7ffd94593ae0,
> n_batches=n_batches@entry=0x7ffd94593f28, in_port=)
> at lib/dpif-netdev.c:5187
> #14 0x00772ea8 in dp_netdev_input__ (pmd=pmd@entry=0x18a3e90,
> packets=packets@entry=0x7ffd94593fa0,
> md_is_valid=md_is_valid@entry=false, port_no=port_no@entry=5)
> at lib/dpif-netdev.c:5259
> #15 0x0077310d in dp_netdev_input (port_no=5,
> packets=0x7ffd94593fa0,
> pmd=0x18a3e90) at lib/dpif-netdev.c:5287
> #16 dp_netdev_process_rxq_port (pmd=pmd@entry=0x18a3e90, rxq=0x1d4f2a0,
> port_no=5) at lib/dpif-netdev.c:3286
> #17 0x00773b02 in dpif_netdev_run (dpif=)
> at lib/dpif-netdev.c:3940
> #18 0x00733a18 in type_run (type=)
> at ofproto/ofproto-dpif.c:342
> #19 0x0071f9cf in ofproto_type_run (datapath_typ

[ovs-dev] OVS will hit an assert if encap(nsh) is done in bucket of group

2018-03-22 Thread Yang, Yi
Hi, guys

A NSH user found OVS will hit the below assert in function
commit_set_nsh_action in file lib/odp-util.c if encap(nsh) is done in
bucket of group

ovs_assert(flow->nsh.mdtype == base_flow->nsh.mdtype &&
   flow->nsh.np == base_flow->nsh.np);

But it isn't an issue if encap(nsh) in actions=.

I debugged this issue but can't find the root cause, basically
xlate_generic_encap_action is called twice for a packet in different
code path in group bucket use case. one is upcall process, another one
is normal process in two code paths, gdb call stack dump is as followed.

The second call is obviously based on the result of the first call which
has been committed by xlate_commit_actions in first call
xlate_generic_encap_action, so flow->nsh.np will be set to NSH_P_NSH,
this is wrong, but it can work normally if I comment out the above
assert. I really don't know why xlate_generic_encap_action is called
twice in group bucket use case, so look forward to your insights,
appriciate your feedback heartfeltly.

(gdb) bt
#0  xlate_generic_encap_action (encap=0x7ffd9458f460,
ctx=0x7ffd94590f30)
at ofproto/ofproto-dpif-xlate.c:5913
#1  do_xlate_actions (ofpacts=, ofpacts_len=,
ctx=ctx@entry=0x7ffd94590f30,
is_last_action=is_last_action@entry=false)
at ofproto/ofproto-dpif-xlate.c:6499
#2  0x0074add0 in xlate_group_bucket
(ctx=ctx@entry=0x7ffd94590f30,
is_last_action=, bucket=0x2214f90, bucket=0x2214f90)
at ofproto/ofproto-dpif-xlate.c:4090
#3  0x00749ee4 in xlate_all_group (is_last_action=false,
group=0x2214bc0, ctx=0x7ffd94590f30) at
ofproto/ofproto-dpif-xlate.c:4150
#4  xlate_group_action__ (is_last_action=,
group=0x2214bc0,
ctx=0x7ffd94590f30) at ofproto/ofproto-dpif-xlate.c:4304
#5  xlate_group_action (is_last_action=,
group_id=, ctx=0x7ffd94590f30)
at ofproto/ofproto-dpif-xlate.c:4335
#6  do_xlate_actions (ofpacts=ofpacts@entry=0x2212558,
ofpacts_len=ofpacts_len@entry=8, ctx=ctx@entry=0x7ffd94590f30,
is_last_action=is_last_action@entry=true)
at ofproto/ofproto-dpif-xlate.c:6177
#7  0x00750041 in xlate_actions (xin=xin@entry=0x7ffd945919d0,
xout=xout@entry=0x7ffd94591dd0) at ofproto/ofproto-dpif-xlate.c:7090
#8  0x00741f00 in upcall_xlate (wc=0x7ffd94592ff8,
---Type  to continue, or q  to quit---
odp_actions=0x7ffd945927d0, upcall=0x7ffd94591d70, udpif=0x1805440)
at ofproto/ofproto-dpif-upcall.c:1162
#9  process_upcall (udpif=udpif@entry=0x1805440,
upcall=upcall@entry=0x7ffd94591d70,
odp_actions=odp_actions@entry=0x7ffd945927d0,
wc=wc@entry=0x7ffd94592ff8)
at ofproto/ofproto-dpif-upcall.c:1361
#10 0x0074244b in upcall_cb (packet=,
flow=0x7ffd94592d60, ufid=, pmd_id=,
type=, userdata=,
actions=0x7ffd945927d0,
wc=0x7ffd94592ff8, put_actions=0x7ffd94592810, aux=0x1805440)
at ofproto/ofproto-dpif-upcall.c:1263
#11 0x0076b2d6 in dp_netdev_upcall
(packet_=packet_@entry=0x2211680,
flow=flow@entry=0x7ffd94592d60, wc=wc@entry=0x7ffd94592ff8,
ufid=ufid@entry=0x7ffd945927b0, type=type@entry=DPIF_UC_MISS,
userdata=userdata@entry=0x0, actions=actions@entry=0x7ffd945927d0,
put_actions=put_actions@entry=0x7ffd94592810, pmd=,
pmd=) at lib/dpif-netdev.c:4868
#12 0x007725fd in handle_packet_upcall
(put_actions=0x7ffd94592810,
actions=0x7ffd945927d0, key=0x7ffd94593c40, packet=0x2211680,
pmd=0x18a3e90) at lib/dpif-netdev.c:5079
#13 fast_path_processing (pmd=pmd@entry=0x18a3e90,
packets_=packets_@entry=0x7ffd94593fa0,
keys=keys@entry=0x7ffd94593c40,
---Type  to continue, or q  to quit---
batches=batches@entry=0x7ffd94593ae0,
n_batches=n_batches@entry=0x7ffd94593f28, in_port=)
at lib/dpif-netdev.c:5187
#14 0x00772ea8 in dp_netdev_input__ (pmd=pmd@entry=0x18a3e90,
packets=packets@entry=0x7ffd94593fa0,
md_is_valid=md_is_valid@entry=false, port_no=port_no@entry=5)
at lib/dpif-netdev.c:5259
#15 0x0077310d in dp_netdev_input (port_no=5,
packets=0x7ffd94593fa0,
pmd=0x18a3e90) at lib/dpif-netdev.c:5287
#16 dp_netdev_process_rxq_port (pmd=pmd@entry=0x18a3e90, rxq=0x1d4f2a0,
port_no=5) at lib/dpif-netdev.c:3286
#17 0x00773b02 in dpif_netdev_run (dpif=)
at lib/dpif-netdev.c:3940
#18 0x00733a18 in type_run (type=)
at ofproto/ofproto-dpif.c:342
#19 0x0071f9cf in ofproto_type_run (datapath_type=,
datapath_type@entry=0x1d50ab0 "netdev") at ofproto/ofproto.c:1707
#20 0x0070f955 in bridge_run__ () at vswitchd/bridge.c:2931
#21 0x007153c8 in bridge_run () at vswitchd/bridge.c:2995
#22 0x00415485 in main (argc=5, argv=0x7ffd94594568)
at vswitchd/ovs-vswitchd.c:120
(gdb) bt
#0  xlate_generic_encap_action (encap=0x7ffd9458f478,
ctx=0x7ffd94590f30)
at ofproto/ofproto-dpif-xlate.c:5913
#1  do_xlate_actions (ofpacts=, ofpacts_len=,
ctx=ctx@entry=0x7ffd94590f30,
is_last_action=is_last_action@entry=false)
at ofproto/ofproto-dpif-xlate.c:6499
#2  0