Re: [ovs-dev] [PATCH] netdev: Padding runt packet on VXLAN and DPDK ports.

2024-05-22 Thread Ilya Maximets
On 5/21/24 13:29, Kumar, Rohit wrote:
> Hi Ilya,
> 
> Thanks for your comments.
> 
> Yes, it's the hardware switch that doesn't do zero padding. Unfortunately,
> the switch vendor claims that the padding is added after the FCS of the
> original packet, and there is no way around it. And if it's a requirement
> to zero pad, then RFC 1042 should use the word "must".  As for the "valid
> reason in special circumstances", the reason is that generating zero padding
> would have greatly complicated their switch architecture, resulting in
> additional power consumption and latency. So OVS has to do it.

This is not a good argument for OVS to do that.  We can't workaround every
possible thing hardware vendors implement differently.  Adding extra padding
into tunnel packets beside performance impact will break communication with
older Linux kernels, so we can't do that.  Also, as I already said, this will
introduce the difference between Userpsace and Kernel datapaths, i.e your
setup will still not work if you use Linux kernel tunnels for encapsulation.
Same is likely true for other operating systems.

> 
> For the other part, where zero padding is to be added to the DPDK interface,
> no PMD except a few (Atomic Rules, Broadcom, HiSilicon PMDs) do this in the
> driver, and we use Intel, so that's the reason for doing it in OVS.

Most modern NICs support padding on the hardware level.  Intel supports it
by enabling IXGBE_HLREG0_TXPADEN for ixgbe cards for example.  So, even if
the padding is not performed in software it doesn't mean it's not performed.
Doing this in software will be a waste of CPU resources in most cases.

Best regards, Ilya Maximets.

> 
> Regards,
> Rohit Kumar
> 
> *From:* Ilya Maximets 
> *Sent:* Monday, May 20, 2024 11:30 PM
> *To:* Kumar, Rohit ; d...@openvswitch.org 
> 
> *Cc:* i.maxim...@ovn.org 
> *Subject:* Re: [ovs-dev] [PATCH] netdev: Padding runt packet on VXLAN and 
> DPDK ports.
>  
> ***WARNING! THIS EMAIL ORIGINATES FROM OUTSIDE ST ENGINEERING IDIRECT.***
> 
> On 5/18/24 00:55, Kumar, Rohit via dev wrote:
>> The basic idea behind this patch is to pad the runt packet before vxlan
>> encapsulation and on a DPDK port.
>>
>> topology:
>> IXIA <> SWITCH <--vxlan--> OVS <--dpdk--> HOST
>>
>> The host sends a runt packet (54 bytes) with a size less than 64 bytes
>> to the OVS. The OVS receives this runt packet and sends it further down
>> the VXLAN tunnel (original packet 54 + 50 VXLAN = 104 B total packet size)
>> to a physical switch. At the switch, after decapsulation, the original
>> packet size is 54B and it then sends the packet on to the ixia.  However,
>> the switch adds 2B of padding with random content (not zero as RFC 1042
>> mentions) and due to this random byte padding, ixia drops the packet.
>>
>> Switch vendors claim to be compliant with both RFC 1042 and RFC 2119 to
>> fix it in the switch.
>>
>> RFC 1042 defines:
>> “IEEE 802 packets may have a minimum size restriction.  When
>> necessary, the data field should be padded (with octets of zero)
>> to meet the IEEE 802 minimum frame size requirements.  This
>> padding is not part of the IP datagram and is not included in the
>> total length field of the IP header.”
>>
>> RFC 2119 defines:
>> "SHOULD   This word, or the adjective "RECOMMENDED", mean that there
>>    may exist valid reasons in particular circumstances to ignore a
>>    particular item, but the full implications must be understood and
>>    carefully weighed before choosing a different course."
>>
>> So, a fix is needed in the OVS. The OVS is connected to the switch on a
>> VXLAN port and to the host on a DPDK port. The padding fix is applied to
>> both netdev types. I tested the fix in the same setup and below are the
>> captures after padding on both VXLAN and DPDK ports from OVS UT.
>>
>> VXLAN (46B zero pad):
>>    aa 55 aa 66 00 00 aa 55 aa 55 00 00 08 00 45 00   .U.f...U.UE.
>> 0010   00 60 00 00 40 00 40 11 26 81 0a 00 00 02 0a 00   .`..@.@.&...
>> 0020   00 0b 9b 77 12 b5 00 4c 00 00 08 00 00 00 00 00   ...w...L
>> 0030   7b 00 50 54 00 00 00 0a 50 54 00 00 00 09 12 34   {.PTPT.4
>> 0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   
>> 0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   
>> 0060   00 00 00 00 00 00 00 00 00 00 00 00 00 00 ..
>>
>> DPDK (16B zero pad):
>>    ff ff ff ff ff ff 5e fb 90 96 7d b7 08 06 00 01   ..^...}.
>&

Re: [ovs-dev] [PATCH] netdev: Padding runt packet on VXLAN and DPDK ports.

2024-05-21 Thread Kumar, Rohit via dev
Hi Ilya,

Thanks for your comments.

Yes, it's the hardware switch that doesn't do zero padding. Unfortunately, the 
switch vendor claims that the padding is added after the FCS of the original 
packet, and there is no way around it. And if it's a requirement to zero pad, 
then RFC 1042 should use the word "must".  As for the "valid reason in special 
circumstances", the reason is that generating zero padding would have greatly 
complicated their switch architecture, resulting in additional power 
consumption and latency. So OVS has to do it.

For the other part, where zero padding is to be added to the DPDK interface, no 
PMD except a few (Atomic Rules, Broadcom, HiSilicon PMDs) do this in the 
driver, and we use Intel, so that's the reason for doing it in OVS.

Regards,
Rohit Kumar

From: Ilya Maximets 
Sent: Monday, May 20, 2024 11:30 PM
To: Kumar, Rohit ; d...@openvswitch.org 

Cc: i.maxim...@ovn.org 
Subject: Re: [ovs-dev] [PATCH] netdev: Padding runt packet on VXLAN and DPDK 
ports.

***WARNING! THIS EMAIL ORIGINATES FROM OUTSIDE ST ENGINEERING IDIRECT.***

On 5/18/24 00:55, Kumar, Rohit via dev wrote:
> The basic idea behind this patch is to pad the runt packet before vxlan
> encapsulation and on a DPDK port.
>
> topology:
> IXIA <> SWITCH <--vxlan--> OVS <--dpdk--> HOST
>
> The host sends a runt packet (54 bytes) with a size less than 64 bytes
> to the OVS. The OVS receives this runt packet and sends it further down
> the VXLAN tunnel (original packet 54 + 50 VXLAN = 104 B total packet size)
> to a physical switch. At the switch, after decapsulation, the original
> packet size is 54B and it then sends the packet on to the ixia.  However,
> the switch adds 2B of padding with random content (not zero as RFC 1042
> mentions) and due to this random byte padding, ixia drops the packet.
>
> Switch vendors claim to be compliant with both RFC 1042 and RFC 2119 to
> fix it in the switch.
>
> RFC 1042 defines:
> “IEEE 802 packets may have a minimum size restriction.  When
> necessary, the data field should be padded (with octets of zero)
> to meet the IEEE 802 minimum frame size requirements.  This
> padding is not part of the IP datagram and is not included in the
> total length field of the IP header.”
>
> RFC 2119 defines:
> "SHOULD   This word, or the adjective "RECOMMENDED", mean that there
>may exist valid reasons in particular circumstances to ignore a
>particular item, but the full implications must be understood and
>carefully weighed before choosing a different course."
>
> So, a fix is needed in the OVS. The OVS is connected to the switch on a
> VXLAN port and to the host on a DPDK port. The padding fix is applied to
> both netdev types. I tested the fix in the same setup and below are the
> captures after padding on both VXLAN and DPDK ports from OVS UT.
>
> VXLAN (46B zero pad):
>    aa 55 aa 66 00 00 aa 55 aa 55 00 00 08 00 45 00   .U.f...U.UE.
> 0010   00 60 00 00 40 00 40 11 26 81 0a 00 00 02 0a 00   .`..@.@.&...
> 0020   00 0b 9b 77 12 b5 00 4c 00 00 08 00 00 00 00 00   ...w...L
> 0030   7b 00 50 54 00 00 00 0a 50 54 00 00 00 09 12 34   {.PTPT.4
> 0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   
> 0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   
> 0060   00 00 00 00 00 00 00 00 00 00 00 00 00 00 ..
>
> DPDK (16B zero pad):
>    ff ff ff ff ff ff 5e fb 90 96 7d b7 08 06 00 01   ..^...}.
> 0010   08 00 06 04 00 01 5e fb 90 96 7d b7 0a 01 01 01   ..^...}.
> 0020   00 00 00 00 00 00 0a 01 01 02 00 00 00 00 00 00   
> 0030   00 00 00 00 00 00 00 00 00 00 00 00   
>
> Signed-off-by: Rohit Kumar 

Hi, Rohit.  Thanks for the patch!

Though I don't think OVS should add padding to packets.  RFC 1042 is talking
about transmission to ethernet networks.  In your example above OVS sends 104
byte long packet to the ethernet link, so no padding is needed.  If the
'SWITCH' in your setup sends incorectly padded packet after decapsulation,
that sounds like an issue of that switch.  Is this patch a workaround for
a hardware switch issue?

Also, we even had a request in the past to strip all the existing padding before
encapsulation, because older Linux kernel versions had issues processing padded
inner packets after decapsulation.

Adding the padding will also introduce a difference in behavior between Linux
kenrel datapath and the userspace datapath, since Linux kernel doesn't pad 
packet
before encapsulation.  Having a difference between datapath implementations is
not good.

For the part where we add padding before sending to a DPDK interface, this
also doesn't seem right.  

Re: [ovs-dev] [PATCH] netdev: Padding runt packet on VXLAN and DPDK ports.

2024-05-20 Thread Ilya Maximets
On 5/18/24 00:55, Kumar, Rohit via dev wrote:
> The basic idea behind this patch is to pad the runt packet before vxlan
> encapsulation and on a DPDK port.
> 
> topology:
> IXIA <> SWITCH <--vxlan--> OVS <--dpdk--> HOST
> 
> The host sends a runt packet (54 bytes) with a size less than 64 bytes
> to the OVS. The OVS receives this runt packet and sends it further down
> the VXLAN tunnel (original packet 54 + 50 VXLAN = 104 B total packet size)
> to a physical switch. At the switch, after decapsulation, the original
> packet size is 54B and it then sends the packet on to the ixia.  However,
> the switch adds 2B of padding with random content (not zero as RFC 1042
> mentions) and due to this random byte padding, ixia drops the packet.
> 
> Switch vendors claim to be compliant with both RFC 1042 and RFC 2119 to
> fix it in the switch.
> 
> RFC 1042 defines:
> “IEEE 802 packets may have a minimum size restriction.  When
> necessary, the data field should be padded (with octets of zero)
> to meet the IEEE 802 minimum frame size requirements.  This
> padding is not part of the IP datagram and is not included in the
> total length field of the IP header.”
> 
> RFC 2119 defines:
> "SHOULD   This word, or the adjective "RECOMMENDED", mean that there
>may exist valid reasons in particular circumstances to ignore a
>particular item, but the full implications must be understood and
>carefully weighed before choosing a different course."
> 
> So, a fix is needed in the OVS. The OVS is connected to the switch on a
> VXLAN port and to the host on a DPDK port. The padding fix is applied to
> both netdev types. I tested the fix in the same setup and below are the
> captures after padding on both VXLAN and DPDK ports from OVS UT.
> 
> VXLAN (46B zero pad):
>    aa 55 aa 66 00 00 aa 55 aa 55 00 00 08 00 45 00   .U.f...U.UE.
> 0010   00 60 00 00 40 00 40 11 26 81 0a 00 00 02 0a 00   .`..@.@.&...
> 0020   00 0b 9b 77 12 b5 00 4c 00 00 08 00 00 00 00 00   ...w...L
> 0030   7b 00 50 54 00 00 00 0a 50 54 00 00 00 09 12 34   {.PTPT.4
> 0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   
> 0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   
> 0060   00 00 00 00 00 00 00 00 00 00 00 00 00 00 ..
> 
> DPDK (16B zero pad):
>    ff ff ff ff ff ff 5e fb 90 96 7d b7 08 06 00 01   ..^...}.
> 0010   08 00 06 04 00 01 5e fb 90 96 7d b7 0a 01 01 01   ..^...}.
> 0020   00 00 00 00 00 00 0a 01 01 02 00 00 00 00 00 00   
> 0030   00 00 00 00 00 00 00 00 00 00 00 00   
> 
> Signed-off-by: Rohit Kumar 

Hi, Rohit.  Thanks for the patch!

Though I don't think OVS should add padding to packets.  RFC 1042 is talking
about transmission to ethernet networks.  In your example above OVS sends 104
byte long packet to the ethernet link, so no padding is needed.  If the
'SWITCH' in your setup sends incorectly padded packet after decapsulation,
that sounds like an issue of that switch.  Is this patch a workaround for
a hardware switch issue?

Also, we even had a request in the past to strip all the existing padding before
encapsulation, because older Linux kernel versions had issues processing padded
inner packets after decapsulation.

Adding the padding will also introduce a difference in behavior between Linux
kenrel datapath and the userspace datapath, since Linux kernel doesn't pad 
packet
before encapsulation.  Having a difference between datapath implementations is
not good.

For the part where we add padding before sending to a DPDK interface, this
also doesn't seem right.  Normally it's a job of a driver to properly pad
packets before transmitting them.  OVS doesn't know what the requirements
for the particular network it is connected to or even if it is an ethernet
or some other type of the interface.  So, it can't make a decision on padding,
only driver can do that.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev