Re: [PATCH 1/1] hw/net: Added basic IPv6 fragmentation. Fixed IPv6 payload length. Fixed CSO for IPv6.

2020-06-01 Thread Andrew Melnichenko
>
> Please introduce a separate patch to do this.

Ok, I'll split the patch.

> Did you mean the headeroom might not be enough?
>
Technically, yes - extensions increase L3 header size. If there are a lot
of them, total packet len may be greater then MTU after adding
fragmentation extension. Qemu calculates payload len for fragment only for
L4.

> Need be verbose on this.
>
Ok, technically, what OS expects is that the device will do segmentation,
so each packet may have different sizes. For backends with vheader, we need
just one packet with proper payload size. For fragmentation, size is
recalculated.

> What did "unsafe" suffix meant here?
>
We adding fragmentation extension for IPv6 L3 to the end of the buffer. So,
there is should be enough memory for that - we chacked it before calling
the function.

> Let's just implement the allocation here.
>
This is 'rare' case, L3 header buffer is allocated with NetTxPkt structure
and equals to ETH_MAX_IP_DGRAM_LEN = 0x - which is more than enough. To
'reallocate' the memory, we need to refactor NetTxPkt structure and
routine, which is maybe done in the future. For now, I think we need basic
IPv6 fragmentation.

> Please introduce a macro with comment for this magic number.
>
Ok, 0x71656d75 - it's string 'qemu'. We just need some number for packet
identification, it's unique for every fragmented packet series.

On Fri, May 29, 2020 at 12:11 PM Jason Wang  wrote:

>
> On 2020/5/8 上午3:25, and...@daynix.com wrote:
> > From: Andrew Melnychenko 
> >
> > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1708065
> > Overall, there was an issue that big frames of IPv6 doesn't sent.
> > With network backend with 'virtual header' - there was an issue
> > in 'plen' field. Overall, during TSO, 'plen' would be changed,
> > but with 'vheader' this field should be set to the size of the
> > payload itself instead of '0'.
> > For software offload - there is added basic IPv6 fragmentation.
>
>
> Please introduce a separate patch to do this.
>
>
> > Also fixed checksum offload for IPv6.
>
>
> And another patch for this.
>
>
> > The basic IPv6 fragmentation - adding 'frag' extension to
> > the packet, overall shares some logic with IPv4. It works,
> > but there are still issues with a combination of
> > extensions - in the future, it would require refactoring
> > work to implement workflow with IPv6 and extension.
>
>
> Did you mean the headeroom might not be enough?
>
>
> > e1000e driver doesn't set the 'plen' field for IPv6 for big packets
> > if TSO is enabled. "Jumbo option" isn't added yet, until
> > qemu supports packets greater than 64K.
> >
> > Signed-off-by: Andrew Melnychenko 
> > ---
> >   hw/net/net_tx_pkt.c | 54 ---
> >   hw/net/net_tx_pkt.h |  7 
> >   include/net/eth.h   | 15 ++--
> >   net/eth.c   | 89 ++---
> >   4 files changed, 151 insertions(+), 14 deletions(-)
> >
> > diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
> > index 162f802dd7..895effecb9 100644
> > --- a/hw/net/net_tx_pkt.c
> > +++ b/hw/net/net_tx_pkt.c
> > @@ -468,8 +468,8 @@ static void net_tx_pkt_do_sw_csum(struct NetTxPkt
> *pkt)
> >   /* num of iovec without vhdr */
> >   uint32_t iov_len = pkt->payload_frags + NET_TX_PKT_PL_START_FRAG -
> 1;
> >   uint16_t csl;
> > -struct ip_header *iphdr;
> >   size_t csum_offset = pkt->virt_hdr.csum_start +
> pkt->virt_hdr.csum_offset;
> > +uint16_t l3_proto = eth_get_l3_proto(iov, 1, iov->iov_len);
> >
> >   /* Put zero to checksum field */
> >   iov_from_buf(iov, iov_len, csum_offset, , sizeof csum);
> > @@ -477,9 +477,18 @@ static void net_tx_pkt_do_sw_csum(struct NetTxPkt
> *pkt)
> >   /* Calculate L4 TCP/UDP checksum */
> >   csl = pkt->payload_len;
> >
> > +csum_cntr = 0;
> > +cso = 0;
> >   /* add pseudo header to csum */
> > -iphdr = pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_base;
> > -csum_cntr = eth_calc_ip4_pseudo_hdr_csum(iphdr, csl, );
> > +if (l3_proto == ETH_P_IP) {
> > +csum_cntr = eth_calc_ip4_pseudo_hdr_csum(
> > +pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_base,
> > +csl, );
> > +} else if (l3_proto == ETH_P_IPV6) {
> > +csum_cntr = eth_calc_ip6_pseudo_hdr_csum(
> > +pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_base,
> > +csl, pkt->l4proto, );
> > +}
> >
> >   /* data checksum */
> >   csum_cntr +=
> > @@ -580,10 +589,11 @@ static bool net_tx_pkt_do_sw_fragmentation(struct
> NetTxPkt *pkt,
> >
> >   more_frags = (fragment_offset + fragment_len <
> pkt->payload_len);
> >
> > -eth_setup_ip4_fragmentation(l2_iov_base, l2_iov_len,
> l3_iov_base,
> > -l3_iov_len, fragment_len, fragment_offset, more_frags);
> > +eth_setup_ip_fragmentation(l2_iov_base, l2_iov_len, l3_iov_base,
> > +_iov_len, ETH_MAX_IP_DGRAM_LEN,
> > +fragment_len, fragment_offset, more_frags);
> 

Re: [PATCH 1/1] hw/net: Added basic IPv6 fragmentation. Fixed IPv6 payload length. Fixed CSO for IPv6.

2020-05-29 Thread Jason Wang



On 2020/5/8 上午3:25, and...@daynix.com wrote:

From: Andrew Melnychenko 

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1708065
Overall, there was an issue that big frames of IPv6 doesn't sent.
With network backend with 'virtual header' - there was an issue
in 'plen' field. Overall, during TSO, 'plen' would be changed,
but with 'vheader' this field should be set to the size of the
payload itself instead of '0'.
For software offload - there is added basic IPv6 fragmentation.



Please introduce a separate patch to do this.



Also fixed checksum offload for IPv6.



And another patch for this.



The basic IPv6 fragmentation - adding 'frag' extension to
the packet, overall shares some logic with IPv4. It works,
but there are still issues with a combination of
extensions - in the future, it would require refactoring
work to implement workflow with IPv6 and extension.



Did you mean the headeroom might not be enough?



e1000e driver doesn't set the 'plen' field for IPv6 for big packets
if TSO is enabled. "Jumbo option" isn't added yet, until
qemu supports packets greater than 64K.

Signed-off-by: Andrew Melnychenko 
---
  hw/net/net_tx_pkt.c | 54 ---
  hw/net/net_tx_pkt.h |  7 
  include/net/eth.h   | 15 ++--
  net/eth.c   | 89 ++---
  4 files changed, 151 insertions(+), 14 deletions(-)

diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 162f802dd7..895effecb9 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -468,8 +468,8 @@ static void net_tx_pkt_do_sw_csum(struct NetTxPkt *pkt)
  /* num of iovec without vhdr */
  uint32_t iov_len = pkt->payload_frags + NET_TX_PKT_PL_START_FRAG - 1;
  uint16_t csl;
-struct ip_header *iphdr;
  size_t csum_offset = pkt->virt_hdr.csum_start + pkt->virt_hdr.csum_offset;
+uint16_t l3_proto = eth_get_l3_proto(iov, 1, iov->iov_len);
  
  /* Put zero to checksum field */

  iov_from_buf(iov, iov_len, csum_offset, , sizeof csum);
@@ -477,9 +477,18 @@ static void net_tx_pkt_do_sw_csum(struct NetTxPkt *pkt)
  /* Calculate L4 TCP/UDP checksum */
  csl = pkt->payload_len;
  
+csum_cntr = 0;

+cso = 0;
  /* add pseudo header to csum */
-iphdr = pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_base;
-csum_cntr = eth_calc_ip4_pseudo_hdr_csum(iphdr, csl, );
+if (l3_proto == ETH_P_IP) {
+csum_cntr = eth_calc_ip4_pseudo_hdr_csum(
+pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_base,
+csl, );
+} else if (l3_proto == ETH_P_IPV6) {
+csum_cntr = eth_calc_ip6_pseudo_hdr_csum(
+pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_base,
+csl, pkt->l4proto, );
+}
  
  /* data checksum */

  csum_cntr +=
@@ -580,10 +589,11 @@ static bool net_tx_pkt_do_sw_fragmentation(struct 
NetTxPkt *pkt,
  
  more_frags = (fragment_offset + fragment_len < pkt->payload_len);
  
-eth_setup_ip4_fragmentation(l2_iov_base, l2_iov_len, l3_iov_base,

-l3_iov_len, fragment_len, fragment_offset, more_frags);
+eth_setup_ip_fragmentation(l2_iov_base, l2_iov_len, l3_iov_base,
+_iov_len, ETH_MAX_IP_DGRAM_LEN,
+fragment_len, fragment_offset, more_frags);
  
-eth_fix_ip4_checksum(l3_iov_base, l3_iov_len);

+fragment[NET_TX_PKT_FRAGMENT_L3_HDR_POS].iov_len = l3_iov_len;
  
  net_tx_pkt_sendv(pkt, nc, fragment, dst_idx);
  
@@ -617,6 +627,7 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, NetClientState *nc)
  
  if (pkt->has_virt_hdr ||

  pkt->virt_hdr.gso_type == VIRTIO_NET_HDR_GSO_NONE) {
+net_tx_pkt_fix_ip6_payload_len(pkt);
  net_tx_pkt_sendv(pkt, nc, pkt->vec,
  pkt->payload_frags + NET_TX_PKT_PL_START_FRAG);
  return true;
@@ -635,3 +646,34 @@ bool net_tx_pkt_send_loopback(struct NetTxPkt *pkt, 
NetClientState *nc)
  
  return res;

  }
+
+void net_tx_pkt_fix_ip6_payload_len(struct NetTxPkt *pkt)
+{
+/*
+ * If ipv6 payload length field is 0 - then there should be Hop-by-Hop
+ * option for packets greater than 65,535.
+ * For packets with payload less than 65,535: fix 'plen' field.
+ * For now, qemu drops every packet with size greater 64K
+ * (see net_tx_pkt_send()) so, there is no reason to add jumbo option to 
ip6
+ * hop-by-hop extension if it's missed
+ */
+
+struct iovec *l2 = >vec[NET_TX_PKT_L2HDR_FRAG];
+if (eth_get_l3_proto(l2, 1, l2->iov_len) == ETH_P_IPV6) {
+struct ip6_header *ip6 = (struct ip6_header *) pkt->l3_hdr;
+/*
+ * TODO: if qemu would support >64K packets - add jumbo option check
+ * something like that:
+ * 'if (ip6->ip6_plen == 0 && !has_jumbo_option(ip6)) {'
+ */
+if (ip6->ip6_plen == 0) {
+if (pkt->payload_len <= ETH_MAX_IP_DGRAM_LEN) {
+ip6->ip6_plen = htons(pkt->payload_len);
+}
+/*
+  

[PATCH 1/1] hw/net: Added basic IPv6 fragmentation. Fixed IPv6 payload length. Fixed CSO for IPv6.

2020-05-07 Thread andrew
From: Andrew Melnychenko 

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1708065
Overall, there was an issue that big frames of IPv6 doesn't sent.
With network backend with 'virtual header' - there was an issue
in 'plen' field. Overall, during TSO, 'plen' would be changed,
but with 'vheader' this field should be set to the size of the
payload itself instead of '0'.
For software offload - there is added basic IPv6 fragmentation.
Also fixed checksum offload for IPv6.
The basic IPv6 fragmentation - adding 'frag' extension to
the packet, overall shares some logic with IPv4. It works,
but there are still issues with a combination of
extensions - in the future, it would require refactoring
work to implement workflow with IPv6 and extension.
e1000e driver doesn't set the 'plen' field for IPv6 for big packets
if TSO is enabled. "Jumbo option" isn't added yet, until
qemu supports packets greater than 64K.

Signed-off-by: Andrew Melnychenko 
---
 hw/net/net_tx_pkt.c | 54 ---
 hw/net/net_tx_pkt.h |  7 
 include/net/eth.h   | 15 ++--
 net/eth.c   | 89 ++---
 4 files changed, 151 insertions(+), 14 deletions(-)

diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 162f802dd7..895effecb9 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -468,8 +468,8 @@ static void net_tx_pkt_do_sw_csum(struct NetTxPkt *pkt)
 /* num of iovec without vhdr */
 uint32_t iov_len = pkt->payload_frags + NET_TX_PKT_PL_START_FRAG - 1;
 uint16_t csl;
-struct ip_header *iphdr;
 size_t csum_offset = pkt->virt_hdr.csum_start + pkt->virt_hdr.csum_offset;
+uint16_t l3_proto = eth_get_l3_proto(iov, 1, iov->iov_len);
 
 /* Put zero to checksum field */
 iov_from_buf(iov, iov_len, csum_offset, , sizeof csum);
@@ -477,9 +477,18 @@ static void net_tx_pkt_do_sw_csum(struct NetTxPkt *pkt)
 /* Calculate L4 TCP/UDP checksum */
 csl = pkt->payload_len;
 
+csum_cntr = 0;
+cso = 0;
 /* add pseudo header to csum */
-iphdr = pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_base;
-csum_cntr = eth_calc_ip4_pseudo_hdr_csum(iphdr, csl, );
+if (l3_proto == ETH_P_IP) {
+csum_cntr = eth_calc_ip4_pseudo_hdr_csum(
+pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_base,
+csl, );
+} else if (l3_proto == ETH_P_IPV6) {
+csum_cntr = eth_calc_ip6_pseudo_hdr_csum(
+pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_base,
+csl, pkt->l4proto, );
+}
 
 /* data checksum */
 csum_cntr +=
@@ -580,10 +589,11 @@ static bool net_tx_pkt_do_sw_fragmentation(struct 
NetTxPkt *pkt,
 
 more_frags = (fragment_offset + fragment_len < pkt->payload_len);
 
-eth_setup_ip4_fragmentation(l2_iov_base, l2_iov_len, l3_iov_base,
-l3_iov_len, fragment_len, fragment_offset, more_frags);
+eth_setup_ip_fragmentation(l2_iov_base, l2_iov_len, l3_iov_base,
+_iov_len, ETH_MAX_IP_DGRAM_LEN,
+fragment_len, fragment_offset, more_frags);
 
-eth_fix_ip4_checksum(l3_iov_base, l3_iov_len);
+fragment[NET_TX_PKT_FRAGMENT_L3_HDR_POS].iov_len = l3_iov_len;
 
 net_tx_pkt_sendv(pkt, nc, fragment, dst_idx);
 
@@ -617,6 +627,7 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, NetClientState 
*nc)
 
 if (pkt->has_virt_hdr ||
 pkt->virt_hdr.gso_type == VIRTIO_NET_HDR_GSO_NONE) {
+net_tx_pkt_fix_ip6_payload_len(pkt);
 net_tx_pkt_sendv(pkt, nc, pkt->vec,
 pkt->payload_frags + NET_TX_PKT_PL_START_FRAG);
 return true;
@@ -635,3 +646,34 @@ bool net_tx_pkt_send_loopback(struct NetTxPkt *pkt, 
NetClientState *nc)
 
 return res;
 }
+
+void net_tx_pkt_fix_ip6_payload_len(struct NetTxPkt *pkt)
+{
+/*
+ * If ipv6 payload length field is 0 - then there should be Hop-by-Hop
+ * option for packets greater than 65,535.
+ * For packets with payload less than 65,535: fix 'plen' field.
+ * For now, qemu drops every packet with size greater 64K
+ * (see net_tx_pkt_send()) so, there is no reason to add jumbo option to 
ip6
+ * hop-by-hop extension if it's missed
+ */
+
+struct iovec *l2 = >vec[NET_TX_PKT_L2HDR_FRAG];
+if (eth_get_l3_proto(l2, 1, l2->iov_len) == ETH_P_IPV6) {
+struct ip6_header *ip6 = (struct ip6_header *) pkt->l3_hdr;
+/*
+ * TODO: if qemu would support >64K packets - add jumbo option check
+ * something like that:
+ * 'if (ip6->ip6_plen == 0 && !has_jumbo_option(ip6)) {'
+ */
+if (ip6->ip6_plen == 0) {
+if (pkt->payload_len <= ETH_MAX_IP_DGRAM_LEN) {
+ip6->ip6_plen = htons(pkt->payload_len);
+}
+/*
+ * TODO: if qemu would support >64K packets
+ * add jumbo option for packets greater then 65,535 bytes
+ */
+}
+}
+}
diff --git a/hw/net/net_tx_pkt.h b/hw/net/net_tx_pkt.h
index