from:"Singhai, Anjali"

Re: [PATCH v4 1/5] geneve: Add geneve udp port offload for ethernet devices

2015-12-14 Thread Singhai, Anjali


Resending the series with a  cover-letter that got missed.

Thanks
Anjali

On 12/14/2015 11:57 AM, Anjali Singhai Jain wrote:

Add ndo_ops to add/del UDP ports to a device that supports geneve
offload.

v2: Comment fix.

Signed-off-by: Anjali Singhai Jain 
Signed-off-by: Kiran Patil 
---
  drivers/net/geneve.c  | 23 +++
  include/linux/netdevice.h | 20 +++-
  2 files changed, 42 insertions(+), 1 deletion(-)




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 1/4] geneve: Add geneve udp port offload for ethernet devices

2015-12-14 Thread Singhai, Anjali




On 12/11/2015 7:11 PM, Tom Herbert wrote:

On Tue, Dec 8, 2015 at 10:12 AM, Anjali Singhai Jain
 wrote:

Add ndo_ops to add/del UDP ports to a device that supports geneve
offload.

v3: Add some more comments about the use of the new ndo ops.

Signed-off-by: Anjali Singhai Jain 
Signed-off-by: Kiran Patil 
---
  drivers/net/geneve.c  | 23 +++
  include/linux/netdevice.h | 21 -
  2 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index de5c30c..b43fd56 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -371,8 +371,11 @@ static struct socket *geneve_create_sock(struct net *net, 
bool ipv6,

  static void geneve_notify_add_rx_port(struct geneve_sock *gs)
  {
+   struct net_device *dev;
 struct sock *sk = gs->sock->sk;
+   struct net *net = sock_net(sk);
 sa_family_t sa_family = sk->sk_family;
+   __be16 port = inet_sk(sk)->inet_sport;
 int err;

 if (sa_family == AF_INET) {
@@ -381,6 +384,14 @@ static void geneve_notify_add_rx_port(struct geneve_sock 
*gs)
 pr_warn("geneve: udp_add_offload failed with status 
%d\n",
 err);
 }
+
+   rcu_read_lock();
+   for_each_netdev_rcu(net, dev) {
+   if (dev->netdev_ops->ndo_add_geneve_port)
+   dev->netdev_ops->ndo_add_geneve_port(dev, sa_family,
+port);
+   }
+   rcu_read_unlock();

What about IPv6 case?


The driver still gets add port calls for IPv6 and can decide to offload 
L4 RX checksum if the HW is capable.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 2/4] i40e: geneve tunnel offload support

2015-12-08 Thread Singhai, Anjali




On 12/8/2015 10:36 AM, Jesse Gross wrote:

On Tue, Dec 8, 2015 at 10:20 AM, Alexei Starovoitov
 wrote:

On Tue, Dec 08, 2015 at 10:12:12AM -0800, Anjali Singhai Jain wrote:

+/**
+ * i40e_add_geneve_port - Get notifications about GENEVE ports that come up
+ * @netdev: This physical port's netdev
+ * @sa_family: Socket Family that GENEVE is notifying us about
+ * @port: New UDP port number that GENEVE started listening to
+ **/
+static void i40e_add_geneve_port(struct net_device *netdev,
+  sa_family_t sa_family, __be16 port)
+{
+#if IS_ENABLED(CONFIG_GENEVE)

...

+ /* New port: add it and mark its index in the bitmap */
+ pf->udp_ports[next_idx].index = port;
+ pf->udp_ports[next_idx].type = I40E_AQC_TUNNEL_TYPE_NGE;

the function suppose to deal with geneve but tunnel type is NGE ?!

NGE is an old name for Geneve: "Next Generation Encapsulation"

Yes and that is why the old name in the SW/FW header files.

-#define I40E_MAX_TUNNEL_HDR_LEN 80
+/* Hardware supports L4 tunnel length of 128B (=2^7) which includes
+ * inner mac plus all inner ethertypes.
+ */
+#define I40E_MAX_TUNNEL_HDR_LEN 128

so the driver lied about actual hw capabilities earlier
or it needs firmware update to work this way?

I'm pretty sure that this is just making the calculation match the
hardware more accurately. If you look at the code below it, it is now
calculating the length from a different place.
It is making the code match the HW more accurately, which did support 
the 128 size
all along but till we enabled geneve support in the SW and FW,  the 
earlier setting was the right value.



- if (!(tx_flags & I40E_TX_FLAGS_VXLAN_TUNNEL)) {
+ if (!(tx_flags & I40E_TX_FLAGS_UDP_TUNNEL)) {

...

@@ -163,7 +163,7 @@ enum i40e_dyn_idx_t {
  #define I40E_TX_FLAGS_FSOBIT(7)
  #define I40E_TX_FLAGS_TSYN   BIT(8)
  #define I40E_TX_FLAGS_FD_SB  BIT(9)
-#define I40E_TX_FLAGS_VXLAN_TUNNEL   BIT(10)
+#define I40E_TX_FLAGS_UDP_TUNNEL BIT(10)

these changes implying that HW actually doesn't have special 'geneve or vxlan'
hard coded logic and it's generic enough to understand most of udp tunnels.
Then why you cannot generalize this whole things as generic udp tunnel offload
and do not add any protocol specific hooks and ndos.

These are transmit flags but the issue of specialization relates to receive.

Second what Jesse said.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload

2015-12-01 Thread Singhai, Anjali




On 12/1/2015 8:08 AM, John W. Linville wrote:

On Tue, Dec 01, 2015 at 04:49:28PM +0100, Hannes Frederic Sowa wrote:

On Tue, Dec 1, 2015, at 16:44, John W. Linville wrote:

On Mon, Nov 30, 2015 at 09:26:51PM -0800, Tom Herbert wrote:

On Mon, Nov 30, 2015 at 5:28 PM, Jesse Gross  wrote:

Based on what we can do today, I see only two real choices: do some
refactoring to clean up the stack a bit or remove the existing VXLAN
offloading altogether. I think this series is trying to do the former
and the result is that the stack is cleaner after than before. That
seems like a good thing.

There is a third choice which is to do nothing. Creating an
infrastructure that claims to "Generalize udp based tunnel offload"
but actually doesn't generalize the mechanism is nothing more than
window dressing-- this does nothing to help with the VXLAN to
VXLAN-GPE transition for instance. If geneve specific offload is
really needed now then that can be should with another ndo function,
or alternatively ntuple filter with a device specific action would at
least get the stack out of needing to be concerned with that.
Regardless, we will work optimize the rest of the stack for devices
that implement protocol agnostic mechanisms.

Is there no concern about NDO proliferation? Does the size of the
netdev_ops structure matter? Beyond that, I can see how a single
entry point with an enum specifying the offload type isn't really any
different in the grand scheme of things than having multiple NDOs,
one per offload.

Given the need to live with existing hardware offloads, I would lean
toward a consolidated NDO. But if a different NDO per tunnel type is
preferred, I can be satisified with that.

Having per-offloading NDOs helps the stack to gather further information
what kind of offloads the driver has even maybe without trying to call
down into the layer (just by comparing to NULL). Checking this inside
the driver offload function clearly does not have this feature. So we
finally can have "ip tunnel please-recommend-type" feature. :)

That is a valuable insight! Maybe the per-offload NDO isn't such a
bad idea afterall... :-)

John
This helps me understand why having a separate ndo op might still be ok. 
Thanks for the feedback. I will go back to that model.  Also  I think I 
did finally understand the discussion on using a single 2's compliment 
checksum method

for future silicon.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH v1 1/6] net: Generalize udp based tunnel offload

2015-11-30 Thread Singhai, Anjali



-Original Message-
From: Tom Herbert [mailto:t...@herbertland.com] 
Sent: Monday, November 30, 2015 8:36 AM
To: Singhai, Anjali 
Cc: Linux Kernel Network Developers ; Jesse Gross 
; Patil, Kiran 
Subject: Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload

On Mon, Nov 23, 2015 at 1:02 PM, Anjali Singhai Jain  
wrote:
> Replace add/del ndo ops for vxlan_port with tunnel_port so that all 
> UDP based tunnels can use the same ndo op. Add a parameter to pass 
> tunnel type to the ndo_op.
>
Please consider using RX ntuple filters for this instead of a new ndo op. The 
vxlan ndo op essentailly implements a limited filter with a rule to match a 
destination UDP port and the the action of processing the packet as vxlan. 
ntuple filters generalizes that so that the filtering becomes arbitrary. We'll 
need the ability to filter on 4-tuple when we implement tunnels to go through 
firewalls or for offloading other UDP protocols such SPUD or QUIC.

Tom

- Tom I am not sure I agree with this suggestion. The easiest way to let the 
hardware know about port to protocol mapping in case of udp-based tunnels is 
when we add udp offloads for the ports aka gro etc in the stack. This way the 
user gets benefit of tunnel offloads from the HWs that support it without 
having to do any extra filter setups from ethtool. Just like ip/tcp/udp 
checksum and TSO support, the user does not have to turn this ON specifically 
if they plan to use those protocols (of course they can turn it off). Besides 
these are not true filters in that sense, they are not used to guide packets to 
any particular destination in this case, rather used to identify packets for 
checksum and TSO purpose.
And I agree with your patch series that reduces protocol ossification of the 
stack and driver interface. My point is this set of patches help with that goal 
and not really hurt because any new tunnel support would mean no change in the 
interface and just a new type in the enum and then the drivers can decide to do 
the magic setup in the HW in their driver based on this new type without ever 
having to touch the interface. So try to explain to me why this is causing 
protocol ossification because I don't believe so. And I think the ntupe 
interface should remain for the purpose of filters which are used to route 
packet or drop them. Not for packet identification and checksum offload support.

> Change all drivers to use the generalized udp tunnel offload
>
> Patch was compile tested with x86_64_defconfig.
>
> Signed-off-by: Kiran Patil 
> Signed-off-by: Anjali Singhai Jain 
> ---
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 15 ++---
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c| 13 +---
>  drivers/net/ethernet/emulex/benet/be_main.c  | 14 +---
>  drivers/net/ethernet/intel/fm10k/fm10k_netdev.c  | 27 
>  drivers/net/ethernet/intel/i40e/i40e_main.c  | 41 
> +---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c| 17 +++---
>  drivers/net/ethernet/mellanox/mlx4/en_netdev.c   | 21 
>  drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c | 17 +++---
>  drivers/net/vxlan.c  | 23 +++--
>  include/linux/netdevice.h| 34 ++--
>  include/net/udp_tunnel.h |  6 
>  11 files changed, 157 insertions(+), 71 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> index 2273576..ad2782f 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> @@ -47,6 +47,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -10124,11 +10125,14 @@ static void __bnx2x_add_vxlan_port(struct 
> bnx2x *bp, u16 port)  }
>
>  static void bnx2x_add_vxlan_port(struct net_device *netdev,
> -sa_family_t sa_family, __be16 port)
> +sa_family_t sa_family, __be16 port,
> +u32 type)
>  {
> struct bnx2x *bp = netdev_priv(netdev);
> u16 t_port = ntohs(port);
>
> +   if (type != UDP_TUNNEL_VXLAN)
> +   return;
> __bnx2x_add_vxlan_port(bp, t_port);  }
>
> @@ -10152,11 +10156,14 @@ static void __bnx2x_del_vxlan_port(struct 
> bnx2x *bp, u16 port)  }
>
>  static void bnx2x_del_vxlan_port(struct net_device *netdev,
> -sa_family_t sa_family, __be16 port)
> +sa_family_t sa_family, __be16 port,
> +u32 type)
>  {
> struct bnx2x *bp = netdev_priv(netdev);
> u16 t_port =

RE: [PATCH v1 1/6] net: Generalize udp based tunnel offload

2015-11-30 Thread Singhai, Anjali

-Original Message-
From: David Miller [mailto:da...@davemloft.net] 
Sent: Sunday, November 29, 2015 7:23 PM
To: t...@herbertland.com
Cc: Brandeburg, Jesse ; Singhai, Anjali 
; je...@kernel.org; netdev@vger.kernel.org; Patil, 
Kiran 
Subject: Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload

From: Tom Herbert 
Date: Tue, 24 Nov 2015 09:32:11 -0800

>>
>> FWIW, I've brought the issue to the attention of the architects here, 
>> and we will likely be able to make changes in this space.  Intel 
>> hardware (as demonstrated by your patches) already is able to deal 
>> with this de-ossification on transmit.  Receive is a whole different beast.
>>
> Please provide the specifics on why "Receive is a whole different 
> beast.". Generic receive checksum is already a subset of the 
> functionality that you must have implement to support the protocol 
> specific offloads. All the hardware needs to do is calculate the 1's 
> complement checksum of the packet and return the value on the to the 
> host with that packet. That's it. No parsing of headers, no worrying 
> about the pseudo header, no dealing with any encapsulation. Just do 
> the calculation, return the result to the host and the driver converts 
> this to CHECKSUM_COMPLETE. I find it very hard to believe that this is 
> any harder than specific support the next protocol du jour.

The reason for receive being different than transmit is, on TX side driver can 
provide the meta data for where the checksum field is and what is the length 
that needs to be check summed to the HW on a per packet basis. On Rx the HW 
parser has to parse the packet to identify the tunnel type and based on that 
figure out the checksum locations and length in the packet, so definitely HW 
has to parse the packet and it can parse only based on next header type 
information or in case of udp tunnels based on udp port mapping to a particular 
protocol. I am not sure why you say it doesn't need to parse the packet, maybe 
I am miss- understanding something.  Although it's not difficult to reduce 
protocol ossification on the RX side but it is certainly different and 
particularly in case of udp-tunnels it needs the port to protocol mapping.

+1
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH v1 1/6] net: Generalize udp based tunnel offload

2015-11-30 Thread Singhai, Anjali



-Original Message-
From: David Miller [mailto:da...@davemloft.net] 
Sent: Sunday, November 29, 2015 7:22 PM
To: t...@herbertland.com
Cc: Singhai, Anjali ; netdev@vger.kernel.org; 
je...@kernel.org; Patil, Kiran 
Subject: Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload

From: Tom Herbert 
Date: Mon, 23 Nov 2015 13:53:44 -0800

> The bad effect of this model is that it is encourages HW vendors to 
> continue implement HW protocol specific support for encapsulations, we 
> get so much more benefit if they implement protocol generic 
> mechanisms.
Dave, at least Intel parts have a protocol generic model for tunneled packet 
offloads and hence we are able to extend our support to newer tunnel types. We 
do  not have protocol specific support in the HW, but since the udp based 
tunnels do not have a packet type for the tunnel header, the HW needs to know 
which udp port should be mapped to which specific encapsulation. Otherwise 
encapsulated types like NVGRE we can identify through packet type and program 
the HW to account for the header. The newer patches for sure reduce the 
protocol ossification since in communalizes all the different tunnels into one 
interface so that any further support to a newer udp tunnel type requires just 
a type definition and if the driver/HW can support it, minor driver changes to 
set the right bits for HW. No interface change for sure. And I think that is 
definitely a step in the right direction.

+1
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH v1 2/6] net: Add a generic udp_offload_get_port function

2015-11-24 Thread Singhai, Anjali



> -Original Message-
> From: Alexander Duyck [mailto:alexander.du...@gmail.com]
> Sent: Monday, November 23, 2015 10:38 PM
> To: Singhai, Anjali; netdev@vger.kernel.org
> Cc: je...@kernel.org; Patil, Kiran
> Subject: Re: [PATCH v1 2/6] net: Add a generic udp_offload_get_port
> function
> 
> On 11/23/2015 01:02 PM, Anjali Singhai Jain wrote:
> > The new function udp_offload_get_port replaces vxlan_get_rx_port().
> > This is a generic function that will help replay all udp tunnel ports
> > irrespective of tunnel type.
> > This way when new udp tunnels get added this function need not change.
> >
> > Note: Drivers besides i40e are compile tested with this change.
> >
> > Signed-off-by: Anjali Singhai Jain 
> > Signed-off-by: Kiran Patil 
> > ---
> 
> [...]
> 
> > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c index
> > f938616..8597020 100644
> > --- a/net/ipv4/udp_offload.c
> > +++ b/net/ipv4/udp_offload.c
> > @@ -290,6 +290,33 @@ unlock:
> >   }
> >   EXPORT_SYMBOL(udp_del_offload);
> >
> > +void udp_offload_get_port(struct net_device *dev) {
> > +   struct udp_offload_priv __rcu **head;
> > +   struct udp_offload_priv *uo_priv;
> > +   struct udp_offload *uo;
> > +
> > +   if (udp_offload_base)
> > +   head = &udp_offload_base;
> > +   else
> > +   return;
> > +
> > +   spin_lock(&udp_offload_lock);
> > +   uo_priv = udp_deref_protected(*head);
> > +   for (; uo_priv != NULL; uo_priv = udp_deref_protected(*head)) {
> > +   /* call the right add port */
> > +   uo = uo_priv->offload;
> > +   if (uo && dev->netdev_ops->ndo_add_udp_tunnel_port)
> > +   dev->netdev_ops->ndo_add_udp_tunnel_port(dev,
> > +   uo->family,
> > +   uo->port,
> > +   uo->tunnel_type);
> > +   head = &uo_priv->next;
> > +   }
> > +   spin_unlock(&udp_offload_lock);
> > +}
> > +EXPORT_SYMBOL(udp_offload_get_port);
> > +
> >   struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff
> *skb,
> >  struct udphdr *uh)
> >   {
> >
> 
> So when I got to patch 5 I realized this approach is horribly broken for
> IPv6 tunnels.  The udp_offload_base is only populated if the family is
> AF_INET.  What do you guys plan to do to get support for AF_INET6?
> 
> You probably ought to look at something like what ended up being done for
> the IOAT stuff.  What you end up needing is to support the drivers querying
> for what ports are active, and receiving notifications of tunnel updates, and
> the tunnel side that will register some functionality allowing the active 
> ports
> for a given tunnel type to be queried.
> 
> - Alex

Alex you are right about the IPv6 handling and I will update the patch series 
to help fix that.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH v1 1/6] net: Generalize udp based tunnel offload

2015-11-23 Thread Singhai, Anjali



> -Original Message-
> From: Jesse Gross [mailto:je...@kernel.org]
> Sent: Monday, November 23, 2015 2:50 PM
> To: Tom Herbert
> Cc: Singhai, Anjali; Linux Kernel Network Developers; Patil, Kiran
> Subject: Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload
> 
> On Mon, Nov 23, 2015 at 1:53 PM, Tom Herbert 
> wrote:
> >> diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
> >> index cb2f89f..72415aa 100644
> >> --- a/include/net/udp_tunnel.h
> >> +++ b/include/net/udp_tunnel.h
> >> @@ -9,6 +9,12 @@
> >>  #include 
> >>  #endif
> >>
> >> +enum udp_tunnel_type {
> >> +   UDP_TUNNEL_UNSPEC,
> >> +   UDP_TUNNEL_VXLAN,
> >> +   UDP_TUNNEL_GENEVE,
> >> +};
> >> +
> >
> > Sorry, I still don't like this. Grant it least it gets rid of of VXLAN
> > specific ops, but the problem is there no such things as a common set
> > of encapsulations in the kernel (e.g. foo-over-udp adds a bunch of
> > encapsulations not represented here), no defined common set of device
> > functionality that needs this, and this precludes the use of the RX
> > accelerations to be available from a userpsace  implementation.
> 
> Regardless, I think this is at least a good cleanup of what is already
> there compared to having VXLAN-specific NDOs. We can always add
> additional things in the future.

Agreed with Jesse that this will help not hurt,  when we are ready to cross the 
bridge for removing RX side Protocol ossification.

Re: [PATCH v4 1/5] geneve: Add geneve udp port offload for ethernet devices

Re: [PATCH v3 1/4] geneve: Add geneve udp port offload for ethernet devices

Re: [PATCH v3 2/4] i40e: geneve tunnel offload support

Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload

RE: [PATCH v1 1/6] net: Generalize udp based tunnel offload

RE: [PATCH v1 1/6] net: Generalize udp based tunnel offload

RE: [PATCH v1 1/6] net: Generalize udp based tunnel offload

RE: [PATCH v1 2/6] net: Add a generic udp_offload_get_port function

RE: [PATCH v1 1/6] net: Generalize udp based tunnel offload

9 matches

Site Navigation

Mail list logo

Footer information