RE: [PATCH 2.6.16-rc1] S2io: Large Receive Offload (LRO) feature(v2) for Neterion (s2io) 10GbE Xframe PCI-X and PCI-E NICs

2006-02-10 Thread Leonid Grossman
 

 -Original Message-
 From: Jeff Garzik [mailto:[EMAIL PROTECTED] 
 
 It's been merged in the 'lro' branch of netdev-2.6.git for a 
 little while now.  Once it gets additional review (and 
 hopefully testing), I am OK with it going upstream.
 
   Jeff

Hi Jeff, 
I agree the more testing LRO code gets over time, the better :-)
In addition to our fairly long cycle, IBM Linux team did some tests on
LRO as well; they are sending you a note to that effect.
Also - if someone interested in testing the feature but doesn't have
10GbE Xframe cards, we can connect couple Xeon boxes back-to-back and
open a remote for 2-3 of days.

Leonid
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2.6.16-rc1] S2io: Large Receive Offload (LRO) feature(v2) for Neterion (s2io) 10GbE Xframe PCI-X and PCI-E NICs

2006-02-08 Thread Ravinandan Arakali
Hi,
Just wondering if anybody got a chance to review the below patch.
This version(as per Rick's comment on v1 patch) includes support
for TCP timestamps.

Thanks,
Ravi

-Original Message-
From: Ravinandan Arakali [mailto:[EMAIL PROTECTED]
Sent: Wednesday, January 25, 2006 11:53 AM
To: [EMAIL PROTECTED]; netdev@vger.kernel.org
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: [PATCH 2.6.16-rc1] S2io: Large Receive Offload (LRO)
feature(v2) for Neterion (s2io) 10GbE Xframe PCI-X and PCI-E NICs


Hi,
Below is a patch for the Large Receive Offload feature.
Please review and let us know your comments.

LRO algorithm was described in an OLS 2005 presentation, located at
ftp.s2io.com
user: linuxdocs
password: HALdocs

The same ftp site has Programming Manual for Xframe-I ASIC.
LRO feature is supported on Neterion Xframe-I, Xframe-II and
Xframe-Express 10GbE NICs.

Brief description:
The Large Receive Offload(LRO) feature is a stateless offload
that is complementary to TSO feature but on the receive path.
The idea is to combine and collapse(upto 64K maximum) in the
driver, in-sequence TCP packets belonging to the same session.
It is mainly designed to improve 1500 mtu receive performance,
since Jumbo frame performance is already close to 10GbE line
rate. Some performance numbers are attached below.

Implementation details:
1. Handle packet chains from multiple sessions(current default
MAX_LRO_SESSSIONS=32).
2. Examine each packet for eligiblity to aggregate. A packet is
considered eligible if it meets all the below criteria.
  a. It is a TCP/IP packet and L2 type is not LLC or SNAP.
  b. The packet has no checksum errors(L3 and L4).
  c. There are no IP options. The only TCP option supported is timestamps.
  d. Search and locate the LRO object corresponding to this
 socket and ensure packet is in TCP sequence.
  e. It's not a special packet(SYN, FIN, RST, URG, PSH etc. flags are not
set).
  f. TCP payload is non-zero(It's not a pure ACK).
  g. It's not an IP-fragmented packet.
3. If a packet is found eligible, the LRO object is updated with
   information such as next sequence number expected, current length
   of aggregated packet and so on. If not eligible or max packets
   reached, update IP and TCP headers of first packet in the chain
   and pass it up to stack.
4. The frag_list in skb structure is used to chain packets into one
   large packet.

Kernel changes required: None

Performance results:
Main focus of the initial testing was on 1500 mtu receiver, since this
is a bottleneck not covered by the existing stateless offloads.

There are couple disclaimers about the performance results below:
1. Your mileage will vary We initially concentrated on couple pci-x
2.0 platforms that are powerful enough to push 10 GbE NIC and do not
have bottlenecks other than cpu%;  testing on other platforms is still
in progress. On some lower end systems we are seeing lower gains.

2. Current LRO implementation is still (for the most part) software based,
and therefore performance potential of the feature is far from being
realized.
Full hw implementation of LRO is expected in the next version of Xframe
ASIC.

Performance delta(with MTU=1500) going from LRO disabled to enabled:
IBM 2-way Xeon (x366) : 3.5 to 7.1 Gbps
2-way Opteron : 4.5 to 6.1 Gbps

Signed-off-by: Ravinandan Arakali [EMAIL PROTECTED]
---

diff -urpN old/drivers/net/s2io.c new_ts/drivers/net/s2io.c
--- old/drivers/net/s2io.c  2006-01-19 04:31:05.0 -0800
+++ new_ts/drivers/net/s2io.c   2006-01-24 08:56:25.0 -0800
@@ -57,6 +57,9 @@
 #include linux/ethtool.h
 #include linux/workqueue.h
 #include linux/if_vlan.h
+#include linux/ip.h
+#include linux/tcp.h
+#include net/tcp.h

 #include asm/system.h
 #include asm/uaccess.h
@@ -66,7 +69,7 @@
 #include s2io.h
 #include s2io-regs.h

-#define DRV_VERSION Version 2.0.9.4
+#define DRV_VERSION 2.0.11.2

 /* S2io Driver name  version. */
 static char s2io_driver_name[] = Neterion;
@@ -168,6 +171,11 @@ static char ethtool_stats_keys[][ETH_GST
{\n DRIVER STATISTICS},
{single_bit_ecc_errs},
{double_bit_ecc_errs},
+   (lro_aggregated_pkts),
+   (lro_flush_both_count),
+   (lro_out_of_sequence_pkts),
+   (lro_flush_due_to_max_pkts),
+   (lro_avg_aggr_pkts),
 };

 #define S2IO_STAT_LEN sizeof(ethtool_stats_keys)/ ETH_GSTRING_LEN
@@ -317,6 +325,12 @@ static unsigned int indicate_max_pkts;
 static unsigned int rxsync_frequency = 3;
 /* Interrupt type. Values can be 0(INTA), 1(MSI), 2(MSI_X) */
 static unsigned int intr_type = 0;
+/* Large receive offload feature */
+static unsigned int lro = 0;
+/* Max pkts to be aggregated by LRO at one time. If not specified,
+ * aggregation happens until we hit max IP pkt size(64K)
+ */
+static unsigned int lro_max_pkts = 0x;

 /*
  * S2IO device table.
@@ -1476,6 +1490,19 @@ static int init_nic(struct s2io_nic *nic
writel((u32) 

Re: [PATCH 2.6.16-rc1] S2io: Large Receive Offload (LRO) feature(v2) for Neterion (s2io) 10GbE Xframe PCI-X and PCI-E NICs

2006-02-08 Thread Jeff Garzik

Ravinandan Arakali wrote:

Hi,
Just wondering if anybody got a chance to review the below patch.
This version(as per Rick's comment on v1 patch) includes support
for TCP timestamps.


It's been merged in the 'lro' branch of netdev-2.6.git for a little 
while now.  Once it gets additional review (and hopefully testing), I am 
OK with it going upstream.


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.16-rc1] S2io: Large Receive Offload (LRO) feature(v2) for Neterion (s2io) 10GbE Xframe PCI-X and PCI-E NICs

2006-02-08 Thread David S. Miller
From: Jeff Garzik [EMAIL PROTECTED]
Date: Wed, 08 Feb 2006 13:02:15 -0500

 Ravinandan Arakali wrote:
  Hi,
  Just wondering if anybody got a chance to review the below patch.
  This version(as per Rick's comment on v1 patch) includes support
  for TCP timestamps.
 
 It's been merged in the 'lro' branch of netdev-2.6.git for a little 
 while now.  Once it gets additional review (and hopefully testing), I am 
 OK with it going upstream.

Now that the timestamp support is in there I'm fine with these
changes.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html