Re: [PATCH] ehea: Optional TX/RX path optimized for SMP

2007-03-03 Thread Benjamin Herrenschmidt
On Sat, 2007-03-03 at 04:06 +0100, Andi Kleen wrote:
 Jan-Bernd Themann [EMAIL PROTECTED] writes:
  
  Are there any concerns about this approach?
 
 Yes. You should fix the NAPI code instead of trying to work
 around it.

NAPI is being fixed but the fix will take time to get in. In the
meantime, the solution to get something working is the workaround.

Ben.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ehea: Optional TX/RX path optimized for SMP

2007-03-02 Thread Andi Kleen
Jan-Bernd Themann [EMAIL PROTECTED] writes:
 
 Are there any concerns about this approach?

Yes. You should fix the NAPI code instead of trying to work
around it.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ehea: Optional TX/RX path optimized for SMP

2007-02-26 Thread Jan-Bernd Themann
Hi

 Also, as far as the approach of using tasklets, I think it would be
 better to use the fake netdev approach to continue to use NAPI.
 Basically you create a pseudo-netdev for each receive queue and have
 NAPI handle the polling for you -- you could look for
 drivers/net/cxgb3 for an example of this.
 
Thanks for pointing us to this solution. We are now building a NAPI version
that makes use of these pseudo-netdev. The fairness amongst other netdevices
should be better this way.

 
 Why make this a module option that the user has to set?  Are there any
 circumstances when someone wouldn't want significant performance
 improvements?  If this approach is just better, then it should just
 replace the old code.
 

We'll change the default behaviour to multi queue, but we'd like to keep
the option to run in a single queue mode for debug and backward compabilty.

Thanks,

Jan-Bernd  Christoph R.
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] ehea: Optional TX/RX path optimized for SMP

2007-02-23 Thread Jan-Bernd Themann
Hi!

This patch introduces an optional alternative receive processing
functionality (enabled via module load parameter). The ehea adapter
can sort TCP traffic to multiple receive queues to be processed by
the driver in parallel on multiple CPUs. The hardware always puts
packets for an individual tcp stream on the same queue. As the
current NAPI interface does not allow to handle parallel receive
threads for a single adapter (processing on multiple CPUs in parallel)
this patch uses tasklets with a simple fairness algorithm instead. 
On the send side we also take advantage of ehea's multiple send queue
capabilites. A simple hash function in combination with the LL_TX
attribute allows to process tx-packets on multiple CPUs on different
queues. The hash function is needed to guarantee proper TCP packet
ordering. This alternative packet processing functionality leads to 
significant performance improvements with ehea. 

Are there any concerns about this approach?

Regards,
Jan-Bernd

Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED]
---



diff -Nurp -X dontdiff linux-2.6.21-rc1/drivers/net/ehea/ehea.h 
patched_kernel/drivers/net/ehea/ehea.h
--- linux-2.6.21-rc1/drivers/net/ehea/ehea.h2007-02-23 15:40:42.0 
+0100
+++ patched_kernel/drivers/net/ehea/ehea.h  2007-02-23 16:17:37.0 
+0100
@@ -39,7 +39,7 @@
 #include asm/io.h
 
 #define DRV_NAME   ehea
-#define DRV_VERSIONEHEA_0048
+#define DRV_VERSIONEHEA_0049
 
 #define EHEA_MSG_DEFAULT (NETIF_MSG_LINK | NETIF_MSG_TIMER \
| NETIF_MSG_RX_ERR | NETIF_MSG_TX_ERR)
@@ -375,8 +375,12 @@ struct ehea_port_res {
struct tasklet_struct send_comp_task;
spinlock_t recv_lock;
struct port_state p_state;
+   struct tasklet_struct recv_task;
u64 rx_packets;
u32 poll_counter;
+   u32 spoll_counter;
+   u32 rtcount;
+   u32 stcount;
 };
 
 
@@ -416,7 +420,9 @@ struct ehea_port {
char int_aff_name[EHEA_IRQ_NAME_SIZE];
int allmulti;/* Indicates IFF_ALLMULTI state */
int promisc; /* Indicates IFF_PROMISC state */
+   int num_tx_qps;
int num_add_tx_qps;
+   int num_mcs;
int resets;
u64 mac_addr;
u32 logical_port_id;
diff -Nurp -X dontdiff linux-2.6.21-rc1/drivers/net/ehea/ehea_main.c 
patched_kernel/drivers/net/ehea/ehea_main.c
--- linux-2.6.21-rc1/drivers/net/ehea/ehea_main.c   2007-02-23 
15:40:42.0 +0100
+++ patched_kernel/drivers/net/ehea/ehea_main.c 2007-02-23 16:17:42.0 
+0100
@@ -51,12 +51,14 @@ static int rq1_entries = EHEA_DEF_ENTRIE
 static int rq2_entries = EHEA_DEF_ENTRIES_RQ2;
 static int rq3_entries = EHEA_DEF_ENTRIES_RQ3;
 static int sq_entries = EHEA_DEF_ENTRIES_SQ;
+static int use_mcs = 0;
 
 module_param(msg_level, int, 0);
 module_param(rq1_entries, int, 0);
 module_param(rq2_entries, int, 0);
 module_param(rq3_entries, int, 0);
 module_param(sq_entries, int, 0);
+module_param(use_mcs, int, 0);
 
 MODULE_PARM_DESC(msg_level, msg_level);
 MODULE_PARM_DESC(rq3_entries, Number of entries for Receive Queue 3 
@@ -71,6 +73,8 @@ MODULE_PARM_DESC(rq1_entries, Number of
 MODULE_PARM_DESC(sq_entries,  Number of entries for the Send Queue  
 [2^x - 1], x = [6..14]. Default = 
 __MODULE_STRING(EHEA_DEF_ENTRIES_SQ) ));
+MODULE_PARM_DESC(use_mcs,  0:NAPI, 1:MCS, Default = 0 );
+
 
 void ehea_dump(void *adr, int len, char *msg) {
int x;
@@ -345,10 +349,12 @@ static int ehea_treat_poll_error(struct 
return 0;
 }
 
-static int ehea_poll(struct net_device *dev, int *budget)
+static struct ehea_cqe *ehea_proc_rwqes(struct net_device *dev,
+   struct ehea_port_res *pr,
+   int max_packets,
+   int *packet_cnt)
 {
-   struct ehea_port *port = netdev_priv(dev);
-   struct ehea_port_res *pr = port-port_res[0];
+   struct ehea_port *port = pr-port;
struct ehea_qp *qp = pr-qp;
struct ehea_cqe *cqe;
struct sk_buff *skb;
@@ -359,14 +365,12 @@ static int ehea_poll(struct net_device *
int skb_arr_rq2_len = pr-rq2_skba.len;
int skb_arr_rq3_len = pr-rq3_skba.len;
int processed, processed_rq1, processed_rq2, processed_rq3;
-   int wqe_index, last_wqe_index, rq, intreq, my_quota, port_reset;
+   int wqe_index, last_wqe_index, rq, my_quota, port_reset;
 
processed = processed_rq1 = processed_rq2 = processed_rq3 = 0;
last_wqe_index = 0;
-   my_quota = min(*budget, dev-quota);
-   my_quota = min(my_quota, EHEA_POLL_MAX_RWQE);
+   my_quota = max_packets;
 
-   /* rq0 is low latency RQ */
cqe = ehea_poll_rq1(qp, wqe_index);
while ((my_quota  0)  cqe) {
ehea_inc_rq1(qp);
@@ -386,6 +390,7 @@ static int ehea_poll(struct net_device *
if (unlikely(!skb)) {
  

Re: [PATCH] ehea: Optional TX/RX path optimized for SMP

2007-02-23 Thread Roland Dreier
  This patch introduces an optional alternative receive processing
  functionality (enabled via module load parameter). The ehea adapter
  can sort TCP traffic to multiple receive queues to be processed by
  the driver in parallel on multiple CPUs. The hardware always puts
  packets for an individual tcp stream on the same queue. As the
  current NAPI interface does not allow to handle parallel receive
  threads for a single adapter (processing on multiple CPUs in parallel)
  this patch uses tasklets with a simple fairness algorithm instead. 
  On the send side we also take advantage of ehea's multiple send queue
  capabilites. A simple hash function in combination with the LL_TX
  attribute allows to process tx-packets on multiple CPUs on different
  queues. The hash function is needed to guarantee proper TCP packet
  ordering. This alternative packet processing functionality leads to 
  significant performance improvements with ehea. 

Why make this a module option that the user has to set?  Are there any
circumstances when someone wouldn't want significant performance
improvements?  If this approach is just better, then it should just
replace the old code.

Also, as far as the approach of using tasklets, I think it would be
better to use the fake netdev approach to continue to use NAPI.
Basically you create a pseudo-netdev for each receive queue and have
NAPI handle the polling for you -- you could look for
drivers/net/cxgb3 for an example of this.

 - R.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html