Re: [PATCH] ehea: Optional TX/RX path optimized for SMP
On Sat, Mar 03, 2007 at 09:28:28AM +0100, Benjamin Herrenschmidt wrote: > On Sat, 2007-03-03 at 04:06 +0100, Andi Kleen wrote: > > Jan-Bernd Themann <[EMAIL PROTECTED]> writes: > > > > > > Are there any concerns about this approach? > > > > Yes. You should fix the NAPI code instead of trying to work > > around it. > > NAPI is being fixed but the fix will take time to get in. In the > meantime, the solution to get something working is the workaround If it works right now with just a little less efficiency there is no pressing need to do workarounds until the real solution. Besides I doubt that patch would have made .21 anyways and in .22 you might already have multiqueue NAPI. -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ehea: Optional TX/RX path optimized for SMP
On Sat, 2007-03-03 at 04:06 +0100, Andi Kleen wrote: > Jan-Bernd Themann <[EMAIL PROTECTED]> writes: > > > > Are there any concerns about this approach? > > Yes. You should fix the NAPI code instead of trying to work > around it. NAPI is being fixed but the fix will take time to get in. In the meantime, the solution to get something working is the workaround. Ben. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ehea: Optional TX/RX path optimized for SMP
Jan-Bernd Themann <[EMAIL PROTECTED]> writes: > > Are there any concerns about this approach? Yes. You should fix the NAPI code instead of trying to work around it. -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ehea: Optional TX/RX path optimized for SMP
Hi > Also, as far as the approach of using tasklets, I think it would be > better to use the "fake netdev" approach to continue to use NAPI. > Basically you create a pseudo-netdev for each receive queue and have > NAPI handle the polling for you -- you could look for > drivers/net/cxgb3 for an example of this. > Thanks for pointing us to this solution. We are now building a NAPI version that makes use of these pseudo-netdev. The fairness amongst other netdevices should be better this way. > > Why make this a module option that the user has to set? Are there any > circumstances when someone wouldn't want "significant performance > improvements?" If this approach is just better, then it should just > replace the old code. > We'll change the default behaviour to multi queue, but we'd like to keep the option to run in a single queue mode for debug and backward compabilty. Thanks, Jan-Bernd & Christoph R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ehea: Optional TX/RX path optimized for SMP
> This patch introduces an optional alternative receive processing > functionality (enabled via module load parameter). The ehea adapter > can sort TCP traffic to multiple receive queues to be processed by > the driver in parallel on multiple CPUs. The hardware always puts > packets for an individual tcp stream on the same queue. As the > current NAPI interface does not allow to handle parallel receive > threads for a single adapter (processing on multiple CPUs in parallel) > this patch uses tasklets with a simple fairness algorithm instead. > On the send side we also take advantage of ehea's multiple send queue > capabilites. A simple hash function in combination with the LL_TX > attribute allows to process tx-packets on multiple CPUs on different > queues. The hash function is needed to guarantee proper TCP packet > ordering. This alternative packet processing functionality leads to > significant performance improvements with ehea. Why make this a module option that the user has to set? Are there any circumstances when someone wouldn't want "significant performance improvements?" If this approach is just better, then it should just replace the old code. Also, as far as the approach of using tasklets, I think it would be better to use the "fake netdev" approach to continue to use NAPI. Basically you create a pseudo-netdev for each receive queue and have NAPI handle the polling for you -- you could look for drivers/net/cxgb3 for an example of this. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ehea: Optional TX/RX path optimized for SMP
Hi! This patch introduces an optional alternative receive processing functionality (enabled via module load parameter). The ehea adapter can sort TCP traffic to multiple receive queues to be processed by the driver in parallel on multiple CPUs. The hardware always puts packets for an individual tcp stream on the same queue. As the current NAPI interface does not allow to handle parallel receive threads for a single adapter (processing on multiple CPUs in parallel) this patch uses tasklets with a simple fairness algorithm instead. On the send side we also take advantage of ehea's multiple send queue capabilites. A simple hash function in combination with the LL_TX attribute allows to process tx-packets on multiple CPUs on different queues. The hash function is needed to guarantee proper TCP packet ordering. This alternative packet processing functionality leads to significant performance improvements with ehea. Are there any concerns about this approach? Regards, Jan-Bernd Signed-off-by: Jan-Bernd Themann <[EMAIL PROTECTED]> --- diff -Nurp -X dontdiff linux-2.6.21-rc1/drivers/net/ehea/ehea.h patched_kernel/drivers/net/ehea/ehea.h --- linux-2.6.21-rc1/drivers/net/ehea/ehea.h2007-02-23 15:40:42.0 +0100 +++ patched_kernel/drivers/net/ehea/ehea.h 2007-02-23 16:17:37.0 +0100 @@ -39,7 +39,7 @@ #include #define DRV_NAME "ehea" -#define DRV_VERSION"EHEA_0048" +#define DRV_VERSION"EHEA_0049" #define EHEA_MSG_DEFAULT (NETIF_MSG_LINK | NETIF_MSG_TIMER \ | NETIF_MSG_RX_ERR | NETIF_MSG_TX_ERR) @@ -375,8 +375,12 @@ struct ehea_port_res { struct tasklet_struct send_comp_task; spinlock_t recv_lock; struct port_state p_state; + struct tasklet_struct recv_task; u64 rx_packets; u32 poll_counter; + u32 spoll_counter; + u32 rtcount; + u32 stcount; }; @@ -416,7 +420,9 @@ struct ehea_port { char int_aff_name[EHEA_IRQ_NAME_SIZE]; int allmulti;/* Indicates IFF_ALLMULTI state */ int promisc; /* Indicates IFF_PROMISC state */ + int num_tx_qps; int num_add_tx_qps; + int num_mcs; int resets; u64 mac_addr; u32 logical_port_id; diff -Nurp -X dontdiff linux-2.6.21-rc1/drivers/net/ehea/ehea_main.c patched_kernel/drivers/net/ehea/ehea_main.c --- linux-2.6.21-rc1/drivers/net/ehea/ehea_main.c 2007-02-23 15:40:42.0 +0100 +++ patched_kernel/drivers/net/ehea/ehea_main.c 2007-02-23 16:17:42.0 +0100 @@ -51,12 +51,14 @@ static int rq1_entries = EHEA_DEF_ENTRIE static int rq2_entries = EHEA_DEF_ENTRIES_RQ2; static int rq3_entries = EHEA_DEF_ENTRIES_RQ3; static int sq_entries = EHEA_DEF_ENTRIES_SQ; +static int use_mcs = 0; module_param(msg_level, int, 0); module_param(rq1_entries, int, 0); module_param(rq2_entries, int, 0); module_param(rq3_entries, int, 0); module_param(sq_entries, int, 0); +module_param(use_mcs, int, 0); MODULE_PARM_DESC(msg_level, "msg_level"); MODULE_PARM_DESC(rq3_entries, "Number of entries for Receive Queue 3 " @@ -71,6 +73,8 @@ MODULE_PARM_DESC(rq1_entries, "Number of MODULE_PARM_DESC(sq_entries, " Number of entries for the Send Queue " "[2^x - 1], x = [6..14]. Default = " __MODULE_STRING(EHEA_DEF_ENTRIES_SQ) ")"); +MODULE_PARM_DESC(use_mcs, " 0:NAPI, 1:MCS, Default = 0 "); + void ehea_dump(void *adr, int len, char *msg) { int x; @@ -345,10 +349,12 @@ static int ehea_treat_poll_error(struct return 0; } -static int ehea_poll(struct net_device *dev, int *budget) +static struct ehea_cqe *ehea_proc_rwqes(struct net_device *dev, + struct ehea_port_res *pr, + int max_packets, + int *packet_cnt) { - struct ehea_port *port = netdev_priv(dev); - struct ehea_port_res *pr = &port->port_res[0]; + struct ehea_port *port = pr->port; struct ehea_qp *qp = pr->qp; struct ehea_cqe *cqe; struct sk_buff *skb; @@ -359,14 +365,12 @@ static int ehea_poll(struct net_device * int skb_arr_rq2_len = pr->rq2_skba.len; int skb_arr_rq3_len = pr->rq3_skba.len; int processed, processed_rq1, processed_rq2, processed_rq3; - int wqe_index, last_wqe_index, rq, intreq, my_quota, port_reset; + int wqe_index, last_wqe_index, rq, my_quota, port_reset; processed = processed_rq1 = processed_rq2 = processed_rq3 = 0; last_wqe_index = 0; - my_quota = min(*budget, dev->quota); - my_quota = min(my_quota, EHEA_POLL_MAX_RWQE); + my_quota = max_packets; - /* rq0 is low latency RQ */ cqe = ehea_poll_rq1(qp, &wqe_index); while ((my_quota > 0) && cqe) { ehea_inc_rq1(qp); @@ -386,6 +390,7 @@ static int ehea_poll(struct net_device * if (unlikely(!sk