Hi Amitkumar, On Thu, Oct 23, 2014 at 2:40 PM, Amitkumar Karwar <akar...@marvell.com> wrote: > Hi Marek, > >>> I tried to capture logs but when enable DYNAMIC_DEBUG I cannot >>> reproduce issue (running test > 30 minutes without allocation >>> failure). > > Thanks for the testing. Yes. Sometimes timing issues won't get reproduced > with debug messages enabled. > >>Any update on this? Should I provide some other logs? > > What's the size of Rx data packets? Is the Rx data AMSDU aggregated?(You can > check if "if (rx_pkt_type == PKT_TYPE_AMSDU)" check is passed in mwifiex > code) If so, disable AMSDU option in AP and try to reproduce the issue. Size of Rx data packets are : pkt type == 2, size - 1574bytes + BAR pkt type, size - 34bytes. AMSDU isn't enabled on AP (verified by adding debug message in mwifiex_process_sta_rx_packet()). > > As you suspected earlier, we might have missed to free skbs allocated for Rx > data which leads to SKB allocation failure. There is very less probability > for this. But we can try below experiment. > > 1) I observed that debug variable "adapter->rx_pending" doesn;t get > decremented when packet is submitted to kernel. Add below code change(valid > only for AMSDU disabled case. Because multiple packets are submitted to > kernel when AMSDU is enabled) > > ---------- > --- a/drivers/net/wireless/mwifiex/util.c > +++ b/drivers/net/wireless/mwifiex/util.c > @@ -218,6 +218,7 @@ int mwifiex_recv_packet(struct mwifiex_private *priv, > struct sk_buff *skb) > > priv->stats.rx_bytes += skb->len; > priv->stats.rx_packets++; > + atomic_dec(&priv->adapter->rx_pending); > if (in_interrupt()) > netif_rx(skb); > ---------- OK, patch applied. > > 2) Add BUG_ON when first time SKB allocation is failed and print > "rx_pending". If it's a huge number, we have missed freeing allocated SKB. [ 167.624452] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed, rx_pending:26893 [ 167.632885] ------------[ cut here ]------------ [ 167.637743] Kernel BUG at bf8a22ae [verbose debug info unavailable] so number seems to be huge and we seems miss free allocated skb.
I did some hacks to code which shows how many packets are received between 2 BAR packets + I print every 500ms rx_pending packets (when packet is received) and also when packet is send to kernel. I also update counter how many packets after reordering are sent to kernel. Log: [ 71.973800] usb 1-1: rx_pending:11 [ 72.077308] usb 1-1: rx_pending kernel:10 [ 72.477546] usb 1-1: rx_pending:868 [ 72.587877] usb 1-1: rx_pending kernel:1096 [ 72.818041] usb 1-1: Received between 2 BAR:6275 [ 72.823127] usb 1-1: Networking send size:6271 [ 72.987375] usb 1-1: rx_pending:1940 [ 73.097504] usb 1-1: rx_pending kernel:2159 [ 73.431973] usb 1-1: Received between 2 BAR:1602 [ 73.437106] usb 1-1: Networking send size:1608 [ 73.497381] usb 1-1: rx_pending:2983 [ 73.608315] usb 1-1: rx_pending kernel:3091 [ 74.007379] usb 1-1: rx_pending:3767 [ 74.117879] usb 1-1: rx_pending kernel:3998 [ 74.517375] usb 1-1: rx_pending:4854 [ 74.543168] usb 1-1: Received between 2 BAR:3152 [ 74.548371] usb 1-1: Networking send size:3152 [ 74.627509] usb 1-1: rx_pending kernel:5062 [ 74.743362] usb 1-1: Received between 2 BAR:495 [ 74.748483] usb 1-1: Networking send size:494 [ 75.027523] usb 1-1: rx_pending:5872 [ 75.137961] usb 1-1: rx_pending kernel:6106 [ 75.537485] usb 1-1: rx_pending:6959 [ 75.647934] usb 1-1: rx_pending kernel:7188 [ 75.656273] usb 1-1: Received between 2 BAR:2383 [ 75.661528] usb 1-1: Networking send size:2382 [ 76.047441] usb 1-1: rx_pending:8004 [ 76.157712] usb 1-1: rx_pending kernel:8240 [ 76.557547] usb 1-1: rx_pending:9095 [ 76.667991] usb 1-1: rx_pending kernel:9326 [ 76.769662] usb 1-1: Received between 2 BAR:2918 [ 76.775047] usb 1-1: Networking send size:2914 [ 77.067491] usb 1-1: rx_pending:10155 [ 77.177524] usb 1-1: rx_pending kernel:10383 [ 77.577547] usb 1-1: rx_pending:11237 [ 77.687859] usb 1-1: rx_pending kernel:11461 [ 78.087401] usb 1-1: rx_pending:12304 [ 78.197992] usb 1-1: rx_pending kernel:12539 [ 78.597442] usb 1-1: rx_pending:13391 [ 78.707786] usb 1-1: rx_pending kernel:13628 [ 79.107487] usb 1-1: rx_pending:14469 [ 79.217812] usb 1-1: rx_pending kernel:14704 [ 79.617528] usb 1-1: rx_pending:15555 [ 79.727734] usb 1-1: rx_pending kernel:15781 [ 80.127486] usb 1-1: rx_pending:16624 [ 80.237756] usb 1-1: rx_pending kernel:16855 [ 80.637423] usb 1-1: rx_pending:17699 [ 80.747868] usb 1-1: rx_pending kernel:17926 [ 81.147446] usb 1-1: rx_pending:18769 [ 81.257800] usb 1-1: rx_pending kernel:18995 [ 81.657399] usb 1-1: rx_pending:19851 [ 81.767785] usb 1-1: rx_pending kernel:20083 [ 82.167341] usb 1-1: rx_pending:20938 [ 82.278005] usb 1-1: rx_pending kernel:21174 [ 82.677527] usb 1-1: rx_pending:22025 [ 82.787508] usb 1-1: rx_pending kernel:22246 [ 83.187390] usb 1-1: rx_pending:23103 [ 83.297690] usb 1-1: rx_pending kernel:23326 [ 83.697376] usb 1-1: rx_pending:24182 [ 83.807711] usb 1-1: rx_pending kernel:24404 [ 84.207487] usb 1-1: rx_pending:25236 [ 84.317747] usb 1-1: rx_pending kernel:25466 [ 84.717378] usb 1-1: rx_pending:26308 [ 84.827806] usb 1-1: rx_pending kernel:26522 [ 85.070931] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed - rx_pending:27017 [ 85.079788] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed - rx_pending:27019 [ 85.088424] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed - rx_pending:27020 [ 85.097029] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed - rx_pending:27020 [ 85.105845] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed - rx_pending:27022 [ 85.114468] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed - rx_pending:27023 [ 85.337391] usb 1-1: rx_pending kernel:22327 [ 85.400153] usb 1-1: Received between 2 BAR:27875 [ 85.405135] usb 1-1: Networking send size:27881 [ 85.766194] usb 1-1: Received between 2 BAR:7781 [ 85.771152] usb 1-1: Networking send size:7781 [ 85.847374] usb 1-1: rx_pending kernel:11681 [ 86.112809] usb 1-1: Received between 2 BAR:7263 [ 86.117755] usb 1-1: Networking send size:7262 [ 86.357378] usb 1-1: rx_pending kernel:827 According log it seems rx_pending is slowly increasing until allocation fails. Code hacks are attached. > > 3) Also, get rx_pending info when Rx traffic works fine with 40M bandwidth > option. when -b40 is set then IMO 11n reordering isn't applied. I added some debug code to mwifiex_11n_rx_reorder_pkt() .. snip .. tbl = mwifiex_11n_get_rx_reorder_tbl(priv, tid, ta); if (!tbl) { .. snip .. is always true (!tbl) so rx_pending after session is 0. > > Btw, could you move to the firmware image (14.68.29.p38) > shared recently? Tested with 14.68.29.p38 FW version. Issue still reproducible. > > By any means(redirecting the log to serial console etc.), could you please > capture and share kernel trace logs when system becomes unresponsive. It may > give some clue and help us debug further. As I wrote earlier with DYNAMIC_DEBUG I cannot reproduce issue. Maybe when we drop some debug code (received packet over USB) issue can be reproducible again, but it's just guess. > > Best Regards, > Amit BR, marek -- as simple and primitive as possible ------------------------------------------------- Marek Belisko - OPEN-NANDRA Freelance Developer Ruska Nova Ves 219 | Presov, 08005 Slovak Republic Tel: +421 915 052 184 skype: marekwhite twitter: #opennandra web: http://open-nandra.com
diff --git a/drivers/net/wireless/mwifiex/11n_rxreorder.c b/drivers/net/wireless/mwifiex/11n_rxreorder.c index 5e796f8..f6a8d0d 100644 --- a/drivers/net/wireless/mwifiex/11n_rxreorder.c +++ b/drivers/net/wireless/mwifiex/11n_rxreorder.c @@ -26,6 +26,7 @@ #include "11n.h" #include "11n_rxreorder.h" +static int pkt_send = 0; /* * This function dispatches all packets in the Rx reorder table until the * start window. @@ -36,7 +37,7 @@ */ static void mwifiex_11n_dispatch_pkt(struct mwifiex_private *priv, - struct mwifiex_rx_reorder_tbl *tbl, int start_win) + struct mwifiex_rx_reorder_tbl *tbl, int start_win, int *send_size) { int pkt_to_send, i; void *rx_tmp_ptr; @@ -45,6 +46,7 @@ mwifiex_11n_dispatch_pkt(struct mwifiex_private *priv, pkt_to_send = (start_win > tbl->start_win) ? min((start_win - tbl->start_win), tbl->win_size) : tbl->win_size; +// pr_info("%s - pkt_to_send:%d\n", __func__, pkt_to_send); for (i = 0; i < pkt_to_send; ++i) { spin_lock_irqsave(&priv->rx_pkt_lock, flags); @@ -57,8 +59,11 @@ mwifiex_11n_dispatch_pkt(struct mwifiex_private *priv, if (rx_tmp_ptr) { if (priv->bss_role == MWIFIEX_BSS_ROLE_UAP) mwifiex_handle_uap_rx_forward(priv, rx_tmp_ptr); - else + else { mwifiex_process_rx_packet(priv, rx_tmp_ptr); + if (send_size) + *send_size += 1; + } } } @@ -86,7 +91,7 @@ mwifiex_11n_dispatch_pkt(struct mwifiex_private *priv, */ static void mwifiex_11n_scan_and_dispatch(struct mwifiex_private *priv, - struct mwifiex_rx_reorder_tbl *tbl) + struct mwifiex_rx_reorder_tbl *tbl, int *send_size) { int i, j, xchg; void *rx_tmp_ptr; @@ -104,8 +109,10 @@ mwifiex_11n_scan_and_dispatch(struct mwifiex_private *priv, if (priv->bss_role == MWIFIEX_BSS_ROLE_UAP) mwifiex_handle_uap_rx_forward(priv, rx_tmp_ptr); - else + else { mwifiex_process_rx_packet(priv, rx_tmp_ptr); + *send_size += 1; + } } spin_lock_irqsave(&priv->rx_pkt_lock, flags); @@ -140,7 +147,7 @@ mwifiex_del_rx_reorder_entry(struct mwifiex_private *priv, return; mwifiex_11n_dispatch_pkt(priv, tbl, (tbl->start_win + tbl->win_size) & - (MAX_TID_VALUE - 1)); + (MAX_TID_VALUE - 1), NULL); del_timer(&tbl->timer_context.timer); @@ -238,7 +245,7 @@ mwifiex_flush_data(unsigned long context) dev_dbg(ctx->priv->adapter->dev, "info: flush data %d\n", start_win); mwifiex_11n_dispatch_pkt(ctx->priv, ctx->ptr, (ctx->ptr->start_win + start_win + 1) & - (MAX_TID_VALUE - 1)); + (MAX_TID_VALUE - 1), NULL); } /* @@ -267,7 +274,7 @@ mwifiex_11n_create_rx_reorder_tbl(struct mwifiex_private *priv, u8 *ta, */ tbl = mwifiex_11n_get_rx_reorder_tbl(priv, tid, ta); if (tbl) { - mwifiex_11n_dispatch_pkt(priv, tbl, seq_num); + mwifiex_11n_dispatch_pkt(priv, tbl, seq_num, NULL); return; } /* if !tbl then create one */ @@ -434,11 +441,15 @@ int mwifiex_11n_rx_reorder_pkt(struct mwifiex_private *priv, tbl = mwifiex_11n_get_rx_reorder_tbl(priv, tid, ta); if (!tbl) { + pr_info("!tbl\n"); if (pkt_type != PKT_TYPE_BAR) { if (priv->bss_role == MWIFIEX_BSS_ROLE_UAP) mwifiex_handle_uap_rx_forward(priv, payload); else mwifiex_process_rx_packet(priv, payload); + } else { + // missing free (maybe also rx_pending) + dev_kfree_skb_any(payload); } return 0; } @@ -485,7 +496,7 @@ int mwifiex_11n_rx_reorder_pkt(struct mwifiex_private *priv, start_win = (end_win - win_size) + 1; else start_win = (MAX_TID_VALUE - (win_size - seq_num)) + 1; - mwifiex_11n_dispatch_pkt(priv, tbl, start_win); + mwifiex_11n_dispatch_pkt(priv, tbl, start_win, &pkt_send); } if (pkt_type != PKT_TYPE_BAR) { @@ -498,13 +509,16 @@ int mwifiex_11n_rx_reorder_pkt(struct mwifiex_private *priv, return -1; tbl->rx_reorder_ptr[pkt_index] = payload; + } else if (pkt_type == PKT_TYPE_BAR) { + dev_info(priv->adapter->dev, "Networking send size:%d\n", pkt_send); + pkt_send = 0; } /* * Dispatch all packets sequentially from start_win until a * hole is found and adjust the start_win appropriately */ - mwifiex_11n_scan_and_dispatch(priv, tbl); + mwifiex_11n_scan_and_dispatch(priv, tbl, &pkt_send); return 0; } diff --git a/drivers/net/wireless/mwifiex/sta_rx.c b/drivers/net/wireless/mwifiex/sta_rx.c index b5c1095..bc1c0d7 100644 --- a/drivers/net/wireless/mwifiex/sta_rx.c +++ b/drivers/net/wireless/mwifiex/sta_rx.c @@ -122,6 +122,7 @@ int mwifiex_process_sta_rx_packet(struct mwifiex_private *priv, struct rx_packet_hdr *rx_pkt_hdr; u8 ta[ETH_ALEN]; u16 rx_pkt_type, rx_pkt_offset, rx_pkt_length, seq_num; + static int pkts_received = 0; local_rx_pd = (struct rxpd *) (skb->data); rx_pkt_type = le16_to_cpu(local_rx_pd->rx_pkt_type); @@ -146,6 +147,7 @@ int mwifiex_process_sta_rx_packet(struct mwifiex_private *priv, } if (rx_pkt_type == PKT_TYPE_AMSDU) { + pr_info("AMSDU\n"); struct sk_buff_head list; struct sk_buff *rx_skb; @@ -168,6 +170,7 @@ int mwifiex_process_sta_rx_packet(struct mwifiex_private *priv, ret = mwifiex_process_mgmt_packet(priv, skb); if (ret) dev_err(adapter->dev, "Rx of mgmt packet failed"); + atomic_dec(&priv->adapter->rx_pending); dev_kfree_skb_any(skb); return ret; } @@ -191,10 +194,19 @@ int mwifiex_process_sta_rx_packet(struct mwifiex_private *priv, ETH_ALEN); } + if (rx_pkt_type != PKT_TYPE_BAR && skb->len > 1500) { + pkts_received++; + } + + if (rx_pkt_type == PKT_TYPE_BAR) { + dev_info(priv->adapter->dev, "Received between 2 BAR:%d\n", pkts_received); + pkts_received = 0; + } /* Reorder and send to OS */ ret = mwifiex_11n_rx_reorder_pkt(priv, seq_num, local_rx_pd->priority, ta, (u8) rx_pkt_type, skb); + if (ret || (rx_pkt_type == PKT_TYPE_BAR)) { if (adapter->if_ops.data_complete) adapter->if_ops.data_complete(adapter, skb); diff --git a/drivers/net/wireless/mwifiex/usb.c b/drivers/net/wireless/mwifiex/usb.c index 7d9d10b..959daed 100644 --- a/drivers/net/wireless/mwifiex/usb.c +++ b/drivers/net/wireless/mwifiex/usb.c @@ -38,6 +38,7 @@ static struct usb_device_id mwifiex_usb_table[] = { MODULE_DEVICE_TABLE(usb, mwifiex_usb_table); +unsigned long timeout = 0; /* Indicate if load FW in MFG (testing) mode */ static int mfg_mode = 0; @@ -178,8 +179,14 @@ static void mwifiex_usb_rx_complete(struct urb *urb) atomic_inc(&adapter->rx_pending); status = mwifiex_usb_recv(adapter, skb, context->ep); + if (time_after(jiffies, timeout)) { + dev_info(adapter->dev, "rx_pending:%d\n", atomic_read(&adapter->rx_pending)); + timeout = jiffies + msecs_to_jiffies(500); + } + dev_dbg(adapter->dev, "info: recv_length=%d, status=%d\n", recv_length, status); + if (status == -EINPROGRESS) { queue_work(adapter->workqueue, &adapter->main_work); @@ -259,7 +266,7 @@ static int mwifiex_usb_submit_rx_urb(struct urb_context *ctx, int size) ctx->skb = dev_alloc_skb(size); if (!ctx->skb) { dev_err(adapter->dev, - "%s: dev_alloc_skb failed\n", __func__); + "%s: dev_alloc_skb failed - rx_pending:%d\n", __func__, atomic_read(&adapter->rx_pending)); return -ENOMEM; } } @@ -344,6 +351,7 @@ static int mwifiex_usb_probe(struct usb_interface *intf, bcd_usb = le16_to_cpu(udev->descriptor.bcdUSB); pr_debug("info: VID/PID = %X/%X, Boot2 version = %X\n", id_vendor, id_product, bcd_device); + timeout = jiffies + msecs_to_jiffies(1000); /* PID_1 is used for firmware downloading only */ if (id_product == USB8797_PID_1) diff --git a/drivers/net/wireless/mwifiex/util.c b/drivers/net/wireless/mwifiex/util.c index 2155397..b18ee13 100644 --- a/drivers/net/wireless/mwifiex/util.c +++ b/drivers/net/wireless/mwifiex/util.c @@ -25,6 +25,7 @@ #include "wmm.h" #include "11n.h" +unsigned long timeout = 0; /* * Firmware initialization complete callback handler. * @@ -195,6 +196,9 @@ int mwifiex_recv_packet(struct mwifiex_private *priv, struct sk_buff *skb) skb->protocol = eth_type_trans(skb, priv->netdev); skb->ip_summed = CHECKSUM_NONE; + if (timeout == 0) + timeout = jiffies; + /* This is required only in case of 11n and USB as we alloc * a buffer of 4K only if its 11N (to be able to receive 4K * AMSDU packets). In case of SD we allocate buffers based @@ -218,6 +222,12 @@ int mwifiex_recv_packet(struct mwifiex_private *priv, struct sk_buff *skb) priv->stats.rx_bytes += skb->len; priv->stats.rx_packets++; + atomic_dec(&priv->adapter->rx_pending); + if (time_after(jiffies, timeout)) { + dev_info(priv->adapter->dev, "rx_pending kernel:%d\n", atomic_read(&priv->adapter->rx_pending)); + timeout = jiffies + msecs_to_jiffies(500); + } + if (in_interrupt()) netif_rx(skb); else