[PATCH] libertas: call into generic suspend code before turning off power
When powering down a SDIO connected card during suspend, make sure to call into the generic lbs_suspend() function before pulling the plug. This will make sure the card is successfully deregistered from the system to avoid communication to the card starving out. Fixes: 7444a8092906 ("libertas: fix suspend and resume for SDIO connected cards") Signed-off-by: Daniel Mack --- If possible, this should go in for 4.19. drivers/net/wireless/marvell/libertas/if_sdio.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/net/wireless/marvell/libertas/if_sdio.c b/drivers/net/wireless/marvell/libertas/if_sdio.c index 43743c26c071..39bf85d0ade0 100644 --- a/drivers/net/wireless/marvell/libertas/if_sdio.c +++ b/drivers/net/wireless/marvell/libertas/if_sdio.c @@ -1317,6 +1317,10 @@ static int if_sdio_suspend(struct device *dev) if (priv->wol_criteria == EHS_REMOVE_WAKEUP) { dev_info(dev, "Suspend without wake params -- powering down card\n"); if (priv->fw_ready) { + ret = lbs_suspend(priv); + if (ret) + return ret; + priv->power_up_on_resume = true; if_sdio_power_off(card); } -- 2.17.1
Re: [PATCH v3 00/11] NFC: A bunch of cleanups for st95hf
Hi Samuel, On 6/8/2018 12:38 PM, Daniel Mack wrote: On Tuesday, July 24, 2018 11:59 AM, Daniel Mack wrote: This v3 of a series of patches for the ST95HF driver. Patch #1 reverts a change that I have submitted earlier and which is sitting in nfc-next already. Given that the tree hasn't been sent out for being merged yet, it could also still be removed with rebasing, in which case #1 is not necessary of course. The changes are all rather simple and are explained in their individual commit logs. Note that this series builds upon the patch titled "nfc: st95hf: remove redundant pointers 'dev' and 'nfcddev'" that Colin posted the other day. Could you still apply this set for 4.19? Hmm it seems it has missed the merge window again. May I ask whether you still consider applying these to your tree and get it merged? Thanks, Daniel
Re: [PATCH v3 00/11] NFC: A bunch of cleanups for st95hf
Hi Samuel, On Tuesday, July 24, 2018 11:59 AM, Daniel Mack wrote: This v3 of a series of patches for the ST95HF driver. Patch #1 reverts a change that I have submitted earlier and which is sitting in nfc-next already. Given that the tree hasn't been sent out for being merged yet, it could also still be removed with rebasing, in which case #1 is not necessary of course. The changes are all rather simple and are explained in their individual commit logs. Note that this series builds upon the patch titled "nfc: st95hf: remove redundant pointers 'dev' and 'nfcddev'" that Colin posted the other day. Could you still apply this set for 4.19? At least #1 should go in, or commit c99f996b2ba49 ("NFC: st95hf: drop illegal kfree_skb()") which is already in your tree should be removed with a rebase. For the rest, I'm in no hurry. I just want to prevent a regression in 4.19. Thanks, Daniel Thanks, Daniel Changelog: v1 → v2: * Improved commit logs, identical patch content. v2 → v3: * Added another patch titled "NFC: st95hf: ignore spurious interrupts" Daniel Mack (11): Revert "NFC: st95hf: drop illegal kfree_skb()" NFC: st95hf: drop nfcdev_free NFC: st95hf: drop illegal kfree_skb() in IRQ handler NFC: st95hf: remove logging from spi functions NFC: st95hf: remove exchange_lock NFC: st95hf: move skb allocation to ISR NFC: st95hf: ignore spurious interrupts NFC: st95hf: re-order command defines NFC: st95hf: unify sync/async flags NFC: st95hf: two small style nits NFC: st95hf: add of match table drivers/nfc/st95hf/core.c | 154 ++ drivers/nfc/st95hf/spi.c | 31 +++- drivers/nfc/st95hf/spi.h | 8 +- 3 files changed, 66 insertions(+), 127 deletions(-)
[PATCH v3 11/11] NFC: st95hf: add of match table
Add a match table for device tree compatible strings. Interestingly, a document describing the bindings already exists since a while, but users currently rely on the implicit matching in the drivers core. Let's be explicit and add a real match table. Signed-off-by: Daniel Mack Cc: devicet...@vger.kernel.org --- drivers/nfc/st95hf/core.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index ccb1f1456871..6161838fcf08 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -1027,6 +1027,12 @@ static const struct spi_device_id st95hf_id[] = { }; MODULE_DEVICE_TABLE(spi, st95hf_id); +static const struct of_device_id st95hf_of_match[] = { + { .compatible = "st,st95hf", }, + {} +}; +MODULE_DEVICE_TABLE(of, st95hf_of_match); + static int st95hf_probe(struct spi_device *nfc_spi_dev) { int ret; @@ -1204,6 +1210,7 @@ static struct spi_driver st95hf_driver = { .driver = { .name = "st95hf", .owner = THIS_MODULE, + .of_match_table = st95hf_of_match, }, .id_table = st95hf_id, .probe = st95hf_probe, -- 2.17.1
[PATCH v3 10/11] NFC: st95hf: two small style nits
Address two minor style issues that I came across while working on the driver. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 10321dba15dc..ccb1f1456871 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -662,7 +662,8 @@ static int st95hf_error_handling(struct st95hf_context *stcontext, result = -ETIMEDOUT; else result = -EIO; - return result; + + return result; } /* Check for CRC err only if CRC is present in the tag response */ @@ -859,6 +860,7 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) skb_resp); mutex_unlock(>rm_lock); + return IRQ_HANDLED; } -- 2.17.1
[PATCH v3 07/11] NFC: st95hf: ignore spurious interrupts
When an interrupt occurs before st95hf_in_send_cmd() was called, the ISR will currently dereference a NULL pointer. Fix this by checking whether `cb_arg->complete_cb' is set, and bail out early if that's not the case. Again spurious interrupts are likely to occur with EMI noise through the antenna, and need to be handled gracefully. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 99f84ddfdfef..7fdad67b1a4d 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -796,6 +796,13 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) goto end; } + /* +* If the completion callback is not set, no command is currently +* active. Ignore the spurious interrupt. +*/ + if (unlikely(!cb_arg->complete_cb)) + goto end; + /* if stcontext->ddev is %NULL, it means remove already ran */ if (!stcontext->ddev) { result = -ENODEV; @@ -844,8 +851,16 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) wtx = false; cb_arg->rats = false; skb_resp = ERR_PTR(result); - /* call of callback with error */ - cb_arg->complete_cb(stcontext->ddev, cb_arg->cb_usrarg, skb_resp); + + /* +* Report an error to the core. If cb_arg->complete_cb is unset, +* we're handling a spurious interrupt that can be ignored. +*/ + if (cb_arg->complete_cb) + cb_arg->complete_cb(stcontext->ddev, + cb_arg->cb_usrarg, + skb_resp); + mutex_unlock(>rm_lock); return IRQ_HANDLED; } -- 2.17.1
[PATCH v3 08/11] NFC: st95hf: re-order command defines
Just a small cleanup to bring the command defines in a numerical order. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 7fdad67b1a4d..53447394666b 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -45,10 +45,10 @@ /* Command Send Interface */ /* ST95HF_COMMAND_SEND CMD Ids */ -#define ECHO_CMD 0x55 -#define WRITE_REGISTER_CMD 0x9 #define PROTOCOL_SELECT_CMD0x2 #define SEND_RECEIVE_CMD 0x4 +#define WRITE_REGISTER_CMD 0x9 +#define ECHO_CMD 0x55 /* Select protocol codes */ #define ISO15693_PROTOCOL_CODE 0x1 -- 2.17.1
[PATCH v3 09/11] NFC: st95hf: unify sync/async flags
Keep the information whether a command is asynchronous in a boolean flag like it is done elsewhere in the code. This way, the enum can go away. No function change intended. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 38 -- drivers/nfc/st95hf/spi.c | 12 +--- drivers/nfc/st95hf/spi.h | 8 +--- 3 files changed, 22 insertions(+), 36 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 53447394666b..10321dba15dc 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -101,7 +101,7 @@ struct cmd { unsigned char cmd_id; unsigned char no_cmd_params; unsigned char cmd_params[MAX_CMD_PARAMS]; - enum req_type req; + bool is_sync; }; struct param_list { @@ -134,63 +134,62 @@ static const struct cmd cmd_array[] = { .cmd_len = 0x2, .cmd_id = ECHO_CMD, .no_cmd_params = 0, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443A_CONFIG] = { .cmd_len = 0x7, .cmd_id = WRITE_REGISTER_CMD, .no_cmd_params = 0x4, .cmd_params = {0x3A, 0x00, 0x5A, 0x04}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443A_DEMOGAIN] = { .cmd_len = 0x7, .cmd_id = WRITE_REGISTER_CMD, .no_cmd_params = 0x4, .cmd_params = {0x68, 0x01, 0x01, 0xDF}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443B_DEMOGAIN] = { .cmd_len = 0x7, .cmd_id = WRITE_REGISTER_CMD, .no_cmd_params = 0x4, .cmd_params = {0x68, 0x01, 0x01, 0x51}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443A_PROTOCOL_SELECT] = { .cmd_len = 0x7, .cmd_id = PROTOCOL_SELECT_CMD, .no_cmd_params = 0x4, .cmd_params = {ISO14443A_PROTOCOL_CODE, 0x00, 0x01, 0xA0}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443B_PROTOCOL_SELECT] = { .cmd_len = 0x7, .cmd_id = PROTOCOL_SELECT_CMD, .no_cmd_params = 0x4, .cmd_params = {ISO14443B_PROTOCOL_CODE, 0x01, 0x03, 0xFF}, - .req = SYNC, + .is_sync = true, }, [CMD_WTX_RESPONSE] = { .cmd_len = 0x6, .cmd_id = SEND_RECEIVE_CMD, .no_cmd_params = 0x3, .cmd_params = {0xF2, 0x00, TRFLAG_NFCA_STD_FRAME_CRC}, - .req = ASYNC, }, [CMD_FIELD_OFF] = { .cmd_len = 0x5, .cmd_id = PROTOCOL_SELECT_CMD, .no_cmd_params = 0x2, .cmd_params = {0x0, 0x0}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO15693_PROTOCOL_SELECT] = { .cmd_len = 0x5, .cmd_id = PROTOCOL_SELECT_CMD, .no_cmd_params = 0x2, .cmd_params = {ISO15693_PROTOCOL_CODE, 0x0D}, - .req = SYNC, + .is_sync = true, }, }; @@ -279,13 +278,13 @@ static int st95hf_send_recv_cmd(struct st95hf_context *st95context, ret = st95hf_spi_send(>spicontext, spi_cmd_buffer, cmd_array[cmd].cmd_len, - cmd_array[cmd].req); + cmd_array[cmd].is_sync); if (ret) { dev_err(dev, "st95hf_spi_send failed with error %d\n", ret); return ret; } - if (cmd_array[cmd].req == SYNC && recv_res) { + if (cmd_array[cmd].is_sync && recv_res) { unsigned char st95hf_response_arr[2]; ret = st95hf_spi_recv_response(>spicontext, @@ -480,10 +479,8 @@ static int st95hf_send_spi_reset_sequence(struct st95hf_context *st95context) int result = 0; unsigned char reset_cmd = ST95HF_COMMAND_RESET; - result = st95hf_spi_send(>spicontext, -_cmd, -ST95HF_RESET_CMD_LEN, -ASYNC); + result = st95hf_spi_send(>spicontext, _cmd, +ST95HF_RESET_CMD_LEN, false); if (result) { dev_err(>spicontext.spidev->dev, "spi reset sequence cmd error = %d", result); @@ -947,8 +944,7 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, stcontext->complete_cb_arg.rats = true; rc = st95hf_spi_send(>spicontext, skb->data, -skb->len, -ASY
[PATCH v3 06/11] NFC: st95hf: move skb allocation to ISR
The driver currently assumes that interrupts can only occur between the time when when a command has been sent to the device and the response to it. This assumption, however, is wrong. The antenna of the chip is likely to catch a lot of environmental noise, and once in a while, the device will latch the interrupt to signal a protocol error, and keep it latched until the response bytes are read from the chip. As the code currently stands, skbs for responses are only prepared when a command is sent, and the ISR bails out early if that wasn't the case. Hence, the interrupt remains latched, and no further communication with device is possible. To fix this, move the call to nfc_alloc_recv_skb() from st95hf_in_send_cmd() to st95hf_irq_thread_handler() and always read back the interrupt reason. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 36 ++-- 1 file changed, 6 insertions(+), 30 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 6761ab90f68d..99f84ddfdfef 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -196,7 +196,6 @@ static const struct cmd cmd_array[] = { /* st95_digital_cmd_complete_arg stores client context */ struct st95_digital_cmd_complete_arg { - struct sk_buff *skb_resp; nfc_digital_cmd_complete_t complete_cb; void *cb_usrarg; bool rats; @@ -783,12 +782,10 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) spidevice = >spicontext.spidev->dev; cb_arg = >complete_cb_arg; - skb_resp = cb_arg->skb_resp; - if (unlikely(!skb_resp)) { - WARN(1, "unknown context in ST95HF ISR"); + skb_resp = nfc_alloc_recv_skb(MAX_RESPONSE_BUFFER_SIZE, GFP_KERNEL); + if (WARN_ON(!skb_resp)) return IRQ_NONE; - } mutex_lock(>rm_lock); res_len = st95hf_spi_recv_response(>spicontext, @@ -838,12 +835,6 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) /* call digital layer callback */ cb_arg->complete_cb(stcontext->ddev, cb_arg->cb_usrarg, skb_resp); - /* -* This skb is now owned by the core layer. -* Make sure not to use it again. -*/ - cb_arg->skb_resp = NULL; - /* up the semaphore before returning */ mutex_unlock(>rm_lock); @@ -913,15 +904,7 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, void *arg) { struct st95hf_context *stcontext = nfc_digital_get_drvdata(ddev); - int rc; - struct sk_buff *skb_resp; - int len_data_to_tag = 0; - - skb_resp = nfc_alloc_recv_skb(MAX_RESPONSE_BUFFER_SIZE, GFP_KERNEL); - if (!skb_resp) { - rc = -ENOMEM; - goto error; - } + int rc, len_data_to_tag = 0; switch (stcontext->current_rf_tech) { case NFC_DIGITAL_RF_TECH_106A: @@ -933,8 +916,7 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, len_data_to_tag = skb->len; break; default: - rc = -EINVAL; - goto free_skb_resp; + return -EINVAL; } skb_push(skb, 3); @@ -942,7 +924,6 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, skb->data[1] = SEND_RECEIVE_CMD; skb->data[2] = len_data_to_tag; - stcontext->complete_cb_arg.skb_resp = skb_resp; stcontext->complete_cb_arg.cb_usrarg = arg; stcontext->complete_cb_arg.complete_cb = cb; @@ -956,17 +937,12 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, if (rc) { dev_err(>nfcdev->dev, "Error %d trying to perform data_exchange", rc); - goto free_skb_resp; + return rc; } kfree_skb(skb); - return rc; - -free_skb_resp: - kfree_skb(skb_resp); -error: - return rc; + return 0; } /* p2p will be supported in a later release ! */ -- 2.17.1
[PATCH v3 05/11] NFC: st95hf: remove exchange_lock
This patch removes the exchange_lock sempahore. Its intended function was two-fold: a) Lock the remove() callback of the driver against the ISR, so that the resources only go away after the ISR has finished. This is unnecessary though, because `rm_lock' does that already, in combination with the nullification of `scontext->ddev'. b) Indicate whether a command was sent previously. If the semaphore is found unused in the threaded ISR, an error is reported. This case can be handled much nicer by checking whether `skb_resp' is present in the context. For this, nullify the `skb_resp' pointer in the callback context after it was sent back to the NFC core. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 52 +++ 1 file changed, 9 insertions(+), 43 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index d857197ec7b2..6761ab90f68d 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -214,8 +214,6 @@ struct st95_digital_cmd_complete_arg { * @st95hf_supply: regulator "consumer" for NFC device. * @sendrcv_trflag: last byte of frame send by sendrecv command * of st95hf. This byte contains transmission flag info. - * @exchange_lock: semaphore used for signaling the st95hf_remove - * function that the last outstanding async nfc request is finished. * @rm_lock: mutex for ensuring safe access of nfc digital object * from threaded ISR. Usage of this mutex avoids any race between * deletion of the object from st95hf_remove() and its access from @@ -233,7 +231,6 @@ struct st95hf_context { struct st95_digital_cmd_complete_arg complete_cb_arg; struct regulator *st95hf_supply; unsigned char sendrcv_trflag; - struct semaphore exchange_lock; struct mutex rm_lock; u8 current_protocol; u8 current_rf_tech; @@ -785,29 +782,14 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) struct st95_digital_cmd_complete_arg *cb_arg; spidevice = >spicontext.spidev->dev; + cb_arg = >complete_cb_arg; + skb_resp = cb_arg->skb_resp; - /* -* check semaphore, if not down() already, then we don't -* know in which context the ISR is called and surely it -* will be a bug. Note that down() of the semaphore is done -* in the corresponding st95hf_in_send_cmd() and then -* only this ISR should be called. ISR will up() the -* semaphore before leaving. Hence when the ISR is called -* the correct behaviour is down_trylock() should always -* return 1 (indicating semaphore cant be taken and hence no -* change in semaphore count). -* If not, then we up() the semaphore and crash on -* a BUG() ! -*/ - if (!down_trylock(>exchange_lock)) { - up(>exchange_lock); + if (unlikely(!skb_resp)) { WARN(1, "unknown context in ST95HF ISR"); return IRQ_NONE; } - cb_arg = >complete_cb_arg; - skb_resp = cb_arg->skb_resp; - mutex_lock(>rm_lock); res_len = st95hf_spi_recv_response(>spicontext, skb_resp->data); @@ -856,8 +838,13 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) /* call digital layer callback */ cb_arg->complete_cb(stcontext->ddev, cb_arg->cb_usrarg, skb_resp); + /* +* This skb is now owned by the core layer. +* Make sure not to use it again. +*/ + cb_arg->skb_resp = NULL; + /* up the semaphore before returning */ - up(>exchange_lock); mutex_unlock(>rm_lock); return IRQ_HANDLED; @@ -868,8 +855,6 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) skb_resp = ERR_PTR(result); /* call of callback with error */ cb_arg->complete_cb(stcontext->ddev, cb_arg->cb_usrarg, skb_resp); - /* up the semaphore before returning */ - up(>exchange_lock); mutex_unlock(>rm_lock); return IRQ_HANDLED; } @@ -965,25 +950,12 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, ddev->curr_protocol == NFC_PROTO_ISO14443) stcontext->complete_cb_arg.rats = true; - /* -* down the semaphore to indicate to remove func that an -* ISR is pending, note that it will not block here in any case. -* If found blocked, it is a BUG! -*/ - rc = down_killable(>exchange_lock); - if (rc) { - WARN(1, "Semaphore is not found up in st95hf_in_send_cmd\n"); - return rc; - } - rc = st95hf_spi_send(>spicontext, skb->data, skb->len,
[PATCH v3 04/11] NFC: st95hf: remove logging from spi functions
The callers of these functions already log on errors, and there's no need to do it from two places. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/spi.c | 19 --- 1 file changed, 4 insertions(+), 15 deletions(-) diff --git a/drivers/nfc/st95hf/spi.c b/drivers/nfc/st95hf/spi.c index e2d3bbcc8c34..d5894d4546b1 100644 --- a/drivers/nfc/st95hf/spi.c +++ b/drivers/nfc/st95hf/spi.c @@ -47,8 +47,6 @@ int st95hf_spi_send(struct st95hf_spi_context *spicontext, result = spi_sync(spidev, ); if (result) { - dev_err(>dev, "error: sending cmd to st95hf using SPI = %d\n", - result); mutex_unlock(>spi_lock); return result; } @@ -62,12 +60,10 @@ int st95hf_spi_send(struct st95hf_spi_context *spicontext, result = wait_for_completion_timeout(>done, msecs_to_jiffies(1000)); /* check for timeout or success */ - if (!result) { - dev_err(>dev, "error: response not ready timeout\n"); + if (!result) result = -ETIMEDOUT; - } else { + else result = 0; - } mutex_unlock(>spi_lock); @@ -79,7 +75,7 @@ EXPORT_SYMBOL_GPL(st95hf_spi_send); int st95hf_spi_recv_response(struct st95hf_spi_context *spicontext, unsigned char *receivebuff) { - int len = 0; + int ret, len; struct spi_transfer tx_takedata; struct spi_message m; struct spi_device *spidev = spicontext->spidev; @@ -89,8 +85,6 @@ int st95hf_spi_recv_response(struct st95hf_spi_context *spicontext, {.rx_buf = receivebuff, .len = 2, .cs_change = 1,}, }; - int ret = 0; - memset(_takedata, 0x0, sizeof(struct spi_transfer)); mutex_lock(>spi_lock); @@ -102,8 +96,6 @@ int st95hf_spi_recv_response(struct st95hf_spi_context *spicontext, ret = spi_sync(spidev, ); if (ret) { - dev_err(>dev, "spi_recv_resp, data length error = %d\n", - ret); mutex_unlock(>spi_lock); return ret; } @@ -127,11 +119,8 @@ int st95hf_spi_recv_response(struct st95hf_spi_context *spicontext, ret = spi_sync(spidev, ); mutex_unlock(>spi_lock); - if (ret) { - dev_err(>dev, "spi_recv_resp, data read error = %d\n", - ret); + if (ret) return ret; - } return len; } -- 2.17.1
[PATCH v3 02/11] NFC: st95hf: drop nfcdev_free
This flag is unneccesary. We can just nullify `ddev' instead after we freed it and check for that in the ISR. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index bc1a2070f9bb..d58424ab5c48 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -220,8 +220,6 @@ struct st95_digital_cmd_complete_arg { * from threaded ISR. Usage of this mutex avoids any race between * deletion of the object from st95hf_remove() and its access from * the threaded ISR. - * @nfcdev_free: flag to have the state of nfc device object. - * [alive | died] * @current_protocol: current nfc protocol. * @current_rf_tech: current rf technology. * @fwi: frame waiting index, received in reply of RATS according to @@ -237,7 +235,6 @@ struct st95hf_context { unsigned char sendrcv_trflag; struct semaphore exchange_lock; struct mutex rm_lock; - bool nfcdev_free; u8 current_protocol; u8 current_rf_tech; int fwi; @@ -820,8 +817,8 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) goto end; } - /* if stcontext->nfcdev_free is true, it means remove already ran */ - if (stcontext->nfcdev_free) { + /* if stcontext->ddev is %NULL, it means remove already ran */ + if (!stcontext->ddev) { result = -ENODEV; goto end; } @@ -1220,7 +1217,7 @@ static int st95hf_remove(struct spi_device *nfc_spi_dev) nfc_digital_unregister_device(stcontext->ddev); nfc_digital_free_device(stcontext->ddev); - stcontext->nfcdev_free = true; + stcontext->ddev = NULL; mutex_unlock(>rm_lock); -- 2.17.1
[PATCH v3 03/11] NFC: st95hf: drop illegal kfree_skb() in IRQ handler
In the error path of the IRQ handler, don't free the skb in flight. The callback in the digital core will do that for us. Doing it from both places causes a memory corruption. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index d58424ab5c48..d857197ec7b2 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -863,7 +863,6 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) return IRQ_HANDLED; end: - kfree_skb(skb_resp); wtx = false; cb_arg->rats = false; skb_resp = ERR_PTR(result); -- 2.17.1
[PATCH v3 01/11] Revert "NFC: st95hf: drop illegal kfree_skb()"
This reverts commit c99f996b2ba49 ("NFC: st95hf: drop illegal kfree_skb()"). It turns out that the st95hf_in_send_cmd() is in fact the sole owner of this skb, and by not freeing it here, we not only causing a memory leak but also mess up the refcount of the socket that holds it. This will in turn lead to activated targets not being cleaned up, even after stopping userspace processes. The memory corruption that I was hunting was caused by another kfree_skb(). This will be fixed in a later commit. Signed-off-by: Daniel Mack Fixes: c99f996b2ba49 ("NFC: st95hf: drop illegal kfree_skb()") --- drivers/nfc/st95hf/core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 36ef0e905ba3..bc1a2070f9bb 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -991,6 +991,8 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, goto free_skb_resp; } + kfree_skb(skb); + return rc; free_skb_resp: -- 2.17.1
[PATCH v3 00/11] NFC: A bunch of cleanups for st95hf
This v3 of a series of patches for the ST95HF driver. Patch #1 reverts a change that I have submitted earlier and which is sitting in nfc-next already. Given that the tree hasn't been sent out for being merged yet, it could also still be removed with rebasing, in which case #1 is not necessary of course. The changes are all rather simple and are explained in their individual commit logs. Note that this series builds upon the patch titled "nfc: st95hf: remove redundant pointers 'dev' and 'nfcddev'" that Colin posted the other day. Thanks, Daniel Changelog: v1 → v2: * Improved commit logs, identical patch content. v2 → v3: * Added another patch titled "NFC: st95hf: ignore spurious interrupts" Daniel Mack (11): Revert "NFC: st95hf: drop illegal kfree_skb()" NFC: st95hf: drop nfcdev_free NFC: st95hf: drop illegal kfree_skb() in IRQ handler NFC: st95hf: remove logging from spi functions NFC: st95hf: remove exchange_lock NFC: st95hf: move skb allocation to ISR NFC: st95hf: ignore spurious interrupts NFC: st95hf: re-order command defines NFC: st95hf: unify sync/async flags NFC: st95hf: two small style nits NFC: st95hf: add of match table drivers/nfc/st95hf/core.c | 154 ++ drivers/nfc/st95hf/spi.c | 31 +++- drivers/nfc/st95hf/spi.h | 8 +- 3 files changed, 66 insertions(+), 127 deletions(-) -- 2.17.1
[PATCH v2 06/10] NFC: st95hf: move skb allocation to ISR
The driver currently assumes that interrupts can only occur between the time when when a command has been sent to the device and the response to it. This assumption, however, is wrong. The antenna of the chip is likely to catch a lot of environmental noise, and once in a while, the device will latch the interrupt to signal a protocol error, and keep it latched until the response bytes are read from the chip. As the code currently stands, skbs for responses are only prepared when a command is sent, and the ISR bails out early if that wasn't the case. Hence, the interrupt remains latched, and no further communication with device is possible. To fix this, move the call to nfc_alloc_recv_skb() from st95hf_in_send_cmd() to st95hf_irq_thread_handler() and always read back the interrupt reason. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 36 ++-- 1 file changed, 6 insertions(+), 30 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 6761ab90f68d..99f84ddfdfef 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -196,7 +196,6 @@ static const struct cmd cmd_array[] = { /* st95_digital_cmd_complete_arg stores client context */ struct st95_digital_cmd_complete_arg { - struct sk_buff *skb_resp; nfc_digital_cmd_complete_t complete_cb; void *cb_usrarg; bool rats; @@ -783,12 +782,10 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) spidevice = >spicontext.spidev->dev; cb_arg = >complete_cb_arg; - skb_resp = cb_arg->skb_resp; - if (unlikely(!skb_resp)) { - WARN(1, "unknown context in ST95HF ISR"); + skb_resp = nfc_alloc_recv_skb(MAX_RESPONSE_BUFFER_SIZE, GFP_KERNEL); + if (WARN_ON(!skb_resp)) return IRQ_NONE; - } mutex_lock(>rm_lock); res_len = st95hf_spi_recv_response(>spicontext, @@ -838,12 +835,6 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) /* call digital layer callback */ cb_arg->complete_cb(stcontext->ddev, cb_arg->cb_usrarg, skb_resp); - /* -* This skb is now owned by the core layer. -* Make sure not to use it again. -*/ - cb_arg->skb_resp = NULL; - /* up the semaphore before returning */ mutex_unlock(>rm_lock); @@ -913,15 +904,7 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, void *arg) { struct st95hf_context *stcontext = nfc_digital_get_drvdata(ddev); - int rc; - struct sk_buff *skb_resp; - int len_data_to_tag = 0; - - skb_resp = nfc_alloc_recv_skb(MAX_RESPONSE_BUFFER_SIZE, GFP_KERNEL); - if (!skb_resp) { - rc = -ENOMEM; - goto error; - } + int rc, len_data_to_tag = 0; switch (stcontext->current_rf_tech) { case NFC_DIGITAL_RF_TECH_106A: @@ -933,8 +916,7 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, len_data_to_tag = skb->len; break; default: - rc = -EINVAL; - goto free_skb_resp; + return -EINVAL; } skb_push(skb, 3); @@ -942,7 +924,6 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, skb->data[1] = SEND_RECEIVE_CMD; skb->data[2] = len_data_to_tag; - stcontext->complete_cb_arg.skb_resp = skb_resp; stcontext->complete_cb_arg.cb_usrarg = arg; stcontext->complete_cb_arg.complete_cb = cb; @@ -956,17 +937,12 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, if (rc) { dev_err(>nfcdev->dev, "Error %d trying to perform data_exchange", rc); - goto free_skb_resp; + return rc; } kfree_skb(skb); - return rc; - -free_skb_resp: - kfree_skb(skb_resp); -error: - return rc; + return 0; } /* p2p will be supported in a later release ! */ -- 2.17.1
[PATCH v2 08/10] NFC: st95hf: unify sync/async flags
Keep the information whether a command is asynchronous in a boolean flag like it is done elsewhere in the code. This way, the enum can go away. No function change intended. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 38 -- drivers/nfc/st95hf/spi.c | 12 +--- drivers/nfc/st95hf/spi.h | 8 +--- 3 files changed, 22 insertions(+), 36 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index e7ecc57dde8f..835a1e61c817 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -101,7 +101,7 @@ struct cmd { unsigned char cmd_id; unsigned char no_cmd_params; unsigned char cmd_params[MAX_CMD_PARAMS]; - enum req_type req; + bool is_sync; }; struct param_list { @@ -134,63 +134,62 @@ static const struct cmd cmd_array[] = { .cmd_len = 0x2, .cmd_id = ECHO_CMD, .no_cmd_params = 0, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443A_CONFIG] = { .cmd_len = 0x7, .cmd_id = WRITE_REGISTER_CMD, .no_cmd_params = 0x4, .cmd_params = {0x3A, 0x00, 0x5A, 0x04}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443A_DEMOGAIN] = { .cmd_len = 0x7, .cmd_id = WRITE_REGISTER_CMD, .no_cmd_params = 0x4, .cmd_params = {0x68, 0x01, 0x01, 0xDF}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443B_DEMOGAIN] = { .cmd_len = 0x7, .cmd_id = WRITE_REGISTER_CMD, .no_cmd_params = 0x4, .cmd_params = {0x68, 0x01, 0x01, 0x51}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443A_PROTOCOL_SELECT] = { .cmd_len = 0x7, .cmd_id = PROTOCOL_SELECT_CMD, .no_cmd_params = 0x4, .cmd_params = {ISO14443A_PROTOCOL_CODE, 0x00, 0x01, 0xA0}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443B_PROTOCOL_SELECT] = { .cmd_len = 0x7, .cmd_id = PROTOCOL_SELECT_CMD, .no_cmd_params = 0x4, .cmd_params = {ISO14443B_PROTOCOL_CODE, 0x01, 0x03, 0xFF}, - .req = SYNC, + .is_sync = true, }, [CMD_WTX_RESPONSE] = { .cmd_len = 0x6, .cmd_id = SEND_RECEIVE_CMD, .no_cmd_params = 0x3, .cmd_params = {0xF2, 0x00, TRFLAG_NFCA_STD_FRAME_CRC}, - .req = ASYNC, }, [CMD_FIELD_OFF] = { .cmd_len = 0x5, .cmd_id = PROTOCOL_SELECT_CMD, .no_cmd_params = 0x2, .cmd_params = {0x0, 0x0}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO15693_PROTOCOL_SELECT] = { .cmd_len = 0x5, .cmd_id = PROTOCOL_SELECT_CMD, .no_cmd_params = 0x2, .cmd_params = {ISO15693_PROTOCOL_CODE, 0x0D}, - .req = SYNC, + .is_sync = true, }, }; @@ -279,13 +278,13 @@ static int st95hf_send_recv_cmd(struct st95hf_context *st95context, ret = st95hf_spi_send(>spicontext, spi_cmd_buffer, cmd_array[cmd].cmd_len, - cmd_array[cmd].req); + cmd_array[cmd].is_sync); if (ret) { dev_err(dev, "st95hf_spi_send failed with error %d\n", ret); return ret; } - if (cmd_array[cmd].req == SYNC && recv_res) { + if (cmd_array[cmd].is_sync && recv_res) { unsigned char st95hf_response_arr[2]; ret = st95hf_spi_recv_response(>spicontext, @@ -480,10 +479,8 @@ static int st95hf_send_spi_reset_sequence(struct st95hf_context *st95context) int result = 0; unsigned char reset_cmd = ST95HF_COMMAND_RESET; - result = st95hf_spi_send(>spicontext, -_cmd, -ST95HF_RESET_CMD_LEN, -ASYNC); + result = st95hf_spi_send(>spicontext, _cmd, +ST95HF_RESET_CMD_LEN, false); if (result) { dev_err(>spicontext.spidev->dev, "spi reset sequence cmd error = %d", result); @@ -932,8 +929,7 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, stcontext->complete_cb_arg.rats = true; rc = st95hf_spi_send(>spicontext, skb->data, -skb->len, -ASY
[PATCH v2 05/10] NFC: st95hf: remove exchange_lock
This patch removes the exchange_lock sempahore. Its intended function was two-fold: a) Lock the remove() callback of the driver against the ISR, so that the resources only go away after the ISR has finished. This is unnecessary though, because `rm_lock' does that already, in combination with the nullification of `scontext->ddev'. b) Indicate whether a command was sent previously. If the semaphore is found unused in the threaded ISR, an error is reported. This case can be handled much nicer by checking whether `skb_resp' is present in the context. For this, nullify the `skb_resp' pointer in the callback context after it was sent back to the NFC core. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 52 +++ 1 file changed, 9 insertions(+), 43 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index d857197ec7b2..6761ab90f68d 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -214,8 +214,6 @@ struct st95_digital_cmd_complete_arg { * @st95hf_supply: regulator "consumer" for NFC device. * @sendrcv_trflag: last byte of frame send by sendrecv command * of st95hf. This byte contains transmission flag info. - * @exchange_lock: semaphore used for signaling the st95hf_remove - * function that the last outstanding async nfc request is finished. * @rm_lock: mutex for ensuring safe access of nfc digital object * from threaded ISR. Usage of this mutex avoids any race between * deletion of the object from st95hf_remove() and its access from @@ -233,7 +231,6 @@ struct st95hf_context { struct st95_digital_cmd_complete_arg complete_cb_arg; struct regulator *st95hf_supply; unsigned char sendrcv_trflag; - struct semaphore exchange_lock; struct mutex rm_lock; u8 current_protocol; u8 current_rf_tech; @@ -785,29 +782,14 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) struct st95_digital_cmd_complete_arg *cb_arg; spidevice = >spicontext.spidev->dev; + cb_arg = >complete_cb_arg; + skb_resp = cb_arg->skb_resp; - /* -* check semaphore, if not down() already, then we don't -* know in which context the ISR is called and surely it -* will be a bug. Note that down() of the semaphore is done -* in the corresponding st95hf_in_send_cmd() and then -* only this ISR should be called. ISR will up() the -* semaphore before leaving. Hence when the ISR is called -* the correct behaviour is down_trylock() should always -* return 1 (indicating semaphore cant be taken and hence no -* change in semaphore count). -* If not, then we up() the semaphore and crash on -* a BUG() ! -*/ - if (!down_trylock(>exchange_lock)) { - up(>exchange_lock); + if (unlikely(!skb_resp)) { WARN(1, "unknown context in ST95HF ISR"); return IRQ_NONE; } - cb_arg = >complete_cb_arg; - skb_resp = cb_arg->skb_resp; - mutex_lock(>rm_lock); res_len = st95hf_spi_recv_response(>spicontext, skb_resp->data); @@ -856,8 +838,13 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) /* call digital layer callback */ cb_arg->complete_cb(stcontext->ddev, cb_arg->cb_usrarg, skb_resp); + /* +* This skb is now owned by the core layer. +* Make sure not to use it again. +*/ + cb_arg->skb_resp = NULL; + /* up the semaphore before returning */ - up(>exchange_lock); mutex_unlock(>rm_lock); return IRQ_HANDLED; @@ -868,8 +855,6 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) skb_resp = ERR_PTR(result); /* call of callback with error */ cb_arg->complete_cb(stcontext->ddev, cb_arg->cb_usrarg, skb_resp); - /* up the semaphore before returning */ - up(>exchange_lock); mutex_unlock(>rm_lock); return IRQ_HANDLED; } @@ -965,25 +950,12 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, ddev->curr_protocol == NFC_PROTO_ISO14443) stcontext->complete_cb_arg.rats = true; - /* -* down the semaphore to indicate to remove func that an -* ISR is pending, note that it will not block here in any case. -* If found blocked, it is a BUG! -*/ - rc = down_killable(>exchange_lock); - if (rc) { - WARN(1, "Semaphore is not found up in st95hf_in_send_cmd\n"); - return rc; - } - rc = st95hf_spi_send(>spicontext, skb->data, skb->len,
[PATCH v2 03/10] NFC: st95hf: drop illegal kfree_skb() in IRQ handler
In the error path of the IRQ handler, don't free the skb in flight. The callback in the digital core will do that for us. Doing it from both places causes a memory corruption. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index d58424ab5c48..d857197ec7b2 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -863,7 +863,6 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) return IRQ_HANDLED; end: - kfree_skb(skb_resp); wtx = false; cb_arg->rats = false; skb_resp = ERR_PTR(result); -- 2.17.1
[PATCH v2 04/10] NFC: st95hf: remove logging from spi functions
The callers of these functions already log on errors, and there's no need to do it from two places. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/spi.c | 19 --- 1 file changed, 4 insertions(+), 15 deletions(-) diff --git a/drivers/nfc/st95hf/spi.c b/drivers/nfc/st95hf/spi.c index e2d3bbcc8c34..d5894d4546b1 100644 --- a/drivers/nfc/st95hf/spi.c +++ b/drivers/nfc/st95hf/spi.c @@ -47,8 +47,6 @@ int st95hf_spi_send(struct st95hf_spi_context *spicontext, result = spi_sync(spidev, ); if (result) { - dev_err(>dev, "error: sending cmd to st95hf using SPI = %d\n", - result); mutex_unlock(>spi_lock); return result; } @@ -62,12 +60,10 @@ int st95hf_spi_send(struct st95hf_spi_context *spicontext, result = wait_for_completion_timeout(>done, msecs_to_jiffies(1000)); /* check for timeout or success */ - if (!result) { - dev_err(>dev, "error: response not ready timeout\n"); + if (!result) result = -ETIMEDOUT; - } else { + else result = 0; - } mutex_unlock(>spi_lock); @@ -79,7 +75,7 @@ EXPORT_SYMBOL_GPL(st95hf_spi_send); int st95hf_spi_recv_response(struct st95hf_spi_context *spicontext, unsigned char *receivebuff) { - int len = 0; + int ret, len; struct spi_transfer tx_takedata; struct spi_message m; struct spi_device *spidev = spicontext->spidev; @@ -89,8 +85,6 @@ int st95hf_spi_recv_response(struct st95hf_spi_context *spicontext, {.rx_buf = receivebuff, .len = 2, .cs_change = 1,}, }; - int ret = 0; - memset(_takedata, 0x0, sizeof(struct spi_transfer)); mutex_lock(>spi_lock); @@ -102,8 +96,6 @@ int st95hf_spi_recv_response(struct st95hf_spi_context *spicontext, ret = spi_sync(spidev, ); if (ret) { - dev_err(>dev, "spi_recv_resp, data length error = %d\n", - ret); mutex_unlock(>spi_lock); return ret; } @@ -127,11 +119,8 @@ int st95hf_spi_recv_response(struct st95hf_spi_context *spicontext, ret = spi_sync(spidev, ); mutex_unlock(>spi_lock); - if (ret) { - dev_err(>dev, "spi_recv_resp, data read error = %d\n", - ret); + if (ret) return ret; - } return len; } -- 2.17.1
[PATCH v2 01/10] Revert "NFC: st95hf: drop illegal kfree_skb()"
This reverts commit c99f996b2ba49 ("NFC: st95hf: drop illegal kfree_skb()"). It turns out that the st95hf_in_send_cmd() is in fact the sole owner of this skb, and by not freeing it here, we not only causing a memory leak but also mess up the refcount of the socket that holds it. This will in turn lead to activated targets not being cleaned up, even after stopping userspace processes. The memory corruption that I was hunting was caused by another kfree_skb(). This will be fixed in a later commit. Signed-off-by: Daniel Mack Fixes: c99f996b2ba49 ("NFC: st95hf: drop illegal kfree_skb()") --- drivers/nfc/st95hf/core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 36ef0e905ba3..bc1a2070f9bb 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -991,6 +991,8 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, goto free_skb_resp; } + kfree_skb(skb); + return rc; free_skb_resp: -- 2.17.1
[PATCH v2 09/10] NFC: st95hf: two small style nits
Address two minor style issues that I came across while working on the driver. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 835a1e61c817..4db3c020921c 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -662,7 +662,8 @@ static int st95hf_error_handling(struct st95hf_context *stcontext, result = -ETIMEDOUT; else result = -EIO; - return result; + + return result; } /* Check for CRC err only if CRC is present in the tag response */ @@ -844,6 +845,7 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) /* call of callback with error */ cb_arg->complete_cb(stcontext->ddev, cb_arg->cb_usrarg, skb_resp); mutex_unlock(>rm_lock); + return IRQ_HANDLED; } -- 2.17.1
[PATCH v2 07/10] NFC: st95hf: re-order command defines
Just a small cleanup to bring the command defines in a numerical order. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 99f84ddfdfef..e7ecc57dde8f 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -45,10 +45,10 @@ /* Command Send Interface */ /* ST95HF_COMMAND_SEND CMD Ids */ -#define ECHO_CMD 0x55 -#define WRITE_REGISTER_CMD 0x9 #define PROTOCOL_SELECT_CMD0x2 #define SEND_RECEIVE_CMD 0x4 +#define WRITE_REGISTER_CMD 0x9 +#define ECHO_CMD 0x55 /* Select protocol codes */ #define ISO15693_PROTOCOL_CODE 0x1 -- 2.17.1
[PATCH v2 02/10] NFC: st95hf: drop nfcdev_free
This flag is unneccesary. We can just nullify `ddev' instead after we freed it and check for that in the ISR. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index bc1a2070f9bb..d58424ab5c48 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -220,8 +220,6 @@ struct st95_digital_cmd_complete_arg { * from threaded ISR. Usage of this mutex avoids any race between * deletion of the object from st95hf_remove() and its access from * the threaded ISR. - * @nfcdev_free: flag to have the state of nfc device object. - * [alive | died] * @current_protocol: current nfc protocol. * @current_rf_tech: current rf technology. * @fwi: frame waiting index, received in reply of RATS according to @@ -237,7 +235,6 @@ struct st95hf_context { unsigned char sendrcv_trflag; struct semaphore exchange_lock; struct mutex rm_lock; - bool nfcdev_free; u8 current_protocol; u8 current_rf_tech; int fwi; @@ -820,8 +817,8 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) goto end; } - /* if stcontext->nfcdev_free is true, it means remove already ran */ - if (stcontext->nfcdev_free) { + /* if stcontext->ddev is %NULL, it means remove already ran */ + if (!stcontext->ddev) { result = -ENODEV; goto end; } @@ -1220,7 +1217,7 @@ static int st95hf_remove(struct spi_device *nfc_spi_dev) nfc_digital_unregister_device(stcontext->ddev); nfc_digital_free_device(stcontext->ddev); - stcontext->nfcdev_free = true; + stcontext->ddev = NULL; mutex_unlock(>rm_lock); -- 2.17.1
[PATCH v2 10/10] NFC: st95hf: add of match table
Add a match table for device tree compatible strings. Interestingly, a document describing the bindings already exists since a while, but users currently rely on the implicit matching in the drivers core. Let's be explicit and add a real match table. Signed-off-by: Daniel Mack Cc: devicet...@vger.kernel.org --- drivers/nfc/st95hf/core.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 4db3c020921c..5a01b454fbc4 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -1012,6 +1012,12 @@ static const struct spi_device_id st95hf_id[] = { }; MODULE_DEVICE_TABLE(spi, st95hf_id); +static const struct of_device_id st95hf_of_match[] = { + { .compatible = "st,st95hf", }, + {} +}; +MODULE_DEVICE_TABLE(of, st95hf_of_match); + static int st95hf_probe(struct spi_device *nfc_spi_dev) { int ret; @@ -1189,6 +1195,7 @@ static struct spi_driver st95hf_driver = { .driver = { .name = "st95hf", .owner = THIS_MODULE, + .of_match_table = st95hf_of_match, }, .id_table = st95hf_id, .probe = st95hf_probe, -- 2.17.1
[PATCH v2 00/10] NFC: A bunch of cleanups for st95hf
This is a series of patches for the ST95HF driver. Patch #1 reverts a change that I have submitted earlier and which is sitting in nfc-next already. Given that the tree hasn't been sent out for being merged yet, it could also still be removed with rebasing, in which case #1 is not necessary of course. The changes are all rather simple and are explained in their individual commit logs. Note that this series builds upon the patch titled "nfc: st95hf: remove redundant pointers 'dev' and 'nfcddev'" that Colin posted the other day. Thanks, Daniel Changelog: v1 → v2: * Improved commit logs, identical patch content. Daniel Mack (10): Revert "NFC: st95hf: drop illegal kfree_skb()" NFC: st95hf: drop nfcdev_free NFC: st95hf: drop illegal kfree_skb() in IRQ handler NFC: st95hf: remove logging from spi functions NFC: st95hf: remove exchange_lock NFC: st95hf: move skb allocation to ISR NFC: st95hf: re-order command defines NFC: st95hf: unify sync/async flags NFC: st95hf: two small style nits NFC: st95hf: add of match table drivers/nfc/st95hf/core.c | 135 +++--- drivers/nfc/st95hf/spi.c | 31 +++-- drivers/nfc/st95hf/spi.h | 8 +-- 3 files changed, 49 insertions(+), 125 deletions(-) -- 2.17.1
[PATCH 04/10] NFC: st95hf: remove logging from spi functions
The callers of these functions already log errors, so there's no need to do it from two places. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/spi.c | 19 --- 1 file changed, 4 insertions(+), 15 deletions(-) diff --git a/drivers/nfc/st95hf/spi.c b/drivers/nfc/st95hf/spi.c index e2d3bbcc8c34..d5894d4546b1 100644 --- a/drivers/nfc/st95hf/spi.c +++ b/drivers/nfc/st95hf/spi.c @@ -47,8 +47,6 @@ int st95hf_spi_send(struct st95hf_spi_context *spicontext, result = spi_sync(spidev, ); if (result) { - dev_err(>dev, "error: sending cmd to st95hf using SPI = %d\n", - result); mutex_unlock(>spi_lock); return result; } @@ -62,12 +60,10 @@ int st95hf_spi_send(struct st95hf_spi_context *spicontext, result = wait_for_completion_timeout(>done, msecs_to_jiffies(1000)); /* check for timeout or success */ - if (!result) { - dev_err(>dev, "error: response not ready timeout\n"); + if (!result) result = -ETIMEDOUT; - } else { + else result = 0; - } mutex_unlock(>spi_lock); @@ -79,7 +75,7 @@ EXPORT_SYMBOL_GPL(st95hf_spi_send); int st95hf_spi_recv_response(struct st95hf_spi_context *spicontext, unsigned char *receivebuff) { - int len = 0; + int ret, len; struct spi_transfer tx_takedata; struct spi_message m; struct spi_device *spidev = spicontext->spidev; @@ -89,8 +85,6 @@ int st95hf_spi_recv_response(struct st95hf_spi_context *spicontext, {.rx_buf = receivebuff, .len = 2, .cs_change = 1,}, }; - int ret = 0; - memset(_takedata, 0x0, sizeof(struct spi_transfer)); mutex_lock(>spi_lock); @@ -102,8 +96,6 @@ int st95hf_spi_recv_response(struct st95hf_spi_context *spicontext, ret = spi_sync(spidev, ); if (ret) { - dev_err(>dev, "spi_recv_resp, data length error = %d\n", - ret); mutex_unlock(>spi_lock); return ret; } @@ -127,11 +119,8 @@ int st95hf_spi_recv_response(struct st95hf_spi_context *spicontext, ret = spi_sync(spidev, ); mutex_unlock(>spi_lock); - if (ret) { - dev_err(>dev, "spi_recv_resp, data read error = %d\n", - ret); + if (ret) return ret; - } return len; } -- 2.17.1
[PATCH 03/10] NFC: st95hf: drop illegal kfree_skb() in IRQ handler
In the error path of the IRQ handler, don't free the skb in flight. The callback in the digital core will do that for us. Doing it from both places causes memory corruptions. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index d58424ab5c48..d857197ec7b2 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -863,7 +863,6 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) return IRQ_HANDLED; end: - kfree_skb(skb_resp); wtx = false; cb_arg->rats = false; skb_resp = ERR_PTR(result); -- 2.17.1
[PATCH 02/10] NFC: st95hf: drop nfcdev_free
This flag is unneccesary. We can just nullify `ddev' instead after we freed it. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index bc1a2070f9bb..d58424ab5c48 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -220,8 +220,6 @@ struct st95_digital_cmd_complete_arg { * from threaded ISR. Usage of this mutex avoids any race between * deletion of the object from st95hf_remove() and its access from * the threaded ISR. - * @nfcdev_free: flag to have the state of nfc device object. - * [alive | died] * @current_protocol: current nfc protocol. * @current_rf_tech: current rf technology. * @fwi: frame waiting index, received in reply of RATS according to @@ -237,7 +235,6 @@ struct st95hf_context { unsigned char sendrcv_trflag; struct semaphore exchange_lock; struct mutex rm_lock; - bool nfcdev_free; u8 current_protocol; u8 current_rf_tech; int fwi; @@ -820,8 +817,8 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) goto end; } - /* if stcontext->nfcdev_free is true, it means remove already ran */ - if (stcontext->nfcdev_free) { + /* if stcontext->ddev is %NULL, it means remove already ran */ + if (!stcontext->ddev) { result = -ENODEV; goto end; } @@ -1220,7 +1217,7 @@ static int st95hf_remove(struct spi_device *nfc_spi_dev) nfc_digital_unregister_device(stcontext->ddev); nfc_digital_free_device(stcontext->ddev); - stcontext->nfcdev_free = true; + stcontext->ddev = NULL; mutex_unlock(>rm_lock); -- 2.17.1
[PATCH 09/10] NFC: st95hf: two small style nits
Address two minor style issues that I came across while working on the driver. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 835a1e61c817..4db3c020921c 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -662,7 +662,8 @@ static int st95hf_error_handling(struct st95hf_context *stcontext, result = -ETIMEDOUT; else result = -EIO; - return result; + + return result; } /* Check for CRC err only if CRC is present in the tag response */ @@ -844,6 +845,7 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) /* call of callback with error */ cb_arg->complete_cb(stcontext->ddev, cb_arg->cb_usrarg, skb_resp); mutex_unlock(>rm_lock); + return IRQ_HANDLED; } -- 2.17.1
[PATCH 08/10] NFC: st95hf: unify sync/async flags
Keep the information whether a command is asynchronous in a boolean flag everywhere in the code. This way, the enum can go away. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 38 -- drivers/nfc/st95hf/spi.c | 12 +--- drivers/nfc/st95hf/spi.h | 8 +--- 3 files changed, 22 insertions(+), 36 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index e7ecc57dde8f..835a1e61c817 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -101,7 +101,7 @@ struct cmd { unsigned char cmd_id; unsigned char no_cmd_params; unsigned char cmd_params[MAX_CMD_PARAMS]; - enum req_type req; + bool is_sync; }; struct param_list { @@ -134,63 +134,62 @@ static const struct cmd cmd_array[] = { .cmd_len = 0x2, .cmd_id = ECHO_CMD, .no_cmd_params = 0, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443A_CONFIG] = { .cmd_len = 0x7, .cmd_id = WRITE_REGISTER_CMD, .no_cmd_params = 0x4, .cmd_params = {0x3A, 0x00, 0x5A, 0x04}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443A_DEMOGAIN] = { .cmd_len = 0x7, .cmd_id = WRITE_REGISTER_CMD, .no_cmd_params = 0x4, .cmd_params = {0x68, 0x01, 0x01, 0xDF}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443B_DEMOGAIN] = { .cmd_len = 0x7, .cmd_id = WRITE_REGISTER_CMD, .no_cmd_params = 0x4, .cmd_params = {0x68, 0x01, 0x01, 0x51}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443A_PROTOCOL_SELECT] = { .cmd_len = 0x7, .cmd_id = PROTOCOL_SELECT_CMD, .no_cmd_params = 0x4, .cmd_params = {ISO14443A_PROTOCOL_CODE, 0x00, 0x01, 0xA0}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO14443B_PROTOCOL_SELECT] = { .cmd_len = 0x7, .cmd_id = PROTOCOL_SELECT_CMD, .no_cmd_params = 0x4, .cmd_params = {ISO14443B_PROTOCOL_CODE, 0x01, 0x03, 0xFF}, - .req = SYNC, + .is_sync = true, }, [CMD_WTX_RESPONSE] = { .cmd_len = 0x6, .cmd_id = SEND_RECEIVE_CMD, .no_cmd_params = 0x3, .cmd_params = {0xF2, 0x00, TRFLAG_NFCA_STD_FRAME_CRC}, - .req = ASYNC, }, [CMD_FIELD_OFF] = { .cmd_len = 0x5, .cmd_id = PROTOCOL_SELECT_CMD, .no_cmd_params = 0x2, .cmd_params = {0x0, 0x0}, - .req = SYNC, + .is_sync = true, }, [CMD_ISO15693_PROTOCOL_SELECT] = { .cmd_len = 0x5, .cmd_id = PROTOCOL_SELECT_CMD, .no_cmd_params = 0x2, .cmd_params = {ISO15693_PROTOCOL_CODE, 0x0D}, - .req = SYNC, + .is_sync = true, }, }; @@ -279,13 +278,13 @@ static int st95hf_send_recv_cmd(struct st95hf_context *st95context, ret = st95hf_spi_send(>spicontext, spi_cmd_buffer, cmd_array[cmd].cmd_len, - cmd_array[cmd].req); + cmd_array[cmd].is_sync); if (ret) { dev_err(dev, "st95hf_spi_send failed with error %d\n", ret); return ret; } - if (cmd_array[cmd].req == SYNC && recv_res) { + if (cmd_array[cmd].is_sync && recv_res) { unsigned char st95hf_response_arr[2]; ret = st95hf_spi_recv_response(>spicontext, @@ -480,10 +479,8 @@ static int st95hf_send_spi_reset_sequence(struct st95hf_context *st95context) int result = 0; unsigned char reset_cmd = ST95HF_COMMAND_RESET; - result = st95hf_spi_send(>spicontext, -_cmd, -ST95HF_RESET_CMD_LEN, -ASYNC); + result = st95hf_spi_send(>spicontext, _cmd, +ST95HF_RESET_CMD_LEN, false); if (result) { dev_err(>spicontext.spidev->dev, "spi reset sequence cmd error = %d", result); @@ -932,8 +929,7 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, stcontext->complete_cb_arg.rats = true; rc = st95hf_spi_send(>spicontext, skb->data, -skb->len, -ASYNC); +skb-&g
[PATCH 10/10] NFC: st95hf: add of match table
Add a match table for device tree compatible strings. Interestingly, a document describing the bindings already exists since a while, but users currently reply on the implicit matching in the drivers core. Signed-off-by: Daniel Mack Cc: devicet...@vger.kernel.org --- drivers/nfc/st95hf/core.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 4db3c020921c..5a01b454fbc4 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -1012,6 +1012,12 @@ static const struct spi_device_id st95hf_id[] = { }; MODULE_DEVICE_TABLE(spi, st95hf_id); +static const struct of_device_id st95hf_of_match[] = { + { .compatible = "st,st95hf", }, + {} +}; +MODULE_DEVICE_TABLE(of, st95hf_of_match); + static int st95hf_probe(struct spi_device *nfc_spi_dev) { int ret; @@ -1189,6 +1195,7 @@ static struct spi_driver st95hf_driver = { .driver = { .name = "st95hf", .owner = THIS_MODULE, + .of_match_table = st95hf_of_match, }, .id_table = st95hf_id, .probe = st95hf_probe, -- 2.17.1
[PATCH 07/10] NFC: st95hf: re-order command defines
Just a small cleanup to bring the command defines in a numerical order. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 99f84ddfdfef..e7ecc57dde8f 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -45,10 +45,10 @@ /* Command Send Interface */ /* ST95HF_COMMAND_SEND CMD Ids */ -#define ECHO_CMD 0x55 -#define WRITE_REGISTER_CMD 0x9 #define PROTOCOL_SELECT_CMD0x2 #define SEND_RECEIVE_CMD 0x4 +#define WRITE_REGISTER_CMD 0x9 +#define ECHO_CMD 0x55 /* Select protocol codes */ #define ISO15693_PROTOCOL_CODE 0x1 -- 2.17.1
[PATCH 06/10] NFC: st95hf: move skb allocation to ISR
The driver currently assumes that interrupts can only occur between the time when when a command has been sent to the device and the response to it. This assumption, however, is wrong. The antenna of the chip is likely to catch a lot of environmental noise, and once in a while, the device will latch the interrupt to signal a protocol error, and keep it latched until the response bytes are read from the chip. As the code currently stands, skbs for responses are only prepared when a command is sent, and the ISR bails out early if that wasn't the case. Hence, the interrupt remains latched, and no further communication with device is possible. To fix this, move the call to nfc_alloc_recv_skb() from st95hf_in_send_cmd() to st95hf_irq_thread_handler() so we can always read back the interrupt reason. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 36 ++-- 1 file changed, 6 insertions(+), 30 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 6761ab90f68d..99f84ddfdfef 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -196,7 +196,6 @@ static const struct cmd cmd_array[] = { /* st95_digital_cmd_complete_arg stores client context */ struct st95_digital_cmd_complete_arg { - struct sk_buff *skb_resp; nfc_digital_cmd_complete_t complete_cb; void *cb_usrarg; bool rats; @@ -783,12 +782,10 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) spidevice = >spicontext.spidev->dev; cb_arg = >complete_cb_arg; - skb_resp = cb_arg->skb_resp; - if (unlikely(!skb_resp)) { - WARN(1, "unknown context in ST95HF ISR"); + skb_resp = nfc_alloc_recv_skb(MAX_RESPONSE_BUFFER_SIZE, GFP_KERNEL); + if (WARN_ON(!skb_resp)) return IRQ_NONE; - } mutex_lock(>rm_lock); res_len = st95hf_spi_recv_response(>spicontext, @@ -838,12 +835,6 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) /* call digital layer callback */ cb_arg->complete_cb(stcontext->ddev, cb_arg->cb_usrarg, skb_resp); - /* -* This skb is now owned by the core layer. -* Make sure not to use it again. -*/ - cb_arg->skb_resp = NULL; - /* up the semaphore before returning */ mutex_unlock(>rm_lock); @@ -913,15 +904,7 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, void *arg) { struct st95hf_context *stcontext = nfc_digital_get_drvdata(ddev); - int rc; - struct sk_buff *skb_resp; - int len_data_to_tag = 0; - - skb_resp = nfc_alloc_recv_skb(MAX_RESPONSE_BUFFER_SIZE, GFP_KERNEL); - if (!skb_resp) { - rc = -ENOMEM; - goto error; - } + int rc, len_data_to_tag = 0; switch (stcontext->current_rf_tech) { case NFC_DIGITAL_RF_TECH_106A: @@ -933,8 +916,7 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, len_data_to_tag = skb->len; break; default: - rc = -EINVAL; - goto free_skb_resp; + return -EINVAL; } skb_push(skb, 3); @@ -942,7 +924,6 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, skb->data[1] = SEND_RECEIVE_CMD; skb->data[2] = len_data_to_tag; - stcontext->complete_cb_arg.skb_resp = skb_resp; stcontext->complete_cb_arg.cb_usrarg = arg; stcontext->complete_cb_arg.complete_cb = cb; @@ -956,17 +937,12 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, if (rc) { dev_err(>nfcdev->dev, "Error %d trying to perform data_exchange", rc); - goto free_skb_resp; + return rc; } kfree_skb(skb); - return rc; - -free_skb_resp: - kfree_skb(skb_resp); -error: - return rc; + return 0; } /* p2p will be supported in a later release ! */ -- 2.17.1
[PATCH 05/10] NFC: st95hf: remove exchange_lock
This patch removes the exchange_lock sempahore. Its intended function was two-fold: a) Lock the remove() callback of the driver against the ISR, so that the resources only go away after the ISR has finished. This is unnecessary though, because `rm_lock' does that already, in combination with the nullification of `scontext->ddev'. b) Indicate whether a command was sent previously. If the semaphore is found unused in the threaded ISR, an error is reported. This case can be handled much nicer by checking whether `skb_resp' is present in the context. For this, nullify the `skb_resp' pointer in the callback context. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 52 +++ 1 file changed, 9 insertions(+), 43 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index d857197ec7b2..6761ab90f68d 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -214,8 +214,6 @@ struct st95_digital_cmd_complete_arg { * @st95hf_supply: regulator "consumer" for NFC device. * @sendrcv_trflag: last byte of frame send by sendrecv command * of st95hf. This byte contains transmission flag info. - * @exchange_lock: semaphore used for signaling the st95hf_remove - * function that the last outstanding async nfc request is finished. * @rm_lock: mutex for ensuring safe access of nfc digital object * from threaded ISR. Usage of this mutex avoids any race between * deletion of the object from st95hf_remove() and its access from @@ -233,7 +231,6 @@ struct st95hf_context { struct st95_digital_cmd_complete_arg complete_cb_arg; struct regulator *st95hf_supply; unsigned char sendrcv_trflag; - struct semaphore exchange_lock; struct mutex rm_lock; u8 current_protocol; u8 current_rf_tech; @@ -785,29 +782,14 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) struct st95_digital_cmd_complete_arg *cb_arg; spidevice = >spicontext.spidev->dev; + cb_arg = >complete_cb_arg; + skb_resp = cb_arg->skb_resp; - /* -* check semaphore, if not down() already, then we don't -* know in which context the ISR is called and surely it -* will be a bug. Note that down() of the semaphore is done -* in the corresponding st95hf_in_send_cmd() and then -* only this ISR should be called. ISR will up() the -* semaphore before leaving. Hence when the ISR is called -* the correct behaviour is down_trylock() should always -* return 1 (indicating semaphore cant be taken and hence no -* change in semaphore count). -* If not, then we up() the semaphore and crash on -* a BUG() ! -*/ - if (!down_trylock(>exchange_lock)) { - up(>exchange_lock); + if (unlikely(!skb_resp)) { WARN(1, "unknown context in ST95HF ISR"); return IRQ_NONE; } - cb_arg = >complete_cb_arg; - skb_resp = cb_arg->skb_resp; - mutex_lock(>rm_lock); res_len = st95hf_spi_recv_response(>spicontext, skb_resp->data); @@ -856,8 +838,13 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) /* call digital layer callback */ cb_arg->complete_cb(stcontext->ddev, cb_arg->cb_usrarg, skb_resp); + /* +* This skb is now owned by the core layer. +* Make sure not to use it again. +*/ + cb_arg->skb_resp = NULL; + /* up the semaphore before returning */ - up(>exchange_lock); mutex_unlock(>rm_lock); return IRQ_HANDLED; @@ -868,8 +855,6 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) skb_resp = ERR_PTR(result); /* call of callback with error */ cb_arg->complete_cb(stcontext->ddev, cb_arg->cb_usrarg, skb_resp); - /* up the semaphore before returning */ - up(>exchange_lock); mutex_unlock(>rm_lock); return IRQ_HANDLED; } @@ -965,25 +950,12 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, ddev->curr_protocol == NFC_PROTO_ISO14443) stcontext->complete_cb_arg.rats = true; - /* -* down the semaphore to indicate to remove func that an -* ISR is pending, note that it will not block here in any case. -* If found blocked, it is a BUG! -*/ - rc = down_killable(>exchange_lock); - if (rc) { - WARN(1, "Semaphore is not found up in st95hf_in_send_cmd\n"); - return rc; - } - rc = st95hf_spi_send(>spicontext, skb->data, skb->len, ASYNC); i
[PATCH 01/10] Revert "NFC: st95hf: drop illegal kfree_skb()"
This reverts commit c99f996b2ba49 ("NFC: st95hf: drop illegal kfree_skb()"). It turns out that the st95hf_in_send_cmd() is in fact the sole owner of this skb, and by not freeing it here, we not only causing a memory leak but also mess up the refcount of the socket that holds it. This will in turn lead to activated targets not being cleaned up, even after stopping userspace processes. The memory corruption that I was hunting was caused by another kfree_skb(). This will be fixed in a later commit. Signed-off-by: Daniel Mack Fixes: c99f996b2ba49 ("NFC: st95hf: drop illegal kfree_skb()") --- drivers/nfc/st95hf/core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 36ef0e905ba3..bc1a2070f9bb 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -991,6 +991,8 @@ static int st95hf_in_send_cmd(struct nfc_digital_dev *ddev, goto free_skb_resp; } + kfree_skb(skb); + return rc; free_skb_resp: -- 2.17.1
[PATCH 00/10] NFC: A bunch of cleanups for st95hf
This is a series of patches for the ST95HF driver. Patch #1 reverts a change that I have submitted earlier and which is sitting in nfc-next already. Given that the tree hasn't been sent out for being merged yet, it could also still be removed with rebasing, in which case #1 is not necessary of course. The changes are all rather simple and are explained in their individual commit logs. Note that this series builds upon the patch titled "nfc: st95hf: remove redundant pointers 'dev' and 'nfcddev'" that Colin posted the other day. Thanks, Daniel Daniel Mack (10): Revert "NFC: st95hf: drop illegal kfree_skb()" NFC: st95hf: drop nfcdev_free NFC: st95hf: drop illegal kfree_skb() in IRQ handler NFC: st95hf: remove logging from spi functions NFC: st95hf: remove exchange_lock NFC: st95hf: move skb allocation to ISR NFC: st95hf: re-order command defines NFC: st95hf: unify sync/async flags NFC: st95hf: two small style nits NFC: st95hf: add of match table drivers/nfc/st95hf/core.c | 135 +++--- drivers/nfc/st95hf/spi.c | 31 +++-- drivers/nfc/st95hf/spi.h | 8 +-- 3 files changed, 49 insertions(+), 125 deletions(-) -- 2.17.1
Re: [PATCH 2/2] nfc: st95hf: drop another illegal kfree_skb()
Hi, I'll resend the two patches in this series as part of a bigger series soon, please ignore them for now. Thanks, Daniel On Friday, June 29, 2018 02:47 PM, Daniel Mack wrote: In the error path of the IRQ handler, don't free the skb in flight. The callback in the digital core will do that for us, so this is another double-free that leads to memory corruptions. The assignment of 'wtx' doesn't make sense as the variable is not read after it is written. Drop it. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index ef91ca8b53a4..e651e1aae5a3 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -868,8 +868,6 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) return IRQ_HANDLED; end: - kfree_skb(skb_resp); - wtx = false; cb_arg->rats = false; skb_resp = ERR_PTR(result); /* call of callback with error */
Re: [PATCH v2] libertas: fix suspend and resume for SDIO connected cards
On Friday, June 29, 2018 05:39 PM, Ulf Hansson wrote: On 27 June 2018 at 20:58, Daniel Mack wrote: Prior to commit 573185cc7e64 ("mmc: core: Invoke sdio func driver's PM callbacks from the sdio bus"), the MMC core used to call into the power management functions of SDIO clients itself and removed the card if the return code was non-zero. IOW, the mmc handled errors gracefully and didn't upchain them to the pm core. Since this change, the mmc core relies on generic power management functions which treat all errors as a reason to cancel the suspend immediately. This causes suspend attempts to fail when the libertas driver is loaded. To fix this, power down the card explicitly in if_sdio_suspend() when we know we're about to lose power and return success. Also set a flag in these cases, and power up the card again in if_sdio_resume(). Signed-off-by: Daniel Mack Cc: Ulf Hansson Cc: Chris Ball Looks good to me! Should be a candidate for stable as well!? Thanks! Yeah, it should probably get a Fixes: 573185cc7e64 ("mmc: core: Invoke sdio func driver's PM callbacks from the sdio bus") as well. I have some additional related changes in mind for the libertas SDIO driver, however let me post patches for us to discuss around instead. I currently have access to such hardware, so I can test patches :) Thanks, Daniel
[PATCH 1/2] nfc: st95hf: drop nfcdev_free
This flag is unneccesary. We can just nullify `ddev' instead after we freed it. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index a50a95cfcfd8..ef91ca8b53a4 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -220,7 +220,6 @@ struct st95_digital_cmd_complete_arg { * from threaded ISR. Usage of this mutex avoids any race between * deletion of the object from st95hf_remove() and its access from * the threaded ISR. - * @nfcdev_free: flag to have the state of nfc device object. * [alive | died] * @current_protocol: current nfc protocol. * @current_rf_tech: current rf technology. @@ -237,7 +236,6 @@ struct st95hf_context { unsigned char sendrcv_trflag; struct semaphore exchange_lock; struct mutex rm_lock; - bool nfcdev_free; u8 current_protocol; u8 current_rf_tech; int fwi; @@ -822,8 +820,8 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) goto end; } - /* if stcontext->nfcdev_free is true, it means remove already ran */ - if (stcontext->nfcdev_free) { + /* if stcontext->ddev is %NULL, it means remove already ran */ + if (!stcontext->ddev) { result = -ENODEV; goto end; } @@ -1222,7 +1220,7 @@ static int st95hf_remove(struct spi_device *nfc_spi_dev) nfc_digital_unregister_device(stcontext->ddev); nfc_digital_free_device(stcontext->ddev); - stcontext->nfcdev_free = true; + stcontext->ddev = NULL; mutex_unlock(>rm_lock); -- 2.17.1
[PATCH 2/2] nfc: st95hf: drop another illegal kfree_skb()
In the error path of the IRQ handler, don't free the skb in flight. The callback in the digital core will do that for us, so this is another double-free that leads to memory corruptions. The assignment of 'wtx' doesn't make sense as the variable is not read after it is written. Drop it. Signed-off-by: Daniel Mack --- drivers/nfc/st95hf/core.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index ef91ca8b53a4..e651e1aae5a3 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -868,8 +868,6 @@ static irqreturn_t st95hf_irq_thread_handler(int irq, void *st95hfcontext) return IRQ_HANDLED; end: - kfree_skb(skb_resp); - wtx = false; cb_arg->rats = false; skb_resp = ERR_PTR(result); /* call of callback with error */ -- 2.17.1
[PATCH] wcn36xx: drop unnecessary initialization of variables
Initialization is unneccessary when the variable is written before it is read. There were some occasions in which the driver would initialize `ret' during declaration without need. Purely a cosmetic change with no functional impact. Signed-off-by: Daniel Mack --- drivers/net/wireless/ath/wcn36xx/main.c | 4 +- drivers/net/wireless/ath/wcn36xx/smd.c | 75 - 2 files changed, 38 insertions(+), 41 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c index e38443ecaab4..79998a3ddb7a 100644 --- a/drivers/net/wireless/ath/wcn36xx/main.c +++ b/drivers/net/wireless/ath/wcn36xx/main.c @@ -1161,8 +1161,6 @@ static const struct ieee80211_ops wcn36xx_ops = { static int wcn36xx_init_ieee80211(struct wcn36xx *wcn) { - int ret = 0; - static const u32 cipher_suites[] = { WLAN_CIPHER_SUITE_WEP40, WLAN_CIPHER_SUITE_WEP104, @@ -1209,7 +1207,7 @@ static int wcn36xx_init_ieee80211(struct wcn36xx *wcn) wiphy_ext_feature_set(wcn->hw->wiphy, NL80211_EXT_FEATURE_CQM_RSSI_LIST); - return ret; + return 0; } static int wcn36xx_platform_get_resources(struct wcn36xx *wcn, diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index 304a86c7dcd2..00098f24116d 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -250,7 +250,7 @@ static void wcn36xx_smd_set_sta_params(struct wcn36xx *wcn, static int wcn36xx_smd_send_and_wait(struct wcn36xx *wcn, size_t len) { - int ret = 0; + int ret; unsigned long start; struct wcn36xx_hal_msg_header *hdr = (struct wcn36xx_hal_msg_header *)wcn->hal_buf; @@ -446,7 +446,7 @@ static int wcn36xx_smd_start_rsp(struct wcn36xx *wcn, void *buf, size_t len) int wcn36xx_smd_start(struct wcn36xx *wcn) { struct wcn36xx_hal_mac_start_req_msg msg_body, *body; - int ret = 0; + int ret; int i; size_t len; @@ -493,7 +493,7 @@ int wcn36xx_smd_start(struct wcn36xx *wcn) int wcn36xx_smd_stop(struct wcn36xx *wcn) { struct wcn36xx_hal_mac_stop_req_msg msg_body; - int ret = 0; + int ret; mutex_lock(>hal_mutex); INIT_HAL_MSG(msg_body, WCN36XX_HAL_STOP_REQ); @@ -520,7 +520,7 @@ int wcn36xx_smd_stop(struct wcn36xx *wcn) int wcn36xx_smd_init_scan(struct wcn36xx *wcn, enum wcn36xx_hal_sys_mode mode) { struct wcn36xx_hal_init_scan_req_msg msg_body; - int ret = 0; + int ret; mutex_lock(>hal_mutex); INIT_HAL_MSG(msg_body, WCN36XX_HAL_INIT_SCAN_REQ); @@ -549,7 +549,7 @@ int wcn36xx_smd_init_scan(struct wcn36xx *wcn, enum wcn36xx_hal_sys_mode mode) int wcn36xx_smd_start_scan(struct wcn36xx *wcn, u8 scan_channel) { struct wcn36xx_hal_start_scan_req_msg msg_body; - int ret = 0; + int ret; mutex_lock(>hal_mutex); INIT_HAL_MSG(msg_body, WCN36XX_HAL_START_SCAN_REQ); @@ -579,7 +579,7 @@ int wcn36xx_smd_start_scan(struct wcn36xx *wcn, u8 scan_channel) int wcn36xx_smd_end_scan(struct wcn36xx *wcn, u8 scan_channel) { struct wcn36xx_hal_end_scan_req_msg msg_body; - int ret = 0; + int ret; mutex_lock(>hal_mutex); INIT_HAL_MSG(msg_body, WCN36XX_HAL_END_SCAN_REQ); @@ -610,7 +610,7 @@ int wcn36xx_smd_finish_scan(struct wcn36xx *wcn, enum wcn36xx_hal_sys_mode mode) { struct wcn36xx_hal_finish_scan_req_msg msg_body; - int ret = 0; + int ret; mutex_lock(>hal_mutex); INIT_HAL_MSG(msg_body, WCN36XX_HAL_FINISH_SCAN_REQ); @@ -732,7 +732,7 @@ int wcn36xx_smd_stop_hw_scan(struct wcn36xx *wcn) static int wcn36xx_smd_switch_channel_rsp(void *buf, size_t len) { struct wcn36xx_hal_switch_channel_rsp_msg *rsp; - int ret = 0; + int ret; ret = wcn36xx_smd_rsp_status_check(buf, len); if (ret) @@ -747,7 +747,7 @@ int wcn36xx_smd_switch_channel(struct wcn36xx *wcn, struct ieee80211_vif *vif, int ch) { struct wcn36xx_hal_switch_channel_req_msg msg_body; - int ret = 0; + int ret; mutex_lock(>hal_mutex); INIT_HAL_MSG(msg_body, WCN36XX_HAL_CH_SWITCH_REQ); @@ -860,7 +860,7 @@ int wcn36xx_smd_update_scan_params(struct wcn36xx *wcn, u8 *channels, size_t channel_count) { struct wcn36xx_hal_update_scan_params_req_ex msg_body; - int ret = 0; + int ret; mutex_lock(>hal_mutex); INIT_HAL_MSG(msg_body, WCN36XX_HAL_UPDATE_SCAN_PARAM_REQ); @@ -931,7 +931,7 @@ static int wcn36xx_smd_add_sta_self_rsp(struct wcn36xx *wcn, int wcn36xx_smd_add_sta_self(struct wcn36xx *wcn, struct ieee80211_vif *vif) { struct wcn36xx_hal_add_sta_self_req msg_body; - int ret = 0; +
[PATCH v2] libertas: fix suspend and resume for SDIO connected cards
Prior to commit 573185cc7e64 ("mmc: core: Invoke sdio func driver's PM callbacks from the sdio bus"), the MMC core used to call into the power management functions of SDIO clients itself and removed the card if the return code was non-zero. IOW, the mmc handled errors gracefully and didn't upchain them to the pm core. Since this change, the mmc core relies on generic power management functions which treat all errors as a reason to cancel the suspend immediately. This causes suspend attempts to fail when the libertas driver is loaded. To fix this, power down the card explicitly in if_sdio_suspend() when we know we're about to lose power and return success. Also set a flag in these cases, and power up the card again in if_sdio_resume(). Signed-off-by: Daniel Mack Cc: Ulf Hansson Cc: Chris Ball --- v1 → v2: * Reworded patch subject * Added wait_event(card->pwron_waitq, card->priv->fw_ready) drivers/net/wireless/marvell/libertas/dev.h | 1 + .../net/wireless/marvell/libertas/if_sdio.c | 30 +++ 2 files changed, 25 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/marvell/libertas/dev.h b/drivers/net/wireless/marvell/libertas/dev.h index dd1ee1f0af48..469134930026 100644 --- a/drivers/net/wireless/marvell/libertas/dev.h +++ b/drivers/net/wireless/marvell/libertas/dev.h @@ -104,6 +104,7 @@ struct lbs_private { u8 fw_ready; u8 surpriseremoved; u8 setup_fw_on_resume; + u8 power_up_on_resume; int (*hw_host_to_card) (struct lbs_private *priv, u8 type, u8 *payload, u16 nb); void (*reset_card) (struct lbs_private *priv); int (*power_save) (struct lbs_private *priv); diff --git a/drivers/net/wireless/marvell/libertas/if_sdio.c b/drivers/net/wireless/marvell/libertas/if_sdio.c index 2300e796c6ab..43743c26c071 100644 --- a/drivers/net/wireless/marvell/libertas/if_sdio.c +++ b/drivers/net/wireless/marvell/libertas/if_sdio.c @@ -1290,15 +1290,23 @@ static void if_sdio_remove(struct sdio_func *func) static int if_sdio_suspend(struct device *dev) { struct sdio_func *func = dev_to_sdio_func(dev); - int ret; struct if_sdio_card *card = sdio_get_drvdata(func); + struct lbs_private *priv = card->priv; + int ret; mmc_pm_flag_t flags = sdio_get_host_pm_caps(func); + priv->power_up_on_resume = false; /* If we're powered off anyway, just let the mmc layer remove the * card. */ - if (!lbs_iface_active(card->priv)) - return -ENOSYS; + if (!lbs_iface_active(priv)) { + if (priv->fw_ready) { + priv->power_up_on_resume = true; + if_sdio_power_off(card); + } + + return 0; + } dev_info(dev, "%s: suspend: PM flags = 0x%x\n", sdio_func_id(func), flags); @@ -1306,9 +1314,14 @@ static int if_sdio_suspend(struct device *dev) /* If we aren't being asked to wake on anything, we should bail out * and let the SD stack power down the card. */ - if (card->priv->wol_criteria == EHS_REMOVE_WAKEUP) { + if (priv->wol_criteria == EHS_REMOVE_WAKEUP) { dev_info(dev, "Suspend without wake params -- powering down card\n"); - return -ENOSYS; + if (priv->fw_ready) { + priv->power_up_on_resume = true; + if_sdio_power_off(card); + } + + return 0; } if (!(flags & MMC_PM_KEEP_POWER)) { @@ -1321,7 +1334,7 @@ static int if_sdio_suspend(struct device *dev) if (ret) return ret; - ret = lbs_suspend(card->priv); + ret = lbs_suspend(priv); if (ret) return ret; @@ -1336,6 +1349,11 @@ static int if_sdio_resume(struct device *dev) dev_info(dev, "%s: resume: we're back\n", sdio_func_id(func)); + if (card->priv->power_up_on_resume) { + if_sdio_power_on(card); + wait_event(card->pwron_waitq, card->priv->fw_ready); + } + ret = lbs_resume(card->priv); return ret; -- 2.17.1
[PATCH] libertas: fix return codes in if_sdio_suspend()
Prior to commit 573185cc7e64 ("mmc: core: Invoke sdio func driver's PM callbacks from the sdio bus"), the MMC core used to call into the power management functions of SDIO clients itself and removed the card if the return code was non-zero. IOW, the mmc handled errors gracefully and didn't upchain them to the pm core. Since this change, the mmc core relies on generic power management functions which treat all errors as a reason to cancel the suspend immediately. This causes suspend attempts to fail when the libertas driver is loaded. To fix this, power down the card explicitly in if_sdio_suspend() when we know we're about to lose power and return success. Also set a flag in these cases, and power up the card again in if_sdio_resume(). Signed-off-by: Daniel Mack Cc: Ulf Hansson Cc: Chris Ball --- drivers/net/wireless/marvell/libertas/dev.h | 1 + .../net/wireless/marvell/libertas/if_sdio.c | 28 +++ 2 files changed, 23 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/marvell/libertas/dev.h b/drivers/net/wireless/marvell/libertas/dev.h index dd1ee1f0af48..469134930026 100644 --- a/drivers/net/wireless/marvell/libertas/dev.h +++ b/drivers/net/wireless/marvell/libertas/dev.h @@ -104,6 +104,7 @@ struct lbs_private { u8 fw_ready; u8 surpriseremoved; u8 setup_fw_on_resume; + u8 power_up_on_resume; int (*hw_host_to_card) (struct lbs_private *priv, u8 type, u8 *payload, u16 nb); void (*reset_card) (struct lbs_private *priv); int (*power_save) (struct lbs_private *priv); diff --git a/drivers/net/wireless/marvell/libertas/if_sdio.c b/drivers/net/wireless/marvell/libertas/if_sdio.c index 2300e796c6ab..f43807663b1b 100644 --- a/drivers/net/wireless/marvell/libertas/if_sdio.c +++ b/drivers/net/wireless/marvell/libertas/if_sdio.c @@ -1290,15 +1290,23 @@ static void if_sdio_remove(struct sdio_func *func) static int if_sdio_suspend(struct device *dev) { struct sdio_func *func = dev_to_sdio_func(dev); - int ret; struct if_sdio_card *card = sdio_get_drvdata(func); + struct lbs_private *priv = card->priv; + int ret; mmc_pm_flag_t flags = sdio_get_host_pm_caps(func); + priv->power_up_on_resume = false; /* If we're powered off anyway, just let the mmc layer remove the * card. */ - if (!lbs_iface_active(card->priv)) - return -ENOSYS; + if (!lbs_iface_active(priv)) { + if (priv->fw_ready) { + priv->power_up_on_resume = true; + if_sdio_power_off(card); + } + + return 0; + } dev_info(dev, "%s: suspend: PM flags = 0x%x\n", sdio_func_id(func), flags); @@ -1306,9 +1314,14 @@ static int if_sdio_suspend(struct device *dev) /* If we aren't being asked to wake on anything, we should bail out * and let the SD stack power down the card. */ - if (card->priv->wol_criteria == EHS_REMOVE_WAKEUP) { + if (priv->wol_criteria == EHS_REMOVE_WAKEUP) { dev_info(dev, "Suspend without wake params -- powering down card\n"); - return -ENOSYS; + if (priv->fw_ready) { + priv->power_up_on_resume = true; + if_sdio_power_off(card); + } + + return 0; } if (!(flags & MMC_PM_KEEP_POWER)) { @@ -1321,7 +1334,7 @@ static int if_sdio_suspend(struct device *dev) if (ret) return ret; - ret = lbs_suspend(card->priv); + ret = lbs_suspend(priv); if (ret) return ret; @@ -1336,6 +1349,9 @@ static int if_sdio_resume(struct device *dev) dev_info(dev, "%s: resume: we're back\n", sdio_func_id(func)); + if (card->priv->power_up_on_resume) + if_sdio_power_on(card); + ret = lbs_resume(card->priv); return ret; -- 2.17.1
Re: [PATCH 1/2] NFC: st95hf: initialize semaphore and mutex earlier
On Monday, May 28, 2018 04:50 PM, Samuel Ortiz wrote: Hi Daniel, On Mon, May 28, 2018 at 04:35:15PM +0200, Daniel Mack wrote: On Wednesday, May 16, 2018 03:32 PM, Daniel Mack wrote: 'rm_lock' and 'exchange_lock' need to be ready before the IRQ handler has a chance to fire. This fixes the oops below. Nobody seems to be interested in these. Davem, can you take them through your tree or is there anyone else I can ping? I'm going to gather all pending NFC patches this week, including this one. They will land in either the nfc-next or nfc-fixes tree. Ah, perfect. Sorry for nagging. I just wanted to be sure they're not forgotten :) Thanks, Daniel
Re: [PATCH 1/2] NFC: st95hf: initialize semaphore and mutex earlier
On Wednesday, May 16, 2018 03:32 PM, Daniel Mack wrote: 'rm_lock' and 'exchange_lock' need to be ready before the IRQ handler has a chance to fire. This fixes the oops below. Nobody seems to be interested in these. Davem, can you take them through your tree or is there anyone else I can ping? Thanks, Daniel [1.040255] Internal error: Oops: 9604 [#1] PREEMPT SMP [...] [1.181564] Call trace: [1.188591] Exception stack(0x0a473c40 to 0x0a473d80) [1.190943] 3c40: 80003673b118 800036374380 [1.197542] 3c60: 044b2ac5 0001 [1.205354] 3c80: 800036374d60 0a473d70 0980 [1.213166] 3ca0: 0001 004c 0033 [1.217590] st95hf spi2.0: err: por seq failed for st95hf [1.228788] 3cc0: 0019 0001 0007 80003673b118 [1.234175] 3ce0: 89f27000 80003673b1c8 80003673b1b0 [1.241986] 3d00: 080f 89f716a4 0894bb40 [1.249800] 3d20: 0a473d80 084268c0 0a473d80 [1.257611] 3d40: 084268c4 4005 0a473d60 080e5688 [1.265424] 3d60: 084268a4 0a473d80 084268c4 [1.273239] [] st95hf_irq_thread_handler+0x44/0x3a0 [1.281048] [] irq_thread_fn+0x28/0x68 [1.287468] [] irq_thread+0x10c/0x1a0 [1.292850] [] kthread+0x12c/0x130 [1.297799] [] ret_from_fork+0x10/0x18 [1.303008] Code: aa1603e0 f9403675 940d010f aa1303e0 (f94066a1) [1.308307] ---[ end trace d058c1b88aad74d8 ]--- Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/nfc/st95hf/core.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 2b26f762fbc3..394bdc7b0cf2 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -1112,8 +1112,10 @@ static int st95hf_probe(struct spi_device *nfc_spi_dev) } } + sema_init(>exchange_lock, 1); init_completion(>done); mutex_init(>spi_lock); + mutex_init(>rm_lock); /* * Store spicontext in spi device object for using it in @@ -1197,9 +1199,6 @@ static int st95hf_probe(struct spi_device *nfc_spi_dev) /* store st95context in nfc device object */ nfc_digital_set_drvdata(st95context->ddev, st95context); - sema_init(>exchange_lock, 1); - mutex_init(>rm_lock); - return ret; err_free_digital_device:
Re: [PATCH 00/10] Some more patches for wcn36xx
Hi Kalle, On Thursday, May 24, 2018 10:44 AM, Kalle Valo wrote: Daniel Mack <dan...@zonque.org> writes: On Friday, May 18, 2018 01:28 PM, Kalle Valo wrote: Also I would recommend to file a bug to bugzilla.kernel.org so that all the information is one place and it can be easily updated. Now it's pretty difficult to get the big picture from various emails on the list. Yes, I agree it's a bit convoluted. However, there's already the bug report on 96board.org that Bjorn opened some time back, and I considered that sufficient. IMO, it has all the information needed, plus a link to a tool to reproduce the issue. https://bugs.96boards.org/show_bug.cgi?id=538 Yeah, bugs.96boards.org is fine. As long as there's one place which collects all the information about the bug. But IMHO the bug report is not telling much, all I get is that TX frames get stuck but not even that is confirmed. After reading it I have at least these questions: * Is it really confirmed that the issue is that TX frames are stuck? For example, using a wireless sniffer would confirm that. Yes, that's confirmed. I have a 2nd machine tuned to the same channel than the network I use for testing, and once the timeouts happen, I cannot see any frame anymore from the MAC of the wcn36xx. No probe requests for scans, no authentication attempts, nothing. As my test constantly connects and disconnects, the last thing I see in wireshark is a deauthentication frame. * Are only management frames stuck or does it also involve data frames? It seems that once a network is successfully joined, the network stability is fine. I haven't seen any starvation of streams lately, at least not with the the patches in this series which I'm running since a while. That is, until a disconnect/reconnect attempt is made, and at this point, only management frames are involved. * Based on the bug report the TX stuck issue seems to happen during authentication, but what happens before that? Does wcn36xx get disconnected from AP or what? As I said, my test setup includes repeated disconnections to make the bug appear. It sometimes happens at the first attempt after a fresh boot, however, so the stress test only makes debugging a bit easier by increasing the likeliness. * Any wcn36xx logs about the issue (with or without debug logs)? Also matching wpasupplicant logs would help. The problem with this is that it's not exactly clear what kind of effect we're looking at. With all the debug flags of the driver enabled, it produces so much log output that wpa_supplicant gives up due to timeouts. The other weird issue is that with WCN36XX_DBG_MAC and/or WCN36XX_DBG_SMD enabled, the effect is _much_ harder to trigger. * Does this only happen with encryption or also in open mode? That's a good question. I'll go check with an open network. * How long does it take with qconnman-stress to reproduce the issue? Usually less than 10 minutes. * Does the radio environment make any difference on reproducibility? For example, clear enviroment vs lots of traffic/interference? It seems it does, yes. Tests at night seem to take a bit more time to make the effect happen. But then again, it could also be unrelated. I can't be certain at this point. I'll submit some more information to the bug report. What would help me is if other people tried to reproduce the effect using the stress tool and confirm my findings. Chances are I've been staring at this for too long :) Thanks again, Daniel
Re: wcn36xx: bug #538: stuck tx management frames
On Thursday, May 24, 2018 01:48 PM, Kalle Valo wrote: Daniel Mack <dan...@zonque.org> writes: On Thursday, May 24, 2018 10:44 AM, Kalle Valo wrote: Daniel Mack <dan...@zonque.org> writes: It seems that once a network is successfully joined, the network stability is fine. I haven't seen any starvation of streams lately, at least not with the the patches in this series which I'm running since a while. That is, until a disconnect/reconnect attempt is made, and at this point, only management frames are involved. Ah, maybe originally you were seeing different issues with similar symptoms? But now you have fixed the other bugsand now the stuck transmitted management frame issue is left? Just guessing... Yeah, I wish I had a clearer picture on all this myself :( My patches definitely address some of the issues I have seen before, also the fixes for potential race conditions are likely to have a positive effect. But as you guessed yourself, I'm afraid that there's a multitude of possible sources for bugs, so it's hard to tell. It would be great to have wcn36xx logging via tracing, just like ath10k and iwlwifi does. This way logging shouldn't slow down the system too much and with wpasupplicant's -T switch you can even get wpasupplicant's debug messages to the same log with proper timestamps! And almost forgot, you can also include mac80211 tracing logs as well: https://wireless.wiki.kernel.org/en/developers/documentation/mac80211/tracing https://wireless.wiki.kernel.org/en/users/drivers/ath10k/debug#tracing See ath10k_dbg() and trace_ath10k_log_dbg() for ideas how to implement this, and you can also take a look at iwlwifi. Should be pretty easy. Patches more than welcome :) Okay, I'll see if I can find some time to look into this. The reason why I didn't focus the logging possibilities is that I looked at the SMD messages that are flying around which result from ieee80211 API calls into the driver, and I can't seem to find anything that's wrong, especially not before the timeouts occur. Hence, I don't actually expect any oddness on the ieee80211 layer. But I agree that in general, better logging is certainly helpful. It seems it does, yes. Tests at night seem to take a bit more time to make the effect happen. But then again, it could also be unrelated. I can't be certain at this point. Can you describe what kind of radio environment you have, is it a busy office complex? How many APs around etc? I've tried different environments. In the office with 15-20 laptops/mobiles in the room I see about 10-15 APs. At home, where I did long-term nightly test, there's maybe a higher number of APs, but fewer devices on the channel of the AP that I used for tests. I haven't used any more sophisticated environments like a sealed reverberation chamber yet though. Thanks, Daniel
Re: [PATCH 00/10] Some more patches for wcn36xx
Hi Kalle, On Friday, May 18, 2018 01:28 PM, Kalle Valo wrote: Daniel Mack <dan...@zonque.org> writes: On Wednesday, May 16, 2018 04:08 PM, Daniel Mack wrote: Hence I believe that some sort of firmware internal buffer is overrun if too many SMD requests fly in in a short amount of time. The firmware does, however, still ack all packets just fine on the SMD channels, and also the DXE communication flows are all healthy. No errors are reported anywhere, but nothing is being put on the ether anymore. And FTR, there is a commit in the prima repository that caught my attention a while back: https://source.codeaurora.org/external/wlan/prima/commit/?id=93cd8f3c What this does (through an remarkable number of indirection layers) is sending the DUMP_COMMAND_REQ command with args = (274, 0, 0, 0, 0) when management frames get stuck, which smells pretty much like the issue I'm seeing. Doing the same with the mainline driver and the debugfs interface it exposes doesn't have any effect though. But even if it did work, I wouldn't see a way to detect the situation in which this is needed reliably. The firmware version might make a difference so I recommend always mentioning the firmware version as well. For example, what if your firmware does not support that command or parameter? Sure, that could be the case. FTR - the firmware I'm using is the one that came out of the Qualcomm r1034.2.1 BSP. It is recognized by the driver as 'WCN v2.0 RadioPhy vRhea_GF_1.12 with 19.2MHz XO'. Also I would recommend to file a bug to bugzilla.kernel.org so that all the information is one place and it can be easily updated. Now it's pretty difficult to get the big picture from various emails on the list. Yes, I agree it's a bit convoluted. However, there's already the bug report on 96board.org that Bjorn opened some time back, and I considered that sufficient. IMO, it has all the information needed, plus a link to a tool to reproduce the issue. https://bugs.96boards.org/show_bug.cgi?id=538 The patches in this series can be considered as general cleanups that are not necessarily related to this particular issue. Thanks, Daniel
Re: [PATCH 00/10] Some more patches for wcn36xx
On Wednesday, May 16, 2018 04:08 PM, Daniel Mack wrote: Hence I believe that some sort of firmware internal buffer is overrun if too many SMD requests fly in in a short amount of time. The firmware does, however, still ack all packets just fine on the SMD channels, and also the DXE communication flows are all healthy. No errors are reported anywhere, but nothing is being put on the ether anymore. And FTR, there is a commit in the prima repository that caught my attention a while back: https://source.codeaurora.org/external/wlan/prima/commit/?id=93cd8f3c What this does (through an remarkable number of indirection layers) is sending the DUMP_COMMAND_REQ command with args = (274, 0, 0, 0, 0) when management frames get stuck, which smells pretty much like the issue I'm seeing. Doing the same with the mainline driver and the debugfs interface it exposes doesn't have any effect though. But even if it did work, I wouldn't see a way to detect the situation in which this is needed reliably. Daniel
[PATCH 02/10] wcn36xx: set DMA mask explicitly
The device takes 32-bit addresses only, so inform the DMA API about it. This is the default on msm8016, so that doesn't change anything, but it's best practice to be explicit. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/main.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c index e3b91b3b38ef..e6330ad372b3 100644 --- a/drivers/net/wireless/ath/wcn36xx/main.c +++ b/drivers/net/wireless/ath/wcn36xx/main.c @@ -1309,6 +1309,12 @@ static int wcn36xx_probe(struct platform_device *pdev) mutex_init(>hal_mutex); mutex_init(>scan_lock); + ret = dma_set_mask_and_coherent(wcn->dev, DMA_BIT_MASK(32)); + if (ret < 0) { + wcn36xx_err("failed to set DMA mask: %d\n", ret); + goto out_wq; + } + INIT_WORK(>scan_work, wcn36xx_hw_scan_worker); wcn->smd_channel = qcom_wcnss_open_channel(wcnss, "WLAN_CTRL", wcn36xx_smd_rsp_process, hw); -- 2.14.3
[PATCH 09/10] wcn36xx: simplify wcn36xx_smd_open()
Drop the extra warning about failed allocations, both the core and the only caller of this function will warn loud enough in such cases. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/smd.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index 0a505b5e038b..43c8aa79fad4 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -2494,21 +2494,15 @@ static void wcn36xx_ind_smd_work(struct work_struct *work) } int wcn36xx_smd_open(struct wcn36xx *wcn) { - int ret = 0; wcn->hal_ind_wq = create_freezable_workqueue("wcn36xx_smd_ind"); - if (!wcn->hal_ind_wq) { - wcn36xx_err("failed to allocate wq\n"); - ret = -ENOMEM; - goto out; - } + if (!wcn->hal_ind_wq) + return -ENOMEM; + INIT_WORK(>hal_ind_work, wcn36xx_ind_smd_work); INIT_LIST_HEAD(>hal_ind_queue); spin_lock_init(>hal_ind_lock); return 0; - -out: - return ret; } void wcn36xx_smd_close(struct wcn36xx *wcn) -- 2.14.3
[PATCH 10/10] wcn36xx: improve debug and error messages for SMD
Add a missing newline in wcn36xx_smd_send_and_wait() and also log the command request and response type that was processed. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/smd.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index 43c8aa79fad4..b855a58d5aac 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -252,23 +252,29 @@ static int wcn36xx_smd_send_and_wait(struct wcn36xx *wcn, size_t len) { int ret = 0; unsigned long start; + struct wcn36xx_hal_msg_header *hdr = + (struct wcn36xx_hal_msg_header *)wcn->hal_buf; + u16 req_type = hdr->msg_type; + wcn36xx_dbg_dump(WCN36XX_DBG_SMD_DUMP, "HAL >>> ", wcn->hal_buf, len); init_completion(>hal_rsp_compl); start = jiffies; ret = rpmsg_send(wcn->smd_channel, wcn->hal_buf, len); if (ret) { - wcn36xx_err("HAL TX failed\n"); + wcn36xx_err("HAL TX failed for req %d\n", req_type); goto out; } if (wait_for_completion_timeout(>hal_rsp_compl, msecs_to_jiffies(HAL_MSG_TIMEOUT)) <= 0) { - wcn36xx_err("Timeout! No SMD response in %dms\n", - HAL_MSG_TIMEOUT); + wcn36xx_err("Timeout! No SMD response to req %d in %dms\n", + req_type, HAL_MSG_TIMEOUT); ret = -ETIME; goto out; } - wcn36xx_dbg(WCN36XX_DBG_SMD, "SMD command completed in %dms", + wcn36xx_dbg(WCN36XX_DBG_SMD, + "SMD command (req %d, rsp %d) completed in %dms\n", + req_type, hdr->msg_type, jiffies_to_msecs(jiffies - start)); out: return ret; -- 2.14.3
[PATCH 08/10] wcn36xx: drain pending indicator messages on shutdown
When the interface is shut down, wcn36xx_smd_close() potentially races against the queue worker. Make sure to cancel the work, and then free all the remnants in hal_ind_queue manually. This is again just a theoretical issue, not something that was triggered in the wild. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/smd.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index ea74f2b92df5..0a505b5e038b 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -2513,5 +2513,11 @@ int wcn36xx_smd_open(struct wcn36xx *wcn) void wcn36xx_smd_close(struct wcn36xx *wcn) { + struct wcn36xx_hal_ind_msg *msg, *tmp; + + cancel_work_sync(>hal_ind_work); destroy_workqueue(wcn->hal_ind_wq); + + list_for_each_entry_safe(msg, tmp, >hal_ind_queue, list) + kfree(msg); } -- 2.14.3
[PATCH 04/10] wcn36xx: clear all masks in RX interrupt
Like on the TX side, check for the interrupt reason when the RX interrupt is latched and clear the ERR, DONE and ED masks. This seems to help with connection timeouts and network stream starvatations. And FWIW, the downstream driver does the same thing. Note that in analogy to the TX side, WCN36XX_DXE_0_INT_CLR should be set to WCN36XX_INT_MASK_CHAN_RX_{L,H} rather than WCN36XX_DXE_INT_CH{1,3}_MASK. It did the right thing however, as the defines happen to have identical values. Also, instead of determining register addresses and values inside wcn36xx_rx_handle_packets(), pass them as arguments. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/dxe.c | 62 ++ 1 file changed, 40 insertions(+), 22 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c b/drivers/net/wireless/ath/wcn36xx/dxe.c index d6dd47b211ba..d11c9c536627 100644 --- a/drivers/net/wireless/ath/wcn36xx/dxe.c +++ b/drivers/net/wireless/ath/wcn36xx/dxe.c @@ -511,23 +511,40 @@ static int wcn36xx_dxe_request_irqs(struct wcn36xx *wcn) } static int wcn36xx_rx_handle_packets(struct wcn36xx *wcn, -struct wcn36xx_dxe_ch *ch) +struct wcn36xx_dxe_ch *ch, +u32 ctrl, +u32 en_mask, +u32 int_mask, +u32 status_reg) { struct wcn36xx_dxe_desc *dxe; struct wcn36xx_dxe_ctl *ctl; dma_addr_t dma_addr; struct sk_buff *skb; - int ret = 0, int_mask; - u32 value; + u32 int_reason; + int ret; - if (ch->ch_type == WCN36XX_DXE_CH_RX_L) { - value = WCN36XX_DXE_CTRL_RX_L; - int_mask = WCN36XX_DXE_INT_CH1_MASK; - } else { - value = WCN36XX_DXE_CTRL_RX_H; - int_mask = WCN36XX_DXE_INT_CH3_MASK; + wcn36xx_dxe_read_register(wcn, status_reg, _reason); + wcn36xx_dxe_write_register(wcn, WCN36XX_DXE_0_INT_CLR, int_mask); + + if (int_reason & WCN36XX_CH_STAT_INT_ERR_MASK) { + wcn36xx_dxe_write_register(wcn, + WCN36XX_DXE_0_INT_ERR_CLR, + int_mask); + + wcn36xx_err("DXE IRQ reported error on RX channel\n"); } + if (int_reason & WCN36XX_CH_STAT_INT_DONE_MASK) + wcn36xx_dxe_write_register(wcn, + WCN36XX_DXE_0_INT_DONE_CLR, + int_mask); + + if (int_reason & WCN36XX_CH_STAT_INT_ED_MASK) + wcn36xx_dxe_write_register(wcn, + WCN36XX_DXE_0_INT_ED_CLR, + int_mask); + spin_lock(>lock); ctl = ch->head_blk_ctl; @@ -546,11 +563,11 @@ static int wcn36xx_rx_handle_packets(struct wcn36xx *wcn, wcn36xx_rx_skb(wcn, skb); } /* else keep old skb not submitted and use it for rx DMA */ - dxe->ctrl = value; + dxe->ctrl = ctrl; ctl = ctl->next; dxe = ctl->desc; } - wcn36xx_dxe_write_register(wcn, WCN36XX_DXE_ENCH_ADDR, int_mask); + wcn36xx_dxe_write_register(wcn, WCN36XX_DXE_ENCH_ADDR, en_mask); ch->head_blk_ctl = ctl; @@ -566,19 +583,20 @@ void wcn36xx_dxe_rx_frame(struct wcn36xx *wcn) wcn36xx_dxe_read_register(wcn, WCN36XX_DXE_INT_SRC_RAW_REG, _src); /* RX_LOW_PRI */ - if (int_src & WCN36XX_DXE_INT_CH1_MASK) { - wcn36xx_dxe_write_register(wcn, WCN36XX_DXE_0_INT_CLR, - WCN36XX_DXE_INT_CH1_MASK); - wcn36xx_rx_handle_packets(wcn, &(wcn->dxe_rx_l_ch)); - } + if (int_src & WCN36XX_DXE_INT_CH1_MASK) + wcn36xx_rx_handle_packets(wcn, >dxe_rx_l_ch, + WCN36XX_DXE_CTRL_RX_L, + WCN36XX_DXE_INT_CH1_MASK, + WCN36XX_INT_MASK_CHAN_RX_L, + WCN36XX_DXE_CH_STATUS_REG_ADDR_RX_L); /* RX_HIGH_PRI */ - if (int_src & WCN36XX_DXE_INT_CH3_MASK) { - /* Clean up all the INT within this channel */ - wcn36xx_dxe_write_register(wcn, WCN36XX_DXE_0_INT_CLR, - WCN36XX_DXE_INT_CH3_MASK); - wcn36xx_rx_handle_packets(wcn, &(wcn->dxe_rx_h_ch)); - } + if (int_src & WCN36XX_DXE_INT_CH3_MASK) + wcn36xx_rx_handle_packets(wcn, >dxe_rx_h_ch, + WCN36XX_DXE_CTRL_RX_H, + WCN36X
[PATCH 07/10] wcn36xx: set PREASSOC and IDLE stated when BSS info changes
When a BSSID is joined, set the link status to 'preassoc', and set it to 'idle' when the BSS is deleted. This is what the downstream driver is doing, and it seems to improve the reliability during connect/disconnect stress tests. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/main.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c index e6330ad372b3..662e50540b07 100644 --- a/drivers/net/wireless/ath/wcn36xx/main.c +++ b/drivers/net/wireless/ath/wcn36xx/main.c @@ -798,6 +798,8 @@ static void wcn36xx_bss_info_changed(struct ieee80211_hw *hw, if (!is_zero_ether_addr(bss_conf->bssid)) { vif_priv->is_joining = true; vif_priv->bss_index = WCN36XX_HAL_BSS_INVALID_IDX; + wcn36xx_smd_set_link_st(wcn, bss_conf->bssid, vif->addr, + WCN36XX_HAL_LINK_PREASSOC_STATE); wcn36xx_smd_join(wcn, bss_conf->bssid, vif->addr, WCN36XX_HW_CHANNEL(wcn)); wcn36xx_smd_config_bss(wcn, vif, NULL, @@ -805,6 +807,8 @@ static void wcn36xx_bss_info_changed(struct ieee80211_hw *hw, } else { vif_priv->is_joining = false; wcn36xx_smd_delete_bss(wcn, vif); + wcn36xx_smd_set_link_st(wcn, bss_conf->bssid, vif->addr, + WCN36XX_HAL_LINK_IDLE_STATE); vif_priv->encrypt_type = WCN36XX_HAL_ED_NONE; } } -- 2.14.3
[PATCH 05/10] wcn36xx: only handle packets when ED or DONE bit is set
On RX and TX interrupts, check for the WCN36XX_CH_STAT_INT_ED_MASK or WCN36XX_CH_STAT_INT_DONE_MASK in the interrupt reason register, and only handle packets when it is set. This way, reap_tx_dxes() is only invoked when needed. This brings the dequeing logic in line with what the prima downstream driver is doing. While at it, also log the interrupt reason. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/dxe.c | 20 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c b/drivers/net/wireless/ath/wcn36xx/dxe.c index d11c9c536627..8c64e05ca3b7 100644 --- a/drivers/net/wireless/ath/wcn36xx/dxe.c +++ b/drivers/net/wireless/ath/wcn36xx/dxe.c @@ -430,8 +430,12 @@ static irqreturn_t wcn36xx_irq_tx_complete(int irq, void *dev) WCN36XX_INT_MASK_CHAN_TX_H); } - wcn36xx_dbg(WCN36XX_DBG_DXE, "dxe tx ready high\n"); - reap_tx_dxes(wcn, >dxe_tx_h_ch); + wcn36xx_dbg(WCN36XX_DBG_DXE, "dxe tx ready high, reason %08x\n", + int_reason); + + if (int_reason & (WCN36XX_CH_STAT_INT_DONE_MASK | + WCN36XX_CH_STAT_INT_ED_MASK)) + reap_tx_dxes(wcn, >dxe_tx_h_ch); } if (int_src & WCN36XX_INT_MASK_CHAN_TX_L) { @@ -465,8 +469,12 @@ static irqreturn_t wcn36xx_irq_tx_complete(int irq, void *dev) WCN36XX_INT_MASK_CHAN_TX_L); } - wcn36xx_dbg(WCN36XX_DBG_DXE, "dxe tx ready low\n"); - reap_tx_dxes(wcn, >dxe_tx_l_ch); + wcn36xx_dbg(WCN36XX_DBG_DXE, "dxe tx ready low, reason %08x\n", + int_reason); + + if (int_reason & (WCN36XX_CH_STAT_INT_DONE_MASK | + WCN36XX_CH_STAT_INT_ED_MASK)) + reap_tx_dxes(wcn, >dxe_tx_l_ch); } return IRQ_HANDLED; @@ -545,6 +553,10 @@ static int wcn36xx_rx_handle_packets(struct wcn36xx *wcn, WCN36XX_DXE_0_INT_ED_CLR, int_mask); + if (!(int_reason & (WCN36XX_CH_STAT_INT_DONE_MASK | + WCN36XX_CH_STAT_INT_ED_MASK))) + return 0; + spin_lock(>lock); ctl = ch->head_blk_ctl; -- 2.14.3
[PATCH 06/10] wcn36xx: consider CTRL_EOP bit when looking for valid descriptors
In reap_tx_dxes(), when we iterate over the linked descriptors, only consider such valid that have WCN36xx_DXE_CTRL_EOP set. This is what the prima downstream driver is doing as well. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/dxe.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c b/drivers/net/wireless/ath/wcn36xx/dxe.c index 8c64e05ca3b7..06cfe8d311f3 100644 --- a/drivers/net/wireless/ath/wcn36xx/dxe.c +++ b/drivers/net/wireless/ath/wcn36xx/dxe.c @@ -370,7 +370,9 @@ static void reap_tx_dxes(struct wcn36xx *wcn, struct wcn36xx_dxe_ch *ch) do { if (READ_ONCE(ctl->desc->ctrl) & WCN36xx_DXE_CTRL_VLD) break; - if (ctl->skb) { + + if (ctl->skb && + READ_ONCE(ctl->desc->ctrl) & WCN36xx_DXE_CTRL_EOP) { dma_unmap_single(wcn->dev, ctl->desc->src_addr_l, ctl->skb->len, DMA_TO_DEVICE); info = IEEE80211_SKB_CB(ctl->skb); -- 2.14.3
[PATCH 00/10] Some more patches for wcn36xx
Here are some more patches for the wcn36xx driver that emerged from reviews and my attempts to fix the connection timeout issues. Some of them merely bring the driver in sync with what the "prima" downstream driver is doing, and they seemingly make the driver more stable. Others just apply some best practice patterns or do some cleanup. The first patch in the series is one that I've sent earlier as RFC, and Loic expressed his support in a follow-up. I still believe it does the right thing. However, the connection timeout problems are still there unfortunately, but they seemingly appear to happen less often now. To recap, when the driver gets stuck, not a single packet is being sent out anymore. I have a 2nd machine tuned to the same channel to capture all packets from the MAC of the device under test, and it shows a flat line once the connection timeouts happen. Some more context is given in this bug report: https://bugs.96boards.org/show_bug.cgi?id=538 One interesting find is that setting the debug_mask module parameter to 0x4 or 0x400 (or both), the issue is much less likely to happen. All this does is adding printk()s before each message is sent out, and that affects the timings of course. I tried adding mdelay() there instead, and they too make the effect less likely, but they don't make it go away. Hence I believe that some sort of firmware internal buffer is overrun if too many SMD requests fly in in a short amount of time. The firmware does, however, still ack all packets just fine on the SMD channels, and also the DXE communication flows are all healthy. No errors are reported anywhere, but nothing is being put on the ether anymore. I'm running out of ideas at this point. I guess without access to the firmware sources, it's virtually impossible to grok what's going on and how to work around whatever is causing it. If anyone has an idea on how to debug this frustrating issue any further, please let me know. A reproducer is described in the bug reported linked to above. Thanks, Daniel Daniel Mack (10): wcn36xx: fix buffer commit logic on TX path wcn36xx: set DMA mask explicitly wcn36xx: don't disable RX IRQ from handler wcn36xx: clear all masks in RX interrupt wcn36xx: only handle packets when ED or DONE bit is set wcn36xx: consider CTRL_EOP bit when looking for valid descriptors wcn36xx: set PREASSOC and IDLE stated when BSS info changes wcn36xx: drain pending indicator messages on shutdown wcn36xx: simplify wcn36xx_smd_open() wcn36xx: improve debug and error messages for SMD drivers/net/wireless/ath/wcn36xx/dxe.c | 176 drivers/net/wireless/ath/wcn36xx/main.c | 10 ++ drivers/net/wireless/ath/wcn36xx/smd.c | 32 +++--- 3 files changed, 137 insertions(+), 81 deletions(-) -- 2.14.3
[PATCH 03/10] wcn36xx: don't disable RX IRQ from handler
There's no need to disable the IRQ from inside its handler. Instead just grab the spinlock of the channel that is being processed. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/dxe.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c b/drivers/net/wireless/ath/wcn36xx/dxe.c index 3d1cf7dd1f8e..d6dd47b211ba 100644 --- a/drivers/net/wireless/ath/wcn36xx/dxe.c +++ b/drivers/net/wireless/ath/wcn36xx/dxe.c @@ -476,9 +476,8 @@ static irqreturn_t wcn36xx_irq_rx_ready(int irq, void *dev) { struct wcn36xx *wcn = (struct wcn36xx *)dev; - disable_irq_nosync(wcn->rx_irq); wcn36xx_dxe_rx_frame(wcn); - enable_irq(wcn->rx_irq); + return IRQ_HANDLED; } @@ -514,8 +513,8 @@ static int wcn36xx_dxe_request_irqs(struct wcn36xx *wcn) static int wcn36xx_rx_handle_packets(struct wcn36xx *wcn, struct wcn36xx_dxe_ch *ch) { - struct wcn36xx_dxe_ctl *ctl = ch->head_blk_ctl; - struct wcn36xx_dxe_desc *dxe = ctl->desc; + struct wcn36xx_dxe_desc *dxe; + struct wcn36xx_dxe_ctl *ctl; dma_addr_t dma_addr; struct sk_buff *skb; int ret = 0, int_mask; @@ -529,6 +528,11 @@ static int wcn36xx_rx_handle_packets(struct wcn36xx *wcn, int_mask = WCN36XX_DXE_INT_CH3_MASK; } + spin_lock(>lock); + + ctl = ch->head_blk_ctl; + dxe = ctl->desc; + while (!(READ_ONCE(dxe->ctrl) & WCN36xx_DXE_CTRL_VLD)) { skb = ctl->skb; dma_addr = dxe->dst_addr_l; @@ -549,6 +553,9 @@ static int wcn36xx_rx_handle_packets(struct wcn36xx *wcn, wcn36xx_dxe_write_register(wcn, WCN36XX_DXE_ENCH_ADDR, int_mask); ch->head_blk_ctl = ctl; + + spin_unlock(>lock); + return 0; } -- 2.14.3
[PATCH 01/10] wcn36xx: fix buffer commit logic on TX path
When wcn36xx_dxe_tx_frame() is entered while the device is still processing the queue asyncronously, we are racing against the firmware code with updates to the buffer descriptors. Presumably, the firmware scans the ring buffer that holds the descriptors and scans for a valid control descriptor, and then assumes that the next descriptor contains the payload. If, however, the control descriptor is marked valid, but the payload descriptor isn't, the packet is not sent out. Another issue with the current code is that is lacks memory barriers before descriptors are marked valid. This is important because the CPU may reorder writes to memory, even if it is allocated as coherent DMA area, and hence the device may see incompletely written data. To fix this, the code in wcn36xx_dxe_tx_frame() was restructured a bit so that the payload descriptor is made valid before the control descriptor. Memory barriers are added to ensure coherency of shared memory areas. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/dxe.c | 75 +- 1 file changed, 38 insertions(+), 37 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c b/drivers/net/wireless/ath/wcn36xx/dxe.c index bd2b946a65c9..3d1cf7dd1f8e 100644 --- a/drivers/net/wireless/ath/wcn36xx/dxe.c +++ b/drivers/net/wireless/ath/wcn36xx/dxe.c @@ -642,8 +642,8 @@ int wcn36xx_dxe_tx_frame(struct wcn36xx *wcn, struct sk_buff *skb, bool is_low) { - struct wcn36xx_dxe_ctl *ctl = NULL; - struct wcn36xx_dxe_desc *desc = NULL; + struct wcn36xx_dxe_desc *desc_bd, *desc_skb; + struct wcn36xx_dxe_ctl *ctl_bd, *ctl_skb; struct wcn36xx_dxe_ch *ch = NULL; unsigned long flags; int ret; @@ -651,74 +651,75 @@ int wcn36xx_dxe_tx_frame(struct wcn36xx *wcn, ch = is_low ? >dxe_tx_l_ch : >dxe_tx_h_ch; spin_lock_irqsave(>lock, flags); - ctl = ch->head_blk_ctl; + ctl_bd = ch->head_blk_ctl; + ctl_skb = ctl_bd->next; /* * If skb is not null that means that we reached the tail of the ring * hence ring is full. Stop queues to let mac80211 back off until ring * has an empty slot again. */ - if (NULL != ctl->next->skb) { + if (NULL != ctl_skb->skb) { ieee80211_stop_queues(wcn->hw); wcn->queues_stopped = true; spin_unlock_irqrestore(>lock, flags); return -EBUSY; } - ctl->skb = NULL; - desc = ctl->desc; + if (unlikely(ctl_skb->bd_cpu_addr)) { + wcn36xx_err("bd_cpu_addr cannot be NULL for skb DXE\n"); + ret = -EINVAL; + goto unlock; + } + + desc_bd = ctl_bd->desc; + desc_skb = ctl_skb->desc; + + ctl_bd->skb = NULL; /* write buffer descriptor */ - memcpy(ctl->bd_cpu_addr, bd, sizeof(*bd)); + memcpy(ctl_bd->bd_cpu_addr, bd, sizeof(*bd)); /* Set source address of the BD we send */ - desc->src_addr_l = ctl->bd_phy_addr; - - desc->dst_addr_l = ch->dxe_wq; - desc->fr_len = sizeof(struct wcn36xx_tx_bd); - desc->ctrl = ch->ctrl_bd; + desc_bd->src_addr_l = ctl_bd->bd_phy_addr; + desc_bd->dst_addr_l = ch->dxe_wq; + desc_bd->fr_len = sizeof(struct wcn36xx_tx_bd); wcn36xx_dbg(WCN36XX_DBG_DXE, "DXE TX\n"); wcn36xx_dbg_dump(WCN36XX_DBG_DXE_DUMP, "DESC1 >>> ", -(char *)desc, sizeof(*desc)); +(char *)desc_bd, sizeof(*desc_bd)); wcn36xx_dbg_dump(WCN36XX_DBG_DXE_DUMP, -"BD >>> ", (char *)ctl->bd_cpu_addr, +"BD >>> ", (char *)ctl_bd->bd_cpu_addr, sizeof(struct wcn36xx_tx_bd)); - /* Set source address of the SKB we send */ - ctl = ctl->next; - desc = ctl->desc; - if (ctl->bd_cpu_addr) { - wcn36xx_err("bd_cpu_addr cannot be NULL for skb DXE\n"); - ret = -EINVAL; - goto unlock; - } - - desc->src_addr_l = dma_map_single(wcn->dev, - skb->data, - skb->len, - DMA_TO_DEVICE); - if (dma_mapping_error(wcn->dev, desc->src_addr_l)) { + desc_skb->src_addr_l = dma_map_single(wcn->dev, + skb->data, + skb->len, + DMA_TO_DEVICE); + if (dma_mapping_error(wcn->dev, desc_skb->src_addr_l)) { dev_er
[PATCH 1/2] NFC: st95hf: initialize semaphore and mutex earlier
'rm_lock' and 'exchange_lock' need to be ready before the IRQ handler has a chance to fire. This fixes the oops below. [1.040255] Internal error: Oops: 9604 [#1] PREEMPT SMP [...] [1.181564] Call trace: [1.188591] Exception stack(0x0a473c40 to 0x0a473d80) [1.190943] 3c40: 80003673b118 800036374380 [1.197542] 3c60: 044b2ac5 0001 [1.205354] 3c80: 800036374d60 0a473d70 0980 [1.213166] 3ca0: 0001 004c 0033 [1.217590] st95hf spi2.0: err: por seq failed for st95hf [1.228788] 3cc0: 0019 0001 0007 80003673b118 [1.234175] 3ce0: 89f27000 80003673b1c8 80003673b1b0 [1.241986] 3d00: 080f 89f716a4 0894bb40 [1.249800] 3d20: 0a473d80 084268c0 0a473d80 [1.257611] 3d40: 084268c4 4005 0a473d60 080e5688 [1.265424] 3d60: 084268a4 0a473d80 084268c4 [1.273239] [] st95hf_irq_thread_handler+0x44/0x3a0 [1.281048] [] irq_thread_fn+0x28/0x68 [1.287468] [] irq_thread+0x10c/0x1a0 [1.292850] [] kthread+0x12c/0x130 [1.297799] [] ret_from_fork+0x10/0x18 [1.303008] Code: aa1603e0 f9403675 940d010f aa1303e0 (f94066a1) [1.308307] ---[ end trace d058c1b88aad74d8 ]--- Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/nfc/st95hf/core.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/nfc/st95hf/core.c b/drivers/nfc/st95hf/core.c index 2b26f762fbc3..394bdc7b0cf2 100644 --- a/drivers/nfc/st95hf/core.c +++ b/drivers/nfc/st95hf/core.c @@ -1112,8 +1112,10 @@ static int st95hf_probe(struct spi_device *nfc_spi_dev) } } + sema_init(>exchange_lock, 1); init_completion(>done); mutex_init(>spi_lock); + mutex_init(>rm_lock); /* * Store spicontext in spi device object for using it in @@ -1197,9 +1199,6 @@ static int st95hf_probe(struct spi_device *nfc_spi_dev) /* store st95context in nfc device object */ nfc_digital_set_drvdata(st95context->ddev, st95context); - sema_init(>exchange_lock, 1); - mutex_init(>rm_lock); - return ret; err_free_digital_device: -- 2.14.3
Re: [PATCH 1/1 RFC] wcn36xx: fix buffer commit logic on TX path
Hi Loic, On Wednesday, April 11, 2018 03:30 PM, Loic Poulain wrote: >> /* Move the head of the ring to the next empty descriptor */ >> -ch->head_blk_ctl = ctl->next; >> +ch->head_blk_ctl = ctl_skb->next; >> + >> + /* Commit all previous writes and set descriptors to VALID */ >> + wmb(); > > Is this first memory barrier really needed? from what I understand, we > only need to ensure that the control descriptor is marked valid at the > end of the procedure and we don't really care about the paylod one. > >> + desc_skb->ctrl = ch->ctrl_skb; >> + wmb(); >> + desc_bd->ctrl = ch->ctrl_bd; >> >> /* >> * When connected and trying to send data frame chip can be in sleep > > Otherwise, patch makes sense. This patch has somehow vanished from patchwork. I am running it since I posted it and I can't see it causing any regressions, so I'd like to repost. Given that you seem to be in favor of it, can I have your Reviewed-by? Thanks, Daniel
[PATCH v2 4/5] wcn36xx: send bss_type in scan requests
Pass the bss_type of the currently configured BSS in the message for the scan request. Therefore, that setting needs to be kept in struct wcn36xx_vif. This seems to be only interesting when scanning for a specific SSID and doesn't matter for regular wildcard scans. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/smd.c | 5 - drivers/net/wireless/ath/wcn36xx/wcn36xx.h | 1 + 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index b7964f79d837..9c8ce5e0454f 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -620,6 +620,7 @@ int wcn36xx_smd_finish_scan(struct wcn36xx *wcn, int wcn36xx_smd_start_hw_scan(struct wcn36xx *wcn, struct ieee80211_vif *vif, struct cfg80211_scan_request *req) { + struct wcn36xx_vif *vif_priv = wcn36xx_vif_to_priv(vif); struct wcn36xx_hal_start_scan_offload_req_msg msg_body; int ret, i; @@ -631,6 +632,7 @@ int wcn36xx_smd_start_hw_scan(struct wcn36xx *wcn, struct ieee80211_vif *vif, msg_body.max_ch_time = 100; msg_body.scan_hidden = 1; memcpy(msg_body.mac, vif->addr, ETH_ALEN); + msg_body.bss_type = vif_priv->bss_type; msg_body.p2p_search = vif->p2p; msg_body.num_ssid = min_t(u8, req->n_ssids, ARRAY_SIZE(msg_body.ssids)); @@ -1399,9 +1401,10 @@ int wcn36xx_smd_config_bss(struct wcn36xx *wcn, struct ieee80211_vif *vif, bss->spectrum_mgt_enable = 0; bss->tx_mgmt_power = 0; bss->max_tx_power = WCN36XX_MAX_POWER(wcn); - bss->action = update; + vif_priv->bss_type = bss->bss_type; + wcn36xx_dbg(WCN36XX_DBG_HAL, "hal config bss bssid %pM self_mac_addr %pM bss_type %d oper_mode %d nw_type %d\n", bss->bssid, bss->self_mac_addr, bss->bss_type, diff --git a/drivers/net/wireless/ath/wcn36xx/wcn36xx.h b/drivers/net/wireless/ath/wcn36xx/wcn36xx.h index 5854adf43f3a..22e05cb4eb98 100644 --- a/drivers/net/wireless/ath/wcn36xx/wcn36xx.h +++ b/drivers/net/wireless/ath/wcn36xx/wcn36xx.h @@ -123,6 +123,7 @@ struct wcn36xx_vif { bool is_joining; bool sta_assoc; struct wcn36xx_hal_mac_ssid ssid; + enum wcn36xx_hal_bss_type bss_type; /* Power management */ enum wcn36xx_power_state pw_state; -- 2.14.3
[PATCH v2 5/5] wcn36xx: pass information elements in scan requests
When the wifi driver core passes IE elements in the scan request, append them to the firmware message. The driver currently tells the core that it is capable of attaching up to WCN36XX_MAX_SCAN_IE_LEN octets, but doesn't actually pass them to the the hardware. Note that this patch doesn't fix a bug that was observed. The change is merely done for the sake of completeness as the hardware supports appending IEs in scans. Tests show that network scans work fine with this patch applied. Some defines were moved around to avoid cyclic include dependencies. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/hal.h | 8 +++- drivers/net/wireless/ath/wcn36xx/smd.c | 11 +++ drivers/net/wireless/ath/wcn36xx/wcn36xx.h | 6 -- 3 files changed, 18 insertions(+), 7 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/hal.h b/drivers/net/wireless/ath/wcn36xx/hal.h index 182963522941..2aed6c233508 100644 --- a/drivers/net/wireless/ath/wcn36xx/hal.h +++ b/drivers/net/wireless/ath/wcn36xx/hal.h @@ -88,6 +88,12 @@ /* version string max length (including NULL) */ #define WCN36XX_HAL_VERSION_LENGTH 64 +/* How many frames until we start a-mpdu TX session */ +#define WCN36XX_AMPDU_START_THRESH 20 + +#define WCN36XX_MAX_SCAN_SSIDS 9 +#define WCN36XX_MAX_SCAN_IE_LEN500 + /* message types for messages exchanged between WDI and HAL */ enum wcn36xx_hal_host_msg_type { /* Init/De-Init */ @@ -1170,7 +1176,7 @@ struct wcn36xx_hal_start_scan_offload_req_msg { /* IE field */ u16 ie_len; - u8 ie[0]; + u8 ie[WCN36XX_MAX_SCAN_IE_LEN]; } __packed; struct wcn36xx_hal_start_scan_offload_rsp_msg { diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index 9c8ce5e0454f..ea74f2b92df5 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -624,6 +624,9 @@ int wcn36xx_smd_start_hw_scan(struct wcn36xx *wcn, struct ieee80211_vif *vif, struct wcn36xx_hal_start_scan_offload_req_msg msg_body; int ret, i; + if (req->ie_len > WCN36XX_MAX_SCAN_IE_LEN) + return -EINVAL; + mutex_lock(>hal_mutex); INIT_HAL_MSG(msg_body, WCN36XX_HAL_START_SCAN_OFFLOAD_REQ); @@ -648,6 +651,14 @@ int wcn36xx_smd_start_hw_scan(struct wcn36xx *wcn, struct ieee80211_vif *vif, for (i = 0; i < msg_body.num_channel; i++) msg_body.channels[i] = req->channels[i]->hw_value; + msg_body.header.len -= WCN36XX_MAX_SCAN_IE_LEN; + + if (req->ie_len > 0) { + msg_body.ie_len = req->ie_len; + msg_body.header.len += req->ie_len; + memcpy(msg_body.ie, req->ie, req->ie_len); + } + PREPARE_HAL_BUF(wcn->hal_buf, msg_body); wcn36xx_dbg(WCN36XX_DBG_HAL, diff --git a/drivers/net/wireless/ath/wcn36xx/wcn36xx.h b/drivers/net/wireless/ath/wcn36xx/wcn36xx.h index 22e05cb4eb98..9343989d1169 100644 --- a/drivers/net/wireless/ath/wcn36xx/wcn36xx.h +++ b/drivers/net/wireless/ath/wcn36xx/wcn36xx.h @@ -32,12 +32,6 @@ #define WLAN_NV_FILE "wlan/prima/WCNSS_qcom_wlan_nv.bin" #define WCN36XX_AGGR_BUFFER_SIZE 64 -/* How many frames until we start a-mpdu TX session */ -#define WCN36XX_AMPDU_START_THRESH 20 - -#define WCN36XX_MAX_SCAN_SSIDS 9 -#define WCN36XX_MAX_SCAN_IE_LEN500 - extern unsigned int wcn36xx_dbg_mask; enum wcn36xx_debug_mask { -- 2.14.3
[PATCH v2 3/5] wcn36xx: handle scan cancellation when firmware support is missing
For firmwares that don't have the SCAN_OFFLOAD feature bit set, do not call into wcn36xx_smd_stop_hw_scan(). Instead, stop the asynchronous work and call into ieee80211_scan_completed() immediately. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/main.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c index 08b6939d3f57..e3b91b3b38ef 100644 --- a/drivers/net/wireless/ath/wcn36xx/main.c +++ b/drivers/net/wireless/ath/wcn36xx/main.c @@ -687,10 +687,18 @@ static void wcn36xx_cancel_hw_scan(struct ieee80211_hw *hw, wcn->scan_aborted = true; mutex_unlock(>scan_lock); - /* ieee80211_scan_completed will be called on FW scan indication */ - wcn36xx_smd_stop_hw_scan(wcn); + if (get_feat_caps(wcn->fw_feat_caps, SCAN_OFFLOAD)) { + /* ieee80211_scan_completed will be called on FW scan +* indication */ + wcn36xx_smd_stop_hw_scan(wcn); + } else { + struct cfg80211_scan_info scan_info = { + .aborted = true, + }; - cancel_work_sync(>scan_work); + cancel_work_sync(>scan_work); + ieee80211_scan_completed(wcn->hw, _info); + } } static void wcn36xx_update_allowed_rates(struct ieee80211_sta *sta, -- 2.14.3
[PATCH v2 0/5] wcn36xx: scan related patches
Here's another series of 5 patches for wcn36xx that address some issues with scanning. The first one is the most important one. Daniel v1 → v2: * Moved one hunk from 5/5 to 4/5 so the series becomes bisectable * Add more commit log context to 5/5 Daniel Mack (5): wcn36xx: abort scan request when 'dequeued' indicator is sent wcn36xx: cancel pending scan request when interface goes down wcn36xx: handle scan cancellation when firmware support is missing wcn36xx: send bss_type in scan requests wcn36xx: pass information elements in scan requests drivers/net/wireless/ath/wcn36xx/hal.h | 8 +++- drivers/net/wireless/ath/wcn36xx/main.c| 29 + drivers/net/wireless/ath/wcn36xx/smd.c | 20 +--- drivers/net/wireless/ath/wcn36xx/wcn36xx.h | 7 +-- 4 files changed, 50 insertions(+), 14 deletions(-) -- 2.14.3
[PATCH v2 1/5] wcn36xx: abort scan request when 'dequeued' indicator is sent
When the firmware sends a WCN36XX_HAL_SCAN_IND_DEQUEUED indication, the request is apparently no longer valid. Attempts to stop the hardware scan request subsequently will lead to the following error message, and the hardware is no longer able to communicate with any AP: [ 57.917186] wcn36xx: ERROR hal_stop_scan_offload response failed err=5 Interpreting this indicator message as scan abortion fixes this. While at it, add a newline to a debug print. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/smd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index 297a785d0785..b7964f79d837 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -2140,10 +2140,11 @@ static int wcn36xx_smd_hw_scan_ind(struct wcn36xx *wcn, void *buf, size_t len) return -EIO; } - wcn36xx_dbg(WCN36XX_DBG_HAL, "scan indication (type %x)", rsp->type); + wcn36xx_dbg(WCN36XX_DBG_HAL, "scan indication (type %x)\n", rsp->type); switch (rsp->type) { case WCN36XX_HAL_SCAN_IND_FAILED: + case WCN36XX_HAL_SCAN_IND_DEQUEUED: scan_info.aborted = true; /* fall through */ case WCN36XX_HAL_SCAN_IND_COMPLETED: @@ -2156,7 +2157,6 @@ static int wcn36xx_smd_hw_scan_ind(struct wcn36xx *wcn, void *buf, size_t len) break; case WCN36XX_HAL_SCAN_IND_STARTED: case WCN36XX_HAL_SCAN_IND_FOREIGN_CHANNEL: - case WCN36XX_HAL_SCAN_IND_DEQUEUED: case WCN36XX_HAL_SCAN_IND_PREEMPTED: case WCN36XX_HAL_SCAN_IND_RESTARTED: break; -- 2.14.3
[PATCH v2 2/5] wcn36xx: cancel pending scan request when interface goes down
When the network interface goes down while a scan request is still pending that can't be stopped due to firmware hickups, wcn->scan_req remains set, even though the hardware is deinitialized. This results in -EBUSY for all scan requests after the interface was brought up again. Fix this by explicitly completing pending scan requests in wcn36xx_stop(). Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/main.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c index 749aef3e2b85..08b6939d3f57 100644 --- a/drivers/net/wireless/ath/wcn36xx/main.c +++ b/drivers/net/wireless/ath/wcn36xx/main.c @@ -353,6 +353,19 @@ static void wcn36xx_stop(struct ieee80211_hw *hw) wcn36xx_dbg(WCN36XX_DBG_MAC, "mac stop\n"); + cancel_work_sync(>scan_work); + + mutex_lock(>scan_lock); + if (wcn->scan_req) { + struct cfg80211_scan_info scan_info = { + .aborted = true, + }; + + ieee80211_scan_completed(wcn->hw, _info); + } + wcn->scan_req = NULL; + mutex_unlock(>scan_lock); + wcn36xx_debugfs_exit(wcn); wcn36xx_smd_stop(wcn); wcn36xx_dxe_deinit(wcn); -- 2.14.3
Re: [PATCH 5/5] wcn36xx: pass information elements in scan requests
On Monday, April 16, 2018 04:41 PM, Kalle Valo wrote: > Daniel Mack <dan...@zonque.org> writes: > >> On Monday, April 16, 2018 04:13 PM, Kalle Valo wrote: >>> Daniel Mack <dan...@zonque.org> writes: >>>> On Monday, April 16, 2018 04:03 PM, Kalle Valo wrote: >>>>> Daniel Mack <dan...@zonque.org> writes: >> >>>>>> them to the firmware message. The driver currently tells the core that >>>>>> it is capable of attaching up to WCN36XX_MAX_SCAN_IE_LEN octets, but >>>>>> doesn't actually pass them to the the hardware. >>>>>> >>>>>> Some defines were moved around to avoid cyclic include dependencies. >>>>> >>>>> Does this fix anything or change functionality somehow? You should >>>>> document that also in the commit log. >>>> >>>> I don't have a test case for this, no. But as the change was pretty much >>>> straight forward, I sent it anyway. >>> >>> Ok, but you should still explain that in the commit log. In other words, >>> you should always answer the question "Why?" in the commit log. >>> >>>> I can resend with some more information on this if you like. >>> >>> No need to resend because of this, I can edit the commit log. >> >> Hmm, given that I can't even be sure the firmware does the right thing >> when instructed this way, we should probably just drop this patch from >> the series. The others are more important anyway, as they address bugs >> that hit me on actual hardware. > > IMHO it's useful and if you don't see anything breaking I would prefer > to take it anyway. I can't immeadiately say in what use cases this helps > or fixes, though. But maybe you (or someone else) could verify that it > works correctly by comparing before and after probe requests with a > sniffer? It's certainly not a regression, no. I'm running extensive stress-tests with this driver. Then I guess adding the following to the commit log should suffice? "Note that this patch doesn't fix a bug that was observed. The change is merely done for the sake of completeness as the hardware supports appending IEs in scans. Tests show that network scans work fine with this patch applied." Thanks, Daniel
Re: [PATCH 5/5] wcn36xx: pass information elements in scan requests
On Monday, April 16, 2018 04:13 PM, Kalle Valo wrote: > Daniel Mack <dan...@zonque.org> writes: >> On Monday, April 16, 2018 04:03 PM, Kalle Valo wrote: >>> Daniel Mack <dan...@zonque.org> writes: >>>> them to the firmware message. The driver currently tells the core that >>>> it is capable of attaching up to WCN36XX_MAX_SCAN_IE_LEN octets, but >>>> doesn't actually pass them to the the hardware. >>>> >>>> Some defines were moved around to avoid cyclic include dependencies. >>> >>> Does this fix anything or change functionality somehow? You should >>> document that also in the commit log. >> >> I don't have a test case for this, no. But as the change was pretty much >> straight forward, I sent it anyway. > > Ok, but you should still explain that in the commit log. In other words, > you should always answer the question "Why?" in the commit log. > >> I can resend with some more information on this if you like. > > No need to resend because of this, I can edit the commit log. > Hmm, given that I can't even be sure the firmware does the right thing when instructed this way, we should probably just drop this patch from the series. The others are more important anyway, as they address bugs that hit me on actual hardware. Thanks, Daniel
Re: [PATCH 5/5] wcn36xx: pass information elements in scan requests
On Monday, April 16, 2018 04:03 PM, Kalle Valo wrote: > Daniel Mack <dan...@zonque.org> writes: > >> When the ieee8022 core passes IE elements in the scan request, append > > You mean mac80211? > > And yeah, the ieee80211_ prefix is confusing. Many many years I > started to change that to mac80211_ but gave up :( Ah, yeah. I was just referring to the wifi core driver stack. >> them to the firmware message. The driver currently tells the core that >> it is capable of attaching up to WCN36XX_MAX_SCAN_IE_LEN octets, but >> doesn't actually pass them to the the hardware. >> >> Some defines were moved around to avoid cyclic include dependencies. > > Does this fix anything or change functionality somehow? You should > document that also in the commit log. I don't have a test case for this, no. But as the change was pretty much straight forward, I sent it anyway. I can resend with some more information on this if you like. Thanks, Daniel
[PATCH 2/5] wcn36xx: cancel pending scan request when interface goes down
When the network interface goes down while a scan request is still pending that can't be stopped due to firmware hickups, wcn->scan_req remains set, even though the hardware is deinitialized. This results in -EBUSY for all scan requests after the interface was brought up again. Fix this by explicitly completing pending scan requests in wcn36xx_stop(). Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/main.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c index 1b17c35a7944..e5ef2cbb622f 100644 --- a/drivers/net/wireless/ath/wcn36xx/main.c +++ b/drivers/net/wireless/ath/wcn36xx/main.c @@ -353,6 +353,19 @@ static void wcn36xx_stop(struct ieee80211_hw *hw) wcn36xx_dbg(WCN36XX_DBG_MAC, "mac stop\n"); + cancel_work_sync(>scan_work); + + mutex_lock(>scan_lock); + if (wcn->scan_req) { + struct cfg80211_scan_info scan_info = { + .aborted = true, + }; + + ieee80211_scan_completed(wcn->hw, _info); + } + wcn->scan_req = NULL; + mutex_unlock(>scan_lock); + wcn36xx_debugfs_exit(wcn); wcn36xx_smd_stop(wcn); wcn36xx_dxe_deinit(wcn); -- 2.14.3
[PATCH 3/5] wcn36xx: handle scan cancellation when firmware support is missing
For firmwares that don't have the SCAN_OFFLOAD feature bit set, do not call into wcn36xx_smd_stop_hw_scan(). Instead, stop the asynchronous work and call into ieee80211_scan_completed() immediately. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/main.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c index e5ef2cbb622f..28c0286ae47b 100644 --- a/drivers/net/wireless/ath/wcn36xx/main.c +++ b/drivers/net/wireless/ath/wcn36xx/main.c @@ -685,10 +685,18 @@ static void wcn36xx_cancel_hw_scan(struct ieee80211_hw *hw, wcn->scan_aborted = true; mutex_unlock(>scan_lock); - /* ieee80211_scan_completed will be called on FW scan indication */ - wcn36xx_smd_stop_hw_scan(wcn); + if (get_feat_caps(wcn->fw_feat_caps, SCAN_OFFLOAD)) { + /* ieee80211_scan_completed will be called on FW scan +* indication */ + wcn36xx_smd_stop_hw_scan(wcn); + } else { + struct cfg80211_scan_info scan_info = { + .aborted = true, + }; - cancel_work_sync(>scan_work); + cancel_work_sync(>scan_work); + ieee80211_scan_completed(wcn->hw, _info); + } } static void wcn36xx_update_allowed_rates(struct ieee80211_sta *sta, -- 2.14.3
[PATCH 4/5] wcn36xx: send bss_type in scan requests
Pass the bss_type of the currently configured BSS in the message for the scan request. Therefore, that setting needs to be kept in struct wcn36xx_vif. This seems to be only interesting when scanning for a specific SSID and doesn't matter for regular wildcard scans. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/smd.c | 4 +++- drivers/net/wireless/ath/wcn36xx/wcn36xx.h | 1 + 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index b7964f79d837..09320c3f6bd0 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -631,6 +631,7 @@ int wcn36xx_smd_start_hw_scan(struct wcn36xx *wcn, struct ieee80211_vif *vif, msg_body.max_ch_time = 100; msg_body.scan_hidden = 1; memcpy(msg_body.mac, vif->addr, ETH_ALEN); + msg_body.bss_type = vif_priv->bss_type; msg_body.p2p_search = vif->p2p; msg_body.num_ssid = min_t(u8, req->n_ssids, ARRAY_SIZE(msg_body.ssids)); @@ -1399,9 +1400,10 @@ int wcn36xx_smd_config_bss(struct wcn36xx *wcn, struct ieee80211_vif *vif, bss->spectrum_mgt_enable = 0; bss->tx_mgmt_power = 0; bss->max_tx_power = WCN36XX_MAX_POWER(wcn); - bss->action = update; + vif_priv->bss_type = bss->bss_type; + wcn36xx_dbg(WCN36XX_DBG_HAL, "hal config bss bssid %pM self_mac_addr %pM bss_type %d oper_mode %d nw_type %d\n", bss->bssid, bss->self_mac_addr, bss->bss_type, diff --git a/drivers/net/wireless/ath/wcn36xx/wcn36xx.h b/drivers/net/wireless/ath/wcn36xx/wcn36xx.h index 5854adf43f3a..22e05cb4eb98 100644 --- a/drivers/net/wireless/ath/wcn36xx/wcn36xx.h +++ b/drivers/net/wireless/ath/wcn36xx/wcn36xx.h @@ -123,6 +123,7 @@ struct wcn36xx_vif { bool is_joining; bool sta_assoc; struct wcn36xx_hal_mac_ssid ssid; + enum wcn36xx_hal_bss_type bss_type; /* Power management */ enum wcn36xx_power_state pw_state; -- 2.14.3
[PATCH 1/5] wcn36xx: abort scan request when 'dequeued' indicator is sent
When the firmware sends a WCN36XX_HAL_SCAN_IND_DEQUEUED indication, the request is apparently no longer valid. Attempts to stop the hardware scan request subsequently will lead to the following error message, and the hardware is no longer able to communicate with any AP: [ 57.917186] wcn36xx: ERROR hal_stop_scan_offload response failed err=5 Interpreting this indicator message as scan abortion fixes this. While at it, add a newline to a debug print. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/smd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index 297a785d0785..b7964f79d837 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -2140,10 +2140,11 @@ static int wcn36xx_smd_hw_scan_ind(struct wcn36xx *wcn, void *buf, size_t len) return -EIO; } - wcn36xx_dbg(WCN36XX_DBG_HAL, "scan indication (type %x)", rsp->type); + wcn36xx_dbg(WCN36XX_DBG_HAL, "scan indication (type %x)\n", rsp->type); switch (rsp->type) { case WCN36XX_HAL_SCAN_IND_FAILED: + case WCN36XX_HAL_SCAN_IND_DEQUEUED: scan_info.aborted = true; /* fall through */ case WCN36XX_HAL_SCAN_IND_COMPLETED: @@ -2156,7 +2157,6 @@ static int wcn36xx_smd_hw_scan_ind(struct wcn36xx *wcn, void *buf, size_t len) break; case WCN36XX_HAL_SCAN_IND_STARTED: case WCN36XX_HAL_SCAN_IND_FOREIGN_CHANNEL: - case WCN36XX_HAL_SCAN_IND_DEQUEUED: case WCN36XX_HAL_SCAN_IND_PREEMPTED: case WCN36XX_HAL_SCAN_IND_RESTARTED: break; -- 2.14.3
[PATCH 5/5] wcn36xx: pass information elements in scan requests
When the ieee8022 core passes IE elements in the scan request, append them to the firmware message. The driver currently tells the core that it is capable of attaching up to WCN36XX_MAX_SCAN_IE_LEN octets, but doesn't actually pass them to the the hardware. Some defines were moved around to avoid cyclic include dependencies. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/hal.h | 8 +++- drivers/net/wireless/ath/wcn36xx/smd.c | 12 drivers/net/wireless/ath/wcn36xx/wcn36xx.h | 6 -- 3 files changed, 19 insertions(+), 7 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/hal.h b/drivers/net/wireless/ath/wcn36xx/hal.h index 182963522941..2aed6c233508 100644 --- a/drivers/net/wireless/ath/wcn36xx/hal.h +++ b/drivers/net/wireless/ath/wcn36xx/hal.h @@ -88,6 +88,12 @@ /* version string max length (including NULL) */ #define WCN36XX_HAL_VERSION_LENGTH 64 +/* How many frames until we start a-mpdu TX session */ +#define WCN36XX_AMPDU_START_THRESH 20 + +#define WCN36XX_MAX_SCAN_SSIDS 9 +#define WCN36XX_MAX_SCAN_IE_LEN500 + /* message types for messages exchanged between WDI and HAL */ enum wcn36xx_hal_host_msg_type { /* Init/De-Init */ @@ -1170,7 +1176,7 @@ struct wcn36xx_hal_start_scan_offload_req_msg { /* IE field */ u16 ie_len; - u8 ie[0]; + u8 ie[WCN36XX_MAX_SCAN_IE_LEN]; } __packed; struct wcn36xx_hal_start_scan_offload_rsp_msg { diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index 09320c3f6bd0..ea74f2b92df5 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -620,9 +620,13 @@ int wcn36xx_smd_finish_scan(struct wcn36xx *wcn, int wcn36xx_smd_start_hw_scan(struct wcn36xx *wcn, struct ieee80211_vif *vif, struct cfg80211_scan_request *req) { + struct wcn36xx_vif *vif_priv = wcn36xx_vif_to_priv(vif); struct wcn36xx_hal_start_scan_offload_req_msg msg_body; int ret, i; + if (req->ie_len > WCN36XX_MAX_SCAN_IE_LEN) + return -EINVAL; + mutex_lock(>hal_mutex); INIT_HAL_MSG(msg_body, WCN36XX_HAL_START_SCAN_OFFLOAD_REQ); @@ -647,6 +651,14 @@ int wcn36xx_smd_start_hw_scan(struct wcn36xx *wcn, struct ieee80211_vif *vif, for (i = 0; i < msg_body.num_channel; i++) msg_body.channels[i] = req->channels[i]->hw_value; + msg_body.header.len -= WCN36XX_MAX_SCAN_IE_LEN; + + if (req->ie_len > 0) { + msg_body.ie_len = req->ie_len; + msg_body.header.len += req->ie_len; + memcpy(msg_body.ie, req->ie, req->ie_len); + } + PREPARE_HAL_BUF(wcn->hal_buf, msg_body); wcn36xx_dbg(WCN36XX_DBG_HAL, diff --git a/drivers/net/wireless/ath/wcn36xx/wcn36xx.h b/drivers/net/wireless/ath/wcn36xx/wcn36xx.h index 22e05cb4eb98..9343989d1169 100644 --- a/drivers/net/wireless/ath/wcn36xx/wcn36xx.h +++ b/drivers/net/wireless/ath/wcn36xx/wcn36xx.h @@ -32,12 +32,6 @@ #define WLAN_NV_FILE "wlan/prima/WCNSS_qcom_wlan_nv.bin" #define WCN36XX_AGGR_BUFFER_SIZE 64 -/* How many frames until we start a-mpdu TX session */ -#define WCN36XX_AMPDU_START_THRESH 20 - -#define WCN36XX_MAX_SCAN_SSIDS 9 -#define WCN36XX_MAX_SCAN_IE_LEN500 - extern unsigned int wcn36xx_dbg_mask; enum wcn36xx_debug_mask { -- 2.14.3
[PATCH 0/5] wcn36xx: scan related patches
Here's another series of 5 patches for wcn36xx that address some issues with scanning. The first one is the most important one. Daniel Daniel Mack (5): wcn36xx: abort scan request when 'dequeued' indicator is sent wcn36xx: cancel pending scan request when interface goes down wcn36xx: handle scan cancellation when firmware support is missing wcn36xx: send bss_type in scan requests wcn36xx: pass information elements in scan requests drivers/net/wireless/ath/wcn36xx/hal.h | 8 +++- drivers/net/wireless/ath/wcn36xx/main.c| 29 + drivers/net/wireless/ath/wcn36xx/smd.c | 20 +--- drivers/net/wireless/ath/wcn36xx/wcn36xx.h | 7 +-- 4 files changed, 50 insertions(+), 14 deletions(-) -- 2.14.3
[PATCH v2] wcn36xx: pass correct BSS index when deleting BSS keys
The firmware message to delete BSS keys expects a BSS index to be passed. This field is currently hard-coded to 0. Fix this by passing in the index we received from the firmware when the BSS was configured. The encryption type in that message also needs to be set to what was used when the key was set, so the assignment of vif_priv->encrypt_type is now done after the firmware command was sent. This reportedly fixes the following error in AP mode: wcn36xx: ERROR hal_remove_bsskey response failed err=6 Also, AFAIU, when a BSS is deleted, the firmware apparently drops all the keys associated with it. Trying to remove the key explicitly afterwards will hence lead to the following message: wcn36xx: ERROR hal_remove_bsskey response failed err=16 This is now suppressed with an extra check for the BSS index validity. Signed-off-by: Daniel Mack <dan...@zonque.org> --- v2: mention the vif_priv->encrypt_type move in the commit log. drivers/net/wireless/ath/wcn36xx/main.c | 10 +++--- drivers/net/wireless/ath/wcn36xx/smd.c | 6 -- drivers/net/wireless/ath/wcn36xx/smd.h | 2 ++ 3 files changed, 13 insertions(+), 5 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c index 0bc9283e7d8d..1b17c35a7944 100644 --- a/drivers/net/wireless/ath/wcn36xx/main.c +++ b/drivers/net/wireless/ath/wcn36xx/main.c @@ -547,6 +547,7 @@ static int wcn36xx_set_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, } else { wcn36xx_smd_set_bsskey(wcn, vif_priv->encrypt_type, + vif_priv->bss_index, key_conf->keyidx, key_conf->keylen, key); @@ -564,10 +565,13 @@ static int wcn36xx_set_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, break; case DISABLE_KEY: if (!(IEEE80211_KEY_FLAG_PAIRWISE & key_conf->flags)) { + if (vif_priv->bss_index != WCN36XX_HAL_BSS_INVALID_IDX) + wcn36xx_smd_remove_bsskey(wcn, + vif_priv->encrypt_type, + vif_priv->bss_index, + key_conf->keyidx); + vif_priv->encrypt_type = WCN36XX_HAL_ED_NONE; - wcn36xx_smd_remove_bsskey(wcn, - vif_priv->encrypt_type, - key_conf->keyidx); } else { sta_priv->is_data_encrypted = false; /* do not remove key if disassociated */ diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index 9827a1e1124b..297a785d0785 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -1636,6 +1636,7 @@ int wcn36xx_smd_set_stakey(struct wcn36xx *wcn, int wcn36xx_smd_set_bsskey(struct wcn36xx *wcn, enum ani_ed_type enc_type, + u8 bssidx, u8 keyidx, u8 keylen, u8 *key) @@ -1645,7 +1646,7 @@ int wcn36xx_smd_set_bsskey(struct wcn36xx *wcn, mutex_lock(>hal_mutex); INIT_HAL_MSG(msg_body, WCN36XX_HAL_SET_BSSKEY_REQ); - msg_body.bss_idx = 0; + msg_body.bss_idx = bssidx; msg_body.enc_type = enc_type; msg_body.num_keys = 1; msg_body.keys[0].id = keyidx; @@ -1706,6 +1707,7 @@ int wcn36xx_smd_remove_stakey(struct wcn36xx *wcn, int wcn36xx_smd_remove_bsskey(struct wcn36xx *wcn, enum ani_ed_type enc_type, + u8 bssidx, u8 keyidx) { struct wcn36xx_hal_remove_bss_key_req_msg msg_body; @@ -1713,7 +1715,7 @@ int wcn36xx_smd_remove_bsskey(struct wcn36xx *wcn, mutex_lock(>hal_mutex); INIT_HAL_MSG(msg_body, WCN36XX_HAL_RMV_BSSKEY_REQ); - msg_body.bss_idx = 0; + msg_body.bss_idx = bssidx; msg_body.enc_type = enc_type; msg_body.key_id = keyidx; diff --git a/drivers/net/wireless/ath/wcn36xx/smd.h b/drivers/net/wireless/ath/wcn36xx/smd.h index 8076edf40ac8..61bb8d43138c 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.h +++ b/drivers/net/wireless/ath/wcn36xx/smd.h @@ -97,6 +97,7 @@ int wcn36xx_smd_set_stakey(struct wcn36xx *wcn, u8 sta_index); int wcn36xx_smd_set_bsskey(struct wcn36xx *wcn, enum ani_ed_type enc_type, + u8 bssidx, u8 keyidx, u8 keylen, u8 *key); @@ -106,6 +107,7 @@ int wcn36xx_smd_remove_stakey(struct wcn36xx *wcn,
Re: [PATCH] wcn36xx: pass correct BSS index when deleting BSS keys
On Thursday, April 12, 2018 02:14 PM, Kalle Valo wrote: > Daniel Mack <dan...@zonque.org> writes: > >> On Thursday, April 12, 2018 01:46 PM, Loic Poulain wrote: >>> Hi Daniel, >>> >>>> @@ -564,10 +565,13 @@ static int wcn36xx_set_key(struct ieee80211_hw *hw, >>>> enum set_key_cmd cmd, >>>> break; >>>> case DISABLE_KEY: >>>> if (!(IEEE80211_KEY_FLAG_PAIRWISE & key_conf->flags)) { >>>> + if (vif_priv->bss_index != >>>> WCN36XX_HAL_BSS_INVALID_IDX) >>>> + wcn36xx_smd_remove_bsskey(wcn, >>>> + vif_priv->encrypt_type, >>>> + vif_priv->bss_index, >>>> + key_conf->keyidx); >>>> + >>>> vif_priv->encrypt_type = WCN36XX_HAL_ED_NONE; >>>> - wcn36xx_smd_remove_bsskey(wcn, >>>> - vif_priv->encrypt_type, >>>> - key_conf->keyidx); >>> >>> Note that moving vif_priv->encrypt_type = WCN36XX_HAL_ED_NONE after >>> key removal also fixes an issue I observed in AP mode: >>> wcn36xx: ERROR hal_remove_bsskey response failed err=6 >> >> Yeah, sorry. I did that intentionally, but missed to mention it in the >> commit log. > > I can add that to the commit log, just tell me what to add. I'll resend, hang on :) Daniel
Re: [PATCH] wcn36xx: pass correct BSS index when deleting BSS keys
On Thursday, April 12, 2018 01:46 PM, Loic Poulain wrote: > Hi Daniel, > >> @@ -564,10 +565,13 @@ static int wcn36xx_set_key(struct ieee80211_hw *hw, >> enum set_key_cmd cmd, >> break; >> case DISABLE_KEY: >> if (!(IEEE80211_KEY_FLAG_PAIRWISE & key_conf->flags)) { >> + if (vif_priv->bss_index != >> WCN36XX_HAL_BSS_INVALID_IDX) >> + wcn36xx_smd_remove_bsskey(wcn, >> + vif_priv->encrypt_type, >> + vif_priv->bss_index, >> + key_conf->keyidx); >> + >> vif_priv->encrypt_type = WCN36XX_HAL_ED_NONE; >> - wcn36xx_smd_remove_bsskey(wcn, >> - vif_priv->encrypt_type, >> - key_conf->keyidx); > > Note that moving vif_priv->encrypt_type = WCN36XX_HAL_ED_NONE after > key removal also fixes an issue I observed in AP mode: > wcn36xx: ERROR hal_remove_bsskey response failed err=6 Yeah, sorry. I did that intentionally, but missed to mention it in the commit log. Thanks, Daniel
[PATCH] wcn36xx: pass correct BSS index when deleting BSS keys
The firmware message to delete BSS keys expects a BSS index to be passed. This field is currently hard-coded to 0. Fix this by passing in the index we received from the firmware when the BSS was configured. Also, AFAIU, when a BSS is deleted, the firmware apparently drops all the keys associated with it. Trying to remove the key explicitly afterwards will hence lead to the following message: wcn36xx: ERROR hal_remove_bsskey response failed err=16 This is now suppressed with an extra check for the BSS index validity. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/main.c | 10 +++--- drivers/net/wireless/ath/wcn36xx/smd.c | 6 -- drivers/net/wireless/ath/wcn36xx/smd.h | 2 ++ 3 files changed, 13 insertions(+), 5 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c index 0bc9283e7d8d..1b17c35a7944 100644 --- a/drivers/net/wireless/ath/wcn36xx/main.c +++ b/drivers/net/wireless/ath/wcn36xx/main.c @@ -547,6 +547,7 @@ static int wcn36xx_set_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, } else { wcn36xx_smd_set_bsskey(wcn, vif_priv->encrypt_type, + vif_priv->bss_index, key_conf->keyidx, key_conf->keylen, key); @@ -564,10 +565,13 @@ static int wcn36xx_set_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, break; case DISABLE_KEY: if (!(IEEE80211_KEY_FLAG_PAIRWISE & key_conf->flags)) { + if (vif_priv->bss_index != WCN36XX_HAL_BSS_INVALID_IDX) + wcn36xx_smd_remove_bsskey(wcn, + vif_priv->encrypt_type, + vif_priv->bss_index, + key_conf->keyidx); + vif_priv->encrypt_type = WCN36XX_HAL_ED_NONE; - wcn36xx_smd_remove_bsskey(wcn, - vif_priv->encrypt_type, - key_conf->keyidx); } else { sta_priv->is_data_encrypted = false; /* do not remove key if disassociated */ diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index 9827a1e1124b..297a785d0785 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -1636,6 +1636,7 @@ int wcn36xx_smd_set_stakey(struct wcn36xx *wcn, int wcn36xx_smd_set_bsskey(struct wcn36xx *wcn, enum ani_ed_type enc_type, + u8 bssidx, u8 keyidx, u8 keylen, u8 *key) @@ -1645,7 +1646,7 @@ int wcn36xx_smd_set_bsskey(struct wcn36xx *wcn, mutex_lock(>hal_mutex); INIT_HAL_MSG(msg_body, WCN36XX_HAL_SET_BSSKEY_REQ); - msg_body.bss_idx = 0; + msg_body.bss_idx = bssidx; msg_body.enc_type = enc_type; msg_body.num_keys = 1; msg_body.keys[0].id = keyidx; @@ -1706,6 +1707,7 @@ int wcn36xx_smd_remove_stakey(struct wcn36xx *wcn, int wcn36xx_smd_remove_bsskey(struct wcn36xx *wcn, enum ani_ed_type enc_type, + u8 bssidx, u8 keyidx) { struct wcn36xx_hal_remove_bss_key_req_msg msg_body; @@ -1713,7 +1715,7 @@ int wcn36xx_smd_remove_bsskey(struct wcn36xx *wcn, mutex_lock(>hal_mutex); INIT_HAL_MSG(msg_body, WCN36XX_HAL_RMV_BSSKEY_REQ); - msg_body.bss_idx = 0; + msg_body.bss_idx = bssidx; msg_body.enc_type = enc_type; msg_body.key_id = keyidx; diff --git a/drivers/net/wireless/ath/wcn36xx/smd.h b/drivers/net/wireless/ath/wcn36xx/smd.h index 8076edf40ac8..61bb8d43138c 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.h +++ b/drivers/net/wireless/ath/wcn36xx/smd.h @@ -97,6 +97,7 @@ int wcn36xx_smd_set_stakey(struct wcn36xx *wcn, u8 sta_index); int wcn36xx_smd_set_bsskey(struct wcn36xx *wcn, enum ani_ed_type enc_type, + u8 bssidx, u8 keyidx, u8 keylen, u8 *key); @@ -106,6 +107,7 @@ int wcn36xx_smd_remove_stakey(struct wcn36xx *wcn, u8 sta_index); int wcn36xx_smd_remove_bsskey(struct wcn36xx *wcn, enum ani_ed_type enc_type, + u8 bssidx, u8 keyidx); int wcn36xx_smd_enter_bmps(struct wcn36xx *wcn, struct ieee80211_vif *vif); int wcn36xx_smd_exit_bmps(struct wcn36xx *wcn, struct ieee80211_vif *vif); -- 2.14.3
Re: [PATCH 1/1 RFC] wcn36xx: fix buffer commit logic on TX path
Hi Loic, On Wednesday, April 11, 2018 03:30 PM, Loic Poulain wrote: >> /* Move the head of the ring to the next empty descriptor */ >> -ch->head_blk_ctl = ctl->next; >> +ch->head_blk_ctl = ctl_skb->next; >> + >> + /* Commit all previous writes and set descriptors to VALID */ >> + wmb(); > > Is this first memory barrier really needed? from what I understand, we > only need to ensure that the control descriptor is marked valid at the > end of the procedure and we don't really care about the paylod one. Well, without documentation or the firmware sources, that's just guesswork at this point. My assumption is only based on the weird comments and workarounds in the downstream driver. I added the second barrier to ensure that no descriptor is ever marked valid unless all other bits are definitely in sync. Thanks, Daniel
Re: [PATCH] wcn36xx: use READ_ONCE() to access desc->ctrl
On Tuesday, April 10, 2018 08:17 PM, Ramon Fried wrote: > On 10 April 2018 at 20:35, Daniel Mack <dan...@zonque.org> wrote: >> When accessing shared memory to check for the stat of submitted >> descriptors, make sure to use READ_ONCE(). This will guarantee the >> compiler treats these memory locations as volatile and doesn't apply >> any caching. > > The structure that is tested is not shared memory. It's accessed only > by the apps processor. Hmm? ctl->desc is of type struct wcn36xx_dxe_desc, which is packed and shared with the firmware. The WCN36xx_DXE_CTRL_VLD bit in ctrl bitfield is set by the firmware when a frame is valid, and before asserting the RX interrupt. So the host CPU must treat it as volatile and expect it to change. Am I reading this wrong? Thanks, Daniel
Re: [PATCH 0/1] RFC: memory coherency and data races on TX path
Hi Ramon, On Tuesday, April 10, 2018 08:11 PM, Ramon Fried wrote: > On 10 April 2018 at 20:42, Daniel Mack <dan...@zonque.org> wrote: >> I found something that I believe might be an issue, and I have an >> idea on how to correct this, but unfortunately, this doesn't solve >> the issues I occasionally see with this driver. I'd still like to >> share it, because I might be totally mistaken in my understanding. > > Thanks for sharing you idea. Are you aware of the recent patch I sent > that Loic Poulain > wrote that fixes a race issue in access to wcn36xx_dxe_tx_frame() ? > Kalle just recently applied it to ath-next, I don't think it's > available yet upstream. Yes, my patch builds upon yours. > The patch was fixing something similar, perhaps it will solve the > issue you're experiencing. I'm not even sure what kind of effect I'm hunting, so it's hard to tell. Your patch definitely addresses a data race too though. >> There's a problem with the latter presumption however which looks like >> this in the driver code: >> >> desc->fr_len = ctl->skb->len; >> /* set dxe descriptor to VALID */ >> desc->ctrl = ch->ctrl_skb; > > The firmware won't start reading the descriptors until it receives an > interrupt from driver. > which in turn happen later in the function using: wcn36xx_dxe_write_register() > so I don't think reordering in this case make any problem. I understand, but as I said, there's definitely the problem that the channel is already running when wcn36xx_dxe_tx_frame() is entered. Try adding this at the beginning of the the function and then run iperf: int reg; wcn36xx_dxe_read_register(wcn, ch->reg_ctrl, ); WARN_ON(reg & WCN36xx_DXE_CH_CTRL_EN); I fail to see how the firmware would determine which descriptors to look at without any type of synchronization mechanism. >> Does that make sense? As I said, I can't really say this improves >> anything, sadly, so I might be mistaken entirely. But I'll leave this >> here for further discussion. Ideally, somebody with access to the >> firmware sources could give an assessment whether this is an issue at >> all or not. > > When I'm not sure regarding some implementation I consult with the > downstream pronto driver. > https://github.com/sultanqasim/qcom_wlan_prima > > Did you look there if they actually placed wmb() ? Yes, I've read some of that as well. They use wmb() before writing to and rmb() after reading firmware registers. The equivalent upstream is wcn36xx_dxe_[read,write}_register(), and they use writel() and readl() which have the same barriers on arm64. So that's fine. What's interesting though is that the downstream drivers sets the VLD bit of the first descriptor of the chain *after* they set all the others. There are also some confusing comments about that (/* First frame not set VAL bit, why ??? */). Can you make sense of that? Thanks, Daniel
[PATCH 0/1] RFC: memory coherency and data races on TX path
I found something that I believe might be an issue, and I have an idea on how to correct this, but unfortunately, this doesn't solve the issues I occasionally see with this driver. I'd still like to share it, because I might be totally mistaken in my understanding. With no firmware code or documentation at hand, it's not exactly clear which assumption the firmware makes, but obviously, the driver and the firmware share memory to exchange descriptors that either contain control information or payload. The driver puts control descriptors and payload descriptors in a ring buffer in an interleaved fashion. When the driver wants to send an skb, it looks for a currently unused control descriptor, and then fills it, together with its directly chained payload descriptor. The descriptors are both marked valid and then the firmware is instructed to start processing the ringbuffer. In case the firmware is idle when wcn36xx_dxe_tx_frame() is entered, this is all fine. However, when sending many packets in a short time frame, there are cases when the firmware is still processing the ring buffer (iow, ch->reg_ctrl & WCN36xx_DXE_CH_CTRL_EN != 0), and in this case, writes to the shared memory area depict a data race. The local spinlock of course doesn't help to prevent that. OTOH, it should be completely fine to modify the descriptors while firmware is still reading them, as the firmware should only pay attention to such that are marked valid. There's a problem with the latter presumption however which looks like this in the driver code: desc->fr_len = ctl->skb->len; /* set dxe descriptor to VALID */ desc->ctrl = ch->ctrl_skb; The CPU may very well reorder the two writes, even though the memory is allocated as coherent DMA. When that happens, the firmware may see a wrong length for a valid descriptor. A simple memory write barrier would suffice to solve this, but then again, there are two descriptors involved. My attempt to fix that restructures the code a bit and makes the payload descriptor valid first and then the control descriptor, both strongly ordered through memory fences. This however assumes that the firmware will ignore valid payload descriptors unless they have a valid control descriptor preceding them, but that's really just guessing. Does that make sense? As I said, I can't really say this improves anything, sadly, so I might be mistaken entirely. But I'll leave this here for further discussion. Ideally, somebody with access to the firmware sources could give an assessment whether this is an issue at all or not. Thanks, Daniel Daniel Mack (1): wcn36xx: fix buffer commit logic on TX path drivers/net/wireless/ath/wcn36xx/dxe.c | 75 +- 1 file changed, 38 insertions(+), 37 deletions(-) -- 2.14.3
[PATCH 1/1 RFC] wcn36xx: fix buffer commit logic on TX path
When wcn36xx_dxe_tx_frame() is entered while the device is still processing the queue asyncronously, we are racing against the firmware code with updates to the buffer descriptors. Presumably, the firmware scans the ring buffer that holds the descriptors and scans for a valid control descriptor, and then assumes that the next descriptor contains the payload. If, however, the control descriptor is marked valid, but the payload descriptor isn't, the packet is not sent out. Another issue with the current code is that is lacks memory barriers before descriptors are marked valid. This is important because the CPU may reorder writes to memory, even if it is allocated as coherent DMA area, and hence the device may see incompletely written data. To fix this, the code in wcn36xx_dxe_tx_frame() was restructured a bit so that the payload descriptor is made valid before the control descriptor. Memory barriers are added to ensure coherency of shared memory areas. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/dxe.c | 75 +- 1 file changed, 38 insertions(+), 37 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c b/drivers/net/wireless/ath/wcn36xx/dxe.c index 26e6c42f886a..cb93545a42ce 100644 --- a/drivers/net/wireless/ath/wcn36xx/dxe.c +++ b/drivers/net/wireless/ath/wcn36xx/dxe.c @@ -638,8 +638,8 @@ int wcn36xx_dxe_tx_frame(struct wcn36xx *wcn, struct sk_buff *skb, bool is_low) { - struct wcn36xx_dxe_ctl *ctl = NULL; - struct wcn36xx_dxe_desc *desc = NULL; + struct wcn36xx_dxe_desc *desc_bd, *desc_skb; + struct wcn36xx_dxe_ctl *ctl_bd, *ctl_skb; struct wcn36xx_dxe_ch *ch = NULL; unsigned long flags; int ret; @@ -647,74 +647,75 @@ int wcn36xx_dxe_tx_frame(struct wcn36xx *wcn, ch = is_low ? >dxe_tx_l_ch : >dxe_tx_h_ch; spin_lock_irqsave(>lock, flags); - ctl = ch->head_blk_ctl; + ctl_bd = ch->head_blk_ctl; + ctl_skb = ctl_bd->next; /* * If skb is not null that means that we reached the tail of the ring * hence ring is full. Stop queues to let mac80211 back off until ring * has an empty slot again. */ - if (NULL != ctl->next->skb) { + if (NULL != ctl_skb->skb) { ieee80211_stop_queues(wcn->hw); wcn->queues_stopped = true; spin_unlock_irqrestore(>lock, flags); return -EBUSY; } - ctl->skb = NULL; - desc = ctl->desc; + if (unlikely(ctl_skb->bd_cpu_addr)) { + wcn36xx_err("bd_cpu_addr cannot be NULL for skb DXE\n"); + ret = -EINVAL; + goto unlock; + } + + desc_bd = ctl_bd->desc; + desc_skb = ctl_skb->desc; + + ctl_bd->skb = NULL; /* write buffer descriptor */ - memcpy(ctl->bd_cpu_addr, bd, sizeof(*bd)); + memcpy(ctl_bd->bd_cpu_addr, bd, sizeof(*bd)); /* Set source address of the BD we send */ - desc->src_addr_l = ctl->bd_phy_addr; - - desc->dst_addr_l = ch->dxe_wq; - desc->fr_len = sizeof(struct wcn36xx_tx_bd); - desc->ctrl = ch->ctrl_bd; + desc_bd->src_addr_l = ctl_bd->bd_phy_addr; + desc_bd->dst_addr_l = ch->dxe_wq; + desc_bd->fr_len = sizeof(struct wcn36xx_tx_bd); wcn36xx_dbg(WCN36XX_DBG_DXE, "DXE TX\n"); wcn36xx_dbg_dump(WCN36XX_DBG_DXE_DUMP, "DESC1 >>> ", -(char *)desc, sizeof(*desc)); +(char *)desc_bd, sizeof(*desc_bd)); wcn36xx_dbg_dump(WCN36XX_DBG_DXE_DUMP, -"BD >>> ", (char *)ctl->bd_cpu_addr, +"BD >>> ", (char *)ctl_bd->bd_cpu_addr, sizeof(struct wcn36xx_tx_bd)); - /* Set source address of the SKB we send */ - ctl = ctl->next; - desc = ctl->desc; - if (ctl->bd_cpu_addr) { - wcn36xx_err("bd_cpu_addr cannot be NULL for skb DXE\n"); - ret = -EINVAL; - goto unlock; - } - - desc->src_addr_l = dma_map_single(wcn->dev, - skb->data, - skb->len, - DMA_TO_DEVICE); - if (dma_mapping_error(wcn->dev, desc->src_addr_l)) { + desc_skb->src_addr_l = dma_map_single(wcn->dev, + skb->data, + skb->len, + DMA_TO_DEVICE); + if (dma_mapping_error(wcn->dev, desc_skb->src_addr_l)) { dev_er
[PATCH] wcn36xx: use READ_ONCE() to access desc->ctrl
When accessing shared memory to check for the stat of submitted descriptors, make sure to use READ_ONCE(). This will guarantee the compiler treats these memory locations as volatile and doesn't apply any caching. While this doesn't fix any particular problem I ran into, it's best practice to do it this way. Note that this patch also removes the superflous extra condition check in the do-while loop in reap_tx_dxes(), as the loop will break instantly anyway in that case. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/dxe.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c b/drivers/net/wireless/ath/wcn36xx/dxe.c index 26e6c42f886a..c52f1c21ece9 100644 --- a/drivers/net/wireless/ath/wcn36xx/dxe.c +++ b/drivers/net/wireless/ath/wcn36xx/dxe.c @@ -363,7 +363,7 @@ static void reap_tx_dxes(struct wcn36xx *wcn, struct wcn36xx_dxe_ch *ch) spin_lock_irqsave(>lock, flags); ctl = ch->tail_blk_ctl; do { - if (ctl->desc->ctrl & WCN36xx_DXE_CTRL_VLD) + if (READ_ONCE(ctl->desc->ctrl) & WCN36xx_DXE_CTRL_VLD) break; if (ctl->skb) { dma_unmap_single(wcn->dev, ctl->desc->src_addr_l, @@ -382,8 +382,7 @@ static void reap_tx_dxes(struct wcn36xx *wcn, struct wcn36xx_dxe_ch *ch) ctl->skb = NULL; } ctl = ctl->next; - } while (ctl != ch->head_blk_ctl && - !(ctl->desc->ctrl & WCN36xx_DXE_CTRL_VLD)); + } while (ctl != ch->head_blk_ctl); ch->tail_blk_ctl = ctl; spin_unlock_irqrestore(>lock, flags); @@ -525,7 +524,7 @@ static int wcn36xx_rx_handle_packets(struct wcn36xx *wcn, int_mask = WCN36XX_DXE_INT_CH3_MASK; } - while (!(dxe->ctrl & WCN36xx_DXE_CTRL_VLD)) { + while (!(READ_ONCE(dxe->ctrl) & WCN36xx_DXE_CTRL_VLD)) { skb = ctl->skb; dma_addr = dxe->dst_addr_l; ret = wcn36xx_dxe_fill_skb(wcn->dev, ctl, GFP_ATOMIC); -- 2.14.3
Re: [PATCH 3/3] wcn36xx: don't delete invalid bss indices
On Wednesday, April 04, 2018 07:40 AM, Ramon Fried wrote: > On 4/3/2018 7:51 PM, Daniel Mack wrote: >> The firmware code cannot cope with requests to remove BSS indices that have >> not previously been added. This primarily happens when the device is >> suspended and then resumed. ieee80211_reconfig() then calls into >> wcn36xx_bss_info_changed() with an empty bssid and BSS_CHANGED_BSSID set, >> which subsequently leads to a firmware crash: >> >> [ 43.647928] qcom-wcnss-pil a204000.wcnss: fatal error received: >> halMsg.c:4964:halMsg_DelBss: Invalid BSSIndex 0 >> [ 43.647959] remoteproc remoteproc0: crash detected in a204000.wcnss: type >> fatal error >> >> To fix this, set bss_index to WCN36XX_HAL_BSS_INVALID_IDX for all bss >> that have not been configured in the firmware, and don't call into the >> firmware with invalid indices. >> >> Signed-off-by: Daniel Mack <dan...@zonque.org> > Interesting. I have never seen this bug before. > Do you have a way of recreating it so I can test it on my side ? I tested this by putting the machine to suspend with # echo freeze >/sys/power/state right after boot, without connecting to a network before. Resume will then fail without this patch. I haven't see it in any other cases either though. Thanks, Daniel
[PATCH 1/3] wcn36xx: check for DMA mapping errors in wcn36xx_dxe_tx_frame()
Bail out if the mapping fails. Even though this hasn't occured during tests, this unlikely case should still be handled. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/dxe.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c b/drivers/net/wireless/ath/wcn36xx/dxe.c index 6e9a3583c447..e8ad8f989ccd 100644 --- a/drivers/net/wireless/ath/wcn36xx/dxe.c +++ b/drivers/net/wireless/ath/wcn36xx/dxe.c @@ -707,6 +707,11 @@ int wcn36xx_dxe_tx_frame(struct wcn36xx *wcn, ctl->skb->data, ctl->skb->len, DMA_TO_DEVICE); + if (dma_mapping_error(wcn->dev, desc->src_addr_l)) { + dev_err(wcn->dev, "unable to DMA map src_addr_l\n"); + ret = -ENOMEM; + goto unlock; + } desc->dst_addr_l = ch->dxe_wq; desc->fr_len = ctl->skb->len; -- 2.14.3
[PATCH 2/3] wcn36xx: don't keep reference to skb if transmission failed
When wcn36xx_dxe_tx_frame() fails to transmit the TX frame, the driver will call into ieee80211_free_txskb() for the skb in flight, so it'll no longer be valid. Hence, we shouldn't keep a reference to it in ctl->skb. Also, if the skb has IEEE80211_TX_CTL_REQ_TX_STATUS set, a pointer to it will currently remain in wcn->tx_ack_skb, which will potentially lead to a crash if accessed later. Fix this by checking the return value of wcn36xx_dxe_tx_frame(), and nullify wcn->tx_ack_skb again in case of errors. Move the assignment of ctl->skb in wcn36xx_dxe_tx_frame() so it only happens when the transmission is successful. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/dxe.c | 6 +++--- drivers/net/wireless/ath/wcn36xx/txrx.c | 15 ++- 2 files changed, 17 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c b/drivers/net/wireless/ath/wcn36xx/dxe.c index e8ad8f989ccd..abf7b051e1ff 100644 --- a/drivers/net/wireless/ath/wcn36xx/dxe.c +++ b/drivers/net/wireless/ath/wcn36xx/dxe.c @@ -695,7 +695,6 @@ int wcn36xx_dxe_tx_frame(struct wcn36xx *wcn, /* Set source address of the SKB we send */ ctl = ctl->next; - ctl->skb = skb; desc = ctl->desc; if (ctl->bd_cpu_addr) { wcn36xx_err("bd_cpu_addr cannot be NULL for skb DXE\n"); @@ -704,8 +703,8 @@ int wcn36xx_dxe_tx_frame(struct wcn36xx *wcn, } desc->src_addr_l = dma_map_single(wcn->dev, - ctl->skb->data, - ctl->skb->len, + skb->data, + skb->len, DMA_TO_DEVICE); if (dma_mapping_error(wcn->dev, desc->src_addr_l)) { dev_err(wcn->dev, "unable to DMA map src_addr_l\n"); @@ -713,6 +712,7 @@ int wcn36xx_dxe_tx_frame(struct wcn36xx *wcn, goto unlock; } + ctl->skb = skb; desc->dst_addr_l = ch->dxe_wq; desc->fr_len = ctl->skb->len; diff --git a/drivers/net/wireless/ath/wcn36xx/txrx.c b/drivers/net/wireless/ath/wcn36xx/txrx.c index b1768ed6b0be..a6902371e89c 100644 --- a/drivers/net/wireless/ath/wcn36xx/txrx.c +++ b/drivers/net/wireless/ath/wcn36xx/txrx.c @@ -273,6 +273,7 @@ int wcn36xx_start_tx(struct wcn36xx *wcn, bool bcast = is_broadcast_ether_addr(hdr->addr1) || is_multicast_ether_addr(hdr->addr1); struct wcn36xx_tx_bd bd; + int ret; memset(, 0, sizeof(bd)); @@ -317,5 +318,17 @@ int wcn36xx_start_tx(struct wcn36xx *wcn, buff_to_be((u32 *), sizeof(bd)/sizeof(u32)); bd.tx_bd_sign = 0xbdbdbdbd; - return wcn36xx_dxe_tx_frame(wcn, vif_priv, , skb, is_low); + ret = wcn36xx_dxe_tx_frame(wcn, vif_priv, , skb, is_low); + if (ret && bd.tx_comp) { + /* If the skb has not been transmitted, +* don't keep a reference to it. +*/ + spin_lock_irqsave(>dxe_lock, flags); + wcn->tx_ack_skb = NULL; + spin_unlock_irqrestore(>dxe_lock, flags); + + ieee80211_wake_queues(wcn->hw); + } + + return ret; } -- 2.14.3
[PATCH 3/3] wcn36xx: don't delete invalid bss indices
The firmware code cannot cope with requests to remove BSS indices that have not previously been added. This primarily happens when the device is suspended and then resumed. ieee80211_reconfig() then calls into wcn36xx_bss_info_changed() with an empty bssid and BSS_CHANGED_BSSID set, which subsequently leads to a firmware crash: [ 43.647928] qcom-wcnss-pil a204000.wcnss: fatal error received: halMsg.c:4964:halMsg_DelBss: Invalid BSSIndex 0 [ 43.647959] remoteproc remoteproc0: crash detected in a204000.wcnss: type fatal error To fix this, set bss_index to WCN36XX_HAL_BSS_INVALID_IDX for all bss that have not been configured in the firmware, and don't call into the firmware with invalid indices. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/main.c | 1 + drivers/net/wireless/ath/wcn36xx/smd.c | 6 ++ 2 files changed, 7 insertions(+) diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c index 69d6be59d97f..32bbd6e2fd09 100644 --- a/drivers/net/wireless/ath/wcn36xx/main.c +++ b/drivers/net/wireless/ath/wcn36xx/main.c @@ -953,6 +953,7 @@ static int wcn36xx_add_interface(struct ieee80211_hw *hw, mutex_lock(>conf_mutex); + vif_priv->bss_index = WCN36XX_HAL_BSS_INVALID_IDX; list_add(_priv->list, >vif_list); wcn36xx_smd_add_sta_self(wcn, vif); diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index 8932af5e4d8d..5be07e40a86d 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -1446,6 +1446,10 @@ int wcn36xx_smd_delete_bss(struct wcn36xx *wcn, struct ieee80211_vif *vif) int ret = 0; mutex_lock(>hal_mutex); + + if (vif_priv->bss_index == WCN36XX_HAL_BSS_INVALID_IDX) + goto out; + INIT_HAL_MSG(msg_body, WCN36XX_HAL_DELETE_BSS_REQ); msg_body.bss_index = vif_priv->bss_index; @@ -1464,6 +1468,8 @@ int wcn36xx_smd_delete_bss(struct wcn36xx *wcn, struct ieee80211_vif *vif) wcn36xx_err("hal_delete_bss response failed err=%d\n", ret); goto out; } + + vif_priv->bss_index = WCN36XX_HAL_BSS_INVALID_IDX; out: mutex_unlock(>hal_mutex); return ret; -- 2.14.3
Re: [PATCH] wcn36xx: allocate skbs with GFP_KERNEL during init
On Monday, March 19, 2018 07:30 AM, Daniel Mack wrote: > GFP_ATOMIC should only be used when the allocation is done from atomic > context. Introduce a new flag to wcn36xx_dxe_fill_skb() and use GFP_KERNEL > when pre-allocating buffers during init. > > This doesn't fix an issue that was observed in the wild, but it reduces > the chance of failed allocations under memory pressure. > > Signed-off-by: Daniel Mack <dan...@zonque.org> Any opinion about this one? Thanks, Daniel > --- > drivers/net/wireless/ath/wcn36xx/dxe.c | 10 ++ > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c > b/drivers/net/wireless/ath/wcn36xx/dxe.c > index 5672154948c3..3e180828fbfa 100644 > --- a/drivers/net/wireless/ath/wcn36xx/dxe.c > +++ b/drivers/net/wireless/ath/wcn36xx/dxe.c > @@ -275,12 +275,14 @@ static int wcn36xx_dxe_enable_ch_int(struct wcn36xx > *wcn, u16 wcn_ch) > return 0; > } > > -static int wcn36xx_dxe_fill_skb(struct device *dev, struct wcn36xx_dxe_ctl > *ctl) > +static int wcn36xx_dxe_fill_skb(struct device *dev, > + struct wcn36xx_dxe_ctl *ctl, > + gfp_t gfp) > { > struct wcn36xx_dxe_desc *dxe = ctl->desc; > struct sk_buff *skb; > > - skb = alloc_skb(WCN36XX_PKT_SIZE, GFP_ATOMIC); > + skb = alloc_skb(WCN36XX_PKT_SIZE, gfp); > if (skb == NULL) > return -ENOMEM; > > @@ -307,7 +309,7 @@ static int wcn36xx_dxe_ch_alloc_skb(struct wcn36xx *wcn, > cur_ctl = wcn_ch->head_blk_ctl; > > for (i = 0; i < wcn_ch->desc_num; i++) { > - wcn36xx_dxe_fill_skb(wcn->dev, cur_ctl); > + wcn36xx_dxe_fill_skb(wcn->dev, cur_ctl, GFP_KERNEL); > cur_ctl = cur_ctl->next; > } > > @@ -533,7 +535,7 @@ static int wcn36xx_rx_handle_packets(struct wcn36xx *wcn, > while (!(dxe->ctrl & WCN36XX_DXE_CTRL_VALID_MASK)) { > skb = ctl->skb; > dma_addr = dxe->dst_addr_l; > - ret = wcn36xx_dxe_fill_skb(wcn->dev, ctl); > + ret = wcn36xx_dxe_fill_skb(wcn->dev, ctl, GFP_ATOMIC); > if (0 == ret) { > /* new skb allocation ok. Use the new one and queue >* the old one to network system. >
[PATCH] wcn36xx: allocate skbs with GFP_KERNEL during init
GFP_ATOMIC should only be used when the allocation is done from atomic context. Introduce a new flag to wcn36xx_dxe_fill_skb() and use GFP_KERNEL when pre-allocating buffers during init. This doesn't fix an issue that was observed in the wild, but it reduces the chance of failed allocations under memory pressure. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/dxe.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c b/drivers/net/wireless/ath/wcn36xx/dxe.c index 5672154948c3..3e180828fbfa 100644 --- a/drivers/net/wireless/ath/wcn36xx/dxe.c +++ b/drivers/net/wireless/ath/wcn36xx/dxe.c @@ -275,12 +275,14 @@ static int wcn36xx_dxe_enable_ch_int(struct wcn36xx *wcn, u16 wcn_ch) return 0; } -static int wcn36xx_dxe_fill_skb(struct device *dev, struct wcn36xx_dxe_ctl *ctl) +static int wcn36xx_dxe_fill_skb(struct device *dev, + struct wcn36xx_dxe_ctl *ctl, + gfp_t gfp) { struct wcn36xx_dxe_desc *dxe = ctl->desc; struct sk_buff *skb; - skb = alloc_skb(WCN36XX_PKT_SIZE, GFP_ATOMIC); + skb = alloc_skb(WCN36XX_PKT_SIZE, gfp); if (skb == NULL) return -ENOMEM; @@ -307,7 +309,7 @@ static int wcn36xx_dxe_ch_alloc_skb(struct wcn36xx *wcn, cur_ctl = wcn_ch->head_blk_ctl; for (i = 0; i < wcn_ch->desc_num; i++) { - wcn36xx_dxe_fill_skb(wcn->dev, cur_ctl); + wcn36xx_dxe_fill_skb(wcn->dev, cur_ctl, GFP_KERNEL); cur_ctl = cur_ctl->next; } @@ -533,7 +535,7 @@ static int wcn36xx_rx_handle_packets(struct wcn36xx *wcn, while (!(dxe->ctrl & WCN36XX_DXE_CTRL_VALID_MASK)) { skb = ctl->skb; dma_addr = dxe->dst_addr_l; - ret = wcn36xx_dxe_fill_skb(wcn->dev, ctl); + ret = wcn36xx_dxe_fill_skb(wcn->dev, ctl, GFP_ATOMIC); if (0 == ret) { /* new skb allocation ok. Use the new one and queue * the old one to network system. -- 2.14.3
[PATCH v2] wcn36xx: dequeue all pending indicator messages
In case wcn36xx_smd_rsp_process() is called more than once before hal_ind_work was dispatched, the messages will end up in hal_ind_queue, but wcn36xx_ind_smd_work() will only look at the first message in that list. Fix this by dequeing the messages from the list in a loop, and only stop when it's empty. This issue was found during a review of the driver. In my tests, that race never actually occured. Signed-off-by: Daniel Mack <dan...@zonque.org> Reviewed-by: Bjorn Andersson <bjorn.anders...@linaro.org> --- v2: amended the commit log slightly to state that this issue hasn't occured in the wild. drivers/net/wireless/ath/wcn36xx/smd.c | 95 +++--- 1 file changed, 52 insertions(+), 43 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index 7cc29285e052..a6b5352f59e9 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -2409,54 +2409,63 @@ static void wcn36xx_ind_smd_work(struct work_struct *work) { struct wcn36xx *wcn = container_of(work, struct wcn36xx, hal_ind_work); - struct wcn36xx_hal_msg_header *msg_header; - struct wcn36xx_hal_ind_msg *hal_ind_msg; - unsigned long flags; - spin_lock_irqsave(>hal_ind_lock, flags); + for (;;) { + struct wcn36xx_hal_msg_header *msg_header; + struct wcn36xx_hal_ind_msg *hal_ind_msg; + unsigned long flags; - hal_ind_msg = list_first_entry(>hal_ind_queue, - struct wcn36xx_hal_ind_msg, - list); - list_del(wcn->hal_ind_queue.next); - spin_unlock_irqrestore(>hal_ind_lock, flags); + spin_lock_irqsave(>hal_ind_lock, flags); - msg_header = (struct wcn36xx_hal_msg_header *)hal_ind_msg->msg; + if (list_empty(>hal_ind_queue)) { + spin_unlock_irqrestore(>hal_ind_lock, flags); + return; + } - switch (msg_header->msg_type) { - case WCN36XX_HAL_COEX_IND: - case WCN36XX_HAL_DEL_BA_IND: - case WCN36XX_HAL_AVOID_FREQ_RANGE_IND: - break; - case WCN36XX_HAL_OTA_TX_COMPL_IND: - wcn36xx_smd_tx_compl_ind(wcn, -hal_ind_msg->msg, -hal_ind_msg->msg_len); - break; - case WCN36XX_HAL_MISSED_BEACON_IND: - wcn36xx_smd_missed_beacon_ind(wcn, - hal_ind_msg->msg, - hal_ind_msg->msg_len); - break; - case WCN36XX_HAL_DELETE_STA_CONTEXT_IND: - wcn36xx_smd_delete_sta_context_ind(wcn, - hal_ind_msg->msg, - hal_ind_msg->msg_len); - break; - case WCN36XX_HAL_PRINT_REG_INFO_IND: - wcn36xx_smd_print_reg_info_ind(wcn, - hal_ind_msg->msg, - hal_ind_msg->msg_len); - break; - case WCN36XX_HAL_SCAN_OFFLOAD_IND: - wcn36xx_smd_hw_scan_ind(wcn, hal_ind_msg->msg, - hal_ind_msg->msg_len); - break; - default: - wcn36xx_err("SMD_EVENT (%d) not supported\n", - msg_header->msg_type); + hal_ind_msg = list_first_entry(>hal_ind_queue, + struct wcn36xx_hal_ind_msg, + list); + list_del(_ind_msg->list); + spin_unlock_irqrestore(>hal_ind_lock, flags); + + msg_header = (struct wcn36xx_hal_msg_header *)hal_ind_msg->msg; + + switch (msg_header->msg_type) { + case WCN36XX_HAL_COEX_IND: + case WCN36XX_HAL_DEL_BA_IND: + case WCN36XX_HAL_AVOID_FREQ_RANGE_IND: + break; + case WCN36XX_HAL_OTA_TX_COMPL_IND: + wcn36xx_smd_tx_compl_ind(wcn, +hal_ind_msg->msg, +hal_ind_msg->msg_len); + break; + case WCN36XX_HAL_MISSED_BEACON_IND: + wcn36xx_smd_missed_beacon_ind(wcn, + hal_ind_msg->msg, + hal_ind_msg->msg_len); + break; + case WCN36XX_HAL_DELETE_STA_CONTEXT_IND: + wcn36x
Re: [PATCH] wcn36xx: dequeue all pending indicator messages
On Friday, March 16, 2018 11:22 AM, Kalle Valo wrote: > Daniel Mack <dan...@zonque.org> writes: > >> Hi, >> >> On Friday, March 16, 2018 10:09 AM, Ramon Fried wrote: >>> On 3/16/2018 12:37 AM, Daniel Mack wrote: >>>> In case wcn36xx_smd_rsp_process() is called more than once before >>>> hal_ind_work was dispatched, the messages will end up in hal_ind_queue, >>>> but wcn36xx_ind_smd_work() will only look at the first message in that >>>> list. >>>> >>>> Fix this by dequeing the messages from the list in a loop, and only stop >>>> when it's empty. >>> Interesting. does it solve a specific bug ? can you elaborate ? >> >> I'm poking around in the driver to hopefully find issues that cause >> instability and failures in joining networks, which I am seeing a lot. >> There are a number of bug reports regarding this, for instance >> >> https://bugs.96boards.org/show_bug.cgi?id=538 >> https://bugs.96boards.org/show_bug.cgi?id=319 >> >> I'm following your patches and also started to look into the driver >> myself, and during review, I noticed that list handling issue. I have a >> big fat warning locally that would tell me if the list ever contains >> more than one entry, but that never happens, as the indicator messages >> are way too infrequent to trigger the race. So this isn't a real-world >> issue as far as I can tell, but it is still quite obviously a bug. Hence >> I considered posting a patch. > > It's a good idea to mention in the commit log if the fix is for a > theoretical issue and does not necessarily fix anything visible. Helps > to understand the background, prioritise which release the fix should go > etc. > Well, it's a race which can (theoretically) happen anytime. But sure, I will add another sentence to the commit log and resend. Thanks, Daniel
Re: [PATCH] wcn36xx: dequeue all pending indicator messages
Hi, On Friday, March 16, 2018 10:09 AM, Ramon Fried wrote: > On 3/16/2018 12:37 AM, Daniel Mack wrote: >> In case wcn36xx_smd_rsp_process() is called more than once before >> hal_ind_work was dispatched, the messages will end up in hal_ind_queue, >> but wcn36xx_ind_smd_work() will only look at the first message in that >> list. >> >> Fix this by dequeing the messages from the list in a loop, and only stop >> when it's empty. > Interesting. does it solve a specific bug ? can you elaborate ? I'm poking around in the driver to hopefully find issues that cause instability and failures in joining networks, which I am seeing a lot. There are a number of bug reports regarding this, for instance https://bugs.96boards.org/show_bug.cgi?id=538 https://bugs.96boards.org/show_bug.cgi?id=319 I'm following your patches and also started to look into the driver myself, and during review, I noticed that list handling issue. I have a big fat warning locally that would tell me if the list ever contains more than one entry, but that never happens, as the indicator messages are way too infrequent to trigger the race. So this isn't a real-world issue as far as I can tell, but it is still quite obviously a bug. Hence I considered posting a patch. I have another one that handles missed TX ACKs from the firmware (TODO comment in wcn36xx_start_tx()), but that case doesn't happen either. So I'll keep looking further, and will send things if I spot something. In the meantime, as you seem to be familiar with the firmware internals, I'd appreciate any pointer in how to narrow this down. It's extremely important to us, as we want to release a product featuring the wcn36xx. Thanks a lot for all your efforts! Daniel >> >> Signed-off-by: Daniel Mack <dan...@zonque.org> >> --- >> drivers/net/wireless/ath/wcn36xx/smd.c | 95 >> +++--- >> 1 file changed, 52 insertions(+), 43 deletions(-) >> >> diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c >> b/drivers/net/wireless/ath/wcn36xx/smd.c >> index 7cc29285e052..a6b5352f59e9 100644 >> --- a/drivers/net/wireless/ath/wcn36xx/smd.c >> +++ b/drivers/net/wireless/ath/wcn36xx/smd.c >> @@ -2409,54 +2409,63 @@ static void wcn36xx_ind_smd_work(struct work_struct >> *work) >> { >> struct wcn36xx *wcn = >> container_of(work, struct wcn36xx, hal_ind_work); >> -struct wcn36xx_hal_msg_header *msg_header; >> -struct wcn36xx_hal_ind_msg *hal_ind_msg; >> -unsigned long flags; >> >> -spin_lock_irqsave(>hal_ind_lock, flags); >> +for (;;) { >> +struct wcn36xx_hal_msg_header *msg_header; >> +struct wcn36xx_hal_ind_msg *hal_ind_msg; >> +unsigned long flags; >> >> -hal_ind_msg = list_first_entry(>hal_ind_queue, >> - struct wcn36xx_hal_ind_msg, >> - list); >> -list_del(wcn->hal_ind_queue.next); >> -spin_unlock_irqrestore(>hal_ind_lock, flags); >> +spin_lock_irqsave(>hal_ind_lock, flags); >> >> -msg_header = (struct wcn36xx_hal_msg_header *)hal_ind_msg->msg; >> +if (list_empty(>hal_ind_queue)) { >> +spin_unlock_irqrestore(>hal_ind_lock, flags); >> +return; >> +} >> >> -switch (msg_header->msg_type) { >> -case WCN36XX_HAL_COEX_IND: >> -case WCN36XX_HAL_DEL_BA_IND: >> -case WCN36XX_HAL_AVOID_FREQ_RANGE_IND: >> -break; >> -case WCN36XX_HAL_OTA_TX_COMPL_IND: >> -wcn36xx_smd_tx_compl_ind(wcn, >> - hal_ind_msg->msg, >> - hal_ind_msg->msg_len); >> -break; >> -case WCN36XX_HAL_MISSED_BEACON_IND: >> -wcn36xx_smd_missed_beacon_ind(wcn, >> - hal_ind_msg->msg, >> - hal_ind_msg->msg_len); >> -break; >> -case WCN36XX_HAL_DELETE_STA_CONTEXT_IND: >> -wcn36xx_smd_delete_sta_context_ind(wcn, >> - hal_ind_msg->msg, >> - hal_ind_msg->msg_len); >> -break; >> -case WCN36XX_HAL_PRINT_REG_INFO_IND: >> -wcn36xx_smd_print_reg_info_ind(wcn, >> - hal_ind_msg->msg, >> - hal_ind_msg->msg_
[PATCH] wcn36xx: dequeue all pending indicator messages
In case wcn36xx_smd_rsp_process() is called more than once before hal_ind_work was dispatched, the messages will end up in hal_ind_queue, but wcn36xx_ind_smd_work() will only look at the first message in that list. Fix this by dequeing the messages from the list in a loop, and only stop when it's empty. Signed-off-by: Daniel Mack <dan...@zonque.org> --- drivers/net/wireless/ath/wcn36xx/smd.c | 95 +++--- 1 file changed, 52 insertions(+), 43 deletions(-) diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index 7cc29285e052..a6b5352f59e9 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -2409,54 +2409,63 @@ static void wcn36xx_ind_smd_work(struct work_struct *work) { struct wcn36xx *wcn = container_of(work, struct wcn36xx, hal_ind_work); - struct wcn36xx_hal_msg_header *msg_header; - struct wcn36xx_hal_ind_msg *hal_ind_msg; - unsigned long flags; - spin_lock_irqsave(>hal_ind_lock, flags); + for (;;) { + struct wcn36xx_hal_msg_header *msg_header; + struct wcn36xx_hal_ind_msg *hal_ind_msg; + unsigned long flags; - hal_ind_msg = list_first_entry(>hal_ind_queue, - struct wcn36xx_hal_ind_msg, - list); - list_del(wcn->hal_ind_queue.next); - spin_unlock_irqrestore(>hal_ind_lock, flags); + spin_lock_irqsave(>hal_ind_lock, flags); - msg_header = (struct wcn36xx_hal_msg_header *)hal_ind_msg->msg; + if (list_empty(>hal_ind_queue)) { + spin_unlock_irqrestore(>hal_ind_lock, flags); + return; + } - switch (msg_header->msg_type) { - case WCN36XX_HAL_COEX_IND: - case WCN36XX_HAL_DEL_BA_IND: - case WCN36XX_HAL_AVOID_FREQ_RANGE_IND: - break; - case WCN36XX_HAL_OTA_TX_COMPL_IND: - wcn36xx_smd_tx_compl_ind(wcn, -hal_ind_msg->msg, -hal_ind_msg->msg_len); - break; - case WCN36XX_HAL_MISSED_BEACON_IND: - wcn36xx_smd_missed_beacon_ind(wcn, - hal_ind_msg->msg, - hal_ind_msg->msg_len); - break; - case WCN36XX_HAL_DELETE_STA_CONTEXT_IND: - wcn36xx_smd_delete_sta_context_ind(wcn, - hal_ind_msg->msg, - hal_ind_msg->msg_len); - break; - case WCN36XX_HAL_PRINT_REG_INFO_IND: - wcn36xx_smd_print_reg_info_ind(wcn, - hal_ind_msg->msg, - hal_ind_msg->msg_len); - break; - case WCN36XX_HAL_SCAN_OFFLOAD_IND: - wcn36xx_smd_hw_scan_ind(wcn, hal_ind_msg->msg, - hal_ind_msg->msg_len); - break; - default: - wcn36xx_err("SMD_EVENT (%d) not supported\n", - msg_header->msg_type); + hal_ind_msg = list_first_entry(>hal_ind_queue, + struct wcn36xx_hal_ind_msg, + list); + list_del(_ind_msg->list); + spin_unlock_irqrestore(>hal_ind_lock, flags); + + msg_header = (struct wcn36xx_hal_msg_header *)hal_ind_msg->msg; + + switch (msg_header->msg_type) { + case WCN36XX_HAL_COEX_IND: + case WCN36XX_HAL_DEL_BA_IND: + case WCN36XX_HAL_AVOID_FREQ_RANGE_IND: + break; + case WCN36XX_HAL_OTA_TX_COMPL_IND: + wcn36xx_smd_tx_compl_ind(wcn, +hal_ind_msg->msg, +hal_ind_msg->msg_len); + break; + case WCN36XX_HAL_MISSED_BEACON_IND: + wcn36xx_smd_missed_beacon_ind(wcn, + hal_ind_msg->msg, + hal_ind_msg->msg_len); + break; + case WCN36XX_HAL_DELETE_STA_CONTEXT_IND: + wcn36xx_smd_delete_sta_context_ind(wcn, + hal_ind_msg->msg, + hal_ind_msg->msg_len); + break; + case WCN36XX_HAL_PRINT_REG_INFO_IND: +