Re: [PATCH net-next] ibmvnic: Use list_for_each_entry() to simplify code in ibmvnic.c

2021-06-10 Thread Dany Madden

On 2021-06-10 07:52, Lijun Pan wrote:

On Thu, Jun 10, 2021 at 7:56 AM Wang Hai  wrote:


Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.

Reported-by: Hulk Robot 
Signed-off-by: Wang Hai 
---


Acked-by: Lijun Pan 

Reviewed-by: Dany Madden 


Re: [PATCH net v3] ibmvnic: Continue with reset if set link down failed

2021-05-04 Thread Dany Madden

On 2021-05-04 12:31, Lijun Pan wrote:

On Tue, May 4, 2021 at 2:27 PM Lijun Pan  wrote:


On Tue, May 4, 2021 at 2:14 PM Dany Madden  wrote:
>
> When ibmvnic gets a FATAL error message from the vnicserver, it marks
> the Command Respond Queue (CRQ) inactive and resets the adapter. If this
> FATAL reset fails and a transmission timeout reset follows, the CRQ is
> still inactive, ibmvnic's attempt to set link down will also fail. If
> ibmvnic abandons the reset because of this failed set link down and this
> is the last reset in the workqueue, then this adapter will be left in an
> inoperable state.
>
> Instead, make the driver ignore this link down failure and continue to
> free and re-register CRQ so that the adapter has an opportunity to
> recover.
>
> Fixes: ed651a10875f ("ibmvnic: Updated reset handling")
> Signed-off-by: Dany Madden 
> Reviewed-by: Rick Lindsley 
> Reviewed-by: Sukadev Bhattiprolu 
> ---
> Changes in V2:
> - Update description to clarify background for the patch
> - Include Reviewed-by tags
> Changes in V3:
> - Add comment above the code change
> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
> index 5788bb956d73..9e005a08d43b 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -2017,8 +2017,15 @@ static int do_reset(struct ibmvnic_adapter *adapter,
> rtnl_unlock();
> rc = set_link_state(adapter, IBMVNIC_LOGICAL_LNK_DN);
> rtnl_lock();
> -   if (rc)
> -   goto out;
> +
> +   /* Attempted to set the link down. It could fail if 
the
> +* vnicserver has already torn down the CRQ. We will
> +* note it and continue with reset to reinit the CRQ.
> +*/
> +   if (rc) {
> +   netdev_dbg(netdev,
> +  "Setting link down failed rc=%d. 
Continue anyway\n", rc);
> +   }

There are other places which check and rely on the return value of
this function. Your change makes that inconsistent. Can you stop


To be more specific, __ibmvnic_close, __ibmvnic_open both call this
set_link_state.
Inconsistent would have been not checking for the rc at all. Here we 
checked and noted it that there are times that it's ok to continue.





posting new versions and soliciting the maintainer to accept it before
there is material change? There are many ways to make reset
successful. I think this is the worst approach of all.


Can you show me a patch that is better than this one, that has gone thru 
a 30+ hours of testing?





>
> if (adapter->state == VNIC_OPEN) {
> /* When we dropped rtnl, ibmvnic_open() got
> --
> 2.18.2
>


[PATCH net v3] ibmvnic: Continue with reset if set link down failed

2021-05-04 Thread Dany Madden
When ibmvnic gets a FATAL error message from the vnicserver, it marks
the Command Respond Queue (CRQ) inactive and resets the adapter. If this
FATAL reset fails and a transmission timeout reset follows, the CRQ is
still inactive, ibmvnic's attempt to set link down will also fail. If
ibmvnic abandons the reset because of this failed set link down and this
is the last reset in the workqueue, then this adapter will be left in an
inoperable state.

Instead, make the driver ignore this link down failure and continue to
free and re-register CRQ so that the adapter has an opportunity to
recover.

Fixes: ed651a10875f ("ibmvnic: Updated reset handling")
Signed-off-by: Dany Madden 
Reviewed-by: Rick Lindsley 
Reviewed-by: Sukadev Bhattiprolu 
---
Changes in V2:
- Update description to clarify background for the patch
- Include Reviewed-by tags
Changes in V3:
- Add comment above the code change
---
 drivers/net/ethernet/ibm/ibmvnic.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 5788bb956d73..9e005a08d43b 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2017,8 +2017,15 @@ static int do_reset(struct ibmvnic_adapter *adapter,
rtnl_unlock();
rc = set_link_state(adapter, IBMVNIC_LOGICAL_LNK_DN);
rtnl_lock();
-   if (rc)
-   goto out;
+
+   /* Attempted to set the link down. It could fail if the
+* vnicserver has already torn down the CRQ. We will
+* note it and continue with reset to reinit the CRQ.
+*/
+   if (rc) {
+   netdev_dbg(netdev,
+  "Setting link down failed rc=%d. 
Continue anyway\n", rc);
+   }
 
if (adapter->state == VNIC_OPEN) {
/* When we dropped rtnl, ibmvnic_open() got
-- 
2.18.2



Re: [PATCH V2 net] ibmvnic: Continue with reset if set link down failed

2021-05-04 Thread Dany Madden

On 2021-04-22 19:26, Rick Lindsley wrote:

On 4/22/21 10:21 AM, Michal Suchánek wrote:

Merging do_reset and do_hard_reset might be a good code cleanup which 
is

out of the scope of this fix.


I agree, on both points. Accepting the patch, and improving the reset
path are not in conflict with each other.

My position is that improving the reset path is certainly on the table,
but not with this bug or this fix.  Within the context of this 
discovered
problem, the issue is well understood, the fix is small and addresses 
the

immediate problem, and it has not generated new bugs under subsequent
testing.  For those reasons, the submitted patch has my support.

Rick


Thanks for all the feedback on the patch. Refactoring the ibmvnic reset 
functions is something we plan to evaluate, as this would potentially 
simplify the reset code. In the mean time, the proposed patch is simple, 
and has been validated in our test environment to solve an issue 
resulting in vnic interfaces getting stuck in an offline state following 
a vnic timeout reset. Given that, I suggest we merge this patch while we 
look at further optimizations in future. I will submit a V3 patch 
shortly.


Dany


[PATCH V2 net] ibmvnic: Continue with reset if set link down failed

2021-04-20 Thread Dany Madden
When ibmvnic gets a FATAL error message from the vnicserver, it marks
the Command Respond Queue (CRQ) inactive and resets the adapter. If this
FATAL reset fails and a transmission timeout reset follows, the CRQ is
still inactive, ibmvnic's attempt to set link down will also fail. If
ibmvnic abandons the reset because of this failed set link down and this
is the last reset in the workqueue, then this adapter will be left in an
inoperable state.

Instead, make the driver ignore this link down failure and continue to
free and re-register CRQ so that the adapter has an opportunity to
recover.

Fixes: ed651a10875f ("ibmvnic: Updated reset handling")
Signed-off-by: Dany Madden 
Reviewed-by: Rick Lindsley 
Reviewed-by: Sukadev Bhattiprolu 
---
Changes in V2:
- Update description to clarify background for the patch
- Include Reviewed-by tags
---
 drivers/net/ethernet/ibm/ibmvnic.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index ffb2a91750c7..4bd8c5d1a275 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1970,8 +1970,10 @@ static int do_reset(struct ibmvnic_adapter *adapter,
rtnl_unlock();
rc = set_link_state(adapter, IBMVNIC_LOGICAL_LNK_DN);
rtnl_lock();
-   if (rc)
-   goto out;
+   if (rc) {
+   netdev_dbg(netdev,
+  "Setting link down failed rc=%d. 
Continue anyway\n", rc);
+   }
 
if (adapter->state == VNIC_OPEN) {
/* When we dropped rtnl, ibmvnic_open() got
-- 
2.26.2



Re: [PATCH net] ibmvnic: device remove has higher precedence over reset

2021-01-21 Thread Dany Madden

On 2021-01-20 22:20, Lijun Pan wrote:

Returning -EBUSY in ibmvnic_remove() does not actually hold the
removal procedure since driver core doesn't care for the return
value (see __device_release_driver() in drivers/base/dd.c
calling dev->bus->remove()) though vio_bus_remove
(in arch/powerpc/platforms/pseries/vio.c) records the
return value and passes it on. [1]

During the device removal precedure, we should not schedule
any new reset (ibmvnic_reset check for REMOVING and exit),
and should rely on the flush_work and flush_delayed_work
to complete the pending resets, specifically we need to
let __ibmvnic_reset() keep running while in REMOVING state since
flush_work and flush_delayed_work shall call __ibmvnic_reset finally.
So we skip the checking for REMOVING in __ibmvnic_reset.

[1]
https://lore.kernel.org/linuxppc-dev/20210117101242.dpwayq6wdgfdz...@pengutronix.de/T/#m48f5befd96bc9842ece2a3ad14f4c27747206a53
Reported-by: Uwe Kleine-König 
Fixes: 7d7195a026ba ("ibmvnic: Do not process device remove during
device reset")
Signed-off-by: Lijun Pan 
---
v1 versus RFC:
  1/ articulate why remove the REMOVING checking in __ibmvnic_reset
  and why keep the current checking for REMOVING in ibmvnic_reset.
  2/ The locking issue mentioned by Uwe are being addressed separately
 by https://lists.openwall.net/netdev/2021/01/08/89
  3/ This patch does not have merge conflict with 2/

 drivers/net/ethernet/ibm/ibmvnic.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
b/drivers/net/ethernet/ibm/ibmvnic.c
index aed985e08e8a..11f28fd03057 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2235,8 +2235,7 @@ static void __ibmvnic_reset(struct work_struct 
*work)

while (rwi) {
spin_lock_irqsave(&adapter->state_lock, flags);

-   if (adapter->state == VNIC_REMOVING ||
-   adapter->state == VNIC_REMOVED) {
+   if (adapter->state == VNIC_REMOVED) {


If we do get here, we would crash because ibmvnic_remove() happened. It 
frees the adapter struct already.



spin_unlock_irqrestore(&adapter->state_lock, flags);
kfree(rwi);
rc = EBUSY;
@@ -5372,11 +5371,6 @@ static int ibmvnic_remove(struct vio_dev *dev)
unsigned long flags;

spin_lock_irqsave(&adapter->state_lock, flags);
-   if (test_bit(0, &adapter->resetting)) {
-   spin_unlock_irqrestore(&adapter->state_lock, flags);
-   return -EBUSY;
-   }
-
adapter->state = VNIC_REMOVING;
spin_unlock_irqrestore(&adapter->state_lock, flags);


Re: ibmvnic: Race condition in remove callback

2021-01-19 Thread Dany Madden

On 2021-01-17 02:12, Uwe Kleine-König wrote:

Hello,

while working on some cleanup I stumbled over a problem in the 
ibmvnic's

remove callback. Since commit

7d7195a026ba ("ibmvnic: Do not process device remove during
device reset")

there is the following code in the remove callback:

static int ibmvnic_remove(struct vio_dev *dev)
{
...
spin_lock_irqsave(&adapter->state_lock, flags);
if (test_bit(0, &adapter->resetting)) {
spin_unlock_irqrestore(&adapter->state_lock, 
flags);

return -EBUSY;
}

adapter->state = VNIC_REMOVING;
spin_unlock_irqrestore(&adapter->state_lock, flags);

flush_work(&adapter->ibmvnic_reset);
flush_delayed_work(&adapter->ibmvnic_delayed_reset);
...
}

Unfortunately returning -EBUSY doesn't work as intended. That's because
the return value of this function is ignored[1] and the device is
considered unbound by the device core (shortly) after ibmvnic_remove()
returns.


Oh! I was not aware of this. In our code review, a question on whether 
or not device reset should have a higher precedence over device remove 
was raised before. So, now it is clear that this driver has to take care 
of remove over reset.




While looking into fixing that I noticed a worse problem:

If ibmvnic_reset() (e.g. called by the tx_timeout callback) calls
schedule_work(&adapter->ibmvnic_reset); just after the work queue is
flushed above the problem that 7d7195a026ba intends to fix will trigger
resulting in a use-after-free.


It was proposed that when coming into ibmvnic_remove() we lock down the 
workqueue to prevent future access, flush, cleanup, then unregister the 
device. Your thought on this?




Also ibmvnic_reset() checks for adapter->state without holding the lock
which might be racy, too.


Suka started addressing consistent locking with this patch series:
https://lists.openwall.net/netdev/2021/01/08/89

He is reworking this.


Best regards
Uwe


Thank you for taking the time to review this driver, Uwe. This is very 
helpful for us.


Best Regards,
Dany



[1] vio_bus_remove (in arch/powerpc/platforms/pseries/vio.c) records 
the

return value and passes it on. But the driver core doesn't care for
the return value (see __device_release_driver() in 
drivers/base/dd.c

calling dev->bus->remove()).


[PATCH net v3 1/9] ibmvnic: handle inconsistent login with reset

2020-11-25 Thread Dany Madden
Inconsistent login with the vnicserver is causing the device to be
removed. This does not give the device a chance to recover from error
state. This patch schedules a FATAL reset instead to bring the adapter
up.

Fixes: 032c5e82847a2 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Dany Madden 
Signed-off-by: Lijun Pan 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 2aa40b2f225c..dcb23015b6b4 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -4412,7 +4412,7 @@ static int handle_login_rsp(union ibmvnic_crq 
*login_rsp_crq,
 adapter->req_rx_add_queues !=
 be32_to_cpu(login_rsp->num_rxadd_subcrqs))) {
dev_err(dev, "FATAL: Inconsistent login and login rsp\n");
-   ibmvnic_remove(adapter->vdev);
+   ibmvnic_reset(adapter, VNIC_RESET_FATAL);
return -EIO;
}
size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) +
-- 
2.26.2



[PATCH net v3 9/9] ibmvnic: reduce wait for completion time

2020-11-25 Thread Dany Madden
Reduce the wait time for Command Response Queue response from 30 seconds
to 20 seconds, as recommended by VIOS and Power Hypervisor teams.

Fixes: bd0b672313941 ("ibmvnic: Move login and queue negotiation into 
ibmvnic_open")
Fixes: 53da09e92910f ("ibmvnic: Add set_link_state routine for setting adapter 
link state")
Signed-off-by: Dany Madden 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index a17856be2828..d6b2686aed0f 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -834,7 +834,7 @@ static void release_napi(struct ibmvnic_adapter *adapter)
 static int ibmvnic_login(struct net_device *netdev)
 {
struct ibmvnic_adapter *adapter = netdev_priv(netdev);
-   unsigned long timeout = msecs_to_jiffies(3);
+   unsigned long timeout = msecs_to_jiffies(2);
int retry_count = 0;
int retries = 10;
bool retry;
@@ -938,7 +938,7 @@ static void release_resources(struct ibmvnic_adapter 
*adapter)
 static int set_link_state(struct ibmvnic_adapter *adapter, u8 link_state)
 {
struct net_device *netdev = adapter->netdev;
-   unsigned long timeout = msecs_to_jiffies(3);
+   unsigned long timeout = msecs_to_jiffies(2);
union ibmvnic_crq crq;
bool resend;
int rc;
@@ -5125,7 +5125,7 @@ static int init_crq_queue(struct ibmvnic_adapter *adapter)
 static int ibmvnic_reset_init(struct ibmvnic_adapter *adapter, bool reset)
 {
struct device *dev = &adapter->vdev->dev;
-   unsigned long timeout = msecs_to_jiffies(3);
+   unsigned long timeout = msecs_to_jiffies(2);
u64 old_num_rx_queues, old_num_tx_queues;
int rc;
 
-- 
2.26.2



[PATCH net v3 7/9] ibmvnic: send_login should check for crq errors

2020-11-25 Thread Dany Madden
send_login() does not check for the result of ibmvnic_send_crq() of the
login request. This results in the driver needlessly retrying the login
10 times even when CRQ is no longer active. Check the return code and
give up in case of errors in sending the CRQ.

The only time we want to retry is if we get a PARITALSUCCESS response
from the partner.

Fixes: 032c5e82847a2 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Dany Madden 
Signed-off-by: Sukadev Bhattiprolu 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 55b07bd4c741..9005fab09e15 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -850,10 +850,8 @@ static int ibmvnic_login(struct net_device *netdev)
adapter->init_done_rc = 0;
reinit_completion(&adapter->init_done);
rc = send_login(adapter);
-   if (rc) {
-   netdev_warn(netdev, "Unable to login\n");
+   if (rc)
return rc;
-   }
 
if (!wait_for_completion_timeout(&adapter->init_done,
 timeout)) {
@@ -3727,15 +3725,16 @@ static int send_login(struct ibmvnic_adapter *adapter)
struct ibmvnic_login_rsp_buffer *login_rsp_buffer;
struct ibmvnic_login_buffer *login_buffer;
struct device *dev = &adapter->vdev->dev;
+   struct vnic_login_client_data *vlcd;
dma_addr_t rsp_buffer_token;
dma_addr_t buffer_token;
size_t rsp_buffer_size;
union ibmvnic_crq crq;
+   int client_data_len;
size_t buffer_size;
__be64 *tx_list_p;
__be64 *rx_list_p;
-   int client_data_len;
-   struct vnic_login_client_data *vlcd;
+   int rc;
int i;
 
if (!adapter->tx_scrq || !adapter->rx_scrq) {
@@ -3841,16 +3840,23 @@ static int send_login(struct ibmvnic_adapter *adapter)
crq.login.len = cpu_to_be32(buffer_size);
 
adapter->login_pending = true;
-   ibmvnic_send_crq(adapter, &crq);
+   rc = ibmvnic_send_crq(adapter, &crq);
+   if (rc) {
+   adapter->login_pending = false;
+   netdev_err(adapter->netdev, "Failed to send login, rc=%d\n", 
rc);
+   goto buf_rsp_map_failed;
+   }
 
return 0;
 
 buf_rsp_map_failed:
kfree(login_rsp_buffer);
+   adapter->login_rsp_buf = NULL;
 buf_rsp_alloc_failed:
dma_unmap_single(dev, buffer_token, buffer_size, DMA_TO_DEVICE);
 buf_map_failed:
kfree(login_buffer);
+   adapter->login_buf = NULL;
 buf_alloc_failed:
return -1;
 }
-- 
2.26.2



[PATCH net v3 8/9] ibmvnic: no reset timeout for 5 seconds after reset

2020-11-25 Thread Dany Madden
Reset timeout is going off right after adapter reset. This patch ensures
that timeout is scheduled if it has been 5 seconds since the last reset.
5 seconds is the default watchdog timeout.

Fixes: ed651a10875f1 ("ibmvnic: Updated reset handling")
Signed-off-by: Dany Madden 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 11 +--
 drivers/net/ethernet/ibm/ibmvnic.h |  2 ++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 9005fab09e15..a17856be2828 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2253,6 +2253,7 @@ static void __ibmvnic_reset(struct work_struct *work)
rc = do_reset(adapter, rwi, reset_state);
}
kfree(rwi);
+   adapter->last_reset_time = jiffies;
 
if (rc)
netdev_dbg(adapter->netdev, "Reset failed, rc=%d\n", 
rc);
@@ -2356,7 +2357,13 @@ static void ibmvnic_tx_timeout(struct net_device *dev, 
unsigned int txqueue)
   "Adapter is resetting, skip timeout reset\n");
return;
}
-
+   /* No queuing up reset until at least 5 seconds (default watchdog val)
+* after last reset
+*/
+   if (time_before(jiffies, (adapter->last_reset_time + 
dev->watchdog_timeo))) {
+   netdev_dbg(dev, "Not yet time to tx timeout.\n");
+   return;
+   }
ibmvnic_reset(adapter, VNIC_RESET_TIMEOUT);
 }
 
@@ -5277,7 +5284,7 @@ static int ibmvnic_probe(struct vio_dev *dev, const 
struct vio_device_id *id)
adapter->state = VNIC_PROBED;
 
adapter->wait_for_reset = false;
-
+   adapter->last_reset_time = jiffies;
return 0;
 
 ibmvnic_register_fail:
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h 
b/drivers/net/ethernet/ibm/ibmvnic.h
index 6f0a701c4a38..b21092f5f9c1 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -1088,6 +1088,8 @@ struct ibmvnic_adapter {
unsigned long resetting;
bool napi_enabled, from_passive_init;
bool login_pending;
+   /* last device reset time */
+   unsigned long last_reset_time;
 
bool failover_pending;
bool force_reset_recovery;
-- 
2.26.2



[PATCH net v3 6/9] ibmvnic: track pending login

2020-11-25 Thread Dany Madden
From: Sukadev Bhattiprolu 

If after ibmvnic sends a LOGIN it gets a FAILOVER, it is possible that
the worker thread will start reset process and free the login response
buffer before it gets a (now stale) LOGIN_RSP. The ibmvnic tasklet will
then try to access the login response buffer and crash.

Have ibmvnic track pending logins and discard any stale login responses.

Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Sukadev Bhattiprolu 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 17 +
 drivers/net/ethernet/ibm/ibmvnic.h |  1 +
 2 files changed, 18 insertions(+)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index e2f9b0e9dea8..55b07bd4c741 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -3839,6 +3839,8 @@ static int send_login(struct ibmvnic_adapter *adapter)
crq.login.cmd = LOGIN;
crq.login.ioba = cpu_to_be32(buffer_token);
crq.login.len = cpu_to_be32(buffer_size);
+
+   adapter->login_pending = true;
ibmvnic_send_crq(adapter, &crq);
 
return 0;
@@ -4391,6 +4393,15 @@ static int handle_login_rsp(union ibmvnic_crq 
*login_rsp_crq,
u64 *size_array;
int i;
 
+   /* CHECK: Test/set of login_pending does not need to be atomic
+* because only ibmvnic_tasklet tests/clears this.
+*/
+   if (!adapter->login_pending) {
+   netdev_warn(netdev, "Ignoring unexpected login response\n");
+   return 0;
+   }
+   adapter->login_pending = false;
+
dma_unmap_single(dev, adapter->login_buf_token, adapter->login_buf_sz,
 DMA_TO_DEVICE);
dma_unmap_single(dev, adapter->login_rsp_buf_token,
@@ -4762,6 +4773,11 @@ static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
case IBMVNIC_CRQ_INIT:
dev_info(dev, "Partner initialized\n");
adapter->from_passive_init = true;
+   /* Discard any stale login responses from prev reset.
+* CHECK: should we clear even on INIT_COMPLETE?
+*/
+   adapter->login_pending = false;
+
if (!completion_done(&adapter->init_done)) {
complete(&adapter->init_done);
adapter->init_done_rc = -EIO;
@@ -5191,6 +5207,7 @@ static int ibmvnic_probe(struct vio_dev *dev, const 
struct vio_device_id *id)
dev_set_drvdata(&dev->dev, netdev);
adapter->vdev = dev;
adapter->netdev = netdev;
+   adapter->login_pending = false;
 
ether_addr_copy(adapter->mac_addr, mac_addr_p);
ether_addr_copy(netdev->dev_addr, adapter->mac_addr);
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h 
b/drivers/net/ethernet/ibm/ibmvnic.h
index 217dcc7ded70..6f0a701c4a38 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -1087,6 +1087,7 @@ struct ibmvnic_adapter {
struct delayed_work ibmvnic_delayed_reset;
unsigned long resetting;
bool napi_enabled, from_passive_init;
+   bool login_pending;
 
bool failover_pending;
bool force_reset_recovery;
-- 
2.26.2



[PATCH net v3 2/9] ibmvnic: stop free_all_rwi on failed reset

2020-11-25 Thread Dany Madden
When ibmvnic fails to reset, it breaks out of the reset loop and frees
all of the remaining resets from the workqueue. Doing so prevents the
adapter from recovering if no reset is scheduled after that. Instead,
have the driver continue to process resets on the workqueue.

Remove the no longer need free_all_rwi().

Fixes: ed651a10875f1 ("ibmvnic: Updated reset handling")
Signed-off-by: Dany Madden 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 22 +++---
 1 file changed, 3 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index dcb23015b6b4..d5a927bb4954 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2173,17 +2173,6 @@ static struct ibmvnic_rwi *get_next_rwi(struct 
ibmvnic_adapter *adapter)
return rwi;
 }
 
-static void free_all_rwi(struct ibmvnic_adapter *adapter)
-{
-   struct ibmvnic_rwi *rwi;
-
-   rwi = get_next_rwi(adapter);
-   while (rwi) {
-   kfree(rwi);
-   rwi = get_next_rwi(adapter);
-   }
-}
-
 static void __ibmvnic_reset(struct work_struct *work)
 {
struct ibmvnic_rwi *rwi;
@@ -2253,9 +2242,9 @@ static void __ibmvnic_reset(struct work_struct *work)
else
adapter->state = reset_state;
rc = 0;
-   } else if (rc && rc != IBMVNIC_INIT_FAILED &&
-   !adapter->force_reset_recovery)
-   break;
+   }
+   if (rc)
+   netdev_dbg(adapter->netdev, "Reset failed, rc=%d\n", 
rc);
 
rwi = get_next_rwi(adapter);
 
@@ -2269,11 +2258,6 @@ static void __ibmvnic_reset(struct work_struct *work)
complete(&adapter->reset_done);
}
 
-   if (rc) {
-   netdev_dbg(adapter->netdev, "Reset failed\n");
-   free_all_rwi(adapter);
-   }
-
clear_bit_unlock(0, &adapter->resetting);
 }
 
-- 
2.26.2



[PATCH net v3 3/9] ibmvnic: avoid memset null scrq msgs

2020-11-25 Thread Dany Madden
scrq->msgs could be NULL during device reset, causing Linux to crash.
So, check before memset scrq->msgs.

Fixes: c8b2ad0a4a901 ("ibmvnic: Sanitize entire SCRQ buffer on reset")
Signed-off-by: Dany Madden 
Signed-off-by: Lijun Pan 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index d5a927bb4954..b08f95017825 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2845,15 +2845,26 @@ static int reset_one_sub_crq_queue(struct 
ibmvnic_adapter *adapter,
 {
int rc;
 
+   if (!scrq) {
+   netdev_dbg(adapter->netdev,
+  "Invalid scrq reset. irq (%d) or msgs (%p).\n",
+  scrq->irq, scrq->msgs);
+   return -EINVAL;
+   }
+
if (scrq->irq) {
free_irq(scrq->irq, scrq);
irq_dispose_mapping(scrq->irq);
scrq->irq = 0;
}
-
-   memset(scrq->msgs, 0, 4 * PAGE_SIZE);
-   atomic_set(&scrq->used, 0);
-   scrq->cur = 0;
+   if (scrq->msgs) {
+   memset(scrq->msgs, 0, 4 * PAGE_SIZE);
+   atomic_set(&scrq->used, 0);
+   scrq->cur = 0;
+   } else {
+   netdev_dbg(adapter->netdev, "Invalid scrq reset\n");
+   return -EINVAL;
+   }
 
rc = h_reg_sub_crq(adapter->vdev->unit_address, scrq->msg_token,
   4 * PAGE_SIZE, &scrq->crq_num, &scrq->hw_irq);
-- 
2.26.2



[PATCH net v3 5/9] ibmvnic: delay next reset if hard reset fails

2020-11-25 Thread Dany Madden
From: Sukadev Bhattiprolu 

If auto-priority failover is enabled, the backing device needs time
to settle if hard resetting fails for any reason. Add a delay of 60
seconds before retrying the hard-reset.

Fixes: 2770a7984db5 ("ibmvnic: Introduce hard reset recovery")
Signed-off-by: Sukadev Bhattiprolu 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index ff474a790181..e2f9b0e9dea8 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2242,6 +2242,14 @@ static void __ibmvnic_reset(struct work_struct *work)
rc = do_hard_reset(adapter, rwi, reset_state);
rtnl_unlock();
}
+   if (rc) {
+   /* give backing device time to settle down */
+   netdev_dbg(adapter->netdev,
+  "[S:%d] Hard reset failed, waiting 
60 secs\n",
+  adapter->state);
+   set_current_state(TASK_UNINTERRUPTIBLE);
+   schedule_timeout(60 * HZ);
+   }
} else if (!(rwi->reset_reason == VNIC_RESET_FATAL &&
adapter->from_passive_init)) {
rc = do_reset(adapter, rwi, reset_state);
-- 
2.26.2



[PATCH net v3 4/9] ibmvnic: restore adapter state on failed reset

2020-11-25 Thread Dany Madden
In a failed reset, driver could end up in VNIC_PROBED or VNIC_CLOSED
state and cannot recover in subsequent resets, leaving it offline.
This patch restores the adapter state to reset_state, the original
state when reset was called.

Fixes: b27507bb59ed5 ("net/ibmvnic: unlock rtnl_lock in reset so 
linkwatch_event can run")
Fixes: 2770a7984db58 ("ibmvnic: Introduce hard reset recovery")
Signed-off-by: Dany Madden 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 67 --
 1 file changed, 36 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index b08f95017825..ff474a790181 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1857,7 +1857,7 @@ static int do_change_param_reset(struct ibmvnic_adapter 
*adapter,
if (reset_state == VNIC_OPEN) {
rc = __ibmvnic_close(netdev);
if (rc)
-   return rc;
+   goto out;
}
 
release_resources(adapter);
@@ -1875,24 +1875,25 @@ static int do_change_param_reset(struct ibmvnic_adapter 
*adapter,
}
 
rc = ibmvnic_reset_init(adapter, true);
-   if (rc)
-   return IBMVNIC_INIT_FAILED;
+   if (rc) {
+   rc = IBMVNIC_INIT_FAILED;
+   goto out;
+   }
 
/* If the adapter was in PROBE state prior to the reset,
 * exit here.
 */
if (reset_state == VNIC_PROBED)
-   return 0;
+   goto out;
 
rc = ibmvnic_login(netdev);
if (rc) {
-   adapter->state = reset_state;
-   return rc;
+   goto out;
}
 
rc = init_resources(adapter);
if (rc)
-   return rc;
+   goto out;
 
ibmvnic_disable_irqs(adapter);
 
@@ -1902,8 +1903,10 @@ static int do_change_param_reset(struct ibmvnic_adapter 
*adapter,
return 0;
 
rc = __ibmvnic_open(netdev);
-   if (rc)
-   return IBMVNIC_OPEN_FAILED;
+   if (rc) {
+   rc = IBMVNIC_OPEN_FAILED;
+   goto out;
+   }
 
/* refresh device's multicast list */
ibmvnic_set_multi(netdev);
@@ -1912,7 +1915,10 @@ static int do_change_param_reset(struct ibmvnic_adapter 
*adapter,
for (i = 0; i < adapter->req_rx_queues; i++)
napi_schedule(&adapter->napi[i]);
 
-   return 0;
+out:
+   if (rc)
+   adapter->state = reset_state;
+   return rc;
 }
 
 /**
@@ -2015,7 +2021,6 @@ static int do_reset(struct ibmvnic_adapter *adapter,
 
rc = ibmvnic_login(netdev);
if (rc) {
-   adapter->state = reset_state;
goto out;
}
 
@@ -2083,6 +2088,9 @@ static int do_reset(struct ibmvnic_adapter *adapter,
rc = 0;
 
 out:
+   /* restore the adapter state if reset failed */
+   if (rc)
+   adapter->state = reset_state;
rtnl_unlock();
 
return rc;
@@ -2115,43 +2123,46 @@ static int do_hard_reset(struct ibmvnic_adapter 
*adapter,
if (rc) {
netdev_err(adapter->netdev,
   "Couldn't initialize crq. rc=%d\n", rc);
-   return rc;
+   goto out;
}
 
rc = ibmvnic_reset_init(adapter, false);
if (rc)
-   return rc;
+   goto out;
 
/* If the adapter was in PROBE state prior to the reset,
 * exit here.
 */
if (reset_state == VNIC_PROBED)
-   return 0;
+   goto out;
 
rc = ibmvnic_login(netdev);
-   if (rc) {
-   adapter->state = VNIC_PROBED;
-   return 0;
-   }
+   if (rc)
+   goto out;
 
rc = init_resources(adapter);
if (rc)
-   return rc;
+   goto out;
 
ibmvnic_disable_irqs(adapter);
adapter->state = VNIC_CLOSED;
 
if (reset_state == VNIC_CLOSED)
-   return 0;
+   goto out;
 
rc = __ibmvnic_open(netdev);
-   if (rc)
-   return IBMVNIC_OPEN_FAILED;
+   if (rc) {
+   rc = IBMVNIC_OPEN_FAILED;
+   goto out;
+   }
 
call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, netdev);
call_netdevice_notifiers(NETDEV_RESEND_IGMP, netdev);
-
-   return 0;
+out:
+   /* restore adapter state if reset failed */
+   if (rc)
+   adapter->state = reset_state;
+   return rc;
 }
 
 static struct ibmvnic_rwi *get_next_rwi(struct ibmvnic_adapter *adapter)
@@ -2236,13 +2247,7 @@ static void __ibmvnic_reset(struct work_struct *work)
rc = do_reset(adapter, rwi, reset_state);
}
kfree(rwi);
- 

[PATCH net v3 0/9] ibmvnic: assorted bug fixes

2020-11-25 Thread Dany Madden
Assorted fixes for ibmvnic originated from "[PATCH net 00/15] ibmvnic:
assorted bug fixes" sent by Lijun Pan.

v3 Changes as suggested by Jakub Kicinski:
- Add a space between variable declaration and code in patch 3/9. Checkpatch
  does not catch this.
- Unwrapped FIXES lines in patch 9/9.
- Removed all extra line between Fixes and Signed-off-by lines in all patches.

V2 Changes as suggested by Jakub Kicinski:
- Added "Fixes" to each patch.
- Remove "ibmvnic: process HMC disable command" from the series. Submitting it
  separately to net-next.
- Squash V1 "ibmvnic: remove free_all_rwi function" into
  ibmvnic: stop free_all_rwi on failed reset.


Dany Madden (7):
  ibmvnic: handle inconsistent login with reset
  ibmvnic: stop free_all_rwi on failed reset
  ibmvnic: avoid memset null scrq msgs
  ibmvnic: restore adapter state on failed reset
  ibmvnic: send_login should check for crq errors
  ibmvnic: no reset timeout for 5 seconds after reset
  ibmvnic: reduce wait for completion time

Sukadev Bhattiprolu (2):
  ibmvnic: delay next reset if hard reset fails
  ibmvnic: track pending login

 drivers/net/ethernet/ibm/ibmvnic.c | 168 ++---
 drivers/net/ethernet/ibm/ibmvnic.h |   3 +
 2 files changed, 106 insertions(+), 65 deletions(-)

-- 
2.26.2



[PATCH net-next] Revert ibmvnic merge do_change_param_reset into do_reset

2020-11-06 Thread Dany Madden
This reverts commit 16b5f5ce351f8709a6b518cc3cbf240c378305bf
where it restructures do_reset. There are patches being tested that
would require major rework if this is committed first. 

We will resend this after the other patches have been applied.

Signed-off-by: Dany Madden 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 147 -
 1 file changed, 104 insertions(+), 43 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index f4167de30461..af4dfbe28d56 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1826,6 +1826,86 @@ static int ibmvnic_set_mac(struct net_device *netdev, 
void *p)
return rc;
 }
 
+/**
+ * do_change_param_reset returns zero if we are able to keep processing reset
+ * events, or non-zero if we hit a fatal error and must halt.
+ */
+static int do_change_param_reset(struct ibmvnic_adapter *adapter,
+struct ibmvnic_rwi *rwi,
+u32 reset_state)
+{
+   struct net_device *netdev = adapter->netdev;
+   int i, rc;
+
+   netdev_dbg(adapter->netdev, "Change param resetting driver (%d)\n",
+  rwi->reset_reason);
+
+   netif_carrier_off(netdev);
+   adapter->reset_reason = rwi->reset_reason;
+
+   ibmvnic_cleanup(netdev);
+
+   if (reset_state == VNIC_OPEN) {
+   rc = __ibmvnic_close(netdev);
+   if (rc)
+   return rc;
+   }
+
+   release_resources(adapter);
+   release_sub_crqs(adapter, 1);
+   release_crq_queue(adapter);
+
+   adapter->state = VNIC_PROBED;
+
+   rc = init_crq_queue(adapter);
+
+   if (rc) {
+   netdev_err(adapter->netdev,
+  "Couldn't initialize crq. rc=%d\n", rc);
+   return rc;
+   }
+
+   rc = ibmvnic_reset_init(adapter, true);
+   if (rc)
+   return IBMVNIC_INIT_FAILED;
+
+   /* If the adapter was in PROBE state prior to the reset,
+* exit here.
+*/
+   if (reset_state == VNIC_PROBED)
+   return 0;
+
+   rc = ibmvnic_login(netdev);
+   if (rc) {
+   adapter->state = reset_state;
+   return rc;
+   }
+
+   rc = init_resources(adapter);
+   if (rc)
+   return rc;
+
+   ibmvnic_disable_irqs(adapter);
+
+   adapter->state = VNIC_CLOSED;
+
+   if (reset_state == VNIC_CLOSED)
+   return 0;
+
+   rc = __ibmvnic_open(netdev);
+   if (rc)
+   return IBMVNIC_OPEN_FAILED;
+
+   /* refresh device's multicast list */
+   ibmvnic_set_multi(netdev);
+
+   /* kick napi */
+   for (i = 0; i < adapter->req_rx_queues; i++)
+   napi_schedule(&adapter->napi[i]);
+
+   return 0;
+}
+
 /**
  * do_reset returns zero if we are able to keep processing reset events, or
  * non-zero if we hit a fatal error and must halt.
@@ -1841,12 +1921,10 @@ static int do_reset(struct ibmvnic_adapter *adapter,
netdev_dbg(adapter->netdev, "Re-setting driver (%d)\n",
   rwi->reset_reason);
 
-   adapter->reset_reason = rwi->reset_reason;
-   /* requestor of VNIC_RESET_CHANGE_PARAM already has the rtnl lock */
-   if (!(adapter->reset_reason == VNIC_RESET_CHANGE_PARAM))
-   rtnl_lock();
+   rtnl_lock();
 
netif_carrier_off(netdev);
+   adapter->reset_reason = rwi->reset_reason;
 
old_num_rx_queues = adapter->req_rx_queues;
old_num_tx_queues = adapter->req_tx_queues;
@@ -1858,37 +1936,25 @@ static int do_reset(struct ibmvnic_adapter *adapter,
if (reset_state == VNIC_OPEN &&
adapter->reset_reason != VNIC_RESET_MOBILITY &&
adapter->reset_reason != VNIC_RESET_FAILOVER) {
-   if (adapter->reset_reason == VNIC_RESET_CHANGE_PARAM) {
-   rc = __ibmvnic_close(netdev);
-   if (rc)
-   goto out;
-   } else {
-   adapter->state = VNIC_CLOSING;
-
-   /* Release the RTNL lock before link state change and
-* re-acquire after the link state change to allow
-* linkwatch_event to grab the RTNL lock and run during
-* a reset.
-*/
-   rtnl_unlock();
-   rc = set_link_state(adapter, IBMVNIC_LOGICAL_LNK_DN);
-   rtnl_lock();
-   if (rc)
-   goto out;
+   adapter->state = VNIC_CLOSING;
 
-   if (adapter->state != VNIC_CLOSING) {
-   rc = -1;
-   goto

[PATCH net] ibmvnic: update MAINTAINERS

2020-09-14 Thread Dany Madden
Update supporters for IBM Power SRIOV Virtual NIC Device Driver. 
Thomas Falcon is moving on to other works. Dany Madden, Lijun Pan 
and Sukadev Bhattiprolu are the current supporters.

Signed-off-by: Dany Madden 
---
 MAINTAINERS | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 2783a5f68d2c..923c69ad4eec 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8315,7 +8315,9 @@ S:Supported
 F: drivers/pci/hotplug/rpaphp*
 
 IBM Power SRIOV Virtual NIC Device Driver
-M: Thomas Falcon 
+M: Dany Madden 
+M: Lijun Pan 
+M: Sukadev Bhattiprolu 
 L: net...@vger.kernel.org
 S: Supported
 F: drivers/net/ethernet/ibm/ibmvnic.*
-- 
2.18.2



[PATCH net v3] ibmvnic fix NULL tx_pools and rx_tools issue at do_reset

2020-08-25 Thread Dany Madden
From: Mingming Cao 

At the time of do_rest, ibmvnic tries to re-initalize the tx_pools
and rx_pools to avoid re-allocating the long term buffer. However
there is a window inside do_reset that the tx_pools and
rx_pools were freed before re-initialized making it possible to deference
null pointers.

This patch fix this issue by always check the tx_pool
and rx_pool are not NULL after ibmvnic_login. If so, re-allocating
the pools. This will avoid getting into calling reset_tx/rx_pools with
NULL adapter tx_pools/rx_pools pointer. Also add null pointer check in
reset_tx_pools and reset_rx_pools to safe handle NULL pointer case.

Signed-off-by: Mingming Cao 
Signed-off-by: Dany Madden 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 5afb3c9c52d2..d3a774331afc 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -479,6 +479,9 @@ static int reset_rx_pools(struct ibmvnic_adapter *adapter)
int i, j, rc;
u64 *size_array;
 
+   if (!adapter->rx_pool)
+   return -1;
+
size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) +
be32_to_cpu(adapter->login_rsp_buf->off_rxadd_buff_size));
 
@@ -649,6 +652,9 @@ static int reset_tx_pools(struct ibmvnic_adapter *adapter)
int tx_scrqs;
int i, rc;
 
+   if (!adapter->tx_pool)
+   return -1;
+
tx_scrqs = be32_to_cpu(adapter->login_rsp_buf->num_txsubm_subcrqs);
for (i = 0; i < tx_scrqs; i++) {
rc = reset_one_tx_pool(adapter, &adapter->tso_pool[i]);
@@ -2011,7 +2017,10 @@ static int do_reset(struct ibmvnic_adapter *adapter,
adapter->req_rx_add_entries_per_subcrq !=
old_num_rx_slots ||
adapter->req_tx_entries_per_subcrq !=
-   old_num_tx_slots) {
+   old_num_tx_slots ||
+   !adapter->rx_pool ||
+   !adapter->tso_pool ||
+   !adapter->tx_pool) {
release_rx_pools(adapter);
release_tx_pools(adapter);
release_napi(adapter);
@@ -2024,10 +2033,14 @@ static int do_reset(struct ibmvnic_adapter *adapter,
} else {
rc = reset_tx_pools(adapter);
if (rc)
+   netdev_dbg(adapter->netdev, "reset tx pools 
failed (%d)\n",
+   rc);
goto out;
 
rc = reset_rx_pools(adapter);
if (rc)
+   netdev_dbg(adapter->netdev, "reset rx pools 
failed (%d)\n",
+   rc);
goto out;
}
ibmvnic_disable_irqs(adapter);
-- 
2.18.2



[PATCH net v2] ibmvnic fix NULL tx_pools and rx_tools issue at do_reset

2020-08-25 Thread Dany Madden
From: Mingming Cao 

At the time of do_rest, ibmvnic tries to re-initalize the tx_pools
and rx_pools to avoid re-allocating the long term buffer. However
there is a window inside do_reset that the tx_pools and
rx_pools were freed before re-initialized making it possible to deference
null pointers.

This patch fix this issue by always check the tx_pool
and rx_pool are not NULL after ibmvnic_login. If so, re-allocating
the pools. This will avoid getting into calling reset_tx/rx_pools with
NULL adapter tx_pools/rx_pools pointer. Also add null pointer check in
reset_tx_pools and reset_rx_pools to safe handle NULL pointer case.

Signed-off-by: Mingming Cao 
Signed-off-by: Dany Madden 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 5afb3c9c52d2..52feee97821e 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -479,6 +479,9 @@ static int reset_rx_pools(struct ibmvnic_adapter *adapter)
int i, j, rc;
u64 *size_array;
 
+   if (!adapter->rx_pool)
+   return -1;
+
size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) +
be32_to_cpu(adapter->login_rsp_buf->off_rxadd_buff_size));
 
@@ -649,6 +652,9 @@ static int reset_tx_pools(struct ibmvnic_adapter *adapter)
int tx_scrqs;
int i, rc;
 
+   if (!adapter->tx_pool)
+   return -1;
+
tx_scrqs = be32_to_cpu(adapter->login_rsp_buf->num_txsubm_subcrqs);
for (i = 0; i < tx_scrqs; i++) {
rc = reset_one_tx_pool(adapter, &adapter->tso_pool[i]);
@@ -2011,7 +2017,10 @@ static int do_reset(struct ibmvnic_adapter *adapter,
adapter->req_rx_add_entries_per_subcrq !=
old_num_rx_slots ||
adapter->req_tx_entries_per_subcrq !=
-   old_num_tx_slots) {
+   old_num_tx_slots ||
+   !adapter->rx_pool ||
+   !adapter->tso_pool ||
+   !adapter->tx_pool) {
release_rx_pools(adapter);
release_tx_pools(adapter);
release_napi(adapter);
@@ -2024,10 +2033,14 @@ static int do_reset(struct ibmvnic_adapter *adapter,
} else {
rc = reset_tx_pools(adapter);
if (rc)
+   netdev_dbg(adapter->netdev, "reset tx pools 
failed (%d)\n",
+   rc);
goto out;
 
rc = reset_rx_pools(adapter);
if (rc)
+   netdev_dbg(adapter->netdev, "reset rx pools 
failed (%d)\n",
+   rc);
goto out;
}
ibmvnic_disable_irqs(adapter);
-- 
2.18.2



[PATCH net] ibmvnic fix NULL tx_pools and rx_tools issue at do_reset

2020-08-24 Thread Dany Madden
From: Mingming Cao 

At the time of do_reset, ibmvnic tries to re-initalize the tx_pools
and rx_pools to avoid re-allocating the long term buffer. However
there is a window inside do_reset that the tx_pools and
rx_pools were freed before re-initialized making it possible to deference
null pointers.

This patch fixes this issue by checking that the tx_pool
and rx_pool are not NULL after ibmvnic_login. If so, re-allocating
the pools. This will avoid getting into calling reset_tx/rx_pools with
NULL adapter tx_pools/rx_pools pointer. Also add null pointer check in
reset_tx_pools and reset_rx_pools to safe handle NULL pointer case.

Signed-off-by: Mingming Cao 
Signed-off-by: Dany Madden 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 5afb3c9c52d2..5ff48e55308b 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -479,6 +479,9 @@ static int reset_rx_pools(struct ibmvnic_adapter *adapter)
int i, j, rc;
u64 *size_array;
 
+   if (!adapter->tx_pool)
+   return -1;
+
size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) +
be32_to_cpu(adapter->login_rsp_buf->off_rxadd_buff_size));
 
@@ -649,6 +652,9 @@ static int reset_tx_pools(struct ibmvnic_adapter *adapter)
int tx_scrqs;
int i, rc;
 
+   if (!adapter->tx_pool)
+   return -1;
+
tx_scrqs = be32_to_cpu(adapter->login_rsp_buf->num_txsubm_subcrqs);
for (i = 0; i < tx_scrqs; i++) {
rc = reset_one_tx_pool(adapter, &adapter->tso_pool[i]);
@@ -2011,7 +2017,10 @@ static int do_reset(struct ibmvnic_adapter *adapter,
adapter->req_rx_add_entries_per_subcrq !=
old_num_rx_slots ||
adapter->req_tx_entries_per_subcrq !=
-   old_num_tx_slots) {
+   old_num_tx_slots ||
+   !adapter->rx_pool ||
+   !adapter->tso_pool ||
+   !adapter->tx_pool) {
release_rx_pools(adapter);
release_tx_pools(adapter);
release_napi(adapter);
@@ -2024,10 +2033,14 @@ static int do_reset(struct ibmvnic_adapter *adapter,
} else {
rc = reset_tx_pools(adapter);
if (rc)
+   netdev_dbg(adapter->netdev, "reset tx pools 
failed (%d)\n",
+   rc);
goto out;
 
rc = reset_rx_pools(adapter);
if (rc)
+   netdev_dbg(adapter->netdev, "reset rx pools 
failed (%d)\n",
+   rc);
goto out;
}
ibmvnic_disable_irqs(adapter);
-- 
2.18.2



[PATCH net] ibmvnic: continue to init in CRQ reset returns H_CLOSED

2020-06-18 Thread Dany Madden
Continue the reset path when partner adapter is not ready or H_CLOSED is
returned from reset crq. This patch allows the CRQ init to proceed to
establish a valid CRQ for traffic to flow after reset.

Signed-off-by: Dany Madden 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 2baf7b3ff4cb..4b7cb483c47f 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1971,13 +1971,18 @@ static int do_reset(struct ibmvnic_adapter *adapter,
release_sub_crqs(adapter, 1);
} else {
rc = ibmvnic_reset_crq(adapter);
-   if (!rc)
+   if (rc == H_CLOSED || rc == H_SUCCESS) {
rc = vio_enable_interrupts(adapter->vdev);
+   if (rc)
+   netdev_err(adapter->netdev,
+  "Reset failed to enable 
interrupts. rc=%d\n",
+  rc);
+   }
}
 
if (rc) {
netdev_err(adapter->netdev,
-  "Couldn't initialize crq. rc=%d\n", rc);
+  "Reset couldn't initialize crq. rc=%d\n", 
rc);
goto out;
}
 
-- 
2.18.2