Re: [PATCH 1/5] scsi: bnx2i: convert to workqueue

2017-07-07 Thread Sebastian Andrzej Siewior
On 2017-07-07 09:20:02 [-0400], Chad Dupuis wrote:
> What was the question?  My observation is that the patch I proposed fixed 
> the issue we saw on testing the patch set.  With that small change 
> (essentially modulo by the number of active CPUs vs. the total number) 
> your patch set worked ok.

That mail at the bottom of this mail where I said why I think your patch
is a nop in this context.

Sebastian

On 2017-05-17 17:07:34 [+0200], To Chad Dupuis wrote:
> > > Sebastian, can you add this change to your patch set?
> >
> > Are sure that you can reliably reproduce the issue and fix it with the
> > patch above? Because this patch:
>
> oh. Okay. Now it clicked. It can fix the issue but it is still possible,
> that CPU0 goes down between your check for it and schedule_work_on()
> returning. Let my think of something…

Oh wait. I already thought about this: it may take bnx2fc_percpu from
CPU7 and run the worker on CPU3. The job isn't lost, because the worker
does:

| static void bnx2fc_percpu_io_work(struct work_struct *work_s)
| {
| struct bnx2fc_percpu_s *p;
 …
| p = container_of(work_s, struct bnx2fc_percpu_s, work);
|
| spin_lock_bh(>fp_work_lock);

and so will access bnx2fc_percpu of CPU7 running on CPU3. So I *think*
that your patch should make no difference and there should be no leak if
schedule_work_on() is invoked on an offline CPU.


Re: [PATCH 1/5] scsi: bnx2i: convert to workqueue

2017-07-07 Thread Chad Dupuis


On Fri, 7 Jul 2017, 9:14am, Sebastian Andrzej Siewior wrote:

> On 2017-06-29 15:57:56 [+0200], Johannes Thumshirn wrote:
> > So here we are again,
> > Tested-by: Johannes Thumshirn 
> > 
> > FCoE will follow as soon as my setup can speak FCoE again.
> 
> So it all looks good, doesn't it? Chad never responded to my question
> on his patch. I still doubt that it fixes the problem he observed.
> 
> Sebastian
> 

What was the question?  My observation is that the patch I proposed fixed 
the issue we saw on testing the patch set.  With that small change 
(essentially modulo by the number of active CPUs vs. the total number) 
your patch set worked ok.


Re: [PATCH 1/5] scsi: bnx2i: convert to workqueue

2017-07-07 Thread Sebastian Andrzej Siewior
On 2017-06-29 15:57:56 [+0200], Johannes Thumshirn wrote:
> So here we are again,
> Tested-by: Johannes Thumshirn 
> 
> FCoE will follow as soon as my setup can speak FCoE again.

So it all looks good, doesn't it? Chad never responded to my question
on his patch. I still doubt that it fixes the problem he observed.

Sebastian


Re: [PATCH 1/5] scsi: bnx2i: convert to workqueue

2017-06-29 Thread Johannes Thumshirn
So here we are again,
Tested-by: Johannes Thumshirn 

FCoE will follow as soon as my setup can speak FCoE again.
-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


Re: [PATCH 1/5] scsi: bnx2i: convert to workqueue

2017-05-09 Thread Rangankar, Manish

On 10/04/17 10:42 PM, "Sebastian Andrzej Siewior" 
wrote:

>The driver creates its own per-CPU threads which are updated based on CPU
>hotplug events. It is also possible to use kworkers and remove some of the
>infrastructure get the same job done while saving a few lines of code.
>
>The DECLARE_PER_CPU() definition is moved into the header file where it
>belongs. bnx2i_percpu_io_thread() becomes bnx2i_percpu_io_work() which is
>mostly the same code. The outer loop (kthread_should_stop()) gets removed
>and
>the remaining code is shifted to the left.
>bnx2i_queue_scsi_cmd_resp() is mostly the same. The code checked
>->iothread to
>decide if there is an active per-CPU thread. With the kworkers this is no
>longer possible nor required.
>The allocation of struct bnx2i_work does not happen with ->p_work_lock
>held
>which is not required. I am unsure about the call-stack so I can't say
>if this qualifies it for the allocation with GFP_KERNEL instead of
>GFP_ATOMIC (it is not _bh lock but as I said, I don't know the context).
>The allocation case has been reversed so the inner if case is called on
>!bnx2i_work and is just the invocation one function since the lock is not
>held during allocation. The init of the new bnx2i_work struct is now
>done also without the ->p_work_lock held: it is a new object, nobody
>knows about it yet. It should be enough to hold the lock while adding
>this item to the list. I am unsure about that atomic_inc() so I keep
>things as they were.
>
>The remaining part is the removal CPU hotplug notifier since it is taken
>care by the workqueue code.
>
>This patch was only compile-tested due to -ENODEV.
>
>Cc: qlogic-storage-upstr...@qlogic.com
>Cc: Christoph Hellwig 
>Signed-off-by: Sebastian Andrzej Siewior 
>---

Didn't seen any issue with regression testing. Thanks Sebastian.

Acked-by: Manish Rangankar 



Re: [PATCH 1/5] scsi: bnx2i: convert to workqueue

2017-05-05 Thread Johannes Thumshirn

On 04/10/2017 07:12 PM, Sebastian Andrzej Siewior wrote:
> The driver creates its own per-CPU threads which are updated based on CPU
> hotplug events. It is also possible to use kworkers and remove some of the
> infrastructure get the same job done while saving a few lines of code.
>
> The DECLARE_PER_CPU() definition is moved into the header file where it
> belongs. bnx2i_percpu_io_thread() becomes bnx2i_percpu_io_work() which is
> mostly the same code. The outer loop (kthread_should_stop()) gets removed and
> the remaining code is shifted to the left.
> bnx2i_queue_scsi_cmd_resp() is mostly the same. The code checked ->iothread to
> decide if there is an active per-CPU thread. With the kworkers this is no
> longer possible nor required.
> The allocation of struct bnx2i_work does not happen with ->p_work_lock held
> which is not required. I am unsure about the call-stack so I can't say
> if this qualifies it for the allocation with GFP_KERNEL instead of
> GFP_ATOMIC (it is not _bh lock but as I said, I don't know the context).
> The allocation case has been reversed so the inner if case is called on
> !bnx2i_work and is just the invocation one function since the lock is not
> held during allocation. The init of the new bnx2i_work struct is now
> done also without the ->p_work_lock held: it is a new object, nobody
> knows about it yet. It should be enough to hold the lock while adding
> this item to the list. I am unsure about that atomic_inc() so I keep
> things as they were.
>
> The remaining part is the removal CPU hotplug notifier since it is taken
> care by the workqueue code.
>
> This patch was only compile-tested due to -ENODEV.
>
> Cc: qlogic-storage-upstr...@qlogic.com
> Cc: Christoph Hellwig 
> Signed-off-by: Sebastian Andrzej Siewior 
> ---
Reviewed-by: Johannes Thumshirn 



Re: [PATCH 1/5] scsi: bnx2i: convert to workqueue

2017-05-05 Thread Christoph Hellwig
Looks fine,

Reviewed-by: Christoph Hellwig 


[PATCH 1/5] scsi: bnx2i: convert to workqueue

2017-04-10 Thread Sebastian Andrzej Siewior
The driver creates its own per-CPU threads which are updated based on CPU
hotplug events. It is also possible to use kworkers and remove some of the
infrastructure get the same job done while saving a few lines of code.

The DECLARE_PER_CPU() definition is moved into the header file where it
belongs. bnx2i_percpu_io_thread() becomes bnx2i_percpu_io_work() which is
mostly the same code. The outer loop (kthread_should_stop()) gets removed and
the remaining code is shifted to the left.
bnx2i_queue_scsi_cmd_resp() is mostly the same. The code checked ->iothread to
decide if there is an active per-CPU thread. With the kworkers this is no
longer possible nor required.
The allocation of struct bnx2i_work does not happen with ->p_work_lock held
which is not required. I am unsure about the call-stack so I can't say
if this qualifies it for the allocation with GFP_KERNEL instead of
GFP_ATOMIC (it is not _bh lock but as I said, I don't know the context).
The allocation case has been reversed so the inner if case is called on
!bnx2i_work and is just the invocation one function since the lock is not
held during allocation. The init of the new bnx2i_work struct is now
done also without the ->p_work_lock held: it is a new object, nobody
knows about it yet. It should be enough to hold the lock while adding
this item to the list. I am unsure about that atomic_inc() so I keep
things as they were.

The remaining part is the removal CPU hotplug notifier since it is taken
care by the workqueue code.

This patch was only compile-tested due to -ENODEV.

Cc: qlogic-storage-upstr...@qlogic.com
Cc: Christoph Hellwig 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/scsi/bnx2i/bnx2i.h  |  11 ++---
 drivers/scsi/bnx2i/bnx2i_hwi.c  | 101 +-
 drivers/scsi/bnx2i/bnx2i_init.c | 104 +++-
 3 files changed, 53 insertions(+), 163 deletions(-)

diff --git a/drivers/scsi/bnx2i/bnx2i.h b/drivers/scsi/bnx2i/bnx2i.h
index 89ef1a1678d1..78f67542cbd3 100644
--- a/drivers/scsi/bnx2i/bnx2i.h
+++ b/drivers/scsi/bnx2i/bnx2i.h
@@ -31,7 +31,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
@@ -775,12 +774,11 @@ struct bnx2i_work {
 };
 
 struct bnx2i_percpu_s {
-   struct task_struct *iothread;
+   struct work_struct work;
struct list_head work_list;
spinlock_t p_work_lock;
 };
 
-
 /* Global variables */
 extern unsigned int error_mask1, error_mask2;
 extern u64 iscsi_error_mask;
@@ -797,7 +795,7 @@ extern unsigned int rq_size;
 
 extern struct device_attribute *bnx2i_dev_attributes[];
 
-
+DECLARE_PER_CPU(struct bnx2i_percpu_s, bnx2i_percpu);
 
 /*
  * Function Prototypes
@@ -875,8 +873,5 @@ extern void bnx2i_print_active_cmd_queue(struct bnx2i_conn 
*conn);
 extern void bnx2i_print_xmit_pdu_queue(struct bnx2i_conn *conn);
 extern void bnx2i_print_recv_state(struct bnx2i_conn *conn);
 
-extern int bnx2i_percpu_io_thread(void *arg);
-extern int bnx2i_process_scsi_cmd_resp(struct iscsi_session *session,
-  struct bnx2i_conn *bnx2i_conn,
-  struct cqe *cqe);
+extern void bnx2i_percpu_io_work(struct work_struct *work);
 #endif
diff --git a/drivers/scsi/bnx2i/bnx2i_hwi.c b/drivers/scsi/bnx2i/bnx2i_hwi.c
index 42921dbba927..9be58f6523b3 100644
--- a/drivers/scsi/bnx2i/bnx2i_hwi.c
+++ b/drivers/scsi/bnx2i/bnx2i_hwi.c
@@ -19,8 +19,6 @@
 #include 
 #include "bnx2i.h"
 
-DECLARE_PER_CPU(struct bnx2i_percpu_s, bnx2i_percpu);
-
 /**
  * bnx2i_get_cid_num - get cid from ep
  * @ep:endpoint pointer
@@ -1350,9 +1348,9 @@ int bnx2i_send_fw_iscsi_init_msg(struct bnx2i_hba *hba)
  *
  * process SCSI CMD Response CQE & complete the request to SCSI-ML
  */
-int bnx2i_process_scsi_cmd_resp(struct iscsi_session *session,
-   struct bnx2i_conn *bnx2i_conn,
-   struct cqe *cqe)
+static int bnx2i_process_scsi_cmd_resp(struct iscsi_session *session,
+  struct bnx2i_conn *bnx2i_conn,
+  struct cqe *cqe)
 {
struct iscsi_conn *conn = bnx2i_conn->cls_conn->dd_data;
struct bnx2i_hba *hba = bnx2i_conn->hba;
@@ -1862,45 +1860,37 @@ static void bnx2i_process_cmd_cleanup_resp(struct 
iscsi_session *session,
 
 
 /**
- * bnx2i_percpu_io_thread - thread per cpu for ios
+ * bnx2i_percpu_io_work - thread per cpu for ios
  *
- * @arg:   ptr to bnx2i_percpu_info structure
+ * @work_s:The work struct
  */
-int bnx2i_percpu_io_thread(void *arg)
+void bnx2i_percpu_io_work(struct work_struct *work_s)
 {
-   struct bnx2i_percpu_s *p = arg;
+   struct bnx2i_percpu_s *p;
struct bnx2i_work *work, *tmp;
LIST_HEAD(work_list);
 
-   set_user_nice(current, MIN_NICE);
+   p = container_of(work_s, struct bnx2i_percpu_s, work);
 
-   while (!kthread_should_stop()) {
-   

[PATCH 1/5] scsi: bnx2i: convert to workqueue

2016-11-22 Thread Sebastian Andrzej Siewior
The driver creates its own per-CPU threads which are updated based on CPU
hotplug events. It is also possible to use kworkers and remove some of the
infrastructure get the same job done while saving a few lines of code.

The DECLARE_PER_CPU() definition is moved into the header file where it
belongs. bnx2i_percpu_io_thread() becomes bnx2i_percpu_io_work() which is
mostly the same code. The outer loop (kthread_should_stop()) gets removed and
the remaining code is shifted to the left.
bnx2i_queue_scsi_cmd_resp() is mostly the same. The code checked ->iothread to
decide if there is an active per-CPU thread. With the kworkers this is no
longer possible nor required.
The allocation of struct bnx2i_work does not happen with ->p_work_lock held
which is not required. I am unsure about the call-stack so I can't say
if this qualifies it for the allocation with GFP_KERNEL instead of
GFP_ATOMIC (it is not _bh lock but as I said, I don't know the context).
The allocation case has been reversed so the inner if case is called on
!bnx2i_work and is just the invocation one function since the lock is not
held during allocation. The init of the new bnx2i_work struct is now
done also without the ->p_work_lock held: it is a new object, nobody
knows about it yet. It should be enough to hold the lock while adding
this item to the list. I am unsure about that atomic_inc() so I keep
things as they were.

The remaining part is the removal CPU hotplug notifier since it is taken
care by the workqueue code.

This patch was only compile-tested due to -ENODEV.

Cc: qlogic-storage-upstr...@qlogic.com
Cc: Christoph Hellwig 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/scsi/bnx2i/bnx2i.h  |  11 +---
 drivers/scsi/bnx2i/bnx2i_hwi.c  | 101 ++---
 drivers/scsi/bnx2i/bnx2i_init.c | 121 +++-
 3 files changed, 53 insertions(+), 180 deletions(-)

diff --git a/drivers/scsi/bnx2i/bnx2i.h b/drivers/scsi/bnx2i/bnx2i.h
index ed7f3228e234..78cdc493bab5 100644
--- a/drivers/scsi/bnx2i/bnx2i.h
+++ b/drivers/scsi/bnx2i/bnx2i.h
@@ -31,7 +31,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
@@ -775,12 +774,11 @@ struct bnx2i_work {
 };
 
 struct bnx2i_percpu_s {
-   struct task_struct *iothread;
+   struct work_struct work;
struct list_head work_list;
spinlock_t p_work_lock;
 };
 
-
 /* Global variables */
 extern unsigned int error_mask1, error_mask2;
 extern u64 iscsi_error_mask;
@@ -797,7 +795,7 @@ extern unsigned int rq_size;
 
 extern struct device_attribute *bnx2i_dev_attributes[];
 
-
+DECLARE_PER_CPU(struct bnx2i_percpu_s, bnx2i_percpu);
 
 /*
  * Function Prototypes
@@ -875,8 +873,5 @@ extern void bnx2i_print_active_cmd_queue(struct bnx2i_conn 
*conn);
 extern void bnx2i_print_xmit_pdu_queue(struct bnx2i_conn *conn);
 extern void bnx2i_print_recv_state(struct bnx2i_conn *conn);
 
-extern int bnx2i_percpu_io_thread(void *arg);
-extern int bnx2i_process_scsi_cmd_resp(struct iscsi_session *session,
-  struct bnx2i_conn *bnx2i_conn,
-  struct cqe *cqe);
+extern void bnx2i_percpu_io_work(struct work_struct *work);
 #endif
diff --git a/drivers/scsi/bnx2i/bnx2i_hwi.c b/drivers/scsi/bnx2i/bnx2i_hwi.c
index 42921dbba927..9be58f6523b3 100644
--- a/drivers/scsi/bnx2i/bnx2i_hwi.c
+++ b/drivers/scsi/bnx2i/bnx2i_hwi.c
@@ -19,8 +19,6 @@
 #include 
 #include "bnx2i.h"
 
-DECLARE_PER_CPU(struct bnx2i_percpu_s, bnx2i_percpu);
-
 /**
  * bnx2i_get_cid_num - get cid from ep
  * @ep:endpoint pointer
@@ -1350,9 +1348,9 @@ int bnx2i_send_fw_iscsi_init_msg(struct bnx2i_hba *hba)
  *
  * process SCSI CMD Response CQE & complete the request to SCSI-ML
  */
-int bnx2i_process_scsi_cmd_resp(struct iscsi_session *session,
-   struct bnx2i_conn *bnx2i_conn,
-   struct cqe *cqe)
+static int bnx2i_process_scsi_cmd_resp(struct iscsi_session *session,
+  struct bnx2i_conn *bnx2i_conn,
+  struct cqe *cqe)
 {
struct iscsi_conn *conn = bnx2i_conn->cls_conn->dd_data;
struct bnx2i_hba *hba = bnx2i_conn->hba;
@@ -1862,45 +1860,37 @@ static void bnx2i_process_cmd_cleanup_resp(struct 
iscsi_session *session,
 
 
 /**
- * bnx2i_percpu_io_thread - thread per cpu for ios
+ * bnx2i_percpu_io_work - thread per cpu for ios
  *
- * @arg:   ptr to bnx2i_percpu_info structure
+ * @work_s:The work struct
  */
-int bnx2i_percpu_io_thread(void *arg)
+void bnx2i_percpu_io_work(struct work_struct *work_s)
 {
-   struct bnx2i_percpu_s *p = arg;
+   struct bnx2i_percpu_s *p;
struct bnx2i_work *work, *tmp;
LIST_HEAD(work_list);
 
-   set_user_nice(current, MIN_NICE);
+   p = container_of(work_s, struct bnx2i_percpu_s, work);
 
-   while (!kthread_should_stop()) {
-