Re: [PATCH 1/2] powerpc/workqueue: update list of possible CPUs

2017-08-21 Thread Michael Ellerman
Tejun Heo  writes:

> On Mon, Aug 21, 2017 at 03:49:50PM +0200, Laurent Vivier wrote:
>> In wq_numa_init() a list of NUMA nodes with their list of possible CPUs
>> is built.
>> 
>> Unfortunately, on powerpc, the Firmware is only able to provide the
>> node of a CPU if the CPU is present. So, in our case (possible CPU)
>> CPU ids are known, but as the CPU is not present, the node id is
>> unknown and all the unplugged CPUs are attached to node 0.
>
> This is something powerpc needs to fix.

There is no way for us to fix it.

At boot, for possible but not present CPUs, we have no way of knowing
the CPU <-> node mapping, firmware simply doesn't tell us.

> Workqueue isn't the only one making this assumption. mm as a whole
> assumes that CPU <-> node mapping is stable regardless of hotplug
> events.

At least in this case I don't think the mapping changes, it's just we
don't know the mapping at boot.

Currently we have to report possible but not present CPUs as belonging
to node 0, because otherwise we trip this helpful piece of code:

for_each_possible_cpu(cpu) {
node = cpu_to_node(cpu);
if (WARN_ON(node == NUMA_NO_NODE)) {
pr_warn("workqueue: NUMA node mapping not available for 
cpu%d, disabling NUMA support\n", cpu);
/* happens iff arch is bonkers, let's just proceed */
return;
}

But if we remove that, we could then accurately report NUMA_NO_NODE at
boot, and then update the mapping when the CPU is hotplugged.

cheers


Re: [PATCH] loop: Fix freeze if configured block size is not supported

2017-08-21 Thread Omar Sandoval
On Mon, Aug 21, 2017 at 09:11:32PM +0200, Milan Broz wrote:
> On 08/21/2017 08:47 PM, Omar Sandoval wrote:
> > On Fri, Aug 18, 2017 at 03:07:33PM +0200, Milan Broz wrote:
> >> The commit f2c6df7dbf9a60e1cd9941f9fb376d4d9ad1e8dd
> >>
> >> loop: support 4k physical blocksize
> >>
> >> adds support for loop block size with only specific block sizes.
> >>
> >> If the size is not supported, the code returns -EINVAL keeping
> >> the loop queue frozen. This causes that device could be locked
> >> for a long time by processes trying to scan device (udev).
> >> (And also causing subsequent LOOP_CLR_FD operations noop.)
> >>
> >> Fix it by using goto to proper exit location with queue unfreeze.
> >>
> >> (The same bug is for setting crypt attribute but this code is
> >> probably no more used. Patch fixes it as well though.)
> >>
> >> Signed-off-by: Milan Broz 
> > 
> > Heh, I sent the same patch [1] only hours before :) v2 is here [2] if
> > you want to give it a reviewed-by.
> 
> Yes, and I noticed it 10 seconds after I sent my patch :)
> You can add reviewed by, if it helps anything...

Thanks!

> Actually you fixed another problems there with following patches
> (physical blocks sizes), we discussed this with Karel
> that it need some changes because the original patch caused
> that reported blocks differs between old and new kernel (lsblk -t),
> even if block size was not used.

Yes there was definitely some funkiness there, let me know if I covered
everything or if I need to make more changes.


Re: [PATCH] loop: Fix freeze if configured block size is not supported

2017-08-21 Thread Milan Broz
On 08/21/2017 08:47 PM, Omar Sandoval wrote:
> On Fri, Aug 18, 2017 at 03:07:33PM +0200, Milan Broz wrote:
>> The commit f2c6df7dbf9a60e1cd9941f9fb376d4d9ad1e8dd
>>
>> loop: support 4k physical blocksize
>>
>> adds support for loop block size with only specific block sizes.
>>
>> If the size is not supported, the code returns -EINVAL keeping
>> the loop queue frozen. This causes that device could be locked
>> for a long time by processes trying to scan device (udev).
>> (And also causing subsequent LOOP_CLR_FD operations noop.)
>>
>> Fix it by using goto to proper exit location with queue unfreeze.
>>
>> (The same bug is for setting crypt attribute but this code is
>> probably no more used. Patch fixes it as well though.)
>>
>> Signed-off-by: Milan Broz 
> 
> Heh, I sent the same patch [1] only hours before :) v2 is here [2] if
> you want to give it a reviewed-by.

Yes, and I noticed it 10 seconds after I sent my patch :)
You can add reviewed by, if it helps anything...

Actually you fixed another problems there with following patches
(physical blocks sizes), we discussed this with Karel
that it need some changes because the original patch caused
that reported blocks differs between old and new kernel (lsblk -t),
even if block size was not used.

I will test them as well (we have code in cryptsetup that allocates
loop device automatically if the image is in file and I would like
to add block size setting there).

Thanks,
Milan

> 
> 1: https://marc.info/?l=linux-block&m=150303694018659&w=2
> 2: https://marc.info/?l=linux-block&m=150308446732748&w=2
> 


Re: [PATCH] loop: Fix freeze if configured block size is not supported

2017-08-21 Thread Omar Sandoval
On Fri, Aug 18, 2017 at 03:07:33PM +0200, Milan Broz wrote:
> The commit f2c6df7dbf9a60e1cd9941f9fb376d4d9ad1e8dd
> 
> loop: support 4k physical blocksize
> 
> adds support for loop block size with only specific block sizes.
> 
> If the size is not supported, the code returns -EINVAL keeping
> the loop queue frozen. This causes that device could be locked
> for a long time by processes trying to scan device (udev).
> (And also causing subsequent LOOP_CLR_FD operations noop.)
> 
> Fix it by using goto to proper exit location with queue unfreeze.
> 
> (The same bug is for setting crypt attribute but this code is
> probably no more used. Patch fixes it as well though.)
> 
> Signed-off-by: Milan Broz 

Heh, I sent the same patch [1] only hours before :) v2 is here [2] if
you want to give it a reviewed-by.

1: https://marc.info/?l=linux-block&m=150303694018659&w=2
2: https://marc.info/?l=linux-block&m=150308446732748&w=2


Re: [PATCH 4/6] qlogic: make device_attribute const

2017-08-21 Thread Bhumika Goyal
On Mon, Aug 21, 2017 at 10:55 PM, David Miller  wrote:
> From: Bhumika Goyal 
> Date: Mon, 21 Aug 2017 17:13:10 +0530
>
>> Make these const as they are only passed as an argument to the
>> function device_create_file and device_remove_file and the corresponding
>> arguments are of type const.
>> Done using Coccinelle
>>
>> Signed-off-by: Bhumika Goyal 
>
> Applied.
>
> But I would seriously suggest that when you have to cross subsystems
> like this, just send the patches individually to the respective
> maintainers rather than trying to make a "series" out of it.
>

Yes, noted. Thanks for the pointer.

Thanks,
Bhumika

> Thanks.


Re: [PATCH 4/6] qlogic: make device_attribute const

2017-08-21 Thread David Miller
From: Bhumika Goyal 
Date: Mon, 21 Aug 2017 17:13:10 +0530

> Make these const as they are only passed as an argument to the
> function device_create_file and device_remove_file and the corresponding
> arguments are of type const.
> Done using Coccinelle
> 
> Signed-off-by: Bhumika Goyal 

Applied.

But I would seriously suggest that when you have to cross subsystems
like this, just send the patches individually to the respective
maintainers rather than trying to make a "series" out of it.

Thanks.


Re: [PATCH v2 2/3] loop: use queue limit instead of private lo_logical_blocksize

2017-08-21 Thread Omar Sandoval
On Mon, Aug 21, 2017 at 07:58:52AM +0200, Hannes Reinecke wrote:
> On 08/18/2017 09:27 PM, Omar Sandoval wrote:
> > From: Omar Sandoval 
> > 
> > There's no reason to track this separately; just use the
> > logical_block_size queue limit.
> > 
> > Signed-off-by: Omar Sandoval 
> > ---
> >  drivers/block/loop.c | 44 ++--
> >  drivers/block/loop.h |  1 -
> >  2 files changed, 18 insertions(+), 27 deletions(-)
> > 
> 
> Curiously enough, this is what I attempted initially.
> But that got shut down due to incompability.
> In the hope that you'll succeed:

Is that thread archived anywhere? I don't see how this would be a
compatability problem, it doesn't change the behavior if you're not
using the new option, and if you are using the new option then there's
no compatability to speak of.

> Reviewed-by: Hannes Reinecke 

Thanks!


[PATCH] blkcg: add a cleanup when active policy fail

2017-08-21 Thread weiping zhang
if alloc memory fail, all cpds which were allocated before should be cleaned up.

Signed-off-by: weiping zhang 
---
 block/blk-cgroup.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 0480892..7e21c4b 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1336,6 +1336,17 @@ int blkcg_activate_policy(struct request_queue *q,
 
spin_unlock_irq(q->queue_lock);
 out_bypass_end:
+   if (ret) {
+   spin_lock_irq(q->queue_lock);
+   list_for_each_entry(blkg, &q->blkg_list, q_node) {
+   if (blkg->pd[pol->plid]) {
+   if (pol->pd_free_fn)
+   pol->pd_free_fn(blkg->pd[pol->plid]);
+   blkg->pd[pol->plid] = NULL;
+   }
+   }
+   spin_unlock_irq(q->queue_lock);
+   }
if (q->mq_ops)
blk_mq_unfreeze_queue(q);
else
-- 
2.9.4



Re: [PATCH 1/2] powerpc/workqueue: update list of possible CPUs

2017-08-21 Thread Tejun Heo
On Mon, Aug 21, 2017 at 03:49:50PM +0200, Laurent Vivier wrote:
> In wq_numa_init() a list of NUMA nodes with their list of possible CPUs
> is built.
> 
> Unfortunately, on powerpc, the Firmware is only able to provide the
> node of a CPU if the CPU is present. So, in our case (possible CPU)
> CPU ids are known, but as the CPU is not present, the node id is
> unknown and all the unplugged CPUs are attached to node 0.

This is something powerpc needs to fix.  Workqueue isn't the only one
making this assumption.  mm as a whole assumes that CPU <-> node
mapping is stable regardless of hotplug events.  Powerpc people know
about the issue and AFAIK are working on it.

Thanks.

-- 
tejun


Re: [PATCH 2/2] blk-mq: don't use WORK_CPU_UNBOUND

2017-08-21 Thread Tejun Heo
On Mon, Aug 21, 2017 at 03:49:51PM +0200, Laurent Vivier wrote:
> cpumask is the list of CPUs present when the queue is built.
> If a new CPU is hotplugged, this list is not updated,
> and when the scheduler asks for a CPU id, blk_mq_hctx_next_cpu()
> can return WORK_CPU_UNBOUND.
> And in this case _blk_mq_run_hw_queue() can be executed by the new CPU
> (that is not present in cpumask) and raises the following warning:
> 
> WARN_ON(!cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask) &&
> cpu_online(hctx->next_cpu));
> 
> To fix this problem, this patch modifies blk_mq_hctx_next_cpu()
> to only return a CPU id present in cpumask.

So, this one isn't needed either.

Thanks.

-- 
tejun


[PATCH] blk-mq: Improvements to the hybrid polling sleep time calculation

2017-08-21 Thread sbates
From: Stephen Bates 

Hybrid polling currently uses half the average completion time as an
estimate of how long to poll for. We can improve upon this by noting
that polling before the minimum completion time makes no sense. Add a
sysfs entry to use this fact to improve CPU utilization in certain
cases.

At the same time the minimum is a bit too long to sleep for since we
must factor in OS wake time for the thread. For now allow the user to
set this via a second sysfs entry (in nanoseconds).

Testing this patch on Intel Optane SSDs showed that using the minimum
rather than half reduced CPU utilization from 59% to 38%. Tuning
this via the wake time adjustment allowed us to trade CPU load for
latency. For example

io_poll  delay  hyb_use_min adjust  latency CPU load
1-1 N/A N/A 8.4 100%
10  0   N/A 8.4 57%
10  1   0   10.334%
19  1   10009.9 37%
10  1   20008.4 47%
10  1   1   8.4 100%

Ideally we will extend this to auto-calculate the wake time rather
than have it set by the user.

Signed-off-by: Stephen Bates 
---
 block/blk-mq.c | 10 +
 block/blk-sysfs.c  | 58 ++
 include/linux/blkdev.h |  3 +++
 3 files changed, 71 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index f84d145..f453a35 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2739,6 +2739,16 @@ static unsigned long blk_mq_poll_nsecs(struct 
request_queue *q,
if (q->poll_stat[bucket].nr_samples)
ret = (q->poll_stat[bucket].mean + 1) / 2;
 
+   if (q->poll_hyb_use_min)
+   ret = max(ret, (unsigned long)q->poll_stat[bucket].min);
+
+   if (q->poll_hyb_adjust) {
+   if (ret >= q->poll_hyb_adjust)
+   ret -= q->poll_hyb_adjust;
+   else
+   return 0;
+   }
+
return ret;
 }
 
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 27aceab..51e5853 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -395,6 +395,50 @@ static ssize_t queue_poll_delay_store(struct request_queue 
*q, const char *page,
return count;
 }
 
+static ssize_t queue_poll_hyb_use_min_show(struct request_queue *q, char *page)
+{
+   return sprintf(page, "%d\n", q->poll_hyb_use_min);
+}
+
+static ssize_t queue_poll_hyb_use_min_store(struct request_queue *q,
+   const char *page, size_t count)
+{
+   int err, val;
+
+   if (!q->mq_ops || !q->mq_ops->poll)
+   return -EINVAL;
+
+   err = kstrtoint(page, 10, &val);
+   if (err < 0)
+   return err;
+
+   q->poll_hyb_use_min = val;
+
+   return count;
+}
+
+static ssize_t queue_poll_hyb_adjust_show(struct request_queue *q, char *page)
+{
+   return sprintf(page, "%d\n", q->poll_hyb_adjust);
+}
+
+static ssize_t queue_poll_hyb_adjust_store(struct request_queue *q,
+  const char *page, size_t count)
+{
+   int err, val;
+
+   if (!q->mq_ops || !q->mq_ops->poll)
+   return -EINVAL;
+
+   err = kstrtoint(page, 10, &val);
+   if (err < 0)
+   return err;
+
+   q->poll_hyb_adjust = val;
+
+   return count;
+}
+
 static ssize_t queue_poll_show(struct request_queue *q, char *page)
 {
return queue_var_show(test_bit(QUEUE_FLAG_POLL, &q->queue_flags), page);
@@ -661,6 +705,18 @@ static ssize_t queue_dax_show(struct request_queue *q, 
char *page)
.store = queue_poll_delay_store,
 };
 
+static struct queue_sysfs_entry queue_poll_hyb_use_min_entry = {
+   .attr = {.name = "io_poll_hyb_use_min", .mode = S_IRUGO | S_IWUSR },
+   .show = queue_poll_hyb_use_min_show,
+   .store = queue_poll_hyb_use_min_store,
+};
+
+static struct queue_sysfs_entry queue_poll_hyb_adjust_entry = {
+   .attr = {.name = "io_poll_hyb_adjust", .mode = S_IRUGO | S_IWUSR },
+   .show = queue_poll_hyb_adjust_show,
+   .store = queue_poll_hyb_adjust_store,
+};
+
 static struct queue_sysfs_entry queue_wc_entry = {
.attr = {.name = "write_cache", .mode = S_IRUGO | S_IWUSR },
.show = queue_wc_show,
@@ -719,6 +775,8 @@ static ssize_t queue_dax_show(struct request_queue *q, char 
*page)
&queue_dax_entry.attr,
&queue_wb_lat_entry.attr,
&queue_poll_delay_entry.attr,
+   &queue_poll_hyb_use_min_entry.attr,
+   &queue_poll_hyb_adjust_entry.attr,
 #ifdef CONFIG_BLK_DEV_THROTTLING_LOW
&throtl_sample_time_entry.attr,
 #endif
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f45f157..97b46ce 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -527,6 +527,9 @@ struct request_queue {
unsigned intrq_timeout;
int poll_nsec;
 
+   int

[PATCH 2/2] blk-mq: don't use WORK_CPU_UNBOUND

2017-08-21 Thread Laurent Vivier
cpumask is the list of CPUs present when the queue is built.
If a new CPU is hotplugged, this list is not updated,
and when the scheduler asks for a CPU id, blk_mq_hctx_next_cpu()
can return WORK_CPU_UNBOUND.
And in this case _blk_mq_run_hw_queue() can be executed by the new CPU
(that is not present in cpumask) and raises the following warning:

WARN_ON(!cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask) &&
cpu_online(hctx->next_cpu));

To fix this problem, this patch modifies blk_mq_hctx_next_cpu()
to only return a CPU id present in cpumask.

Signed-off-by: Laurent Vivier 
---
 block/blk-mq.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4603b115e234..bdac1e654814 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1126,9 +1126,6 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx 
*hctx)
  */
 static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
 {
-   if (hctx->queue->nr_hw_queues == 1)
-   return WORK_CPU_UNBOUND;
-
if (--hctx->next_cpu_batch <= 0) {
int next_cpu;
 
-- 
2.13.5



[PATCH 1/2] powerpc/workqueue: update list of possible CPUs

2017-08-21 Thread Laurent Vivier
In wq_numa_init() a list of NUMA nodes with their list of possible CPUs
is built.

Unfortunately, on powerpc, the Firmware is only able to provide the
node of a CPU if the CPU is present. So, in our case (possible CPU)
CPU ids are known, but as the CPU is not present, the node id is
unknown and all the unplugged CPUs are attached to node 0.

When we hotplugged CPU, there can be an inconsistency between
the node id known by the workqueue, and the real node id of
the CPU.

This patch adds a hotplug handler to add the hotplugged CPU to
update the node entry in wq_numa_possible_cpumask array.

Signed-off-by: Laurent Vivier 
---
 arch/powerpc/kernel/smp.c |  1 +
 include/linux/workqueue.h |  1 +
 kernel/workqueue.c| 21 +
 3 files changed, 23 insertions(+)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 8d3320562c70..abc533146514 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -964,6 +964,7 @@ void start_secondary(void *unused)
 
set_numa_node(numa_cpu_lookup_table[cpu]);
set_numa_mem(local_memory_node(numa_cpu_lookup_table[cpu]));
+   wq_numa_add_possible_cpu(cpu);
 
smp_wmb();
notify_cpu_starting(cpu);
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index db6dc9dc0482..4ef030dae22c 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -644,6 +644,7 @@ int workqueue_online_cpu(unsigned int cpu);
 int workqueue_offline_cpu(unsigned int cpu);
 #endif
 
+void wq_numa_add_possible_cpu(unsigned int cpu);
 int __init workqueue_init_early(void);
 int __init workqueue_init(void);
 
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index ca937b0c3a96..d1a99e25d5da 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5486,6 +5486,27 @@ static inline void wq_watchdog_init(void) { }
 
 #endif /* CONFIG_WQ_WATCHDOG */
 
+void wq_numa_add_possible_cpu(unsigned int cpu)
+{
+   int node;
+
+   if (num_possible_nodes() <= 1)
+   return;
+
+   if (wq_disable_numa)
+   return;
+
+   if (!wq_numa_enabled)
+   return;
+
+   node = cpu_to_node(cpu);
+   if (node == NUMA_NO_NODE)
+   return;
+
+   cpumask_set_cpu(cpu, wq_numa_possible_cpumask[node]);
+}
+EXPORT_SYMBOL_GPL(wq_numa_add_possible_cpu);
+
 static void __init wq_numa_init(void)
 {
cpumask_var_t *tbl;
-- 
2.13.5



Re: [PATCH 5/6] platform/x86: make device_attribute const

2017-08-21 Thread Thadeu Lima de Souza Cascardo
For classmate-laptop.c

Acked-by: Thadeu Lima de Souza Cascardo 



Re: [PATCH 0/6] drivers: make device_attribute const

2017-08-21 Thread Bhumika Goyal
On Mon, Aug 21, 2017 at 5:58 PM, Rafael J. Wysocki  wrote:
> On Mon, Aug 21, 2017 at 1:43 PM, Bhumika Goyal  wrote:
>> Make these const. Done using Coccinelle.
>>
>> @match disable optional_qualifier@
>> identifier s;
>> @@
>> static struct device_attribute s = {...};
>>
>> @ref@
>> position p;
>> identifier match.s;
>> @@
>> s@p
>>
>> @good1@
>> identifier match.s;
>> expression e1;
>> position ref.p;
>> @@
>> device_remove_file(e1,&s@p,...)
>>
>> @good2@
>> identifier match.s;
>> expression e1;
>> position ref.p;
>> @@
>> device_create_file(e1,&s@p,...)
>>
>>
>> @bad depends on  !good1 && !good2@
>> position ref.p;
>> identifier match.s;
>> @@
>> s@p
>>
>> @depends on forall !bad disable optional_qualifier@
>> identifier match.s;
>> @@
>> static
>> + const
>> struct device_attribute s;
>>
>> Bhumika Goyal (6):
>>   ACPI: make device_attribute const
>>   nbd: make device_attribute const
>>   hid: make device_attribute const
>>   qlogic:  make device_attribute const
>>   platform/x86: make device_attribute const
>>   power: supply: make device_attribute const
>
> It would be better to send these patches separately, because they
> touch code maintained by different people and I guess no one will take
> the whole series.
>
> I'll take care of the ACPI one, but the rest needs to go in via their
> proper trees.
>

Thanks for the note. From now onwards, I will send it separately
depending on the maintainers. But is possible please consider it this
time.

Thanks,
Bhumika

> Thanks,
> Rafael


Re: [PATCH 0/6] drivers: make device_attribute const

2017-08-21 Thread Bhumika Goyal
On Mon, Aug 21, 2017 at 5:13 PM, Bhumika Goyal  wrote:
> Make these const. Done using Coccinelle.
>
> @match disable optional_qualifier@
> identifier s;
> @@
> static struct device_attribute s = {...};
>
> @ref@
> position p;
> identifier match.s;
> @@
> s@p
>
> @good1@
> identifier match.s;
> expression e1;
> position ref.p;
> @@
> device_remove_file(e1,&s@p,...)
>
> @good2@
> identifier match.s;
> expression e1;
> position ref.p;
> @@
> device_create_file(e1,&s@p,...)
>
>
> @bad depends on  !good1 && !good2@
> position ref.p;
> identifier match.s;
> @@
> s@p
>
> @depends on forall !bad disable optional_qualifier@
> identifier match.s;
> @@
> static
> + const
> struct device_attribute s;
>
> Bhumika Goyal (6):
>   ACPI: make device_attribute const
>   nbd: make device_attribute const
>   hid: make device_attribute const
>   qlogic:  make device_attribute const
>   platform/x86: make device_attribute const
>   power: supply: make device_attribute const
>

Hello all,

The patches are all independent, so please take what seems relevant.

Thanks,
Bhumika

>  drivers/acpi/battery.c   | 2 +-
>  drivers/acpi/sbs.c   | 2 +-
>  drivers/block/nbd.c  | 2 +-
>  drivers/hid/hid-core.c   | 2 +-
>  drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c | 4 ++--
>  drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c| 6 +++---
>  drivers/platform/x86/classmate-laptop.c  | 6 +++---
>  drivers/platform/x86/intel-rst.c | 4 ++--
>  drivers/power/supply/olpc_battery.c  | 2 +-
>  9 files changed, 15 insertions(+), 15 deletions(-)
>
> --
> 1.9.1
>


Re: [PATCH 0/6] drivers: make device_attribute const

2017-08-21 Thread Rafael J. Wysocki
On Mon, Aug 21, 2017 at 1:43 PM, Bhumika Goyal  wrote:
> Make these const. Done using Coccinelle.
>
> @match disable optional_qualifier@
> identifier s;
> @@
> static struct device_attribute s = {...};
>
> @ref@
> position p;
> identifier match.s;
> @@
> s@p
>
> @good1@
> identifier match.s;
> expression e1;
> position ref.p;
> @@
> device_remove_file(e1,&s@p,...)
>
> @good2@
> identifier match.s;
> expression e1;
> position ref.p;
> @@
> device_create_file(e1,&s@p,...)
>
>
> @bad depends on  !good1 && !good2@
> position ref.p;
> identifier match.s;
> @@
> s@p
>
> @depends on forall !bad disable optional_qualifier@
> identifier match.s;
> @@
> static
> + const
> struct device_attribute s;
>
> Bhumika Goyal (6):
>   ACPI: make device_attribute const
>   nbd: make device_attribute const
>   hid: make device_attribute const
>   qlogic:  make device_attribute const
>   platform/x86: make device_attribute const
>   power: supply: make device_attribute const

It would be better to send these patches separately, because they
touch code maintained by different people and I guess no one will take
the whole series.

I'll take care of the ACPI one, but the rest needs to go in via their
proper trees.

Thanks,
Rafael


[PATCH 1/6] ACPI: make device_attribute const

2017-08-21 Thread Bhumika Goyal
Make these const as they are only passed as an argument to the function
device_create_file and device_remove_file and the corresponding
arguments are of type const.
Done using Coccinelle

Signed-off-by: Bhumika Goyal 
---
 drivers/acpi/battery.c | 2 +-
 drivers/acpi/sbs.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c
index 1cbb88d..13e7b56 100644
--- a/drivers/acpi/battery.c
+++ b/drivers/acpi/battery.c
@@ -620,7 +620,7 @@ static ssize_t acpi_battery_alarm_store(struct device *dev,
return count;
 }
 
-static struct device_attribute alarm_attr = {
+static const struct device_attribute alarm_attr = {
.attr = {.name = "alarm", .mode = 0644},
.show = acpi_battery_alarm_show,
.store = acpi_battery_alarm_store,
diff --git a/drivers/acpi/sbs.c b/drivers/acpi/sbs.c
index a184637..a2428e9 100644
--- a/drivers/acpi/sbs.c
+++ b/drivers/acpi/sbs.c
@@ -474,7 +474,7 @@ static ssize_t acpi_battery_alarm_store(struct device *dev,
return count;
 }
 
-static struct device_attribute alarm_attr = {
+static const struct device_attribute alarm_attr = {
.attr = {.name = "alarm", .mode = 0644},
.show = acpi_battery_alarm_show,
.store = acpi_battery_alarm_store,
-- 
1.9.1



[PATCH 2/6] nbd: make device_attribute const

2017-08-21 Thread Bhumika Goyal
Make this const as is is only passed as an argument to the
function device_create_file and device_remove_file and the corresponding
arguments are of type const.
Done using Coccinelle

Signed-off-by: Bhumika Goyal 
---
 drivers/block/nbd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 5bdf923..49d7763 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -165,7 +165,7 @@ static ssize_t pid_show(struct device *dev,
return sprintf(buf, "%d\n", task_pid_nr(nbd->task_recv));
 }
 
-static struct device_attribute pid_attr = {
+static const struct device_attribute pid_attr = {
.attr = { .name = "pid", .mode = S_IRUGO},
.show = pid_show,
 };
-- 
1.9.1



[PATCH 3/6] hid: make device_attribute const

2017-08-21 Thread Bhumika Goyal
Make this const as it is only passed as an argument to the
function device_create_file and device_remove_file and the corresponding
arguments are of type const.
Done using Coccinelle

Signed-off-by: Bhumika Goyal 
---
 drivers/hid/hid-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index 9bc9116..24e929c 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -1662,7 +1662,7 @@ static bool hid_hiddev(struct hid_device *hdev)
.size = HID_MAX_DESCRIPTOR_SIZE,
 };
 
-static struct device_attribute dev_attr_country = {
+static const struct device_attribute dev_attr_country = {
.attr = { .name = "country", .mode = 0444 },
.show = show_country,
 };
-- 
1.9.1



[PATCH 6/6] power: supply: make device_attribute const

2017-08-21 Thread Bhumika Goyal
Make these const as they are only passed as an argument to the
function device_create_file and device_remove_file and the corresponding
arguments are of type const.
Done using Coccinelle.

Signed-off-by: Bhumika Goyal 
---
 drivers/power/supply/olpc_battery.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/power/supply/olpc_battery.c 
b/drivers/power/supply/olpc_battery.c
index fc20ca3..3bc2eea 100644
--- a/drivers/power/supply/olpc_battery.c
+++ b/drivers/power/supply/olpc_battery.c
@@ -559,7 +559,7 @@ static ssize_t olpc_bat_error_read(struct device *dev,
return sprintf(buf, "%d\n", ec_byte);
 }
 
-static struct device_attribute olpc_bat_error = {
+static const struct device_attribute olpc_bat_error = {
.attr = {
.name = "error",
.mode = S_IRUGO,
-- 
1.9.1



[PATCH 4/6] qlogic: make device_attribute const

2017-08-21 Thread Bhumika Goyal
Make these const as they are only passed as an argument to the
function device_create_file and device_remove_file and the corresponding
arguments are of type const.
Done using Coccinelle

Signed-off-by: Bhumika Goyal 
---
 drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c | 4 ++--
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c| 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c 
b/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c
index 827de83..f2e8de6 100644
--- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c
+++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c
@@ -2828,7 +2828,7 @@ static void netxen_nic_poll_controller(struct net_device 
*netdev)
return sprintf(buf, "%d\n", bridged_mode);
 }
 
-static struct device_attribute dev_attr_bridged_mode = {
+static const struct device_attribute dev_attr_bridged_mode = {
.attr = {.name = "bridged_mode", .mode = (S_IRUGO | S_IWUSR)},
.show = netxen_show_bridged_mode,
.store = netxen_store_bridged_mode,
@@ -2860,7 +2860,7 @@ static void netxen_nic_poll_controller(struct net_device 
*netdev)
!!(adapter->flags & NETXEN_NIC_DIAG_ENABLED));
 }
 
-static struct device_attribute dev_attr_diag_mode = {
+static const struct device_attribute dev_attr_diag_mode = {
.attr = {.name = "diag_mode", .mode = (S_IRUGO | S_IWUSR)},
.show = netxen_show_diag_mode,
.store = netxen_store_diag_mode,
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c
index 82fcb83..287d89d 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c
@@ -1174,19 +1174,19 @@ static ssize_t 
qlcnic_83xx_sysfs_flash_write_handler(struct file *filp,
return size;
 }
 
-static struct device_attribute dev_attr_bridged_mode = {
+static const struct device_attribute dev_attr_bridged_mode = {
.attr = {.name = "bridged_mode", .mode = (S_IRUGO | S_IWUSR)},
.show = qlcnic_show_bridged_mode,
.store = qlcnic_store_bridged_mode,
 };
 
-static struct device_attribute dev_attr_diag_mode = {
+static const struct device_attribute dev_attr_diag_mode = {
.attr = {.name = "diag_mode", .mode = (S_IRUGO | S_IWUSR)},
.show = qlcnic_show_diag_mode,
.store = qlcnic_store_diag_mode,
 };
 
-static struct device_attribute dev_attr_beacon = {
+static const struct device_attribute dev_attr_beacon = {
.attr = {.name = "beacon", .mode = (S_IRUGO | S_IWUSR)},
.show = qlcnic_show_beacon,
.store = qlcnic_store_beacon,
-- 
1.9.1



[PATCH 5/6] platform/x86: make device_attribute const

2017-08-21 Thread Bhumika Goyal
Make these const as they are only passed as an argument to the
function device_create_file and device_remove_file and the corresponding
arguments are of type const.
Done using Coccinelle

Signed-off-by: Bhumika Goyal 
---
 drivers/platform/x86/classmate-laptop.c | 6 +++---
 drivers/platform/x86/intel-rst.c| 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/platform/x86/classmate-laptop.c 
b/drivers/platform/x86/classmate-laptop.c
index 55cf10b..d3715e2 100644
--- a/drivers/platform/x86/classmate-laptop.c
+++ b/drivers/platform/x86/classmate-laptop.c
@@ -254,7 +254,7 @@ static ssize_t cmpc_accel_sensitivity_store_v4(struct 
device *dev,
return strnlen(buf, count);
 }
 
-static struct device_attribute cmpc_accel_sensitivity_attr_v4 = {
+static const struct device_attribute cmpc_accel_sensitivity_attr_v4 = {
.attr = { .name = "sensitivity", .mode = 0660 },
.show = cmpc_accel_sensitivity_show_v4,
.store = cmpc_accel_sensitivity_store_v4
@@ -303,7 +303,7 @@ static ssize_t cmpc_accel_g_select_store_v4(struct device 
*dev,
return strnlen(buf, count);
 }
 
-static struct device_attribute cmpc_accel_g_select_attr_v4 = {
+static const struct device_attribute cmpc_accel_g_select_attr_v4 = {
.attr = { .name = "g_select", .mode = 0660 },
.show = cmpc_accel_g_select_show_v4,
.store = cmpc_accel_g_select_store_v4
@@ -599,7 +599,7 @@ static ssize_t cmpc_accel_sensitivity_store(struct device 
*dev,
return strnlen(buf, count);
 }
 
-static struct device_attribute cmpc_accel_sensitivity_attr = {
+static const struct device_attribute cmpc_accel_sensitivity_attr = {
.attr = { .name = "sensitivity", .mode = 0660 },
.show = cmpc_accel_sensitivity_show,
.store = cmpc_accel_sensitivity_store
diff --git a/drivers/platform/x86/intel-rst.c b/drivers/platform/x86/intel-rst.c
index 7344d84..760a9bf 100644
--- a/drivers/platform/x86/intel-rst.c
+++ b/drivers/platform/x86/intel-rst.c
@@ -65,7 +65,7 @@ static ssize_t irst_store_wakeup_events(struct device *dev,
return count;
 }
 
-static struct device_attribute irst_wakeup_attr = {
+static const struct device_attribute irst_wakeup_attr = {
.attr = { .name = "wakeup_events", .mode = 0600 },
.show = irst_show_wakeup_events,
.store = irst_store_wakeup_events
@@ -111,7 +111,7 @@ static ssize_t irst_store_wakeup_time(struct device *dev,
return count;
 }
 
-static struct device_attribute irst_timeout_attr = {
+static const struct device_attribute irst_timeout_attr = {
.attr = { .name = "wakeup_time", .mode = 0600 },
.show = irst_show_wakeup_time,
.store = irst_store_wakeup_time
-- 
1.9.1



[PATCH 0/6] drivers: make device_attribute const

2017-08-21 Thread Bhumika Goyal
Make these const. Done using Coccinelle.

@match disable optional_qualifier@
identifier s;
@@
static struct device_attribute s = {...};

@ref@
position p;
identifier match.s;
@@
s@p

@good1@
identifier match.s;
expression e1;
position ref.p;
@@
device_remove_file(e1,&s@p,...)

@good2@
identifier match.s;
expression e1;
position ref.p;
@@
device_create_file(e1,&s@p,...)


@bad depends on  !good1 && !good2@
position ref.p;
identifier match.s;
@@
s@p

@depends on forall !bad disable optional_qualifier@
identifier match.s;
@@
static
+ const
struct device_attribute s;

Bhumika Goyal (6):
  ACPI: make device_attribute const
  nbd: make device_attribute const
  hid: make device_attribute const
  qlogic:  make device_attribute const
  platform/x86: make device_attribute const
  power: supply: make device_attribute const

 drivers/acpi/battery.c   | 2 +-
 drivers/acpi/sbs.c   | 2 +-
 drivers/block/nbd.c  | 2 +-
 drivers/hid/hid-core.c   | 2 +-
 drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c | 4 ++--
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c| 6 +++---
 drivers/platform/x86/classmate-laptop.c  | 6 +++---
 drivers/platform/x86/intel-rst.c | 4 ++--
 drivers/power/supply/olpc_battery.c  | 2 +-
 9 files changed, 15 insertions(+), 15 deletions(-)

-- 
1.9.1



Re: [PATCH V5 11/13] mmc: block: Add CQE support

2017-08-21 Thread Adrian Hunter
On 20/08/17 15:13, Linus Walleij wrote:
> On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter  
> wrote:
> 
>> Add CQE support to the block driver, including:
>> - optionally using DCMD for flush requests
>> - manually issuing discard requests
> 
> "Manually" has no place in computer code IMO since there is no
> man there. But I may be especially grumpy. But I don't understand the
> comment anyway.

CQE automatically sends the commands to complete requests.  However it only
supports reads / writes and so-called "direct commands" (DCMD).  Furthermore
DCMD is limited to one command at a time, but discards require 3 commands.
That makes issuing discards through CQE very awkward, but some CQE's don't
support DCMD anyway.  So for discards, the existing non-CQE approach is
taken, where the mmc core code issues the 3 commands one at a time i.e.
mmc_erase().

> 
>> - issuing read / write requests to the CQE
>> - supporting block-layer timeouts
>> - handling recovery
>> - supporting re-tuning
>>
>> CQE offers 25% - 50% better random multi-threaded I/O.  There is a slight
>> (e.g. 2%) drop in sequential read speed but no observable change to 
>> sequential
>> write.
>>
>> Signed-off-by: Adrian Hunter 
> 
> This commit message seriously needs to detail the design of the
> request handling thread/engine.
> 
> So adding a new (in effect) invasive block driver needs to at least
> be CC:ed to the block maintainers so we don't sneak anything like
> that in under the radar.
> 
> This duplicates the "thread that sucks out requests" from the
> MMC core, and doubles the problem of replacing it with
> something like blk-mq.

We need to start with a legacy API because people want to backport CQ to
earlier kernels (we really need to get features upstream more quickly), but
blk-mq has been evolving a lot (e.g. elevator support), so backporters face
having either something quite different from upstream or trying to backport
great chunks of the block layer.

We also don't know how blk-mq will perform so it would be prudent to start
with support for both the legacy API and blk-mq (as scsi does) so that we
can find out first.

A loop to remove requests from the queue is normal e.g. scsi_request_fn()
and taking a consistent approach with the existing mmc thread does not
double the the problem of blk-mq implementation, since the same approach can
be used for both.

> 
> Especially these two snippets I would personally NOT merge
> without the explicit consent of a block layer maintainer:
> 
>> +static void mmc_cqe_request_fn(struct request_queue *q)
>> +{
>> +   struct mmc_queue *mq = q->queuedata;
>> +   struct request *req;
>> +
>> +   if (!mq) {
>> +   while ((req = blk_fetch_request(q)) != NULL) {
>> +   req->rq_flags |= RQF_QUIET;
>> +   __blk_end_request_all(req, BLK_STS_IOERR);
>> +   }
>> +   return;
>> +   }
>> +
>> +   if (mq->asleep && !mq->cqe_busy)
>> +   wake_up_process(mq->thread);
>> +}
> (...)
>> +static int mmc_cqe_thread(void *d)
>> +{
>> +   struct mmc_queue *mq = d;
>> +   struct request_queue *q = mq->queue;
>> +   struct mmc_card *card = mq->card;
>> +   struct mmc_host *host = card->host;
>> +   unsigned long flags;
>> +   int get_put = 0;
>> +
>> +   current->flags |= PF_MEMALLOC;
>> +
>> +   down(&mq->thread_sem);
>> +   spin_lock_irqsave(q->queue_lock, flags);
>> +   while (1) {
>> +   struct request *req = NULL;
>> +   enum mmc_issue_type issue_type;
>> +   bool retune_ok = false;
>> +
>> +   if (mq->cqe_recovery_needed) {
>> +   spin_unlock_irqrestore(q->queue_lock, flags);
>> +   mmc_blk_cqe_recovery(mq);
>> +   spin_lock_irqsave(q->queue_lock, flags);
>> +   mq->cqe_recovery_needed = false;
>> +   }
>> +
>> +   set_current_state(TASK_INTERRUPTIBLE);
>> +
>> +   if (!kthread_should_stop())
>> +   req = blk_peek_request(q);
> 
> Why are you using blk_peek_request() instead of blk_fetch_request()
> when the other thread just goes with fetch?

Because we are not always able to start the request.

> Am I right in assuming that also this request queue replies on the
> implicit semantic that blk_peek_request(q) shall return NULL
> if there are no requests in the queue, and that it is pulling out a
> few NULLs to flush the request-handling machinery? Or did it
> fix this? (Put that in the commit message in that case.)

CQ is different.  Because read / write requests are processed asynchronously
we can keep preparing and issuing more requests until the hardware queue is
full.  That is, we don't have to wait for the first request to complete
before preparing and issuing the second request, and so on.

However, the existing thread's need to issue a 

Re: [PATCH V5 02/13] mmc: core: Add members to mmc_request and mmc_data for CQE's

2017-08-21 Thread Adrian Hunter
On 20/08/17 14:29, Linus Walleij wrote:
> On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter  
> wrote:
> 
>> Most of the information needed to issue requests to a CQE is already in
>> struct mmc_request and struct mmc_data. Add data block address, some flags,
>> and the task id (tag), and allow for cmd being NULL which it is for CQE
>> tasks.
>>
>> Signed-off-by: Adrian Hunter 
> 
>> +   int tag;
> 
> Is this consistent with the block layer idea of "tag"?

It is named "tag" because it is the block layer tag.

> 
> I am asking because I get confused.
> 
> I thought the block layers idea of a "tag" was some metadata
> associated with a request. Not that I am a block layer expert.

The block layer tag is a unique number to identify the task to the hardware
queue.  It is typically a number from 0 up to queue depth - 1.

> Why can't we just name this "task_id" if that is what it is in
> Linux terms? Does the specification call it "tag"?

The eMMC specification calls it "task id" but the block layer calls it
"tag".  I went with "tag" to be easier for block layer people to understand.


Re: [PATCH V2 01/20] blk-mq-sched: fix scheduler bad performance

2017-08-21 Thread Ming Lei
On Wed, Aug 09, 2017 at 12:11:18AM -0700, Omar Sandoval wrote:
> On Wed, Aug 09, 2017 at 10:32:52AM +0800, Ming Lei wrote:
> > On Wed, Aug 9, 2017 at 8:11 AM, Omar Sandoval  wrote:
> > > On Sat, Aug 05, 2017 at 02:56:46PM +0800, Ming Lei wrote:
> > >> When hw queue is busy, we shouldn't take requests from
> > >> scheduler queue any more, otherwise IO merge will be
> > >> difficult to do.
> > >>
> > >> This patch fixes the awful IO performance on some
> > >> SCSI devices(lpfc, qla2xxx, ...) when mq-deadline/kyber
> > >> is used by not taking requests if hw queue is busy.
> > >
> > > Jens added this behavior in 64765a75ef25 ("blk-mq-sched: ask scheduler
> > > for work, if we failed dispatching leftovers"). That change was a big
> > > performance improvement, but we didn't figure out why. We'll need to dig
> > > up whatever test Jens was doing to make sure it doesn't regress.
> > 
> > Not found info about Jen's test case on this commit from google.
> > 
> > Maybe Jens could provide some input about your test case?
> 
> Okay I found my previous discussion with Jens (it was an off-list
> discussion). The test case was xfs/297 from xfstests: after
> 64765a75ef25, the test went from taking ~300 seconds to ~200 seconds on
> his SCSI device.
> 
> > In theory, if hw queue is busy and requests are left in ->dispatch,
> > we should not have continued to dequeue requests from sw/scheduler queue
> > any more. Otherwise, IO merge can be hurt much. At least on SCSI devices,
> > this improved much on sequential I/O,  at least 3X of sequential
> > read is increased on lpfc with this patch, in case of mq-deadline.
> 
> Right, your patch definitely makes more sense intuitively.
> 
> > Or are there other special cases in which we still need
> > to push requests hard into a busy hardware?
> 
> xfs/297 does a lot of fsyncs and hence a lot of flushes, that could be
> the special case.

OK, thanks a lot for this input, and I will run xfs/297 with this
patchset to see if performance is degraded.

-- 
Ming