Re: [PATCH IMPROVEMENT] block, bfq: limit sectors served with interactive weight raising

2018-01-12 Thread Jens Axboe
On 1/12/18 3:20 AM, Paolo Valente wrote:
> 
> 
>> Il giorno 12 gen 2018, alle ore 11:15, Holger Hoffstätte 
>>  ha scritto:
>>
>> On 01/12/18 06:58, Paolo Valente wrote:
>>>
>>>
 Il giorno 28 dic 2017, alle ore 15:00, Holger Hoffstätte 
  ha scritto:


 On 12/28/17 12:19, Paolo Valente wrote:
 (snip half a tech report ;)

 So either this or the previous patch ("limit tags for writes and async I/O"
 can lead to a hard, unrecoverable hang with heavy writes. Since I couldn't
 log into the affected system anymore I couldn't get any stack traces, 
 blk-mq
 debug output etc. but there was nothing in dmesg/on the console, so it
 wasn't a BUG/OOPS.

 -h
>>>
>>> Hi Holger,
>>> if, as I guess, this problem hasn't gone away for you, I have two
>>> requests:
>>> 1) could you share your exact test
>>> 2) if nothing happens in my systems with your test, would you be
>>> willing to retry with the dev version of bfq?  It should be able to
>>> tell us what takes to your hang.  If you are willing to do this test,
>>> I'll prepare a branch with everything already configured for you.
>>
>> Hi,
>>
>> thanks for following up but there's no need for any of that; it turned out
>> to be something else since I got the same hang without those patches at
>> least once (during a btrfs balance, even though it didn't look like btrfs'
>> fault directly; more like block/mm/helpers.
>>
>> So on January 7 I posted to linux-block et.al. where I said
>> "So this turned out to be something else, sorry for the false alarm."
>> but apparently that didn't make it through since it's not in the
>> archives either. Sorry.
>>
>> Long story short, the good news is that I've been running with both patches
>> since then without any issue. :)
>>
> 
> Wow, what a relief! :)
> 
> So, Jens, being the only issue reported gone, can you please consider
> queueing this patch and the other pending one [1]?  They are both
> critical for bfq performance.

Please just resend them.

-- 
Jens Axboe



Re: [PATCH IMPROVEMENT] block, bfq: limit sectors served with interactive weight raising

2018-01-12 Thread Paolo Valente


> Il giorno 12 gen 2018, alle ore 11:15, Holger Hoffstätte 
>  ha scritto:
> 
> On 01/12/18 06:58, Paolo Valente wrote:
>> 
>> 
>>> Il giorno 28 dic 2017, alle ore 15:00, Holger Hoffstätte 
>>>  ha scritto:
>>> 
>>> 
>>> On 12/28/17 12:19, Paolo Valente wrote:
>>> (snip half a tech report ;)
>>> 
>>> So either this or the previous patch ("limit tags for writes and async I/O"
>>> can lead to a hard, unrecoverable hang with heavy writes. Since I couldn't
>>> log into the affected system anymore I couldn't get any stack traces, blk-mq
>>> debug output etc. but there was nothing in dmesg/on the console, so it
>>> wasn't a BUG/OOPS.
>>> 
>>> -h
>> 
>> Hi Holger,
>> if, as I guess, this problem hasn't gone away for you, I have two
>> requests:
>> 1) could you share your exact test
>> 2) if nothing happens in my systems with your test, would you be
>> willing to retry with the dev version of bfq?  It should be able to
>> tell us what takes to your hang.  If you are willing to do this test,
>> I'll prepare a branch with everything already configured for you.
> 
> Hi,
> 
> thanks for following up but there's no need for any of that; it turned out
> to be something else since I got the same hang without those patches at
> least once (during a btrfs balance, even though it didn't look like btrfs'
> fault directly; more like block/mm/helpers.
> 
> So on January 7 I posted to linux-block et.al. where I said
> "So this turned out to be something else, sorry for the false alarm."
> but apparently that didn't make it through since it's not in the
> archives either. Sorry.
> 
> Long story short, the good news is that I've been running with both patches
> since then without any issue. :)
> 

Wow, what a relief! :)

So, Jens, being the only issue reported gone, can you please consider
queueing this patch and the other pending one [1]?  They are both
critical for bfq performance.

Thanks,
Paolo

[1] https://www.spinics.net/lists/kernel/msg2684463.html

> cheers
> Holger



Re: [PATCH IMPROVEMENT] block, bfq: limit sectors served with interactive weight raising

2018-01-12 Thread Holger Hoffstätte
On 01/12/18 06:58, Paolo Valente wrote:
> 
> 
>> Il giorno 28 dic 2017, alle ore 15:00, Holger Hoffstätte 
>>  ha scritto:
>>
>>
>> On 12/28/17 12:19, Paolo Valente wrote:
>> (snip half a tech report ;)
>>
>> So either this or the previous patch ("limit tags for writes and async I/O"
>> can lead to a hard, unrecoverable hang with heavy writes. Since I couldn't
>> log into the affected system anymore I couldn't get any stack traces, blk-mq
>> debug output etc. but there was nothing in dmesg/on the console, so it
>> wasn't a BUG/OOPS.
>>
>> -h
> 
> Hi Holger,
> if, as I guess, this problem hasn't gone away for you, I have two
> requests:
> 1) could you share your exact test
> 2) if nothing happens in my systems with your test, would you be
> willing to retry with the dev version of bfq?  It should be able to
> tell us what takes to your hang.  If you are willing to do this test,
> I'll prepare a branch with everything already configured for you.

Hi,

thanks for following up but there's no need for any of that; it turned out
to be something else since I got the same hang without those patches at
least once (during a btrfs balance, even though it didn't look like btrfs'
fault directly; more like block/mm/helpers.

So on January 7 I posted to linux-block et.al. where I said
"So this turned out to be something else, sorry for the false alarm."
but apparently that didn't make it through since it's not in the
archives either. Sorry.

Long story short, the good news is that I've been running with both patches
since then without any issue. :)

cheers
Holger


Re: [PATCH IMPROVEMENT] block, bfq: limit sectors served with interactive weight raising

2018-01-11 Thread Paolo Valente


> Il giorno 28 dic 2017, alle ore 15:00, Holger Hoffstätte 
>  ha scritto:
> 
> 
> On 12/28/17 12:19, Paolo Valente wrote:
> (snip half a tech report ;)
> 
> So either this or the previous patch ("limit tags for writes and async I/O"
> can lead to a hard, unrecoverable hang with heavy writes. Since I couldn't
> log into the affected system anymore I couldn't get any stack traces, blk-mq
> debug output etc. but there was nothing in dmesg/on the console, so it
> wasn't a BUG/OOPS.
> 
> -h

Hi Holger,
if, as I guess, this problem hasn't gone away for you, I have two
requests:
1) could you share your exact test
2) if nothing happens in my systems with your test, would you be
willing to retry with the dev version of bfq?  It should be able to
tell us what takes to your hang.  If you are willing to do this test,
I'll prepare a branch with everything already configured for you.

Thanks,
Paolo

Re: [PATCH IMPROVEMENT] block, bfq: limit sectors served with interactive weight raising

2017-12-29 Thread Oleksandr Natalenko
Hi.

On čtvrtek 28. prosince 2017 12:19:17 CET Paolo Valente wrote:
> To maximise responsiveness, BFQ raises the weight, and performs device
> idling, for bfq_queues associated with processes deemed as
> interactive. In particular, weight raising has a maximum duration,
> equal to the time needed to start a large application. If a
> weight-raised process goes on doing I/O beyond this maximum duration,
> it loses weight-raising.
> 
> This mechanism is evidently vulnerable to the following false
> positives: I/O-bound applications that will go on doing I/O for much
> longer than the duration of weight-raising. These applications have
> basically no benefit from being weight-raised at the beginning of
> their I/O. On the opposite end, while being weight-raised, these
> applications
> a) unjustly steal throughput to applications that may truly need
> low latency;
> b) make BFQ uselessly perform device idling; device idling results
> in loss of device throughput with most flash-based storage, and may
> increase latencies when used purposelessly.
> 
> This commit adds a countermeasure to reduce both the above
> problems. To introduce this countermeasure, we provide the following
> extra piece of information (full details in the comments added by this
> commit). During the start-up of the large application used as a
> reference to set the duration of weight-raising, involved processes
> transfer at most ~110K sectors each. Accordingly, a process initially
> deemed as interactive has no right to be weight-raised any longer,
> once transferred 110K sectors or more.
> 
> Basing on this consideration, this commit early-ends weight-raising
> for a bfq_queue if the latter happens to have received an amount of
> service at least equal to 110K sectors (actually, a little bit more,
> to keep a safety margin). I/O-bound applications that reach a high
> throughput, such as file copy, get to this threshold much before the
> allowed weight-raising period finishes. Thus this early ending of
> weight-raising reduces the amount of time during which these
> applications cause the problems described above.
> 
> Signed-off-by: Paolo Valente 
> ---
>  block/bfq-iosched.c | 81
> +++-- block/bfq-iosched.h |
>  5 
>  block/bfq-wf2q.c|  3 ++
>  3 files changed, 80 insertions(+), 9 deletions(-)
> 
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 6f75015d18c0..ea48b5c8f088 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -209,15 +209,17 @@ static struct kmem_cache *bfq_pool;
>   * interactive applications automatically, using the following formula:
>   * duration = (R / r) * T, where r is the peak rate of the device, and
>   * R and T are two reference parameters.
> - * In particular, R is the peak rate of the reference device (see below),
> - * and T is a reference time: given the systems that are likely to be
> - * installed on the reference device according to its speed class, T is
> - * about the maximum time needed, under BFQ and while reading two files in
> - * parallel, to load typical large applications on these systems.
> - * In practice, the slower/faster the device at hand is, the more/less it
> - * takes to load applications with respect to the reference device.
> - * Accordingly, the longer/shorter BFQ grants weight raising to interactive
> - * applications.
> + * In particular, R is the peak rate of the reference device (see
> + * below), and T is a reference time: given the systems that are
> + * likely to be installed on the reference device according to its
> + * speed class, T is about the maximum time needed, under BFQ and
> + * while reading two files in parallel, to load typical large
> + * applications on these systems (see the comments on
> + * max_service_from_wr below, for more details on how T is obtained).
> + * In practice, the slower/faster the device at hand is, the more/less
> + * it takes to load applications with respect to the reference device.
> + * Accordingly, the longer/shorter BFQ grants weight raising to
> + * interactive applications.
>   *
>   * BFQ uses four different reference pairs (R, T), depending on:
>   * . whether the device is rotational or non-rotational;
> @@ -254,6 +256,60 @@ static int T_slow[2];
>  static int T_fast[2];
>  static int device_speed_thresh[2];
> 
> +/*
> + * BFQ uses the above-detailed, time-based weight-raising mechanism to
> + * privilege interactive tasks. This mechanism is vulnerable to the
> + * following false positives: I/O-bound applications that will go on
> + * doing I/O for much longer than the duration of weight
> + * raising. These applications have basically no benefit from being
> + * weight-raised at the beginning of their I/O. On the opposite end,
> + * while being weight-raised, these applications
> + * a) unjustly steal throughput to applications that may actually need
> + * low latency;
> + * b) make BFQ uselessly perform device idling; device idling results
> 

Re: [PATCH IMPROVEMENT] block, bfq: limit sectors served with interactive weight raising

2017-12-28 Thread Holger Hoffstätte

On 12/28/17 12:19, Paolo Valente wrote:
(snip half a tech report ;)

So either this or the previous patch ("limit tags for writes and async I/O"
can lead to a hard, unrecoverable hang with heavy writes. Since I couldn't
log into the affected system anymore I couldn't get any stack traces, blk-mq
debug output etc. but there was nothing in dmesg/on the console, so it
wasn't a BUG/OOPS.

-h


[PATCH IMPROVEMENT] block, bfq: limit sectors served with interactive weight raising

2017-12-28 Thread Paolo Valente
To maximise responsiveness, BFQ raises the weight, and performs device
idling, for bfq_queues associated with processes deemed as
interactive. In particular, weight raising has a maximum duration,
equal to the time needed to start a large application. If a
weight-raised process goes on doing I/O beyond this maximum duration,
it loses weight-raising.

This mechanism is evidently vulnerable to the following false
positives: I/O-bound applications that will go on doing I/O for much
longer than the duration of weight-raising. These applications have
basically no benefit from being weight-raised at the beginning of
their I/O. On the opposite end, while being weight-raised, these
applications
a) unjustly steal throughput to applications that may truly need
low latency;
b) make BFQ uselessly perform device idling; device idling results
in loss of device throughput with most flash-based storage, and may
increase latencies when used purposelessly.

This commit adds a countermeasure to reduce both the above
problems. To introduce this countermeasure, we provide the following
extra piece of information (full details in the comments added by this
commit). During the start-up of the large application used as a
reference to set the duration of weight-raising, involved processes
transfer at most ~110K sectors each. Accordingly, a process initially
deemed as interactive has no right to be weight-raised any longer,
once transferred 110K sectors or more.

Basing on this consideration, this commit early-ends weight-raising
for a bfq_queue if the latter happens to have received an amount of
service at least equal to 110K sectors (actually, a little bit more,
to keep a safety margin). I/O-bound applications that reach a high
throughput, such as file copy, get to this threshold much before the
allowed weight-raising period finishes. Thus this early ending of
weight-raising reduces the amount of time during which these
applications cause the problems described above.

Signed-off-by: Paolo Valente 
---
 block/bfq-iosched.c | 81 +++--
 block/bfq-iosched.h |  5 
 block/bfq-wf2q.c|  3 ++
 3 files changed, 80 insertions(+), 9 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 6f75015d18c0..ea48b5c8f088 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -209,15 +209,17 @@ static struct kmem_cache *bfq_pool;
  * interactive applications automatically, using the following formula:
  * duration = (R / r) * T, where r is the peak rate of the device, and
  * R and T are two reference parameters.
- * In particular, R is the peak rate of the reference device (see below),
- * and T is a reference time: given the systems that are likely to be
- * installed on the reference device according to its speed class, T is
- * about the maximum time needed, under BFQ and while reading two files in
- * parallel, to load typical large applications on these systems.
- * In practice, the slower/faster the device at hand is, the more/less it
- * takes to load applications with respect to the reference device.
- * Accordingly, the longer/shorter BFQ grants weight raising to interactive
- * applications.
+ * In particular, R is the peak rate of the reference device (see
+ * below), and T is a reference time: given the systems that are
+ * likely to be installed on the reference device according to its
+ * speed class, T is about the maximum time needed, under BFQ and
+ * while reading two files in parallel, to load typical large
+ * applications on these systems (see the comments on
+ * max_service_from_wr below, for more details on how T is obtained).
+ * In practice, the slower/faster the device at hand is, the more/less
+ * it takes to load applications with respect to the reference device.
+ * Accordingly, the longer/shorter BFQ grants weight raising to
+ * interactive applications.
  *
  * BFQ uses four different reference pairs (R, T), depending on:
  * . whether the device is rotational or non-rotational;
@@ -254,6 +256,60 @@ static int T_slow[2];
 static int T_fast[2];
 static int device_speed_thresh[2];
 
+/*
+ * BFQ uses the above-detailed, time-based weight-raising mechanism to
+ * privilege interactive tasks. This mechanism is vulnerable to the
+ * following false positives: I/O-bound applications that will go on
+ * doing I/O for much longer than the duration of weight
+ * raising. These applications have basically no benefit from being
+ * weight-raised at the beginning of their I/O. On the opposite end,
+ * while being weight-raised, these applications
+ * a) unjustly steal throughput to applications that may actually need
+ * low latency;
+ * b) make BFQ uselessly perform device idling; device idling results
+ * in loss of device throughput with most flash-based storage, and may
+ * increase latencies when used purposelessly.
+ *
+ * BFQ tries to reduce these problems, by adopting the following
+ * countermeasure. To introduce this countermeasure, we nee