Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
On 2019/8/14 10:23 下午, Heitor Alves de Siqueira wrote: > Hi Coly, > > We've had users impacted by system stalls and were able to trace it back to > the > bcache priority_stats query. After investigating a bit further, it seems that > the sorting step in the quantiles calculation can cause heavy CPU > contention. This has a severe performance impact on any task that is running > in > the same CPU as the sysfs query, and caused issues even for non-bcache > workloads. > > We did some test runs with fio to get a better picture of the impact on > read/write workloads while a priority_stats query is running, and came up with > some interesting results. The bucket locking doesn't seem to have that > much performance impact even in full-write workloads, but the 'sort' can > affect > bcache device throughput and latency pretty heavily (and any other tasks that > are "unlucky" to be scheduled together with it). In some of our tests, there > was > a performance reduction of almost 90% in random read IOPS to the bcache device > (refer to the comparison graph at [0]). There's a few more details on the > Launchpad bug [1] we've created to track this, together with the complete fio > results + comparison graphs. > > The cond_resched() patch suggested by Shile Zhang actually improved > performance > a lot, and eliminated the stalls we've observed during the priority_stats > query. Even though it may cause the sysfs query to take a bit longer, it seems > like a decent tradeoff for general performance when running that query on a > system under heavy load. It's also a cheap short-term solution until we can > look > into a more complex re-write of the priority_stats calculation (perhaps one > that > doesn't require the locking?). Could we revisit Shile's patch, and consider > merging it? > > Thanks! > Heitor > > [0] > https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png > [1] https://bugs.launchpad.net/bugs/1840043 > Hi Heitor, With your very detailed explanation I come to understand why people cares about performance impact of pririty_stats. In the case of system monitoring, how long priority_stats returns is less important than overall system throughput. Yes I agree with your opinion and the benchmark chart makes me confident with the original patch. I will add this patch to v5.4 window with tested-by: Heitor Alves de Siqueira Thanks for your detailed information. And thanks for Shile Zhang originally composing this patch. -- Coly Li
Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
Hi Coly, We've had users impacted by system stalls and were able to trace it back to the bcache priority_stats query. After investigating a bit further, it seems that the sorting step in the quantiles calculation can cause heavy CPU contention. This has a severe performance impact on any task that is running in the same CPU as the sysfs query, and caused issues even for non-bcache workloads. We did some test runs with fio to get a better picture of the impact on read/write workloads while a priority_stats query is running, and came up with some interesting results. The bucket locking doesn't seem to have that much performance impact even in full-write workloads, but the 'sort' can affect bcache device throughput and latency pretty heavily (and any other tasks that are "unlucky" to be scheduled together with it). In some of our tests, there was a performance reduction of almost 90% in random read IOPS to the bcache device (refer to the comparison graph at [0]). There's a few more details on the Launchpad bug [1] we've created to track this, together with the complete fio results + comparison graphs. The cond_resched() patch suggested by Shile Zhang actually improved performance a lot, and eliminated the stalls we've observed during the priority_stats query. Even though it may cause the sysfs query to take a bit longer, it seems like a decent tradeoff for general performance when running that query on a system under heavy load. It's also a cheap short-term solution until we can look into a more complex re-write of the priority_stats calculation (perhaps one that doesn't require the locking?). Could we revisit Shile's patch, and consider merging it? Thanks! Heitor [0] https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png [1] https://bugs.launchpad.net/bugs/1840043
Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
On 2019/3/8 8:35 上午, Shile Zhang wrote: > > On 2019/3/7 23:44, Vojtech Pavlik wrote: >> On Thu, Mar 07, 2019 at 11:36:18PM +0800, Coly Li wrote: >>> On 2019/3/7 11:06 下午, Shile Zhang wrote: On 2019/3/7 18:34, Coly Li wrote: > On 2019/3/7 1:15 下午, shile.zh...@linux.alibaba.com wrote: >> From: Shile Zhang >> >> Read /sys/fs/bcache//cacheN/priority_stats can take very long >> time with huge cache after long run. >> >> Signed-off-by: Shile Zhang > Hi Shile, > > Do you test your change ? It will be helpful with more performance > data > (what problem that you improved). In case of 960GB SSD cache device, once read of the 'priority_stats' costs about 600ms in our test environment. >>> After the fix, how much time it takes ? >>> >>> The perf tool shown that near 50% CPU time consumed by 'sort()', this means once sort will hold the CPU near 300ms. In our case, the statistics collector reads the 'priority_stats' periodically, it will trigger the schedule latency jitters of the task which shared same CPU core. >>> Hmm, it seems you just make the sort slower, and nothing more changes. >>> Am I right ? >> Well, it has to make the sort slower, but it'll also avoid hogging the >> CPU (on a non-preemptible kernel), avoiding a potential soft lockup >> warning and allowing other tasks to run. >> > Yes, there is a risk that other tasks have no chance to run due to sort > hogging the CPU, it is harmful to some schedule-latency sensitive tasks. > This change just try to reduce the impact of sort, but not a performance > improvement of it. I'm not sure if a better way can handle it more > efficiency. I know the above concept, since I would expect when people talking about performance improvement, it would be better to provide real performance number. Under some conditions it might be hard, but in this exact example, it won't. Could you please provide some data that on your configuration, how slow 'sort' becomes ? AND, reading priority_stats in period for performance monitoring is not good idea. The problem is not from 'sort', it is from mutex_lock(&ca->set->bucket_lock) lines above the sort in sysfs.c, when you have a quite large or very busy cache device, you may see normal I/O performance drops due to too much time holding bucket_lock here. Anyway, this is just my suggestion. Back to this patch, please provide performance number. Thanks. Coly Li
Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
On 2019/3/7 23:44, Vojtech Pavlik wrote: On Thu, Mar 07, 2019 at 11:36:18PM +0800, Coly Li wrote: On 2019/3/7 11:06 下午, Shile Zhang wrote: On 2019/3/7 18:34, Coly Li wrote: On 2019/3/7 1:15 下午, shile.zh...@linux.alibaba.com wrote: From: Shile Zhang Read /sys/fs/bcache//cacheN/priority_stats can take very long time with huge cache after long run. Signed-off-by: Shile Zhang Hi Shile, Do you test your change ? It will be helpful with more performance data (what problem that you improved). In case of 960GB SSD cache device, once read of the 'priority_stats' costs about 600ms in our test environment. After the fix, how much time it takes ? The perf tool shown that near 50% CPU time consumed by 'sort()', this means once sort will hold the CPU near 300ms. In our case, the statistics collector reads the 'priority_stats' periodically, it will trigger the schedule latency jitters of the task which shared same CPU core. Hmm, it seems you just make the sort slower, and nothing more changes. Am I right ? Well, it has to make the sort slower, but it'll also avoid hogging the CPU (on a non-preemptible kernel), avoiding a potential soft lockup warning and allowing other tasks to run. Yes, there is a risk that other tasks have no chance to run due to sort hogging the CPU, it is harmful to some schedule-latency sensitive tasks. This change just try to reduce the impact of sort, but not a performance improvement of it. I'm not sure if a better way can handle it more efficiency. Thanks, Shile
Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
On Thu, Mar 07, 2019 at 11:36:18PM +0800, Coly Li wrote: > On 2019/3/7 11:06 下午, Shile Zhang wrote: > > > > On 2019/3/7 18:34, Coly Li wrote: > >> On 2019/3/7 1:15 下午, shile.zh...@linux.alibaba.com wrote: > >>> From: Shile Zhang > >>> > >>> Read /sys/fs/bcache//cacheN/priority_stats can take very long > >>> time with huge cache after long run. > >>> > >>> Signed-off-by: Shile Zhang > >> Hi Shile, > >> > >> Do you test your change ? It will be helpful with more performance data > >> (what problem that you improved). > > > > In case of 960GB SSD cache device, once read of the 'priority_stats' > > costs about 600ms in our test environment. > > > > After the fix, how much time it takes ? > > > > The perf tool shown that near 50% CPU time consumed by 'sort()', this > > means once sort will hold the CPU near 300ms. > > > > In our case, the statistics collector reads the 'priority_stats' > > periodically, it will trigger the schedule latency jitters of the > > > > task which shared same CPU core. > > > > Hmm, it seems you just make the sort slower, and nothing more changes. > Am I right ? Well, it has to make the sort slower, but it'll also avoid hogging the CPU (on a non-preemptible kernel), avoiding a potential soft lockup warning and allowing other tasks to run. -- Vojtech Pavlik VP SUSE Labs
Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
On 2019/3/7 11:06 下午, Shile Zhang wrote: > > On 2019/3/7 18:34, Coly Li wrote: >> On 2019/3/7 1:15 下午, shile.zh...@linux.alibaba.com wrote: >>> From: Shile Zhang >>> >>> Read /sys/fs/bcache//cacheN/priority_stats can take very long >>> time with huge cache after long run. >>> >>> Signed-off-by: Shile Zhang >> Hi Shile, >> >> Do you test your change ? It will be helpful with more performance data >> (what problem that you improved). > > In case of 960GB SSD cache device, once read of the 'priority_stats' > costs about 600ms in our test environment. > After the fix, how much time it takes ? > The perf tool shown that near 50% CPU time consumed by 'sort()', this > means once sort will hold the CPU near 300ms. > > In our case, the statistics collector reads the 'priority_stats' > periodically, it will trigger the schedule latency jitters of the > > task which shared same CPU core. > Hmm, it seems you just make the sort slower, and nothing more changes. Am I right ? Coly Li >> >> Thanks. >> >> Coly Li >> >>> --- >>> drivers/md/bcache/sysfs.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c >>> index 557a8a3..028fea1 100644 >>> --- a/drivers/md/bcache/sysfs.c >>> +++ b/drivers/md/bcache/sysfs.c >>> @@ -897,6 +897,7 @@ static void bch_cache_set_internal_release(struct >>> kobject *k) >>> static int __bch_cache_cmp(const void *l, const void *r) >>> { >>> + cond_resched(); >>> return *((uint16_t *)r) - *((uint16_t *)l); >>> } >>> >> -- Coly Li
Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
On 2019/3/7 18:34, Coly Li wrote: On 2019/3/7 1:15 下午, shile.zh...@linux.alibaba.com wrote: From: Shile Zhang Read /sys/fs/bcache//cacheN/priority_stats can take very long time with huge cache after long run. Signed-off-by: Shile Zhang Hi Shile, Do you test your change ? It will be helpful with more performance data (what problem that you improved). In case of 960GB SSD cache device, once read of the 'priority_stats' costs about 600ms in our test environment. The perf tool shown that near 50% CPU time consumed by 'sort()', this means once sort will hold the CPU near 300ms. In our case, the statistics collector reads the 'priority_stats' periodically, it will trigger the schedule latency jitters of the task which shared same CPU core. Thanks. Coly Li --- drivers/md/bcache/sysfs.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index 557a8a3..028fea1 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -897,6 +897,7 @@ static void bch_cache_set_internal_release(struct kobject *k) static int __bch_cache_cmp(const void *l, const void *r) { + cond_resched(); return *((uint16_t *)r) - *((uint16_t *)l); }
Re: [PATCH] bcache: add cond_resched() in __bch_cache_cmp()
On 2019/3/7 1:15 下午, shile.zh...@linux.alibaba.com wrote: > From: Shile Zhang > > Read /sys/fs/bcache//cacheN/priority_stats can take very long > time with huge cache after long run. > > Signed-off-by: Shile Zhang Hi Shile, Do you test your change ? It will be helpful with more performance data (what problem that you improved). Thanks. Coly Li > --- > drivers/md/bcache/sysfs.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c > index 557a8a3..028fea1 100644 > --- a/drivers/md/bcache/sysfs.c > +++ b/drivers/md/bcache/sysfs.c > @@ -897,6 +897,7 @@ static void bch_cache_set_internal_release(struct kobject > *k) > > static int __bch_cache_cmp(const void *l, const void *r) > { > + cond_resched(); > return *((uint16_t *)r) - *((uint16_t *)l); > } > > -- Coly Li
[PATCH] bcache: add cond_resched() in __bch_cache_cmp()
From: Shile Zhang Read /sys/fs/bcache//cacheN/priority_stats can take very long time with huge cache after long run. Signed-off-by: Shile Zhang --- drivers/md/bcache/sysfs.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index 557a8a3..028fea1 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -897,6 +897,7 @@ static void bch_cache_set_internal_release(struct kobject *k) static int __bch_cache_cmp(const void *l, const void *r) { + cond_resched(); return *((uint16_t *)r) - *((uint16_t *)l); } -- 1.8.3.1