Bug#972709: Wishlist/RFC: Change to CONFIG_PREEMPT_NONE in linux-image-cloud-*

2020-11-25 Thread Flavio Veloso Soares
Thank you for contributing to the discussion. I agree that the page I 
cited lacks a lot (I mentioned it), but unfortunately that's the best 
one I could find that touches the subject. Naturally a benchmark would 
be ideal, but again unfortunately I don't have the resources to do one 
myself now. From a pure technical POV, I think that the current choice 
-- a kernel tuned up for desktop use -- is the one that should require a 
benchmark to be adopted, but I get the message.


On 2020-11-22 5:49 p.m., Noah Meyerhans wrote:

On Sun, Nov 22, 2020 at 03:53:32PM -0800, Flavio Veloso Soares wrote:

  Unfortunately, I couldn't find many comprehensive benchmarks of kernel
  CONFIG_PREEMPT* options. The one at
  
[1]https://www.codeblueprint.co.uk/2019/12/23/linux-preemption-latency-throughput.html
  seems to be very thorough,

  [...]

  Not particularly.  I'm used to latency benchmarks showing e.g. average,
  90th percentile, 99th percentile, as well as worst.

I don't think Ben was talking about specific benchmarks.  The web page
you cites lacks basic measurements one would expect to see from *any*
meaningful performance benchmark.  Comparing maximum latency is fine,
but it's not really relevant by itself.  If a configuration change
improves the worst case (100th percentile) but negatively impacts the
50th percentile, is that a change worth making?  Maybe.  But without
having that data at all, the benchmark really isn't worth much at all.

It's totally reasonable for us to consider making this change, but we
should have comprehensive data about the impact of doing so.  What
impact does the change have on different classes of workloads?  e.g.
high tps, CPU-bound, IO-bound, etc.  It's entirely possible that the
proposed change improves performance under certain workloads, but
negatively impacts others.  Without knowing the impact in more in more
detail, which would allow us to evaluate the tradeoffs, I don't think
there's a compelling reason to make a change.

noah


--
FVS



Bug#972709: Wishlist/RFC: Change to CONFIG_PREEMPT_NONE in linux-image-cloud-*

2020-11-22 Thread Noah Meyerhans
On Sun, Nov 22, 2020 at 03:53:32PM -0800, Flavio Veloso Soares wrote:
>  Unfortunately, I couldn't find many comprehensive benchmarks of kernel
>  CONFIG_PREEMPT* options. The one at
>  
> [1]https://www.codeblueprint.co.uk/2019/12/23/linux-preemption-latency-throughput.html
>  seems to be very thorough,
> 
>  [...]
> 
>  Not particularly.  I'm used to latency benchmarks showing e.g. average,
>  90th percentile, 99th percentile, as well as worst.

I don't think Ben was talking about specific benchmarks.  The web page
you cites lacks basic measurements one would expect to see from *any*
meaningful performance benchmark.  Comparing maximum latency is fine,
but it's not really relevant by itself.  If a configuration change
improves the worst case (100th percentile) but negatively impacts the
50th percentile, is that a change worth making?  Maybe.  But without
having that data at all, the benchmark really isn't worth much at all.

It's totally reasonable for us to consider making this change, but we
should have comprehensive data about the impact of doing so.  What
impact does the change have on different classes of workloads?  e.g.
high tps, CPU-bound, IO-bound, etc.  It's entirely possible that the
proposed change improves performance under certain workloads, but
negatively impacts others.  Without knowing the impact in more in more
detail, which would allow us to evaluate the tradeoffs, I don't think
there's a compelling reason to make a change.

noah



Bug#972709: Wishlist/RFC: Change to CONFIG_PREEMPT_NONE in linux-image-cloud-*

2020-11-22 Thread Flavio Veloso Soares


On 2020-11-22 2:28 p.m., Ben Hutchings wrote:

On Sun, 2020-11-22 at 13:45 -0800, Flavio Veloso Soares wrote:

[Resending: just noticed that the reply I sent on Oct 23 didn't include
b.d.o]

I don't think the article is about the same thing we're talking here.
CONFIG_PREEMPT* options control the compromise between latency and
throughput of *system calls* and *scheduling of CPU cycles spent in
kernel mode*, not network traffic.

The latency of requests to services on a server is affected by both
scheduler and network latency.

[...]


"Services" is a too broad term. Which kind of service are you talking about?

For the record, I'm talking about latency of kernel system calls 
specifically, which happens to be what CONFIG_PREEMPT* controls.




Unfortunately, I couldn't find many comprehensive benchmarks of kernel
CONFIG_PREEMPT* options. The one at
https://www.codeblueprint.co.uk/2019/12/23/linux-preemption-latency-throughput.html
seems to be very thorough,

[...]

Not particularly.  I'm used to latency benchmarks showing e.g. average,
90th percentile, 99th percentile, as well as worst.

Ben.


Are those benchmarks public? Can you provide links to them?


--
FVS



Bug#972709: Wishlist/RFC: Change to CONFIG_PREEMPT_NONE in linux-image-cloud-*

2020-11-22 Thread Ben Hutchings
On Sun, 2020-11-22 at 13:45 -0800, Flavio Veloso Soares wrote:
> [Resending: just noticed that the reply I sent on Oct 23 didn't include 
> b.d.o]
> 
> I don't think the article is about the same thing we're talking here. 
> CONFIG_PREEMPT* options control the compromise between latency and 
> throughput of *system calls* and *scheduling of CPU cycles spent in 
> kernel mode*, not network traffic.

The latency of requests to services on a server is affected by both
scheduler and network latency.

[...]
> Unfortunately, I couldn't find many comprehensive benchmarks of kernel 
> CONFIG_PREEMPT* options. The one at 
> https://www.codeblueprint.co.uk/2019/12/23/linux-preemption-latency-throughput.html
>  
> seems to be very thorough,
[...]

Not particularly.  I'm used to latency benchmarks showing e.g. average,
90th percentile, 99th percentile, as well as worst.

Ben.

-- 
Ben Hutchings
If at first you don't succeed, you're doing about average.



signature.asc
Description: This is a digitally signed message part


Bug#972709: Wishlist/RFC: Change to CONFIG_PREEMPT_NONE in linux-image-cloud-*

2020-11-22 Thread Flavio Veloso Soares
[Resending: just noticed that the reply I sent on Oct 23 didn't include 
b.d.o]


I don't think the article is about the same thing we're talking here. 
CONFIG_PREEMPT* options control the compromise between latency and 
throughput of *system calls* and *scheduling of CPU cycles spent in 
kernel mode*, not network traffic. Granted, networking is affected by 
the setting too,  but intuition tells me that a nonpreemptible system 
call -- meaning, one that finish all processing until it ends, or blocks 
on I/O -- could even *decrease* network latency, not increase.


Unfortunately, I couldn't find many comprehensive benchmarks of kernel 
CONFIG_PREEMPT* options. The one at 
https://www.codeblueprint.co.uk/2019/12/23/linux-preemption-latency-throughput.html 
seems to be very thorough, and shows that the difference of latency 
between CONFIG_PREEMPT_VOLUNTARY and CONFIG_PREEMPT_NONE is actually 
nonexistent, while no-preemption provides noticeable more throughput.


This unsurprising conclusion alone tells that CONFIG_PREEMPT_NONE is a 
better choice for servers.


However, there's more. No benchmark touches the subject of overhead 
context switches and burstable CPU cycles "credit" system used in many 
(most?) cloud environments, which happens to be the target of *-cloud 
kernels. With voluntary preemption, all those cycles used in overhead 
context switches are not only wasted, but they still count against 
instance CPU "credits", and that reduces overall computing power 
available to the instance. This is like double-paying for something you 
don't need.



On 2020-10-23 6:04 p.m., Ben Hutchings wrote:

On Thu, 2020-10-22 at 13:43 -0700, Flavio Veloso wrote:

Package: linux-image-cloud-amd64
Version: 4.19+105+deb10u7
Severity: wishlist

Since cloud images are mostly run for server workloads in headless
environments accessed via network only, it would be better if
"linux-image-cloud-*" kernels were compiled with CONFIG_PREEMPT_NONE=y
("No Forced Preemption (Server)").

Currently those packages use CONFIG_PREEMPT_VOLUNTARY=y ("Voluntary
Kernel Preemption (Desktop)")

CONFIG_PREEMPT_NONE description from kernel help:

[...]

I know what it says, but I think the notion that latency is less
important on servers is outdated.

It's well known that people give up quickly on web pages that are slow
to load:
.
And a web page can depend on (indirectly) very many servers, which
means that e.g. high latency that only occurs 1% of the time on any
single server actually affects a large fraction of requests.

Ben.


--
FVS



Bug#972709: Wishlist/RFC: Change to CONFIG_PREEMPT_NONE in linux-image-cloud-*

2020-10-23 Thread Ben Hutchings
On Thu, 2020-10-22 at 13:43 -0700, Flavio Veloso wrote:
> Package: linux-image-cloud-amd64
> Version: 4.19+105+deb10u7
> Severity: wishlist
> 
> Since cloud images are mostly run for server workloads in headless 
> environments accessed via network only, it would be better if 
> "linux-image-cloud-*" kernels were compiled with CONFIG_PREEMPT_NONE=y 
> ("No Forced Preemption (Server)").
> 
> Currently those packages use CONFIG_PREEMPT_VOLUNTARY=y ("Voluntary 
> Kernel Preemption (Desktop)")
> 
> CONFIG_PREEMPT_NONE description from kernel help:
[...]

I know what it says, but I think the notion that latency is less
important on servers is outdated.

It's well known that people give up quickly on web pages that are slow
to load:
.
And a web page can depend on (indirectly) very many servers, which
means that e.g. high latency that only occurs 1% of the time on any
single server actually affects a large fraction of requests.

Ben.

-- 
Ben Hutchings
The world is coming to an end.  Please log off.



Bug#972709: Wishlist/RFC: Change to CONFIG_PREEMPT_NONE in linux-image-cloud-*

2020-10-22 Thread Flavio Veloso

Package: linux-image-cloud-amd64
Version: 4.19+105+deb10u7
Severity: wishlist

Since cloud images are mostly run for server workloads in headless 
environments accessed via network only, it would be better if 
"linux-image-cloud-*" kernels were compiled with CONFIG_PREEMPT_NONE=y 
("No Forced Preemption (Server)").


Currently those packages use CONFIG_PREEMPT_VOLUNTARY=y ("Voluntary 
Kernel Preemption (Desktop)")


CONFIG_PREEMPT_NONE description from kernel help:

"This is the traditional Linux preemption model, geared towards
throughput. It will still provide good latencies most of the time,
but there are no guarantees and occasional longer delays are
possible.

Select this option if you are building a kernel for a server
or scientific/computation system, or if you want to maximize the
raw processing power of the kernel, irrespective of scheduling
latencies."

Help on CONFIG_PREEMPT_VOLUNTARY:

"This option reduces the latency of the kernel by adding more
"explicit preemption points" to the kernel code. These new
preemption points have been selected to reduce the maximum latency
of rescheduling, providing faster application reactions, at the cost
of slightly lower throughput.

This allows reaction to interactive events by allowing a low
priority process to voluntarily preempt itself even if it is in
kernel mode executing a system call. This allows applications to run
more 'smoothly' even when the system is under load.

Select this if you are building a kernel for a desktop system.

In other words, choosing CONFIG_PREEMPT_NONE would favour throughput 
over latency, the latter being more important on GUI environment (where 
a non-responsive mouse is bad, for example) but not on servers.


A second benefit of CONFIG_PREEMPT_NONE is that it reduces context 
switch overhead, which means more CPU cycles are available for doing 
useful computing. This is specially important on virtualized 
environments and/or guest scheduling is based on CPU "credits" (for 
example, AWS).


--
FV