Re: [PATCH V4 00/15] blk-throttle: add .high limit

2016-11-15 Thread Shaohua Li
On Tue, Nov 15, 2016 at 11:53:39AM -0800, Bart Van Assche wrote:
> On 11/14/2016 05:28 PM, Shaohua Li wrote:
> > On Mon, Nov 14, 2016 at 05:18:28PM -0800, Bart Van Assche wrote:
> > > Unless someone can convince me of the opposite I think that coming up with
> > > an algorithm for estimating I/O cost is essential to guarantee I/O 
> > > fairness
> > > without requesting users to perform complicated parameter configurations.
> > 
> > That's what I tried before:
> > http://marc.info/?l=linux-kernel&m=145617863208940&w=2
> 
> That URL refers to v2 of this patch series. Sorry but I have not found the
> I/O cost estimation algorithm I was referring to in v2 of this patch series.

It's a v2, but not the v2 of this series. There is no algorithm to estimate I/O
cost, that one just uses IOPS or bandwidth to determine I/O cost.

Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 00/15] blk-throttle: add .high limit

2016-11-15 Thread Bart Van Assche

On 11/14/2016 05:28 PM, Shaohua Li wrote:

On Mon, Nov 14, 2016 at 05:18:28PM -0800, Bart Van Assche wrote:

Unless someone can convince me of the opposite I think that coming up with
an algorithm for estimating I/O cost is essential to guarantee I/O fairness
without requesting users to perform complicated parameter configurations.


That's what I tried before:
http://marc.info/?l=linux-kernel&m=145617863208940&w=2


That URL refers to v2 of this patch series. Sorry but I have not found 
the I/O cost estimation algorithm I was referring to in v2 of this patch 
series.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 00/15] blk-throttle: add .high limit

2016-11-14 Thread Shaohua Li
On Mon, Nov 14, 2016 at 05:18:28PM -0800, Bart Van Assche wrote:
> On 11/14/2016 04:49 PM, Shaohua Li wrote:
> > On Mon, Nov 14, 2016 at 04:41:33PM -0800, Bart Van Assche wrote:
> > > Thank you for pointing me to the discussion thread about v3 of this patch
> > > series. Did I see correctly that one of the conclusions was that for users
> > > this mechanism is hard to configure? Are we providing a good service to
> > > Linux users by providing a mechanism that is hard to configure?
> > 
> > Yes, this is a kind of low level knob and is expected to be configured by
> > experienced users. This sucks, but we really don't have good solutions. If
> > anybody has better ideas, I'm happy to try.
> 
> Hello Shaohua,
> 
> An approach I have been considering to analyze further is as follows:
> * For rotational media use an algorithm like BFQ to preserve sequentiality
> of workloads and to guarantee fairness. This means that one application
> submits I/O per time slot.
> * For SSDs, multiplex I/O from multiple applications during a single time
> slot to keep the queue depth high. Throttle I/O if needed to realize
> fairness.
> 
> Implementing this approach requires an approach for estimating I/O cost
> based on the request characteristics (offset and size) and the device type
> (rotational or SSD). This may require measuring the time that was needed to
> process past requests and to use that information in a learning algorithm.
> 
> Unless someone can convince me of the opposite I think that coming up with
> an algorithm for estimating I/O cost is essential to guarantee I/O fairness
> without requesting users to perform complicated parameter configurations.

That's what I tried before:
http://marc.info/?l=linux-kernel&m=145617863208940&w=2

Unfortunately estimating I/O cost and disk capability is very hard if not
impossible. People objected using bandwidth or iops to estimate I/O cost.

Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 00/15] blk-throttle: add .high limit

2016-11-14 Thread Bart Van Assche

On 11/14/2016 04:49 PM, Shaohua Li wrote:

On Mon, Nov 14, 2016 at 04:41:33PM -0800, Bart Van Assche wrote:

Thank you for pointing me to the discussion thread about v3 of this patch
series. Did I see correctly that one of the conclusions was that for users
this mechanism is hard to configure? Are we providing a good service to
Linux users by providing a mechanism that is hard to configure?


Yes, this is a kind of low level knob and is expected to be configured by
experienced users. This sucks, but we really don't have good solutions. If
anybody has better ideas, I'm happy to try.


Hello Shaohua,

An approach I have been considering to analyze further is as follows:
* For rotational media use an algorithm like BFQ to preserve 
sequentiality of workloads and to guarantee fairness. This means that 
one application submits I/O per time slot.
* For SSDs, multiplex I/O from multiple applications during a single 
time slot to keep the queue depth high. Throttle I/O if needed to 
realize fairness.


Implementing this approach requires an approach for estimating I/O cost 
based on the request characteristics (offset and size) and the device 
type (rotational or SSD). This may require measuring the time that was 
needed to process past requests and to use that information in a 
learning algorithm.


Unless someone can convince me of the opposite I think that coming up 
with an algorithm for estimating I/O cost is essential to guarantee I/O 
fairness without requesting users to perform complicated parameter 
configurations.


Bart.


--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 00/15] blk-throttle: add .high limit

2016-11-14 Thread Shaohua Li
On Mon, Nov 14, 2016 at 04:41:33PM -0800, Bart Van Assche wrote:
> On 11/14/2016 04:05 PM, Shaohua Li wrote:
> > On Mon, Nov 14, 2016 at 02:46:22PM -0800, Bart Van Assche wrote:
> > > On 11/14/2016 02:22 PM, Shaohua Li wrote:
> > > > The background is we don't have an ioscheduler for blk-mq yet, so we 
> > > > can't
> > > > prioritize processes/cgroups. This patch set tries to add basic 
> > > > arbitration
> > > > between cgroups with blk-throttle. It adds a new limit io.high for
> > > > blk-throttle. It's only for cgroup2.
> > > 
> > > My understanding of this work is that a significant part of it will have 
> > > to
> > > be reverted once blk-mq supports I/O scheduling, e.g. the code for 
> > > detecting
> > > whether the I/O submitter is idle. Shouldn't this kind of infrastructure 
> > > be
> > > added after support has been added in blk-mq for I/O scheduling?
> > 
> > Sure, if we have a CFQ-like io scheduler for blk-mq, this is largly not
> > required. But we don't have one yet and nothing is floating around either. 
> > The
> > conservative throttling is relatively easy to implement and achive similar
> > goal. The throttling could be still useful even with ioscheduler as 
> > throttling
> > is faster if we are talking about CFQ-like scheduler. I don't think this 
> > should
> > be blocked to wait for I/O scheduling. There was a long discussion in last
> > post, and we agreed the throttling and io scheduler aren't mutually 
> > exclusive.
> > http://marc.info/?l=linux-kernel&m=147552964708965&w=2
> 
> Hello Shaohua,
> 
> Thank you for pointing me to the discussion thread about v3 of this patch
> series. Did I see correctly that one of the conclusions was that for users
> this mechanism is hard to configure? Are we providing a good service to
> Linux users by providing a mechanism that is hard to configure?

Yes, this is a kind of low level knob and is expected to be configured by
experienced users. This sucks, but we really don't have good solutions. If
anybody has better ideas, I'm happy to try.

Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 00/15] blk-throttle: add .high limit

2016-11-14 Thread Bart Van Assche

On 11/14/2016 04:05 PM, Shaohua Li wrote:

On Mon, Nov 14, 2016 at 02:46:22PM -0800, Bart Van Assche wrote:

On 11/14/2016 02:22 PM, Shaohua Li wrote:

The background is we don't have an ioscheduler for blk-mq yet, so we can't
prioritize processes/cgroups. This patch set tries to add basic arbitration
between cgroups with blk-throttle. It adds a new limit io.high for
blk-throttle. It's only for cgroup2.


My understanding of this work is that a significant part of it will have to
be reverted once blk-mq supports I/O scheduling, e.g. the code for detecting
whether the I/O submitter is idle. Shouldn't this kind of infrastructure be
added after support has been added in blk-mq for I/O scheduling?


Sure, if we have a CFQ-like io scheduler for blk-mq, this is largly not
required. But we don't have one yet and nothing is floating around either. The
conservative throttling is relatively easy to implement and achive similar
goal. The throttling could be still useful even with ioscheduler as throttling
is faster if we are talking about CFQ-like scheduler. I don't think this should
be blocked to wait for I/O scheduling. There was a long discussion in last
post, and we agreed the throttling and io scheduler aren't mutually exclusive.
http://marc.info/?l=linux-kernel&m=147552964708965&w=2


Hello Shaohua,

Thank you for pointing me to the discussion thread about v3 of this 
patch series. Did I see correctly that one of the conclusions was that 
for users this mechanism is hard to configure? Are we providing a good 
service to Linux users by providing a mechanism that is hard to configure?


Thanks,

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 00/15] blk-throttle: add .high limit

2016-11-14 Thread Shaohua Li
On Mon, Nov 14, 2016 at 02:46:22PM -0800, Bart Van Assche wrote:
> On 11/14/2016 02:22 PM, Shaohua Li wrote:
> > The background is we don't have an ioscheduler for blk-mq yet, so we can't
> > prioritize processes/cgroups. This patch set tries to add basic arbitration
> > between cgroups with blk-throttle. It adds a new limit io.high for
> > blk-throttle. It's only for cgroup2.
> 
> Hello Shaohua,
> 
> My understanding of this work is that a significant part of it will have to
> be reverted once blk-mq supports I/O scheduling, e.g. the code for detecting
> whether the I/O submitter is idle. Shouldn't this kind of infrastructure be
> added after support has been added in blk-mq for I/O scheduling?

Sure, if we have a CFQ-like io scheduler for blk-mq, this is largly not
required. But we don't have one yet and nothing is floating around either. The
conservative throttling is relatively easy to implement and achive similar
goal. The throttling could be still useful even with ioscheduler as throttling
is faster if we are talking about CFQ-like scheduler. I don't think this should
be blocked to wait for I/O scheduling. There was a long discussion in last
post, and we agreed the throttling and io scheduler aren't mutually exclusive.
http://marc.info/?l=linux-kernel&m=147552964708965&w=2

Thanks,
Shaohua

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 00/15] blk-throttle: add .high limit

2016-11-14 Thread Bart Van Assche

On 11/14/2016 02:22 PM, Shaohua Li wrote:

The background is we don't have an ioscheduler for blk-mq yet, so we can't
prioritize processes/cgroups. This patch set tries to add basic arbitration
between cgroups with blk-throttle. It adds a new limit io.high for
blk-throttle. It's only for cgroup2.


Hello Shaohua,

My understanding of this work is that a significant part of it will have 
to be reverted once blk-mq supports I/O scheduling, e.g. the code for 
detecting whether the I/O submitter is idle. Shouldn't this kind of 
infrastructure be added after support has been added in blk-mq for I/O 
scheduling?


Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html