On Fri, Mar 08, 2019 at 12:22:20PM -0500, Josef Bacik wrote:
> On Thu, Mar 07, 2019 at 07:08:31PM +0100, Andrea Righi wrote:
> > = Problem =
> > 
> > When sync() is executed from a high-priority cgroup, the process is forced 
> > to
> > wait the completion of the entire outstanding writeback I/O, even the I/O 
> > that
> > was originally generated by low-priority cgroups potentially.
> > 
> > This may cause massive latencies to random processes (even those running in 
> > the
> > root cgroup) that shouldn't be I/O-throttled at all, similarly to a classic
> > priority inversion problem.
> > 
> > This topic has been previously discussed here:
> > https://patchwork.kernel.org/patch/10804489/
> > 
> 
> Sorry to move the goal posts on you again Andrea, but Tejun and I talked about
> this some more offline.
> 
> We don't want cgroup to become the arbiter of correctness/behavior here.  We
> just want it to be isolating things.
> 
> For you that means you can drop the per-cgroup flag stuff, and only do the
> priority boosting for multiple sync(2) waiters.  That is a real priority
> inversion that needs to be fixed.  io.latency and io.max are capable of 
> noticing
> that a low priority group is going above their configured limits and putting
> pressure elsewhere accordingly.

Alright, so IIUC that means we just need patch 1/3 for now (with the
per-bdi lock instead of the global lock). If that's the case I'll focus
at that patch then.

> 
> Tejun said he'd rather see the sync(2) isolation be done at the namespace 
> level.
> That way if you have fs namespacing you are already isolated to your 
> namespace.
> If you feel like tackling that then hooray, but that's a separate dragon to 
> slay
> so don't feel like you have to right now.

Makes sense. I can take a look and see what I can do after posting the
new patch with the priority inversion fix only.

> 
> This way we keep cgroup doing its job, controlling resources.  Then we allow
> namespacing to do its thing, isolating resources.  Thanks,
> 
> Josef

Looks like a good plan to me. Thanks for the update.

-Andrea

Reply via email to