On Thu, Jul 14, 2016 at 08:36:09PM +0200, Peter Zijlstra wrote:
> On Thu, Jul 14, 2016 at 08:09:52PM +0200, Oleg Nesterov wrote:
> > On 07/14, Peter Zijlstra wrote:
> > >
> > > The below is a compile tested only first draft so far. I'll go give it
> > > some runtime next.
> >
> > So I will wait fo
On Thu, Jul 14, 2016 at 08:09:52PM +0200, Oleg Nesterov wrote:
> On 07/14, Peter Zijlstra wrote:
> >
> > The below is a compile tested only first draft so far. I'll go give it
> > some runtime next.
>
> So I will wait for the new version, but at first glance this matches the
> code I already revie
On 07/14, Peter Zijlstra wrote:
>
> The below is a compile tested only first draft so far. I'll go give it
> some runtime next.
So I will wait for the new version, but at first glance this matches the
code I already reviewed in the past (at least, tried hard to review ;)
and it looks correct.
Jus
On Thu, Jul 14, 2016 at 10:41 AM, Oleg Nesterov wrote:
> On 07/14, John Stultz wrote:
>>
>> I'm not supposed to be applying this on-top of
>> Paul's change, right?
>
> Right, unless I am totally confused,
>
>> > Just in case, could you try the patch below? Of course, without other
>> > optimizatio
On 07/14, John Stultz wrote:
>
> I'm not supposed to be applying this on-top of
> Paul's change, right?
Right, unless I am totally confused,
> > Just in case, could you try the patch below? Of course, without other
> > optimizations from Peter, this change makes cgroup_threadgroup_rwsem
> > much
On 07/14, Peter Zijlstra wrote:
>
> But I really think that this Android usecase invalidates the premise of
> cgroups using a global lock.
Perhaps... but it would be nice to have a global lock for cgroups (and in
fact probably unify it with dup_mmap_sem). And we can't simply revert that
change now
On Thu, Jul 14, 2016 at 10:13 AM, Oleg Nesterov wrote:
> On 07/14, John Stultz wrote:
>>
>> So I am seeing synchronize_sched called, and its taking the
>> !rcu_gp_is_expedited path when I see the particularly bad latencies.
>>
>> I wonder if I just mucked up applying the patch?
>
> Probably yes...
On Thu, Jul 14, 2016 at 06:45:47PM +0200, Peter Zijlstra wrote:
> On Thu, Jul 14, 2016 at 09:23:55AM -0700, Paul E. McKenney wrote:
> > Hmmm... How does this handle the following sequence of events for
> > the case where we are not biased towards the reader?
> >
> > o The per-CPU rwsem is set u
On 07/14, John Stultz wrote:
>
> So I am seeing synchronize_sched called, and its taking the
> !rcu_gp_is_expedited path when I see the particularly bad latencies.
>
> I wonder if I just mucked up applying the patch?
Probably yes...
Just in case, could you try the patch below? Of course, without
On 07/14, Peter Zijlstra wrote:
>
> On Thu, Jul 14, 2016 at 04:58:44PM +0200, Oleg Nesterov wrote:
> >
> > But note that we do not need RCU_NONE. All we need is the trivial change
> > below.
>
> Hurm, maybe. So having that unbalanced keeps us in GP_PASSED state and
> since we'll never drop gp_count
On Thu, Jul 14, 2016 at 9:49 AM, Peter Zijlstra wrote:
> On Thu, Jul 14, 2016 at 09:43:40AM -0700, John Stultz wrote:
>> On Thu, Jul 14, 2016 at 6:18 AM, Peter Zijlstra wrote:
>> > On Wed, Jul 13, 2016 at 10:51:02PM +0200, Peter Zijlstra wrote:
>> >> So, IIRC, the trade-off is a full memory barri
On Wed, Jul 13, 2016 at 4:02 PM, Paul E. McKenney
wrote:
> On Wed, Jul 13, 2016 at 03:39:37PM -0700, John Stultz wrote:
>>
>> But otherwise both patches look great and are working well!
>>
>> Do you mind marking them both for stable 4.4+?
>
> OK, looks like it does qualify in the "fix a notable pe
On Thu, Jul 14, 2016 at 09:43:40AM -0700, John Stultz wrote:
> On Thu, Jul 14, 2016 at 6:18 AM, Peter Zijlstra wrote:
> > On Wed, Jul 13, 2016 at 10:51:02PM +0200, Peter Zijlstra wrote:
> >> So, IIRC, the trade-off is a full memory barrier in read_lock and
> >> read_unlock() vs sync_sched() in wri
On Thu, Jul 14, 2016 at 09:23:55AM -0700, Paul E. McKenney wrote:
> Hmmm... How does this handle the following sequence of events for
> the case where we are not biased towards the reader?
>
> o The per-CPU rwsem is set up with RCU_NONE and readers_slow
> (as opposed to readers_block).
On Thu, Jul 14, 2016 at 6:18 AM, Peter Zijlstra wrote:
> On Wed, Jul 13, 2016 at 10:51:02PM +0200, Peter Zijlstra wrote:
>> So, IIRC, the trade-off is a full memory barrier in read_lock and
>> read_unlock() vs sync_sched() in write.
>>
>> Full memory barriers are expensive and while the combined c
On Thu, Jul 14, 2016 at 04:58:44PM +0200, Oleg Nesterov wrote:
>
> But note that we do not need RCU_NONE. All we need is the trivial change
> below.
Hurm, maybe. So having that unbalanced keeps us in GP_PASSED state and
since we'll never drop gp_count back to 0 nothing will ever happen.
Cute, ye
On Thu, Jul 14, 2016 at 11:07:15AM -0400, Tejun Heo wrote:
> How? While write lock is pending, no new reader is allowed.
Look at the new percpu_down_write (the old one is similar in concept):
+ void percpu_down_write(struct percpu_rw_semaphore *sem)
+ {
+ down_write(&sem->rw_sem);
+
+
On Thu, Jul 14, 2016 at 03:18:09PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 13, 2016 at 10:51:02PM +0200, Peter Zijlstra wrote:
> > So, IIRC, the trade-off is a full memory barrier in read_lock and
> > read_unlock() vs sync_sched() in write.
> >
> > Full memory barriers are expensive and while t
On Thu, Jul 14, 2016 at 04:58:44PM +0200, Oleg Nesterov wrote:
>
> Of course, this leads to another question: do we really need rcu-sync at
> all, or should we change percpu-rwsem to always work in the "slow" mode
> which is not that slow with your change... I'd like to keep it ;)
>
> What do you
On Thu, Jul 14, 2016 at 11:07:15AM -0400, Tejun Heo wrote:
> On Thu, Jul 14, 2016 at 02:20:49PM +0200, Peter Zijlstra wrote:
> > > If that's the case, we have the wrong implemention
> > > for percpu-rwsem where very long delays for writers induce the same
> > > level of delays to all readers. If e
On Thu, Jul 14, 2016 at 02:11:01PM +0200, Peter Zijlstra wrote:
> > How so? As the number of cores increases, it'll get proportionally
> > more expensive as the same operation is performed on more CPUs;
> > however, the latency is dependent on the slowest one and it'll get
> > higher more often wi
On Thu, Jul 14, 2016 at 02:20:49PM +0200, Peter Zijlstra wrote:
> > If that's the case, we have the wrong implemention
> > for percpu-rwsem where very long delays for writers induce the same
> > level of delays to all readers. If expedited by default isn't
> > workable, we should move away from rc
On 07/14, Peter Zijlstra wrote:
>
> OK, not too horrible if I say so myself :-)
>
> The below is a compile tested only first draft so far. I'll go give it
> some runtime next.
Yes, thanks.
But note that we do not need RCU_NONE. All we need is the trivial change
below. Damn, I am trying to find my
On Thu, Jul 14, 2016 at 03:18:09PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 13, 2016 at 10:51:02PM +0200, Peter Zijlstra wrote:
> > So, IIRC, the trade-off is a full memory barrier in read_lock and
> > read_unlock() vs sync_sched() in write.
> >
> > Full memory barriers are expensive and while t
On Wed, Jul 13, 2016 at 10:51:02PM +0200, Peter Zijlstra wrote:
> So, IIRC, the trade-off is a full memory barrier in read_lock and
> read_unlock() vs sync_sched() in write.
>
> Full memory barriers are expensive and while the combined cost might
> well exceed the cost of the sync_sched() it doesn
On Thu, Jul 14, 2016 at 08:08:45AM -0400, Tejun Heo wrote:
> On Thu, Jul 14, 2016 at 02:04:28PM +0200, Peter Zijlstra wrote:
> > > I think it probably makes sense to make this the default on !RT at
> > > least with a separate patch w/o stable cc'd. While most use cases
> > > will be fine with the
On Thu, Jul 14, 2016 at 07:20:46AM -0400, Tejun Heo wrote:
> On Thu, Jul 14, 2016 at 08:49:56AM +0200, Peter Zijlstra wrote:
> > So the immediate problem with lg style locks is that the 'local' lock
> > will not stay local since these are preemptible locks we can get
> > migrations etc..
> >
> >
On Thu, Jul 14, 2016 at 02:04:28PM +0200, Peter Zijlstra wrote:
> > I think it probably makes sense to make this the default on !RT at
> > least with a separate patch w/o stable cc'd. While most use cases
> > will be fine with the latency on write path, it also means that the
> > reader side is bl
On Thu, Jul 14, 2016 at 07:35:05AM -0400, Tejun Heo wrote:
> Hello,
>
> On Wed, Jul 13, 2016 at 04:04:04PM -0700, Paul E. McKenney wrote:
> > commit b4edebb8f5664a3a51be1e3ff3d7f1cb2d3d5c88
> > Author: Paul E. McKenney
> > Date: Wed Jul 13 15:13:31 2016 -0700
> >
> > rcu: Provide RCUSYNC_E
Hello,
On Wed, Jul 13, 2016 at 04:04:04PM -0700, Paul E. McKenney wrote:
> commit b4edebb8f5664a3a51be1e3ff3d7f1cb2d3d5c88
> Author: Paul E. McKenney
> Date: Wed Jul 13 15:13:31 2016 -0700
>
> rcu: Provide RCUSYNC_EXPEDITE option for rcusync.expedited default
>
> This commit provi
On Thu, Jul 14, 2016 at 08:49:56AM +0200, Peter Zijlstra wrote:
> On Wed, Jul 13, 2016 at 06:01:28PM -0400, Tejun Heo wrote:
>
> > Technically, I think the lglock approach would be better here given
> > the combination of requirements; however, it's quite a bit more code
> > which would likely req
On Wed, Jul 13, 2016 at 06:01:28PM -0400, Tejun Heo wrote:
> Technically, I think the lglock approach would be better here given
> the combination of requirements; however, it's quite a bit more code
> which would likely require some sophistications down the line (like
> blocking new readers first
On Wed, Jul 13, 2016 at 04:02:38PM -0700, Paul E. McKenney wrote:
> On Wed, Jul 13, 2016 at 03:39:37PM -0700, John Stultz wrote:
> > On Wed, Jul 13, 2016 at 3:17 PM, Paul E. McKenney
> > wrote:
> > > On Wed, Jul 13, 2016 at 02:46:37PM -0700, John Stultz wrote:
> > >> On Wed, Jul 13, 2016 at 2:42 P
On Wed, Jul 13, 2016 at 03:39:37PM -0700, John Stultz wrote:
> On Wed, Jul 13, 2016 at 3:17 PM, Paul E. McKenney
> wrote:
> > On Wed, Jul 13, 2016 at 02:46:37PM -0700, John Stultz wrote:
> >> On Wed, Jul 13, 2016 at 2:42 PM, Paul E. McKenney
> >> wrote:
> >> > On Wed, Jul 13, 2016 at 02:18:41PM -
On Wed, Jul 13, 2016 at 3:17 PM, Paul E. McKenney
wrote:
> On Wed, Jul 13, 2016 at 02:46:37PM -0700, John Stultz wrote:
>> On Wed, Jul 13, 2016 at 2:42 PM, Paul E. McKenney
>> wrote:
>> > On Wed, Jul 13, 2016 at 02:18:41PM -0700, Paul E. McKenney wrote:
>> >> On Wed, Jul 13, 2016 at 05:05:26PM -0
On Wed, Jul 13, 2016 at 06:01:28PM -0400, Tejun Heo wrote:
> Hello, Paul.
>
> On Wed, Jul 13, 2016 at 02:18:41PM -0700, Paul E. McKenney wrote:
> > On Wed, Jul 13, 2016 at 05:05:26PM -0400, Tejun Heo wrote:
> > > On Wed, Jul 13, 2016 at 02:03:15PM -0700, Paul E. McKenney wrote:
> > > > Take the pa
On Wed, Jul 13, 2016 at 2:42 PM, Paul E. McKenney
wrote:
> On Wed, Jul 13, 2016 at 02:18:41PM -0700, Paul E. McKenney wrote:
>> On Wed, Jul 13, 2016 at 05:05:26PM -0400, Tejun Heo wrote:
>> > On Wed, Jul 13, 2016 at 02:03:15PM -0700, Paul E. McKenney wrote:
>> > > Take the patch that I just sent o
On Wed, Jul 13, 2016 at 02:46:37PM -0700, John Stultz wrote:
> On Wed, Jul 13, 2016 at 2:42 PM, Paul E. McKenney
> wrote:
> > On Wed, Jul 13, 2016 at 02:18:41PM -0700, Paul E. McKenney wrote:
> >> On Wed, Jul 13, 2016 at 05:05:26PM -0400, Tejun Heo wrote:
> >> > On Wed, Jul 13, 2016 at 02:03:15PM
Hello, Paul.
On Wed, Jul 13, 2016 at 02:18:41PM -0700, Paul E. McKenney wrote:
> On Wed, Jul 13, 2016 at 05:05:26PM -0400, Tejun Heo wrote:
> > On Wed, Jul 13, 2016 at 02:03:15PM -0700, Paul E. McKenney wrote:
> > > Take the patch that I just sent out and make the choice of normal
> > > vs. expedi
On Wed, Jul 13, 2016 at 2:42 PM, Paul E. McKenney
wrote:
> On Wed, Jul 13, 2016 at 02:18:41PM -0700, Paul E. McKenney wrote:
>> On Wed, Jul 13, 2016 at 05:05:26PM -0400, Tejun Heo wrote:
>> > On Wed, Jul 13, 2016 at 02:03:15PM -0700, Paul E. McKenney wrote:
>> > > Take the patch that I just sent o
On Wed, Jul 13, 2016 at 02:18:41PM -0700, Paul E. McKenney wrote:
> On Wed, Jul 13, 2016 at 05:05:26PM -0400, Tejun Heo wrote:
> > On Wed, Jul 13, 2016 at 02:03:15PM -0700, Paul E. McKenney wrote:
> > > Take the patch that I just sent out and make the choice of normal
> > > vs. expedited depend on
On Wed, Jul 13, 2016 at 05:05:26PM -0400, Tejun Heo wrote:
> On Wed, Jul 13, 2016 at 02:03:15PM -0700, Paul E. McKenney wrote:
> > Take the patch that I just sent out and make the choice of normal
> > vs. expedited depend on CONFIG_PREEMPT_RT or whatever the -rt guys are
> > calling it these days.
On Wed, Jul 13, 2016 at 10:57:12PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 13, 2016 at 01:52:11PM -0700, Paul E. McKenney wrote:
> > Not a particularly real-time-friendly fix, but certainly a good check
> > on our various assumptions.
>
> Not only RT, but also HPC and all the other RDMA userspa
On Wed, Jul 13, 2016 at 02:01:27PM -0700, Dmitry Shmidt wrote:
> On Wed, Jul 13, 2016 at 1:52 PM, Paul E. McKenney
> wrote:
> > On Wed, Jul 13, 2016 at 10:26:57PM +0200, Peter Zijlstra wrote:
> >> On Wed, Jul 13, 2016 at 04:18:23PM -0400, Tejun Heo wrote:
> >> > Hello, John.
> >> >
> >> > On Wed,
On Wed, Jul 13, 2016 at 02:03:15PM -0700, Paul E. McKenney wrote:
> Take the patch that I just sent out and make the choice of normal
> vs. expedited depend on CONFIG_PREEMPT_RT or whatever the -rt guys are
> calling it these days. Is there a low-latency Kconfig option other
> than CONFIG_NO_HZ_FU
On Wed, Jul 13, 2016 at 10:51:02PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 13, 2016 at 04:39:44PM -0400, Tejun Heo wrote:
>
> > > There is a synchronize_sched() in there, so sorta. That thing is heavily
> > > geared towards readers, as is the only 'sane' choice for global locks.
> >
> > It use
On Wed, Jul 13, 2016 at 2:01 PM, Dmitry Shmidt wrote:
> On Wed, Jul 13, 2016 at 1:52 PM, Paul E. McKenney
> wrote:
>> On Wed, Jul 13, 2016 at 10:26:57PM +0200, Peter Zijlstra wrote:
>>> On Wed, Jul 13, 2016 at 04:18:23PM -0400, Tejun Heo wrote:
>>> > Hello, John.
>>> >
>>> > On Wed, Jul 13, 2016
On Wed, Jul 13, 2016 at 1:52 PM, Paul E. McKenney
wrote:
> On Wed, Jul 13, 2016 at 10:26:57PM +0200, Peter Zijlstra wrote:
>> On Wed, Jul 13, 2016 at 04:18:23PM -0400, Tejun Heo wrote:
>> > Hello, John.
>> >
>> > On Wed, Jul 13, 2016 at 01:13:11PM -0700, John Stultz wrote:
>> > > On Wed, Jul 13, 2
On Wed, Jul 13, 2016 at 10:51:02PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 13, 2016 at 04:39:44PM -0400, Tejun Heo wrote:
> So, IIRC, the trade-off is a full memory barrier in read_lock and
> read_unlock() vs sync_sched() in write.
>
> Full memory barriers are expensive and while the combined co
On Wed, Jul 13, 2016 at 01:52:11PM -0700, Paul E. McKenney wrote:
> Not a particularly real-time-friendly fix, but certainly a good check
> on our various assumptions.
Not only RT, but also HPC and all the other RDMA userspace polling
freaks ;-)
But yes, it would confirm that this is indeed the i
On Wed, Jul 13, 2016 at 1:39 PM, Tejun Heo wrote:
> Hello,
>
> On Wed, Jul 13, 2016 at 10:26:57PM +0200, Peter Zijlstra wrote:
>> > So, it's a percpu rwsem issue then. I haven't really followed the
>> > perpcpu rwsem changes closely. Oleg, are multi-milisec delay expected
>> > on down write expe
Hello,
On Wed, Jul 13, 2016 at 01:44:50PM -0700, Colin Cross wrote:
> > Switching between foreground and background isn't a hot path. It's
> > human initiated operations after all. It taking 80 msecs sure is
> > problematic but I'm skeptical that this is from actual contention
> > given that the
On Wed, Jul 13, 2016 at 10:26:57PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 13, 2016 at 04:18:23PM -0400, Tejun Heo wrote:
> > Hello, John.
> >
> > On Wed, Jul 13, 2016 at 01:13:11PM -0700, John Stultz wrote:
> > > On Wed, Jul 13, 2016 at 11:33 AM, Tejun Heo wrote:
> > > > On Wed, Jul 13, 2016
On Wed, Jul 13, 2016 at 04:39:44PM -0400, Tejun Heo wrote:
> > There is a synchronize_sched() in there, so sorta. That thing is heavily
> > geared towards readers, as is the only 'sane' choice for global locks.
>
> It used to use the expedited variant until 001dac627ff3
> ("locking/percpu-rwsem:
On Wed, Jul 13, 2016 at 11:21 AM, Tejun Heo wrote:
> (cc'ing Oleg)
>
> Hello,
>
> On Tue, Jul 12, 2016 at 05:00:04PM -0700, John Stultz wrote:
>> So Dmitry Shmidt recently noticed that with 4.4 based systems we're
>> seeing quite a bit of performance overhead from
>> __cgroup_procs_write().
>>
>
Hello,
On Wed, Jul 13, 2016 at 10:26:57PM +0200, Peter Zijlstra wrote:
> > So, it's a percpu rwsem issue then. I haven't really followed the
> > perpcpu rwsem changes closely. Oleg, are multi-milisec delay expected
> > on down write expected with the current implementation of
> > percpu_rwsem?
>
On Wed, Jul 13, 2016 at 11:33 AM, Tejun Heo wrote:
> On Wed, Jul 13, 2016 at 02:21:02PM -0400, Tejun Heo wrote:
>> One interesting thing to try would be replacing it with a regular
>> non-percpu rwsem and see how it behaves. That should easily tell us
>> whether this is from actual contention or
On Wed, Jul 13, 2016 at 04:18:23PM -0400, Tejun Heo wrote:
> Hello, John.
>
> On Wed, Jul 13, 2016 at 01:13:11PM -0700, John Stultz wrote:
> > On Wed, Jul 13, 2016 at 11:33 AM, Tejun Heo wrote:
> > > On Wed, Jul 13, 2016 at 02:21:02PM -0400, Tejun Heo wrote:
> > >> One interesting thing to try wo
Hello, John.
On Wed, Jul 13, 2016 at 01:13:11PM -0700, John Stultz wrote:
> On Wed, Jul 13, 2016 at 11:33 AM, Tejun Heo wrote:
> > On Wed, Jul 13, 2016 at 02:21:02PM -0400, Tejun Heo wrote:
> >> One interesting thing to try would be replacing it with a regular
> >> non-percpu rwsem and see how it
On Wed, Jul 13, 2016 at 11:33 AM, Tejun Heo wrote:
> On Wed, Jul 13, 2016 at 02:21:02PM -0400, Tejun Heo wrote:
>> One interesting thing to try would be replacing it with a regular
>> non-percpu rwsem and see how it behaves. That should easily tell us
>> whether this is from actual contention or
On Wed, Jul 13, 2016 at 11:13:26AM -0700, Dmitry Shmidt wrote:
> On Wed, Jul 13, 2016 at 7:42 AM, Paul E. McKenney
> wrote:
> > On Wed, Jul 13, 2016 at 10:21:12AM +0200, Peter Zijlstra wrote:
> >> On Tue, Jul 12, 2016 at 05:00:04PM -0700, John Stultz wrote:
> >> > Hey Tejun,
> >> >
> >> > So Dmi
On Wed, Jul 13, 2016 at 02:21:02PM -0400, Tejun Heo wrote:
> One interesting thing to try would be replacing it with a regular
> non-percpu rwsem and see how it behaves. That should easily tell us
> whether this is from actual contention or artifacts from percpu_rwsem
> implementation.
So, someth
(cc'ing Oleg)
Hello,
On Tue, Jul 12, 2016 at 05:00:04PM -0700, John Stultz wrote:
> So Dmitry Shmidt recently noticed that with 4.4 based systems we're
> seeing quite a bit of performance overhead from
> __cgroup_procs_write().
>
> With 4.4 tree as it stands, we're seeing __cgroup_procs_write(
On Wed, Jul 13, 2016 at 10:21:12AM +0200, Peter Zijlstra wrote:
> On Tue, Jul 12, 2016 at 05:00:04PM -0700, John Stultz wrote:
> > Hey Tejun,
> >
> > So Dmitry Shmidt recently noticed that with 4.4 based systems we're
> > seeing quite a bit of performance overhead from
> > __cgroup_procs_write()
On Tue, Jul 12, 2016 at 05:00:04PM -0700, John Stultz wrote:
> Hey Tejun,
>
> So Dmitry Shmidt recently noticed that with 4.4 based systems we're
> seeing quite a bit of performance overhead from
> __cgroup_procs_write().
>
> With 4.4 tree as it stands, we're seeing __cgroup_procs_write() quite
Hey Tejun,
So Dmitry Shmidt recently noticed that with 4.4 based systems we're
seeing quite a bit of performance overhead from
__cgroup_procs_write().
With 4.4 tree as it stands, we're seeing __cgroup_procs_write() quite
often take 10s of miliseconds to execute (with max times up in the
80ms ra
66 matches
Mail list logo