On 16/05/17 15:52, Byungchul Park wrote: > On Fri, May 12, 2017 at 10:25:30AM -0400, Steven Rostedt wrote: > > On Fri, 12 May 2017 14:48:45 +0900 > > Byungchul Park <byungchul.p...@lge.com> wrote: > > > > > cpudl.elements is an instance that should be protected with a spin lock. > > > Without it, the code would be insane. > > > > And how much contention will this add? Spin locks in the scheduler code > > that are shared among a domain can cause huge latency. This was why I > > worked hard not to add any in the cpupri code. > > Yes. That's also whay I hesitated to post this patch.. > > > > Current cpudl_find() has problems like, > > > > > > 1. cpudl.elements[0].cpu might not match with cpudl.elements[0].dl. > > > 2. cpudl.elements[0].dl(u64) might not be referred atomically. > > > 3. Two cpudl_maximum()s might return different values. > > > 4. It's just insane. > > > > And lockless algorithms usually are insane. But locks come with a huge > > cost. What happens when we have 32 core domains. This can cause > > tremendous contention and makes the entire cpu priority for deadlines > > useless. Might as well rip out the code. > > I think it would be better if we, even keeping it lockless, > > 1. make cp->elements[].cpu referred once than twice, > 2. add retry logic in order to match elements[].cpu with its dl, > 3. add retry logic for the u64 variable to be read atomically, > > So what do you think about the suggestions? Of course it does not solve > the problems perfectly though.. Or do you think it's not worth? >
Not sure, but if we are going to retry a lot it might be better off to put proper locking instead? We could also simply bail out when we notice that something is changed under our feet. I'd say (again :) we might first want to understand (with some numbers) how bad the problem is and then fix it. I guess numbers might also help us to better understand what the best fix is? Thanks, - Juri