On Tue, Sep 20, 2016 at 02:11:04AM +0200, Dario Faggioli wrote: >On Mon, 2016-09-19 at 21:33 +0800, Peng Fan wrote: >> On Mon, Sep 19, 2016 at 11:33:58AM +0100, George Dunlap wrote: >> >?? >> > No, I think it would be a lot simpler to just teach the scheduler >> > about >> > different classes of cpus.????credit1 would probably need to be >> > modified >> > so that its credit algorithm would be per-class rather than pool- >> > wide; >> > but credit2 shouldn't need much modification at all, other than to >> > make >> > sure that a given runqueue doesn't include more than one class; and >> > to >> > do load-balancing only with runqueues of the same class. >> >> I try to follow. >> ??- scheduler needs to be aware of different classes of cpus. ARM >> big.Little cpus. >> >Yes, I think this is essential. > >> ??- scheduler schedules vcpus on different physical cpus in one >> cpupool. >> >Yep, that's what the scheduler does. And personally, I'd start >implementing big.LITTLE support for a situation where both big and >LITTLE cpus coexists in the same pool.
It's great if you have plan to work on the scheduler part. > >> ??- different cpu classes needs to be in different runqueue. >> >Yes. So, basically, imagine to use vcpu pinning to support big.LITTLE. >I've spoken briefly about this in my reply to Juergen. You probably can >even get something like this up-&-running by writing very few or zero >code (you'll need --for now-- max_dom0_vcpus, dom0_vcpus_pin, and then, >in domain config files, "cpus='...'"). > >Then, the real goal, would be to achieve the same behavior >automatically, by acting on runqueues' arrangement and load balancing >logic in the scheduler(s). > >Anyway, sorry for my ignorance on big.LITTLE, but there's something I'm >missing: _when_ is it that it is (or needs to be) decided whether a >vcpu will run on a big or LITTLE core? Big cores are more powerful than little cores, but consumes more power. In Linux kernel, linaro is working on EAS scheduler to take advantage of big.LITTLE. http://www.linaro.org/blog/core-dump/energy-aware-scheduling-eas-project/ As discussed, for big.little guest os that have big vcpu and little vcpu, we only need to take care of big vcpu scheduled on big physical cpus, and little vcpu sheduled on little physical cpus. So a vcpu is not be scheduled between big and little physical cpus. > >Thinking to a bare metal system, I think that cpu X is, for instance, big, and >will always be like that; similarly, cpu Y is LITTLE. > >This makes me think that, for a virtual machine, it is ok to choose/specify at >_domain_creation_ time, which vcpus are big and which vcpus are LITTLE, is >this correct? >If yes, this also means that --whatever way we find to make this happen, >cpupools, scheduler, etc-- the vcpus that we decided they are big, must only >be scheduled on actual big pcpus, and pcpus that we decided they are LITTLE, >must only be scheduled on actual LITTLE pcpus, correct again? > >> Then for implementation. >> ??- When create a guest, specific physical cpus that the guest will be >> run on. >> >I'd actually do that the other way round. I'd ask the user to specify >how many --and, if that's important-- vcpus are big and how many/which >are LITTLE. > >Knowing that, we also know whether the domain is a big only, LITTLE >only or big.LITTLE one. And we also know on which set of pcpus each set >of vcpus should be restrict to. > >So, basically (but it's just an example) something like this, in the xl >config file of a guest: > >1) big.LITTLE guest, with 2 big and 2 LITTLE pcpus. User doesn't care ?? >?? ??which is which, so a default could be 0,1 big and 2,3 LITTLE: > >??vcpus = 4 >??vcpus.big = 2 > >2) big.LITTLE guest, with 8 vcpus, of which 0,2,4 and 6 are big: > >vcpus = 8 >vcpus.big = [0, 2, 4, 6] > >Which would be the same as > >vcpus = 8 >vcpus.little = [1, 3, 5, 7] > >3) guest with 4 vcpus, all big: > >vcpus = 4 >vcpus.big = "all" > >Which would be the same as: > >vcpus = 4 >vcpus.little = "none" > >And also the same as just: > >vcpus = 4 > > >Or something like this > >> ??- If the physical cpus are different cpus, indicate the guest would >> like to be a big.little guest. >> ??????And have big vcpus and little vcpus. >> >Not liking this as _the_ way of specifying the guest topology, wrt >big.LITTLE-ness (see alternative proposal right above. :-)) > >However, right now we support pinning/affinity already. We certainly >need to decide what to do if, e.g., no vcpus.big or vcpus.little are >present, but the vcpus have hard or soft affinity to some specific >pcpus. > >So, right now, this, in the xl config file: > >cpus = [2, 8, 12, 13, 15, 17] > >means that we want to ping 1-to-1 vcpu 0 to pcpu 2, vcpu 1 to pcpu 8, >vcpu 2 to pcpu 12, vcpu 3 to pcpu 13, vcpu 4 to pcpu 15 and vcpu 5 to >pcpu 17. Now, if cores 2, 8 and 12 are big, and no vcpus.big or >vcpu.little is specified, I'd put forward the assumption that the user >wants vcpus 0, 1 and 2 to be big, and vcpus 3, 4, and 5 to be LITTLE. > >If, instead, there are vcpus.big or vcpus.little specified, and there's >disagreement, I'd either error out or decide which overrun the other >(and print a WARNING about that happening). > >Still right now, this: > >cpus = "2-12" > >means that all the vcpus of the domain have hard affinity (i.e., are >pinned) to pcpus 2-12. And in this case I'd conclude that the user >wants for all the vcpus to be big. > >I'm less sure what to do if _only_ soft-affinity is specified (via >"cpus_soft="), or if hard-affinity contains both big and LITTLE pcpus, >like, e.g.: > >cpus = "2-15" > >> ??- If no physical cpus specificed, then the guest may runs on big >> cpus or on little cpus. But not both. >> >Yes. if nothing (or something contradictory) is specified, we "just" >have to decide what's the sanest default. > >> ??????How to decide runs on big or little physical cpus? >> >I'd default to big. > >> ??- For Dom0, I am still not sure,default big.little or else? >> >Again, if nothing is specified, I'd probably default to: >??- give dom0 as much vcpus are there are big cores >??- restrict them to big cores > >But, of course, I think we should add boot time parameters like these >ones: > >??dom0_vcpus_big = 4 >??dom0_vcpus_little = 2 > >which would mean the user wants dom0 to have 4 big and 2 LITTLE >cores... and then we act accordingly, as described above, and in other >emails. > >> If use scheduler to handle the different classes cpu, we do not need >> to use cpupool >> to block vcpus be scheduled onto different physical cpus. And using >> scheudler to handle this >> gives an opportunity to support big.little guest. >> >Exactly, this is one strong point in favour of this solution, IMO! From the long run, I agree this is a good solution. Thanks, Peng. > >Regards, >Dario >-- ><<This happens because I choose it to happen!>> (Raistlin Majere) >----------------------------------------------------------------- >Dario Faggioli, Ph.D, http://about.me/dario.faggioli >Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) -- _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel