Hi Yuyang, On Fri, Jun 19, 2015 at 11:11:16AM +0800, Yuyang Du wrote: > On Fri, Jun 19, 2015 at 03:57:24PM +0800, Boqun Feng wrote: > > > > > > This rewrite patch does not NEED to aggregate entity's load to cfs_rq, > > > but rather directly update the cfs_rq's load (both runnable and blocked), > > > so there is NO NEED to iterate all of the cfs_rqs. > > > > Actually, I'm not sure whether we NEED to aggregate or NOT. > > > > > > > > So simply updating the top cfs_rq is already equivalent to the stock. > > > > > Ok. By aggregate, the rewrite patch does not need it, because the cfs_rq's > load is calculated at once with all its runnable and blocked tasks counted, > assuming the all children's weights are up-to-date, of course. Please refer > to the changelog to get an idea. > > > > > The stock does have a bottom up update, so simply updating the top > > cfs_rq is not equivalent to it. Simply updateing the top cfs_rq is > > equivalent to the rewrite patch, because the rewrite patch lacks of the > > aggregation. > > It is not the rewrite patch "lacks" aggregation, it is needless. The stock > has to do a bottom-up update and aggregate, because 1) it updates the > load at an entity granularity, 2) the blocked load is separate.
Yep, you are right, the aggregation is not necessary. Let me see if I understand you, in the rewrite, when we update_cfs_rq_load_avg() we need neither to aggregate child's load_avg, nor to update cfs_rq->load.weight. Because: 1) For the load before cfs_rq->last_update_time, it's already in the ->load_avg, and decay will do the job. 2) For the load from cfs_rq->last_update_time to now, we calculate with cfs_rq->load.weight, and the weight should be weight at ->last_update_time rather than now. Right? > > > > It is better if we iterate the cfs_rq to update the actually weight > > > (update_cfs_share), because the weight may have already changed, which > > > would in turn change the load. But update_cfs_share is not cheap. > > > > > > Right? > > > > You get me right for most part ;-) > > > > My points are: > > > > 1. We *may not* need to aggregate entity's load to cfs_rq in > > update_blocked_averages(), simply updating the top cfs_rq may be just > > fine, but I'm not sure, so scheduler experts' insights are needed here. > > Then I don't need to say anything about this. > > > 2. Whether we need to aggregate or not, the update_blocked_averages() in > > the rewrite patch could be improved. If we need to aggregate, we have to > > add something like update_cfs_shares(). If we don't need, we can just > > replace the loop with one update_cfs_rq_load_avg() on root cfs_rq. > > If update_cfs_shares() is done here, it is good, but probably not necessary > though. However, we do need to update_tg_load_avg() here, because if cfs_rq's We may have another problem even we udpate_tg_load_avg(), because after the loop, for each cfs_rq, ->load.weight is not up-to-date, right? So next time before we update_cfs_rq_load_avg(), we need to guarantee that the cfs_rq->load.weight is already updated, right? And IMO, we don't have that guarantee yet, do we? > load change, the parent tg's load_avg should change too. I will upload a next > version soon. > > In addition, an update to the stress + dbench test case: > > I have a Core i7, not a Xeon Nehalem, and I have a patch that may not impact > the result. Then, the dbench runs at very low CPU utilization ~1%. Boqun said > this may result from cgroup control, the dbench I/O is low. > > Anyway, I can't reproduce the results, the CPU0's util is 92+%, and other CPUs > have ~100% util. Thank you for looking into that problem, and I will test with your new version of patch ;-) Thanks, Boqun > > Thanks, > Yuyang
signature.asc
Description: PGP signature