On Mon, 2013-01-21 at 17:22 +0800, Michael Wang wrote: 
> On 01/21/2013 05:09 PM, Mike Galbraith wrote:
> > On Mon, 2013-01-21 at 15:45 +0800, Michael Wang wrote: 
> >> On 01/21/2013 03:09 PM, Mike Galbraith wrote:
> >>> On Mon, 2013-01-21 at 07:42 +0100, Mike Galbraith wrote: 
> >>>> On Mon, 2013-01-21 at 13:07 +0800, Michael Wang wrote:
> >>>
> >>>>> May be we could try change this back to the old way later, after the aim
> >>>>> 7 test on my server.
> >>>>
> >>>> Yeah, something funny is going on.
> >>>
> >>> Never entering balance path kills the collapse.  Asking wake_affine()
> >>> wrt the pull as before, but allowing us to continue should no idle cpu
> >>> be found, still collapsed.  So the source of funny behavior is indeed in
> >>> balance_path.
> >>
> >> Below patch based on the patch set could help to avoid enter balance path
> >> if affine_sd could be found, just like the old logical, would you like to
> >> take a try and see whether it could help fix the collapse?
> > 
> > No, it does not.
> 
> Hmm...what have changed now compared to the old logical?

What I did earlier to confirm the collapse originates in balance_path is
below.  I just retested to confirm.

Tasks    jobs/min  jti  jobs/min/task      real       cpu
    1      435.34  100       435.3448     13.92      3.76   Mon Jan 21 10:24:00 
2013
    1      440.09  100       440.0871     13.77      3.76   Mon Jan 21 10:24:22 
2013
    1      440.41  100       440.4070     13.76      3.75   Mon Jan 21 10:24:45 
2013
    5     2467.43   99       493.4853     12.28     10.71   Mon Jan 21 10:24:59 
2013
    5     2445.52   99       489.1041     12.39     10.98   Mon Jan 21 10:25:14 
2013
    5     2475.49   99       495.0980     12.24     10.59   Mon Jan 21 10:25:27 
2013
   10     4963.14   99       496.3145     12.21     20.64   Mon Jan 21 10:25:41 
2013
   10     4959.08   99       495.9083     12.22     21.26   Mon Jan 21 10:25:54 
2013
   10     5415.55   99       541.5550     11.19     11.54   Mon Jan 21 10:26:06 
2013
   20     9934.43   96       496.7213     12.20     33.52   Mon Jan 21 10:26:18 
2013
   20     9950.74   98       497.5369     12.18     36.52   Mon Jan 21 10:26:31 
2013
   20     9893.88   96       494.6939     12.25     34.39   Mon Jan 21 10:26:43 
2013
   40    18937.50   98       473.4375     12.80     84.74   Mon Jan 21 10:26:56 
2013
   40    18996.87   98       474.9216     12.76     88.64   Mon Jan 21 10:27:09 
2013
   40    19146.92   98       478.6730     12.66     89.98   Mon Jan 21 10:27:22 
2013
   80    37610.55   98       470.1319     12.89    112.01   Mon Jan 21 10:27:35 
2013
   80    37321.02   98       466.5127     12.99    114.21   Mon Jan 21 10:27:48 
2013
   80    37610.55   98       470.1319     12.89    111.77   Mon Jan 21 10:28:01 
2013
  160    69109.05   98       431.9316     14.03    156.81   Mon Jan 21 10:28:15 
2013
  160    69505.38   98       434.4086     13.95    155.33   Mon Jan 21 10:28:29 
2013
  160    69207.71   98       432.5482     14.01    155.79   Mon Jan 21 10:28:43 
2013
  320   108033.43   98       337.6045     17.95    314.01   Mon Jan 21 10:29:01 
2013
  320   108577.83   98       339.3057     17.86    311.79   Mon Jan 21 10:29:19 
2013
  320   108395.75   98       338.7367     17.89    312.55   Mon Jan 21 10:29:37 
2013
  640   151440.84   98       236.6263     25.61    620.37   Mon Jan 21 10:30:03 
2013
  640   151440.84   97       236.6263     25.61    621.23   Mon Jan 21 10:30:29 
2013
  640   151145.75   98       236.1652     25.66    622.35   Mon Jan 21 10:30:55 
2013
 1280   190117.65   98       148.5294     40.80   1228.40   Mon Jan 21 10:31:36 
2013
 1280   189977.96   98       148.4203     40.83   1229.91   Mon Jan 21 10:32:17 
2013
 1280   189560.12   98       148.0938     40.92   1231.71   Mon Jan 21 10:32:58 
2013
 2560   217857.04   98        85.1004     71.21   2441.61   Mon Jan 21 10:34:09 
2013
 2560   217338.19   98        84.8977     71.38   2448.76   Mon Jan 21 10:35:21 
2013
 2560   217795.87   97        85.0765     71.23   2443.12   Mon Jan 21 10:36:32 
2013

That was with your change backed out, and the q/d below applied.

---
 kernel/sched/fair.c |   27 ++++++---------------------
 1 file changed, 6 insertions(+), 21 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3337,6 +3337,8 @@ select_task_rq_fair(struct task_struct *
                goto unlock;
 
        if (sd_flag & SD_BALANCE_WAKE) {
+               new_cpu = prev_cpu;
+
                /*
                 * Tasks to be waked is special, memory it relied on
                 * may has already been cached on prev_cpu, and usually
@@ -3348,33 +3350,16 @@ select_task_rq_fair(struct task_struct *
                 * from top to bottom, which help to reduce the chance in
                 * some cases.
                 */
-               new_cpu = select_idle_sibling(p, prev_cpu);
+               new_cpu = select_idle_sibling(p, new_cpu);
                if (idle_cpu(new_cpu))
                        goto unlock;
 
-               /*
-                * No idle cpu could be found in the topology of prev_cpu,
-                * before jump into the slow balance_path, try search again
-                * in the topology of current cpu if it is the affine of
-                * prev_cpu.
-                */
-               if (!sbm->affine_map[prev_cpu] ||
-                               !cpumask_test_cpu(cpu, tsk_cpus_allowed(p)))
-                       goto balance_path;
-
-               new_cpu = select_idle_sibling(p, cpu);
-               if (!idle_cpu(new_cpu))
-                       goto balance_path;
+               if (wake_affine(sbm->affine_map[cpu], p, sync))
+                       new_cpu = select_idle_sibling(p, cpu);
 
-               /*
-                * Invoke wake_affine() finally since it is no doubt a
-                * performance killer.
-                */
-               if (wake_affine(sbm->affine_map[prev_cpu], p, sync))
-                       goto unlock;
+               goto unlock;
        }
 
-balance_path:
        new_cpu = (sd_flag & SD_BALANCE_WAKE) ? prev_cpu : cpu;
        sd = sbm->sd[type][sbm->top_level[type]];
 




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to