Re: [HMP tunables v2][PATCH 2/7] sched: SD_SHARE_POWERLINE buddy selection fix

2012-11-19 Thread Vincent Guittot
Hi,

I would prefer that you use the branch in the git tree below instead
which is the final correction
http://git.linaro.org/gitweb?p=people/vingu/kernel.git;a=shortlog;h=refs/heads/sched-pack-small-task-v1-fixed

Regards
Vincent

On 16 November 2012 19:32, Liviu Dudau  wrote:
> From: Morten Rasmussen 
>
> Fixes update_packing_domain() to behave better for topologies where
> SD_SHARE_POWERLINE is disabled at highest sched domain level.
>
> Signed-of-by: Morten Rasmussen 
> ---
>  kernel/sched/fair.c |   15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0ee9834..d758086 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -184,7 +184,8 @@ void update_packing_domain(int cpu)
> if (!sd)
> sd = rcu_dereference_check_sched_domain(cpu_rq(cpu)->sd);
> else
> -   sd = sd->parent;
> +   if (cpumask_first(sched_domain_span(sd)) == cpu || 
> !sd->parent)
> +   sd = sd->parent;
>
> while (sd) {
> struct sched_group *sg = sd->groups;
> @@ -195,6 +196,18 @@ void update_packing_domain(int cpu)
> if (id == -1)
> id = cpumask_first(sched_domain_span(sd));
>
> +   /* Find sched group of candidate */
> +   tmp = sd->groups;
> +   do {
> +   if (cpumask_test_cpu(id, sched_group_cpus(tmp))) {
> +   sg = tmp;
> +   break;
> +   }
> +   } while (tmp = tmp->next, tmp != sd->groups);
> +
> +   pack = sg;
> +   tmp = sg->next;
> +
> /* loop the sched groups to find the best one */
> while (tmp != sg) {
> if (tmp->sgp->power * sg->group_weight <
> --
> 1.7.9.5
>
>
>
> ___
> linaro-dev mailing list
> linaro-dev@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-dev

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [HMP tunables v2][PATCH 2/7] sched: SD_SHARE_POWERLINE buddy selection fix

2012-11-19 Thread Viresh Kumar
On 19 November 2012 14:33, Vincent Guittot  wrote:
> I would prefer that you use the branch in the git tree below instead
> which is the final correction
> http://git.linaro.org/gitweb?p=people/vingu/kernel.git;a=shortlog;h=refs/heads/sched-pack-small-task-v1-fixed

Hi Vingu,

I have applied 3 patches on top of your branch in my current PULL
request. Can you please check them and let me know which ones should
i keep out of 5 patches (2 from you, 3 from ARM) ?

--
viresh

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [HMP tunables v2][PATCH 3/7] ARM: TC2: Re-enable SD_SHARE_POWERLINE

2012-11-19 Thread Vincent Guittot
Hi,

On 16 November 2012 19:32, Liviu Dudau  wrote:
> From: Morten Rasmussen 
>
> Re-enable SD_SHARE_POWERLINE to reflect the power domains of TC2.
> ---
>  arch/arm/kernel/topology.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
> index 317dac6..4d34e0e 100644
> --- a/arch/arm/kernel/topology.c
> +++ b/arch/arm/kernel/topology.c
> @@ -228,7 +228,7 @@ struct cputopo_arm cpu_topology[NR_CPUS];
>
>  int arch_sd_share_power_line(void)
>  {
> -   return 0*SD_SHARE_POWERLINE;
> +   return 1*SD_SHARE_POWERLINE;

I'm not sure to catch your goal. With this modification, the power
line (or power domain) is shared at all level which should disable the
packing mechanism. But in a previous patch you fix the update packing
loop so I assume that you want to use it. Which kind of configuration
you would like to have among the proposal below ?

cpu   : CPU0 | CPU1 | CPU2 | CPU3 | CPU4
buddy conf 1 : CPU2 | CPU0 | CPU2 | CPU2 | CPU2
buddy conf 2 : CPU2 | CPU2 | CPU2 | CPU2 | CPU2
buddy conf 3 :   -1 |   -1 |   -1 |   -1 |   -1

When we look at the  git://git.linaro.org/arm/big.LITTLE/mp.git
big-LITTLE-MP-master-v12, we can see that you have defined a custom
sched_domain which hasn't been updated with SD_SHARE_POWERLINE flag so
the flag is cleared at CPU level. Based on this, I would say that you
want buddy conf 2 ? but I would say that buddy conf 1 should give
better result. Have you tried both ?

Regards,
Vincent

>  }
>
>  const struct cpumask *cpu_coregroup_mask(int cpu)
> --
> 1.7.9.5
>
>
>
> ___
> linaro-dev mailing list
> linaro-dev@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-dev

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [HMP tunables v2][PATCH 2/7] sched: SD_SHARE_POWERLINE buddy selection fix

2012-11-19 Thread Vincent Guittot
On 19 November 2012 10:10, Viresh Kumar  wrote:
> On 19 November 2012 14:33, Vincent Guittot  wrote:
>> I would prefer that you use the branch in the git tree below instead
>> which is the final correction
>> http://git.linaro.org/gitweb?p=people/vingu/kernel.git;a=shortlog;h=refs/heads/sched-pack-small-task-v1-fixed
>
> Hi Vingu,
>
> I have applied 3 patches on top of your branch in my current PULL
> request. Can you please check them and let me know which ones should
> i keep out of 5 patches (2 from you, 3 from ARM) ?

sched: pack small tasks: fix update packing domain
sched-pack-small-task-v1-fixed
 - fix the buddy selection loop. it will be part of the next version
sched: pack small tasks: fix printk formating
 - fix a display issue that has been reported by Tixy but it's not
related to current discussion.  it will be part of the next version
ARM: TC2: Re-enable SD_SHARE_POWERLINE
 - depend of which configuration we want for TC2
sched: SD_SHARE_POWERLINE buddy selection fix
 - Remove it and use the fix above
Revert "sched: secure access to other CPU statistics"
 - this patch removes a patch that doesn't fulfill its goal but
doesn't introduce any regression so reverting it doesn't change
anything in the behaviour. You should keep the original patch and
remove the revert as long as there is no regression

Regards,
Vincent
>
> --
> viresh

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [HMP tunables v2][PATCH 3/7] ARM: TC2: Re-enable SD_SHARE_POWERLINE

2012-11-19 Thread Morten Rasmussen

Hi Vincent,

On 19/11/12 09:20, Vincent Guittot wrote:

Hi,

On 16 November 2012 19:32, Liviu Dudau  wrote:

From: Morten Rasmussen 

Re-enable SD_SHARE_POWERLINE to reflect the power domains of TC2.
---
  arch/arm/kernel/topology.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 317dac6..4d34e0e 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -228,7 +228,7 @@ struct cputopo_arm cpu_topology[NR_CPUS];

  int arch_sd_share_power_line(void)
  {
-   return 0*SD_SHARE_POWERLINE;
+   return 1*SD_SHARE_POWERLINE;


I'm not sure to catch your goal. With this modification, the power
line (or power domain) is shared at all level which should disable the
packing mechanism. But in a previous patch you fix the update packing
loop so I assume that you want to use it. Which kind of configuration
you would like to have among the proposal below ?

cpu   : CPU0 | CPU1 | CPU2 | CPU3 | CPU4
buddy conf 1 : CPU2 | CPU0 | CPU2 | CPU2 | CPU2
buddy conf 2 : CPU2 | CPU2 | CPU2 | CPU2 | CPU2
buddy conf 3 :   -1 |   -1 |   -1 |   -1 |   -1

When we look at the  git://git.linaro.org/arm/big.LITTLE/mp.git
big-LITTLE-MP-master-v12, we can see that you have defined a custom
sched_domain which hasn't been updated with SD_SHARE_POWERLINE flag so
the flag is cleared at CPU level. Based on this, I would say that you
want buddy conf 2 ? but I would say that buddy conf 1 should give
better result. Have you tried both ?



My goal with this fix is to set up the SD_SHARE_POWERLINE flags as they 
really are on TC2. It could have been done more elegantly. Since the HMP 
patches overrides the sched_domain flags at CPU level the 
SD_SHARE_POWERLINE is not being set by arch_sd_share_power_line(). With 
this fix we will get SD_SHARE_POWERLINE at MC level and no 
SD_SHARE_POWERLINE at CPU level, which I believe is the correct set up 
for TC2.


For the buddy configuration the goal is to get configuration 1 in your 
list above. You should get that when using the other patch to fix the 
buddy selection algorithm.
I'm not sure if conf 1 or 2 is best. I think it depends on the 
power/performance trade-off of the specific platform. conf 1 may lead to 
CPU1->CPU0->CPU2 migrations which may be undesirable. If your cpus are 
very leaky it might make sense to not do packing at all inside a high 
performance cluster and always do packing directly on a another low 
power cluster like conf 2. I think this needs further investigation.


I have only tested with conf 1 on TC2.

Regards,
Morten


Regards,
Vincent


  }

  const struct cpumask *cpu_coregroup_mask(int cpu)
--
1.7.9.5



___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev






___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [HMP tunables v2][PATCH 3/7] ARM: TC2: Re-enable SD_SHARE_POWERLINE

2012-11-19 Thread Vincent Guittot
On 19 November 2012 13:08, Morten Rasmussen  wrote:
> Hi Vincent,
>
>
> On 19/11/12 09:20, Vincent Guittot wrote:
>>
>> Hi,
>>
>> On 16 November 2012 19:32, Liviu Dudau  wrote:
>>>
>>> From: Morten Rasmussen 
>>>
>>> Re-enable SD_SHARE_POWERLINE to reflect the power domains of TC2.
>>> ---
>>>   arch/arm/kernel/topology.c |2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
>>> index 317dac6..4d34e0e 100644
>>> --- a/arch/arm/kernel/topology.c
>>> +++ b/arch/arm/kernel/topology.c
>>> @@ -228,7 +228,7 @@ struct cputopo_arm cpu_topology[NR_CPUS];
>>>
>>>   int arch_sd_share_power_line(void)
>>>   {
>>> -   return 0*SD_SHARE_POWERLINE;
>>> +   return 1*SD_SHARE_POWERLINE;
>>
>>
>> I'm not sure to catch your goal. With this modification, the power
>> line (or power domain) is shared at all level which should disable the
>> packing mechanism. But in a previous patch you fix the update packing
>> loop so I assume that you want to use it. Which kind of configuration
>> you would like to have among the proposal below ?
>>
>> cpu   : CPU0 | CPU1 | CPU2 | CPU3 | CPU4
>> buddy conf 1 : CPU2 | CPU0 | CPU2 | CPU2 | CPU2
>> buddy conf 2 : CPU2 | CPU2 | CPU2 | CPU2 | CPU2
>> buddy conf 3 :   -1 |   -1 |   -1 |   -1 |   -1
>>
>> When we look at the  git://git.linaro.org/arm/big.LITTLE/mp.git
>> big-LITTLE-MP-master-v12, we can see that you have defined a custom
>> sched_domain which hasn't been updated with SD_SHARE_POWERLINE flag so
>> the flag is cleared at CPU level. Based on this, I would say that you
>> want buddy conf 2 ? but I would say that buddy conf 1 should give
>> better result. Have you tried both ?
>>
>
> My goal with this fix is to set up the SD_SHARE_POWERLINE flags as they
> really are on TC2. It could have been done more elegantly. Since the HMP
> patches overrides the sched_domain flags at CPU level the SD_SHARE_POWERLINE
> is not being set by arch_sd_share_power_line(). With this fix we will get
> SD_SHARE_POWERLINE at MC level and no SD_SHARE_POWERLINE at CPU level, which
> I believe is the correct set up for TC2.
>
> For the buddy configuration the goal is to get configuration 1 in your list
> above. You should get that when using the other patch to fix the buddy
> selection algorithm.
> I'm not sure if conf 1 or 2 is best. I think it depends on the
> power/performance trade-off of the specific platform. conf 1 may lead to
> CPU1->CPU0->CPU2 migrations which may be undesirable. If your cpus are very
> leaky it might make sense to not do packing at all inside a high performance
> cluster and always do packing directly on a another low power cluster like
> conf 2. I think this needs further investigation.
>
> I have only tested with conf 1 on TC2.

Hi Morten,

Conf1 is the default configuration for ARM platform because
SD_SHARE_POWERLINE is cleared at all levels for this architecture.

Conf2 should be used if you can't powergate the core independently but
several tests have demonstrated that even if you can't powergate each
core independently, it worth packing small task on few CPUs in a core
so it's worth using conf1 on TC2 as well.

Based on your explanation, we should use the original configuration of
SD_SHARE_POWERLINE (cleared at all level for ARM platform)

Regards
Vincent


>
> Regards,
> Morten
>
>
>> Regards,
>> Vincent
>>
>>>   }
>>>
>>>   const struct cpumask *cpu_coregroup_mask(int cpu)
>>> --
>>> 1.7.9.5
>>>
>>>
>>>
>>> ___
>>> linaro-dev mailing list
>>> linaro-dev@lists.linaro.org
>>> http://lists.linaro.org/mailman/listinfo/linaro-dev
>>
>>
>
>

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [HMP tunables v2][PATCH 3/7] ARM: TC2: Re-enable SD_SHARE_POWERLINE

2012-11-19 Thread Morten Rasmussen

On 19/11/12 12:23, Vincent Guittot wrote:

On 19 November 2012 13:08, Morten Rasmussen  wrote:

Hi Vincent,


On 19/11/12 09:20, Vincent Guittot wrote:


Hi,

On 16 November 2012 19:32, Liviu Dudau  wrote:


From: Morten Rasmussen 

Re-enable SD_SHARE_POWERLINE to reflect the power domains of TC2.
---
   arch/arm/kernel/topology.c |2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 317dac6..4d34e0e 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -228,7 +228,7 @@ struct cputopo_arm cpu_topology[NR_CPUS];

   int arch_sd_share_power_line(void)
   {
-   return 0*SD_SHARE_POWERLINE;
+   return 1*SD_SHARE_POWERLINE;



I'm not sure to catch your goal. With this modification, the power
line (or power domain) is shared at all level which should disable the
packing mechanism. But in a previous patch you fix the update packing
loop so I assume that you want to use it. Which kind of configuration
you would like to have among the proposal below ?

cpu   : CPU0 | CPU1 | CPU2 | CPU3 | CPU4
buddy conf 1 : CPU2 | CPU0 | CPU2 | CPU2 | CPU2
buddy conf 2 : CPU2 | CPU2 | CPU2 | CPU2 | CPU2
buddy conf 3 :   -1 |   -1 |   -1 |   -1 |   -1

When we look at the  git://git.linaro.org/arm/big.LITTLE/mp.git
big-LITTLE-MP-master-v12, we can see that you have defined a custom
sched_domain which hasn't been updated with SD_SHARE_POWERLINE flag so
the flag is cleared at CPU level. Based on this, I would say that you
want buddy conf 2 ? but I would say that buddy conf 1 should give
better result. Have you tried both ?



My goal with this fix is to set up the SD_SHARE_POWERLINE flags as they
really are on TC2. It could have been done more elegantly. Since the HMP
patches overrides the sched_domain flags at CPU level the SD_SHARE_POWERLINE
is not being set by arch_sd_share_power_line(). With this fix we will get
SD_SHARE_POWERLINE at MC level and no SD_SHARE_POWERLINE at CPU level, which
I believe is the correct set up for TC2.

For the buddy configuration the goal is to get configuration 1 in your list
above. You should get that when using the other patch to fix the buddy
selection algorithm.
I'm not sure if conf 1 or 2 is best. I think it depends on the
power/performance trade-off of the specific platform. conf 1 may lead to
CPU1->CPU0->CPU2 migrations which may be undesirable. If your cpus are very
leaky it might make sense to not do packing at all inside a high performance
cluster and always do packing directly on a another low power cluster like
conf 2. I think this needs further investigation.

I have only tested with conf 1 on TC2.


Hi Morten,

Conf1 is the default configuration for ARM platform because
SD_SHARE_POWERLINE is cleared at all levels for this architecture.

Conf2 should be used if you can't powergate the core independently but
several tests have demonstrated that even if you can't powergate each
core independently, it worth packing small task on few CPUs in a core
so it's worth using conf1 on TC2 as well.

Based on your explanation, we should use the original configuration of
SD_SHARE_POWERLINE (cleared at all level for ARM platform)


I agree that the result is the same, but I don't like disabling 
SD_SHARE_POWERLINE for all level when the cpus in each cluster actually 
are in the same power domain as it is the case on TC2. The name 
SHARE_POWERLINE implies a clear relation to the actual hardware design, 
thus setting the flags differently than the actual hardware design is 
misleading in my opinion. If the buddy selection algorithm doesn't 
select appropriate buddies when flags are set to reflect the actual 
hardware design I would suggest changing the buddy selection algorithm 
instead of changing the sched_domain flags.


If it is chosen to not have a direct relation between the flags and the 
hardware design, I think that the flag should be renamed so it doesn't 
give the wrong impression.


Morten



Regards
Vincent




Regards,
Morten



Regards,
Vincent


   }

   const struct cpumask *cpu_coregroup_mask(int cpu)
--
1.7.9.5



___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev












___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [HMP tunables v2][PATCH 3/7] ARM: TC2: Re-enable SD_SHARE_POWERLINE

2012-11-19 Thread Vincent Guittot
On 19 November 2012 14:36, Morten Rasmussen  wrote:
> On 19/11/12 12:23, Vincent Guittot wrote:
>>
>> On 19 November 2012 13:08, Morten Rasmussen 
>> wrote:
>>>
>>> Hi Vincent,
>>>
>>>
>>> On 19/11/12 09:20, Vincent Guittot wrote:


 Hi,

 On 16 November 2012 19:32, Liviu Dudau  wrote:
>
>
> From: Morten Rasmussen 
>
> Re-enable SD_SHARE_POWERLINE to reflect the power domains of TC2.
> ---
>arch/arm/kernel/topology.c |2 +-
>1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
> index 317dac6..4d34e0e 100644
> --- a/arch/arm/kernel/topology.c
> +++ b/arch/arm/kernel/topology.c
> @@ -228,7 +228,7 @@ struct cputopo_arm cpu_topology[NR_CPUS];
>
>int arch_sd_share_power_line(void)
>{
> -   return 0*SD_SHARE_POWERLINE;
> +   return 1*SD_SHARE_POWERLINE;



 I'm not sure to catch your goal. With this modification, the power
 line (or power domain) is shared at all level which should disable the
 packing mechanism. But in a previous patch you fix the update packing
 loop so I assume that you want to use it. Which kind of configuration
 you would like to have among the proposal below ?

 cpu   : CPU0 | CPU1 | CPU2 | CPU3 | CPU4
 buddy conf 1 : CPU2 | CPU0 | CPU2 | CPU2 | CPU2
 buddy conf 2 : CPU2 | CPU2 | CPU2 | CPU2 | CPU2
 buddy conf 3 :   -1 |   -1 |   -1 |   -1 |   -1

 When we look at the  git://git.linaro.org/arm/big.LITTLE/mp.git
 big-LITTLE-MP-master-v12, we can see that you have defined a custom
 sched_domain which hasn't been updated with SD_SHARE_POWERLINE flag so
 the flag is cleared at CPU level. Based on this, I would say that you
 want buddy conf 2 ? but I would say that buddy conf 1 should give
 better result. Have you tried both ?

>>>
>>> My goal with this fix is to set up the SD_SHARE_POWERLINE flags as they
>>> really are on TC2. It could have been done more elegantly. Since the HMP
>>> patches overrides the sched_domain flags at CPU level the
>>> SD_SHARE_POWERLINE
>>> is not being set by arch_sd_share_power_line(). With this fix we will get
>>> SD_SHARE_POWERLINE at MC level and no SD_SHARE_POWERLINE at CPU level,
>>> which
>>> I believe is the correct set up for TC2.
>>>
>>> For the buddy configuration the goal is to get configuration 1 in your
>>> list
>>> above. You should get that when using the other patch to fix the buddy
>>> selection algorithm.
>>> I'm not sure if conf 1 or 2 is best. I think it depends on the
>>> power/performance trade-off of the specific platform. conf 1 may lead to
>>> CPU1->CPU0->CPU2 migrations which may be undesirable. If your cpus are
>>> very
>>> leaky it might make sense to not do packing at all inside a high
>>> performance
>>> cluster and always do packing directly on a another low power cluster
>>> like
>>> conf 2. I think this needs further investigation.
>>>
>>> I have only tested with conf 1 on TC2.
>>
>>
>> Hi Morten,
>>
>> Conf1 is the default configuration for ARM platform because
>> SD_SHARE_POWERLINE is cleared at all levels for this architecture.
>>
>> Conf2 should be used if you can't powergate the core independently but
>> several tests have demonstrated that even if you can't powergate each
>> core independently, it worth packing small task on few CPUs in a core
>> so it's worth using conf1 on TC2 as well.
>>
>> Based on your explanation, we should use the original configuration of
>> SD_SHARE_POWERLINE (cleared at all level for ARM platform)
>
>
> I agree that the result is the same, but I don't like disabling
> SD_SHARE_POWERLINE for all level when the cpus in each cluster actually are
> in the same power domain as it is the case on TC2. The name SHARE_POWERLINE
> implies a clear relation to the actual hardware design, thus setting the
> flags differently than the actual hardware design is misleading in my
> opinion. If the buddy selection algorithm doesn't select appropriate buddies
> when flags are set to reflect the actual hardware design I would suggest
> changing the buddy selection algorithm instead of changing the sched_domain
> flags.
>
> If it is chosen to not have a direct relation between the flags and the
> hardware design, I think that the flag should be renamed so it doesn't give
> the wrong impression.

There is a direct link between the powergating and the SHARE_POWERLINE
and if you want that the buddy selection strictly reflects your HW
configuration, you must use conf2 and not conf1.
Now, beside the packing small task patch and the TC2 configuration, it
has been proven that packing small tasks on an ARM platform (dual
cortex-A9) which can only powergate the cluster, improves the power
consumption of some low cpu load use cases like the MP3 playback (we
had used cpu hotplug at that time). This assumption has been proven
only for ARM platform and tha

Re: [HMP tunables v2][PATCH 3/7] ARM: TC2: Re-enable SD_SHARE_POWERLINE

2012-11-19 Thread Morten Rasmussen

On 19/11/12 14:09, Vincent Guittot wrote:

On 19 November 2012 14:36, Morten Rasmussen  wrote:

On 19/11/12 12:23, Vincent Guittot wrote:


On 19 November 2012 13:08, Morten Rasmussen 
wrote:


Hi Vincent,


On 19/11/12 09:20, Vincent Guittot wrote:



Hi,

On 16 November 2012 19:32, Liviu Dudau  wrote:



From: Morten Rasmussen 

Re-enable SD_SHARE_POWERLINE to reflect the power domains of TC2.
---
arch/arm/kernel/topology.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 317dac6..4d34e0e 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -228,7 +228,7 @@ struct cputopo_arm cpu_topology[NR_CPUS];

int arch_sd_share_power_line(void)
{
-   return 0*SD_SHARE_POWERLINE;
+   return 1*SD_SHARE_POWERLINE;




I'm not sure to catch your goal. With this modification, the power
line (or power domain) is shared at all level which should disable the
packing mechanism. But in a previous patch you fix the update packing
loop so I assume that you want to use it. Which kind of configuration
you would like to have among the proposal below ?

cpu   : CPU0 | CPU1 | CPU2 | CPU3 | CPU4
buddy conf 1 : CPU2 | CPU0 | CPU2 | CPU2 | CPU2
buddy conf 2 : CPU2 | CPU2 | CPU2 | CPU2 | CPU2
buddy conf 3 :   -1 |   -1 |   -1 |   -1 |   -1

When we look at the  git://git.linaro.org/arm/big.LITTLE/mp.git
big-LITTLE-MP-master-v12, we can see that you have defined a custom
sched_domain which hasn't been updated with SD_SHARE_POWERLINE flag so
the flag is cleared at CPU level. Based on this, I would say that you
want buddy conf 2 ? but I would say that buddy conf 1 should give
better result. Have you tried both ?



My goal with this fix is to set up the SD_SHARE_POWERLINE flags as they
really are on TC2. It could have been done more elegantly. Since the HMP
patches overrides the sched_domain flags at CPU level the
SD_SHARE_POWERLINE
is not being set by arch_sd_share_power_line(). With this fix we will get
SD_SHARE_POWERLINE at MC level and no SD_SHARE_POWERLINE at CPU level,
which
I believe is the correct set up for TC2.

For the buddy configuration the goal is to get configuration 1 in your
list
above. You should get that when using the other patch to fix the buddy
selection algorithm.
I'm not sure if conf 1 or 2 is best. I think it depends on the
power/performance trade-off of the specific platform. conf 1 may lead to
CPU1->CPU0->CPU2 migrations which may be undesirable. If your cpus are
very
leaky it might make sense to not do packing at all inside a high
performance
cluster and always do packing directly on a another low power cluster
like
conf 2. I think this needs further investigation.

I have only tested with conf 1 on TC2.



Hi Morten,

Conf1 is the default configuration for ARM platform because
SD_SHARE_POWERLINE is cleared at all levels for this architecture.

Conf2 should be used if you can't powergate the core independently but
several tests have demonstrated that even if you can't powergate each
core independently, it worth packing small task on few CPUs in a core
so it's worth using conf1 on TC2 as well.

Based on your explanation, we should use the original configuration of
SD_SHARE_POWERLINE (cleared at all level for ARM platform)



I agree that the result is the same, but I don't like disabling
SD_SHARE_POWERLINE for all level when the cpus in each cluster actually are
in the same power domain as it is the case on TC2. The name SHARE_POWERLINE
implies a clear relation to the actual hardware design, thus setting the
flags differently than the actual hardware design is misleading in my
opinion. If the buddy selection algorithm doesn't select appropriate buddies
when flags are set to reflect the actual hardware design I would suggest
changing the buddy selection algorithm instead of changing the sched_domain
flags.

If it is chosen to not have a direct relation between the flags and the
hardware design, I think that the flag should be renamed so it doesn't give
the wrong impression.


There is a direct link between the powergating and the SHARE_POWERLINE
and if you want that the buddy selection strictly reflects your HW
configuration, you must use conf2 and not conf1.


I just want the buddy selection to be reasonable when the 
SHARE_POWERLINE flags are reflecting the true hardware configuration. I 
haven't tested whether conf 1 or 2 is best yet. As long as I am getting 
one them it is definitely an improvement over not having task packing at 
all :)



Now, beside the packing small task patch and the TC2 configuration, it
has been proven that packing small tasks on an ARM platform (dual
cortex-A9) which can only powergate the cluster, improves the power
consumption of some low cpu load use cases like the MP3 playback (we
had used cpu hotplug at that time). This assumption has been proven
only for ARM platform and that's why the SHARE_POWERLINE is cleared at
all level for ARM platform 

Re: [PATCH] genirq: Add default affinity mask command line option

2012-11-19 Thread Punit Agrawal

Hi Francesco,

Thanks for your comments on the patch.

On 18/11/12 10:41, Francesco Lavra wrote:

Hi,

On 11/12/2012 05:57 PM, Punit Agrawal wrote:

I am attaching a patch by Thomas Gleixner which adds a kernel
command line parameter to set the defauilt IRQ affinity mask. Could
you please integrate this in your tree for the next Linaro release?

I've been using this patch for sometime now and it doesn't introduce
any regressions. There is a possibility that this patch will make it
upstream via the RT patches in the near future but in the meanwhile,
we'd like to carry this patch as well.

Cheers,
Punit

 From 52a7d44f58a262e166575abc57aa0bd3bfc8cfbb Mon Sep 17 00:00:00 2001
From: Thomas Gleixner 
Date: Fri, 25 May 2012 16:59:47 +0200
Subject: [PATCH] genirq: Add default affinity mask command line option

If we isolate CPUs, then we don't want random device interrupts on
them. Even w/o the user space irq balancer enabled we can end up with
irqs on non boot cpus.

Allow to restrict the default irq affinity mask.

Signed-off-by: Thomas Gleixner 
---
  Documentation/kernel-parameters.txt |9 +
  kernel/irq/irqdesc.c|   21 +++--
  2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 7d82468..00fedab 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1164,6 +1164,15 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
 See comment before ip2_setup() in
 drivers/char/ip2/ip2base.c.

+irqaffinity=[SMP] Set the default irq affinity mask
+Format:
+,...,
+or
+-
+(must be a positive range in ascending order)
+or a mixture
+,...,-
+
 irqfixup[HW]
 When an interrupt is not handled search all handlers
 for it. Intended to get systems with badly broken
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 192a302..473b2b6 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -23,10 +23,27 @@
  static struct lock_class_key irq_desc_lock_class;

  #if defined(CONFIG_SMP)
+static int __init irq_affinity_setup(char *str)
+{
+zalloc_cpumask_var(&irq_default_affinity, GFP_NOWAIT);
+cpulist_parse(str, irq_default_affinity);

Since cpulist_parse() sets all the bits of its cpumask argument, it's
not necessary to initialize them to zero during allocation, so I would
use alloc_cpumask_var() instead of zalloc_cpumask_var().


+/*
+ * Set at least the boot cpu. We don't want to end up with
+ * bugreports caused by random comandline masks
+ */
+cpumask_set_cpu(smp_processor_id(), irq_default_affinity);
+return 1;
+}
+__setup("irqaffinity=", irq_affinity_setup);
+
  static void __init init_irq_default_affinity(void)
  {
-alloc_cpumask_var(&irq_default_affinity, GFP_NOWAIT);
-cpumask_setall(irq_default_affinity);
+#ifdef CONFIG_CPUMASK_OFFSTACK
+if (!irq_default_affinity)
+zalloc_cpumask_var(&irq_default_affinity, GFP_NOWAIT);
+#endif

The #ifdefery is not necessary here, because if CONFIG_CPUMASK_OFFSTACK
is not defined irq_default_affinity cannot be NULL.


The patch was picked up as-is from the RT patches as it implements a
functionality that we wanted to better control IRQ affinity. Being part
of the RT patches, I hope that it'll merge into mainline via that route
and I am not going to try to mainline it. So your comments will be best
addressed to the original patch postings on the lkml (Search for RT
patches).

If you think it is really important to address your comments for the
patch that goes into Linaro kernel, I could address them and send an
updated patch. Though in that case, I am not quite sure how to attribute
the original author who wrote the patch.


Thanks,
Punit

--
Francesco






-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [GIT PULL]; big LITTLE MP master v12

2012-11-19 Thread Jon Medhurst (Tixy)
On Sun, 2012-11-18 at 10:40 +0530, Viresh Kumar wrote:
> Hi Andrey,
> 
> Please pull big-LITTLE-MP-master-v12 with following updates:
> 
> - Based on v3.7-rc5
> - Stats:
>  - Total Patches: 62
>  - New Patches: 1
>- genirq: Add default affinity mask command line option in
> misc-patches branch
>- top 3 patches in: sched-pack-small-tasks-v1
>- top 2 patches in: task-placement-v2
>- additional patch in: config-fragments
>  - Dropped patches/branches (as they are managed in experimental
> merge branch): 20
>- patches in per-entity-load-tracking-with-core-sched-v1: 15
>  - Updated Patches: 0
> 

This version increases Android boot time by a factor of 3, from 91
seconds to 257 seconds, this is comparing it with the version of the
master-v12 branch created on Nov 15th. Looking at the differences in the
code, one obvious thing which stands out is big-LITTLE-MP.conf now has:

CONFIG_HMP_VARIABLE_SCALE=y
CONFIG_HMP_FREQUENCY_INVARIANT_SCALE=y

If I remove this then boot time goes back to 90 seconds.

Also, if I build without big-LITTLE-MP.conf the I get a build error:

kernel/sched/fair.c: In function 'update_entity_load_avg':
kernel/sched/fair.c:1469:26: error: 'struct sched_entity' has no member
named 'cfs_rq'

-- 
Tixy

> -x--x---
> 
> The following changes since commit 77b67063bb6bce6d475e910d3b886a606d0d91f7:
> 
>   Linux 3.7-rc5 (2012-11-11 13:44:33 +0100)
> 
> are available in the git repository at:
> 
>   git://git.linaro.org/arm/big.LITTLE/mp.git big-LITTLE-MP-master-v12
> 
> for you to fetch changes up to f942092bd1008de7379b4a52d38dc03de5949fc8:
> 
>   Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1',
> 'task-placement-v2', 'misc-patches', 'config-fragments' and
> 'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2
> (2012-11-17 09:29:41 +0530)
> 
> 
> 
> Ben Segall (1):
>   sched: Maintain per-rq runnable averages
> 
> Chris Redpath (1):
>   ARM: Experimental Frequency-Invariant Load Scaling Patch
> 
> Dietmar Eggemann (1):
>   ARM: hw_breakpoint: v7.1 self-hosted debug powerdown support
> 
> Jon Medhurst (1):
>   ARM: sched: Avoid empty 'slow' HMP domain
> 
> Liviu Dudau (2):
>   Revert "sched: secure access to other CPU statistics"
>   linaro/configs: big-LITTLE-MP: Enable the new tunable sysfs
> interface by default.
> 
> Lorenzo Pieralisi (1):
>   ARM: kernel: provide cluster to logical cpu mask mapping API
> 
> Marc Zyngier (1):
>   ARM: perf: add guest vs host discrimination
> 
> Mark Rutland (1):
>   ARM: perf: register cpu_notifier at driver init
> 
> Morten Rasmussen (15):
>   sched: entity load-tracking load_avg_ratio
>   sched: Task placement for heterogeneous systems based on task
> load-tracking
>   sched: Forced task migration on heterogeneous systems
>   sched: Introduce priority-based task migration filter
>   ARM: Add HMP scheduling support for ARM architecture
>   ARM: sched: Use device-tree to provide fast/slow CPU list for HMP
>   ARM: sched: Setup SCHED_HMP domains
>   sched: Add ftrace events for entity load-tracking
>   sched: Add HMP task migration ftrace event
>   sched: SCHED_HMP multi-domain task migration control
>   sched: Enable HMP priority filter by default
>   sched: Only down migrate low priority tasks if allowed by affinity mask
>   linaro/configs: Enable HMP priority filter by default
>   sched: SD_SHARE_POWERLINE buddy selection fix
>   ARM: TC2: Re-enable SD_SHARE_POWERLINE
> 
> Olivier Cozette (1):
>   ARM: Change load tracking scale using sysfs
> 
> Paul Turner (15):
>   sched: Track the runnable average on a per-task entity basis
>   sched: Aggregate load contributed by task entities on parenting cfs_rq
>   sched: Maintain the load contribution of blocked entities
>   sched: Add an rq migration call-back to sched_class
>   sched: Account for blocked load waking back up
>   sched: Aggregate total task_group load
>   sched: Compute load contribution by a group entity
>   sched: Normalize tg load contributions against runnable time
>   sched: Maintain runnable averages across throttled periods
>   sched: Replace update_shares weight distribution with per-entity
> computation
>   sched: Refactor update_shares_cpu() -> update_blocked_avgs()
>   sched: Update_cfs_shares at period edge
>   sched: Make __update_entity_runnable_avg() fast
>   sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking
>   sched: implement usage tracking
> 
> Peter Zijlstra (1):
>   sched: Describe CFS load-balancer
> 
> Sudeep KarkadaNagesha (9):
>   ARM: perf: allocate CPU PMU dynamically at probe time
>   ARM: perf: consistently use struct perf_event in arm_pmu functions
>   ARM: perf: check ARM

RE: [GIT PULL]; big LITTLE MP master v12

2012-11-19 Thread Chris Redpath
Hi Tixy,

These patches are mine and Olivier's, can you tell me what your config is to 
see even a 90s boot? I haven't noticed any boot-time extension on my system, 
but I'm booting from the A15 cluster. I'd like to reproduce your system here.

The build error is my mistake, I need to know which CPU a task is on and it 
looks like I have missed a dependency when I've changed the calling code. I 
will sort out a patch for that asap.

Best Regards,
Chris

> -Original Message-
> From: Jon Medhurst (Tixy) [mailto:t...@linaro.org]
> Sent: 19 November 2012 15:41
> To: Viresh Kumar
> Cc: Andrey Konovalov; PDSW-power-team; Lists linaro-dev
> Subject: Re: [GIT PULL]; big LITTLE MP master v12
>
> On Sun, 2012-11-18 at 10:40 +0530, Viresh Kumar wrote:
> > Hi Andrey,
> >
> > Please pull big-LITTLE-MP-master-v12 with following updates:
> >
> > - Based on v3.7-rc5
> > - Stats:
> >  - Total Patches: 62
> >  - New Patches: 1
> >- genirq: Add default affinity mask command line option in
> > misc-patches branch
> >- top 3 patches in: sched-pack-small-tasks-v1
> >- top 2 patches in: task-placement-v2
> >- additional patch in: config-fragments
> >  - Dropped patches/branches (as they are managed in experimental
> > merge branch): 20
> >- patches in per-entity-load-tracking-with-core-sched-v1: 15
> >  - Updated Patches: 0
> >
>
> This version increases Android boot time by a factor of 3, from 91
> seconds to 257 seconds, this is comparing it with the version of the
> master-v12 branch created on Nov 15th. Looking at the differences in
> the
> code, one obvious thing which stands out is big-LITTLE-MP.conf now has:
>
> CONFIG_HMP_VARIABLE_SCALE=y
> CONFIG_HMP_FREQUENCY_INVARIANT_SCALE=y
>
> If I remove this then boot time goes back to 90 seconds.
>
> Also, if I build without big-LITTLE-MP.conf the I get a build error:
>
> kernel/sched/fair.c: In function 'update_entity_load_avg':
> kernel/sched/fair.c:1469:26: error: 'struct sched_entity' has no member
> named 'cfs_rq'
>
> --
> Tixy
>
> > -x--x
> ---
> >
> > The following changes since commit
> 77b67063bb6bce6d475e910d3b886a606d0d91f7:
> >
> >   Linux 3.7-rc5 (2012-11-11 13:44:33 +0100)
> >
> > are available in the git repository at:
> >
> >   git://git.linaro.org/arm/big.LITTLE/mp.git big-LITTLE-MP-master-v12
> >
> > for you to fetch changes up to
> f942092bd1008de7379b4a52d38dc03de5949fc8:
> >
> >   Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1',
> > 'task-placement-v2', 'misc-patches', 'config-fragments' and
> > 'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2
> > (2012-11-17 09:29:41 +0530)
> >
> > 
> >
> > Ben Segall (1):
> >   sched: Maintain per-rq runnable averages
> >
> > Chris Redpath (1):
> >   ARM: Experimental Frequency-Invariant Load Scaling Patch
> >
> > Dietmar Eggemann (1):
> >   ARM: hw_breakpoint: v7.1 self-hosted debug powerdown support
> >
> > Jon Medhurst (1):
> >   ARM: sched: Avoid empty 'slow' HMP domain
> >
> > Liviu Dudau (2):
> >   Revert "sched: secure access to other CPU statistics"
> >   linaro/configs: big-LITTLE-MP: Enable the new tunable sysfs
> > interface by default.
> >
> > Lorenzo Pieralisi (1):
> >   ARM: kernel: provide cluster to logical cpu mask mapping API
> >
> > Marc Zyngier (1):
> >   ARM: perf: add guest vs host discrimination
> >
> > Mark Rutland (1):
> >   ARM: perf: register cpu_notifier at driver init
> >
> > Morten Rasmussen (15):
> >   sched: entity load-tracking load_avg_ratio
> >   sched: Task placement for heterogeneous systems based on task
> > load-tracking
> >   sched: Forced task migration on heterogeneous systems
> >   sched: Introduce priority-based task migration filter
> >   ARM: Add HMP scheduling support for ARM architecture
> >   ARM: sched: Use device-tree to provide fast/slow CPU list for
> HMP
> >   ARM: sched: Setup SCHED_HMP domains
> >   sched: Add ftrace events for entity load-tracking
> >   sched: Add HMP task migration ftrace event
> >   sched: SCHED_HMP multi-domain task migration control
> >   sched: Enable HMP priority filter by default
> >   sched: Only down migrate low priority tasks if allowed by
> affinity mask
> >   linaro/configs: Enable HMP priority filter by default
> >   sched: SD_SHARE_POWERLINE buddy selection fix
> >   ARM: TC2: Re-enable SD_SHARE_POWERLINE
> >
> > Olivier Cozette (1):
> >   ARM: Change load tracking scale using sysfs
> >
> > Paul Turner (15):
> >   sched: Track the runnable average on a per-task entity basis
> >   sched: Aggregate load contributed by task entities on parenting
> cfs_rq
> >   sched: Maintain the load contribution of blocked entities
> >   sched: Add an rq migration call-back to sched_class
> >   sched: Account for block

Re: [GIT PULL]; big LITTLE MP master v12

2012-11-19 Thread Jon Medhurst (Tixy)
On Mon, 2012-11-19 at 15:57 +, Chris Redpath wrote:
> These patches are mine and Olivier's, can you tell me what your config is to 
> see even a 90s boot? I haven't noticed any boot-time extension on my system, 
> but I'm booting from the A15 cluster. I'd like to reproduce your system here.

The config Linaro uses for Android on vexpress is generated by the
command:

ARCH=arm scripts/kconfig/merge_config.sh \
   linaro/configs/linaro-base.conf \
   linaro/configs/android.conf \
   linaro/configs/big-LITTLE-MP.conf \
   linaro/configs/vexpress.conf

I've push the kernel tree I had to my personal git area...
http://git.linaro.org/gitweb?p=people/tixy/kernel.git;a=shortlog;h=refs/heads/integration-android-vexpress

the Android userside I'm using the latest daily build:
https://android-build.linaro.org/builds/~linaro-android/vexpress-jb-gcc47-armlt-tracking-open/#build=104

An Android images always takes a lot longer to boot first time, so
before doing any timing boot a fresh image one and wait for it to settle
down (the power/freq status LED on the TC2 coretile are a good indicator
of when the system finally goes mostly idle)

> The build error is my mistake, I need to know which CPU a task is on and it 
> looks like I have missed a dependency when I've changed the calling code. I 
> will sort out a patch for that asap.

Even the 90 second boot seemed very long from what I remember, that's
why I was also trying to to build without any big.LITTLE MP configured;
was going to try different configs to see if I can narrow the slowness
down.

-- 
Tixy




> Best Regards,
> Chris
> 
> > -Original Message-
> > From: Jon Medhurst (Tixy) [mailto:t...@linaro.org]
> > Sent: 19 November 2012 15:41
> > To: Viresh Kumar
> > Cc: Andrey Konovalov; PDSW-power-team; Lists linaro-dev
> > Subject: Re: [GIT PULL]; big LITTLE MP master v12
> >
> > On Sun, 2012-11-18 at 10:40 +0530, Viresh Kumar wrote:
> > > Hi Andrey,
> > >
> > > Please pull big-LITTLE-MP-master-v12 with following updates:
> > >
> > > - Based on v3.7-rc5
> > > - Stats:
> > >  - Total Patches: 62
> > >  - New Patches: 1
> > >- genirq: Add default affinity mask command line option in
> > > misc-patches branch
> > >- top 3 patches in: sched-pack-small-tasks-v1
> > >- top 2 patches in: task-placement-v2
> > >- additional patch in: config-fragments
> > >  - Dropped patches/branches (as they are managed in experimental
> > > merge branch): 20
> > >- patches in per-entity-load-tracking-with-core-sched-v1: 15
> > >  - Updated Patches: 0
> > >
> >
> > This version increases Android boot time by a factor of 3, from 91
> > seconds to 257 seconds, this is comparing it with the version of the
> > master-v12 branch created on Nov 15th. Looking at the differences in
> > the
> > code, one obvious thing which stands out is big-LITTLE-MP.conf now has:
> >
> > CONFIG_HMP_VARIABLE_SCALE=y
> > CONFIG_HMP_FREQUENCY_INVARIANT_SCALE=y
> >
> > If I remove this then boot time goes back to 90 seconds.
> >
> > Also, if I build without big-LITTLE-MP.conf the I get a build error:
> >
> > kernel/sched/fair.c: In function 'update_entity_load_avg':
> > kernel/sched/fair.c:1469:26: error: 'struct sched_entity' has no member
> > named 'cfs_rq'
> >
> > --
> > Tixy
> >
> > > -x--x
> > ---
> > >
> > > The following changes since commit
> > 77b67063bb6bce6d475e910d3b886a606d0d91f7:
> > >
> > >   Linux 3.7-rc5 (2012-11-11 13:44:33 +0100)
> > >
> > > are available in the git repository at:
> > >
> > >   git://git.linaro.org/arm/big.LITTLE/mp.git big-LITTLE-MP-master-v12
> > >
> > > for you to fetch changes up to
> > f942092bd1008de7379b4a52d38dc03de5949fc8:
> > >
> > >   Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1',
> > > 'task-placement-v2', 'misc-patches', 'config-fragments' and
> > > 'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2
> > > (2012-11-17 09:29:41 +0530)
> > >
> > > 
> > >
> > > Ben Segall (1):
> > >   sched: Maintain per-rq runnable averages
> > >
> > > Chris Redpath (1):
> > >   ARM: Experimental Frequency-Invariant Load Scaling Patch
> > >
> > > Dietmar Eggemann (1):
> > >   ARM: hw_breakpoint: v7.1 self-hosted debug powerdown support
> > >
> > > Jon Medhurst (1):
> > >   ARM: sched: Avoid empty 'slow' HMP domain
> > >
> > > Liviu Dudau (2):
> > >   Revert "sched: secure access to other CPU statistics"
> > >   linaro/configs: big-LITTLE-MP: Enable the new tunable sysfs
> > > interface by default.
> > >
> > > Lorenzo Pieralisi (1):
> > >   ARM: kernel: provide cluster to logical cpu mask mapping API
> > >
> > > Marc Zyngier (1):
> > >   ARM: perf: add guest vs host discrimination
> > >
> > > Mark Rutland (1):
> > >   ARM: perf: register cpu_notifier at driver init
> > >
> > > Morten Rasmussen (15):
> > >   sched: entity load-

Re: [PATCH Resend V2] dt: add helper function to read u8 & u16 variables & arrays

2012-11-19 Thread Stephen Warren
On 11/18/2012 11:41 PM, Viresh Kumar wrote:
> On 19 November 2012 12:05, Rajanikanth HV  wrote:
>> On 19 November 2012 12:00, Viresh Kumar  wrote:
>>> Firstly you tried square braces [ ], I am not sure if that is allowed.
>>> Can you point me to the specification?
>> http://www.devicetree.org/Device_Tree_Usage
>> "
>> a-byte-data-property = [0x01 0x23 0x34 0x56];
>> "
> 
> Ok, but what about 16 bit then {} :)

Support for byte- and word- properties is relatively recent I believe
(or at least, the /bits/ syntax is). Which dtc version are you using?

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[RFC 1/3] sched: fix nr_busy_cpus with coupled cpuidle

2012-11-19 Thread Vincent Guittot
With the coupled cpuidle driver (but probably also with other drivers),
a CPU loops in a temporary safe state while waiting for other CPUs of its
cluster to be ready to enter the coupled C-state. If an IRQ or a softirq
occurs, the CPU will stay in this internal loop if there is no need
to resched. The SCHED softirq clears the NOHZ and increases
nr_busy_cpus. If there is no need to resched, we will not call
set_cpu_sd_state_idle because of this internal loop in a cpuidle state.
We have to call set_cpu_sd_state_idle in tick_nohz_irq_exit which is used
to handle such situation.

Signed-off-by: Vincent Guittot 
---
 kernel/time/tick-sched.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index a402608..e19bbc9 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -526,6 +526,8 @@ void tick_nohz_irq_exit(void)
if (!ts->inidle)
return;
 
+   set_cpu_sd_state_idle();
+
__tick_nohz_idle_enter(ts);
 }
 
-- 
1.7.10


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[RFC 3/3] sched: fix update NOHZ_IDLE flag

2012-11-19 Thread Vincent Guittot
The function nohz_kick_needed modifies NOHZ_IDLE flag that is used to update
the nr_busy_cpus of the sched_group.
When the sched_domain are updated (because of the unplug of a CPUs as an
example) a null_domain is attached to CPUs. We have to test
likely(!on_null_domain(cpu) first in order to detect such intialization step
and to not modify the NOHZ_IDLE flag

Signed-off-by: Vincent Guittot 
---
 kernel/sched/fair.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3d0686c..1bf7c87 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5490,7 +5490,7 @@ void trigger_load_balance(struct rq *rq, int cpu)
likely(!on_null_domain(cpu)))
raise_softirq(SCHED_SOFTIRQ);
 #ifdef CONFIG_NO_HZ
-   if (nohz_kick_needed(rq, cpu) && likely(!on_null_domain(cpu)))
+   if (likely(!on_null_domain(cpu)) && nohz_kick_needed(rq, cpu))
nohz_balancer_kick(cpu);
 #endif
 }
-- 
1.7.10


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[RFC 0/3] sched: fix nr_busy_cpus

2012-11-19 Thread Vincent Guittot
The nr_busy_cpus field of the sched_group_power is sometime different from 0
whereas the platform is fully idle. This serie fixes 3 use cases:
 - when the SCHED softirq is raised on an idle core for idle load balance but
   the platform doesn't go out of the cpuidle state
 - when some CPUs enter idle state while booting all CPUs
 - when a CPU is unplug and/or replug

Vincent Guittot (3):
  sched: fix nr_busy_cpus with coupled cpuidle
  sched: fix init NOHZ_IDLE flag
  sched: fix update NOHZ_IDLE flag

 kernel/sched/core.c  |1 +
 kernel/sched/fair.c  |2 +-
 kernel/time/tick-sched.c |2 ++
 3 files changed, 4 insertions(+), 1 deletion(-)

-- 
1.7.10


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[RFC 2/3] sched: fix init NOHZ_IDLE flag

2012-11-19 Thread Vincent Guittot
On my smp platform which is made of 5 cores in 2 clusters,I have the
nr_busy_cpu field of sched_group_power struct that is not null when the
platform is fully idle. The root cause seems to be:
During the boot sequence, some CPUs reach the idle loop and set their
NOHZ_IDLE flag while waiting for others CPUs to boot. But the nr_busy_cpus
field is initialized later with the assumption that all CPUs are in the busy
state whereas some CPUs have already set their NOHZ_IDLE flag.
We clear the NOHZ_IDLE flag when nr_busy_cpus is initialized in order to 
have a coherent configuration.

Signed-off-by: Vincent Guittot 
---
 kernel/sched/core.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 5dae0d2..05058e8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5817,6 +5817,7 @@ static void init_sched_groups_power(int cpu, struct 
sched_domain *sd)
 
update_group_power(sd, cpu);
atomic_set(&sg->sgp->nr_busy_cpus, sg->group_weight);
+   clear_bit(NOHZ_IDLE, nohz_flags(cpu));
 }
 
 int __weak arch_sd_sibling_asym_packing(void)
-- 
1.7.10


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [GIT PULL]; big LITTLE MP master v12

2012-11-19 Thread Andrey Konovalov

Viresh,

I won't pull the big-LITTLE-MP-master-v12 into the 
linux-linaro-core-tracking tree today due to the issues found by Tixy.


Tomorrow evening I am going to pull this topic anyway - whether these 
issues are resolved, or not. If the build error is not fixed by Thursday 
morning UTC, I'll move llct back to v11. Would it work for the Landing 
Teams? Tixy?


Thanks,
Andrey

On 11/19/2012 07:57 PM, Chris Redpath wrote:

Hi Tixy,

These patches are mine and Olivier's, can you tell me what your config is to 
see even a 90s boot? I haven't noticed any boot-time extension on my system, 
but I'm booting from the A15 cluster. I'd like to reproduce your system here.

The build error is my mistake, I need to know which CPU a task is on and it 
looks like I have missed a dependency when I've changed the calling code. I 
will sort out a patch for that asap.

Best Regards,
Chris


-Original Message-
From: Jon Medhurst (Tixy) [mailto:t...@linaro.org]
Sent: 19 November 2012 15:41
To: Viresh Kumar
Cc: Andrey Konovalov; PDSW-power-team; Lists linaro-dev
Subject: Re: [GIT PULL]; big LITTLE MP master v12

On Sun, 2012-11-18 at 10:40 +0530, Viresh Kumar wrote:

Hi Andrey,

Please pull big-LITTLE-MP-master-v12 with following updates:

 - Based on v3.7-rc5
 - Stats:
  - Total Patches: 62
  - New Patches: 1
- genirq: Add default affinity mask command line option in
misc-patches branch
- top 3 patches in: sched-pack-small-tasks-v1
- top 2 patches in: task-placement-v2
- additional patch in: config-fragments
  - Dropped patches/branches (as they are managed in experimental
merge branch): 20
- patches in per-entity-load-tracking-with-core-sched-v1: 15
  - Updated Patches: 0



This version increases Android boot time by a factor of 3, from 91
seconds to 257 seconds, this is comparing it with the version of the
master-v12 branch created on Nov 15th. Looking at the differences in
the
code, one obvious thing which stands out is big-LITTLE-MP.conf now has:

CONFIG_HMP_VARIABLE_SCALE=y
CONFIG_HMP_FREQUENCY_INVARIANT_SCALE=y

If I remove this then boot time goes back to 90 seconds.

Also, if I build without big-LITTLE-MP.conf the I get a build error:

kernel/sched/fair.c: In function 'update_entity_load_avg':
kernel/sched/fair.c:1469:26: error: 'struct sched_entity' has no member
named 'cfs_rq'

--
Tixy


-x--x

---


The following changes since commit

77b67063bb6bce6d475e910d3b886a606d0d91f7:


   Linux 3.7-rc5 (2012-11-11 13:44:33 +0100)

are available in the git repository at:

   git://git.linaro.org/arm/big.LITTLE/mp.git big-LITTLE-MP-master-v12

for you to fetch changes up to

f942092bd1008de7379b4a52d38dc03de5949fc8:


   Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1',
'task-placement-v2', 'misc-patches', 'config-fragments' and
'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2
(2012-11-17 09:29:41 +0530)



Ben Segall (1):
   sched: Maintain per-rq runnable averages

Chris Redpath (1):
   ARM: Experimental Frequency-Invariant Load Scaling Patch

Dietmar Eggemann (1):
   ARM: hw_breakpoint: v7.1 self-hosted debug powerdown support

Jon Medhurst (1):
   ARM: sched: Avoid empty 'slow' HMP domain

Liviu Dudau (2):
   Revert "sched: secure access to other CPU statistics"
   linaro/configs: big-LITTLE-MP: Enable the new tunable sysfs
interface by default.

Lorenzo Pieralisi (1):
   ARM: kernel: provide cluster to logical cpu mask mapping API

Marc Zyngier (1):
   ARM: perf: add guest vs host discrimination

Mark Rutland (1):
   ARM: perf: register cpu_notifier at driver init

Morten Rasmussen (15):
   sched: entity load-tracking load_avg_ratio
   sched: Task placement for heterogeneous systems based on task
load-tracking
   sched: Forced task migration on heterogeneous systems
   sched: Introduce priority-based task migration filter
   ARM: Add HMP scheduling support for ARM architecture
   ARM: sched: Use device-tree to provide fast/slow CPU list for

HMP

   ARM: sched: Setup SCHED_HMP domains
   sched: Add ftrace events for entity load-tracking
   sched: Add HMP task migration ftrace event
   sched: SCHED_HMP multi-domain task migration control
   sched: Enable HMP priority filter by default
   sched: Only down migrate low priority tasks if allowed by

affinity mask

   linaro/configs: Enable HMP priority filter by default
   sched: SD_SHARE_POWERLINE buddy selection fix
   ARM: TC2: Re-enable SD_SHARE_POWERLINE

Olivier Cozette (1):
   ARM: Change load tracking scale using sysfs

Paul Turner (15):
   sched: Track the runnable average on a per-task entity basis
   sched: Aggregate load contributed by task entities on parenting

cfs_rq

   sched: Maintain the load contribution of blocked entities
  

Re: [PATCH Resend V2] dt: add helper function to read u8 & u16 variables & arrays

2012-11-19 Thread Viresh Kumar
On 19 November 2012 21:58, Stephen Warren  wrote:
> Support for byte- and word- properties is relatively recent I believe
> (or at least, the /bits/ syntax is). Which dtc version are you using?

Ok, i was on a older version. I just saw this patch now:

commit cd296721a9645f9f28800a072490fa15458d1fb7
Author: Stephen Warren 
Date:   Fri Sep 28 21:25:59 2012 +

dtc: import latest upstream dtc

This updates scripts/dtc to commit 317a5d9 "dtc: zero out new label
objects" from git://git.jdl.com/software/dtc.git.

This adds features such as:
* /bits/ syntax for cell data.
* Math expressions within cell data.
* The ability to delete properties or nodes.
* Support for #line directives in the input file, which allows the use of
  cpp on *.dts.
* -i command-line option (/include/ path)
* -W/-E command-line options for error/warning control.
* Removal of spew to STDOUT containing the filename being compiled.
* Many additions to the libfdt API.

Signed-off-by: Stephen Warren 
Acked-by: Jon Loeliger 
Signed-off-by: Rob Herring 

Will try it tomorrow

--
viresh

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [GIT PULL]; big LITTLE MP master v12

2012-11-19 Thread Jon Medhurst (Tixy)
On Mon, 2012-11-19 at 21:14 +0400, Andrey Konovalov wrote:
> Viresh,
> 
> I won't pull the big-LITTLE-MP-master-v12 into the 
> linux-linaro-core-tracking tree today due to the issues found by Tixy.
> 
> Tomorrow evening I am going to pull this topic anyway - whether these 
> issues are resolved, or not. If the build error is not fixed by Thursday 
> morning UTC, I'll move llct back to v11. Would it work for the Landing 
> Teams? Tixy?

The timescales seem a bit wrong for that, working backwards...

- The monthly release is made from linux-linaro's state at end of
Thursday.

- You need to merge Landing Team's topics in before then.

- Landing Teams need to prepare their topics based on a given llct.

- To prepare their topics, Landing Teams need to be able to compile
their kernels.

So I would say that LT's need a final working llct tomorrow really. I
could manage OK getting this on Wednesday, don't know about other teams.

That's would then mean the monthly release candidate build comes from a
tree who's contents have never been built together before that day, so
it's trusting to luck somewhat.

Last month llct was created by the Monday morning so LT's could base
their branches on that and have them merged into linux-linaro by the end
of Tuesday. We then had two days to fix problems before Thursday's code
cutoff.

-- 
Tixy



___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


RE: [GIT PULL]; big LITTLE MP master v12

2012-11-19 Thread Chris Redpath
Hi Tixy,

I've got your kernel and the JB filesystem modified for USB booting here. There 
is no boot delay for me - it takes just over 70s from boot monitor to mouse 
pointer.

Catch you in the morning :)

Chris

> -Original Message-
> From: Jon Medhurst (Tixy) [mailto:t...@linaro.org]
> Sent: 19 November 2012 16:16
> To: Chris Redpath
> Cc: Viresh Kumar; Andrey Konovalov; PDSW-power-team; Lists linaro-dev
> Subject: Re: [GIT PULL]; big LITTLE MP master v12
>
> On Mon, 2012-11-19 at 15:57 +, Chris Redpath wrote:
> > These patches are mine and Olivier's, can you tell me what your
> config is to see even a 90s boot? I haven't noticed any boot-time
> extension on my system, but I'm booting from the A15 cluster. I'd like
> to reproduce your system here.
>
> The config Linaro uses for Android on vexpress is generated by the
> command:
>
> ARCH=arm scripts/kconfig/merge_config.sh \
>linaro/configs/linaro-base.conf \
>linaro/configs/android.conf \
>linaro/configs/big-LITTLE-MP.conf \
>linaro/configs/vexpress.conf
>
> I've push the kernel tree I had to my personal git area...
> http://git.linaro.org/gitweb?p=people/tixy/kernel.git;a=shortlog;h=refs
> /heads/integration-android-vexpress
>
> the Android userside I'm using the latest daily build:
> https://android-build.linaro.org/builds/~linaro-android/vexpress-jb-
> gcc47-armlt-tracking-open/#build=104
>
> An Android images always takes a lot longer to boot first time, so
> before doing any timing boot a fresh image one and wait for it to
> settle
> down (the power/freq status LED on the TC2 coretile are a good
> indicator
> of when the system finally goes mostly idle)
>
> > The build error is my mistake, I need to know which CPU a task is on
> and it looks like I have missed a dependency when I've changed the
> calling code. I will sort out a patch for that asap.
>
> Even the 90 second boot seemed very long from what I remember, that's
> why I was also trying to to build without any big.LITTLE MP configured;
> was going to try different configs to see if I can narrow the slowness
> down.
>
> --
> Tixy
>
>
>
>
> > Best Regards,
> > Chris
> >
> > > -Original Message-
> > > From: Jon Medhurst (Tixy) [mailto:t...@linaro.org]
> > > Sent: 19 November 2012 15:41
> > > To: Viresh Kumar
> > > Cc: Andrey Konovalov; PDSW-power-team; Lists linaro-dev
> > > Subject: Re: [GIT PULL]; big LITTLE MP master v12
> > >
> > > On Sun, 2012-11-18 at 10:40 +0530, Viresh Kumar wrote:
> > > > Hi Andrey,
> > > >
> > > > Please pull big-LITTLE-MP-master-v12 with following updates:
> > > >
> > > > - Based on v3.7-rc5
> > > > - Stats:
> > > >  - Total Patches: 62
> > > >  - New Patches: 1
> > > >- genirq: Add default affinity mask command line option in
> > > > misc-patches branch
> > > >- top 3 patches in: sched-pack-small-tasks-v1
> > > >- top 2 patches in: task-placement-v2
> > > >- additional patch in: config-fragments
> > > >  - Dropped patches/branches (as they are managed in
> experimental
> > > > merge branch): 20
> > > >- patches in per-entity-load-tracking-with-core-sched-v1:
> 15
> > > >  - Updated Patches: 0
> > > >
> > >
> > > This version increases Android boot time by a factor of 3, from 91
> > > seconds to 257 seconds, this is comparing it with the version of
> the
> > > master-v12 branch created on Nov 15th. Looking at the differences
> in
> > > the
> > > code, one obvious thing which stands out is big-LITTLE-MP.conf now
> has:
> > >
> > > CONFIG_HMP_VARIABLE_SCALE=y
> > > CONFIG_HMP_FREQUENCY_INVARIANT_SCALE=y
> > >
> > > If I remove this then boot time goes back to 90 seconds.
> > >
> > > Also, if I build without big-LITTLE-MP.conf the I get a build
> error:
> > >
> > > kernel/sched/fair.c: In function 'update_entity_load_avg':
> > > kernel/sched/fair.c:1469:26: error: 'struct sched_entity' has no
> member
> > > named 'cfs_rq'
> > >
> > > --
> > > Tixy
> > >
> > > > -x--x
> 
> > > ---
> > > >
> > > > The following changes since commit
> > > 77b67063bb6bce6d475e910d3b886a606d0d91f7:
> > > >
> > > >   Linux 3.7-rc5 (2012-11-11 13:44:33 +0100)
> > > >
> > > > are available in the git repository at:
> > > >
> > > >   git://git.linaro.org/arm/big.LITTLE/mp.git big-LITTLE-MP-
> master-v12
> > > >
> > > > for you to fetch changes up to
> > > f942092bd1008de7379b4a52d38dc03de5949fc8:
> > > >
> > > >   Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1',
> > > > 'task-placement-v2', 'misc-patches', 'config-fragments' and
> > > > 'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2
> > > > (2012-11-17 09:29:41 +0530)
> > > >
> > > > 
> > > >
> > > > Ben Segall (1):
> > > >   sched: Maintain per-rq runnable averages
> > > >
> > > > Chris Redpath (1):
> > > >   ARM: Experimental Frequency-Invariant Load Scaling Patch
> > > >
> > > > Dietmar

Re: [GIT PULL]; big LITTLE MP master v12

2012-11-19 Thread Andrey Konovalov

On 11/19/2012 10:17 PM, Jon Medhurst (Tixy) wrote:

On Mon, 2012-11-19 at 21:14 +0400, Andrey Konovalov wrote:

Viresh,

I won't pull the big-LITTLE-MP-master-v12 into the
linux-linaro-core-tracking tree today due to the issues found by Tixy.

Tomorrow evening I am going to pull this topic anyway - whether these
issues are resolved, or not. If the build error is not fixed by Thursday
morning UTC, I'll move llct back to v11. Would it work for the Landing
Teams? Tixy?


The timescales seem a bit wrong for that, working backwards...


That's correct..


- The monthly release is made from linux-linaro's state at end of
Thursday.

- You need to merge Landing Team's topics in before then.

- Landing Teams need to prepare their topics based on a given llct.

- To prepare their topics, Landing Teams need to be able to compile
their kernels.

So I would say that LT's need a final working llct tomorrow really. I
could manage OK getting this on Wednesday, don't know about other teams.

That's would then mean the monthly release candidate build comes from a
tree who's contents have never been built together before that day, so
it's trusting to luck somewhat.

Last month llct was created by the Monday morning so LT's could base
their branches on that and have them merged into linux-linaro by the end
of Tuesday. We then had two days to fix problems before Thursday's code
cutoff.


I'll push updated llct tree with v11 big-LITTLE-MP topic tonight.
If there is the build error fix by tomorrow, I can push one more llct 
update using the updated master v12 version tomorrow. Is the boot delay 
issue a show-stopper? If yes, we could just stick to the v11 for this cycle.


Thanks,
Andrey


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [GIT PULL]; big LITTLE MP master v12

2012-11-19 Thread Jon Medhurst (Tixy)
On Mon, 2012-11-19 at 22:24 +0400, Andrey Konovalov wrote:
> On 11/19/2012 10:17 PM, Jon Medhurst (Tixy) wrote:
> > On Mon, 2012-11-19 at 21:14 +0400, Andrey Konovalov wrote:
> >> Viresh,
> >>
> >> I won't pull the big-LITTLE-MP-master-v12 into the
> >> linux-linaro-core-tracking tree today due to the issues found by Tixy.
> >>
> >> Tomorrow evening I am going to pull this topic anyway - whether these
> >> issues are resolved, or not. If the build error is not fixed by Thursday
> >> morning UTC, I'll move llct back to v11. Would it work for the Landing
> >> Teams? Tixy?
> >
> > The timescales seem a bit wrong for that, working backwards...
> 
> That's correct..
> 
> > - The monthly release is made from linux-linaro's state at end of
> > Thursday.
> >
> > - You need to merge Landing Team's topics in before then.
> >
> > - Landing Teams need to prepare their topics based on a given llct.
> >
> > - To prepare their topics, Landing Teams need to be able to compile
> > their kernels.
> >
> > So I would say that LT's need a final working llct tomorrow really. I
> > could manage OK getting this on Wednesday, don't know about other teams.
> >
> > That's would then mean the monthly release candidate build comes from a
> > tree who's contents have never been built together before that day, so
> > it's trusting to luck somewhat.
> >
> > Last month llct was created by the Monday morning so LT's could base
> > their branches on that and have them merged into linux-linaro by the end
> > of Tuesday. We then had two days to fix problems before Thursday's code
> > cutoff.
> 
> I'll push updated llct tree with v11 big-LITTLE-MP topic tonight.
> If there is the build error fix by tomorrow, I can push one more llct 
> update using the updated master v12 version tomorrow.

That's sounds like a sensible plan :-)

>  Is the boot delay 
> issue a show-stopper?

It's not a show-stopper. It only manifests with a new config option in
the big-LITTPLE-MP config, so doesn't impact any board other than
vexpress and if required I could override it in my tree or we could add
a simple patch to linux-linaro later.

Someone else can't reproduce the problem so the slowness could be user
error on my part. (The build failure problem is definitely real
though :-)

-- 
Tixy


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [GIT PULL]; big LITTLE MP master v12

2012-11-19 Thread Andrey Konovalov

On 11/19/2012 10:42 PM, Jon Medhurst (Tixy) wrote:

On Mon, 2012-11-19 at 22:24 +0400, Andrey Konovalov wrote:

On 11/19/2012 10:17 PM, Jon Medhurst (Tixy) wrote:

On Mon, 2012-11-19 at 21:14 +0400, Andrey Konovalov wrote:

Viresh,

I won't pull the big-LITTLE-MP-master-v12 into the
linux-linaro-core-tracking tree today due to the issues found by Tixy.

Tomorrow evening I am going to pull this topic anyway - whether these
issues are resolved, or not. If the build error is not fixed by Thursday
morning UTC, I'll move llct back to v11. Would it work for the Landing
Teams? Tixy?


The timescales seem a bit wrong for that, working backwards...


That's correct..


- The monthly release is made from linux-linaro's state at end of
Thursday.

- You need to merge Landing Team's topics in before then.

- Landing Teams need to prepare their topics based on a given llct.

- To prepare their topics, Landing Teams need to be able to compile
their kernels.

So I would say that LT's need a final working llct tomorrow really. I
could manage OK getting this on Wednesday, don't know about other teams.

That's would then mean the monthly release candidate build comes from a
tree who's contents have never been built together before that day, so
it's trusting to luck somewhat.

Last month llct was created by the Monday morning so LT's could base
their branches on that and have them merged into linux-linaro by the end
of Tuesday. We then had two days to fix problems before Thursday's code
cutoff.


I'll push updated llct tree with v11 big-LITTLE-MP topic tonight.
If there is the build error fix by tomorrow, I can push one more llct
update using the updated master v12 version tomorrow.


That's sounds like a sensible plan :-)


llct-20121120.0 has been pushed to g.l.o:
- v3.7-rc6 based
- the same v11 big-LITTLE-MP topic,
- configs topic renamed to core-configs,
- basic-board-configs topic added,
- devfreq topic added,
- "KBuild: Allow scripts/* to be cross compiled" patch added to
  llct-v3.7-misc-fixes topic


  Is the boot delay
issue a show-stopper?


It's not a show-stopper. It only manifests with a new config option in
the big-LITTPLE-MP config, so doesn't impact any board other than
vexpress and if required I could override it in my tree or we could add
a simple patch to linux-linaro later.

Someone else can't reproduce the problem so the slowness could be user
error on my part. (The build failure problem is definitely real
though :-)


OK :)

Thanks,
Andrey



___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[PATCH] sched: Explicit division calls on 64-bit integers

2012-11-19 Thread Preeti U Murthy
Certain gcc tool chains convert the division on a 64-bit dividend into a
__aeabi_uldivmod call which does unnecessary 64-bit by 64-bit divides
although the divisor is 32-bit.This 64 by 64 bit division is not implemented
in the kernel for reasons of efficiency,which results in undefined reference
errors during link time.Hence perform the division on 64-bit dividends
using do_div() function.
The below use case is the integration of Per-entity-Load-Tracking
metric with the load balancer,where cfs_rq->runnable_load_avg,
a 64 bit unsigned integer is used to as the base metric for load balancing.

Signed-off-by: Preeti U Murthy
---
 kernel/sched/fair.c |   51 +++
 1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f8f3a29..7cd3096 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2982,9 +2982,13 @@ static u64 cpu_avg_load_per_task(int cpu)
 {
struct rq *rq = cpu_rq(cpu);
unsigned long nr_running = ACCESS_ONCE(rq->nr_running);
+   u64 cfs_avg_load_per_task;
 
-   if (nr_running)
-   return rq->cfs.runnable_load_avg / nr_running;
+   if (nr_running) {
+   cfs_avg_load_per_task = rq->cfs.runnable_load_avg;
+   do_div(cfs_avg_load_per_task, nr_running);
+   return cfs_avg_load_per_task;
+   }
 
return 0;
 }
@@ -3249,7 +3253,8 @@ find_idlest_group(struct sched_domain *sd, struct 
task_struct *p,
}
 
/* Adjust by relative CPU power of the group */
-   avg_load = (avg_load * SCHED_POWER_SCALE) / group->sgp->power;
+   avg_load = (avg_load * SCHED_POWER_SCALE);
+   do_div(avg_load, group->sgp->power);
 
if (local_group) {
this_load = avg_load;
@@ -4756,7 +4761,8 @@ static inline void update_sg_lb_stats(struct lb_env *env,
}
 
/* Adjust by relative CPU power of the group */
-   sgs->avg_load = (sgs->group_load*SCHED_POWER_SCALE) / group->sgp->power;
+   sgs->avg_load = (sgs->group_load*SCHED_POWER_SCALE);
+   do_div(sgs->avg_load, group->sgp->power);
 
/*
 * Consider the group unbalanced when the imbalance is larger
@@ -4767,8 +4773,10 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 *  normalized nr_running number somewhere that negates
 *  the hierarchy?
 */
-   if (sgs->sum_nr_running)
-   avg_load_per_task = sgs->sum_weighted_load / 
sgs->sum_nr_running;
+   if (sgs->sum_nr_running) {
+   avg_load_per_task = sgs->sum_weighted_load;
+   do_div(avg_load_per_task, sgs->sum_nr_running);
+   }
 
if ((max_cpu_load - min_cpu_load) >= avg_load_per_task &&
(max_nr_running - min_nr_running) > 1)
@@ -4953,7 +4961,7 @@ void fix_small_imbalance(struct lb_env *env, struct 
sd_lb_stats *sds)
u64 scaled_busy_load_per_task;
 
if (sds->this_nr_running) {
-   sds->this_load_per_task /= sds->this_nr_running;
+   do_div(sds->this_load_per_task, sds->this_nr_running);
if (sds->busiest_load_per_task >
sds->this_load_per_task)
imbn = 1;
@@ -4964,7 +4972,7 @@ void fix_small_imbalance(struct lb_env *env, struct 
sd_lb_stats *sds)
 
scaled_busy_load_per_task = sds->busiest_load_per_task
 * SCHED_POWER_SCALE;
-   scaled_busy_load_per_task /= sds->busiest->sgp->power;
+   do_div(scaled_busy_load_per_task, sds->busiest->sgp->power);
 
if (sds->max_load - sds->this_load + scaled_busy_load_per_task >=
(scaled_busy_load_per_task * imbn)) {
@@ -4985,20 +4993,21 @@ void fix_small_imbalance(struct lb_env *env, struct 
sd_lb_stats *sds)
pwr_now /= SCHED_POWER_SCALE;
 
/* Amount of load we'd subtract */
-   tmp = (sds->busiest_load_per_task * SCHED_POWER_SCALE) /
-   sds->busiest->sgp->power;
+   tmp = (sds->busiest_load_per_task * SCHED_POWER_SCALE);
+   do_div(tmp, sds->busiest->sgp->power);
if (sds->max_load > tmp)
pwr_move += sds->busiest->sgp->power *
min(sds->busiest_load_per_task, sds->max_load - tmp);
 
/* Amount of load we'd add */
if (sds->max_load * sds->busiest->sgp->power <
-   sds->busiest_load_per_task * SCHED_POWER_SCALE)
-   tmp = (sds->max_load * sds->busiest->sgp->power) /
-   sds->this->sgp->power;
-   else
-   tmp = (sds->busiest_load_per_task * SCHED_POWER_SCALE) /
-   sds->this->sgp->power;
+   sds->busiest_load_per_task * SCHED_POWER_SCALE) {
+   tmp = (sds->max_load * sds->busiest->sgp->power);
+   do_div(tmp, sds->this->sgp->power);
+   } else {
+

[PATCH] dt: add helper function to read u8 & u16 variables & arrays

2012-11-19 Thread Viresh Kumar
This adds following helper routines:
- of_property_read_u8_array()
- of_property_read_u16_array()
- of_property_read_u8()
- of_property_read_u16()

This expects arrays from DT to be passed as:
- u8 array:
property = /bits/ 8 <0x50 0x60 0x70>;
- u16 array:
property = /bits/ 16 <0x5000 0x6000 0x7000>;

Signed-off-by: Viresh Kumar 
---
V2->V3:
- Expect u8 & u16 arrays to be passed using: /bits/ 8 or 16
- remove common macro, as not much common now :(
- Tested on ARM platform.

 drivers/of/base.c  | 77 ++
 include/linux/of.h | 30 +
 2 files changed, 107 insertions(+)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index af3b22a..f564e31 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -671,12 +671,89 @@ struct device_node *of_find_node_by_phandle(phandle 
handle)
 EXPORT_SYMBOL(of_find_node_by_phandle);
 
 /**
+ * of_property_read_u8_array - Find and read an array of u8 from a property.
+ *
+ * @np:device node from which the property value is to be read.
+ * @propname:  name of the property to be searched.
+ * @out_value: pointer to return value, modified only if return value is 0.
+ * @sz:number of array elements to read
+ *
+ * Search for a property in a device node and read 8-bit value(s) from
+ * it. Returns 0 on success, -EINVAL if the property does not exist,
+ * -ENODATA if property does not have a value, and -EOVERFLOW if the
+ * property data isn't large enough.
+ *
+ * dts entry of array should be like:
+ * property = /bits/ 8 <0x50 0x60 0x70>;
+ *
+ * The out_value is modified only if a valid u8 value can be decoded.
+ */
+int of_property_read_u8_array(const struct device_node *np,
+   const char *propname, u8 *out_values, size_t sz)
+{
+   struct property *prop = of_find_property(np, propname, NULL);
+   const u8 *val;
+
+   if (!prop)
+   return -EINVAL;
+   if (!prop->value)
+   return -ENODATA;
+   if ((sz * sizeof(*out_values)) > prop->length)
+   return -EOVERFLOW;
+
+   val = prop->value;
+   while (sz--)
+   *out_values++ = *val++;
+   return 0;
+}
+EXPORT_SYMBOL_GPL(of_property_read_u8_array);
+
+/**
+ * of_property_read_u16_array - Find and read an array of u16 from a property.
+ *
+ * @np:device node from which the property value is to be read.
+ * @propname:  name of the property to be searched.
+ * @out_value: pointer to return value, modified only if return value is 0.
+ * @sz:number of array elements to read
+ *
+ * Search for a property in a device node and read 16-bit value(s) from
+ * it. Returns 0 on success, -EINVAL if the property does not exist,
+ * -ENODATA if property does not have a value, and -EOVERFLOW if the
+ * property data isn't large enough.
+ *
+ * dts entry of array should be like:
+ * property = /bits/ 16 <0x5000 0x6000 0x7000>;
+ *
+ * The out_value is modified only if a valid u16 value can be decoded.
+ */
+int of_property_read_u16_array(const struct device_node *np,
+   const char *propname, u16 *out_values, size_t sz)
+{
+   struct property *prop = of_find_property(np, propname, NULL);
+   const __be16 *val;
+
+   if (!prop)
+   return -EINVAL;
+   if (!prop->value)
+   return -ENODATA;
+   if ((sz * sizeof(*out_values)) > prop->length)
+   return -EOVERFLOW;
+
+   val = prop->value;
+   while (sz--)
+   *out_values++ = be16_to_cpup(val++);
+   return 0;
+}
+EXPORT_SYMBOL_GPL(of_property_read_u16_array);
+
+/**
  * of_property_read_u32_array - Find and read an array of 32 bit integers
  * from a property.
  *
  * @np:device node from which the property value is to be read.
  * @propname:  name of the property to be searched.
  * @out_value: pointer to return value, modified only if return value is 0.
+ * @sz:number of array elements to read
  *
  * Search for a property in a device node and read 32-bit value(s) from
  * it. Returns 0 on success, -EINVAL if the property does not exist,
diff --git a/include/linux/of.h b/include/linux/of.h
index b4e50d5..bfdc130 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -223,6 +223,10 @@ extern struct device_node *of_find_node_with_property(
 extern struct property *of_find_property(const struct device_node *np,
 const char *name,
 int *lenp);
+extern int of_property_read_u8_array(const struct device_node *np,
+   const char *propname, u8 *out_values, size_t sz);
+extern int of_property_read_u16_array(const struct device_node *np,
+   const char *propname, u16 *out_values, size_t sz);
 extern int of_property_read_u32_array(const struct device_node *np,
  const ch

Re: [GIT PULL]; big LITTLE MP master v12

2012-11-19 Thread Viresh Kumar
On 19 November 2012 22:44, Andrey Konovalov  wrote:
> I won't pull the big-LITTLE-MP-master-v12 into the
> linux-linaro-core-tracking tree today due to the issues found by Tixy.
>
> Tomorrow evening I am going to pull this topic anyway - whether these issues
> are resolved, or not. If the build error is not fixed by Thursday morning
> UTC, I'll move llct back to v11. Would it work for the Landing Teams? Tixy?

Hi Andrey,

I have updated master-v12 branch with fixes from tixy and ARM.
You can PULL it now :)

--
viresh

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [RFC 0/3] sched: fix nr_busy_cpus

2012-11-19 Thread Viresh Kumar
On 19 November 2012 22:08, Vincent Guittot  wrote:
> The nr_busy_cpus field of the sched_group_power is sometime different from 0
> whereas the platform is fully idle. This serie fixes 3 use cases:
>  - when the SCHED softirq is raised on an idle core for idle load balance but
>the platform doesn't go out of the cpuidle state
>  - when some CPUs enter idle state while booting all CPUs
>  - when a CPU is unplug and/or replug

Applied to a independent branch in big LITTLE MP tree: fix-nr-busy-cpus-v1
Isn't merged with any other branch.

--
viresh

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev