from:"Alexey Klimov"

Re: [PATCH v3] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining

2021-04-14 Thread Alexey Klimov

On Sun, Apr 4, 2021 at 3:32 AM Alexey Klimov  wrote:
>
> On Sat, Mar 27, 2021 at 9:01 PM Thomas Gleixner  wrote:

[...]

Now, the patch:

>> Subject: cpu/hotplug: Cure the cpusets trainwreck
>> From: Thomas Gleixner 
>> Date: Sat, 27 Mar 2021 15:57:29 +0100
>>
>> Alexey and Joshua tried to solve a cpusets related hotplug problem which is
>> user space visible and results in unexpected behaviour for some time after
>> a CPU has been plugged in and the corresponding uevent was delivered.
>>
>> cpusets delegate the hotplug work (rebuilding cpumasks etc.) to a
>> workqueue. This is done because the cpusets code has already a lock
>> nesting of cgroups_mutex -> cpu_hotplug_lock. A synchronous callback or
>> waiting for the work to finish with cpu_hotplug_lock held can and will
>> deadlock because that results in the reverse lock order.
>>
>> As a consequence the uevent can be delivered before cpusets have consistent
>> state which means that a user space invocation of sched_setaffinity() to
>> move a task to the plugged CPU fails up to the point where the scheduled
>> work has been processed.
>>
>> The same is true for CPU unplug, but that does not create user observable
>> failure (yet).
>>
>> It's still inconsistent to claim that an operation is finished before it
>> actually is and that's the real issue at hand. uevents just make it
>> reliably observable.
>>
>> Obviously the problem should be fixed in cpusets/cgroups, but untangling
>> that is pretty much impossible because according to the changelog of the
>> commit which introduced this 8 years ago:
>>
>>  3a5a6d0c2b03("cpuset: don't nest cgroup_mutex inside get_online_cpus()")
>>
>> the lock order cgroups_mutex -> cpu_hotplug_lock is a design decision and
>> the whole code is built around that.
>>
>> So bite the bullet and invoke the relevant cpuset function, which waits for
>> the work to finish, in _cpu_up/down() after dropping cpu_hotplug_lock and
>> only when tasks are not frozen by suspend/hibernate because that would
>> obviously wait forever.
>>
>> Waiting there with cpu_add_remove_lock, which is protecting the present
>> and possible CPU maps, held is not a problem at all because neither work
>> queues nor cpusets/cgroups have any lockchains related to that lock.
>>
>> Waiting in the hotplug machinery is not problematic either because there
>> are already state callbacks which wait for hardware queues to drain. It
>> makes the operations slightly slower, but hotplug is slow anyway.
>>
>> This ensures that state is consistent before returning from a hotplug
>> up/down operation. It's still inconsistent during the operation, but that's
>> a different story.
>>
>> Add a large comment which explains why this is done and why this is not a
>> dump ground for the hack of the day to work around half thought out locking
>> schemes. Document also the implications vs. hotplug operations and
>> serialization or the lack of it.
>>
>> Thanks to Alexy and Joshua for analyzing why this temporary
>> sched_setaffinity() failure happened.
>>
>> Reported-by: Alexey Klimov 
>> Reported-by: Joshua Baker 
>> Signed-off-by: Thomas Gleixner 
>> Cc: Daniel Jordan 
>> Cc: Qais Yousef 

Feel free to use:
Tested-by: Alexey Klimov 

The bug doesn't reproduce with this change, I had the testcase running
for ~25 hrs without failing under different workloads.

Are you going to submit the patch? Or I can do it on your behalf if you like.

[...]

Best regards,
Alexey

Re: [PATCH v3] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining

2021-04-03 Thread Alexey Klimov

On Sat, Mar 27, 2021 at 9:01 PM Thomas Gleixner  wrote:

Lovely that you eventually found time to take a look at this since
first RFC patch was sent.

> Alexey,
>
> On Wed, Mar 17 2021 at 00:36, Alexey Klimov wrote:
> > When a CPU offlined and onlined via device_offline() and device_online()
> > the userspace gets uevent notification. If, after receiving "online" uevent,
> > userspace executes sched_setaffinity() on some task trying to move it
> > to a recently onlined CPU, then it sometimes fails with -EINVAL. Userspace
> > needs to wait around 5..30 ms before sched_setaffinity() will succeed for
> > recently onlined CPU after receiving uevent.
> >
> > Cpusets used in guarantee_online_cpus() are updated using workqueue from
> > cpuset_update_active_cpus() which in its turn is called from cpu hotplug 
> > callback
> > sched_cpu_activate() hence it may not be observable by sched_setaffinity() 
> > if
> > it is called immediately after uevent.
>
> And because cpusets are using a workqueue just to deal with their
> backwards lock order we need to cure the symptom in the CPU hotplug
> code, right?

Feel free to suggest a better place.

> > Out of line uevent can be avoided if we will ensure that cpuset_hotplug_work
> > has run to completion using cpuset_wait_for_hotplug() after onlining the
> > cpu in cpu_device_up() and in cpuhp_smt_enable().
>
> It can also be avoided by fixing the root cause which is _NOT_ in the
> CPU hotplug code at all.
>
> The fundamental assumption of CPU hotplug is that if the state machine
> reaches a given state, which might have user space visible effects or
> even just kernel visible effects, the overall state of the system has to
> be consistent.
>
> cpusets violate this assumption. And they do so since 2013 due to commit
> 3a5a6d0c2b03("cpuset: don't nest cgroup_mutex inside get_online_cpus()").
>
> If that cannot be fixed in cgroups/cpusets with out rewriting the whole
> cpusets/cgroups muck, then this want's to be explained and justified in the
> changelog.
>
> Looking at the changelog of 3a5a6d0c2b03 it's entirely clear that this
> is non trivial because that changelog clearly states that the lock order
> is a design decision and that design decision required that workqueue
> workaround
>
> See? Now we suddenly have a proper root cause and not just a description
> of the symptom with some hidden hint about workqueues. And we have an
> argument why fixing the root cause is close to impossible.

Thank you for this educational scolding here and below. I see that
problem here is more fundamental than I thought before and my commit
messages standards are too low for you.
Good to see that bug that may exist since 2013 could be fixed finally.

> >  int cpu_device_up(struct device *dev)
> >  {
> > - return cpu_up(dev->id, CPUHP_ONLINE);
> > + int err;
> > +
> > + err = cpu_up(dev->id, CPUHP_ONLINE);
> > + /*
> > +  * Wait for cpuset updates to cpumasks to finish.  Later on this path
> > +  * may generate uevents whose consumers rely on the updates.
> > +  */
> > + if (!err)
> > + cpuset_wait_for_hotplug();
>
> No. Quite some people wasted^Wspent a considerable amount of time to get
> the hotplug trainwreck cleaned up and we are not sprinkling random
> workarounds all over the place again.
>
> >  int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
> >  {
> > - int cpu, ret = 0;
> > + cpumask_var_t mask;
> > + int cpu, ret;
> >
> > + if (!zalloc_cpumask_var(, GFP_KERNEL))
> > + return -ENOMEM;
> > +
> > + ret = 0;
> >   cpu_maps_update_begin();
> >   for_each_online_cpu(cpu) {
> >   if (topology_is_primary_thread(cpu))
> > @@ -2093,31 +2109,42 @@ int cpuhp_smt_disable(enum cpuhp_smt_control 
> > ctrlval)
> >   ret = cpu_down_maps_locked(cpu, CPUHP_OFFLINE);
> >   if (ret)
> >   break;
> > - /*
> > -  * As this needs to hold the cpu maps lock it's impossible
> > -  * to call device_offline() because that ends up calling
> > -  * cpu_down() which takes cpu maps lock. cpu maps lock
> > -  * needs to be held as this might race against in kernel
> > -  * abusers of the hotplug machinery (thermal management).
> > -  *
> > -  * So nothing would update device:offline state. That would
> > -  * leave the sysfs entry stale and prevent onlining after
> > -  * smt control ha

[PATCH v3] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining

2021-03-16 Thread Alexey Klimov

When a CPU offlined and onlined via device_offline() and device_online()
the userspace gets uevent notification. If, after receiving "online" uevent,
userspace executes sched_setaffinity() on some task trying to move it
to a recently onlined CPU, then it sometimes fails with -EINVAL. Userspace
needs to wait around 5..30 ms before sched_setaffinity() will succeed for
recently onlined CPU after receiving uevent.

If in_mask argument for sched_setaffinity() has only recently onlined CPU,
it could fail with such flow:

  sched_setaffinity()
cpuset_cpus_allowed()
  guarantee_online_cpus()   <-- cs->effective_cpus mask does not
contain recently onlined cpu
cpumask_and()   <-- final new_mask is empty
__set_cpus_allowed_ptr()
  cpumask_any_and_distribute() <-- returns dest_cpu equal to nr_cpu_ids
  returns -EINVAL

Cpusets used in guarantee_online_cpus() are updated using workqueue from
cpuset_update_active_cpus() which in its turn is called from cpu hotplug 
callback
sched_cpu_activate() hence it may not be observable by sched_setaffinity() if
it is called immediately after uevent.

Out of line uevent can be avoided if we will ensure that cpuset_hotplug_work
has run to completion using cpuset_wait_for_hotplug() after onlining the
cpu in cpu_device_up() and in cpuhp_smt_enable().

Cc: Daniel Jordan 
Reviewed-by: Qais Yousef 
Co-analyzed-by: Joshua Baker 
Signed-off-by: Alexey Klimov 
---

Changes since v2:
- restore cpuhp_{online,offline}_cpu_device back and move it out
of cpu maps lock;
- use Reviewed-by from Qais;
- minor corrections in commit message and in comment in code.

Changes since v1:
- cpuset_wait_for_hotplug() moved to cpu_device_up();
- corrections in comments;
- removed cpuhp_{online,offline}_cpu_device.

Changes since RFC:
- cpuset_wait_for_hotplug() used in cpuhp_smt_enable().

Previous patches and discussion are:
RFC patch: 
https://lore.kernel.org/lkml/20201203171431.256675-1-akli...@redhat.com/
v1 patch:  
https://lore.kernel.org/lkml/20210204010157.1823669-1-akli...@redhat.com/
v2 patch: 
https://lore.kernel.org/lkml/20210212003032.2037750-1-akli...@redhat.com/

The commit a49e4629b5ed "cpuset: Make cpuset hotplug synchronous"
would also get rid of the early uevent but it was reverted (deadlocks).

The nature of this bug is also described here (with different consequences):
https://lore.kernel.org/lkml/20200211141554.24181-1-qais.you...@arm.com/

Reproducer: https://gitlab.com/0xeafe/xlam

Currently with such changes the reproducer code continues to work without 
issues.
The idea is to avoid the situation when userspace receives the event about
onlined CPU which is not ready to take tasks for a while after uevent.

 kernel/cpu.c | 74 +++-
 1 file changed, 56 insertions(+), 18 deletions(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 1b6302ecbabe..9b091d8a8811 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1301,7 +1302,17 @@ static int cpu_up(unsigned int cpu, enum cpuhp_state 
target)
  */
 int cpu_device_up(struct device *dev)
 {
-   return cpu_up(dev->id, CPUHP_ONLINE);
+   int err;
+
+   err = cpu_up(dev->id, CPUHP_ONLINE);
+   /*
+* Wait for cpuset updates to cpumasks to finish.  Later on this path
+* may generate uevents whose consumers rely on the updates.
+*/
+   if (!err)
+   cpuset_wait_for_hotplug();
+
+   return err;
 }
 
 int add_cpu(unsigned int cpu)
@@ -2084,8 +2095,13 @@ static void cpuhp_online_cpu_device(unsigned int cpu)
 
 int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
 {
-   int cpu, ret = 0;
+   cpumask_var_t mask;
+   int cpu, ret;
 
+   if (!zalloc_cpumask_var(, GFP_KERNEL))
+   return -ENOMEM;
+
+   ret = 0;
cpu_maps_update_begin();
for_each_online_cpu(cpu) {
if (topology_is_primary_thread(cpu))
@@ -2093,31 +2109,42 @@ int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
ret = cpu_down_maps_locked(cpu, CPUHP_OFFLINE);
if (ret)
break;
-   /*
-* As this needs to hold the cpu maps lock it's impossible
-* to call device_offline() because that ends up calling
-* cpu_down() which takes cpu maps lock. cpu maps lock
-* needs to be held as this might race against in kernel
-* abusers of the hotplug machinery (thermal management).
-*
-* So nothing would update device:offline state. That would
-* leave the sysfs entry stale and prevent onlining after
-* smt control has been changed to 'off' again. This is
-

Re: [PATCH v2] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining

2021-03-15 Thread Alexey Klimov

On Fri, Feb 12, 2021 at 7:42 PM Daniel Jordan
 wrote:
>
> Alexey Klimov  writes:
> > int cpu_device_up(struct device *dev)
>
> Yeah, definitely better to do the wait here.
>
> >  int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
> >  {
> > - int cpu, ret = 0;
> > + struct device *dev;
> > + cpumask_var_t mask;
> > + int cpu, ret;
> > +
> > + if (!zalloc_cpumask_var(, GFP_KERNEL))
> > + return -ENOMEM;
> >
> > + ret = 0;
> >   cpu_maps_update_begin();
> >   for_each_online_cpu(cpu) {
> >   if (topology_is_primary_thread(cpu))
> > @@ -2099,18 +2098,35 @@ int cpuhp_smt_disable(enum cpuhp_smt_control 
> > ctrlval)
> >* called under the sysfs hotplug lock, so it is properly
> >* serialized against the regular offline usage.
> >*/
> > - cpuhp_offline_cpu_device(cpu);
> > + dev = get_cpu_device(cpu);
> > + dev->offline = true;
> > +
> > + cpumask_set_cpu(cpu, mask);
> >   }
> >   if (!ret)
> >   cpu_smt_control = ctrlval;
> >   cpu_maps_update_done();
> > +
> > + /* Tell user space about the state changes */
> > + for_each_cpu(cpu, mask) {
> > + dev = get_cpu_device(cpu);
> > + kobject_uevent(>kobj, KOBJ_OFFLINE);
> > + }
> > +
> > + free_cpumask_var(mask);
> >   return ret;
> >  }
>
> Hrm, should the dev manipulation be kept in one place, something like
> this?

The first section of comment seems problematic to me with regards to such move:

 * As this needs to hold the cpu maps lock it's impossible
 * to call device_offline() because that ends up calling
 * cpu_down() which takes cpu maps lock. cpu maps lock
 * needs to be held as this might race against in kernel
 * abusers of the hotplug machinery (thermal management).

Cpu maps lock is released in cpu_maps_update_done() hence we will move
dev->offline out of cpu maps lock. Maybe I misunderstood the comment
and it relates to calling cpu_down_maps_locked() under lock to avoid
race?
I failed to find the abusers of hotplug machinery in drivers/thermal/*
to track down the logic of potential race but I might have overlooked.
Anyway, if we move the update of dev->offline out, then it makes sense
to restore cpuhp_{offline,online}_cpu_device back and just use it.

I guess I'll update and re-send the patch and see how it goes.

> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 8817ccdc8e112..aa21219a7b7c4 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -2085,11 +2085,20 @@ int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
> ret = cpu_down_maps_locked(cpu, CPUHP_OFFLINE);
> if (ret)
> break;
> +
> +   cpumask_set_cpu(cpu, mask);
> +   }
> +   if (!ret)
> +   cpu_smt_control = ctrlval;
> +   cpu_maps_update_done();
> +
> +   /* Tell user space about the state changes */
> +   for_each_cpu(cpu, mask) {
> /*
> -* As this needs to hold the cpu maps lock it's impossible
> +* When the cpu maps lock was taken above it was impossible
>  * to call device_offline() because that ends up calling
>  * cpu_down() which takes cpu maps lock. cpu maps lock
> -* needs to be held as this might race against in kernel
> +* needed to be held as this might race against in kernel
>  * abusers of the hotplug machinery (thermal management).
>  *
>  * So nothing would update device:offline state. That would

Yeah, reading how you re-phrased it, this seems to be about
cpu_down_maps_locked()/device_offline() locks and race rather than
updating stale dev->offline.

Thank you,
Alexey

[PATCH v2] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining

2021-02-11 Thread Alexey Klimov

When a CPU offlined and onlined via device_offline() and device_online()
the userspace gets uevent notification. If, after receiving "online" uevent,
userspace executes sched_setaffinity() on some task trying to move it
to a recently onlined CPU, then it often fails with -EINVAL. Userspace needs
to wait around 5..30 ms before sched_setaffinity() will succeed for recently
onlined CPU after receiving uevent.

If in_mask argument for sched_setaffinity() has only recently onlined CPU,
it often fails with such flow:

  sched_setaffinity()
cpuset_cpus_allowed()
  guarantee_online_cpus()   <-- cs->effective_cpus mask does not
contain recently onlined cpu
cpumask_and()   <-- final new_mask is empty
__set_cpus_allowed_ptr()
  cpumask_any_and_distribute() <-- returns dest_cpu equal to nr_cpu_ids
  returns -EINVAL

Cpusets used in guarantee_online_cpus() are updated using workqueue from
cpuset_update_active_cpus() which in its turn is called from cpu hotplug 
callback
sched_cpu_activate() hence it may not be observable by sched_setaffinity() if
it is called immediately after uevent.
Out of line uevent can be avoided if we will ensure that cpuset_hotplug_work
has run to completion using cpuset_wait_for_hotplug() after onlining the
cpu in cpu_device_up() and in cpuhp_smt_enable().

Co-analyzed-by: Joshua Baker 
Signed-off-by: Alexey Klimov 
---

Previous patches and discussion are:
RFC patch: 
https://lore.kernel.org/lkml/20201203171431.256675-1-akli...@redhat.com/
v1 patch:  
https://lore.kernel.org/lkml/20210204010157.1823669-1-akli...@redhat.com/

The commit a49e4629b5ed "cpuset: Make cpuset hotplug synchronous"
would also get rid of the early uevent but it was reverted (deadlocks).

The nature of this bug is also described here (with different consequences):
https://lore.kernel.org/lkml/20200211141554.24181-1-qais.you...@arm.com/

Reproducer: https://gitlab.com/0xeafe/xlam

Currently with such changes the reproducer code continues to work without 
issues.
The idea is to avoid the situation when userspace receives the event about
onlined CPU which is not ready to take tasks for a while after uevent.

 kernel/cpu.c | 79 +---
 1 file changed, 56 insertions(+), 23 deletions(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 4e11e91010e1..8817ccdc8e11 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1294,7 +1295,17 @@ static int cpu_up(unsigned int cpu, enum cpuhp_state 
target)
  */
 int cpu_device_up(struct device *dev)
 {
-   return cpu_up(dev->id, CPUHP_ONLINE);
+   int err;
+
+   err = cpu_up(dev->id, CPUHP_ONLINE);
+   /*
+* Wait for cpuset updates to cpumasks to finish.  Later on this path
+* may generate uevents whose consumers rely on the updates.
+*/
+   if (!err)
+   cpuset_wait_for_hotplug();
+
+   return err;
 }
 
 int add_cpu(unsigned int cpu)
@@ -2057,28 +2068,16 @@ void __cpuhp_remove_state(enum cpuhp_state state, bool 
invoke)
 EXPORT_SYMBOL(__cpuhp_remove_state);
 
 #ifdef CONFIG_HOTPLUG_SMT
-static void cpuhp_offline_cpu_device(unsigned int cpu)
-{
-   struct device *dev = get_cpu_device(cpu);
-
-   dev->offline = true;
-   /* Tell user space about the state change */
-   kobject_uevent(>kobj, KOBJ_OFFLINE);
-}
-
-static void cpuhp_online_cpu_device(unsigned int cpu)
-{
-   struct device *dev = get_cpu_device(cpu);
-
-   dev->offline = false;
-   /* Tell user space about the state change */
-   kobject_uevent(>kobj, KOBJ_ONLINE);
-}
-
 int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
 {
-   int cpu, ret = 0;
+   struct device *dev;
+   cpumask_var_t mask;
+   int cpu, ret;
+
+   if (!zalloc_cpumask_var(, GFP_KERNEL))
+   return -ENOMEM;
 
+   ret = 0;
cpu_maps_update_begin();
for_each_online_cpu(cpu) {
if (topology_is_primary_thread(cpu))
@@ -2099,18 +2098,35 @@ int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
 * called under the sysfs hotplug lock, so it is properly
 * serialized against the regular offline usage.
 */
-   cpuhp_offline_cpu_device(cpu);
+   dev = get_cpu_device(cpu);
+   dev->offline = true;
+
+   cpumask_set_cpu(cpu, mask);
}
if (!ret)
cpu_smt_control = ctrlval;
cpu_maps_update_done();
+
+   /* Tell user space about the state changes */
+   for_each_cpu(cpu, mask) {
+   dev = get_cpu_device(cpu);
+   kobject_uevent(>kobj, KOBJ_OFFLINE);
+   }
+
+   free_cpumask_var(mask);
return ret;
 }
 
 int cpuhp_smt_enable(void)
 {
-   int cpu, ret = 0;
+   struct

Re: [PATCH] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining

2021-02-11 Thread Alexey Klimov

On Fri, Feb 5, 2021 at 12:41 AM Daniel Jordan
 wrote:
>
> Peter Zijlstra  writes:

[...]

> >> > One concequence of this is that you'll now get a bunch of notifications
> >> > across things like suspend/hybernate.
> >>
> >> The patch doesn't change the number of kobject_uevent()s. The
> >> userspace will get the same number of uevents as before the patch (at
> >> least if I can rely on my eyes).
> >
> > bringup_hibernate_cpu() didn't used to generate an event, it does now.
> > Same for bringup_nonboot_cpus().
>
> Both of those call cpu_up(), which only gets a cpuset_wait_for_hotplug()
> in this patch.  No new events generated from that, right, it's just a
> wrapper for a flush_work()?
>
> > Also, looking again, you don't seem to be reinstating the OFFLINE event
> > you took out.
>
> It seems to be reinstated in cpuhp_smt_disable()?

Peter, what Daniel said.
cpuset_wait_for_hotplug() doesn't generate an event.

The offline event was moved below in the same function:

+
+ /* Tell user space about the state changes */
+ for_each_cpu(cpu, mask) {
+ dev = get_cpu_device(cpu);
+ kobject_uevent(>kobj, KOBJ_OFFLINE);
+ }
+
+ free_cpumask_var(mask);

Daniel,
thanks for your comments. I'll update the patch and resend.

Best regards,
Alexey

Re: [PATCH] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining

2021-02-11 Thread Alexey Klimov

On Fri, Feb 5, 2021 at 11:22 AM Qais Yousef  wrote:
>
> On 02/04/21 10:46, Peter Zijlstra wrote:
> > On Thu, Feb 04, 2021 at 01:01:57AM +0000, Alexey Klimov wrote:
> > > @@ -1281,6 +1282,11 @@ static int cpu_up(unsigned int cpu, enum 
> > > cpuhp_state target)
> > > err = _cpu_up(cpu, 0, target);
> > >  out:
> > > cpu_maps_update_done();
> > > +
> > > +   /* To avoid out of line uevent */
> > > +   if (!err)
> > > +   cpuset_wait_for_hotplug();
> > > +
> > > return err;
> > >  }
> > >
> >
> > > @@ -2071,14 +2075,18 @@ static void cpuhp_online_cpu_device(unsigned int 
> > > cpu)
> > > struct device *dev = get_cpu_device(cpu);
> > >
> > > dev->offline = false;
> > > -   /* Tell user space about the state change */
> > > -   kobject_uevent(>kobj, KOBJ_ONLINE);
> > >  }
> > >
> >
> > One concequence of this is that you'll now get a bunch of notifications
> > across things like suspend/hybernate.
>
> And the resume latency will incur 5-30ms * nr_cpu_ids.
>
> Since you just care about device_online(), isn't cpu_device_up() a better 
> place
> for the wait? This function is special helper for device_online(), leaving
> suspend/resume and kexec paths free from having to do this unnecessary wait.

Yup, the same idea here once Peter mentioned bringup_nonboot_cpus()
and bringup_hibernate_cpu().

Best regards,
Alexey

Re: [PATCH] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining

2021-02-04 Thread Alexey Klimov

On Thu, Feb 4, 2021 at 9:46 AM Peter Zijlstra  wrote:
>
> On Thu, Feb 04, 2021 at 01:01:57AM +, Alexey Klimov wrote:
> > @@ -1281,6 +1282,11 @@ static int cpu_up(unsigned int cpu, enum cpuhp_state 
> > target)
> >   err = _cpu_up(cpu, 0, target);
> >  out:
> >   cpu_maps_update_done();
> > +
> > + /* To avoid out of line uevent */
> > + if (!err)
> > + cpuset_wait_for_hotplug();
> > +
> >   return err;
> >  }
> >
>
> > @@ -2071,14 +2075,18 @@ static void cpuhp_online_cpu_device(unsigned int 
> > cpu)
> >   struct device *dev = get_cpu_device(cpu);
> >
> >   dev->offline = false;
> > - /* Tell user space about the state change */
> > - kobject_uevent(>kobj, KOBJ_ONLINE);
> >  }
> >
>
> One concequence of this is that you'll now get a bunch of notifications
> across things like suspend/hybernate.

The patch doesn't change the number of kobject_uevent()s. The
userspace will get the same number of uevents as before the patch (at
least if I can rely on my eyes).
Or is there a concern that now the uevents are sent in a row
sequentially which might abuse userspace uevents handling machinery?

Best regards,
Alexey

[PATCH] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining

2021-02-03 Thread Alexey Klimov

When a CPU offlined and onlined via device_offline() and device_online()
the userspace gets uevent notification. If, after receiving "online" uevent,
userspace executes sched_setaffinity() on some task trying to move it
to a recently onlined CPU, then it often fails with -EINVAL. Userspace needs
to wait around 5..30 ms before sched_setaffinity() will succeed for the recently
onlined CPU after receiving uevent.

If in_mask argument for sched_setaffinity() has only recently onlined CPU,
it often fails with such flow:

  sched_setaffinity()
cpuset_cpus_allowed()
  guarantee_online_cpus()   <-- cs->effective_cpus mask does not
contain recently onlined cpu
cpumask_and()   <-- final new_mask is empty
__set_cpus_allowed_ptr()
  cpumask_any_and_distribute() <-- returns dest_cpu equal to nr_cpu_ids
  returns -EINVAL

Cpusets used in guarantee_online_cpus() are updated using workqueue from
cpuset_update_active_cpus() which in its turn is called from cpu hotplug 
callback
sched_cpu_activate() hence it may not be observable by sched_setaffinity() if
it is called immediately after uevent.
Out of line uevent can be avoided if we will ensure that cpuset_hotplug_work
has run to completion using cpuset_wait_for_hotplug() after onlining the
cpu in cpu_up() and in cpuhp_smt_enable().

Co-analyzed-by: Joshua Baker 
Signed-off-by: Alexey Klimov 
---

Previous RFC patch and discussion is here:
https://lore.kernel.org/lkml/20201203171431.256675-1-akli...@redhat.com/

The commit a49e4629b5ed "cpuset: Make cpuset hotplug synchronous"
would also get rid of the early uevent but it was reverted (deadlocks).

The nature of this bug is also described here (with different consequences):
https://lore.kernel.org/lkml/20200211141554.24181-1-qais.you...@arm.com/

Reproducer: https://gitlab.com/0xeafe/xlam

Currently with such changes the reproducer code continues to work without 
issues.
The idea is to avoid the situation when userspace receives the event about
onlined CPU which is not ready to take tasks for a while after uevent.


 kernel/cpu.c | 47 +--
 1 file changed, 41 insertions(+), 6 deletions(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 4e11e91010e1..ea728e75a74d 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1281,6 +1282,11 @@ static int cpu_up(unsigned int cpu, enum cpuhp_state 
target)
err = _cpu_up(cpu, 0, target);
 out:
cpu_maps_update_done();
+
+   /* To avoid out of line uevent */
+   if (!err)
+   cpuset_wait_for_hotplug();
+
return err;
 }
 
@@ -2062,8 +2068,6 @@ static void cpuhp_offline_cpu_device(unsigned int cpu)
struct device *dev = get_cpu_device(cpu);
 
dev->offline = true;
-   /* Tell user space about the state change */
-   kobject_uevent(>kobj, KOBJ_OFFLINE);
 }
 
 static void cpuhp_online_cpu_device(unsigned int cpu)
@@ -2071,14 +2075,18 @@ static void cpuhp_online_cpu_device(unsigned int cpu)
struct device *dev = get_cpu_device(cpu);
 
dev->offline = false;
-   /* Tell user space about the state change */
-   kobject_uevent(>kobj, KOBJ_ONLINE);
 }
 
 int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
 {
-   int cpu, ret = 0;
+   struct device *dev;
+   cpumask_var_t mask;
+   int cpu, ret;
+
+   if (!zalloc_cpumask_var(, GFP_KERNEL))
+return -ENOMEM;
 
+   ret = 0;
cpu_maps_update_begin();
for_each_online_cpu(cpu) {
if (topology_is_primary_thread(cpu))
@@ -2100,17 +2108,32 @@ int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
 * serialized against the regular offline usage.
 */
cpuhp_offline_cpu_device(cpu);
+   cpumask_set_cpu(cpu, mask);
}
if (!ret)
cpu_smt_control = ctrlval;
cpu_maps_update_done();
+
+   /* Tell user space about the state changes */
+   for_each_cpu(cpu, mask) {
+   dev = get_cpu_device(cpu);
+   kobject_uevent(>kobj, KOBJ_OFFLINE);
+   }
+
+   free_cpumask_var(mask);
return ret;
 }
 
 int cpuhp_smt_enable(void)
 {
-   int cpu, ret = 0;
+   struct device *dev;
+   cpumask_var_t mask;
+   int cpu, ret;
 
+   if (!zalloc_cpumask_var(, GFP_KERNEL))
+return -ENOMEM;
+
+   ret = 0;
cpu_maps_update_begin();
cpu_smt_control = CPU_SMT_ENABLED;
for_each_present_cpu(cpu) {
@@ -2122,8 +2145,20 @@ int cpuhp_smt_enable(void)
break;
/* See comment in cpuhp_smt_disable() */
cpuhp_online_cpu_device(cpu);
+   cpumask_set_cpu(cpu, mask);
}
cpu_maps_update_do

Re: [RFC][PATCH] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu online

2021-01-19 Thread Alexey Klimov

On Fri, Jan 15, 2021 at 6:54 AM Daniel Jordan
 wrote:
>
> Daniel Jordan  writes:
> > Peter Zijlstra  writes:
> >>> The nature of this bug is also described here (with different 
> >>> consequences):
> >>> https://lore.kernel.org/lkml/20200211141554.24181-1-qais.you...@arm.com/
> >>
> >> Yeah, pesky deadlocks.. someone was going to try again.
> >
> > I dug up the synchronous patch
> >
> > 
> > https://lore.kernel.org/lkml/1579878449-10164-1-git-send-email-prs...@codeaurora.org/
> >
> > but surprisingly wasn't able to reproduce the lockdep splat from
> >
> > 
> > https://lore.kernel.org/lkml/f0388d99-84d7-453b-9b6b-eeff0e7be...@lca.pw/
> >
> > even though I could hit it a few weeks ago.
>
> oh okay, you need to mount a legacy cpuset hierarchy.
>
> So as the above splat shows, making cpuset_hotplug_workfn() synchronous
> means cpu_hotplug_lock (and "cpuhp_state-down") can be acquired before
> cgroup_mutex.
>
> But there are at least four cgroup paths that take the locks in the
> opposite order.  They're all the same, they take cgroup_mutex and then
> cpu_hotplug_lock later on to modify one or more static keys.
>
> cpu_hotplug_lock should probably be ahead of cgroup_mutex because the
> latter is taken in a hotplug callback, and we should keep the static
> branches in cgroup, so the only way out I can think of is moving
> cpu_hotplug_lock to just before cgroup_mutex is taken and switching to
> _cpuslocked flavors of the static key calls.
>
> lockdep quiets down with that change everywhere, but it puts another big
> lock around a lot of cgroup paths.  Seems less heavyhanded to go with
> this RFC.  What do you all think?

Daniel, thank you for taking a look. I don't mind reviewing+testing
another approach that you described.

> Absent further discussion, Alexey, do you plan to post another version?

I plan to update this patch and re-send in the next couple of days. It
looks like it might be a series of two patches. Sorry for delays.

Best regards,
Alexey

[RFC][PATCH] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu online

2020-12-03 Thread Alexey Klimov

When a CPU offlined and onlined via device_offline() and device_online()
the userspace gets uevent notification. If, after receiving uevent,
userspace executes sched_setaffinity() on some task trying to move it
to a recently onlined CPU, then it will fail with -EINVAL. Userspace needs
to wait around 5..30 ms before sched_setaffinity() will succeed for
recently onlined CPU after receiving uevent.

If in_mask for sched_setaffinity() has only recently onlined CPU, it
quickly fails with such flow:

  sched_setaffinity()
cpuset_cpus_allowed()
  guarantee_online_cpus()   <-- cs->effective_cpus mask does not
contain recently onlined cpu
cpumask_and()   <-- final new_mask is empty
__set_cpus_allowed_ptr()
  cpumask_any_and_distribute() <-- returns dest_cpu equal to nr_cpu_ids
  returns -EINVAL

Cpusets are updated using workqueue from cpuset_update_active_cpus() which
in its turn is called from cpu hotplug callback sched_cpu_activate() hence
the delay observable by sched_setaffinity().
Out of line uevent can be avoided if we will ensure that cpuset_hotplug_work
has run to completion using cpuset_wait_for_hotplug() after onlining the
cpu in cpu_up(). Unfortunately, the execution time of
echo 1 > /sys/devices/system/cpu/cpuX/online roughly doubled with this
change (on my test machine).

Co-analyzed-by: Joshua Baker 
Signed-off-by: Alexey Klimov 
---

The commit "cpuset: Make cpuset hotplug synchronous" would also get rid of the
early uevent but it was reverted.

The nature of this bug is also described here (with different consequences):
https://lore.kernel.org/lkml/20200211141554.24181-1-qais.you...@arm.com/

Reproducer: https://gitlab.com/0xeafe/xlam

It could be that I missed the correct place for cpuset synchronisation and it 
should
be done in cpu_device_up() instead.
I also in doubts if we need cpuset_wait_for_hotplug() in 
cpuhp_online_cpu_device()
since an online uevent is sent there too.
Currently with such change the reproducer code continues to work without issues.
The idea is to avoid the situation when userspace receives the event about
onlined CPU which is not ready to take tasks for a while after uevent.


 kernel/cpu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 6ff2578ecf17..f39a27a7f24b 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1275,6 +1276,8 @@ static int cpu_up(unsigned int cpu, enum cpuhp_state 
target)
}
 
err = _cpu_up(cpu, 0, target);
+   if (!err)
+   cpuset_wait_for_hotplug();
 out:
cpu_maps_update_done();
return err;
-- 
2.26.2

Re: [PATCH v2] [media] Delete unnecessary variable initialisations in seven functions

2018-02-22 Thread Alexey Klimov

On Thu, Feb 22, 2018 at 9:22 PM, SF Markus Elfring
<elfr...@users.sourceforge.net> wrote:
> From: Markus Elfring <elfr...@users.sourceforge.net>
> Date: Thu, 22 Feb 2018 21:45:47 +0100
>
> Some local variables will be set to an appropriate value before usage.
> Thus omit explicit initialisations at the beginning of these functions.
>
> Signed-off-by: Markus Elfring <elfr...@users.sourceforge.net>
> ---
>
> v2:
> Hans Verkuil insisted on patch squashing. Thus some changes
> were recombined based on source files from Linux next-20180216.
>
>  drivers/media/radio/radio-mr800.c     | 2 +-

For radio-mr800:

Acked-by: Alexey Klimov <klimov.li...@gmail.com>

>  drivers/media/radio/radio-wl1273.c| 2 +-
>  drivers/media/radio/si470x/radio-si470x-usb.c | 2 +-
>  drivers/media/usb/cx231xx/cx231xx-cards.c | 2 +-
>  drivers/media/usb/cx231xx/cx231xx-dvb.c   | 2 +-
>  drivers/media/usb/go7007/snd-go7007.c | 2 +-
>  drivers/media/usb/tm6000/tm6000-cards.c   | 2 +-
>  7 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/media/radio/radio-mr800.c 
> b/drivers/media/radio/radio-mr800.c
> index dc6c4f985911..0f292c6ba338 100644
> --- a/drivers/media/radio/radio-mr800.c
> +++ b/drivers/media/radio/radio-mr800.c
> @@ -511,5 +511,5 @@ static int usb_amradio_probe(struct usb_interface *intf,
> const struct usb_device_id *id)
>  {
> struct amradio_device *radio;
> -   int retval = 0;
> +   int retval;
>
> diff --git a/drivers/media/radio/radio-wl1273.c 
> b/drivers/media/radio/radio-wl1273.c

[..]

Thanks!
Alexey

Re: [PATCH v2] [media] Delete unnecessary variable initialisations in seven functions

2018-02-22 Thread Alexey Klimov

On Thu, Feb 22, 2018 at 9:22 PM, SF Markus Elfring
 wrote:
> From: Markus Elfring 
> Date: Thu, 22 Feb 2018 21:45:47 +0100
>
> Some local variables will be set to an appropriate value before usage.
> Thus omit explicit initialisations at the beginning of these functions.
>
> Signed-off-by: Markus Elfring 
> ---
>
> v2:
> Hans Verkuil insisted on patch squashing. Thus some changes
> were recombined based on source files from Linux next-20180216.
>
>  drivers/media/radio/radio-mr800.c     | 2 +-

For radio-mr800:

Acked-by: Alexey Klimov 

>  drivers/media/radio/radio-wl1273.c| 2 +-
>  drivers/media/radio/si470x/radio-si470x-usb.c | 2 +-
>  drivers/media/usb/cx231xx/cx231xx-cards.c | 2 +-
>  drivers/media/usb/cx231xx/cx231xx-dvb.c   | 2 +-
>  drivers/media/usb/go7007/snd-go7007.c | 2 +-
>  drivers/media/usb/tm6000/tm6000-cards.c   | 2 +-
>  7 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/media/radio/radio-mr800.c 
> b/drivers/media/radio/radio-mr800.c
> index dc6c4f985911..0f292c6ba338 100644
> --- a/drivers/media/radio/radio-mr800.c
> +++ b/drivers/media/radio/radio-mr800.c
> @@ -511,5 +511,5 @@ static int usb_amradio_probe(struct usb_interface *intf,
> const struct usb_device_id *id)
>  {
> struct amradio_device *radio;
> -   int retval = 0;
> +   int retval;
>
> diff --git a/drivers/media/radio/radio-wl1273.c 
> b/drivers/media/radio/radio-wl1273.c

[..]

Thanks!
Alexey

Re: [PATCH v5 00/20][RESEND] firmware: ARM System Control and Management Interface(SCMI) support

2018-02-14 Thread Alexey Klimov

On Mon, Feb 12, 2018 at 6:45 PM, Sudeep Holla  wrote:
> Hi all,
>
> ARM System Control and Management Interface(SCMI) is more flexible and
> easily extensible than any of the existing interfaces. Many vendors were
> involved in the making of this formal specification and is now published[1].
>
> There is a strong trend in the industry to provide micro-controllers in
> systems to abstract various power, or other system management tasks.
> These controllers usually have similar interfaces, both in terms of the
> functions that are provided by them, and in terms of how requests are
> communicated to them.
>
> This specification is to standardise and avoid (any further)
> fragmentation in the design of such interface by various vendors.
>
> This patch set is intended to get feedback on the design and structure
> of the code. This is not complete and not fully tested due to
> non-availability of firmware with full feature set at this time.

If it's not fully tested and not complete (I read as this patch set is
not ready to be merged), then maybe it's better to mark it as RFC?


> It currently doesn't support notification, asynchronous/delayed response,
> perf/power statistics region and sensor register region to name a few.
> I have borrowed some of the ideas of message allocation/management from
> TI SCI.
>
> Changes:
>
> v4[6]->v5:
> - Rebased to v4.16-rc1
> - Updated all the gathered Ack/Reviewed-by tags(which includes
>   all the drivers using SCMI protocol)

You still didn't comment on all questions to previous patchset.

For example,
https://www.spinics.net/lists/arm-kernel/msg626719.html


Best regards,
Alexey

Re: [PATCH v5 00/20][RESEND] firmware: ARM System Control and Management Interface(SCMI) support

2018-02-14 Thread Alexey Klimov

On Mon, Feb 12, 2018 at 6:45 PM, Sudeep Holla  wrote:
> Hi all,
>
> ARM System Control and Management Interface(SCMI) is more flexible and
> easily extensible than any of the existing interfaces. Many vendors were
> involved in the making of this formal specification and is now published[1].
>
> There is a strong trend in the industry to provide micro-controllers in
> systems to abstract various power, or other system management tasks.
> These controllers usually have similar interfaces, both in terms of the
> functions that are provided by them, and in terms of how requests are
> communicated to them.
>
> This specification is to standardise and avoid (any further)
> fragmentation in the design of such interface by various vendors.
>
> This patch set is intended to get feedback on the design and structure
> of the code. This is not complete and not fully tested due to
> non-availability of firmware with full feature set at this time.

If it's not fully tested and not complete (I read as this patch set is
not ready to be merged), then maybe it's better to mark it as RFC?


> It currently doesn't support notification, asynchronous/delayed response,
> perf/power statistics region and sensor register region to name a few.
> I have borrowed some of the ideas of message allocation/management from
> TI SCI.
>
> Changes:
>
> v4[6]->v5:
> - Rebased to v4.16-rc1
> - Updated all the gathered Ack/Reviewed-by tags(which includes
>   all the drivers using SCMI protocol)

You still didn't comment on all questions to previous patchset.

For example,
https://www.spinics.net/lists/arm-kernel/msg626719.html


Best regards,
Alexey

Re: [PATCH v5 06/20] firmware: arm_scmi: add initial support for performance protocol

2018-01-12 Thread Alexey Klimov

dev, "No. of OPPs exceeded MAX_OPPS");
> +   break;
> +   }
> +
> +   opp = _dom->opp[tot_opp_cnt];
> +   for (cnt = 0; cnt < num_returned; cnt++, opp++) {
> +   opp->perf = 
> le32_to_cpu(level_info->opp[cnt].perf_val);
> +   opp->power = le32_to_cpu(level_info->opp[cnt].power);
> +   opp->trans_latency_us = le16_to_cpu(
> +   level_info->opp[cnt].transition_latency_us);
> +
> +   dev_dbg(handle->dev, "Level %d Power %d Latency 
> %dus\n",
> +   opp->perf, opp->power, opp->trans_latency_us);
> +   }
> +
> +   tot_opp_cnt += num_returned;
> +   /*
> +* check for both returned and remaining to avoid infinite
> +* loop due to buggy firmware
> +*/
> +   } while (num_returned && num_remaining);
> +
> +   perf_dom->opp_count = tot_opp_cnt;
> +   scmi_one_xfer_put(handle, t);
> +
> +   sort(perf_dom->opp, tot_opp_cnt, sizeof(*opp), opp_cmp_func, NULL);
> +   return ret;
> +}
> +
> +static int scmi_perf_limits_set(const struct scmi_handle *handle, u32 domain,
> +   u32 max_perf, u32 min_perf)
> +{
> +   int ret;
> +   struct scmi_xfer *t;
> +   struct scmi_perf_set_limits *limits;
> +
> +   ret = scmi_one_xfer_init(handle, PERF_LIMITS_SET, SCMI_PROTOCOL_PERF,
> +sizeof(*limits), 0, );
> +   if (ret)
> +   return ret;
> +
> +   limits = t->tx.buf;
> +   limits->domain = cpu_to_le32(domain);
> +   limits->max_level = cpu_to_le32(max_perf);
> +   limits->min_level = cpu_to_le32(min_perf);
> +
> +   ret = scmi_do_xfer(handle, t);
> +
> +   scmi_one_xfer_put(handle, t);
> +   return ret;
> +}
> +
> +static int scmi_perf_limits_get(const struct scmi_handle *handle, u32 domain,
> +   u32 *max_perf, u32 *min_perf)
> +{
> +   int ret;
> +   struct scmi_xfer *t;
> +   struct scmi_perf_get_limits *limits;
> +
> +   ret = scmi_one_xfer_init(handle, PERF_LIMITS_GET, SCMI_PROTOCOL_PERF,
> +sizeof(__le32), 0, );
> +   if (ret)
> +   return ret;
> +
> +   *(__le32 *)t->tx.buf = cpu_to_le32(domain);
> +
> +   ret = scmi_do_xfer(handle, t);
> +   if (!ret) {
> +   limits = t->rx.buf;
> +
> +   *max_perf = le32_to_cpu(limits->max_level);
> +   *min_perf = le32_to_cpu(limits->min_level);
> +   }
> +
> +   scmi_one_xfer_put(handle, t);
> +   return ret;
> +}
> +
> +static int
> +scmi_perf_level_set(const struct scmi_handle *handle, u32 domain, u32 level)
> +{
> +   int ret;
> +   struct scmi_xfer *t;
> +   struct scmi_perf_set_level *lvl;
> +
> +   ret = scmi_one_xfer_init(handle, PERF_LEVEL_SET, SCMI_PROTOCOL_PERF,
> +sizeof(*lvl), 0, );
> +   if (ret)
> +   return ret;
> +
> +   lvl = t->tx.buf;
> +   lvl->domain = cpu_to_le32(domain);
> +   lvl->level = cpu_to_le32(level);
> +
> +   ret = scmi_do_xfer(handle, t);
> +
> +   scmi_one_xfer_put(handle, t);
> +   return ret;
> +}
> +
> +static int
> +scmi_perf_level_get(const struct scmi_handle *handle, u32 domain, u32 *level)
> +{
> +   int ret;
> +   struct scmi_xfer *t;
> +
> +   ret = scmi_one_xfer_init(handle, PERF_LEVEL_GET, SCMI_PROTOCOL_PERF,
> +sizeof(u32), sizeof(u32), );
> +   if (ret)
> +   return ret;
> +
> +   *(__le32 *)t->tx.buf = cpu_to_le32(domain);
> +
> +   ret = scmi_do_xfer(handle, t);
> +   if (!ret)
> +   *level = le32_to_cpu(*(__le32 *)t->rx.buf);
> +
> +   scmi_one_xfer_put(handle, t);
> +   return ret;
> +}
> +
> +static int __scmi_perf_notify_enable(const struct scmi_handle *handle, u32 
> cmd,
> +u32 domain, bool enable)
> +{
> +   int ret;
> +   struct scmi_xfer *t;
> +   struct scmi_perf_notify_level_or_limits *notify;
> +
> +   ret = scmi_one_xfer_init(handle, cmd, SCMI_PROTOCOL_PERF,
> +sizeof(*notify), 0, );
> +   if (ret)
> +   return ret;
> +
> +   notify = t->tx.buf;
> +   notify->domain = cpu_to_le32(domain);
> +   notify->notify_enable = cpu_to_le32(enable & BIT(0));
> +
> +   ret = scmi_do_xfer(handle, t);
> +
> +   scmi_one_xfer_put(handle, t);
> +   return ret;
> +}
> +
> +static int scmi_perf_limits_notify_enable(const struct scmi_handle *handle,
> + u32 domain, bool enable)
> +{
> +   return __scmi_perf_notify_enable(handle, PERF_NOTIFY_LIMITS,
> +domain, enable);
> +}
> +
> +static int scmi_perf_level_notify_enable(const struct scmi_handle *handle,
> +u32 domain, bool enable)
> +{
> +   return __scmi_perf_notify_enable(handle, PERF_NOTIFY_LEVEL,
> +domain, enable);
> +}
> +

Do you have any support to correctly handle notifications without
errors/warnings?
It looks like this two functions are accessible to some user through
perf_ops. But are you sure that notifications will be correctly
handled by transport, mailbox framework and scmi protocol?

The reason I ask is that it looks like it's better to return
-EOPNOTSUPP or -ENODEV, maybe -EINVAL here.
When you add notifications support you can allow these operations when
it's safe to do it.

[..]

Best regards,
Alexey Klimov

Re: [PATCH v5 06/20] firmware: arm_scmi: add initial support for performance protocol

2018-01-12 Thread Alexey Klimov

> +   }
> +
> +   opp = _dom->opp[tot_opp_cnt];
> +   for (cnt = 0; cnt < num_returned; cnt++, opp++) {
> +   opp->perf = 
> le32_to_cpu(level_info->opp[cnt].perf_val);
> +   opp->power = le32_to_cpu(level_info->opp[cnt].power);
> +   opp->trans_latency_us = le16_to_cpu(
> +   level_info->opp[cnt].transition_latency_us);
> +
> +   dev_dbg(handle->dev, "Level %d Power %d Latency 
> %dus\n",
> +   opp->perf, opp->power, opp->trans_latency_us);
> +   }
> +
> +   tot_opp_cnt += num_returned;
> +   /*
> +* check for both returned and remaining to avoid infinite
> +* loop due to buggy firmware
> +*/
> +   } while (num_returned && num_remaining);
> +
> +   perf_dom->opp_count = tot_opp_cnt;
> +   scmi_one_xfer_put(handle, t);
> +
> +   sort(perf_dom->opp, tot_opp_cnt, sizeof(*opp), opp_cmp_func, NULL);
> +   return ret;
> +}
> +
> +static int scmi_perf_limits_set(const struct scmi_handle *handle, u32 domain,
> +   u32 max_perf, u32 min_perf)
> +{
> +   int ret;
> +   struct scmi_xfer *t;
> +   struct scmi_perf_set_limits *limits;
> +
> +   ret = scmi_one_xfer_init(handle, PERF_LIMITS_SET, SCMI_PROTOCOL_PERF,
> +sizeof(*limits), 0, );
> +   if (ret)
> +   return ret;
> +
> +   limits = t->tx.buf;
> +   limits->domain = cpu_to_le32(domain);
> +   limits->max_level = cpu_to_le32(max_perf);
> +   limits->min_level = cpu_to_le32(min_perf);
> +
> +   ret = scmi_do_xfer(handle, t);
> +
> +   scmi_one_xfer_put(handle, t);
> +   return ret;
> +}
> +
> +static int scmi_perf_limits_get(const struct scmi_handle *handle, u32 domain,
> +   u32 *max_perf, u32 *min_perf)
> +{
> +   int ret;
> +   struct scmi_xfer *t;
> +   struct scmi_perf_get_limits *limits;
> +
> +   ret = scmi_one_xfer_init(handle, PERF_LIMITS_GET, SCMI_PROTOCOL_PERF,
> +sizeof(__le32), 0, );
> +   if (ret)
> +   return ret;
> +
> +   *(__le32 *)t->tx.buf = cpu_to_le32(domain);
> +
> +   ret = scmi_do_xfer(handle, t);
> +   if (!ret) {
> +   limits = t->rx.buf;
> +
> +   *max_perf = le32_to_cpu(limits->max_level);
> +   *min_perf = le32_to_cpu(limits->min_level);
> +   }
> +
> +   scmi_one_xfer_put(handle, t);
> +   return ret;
> +}
> +
> +static int
> +scmi_perf_level_set(const struct scmi_handle *handle, u32 domain, u32 level)
> +{
> +   int ret;
> +   struct scmi_xfer *t;
> +   struct scmi_perf_set_level *lvl;
> +
> +   ret = scmi_one_xfer_init(handle, PERF_LEVEL_SET, SCMI_PROTOCOL_PERF,
> +sizeof(*lvl), 0, );
> +   if (ret)
> +   return ret;
> +
> +   lvl = t->tx.buf;
> +   lvl->domain = cpu_to_le32(domain);
> +   lvl->level = cpu_to_le32(level);
> +
> +   ret = scmi_do_xfer(handle, t);
> +
> +   scmi_one_xfer_put(handle, t);
> +   return ret;
> +}
> +
> +static int
> +scmi_perf_level_get(const struct scmi_handle *handle, u32 domain, u32 *level)
> +{
> +   int ret;
> +   struct scmi_xfer *t;
> +
> +   ret = scmi_one_xfer_init(handle, PERF_LEVEL_GET, SCMI_PROTOCOL_PERF,
> +sizeof(u32), sizeof(u32), );
> +   if (ret)
> +   return ret;
> +
> +   *(__le32 *)t->tx.buf = cpu_to_le32(domain);
> +
> +   ret = scmi_do_xfer(handle, t);
> +   if (!ret)
> +   *level = le32_to_cpu(*(__le32 *)t->rx.buf);
> +
> +   scmi_one_xfer_put(handle, t);
> +   return ret;
> +}
> +
> +static int __scmi_perf_notify_enable(const struct scmi_handle *handle, u32 
> cmd,
> +u32 domain, bool enable)
> +{
> +   int ret;
> +   struct scmi_xfer *t;
> +   struct scmi_perf_notify_level_or_limits *notify;
> +
> +   ret = scmi_one_xfer_init(handle, cmd, SCMI_PROTOCOL_PERF,
> +        sizeof(*notify), 0, );
> +   if (ret)
> +   return ret;
> +
> +   notify = t->tx.buf;
> +   notify->domain = cpu_to_le32(domain);
> +   notify->notify_enable = cpu_to_le32(enable & BIT(0));
> +
> +   ret = scmi_do_xfer(handle, t);
> +
> +   scmi_one_xfer_put(handle, t);
> +   return ret;
> +}
> +
> +static int scmi_perf_limits_notify_enable(const struct scmi_handle *handle,
> + u32 domain, bool enable)
> +{
> +   return __scmi_perf_notify_enable(handle, PERF_NOTIFY_LIMITS,
> +domain, enable);
> +}
> +
> +static int scmi_perf_level_notify_enable(const struct scmi_handle *handle,
> +u32 domain, bool enable)
> +{
> +   return __scmi_perf_notify_enable(handle, PERF_NOTIFY_LEVEL,
> +domain, enable);
> +}
> +

Do you have any support to correctly handle notifications without
errors/warnings?
It looks like this two functions are accessible to some user through
perf_ops. But are you sure that notifications will be correctly
handled by transport, mailbox framework and scmi protocol?

The reason I ask is that it looks like it's better to return
-EOPNOTSUPP or -ENODEV, maybe -EINVAL here.
When you add notifications support you can allow these operations when
it's safe to do it.

[..]

Best regards,
Alexey Klimov

Re: [PATCH v5 20/20] cpufreq: scmi: add support for fast frequency switching

2018-01-04 Thread Alexey Klimov

Hi Sudeep,

On Tue, Jan 2, 2018 at 2:42 PM, Sudeep Holla  wrote:
> The cpufreq core provides option for drivers to implement fast_switch
> callback which is invoked for frequency switching from interrupt context.
>
> This patch adds support for fast_switch callback in SCMI cpufreq driver
> by making use of polling based SCMI transfer. It also sets the flag
> fast_switch_possible.
>
> Cc: linux...@vger.kernel.org
> Acked-by: Rafael J. Wysocki 
> Acked-by: Viresh Kumar 
> Signed-off-by: Sudeep Holla 
> ---
>  drivers/cpufreq/scmi-cpufreq.c | 15 +++
>  1 file changed, 15 insertions(+)
>
> diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
> index 0ee9335d0063..d0a82d7c6fd4 100644
> --- a/drivers/cpufreq/scmi-cpufreq.c
> +++ b/drivers/cpufreq/scmi-cpufreq.c
> @@ -64,6 +64,19 @@ scmi_cpufreq_set_target(struct cpufreq_policy *policy, 
> unsigned int index)
> return perf_ops->freq_set(handle, priv->domain_id, freq, false);
>  }
>
> +static unsigned int scmi_cpufreq_fast_switch(struct cpufreq_policy *policy,
> +unsigned int target_freq)
> +{
> +   struct scmi_data *priv = policy->driver_data;
> +   struct scmi_perf_ops *perf_ops = handle->perf_ops;
> +
> +   if (!perf_ops->freq_set(handle, priv->domain_id,
> +   target_freq * 1000, true))
> +   return target_freq;
> +
> +   return 0;
> +}

Could you please explain how it's supposed to work for purpose of fast
frequency switching?

I am trying to track down ->freq_set.
So it looks like this will fire an scmi perf level set command and
will poll for this command to complete without asking for firmware to
send command completion irq.

scmi_perf_level_set() will call the following functions:

scmi_one_xfer_init();
scmi_do_xfer(handle, t);
scmi_one_xfer_put(handle, t);

The first function in the list calls scmi_one_xfer_get() which has
this in the description (I guess because of down_timeout()):
"This function can sleep depending on pending requests already in the system
for the SCMI entity. Further, this also holds a spinlock to maintain
integrity of internal data structures."

So it can sleep.

As far as I see description of fast frequency switching it's required
for fast_switch to not sleep:
(file Documentation/cpu-freq/cpu-drivers.txt)

"This function is used for frequency switching from scheduler's context.
Not all drivers are expected to implement it, as sleeping from within
this callback isn't allowed. This callback must be highly optimized to
do switching as fast as possible."

The other questions to this implementation of fast switching:

1) Fast switching callback must be highly optimized. Is it now? I see
few spinlocks (in scmi mbox client and in the mailbox framework) there
and polling functionality with udelay(5) inside that will timeout (if
my calculations are correct) after 0.5 ms.
2) Is it highly dependent on transport? If mailbox transport
->send_data() may sleep or hrtimer-based polling in mailbox framework
will be used, then this fast switch won't work, right?

I am still looking into that: I can be wrong and just trying to
understand if it is all okay.

[..]

Thanks,
Alexey

Re: [PATCH v5 20/20] cpufreq: scmi: add support for fast frequency switching

2018-01-04 Thread Alexey Klimov

Hi Sudeep,

On Tue, Jan 2, 2018 at 2:42 PM, Sudeep Holla  wrote:
> The cpufreq core provides option for drivers to implement fast_switch
> callback which is invoked for frequency switching from interrupt context.
>
> This patch adds support for fast_switch callback in SCMI cpufreq driver
> by making use of polling based SCMI transfer. It also sets the flag
> fast_switch_possible.
>
> Cc: linux...@vger.kernel.org
> Acked-by: Rafael J. Wysocki 
> Acked-by: Viresh Kumar 
> Signed-off-by: Sudeep Holla 
> ---
>  drivers/cpufreq/scmi-cpufreq.c | 15 +++
>  1 file changed, 15 insertions(+)
>
> diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
> index 0ee9335d0063..d0a82d7c6fd4 100644
> --- a/drivers/cpufreq/scmi-cpufreq.c
> +++ b/drivers/cpufreq/scmi-cpufreq.c
> @@ -64,6 +64,19 @@ scmi_cpufreq_set_target(struct cpufreq_policy *policy, 
> unsigned int index)
> return perf_ops->freq_set(handle, priv->domain_id, freq, false);
>  }
>
> +static unsigned int scmi_cpufreq_fast_switch(struct cpufreq_policy *policy,
> +unsigned int target_freq)
> +{
> +   struct scmi_data *priv = policy->driver_data;
> +   struct scmi_perf_ops *perf_ops = handle->perf_ops;
> +
> +   if (!perf_ops->freq_set(handle, priv->domain_id,
> +   target_freq * 1000, true))
> +   return target_freq;
> +
> +   return 0;
> +}

Could you please explain how it's supposed to work for purpose of fast
frequency switching?

I am trying to track down ->freq_set.
So it looks like this will fire an scmi perf level set command and
will poll for this command to complete without asking for firmware to
send command completion irq.

scmi_perf_level_set() will call the following functions:

scmi_one_xfer_init();
scmi_do_xfer(handle, t);
scmi_one_xfer_put(handle, t);

The first function in the list calls scmi_one_xfer_get() which has
this in the description (I guess because of down_timeout()):
"This function can sleep depending on pending requests already in the system
for the SCMI entity. Further, this also holds a spinlock to maintain
integrity of internal data structures."

So it can sleep.

As far as I see description of fast frequency switching it's required
for fast_switch to not sleep:
(file Documentation/cpu-freq/cpu-drivers.txt)

"This function is used for frequency switching from scheduler's context.
Not all drivers are expected to implement it, as sleeping from within
this callback isn't allowed. This callback must be highly optimized to
do switching as fast as possible."

The other questions to this implementation of fast switching:

1) Fast switching callback must be highly optimized. Is it now? I see
few spinlocks (in scmi mbox client and in the mailbox framework) there
and polling functionality with udelay(5) inside that will timeout (if
my calculations are correct) after 0.5 ms.
2) Is it highly dependent on transport? If mailbox transport
->send_data() may sleep or hrtimer-based polling in mailbox framework
will be used, then this fast switch won't work, right?

I am still looking into that: I can be wrong and just trying to
understand if it is all okay.

[..]

Thanks,
Alexey

Re: [PATCH v5 03/20] firmware: arm_scmi: add basic driver infrastructure for SCMI

2018-01-04 Thread Alexey Klimov

Hi Sudeep,

thank you for working on this.

On Tue, Jan 2, 2018 at 2:42 PM, Sudeep Holla  wrote:

[...]

> diff --git a/drivers/firmware/arm_scmi/driver.c 
> b/drivers/firmware/arm_scmi/driver.c
> new file mode 100644
> index ..58d8f88893e6
> --- /dev/null
> +++ b/drivers/firmware/arm_scmi/driver.c

[..]

> + * Return: 0 is successfully released
> + * if null was passed, it returns -EINVAL;
> + */
> +int scmi_handle_put(const struct scmi_handle *handle)
> +{
> +   struct scmi_info *info;
> +
> +   if (!handle)
> +   return -EINVAL;
> +
> +   info = handle_to_scmi_info(handle);
> +   mutex_lock(_list_mutex);
> +   if (!WARN_ON(!info->users))
> +   info->users--;
> +   mutex_unlock(_list_mutex);
> +
> +   return 0;
> +}
> +
> +static const struct scmi_desc scmi_generic_desc = {
> +   .max_rx_timeout_ms = 30,/* we may increase this if required */

What are your thoughts about making it a module parameter?

IIRC, this may required to be increased when someone uses debugging
version of firmware, for example.
In such case someone might need to recompile the kernel in order to
boot with enabled and initialized scmi.

Also, there can be a chance that another transport will be used that
will require larger than 5 * 30 ms delay (such kind of transport can
be kinda useless, I know, but can help with development).

With module parameter you can still boot passing the larger timeout
parameter to the module from cmdline.

> +   .max_msg = 20,  /* Limited by MBOX_TX_QUEUE_LEN */
> +   .max_msg_size = 128,
> +};

Best regards,
Alexey

Re: [PATCH v5 03/20] firmware: arm_scmi: add basic driver infrastructure for SCMI

2018-01-04 Thread Alexey Klimov

Hi Sudeep,

thank you for working on this.

On Tue, Jan 2, 2018 at 2:42 PM, Sudeep Holla  wrote:

[...]

> diff --git a/drivers/firmware/arm_scmi/driver.c 
> b/drivers/firmware/arm_scmi/driver.c
> new file mode 100644
> index ..58d8f88893e6
> --- /dev/null
> +++ b/drivers/firmware/arm_scmi/driver.c

[..]

> + * Return: 0 is successfully released
> + * if null was passed, it returns -EINVAL;
> + */
> +int scmi_handle_put(const struct scmi_handle *handle)
> +{
> +   struct scmi_info *info;
> +
> +   if (!handle)
> +   return -EINVAL;
> +
> +   info = handle_to_scmi_info(handle);
> +   mutex_lock(_list_mutex);
> +   if (!WARN_ON(!info->users))
> +   info->users--;
> +   mutex_unlock(_list_mutex);
> +
> +   return 0;
> +}
> +
> +static const struct scmi_desc scmi_generic_desc = {
> +   .max_rx_timeout_ms = 30,/* we may increase this if required */

What are your thoughts about making it a module parameter?

IIRC, this may required to be increased when someone uses debugging
version of firmware, for example.
In such case someone might need to recompile the kernel in order to
boot with enabled and initialized scmi.

Also, there can be a chance that another transport will be used that
will require larger than 5 * 30 ms delay (such kind of transport can
be kinda useless, I know, but can help with development).

With module parameter you can still boot passing the larger timeout
parameter to the module from cmdline.

> +   .max_msg = 20,  /* Limited by MBOX_TX_QUEUE_LEN */
> +   .max_msg_size = 128,
> +};

Best regards,
Alexey

Re: [PATCH 3/3] [media] radio: constify usb_device_id

2017-08-21 Thread Alexey Klimov

Hi Arvind,

thanks for the patch!

On Sun, Aug 13, 2017 at 9:54 AM, Arvind Yadav <arvind.yadav...@gmail.com> wrote:
> usb_device_id are not supposed to change at runtime. All functions
> working with usb_device_id provided by  work with
> const usb_device_id. So mark the non-const structs as const.
>
> Signed-off-by: Arvind Yadav <arvind.yadav...@gmail.com>

For dsbr100, radio-mr800 and radio-ma901 please feel free to use:

Acked-by: Alexey Klimov <klimov.li...@gmail.com>


> ---
>  drivers/media/radio/dsbr100.c | 2 +-
>  drivers/media/radio/radio-keene.c | 2 +-
>  drivers/media/radio/radio-ma901.c | 2 +-
>  drivers/media/radio/radio-mr800.c | 2 +-
>  drivers/media/radio/radio-raremono.c  | 2 +-
>  drivers/media/radio/radio-shark.c | 2 +-
>  drivers/media/radio/radio-shark2.c| 2 +-
>  drivers/media/radio/si470x/radio-si470x-usb.c | 2 +-
>  drivers/media/radio/si4713/radio-usb-si4713.c | 2 +-
>  9 files changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/media/radio/dsbr100.c b/drivers/media/radio/dsbr100.c
> index 53bc8c0..8521bb2 100644


[...]

Best regards,
Alexey

Re: [PATCH 3/3] [media] radio: constify usb_device_id

2017-08-21 Thread Alexey Klimov

Hi Arvind,

thanks for the patch!

On Sun, Aug 13, 2017 at 9:54 AM, Arvind Yadav  wrote:
> usb_device_id are not supposed to change at runtime. All functions
> working with usb_device_id provided by  work with
> const usb_device_id. So mark the non-const structs as const.
>
> Signed-off-by: Arvind Yadav 

For dsbr100, radio-mr800 and radio-ma901 please feel free to use:

Acked-by: Alexey Klimov 


> ---
>  drivers/media/radio/dsbr100.c | 2 +-
>  drivers/media/radio/radio-keene.c | 2 +-
>  drivers/media/radio/radio-ma901.c | 2 +-
>  drivers/media/radio/radio-mr800.c | 2 +-
>  drivers/media/radio/radio-raremono.c  | 2 +-
>  drivers/media/radio/radio-shark.c | 2 +-
>  drivers/media/radio/radio-shark2.c| 2 +-
>  drivers/media/radio/si470x/radio-si470x-usb.c | 2 +-
>  drivers/media/radio/si4713/radio-usb-si4713.c | 2 +-
>  9 files changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/media/radio/dsbr100.c b/drivers/media/radio/dsbr100.c
> index 53bc8c0..8521bb2 100644


[...]

Best regards,
Alexey

Re: [PATCH 2/4] mailbox: pcc: Drop uninformative output during boot

2017-07-31 Thread Alexey Klimov

(keeping Rafael in c/c per Jassi suggestion)

On Thu, Jul 20, 2017 at 12:04 PM, Punit Agrawal <punit.agra...@arm.com> wrote:
> When booting on an ACPI enabled system that does not provide the
> Platform Communications Channel Table (PCCT), the pcc mailbox driver
> prints -
> 
> [0.484261] PCCT header not found.

Ah. I thought this was already removed ages ago. Thanks for removing this.
 
> during probe before returning -ENODEV.
> 
> This message clutters the bootlog and doesn't provide any useful
> information. Drop this message.
> 
> Signed-off-by: Punit Agrawal <punit.agra...@arm.com>
> Cc: Jassi Brar <jassisinghb...@gmail.com>

Acked-by: Alexey Klimov <alexey.kli...@arm.com>

Re: [PATCH 2/4] mailbox: pcc: Drop uninformative output during boot

2017-07-31 Thread Alexey Klimov

(keeping Rafael in c/c per Jassi suggestion)

On Thu, Jul 20, 2017 at 12:04 PM, Punit Agrawal  wrote:
> When booting on an ACPI enabled system that does not provide the
> Platform Communications Channel Table (PCCT), the pcc mailbox driver
> prints -
> 
> [0.484261] PCCT header not found.

Ah. I thought this was already removed ages ago. Thanks for removing this.
 
> during probe before returning -ENODEV.
> 
> This message clutters the bootlog and doesn't provide any useful
> information. Drop this message.
> 
> Signed-off-by: Punit Agrawal 
> Cc: Jassi Brar 

Acked-by: Alexey Klimov

Re: [PATCH v2 2/3] mailbox: introduce ARM SMC based mailbox

2017-07-31 Thread Alexey Klimov

Hi Andre,

On Mon, Jul 24, 2017 at 12:23 AM, Andre Przywara  wrote:
> 
> This mailbox driver implements a mailbox which signals transmitted data
> via an ARM smc (secure monitor call) instruction. The mailbox receiver

As far as I can see, this driver also supports transmission via hvc.
However, almost everywhere here only smc instruction is mentioned.
Is it okay from your point of view?

> is implemented in firmware and can synchronously return data when it
> returns execution to the non-secure world again.
> An asynchronous receive path is not implemented.
> This allows the usage of a mailbox to trigger firmware actions on SoCs
> which either don't have a separate management processor or on which such
> a core is not available. A user of this mailbox could be the SCP
> interface.
> 
> Signed-off-by: Andre Przywara 
> ---
>  drivers/mailbox/Kconfig   |   8 ++
>  drivers/mailbox/Makefile  |   2 +
>  drivers/mailbox/arm-smc-mailbox.c | 155 
> ++
>  3 files changed, 165 insertions(+)
>  create mode 100644 drivers/mailbox/arm-smc-mailbox.c
> 
> diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
> index c5731e5..5664b7f 100644
> --- a/drivers/mailbox/Kconfig
> +++ b/drivers/mailbox/Kconfig
> @@ -170,4 +170,12 @@ config BCM_FLEXRM_MBOX
>   Mailbox implementation of the Broadcom FlexRM ring manager,
>   which provides access to various offload engines on Broadcom
>   SoCs. Say Y here if you want to use the Broadcom FlexRM.
> +
> +config ARM_SMC_MBOX
> +   tristate "Generic ARM smc mailbox"
> +   depends on OF && HAVE_ARM_SMCCC
> +   help
> + Generic mailbox driver which uses ARM smc calls to call into
> + firmware for triggering mailboxes.
> +
>  endif
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index d54e412..8ec6869 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -35,3 +35,5 @@ obj-$(CONFIG_BCM_FLEXRM_MBOX) += bcm-flexrm-mailbox.o
>  obj-$(CONFIG_QCOM_APCS_IPC)+= qcom-apcs-ipc-mailbox.o
> 
>  obj-$(CONFIG_TEGRA_HSP_MBOX)   += tegra-hsp.o
> +
> +obj-$(CONFIG_ARM_SMC_MBOX) += arm-smc-mailbox.o
> diff --git a/drivers/mailbox/arm-smc-mailbox.c
> b/drivers/mailbox/arm-smc-mailbox.c
> new file mode 100644
> index 000..d7b61a7
> --- /dev/null
> +++ b/drivers/mailbox/arm-smc-mailbox.c
> @@ -0,0 +1,155 @@
> +/*
> + *  Copyright (C) 2016,2017 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This device provides a mechanism for emulating a mailbox by using
> + * smc calls, allowing a "mailbox" consumer to sit in firmware running
> + * on the same core.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define ARM_SMC_MBOX_USE_HVC   BIT(0)
> +
> +struct arm_smc_chan_data {
> +   u32 function_id;
> +   u32 flags;
> +};
> +
> +static int arm_smc_send_data(struct mbox_chan *link, void *data)
> +{
> +   struct arm_smc_chan_data *chan_data = link->con_priv;
> +   u32 function_id = chan_data->function_id;
> +   struct arm_smccc_res res;
> +   u32 msg = *(u32 *)data;
> +
> +   if (chan_data->flags & ARM_SMC_MBOX_USE_HVC)
> +   arm_smccc_hvc(function_id, msg, 0, 0, 0, 0, 0, 0, );
> +   else
> +   arm_smccc_smc(function_id, msg, 0, 0, 0, 0, 0, 0, );
> +
> +   mbox_chan_received_data(link, (void *)res.a0);
> +
> +   return 0;
> +}
> +
> +/* This mailbox is synchronous, so we are always done. */
> +static bool arm_smc_last_tx_done(struct mbox_chan *link)
> +{
> +   return true;
> +}
> +
> +static const struct mbox_chan_ops arm_smc_mbox_chan_ops = {
> +   .send_data  = arm_smc_send_data,
> +   .last_tx_done   = arm_smc_last_tx_done
> +};

How the usage of timer-based polling tx_done method is justified (since it
always returns 'true')?

At the first glance, will it be more efficient to use TXDONE_BY_ACK here?
For instance, a controller will say:

mbox->txdone_poll = false;
mbox->txdone_irq = false;

and a client will say:

cl->tx_block = true;
cl->knows_txdone = true,

and the client will tick tx machinery with its mbox_client_txdone() immediately
after sending of message (since 'This mailbox is synchronous'). Otherwise,
why framework and client should wait >=1 ms before sending next message?



> +static int arm_smc_mbox_probe(struct platform_device *pdev)
> +{
> +   struct device *dev = >dev;
> +   struct mbox_controller *mbox;
> +   struct arm_smc_chan_data *chan_data;
> +   const char *method;
> +   bool use_hvc = false;
> +   int ret, i;
> +
> +   ret = of_property_count_elems_of_size(dev->of_node, "arm,func-ids",
> +

Re: [PATCH v2 2/3] mailbox: introduce ARM SMC based mailbox

2017-07-31 Thread Alexey Klimov

Hi Andre,

On Mon, Jul 24, 2017 at 12:23 AM, Andre Przywara  wrote:
> 
> This mailbox driver implements a mailbox which signals transmitted data
> via an ARM smc (secure monitor call) instruction. The mailbox receiver

As far as I can see, this driver also supports transmission via hvc.
However, almost everywhere here only smc instruction is mentioned.
Is it okay from your point of view?

> is implemented in firmware and can synchronously return data when it
> returns execution to the non-secure world again.
> An asynchronous receive path is not implemented.
> This allows the usage of a mailbox to trigger firmware actions on SoCs
> which either don't have a separate management processor or on which such
> a core is not available. A user of this mailbox could be the SCP
> interface.
> 
> Signed-off-by: Andre Przywara 
> ---
>  drivers/mailbox/Kconfig   |   8 ++
>  drivers/mailbox/Makefile  |   2 +
>  drivers/mailbox/arm-smc-mailbox.c | 155 
> ++
>  3 files changed, 165 insertions(+)
>  create mode 100644 drivers/mailbox/arm-smc-mailbox.c
> 
> diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig
> index c5731e5..5664b7f 100644
> --- a/drivers/mailbox/Kconfig
> +++ b/drivers/mailbox/Kconfig
> @@ -170,4 +170,12 @@ config BCM_FLEXRM_MBOX
>   Mailbox implementation of the Broadcom FlexRM ring manager,
>   which provides access to various offload engines on Broadcom
>   SoCs. Say Y here if you want to use the Broadcom FlexRM.
> +
> +config ARM_SMC_MBOX
> +   tristate "Generic ARM smc mailbox"
> +   depends on OF && HAVE_ARM_SMCCC
> +   help
> + Generic mailbox driver which uses ARM smc calls to call into
> + firmware for triggering mailboxes.
> +
>  endif
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index d54e412..8ec6869 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -35,3 +35,5 @@ obj-$(CONFIG_BCM_FLEXRM_MBOX) += bcm-flexrm-mailbox.o
>  obj-$(CONFIG_QCOM_APCS_IPC)+= qcom-apcs-ipc-mailbox.o
> 
>  obj-$(CONFIG_TEGRA_HSP_MBOX)   += tegra-hsp.o
> +
> +obj-$(CONFIG_ARM_SMC_MBOX) += arm-smc-mailbox.o
> diff --git a/drivers/mailbox/arm-smc-mailbox.c
> b/drivers/mailbox/arm-smc-mailbox.c
> new file mode 100644
> index 000..d7b61a7
> --- /dev/null
> +++ b/drivers/mailbox/arm-smc-mailbox.c
> @@ -0,0 +1,155 @@
> +/*
> + *  Copyright (C) 2016,2017 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This device provides a mechanism for emulating a mailbox by using
> + * smc calls, allowing a "mailbox" consumer to sit in firmware running
> + * on the same core.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define ARM_SMC_MBOX_USE_HVC   BIT(0)
> +
> +struct arm_smc_chan_data {
> +   u32 function_id;
> +   u32 flags;
> +};
> +
> +static int arm_smc_send_data(struct mbox_chan *link, void *data)
> +{
> +   struct arm_smc_chan_data *chan_data = link->con_priv;
> +   u32 function_id = chan_data->function_id;
> +   struct arm_smccc_res res;
> +   u32 msg = *(u32 *)data;
> +
> +   if (chan_data->flags & ARM_SMC_MBOX_USE_HVC)
> +   arm_smccc_hvc(function_id, msg, 0, 0, 0, 0, 0, 0, );
> +   else
> +   arm_smccc_smc(function_id, msg, 0, 0, 0, 0, 0, 0, );
> +
> +   mbox_chan_received_data(link, (void *)res.a0);
> +
> +   return 0;
> +}
> +
> +/* This mailbox is synchronous, so we are always done. */
> +static bool arm_smc_last_tx_done(struct mbox_chan *link)
> +{
> +   return true;
> +}
> +
> +static const struct mbox_chan_ops arm_smc_mbox_chan_ops = {
> +   .send_data  = arm_smc_send_data,
> +   .last_tx_done   = arm_smc_last_tx_done
> +};

How the usage of timer-based polling tx_done method is justified (since it
always returns 'true')?

At the first glance, will it be more efficient to use TXDONE_BY_ACK here?
For instance, a controller will say:

mbox->txdone_poll = false;
mbox->txdone_irq = false;

and a client will say:

cl->tx_block = true;
cl->knows_txdone = true,

and the client will tick tx machinery with its mbox_client_txdone() immediately
after sending of message (since 'This mailbox is synchronous'). Otherwise,
why framework and client should wait >=1 ms before sending next message?



> +static int arm_smc_mbox_probe(struct platform_device *pdev)
> +{
> +   struct device *dev = >dev;
> +   struct mbox_controller *mbox;
> +   struct arm_smc_chan_data *chan_data;
> +   const char *method;
> +   bool use_hvc = false;
> +   int ret, i;
> +
> +   ret = of_property_count_elems_of_size(dev->of_node, "arm,func-ids",
> + sizeof(u32));
> +   if (ret <

[PATCH 1/3] rtc: sun6i: Remove double init of spinlock in sun6i_rtc_clk_init()

2017-07-12 Thread Alexey Klimov

Fixes: 847b8bf62eb4 ("rtc: sun6i: Expose the 32kHz oscillator")
Cc: Maxime Ripard <maxime.rip...@free-electrons.com>
Cc: Rob Herring <r...@kernel.org>
Signed-off-by: Alexey Klimov <alexey.kli...@arm.com>
---
 drivers/rtc/rtc-sun6i.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/rtc/rtc-sun6i.c b/drivers/rtc/rtc-sun6i.c
index 39cbc12..7e7da60 100644
--- a/drivers/rtc/rtc-sun6i.c
+++ b/drivers/rtc/rtc-sun6i.c
@@ -193,12 +193,12 @@ static void __init sun6i_rtc_clk_init(struct device_node 
*node)
rtc = kzalloc(sizeof(*rtc), GFP_KERNEL);
if (!rtc)
return;
-   spin_lock_init(>lock);
 
clk_data = kzalloc(sizeof(*clk_data) + sizeof(*clk_data->hws),
   GFP_KERNEL);
if (!clk_data)
return;
+
spin_lock_init(>lock);
 
rtc->base = of_io_request_and_map(node, 0, of_node_full_name(node));
-- 
1.9.1

[PATCH 1/3] rtc: sun6i: Remove double init of spinlock in sun6i_rtc_clk_init()

2017-07-12 Thread Alexey Klimov

Fixes: 847b8bf62eb4 ("rtc: sun6i: Expose the 32kHz oscillator")
Cc: Maxime Ripard 
Cc: Rob Herring 
Signed-off-by: Alexey Klimov 
---
 drivers/rtc/rtc-sun6i.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/rtc/rtc-sun6i.c b/drivers/rtc/rtc-sun6i.c
index 39cbc12..7e7da60 100644
--- a/drivers/rtc/rtc-sun6i.c
+++ b/drivers/rtc/rtc-sun6i.c
@@ -193,12 +193,12 @@ static void __init sun6i_rtc_clk_init(struct device_node 
*node)
rtc = kzalloc(sizeof(*rtc), GFP_KERNEL);
if (!rtc)
return;
-   spin_lock_init(>lock);
 
clk_data = kzalloc(sizeof(*clk_data) + sizeof(*clk_data->hws),
   GFP_KERNEL);
if (!clk_data)
return;
+
spin_lock_init(>lock);
 
rtc->base = of_io_request_and_map(node, 0, of_node_full_name(node));
-- 
1.9.1

[PATCH 3/3] rtc: sun6i: Remove unneeded initalization of ret in sun6i_rtc_setalarm()

2017-07-12 Thread Alexey Klimov

Signed-off-by: Alexey Klimov <alexey.kli...@arm.com>
---
 drivers/rtc/rtc-sun6i.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/rtc/rtc-sun6i.c b/drivers/rtc/rtc-sun6i.c
index 77bc4d3..1886b85 100644
--- a/drivers/rtc/rtc-sun6i.c
+++ b/drivers/rtc/rtc-sun6i.c
@@ -362,7 +362,7 @@ static int sun6i_rtc_setalarm(struct device *dev, struct 
rtc_wkalrm *wkalrm)
unsigned long time_now = 0;
unsigned long time_set = 0;
unsigned long time_gap = 0;
-   int ret = 0;
+   int ret;
 
ret = sun6i_rtc_gettime(dev, _now);
if (ret < 0) {
-- 
1.9.1

[PATCH 3/3] rtc: sun6i: Remove unneeded initalization of ret in sun6i_rtc_setalarm()

2017-07-12 Thread Alexey Klimov

Signed-off-by: Alexey Klimov 
---
 drivers/rtc/rtc-sun6i.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/rtc/rtc-sun6i.c b/drivers/rtc/rtc-sun6i.c
index 77bc4d3..1886b85 100644
--- a/drivers/rtc/rtc-sun6i.c
+++ b/drivers/rtc/rtc-sun6i.c
@@ -362,7 +362,7 @@ static int sun6i_rtc_setalarm(struct device *dev, struct 
rtc_wkalrm *wkalrm)
unsigned long time_now = 0;
unsigned long time_set = 0;
unsigned long time_gap = 0;
-   int ret = 0;
+   int ret;
 
ret = sun6i_rtc_gettime(dev, _now);
if (ret < 0) {
-- 
1.9.1

[PATCH 2/3] rtc: sun6i: fix memleaks and add error-path in sun6i_rtc_clk_init()

2017-07-12 Thread Alexey Klimov

The memory allocated for rtc and clk_data will never be freed in
sun6i_rtc_clk_init() in case of error and return. This patch adds
required error path with memory freeing.

Fixes: 847b8bf62eb4 ("rtc: sun6i: Expose the 32kHz oscillator")
Cc: Maxime Ripard <maxime.rip...@free-electrons.com>
Cc: Rob Herring <r...@kernel.org>
Signed-off-by: Alexey Klimov <alexey.kli...@arm.com>
---
 drivers/rtc/rtc-sun6i.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/rtc/rtc-sun6i.c b/drivers/rtc/rtc-sun6i.c
index 7e7da60..77bc4d3 100644
--- a/drivers/rtc/rtc-sun6i.c
+++ b/drivers/rtc/rtc-sun6i.c
@@ -197,14 +197,14 @@ static void __init sun6i_rtc_clk_init(struct device_node 
*node)
clk_data = kzalloc(sizeof(*clk_data) + sizeof(*clk_data->hws),
   GFP_KERNEL);
if (!clk_data)
-   return;
+   goto out_rtc_free;
 
spin_lock_init(>lock);
 
rtc->base = of_io_request_and_map(node, 0, of_node_full_name(node));
if (IS_ERR(rtc->base)) {
pr_crit("Can't map RTC registers");
-   return;
+   goto out_clk_data_free;
}
 
/* Switch to the external, more precise, oscillator */
@@ -216,7 +216,7 @@ static void __init sun6i_rtc_clk_init(struct device_node 
*node)
 
/* Deal with old DTs */
if (!of_get_property(node, "clocks", NULL))
-   return;
+   goto out_clk_data_free;
 
rtc->int_osc = clk_hw_register_fixed_rate_with_accuracy(NULL,
"rtc-int-osc",
@@ -225,7 +225,7 @@ static void __init sun6i_rtc_clk_init(struct device_node 
*node)
3);
if (IS_ERR(rtc->int_osc)) {
pr_crit("Couldn't register the internal oscillator\n");
-   return;
+   goto out_clk_data_free;
}
 
parents[0] = clk_hw_get_name(rtc->int_osc);
@@ -240,12 +240,19 @@ static void __init sun6i_rtc_clk_init(struct device_node 
*node)
rtc->losc = clk_register(NULL, >hw);
if (IS_ERR(rtc->losc)) {
pr_crit("Couldn't register the LOSC clock\n");
-   return;
+   goto out_clk_data_free;
}
 
clk_data->num = 1;
clk_data->hws[0] = >hw;
of_clk_add_hw_provider(node, of_clk_hw_onecell_get, clk_data);
+
+   return;
+
+out_clk_data_free:
+   kfree(clk_data);
+out_rtc_free:
+   kfree(rtc);
 }
 CLK_OF_DECLARE_DRIVER(sun6i_rtc_clk, "allwinner,sun6i-a31-rtc",
  sun6i_rtc_clk_init);
-- 
1.9.1

[PATCH 2/3] rtc: sun6i: fix memleaks and add error-path in sun6i_rtc_clk_init()

2017-07-12 Thread Alexey Klimov

The memory allocated for rtc and clk_data will never be freed in
sun6i_rtc_clk_init() in case of error and return. This patch adds
required error path with memory freeing.

Fixes: 847b8bf62eb4 ("rtc: sun6i: Expose the 32kHz oscillator")
Cc: Maxime Ripard 
Cc: Rob Herring 
Signed-off-by: Alexey Klimov 
---
 drivers/rtc/rtc-sun6i.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/rtc/rtc-sun6i.c b/drivers/rtc/rtc-sun6i.c
index 7e7da60..77bc4d3 100644
--- a/drivers/rtc/rtc-sun6i.c
+++ b/drivers/rtc/rtc-sun6i.c
@@ -197,14 +197,14 @@ static void __init sun6i_rtc_clk_init(struct device_node 
*node)
clk_data = kzalloc(sizeof(*clk_data) + sizeof(*clk_data->hws),
   GFP_KERNEL);
if (!clk_data)
-   return;
+   goto out_rtc_free;
 
spin_lock_init(>lock);
 
rtc->base = of_io_request_and_map(node, 0, of_node_full_name(node));
if (IS_ERR(rtc->base)) {
pr_crit("Can't map RTC registers");
-   return;
+   goto out_clk_data_free;
}
 
/* Switch to the external, more precise, oscillator */
@@ -216,7 +216,7 @@ static void __init sun6i_rtc_clk_init(struct device_node 
*node)
 
/* Deal with old DTs */
if (!of_get_property(node, "clocks", NULL))
-   return;
+   goto out_clk_data_free;
 
rtc->int_osc = clk_hw_register_fixed_rate_with_accuracy(NULL,
"rtc-int-osc",
@@ -225,7 +225,7 @@ static void __init sun6i_rtc_clk_init(struct device_node 
*node)
3);
if (IS_ERR(rtc->int_osc)) {
pr_crit("Couldn't register the internal oscillator\n");
-   return;
+   goto out_clk_data_free;
}
 
parents[0] = clk_hw_get_name(rtc->int_osc);
@@ -240,12 +240,19 @@ static void __init sun6i_rtc_clk_init(struct device_node 
*node)
rtc->losc = clk_register(NULL, >hw);
if (IS_ERR(rtc->losc)) {
pr_crit("Couldn't register the LOSC clock\n");
-   return;
+   goto out_clk_data_free;
}
 
clk_data->num = 1;
clk_data->hws[0] = >hw;
of_clk_add_hw_provider(node, of_clk_hw_onecell_get, clk_data);
+
+   return;
+
+out_clk_data_free:
+   kfree(clk_data);
+out_rtc_free:
+   kfree(rtc);
 }
 CLK_OF_DECLARE_DRIVER(sun6i_rtc_clk, "allwinner,sun6i-a31-rtc",
  sun6i_rtc_clk_init);
-- 
1.9.1

Re: [PATCH RFC] mailbox: move controller timer to per-channel timers

2017-05-30 Thread Alexey Klimov

On Thu, May 25, 2017 at 6:43 PM, Alexey Klimov <alexey.kli...@arm.com> wrote:
> Hi Jassi,
> 
> sorry for delay again.
> 
> On Tue, Apr 11, 2017 at 06:30:08PM +0530, Jassi Brar wrote:
> > On 11 April 2017 at 18:04, Alexey Klimov <alexey.kli...@arm.com> wrote:
> > > On Fri, Apr 07, 2017 at 08:39:35PM +0530, Jassi Brar wrote:
> > >> On Thu, Apr 6, 2017 at 11:01 PM, Alexey Klimov <alexey.kli...@arm.com> 
> > >> wrote:
> > >> > When mailbox controller provides two or more channels and
> > >> > they are actively used by mailbox client(s) it's very easy
> > >> > to trigger the warning in hrtimer_forward():
> > >> >
> > >> > [  247.853060] WARNING: CPU: 6 PID: 0 at kernel/time/hrtimer.c:805 
> > >> > hrtimer_forward+0x88/0xd8
> > >> > [  247.853549] Modules linked in:
> > >> > [  247.853907] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW   
> > >> > 4.11.0-rc2-00362-g93afaa4513bb-dirty #13
> > >> > [  247.854472] Hardware name: linux,dummy-virt (DT)
> > >> > [  247.854699] task: 80001d89d780 task.stack: 80001d8c4000
> > >> > [  247.854999] PC is at hrtimer_forward+0x88/0xd8
> > >> > [  247.855280] LR is at txdone_hrtimer+0xd4/0xf8
> > >> > [  247.81] pc : [] lr : [] 
> > >> > pstate: 21c5
> > >> > [  247.855857] sp : 80001efbdeb0
> > >> > [  247.856072] x29: 80001efbdeb0 x28: 80001efc3140
> > >> > [  247.856358] x27: 0881b7a0 x26: 0039ac93e8b6
> > >> > [  247.856604] x25: 08e756be x24: 80001c4a1348
> > >> > [  247.856882] x23: 0001 x22: 00f8
> > >> > [  247.857189] x21: 80001c4a1318 x20: 80001d327110
> > >> > [  247.857509] x19: 000f4240 x18: 0030
> > >> > [  247.857808] x17: aecdf370 x16: 081ccc80
> > >> > [  247.858000] x15: 0010 x14: fff0
> > >> > [  247.858186] x13: 08f488e0 x12: 0002e3eb
> > >> > [  247.858381] x11: 08979690 x10: 
> > >> > [  247.858573] x9 : 0001 x8 : 80001efc66e0
> > >> > [  247.858758] x7 : 80001efc6708 x6 : 0005be7732f2
> > >> > [  247.858943] x5 : 0001 x4 : 80001c4a1348
> > >> > [  247.859130] x3 : 0039ac94952a x2 : 000f4240
> > >> > [  247.859315] x1 : 0039ac98243c x0 : 00038f12
> > >> > [  247.859582] ---[ end trace d61812426ec3c30b ]---
> > >> >
> > >> > To fix this current patch migrates hr timers to be per-channel
> > >> > instead of using only one timer per-controller.
> > >> >
> > >> I think we can do by just checking if hrtimer_active() returns false
> > >> before we do hrtimer_start() in msg_submit() ?
> > >
> > > It looks like it can be easily broken:
> > >
> > > 1) let's say first thread executes timer callback and already checked 
> > > last_tx_done
> > > on channel 0;
> > > 2) second thread submits a message to the controller, say, on channel 0 
> > > and with
> > > help of hrtimer_active() observes that the timer is active (because timer 
> > > callback
> > > is running) and decides not to (re-)start timer;
> > >
> > > After this first thread decides not to restart the timer and finishes 
> > > callback.
> > > The thing that first thread executes tx_tick isn't helpful: for example 
> > > first
> > > thread may have no messages to submit on any channel and therefore is not 
> > > going
> > > to deal with timer.
> > >
> > > Finally, mailbox state machine is stalled. Second thread thinks that 
> > > timer is
> > > active while it's not.
> > >
> > ... you mean race :)  and we have locks for that. You want me to send
> > in a patch?
> 
> We don't have separate lock for timer.
> 
> > > One of the main questions is that there is only one timer per few channels
> > > in current code.
> > >
> > I see that as a good thing because
> > a) Polling anyway doesn't provide 'hard' guarantee even if we have one
> > timer per channel
> > b) The poll period remains same for every channel, so functionality
> > wise you only increase timer callbacks.
> 
> Do you mean something like this below?
> 
> The patch isn't r

Re: [PATCH RFC] mailbox: move controller timer to per-channel timers

2017-05-30 Thread Alexey Klimov

On Thu, May 25, 2017 at 6:43 PM, Alexey Klimov  wrote:
> Hi Jassi,
> 
> sorry for delay again.
> 
> On Tue, Apr 11, 2017 at 06:30:08PM +0530, Jassi Brar wrote:
> > On 11 April 2017 at 18:04, Alexey Klimov  wrote:
> > > On Fri, Apr 07, 2017 at 08:39:35PM +0530, Jassi Brar wrote:
> > >> On Thu, Apr 6, 2017 at 11:01 PM, Alexey Klimov  
> > >> wrote:
> > >> > When mailbox controller provides two or more channels and
> > >> > they are actively used by mailbox client(s) it's very easy
> > >> > to trigger the warning in hrtimer_forward():
> > >> >
> > >> > [  247.853060] WARNING: CPU: 6 PID: 0 at kernel/time/hrtimer.c:805 
> > >> > hrtimer_forward+0x88/0xd8
> > >> > [  247.853549] Modules linked in:
> > >> > [  247.853907] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW   
> > >> > 4.11.0-rc2-00362-g93afaa4513bb-dirty #13
> > >> > [  247.854472] Hardware name: linux,dummy-virt (DT)
> > >> > [  247.854699] task: 80001d89d780 task.stack: 80001d8c4000
> > >> > [  247.854999] PC is at hrtimer_forward+0x88/0xd8
> > >> > [  247.855280] LR is at txdone_hrtimer+0xd4/0xf8
> > >> > [  247.81] pc : [] lr : [] 
> > >> > pstate: 21c5
> > >> > [  247.855857] sp : 80001efbdeb0
> > >> > [  247.856072] x29: 80001efbdeb0 x28: 80001efc3140
> > >> > [  247.856358] x27: 0881b7a0 x26: 0039ac93e8b6
> > >> > [  247.856604] x25: 08e756be x24: 80001c4a1348
> > >> > [  247.856882] x23: 0001 x22: 00f8
> > >> > [  247.857189] x21: 80001c4a1318 x20: 80001d327110
> > >> > [  247.857509] x19: 000f4240 x18: 0030
> > >> > [  247.857808] x17: aecdf370 x16: 081ccc80
> > >> > [  247.858000] x15: 0010 x14: fff0
> > >> > [  247.858186] x13: 08f488e0 x12: 0002e3eb
> > >> > [  247.858381] x11: 08979690 x10: 
> > >> > [  247.858573] x9 : 0001 x8 : 80001efc66e0
> > >> > [  247.858758] x7 : 80001efc6708 x6 : 0005be7732f2
> > >> > [  247.858943] x5 : 0001 x4 : 80001c4a1348
> > >> > [  247.859130] x3 : 0039ac94952a x2 : 000f4240
> > >> > [  247.859315] x1 : 0039ac98243c x0 : 00038f12
> > >> > [  247.859582] ---[ end trace d61812426ec3c30b ]---
> > >> >
> > >> > To fix this current patch migrates hr timers to be per-channel
> > >> > instead of using only one timer per-controller.
> > >> >
> > >> I think we can do by just checking if hrtimer_active() returns false
> > >> before we do hrtimer_start() in msg_submit() ?
> > >
> > > It looks like it can be easily broken:
> > >
> > > 1) let's say first thread executes timer callback and already checked 
> > > last_tx_done
> > > on channel 0;
> > > 2) second thread submits a message to the controller, say, on channel 0 
> > > and with
> > > help of hrtimer_active() observes that the timer is active (because timer 
> > > callback
> > > is running) and decides not to (re-)start timer;
> > >
> > > After this first thread decides not to restart the timer and finishes 
> > > callback.
> > > The thing that first thread executes tx_tick isn't helpful: for example 
> > > first
> > > thread may have no messages to submit on any channel and therefore is not 
> > > going
> > > to deal with timer.
> > >
> > > Finally, mailbox state machine is stalled. Second thread thinks that 
> > > timer is
> > > active while it's not.
> > >
> > ... you mean race :)  and we have locks for that. You want me to send
> > in a patch?
> 
> We don't have separate lock for timer.
> 
> > > One of the main questions is that there is only one timer per few channels
> > > in current code.
> > >
> > I see that as a good thing because
> > a) Polling anyway doesn't provide 'hard' guarantee even if we have one
> > timer per channel
> > b) The poll period remains same for every channel, so functionality
> > wise you only increase timer callbacks.
> 
> Do you mean something like this below?
> 
> The patch isn't really tested on multi-channel environment yet but
> I will test it. I just want to know i

Re: [PATCH RFC] mailbox: move controller timer to per-channel timers

2017-05-25 Thread Alexey Klimov

Hi Jassi,

sorry for delay again.

On Tue, Apr 11, 2017 at 06:30:08PM +0530, Jassi Brar wrote:
> On 11 April 2017 at 18:04, Alexey Klimov <alexey.kli...@arm.com> wrote:
> > On Fri, Apr 07, 2017 at 08:39:35PM +0530, Jassi Brar wrote:
> >> On Thu, Apr 6, 2017 at 11:01 PM, Alexey Klimov <alexey.kli...@arm.com> 
> >> wrote:
> >> > When mailbox controller provides two or more channels and
> >> > they are actively used by mailbox client(s) it's very easy
> >> > to trigger the warning in hrtimer_forward():
> >> >
> >> > [  247.853060] WARNING: CPU: 6 PID: 0 at kernel/time/hrtimer.c:805 
> >> > hrtimer_forward+0x88/0xd8
> >> > [  247.853549] Modules linked in:
> >> > [  247.853907] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW   
> >> > 4.11.0-rc2-00362-g93afaa4513bb-dirty #13
> >> > [  247.854472] Hardware name: linux,dummy-virt (DT)
> >> > [  247.854699] task: 80001d89d780 task.stack: 80001d8c4000
> >> > [  247.854999] PC is at hrtimer_forward+0x88/0xd8
> >> > [  247.855280] LR is at txdone_hrtimer+0xd4/0xf8
> >> > [  247.81] pc : [] lr : [] 
> >> > pstate: 21c5
> >> > [  247.855857] sp : 80001efbdeb0
> >> > [  247.856072] x29: 80001efbdeb0 x28: 80001efc3140
> >> > [  247.856358] x27: 0881b7a0 x26: 0039ac93e8b6
> >> > [  247.856604] x25: 08e756be x24: 80001c4a1348
> >> > [  247.856882] x23: 0001 x22: 00f8
> >> > [  247.857189] x21: 80001c4a1318 x20: 80001d327110
> >> > [  247.857509] x19: 000f4240 x18: 0030
> >> > [  247.857808] x17: aecdf370 x16: 081ccc80
> >> > [  247.858000] x15: 0010 x14: fff0
> >> > [  247.858186] x13: 08f488e0 x12: 0002e3eb
> >> > [  247.858381] x11: 08979690 x10: 
> >> > [  247.858573] x9 : 0001 x8 : 80001efc66e0
> >> > [  247.858758] x7 : 80001efc6708 x6 : 0005be7732f2
> >> > [  247.858943] x5 : 0001 x4 : 80001c4a1348
> >> > [  247.859130] x3 : 0039ac94952a x2 : 000f4240
> >> > [  247.859315] x1 : 0039ac98243c x0 : 00038f12
> >> > [  247.859582] ---[ end trace d61812426ec3c30b ]---
> >> >
> >> > To fix this current patch migrates hr timers to be per-channel
> >> > instead of using only one timer per-controller.
> >> >
> >> I think we can do by just checking if hrtimer_active() returns false
> >> before we do hrtimer_start() in msg_submit() ?
> >
> > It looks like it can be easily broken:
> >
> > 1) let's say first thread executes timer callback and already checked 
> > last_tx_done
> > on channel 0;
> > 2) second thread submits a message to the controller, say, on channel 0 and 
> > with
> > help of hrtimer_active() observes that the timer is active (because timer 
> > callback
> > is running) and decides not to (re-)start timer;
> >
> > After this first thread decides not to restart the timer and finishes 
> > callback.
> > The thing that first thread executes tx_tick isn't helpful: for example 
> > first
> > thread may have no messages to submit on any channel and therefore is not 
> > going
> > to deal with timer.
> >
> > Finally, mailbox state machine is stalled. Second thread thinks that timer 
> > is
> > active while it's not.
> >
> ... you mean race :)  and we have locks for that. You want me to send
> in a patch?

We don't have separate lock for timer.
 
> > One of the main questions is that there is only one timer per few channels
> > in current code.
> >
> I see that as a good thing because
> a) Polling anyway doesn't provide 'hard' guarantee even if we have one
> timer per channel
> b) The poll period remains same for every channel, so functionality
> wise you only increase timer callbacks.

Do you mean something like this below?

The patch isn't really tested on multi-channel environment yet but
I will test it. I just want to know if I am on the right way here.

I know there are some adjustments that can be done in the loop in hr-timer
callback. The thing that I don't like here is a lot of spin_lock/unlocks
in the timer callback.

Thanks,
Alexey


>From 2a7653a27be60d3e81719b16382d13963ab828e0 Mon Sep 17 00:00:00 2001
From: Alexey Klimov <alexey.kli...@arm.com>
Date: Thu, 25 May 2017 18:30:13 +0100
Subject: [PATCH RFC] mailbox: fix hrtimer_forward(

Re: [PATCH RFC] mailbox: move controller timer to per-channel timers

2017-05-25 Thread Alexey Klimov

Hi Jassi,

sorry for delay again.

On Tue, Apr 11, 2017 at 06:30:08PM +0530, Jassi Brar wrote:
> On 11 April 2017 at 18:04, Alexey Klimov  wrote:
> > On Fri, Apr 07, 2017 at 08:39:35PM +0530, Jassi Brar wrote:
> >> On Thu, Apr 6, 2017 at 11:01 PM, Alexey Klimov  
> >> wrote:
> >> > When mailbox controller provides two or more channels and
> >> > they are actively used by mailbox client(s) it's very easy
> >> > to trigger the warning in hrtimer_forward():
> >> >
> >> > [  247.853060] WARNING: CPU: 6 PID: 0 at kernel/time/hrtimer.c:805 
> >> > hrtimer_forward+0x88/0xd8
> >> > [  247.853549] Modules linked in:
> >> > [  247.853907] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW   
> >> > 4.11.0-rc2-00362-g93afaa4513bb-dirty #13
> >> > [  247.854472] Hardware name: linux,dummy-virt (DT)
> >> > [  247.854699] task: 80001d89d780 task.stack: 80001d8c4000
> >> > [  247.854999] PC is at hrtimer_forward+0x88/0xd8
> >> > [  247.855280] LR is at txdone_hrtimer+0xd4/0xf8
> >> > [  247.81] pc : [] lr : [] 
> >> > pstate: 21c5
> >> > [  247.855857] sp : 80001efbdeb0
> >> > [  247.856072] x29: 80001efbdeb0 x28: 80001efc3140
> >> > [  247.856358] x27: 0881b7a0 x26: 0039ac93e8b6
> >> > [  247.856604] x25: 08e756be x24: 80001c4a1348
> >> > [  247.856882] x23: 0001 x22: 00f8
> >> > [  247.857189] x21: 80001c4a1318 x20: 80001d327110
> >> > [  247.857509] x19: 000f4240 x18: 0030
> >> > [  247.857808] x17: aecdf370 x16: 081ccc80
> >> > [  247.858000] x15: 0010 x14: fff0
> >> > [  247.858186] x13: 08f488e0 x12: 0002e3eb
> >> > [  247.858381] x11: 08979690 x10: 
> >> > [  247.858573] x9 : 0001 x8 : 80001efc66e0
> >> > [  247.858758] x7 : 80001efc6708 x6 : 0005be7732f2
> >> > [  247.858943] x5 : 0001 x4 : 80001c4a1348
> >> > [  247.859130] x3 : 0039ac94952a x2 : 000f4240
> >> > [  247.859315] x1 : 0039ac98243c x0 : 00038f12
> >> > [  247.859582] ---[ end trace d61812426ec3c30b ]---
> >> >
> >> > To fix this current patch migrates hr timers to be per-channel
> >> > instead of using only one timer per-controller.
> >> >
> >> I think we can do by just checking if hrtimer_active() returns false
> >> before we do hrtimer_start() in msg_submit() ?
> >
> > It looks like it can be easily broken:
> >
> > 1) let's say first thread executes timer callback and already checked 
> > last_tx_done
> > on channel 0;
> > 2) second thread submits a message to the controller, say, on channel 0 and 
> > with
> > help of hrtimer_active() observes that the timer is active (because timer 
> > callback
> > is running) and decides not to (re-)start timer;
> >
> > After this first thread decides not to restart the timer and finishes 
> > callback.
> > The thing that first thread executes tx_tick isn't helpful: for example 
> > first
> > thread may have no messages to submit on any channel and therefore is not 
> > going
> > to deal with timer.
> >
> > Finally, mailbox state machine is stalled. Second thread thinks that timer 
> > is
> > active while it's not.
> >
> ... you mean race :)  and we have locks for that. You want me to send
> in a patch?

We don't have separate lock for timer.
 
> > One of the main questions is that there is only one timer per few channels
> > in current code.
> >
> I see that as a good thing because
> a) Polling anyway doesn't provide 'hard' guarantee even if we have one
> timer per channel
> b) The poll period remains same for every channel, so functionality
> wise you only increase timer callbacks.

Do you mean something like this below?

The patch isn't really tested on multi-channel environment yet but
I will test it. I just want to know if I am on the right way here.

I know there are some adjustments that can be done in the loop in hr-timer
callback. The thing that I don't like here is a lot of spin_lock/unlocks
in the timer callback.

Thanks,
Alexey


>From 2a7653a27be60d3e81719b16382d13963ab828e0 Mon Sep 17 00:00:00 2001
From: Alexey Klimov 
Date: Thu, 25 May 2017 18:30:13 +0100
Subject: [PATCH RFC] mailbox: fix hrtimer_forward() warning

When mailbox controller provides two or more channels and
they are actively used

Re: [PATCH] mailbox: fix completion order for blocking requests

2017-05-25 Thread Alexey Klimov

Hi Jassi,

Sorry for delay -- this is not my main activity so please be patient.

On Sun, Apr 23, 2017 at 03:33:39PM +0530, Jassi Brar wrote:
> On Tue, Apr 11, 2017 at 6:15 PM, Alexey Klimov <alexey.kli...@arm.com> wrote:
> > On Thu, Apr 06, 2017 at 10:45:26PM +0530, Jassi Brar wrote:
> >> On 6 April 2017 at 22:28, Alexey Klimov <alexey.kli...@arm.com> wrote:
> >> > Hi Jassi/Sudeep,
> >> >
> >> > On Wed, Mar 29, 2017 at 07:01:09PM +0100, Sudeep Holla wrote:
> >> >>
> >> >>
> >> >> On 29/03/17 18:43, Jassi Brar wrote:
> >> ...
> >>
> >> >> > diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
> >> >> > index 9dfbf7e..e06c50c 100644
> >> >> > --- a/drivers/mailbox/mailbox.c
> >> >> > +++ b/drivers/mailbox/mailbox.c
> >> >> > @@ -41,6 +41,7 @@ static int add_to_rbuf(struct mbox_chan *chan, void 
> >> >> > *mssg)
> >> >> >
> >> >> > idx = chan->msg_free;
> >> >> > chan->msg_data[idx] = mssg;
> >> >> > +   init_completion(>tx_cmpl[idx]);
> >> >>
> >> >> reinit would be better.
> >> >
> >> Of course.
> >>
> >> 
> >> > From: Alexey Klimov <alexey.kli...@arm.com>
> >> > Date: Thu, 6 Apr 2017 13:57:02 +0100
> >> > Subject: [RFC][PATCH] mailbox: per-channel arrays with msg data and 
> >> > completion
> >> >  structures
> >> >
> >> > When a mailbox client doesn't serialize sending of the message itself,
> >> > and asks mailbox framework to block on mbox_send_message(), one
> >> > completion structure per channel is not enough. Client can make a few
> >> > mbox_send_message() calls at the same time, and there is no guaranteed
> >> > order of going to sleep on completion.
> >> >
> >> > If mailbox controller acks a message transfer, then tx_tick() wakes up
> >> > the first thread that waits on completion.
> >> > If mailbox controller doesn't ack the transfer and timeout happens, then
> >> > tx_tick() calls complete, and the next caller trying to sleep on
> >> > completion wakes up immediately.
> >> >
> >> > This patch fixes this by changing completion structures to be inserted
> >> > into an array that contains a) pointer to data provided by client and
> >> > b) the completion structure. Thus active_req field tracks the index of
> >> > the current running request that was submitted to mailbox controller.
> >> >
> >> > Signed-off-by: Alexey Klimov <alexey.kli...@arm.com>
> >> > ---
> >> >  drivers/mailbox/mailbox.c  | 40 
> >> > +++---
> >> >  drivers/mailbox/pcc.c  | 10 +++---
> >> >  include/linux/mailbox_controller.h | 24 +--
> >> ...
> >> >  3 files changed, 49 insertions(+), 25 deletions(-)
> >> >
> >>  Versus   4 files changed, 17 insertions(+), 8 deletions(-)
> >>
> >> I think we should just keep it simpler if it works just as fine.
> >
> > Along with this patch you still need at least one patch from Sudeep with 
> > subject:
> > "[PATCH 1/3] mailbox: always wait in mbox_send_message for blocking Tx mode"
> >
> Yes. Just so we are on same page, can you please redo your tests and
> see if this and Sudeep's patch-1/3 does the trick?

Yes, for such kind of behaviour to make things work I need to apply Sudeep's 
patches
and this your (or mine) patch that changes completion behaviour.
Or do I miss something and have you already applied it?

Thanks!

Alexey.

Re: [PATCH] mailbox: fix completion order for blocking requests

2017-05-25 Thread Alexey Klimov

Hi Jassi,

Sorry for delay -- this is not my main activity so please be patient.

On Sun, Apr 23, 2017 at 03:33:39PM +0530, Jassi Brar wrote:
> On Tue, Apr 11, 2017 at 6:15 PM, Alexey Klimov  wrote:
> > On Thu, Apr 06, 2017 at 10:45:26PM +0530, Jassi Brar wrote:
> >> On 6 April 2017 at 22:28, Alexey Klimov  wrote:
> >> > Hi Jassi/Sudeep,
> >> >
> >> > On Wed, Mar 29, 2017 at 07:01:09PM +0100, Sudeep Holla wrote:
> >> >>
> >> >>
> >> >> On 29/03/17 18:43, Jassi Brar wrote:
> >> ...
> >>
> >> >> > diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
> >> >> > index 9dfbf7e..e06c50c 100644
> >> >> > --- a/drivers/mailbox/mailbox.c
> >> >> > +++ b/drivers/mailbox/mailbox.c
> >> >> > @@ -41,6 +41,7 @@ static int add_to_rbuf(struct mbox_chan *chan, void 
> >> >> > *mssg)
> >> >> >
> >> >> > idx = chan->msg_free;
> >> >> > chan->msg_data[idx] = mssg;
> >> >> > +   init_completion(>tx_cmpl[idx]);
> >> >>
> >> >> reinit would be better.
> >> >
> >> Of course.
> >>
> >> 
> >> > From: Alexey Klimov 
> >> > Date: Thu, 6 Apr 2017 13:57:02 +0100
> >> > Subject: [RFC][PATCH] mailbox: per-channel arrays with msg data and 
> >> > completion
> >> >  structures
> >> >
> >> > When a mailbox client doesn't serialize sending of the message itself,
> >> > and asks mailbox framework to block on mbox_send_message(), one
> >> > completion structure per channel is not enough. Client can make a few
> >> > mbox_send_message() calls at the same time, and there is no guaranteed
> >> > order of going to sleep on completion.
> >> >
> >> > If mailbox controller acks a message transfer, then tx_tick() wakes up
> >> > the first thread that waits on completion.
> >> > If mailbox controller doesn't ack the transfer and timeout happens, then
> >> > tx_tick() calls complete, and the next caller trying to sleep on
> >> > completion wakes up immediately.
> >> >
> >> > This patch fixes this by changing completion structures to be inserted
> >> > into an array that contains a) pointer to data provided by client and
> >> > b) the completion structure. Thus active_req field tracks the index of
> >> > the current running request that was submitted to mailbox controller.
> >> >
> >> > Signed-off-by: Alexey Klimov 
> >> > ---
> >> >  drivers/mailbox/mailbox.c  | 40 
> >> > +++---
> >> >  drivers/mailbox/pcc.c  | 10 +++---
> >> >  include/linux/mailbox_controller.h | 24 +--
> >> ...
> >> >  3 files changed, 49 insertions(+), 25 deletions(-)
> >> >
> >>  Versus   4 files changed, 17 insertions(+), 8 deletions(-)
> >>
> >> I think we should just keep it simpler if it works just as fine.
> >
> > Along with this patch you still need at least one patch from Sudeep with 
> > subject:
> > "[PATCH 1/3] mailbox: always wait in mbox_send_message for blocking Tx mode"
> >
> Yes. Just so we are on same page, can you please redo your tests and
> see if this and Sudeep's patch-1/3 does the trick?

Yes, for such kind of behaviour to make things work I need to apply Sudeep's 
patches
and this your (or mine) patch that changes completion behaviour.
Or do I miss something and have you already applied it?

Thanks!

Alexey.

Re: [PATCH 1/3] mailbox: always wait in mbox_send_message for blocking Tx mode

2017-04-11 Thread Alexey Klimov

On Tue, Mar 21, 2017 at 11:30:14AM +, Sudeep Holla wrote:
> There exists a race when msg_submit return immediately as there was an
> active request being processed which may have completed just before it's
> checked again in mbox_send_message. This will result in return to the
> caller without waiting in mbox_send_message even when it's blocking Tx.
> 
> This patch fixes the issue by waiting for the completion always if Tx
> is in blocking mode.
> 
> Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox")
> Cc: Jassi Brar <jassisinghb...@gmail.com>
> Reported-by: Alexey Klimov <alexey.kli...@arm.com>
> Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>


Reviewed-by: Alexey Klimov <alexey.kli...@arm.com>



> ---
>  drivers/mailbox/mailbox.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Hi Jassi,
> 
> Here are fixes for few issues we encountered when dealing with multiple
> requests on multiple channels simultaneously.
> 
> Regards,
> Sudeep
> 
> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
> index 4671f8a12872..160d6640425a 100644
> --- a/drivers/mailbox/mailbox.c
> +++ b/drivers/mailbox/mailbox.c
> @@ -260,7 +260,7 @@ int mbox_send_message(struct mbox_chan *chan, void *mssg)
> 
>   msg_submit(chan);
> 
> - if (chan->cl->tx_block && chan->active_req) {
> + if (chan->cl->tx_block) {
>   unsigned long wait;
>   int ret;
> 
> --
> 2.7.4
>

Re: [PATCH 1/3] mailbox: always wait in mbox_send_message for blocking Tx mode

2017-04-11 Thread Alexey Klimov

On Tue, Mar 21, 2017 at 11:30:14AM +, Sudeep Holla wrote:
> There exists a race when msg_submit return immediately as there was an
> active request being processed which may have completed just before it's
> checked again in mbox_send_message. This will result in return to the
> caller without waiting in mbox_send_message even when it's blocking Tx.
> 
> This patch fixes the issue by waiting for the completion always if Tx
> is in blocking mode.
> 
> Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox")
> Cc: Jassi Brar 
> Reported-by: Alexey Klimov 
> Signed-off-by: Sudeep Holla 


Reviewed-by: Alexey Klimov 



> ---
>  drivers/mailbox/mailbox.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Hi Jassi,
> 
> Here are fixes for few issues we encountered when dealing with multiple
> requests on multiple channels simultaneously.
> 
> Regards,
> Sudeep
> 
> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
> index 4671f8a12872..160d6640425a 100644
> --- a/drivers/mailbox/mailbox.c
> +++ b/drivers/mailbox/mailbox.c
> @@ -260,7 +260,7 @@ int mbox_send_message(struct mbox_chan *chan, void *mssg)
> 
>   msg_submit(chan);
> 
> - if (chan->cl->tx_block && chan->active_req) {
> + if (chan->cl->tx_block) {
>   unsigned long wait;
>   int ret;
> 
> --
> 2.7.4
>

Re: [PATCH] mailbox: fix completion order for blocking requests

2017-04-11 Thread Alexey Klimov

On Thu, Apr 06, 2017 at 10:45:26PM +0530, Jassi Brar wrote:
> On 6 April 2017 at 22:28, Alexey Klimov <alexey.kli...@arm.com> wrote:
> > Hi Jassi/Sudeep,
> >
> > On Wed, Mar 29, 2017 at 07:01:09PM +0100, Sudeep Holla wrote:
> >>
> >>
> >> On 29/03/17 18:43, Jassi Brar wrote:
> ...
> 
> >> > diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
> >> > index 9dfbf7e..e06c50c 100644
> >> > --- a/drivers/mailbox/mailbox.c
> >> > +++ b/drivers/mailbox/mailbox.c
> >> > @@ -41,6 +41,7 @@ static int add_to_rbuf(struct mbox_chan *chan, void 
> >> > *mssg)
> >> >
> >> > idx = chan->msg_free;
> >> > chan->msg_data[idx] = mssg;
> >> > +   init_completion(>tx_cmpl[idx]);
> >>
> >> reinit would be better.
> >
> Of course.
> 
> 
> > From: Alexey Klimov <alexey.kli...@arm.com>
> > Date: Thu, 6 Apr 2017 13:57:02 +0100
> > Subject: [RFC][PATCH] mailbox: per-channel arrays with msg data and 
> > completion
> >  structures
> >
> > When a mailbox client doesn't serialize sending of the message itself,
> > and asks mailbox framework to block on mbox_send_message(), one
> > completion structure per channel is not enough. Client can make a few
> > mbox_send_message() calls at the same time, and there is no guaranteed
> > order of going to sleep on completion.
> >
> > If mailbox controller acks a message transfer, then tx_tick() wakes up
> > the first thread that waits on completion.
> > If mailbox controller doesn't ack the transfer and timeout happens, then
> > tx_tick() calls complete, and the next caller trying to sleep on
> > completion wakes up immediately.
> >
> > This patch fixes this by changing completion structures to be inserted
> > into an array that contains a) pointer to data provided by client and
> > b) the completion structure. Thus active_req field tracks the index of
> > the current running request that was submitted to mailbox controller.
> >
> > Signed-off-by: Alexey Klimov <alexey.kli...@arm.com>
> > ---
> >  drivers/mailbox/mailbox.c  | 40 
> > +++---
> >  drivers/mailbox/pcc.c  | 10 +++---
> >  include/linux/mailbox_controller.h | 24 +--
> ...
> >  3 files changed, 49 insertions(+), 25 deletions(-)
> >
>  Versus   4 files changed, 17 insertions(+), 8 deletions(-)
> 
> I think we should just keep it simpler if it works just as fine.

Along with this patch you still need at least one patch from Sudeep with 
subject:
"[PATCH 1/3] mailbox: always wait in mbox_send_message for blocking Tx mode"

Decision to block or not shouldn't depend on racy reading of active_req field.

Best regards,
Alexey Klimov.

Re: [PATCH] mailbox: fix completion order for blocking requests

2017-04-11 Thread Alexey Klimov

On Thu, Apr 06, 2017 at 10:45:26PM +0530, Jassi Brar wrote:
> On 6 April 2017 at 22:28, Alexey Klimov  wrote:
> > Hi Jassi/Sudeep,
> >
> > On Wed, Mar 29, 2017 at 07:01:09PM +0100, Sudeep Holla wrote:
> >>
> >>
> >> On 29/03/17 18:43, Jassi Brar wrote:
> ...
> 
> >> > diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
> >> > index 9dfbf7e..e06c50c 100644
> >> > --- a/drivers/mailbox/mailbox.c
> >> > +++ b/drivers/mailbox/mailbox.c
> >> > @@ -41,6 +41,7 @@ static int add_to_rbuf(struct mbox_chan *chan, void 
> >> > *mssg)
> >> >
> >> > idx = chan->msg_free;
> >> > chan->msg_data[idx] = mssg;
> >> > +   init_completion(>tx_cmpl[idx]);
> >>
> >> reinit would be better.
> >
> Of course.
> 
> 
> > From: Alexey Klimov 
> > Date: Thu, 6 Apr 2017 13:57:02 +0100
> > Subject: [RFC][PATCH] mailbox: per-channel arrays with msg data and 
> > completion
> >  structures
> >
> > When a mailbox client doesn't serialize sending of the message itself,
> > and asks mailbox framework to block on mbox_send_message(), one
> > completion structure per channel is not enough. Client can make a few
> > mbox_send_message() calls at the same time, and there is no guaranteed
> > order of going to sleep on completion.
> >
> > If mailbox controller acks a message transfer, then tx_tick() wakes up
> > the first thread that waits on completion.
> > If mailbox controller doesn't ack the transfer and timeout happens, then
> > tx_tick() calls complete, and the next caller trying to sleep on
> > completion wakes up immediately.
> >
> > This patch fixes this by changing completion structures to be inserted
> > into an array that contains a) pointer to data provided by client and
> > b) the completion structure. Thus active_req field tracks the index of
> > the current running request that was submitted to mailbox controller.
> >
> > Signed-off-by: Alexey Klimov 
> > ---
> >  drivers/mailbox/mailbox.c  | 40 
> > +++---
> >  drivers/mailbox/pcc.c  | 10 +++---
> >  include/linux/mailbox_controller.h | 24 +--
> ...
> >  3 files changed, 49 insertions(+), 25 deletions(-)
> >
>  Versus   4 files changed, 17 insertions(+), 8 deletions(-)
> 
> I think we should just keep it simpler if it works just as fine.

Along with this patch you still need at least one patch from Sudeep with 
subject:
"[PATCH 1/3] mailbox: always wait in mbox_send_message for blocking Tx mode"

Decision to block or not shouldn't depend on racy reading of active_req field.

Best regards,
Alexey Klimov.

Re: [PATCH RFC] mailbox: move controller timer to per-channel timers

2017-04-11 Thread Alexey Klimov

On Fri, Apr 07, 2017 at 08:39:35PM +0530, Jassi Brar wrote:
> On Thu, Apr 6, 2017 at 11:01 PM, Alexey Klimov <alexey.kli...@arm.com> wrote:
> > When mailbox controller provides two or more channels and
> > they are actively used by mailbox client(s) it's very easy
> > to trigger the warning in hrtimer_forward():
> >
> > [  247.853060] WARNING: CPU: 6 PID: 0 at kernel/time/hrtimer.c:805 
> > hrtimer_forward+0x88/0xd8
> > [  247.853549] Modules linked in:
> > [  247.853907] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW   
> > 4.11.0-rc2-00362-g93afaa4513bb-dirty #13
> > [  247.854472] Hardware name: linux,dummy-virt (DT)
> > [  247.854699] task: 80001d89d780 task.stack: 80001d8c4000
> > [  247.854999] PC is at hrtimer_forward+0x88/0xd8
> > [  247.855280] LR is at txdone_hrtimer+0xd4/0xf8
> > [  247.81] pc : [] lr : [] pstate: 
> > 21c5
> > [  247.855857] sp : 80001efbdeb0
> > [  247.856072] x29: 80001efbdeb0 x28: 80001efc3140
> > [  247.856358] x27: 0881b7a0 x26: 0039ac93e8b6
> > [  247.856604] x25: 08e756be x24: 80001c4a1348
> > [  247.856882] x23: 0001 x22: 00f8
> > [  247.857189] x21: 80001c4a1318 x20: 80001d327110
> > [  247.857509] x19: 000f4240 x18: 0030
> > [  247.857808] x17: aecdf370 x16: 081ccc80
> > [  247.858000] x15: 0010 x14: fff0
> > [  247.858186] x13: 08f488e0 x12: 0002e3eb
> > [  247.858381] x11: 08979690 x10: 
> > [  247.858573] x9 : 0001 x8 : 80001efc66e0
> > [  247.858758] x7 : 80001efc6708 x6 : 0005be7732f2
> > [  247.858943] x5 : 0001 x4 : 80001c4a1348
> > [  247.859130] x3 : 0039ac94952a x2 : 000f4240
> > [  247.859315] x1 : 0039ac98243c x0 : 00038f12
> > [  247.859582] ---[ end trace d61812426ec3c30b ]---
> >
> > To fix this current patch migrates hr timers to be per-channel
> > instead of using only one timer per-controller.
> >
> I think we can do by just checking if hrtimer_active() returns false
> before we do hrtimer_start() in msg_submit() ?

It looks like it can be easily broken:

1) let's say first thread executes timer callback and already checked 
last_tx_done
on channel 0;
2) second thread submits a message to the controller, say, on channel 0 and with
help of hrtimer_active() observes that the timer is active (because timer 
callback
is running) and decides not to (re-)start timer;

After this first thread decides not to restart the timer and finishes callback.
The thing that first thread executes tx_tick isn't helpful: for example first
thread may have no messages to submit on any channel and therefore is not going
to deal with timer.

Finally, mailbox state machine is stalled. Second thread thinks that timer is
active while it's not.

One of the main questions is that there is only one timer per few channels
in current code.

Thanks,
Alexey.

Re: [PATCH RFC] mailbox: move controller timer to per-channel timers

2017-04-11 Thread Alexey Klimov

On Fri, Apr 07, 2017 at 08:39:35PM +0530, Jassi Brar wrote:
> On Thu, Apr 6, 2017 at 11:01 PM, Alexey Klimov  wrote:
> > When mailbox controller provides two or more channels and
> > they are actively used by mailbox client(s) it's very easy
> > to trigger the warning in hrtimer_forward():
> >
> > [  247.853060] WARNING: CPU: 6 PID: 0 at kernel/time/hrtimer.c:805 
> > hrtimer_forward+0x88/0xd8
> > [  247.853549] Modules linked in:
> > [  247.853907] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW   
> > 4.11.0-rc2-00362-g93afaa4513bb-dirty #13
> > [  247.854472] Hardware name: linux,dummy-virt (DT)
> > [  247.854699] task: 80001d89d780 task.stack: 80001d8c4000
> > [  247.854999] PC is at hrtimer_forward+0x88/0xd8
> > [  247.855280] LR is at txdone_hrtimer+0xd4/0xf8
> > [  247.81] pc : [] lr : [] pstate: 
> > 21c5
> > [  247.855857] sp : 80001efbdeb0
> > [  247.856072] x29: 80001efbdeb0 x28: 80001efc3140
> > [  247.856358] x27: 0881b7a0 x26: 0039ac93e8b6
> > [  247.856604] x25: 08e756be x24: 80001c4a1348
> > [  247.856882] x23: 0001 x22: 00f8
> > [  247.857189] x21: 80001c4a1318 x20: 80001d327110
> > [  247.857509] x19: 000f4240 x18: 0030
> > [  247.857808] x17: aecdf370 x16: 081ccc80
> > [  247.858000] x15: 0010 x14: fff0
> > [  247.858186] x13: 08f488e0 x12: 0002e3eb
> > [  247.858381] x11: 08979690 x10: 
> > [  247.858573] x9 : 0001 x8 : 80001efc66e0
> > [  247.858758] x7 : 80001efc6708 x6 : 0005be7732f2
> > [  247.858943] x5 : 0001 x4 : 80001c4a1348
> > [  247.859130] x3 : 0039ac94952a x2 : 000f4240
> > [  247.859315] x1 : 0039ac98243c x0 : 00038f12
> > [  247.859582] ---[ end trace d61812426ec3c30b ]---
> >
> > To fix this current patch migrates hr timers to be per-channel
> > instead of using only one timer per-controller.
> >
> I think we can do by just checking if hrtimer_active() returns false
> before we do hrtimer_start() in msg_submit() ?

It looks like it can be easily broken:

1) let's say first thread executes timer callback and already checked 
last_tx_done
on channel 0;
2) second thread submits a message to the controller, say, on channel 0 and with
help of hrtimer_active() observes that the timer is active (because timer 
callback
is running) and decides not to (re-)start timer;

After this first thread decides not to restart the timer and finishes callback.
The thing that first thread executes tx_tick isn't helpful: for example first
thread may have no messages to submit on any channel and therefore is not going
to deal with timer.

Finally, mailbox state machine is stalled. Second thread thinks that timer is
active while it's not.

One of the main questions is that there is only one timer per few channels
in current code.

Thanks,
Alexey.

[PATCH RFC] mailbox: move controller timer to per-channel timers

2017-04-06 Thread Alexey Klimov

When mailbox controller provides two or more channels and
they are actively used by mailbox client(s) it's very easy
to trigger the warning in hrtimer_forward():

[  247.853060] WARNING: CPU: 6 PID: 0 at kernel/time/hrtimer.c:805 
hrtimer_forward+0x88/0xd8
[  247.853549] Modules linked in:
[  247.853907] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW   
4.11.0-rc2-00362-g93afaa4513bb-dirty #13
[  247.854472] Hardware name: linux,dummy-virt (DT)
[  247.854699] task: 80001d89d780 task.stack: 80001d8c4000
[  247.854999] PC is at hrtimer_forward+0x88/0xd8
[  247.855280] LR is at txdone_hrtimer+0xd4/0xf8
[  247.81] pc : [] lr : [] pstate: 
21c5
[  247.855857] sp : 80001efbdeb0
[  247.856072] x29: 80001efbdeb0 x28: 80001efc3140
[  247.856358] x27: 0881b7a0 x26: 0039ac93e8b6
[  247.856604] x25: 08e756be x24: 80001c4a1348
[  247.856882] x23: 0001 x22: 00f8
[  247.857189] x21: 80001c4a1318 x20: 80001d327110
[  247.857509] x19: 000f4240 x18: 0030
[  247.857808] x17: aecdf370 x16: 081ccc80
[  247.858000] x15: 0010 x14: fff0
[  247.858186] x13: 08f488e0 x12: 0002e3eb
[  247.858381] x11: 08979690 x10: 
[  247.858573] x9 : 0001 x8 : 80001efc66e0
[  247.858758] x7 : 80001efc6708 x6 : 0005be7732f2
[  247.858943] x5 : 0001 x4 : 80001c4a1348
[  247.859130] x3 : 0039ac94952a x2 : 000f4240
[  247.859315] x1 : 0039ac98243c x0 : 00038f12
[  247.859582] ---[ end trace d61812426ec3c30b ]---

To fix this current patch migrates hr timers to be per-channel
instead of using only one timer per-controller.

The racy reading of chan->active_req is removed from timer callback
since it's not done under spinlock and it seems that timer-based
polling logic shouldn't rely on this. Timer is started on the channel
when new message is submitted to controller and timer will continue
to reschedule itself until it detects that controller acked a message
by using ->last_tx_done(), after acknowledge from controller timer
callback will trigger mailbox tx state machine.

Signed-off-by: Alexey Klimov <alexey.kli...@arm.com>
---

Hi Jassi and Sudeep,

could you please take a look at this?

The only thing that I don't know how to fix here is that
if controller reports timers-based polling and client supports
acknowledgement of message transfer then the scenario looks like
it theoretically possible to call tx_tick() from two
points: from timer callback and from a client. This is setup
in mbox_request_channel() in such lines:

if (chan->txdone_method == TXDONE_BY_POLL && cl->knows_txdone)
chan->txdone_method |= TXDONE_BY_ACK;


Thanks,
Alexey



 drivers/mailbox/mailbox.c  | 45 ++
 include/linux/mailbox_controller.h |  2 +-
 2 files changed, 22 insertions(+), 25 deletions(-)

diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
index 4671f8a12872..124d2a64de83 100644
--- a/drivers/mailbox/mailbox.c
+++ b/drivers/mailbox/mailbox.c
@@ -87,7 +87,7 @@ static void msg_submit(struct mbox_chan *chan)
 
if (!err && (chan->txdone_method & TXDONE_BY_POLL))
/* kick start the timer immediately to avoid delays */
-   hrtimer_start(>mbox->poll_hrt, 0, HRTIMER_MODE_REL);
+   hrtimer_start(>poll_hrt, 0, HRTIMER_MODE_REL);
 }
 
 static void tx_tick(struct mbox_chan *chan, int r)
@@ -113,25 +113,21 @@ static void tx_tick(struct mbox_chan *chan, int r)
 
 static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer)
 {
-   struct mbox_controller *mbox =
-   container_of(hrtimer, struct mbox_controller, poll_hrt);
+   struct mbox_chan *chan =
+   container_of(hrtimer, struct mbox_chan, poll_hrt);
bool txdone, resched = false;
-   int i;
-
-   for (i = 0; i < mbox->num_chans; i++) {
-   struct mbox_chan *chan = >chans[i];
 
-   if (chan->active_req && chan->cl) {
-   txdone = chan->mbox->ops->last_tx_done(chan);
-   if (txdone)
-   tx_tick(chan, 0);
-   else
-   resched = true;
-   }
+   if (chan->cl) {
+   txdone = chan->mbox->ops->last_tx_done(chan);
+   if (txdone)
+   tx_tick(chan, 0);
+   else
+   resched = true;
}
 
if (resched) {
-   hrtimer_forward_now(hrtimer, ms_to_ktime(mbox->txpoll_period));
+   hrtimer_forward_now(hrtimer,
+   ms_to_ktime(chan->mbox->txpoll_period));
return HRTIMER_RESTART;
}
return H

[PATCH RFC] mailbox: move controller timer to per-channel timers

2017-04-06 Thread Alexey Klimov

When mailbox controller provides two or more channels and
they are actively used by mailbox client(s) it's very easy
to trigger the warning in hrtimer_forward():

[  247.853060] WARNING: CPU: 6 PID: 0 at kernel/time/hrtimer.c:805 
hrtimer_forward+0x88/0xd8
[  247.853549] Modules linked in:
[  247.853907] CPU: 6 PID: 0 Comm: swapper/6 Tainted: GW   
4.11.0-rc2-00362-g93afaa4513bb-dirty #13
[  247.854472] Hardware name: linux,dummy-virt (DT)
[  247.854699] task: 80001d89d780 task.stack: 80001d8c4000
[  247.854999] PC is at hrtimer_forward+0x88/0xd8
[  247.855280] LR is at txdone_hrtimer+0xd4/0xf8
[  247.81] pc : [] lr : [] pstate: 
21c5
[  247.855857] sp : 80001efbdeb0
[  247.856072] x29: 80001efbdeb0 x28: 80001efc3140
[  247.856358] x27: 0881b7a0 x26: 0039ac93e8b6
[  247.856604] x25: 08e756be x24: 80001c4a1348
[  247.856882] x23: 0001 x22: 00f8
[  247.857189] x21: 80001c4a1318 x20: 80001d327110
[  247.857509] x19: 000f4240 x18: 0030
[  247.857808] x17: aecdf370 x16: 081ccc80
[  247.858000] x15: 0010 x14: fff0
[  247.858186] x13: 08f488e0 x12: 0002e3eb
[  247.858381] x11: 08979690 x10: 
[  247.858573] x9 : 0001 x8 : 80001efc66e0
[  247.858758] x7 : 80001efc6708 x6 : 0005be7732f2
[  247.858943] x5 : 0001 x4 : 80001c4a1348
[  247.859130] x3 : 0039ac94952a x2 : 000f4240
[  247.859315] x1 : 0039ac98243c x0 : 00038f12
[  247.859582] ---[ end trace d61812426ec3c30b ]---

To fix this current patch migrates hr timers to be per-channel
instead of using only one timer per-controller.

The racy reading of chan->active_req is removed from timer callback
since it's not done under spinlock and it seems that timer-based
polling logic shouldn't rely on this. Timer is started on the channel
when new message is submitted to controller and timer will continue
to reschedule itself until it detects that controller acked a message
by using ->last_tx_done(), after acknowledge from controller timer
callback will trigger mailbox tx state machine.

Signed-off-by: Alexey Klimov 
---

Hi Jassi and Sudeep,

could you please take a look at this?

The only thing that I don't know how to fix here is that
if controller reports timers-based polling and client supports
acknowledgement of message transfer then the scenario looks like
it theoretically possible to call tx_tick() from two
points: from timer callback and from a client. This is setup
in mbox_request_channel() in such lines:

if (chan->txdone_method == TXDONE_BY_POLL && cl->knows_txdone)
chan->txdone_method |= TXDONE_BY_ACK;


Thanks,
Alexey



 drivers/mailbox/mailbox.c  | 45 ++
 include/linux/mailbox_controller.h |  2 +-
 2 files changed, 22 insertions(+), 25 deletions(-)

diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
index 4671f8a12872..124d2a64de83 100644
--- a/drivers/mailbox/mailbox.c
+++ b/drivers/mailbox/mailbox.c
@@ -87,7 +87,7 @@ static void msg_submit(struct mbox_chan *chan)
 
if (!err && (chan->txdone_method & TXDONE_BY_POLL))
/* kick start the timer immediately to avoid delays */
-   hrtimer_start(>mbox->poll_hrt, 0, HRTIMER_MODE_REL);
+   hrtimer_start(>poll_hrt, 0, HRTIMER_MODE_REL);
 }
 
 static void tx_tick(struct mbox_chan *chan, int r)
@@ -113,25 +113,21 @@ static void tx_tick(struct mbox_chan *chan, int r)
 
 static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer)
 {
-   struct mbox_controller *mbox =
-   container_of(hrtimer, struct mbox_controller, poll_hrt);
+   struct mbox_chan *chan =
+   container_of(hrtimer, struct mbox_chan, poll_hrt);
bool txdone, resched = false;
-   int i;
-
-   for (i = 0; i < mbox->num_chans; i++) {
-   struct mbox_chan *chan = >chans[i];
 
-   if (chan->active_req && chan->cl) {
-   txdone = chan->mbox->ops->last_tx_done(chan);
-   if (txdone)
-   tx_tick(chan, 0);
-   else
-   resched = true;
-   }
+   if (chan->cl) {
+   txdone = chan->mbox->ops->last_tx_done(chan);
+   if (txdone)
+   tx_tick(chan, 0);
+   else
+   resched = true;
}
 
if (resched) {
-   hrtimer_forward_now(hrtimer, ms_to_ktime(mbox->txpoll_period));
+   hrtimer_forward_now(hrtimer,
+   ms_to_ktime(chan->mbox->txpoll_period));
return HRTIMER_RESTART;
}
return HRTIMER_NORESTART;
@@ -350,6 +3

Re: [PATCH] mailbox: fix completion order for blocking requests

2017-04-06 Thread Alexey Klimov

Hi Jassi/Sudeep,

On Wed, Mar 29, 2017 at 07:01:09PM +0100, Sudeep Holla wrote:
> 
> 
> On 29/03/17 18:43, Jassi Brar wrote:
> > Currently two threads, wait on blocking requests, could wake up for
> > completion of request of each other as ...
> > 
> > Thread#1(T1)   Thread#2(T2)
> >  mbox_send_message   mbox_send_message
> > |   |
> > V   |
> > add_to_rbuf(M1) V
> > | add_to_rbuf(M2)
> > |   |
> > |   V
> > V  msg_submit(picks M1)
> > msg_submit  |
> > |   V
> > V   wait_for_completion(on M2)
> >  wait_for_completion(on M1) |  (1st in waitQ)
> > |   (2nd in waitQ)  V
> > V   wake_up(on completion of M1)<--incorrect
> > 
> >  Fix this situaion by assigning completion structures to each queued
> > request, so that the threads could wait on their own completions.
> > 
> 
> Alexey came up with exact similar solution. I didn't like:

Sorry for delay.

Let me attach it, just in case. It's inserted in the of the email at [1].
It has some issues with naming of structure maybe and thing that
Sudeep pointed out.
-1 is used for active request field which doesn't look good too.
 
> 1. the static array just bloats the structure with equal no. of
>completion which may be useless for !TXDONE_BY_POLL
> 
> 2. We have client drivers already doing something similar. I wanted
>to fix/move those along with this fix. Or at-least see the feasibiliy
> 
> > Reported-by: Alexey Klimov <alexey.kli...@arm.com>
> > Signed-off-by: Jassi Brar <jaswinder.si...@linaro.org>
> > ---
> >  drivers/mailbox/mailbox.c  | 15 +++
> >  drivers/mailbox/omap-mailbox.c |  2 +-
> >  drivers/mailbox/pcc.c  |  2 +-
> >  include/linux/mailbox_controller.h |  6 --
> >  4 files changed, 17 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
> > index 9dfbf7e..e06c50c 100644
> > --- a/drivers/mailbox/mailbox.c
> > +++ b/drivers/mailbox/mailbox.c
> > @@ -41,6 +41,7 @@ static int add_to_rbuf(struct mbox_chan *chan, void *mssg)
> >  
> > idx = chan->msg_free;
> > chan->msg_data[idx] = mssg;
> > +   init_completion(>tx_cmpl[idx]);
> 
> reinit would be better.

Agree.
Also, reinit_completion can be moved to mbox_send_message() under
"if" that checks if it's a blocking request or not.
 
> > chan->msg_count++;
> >  
> > if (idx == MBOX_TX_QUEUE_LEN - 1)
> > @@ -73,6 +74,7 @@ static void msg_submit(struct mbox_chan *chan)
> > idx += MBOX_TX_QUEUE_LEN - count;
> >  
> > data = chan->msg_data[idx];
> > +   chan->tx_complete = >tx_cmpl[idx];
> >  
> > if (chan->cl->tx_prepare)
> > chan->cl->tx_prepare(chan->cl, data);
> > @@ -81,7 +83,8 @@ static void msg_submit(struct mbox_chan *chan)
> > if (!err) {
> > chan->active_req = data;
> > chan->msg_count--;
> > -   }
> > +   } else
> > +   chan->tx_complete = NULL;
> >  exit:
> > spin_unlock_irqrestore(>lock, flags);
> >  
> > @@ -92,12 +95,15 @@ static void msg_submit(struct mbox_chan *chan)
> >  
> >  static void tx_tick(struct mbox_chan *chan, int r)
> >  {
> > +   struct completion *tx_complete;
> > unsigned long flags;
> > void *mssg;
> >  
> > spin_lock_irqsave(>lock, flags);
> > mssg = chan->active_req;
> > +   tx_complete = chan->tx_complete;
> > chan->active_req = NULL;
> > +   chan->tx_complete = NULL;
> > spin_unlock_irqrestore(>lock, flags);
> >  
> > /* Submit next message */
> > @@ -111,7 +117,7 @@ static void tx_tick(struct mbox_chan *chan, int r)
> > chan->cl->tx_done(chan->cl, mssg, r);
> >  
> > if (r != -ETIME && chan->cl->tx_block)
> > -   complete(>tx_complete);
> > +   complete(tx_complete);
> >  }
> >  
> >  static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer)
> > @@ -272,7 +278,7 @@ int mbox_send_message(struct mbox_chan *chan, void 
> > *mssg

Re: [PATCH] mailbox: fix completion order for blocking requests

2017-04-06 Thread Alexey Klimov

Hi Jassi/Sudeep,

On Wed, Mar 29, 2017 at 07:01:09PM +0100, Sudeep Holla wrote:
> 
> 
> On 29/03/17 18:43, Jassi Brar wrote:
> > Currently two threads, wait on blocking requests, could wake up for
> > completion of request of each other as ...
> > 
> > Thread#1(T1)   Thread#2(T2)
> >  mbox_send_message   mbox_send_message
> > |   |
> > V   |
> > add_to_rbuf(M1) V
> > | add_to_rbuf(M2)
> > |   |
> > |   V
> > V  msg_submit(picks M1)
> > msg_submit  |
> > |   V
> > V   wait_for_completion(on M2)
> >  wait_for_completion(on M1) |  (1st in waitQ)
> > |   (2nd in waitQ)  V
> > V   wake_up(on completion of M1)<--incorrect
> > 
> >  Fix this situaion by assigning completion structures to each queued
> > request, so that the threads could wait on their own completions.
> > 
> 
> Alexey came up with exact similar solution. I didn't like:

Sorry for delay.

Let me attach it, just in case. It's inserted in the of the email at [1].
It has some issues with naming of structure maybe and thing that
Sudeep pointed out.
-1 is used for active request field which doesn't look good too.
 
> 1. the static array just bloats the structure with equal no. of
>completion which may be useless for !TXDONE_BY_POLL
> 
> 2. We have client drivers already doing something similar. I wanted
>to fix/move those along with this fix. Or at-least see the feasibiliy
> 
> > Reported-by: Alexey Klimov 
> > Signed-off-by: Jassi Brar 
> > ---
> >  drivers/mailbox/mailbox.c  | 15 +++
> >  drivers/mailbox/omap-mailbox.c |  2 +-
> >  drivers/mailbox/pcc.c  |  2 +-
> >  include/linux/mailbox_controller.h |  6 --
> >  4 files changed, 17 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
> > index 9dfbf7e..e06c50c 100644
> > --- a/drivers/mailbox/mailbox.c
> > +++ b/drivers/mailbox/mailbox.c
> > @@ -41,6 +41,7 @@ static int add_to_rbuf(struct mbox_chan *chan, void *mssg)
> >  
> > idx = chan->msg_free;
> > chan->msg_data[idx] = mssg;
> > +   init_completion(>tx_cmpl[idx]);
> 
> reinit would be better.

Agree.
Also, reinit_completion can be moved to mbox_send_message() under
"if" that checks if it's a blocking request or not.
 
> > chan->msg_count++;
> >  
> > if (idx == MBOX_TX_QUEUE_LEN - 1)
> > @@ -73,6 +74,7 @@ static void msg_submit(struct mbox_chan *chan)
> > idx += MBOX_TX_QUEUE_LEN - count;
> >  
> > data = chan->msg_data[idx];
> > +   chan->tx_complete = >tx_cmpl[idx];
> >  
> > if (chan->cl->tx_prepare)
> > chan->cl->tx_prepare(chan->cl, data);
> > @@ -81,7 +83,8 @@ static void msg_submit(struct mbox_chan *chan)
> > if (!err) {
> > chan->active_req = data;
> > chan->msg_count--;
> > -   }
> > +   } else
> > +   chan->tx_complete = NULL;
> >  exit:
> > spin_unlock_irqrestore(>lock, flags);
> >  
> > @@ -92,12 +95,15 @@ static void msg_submit(struct mbox_chan *chan)
> >  
> >  static void tx_tick(struct mbox_chan *chan, int r)
> >  {
> > +   struct completion *tx_complete;
> > unsigned long flags;
> > void *mssg;
> >  
> > spin_lock_irqsave(>lock, flags);
> > mssg = chan->active_req;
> > +   tx_complete = chan->tx_complete;
> > chan->active_req = NULL;
> > +   chan->tx_complete = NULL;
> > spin_unlock_irqrestore(>lock, flags);
> >  
> > /* Submit next message */
> > @@ -111,7 +117,7 @@ static void tx_tick(struct mbox_chan *chan, int r)
> > chan->cl->tx_done(chan->cl, mssg, r);
> >  
> > if (r != -ETIME && chan->cl->tx_block)
> > -   complete(>tx_complete);
> > +   complete(tx_complete);
> >  }
> >  
> >  static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer)
> > @@ -272,7 +278,7 @@ int mbox_send_message(struct mbox_chan *chan, void 
> > *mssg)
> > else
> > wait =

Re: [PATCH 2/2] ACPI / CPPC: Make cppc acpi driver aware of pcc subspace ids

2017-04-04 Thread Alexey Klimov

On Tue, 4 Apr 2017 16:21:20 +0530 George Cherian <gcher...@caviumnetworks.com> 
wrote:

>
> Hi Alexey,
>
> On 04/03/2017 11:07 PM, Alexey Klimov wrote:
> > (adding Prashanth to c/c)
> >
> > Hi George,
> >
> > On Fri, Mar 31, 2017 at 06:24:02AM +, George Cherian wrote:
> >> Based on Section 14.1 of ACPI specification, it is possible to
> >> have a maximum of 256 PCC subspace ids. Add support of multiple
> >> PCC subspace id instead of using a single global pcc_data
> >> structure.
> >>
> >> While at that fix the time_delta check in send_pcc_cmd() so that
> >> last_mpar_reset and mpar_count is initialized properly. Also
> >> maintain a global total_mpar_count which is a sum of per subspace
> >> id mpar value.
> >
> > Could you please provide clarification on why sum of
> > total_mpar_count is required? Do you assume that there always will
> > be only one single firmware CPU that handles PCC commands on
> > another side?
>
> Yes you are right the total_mpar_count  should be removed and should
> be handled per subspace id. Moreover the current logic of not sending
> the command to PCC and returning with -EIO is also flawed. It should
> actually have a retry mechanism instead of returning -EIO even
> without submitting the request to the channel.

That sounds interesting.
How many times should the code try to resend before giving up (let's
say that timing constraints allow the caller to resend command)?

Regarding error codes, the code can differentiate between timeout,
platform error, timing constraints (-EBUSY?), maybe some other errors.
In some cases and since mailbox framework can't resend pcc commands on
itself then it's the client responsibility to re-queue a command.


> > Theoretically different PCC channels can be connected to different
> > platform CPUs on other end (imagine NUMA systems in case of CPPC)
> > so it's not clear why transport layer of PCC should use that global
> > count. Also, ACPI spec 6.1 (page 701) in in description of MPAR
> > states "The maximum number of periodic requests that the subspace
> > channel can support".
> >
> >
> >
> >> Signed-off-by: George Cherian <george.cher...@cavium.com>
> >> ---
> >>   drivers/acpi/cppc_acpi.c | 189
> >> ++- 1 file changed,
> >> 105 insertions(+), 84 deletions(-)
> >>
> >> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> >> index 3ca0729..7ba05ac 100644
> >> --- a/drivers/acpi/cppc_acpi.c
> >> +++ b/drivers/acpi/cppc_acpi.c
> >> @@ -77,12 +77,16 @@ struct cppc_pcc_data {
> >>wait_queue_head_t pcc_write_wait_q;
> >>   };
> >>
> >> -/* Structure to represent the single PCC channel */
> >> -static struct cppc_pcc_data pcc_data = {
> >> -  .pcc_subspace_idx = -1,
> >> -  .platform_owns_pcc = true,
> >> -};
> >> +/* Array  to represent the PCC channel per subspace id */
> >> +static struct cppc_pcc_data pcc_data[MAX_PCC_SUBSPACES];
> >> +/*
> >> + * It is quiet possible that multiple CPU's can share
> >> + * same subspace ids. The cpu_pcc_subspace_idx maintains
> >> + * the cpu to pcc subspace id map.
> >> + */
> >> +static DEFINE_PER_CPU(int, cpu_pcc_subspace_idx);
> >>
> >> +static int total_mpar_count;
> >>   /*
> >>* The cpc_desc structure contains the ACPI register details
> >>* as described in the per CPU _CPC tables. The details
> >> @@ -93,7 +97,8 @@ static struct cppc_pcc_data pcc_data = {
> >>   static DEFINE_PER_CPU(struct cpc_desc *, cpc_desc_ptr);
> >>
> >>   /* pcc mapped address + header size + offset within PCC subspace
> >> */ -#define GET_PCC_VADDR(offs) (pcc_data.pcc_comm_addr + 0x8 +
> >> (offs)) +#define GET_PCC_VADDR(offs, pcc_ss_id)
> >> (pcc_data[pcc_ss_id].pcc_comm_addr + \
> >> +  0x8 + (offs))
> >>
> >>   /* Check if a CPC regsiter is in PCC */
> >>   #define CPC_IN_PCC(cpc) ((cpc)->type == ACPI_TYPE_BUFFER
> >> && \ @@ -183,13 +188,17 @@ static struct kobj_type
> >> cppc_ktype = { .default_attrs = cppc_attrs,
> >>   };
> >>
> >> -static int check_pcc_chan(bool chk_err_bit)
> >> +static int check_pcc_chan(int cpunum, bool chk_err_bit)
> >>   {
> >>int ret = -EIO, status = 0;
> >> -  struct acpi_pcct_shared_memory __iomem *generic_comm_base
> >> = pcc_d

Re: [PATCH 2/2] ACPI / CPPC: Make cppc acpi driver aware of pcc subspace ids

2017-04-04 Thread Alexey Klimov

On Tue, 4 Apr 2017 16:21:20 +0530 George Cherian  
wrote:

>
> Hi Alexey,
>
> On 04/03/2017 11:07 PM, Alexey Klimov wrote:
> > (adding Prashanth to c/c)
> >
> > Hi George,
> >
> > On Fri, Mar 31, 2017 at 06:24:02AM +, George Cherian wrote:
> >> Based on Section 14.1 of ACPI specification, it is possible to
> >> have a maximum of 256 PCC subspace ids. Add support of multiple
> >> PCC subspace id instead of using a single global pcc_data
> >> structure.
> >>
> >> While at that fix the time_delta check in send_pcc_cmd() so that
> >> last_mpar_reset and mpar_count is initialized properly. Also
> >> maintain a global total_mpar_count which is a sum of per subspace
> >> id mpar value.
> >
> > Could you please provide clarification on why sum of
> > total_mpar_count is required? Do you assume that there always will
> > be only one single firmware CPU that handles PCC commands on
> > another side?
>
> Yes you are right the total_mpar_count  should be removed and should
> be handled per subspace id. Moreover the current logic of not sending
> the command to PCC and returning with -EIO is also flawed. It should
> actually have a retry mechanism instead of returning -EIO even
> without submitting the request to the channel.

That sounds interesting.
How many times should the code try to resend before giving up (let's
say that timing constraints allow the caller to resend command)?

Regarding error codes, the code can differentiate between timeout,
platform error, timing constraints (-EBUSY?), maybe some other errors.
In some cases and since mailbox framework can't resend pcc commands on
itself then it's the client responsibility to re-queue a command.


> > Theoretically different PCC channels can be connected to different
> > platform CPUs on other end (imagine NUMA systems in case of CPPC)
> > so it's not clear why transport layer of PCC should use that global
> > count. Also, ACPI spec 6.1 (page 701) in in description of MPAR
> > states "The maximum number of periodic requests that the subspace
> > channel can support".
> >
> >
> >
> >> Signed-off-by: George Cherian 
> >> ---
> >>   drivers/acpi/cppc_acpi.c | 189
> >> ++- 1 file changed,
> >> 105 insertions(+), 84 deletions(-)
> >>
> >> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> >> index 3ca0729..7ba05ac 100644
> >> --- a/drivers/acpi/cppc_acpi.c
> >> +++ b/drivers/acpi/cppc_acpi.c
> >> @@ -77,12 +77,16 @@ struct cppc_pcc_data {
> >>wait_queue_head_t pcc_write_wait_q;
> >>   };
> >>
> >> -/* Structure to represent the single PCC channel */
> >> -static struct cppc_pcc_data pcc_data = {
> >> -  .pcc_subspace_idx = -1,
> >> -  .platform_owns_pcc = true,
> >> -};
> >> +/* Array  to represent the PCC channel per subspace id */
> >> +static struct cppc_pcc_data pcc_data[MAX_PCC_SUBSPACES];
> >> +/*
> >> + * It is quiet possible that multiple CPU's can share
> >> + * same subspace ids. The cpu_pcc_subspace_idx maintains
> >> + * the cpu to pcc subspace id map.
> >> + */
> >> +static DEFINE_PER_CPU(int, cpu_pcc_subspace_idx);
> >>
> >> +static int total_mpar_count;
> >>   /*
> >>* The cpc_desc structure contains the ACPI register details
> >>* as described in the per CPU _CPC tables. The details
> >> @@ -93,7 +97,8 @@ static struct cppc_pcc_data pcc_data = {
> >>   static DEFINE_PER_CPU(struct cpc_desc *, cpc_desc_ptr);
> >>
> >>   /* pcc mapped address + header size + offset within PCC subspace
> >> */ -#define GET_PCC_VADDR(offs) (pcc_data.pcc_comm_addr + 0x8 +
> >> (offs)) +#define GET_PCC_VADDR(offs, pcc_ss_id)
> >> (pcc_data[pcc_ss_id].pcc_comm_addr + \
> >> +  0x8 + (offs))
> >>
> >>   /* Check if a CPC regsiter is in PCC */
> >>   #define CPC_IN_PCC(cpc) ((cpc)->type == ACPI_TYPE_BUFFER
> >> && \ @@ -183,13 +188,17 @@ static struct kobj_type
> >> cppc_ktype = { .default_attrs = cppc_attrs,
> >>   };
> >>
> >> -static int check_pcc_chan(bool chk_err_bit)
> >> +static int check_pcc_chan(int cpunum, bool chk_err_bit)
> >>   {
> >>int ret = -EIO, status = 0;
> >> -  struct acpi_pcct_shared_memory __iomem *generic_comm_base
> >> = pcc_data.pcc_comm_addr;
> >> -  ktime_t next_deadline = ktime_add(ktime_get(),
> &

Re: [PATCH 2/2] ACPI / CPPC: Make cppc acpi driver aware of pcc subspace ids

2017-04-03 Thread Alexey Klimov

(adding Prashanth to c/c)

Hi George,

On Fri, Mar 31, 2017 at 06:24:02AM +, George Cherian wrote:
> Based on Section 14.1 of ACPI specification, it is possible to have a
> maximum of 256 PCC subspace ids. Add support of multiple PCC subspace id
> instead of using a single global pcc_data structure.
> 
> While at that fix the time_delta check in send_pcc_cmd() so that 
> last_mpar_reset
> and mpar_count is initialized properly. Also maintain a global 
> total_mpar_count
> which is a sum of per subspace id mpar value.

Could you please provide clarification on why sum of total_mpar_count is
required? Do you assume that there always will be only one single firmware CPU
that handles PCC commands on another side?

Theoretically different PCC channels can be connected to different platform CPUs
on other end (imagine NUMA systems in case of CPPC) so it's not clear why 
transport
layer of PCC should use that global count. Also, ACPI spec 6.1 (page 701) in
in description of MPAR states "The maximum number of periodic requests that the 
subspace
channel can support".



> Signed-off-by: George Cherian <george.cher...@cavium.com>
> ---
>  drivers/acpi/cppc_acpi.c | 189 
> ++-
>  1 file changed, 105 insertions(+), 84 deletions(-)
> 
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> index 3ca0729..7ba05ac 100644
> --- a/drivers/acpi/cppc_acpi.c
> +++ b/drivers/acpi/cppc_acpi.c
> @@ -77,12 +77,16 @@ struct cppc_pcc_data {
>   wait_queue_head_t pcc_write_wait_q;
>  };
>  
> -/* Structure to represent the single PCC channel */
> -static struct cppc_pcc_data pcc_data = {
> - .pcc_subspace_idx = -1,
> - .platform_owns_pcc = true,
> -};
> +/* Array  to represent the PCC channel per subspace id */
> +static struct cppc_pcc_data pcc_data[MAX_PCC_SUBSPACES];
> +/*
> + * It is quiet possible that multiple CPU's can share
> + * same subspace ids. The cpu_pcc_subspace_idx maintains
> + * the cpu to pcc subspace id map.
> + */
> +static DEFINE_PER_CPU(int, cpu_pcc_subspace_idx);
>  
> +static int total_mpar_count;
>  /*
>   * The cpc_desc structure contains the ACPI register details
>   * as described in the per CPU _CPC tables. The details
> @@ -93,7 +97,8 @@ static struct cppc_pcc_data pcc_data = {
>  static DEFINE_PER_CPU(struct cpc_desc *, cpc_desc_ptr);
>  
>  /* pcc mapped address + header size + offset within PCC subspace */
> -#define GET_PCC_VADDR(offs) (pcc_data.pcc_comm_addr + 0x8 + (offs))
> +#define GET_PCC_VADDR(offs, pcc_ss_id) (pcc_data[pcc_ss_id].pcc_comm_addr + \
> + 0x8 + (offs))
>  
>  /* Check if a CPC regsiter is in PCC */
>  #define CPC_IN_PCC(cpc) ((cpc)->type == ACPI_TYPE_BUFFER &&  \
> @@ -183,13 +188,17 @@ static struct kobj_type cppc_ktype = {
>   .default_attrs = cppc_attrs,
>  };
>  
> -static int check_pcc_chan(bool chk_err_bit)
> +static int check_pcc_chan(int cpunum, bool chk_err_bit)
>  {
>   int ret = -EIO, status = 0;
> - struct acpi_pcct_shared_memory __iomem *generic_comm_base = 
> pcc_data.pcc_comm_addr;
> - ktime_t next_deadline = ktime_add(ktime_get(), pcc_data.deadline);
> -
> - if (!pcc_data.platform_owns_pcc)
> + int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum);
> + struct cppc_pcc_data *pcc_ss_data = _data[pcc_ss_id];
> + struct acpi_pcct_shared_memory __iomem *generic_comm_base =
> + pcc_ss_data->pcc_comm_addr;
> + ktime_t next_deadline = ktime_add(ktime_get(),
> +   pcc_ss_data->deadline);
> +
> + if (!pcc_ss_data->platform_owns_pcc)
>   return 0;
>  
>   /* Retry in case the remote processor was too slow to catch up. */
> @@ -214,7 +223,7 @@ static int check_pcc_chan(bool chk_err_bit)
>   }
>  
>   if (likely(!ret))
> - pcc_data.platform_owns_pcc = false;
> + pcc_ss_data->platform_owns_pcc = false;
>   else
>   pr_err("PCC check channel failed. Status=%x\n", status);
>  
> @@ -225,11 +234,13 @@ static int check_pcc_chan(bool chk_err_bit)
>   * This function transfers the ownership of the PCC to the platform
>   * So it must be called while holding write_lock(pcc_lock)
>   */
> -static int send_pcc_cmd(u16 cmd)
> +static int send_pcc_cmd(int cpunum, u16 cmd)


I don't like the direction of where it's going.

To send commands through PCC channel you don't need to know CPU number.
Ideally, send_pcc_cmd() shouldn't care a lot about software entity that uses
it (CPPC, RASF, MPST, etc) and passing some CPU number to this function you
bind it to CPPC interfaces while it shouldn't depend on it.
Maybe you can pass subspace it here instead.

BTW, is it possible to make separate mailbox PCC client and move it out from
CPPC code?


[...]


Best regards,
Alexey Klimov.

Re: [PATCH 2/2] ACPI / CPPC: Make cppc acpi driver aware of pcc subspace ids

2017-04-03 Thread Alexey Klimov

(adding Prashanth to c/c)

Hi George,

On Fri, Mar 31, 2017 at 06:24:02AM +, George Cherian wrote:
> Based on Section 14.1 of ACPI specification, it is possible to have a
> maximum of 256 PCC subspace ids. Add support of multiple PCC subspace id
> instead of using a single global pcc_data structure.
> 
> While at that fix the time_delta check in send_pcc_cmd() so that 
> last_mpar_reset
> and mpar_count is initialized properly. Also maintain a global 
> total_mpar_count
> which is a sum of per subspace id mpar value.

Could you please provide clarification on why sum of total_mpar_count is
required? Do you assume that there always will be only one single firmware CPU
that handles PCC commands on another side?

Theoretically different PCC channels can be connected to different platform CPUs
on other end (imagine NUMA systems in case of CPPC) so it's not clear why 
transport
layer of PCC should use that global count. Also, ACPI spec 6.1 (page 701) in
in description of MPAR states "The maximum number of periodic requests that the 
subspace
channel can support".



> Signed-off-by: George Cherian 
> ---
>  drivers/acpi/cppc_acpi.c | 189 
> ++-
>  1 file changed, 105 insertions(+), 84 deletions(-)
> 
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> index 3ca0729..7ba05ac 100644
> --- a/drivers/acpi/cppc_acpi.c
> +++ b/drivers/acpi/cppc_acpi.c
> @@ -77,12 +77,16 @@ struct cppc_pcc_data {
>   wait_queue_head_t pcc_write_wait_q;
>  };
>  
> -/* Structure to represent the single PCC channel */
> -static struct cppc_pcc_data pcc_data = {
> - .pcc_subspace_idx = -1,
> - .platform_owns_pcc = true,
> -};
> +/* Array  to represent the PCC channel per subspace id */
> +static struct cppc_pcc_data pcc_data[MAX_PCC_SUBSPACES];
> +/*
> + * It is quiet possible that multiple CPU's can share
> + * same subspace ids. The cpu_pcc_subspace_idx maintains
> + * the cpu to pcc subspace id map.
> + */
> +static DEFINE_PER_CPU(int, cpu_pcc_subspace_idx);
>  
> +static int total_mpar_count;
>  /*
>   * The cpc_desc structure contains the ACPI register details
>   * as described in the per CPU _CPC tables. The details
> @@ -93,7 +97,8 @@ static struct cppc_pcc_data pcc_data = {
>  static DEFINE_PER_CPU(struct cpc_desc *, cpc_desc_ptr);
>  
>  /* pcc mapped address + header size + offset within PCC subspace */
> -#define GET_PCC_VADDR(offs) (pcc_data.pcc_comm_addr + 0x8 + (offs))
> +#define GET_PCC_VADDR(offs, pcc_ss_id) (pcc_data[pcc_ss_id].pcc_comm_addr + \
> + 0x8 + (offs))
>  
>  /* Check if a CPC regsiter is in PCC */
>  #define CPC_IN_PCC(cpc) ((cpc)->type == ACPI_TYPE_BUFFER &&  \
> @@ -183,13 +188,17 @@ static struct kobj_type cppc_ktype = {
>   .default_attrs = cppc_attrs,
>  };
>  
> -static int check_pcc_chan(bool chk_err_bit)
> +static int check_pcc_chan(int cpunum, bool chk_err_bit)
>  {
>   int ret = -EIO, status = 0;
> - struct acpi_pcct_shared_memory __iomem *generic_comm_base = 
> pcc_data.pcc_comm_addr;
> - ktime_t next_deadline = ktime_add(ktime_get(), pcc_data.deadline);
> -
> - if (!pcc_data.platform_owns_pcc)
> + int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum);
> + struct cppc_pcc_data *pcc_ss_data = _data[pcc_ss_id];
> + struct acpi_pcct_shared_memory __iomem *generic_comm_base =
> + pcc_ss_data->pcc_comm_addr;
> + ktime_t next_deadline = ktime_add(ktime_get(),
> +   pcc_ss_data->deadline);
> +
> + if (!pcc_ss_data->platform_owns_pcc)
>   return 0;
>  
>   /* Retry in case the remote processor was too slow to catch up. */
> @@ -214,7 +223,7 @@ static int check_pcc_chan(bool chk_err_bit)
>   }
>  
>   if (likely(!ret))
> - pcc_data.platform_owns_pcc = false;
> + pcc_ss_data->platform_owns_pcc = false;
>   else
>   pr_err("PCC check channel failed. Status=%x\n", status);
>  
> @@ -225,11 +234,13 @@ static int check_pcc_chan(bool chk_err_bit)
>   * This function transfers the ownership of the PCC to the platform
>   * So it must be called while holding write_lock(pcc_lock)
>   */
> -static int send_pcc_cmd(u16 cmd)
> +static int send_pcc_cmd(int cpunum, u16 cmd)


I don't like the direction of where it's going.

To send commands through PCC channel you don't need to know CPU number.
Ideally, send_pcc_cmd() shouldn't care a lot about software entity that uses
it (CPPC, RASF, MPST, etc) and passing some CPU number to this function you
bind it to CPPC interfaces while it shouldn't depend on it.
Maybe you can pass subspace it here instead.

BTW, is it possible to make separate mailbox PCC client and move it out from
CPPC code?


[...]


Best regards,
Alexey Klimov.

[PATCH] firmware: arm_scpi: reinit completion instead of full init_completion()

2017-03-29 Thread Alexey Klimov

Instead of performing full initialization of the completion structure
on each transfer in scpi_send_message(), we initialize it at boot time
(more specifically, in the relevant probe() function) and use
reinit_completion() to reset ->done counter on each message transfer
thus saving a little bit of cpu time.

Signed-off-by: Alexey Klimov <alexey.kli...@arm.com>
---
 drivers/firmware/arm_scpi.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/arm_scpi.c b/drivers/firmware/arm_scpi.c
index 9ad0b19..f6cfc31 100644
--- a/drivers/firmware/arm_scpi.c
+++ b/drivers/firmware/arm_scpi.c
@@ -538,7 +538,7 @@ static int scpi_send_message(u8 idx, void *tx_buf, unsigned 
int tx_len,
msg->tx_len = tx_len;
msg->rx_buf = rx_buf;
msg->rx_len = rx_len;
-   init_completion(>done);
+   reinit_completion(>done);
 
ret = mbox_send_message(scpi_chan->chan, msg);
if (ret < 0 || !rx_buf)
@@ -872,8 +872,11 @@ static int scpi_alloc_xfer_list(struct device *dev, struct 
scpi_chan *ch)
return -ENOMEM;
 
ch->xfers = xfers;
-   for (i = 0; i < MAX_SCPI_XFERS; i++, xfers++)
+   for (i = 0; i < MAX_SCPI_XFERS; i++, xfers++) {
+   init_completion(>done);
list_add_tail(>node, >xfers_list);
+   }
+
return 0;
 }
 
-- 
1.9.1

[PATCH] firmware: arm_scpi: reinit completion instead of full init_completion()

2017-03-29 Thread Alexey Klimov

Instead of performing full initialization of the completion structure
on each transfer in scpi_send_message(), we initialize it at boot time
(more specifically, in the relevant probe() function) and use
reinit_completion() to reset ->done counter on each message transfer
thus saving a little bit of cpu time.

Signed-off-by: Alexey Klimov 
---
 drivers/firmware/arm_scpi.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/arm_scpi.c b/drivers/firmware/arm_scpi.c
index 9ad0b19..f6cfc31 100644
--- a/drivers/firmware/arm_scpi.c
+++ b/drivers/firmware/arm_scpi.c
@@ -538,7 +538,7 @@ static int scpi_send_message(u8 idx, void *tx_buf, unsigned 
int tx_len,
msg->tx_len = tx_len;
msg->rx_buf = rx_buf;
msg->rx_len = rx_len;
-   init_completion(>done);
+   reinit_completion(>done);
 
ret = mbox_send_message(scpi_chan->chan, msg);
if (ret < 0 || !rx_buf)
@@ -872,8 +872,11 @@ static int scpi_alloc_xfer_list(struct device *dev, struct 
scpi_chan *ch)
return -ENOMEM;
 
ch->xfers = xfers;
-   for (i = 0; i < MAX_SCPI_XFERS; i++, xfers++)
+   for (i = 0; i < MAX_SCPI_XFERS; i++, xfers++) {
+   init_completion(>done);
list_add_tail(>node, >xfers_list);
+   }
+
return 0;
 }
 
-- 
1.9.1

[PATCH] mailbox: check ->last_tx_done for NULL in case of timer-based polling

2017-03-21 Thread Alexey Klimov

It is allowed by code to register mailbox controller that sets txdone_poll
flag to request timer-based polling with missed ->last_tx_done() method.
If such thing happens and since presence of last_tx_done() is not checked
it will fail in hrtimer callback function txdone_hrtimer() when first
message will be transmitted.

This patch adds check for this method and logging of error on
registration of mailbox controller if it requested timer-based polling.

Signed-off-by: Alexey Klimov <alexey.kli...@arm.com>
---
 drivers/mailbox/mailbox.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
index 4671f8a..59b7221 100644
--- a/drivers/mailbox/mailbox.c
+++ b/drivers/mailbox/mailbox.c
@@ -453,6 +453,12 @@ int mbox_controller_register(struct mbox_controller *mbox)
txdone = TXDONE_BY_ACK;
 
if (txdone == TXDONE_BY_POLL) {
+
+   if (!mbox->ops->last_tx_done) {
+   dev_err(mbox->dev, "last_tx_done method is absent\n");
+   return -EINVAL;
+   }
+
hrtimer_init(>poll_hrt, CLOCK_MONOTONIC,
 HRTIMER_MODE_REL);
mbox->poll_hrt.function = txdone_hrtimer;
-- 
1.9.1

[PATCH] mailbox: check ->last_tx_done for NULL in case of timer-based polling

2017-03-21 Thread Alexey Klimov

It is allowed by code to register mailbox controller that sets txdone_poll
flag to request timer-based polling with missed ->last_tx_done() method.
If such thing happens and since presence of last_tx_done() is not checked
it will fail in hrtimer callback function txdone_hrtimer() when first
message will be transmitted.

This patch adds check for this method and logging of error on
registration of mailbox controller if it requested timer-based polling.

Signed-off-by: Alexey Klimov 
---
 drivers/mailbox/mailbox.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
index 4671f8a..59b7221 100644
--- a/drivers/mailbox/mailbox.c
+++ b/drivers/mailbox/mailbox.c
@@ -453,6 +453,12 @@ int mbox_controller_register(struct mbox_controller *mbox)
txdone = TXDONE_BY_ACK;
 
if (txdone == TXDONE_BY_POLL) {
+
+   if (!mbox->ops->last_tx_done) {
+   dev_err(mbox->dev, "last_tx_done method is absent\n");
+   return -EINVAL;
+   }
+
hrtimer_init(>poll_hrt, CLOCK_MONOTONIC,
 HRTIMER_MODE_REL);
mbox->poll_hrt.function = txdone_hrtimer;
-- 
1.9.1

Re: [PATCH] mailbox: always wait in mbox_send_message for blocking tx mode

2017-03-20 Thread Alexey Klimov

Hi Sudeep,

thanks for sending this patch.

On Mon, Mar 20, 2017 at 03:40:10PM +, Sudeep Holla wrote:
> There exists a race when msg_submit return immediately as there was an
> active request being processed which may have completed just before it's
> checked again in mbox_send_message. This will result in return to the
> caller without waiting in mbox_send_message even when it's blocking Tx.
> 
> This patch fixes the issue by making use of non-negative token returned
> by add_to_rbuf to check if the request was queued and block always if
> so in blocking Tx mode.
> 
> Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox")
> Cc: Jassi Brar <jassisinghb...@gmail.com>
> Reported-by: Alexey Klimov <alexey.kli...@arm.com>
> Signed-off-by: Sudeep Holla <sudeep.ho...@arm.com>
> ---
>  drivers/mailbox/mailbox.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
> index 4671f8a12872..d5895791ab5d 100644
> --- a/drivers/mailbox/mailbox.c
> +++ b/drivers/mailbox/mailbox.c
> @@ -260,7 +260,7 @@ int mbox_send_message(struct mbox_chan *chan, void *mssg)
>  
>   msg_submit(chan);
>  
> - if (chan->cl->tx_block && chan->active_req) {
> + if (chan->cl->tx_block && t >= 0) {

What do you think about removing t>=0 at all?
If add_to_rbuf() above returns negative number then we won't reach this point
in code at all and quit this function with error. If execution reaches this 
line then
we can say that t is definetely >= 0 and maybe it shouldn't be checked.


Best regards,
Alexey

Re: [PATCH] mailbox: always wait in mbox_send_message for blocking tx mode

2017-03-20 Thread Alexey Klimov

Hi Sudeep,

thanks for sending this patch.

On Mon, Mar 20, 2017 at 03:40:10PM +, Sudeep Holla wrote:
> There exists a race when msg_submit return immediately as there was an
> active request being processed which may have completed just before it's
> checked again in mbox_send_message. This will result in return to the
> caller without waiting in mbox_send_message even when it's blocking Tx.
> 
> This patch fixes the issue by making use of non-negative token returned
> by add_to_rbuf to check if the request was queued and block always if
> so in blocking Tx mode.
> 
> Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox")
> Cc: Jassi Brar 
> Reported-by: Alexey Klimov 
> Signed-off-by: Sudeep Holla 
> ---
>  drivers/mailbox/mailbox.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
> index 4671f8a12872..d5895791ab5d 100644
> --- a/drivers/mailbox/mailbox.c
> +++ b/drivers/mailbox/mailbox.c
> @@ -260,7 +260,7 @@ int mbox_send_message(struct mbox_chan *chan, void *mssg)
>  
>   msg_submit(chan);
>  
> - if (chan->cl->tx_block && chan->active_req) {
> + if (chan->cl->tx_block && t >= 0) {

What do you think about removing t>=0 at all?
If add_to_rbuf() above returns negative number then we won't reach this point
in code at all and quit this function with error. If execution reaches this 
line then
we can say that t is definetely >= 0 and maybe it shouldn't be checked.


Best regards,
Alexey

[PATCH v3] watchdog: add driver for StreamLabs USB watchdog device

2017-02-17 Thread Alexey Klimov

This patch creates new driver that supports StreamLabs usb watchdog
device. This device plugs into 9-pin usb header and connects to
reset pin and reset button on common PC.

USB commands used to communicate with device were reverse
engineered using usbmon.

Signed-off-by: Alexey Klimov <klimov.li...@gmail.com>
---

Changes in v3:
 -- coding style cleanups and rebase;
 -- buffer is allocated with separate allocation;
 -- adding comments about max/min limits;
 -- rework start/stop commands implementation;
 -- fix first if-check in probe() function;

Previous version: https://www.spinics.net/lists/linux-watchdog/msg09092.html

 drivers/watchdog/Kconfig  |  16 ++
 drivers/watchdog/Makefile |   1 +
 drivers/watchdog/streamlabs_wdt.c | 321 ++
 3 files changed, 338 insertions(+)
 create mode 100644 drivers/watchdog/streamlabs_wdt.c

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index acb00b53a520..6a2195d8cc5c 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -1852,6 +1852,22 @@ config USBPCWATCHDOG
 
  Most people will say N.
 
+config USB_STREAMLABS_WATCHDOG
+   tristate "StreamLabs USB watchdog driver"
+   depends on USB
+   ---help---
+ This is the driver for the USB Watchdog dongle from StreamLabs.
+ If you correctly connect reset pins to motherboard Reset pin and
+ to Reset button then this device will simply watch your kernel to make
+ sure it doesn't freeze, and if it does, it reboots your computer
+ after a certain amount of time.
+
+
+ To compile this driver as a module, choose M here: the
+ module will be called streamlabs_wdt.
+
+ Most people will say N. Say yes or M if you want to use such usb 
device.
+
 comment "Watchdog Pretimeout Governors"
 
 config WATCHDOG_PRETIMEOUT_GOV
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index 0c3d35e3c334..d4a61222ccd2 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -31,6 +31,7 @@ obj-$(CONFIG_WDTPCI) += wdt_pci.o
 
 # USB-based Watchdog Cards
 obj-$(CONFIG_USBPCWATCHDOG) += pcwd_usb.o
+obj-$(CONFIG_USB_STREAMLABS_WATCHDOG) += streamlabs_wdt.o
 
 # ALPHA Architecture
 
diff --git a/drivers/watchdog/streamlabs_wdt.c 
b/drivers/watchdog/streamlabs_wdt.c
new file mode 100644
index ..4442d053d9f7
--- /dev/null
+++ b/drivers/watchdog/streamlabs_wdt.c
@@ -0,0 +1,321 @@
+/*
+ * StreamLabs USB Watchdog driver
+ *
+ * Copyright (c) 2016-2017 Alexey Klimov <klimov.li...@gmail.com>
+ *
+ * This program is free software; you may redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * USB Watchdog device from Streamlabs:
+ * https://www.stream-labs.com/en/catalog/?cat_id=1203_id=323
+ *
+ * USB commands have been reverse engineered using usbmon.
+ */
+
+#define DRIVER_AUTHOR "Alexey Klimov <klimov.li...@gmail.com>"
+#define DRIVER_DESC "StreamLabs USB watchdog driver"
+#define DRIVER_NAME "usb_streamlabs_wdt"
+
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
+MODULE_LICENSE("GPL");
+
+#define USB_STREAMLABS_WATCHDOG_VENDOR 0x13c0
+#define USB_STREAMLABS_WATCHDOG_PRODUCT0x0011
+
+/*
+ * one buffer is used for communication, however transmitted message is only
+ * 32 bytes long
+ */
+#define BUFFER_TRANSFER_LENGTH 32
+#define BUFFER_LENGTH  64
+#define USB_TIMEOUT350
+
+#define STREAMLABS_CMD_START   0xaacc
+#define STREAMLABS_CMD_STOP0xbbff
+
+/* timeouts values are taken from windows program */
+#define STREAMLABS_WDT_MIN_TIMEOUT 1
+#define STREAMLABS_WDT_MAX_TIMEOUT 46
+
+struct streamlabs_wdt {
+   struct watchdog_device wdt_dev;
+   struct usb_interface *intf;
+
+   struct mutex lock;
+   u8 *buffer;
+};
+
+static bool nowayout = WATCHDOG_NOWAYOUT;
+
+/*
+ * This function is used to check if watchdog actually changed
+ * its state to disabled that is reported in first two bytes of response
+ * message.
+ */
+static int usb_streamlabs_wdt_check_stop(u16 *buf)
+{
+   if (buf[0] != cpu_to_le16(STREAMLABS_CMD_STOP))
+   return -EINVAL;
+
+   return 0;
+}
+
+static int usb_streamlabs_wdt_validate_response(u8 *buf)
+{
+   /*
+* If watchdog device understood the command it will acknowledge
+* with values 1,2,3,4 at indexes 10, 1

[PATCH v3] watchdog: add driver for StreamLabs USB watchdog device

2017-02-17 Thread Alexey Klimov

This patch creates new driver that supports StreamLabs usb watchdog
device. This device plugs into 9-pin usb header and connects to
reset pin and reset button on common PC.

USB commands used to communicate with device were reverse
engineered using usbmon.

Signed-off-by: Alexey Klimov 
---

Changes in v3:
 -- coding style cleanups and rebase;
 -- buffer is allocated with separate allocation;
 -- adding comments about max/min limits;
 -- rework start/stop commands implementation;
 -- fix first if-check in probe() function;

Previous version: https://www.spinics.net/lists/linux-watchdog/msg09092.html

 drivers/watchdog/Kconfig  |  16 ++
 drivers/watchdog/Makefile |   1 +
 drivers/watchdog/streamlabs_wdt.c | 321 ++
 3 files changed, 338 insertions(+)
 create mode 100644 drivers/watchdog/streamlabs_wdt.c

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index acb00b53a520..6a2195d8cc5c 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -1852,6 +1852,22 @@ config USBPCWATCHDOG
 
  Most people will say N.
 
+config USB_STREAMLABS_WATCHDOG
+   tristate "StreamLabs USB watchdog driver"
+   depends on USB
+   ---help---
+ This is the driver for the USB Watchdog dongle from StreamLabs.
+ If you correctly connect reset pins to motherboard Reset pin and
+ to Reset button then this device will simply watch your kernel to make
+ sure it doesn't freeze, and if it does, it reboots your computer
+ after a certain amount of time.
+
+
+ To compile this driver as a module, choose M here: the
+ module will be called streamlabs_wdt.
+
+ Most people will say N. Say yes or M if you want to use such usb 
device.
+
 comment "Watchdog Pretimeout Governors"
 
 config WATCHDOG_PRETIMEOUT_GOV
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index 0c3d35e3c334..d4a61222ccd2 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -31,6 +31,7 @@ obj-$(CONFIG_WDTPCI) += wdt_pci.o
 
 # USB-based Watchdog Cards
 obj-$(CONFIG_USBPCWATCHDOG) += pcwd_usb.o
+obj-$(CONFIG_USB_STREAMLABS_WATCHDOG) += streamlabs_wdt.o
 
 # ALPHA Architecture
 
diff --git a/drivers/watchdog/streamlabs_wdt.c 
b/drivers/watchdog/streamlabs_wdt.c
new file mode 100644
index ..4442d053d9f7
--- /dev/null
+++ b/drivers/watchdog/streamlabs_wdt.c
@@ -0,0 +1,321 @@
+/*
+ * StreamLabs USB Watchdog driver
+ *
+ * Copyright (c) 2016-2017 Alexey Klimov 
+ *
+ * This program is free software; you may redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * USB Watchdog device from Streamlabs:
+ * https://www.stream-labs.com/en/catalog/?cat_id=1203_id=323
+ *
+ * USB commands have been reverse engineered using usbmon.
+ */
+
+#define DRIVER_AUTHOR "Alexey Klimov "
+#define DRIVER_DESC "StreamLabs USB watchdog driver"
+#define DRIVER_NAME "usb_streamlabs_wdt"
+
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
+MODULE_LICENSE("GPL");
+
+#define USB_STREAMLABS_WATCHDOG_VENDOR 0x13c0
+#define USB_STREAMLABS_WATCHDOG_PRODUCT0x0011
+
+/*
+ * one buffer is used for communication, however transmitted message is only
+ * 32 bytes long
+ */
+#define BUFFER_TRANSFER_LENGTH 32
+#define BUFFER_LENGTH  64
+#define USB_TIMEOUT350
+
+#define STREAMLABS_CMD_START   0xaacc
+#define STREAMLABS_CMD_STOP0xbbff
+
+/* timeouts values are taken from windows program */
+#define STREAMLABS_WDT_MIN_TIMEOUT 1
+#define STREAMLABS_WDT_MAX_TIMEOUT 46
+
+struct streamlabs_wdt {
+   struct watchdog_device wdt_dev;
+   struct usb_interface *intf;
+
+   struct mutex lock;
+   u8 *buffer;
+};
+
+static bool nowayout = WATCHDOG_NOWAYOUT;
+
+/*
+ * This function is used to check if watchdog actually changed
+ * its state to disabled that is reported in first two bytes of response
+ * message.
+ */
+static int usb_streamlabs_wdt_check_stop(u16 *buf)
+{
+   if (buf[0] != cpu_to_le16(STREAMLABS_CMD_STOP))
+   return -EINVAL;
+
+   return 0;
+}
+
+static int usb_streamlabs_wdt_validate_response(u8 *buf)
+{
+   /*
+* If watchdog device understood the command it will acknowledge
+* with values 1,2,3,4 at indexes 10, 11, 12, 13 in response message
+* when response treated as 8bit message.
+*/
+   if (

[PATCH RESEND] elevator: remove second argument in elevator_init()

2016-10-10 Thread Alexey Klimov

Last user of elevator_init() with non-NULL name as second argument
that supposed to be s390 dasd driver has gone few releases ago.
Drivers rely on elevator_change(), elevator_switch() and friends
for example. Right now elevator_init() is always called as
elevator_init(q, NULL).

Patch removes passing of second name argument and its usage.

While we're at it fix following if-check after removed lines. We know
that elevator_type e is initialized by NULL and need to check only
chosen_elevator.

Signed-off-by: Alexey Klimov <klimov.li...@gmail.com>
Reviewed-by: Jeff Moyer <jmo...@redhat.com>
---
 block/blk-core.c |  2 +-
 block/elevator.c | 10 ++
 include/linux/elevator.h |  2 +-
 3 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 36c7ac3..6e36d0b 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -871,7 +871,7 @@ blk_init_allocated_queue(struct request_queue *q, 
request_fn_proc *rfn,
mutex_lock(>sysfs_lock);
 
/* init elevator */
-   if (elevator_init(q, NULL)) {
+   if (elevator_init(q)) {
mutex_unlock(>sysfs_lock);
goto fail;
}
diff --git a/block/elevator.c b/block/elevator.c
index f7d973a..e810938 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -177,7 +177,7 @@ static void elevator_release(struct kobject *kobj)
kfree(e);
 }
 
-int elevator_init(struct request_queue *q, char *name)
+int elevator_init(struct request_queue *q)
 {
struct elevator_type *e = NULL;
int err;
@@ -196,18 +196,12 @@ int elevator_init(struct request_queue *q, char *name)
q->end_sector = 0;
q->boundary_rq = NULL;
 
-   if (name) {
-   e = elevator_get(name, true);
-   if (!e)
-   return -EINVAL;
-   }
-
/*
 * Use the default elevator specified by config boot param or
 * config option.  Don't try to load modules as we could be running
 * off async and request_module() isn't allowed from async.
 */
-   if (!e && *chosen_elevator) {
+   if (*chosen_elevator) {
e = elevator_get(chosen_elevator, false);
if (!e)
printk(KERN_ERR "I/O scheduler %s not found\n",
diff --git a/include/linux/elevator.h b/include/linux/elevator.h
index e7f358d..ab6963e 100644
--- a/include/linux/elevator.h
+++ b/include/linux/elevator.h
@@ -159,7 +159,7 @@ extern void elv_unregister(struct elevator_type *);
 extern ssize_t elv_iosched_show(struct request_queue *, char *);
 extern ssize_t elv_iosched_store(struct request_queue *, const char *, size_t);
 
-extern int elevator_init(struct request_queue *, char *);
+extern int elevator_init(struct request_queue *);
 extern void elevator_exit(struct elevator_queue *);
 extern int elevator_change(struct request_queue *, const char *);
 extern bool elv_bio_merge_ok(struct request *, struct bio *);
-- 
2.9.3

[PATCH RESEND] elevator: remove second argument in elevator_init()

2016-10-10 Thread Alexey Klimov

Last user of elevator_init() with non-NULL name as second argument
that supposed to be s390 dasd driver has gone few releases ago.
Drivers rely on elevator_change(), elevator_switch() and friends
for example. Right now elevator_init() is always called as
elevator_init(q, NULL).

Patch removes passing of second name argument and its usage.

While we're at it fix following if-check after removed lines. We know
that elevator_type e is initialized by NULL and need to check only
chosen_elevator.

Signed-off-by: Alexey Klimov 
Reviewed-by: Jeff Moyer 
---
 block/blk-core.c |  2 +-
 block/elevator.c | 10 ++
 include/linux/elevator.h |  2 +-
 3 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 36c7ac3..6e36d0b 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -871,7 +871,7 @@ blk_init_allocated_queue(struct request_queue *q, 
request_fn_proc *rfn,
mutex_lock(>sysfs_lock);
 
/* init elevator */
-   if (elevator_init(q, NULL)) {
+   if (elevator_init(q)) {
mutex_unlock(>sysfs_lock);
goto fail;
}
diff --git a/block/elevator.c b/block/elevator.c
index f7d973a..e810938 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -177,7 +177,7 @@ static void elevator_release(struct kobject *kobj)
kfree(e);
 }
 
-int elevator_init(struct request_queue *q, char *name)
+int elevator_init(struct request_queue *q)
 {
struct elevator_type *e = NULL;
int err;
@@ -196,18 +196,12 @@ int elevator_init(struct request_queue *q, char *name)
q->end_sector = 0;
q->boundary_rq = NULL;
 
-   if (name) {
-   e = elevator_get(name, true);
-   if (!e)
-   return -EINVAL;
-   }
-
/*
 * Use the default elevator specified by config boot param or
 * config option.  Don't try to load modules as we could be running
 * off async and request_module() isn't allowed from async.
 */
-   if (!e && *chosen_elevator) {
+   if (*chosen_elevator) {
e = elevator_get(chosen_elevator, false);
if (!e)
printk(KERN_ERR "I/O scheduler %s not found\n",
diff --git a/include/linux/elevator.h b/include/linux/elevator.h
index e7f358d..ab6963e 100644
--- a/include/linux/elevator.h
+++ b/include/linux/elevator.h
@@ -159,7 +159,7 @@ extern void elv_unregister(struct elevator_type *);
 extern ssize_t elv_iosched_show(struct request_queue *, char *);
 extern ssize_t elv_iosched_store(struct request_queue *, const char *, size_t);
 
-extern int elevator_init(struct request_queue *, char *);
+extern int elevator_init(struct request_queue *);
 extern void elevator_exit(struct elevator_queue *);
 extern int elevator_change(struct request_queue *, const char *);
 extern bool elv_bio_merge_ok(struct request *, struct bio *);
-- 
2.9.3

Re: [PATCH] elevator: remove second argument in elevator_init()

2016-09-26 Thread Alexey Klimov

On Thu, Mar 31, 2016 at 1:34 AM, Jens Axboe <ax...@kernel.dk> wrote:
> On 03/30/2016 05:31 PM, Alexey Klimov wrote:
>>
>> Hi all,
>>
>> On Wed, Jan 27, 2016 at 9:01 PM, Jeff Moyer <jmo...@redhat.com> wrote:
>>>
>>> Alexey Klimov <klimov.li...@gmail.com> writes:
>>>
>>>> Last user of elevator_init() with non-NULL name as second argument
>>>> that supposed to be s390 dasd driver has gone few releases ago.
>>>> Drivers rely on elevator_change(), elevator_switch() and friends
>>>> for example. Right now elevator_init() is always called as
>>>> elevator_init(q, NULL).
>>>>
>>>> Patch removes passing of second name argument and its usage.
>>>>
>>>> While we're at it fix following if-check after removed lines. We know
>>>> that elevator_type e is initialized by NULL and need to check only
>>>> chosen_elevator.
>>>>
>>>> Signed-off-by: Alexey Klimov <klimov.li...@gmail.com>
>>>
>>>
>>> Reviewed-by: Jeff Moyer <jmo...@redhat.com>
>>
>>
>>
>> what is the status of this patch? Is it that wrong and are there some
>> concerns or do I need to resend it?
>
>
> It looks fine, I'll pick it up for 4.7.
>
> --
> Jens Axboe


So, I guess this one was lost: I can't find it in the tree.
Looks like the easiest way will be to rebase (and check that it's
still fine) and resend. Right?

Best regards,
Alexey.

Re: [PATCH] elevator: remove second argument in elevator_init()

2016-09-26 Thread Alexey Klimov

On Thu, Mar 31, 2016 at 1:34 AM, Jens Axboe  wrote:
> On 03/30/2016 05:31 PM, Alexey Klimov wrote:
>>
>> Hi all,
>>
>> On Wed, Jan 27, 2016 at 9:01 PM, Jeff Moyer  wrote:
>>>
>>> Alexey Klimov  writes:
>>>
>>>> Last user of elevator_init() with non-NULL name as second argument
>>>> that supposed to be s390 dasd driver has gone few releases ago.
>>>> Drivers rely on elevator_change(), elevator_switch() and friends
>>>> for example. Right now elevator_init() is always called as
>>>> elevator_init(q, NULL).
>>>>
>>>> Patch removes passing of second name argument and its usage.
>>>>
>>>> While we're at it fix following if-check after removed lines. We know
>>>> that elevator_type e is initialized by NULL and need to check only
>>>> chosen_elevator.
>>>>
>>>> Signed-off-by: Alexey Klimov 
>>>
>>>
>>> Reviewed-by: Jeff Moyer 
>>
>>
>>
>> what is the status of this patch? Is it that wrong and are there some
>> concerns or do I need to resend it?
>
>
> It looks fine, I'll pick it up for 4.7.
>
> --
> Jens Axboe


So, I guess this one was lost: I can't find it in the tree.
Looks like the easiest way will be to rebase (and check that it's
still fine) and resend. Right?

Best regards,
Alexey.

Re: [PATCH] mm: mlock: check if vma is locked using & instead of && operator

2016-09-09 Thread Alexey Klimov

Hi Colin,

On Fri, Sep 9, 2016 at 11:46 AM, Colin King  wrote:
> From: Colin Ian King 
>
> The check to see if a vma is locked is using the operator && and
> should be using the bitwise operator & to see if the VM_LOCKED bit
> is set. Fix this to use & instead.
>
> Fixes: ae38c3be005ee ("mm: mlock: check against vma for actual mlock() size")
> Signed-off-by: Colin Ian King 
> ---
>  mm/mlock.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/mlock.c b/mm/mlock.c
> index fafbb78..f5b1d07 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -643,7 +643,7 @@ static int count_mm_mlocked_page_nr(struct mm_struct *mm,
> for (; vma ; vma = vma->vm_next) {
> if (start + len <=  vma->vm_start)
> break;
> -   if (vma->vm_flags && VM_LOCKED) {
> +   if (vma->vm_flags & VM_LOCKED) {
> if (start > vma->vm_start)
> count -= (start - vma->vm_start);
> if (start + len < vma->vm_end) {
> --

I think it was already addressed in [1] by Simon Guo.

[1] http://www.spinics.net/lists/linux-mm/msg113228.html

-- 
Best regards,
Alexey

Re: [PATCH] mm: mlock: check if vma is locked using & instead of && operator

2016-09-09 Thread Alexey Klimov

Hi Colin,

On Fri, Sep 9, 2016 at 11:46 AM, Colin King  wrote:
> From: Colin Ian King 
>
> The check to see if a vma is locked is using the operator && and
> should be using the bitwise operator & to see if the VM_LOCKED bit
> is set. Fix this to use & instead.
>
> Fixes: ae38c3be005ee ("mm: mlock: check against vma for actual mlock() size")
> Signed-off-by: Colin Ian King 
> ---
>  mm/mlock.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/mlock.c b/mm/mlock.c
> index fafbb78..f5b1d07 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -643,7 +643,7 @@ static int count_mm_mlocked_page_nr(struct mm_struct *mm,
> for (; vma ; vma = vma->vm_next) {
> if (start + len <=  vma->vm_start)
> break;
> -   if (vma->vm_flags && VM_LOCKED) {
> +   if (vma->vm_flags & VM_LOCKED) {
> if (start > vma->vm_start)
> count -= (start - vma->vm_start);
> if (start + len < vma->vm_end) {
> --

I think it was already addressed in [1] by Simon Guo.

[1] http://www.spinics.net/lists/linux-mm/msg113228.html

-- 
Best regards,
Alexey

[PATCH] USB: serial: fix memleak on error path in usb-serial

2016-08-07 Thread Alexey Klimov

udriver struct allocated by kzalloc() will not be freed
if usb_register() and next calls fail. This patch fixes this
by adding one more step with kfree(udriver) in error path.

Cc: Alan Stern <st...@rowland.harvard.edu>
Signed-off-by: Alexey Klimov <klimov.li...@gmail.com>
---
 drivers/usb/serial/usb-serial.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/serial/usb-serial.c b/drivers/usb/serial/usb-serial.c
index b1b9bac..d213cf4 100644
--- a/drivers/usb/serial/usb-serial.c
+++ b/drivers/usb/serial/usb-serial.c
@@ -1433,7 +1433,7 @@ int usb_serial_register_drivers(struct usb_serial_driver 
*const serial_drivers[]
 
rc = usb_register(udriver);
if (rc)
-   return rc;
+   goto failed_usb_register;
 
for (sd = serial_drivers; *sd; ++sd) {
(*sd)->usb_driver = udriver;
@@ -1451,6 +1451,8 @@ int usb_serial_register_drivers(struct usb_serial_driver 
*const serial_drivers[]
while (sd-- > serial_drivers)
usb_serial_deregister(*sd);
usb_deregister(udriver);
+failed_usb_register:
+   kfree(udriver);
return rc;
 }
 EXPORT_SYMBOL_GPL(usb_serial_register_drivers);
-- 
2.5.0

[PATCH] USB: serial: fix memleak on error path in usb-serial

2016-08-07 Thread Alexey Klimov

udriver struct allocated by kzalloc() will not be freed
if usb_register() and next calls fail. This patch fixes this
by adding one more step with kfree(udriver) in error path.

Cc: Alan Stern 
Signed-off-by: Alexey Klimov 
---
 drivers/usb/serial/usb-serial.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/serial/usb-serial.c b/drivers/usb/serial/usb-serial.c
index b1b9bac..d213cf4 100644
--- a/drivers/usb/serial/usb-serial.c
+++ b/drivers/usb/serial/usb-serial.c
@@ -1433,7 +1433,7 @@ int usb_serial_register_drivers(struct usb_serial_driver 
*const serial_drivers[]
 
rc = usb_register(udriver);
if (rc)
-   return rc;
+   goto failed_usb_register;
 
for (sd = serial_drivers; *sd; ++sd) {
(*sd)->usb_driver = udriver;
@@ -1451,6 +1451,8 @@ int usb_serial_register_drivers(struct usb_serial_driver 
*const serial_drivers[]
while (sd-- > serial_drivers)
usb_serial_deregister(*sd);
usb_deregister(udriver);
+failed_usb_register:
+   kfree(udriver);
return rc;
 }
 EXPORT_SYMBOL_GPL(usb_serial_register_drivers);
-- 
2.5.0

Re: [PATCH v4] Force cppc_cpufreq to report values in KHz to fix user space reporting

2016-07-14 Thread Alexey Klimov

On Thu, Jul 14, 2016 at 10:15:39AM -0600, Al Stone wrote:
> On 07/14/2016 04:03 AM, Alexey Klimov wrote:
> > Hi Al,
> > 
> > On Tue, Jul 12, 2016 at 11:16:11AM -0600, Al Stone wrote:
> >> When CPPC is being used by ACPI on arm64, user space tools such as
> >> cpupower report CPU frequency values from sysfs that are incorrect.
> >>
> >> What the driver was doing was reporting the values given by ACPI tables
> >> in whatever scale was used to provide them.  However, the ACPI spec
> >> defines the CPPC values as unitless abstract numbers.  Internal kernel
> >> structures such as struct perf_cap, in contrast, expect these values
> >> to be in KHz.  When these struct values get reported via sysfs, the
> >> user space tools also assume they are in KHz, causing them to report
> >> incorrect values (for example, reporting a CPU frequency of 1MHz when
> >> it should be 1.8GHz).
> >>
> >> While the investigation for a long term fix proceeds (several options
> >> are being explored, some of which may require spec changes or other
> >> much more invasive fixes), this patch forces the values read by CPPC
> >> to be read in KHz, regardless of what they actually represent.
> >>
> >> The downside is that this approach has some assumptions:
> >>
> >>(1) It relies on SMBIOS3 being used, *and* that the Max Frequency
> >>value for a processor is set to a non-zero value.
> >>
> >>(2) It assumes that all processors run at the same speed, or that
> >>the CPPC values have all been scaled to reflect relative speed.
> >>This patch retrieves the largest CPU Max Frequency from a type 4 DMI
> >>record that it can find.  This may not be an issue, however, as a
> >>sampling of DMI data on x86 and arm64 indicates there is often only
> >>one such record regardless.  Since CPPC is relatively new, it is
> >>unclear if the ACPI ASL will always be written to reflect any sort
> >>of relative performance of processors of differing speeds.
> >>
> >>(3) It assumes that performance and frequency both scale linearly.
> >>
> >> For arm64 servers, this may be sufficient, but it does rely on
> >> firmware values being set correctly.  Hence, other approaches are
> >> also being considered.
> >>
> >> This has been tested on three arm64 servers, with and without DMI, with
> >> and without CPPC support.
> >>
> >> Changes for v4:
> >> -- Replaced magic constants with #defines (Rafael Wysocki)
> >> -- Renamed cppc_unitless_to_khz() to cppc_to_khz() (Rafael Wysocki)
> >> -- Replaced hidden initialization with a clearer form (Rafael Wysocki)
> >> -- Instead of picking up the first Max Speed value from DMI, we will
> >>now get the largest Max Speed; still an approximation, but slightly
> >>    less subject to error (Rafael Wysocki)
> >> -- Kconfig for cppc_cpufreq now depends on DMI, instead of selecting
> >>it, in order to make sure DMI is set up properly (Rafael Wysocki)
> >>
> >> Changes for v3:
> >> -- Added clarifying commentary re short-term vs long-term fix (Alexey
> >>Klimov)
> >> -- Added range checking code to ensure proper arithmetic occurs,
> >>especially no division by zero (Alexey Klimov)
> >>
> >> Changes for v2:
> >> -- Corrected thinko: needed to have DEPENDS on DMI in Kconfig.arm,
> >>not SELECT DMI (found by build daemon)
> >>
> >> Signed-off-by: Al Stone <a...@redhat.com>
> >> ---
> >>  drivers/acpi/cppc_acpi.c| 106 
> >> +---
> >>  drivers/cpufreq/Kconfig.arm |   2 +-
> >>  2 files changed, 102 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> >> index 8adac69..6e6df9c 100644
> >> --- a/drivers/acpi/cppc_acpi.c
> >> +++ b/drivers/acpi/cppc_acpi.c
> >> @@ -40,8 +40,18 @@
> >>  #include 
> >>  #include 
> >>  #include 
> >> +#include 
> >> +
> >> +#include 
> >>  
> >>  #include 
> >> +
> >> +/* Minimum struct length needed for the DMI processor entry we want */
> >> +#define DMI_ENTRY_PROCESSOR_MIN_LENGTH48
> >> +
> >> +/* Offest in the DMI processor structure for the max frequency */
> >> +#define DMI_PROCESSOR_MAX_SPEED  0x14
>

Re: [PATCH v4] Force cppc_cpufreq to report values in KHz to fix user space reporting

2016-07-14 Thread Alexey Klimov

On Thu, Jul 14, 2016 at 10:15:39AM -0600, Al Stone wrote:
> On 07/14/2016 04:03 AM, Alexey Klimov wrote:
> > Hi Al,
> > 
> > On Tue, Jul 12, 2016 at 11:16:11AM -0600, Al Stone wrote:
> >> When CPPC is being used by ACPI on arm64, user space tools such as
> >> cpupower report CPU frequency values from sysfs that are incorrect.
> >>
> >> What the driver was doing was reporting the values given by ACPI tables
> >> in whatever scale was used to provide them.  However, the ACPI spec
> >> defines the CPPC values as unitless abstract numbers.  Internal kernel
> >> structures such as struct perf_cap, in contrast, expect these values
> >> to be in KHz.  When these struct values get reported via sysfs, the
> >> user space tools also assume they are in KHz, causing them to report
> >> incorrect values (for example, reporting a CPU frequency of 1MHz when
> >> it should be 1.8GHz).
> >>
> >> While the investigation for a long term fix proceeds (several options
> >> are being explored, some of which may require spec changes or other
> >> much more invasive fixes), this patch forces the values read by CPPC
> >> to be read in KHz, regardless of what they actually represent.
> >>
> >> The downside is that this approach has some assumptions:
> >>
> >>(1) It relies on SMBIOS3 being used, *and* that the Max Frequency
> >>value for a processor is set to a non-zero value.
> >>
> >>(2) It assumes that all processors run at the same speed, or that
> >>the CPPC values have all been scaled to reflect relative speed.
> >>This patch retrieves the largest CPU Max Frequency from a type 4 DMI
> >>record that it can find.  This may not be an issue, however, as a
> >>sampling of DMI data on x86 and arm64 indicates there is often only
> >>one such record regardless.  Since CPPC is relatively new, it is
> >>unclear if the ACPI ASL will always be written to reflect any sort
> >>of relative performance of processors of differing speeds.
> >>
> >>(3) It assumes that performance and frequency both scale linearly.
> >>
> >> For arm64 servers, this may be sufficient, but it does rely on
> >> firmware values being set correctly.  Hence, other approaches are
> >> also being considered.
> >>
> >> This has been tested on three arm64 servers, with and without DMI, with
> >> and without CPPC support.
> >>
> >> Changes for v4:
> >> -- Replaced magic constants with #defines (Rafael Wysocki)
> >> -- Renamed cppc_unitless_to_khz() to cppc_to_khz() (Rafael Wysocki)
> >> -- Replaced hidden initialization with a clearer form (Rafael Wysocki)
> >> -- Instead of picking up the first Max Speed value from DMI, we will
> >>now get the largest Max Speed; still an approximation, but slightly
> >>    less subject to error (Rafael Wysocki)
> >> -- Kconfig for cppc_cpufreq now depends on DMI, instead of selecting
> >>it, in order to make sure DMI is set up properly (Rafael Wysocki)
> >>
> >> Changes for v3:
> >> -- Added clarifying commentary re short-term vs long-term fix (Alexey
> >>Klimov)
> >> -- Added range checking code to ensure proper arithmetic occurs,
> >>especially no division by zero (Alexey Klimov)
> >>
> >> Changes for v2:
> >> -- Corrected thinko: needed to have DEPENDS on DMI in Kconfig.arm,
> >>not SELECT DMI (found by build daemon)
> >>
> >> Signed-off-by: Al Stone 
> >> ---
> >>  drivers/acpi/cppc_acpi.c| 106 
> >> +---
> >>  drivers/cpufreq/Kconfig.arm |   2 +-
> >>  2 files changed, 102 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> >> index 8adac69..6e6df9c 100644
> >> --- a/drivers/acpi/cppc_acpi.c
> >> +++ b/drivers/acpi/cppc_acpi.c
> >> @@ -40,8 +40,18 @@
> >>  #include 
> >>  #include 
> >>  #include 
> >> +#include 
> >> +
> >> +#include 
> >>  
> >>  #include 
> >> +
> >> +/* Minimum struct length needed for the DMI processor entry we want */
> >> +#define DMI_ENTRY_PROCESSOR_MIN_LENGTH48
> >> +
> >> +/* Offest in the DMI processor structure for the max frequency */
> >> +#define DMI_PROCESSOR_MAX_SPEED  0x14
> >> +
> >&g

Re: [PATCH v4] Force cppc_cpufreq to report values in KHz to fix user space reporting

2016-07-14 Thread Alexey Klimov

Hi Al,

On Tue, Jul 12, 2016 at 11:16:11AM -0600, Al Stone wrote:
> When CPPC is being used by ACPI on arm64, user space tools such as
> cpupower report CPU frequency values from sysfs that are incorrect.
> 
> What the driver was doing was reporting the values given by ACPI tables
> in whatever scale was used to provide them.  However, the ACPI spec
> defines the CPPC values as unitless abstract numbers.  Internal kernel
> structures such as struct perf_cap, in contrast, expect these values
> to be in KHz.  When these struct values get reported via sysfs, the
> user space tools also assume they are in KHz, causing them to report
> incorrect values (for example, reporting a CPU frequency of 1MHz when
> it should be 1.8GHz).
> 
> While the investigation for a long term fix proceeds (several options
> are being explored, some of which may require spec changes or other
> much more invasive fixes), this patch forces the values read by CPPC
> to be read in KHz, regardless of what they actually represent.
> 
> The downside is that this approach has some assumptions:
> 
>(1) It relies on SMBIOS3 being used, *and* that the Max Frequency
>value for a processor is set to a non-zero value.
> 
>(2) It assumes that all processors run at the same speed, or that
>the CPPC values have all been scaled to reflect relative speed.
>This patch retrieves the largest CPU Max Frequency from a type 4 DMI
>record that it can find.  This may not be an issue, however, as a
>sampling of DMI data on x86 and arm64 indicates there is often only
>one such record regardless.  Since CPPC is relatively new, it is
>unclear if the ACPI ASL will always be written to reflect any sort
>of relative performance of processors of differing speeds.
> 
>(3) It assumes that performance and frequency both scale linearly.
> 
> For arm64 servers, this may be sufficient, but it does rely on
> firmware values being set correctly.  Hence, other approaches are
> also being considered.
> 
> This has been tested on three arm64 servers, with and without DMI, with
> and without CPPC support.
> 
> Changes for v4:
> -- Replaced magic constants with #defines (Rafael Wysocki)
> -- Renamed cppc_unitless_to_khz() to cppc_to_khz() (Rafael Wysocki)
> -- Replaced hidden initialization with a clearer form (Rafael Wysocki)
> -- Instead of picking up the first Max Speed value from DMI, we will
>now get the largest Max Speed; still an approximation, but slightly
>less subject to error (Rafael Wysocki)
> -- Kconfig for cppc_cpufreq now depends on DMI, instead of selecting
>it, in order to make sure DMI is set up properly (Rafael Wysocki)
> 
> Changes for v3:
> -- Added clarifying commentary re short-term vs long-term fix (Alexey
>Klimov)
> -- Added range checking code to ensure proper arithmetic occurs,
>especially no division by zero (Alexey Klimov)
> 
> Changes for v2:
> -- Corrected thinko: needed to have DEPENDS on DMI in Kconfig.arm,
>not SELECT DMI (found by build daemon)
> 
> Signed-off-by: Al Stone <a...@redhat.com>
> ---
>  drivers/acpi/cppc_acpi.c| 106 
> +---
>  drivers/cpufreq/Kconfig.arm |   2 +-
>  2 files changed, 102 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> index 8adac69..6e6df9c 100644
> --- a/drivers/acpi/cppc_acpi.c
> +++ b/drivers/acpi/cppc_acpi.c
> @@ -40,8 +40,18 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +
> +#include 
>  
>  #include 
> +
> +/* Minimum struct length needed for the DMI processor entry we want */
> +#define DMI_ENTRY_PROCESSOR_MIN_LENGTH   48
> +
> +/* Offest in the DMI processor structure for the max frequency */
> +#define DMI_PROCESSOR_MAX_SPEED  0x14
> +
>  /*
>   * Lock to provide mutually exclusive access to the PCC
>   * channel. e.g. When the remote updates the shared region
> @@ -709,6 +719,56 @@ static int cpc_write(struct cpc_reg *reg, u64 val)
>   return ret_val;
>  }
>  
> +static u64 cppc_dmi_khz;
> +
> +static void cppc_find_dmi_mhz(const struct dmi_header *dm, void *private)
> +{
> + const u8 *dmi_data = (const u8 *)dm;
> + u16 *mhz = (u16 *)private;
> +
> + if (dm->type == DMI_ENTRY_PROCESSOR &&
> + dm->length >= DMI_ENTRY_PROCESSOR_MIN_LENGTH) {
> + u16 val = (u16)get_unaligned((const u16 *)
> + (dmi_data + DMI_PROCESSOR_MAX_SPEED));
> + *mhz = val > *mhz ? val : *mhz;
> + }
> +}
> +
> +
> +static u64 cppc_get_dmi_khz(void)
> +{
> + u16 m

Re: [PATCH v4] Force cppc_cpufreq to report values in KHz to fix user space reporting

2016-07-14 Thread Alexey Klimov

Hi Al,

On Tue, Jul 12, 2016 at 11:16:11AM -0600, Al Stone wrote:
> When CPPC is being used by ACPI on arm64, user space tools such as
> cpupower report CPU frequency values from sysfs that are incorrect.
> 
> What the driver was doing was reporting the values given by ACPI tables
> in whatever scale was used to provide them.  However, the ACPI spec
> defines the CPPC values as unitless abstract numbers.  Internal kernel
> structures such as struct perf_cap, in contrast, expect these values
> to be in KHz.  When these struct values get reported via sysfs, the
> user space tools also assume they are in KHz, causing them to report
> incorrect values (for example, reporting a CPU frequency of 1MHz when
> it should be 1.8GHz).
> 
> While the investigation for a long term fix proceeds (several options
> are being explored, some of which may require spec changes or other
> much more invasive fixes), this patch forces the values read by CPPC
> to be read in KHz, regardless of what they actually represent.
> 
> The downside is that this approach has some assumptions:
> 
>(1) It relies on SMBIOS3 being used, *and* that the Max Frequency
>value for a processor is set to a non-zero value.
> 
>(2) It assumes that all processors run at the same speed, or that
>the CPPC values have all been scaled to reflect relative speed.
>This patch retrieves the largest CPU Max Frequency from a type 4 DMI
>record that it can find.  This may not be an issue, however, as a
>sampling of DMI data on x86 and arm64 indicates there is often only
>one such record regardless.  Since CPPC is relatively new, it is
>unclear if the ACPI ASL will always be written to reflect any sort
>of relative performance of processors of differing speeds.
> 
>(3) It assumes that performance and frequency both scale linearly.
> 
> For arm64 servers, this may be sufficient, but it does rely on
> firmware values being set correctly.  Hence, other approaches are
> also being considered.
> 
> This has been tested on three arm64 servers, with and without DMI, with
> and without CPPC support.
> 
> Changes for v4:
> -- Replaced magic constants with #defines (Rafael Wysocki)
> -- Renamed cppc_unitless_to_khz() to cppc_to_khz() (Rafael Wysocki)
> -- Replaced hidden initialization with a clearer form (Rafael Wysocki)
> -- Instead of picking up the first Max Speed value from DMI, we will
>now get the largest Max Speed; still an approximation, but slightly
>less subject to error (Rafael Wysocki)
> -- Kconfig for cppc_cpufreq now depends on DMI, instead of selecting
>it, in order to make sure DMI is set up properly (Rafael Wysocki)
> 
> Changes for v3:
> -- Added clarifying commentary re short-term vs long-term fix (Alexey
>Klimov)
> -- Added range checking code to ensure proper arithmetic occurs,
>especially no division by zero (Alexey Klimov)
> 
> Changes for v2:
> -- Corrected thinko: needed to have DEPENDS on DMI in Kconfig.arm,
>not SELECT DMI (found by build daemon)
> 
> Signed-off-by: Al Stone 
> ---
>  drivers/acpi/cppc_acpi.c| 106 
> +---
>  drivers/cpufreq/Kconfig.arm |   2 +-
>  2 files changed, 102 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> index 8adac69..6e6df9c 100644
> --- a/drivers/acpi/cppc_acpi.c
> +++ b/drivers/acpi/cppc_acpi.c
> @@ -40,8 +40,18 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +
> +#include 
>  
>  #include 
> +
> +/* Minimum struct length needed for the DMI processor entry we want */
> +#define DMI_ENTRY_PROCESSOR_MIN_LENGTH   48
> +
> +/* Offest in the DMI processor structure for the max frequency */
> +#define DMI_PROCESSOR_MAX_SPEED  0x14
> +
>  /*
>   * Lock to provide mutually exclusive access to the PCC
>   * channel. e.g. When the remote updates the shared region
> @@ -709,6 +719,56 @@ static int cpc_write(struct cpc_reg *reg, u64 val)
>   return ret_val;
>  }
>  
> +static u64 cppc_dmi_khz;
> +
> +static void cppc_find_dmi_mhz(const struct dmi_header *dm, void *private)
> +{
> + const u8 *dmi_data = (const u8 *)dm;
> + u16 *mhz = (u16 *)private;
> +
> + if (dm->type == DMI_ENTRY_PROCESSOR &&
> + dm->length >= DMI_ENTRY_PROCESSOR_MIN_LENGTH) {
> + u16 val = (u16)get_unaligned((const u16 *)
> + (dmi_data + DMI_PROCESSOR_MAX_SPEED));
> + *mhz = val > *mhz ? val : *mhz;
> + }
> +}
> +
> +
> +static u64 cppc_get_dmi_khz(void)
> +{
> + u16 mhz = 0;
> +
>

Re: [PATCH v5 0/1] ARM64: ACPI: Update documentation for latest specification version

2016-05-16 Thread Alexey Klimov

On Mon, May 2, 2016 at 09:19 PM, Al Stone wrote:
> On 04/25/2016 03:21 PM, Al Stone wrote:
> > The ACPI 6.1 specification was recently released at the end of January
> > 2016, but the arm64 kernel documentation for the use of ACPI was written
> > for the 5.1 version of the spec.  There were significant additions to the
> > spec that had not yet been mentioned -- for example, the 6.0 mechanisms
> > added to make it easier to define processors and low power idle states,
> > as well as the 6.1 addition allowing regular interrupts (not just from
> > GPIO) be used to signal ACPI general purpose events.
> >
> > This patch reflects going back through and examining the specs in detail
> > and updating content appropriately.  Whilst there, a few odds and ends of
> > typos were caught as well.  This brings the documentation up to date with
> > ACPI 6.1 for arm64.
> >
> > Changes for v5:
> >-- Miscellaneous typos and corrections (Lorenzo Pieralisi)
> >-- Add linux-acpi@ ML to the distribution list (Alexey Klimov)
> >-- Corrections to CPPC information (Alexey Klimov)
> >-- ACK from Lorenzo Pieralisi
> >-- Updated bibliographic info (Al Stone)
> >
> > Changes for v4:
> >-- Clarify that IORT can sometimes be optional (Jon Masters).
> >-- Remove "Use as needed" descriptions of ACPI objects; they provide
> >   no substantive information and doing so simplifies maintenance of
> >   this document over time.  These have been replaced with a simpler
> >   notice that states that unless otherwise noted, do what the ACPI
> >   specification says is needed.
> >-- Corrected the _OSI object usage recommendation; it described kernel
> >   behavior that does not exist (Al Stone).
> >
> > Changes for v3:
> >-- Clarify use of _LPI/_RDI (Vikas Sajjan)
> >-- Whitespace cleanup as pointed out by checkpatch
> >
> > Changes for v2:
> >-- Clean up white space (Harb Abdulhahmid)
> >-- Clarification on _CCA usage (Harb Abdulhamid)
> >-- IORT moved to required from recommended (Hanjun Guo)
> >-- Clarify IORT description (Hanjun Guo)
> >
> >
> > Al Stone (1):
> >   ARM64: ACPI: Update documentation for latest specification version
> >
> >  Documentation/arm64/acpi_object_usage.txt | 343 
> > --
> >  Documentation/arm64/arm-acpi.txt  |  40 ++--
> >  2 files changed, 213 insertions(+), 170 deletions(-)
> >
> 
> Ping?  If there are no further comments, can this be pulled in through
> either the documentation or arm64 tree?
> 
> Thanks.

Hi Al,
sorry for delay.

CPPC and PCC corrections look fine. Thanks.


This comment is not to block your patch (maybe some to-do):
I greped sources and your patch and I don't see description of _PSD object.
This P-state dependancy object is optional but it's presense and correct data
are extremely useful for CPPC and can potentially descrease number of 
performance
changing requests.

ACPI spec in section about CPPC tells that it may use _PSD (page 503 if I 
remember
correctly) to specify domain belongings of CPUs.

You may consider to add description of _PSD object later.

Best regards,
Alexey.

Re: [PATCH v5 0/1] ARM64: ACPI: Update documentation for latest specification version

2016-05-16 Thread Alexey Klimov

On Mon, May 2, 2016 at 09:19 PM, Al Stone wrote:
> On 04/25/2016 03:21 PM, Al Stone wrote:
> > The ACPI 6.1 specification was recently released at the end of January
> > 2016, but the arm64 kernel documentation for the use of ACPI was written
> > for the 5.1 version of the spec.  There were significant additions to the
> > spec that had not yet been mentioned -- for example, the 6.0 mechanisms
> > added to make it easier to define processors and low power idle states,
> > as well as the 6.1 addition allowing regular interrupts (not just from
> > GPIO) be used to signal ACPI general purpose events.
> >
> > This patch reflects going back through and examining the specs in detail
> > and updating content appropriately.  Whilst there, a few odds and ends of
> > typos were caught as well.  This brings the documentation up to date with
> > ACPI 6.1 for arm64.
> >
> > Changes for v5:
> >-- Miscellaneous typos and corrections (Lorenzo Pieralisi)
> >-- Add linux-acpi@ ML to the distribution list (Alexey Klimov)
> >-- Corrections to CPPC information (Alexey Klimov)
> >-- ACK from Lorenzo Pieralisi
> >-- Updated bibliographic info (Al Stone)
> >
> > Changes for v4:
> >-- Clarify that IORT can sometimes be optional (Jon Masters).
> >-- Remove "Use as needed" descriptions of ACPI objects; they provide
> >   no substantive information and doing so simplifies maintenance of
> >   this document over time.  These have been replaced with a simpler
> >   notice that states that unless otherwise noted, do what the ACPI
> >   specification says is needed.
> >-- Corrected the _OSI object usage recommendation; it described kernel
> >   behavior that does not exist (Al Stone).
> >
> > Changes for v3:
> >-- Clarify use of _LPI/_RDI (Vikas Sajjan)
> >-- Whitespace cleanup as pointed out by checkpatch
> >
> > Changes for v2:
> >-- Clean up white space (Harb Abdulhahmid)
> >-- Clarification on _CCA usage (Harb Abdulhamid)
> >-- IORT moved to required from recommended (Hanjun Guo)
> >-- Clarify IORT description (Hanjun Guo)
> >
> >
> > Al Stone (1):
> >   ARM64: ACPI: Update documentation for latest specification version
> >
> >  Documentation/arm64/acpi_object_usage.txt | 343 
> > --
> >  Documentation/arm64/arm-acpi.txt  |  40 ++--
> >  2 files changed, 213 insertions(+), 170 deletions(-)
> >
> 
> Ping?  If there are no further comments, can this be pulled in through
> either the documentation or arm64 tree?
> 
> Thanks.

Hi Al,
sorry for delay.

CPPC and PCC corrections look fine. Thanks.


This comment is not to block your patch (maybe some to-do):
I greped sources and your patch and I don't see description of _PSD object.
This P-state dependancy object is optional but it's presense and correct data
are extremely useful for CPPC and can potentially descrease number of 
performance
changing requests.

ACPI spec in section about CPPC tells that it may use _PSD (page 503 if I 
remember
correctly) to specify domain belongings of CPUs.

You may consider to add description of _PSD object later.

Best regards,
Alexey.

Re: [PATCH v9] mm: kasan: Initial memory quarantine implementation

2016-05-16 Thread Alexey Klimov

Hi Alexander,

On Wed, May 11, 2016 at 6:18 PM, Alexander Potapenko  wrote:
> Quarantine isolates freed objects in a separate queue. The objects are
> returned to the allocator later, which helps to detect use-after-free
> errors.
>
> Freed objects are first added to per-cpu quarantine queues.
> When a cache is destroyed or memory shrinking is requested, the objects
> are moved into the global quarantine queue. Whenever a kmalloc call
> allows memory reclaiming, the oldest objects are popped out of the
> global queue until the total size of objects in quarantine is less than
> 3/4 of the maximum quarantine size (which is a fraction of installed
> physical memory).
>
> As long as an object remains in the quarantine, KASAN is able to report
> accesses to it, so the chance of reporting a use-after-free is increased.
> Once the object leaves quarantine, the allocator may reuse it, in which
> case the object is unpoisoned and KASAN can't detect incorrect accesses
> to it.
>
> Right now quarantine support is only enabled in SLAB allocator.
> Unification of KASAN features in SLAB and SLUB will be done later.
>
> This patch is based on the "mm: kasan: quarantine" patch originally
> prepared by Dmitry Chernenkov. A number of improvements have been
> suggested by Andrey Ryabinin.
>
> Signed-off-by: Alexander Potapenko 
> ---
> v2: - added copyright comments
> - per request from Joonsoo Kim made __cache_free() more straightforward
> - added comments for smp_load_acquire()/smp_store_release()
>
> v3: - incorporate changes introduced by the "mm, kasan: SLAB support" patch
>
> v4: - fix kbuild compile-time error (missing ___cache_free() declaration)
>   and a warning (wrong format specifier)
>
> v6: - extended the patch description
> - dropped the unused qlist_remove() function
>
> v9: - incorporate the fixes by Andrey Ryabinin:
>   * Fix comment styles,
>   * Get rid of some ifdefs
>   * Revert needless functions renames in quarantine patch
>   * Remove needless local_irq_save()/restore() in
> per_cpu_remove_cache()
>   * Add new 'struct qlist_node' instead of 'void **' types. This makes
> code a bit more redable.
> - remove the non-deterministic quarantine test
> - dropped smp_load_acquire()/smp_store_release()
> ---
>  include/linux/kasan.h |  13 ++-
>  mm/kasan/Makefile |   1 +
>  mm/kasan/kasan.c  |  57 --
>  mm/kasan/kasan.h  |  21 +++-
>  mm/kasan/quarantine.c | 291 
> ++
>  mm/kasan/report.c |   1 +
>  mm/mempool.c  |   2 +-
>  mm/slab.c |  12 ++-
>  mm/slab.h |   2 +
>  mm/slab_common.c  |   2 +
>  10 files changed, 387 insertions(+), 15 deletions(-)
>  create mode 100644 mm/kasan/quarantine.c
>
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 737371b..611927f 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -50,6 +50,8 @@ void kasan_free_pages(struct page *page, unsigned int 
> order);
>

[...]

> diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
> new file mode 100644
> index 000..4973505
> --- /dev/null
> +++ b/mm/kasan/quarantine.c
> @@ -0,0 +1,291 @@
> +/*
> + * KASAN quarantine.
> + *
> + * Author: Alexander Potapenko 
> + * Copyright (C) 2016 Google, Inc.
> + *
> + * Based on code by Dmitry Chernenkov.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "../slab.h"
> +#include "kasan.h"
> +
> +/* Data structure and operations for quarantine queues. */
> +
> +/*
> + * Each queue is a signle-linked list, which also stores the total size of
> + * objects inside of it.
> + */
> +struct qlist_head {
> +   struct qlist_node *head;
> +   struct qlist_node *tail;
> +   size_t bytes;
> +};
> +
> +#define QLIST_INIT { NULL, NULL, 0 }
> +
> +static bool qlist_empty(struct qlist_head *q)
> +{
> +   return !q->head;
> +}
> +
> +static void qlist_init(struct qlist_head *q)
> +{
> +   q->head = q->tail = NULL;
> +   q->bytes = 0;
> +}
> +
> +static void qlist_put(struct qlist_head *q, struct qlist_node *qlink,
> +   size_t size)
> +{
> +   if (unlikely(qlist_empty(q)))
> +   q->head = qlink;
> +   else
> +   q->tail->next = qlink;
> +   q->tail = qlink;
> +   qlink->next = NULL;
> +   q->bytes +=

Re: [PATCH v9] mm: kasan: Initial memory quarantine implementation

2016-05-16 Thread Alexey Klimov

Hi Alexander,

On Wed, May 11, 2016 at 6:18 PM, Alexander Potapenko  wrote:
> Quarantine isolates freed objects in a separate queue. The objects are
> returned to the allocator later, which helps to detect use-after-free
> errors.
>
> Freed objects are first added to per-cpu quarantine queues.
> When a cache is destroyed or memory shrinking is requested, the objects
> are moved into the global quarantine queue. Whenever a kmalloc call
> allows memory reclaiming, the oldest objects are popped out of the
> global queue until the total size of objects in quarantine is less than
> 3/4 of the maximum quarantine size (which is a fraction of installed
> physical memory).
>
> As long as an object remains in the quarantine, KASAN is able to report
> accesses to it, so the chance of reporting a use-after-free is increased.
> Once the object leaves quarantine, the allocator may reuse it, in which
> case the object is unpoisoned and KASAN can't detect incorrect accesses
> to it.
>
> Right now quarantine support is only enabled in SLAB allocator.
> Unification of KASAN features in SLAB and SLUB will be done later.
>
> This patch is based on the "mm: kasan: quarantine" patch originally
> prepared by Dmitry Chernenkov. A number of improvements have been
> suggested by Andrey Ryabinin.
>
> Signed-off-by: Alexander Potapenko 
> ---
> v2: - added copyright comments
> - per request from Joonsoo Kim made __cache_free() more straightforward
> - added comments for smp_load_acquire()/smp_store_release()
>
> v3: - incorporate changes introduced by the "mm, kasan: SLAB support" patch
>
> v4: - fix kbuild compile-time error (missing ___cache_free() declaration)
>   and a warning (wrong format specifier)
>
> v6: - extended the patch description
> - dropped the unused qlist_remove() function
>
> v9: - incorporate the fixes by Andrey Ryabinin:
>   * Fix comment styles,
>   * Get rid of some ifdefs
>   * Revert needless functions renames in quarantine patch
>   * Remove needless local_irq_save()/restore() in
> per_cpu_remove_cache()
>   * Add new 'struct qlist_node' instead of 'void **' types. This makes
> code a bit more redable.
> - remove the non-deterministic quarantine test
> - dropped smp_load_acquire()/smp_store_release()
> ---
>  include/linux/kasan.h |  13 ++-
>  mm/kasan/Makefile |   1 +
>  mm/kasan/kasan.c  |  57 --
>  mm/kasan/kasan.h  |  21 +++-
>  mm/kasan/quarantine.c | 291 
> ++
>  mm/kasan/report.c |   1 +
>  mm/mempool.c  |   2 +-
>  mm/slab.c |  12 ++-
>  mm/slab.h |   2 +
>  mm/slab_common.c  |   2 +
>  10 files changed, 387 insertions(+), 15 deletions(-)
>  create mode 100644 mm/kasan/quarantine.c
>
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 737371b..611927f 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -50,6 +50,8 @@ void kasan_free_pages(struct page *page, unsigned int 
> order);
>

[...]

> diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
> new file mode 100644
> index 000..4973505
> --- /dev/null
> +++ b/mm/kasan/quarantine.c
> @@ -0,0 +1,291 @@
> +/*
> + * KASAN quarantine.
> + *
> + * Author: Alexander Potapenko 
> + * Copyright (C) 2016 Google, Inc.
> + *
> + * Based on code by Dmitry Chernenkov.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "../slab.h"
> +#include "kasan.h"
> +
> +/* Data structure and operations for quarantine queues. */
> +
> +/*
> + * Each queue is a signle-linked list, which also stores the total size of
> + * objects inside of it.
> + */
> +struct qlist_head {
> +   struct qlist_node *head;
> +   struct qlist_node *tail;
> +   size_t bytes;
> +};
> +
> +#define QLIST_INIT { NULL, NULL, 0 }
> +
> +static bool qlist_empty(struct qlist_head *q)
> +{
> +   return !q->head;
> +}
> +
> +static void qlist_init(struct qlist_head *q)
> +{
> +   q->head = q->tail = NULL;
> +   q->bytes = 0;
> +}
> +
> +static void qlist_put(struct qlist_head *q, struct qlist_node *qlink,
> +   size_t size)
> +{
> +   if (unlikely(qlist_empty(q)))
> +   q->head = qlink;
> +   else
> +   q->tail->next = qlink;
> +   q->tail = qlink;
> +   qlink->next = NULL;
> +   q->bytes += size;
> +}
> +
> +static void qlist_move_all(struct

Re: [PATCH v2] mailbox: pcc: Support HW-Reduced Communication Subspace Type 2

2016-05-10 Thread Alexey Klimov

On Mon, May 09, 2016 at 10:38:24AM -0700, Hoan Tran wrote:
> Hi Alexey,
> 
> On Mon, May 9, 2016 at 2:43 AM, Alexey Klimov <alexey.kli...@arm.com> wrote:
> > Hi Hoan,
> >
> > On Fri, May 06, 2016 at 11:38:34AM -0700, Hoan Tran wrote:
> >> From: hotran <hot...@apm.com>
> >>
> >> ACPI 6.1 has a PCC HW-Reduced Communication Subspace Type 2 intended for
> >> use on HW-Reduce ACPI Platform, which requires read-modify-write sequence
> >> to acknowledge doorbell interrupt. This patch provides the implementation
> >> for the Communication Subspace Type 2.
> >>
> >> This patch depends on patch [1] which supports PCC subspace type 2 header
> >> [1] https://lkml.org/lkml/2016/5/5/14
> >>  - [PATCH v2 03/13] ACPICA: ACPI 6.1: Support for new PCCT subtable
> >
> > So you finally decided to use separate structure declaration for type 2. 
> > Good.
> >
> >> v2
> >>  * Remove changes inside "actbl3.h". This file is taken care by ACPICA.
> >>  * Parse both subspace type 1 and subspace type 2
> >>  * Remove unnecessary variable initialization
> >>  * ISR returns IRQ_NONE in case of error
> >>
> >> v1
> >>  * Initial
> >>
> >> Signed-off-by: Hoan Tran <hot...@apm.com>
> >> ---
> >>  drivers/mailbox/pcc.c | 395 
> >> +-
> >>  1 file changed, 296 insertions(+), 99 deletions(-)
> >>
> >> diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> >> index 043828d..58c9a67 100644
> >> --- a/drivers/mailbox/pcc.c
> >> +++ b/drivers/mailbox/pcc.c
> >> @@ -59,6 +59,7 @@
> >>  #include 
> >>  #include 
> >>  #include 
> >
> > [...]
> >
> >> @@ -307,6 +440,43 @@ static int parse_pcc_subspace(struct 
> >> acpi_subtable_header *header,
> >>  }
> >>
> >>  /**
> >> + * pcc_parse_subspace_irq - Parse the PCC IRQ and PCC ACK register
> >> + *   There should be one entry per PCC client.
> >> + * @mbox_chans: Pointer to the PCC mailbox channel data
> >> + * @pcct_ss: Pointer to the ACPI subtable header under the PCCT.
> >> + *
> >> + * Return: 0 for Success, else errno.
> >> + *
> >> + * This gets called for each entry in the PCC table.
> >> + */
> >> +static int pcc_parse_subspace_irq(struct pcc_mbox_chan *mbox_chans,
> >> + struct acpi_pcct_hw_reduced *pcct_ss)
> >> +{
> >> + mbox_chans->irq = pcc_map_interrupt(pcct_ss->doorbell_interrupt,
> >> + (u32)pcct_ss->flags);
> >> + if (mbox_chans->irq <= 0) {
> >> + pr_err("PCC GSI %d not registered\n",
> >> +pcct_ss->doorbell_interrupt);
> >> + return -EINVAL;
> >> + }
> >> +
> >> + if (pcct_ss->header.type
> >> + == ACPI_PCCT_TYPE_HW_REDUCED_SUBSPACE_TYPE2) {
> >> + struct acpi_pcct_hw_reduced_type2 *pcct2_ss = (void 
> >> *)pcct_ss;
> >> +
> >> + mbox_chans->pcc_doorbell_ack_vaddr = acpi_os_ioremap(
> >> + pcct2_ss->doorbell_ack_register.address,
> >> + pcct2_ss->doorbell_ack_register.bit_width / 
> >> 8);
> >> + if (!mbox_chans->pcc_doorbell_ack_vaddr) {
> >> + pr_err("Failed to ioremap PCC ACK register\n");
> >> + return -ENOMEM;
> >> + }
> >> + }
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +/**
> >>   * acpi_pcc_probe - Parse the ACPI tree for the PCCT.
> >>   *
> >>   * Return: 0 for Success, else errno.
> >> @@ -316,7 +486,8 @@ static int __init acpi_pcc_probe(void)
> >>   acpi_size pcct_tbl_header_size;
> >>   struct acpi_table_header *pcct_tbl;
> >>   struct acpi_subtable_header *pcct_entry;
> >> - int count, i;
> >> + struct acpi_table_pcct *acpi_pcct_tbl;
> >> + int count, i, rc;
> >>   acpi_status status = AE_OK;
> >>
> >>   /* Search for PCCT */
> >> @@ -334,22 +505,28 @@ static int __init acpi_pcc_probe(void)
> >>   ACPI_PCCT_TYPE_HW_REDUCED_SUBSPACE,
> >>   parse_pcc_subspace, MAX_PCC_SUBSPACES);
> >

Re: [PATCH v2] mailbox: pcc: Support HW-Reduced Communication Subspace Type 2

2016-05-10 Thread Alexey Klimov

On Mon, May 09, 2016 at 10:38:24AM -0700, Hoan Tran wrote:
> Hi Alexey,
> 
> On Mon, May 9, 2016 at 2:43 AM, Alexey Klimov  wrote:
> > Hi Hoan,
> >
> > On Fri, May 06, 2016 at 11:38:34AM -0700, Hoan Tran wrote:
> >> From: hotran 
> >>
> >> ACPI 6.1 has a PCC HW-Reduced Communication Subspace Type 2 intended for
> >> use on HW-Reduce ACPI Platform, which requires read-modify-write sequence
> >> to acknowledge doorbell interrupt. This patch provides the implementation
> >> for the Communication Subspace Type 2.
> >>
> >> This patch depends on patch [1] which supports PCC subspace type 2 header
> >> [1] https://lkml.org/lkml/2016/5/5/14
> >>  - [PATCH v2 03/13] ACPICA: ACPI 6.1: Support for new PCCT subtable
> >
> > So you finally decided to use separate structure declaration for type 2. 
> > Good.
> >
> >> v2
> >>  * Remove changes inside "actbl3.h". This file is taken care by ACPICA.
> >>  * Parse both subspace type 1 and subspace type 2
> >>  * Remove unnecessary variable initialization
> >>  * ISR returns IRQ_NONE in case of error
> >>
> >> v1
> >>  * Initial
> >>
> >> Signed-off-by: Hoan Tran 
> >> ---
> >>  drivers/mailbox/pcc.c | 395 
> >> +-
> >>  1 file changed, 296 insertions(+), 99 deletions(-)
> >>
> >> diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> >> index 043828d..58c9a67 100644
> >> --- a/drivers/mailbox/pcc.c
> >> +++ b/drivers/mailbox/pcc.c
> >> @@ -59,6 +59,7 @@
> >>  #include 
> >>  #include 
> >>  #include 
> >
> > [...]
> >
> >> @@ -307,6 +440,43 @@ static int parse_pcc_subspace(struct 
> >> acpi_subtable_header *header,
> >>  }
> >>
> >>  /**
> >> + * pcc_parse_subspace_irq - Parse the PCC IRQ and PCC ACK register
> >> + *   There should be one entry per PCC client.
> >> + * @mbox_chans: Pointer to the PCC mailbox channel data
> >> + * @pcct_ss: Pointer to the ACPI subtable header under the PCCT.
> >> + *
> >> + * Return: 0 for Success, else errno.
> >> + *
> >> + * This gets called for each entry in the PCC table.
> >> + */
> >> +static int pcc_parse_subspace_irq(struct pcc_mbox_chan *mbox_chans,
> >> + struct acpi_pcct_hw_reduced *pcct_ss)
> >> +{
> >> + mbox_chans->irq = pcc_map_interrupt(pcct_ss->doorbell_interrupt,
> >> + (u32)pcct_ss->flags);
> >> + if (mbox_chans->irq <= 0) {
> >> + pr_err("PCC GSI %d not registered\n",
> >> +pcct_ss->doorbell_interrupt);
> >> + return -EINVAL;
> >> + }
> >> +
> >> + if (pcct_ss->header.type
> >> + == ACPI_PCCT_TYPE_HW_REDUCED_SUBSPACE_TYPE2) {
> >> + struct acpi_pcct_hw_reduced_type2 *pcct2_ss = (void 
> >> *)pcct_ss;
> >> +
> >> + mbox_chans->pcc_doorbell_ack_vaddr = acpi_os_ioremap(
> >> + pcct2_ss->doorbell_ack_register.address,
> >> + pcct2_ss->doorbell_ack_register.bit_width / 
> >> 8);
> >> + if (!mbox_chans->pcc_doorbell_ack_vaddr) {
> >> + pr_err("Failed to ioremap PCC ACK register\n");
> >> + return -ENOMEM;
> >> + }
> >> + }
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +/**
> >>   * acpi_pcc_probe - Parse the ACPI tree for the PCCT.
> >>   *
> >>   * Return: 0 for Success, else errno.
> >> @@ -316,7 +486,8 @@ static int __init acpi_pcc_probe(void)
> >>   acpi_size pcct_tbl_header_size;
> >>   struct acpi_table_header *pcct_tbl;
> >>   struct acpi_subtable_header *pcct_entry;
> >> - int count, i;
> >> + struct acpi_table_pcct *acpi_pcct_tbl;
> >> + int count, i, rc;
> >>   acpi_status status = AE_OK;
> >>
> >>   /* Search for PCCT */
> >> @@ -334,22 +505,28 @@ static int __init acpi_pcc_probe(void)
> >>   ACPI_PCCT_TYPE_HW_REDUCED_SUBSPACE,
> >>   parse_pcc_subspace, MAX_PCC_SUBSPACES);
> >>
> >> + count += acpi_table_parse_entries(ACPI_SIG_PCCT,
> >

Re: [PATCH v2] mailbox: pcc: Support HW-Reduced Communication Subspace Type 2

2016-05-09 Thread Alexey Klimov

Hi Hoan,

On Fri, May 06, 2016 at 11:38:34AM -0700, Hoan Tran wrote:
> From: hotran <hot...@apm.com>
> 
> ACPI 6.1 has a PCC HW-Reduced Communication Subspace Type 2 intended for
> use on HW-Reduce ACPI Platform, which requires read-modify-write sequence
> to acknowledge doorbell interrupt. This patch provides the implementation
> for the Communication Subspace Type 2.
> 
> This patch depends on patch [1] which supports PCC subspace type 2 header
> [1] https://lkml.org/lkml/2016/5/5/14
>  - [PATCH v2 03/13] ACPICA: ACPI 6.1: Support for new PCCT subtable

So you finally decided to use separate structure declaration for type 2. Good.

> v2
>  * Remove changes inside "actbl3.h". This file is taken care by ACPICA.
>  * Parse both subspace type 1 and subspace type 2
>  * Remove unnecessary variable initialization
>  * ISR returns IRQ_NONE in case of error
> 
> v1
>  * Initial
> 
> Signed-off-by: Hoan Tran <hot...@apm.com>
> ---
>  drivers/mailbox/pcc.c | 395 
> +-
>  1 file changed, 296 insertions(+), 99 deletions(-)
> 
> diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> index 043828d..58c9a67 100644
> --- a/drivers/mailbox/pcc.c
> +++ b/drivers/mailbox/pcc.c
> @@ -59,6 +59,7 @@
>  #include 
>  #include 
>  #include 

[...]

> @@ -307,6 +440,43 @@ static int parse_pcc_subspace(struct 
> acpi_subtable_header *header,
>  }
>  
>  /**
> + * pcc_parse_subspace_irq - Parse the PCC IRQ and PCC ACK register
> + *   There should be one entry per PCC client.
> + * @mbox_chans: Pointer to the PCC mailbox channel data
> + * @pcct_ss: Pointer to the ACPI subtable header under the PCCT.
> + *
> + * Return: 0 for Success, else errno.
> + *
> + * This gets called for each entry in the PCC table.
> + */
> +static int pcc_parse_subspace_irq(struct pcc_mbox_chan *mbox_chans,
> + struct acpi_pcct_hw_reduced *pcct_ss)
> +{
> + mbox_chans->irq = pcc_map_interrupt(pcct_ss->doorbell_interrupt,
> + (u32)pcct_ss->flags);
> + if (mbox_chans->irq <= 0) {
> + pr_err("PCC GSI %d not registered\n",
> +pcct_ss->doorbell_interrupt);
> + return -EINVAL;
> + }
> +
> + if (pcct_ss->header.type
> + == ACPI_PCCT_TYPE_HW_REDUCED_SUBSPACE_TYPE2) {
> + struct acpi_pcct_hw_reduced_type2 *pcct2_ss = (void *)pcct_ss;
> +
> + mbox_chans->pcc_doorbell_ack_vaddr = acpi_os_ioremap(
> + pcct2_ss->doorbell_ack_register.address,
> + pcct2_ss->doorbell_ack_register.bit_width / 8);
> + if (!mbox_chans->pcc_doorbell_ack_vaddr) {
> + pr_err("Failed to ioremap PCC ACK register\n");
> + return -ENOMEM;
> + }
> + }
> +
> + return 0;
> +}
> +
> +/**
>   * acpi_pcc_probe - Parse the ACPI tree for the PCCT.
>   *
>   * Return: 0 for Success, else errno.
> @@ -316,7 +486,8 @@ static int __init acpi_pcc_probe(void)
>   acpi_size pcct_tbl_header_size;
>   struct acpi_table_header *pcct_tbl;
>   struct acpi_subtable_header *pcct_entry;
> - int count, i;
> + struct acpi_table_pcct *acpi_pcct_tbl;
> + int count, i, rc;
>   acpi_status status = AE_OK;
>  
>   /* Search for PCCT */
> @@ -334,22 +505,28 @@ static int __init acpi_pcc_probe(void)
>   ACPI_PCCT_TYPE_HW_REDUCED_SUBSPACE,
>   parse_pcc_subspace, MAX_PCC_SUBSPACES);
>  
> + count += acpi_table_parse_entries(ACPI_SIG_PCCT,
> + sizeof(struct acpi_table_pcct),
> + ACPI_PCCT_TYPE_HW_REDUCED_SUBSPACE_TYPE2,
> + parse_pcc_subspace, MAX_PCC_SUBSPACES);
> +
>   if (count <= 0) {
>   pr_err("Error parsing PCC subspaces from PCCT\n");
>   return -EINVAL;
>   }

Looks like after first call to acpi_table_parse_entries() you may have negative
number in count. And then you add counted number of type 2 subtables to count 
variable.

I am not aware how pedantic this all should be but you may have more than 
MAX_PCC_SUBSPACES
subspaces or don't probe any subspaces at all with such approach. Or other side 
effects.


Best regards,
Alexey Klimov

Re: [PATCH v2] mailbox: pcc: Support HW-Reduced Communication Subspace Type 2

2016-05-09 Thread Alexey Klimov

Hi Hoan,

On Fri, May 06, 2016 at 11:38:34AM -0700, Hoan Tran wrote:
> From: hotran 
> 
> ACPI 6.1 has a PCC HW-Reduced Communication Subspace Type 2 intended for
> use on HW-Reduce ACPI Platform, which requires read-modify-write sequence
> to acknowledge doorbell interrupt. This patch provides the implementation
> for the Communication Subspace Type 2.
> 
> This patch depends on patch [1] which supports PCC subspace type 2 header
> [1] https://lkml.org/lkml/2016/5/5/14
>  - [PATCH v2 03/13] ACPICA: ACPI 6.1: Support for new PCCT subtable

So you finally decided to use separate structure declaration for type 2. Good.

> v2
>  * Remove changes inside "actbl3.h". This file is taken care by ACPICA.
>  * Parse both subspace type 1 and subspace type 2
>  * Remove unnecessary variable initialization
>  * ISR returns IRQ_NONE in case of error
> 
> v1
>  * Initial
> 
> Signed-off-by: Hoan Tran 
> ---
>  drivers/mailbox/pcc.c | 395 
> +-
>  1 file changed, 296 insertions(+), 99 deletions(-)
> 
> diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> index 043828d..58c9a67 100644
> --- a/drivers/mailbox/pcc.c
> +++ b/drivers/mailbox/pcc.c
> @@ -59,6 +59,7 @@
>  #include 
>  #include 
>  #include 

[...]

> @@ -307,6 +440,43 @@ static int parse_pcc_subspace(struct 
> acpi_subtable_header *header,
>  }
>  
>  /**
> + * pcc_parse_subspace_irq - Parse the PCC IRQ and PCC ACK register
> + *   There should be one entry per PCC client.
> + * @mbox_chans: Pointer to the PCC mailbox channel data
> + * @pcct_ss: Pointer to the ACPI subtable header under the PCCT.
> + *
> + * Return: 0 for Success, else errno.
> + *
> + * This gets called for each entry in the PCC table.
> + */
> +static int pcc_parse_subspace_irq(struct pcc_mbox_chan *mbox_chans,
> + struct acpi_pcct_hw_reduced *pcct_ss)
> +{
> + mbox_chans->irq = pcc_map_interrupt(pcct_ss->doorbell_interrupt,
> + (u32)pcct_ss->flags);
> + if (mbox_chans->irq <= 0) {
> + pr_err("PCC GSI %d not registered\n",
> +pcct_ss->doorbell_interrupt);
> + return -EINVAL;
> + }
> +
> + if (pcct_ss->header.type
> + == ACPI_PCCT_TYPE_HW_REDUCED_SUBSPACE_TYPE2) {
> + struct acpi_pcct_hw_reduced_type2 *pcct2_ss = (void *)pcct_ss;
> +
> + mbox_chans->pcc_doorbell_ack_vaddr = acpi_os_ioremap(
> + pcct2_ss->doorbell_ack_register.address,
> + pcct2_ss->doorbell_ack_register.bit_width / 8);
> + if (!mbox_chans->pcc_doorbell_ack_vaddr) {
> + pr_err("Failed to ioremap PCC ACK register\n");
> + return -ENOMEM;
> + }
> + }
> +
> + return 0;
> +}
> +
> +/**
>   * acpi_pcc_probe - Parse the ACPI tree for the PCCT.
>   *
>   * Return: 0 for Success, else errno.
> @@ -316,7 +486,8 @@ static int __init acpi_pcc_probe(void)
>   acpi_size pcct_tbl_header_size;
>   struct acpi_table_header *pcct_tbl;
>   struct acpi_subtable_header *pcct_entry;
> - int count, i;
> + struct acpi_table_pcct *acpi_pcct_tbl;
> + int count, i, rc;
>   acpi_status status = AE_OK;
>  
>   /* Search for PCCT */
> @@ -334,22 +505,28 @@ static int __init acpi_pcc_probe(void)
>   ACPI_PCCT_TYPE_HW_REDUCED_SUBSPACE,
>   parse_pcc_subspace, MAX_PCC_SUBSPACES);
>  
> + count += acpi_table_parse_entries(ACPI_SIG_PCCT,
> + sizeof(struct acpi_table_pcct),
> + ACPI_PCCT_TYPE_HW_REDUCED_SUBSPACE_TYPE2,
> + parse_pcc_subspace, MAX_PCC_SUBSPACES);
> +
>   if (count <= 0) {
>   pr_err("Error parsing PCC subspaces from PCCT\n");
>   return -EINVAL;
>   }

Looks like after first call to acpi_table_parse_entries() you may have negative
number in count. And then you add counted number of type 2 subtables to count 
variable.

I am not aware how pedantic this all should be but you may have more than 
MAX_PCC_SUBSPACES
subspaces or don't probe any subspaces at all with such approach. Or other side 
effects.


Best regards,
Alexey Klimov

Re: [PATCH v2] Force cppc_cpufreq to report values in KHz to fix user space reporting

2016-04-21 Thread Alexey Klimov


On Tue, Apr 19, 2016 at 1:11 AM, Al Stone  wrote:
> 
> When CPPC is being used by ACPI on arm64, user space tools such as
> cpupower report CPU frequency values from sysfs that are incorrect.
> 
> What the driver was doing was reporting the values given by ACPI tables
> in whatever scale was used to provide them.  However, the ACPI spec
> defines the CPPC values as unitless abstract numbers.  Internal kernel
> structures such as struct perf_cap, in contrast, expect these values
> to be in KHz.  When these struct values get reported via sysfs, the
> user space tools also assume they are in KHz, causing them to report
> incorrect values (for example, reporting a CPU frequency of 1MHz when
> it should be 1.8GHz).
> 
> While the investigation for a long term fix proceeds (several options
> are being explored, some of which may require spec changes or other
> much more invasive fixes), this patch forces the values read by CPPC
> to be read in KHz, regardless of what they actually represent.
> 
> The downside is that this approach has some assumptions:
> 
>(1) It relies on SMBIOS3 being used, *and* that the Max Frequency
>value for a processor is set to a non-zero value.
> 
>(2) It assumes that all processors run at the same speed.  This

Sometimes short-term solution becomes long-term. It's worth to place
comment in code about this assumption.

>patch retrieves the first CPU Max Frequency from a type 4 DMI
>record that it can find.  This may not be an issue, however, as a
>sampling of DMI data on x86 and arm64 indicates there is often only
>one such record regardless.
> 
> For arm64 servers, this may be sufficient, but it does rely on
> firmware values being set correctly.  Hence, other approaches are
> also being considered.
> 
> This has been tested on three arm64 servers, with and without DMI, with
> and without CPPC support.
> 
> Changes for v2:
> -- Corrected thinko: needed to have DEPENDS on DMI in Kconfig.arm,
>not SELECT DMI (found by build daemon)
> 
> Signed-off-by: Al Stone 
> ---
>  drivers/acpi/cppc_acpi.c| 61 
> +
>  drivers/cpufreq/Kconfig.arm |  1 +
>  2 files changed, 57 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> index 8adac69..d61ced6 100644
> --- a/drivers/acpi/cppc_acpi.c
> +++ b/drivers/acpi/cppc_acpi.c
> @@ -40,6 +40,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +
> +#include 
> 
>  #include 
>  /*
> @@ -709,6 +712,47 @@ static int cpc_write(struct cpc_reg *reg, u64 val)
> return ret_val;
>  }
> 
> +static u64 cppc_dmi_khz;
> +
> +static void cppc_find_dmi_mhz(const struct dmi_header *dm, void *private)
> +{
> +   u16 *mhz = (u16 *)private;
> +   const u8 *dmi_data = (const u8 *)dm;
> +
> +   if (dm->type == DMI_ENTRY_PROCESSOR && dm->length >= 48)
> +   *mhz = (u16)get_unaligned((const u16 *)(dmi_data + 0x14));
> +}
> +
> +
> +static u64 cppc_get_dmi_khz(void)
> +{
> +   u16 mhz;
> +
> +   dmi_walk(cppc_find_dmi_mhz, );
> +
> +   /*
> +* Real stupid fallback value, just in case there is no
> +* actual value set.
> +*/
> +   mhz = mhz ? mhz : 1;
> +
> +   return (1000 * mhz);
> +}
> +
> +static u64 cppc_unitless_to_khz(u64 min, u64 max, u64 val)
> +{
> +   /*
> +* The incoming val should be min <= val <= max.  Our
> +* job is to convert that to KHz so it can be properly
> +* reported to user space via cpufreq_policy.
> +*/
> +
> +   if (!cppc_dmi_khz)
> +   cppc_dmi_khz = cppc_get_dmi_khz();
> +
> +   return ((val - min) * cppc_dmi_khz) / (max - min);

How pedantic should the kernel be while dealing with this values?

This 1) can potentially divide by zero (extra care is required to
perform this in Solar System) and 2) can return 0.

Not sure if there is some benefit for firmware to export such
values.

[..]

Best regards,
Alexey

Re: [PATCH v2] Force cppc_cpufreq to report values in KHz to fix user space reporting

2016-04-21 Thread Alexey Klimov


On Tue, Apr 19, 2016 at 1:11 AM, Al Stone  wrote:
> 
> When CPPC is being used by ACPI on arm64, user space tools such as
> cpupower report CPU frequency values from sysfs that are incorrect.
> 
> What the driver was doing was reporting the values given by ACPI tables
> in whatever scale was used to provide them.  However, the ACPI spec
> defines the CPPC values as unitless abstract numbers.  Internal kernel
> structures such as struct perf_cap, in contrast, expect these values
> to be in KHz.  When these struct values get reported via sysfs, the
> user space tools also assume they are in KHz, causing them to report
> incorrect values (for example, reporting a CPU frequency of 1MHz when
> it should be 1.8GHz).
> 
> While the investigation for a long term fix proceeds (several options
> are being explored, some of which may require spec changes or other
> much more invasive fixes), this patch forces the values read by CPPC
> to be read in KHz, regardless of what they actually represent.
> 
> The downside is that this approach has some assumptions:
> 
>(1) It relies on SMBIOS3 being used, *and* that the Max Frequency
>value for a processor is set to a non-zero value.
> 
>(2) It assumes that all processors run at the same speed.  This

Sometimes short-term solution becomes long-term. It's worth to place
comment in code about this assumption.

>patch retrieves the first CPU Max Frequency from a type 4 DMI
>record that it can find.  This may not be an issue, however, as a
>sampling of DMI data on x86 and arm64 indicates there is often only
>one such record regardless.
> 
> For arm64 servers, this may be sufficient, but it does rely on
> firmware values being set correctly.  Hence, other approaches are
> also being considered.
> 
> This has been tested on three arm64 servers, with and without DMI, with
> and without CPPC support.
> 
> Changes for v2:
> -- Corrected thinko: needed to have DEPENDS on DMI in Kconfig.arm,
>not SELECT DMI (found by build daemon)
> 
> Signed-off-by: Al Stone 
> ---
>  drivers/acpi/cppc_acpi.c| 61 
> +
>  drivers/cpufreq/Kconfig.arm |  1 +
>  2 files changed, 57 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> index 8adac69..d61ced6 100644
> --- a/drivers/acpi/cppc_acpi.c
> +++ b/drivers/acpi/cppc_acpi.c
> @@ -40,6 +40,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +
> +#include 
> 
>  #include 
>  /*
> @@ -709,6 +712,47 @@ static int cpc_write(struct cpc_reg *reg, u64 val)
> return ret_val;
>  }
> 
> +static u64 cppc_dmi_khz;
> +
> +static void cppc_find_dmi_mhz(const struct dmi_header *dm, void *private)
> +{
> +   u16 *mhz = (u16 *)private;
> +   const u8 *dmi_data = (const u8 *)dm;
> +
> +   if (dm->type == DMI_ENTRY_PROCESSOR && dm->length >= 48)
> +   *mhz = (u16)get_unaligned((const u16 *)(dmi_data + 0x14));
> +}
> +
> +
> +static u64 cppc_get_dmi_khz(void)
> +{
> +   u16 mhz;
> +
> +   dmi_walk(cppc_find_dmi_mhz, );
> +
> +   /*
> +* Real stupid fallback value, just in case there is no
> +* actual value set.
> +*/
> +   mhz = mhz ? mhz : 1;
> +
> +   return (1000 * mhz);
> +}
> +
> +static u64 cppc_unitless_to_khz(u64 min, u64 max, u64 val)
> +{
> +   /*
> +* The incoming val should be min <= val <= max.  Our
> +* job is to convert that to KHz so it can be properly
> +* reported to user space via cpufreq_policy.
> +*/
> +
> +   if (!cppc_dmi_khz)
> +   cppc_dmi_khz = cppc_get_dmi_khz();
> +
> +   return ((val - min) * cppc_dmi_khz) / (max - min);

How pedantic should the kernel be while dealing with this values?

This 1) can potentially divide by zero (extra care is required to
perform this in Solar System) and 2) can return 0.

Not sure if there is some benefit for firmware to export such
values.

[..]

Best regards,
Alexey

Re: [PATCH v4] ARM64: ACPI: Update documentation for latest specification version

2016-04-21 Thread Alexey Klimov

Hi Al,

I hope you don't mind if I put few minor questions here.

On Mon, Apr 18, 2016 at 8:32 PM, Al Stone  wrote:
> The ACPI 6.1 specification was recently released at the end of January
> 2016, but the arm64 kernel documentation for the use of ACPI was written
> for the 5.1 version of the spec.  There were significant additions to the
> spec that had not yet been mentioned -- for example, the 6.0 mechanisms
> added to make it easier to define processors and low power idle states,
> as well as the 6.1 addition allowing regular interrupts (not just from
> GPIO) be used to signal ACPI general purpose events.
> 
> This patch reflects going back through and examining the specs in detail
> and updating content appropriately.  Whilst there, a few odds and ends of
> typos were caught as well.  This brings the documentation up to date with
> ACPI 6.1 for arm64.

Why linux-acpi is not in the destination list?
 
> Changes for v4:
>-- Clarify that IORT can sometimes be optional (Jon Masters).
>-- Remove "Use as needed" descriptions of ACPI objects; they provide
>   no substantive information and doing so simplifies maintenance of
>   this document over time.  These have been replaced with a simpler
>   notice that states that unless otherwise noted, do what the APCI
>   specification says is needed.
>-- Corrected the _OSI object usage recommendation; it described kernel
>   behavior that does not exist (Al Stone).
> 
> Changes for v3:
>-- Clarify use of _LPI/_RDI (Vikas Sajjan)
>-- Whitespace cleanup as pointed out by checkpatch
> 
> Changes for v2:
>-- Clean up white space (Harb Abdulhahmid)
>-- Clarification on _CCA usage (Harb Abdulhamid)
>-- IORT moved to required from recommended (Hanjun Guo)
>-- Clarify IORT description (Hanjun Guo)
> 
> Signed-off-by: Al Stone 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Jonathan Corbet 
> ---
>  Documentation/arm64/acpi_object_usage.txt | 347 
> --
>  Documentation/arm64/arm-acpi.txt  |  28 ++-
>  2 files changed, 212 insertions(+), 163 deletions(-)
> 
> diff --git a/Documentation/arm64/acpi_object_usage.txt
> b/Documentation/arm64/acpi_object_usage.txt
> index a6e1a18..3891750 100644
> --- a/Documentation/arm64/acpi_object_usage.txt
> +++ b/Documentation/arm64/acpi_object_usage.txt
> @@ -13,14 +13,18 @@ For ACPI on arm64, tables also fall into the
> following categories:

[..]

> == Memory-mapped ConFiGuration space ==
> @@ -176,14 +192,38 @@ MPST   Section 5.2.21 (signature == "MPST")
> == Memory Power State Table ==
> Optional, not currently supported.
> 
> +MSCT   Section 5.2.19 (signature == "MSCT")
> +   == Maximum System Characteristic Table ==
> +   Optional, not currently supported.
> +
>  MSDM   Signature Reserved (signature == "MSDM")
> == Microsoft Data Management table ==
> Microsoft only table, will not be supported.
> 
> -MSCT   Section 5.2.19 (signature == "MSCT")
> -   == Maximum System Characteristic Table ==
> +NFIT   Section 5.2.25 (signature == "NFIT")
> +   == NVDIMM Firmware Interface Table ==
> Optional, not currently supported.
> 
> +OEMx   Signature of "OEMx" only
> +   == OEM Specific Tables ==
> +   All tables starting with a signature of "OEM" are reserved for OEM
> +   use.  Since these are not meant to be of general use but are limited
> +   to very specific end users, they are not recommended for use and are
> +   not supported by the kernel for arm64.
> +
> +PCCT   Section 14.1 (signature == "PCCT)
> +   == Platform Communications Channel Table ==
> +   Recommend for use on arm64, and required when using CPPC to control
> +   power on the platform.

Could you please check corectness of this sentence?

If I remember correctly CPPC may operate via PCC interface but there is no
strict requirement to implement control mechanism via PCC.

> using CPPC to control power on the platform

Sorry, I think I need to disagree.
Main description of CPPC says that CPPC defines mechanism to manage performance
of logical processor.

What do you think about "to control performance on the platform"?
(or maybe "to control performance and power on the platform")

Thanks,
Alexey

Re: [PATCH v4] ARM64: ACPI: Update documentation for latest specification version

2016-04-21 Thread Alexey Klimov

Hi Al,

I hope you don't mind if I put few minor questions here.

On Mon, Apr 18, 2016 at 8:32 PM, Al Stone  wrote:
> The ACPI 6.1 specification was recently released at the end of January
> 2016, but the arm64 kernel documentation for the use of ACPI was written
> for the 5.1 version of the spec.  There were significant additions to the
> spec that had not yet been mentioned -- for example, the 6.0 mechanisms
> added to make it easier to define processors and low power idle states,
> as well as the 6.1 addition allowing regular interrupts (not just from
> GPIO) be used to signal ACPI general purpose events.
> 
> This patch reflects going back through and examining the specs in detail
> and updating content appropriately.  Whilst there, a few odds and ends of
> typos were caught as well.  This brings the documentation up to date with
> ACPI 6.1 for arm64.

Why linux-acpi is not in the destination list?
 
> Changes for v4:
>-- Clarify that IORT can sometimes be optional (Jon Masters).
>-- Remove "Use as needed" descriptions of ACPI objects; they provide
>   no substantive information and doing so simplifies maintenance of
>   this document over time.  These have been replaced with a simpler
>   notice that states that unless otherwise noted, do what the APCI
>   specification says is needed.
>-- Corrected the _OSI object usage recommendation; it described kernel
>   behavior that does not exist (Al Stone).
> 
> Changes for v3:
>-- Clarify use of _LPI/_RDI (Vikas Sajjan)
>-- Whitespace cleanup as pointed out by checkpatch
> 
> Changes for v2:
>-- Clean up white space (Harb Abdulhahmid)
>-- Clarification on _CCA usage (Harb Abdulhamid)
>-- IORT moved to required from recommended (Hanjun Guo)
>-- Clarify IORT description (Hanjun Guo)
> 
> Signed-off-by: Al Stone 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Jonathan Corbet 
> ---
>  Documentation/arm64/acpi_object_usage.txt | 347 
> --
>  Documentation/arm64/arm-acpi.txt  |  28 ++-
>  2 files changed, 212 insertions(+), 163 deletions(-)
> 
> diff --git a/Documentation/arm64/acpi_object_usage.txt
> b/Documentation/arm64/acpi_object_usage.txt
> index a6e1a18..3891750 100644
> --- a/Documentation/arm64/acpi_object_usage.txt
> +++ b/Documentation/arm64/acpi_object_usage.txt
> @@ -13,14 +13,18 @@ For ACPI on arm64, tables also fall into the
> following categories:

[..]

> == Memory-mapped ConFiGuration space ==
> @@ -176,14 +192,38 @@ MPST   Section 5.2.21 (signature == "MPST")
> == Memory Power State Table ==
> Optional, not currently supported.
> 
> +MSCT   Section 5.2.19 (signature == "MSCT")
> +   == Maximum System Characteristic Table ==
> +   Optional, not currently supported.
> +
>  MSDM   Signature Reserved (signature == "MSDM")
> == Microsoft Data Management table ==
> Microsoft only table, will not be supported.
> 
> -MSCT   Section 5.2.19 (signature == "MSCT")
> -   == Maximum System Characteristic Table ==
> +NFIT   Section 5.2.25 (signature == "NFIT")
> +   == NVDIMM Firmware Interface Table ==
> Optional, not currently supported.
> 
> +OEMx   Signature of "OEMx" only
> +   == OEM Specific Tables ==
> +   All tables starting with a signature of "OEM" are reserved for OEM
> +   use.  Since these are not meant to be of general use but are limited
> +   to very specific end users, they are not recommended for use and are
> +   not supported by the kernel for arm64.
> +
> +PCCT   Section 14.1 (signature == "PCCT)
> +   == Platform Communications Channel Table ==
> +   Recommend for use on arm64, and required when using CPPC to control
> +   power on the platform.

Could you please check corectness of this sentence?

If I remember correctly CPPC may operate via PCC interface but there is no
strict requirement to implement control mechanism via PCC.

> using CPPC to control power on the platform

Sorry, I think I need to disagree.
Main description of CPPC says that CPPC defines mechanism to manage performance
of logical processor.

What do you think about "to control performance on the platform"?
(or maybe "to control performance and power on the platform")

Thanks,
Alexey

Re: [PATCH] mailbox: pcc: Support HW-Reduced Communication Subspace Type 2

2016-04-19 Thread Alexey Klimov

Hi Hoan,

On Tue, Apr 5, 2016 at 11:14 PM, hotran  wrote:
> ACPI 6.1 has a HW-Reduced Communication Subspace Type 2 intended for
> use on HW-Reduce ACPI Platform, which requires read-modify-write sequence
> to acknowledge doorbell interrupt. This patch provides the implementation
> for the Communication Subspace Type 2.
>
> Signed-off-by: Hoan Tran 
> ---
>  drivers/mailbox/pcc.c | 384 
> +-
>  include/acpi/actbl3.h |   8 +-
>  2 files changed, 294 insertions(+), 98 deletions(-)
>
> diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> index 0ddf638..4ed8153 100644
> --- a/drivers/mailbox/pcc.c
> +++ b/drivers/mailbox/pcc.c
> @@ -59,6 +59,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -68,27 +69,178 @@
>  #include "mailbox.h"
>
>  #define MAX_PCC_SUBSPACES  256
> +#define MBOX_IRQ_NAME  "pcc-mbox"
>
> -static struct mbox_chan *pcc_mbox_channels;
> +/**
> + * PCC mailbox channel information
> + *
> + * @chan:  Pointer to mailbox communication channel
> + * @pcc_doorbell_vaddr: PCC doorbell register address
> + * @pcc_doorbell_ack_vaddr: PCC doorbell ack register address
> + * @irq:   Interrupt number of the channel
> + */
> +struct pcc_mbox_chan {
> +   struct mbox_chan*chan;
> +   void __iomem*pcc_doorbell_vaddr;
> +   void __iomem*pcc_doorbell_ack_vaddr;
> +   int irq;
> +};
>
> -/* Array of cached virtual address for doorbell registers */
> -static void __iomem **pcc_doorbell_vaddr;
> +/**
> + * PCC mailbox controller data
> + *
> + * @mb_ctrl:   Representation of the communication channel controller
> + * @mbox_chan: Array of PCC mailbox channels of the controller
> + * @chans: Array of mailbox communication channels
> + */
> +struct pcc_mbox {
> +   struct mbox_controller  mbox_ctrl;
> +   struct pcc_mbox_chan*mbox_chans;
> +   struct mbox_chan*chans;
> +};
> +
> +static struct pcc_mbox pcc_mbox_ctx = {};
>
> -static struct mbox_controller pcc_mbox_ctrl = {};
>  /**
>   * get_pcc_channel - Given a PCC subspace idx, get
> - * the respective mbox_channel.
> + * the respective pcc mbox_channel.
>   * @id: PCC subspace index.
>   *
>   * Return: ERR_PTR(errno) if error, else pointer
> - * to mbox channel.
> + * to pcc mbox channel.
>   */
> -static struct mbox_chan *get_pcc_channel(int id)
> +static struct pcc_mbox_chan *get_pcc_channel(int id)
>  {
> -   if (id < 0 || id > pcc_mbox_ctrl.num_chans)
> +   if (id < 0 || id > pcc_mbox_ctx.mbox_ctrl.num_chans)
> return ERR_PTR(-ENOENT);
>
> -   return _mbox_channels[id];
> +   return _mbox_ctx.mbox_chans[id];
> +}
> +
> +/*
> + * PCC can be used with perf critical drivers such as CPPC
> + * So it makes sense to locally cache the virtual address and
> + * use it to read/write to PCC registers such as doorbell register
> + *
> + * The below read_register and write_registers are used to read and
> + * write from perf critical registers such as PCC doorbell register
> + */
> +static int read_register(void __iomem *vaddr, u64 *val, unsigned int 
> bit_width)
> +{
> +   int ret_val = 0;
> +
> +   switch (bit_width) {
> +   case 8:
> +   *val = readb(vaddr);
> +   break;
> +   case 16:
> +   *val = readw(vaddr);
> +   break;
> +   case 32:
> +   *val = readl(vaddr);
> +   break;
> +   case 64:
> +   *val = readq(vaddr);
> +   break;
> +   default:
> +   pr_debug("Error: Cannot read register of %u bit width",
> +   bit_width);
> +   ret_val = -EFAULT;
> +   break;
> +   }
> +   return ret_val;
> +}
> +
> +static int write_register(void __iomem *vaddr, u64 val, unsigned int 
> bit_width)
> +{
> +   int ret_val = 0;
> +
> +   switch (bit_width) {
> +   case 8:
> +   writeb(val, vaddr);
> +   break;
> +   case 16:
> +   writew(val, vaddr);
> +   break;
> +   case 32:
> +   writel(val, vaddr);
> +   break;
> +   case 64:
> +   writeq(val, vaddr);
> +   break;
> +   default:
> +   pr_debug("Error: Cannot write register of %u bit width",
> +   bit_width);
> +   ret_val = -EFAULT;
> +   break;
> +   }
> +   return ret_val;
> +}
> +
> +/**
> + * pcc_map_interrupt - Map a PCC subspace GSI to a linux IRQ number
> + * @interrupt: GSI number.
> + * @flags: interrupt flags
> + *
> + * Returns: a valid linux IRQ number on success
> + * 0 or -EINVAL on failure
> + */
> +static int pcc_map_interrupt(u32 interrupt, u32 flags)
> +{
> +   int trigger, polarity;
> +
> +   if (!interrupt)
> +

Re: [PATCH] mailbox: pcc: Support HW-Reduced Communication Subspace Type 2

2016-04-19 Thread Alexey Klimov

Hi Hoan,

On Tue, Apr 5, 2016 at 11:14 PM, hotran  wrote:
> ACPI 6.1 has a HW-Reduced Communication Subspace Type 2 intended for
> use on HW-Reduce ACPI Platform, which requires read-modify-write sequence
> to acknowledge doorbell interrupt. This patch provides the implementation
> for the Communication Subspace Type 2.
>
> Signed-off-by: Hoan Tran 
> ---
>  drivers/mailbox/pcc.c | 384 
> +-
>  include/acpi/actbl3.h |   8 +-
>  2 files changed, 294 insertions(+), 98 deletions(-)
>
> diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> index 0ddf638..4ed8153 100644
> --- a/drivers/mailbox/pcc.c
> +++ b/drivers/mailbox/pcc.c
> @@ -59,6 +59,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -68,27 +69,178 @@
>  #include "mailbox.h"
>
>  #define MAX_PCC_SUBSPACES  256
> +#define MBOX_IRQ_NAME  "pcc-mbox"
>
> -static struct mbox_chan *pcc_mbox_channels;
> +/**
> + * PCC mailbox channel information
> + *
> + * @chan:  Pointer to mailbox communication channel
> + * @pcc_doorbell_vaddr: PCC doorbell register address
> + * @pcc_doorbell_ack_vaddr: PCC doorbell ack register address
> + * @irq:   Interrupt number of the channel
> + */
> +struct pcc_mbox_chan {
> +   struct mbox_chan*chan;
> +   void __iomem*pcc_doorbell_vaddr;
> +   void __iomem*pcc_doorbell_ack_vaddr;
> +   int irq;
> +};
>
> -/* Array of cached virtual address for doorbell registers */
> -static void __iomem **pcc_doorbell_vaddr;
> +/**
> + * PCC mailbox controller data
> + *
> + * @mb_ctrl:   Representation of the communication channel controller
> + * @mbox_chan: Array of PCC mailbox channels of the controller
> + * @chans: Array of mailbox communication channels
> + */
> +struct pcc_mbox {
> +   struct mbox_controller  mbox_ctrl;
> +   struct pcc_mbox_chan*mbox_chans;
> +   struct mbox_chan*chans;
> +};
> +
> +static struct pcc_mbox pcc_mbox_ctx = {};
>
> -static struct mbox_controller pcc_mbox_ctrl = {};
>  /**
>   * get_pcc_channel - Given a PCC subspace idx, get
> - * the respective mbox_channel.
> + * the respective pcc mbox_channel.
>   * @id: PCC subspace index.
>   *
>   * Return: ERR_PTR(errno) if error, else pointer
> - * to mbox channel.
> + * to pcc mbox channel.
>   */
> -static struct mbox_chan *get_pcc_channel(int id)
> +static struct pcc_mbox_chan *get_pcc_channel(int id)
>  {
> -   if (id < 0 || id > pcc_mbox_ctrl.num_chans)
> +   if (id < 0 || id > pcc_mbox_ctx.mbox_ctrl.num_chans)
> return ERR_PTR(-ENOENT);
>
> -   return _mbox_channels[id];
> +   return _mbox_ctx.mbox_chans[id];
> +}
> +
> +/*
> + * PCC can be used with perf critical drivers such as CPPC
> + * So it makes sense to locally cache the virtual address and
> + * use it to read/write to PCC registers such as doorbell register
> + *
> + * The below read_register and write_registers are used to read and
> + * write from perf critical registers such as PCC doorbell register
> + */
> +static int read_register(void __iomem *vaddr, u64 *val, unsigned int 
> bit_width)
> +{
> +   int ret_val = 0;
> +
> +   switch (bit_width) {
> +   case 8:
> +   *val = readb(vaddr);
> +   break;
> +   case 16:
> +   *val = readw(vaddr);
> +   break;
> +   case 32:
> +   *val = readl(vaddr);
> +   break;
> +   case 64:
> +   *val = readq(vaddr);
> +   break;
> +   default:
> +   pr_debug("Error: Cannot read register of %u bit width",
> +   bit_width);
> +   ret_val = -EFAULT;
> +   break;
> +   }
> +   return ret_val;
> +}
> +
> +static int write_register(void __iomem *vaddr, u64 val, unsigned int 
> bit_width)
> +{
> +   int ret_val = 0;
> +
> +   switch (bit_width) {
> +   case 8:
> +   writeb(val, vaddr);
> +   break;
> +   case 16:
> +   writew(val, vaddr);
> +   break;
> +   case 32:
> +   writel(val, vaddr);
> +   break;
> +   case 64:
> +   writeq(val, vaddr);
> +   break;
> +   default:
> +   pr_debug("Error: Cannot write register of %u bit width",
> +   bit_width);
> +   ret_val = -EFAULT;
> +   break;
> +   }
> +   return ret_val;
> +}
> +
> +/**
> + * pcc_map_interrupt - Map a PCC subspace GSI to a linux IRQ number
> + * @interrupt: GSI number.
> + * @flags: interrupt flags
> + *
> + * Returns: a valid linux IRQ number on success
> + * 0 or -EINVAL on failure
> + */
> +static int pcc_map_interrupt(u32 interrupt, u32 flags)
> +{
> +   int trigger, polarity;
> +
> +   if (!interrupt)
> +   return 0;
> +
> +

[PATCH v2] watchdog: add driver for StreamLabs USB watchdog device

2016-04-17 Thread Alexey Klimov

This patch creates new driver that supports StreamLabs usb watchdog
device. This device plugs into 9-pin usb header and connects to
reset pin and reset button on common PC.

USB commands used to communicate with device were reverse
engineered using usbmon.

Signed-off-by: Alexey Klimov <klimov.li...@gmail.com>
---
Changes in v2:
 -- coding style cleanups
 -- turn some dev_err messages to dev_dbg
 -- reimplemented usb_streamlabs_wdt_command() to use loop
 -- re-worked disconnect routine
 -- rebased to 4.6-rc2, removed set_timeout method
 -- removed braces in .options field in streamlabs_wdt_indent
 -- mem allocation migrated to devm_kzalloc
 -- buffer for device struct moved inside main struct
to avoid additional memory allocation
 -- removed watchdog_init_timeout()
 -- re-worked usb_streamlabs_wdt_{resume,suspend}
 -- removed struct usb_device pointer from main driver struct
 -- buffer preparation for communication migrated to cpu_to_le16()
functions, also buffer is filled in as u16 elements to
make this byteorder usable
 -- added stop command in usb_streamlabs_wdt_disconnect()
 
 drivers/watchdog/Kconfig  |  15 ++
 drivers/watchdog/Makefile |   1 +
 drivers/watchdog/streamlabs_wdt.c | 313 ++
 3 files changed, 329 insertions(+)
 create mode 100644 drivers/watchdog/streamlabs_wdt.c

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index fb94765..130cf54 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -1766,4 +1766,19 @@ config USBPCWATCHDOG
 
  Most people will say N.
 
+config USB_STREAMLABS_WATCHDOG
+   tristate "StreamLabs USB watchdog driver"
+   depends on USB
+   ---help---
+ This is the driver for the USB Watchdog dongle from StreamLabs.
+ If you correctly connect reset pins to motherboard Reset pin and
+ to Reset button then this device will simply watch your kernel to make
+ sure it doesn't freeze, and if it does, it reboots your computer
+ after a certain amount of time.
+
+
+ To compile this driver as a module, choose M here: the
+ module will be called streamlabs_wdt.
+
+ Most people will say N. Say yes or M if you want to use such usb 
device.
 endif # WATCHDOG
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index feb6270..9d36929 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -25,6 +25,7 @@ obj-$(CONFIG_WDTPCI) += wdt_pci.o
 
 # USB-based Watchdog Cards
 obj-$(CONFIG_USBPCWATCHDOG) += pcwd_usb.o
+obj-$(CONFIG_USB_STREAMLABS_WATCHDOG) += streamlabs_wdt.o
 
 # ALPHA Architecture
 
diff --git a/drivers/watchdog/streamlabs_wdt.c 
b/drivers/watchdog/streamlabs_wdt.c
new file mode 100644
index 000..3e34cd8
--- /dev/null
+++ b/drivers/watchdog/streamlabs_wdt.c
@@ -0,0 +1,313 @@
+/*
+ * StreamLabs USB Watchdog driver
+ *
+ * Copyright (c) 2016 Alexey Klimov <klimov.li...@gmail.com>
+ *
+ * This program is free software; you may redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * USB Watchdog device from Streamlabs
+ * http://www.stream-labs.com/products/devices/watchdog/
+ *
+ * USB commands have been reverse engineered using usbmon.
+ */
+
+#define DRIVER_AUTHOR "Alexey Klimov <klimov.li...@gmail.com>"
+#define DRIVER_DESC "StreamLabs USB watchdog driver"
+#define DRIVER_NAME "usb_streamlabs_wdt"
+
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
+MODULE_LICENSE("GPL");
+
+#define USB_STREAMLABS_WATCHDOG_VENDOR 0x13c0
+#define USB_STREAMLABS_WATCHDOG_PRODUCT0x0011
+
+/*
+ * one buffer is used for communication, however transmitted message is only
+ * 32 bytes long
+ */
+#define BUFFER_TRANSFER_LENGTH 32
+#define BUFFER_LENGTH  64
+#define USB_TIMEOUT350
+
+#define STREAMLABS_CMD_START   0xaacc
+#define STREAMLABS_CMD_STOP0xbbff
+
+#define STREAMLABS_WDT_MIN_TIMEOUT 1
+#define STREAMLABS_WDT_MAX_TIMEOUT 46
+
+struct streamlabs_wdt {
+   struct watchdog_device wdt_dev;
+   struct usb_interface *intf;
+
+   struct mutex lock;
+   u8 buffer[BUFFER_LENGTH];
+};
+
+static bool nowayout = WATCHDOG_NOWAYOUT;
+
+/*
+ * This function is used to check if watchdog actually changed
+ * its state to disabled that is reported in first two bytes of response
+ * message.
+ */
+static int usb_streamlabs_wdt_check_sto

[PATCH v2] watchdog: add driver for StreamLabs USB watchdog device

2016-04-17 Thread Alexey Klimov

This patch creates new driver that supports StreamLabs usb watchdog
device. This device plugs into 9-pin usb header and connects to
reset pin and reset button on common PC.

USB commands used to communicate with device were reverse
engineered using usbmon.

Signed-off-by: Alexey Klimov 
---
Changes in v2:
 -- coding style cleanups
 -- turn some dev_err messages to dev_dbg
 -- reimplemented usb_streamlabs_wdt_command() to use loop
 -- re-worked disconnect routine
 -- rebased to 4.6-rc2, removed set_timeout method
 -- removed braces in .options field in streamlabs_wdt_indent
 -- mem allocation migrated to devm_kzalloc
 -- buffer for device struct moved inside main struct
to avoid additional memory allocation
 -- removed watchdog_init_timeout()
 -- re-worked usb_streamlabs_wdt_{resume,suspend}
 -- removed struct usb_device pointer from main driver struct
 -- buffer preparation for communication migrated to cpu_to_le16()
functions, also buffer is filled in as u16 elements to
make this byteorder usable
 -- added stop command in usb_streamlabs_wdt_disconnect()
 
 drivers/watchdog/Kconfig  |  15 ++
 drivers/watchdog/Makefile |   1 +
 drivers/watchdog/streamlabs_wdt.c | 313 ++
 3 files changed, 329 insertions(+)
 create mode 100644 drivers/watchdog/streamlabs_wdt.c

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index fb94765..130cf54 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -1766,4 +1766,19 @@ config USBPCWATCHDOG
 
  Most people will say N.
 
+config USB_STREAMLABS_WATCHDOG
+   tristate "StreamLabs USB watchdog driver"
+   depends on USB
+   ---help---
+ This is the driver for the USB Watchdog dongle from StreamLabs.
+ If you correctly connect reset pins to motherboard Reset pin and
+ to Reset button then this device will simply watch your kernel to make
+ sure it doesn't freeze, and if it does, it reboots your computer
+ after a certain amount of time.
+
+
+ To compile this driver as a module, choose M here: the
+ module will be called streamlabs_wdt.
+
+ Most people will say N. Say yes or M if you want to use such usb 
device.
 endif # WATCHDOG
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index feb6270..9d36929 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -25,6 +25,7 @@ obj-$(CONFIG_WDTPCI) += wdt_pci.o
 
 # USB-based Watchdog Cards
 obj-$(CONFIG_USBPCWATCHDOG) += pcwd_usb.o
+obj-$(CONFIG_USB_STREAMLABS_WATCHDOG) += streamlabs_wdt.o
 
 # ALPHA Architecture
 
diff --git a/drivers/watchdog/streamlabs_wdt.c 
b/drivers/watchdog/streamlabs_wdt.c
new file mode 100644
index 000..3e34cd8
--- /dev/null
+++ b/drivers/watchdog/streamlabs_wdt.c
@@ -0,0 +1,313 @@
+/*
+ * StreamLabs USB Watchdog driver
+ *
+ * Copyright (c) 2016 Alexey Klimov 
+ *
+ * This program is free software; you may redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * USB Watchdog device from Streamlabs
+ * http://www.stream-labs.com/products/devices/watchdog/
+ *
+ * USB commands have been reverse engineered using usbmon.
+ */
+
+#define DRIVER_AUTHOR "Alexey Klimov "
+#define DRIVER_DESC "StreamLabs USB watchdog driver"
+#define DRIVER_NAME "usb_streamlabs_wdt"
+
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
+MODULE_LICENSE("GPL");
+
+#define USB_STREAMLABS_WATCHDOG_VENDOR 0x13c0
+#define USB_STREAMLABS_WATCHDOG_PRODUCT0x0011
+
+/*
+ * one buffer is used for communication, however transmitted message is only
+ * 32 bytes long
+ */
+#define BUFFER_TRANSFER_LENGTH 32
+#define BUFFER_LENGTH  64
+#define USB_TIMEOUT350
+
+#define STREAMLABS_CMD_START   0xaacc
+#define STREAMLABS_CMD_STOP0xbbff
+
+#define STREAMLABS_WDT_MIN_TIMEOUT 1
+#define STREAMLABS_WDT_MAX_TIMEOUT 46
+
+struct streamlabs_wdt {
+   struct watchdog_device wdt_dev;
+   struct usb_interface *intf;
+
+   struct mutex lock;
+   u8 buffer[BUFFER_LENGTH];
+};
+
+static bool nowayout = WATCHDOG_NOWAYOUT;
+
+/*
+ * This function is used to check if watchdog actually changed
+ * its state to disabled that is reported in first two bytes of response
+ * message.
+ */
+static int usb_streamlabs_wdt_check_stop(u16 *buf)
+{
+   if (buf[0] != cpu_to_le16(STREAMLABS_CMD_STOP))
+   re

Re: [PATCH] elevator: remove second argument in elevator_init()

2016-03-30 Thread Alexey Klimov

Hi all,

On Wed, Jan 27, 2016 at 9:01 PM, Jeff Moyer <jmo...@redhat.com> wrote:
> Alexey Klimov <klimov.li...@gmail.com> writes:
>
>> Last user of elevator_init() with non-NULL name as second argument
>> that supposed to be s390 dasd driver has gone few releases ago.
>> Drivers rely on elevator_change(), elevator_switch() and friends
>> for example. Right now elevator_init() is always called as
>> elevator_init(q, NULL).
>>
>> Patch removes passing of second name argument and its usage.
>>
>> While we're at it fix following if-check after removed lines. We know
>> that elevator_type e is initialized by NULL and need to check only
>> chosen_elevator.
>>
>> Signed-off-by: Alexey Klimov <klimov.li...@gmail.com>
>
> Reviewed-by: Jeff Moyer <jmo...@redhat.com>


what is the status of this patch? Is it that wrong and are there some
concerns or do I need to resend it?


Best regards,
Alexey


>> ---
>>  block/blk-core.c |  2 +-
>>  block/elevator.c | 10 ++
>>  include/linux/elevator.h |  2 +-
>>  3 files changed, 4 insertions(+), 10 deletions(-)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 33e2f62..f742ef4 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -861,7 +861,7 @@ blk_init_allocated_queue(struct request_queue *q, 
>> request_fn_proc *rfn,
>>   mutex_lock(>sysfs_lock);
>>
>>   /* init elevator */
>> - if (elevator_init(q, NULL)) {
>> + if (elevator_init(q)) {
>>   mutex_unlock(>sysfs_lock);
>>   goto fail;
>>   }
>> diff --git a/block/elevator.c b/block/elevator.c
>> index c3555c9..ff5c830 100644
>> --- a/block/elevator.c
>> +++ b/block/elevator.c
>> @@ -177,7 +177,7 @@ static void elevator_release(struct kobject *kobj)
>>   kfree(e);
>>  }
>>
>> -int elevator_init(struct request_queue *q, char *name)
>> +int elevator_init(struct request_queue *q)
>>  {
>>   struct elevator_type *e = NULL;
>>   int err;
>> @@ -196,18 +196,12 @@ int elevator_init(struct request_queue *q, char *name)
>>   q->end_sector = 0;
>>   q->boundary_rq = NULL;
>>
>> - if (name) {
>> - e = elevator_get(name, true);
>> - if (!e)
>> - return -EINVAL;
>> - }
>> -
>>   /*
>>* Use the default elevator specified by config boot param or
>>* config option.  Don't try to load modules as we could be running
>>* off async and request_module() isn't allowed from async.
>>*/
>> - if (!e && *chosen_elevator) {
>> + if (*chosen_elevator) {
>>   e = elevator_get(chosen_elevator, false);
>>   if (!e)
>>   printk(KERN_ERR "I/O scheduler %s not found\n",
>> diff --git a/include/linux/elevator.h b/include/linux/elevator.h
>> index 638b324..0ae0efd 100644
>> --- a/include/linux/elevator.h
>> +++ b/include/linux/elevator.h
>> @@ -154,7 +154,7 @@ extern void elv_unregister(struct elevator_type *);
>>  extern ssize_t elv_iosched_show(struct request_queue *, char *);
>>  extern ssize_t elv_iosched_store(struct request_queue *, const char *, 
>> size_t);
>>
>> -extern int elevator_init(struct request_queue *, char *);
>> +extern int elevator_init(struct request_queue *);
>>  extern void elevator_exit(struct elevator_queue *);
>>  extern int elevator_change(struct request_queue *, const char *);
>>  extern bool elv_rq_merge_ok(struct request *, struct bio *);

Re: [PATCH] elevator: remove second argument in elevator_init()

2016-03-30 Thread Alexey Klimov

Hi all,

On Wed, Jan 27, 2016 at 9:01 PM, Jeff Moyer  wrote:
> Alexey Klimov  writes:
>
>> Last user of elevator_init() with non-NULL name as second argument
>> that supposed to be s390 dasd driver has gone few releases ago.
>> Drivers rely on elevator_change(), elevator_switch() and friends
>> for example. Right now elevator_init() is always called as
>> elevator_init(q, NULL).
>>
>> Patch removes passing of second name argument and its usage.
>>
>> While we're at it fix following if-check after removed lines. We know
>> that elevator_type e is initialized by NULL and need to check only
>> chosen_elevator.
>>
>> Signed-off-by: Alexey Klimov 
>
> Reviewed-by: Jeff Moyer 


what is the status of this patch? Is it that wrong and are there some
concerns or do I need to resend it?


Best regards,
Alexey


>> ---
>>  block/blk-core.c |  2 +-
>>  block/elevator.c | 10 ++
>>  include/linux/elevator.h |  2 +-
>>  3 files changed, 4 insertions(+), 10 deletions(-)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 33e2f62..f742ef4 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -861,7 +861,7 @@ blk_init_allocated_queue(struct request_queue *q, 
>> request_fn_proc *rfn,
>>   mutex_lock(>sysfs_lock);
>>
>>   /* init elevator */
>> - if (elevator_init(q, NULL)) {
>> + if (elevator_init(q)) {
>>   mutex_unlock(>sysfs_lock);
>>   goto fail;
>>   }
>> diff --git a/block/elevator.c b/block/elevator.c
>> index c3555c9..ff5c830 100644
>> --- a/block/elevator.c
>> +++ b/block/elevator.c
>> @@ -177,7 +177,7 @@ static void elevator_release(struct kobject *kobj)
>>   kfree(e);
>>  }
>>
>> -int elevator_init(struct request_queue *q, char *name)
>> +int elevator_init(struct request_queue *q)
>>  {
>>   struct elevator_type *e = NULL;
>>   int err;
>> @@ -196,18 +196,12 @@ int elevator_init(struct request_queue *q, char *name)
>>   q->end_sector = 0;
>>   q->boundary_rq = NULL;
>>
>> - if (name) {
>> - e = elevator_get(name, true);
>> - if (!e)
>> - return -EINVAL;
>> - }
>> -
>>   /*
>>* Use the default elevator specified by config boot param or
>>* config option.  Don't try to load modules as we could be running
>>* off async and request_module() isn't allowed from async.
>>*/
>> - if (!e && *chosen_elevator) {
>> + if (*chosen_elevator) {
>>   e = elevator_get(chosen_elevator, false);
>>   if (!e)
>>   printk(KERN_ERR "I/O scheduler %s not found\n",
>> diff --git a/include/linux/elevator.h b/include/linux/elevator.h
>> index 638b324..0ae0efd 100644
>> --- a/include/linux/elevator.h
>> +++ b/include/linux/elevator.h
>> @@ -154,7 +154,7 @@ extern void elv_unregister(struct elevator_type *);
>>  extern ssize_t elv_iosched_show(struct request_queue *, char *);
>>  extern ssize_t elv_iosched_store(struct request_queue *, const char *, 
>> size_t);
>>
>> -extern int elevator_init(struct request_queue *, char *);
>> +extern int elevator_init(struct request_queue *);
>>  extern void elevator_exit(struct elevator_queue *);
>>  extern int elevator_change(struct request_queue *, const char *);
>>  extern bool elv_rq_merge_ok(struct request *, struct bio *);

Re: [PATCH] watchdog: add driver for StreamLabs USB watchdog device

2016-03-15 Thread Alexey Klimov

Hi Guenter,

On Tue, Mar 15, 2016 at 2:24 AM, Guenter Roeck <li...@roeck-us.net> wrote:
> Hi Alexey,
>
>
> On 03/14/2016 06:02 PM, Alexey Klimov wrote:
>>
>> Hi Guenter,
>>
>> On Thu, Mar 10, 2016 at 3:54 AM, Guenter Roeck <li...@roeck-us.net> wrote:
>>>
>>> On 03/09/2016 06:29 PM, Alexey Klimov wrote:
>>>>
>>>>
>>>> This patch creates new driver that supports StreamLabs usb watchdog
>>>> device. This device plugs into 9-pin usb header and connects to
>>>> reset pin and reset button on common PC.
>>>>
>>>> USB commands used to communicate with device were reverse
>>>> engineered using usbmon.
>>>>
>>>> Signed-off-by: Alexey Klimov <klimov.li...@gmail.com>
>>>> ---
>>>>drivers/watchdog/Kconfig  |  15 ++
>>>>drivers/watchdog/Makefile |   1 +
>>>>drivers/watchdog/streamlabs_wdt.c | 370
>>>> ++
>>>>3 files changed, 386 insertions(+)

[...]

>>>> +static int usb_streamlabs_wdt_command(struct watchdog_device *wdt_dev,
>>>> int cmd)
>>>> +{
>>>> +   struct streamlabs_wdt *streamlabs_wdt =
>>>> watchdog_get_drvdata(wdt_dev);
>>>> +   int retval;
>>>> +   int size;
>>>> +   unsigned long timeout_msec;
>>>> +   int retry_counter = 10; /* how many times to re-send
>>>> stop
>>>> cmd */
>>>> +
>>>> +   mutex_lock(_wdt->lock);
>>>> +
>>>> +   timeout_msec = wdt_dev->timeout * MSEC_PER_SEC;
>>>> +
>>>> +   /* Prepare message that will be sent to device.
>>>> +* This buffer is allocated by kzalloc(). Only initialize
>>>> required
>>>> +* fields.
>>>
>>>
>>>
>>> But only once, and overwritten by the response. So the comment is quite
>>> pointless
>>> and misleading.
>>
>>
>> Ok, I will do something with this comment during re-work and rebase.
>>
>>>> +*/
>>>> +   if (cmd == STREAMLABS_CMD_START) {
>>>> +   streamlabs_wdt->buffer[0] = 0xcc;
>>>> +   streamlabs_wdt->buffer[1] = 0xaa;
>>>> +   } else {/* assume stop command if it's not start */
>>>> +   streamlabs_wdt->buffer[0] = 0xff;
>>>> +   streamlabs_wdt->buffer[1] = 0xbb;
>>>> +   }
>>>> +
>>>> +   streamlabs_wdt->buffer[3] = 0x80;
>>>> +
>>>> +   streamlabs_wdt->buffer[6] = (timeout_msec & 0xff) << 8;
>>>> +   streamlabs_wdt->buffer[7] = (timeout_msec & 0xff00) >> 8;
>>>> +retry:
>>>> +   streamlabs_wdt->buffer[10] = 0x00;
>>>> +   streamlabs_wdt->buffer[11] = 0x00;
>>>> +   streamlabs_wdt->buffer[12] = 0x00;
>>>> +   streamlabs_wdt->buffer[13] = 0x00;
>>>> +
>>>> +   /* send command to watchdog */
>>>> +   retval = usb_interrupt_msg(streamlabs_wdt->usbdev,
>>>> +   usb_sndintpipe(streamlabs_wdt->usbdev,
>>>> 0x02),
>>>> +   streamlabs_wdt->buffer,
>>>> BUFFER_TRANSFER_LENGTH,
>>>> +   , USB_TIMEOUT);
>>>> +
>>>> +   if (retval || size != BUFFER_TRANSFER_LENGTH) {
>>>> +   dev_err(_wdt->intf->dev,
>>>> +   "error %i when submitting interrupt msg\n",
>>>> retval);
>>>
>>>
>>>
>>> Please no error messages if something goes wrong. We don't want to
>>> fill the kernel log with those messages.
>>
>>
>> Ok, will remove them. Or is it fine to convert them to dev_dbg?
>>
>
> If you think the messages might be useful for debugging, sure.

Well, definetely they help me now.

>>>> +   retval = -EIO;
>>>> +   goto out;
>>>> +   }
>>>> +
>>>> +   /* and read response from watchdog */
>>>> +   retval = usb_interrupt_msg(streamlabs_wdt->usbdev,
>>>> +   usb_rcvintpipe(streamlabs_wdt->usbdev,
>>>> 0x81),
>>>> +

Re: [PATCH] watchdog: add driver for StreamLabs USB watchdog device

2016-03-15 Thread Alexey Klimov

Hi Guenter,

On Tue, Mar 15, 2016 at 2:24 AM, Guenter Roeck  wrote:
> Hi Alexey,
>
>
> On 03/14/2016 06:02 PM, Alexey Klimov wrote:
>>
>> Hi Guenter,
>>
>> On Thu, Mar 10, 2016 at 3:54 AM, Guenter Roeck  wrote:
>>>
>>> On 03/09/2016 06:29 PM, Alexey Klimov wrote:
>>>>
>>>>
>>>> This patch creates new driver that supports StreamLabs usb watchdog
>>>> device. This device plugs into 9-pin usb header and connects to
>>>> reset pin and reset button on common PC.
>>>>
>>>> USB commands used to communicate with device were reverse
>>>> engineered using usbmon.
>>>>
>>>> Signed-off-by: Alexey Klimov 
>>>> ---
>>>>drivers/watchdog/Kconfig  |  15 ++
>>>>drivers/watchdog/Makefile |   1 +
>>>>drivers/watchdog/streamlabs_wdt.c | 370
>>>> ++
>>>>3 files changed, 386 insertions(+)

[...]

>>>> +static int usb_streamlabs_wdt_command(struct watchdog_device *wdt_dev,
>>>> int cmd)
>>>> +{
>>>> +   struct streamlabs_wdt *streamlabs_wdt =
>>>> watchdog_get_drvdata(wdt_dev);
>>>> +   int retval;
>>>> +   int size;
>>>> +   unsigned long timeout_msec;
>>>> +   int retry_counter = 10; /* how many times to re-send
>>>> stop
>>>> cmd */
>>>> +
>>>> +   mutex_lock(_wdt->lock);
>>>> +
>>>> +   timeout_msec = wdt_dev->timeout * MSEC_PER_SEC;
>>>> +
>>>> +   /* Prepare message that will be sent to device.
>>>> +* This buffer is allocated by kzalloc(). Only initialize
>>>> required
>>>> +* fields.
>>>
>>>
>>>
>>> But only once, and overwritten by the response. So the comment is quite
>>> pointless
>>> and misleading.
>>
>>
>> Ok, I will do something with this comment during re-work and rebase.
>>
>>>> +*/
>>>> +   if (cmd == STREAMLABS_CMD_START) {
>>>> +   streamlabs_wdt->buffer[0] = 0xcc;
>>>> +   streamlabs_wdt->buffer[1] = 0xaa;
>>>> +   } else {/* assume stop command if it's not start */
>>>> +   streamlabs_wdt->buffer[0] = 0xff;
>>>> +   streamlabs_wdt->buffer[1] = 0xbb;
>>>> +   }
>>>> +
>>>> +   streamlabs_wdt->buffer[3] = 0x80;
>>>> +
>>>> +   streamlabs_wdt->buffer[6] = (timeout_msec & 0xff) << 8;
>>>> +   streamlabs_wdt->buffer[7] = (timeout_msec & 0xff00) >> 8;
>>>> +retry:
>>>> +   streamlabs_wdt->buffer[10] = 0x00;
>>>> +   streamlabs_wdt->buffer[11] = 0x00;
>>>> +   streamlabs_wdt->buffer[12] = 0x00;
>>>> +   streamlabs_wdt->buffer[13] = 0x00;
>>>> +
>>>> +   /* send command to watchdog */
>>>> +   retval = usb_interrupt_msg(streamlabs_wdt->usbdev,
>>>> +   usb_sndintpipe(streamlabs_wdt->usbdev,
>>>> 0x02),
>>>> +   streamlabs_wdt->buffer,
>>>> BUFFER_TRANSFER_LENGTH,
>>>> +   , USB_TIMEOUT);
>>>> +
>>>> +   if (retval || size != BUFFER_TRANSFER_LENGTH) {
>>>> +   dev_err(_wdt->intf->dev,
>>>> +   "error %i when submitting interrupt msg\n",
>>>> retval);
>>>
>>>
>>>
>>> Please no error messages if something goes wrong. We don't want to
>>> fill the kernel log with those messages.
>>
>>
>> Ok, will remove them. Or is it fine to convert them to dev_dbg?
>>
>
> If you think the messages might be useful for debugging, sure.

Well, definetely they help me now.

>>>> +   retval = -EIO;
>>>> +   goto out;
>>>> +   }
>>>> +
>>>> +   /* and read response from watchdog */
>>>> +   retval = usb_interrupt_msg(streamlabs_wdt->usbdev,
>>>> +   usb_rcvintpipe(streamlabs_wdt->usbdev,
>>>> 0x81),
>>>> +   streamlabs_wdt->buffer, BUFFER_LENGTH,
>>>> +   , USB_TIMEOUT)

Re: [PATCH] watchdog: add driver for StreamLabs USB watchdog device

2016-03-15 Thread Alexey Klimov

Hi Oliver,

On Thu, Mar 10, 2016 at 9:23 AM, Oliver Neukum <oneu...@suse.com> wrote:
> On Thu, 2016-03-10 at 02:29 +0000, Alexey Klimov wrote:
>> This patch creates new driver that supports StreamLabs usb watchdog
>> device. This device plugs into 9-pin usb header and connects to
>> reset pin and reset button on common PC.
>
> Hi,
>
> a few remarks.
>
> Regards
> Oliver
>
>>
>> USB commands used to communicate with device were reverse
>> engineered using usbmon.
>>
>> Signed-off-by: Alexey Klimov <klimov.li...@gmail.com>
>> ---
>>  drivers/watchdog/Kconfig  |  15 ++
>>  drivers/watchdog/Makefile |   1 +
>>  drivers/watchdog/streamlabs_wdt.c | 370 
>> ++
>>  3 files changed, 386 insertions(+)
>>  create mode 100644 drivers/watchdog/streamlabs_wdt.c
>>
>> diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
>> index 80825a7..95d8f72 100644
>> --- a/drivers/watchdog/Kconfig
>> +++ b/drivers/watchdog/Kconfig
>> @@ -1705,4 +1705,19 @@ config USBPCWATCHDOG
>>
>> Most people will say N.
>>
>> +config USB_STREAMLABS_WATCHDOG
>> + tristate "StreamLabs USB watchdog driver"
>> + depends on USB
>> + ---help---
>> +   This is the driver for the USB Watchdog dongle from StreamLabs.
>> +   If you correctly connect reset pins to motherboard Reset pin and
>> +   to Reset button then this device will simply watch your kernel to 
>> make
>> +   sure it doesn't freeze, and if it does, it reboots your computer
>> +   after a certain amount of time.
>> +
>> +
>> +   To compile this driver as a module, choose M here: the
>> +   module will be called streamlabs_wdt.
>> +
>> +   Most people will say N. Say yes or M if you want to use such usb 
>> device.
>>  endif # WATCHDOG
>> diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
>> index f6a6a38..d54fd31 100644
>> --- a/drivers/watchdog/Makefile
>> +++ b/drivers/watchdog/Makefile
>> @@ -25,6 +25,7 @@ obj-$(CONFIG_WDTPCI) += wdt_pci.o
>>
>>  # USB-based Watchdog Cards
>>  obj-$(CONFIG_USBPCWATCHDOG) += pcwd_usb.o
>> +obj-$(CONFIG_USB_STREAMLABS_WATCHDOG) += streamlabs_wdt.o
>>
>>  # ALPHA Architecture
>>
>> diff --git a/drivers/watchdog/streamlabs_wdt.c 
>> b/drivers/watchdog/streamlabs_wdt.c
>> new file mode 100644
>> index 000..031dbc35
>> --- /dev/null
>> +++ b/drivers/watchdog/streamlabs_wdt.c
>> @@ -0,0 +1,370 @@
>> +/*
>> + * StreamLabs USB Watchdog driver
>> + *
>> + * Copyright (c) 2016 Alexey Klimov <klimov.li...@gmail.com>
>> + *
>> + * This program is free software; you may redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +/*
>> + * USB Watchdog device from Streamlabs
>> + * http://www.stream-labs.com/products/devices/watchdog/
>> + *
>> + * USB commands have been reverse engineered using usbmon.
>> + */
>> +
>> +#define DRIVER_AUTHOR "Alexey Klimov <klimov.li...@gmail.com>"
>> +#define DRIVER_DESC "StreamLabs USB watchdog driver"
>> +#define DRIVER_NAME "usb_streamlabs_wdt"
>> +
>> +MODULE_AUTHOR(DRIVER_AUTHOR);
>> +MODULE_DESCRIPTION(DRIVER_DESC);
>> +MODULE_LICENSE("GPL");
>> +
>> +#define USB_STREAMLABS_WATCHDOG_VENDOR   0x13c0
>> +#define USB_STREAMLABS_WATCHDOG_PRODUCT  0x0011
>> +
>> +/* one buffer is used for communication, however transmitted message is only
>> + * 32 bytes long */
>> +#define BUFFER_TRANSFER_LENGTH   32
>> +#define BUFFER_LENGTH64
>> +#define USB_TIMEOUT  350
>> +
>> +#define STREAMLABS_CMD_START 0
>> +#define STREAMLABS_CMD_STOP  1
>> +
>> +#define STREAMLABS_WDT_MIN_TIMEOUT   1
>> +#define STREAMLABS_WDT_MAX_TIMEOUT   46
>>

Re: [PATCH] watchdog: add driver for StreamLabs USB watchdog device

2016-03-15 Thread Alexey Klimov

Hi Oliver,

On Thu, Mar 10, 2016 at 9:23 AM, Oliver Neukum  wrote:
> On Thu, 2016-03-10 at 02:29 +0000, Alexey Klimov wrote:
>> This patch creates new driver that supports StreamLabs usb watchdog
>> device. This device plugs into 9-pin usb header and connects to
>> reset pin and reset button on common PC.
>
> Hi,
>
> a few remarks.
>
> Regards
> Oliver
>
>>
>> USB commands used to communicate with device were reverse
>> engineered using usbmon.
>>
>> Signed-off-by: Alexey Klimov 
>> ---
>>  drivers/watchdog/Kconfig  |  15 ++
>>  drivers/watchdog/Makefile |   1 +
>>  drivers/watchdog/streamlabs_wdt.c | 370 
>> ++
>>  3 files changed, 386 insertions(+)
>>  create mode 100644 drivers/watchdog/streamlabs_wdt.c
>>
>> diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
>> index 80825a7..95d8f72 100644
>> --- a/drivers/watchdog/Kconfig
>> +++ b/drivers/watchdog/Kconfig
>> @@ -1705,4 +1705,19 @@ config USBPCWATCHDOG
>>
>> Most people will say N.
>>
>> +config USB_STREAMLABS_WATCHDOG
>> + tristate "StreamLabs USB watchdog driver"
>> + depends on USB
>> + ---help---
>> +   This is the driver for the USB Watchdog dongle from StreamLabs.
>> +   If you correctly connect reset pins to motherboard Reset pin and
>> +   to Reset button then this device will simply watch your kernel to 
>> make
>> +   sure it doesn't freeze, and if it does, it reboots your computer
>> +   after a certain amount of time.
>> +
>> +
>> +   To compile this driver as a module, choose M here: the
>> +   module will be called streamlabs_wdt.
>> +
>> +   Most people will say N. Say yes or M if you want to use such usb 
>> device.
>>  endif # WATCHDOG
>> diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
>> index f6a6a38..d54fd31 100644
>> --- a/drivers/watchdog/Makefile
>> +++ b/drivers/watchdog/Makefile
>> @@ -25,6 +25,7 @@ obj-$(CONFIG_WDTPCI) += wdt_pci.o
>>
>>  # USB-based Watchdog Cards
>>  obj-$(CONFIG_USBPCWATCHDOG) += pcwd_usb.o
>> +obj-$(CONFIG_USB_STREAMLABS_WATCHDOG) += streamlabs_wdt.o
>>
>>  # ALPHA Architecture
>>
>> diff --git a/drivers/watchdog/streamlabs_wdt.c 
>> b/drivers/watchdog/streamlabs_wdt.c
>> new file mode 100644
>> index 000..031dbc35
>> --- /dev/null
>> +++ b/drivers/watchdog/streamlabs_wdt.c
>> @@ -0,0 +1,370 @@
>> +/*
>> + * StreamLabs USB Watchdog driver
>> + *
>> + * Copyright (c) 2016 Alexey Klimov 
>> + *
>> + * This program is free software; you may redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +/*
>> + * USB Watchdog device from Streamlabs
>> + * http://www.stream-labs.com/products/devices/watchdog/
>> + *
>> + * USB commands have been reverse engineered using usbmon.
>> + */
>> +
>> +#define DRIVER_AUTHOR "Alexey Klimov "
>> +#define DRIVER_DESC "StreamLabs USB watchdog driver"
>> +#define DRIVER_NAME "usb_streamlabs_wdt"
>> +
>> +MODULE_AUTHOR(DRIVER_AUTHOR);
>> +MODULE_DESCRIPTION(DRIVER_DESC);
>> +MODULE_LICENSE("GPL");
>> +
>> +#define USB_STREAMLABS_WATCHDOG_VENDOR   0x13c0
>> +#define USB_STREAMLABS_WATCHDOG_PRODUCT  0x0011
>> +
>> +/* one buffer is used for communication, however transmitted message is only
>> + * 32 bytes long */
>> +#define BUFFER_TRANSFER_LENGTH   32
>> +#define BUFFER_LENGTH64
>> +#define USB_TIMEOUT  350
>> +
>> +#define STREAMLABS_CMD_START 0
>> +#define STREAMLABS_CMD_STOP  1
>> +
>> +#define STREAMLABS_WDT_MIN_TIMEOUT   1
>> +#define STREAMLABS_WDT_MAX_TIMEOUT   46
>> +
>> +struct streamlabs_wdt {
>> + struct watchdog_device wdt_dev;
>> +

Re: [PATCH] watchdog: add driver for StreamLabs USB watchdog device

2016-03-14 Thread Alexey Klimov

Hi Guenter,

On Thu, Mar 10, 2016 at 3:54 AM, Guenter Roeck <li...@roeck-us.net> wrote:
> On 03/09/2016 06:29 PM, Alexey Klimov wrote:
>>
>> This patch creates new driver that supports StreamLabs usb watchdog
>> device. This device plugs into 9-pin usb header and connects to
>> reset pin and reset button on common PC.
>>
>> USB commands used to communicate with device were reverse
>> engineered using usbmon.
>>
>> Signed-off-by: Alexey Klimov <klimov.li...@gmail.com>
>> ---
>>   drivers/watchdog/Kconfig  |  15 ++
>>   drivers/watchdog/Makefile |   1 +
>>   drivers/watchdog/streamlabs_wdt.c | 370
>> ++
>>   3 files changed, 386 insertions(+)
>>   create mode 100644 drivers/watchdog/streamlabs_wdt.c
>>
>> diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
>> index 80825a7..95d8f72 100644
>> --- a/drivers/watchdog/Kconfig
>> +++ b/drivers/watchdog/Kconfig
>> @@ -1705,4 +1705,19 @@ config USBPCWATCHDOG
>>
>>   Most people will say N.
>>
>> +config USB_STREAMLABS_WATCHDOG
>> +   tristate "StreamLabs USB watchdog driver"
>> +   depends on USB
>> +   ---help---
>> + This is the driver for the USB Watchdog dongle from StreamLabs.
>> + If you correctly connect reset pins to motherboard Reset pin and
>> + to Reset button then this device will simply watch your kernel
>> to make
>> + sure it doesn't freeze, and if it does, it reboots your computer
>> + after a certain amount of time.
>> +
>> +
>> + To compile this driver as a module, choose M here: the
>> + module will be called streamlabs_wdt.
>> +
>> + Most people will say N. Say yes or M if you want to use such usb
>> device.
>>   endif # WATCHDOG
>> diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
>> index f6a6a38..d54fd31 100644
>> --- a/drivers/watchdog/Makefile
>> +++ b/drivers/watchdog/Makefile
>> @@ -25,6 +25,7 @@ obj-$(CONFIG_WDTPCI) += wdt_pci.o
>>
>>   # USB-based Watchdog Cards
>>   obj-$(CONFIG_USBPCWATCHDOG) += pcwd_usb.o
>> +obj-$(CONFIG_USB_STREAMLABS_WATCHDOG) += streamlabs_wdt.o
>>
>>   # ALPHA Architecture
>>
>> diff --git a/drivers/watchdog/streamlabs_wdt.c
>> b/drivers/watchdog/streamlabs_wdt.c
>> new file mode 100644
>> index 000..031dbc35
>> --- /dev/null
>> +++ b/drivers/watchdog/streamlabs_wdt.c
>> @@ -0,0 +1,370 @@
>> +/*
>> + * StreamLabs USB Watchdog driver
>> + *
>> + * Copyright (c) 2016 Alexey Klimov <klimov.li...@gmail.com>
>> + *
>> + * This program is free software; you may redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +/*
>> + * USB Watchdog device from Streamlabs
>> + * http://www.stream-labs.com/products/devices/watchdog/
>> + *
>> + * USB commands have been reverse engineered using usbmon.
>> + */
>> +
>> +#define DRIVER_AUTHOR "Alexey Klimov <klimov.li...@gmail.com>"
>> +#define DRIVER_DESC "StreamLabs USB watchdog driver"
>> +#define DRIVER_NAME "usb_streamlabs_wdt"
>> +
>> +MODULE_AUTHOR(DRIVER_AUTHOR);
>> +MODULE_DESCRIPTION(DRIVER_DESC);
>> +MODULE_LICENSE("GPL");
>> +
>> +#define USB_STREAMLABS_WATCHDOG_VENDOR 0x13c0
>> +#define USB_STREAMLABS_WATCHDOG_PRODUCT0x0011
>> +
>> +/* one buffer is used for communication, however transmitted message is
>> only
>> + * 32 bytes long */
>
>
> /*
>  * Please use proper multi-line comments throughout.
>
>  */

Ok, will fix them all.


>> +#define BUFFER_TRANSFER_LENGTH 32
>> +#define BUFFER_LENGTH  64
>> +#define USB_TIMEOUT350
>> +
>> +#define STREAMLABS_CMD_START   0
>> +#define STREAMLABS_CMD_STOP1
>> +
>> +#define STREAMLABS_WDT_MIN_TIMEOUT 1
>

Re: [PATCH] watchdog: add driver for StreamLabs USB watchdog device

2016-03-14 Thread Alexey Klimov

Hi Guenter,

On Thu, Mar 10, 2016 at 3:54 AM, Guenter Roeck  wrote:
> On 03/09/2016 06:29 PM, Alexey Klimov wrote:
>>
>> This patch creates new driver that supports StreamLabs usb watchdog
>> device. This device plugs into 9-pin usb header and connects to
>> reset pin and reset button on common PC.
>>
>> USB commands used to communicate with device were reverse
>> engineered using usbmon.
>>
>> Signed-off-by: Alexey Klimov 
>> ---
>>   drivers/watchdog/Kconfig  |  15 ++
>>   drivers/watchdog/Makefile |   1 +
>>   drivers/watchdog/streamlabs_wdt.c | 370
>> ++
>>   3 files changed, 386 insertions(+)
>>   create mode 100644 drivers/watchdog/streamlabs_wdt.c
>>
>> diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
>> index 80825a7..95d8f72 100644
>> --- a/drivers/watchdog/Kconfig
>> +++ b/drivers/watchdog/Kconfig
>> @@ -1705,4 +1705,19 @@ config USBPCWATCHDOG
>>
>>   Most people will say N.
>>
>> +config USB_STREAMLABS_WATCHDOG
>> +   tristate "StreamLabs USB watchdog driver"
>> +   depends on USB
>> +   ---help---
>> + This is the driver for the USB Watchdog dongle from StreamLabs.
>> + If you correctly connect reset pins to motherboard Reset pin and
>> + to Reset button then this device will simply watch your kernel
>> to make
>> + sure it doesn't freeze, and if it does, it reboots your computer
>> + after a certain amount of time.
>> +
>> +
>> + To compile this driver as a module, choose M here: the
>> + module will be called streamlabs_wdt.
>> +
>> + Most people will say N. Say yes or M if you want to use such usb
>> device.
>>   endif # WATCHDOG
>> diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
>> index f6a6a38..d54fd31 100644
>> --- a/drivers/watchdog/Makefile
>> +++ b/drivers/watchdog/Makefile
>> @@ -25,6 +25,7 @@ obj-$(CONFIG_WDTPCI) += wdt_pci.o
>>
>>   # USB-based Watchdog Cards
>>   obj-$(CONFIG_USBPCWATCHDOG) += pcwd_usb.o
>> +obj-$(CONFIG_USB_STREAMLABS_WATCHDOG) += streamlabs_wdt.o
>>
>>   # ALPHA Architecture
>>
>> diff --git a/drivers/watchdog/streamlabs_wdt.c
>> b/drivers/watchdog/streamlabs_wdt.c
>> new file mode 100644
>> index 000..031dbc35
>> --- /dev/null
>> +++ b/drivers/watchdog/streamlabs_wdt.c
>> @@ -0,0 +1,370 @@
>> +/*
>> + * StreamLabs USB Watchdog driver
>> + *
>> + * Copyright (c) 2016 Alexey Klimov 
>> + *
>> + * This program is free software; you may redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +/*
>> + * USB Watchdog device from Streamlabs
>> + * http://www.stream-labs.com/products/devices/watchdog/
>> + *
>> + * USB commands have been reverse engineered using usbmon.
>> + */
>> +
>> +#define DRIVER_AUTHOR "Alexey Klimov "
>> +#define DRIVER_DESC "StreamLabs USB watchdog driver"
>> +#define DRIVER_NAME "usb_streamlabs_wdt"
>> +
>> +MODULE_AUTHOR(DRIVER_AUTHOR);
>> +MODULE_DESCRIPTION(DRIVER_DESC);
>> +MODULE_LICENSE("GPL");
>> +
>> +#define USB_STREAMLABS_WATCHDOG_VENDOR 0x13c0
>> +#define USB_STREAMLABS_WATCHDOG_PRODUCT0x0011
>> +
>> +/* one buffer is used for communication, however transmitted message is
>> only
>> + * 32 bytes long */
>
>
> /*
>  * Please use proper multi-line comments throughout.
>
>  */

Ok, will fix them all.


>> +#define BUFFER_TRANSFER_LENGTH 32
>> +#define BUFFER_LENGTH  64
>> +#define USB_TIMEOUT350
>> +
>> +#define STREAMLABS_CMD_START   0
>> +#define STREAMLABS_CMD_STOP1
>> +
>> +#define STREAMLABS_WDT_MIN_TIMEOUT 1
>> +#define STREAMLABS_WDT_MAX_TIMEOUT 46
>> +
>> +struct streamlabs_wdt {
>> +   struct

[PATCH] watchdog: add driver for StreamLabs USB watchdog device

2016-03-09 Thread Alexey Klimov

This patch creates new driver that supports StreamLabs usb watchdog
device. This device plugs into 9-pin usb header and connects to
reset pin and reset button on common PC.

USB commands used to communicate with device were reverse
engineered using usbmon.

Signed-off-by: Alexey Klimov <klimov.li...@gmail.com>
---
 drivers/watchdog/Kconfig  |  15 ++
 drivers/watchdog/Makefile |   1 +
 drivers/watchdog/streamlabs_wdt.c | 370 ++
 3 files changed, 386 insertions(+)
 create mode 100644 drivers/watchdog/streamlabs_wdt.c

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index 80825a7..95d8f72 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -1705,4 +1705,19 @@ config USBPCWATCHDOG
 
  Most people will say N.
 
+config USB_STREAMLABS_WATCHDOG
+   tristate "StreamLabs USB watchdog driver"
+   depends on USB
+   ---help---
+ This is the driver for the USB Watchdog dongle from StreamLabs.
+ If you correctly connect reset pins to motherboard Reset pin and
+ to Reset button then this device will simply watch your kernel to make
+ sure it doesn't freeze, and if it does, it reboots your computer
+ after a certain amount of time.
+
+
+ To compile this driver as a module, choose M here: the
+ module will be called streamlabs_wdt.
+
+ Most people will say N. Say yes or M if you want to use such usb 
device.
 endif # WATCHDOG
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index f6a6a38..d54fd31 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -25,6 +25,7 @@ obj-$(CONFIG_WDTPCI) += wdt_pci.o
 
 # USB-based Watchdog Cards
 obj-$(CONFIG_USBPCWATCHDOG) += pcwd_usb.o
+obj-$(CONFIG_USB_STREAMLABS_WATCHDOG) += streamlabs_wdt.o
 
 # ALPHA Architecture
 
diff --git a/drivers/watchdog/streamlabs_wdt.c 
b/drivers/watchdog/streamlabs_wdt.c
new file mode 100644
index 000..031dbc35
--- /dev/null
+++ b/drivers/watchdog/streamlabs_wdt.c
@@ -0,0 +1,370 @@
+/*
+ * StreamLabs USB Watchdog driver
+ *
+ * Copyright (c) 2016 Alexey Klimov <klimov.li...@gmail.com>
+ *
+ * This program is free software; you may redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * USB Watchdog device from Streamlabs
+ * http://www.stream-labs.com/products/devices/watchdog/
+ *
+ * USB commands have been reverse engineered using usbmon.
+ */
+
+#define DRIVER_AUTHOR "Alexey Klimov <klimov.li...@gmail.com>"
+#define DRIVER_DESC "StreamLabs USB watchdog driver"
+#define DRIVER_NAME "usb_streamlabs_wdt"
+
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
+MODULE_LICENSE("GPL");
+
+#define USB_STREAMLABS_WATCHDOG_VENDOR 0x13c0
+#define USB_STREAMLABS_WATCHDOG_PRODUCT0x0011
+
+/* one buffer is used for communication, however transmitted message is only
+ * 32 bytes long */
+#define BUFFER_TRANSFER_LENGTH 32
+#define BUFFER_LENGTH  64
+#define USB_TIMEOUT350
+
+#define STREAMLABS_CMD_START   0
+#define STREAMLABS_CMD_STOP1
+
+#define STREAMLABS_WDT_MIN_TIMEOUT 1
+#define STREAMLABS_WDT_MAX_TIMEOUT 46
+
+struct streamlabs_wdt {
+   struct watchdog_device wdt_dev;
+   struct usb_device *usbdev;
+   struct usb_interface *intf;
+
+   struct kref kref;
+   struct mutex lock;
+   u8 *buffer;
+};
+
+static bool nowayout = WATCHDOG_NOWAYOUT;
+
+static int usb_streamlabs_wdt_validate_response(u8 *buf)
+{
+   /* If watchdog device understood the command it will acknowledge
+* with values 1,2,3,4 at indexes 10, 11, 12, 13 in response message.
+*/
+   if (buf[10] != 1 || buf[11] != 2 || buf[12] != 3 || buf[13] != 4)
+   return -EINVAL;
+
+   return 0;
+}
+
+static int usb_streamlabs_wdt_command(struct watchdog_device *wdt_dev, int cmd)
+{
+   struct streamlabs_wdt *streamlabs_wdt = watchdog_get_drvdata(wdt_dev);
+   int retval;
+   int size;
+   unsigned long timeout_msec;
+   int retry_counter = 10; /* how many times to re-send stop cmd */
+
+   mutex_lock(_wdt->lock);
+
+   timeout_msec = wdt_dev->timeout * MSEC_PER_SEC;
+
+   /* Prepare message that will be sent to device.
+* This buffer is allocated by kzalloc(). Only initialize required
+* fields.
+*/
+   if (cmd == STREAMLABS_CMD_START) {
+

[PATCH] watchdog: add driver for StreamLabs USB watchdog device

2016-03-09 Thread Alexey Klimov

This patch creates new driver that supports StreamLabs usb watchdog
device. This device plugs into 9-pin usb header and connects to
reset pin and reset button on common PC.

USB commands used to communicate with device were reverse
engineered using usbmon.

Signed-off-by: Alexey Klimov 
---
 drivers/watchdog/Kconfig  |  15 ++
 drivers/watchdog/Makefile |   1 +
 drivers/watchdog/streamlabs_wdt.c | 370 ++
 3 files changed, 386 insertions(+)
 create mode 100644 drivers/watchdog/streamlabs_wdt.c

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index 80825a7..95d8f72 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -1705,4 +1705,19 @@ config USBPCWATCHDOG
 
  Most people will say N.
 
+config USB_STREAMLABS_WATCHDOG
+   tristate "StreamLabs USB watchdog driver"
+   depends on USB
+   ---help---
+ This is the driver for the USB Watchdog dongle from StreamLabs.
+ If you correctly connect reset pins to motherboard Reset pin and
+ to Reset button then this device will simply watch your kernel to make
+ sure it doesn't freeze, and if it does, it reboots your computer
+ after a certain amount of time.
+
+
+ To compile this driver as a module, choose M here: the
+ module will be called streamlabs_wdt.
+
+ Most people will say N. Say yes or M if you want to use such usb 
device.
 endif # WATCHDOG
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index f6a6a38..d54fd31 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -25,6 +25,7 @@ obj-$(CONFIG_WDTPCI) += wdt_pci.o
 
 # USB-based Watchdog Cards
 obj-$(CONFIG_USBPCWATCHDOG) += pcwd_usb.o
+obj-$(CONFIG_USB_STREAMLABS_WATCHDOG) += streamlabs_wdt.o
 
 # ALPHA Architecture
 
diff --git a/drivers/watchdog/streamlabs_wdt.c 
b/drivers/watchdog/streamlabs_wdt.c
new file mode 100644
index 000..031dbc35
--- /dev/null
+++ b/drivers/watchdog/streamlabs_wdt.c
@@ -0,0 +1,370 @@
+/*
+ * StreamLabs USB Watchdog driver
+ *
+ * Copyright (c) 2016 Alexey Klimov 
+ *
+ * This program is free software; you may redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * USB Watchdog device from Streamlabs
+ * http://www.stream-labs.com/products/devices/watchdog/
+ *
+ * USB commands have been reverse engineered using usbmon.
+ */
+
+#define DRIVER_AUTHOR "Alexey Klimov "
+#define DRIVER_DESC "StreamLabs USB watchdog driver"
+#define DRIVER_NAME "usb_streamlabs_wdt"
+
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
+MODULE_LICENSE("GPL");
+
+#define USB_STREAMLABS_WATCHDOG_VENDOR 0x13c0
+#define USB_STREAMLABS_WATCHDOG_PRODUCT0x0011
+
+/* one buffer is used for communication, however transmitted message is only
+ * 32 bytes long */
+#define BUFFER_TRANSFER_LENGTH 32
+#define BUFFER_LENGTH  64
+#define USB_TIMEOUT350
+
+#define STREAMLABS_CMD_START   0
+#define STREAMLABS_CMD_STOP1
+
+#define STREAMLABS_WDT_MIN_TIMEOUT 1
+#define STREAMLABS_WDT_MAX_TIMEOUT 46
+
+struct streamlabs_wdt {
+   struct watchdog_device wdt_dev;
+   struct usb_device *usbdev;
+   struct usb_interface *intf;
+
+   struct kref kref;
+   struct mutex lock;
+   u8 *buffer;
+};
+
+static bool nowayout = WATCHDOG_NOWAYOUT;
+
+static int usb_streamlabs_wdt_validate_response(u8 *buf)
+{
+   /* If watchdog device understood the command it will acknowledge
+* with values 1,2,3,4 at indexes 10, 11, 12, 13 in response message.
+*/
+   if (buf[10] != 1 || buf[11] != 2 || buf[12] != 3 || buf[13] != 4)
+   return -EINVAL;
+
+   return 0;
+}
+
+static int usb_streamlabs_wdt_command(struct watchdog_device *wdt_dev, int cmd)
+{
+   struct streamlabs_wdt *streamlabs_wdt = watchdog_get_drvdata(wdt_dev);
+   int retval;
+   int size;
+   unsigned long timeout_msec;
+   int retry_counter = 10; /* how many times to re-send stop cmd */
+
+   mutex_lock(_wdt->lock);
+
+   timeout_msec = wdt_dev->timeout * MSEC_PER_SEC;
+
+   /* Prepare message that will be sent to device.
+* This buffer is allocated by kzalloc(). Only initialize required
+* fields.
+*/
+   if (cmd == STREAMLABS_CMD_START) {
+   streamlabs_wdt->buffer[0] = 0xcc;
+   streamlabs_wdt->buffer[1]

[PATCH RESEND] mailbox: pcc: fix channel calculation in get_pcc_channel()

2016-02-02 Thread Alexey Klimov

This patch fixes the calculation of pcc_chan for non-zero id.
After the compiler ignores the (unsigned long) cast the
pcc_mbox_channels pointer is type-cast and then the type-cast
offset is added which results in address outside of the range
leading to the kernel crashing.

We might add braces and make it:

pcc_chan = (struct mbox_chan *)
((unsigned long) pcc_mbox_channels +
(id * sizeof(*pcc_chan)));

but let's go with array approach here and use id as index.

Tested on Juno board.

Acked-by: Sudeep Holla 
Acked-by: Ashwin Chaugule 
Signed-off-by: Alexey Klimov 
---
 drivers/mailbox/pcc.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
index 45d85ae..8f779a1 100644
--- a/drivers/mailbox/pcc.c
+++ b/drivers/mailbox/pcc.c
@@ -81,16 +81,10 @@ static struct mbox_controller pcc_mbox_ctrl = {};
  */
 static struct mbox_chan *get_pcc_channel(int id)
 {
-   struct mbox_chan *pcc_chan;
-
if (id < 0 || id > pcc_mbox_ctrl.num_chans)
return ERR_PTR(-ENOENT);
 
-   pcc_chan = (struct mbox_chan *)
-   (unsigned long) pcc_mbox_channels +
-   (id * sizeof(*pcc_chan));
-
-   return pcc_chan;
+   return _mbox_channels[id];
 }
 
 /**
-- 
1.9.1

1 2 >

1 - 100 of 199 matches

Mail list logo