Re: [PATCH] x86/x2apic: fix x2apic_cluster_probe null pointer

2016-08-11 Thread Wanpeng Li
2016-08-12 13:30 GMT+08:00 Huaitong Han :
> percpu cpus_in_cluster is a pointer with CONFIG_CPUMASK_OFFSTACK, 
> cpu_in_cluster
> is not alloced memory before x2apic_prepare_cpu is invoked, so cpumask_set_cpu
> would get a NULL pointer bug:
>  BUG: unable to handle kernel NULL pointer dereference at (null)
>  IP: [] x2apic_cluster_probe+0x35/0x70

Could you try this? https://lkml.org/lkml/2016/8/4/345

Regards,
Wanpeng Li


Re: [PATCH v1] x86/power/64: Restore processor state before using per-cpu variables

2016-08-11 Thread Borislav Petkov
On Thu, Aug 11, 2016 at 02:49:29PM -0700, Thomas Garnier wrote:
> Restore the processor state before calling any other function to ensure
> per-cpu variables can be used with KASLR memory randomization.
> 
> Tracing functions use per-cpu variables (gs based) and one was called
> just before restoring the processor state fully. It resulted in a double
> fault when both the tracing & the exception handler functions tried to
> use a per-cpu variable.
> 
> Signed-off-by: Thomas Garnier 
> ---
> Based on next-20160808

Ok, I believe before I test this, I need to apply another patch from
Rafael. I think it is the "Always create temporary identity mapping
correctly" thing.

Yes, no?

Rafael, can you please apply everything on a test branch for us to run?

Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
--


Re: [PATCH] x86/x2apic: fix x2apic_cluster_probe null pointer

2016-08-11 Thread Wanpeng Li
2016-08-12 13:30 GMT+08:00 Huaitong Han :
> percpu cpus_in_cluster is a pointer with CONFIG_CPUMASK_OFFSTACK, 
> cpu_in_cluster
> is not alloced memory before x2apic_prepare_cpu is invoked, so cpumask_set_cpu
> would get a NULL pointer bug:
>  BUG: unable to handle kernel NULL pointer dereference at (null)
>  IP: [] x2apic_cluster_probe+0x35/0x70

Could you try this? https://lkml.org/lkml/2016/8/4/345

Regards,
Wanpeng Li


Re: [PATCH v1] x86/power/64: Restore processor state before using per-cpu variables

2016-08-11 Thread Borislav Petkov
On Thu, Aug 11, 2016 at 02:49:29PM -0700, Thomas Garnier wrote:
> Restore the processor state before calling any other function to ensure
> per-cpu variables can be used with KASLR memory randomization.
> 
> Tracing functions use per-cpu variables (gs based) and one was called
> just before restoring the processor state fully. It resulted in a double
> fault when both the tracing & the exception handler functions tried to
> use a per-cpu variable.
> 
> Signed-off-by: Thomas Garnier 
> ---
> Based on next-20160808

Ok, I believe before I test this, I need to apply another patch from
Rafael. I think it is the "Always create temporary identity mapping
correctly" thing.

Yes, no?

Rafael, can you please apply everything on a test branch for us to run?

Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
--


Re: [PATCH] uprobes/x86: fix rip-relative handling of EVEX-encoded insns.

2016-08-11 Thread Srikar Dronamraju
* Denys Vlasenko  [2016-08-11 17:45:21]:

> Since instruction decoder now supports EVEX-encoded insns, two fixes are 
> needed
> to correctly handle them in uprobes.

Could you please add the commit that helps support the insns in the
Changelog.

Rest all looks okay.


Looks good to me.

Acked-by: Srikar Dronamraju 

-- 
Thanks and Regards
Srikar Dronamraju



Re: [PATCH] uprobes/x86: fix rip-relative handling of EVEX-encoded insns.

2016-08-11 Thread Srikar Dronamraju
* Denys Vlasenko  [2016-08-11 17:45:21]:

> Since instruction decoder now supports EVEX-encoded insns, two fixes are 
> needed
> to correctly handle them in uprobes.

Could you please add the commit that helps support the insns in the
Changelog.

Rest all looks okay.


Looks good to me.

Acked-by: Srikar Dronamraju 

-- 
Thanks and Regards
Srikar Dronamraju



[PATCH] x86/x2apic: fix x2apic_cluster_probe null pointer

2016-08-11 Thread Huaitong Han
percpu cpus_in_cluster is a pointer with CONFIG_CPUMASK_OFFSTACK, cpu_in_cluster
is not alloced memory before x2apic_prepare_cpu is invoked, so cpumask_set_cpu
would get a NULL pointer bug:
 BUG: unable to handle kernel NULL pointer dereference at (null)
 IP: [] x2apic_cluster_probe+0x35/0x70

Signed-off-by: Huaitong Han 
---
 arch/x86/kernel/apic/x2apic_cluster.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/x2apic_cluster.c 
b/arch/x86/kernel/apic/x2apic_cluster.c
index 6368fa6..6acb055 100644
--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -190,9 +190,10 @@ static int x2apic_cluster_probe(void)
if (!x2apic_mode)
return 0;
 
-   cpumask_set_cpu(cpu, per_cpu(cpus_in_cluster, cpu));
cpuhp_setup_state(CPUHP_X2APIC_PREPARE, "X2APIC_PREPARE",
  x2apic_prepare_cpu, x2apic_dead_cpu);
+   cpumask_set_cpu(cpu, per_cpu(cpus_in_cluster, cpu));
+
return 1;
 }
 
-- 
1.8.3.1



[PATCH] x86/x2apic: fix x2apic_cluster_probe null pointer

2016-08-11 Thread Huaitong Han
percpu cpus_in_cluster is a pointer with CONFIG_CPUMASK_OFFSTACK, cpu_in_cluster
is not alloced memory before x2apic_prepare_cpu is invoked, so cpumask_set_cpu
would get a NULL pointer bug:
 BUG: unable to handle kernel NULL pointer dereference at (null)
 IP: [] x2apic_cluster_probe+0x35/0x70

Signed-off-by: Huaitong Han 
---
 arch/x86/kernel/apic/x2apic_cluster.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/x2apic_cluster.c 
b/arch/x86/kernel/apic/x2apic_cluster.c
index 6368fa6..6acb055 100644
--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -190,9 +190,10 @@ static int x2apic_cluster_probe(void)
if (!x2apic_mode)
return 0;
 
-   cpumask_set_cpu(cpu, per_cpu(cpus_in_cluster, cpu));
cpuhp_setup_state(CPUHP_X2APIC_PREPARE, "X2APIC_PREPARE",
  x2apic_prepare_cpu, x2apic_dead_cpu);
+   cpumask_set_cpu(cpu, per_cpu(cpus_in_cluster, cpu));
+
return 1;
 }
 
-- 
1.8.3.1



Re: [RFC PATCH v7 7/7] Restartable sequences: self-tests

2016-08-11 Thread Boqun Feng
On Fri, Aug 12, 2016 at 03:10:38AM +, Mathieu Desnoyers wrote:
> - On Aug 11, 2016, at 9:28 PM, Boqun Feng boqun.f...@gmail.com wrote:
> 
> > On Thu, Aug 11, 2016 at 11:26:30PM +, Mathieu Desnoyers wrote:
> >> - On Jul 24, 2016, at 2:01 PM, Dave Watson davejwat...@fb.com wrote:
> >> 
> >> >>> +static inline __attribute__((always_inline))
> >> >>> +bool rseq_finish(struct rseq_lock *rlock,
> >> >>> + intptr_t *p, intptr_t to_write,
> >> >>> + struct rseq_state start_value)
> >> > 
> >> >>> This ABI looks like it will work fine for our use case. I don't think 
> >> >>> it
> >> >>> has been mentioned yet, but we may still need multiple asm blocks
> >> >>> for differing numbers of writes. For example, an array-based freelist 
> >> >>> push:
> >> > 
> >> >>> void push(void *obj) {
> >> >>> if (index < maxlen) {
> >> >>> freelist[index++] = obj;
> >> >>> }
> >> >>> }
> >> > 
> >> >>> would be more efficiently implemented with a two-write rseq_finish:
> >> > 
> >> >>> rseq_finish2([index], obj, // first write
> >> >>> , index + 1, // second write
> >> >>> ...);
> >> > 
> >> >> Would pairing one rseq_start with two rseq_finish do the trick
> >> >> there ?
> >> > 
> >> > Yes, two rseq_finish works, as long as the extra rseq management overhead
> >> > is not substantial.
> >> 
> >> I've added a commit implementing rseq_finish2() in my rseq volatile
> >> dev branch. You can fetch it at:
> >> 
> >> https://github.com/compudj/linux-percpu-dev/tree/rseq-fallback
> >> 
> >> I also have a separate test and benchmark tree in addition to the
> >> kernel selftests here:
> >> 
> >> https://github.com/compudj/rseq-test
> >> 
> >> I named the first write a "speculative" write, and the second write
> >> the "final" write.
> >> 
> > 
> > Maybe I miss something subtle, but if the first write is only a
> > "speculative" write, why can't we put it in the rseq critical section
> > rather than asm block? Like this:
> > 
> > do_rseq(..., result, targetptr, newval
> > {
> > newval = index;
> > targetptr = 
> > if (newval < maxlen)
> > freelist[newval++] = obj;
> > else
> > result = false;
> > }
> > 
> > No extra rseq_finish() is needed here, but maybe a little more
> > "speculative" writes?
> 
> This won't work unfortunately. The speculative stores need to be
> between the rseq_event_counter comparison instruction in the rseq_finish
> asm sequence and the final store. The ip fixup is really needed for
> correctness of speculative stores. The sequence number scheme only works
> for loads.
> 
> Putting it in the C code between rseq_start and rseq_finish would lead
> to races such as:
> 
> thread Athread B
> rseq_start
> 
> 
> rseq_start
> freelist[offset + 1] = obj
> rseq_finish
>offset++
> 
> 
> freelist[newval + 1] = obj  <--- corrupts the list content.
> 

Ah, right!

We couldn't do any "global"(real global or percpu) update in the rseq
critical section(code between rseq_start and rseq_finish), because
without an ip fixup, we cannot abort the critical section immediately,
we have to compare the event_counter in rseq_finish, but that's too late
for speculates stores.

> 
> 
> > Besides, do we allow userspace programs do read-only access to the
> > memory objects modified by do_rseq(). If so, we have a problem when
> > there are two writes in a do_rseq()(either in the rseq critical section
> > or in the asm block), because in current implemetation, these two writes
> > are unordered, which makes the readers outside a do_rseq() could observe
> > the ordering of writes differently.
> > 
> > For rseq_finish2(), a simple solution would be making the "final" write
> > a RELEASE.
> 
> Indeed, we would need a release semantic for the final store here if this
> is the common use. Or we could duplicate the "flavors" of rseq_finish2 and
> add a rseq_finish2_release. We should find a way to eliminate code duplication

I'm in favor of a separate rseq_finish2_release().

> there. I suspect we'll end up doing macros.
> 

Me too. Lemme have a try ;-)

Regards,
Boqun

> Thanks,
> 
> Mathieu
> 
> > 
> > Regards,
> > Boqun
> > 
> >> Thanks,
> >> 
> >> Mathieu
> >> 
> >> --
> >> Mathieu Desnoyers
> >> EfficiOS Inc.
> > > http://www.efficios.com
> 
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com


signature.asc
Description: PGP signature


Re: [RFC PATCH v7 7/7] Restartable sequences: self-tests

2016-08-11 Thread Boqun Feng
On Fri, Aug 12, 2016 at 03:10:38AM +, Mathieu Desnoyers wrote:
> - On Aug 11, 2016, at 9:28 PM, Boqun Feng boqun.f...@gmail.com wrote:
> 
> > On Thu, Aug 11, 2016 at 11:26:30PM +, Mathieu Desnoyers wrote:
> >> - On Jul 24, 2016, at 2:01 PM, Dave Watson davejwat...@fb.com wrote:
> >> 
> >> >>> +static inline __attribute__((always_inline))
> >> >>> +bool rseq_finish(struct rseq_lock *rlock,
> >> >>> + intptr_t *p, intptr_t to_write,
> >> >>> + struct rseq_state start_value)
> >> > 
> >> >>> This ABI looks like it will work fine for our use case. I don't think 
> >> >>> it
> >> >>> has been mentioned yet, but we may still need multiple asm blocks
> >> >>> for differing numbers of writes. For example, an array-based freelist 
> >> >>> push:
> >> > 
> >> >>> void push(void *obj) {
> >> >>> if (index < maxlen) {
> >> >>> freelist[index++] = obj;
> >> >>> }
> >> >>> }
> >> > 
> >> >>> would be more efficiently implemented with a two-write rseq_finish:
> >> > 
> >> >>> rseq_finish2([index], obj, // first write
> >> >>> , index + 1, // second write
> >> >>> ...);
> >> > 
> >> >> Would pairing one rseq_start with two rseq_finish do the trick
> >> >> there ?
> >> > 
> >> > Yes, two rseq_finish works, as long as the extra rseq management overhead
> >> > is not substantial.
> >> 
> >> I've added a commit implementing rseq_finish2() in my rseq volatile
> >> dev branch. You can fetch it at:
> >> 
> >> https://github.com/compudj/linux-percpu-dev/tree/rseq-fallback
> >> 
> >> I also have a separate test and benchmark tree in addition to the
> >> kernel selftests here:
> >> 
> >> https://github.com/compudj/rseq-test
> >> 
> >> I named the first write a "speculative" write, and the second write
> >> the "final" write.
> >> 
> > 
> > Maybe I miss something subtle, but if the first write is only a
> > "speculative" write, why can't we put it in the rseq critical section
> > rather than asm block? Like this:
> > 
> > do_rseq(..., result, targetptr, newval
> > {
> > newval = index;
> > targetptr = 
> > if (newval < maxlen)
> > freelist[newval++] = obj;
> > else
> > result = false;
> > }
> > 
> > No extra rseq_finish() is needed here, but maybe a little more
> > "speculative" writes?
> 
> This won't work unfortunately. The speculative stores need to be
> between the rseq_event_counter comparison instruction in the rseq_finish
> asm sequence and the final store. The ip fixup is really needed for
> correctness of speculative stores. The sequence number scheme only works
> for loads.
> 
> Putting it in the C code between rseq_start and rseq_finish would lead
> to races such as:
> 
> thread Athread B
> rseq_start
> 
> 
> rseq_start
> freelist[offset + 1] = obj
> rseq_finish
>offset++
> 
> 
> freelist[newval + 1] = obj  <--- corrupts the list content.
> 

Ah, right!

We couldn't do any "global"(real global or percpu) update in the rseq
critical section(code between rseq_start and rseq_finish), because
without an ip fixup, we cannot abort the critical section immediately,
we have to compare the event_counter in rseq_finish, but that's too late
for speculates stores.

> 
> 
> > Besides, do we allow userspace programs do read-only access to the
> > memory objects modified by do_rseq(). If so, we have a problem when
> > there are two writes in a do_rseq()(either in the rseq critical section
> > or in the asm block), because in current implemetation, these two writes
> > are unordered, which makes the readers outside a do_rseq() could observe
> > the ordering of writes differently.
> > 
> > For rseq_finish2(), a simple solution would be making the "final" write
> > a RELEASE.
> 
> Indeed, we would need a release semantic for the final store here if this
> is the common use. Or we could duplicate the "flavors" of rseq_finish2 and
> add a rseq_finish2_release. We should find a way to eliminate code duplication

I'm in favor of a separate rseq_finish2_release().

> there. I suspect we'll end up doing macros.
> 

Me too. Lemme have a try ;-)

Regards,
Boqun

> Thanks,
> 
> Mathieu
> 
> > 
> > Regards,
> > Boqun
> > 
> >> Thanks,
> >> 
> >> Mathieu
> >> 
> >> --
> >> Mathieu Desnoyers
> >> EfficiOS Inc.
> > > http://www.efficios.com
> 
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com


signature.asc
Description: PGP signature


Re: [PATCH 1/5] arm/defconfigs: Remove CONFIG_IPV6_PRIVACY

2016-08-11 Thread Borislav Petkov
On Thu, Aug 11, 2016 at 11:51:38PM +0200, Arnd Bergmann wrote:
> That patch looks good, sorry I missed it. I see that Vladimir already
> did the lpc32xx portion of it, can you send the part for s3c24xx
> to the exynos maintainers?

I see you've added them to CC, thanks Arnd. Here it is:

---
From: Borislav Petkov 
Date: Fri, 12 Aug 2016 07:18:20 +0200
Subject: [PATCH] arm/s3c2410_defconfig: Remove CONFIG_IPV6_PRIVACY

Option is long gone, see

  5d9efa7ee99e ("ipv6: Remove privacy config option.")

Signed-off-by: Borislav Petkov 
Cc: Kukjin Kim 
Cc: Krzysztof Kozlowski 
---
 arch/arm/configs/s3c2410_defconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm/configs/s3c2410_defconfig 
b/arch/arm/configs/s3c2410_defconfig
index b3ade552a2a5..bc4bfe02e611 100644
--- a/arch/arm/configs/s3c2410_defconfig
+++ b/arch/arm/configs/s3c2410_defconfig
@@ -76,7 +76,6 @@ CONFIG_TCP_CONG_LP=m
 CONFIG_TCP_CONG_VENO=m
 CONFIG_TCP_CONG_YEAH=m
 CONFIG_TCP_CONG_ILLINOIS=m
-CONFIG_IPV6_PRIVACY=y
 CONFIG_IPV6_ROUTER_PREF=y
 CONFIG_INET6_AH=m
 CONFIG_INET6_ESP=m
-- 
2.8.4

> If you have larger chunks like CONFIG_INET_LRO removal, we can take
> that directly through arm-soc as a single patch for all defconfigs.

I'll leave that to someone who wants to get their feet wet with patching
the kernel - should be a simple task. :-)

Thanks!

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH 1/5] arm/defconfigs: Remove CONFIG_IPV6_PRIVACY

2016-08-11 Thread Borislav Petkov
On Thu, Aug 11, 2016 at 11:51:38PM +0200, Arnd Bergmann wrote:
> That patch looks good, sorry I missed it. I see that Vladimir already
> did the lpc32xx portion of it, can you send the part for s3c24xx
> to the exynos maintainers?

I see you've added them to CC, thanks Arnd. Here it is:

---
From: Borislav Petkov 
Date: Fri, 12 Aug 2016 07:18:20 +0200
Subject: [PATCH] arm/s3c2410_defconfig: Remove CONFIG_IPV6_PRIVACY

Option is long gone, see

  5d9efa7ee99e ("ipv6: Remove privacy config option.")

Signed-off-by: Borislav Petkov 
Cc: Kukjin Kim 
Cc: Krzysztof Kozlowski 
---
 arch/arm/configs/s3c2410_defconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm/configs/s3c2410_defconfig 
b/arch/arm/configs/s3c2410_defconfig
index b3ade552a2a5..bc4bfe02e611 100644
--- a/arch/arm/configs/s3c2410_defconfig
+++ b/arch/arm/configs/s3c2410_defconfig
@@ -76,7 +76,6 @@ CONFIG_TCP_CONG_LP=m
 CONFIG_TCP_CONG_VENO=m
 CONFIG_TCP_CONG_YEAH=m
 CONFIG_TCP_CONG_ILLINOIS=m
-CONFIG_IPV6_PRIVACY=y
 CONFIG_IPV6_ROUTER_PREF=y
 CONFIG_INET6_AH=m
 CONFIG_INET6_ESP=m
-- 
2.8.4

> If you have larger chunks like CONFIG_INET_LRO removal, we can take
> that directly through arm-soc as a single patch for all defconfigs.

I'll leave that to someone who wants to get their feet wet with patching
the kernel - should be a simple task. :-)

Thanks!

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access

2016-08-11 Thread Benjamin Herrenschmidt
On Thu, 2016-08-11 at 14:17 -0600, Alex Williamson wrote:
> 
> vfio isn't playing nanny here for the fun of it, part of the reason we
> have vpd access functions is because folks have discovered that vpd
> registers between PCI functions on multi-function devices may be
> shared.  So pounding on vpd registers for function 1 may adversely
> > affect someone else reading from a different function.

A lot more than that may be shared... the bus for example. lots of
devices have backdoors into their own BARs, one partition can make its
BARs overlap the neighbour function, ... dang !

In the end, PCI assignment at a granularity smaller than a bus cannot
be done in a fully air tight way and shouldn't be relied upon in
environemnt where the isolation between partitions matters.

The only exception here is SR-IOV of course since it's designed
specifically so that the VFs can't be messed with.

>   This is a case
> where I feel vfio needs to step in because if that's a user competing
> with the host or two users stepping on each other, well that's exactly
> what vfio tries to prevent.  A driver in userspace or a VM driver can't
> very well determine these sorts of interactions when it only has
> visibility to a subset of the functions and users and hardware folks
> would throw a fit if I extended iommu groups to encompass all the
> related devices rather than take the relatively simple step of
> virtualizing these accesses and occasionally quirking devices that are
> extra broken, as seems to be required here.  Thanks,

We may want some kind of "strict" vs. "relaxed" model here to
differenciate the desktop user wanting to give a function to his/her
windows partition and doesn't care about strict isolation vs. the cloud
data center.

Cheers,
Ben.



Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access

2016-08-11 Thread Benjamin Herrenschmidt
On Thu, 2016-08-11 at 14:17 -0600, Alex Williamson wrote:
> 
> vfio isn't playing nanny here for the fun of it, part of the reason we
> have vpd access functions is because folks have discovered that vpd
> registers between PCI functions on multi-function devices may be
> shared.  So pounding on vpd registers for function 1 may adversely
> > affect someone else reading from a different function.

A lot more than that may be shared... the bus for example. lots of
devices have backdoors into their own BARs, one partition can make its
BARs overlap the neighbour function, ... dang !

In the end, PCI assignment at a granularity smaller than a bus cannot
be done in a fully air tight way and shouldn't be relied upon in
environemnt where the isolation between partitions matters.

The only exception here is SR-IOV of course since it's designed
specifically so that the VFs can't be messed with.

>   This is a case
> where I feel vfio needs to step in because if that's a user competing
> with the host or two users stepping on each other, well that's exactly
> what vfio tries to prevent.  A driver in userspace or a VM driver can't
> very well determine these sorts of interactions when it only has
> visibility to a subset of the functions and users and hardware folks
> would throw a fit if I extended iommu groups to encompass all the
> related devices rather than take the relatively simple step of
> virtualizing these accesses and occasionally quirking devices that are
> extra broken, as seems to be required here.  Thanks,

We may want some kind of "strict" vs. "relaxed" model here to
differenciate the desktop user wanting to give a function to his/her
windows partition and doesn't care about strict isolation vs. the cloud
data center.

Cheers,
Ben.



Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-11 Thread Linus Torvalds
On Thu, Aug 11, 2016 at 9:16 PM, Dave Chinner  wrote:
>
> That's why running aim7 as your "does the filesystem scale"
> benchmark is somewhat irrelevant to scaling applications on high
> performance systems these days

Yes, don't get me wrong - I'm not at all trying to say that AIM7 is a
good benchmark. It's just that I think what it happens to test is
still meaningful, even if it's not necessarily in any way some kind of
"high performance IO" thing.

There are probably lots of other more important loads, I just reacted
to Christoph seeming to argue that the AIM7 behavior was _so_ broken
that we shouldn't even care. It's not _that_ broken, it's just not
about high-performance IO streaming, it happens to test something else
entirely.

We've actually had AIM7 occasionally find other issues just because
some of the things it does is so odd.

Iirc it has a fork test that doesn't execve (very unusual - you'd
generally use threads if you care about performance), and that has
shown issues with our anon_vma scaling before anything else did.

I also seem to remember some odd pty open/close/ioctl subtest that
showed problems with some of the last remnants of the old BKL (the
test probably actually tested something else, but ended up choking on
the odd tty things).

So in general, I'm not a fan of AIM as a benchmark, but it actually
_has_ found lots of real issues because it tends to do things that
kernel developers think are insane.

And let's face it, user programs doing odd and not very efficient
things should be considered par for the course. We're never going to
get rid of insane user programs, so we might as well fix the
performance problems even when we say "that's just stupid".

Linus


Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-11 Thread Linus Torvalds
On Thu, Aug 11, 2016 at 9:16 PM, Dave Chinner  wrote:
>
> That's why running aim7 as your "does the filesystem scale"
> benchmark is somewhat irrelevant to scaling applications on high
> performance systems these days

Yes, don't get me wrong - I'm not at all trying to say that AIM7 is a
good benchmark. It's just that I think what it happens to test is
still meaningful, even if it's not necessarily in any way some kind of
"high performance IO" thing.

There are probably lots of other more important loads, I just reacted
to Christoph seeming to argue that the AIM7 behavior was _so_ broken
that we shouldn't even care. It's not _that_ broken, it's just not
about high-performance IO streaming, it happens to test something else
entirely.

We've actually had AIM7 occasionally find other issues just because
some of the things it does is so odd.

Iirc it has a fork test that doesn't execve (very unusual - you'd
generally use threads if you care about performance), and that has
shown issues with our anon_vma scaling before anything else did.

I also seem to remember some odd pty open/close/ioctl subtest that
showed problems with some of the last remnants of the old BKL (the
test probably actually tested something else, but ended up choking on
the odd tty things).

So in general, I'm not a fan of AIM as a benchmark, but it actually
_has_ found lots of real issues because it tends to do things that
kernel developers think are insane.

And let's face it, user programs doing odd and not very efficient
things should be considered par for the course. We're never going to
get rid of insane user programs, so we might as well fix the
performance problems even when we say "that's just stupid".

Linus


[PATCH v8 3/3] regulator: lp873x: Change the MFD config option as per latest naming

2016-08-11 Thread Keerthy
Change the MFD config option as per latest naming

Acked-by: Mark Brown 
Signed-off-by: Keerthy 
---
 drivers/regulator/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig
index 6c88e31..97dc4cc 100644
--- a/drivers/regulator/Kconfig
+++ b/drivers/regulator/Kconfig
@@ -323,7 +323,7 @@ config REGULATOR_LP872X
 
 config REGULATOR_LP873X
tristate "TI LP873X Power regulators"
-   depends on MFD_LP873X && OF
+   depends on MFD_TI_LP873X && OF
help
  This driver supports LP873X voltage regulator chips. LP873X
  provides two step-down converters and two general-purpose LDO
-- 
1.9.1



[PATCH v8 0/3] mfd: lp873x: Add lp873x PMIC support

2016-08-11 Thread Keerthy
The LP873X chip is a power management IC for Portable Navigation Systems
and Tablet Computing devices. It contains the following components:

 - Regulators.
 - Configurable General Purpose Output Signals(GPO).

PMIC interacts with the main processor through i2c. PMIC has
couple of LDOs(Linear Regulators), couple of BUCKs (Step-Down DC-DC
Converter Cores) and GPOs(General Purpose Output Signals).

The regulator patch is already queued:
http://marc.info/?l=linux-kernel=146298767110771=2

DT patch also queued.

Hence posting the remaining patches of the series with
the comments fixed on mfd driver.

Changes in v8:

Defined Macros for hardcoded bit definitions in gpio driver.
Added Mark's ack for the regulator config change patch.

Chages in v7:

Reverted the dependency on 
http://www.gossamer-threads.com/lists/linux/kernel/2457552.

Changes in v6:

Series depends on: http://www.gossamer-threads.com/lists/linux/kernel/2457552
probe_new used.

Chnages in v4:

Added GPO driver support. 

Keerthy (3):
  mfd: lp873x: Add lp873x PMIC support
  gpio: lp873x: Add support for General Purpose Outputs
  regulator: lp873x: Change the MFD config option as per latest naming

 drivers/gpio/Kconfig   |  10 ++
 drivers/gpio/Makefile  |   1 +
 drivers/gpio/gpio-lp873x.c | 193 +
 drivers/mfd/Kconfig|  14 +++
 drivers/mfd/Makefile   |   2 +
 drivers/mfd/lp873x.c   |  97 +
 drivers/regulator/Kconfig  |   2 +-
 include/linux/mfd/lp873x.h | 264 +
 8 files changed, 582 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpio/gpio-lp873x.c
 create mode 100644 drivers/mfd/lp873x.c
 create mode 100644 include/linux/mfd/lp873x.h

-- 
1.9.1



[PATCH v8 2/3] gpio: lp873x: Add support for General Purpose Outputs

2016-08-11 Thread Keerthy
Add driver for lp873x PMIC family GPOs. Two GPOs are supported
and can be configured in Open-drain output or Push-pull output.

Acked-by: Linus Walleij 
Signed-off-by: Keerthy 
---

Changes in v8:

  * Replaced hardcoded bit fields with macros.

 drivers/gpio/Kconfig   |  10 +++
 drivers/gpio/Makefile  |   1 +
 drivers/gpio/gpio-lp873x.c | 193 +
 3 files changed, 204 insertions(+)
 create mode 100644 drivers/gpio/gpio-lp873x.c

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index 6de9062..ff68a80 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -853,6 +853,16 @@ config GPIO_LP3943
  LP3943 can be used as a GPIO expander which provides up to 16 GPIOs.
  Open drain outputs are required for this usage.
 
+config GPIO_LP873X
+   tristate "TI LP873X GPO"
+   depends on MFD_TI_LP873X
+   help
+ This driver supports the GPO on TI Lp873x PMICs. 2 GPOs are present
+ on LP873X PMICs.
+
+ This driver can also be built as a module. If so, the module will be
+  called gpio-lp873x.
+
 config GPIO_MAX77620
tristate "GPIO support for PMIC MAX77620 and MAX20024"
depends on MFD_MAX77620
diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile
index 2a035ed..d60432f 100644
--- a/drivers/gpio/Makefile
+++ b/drivers/gpio/Makefile
@@ -56,6 +56,7 @@ obj-$(CONFIG_GPIO_LOONGSON)   += gpio-loongson.o
 obj-$(CONFIG_GPIO_LP3943)  += gpio-lp3943.o
 obj-$(CONFIG_GPIO_LPC18XX) += gpio-lpc18xx.o
 obj-$(CONFIG_ARCH_LPC32XX) += gpio-lpc32xx.o
+obj-$(CONFIG_GPIO_LP873X)  += gpio-lp873x.o
 obj-$(CONFIG_GPIO_LYNXPOINT)   += gpio-lynxpoint.o
 obj-$(CONFIG_GPIO_MAX730X) += gpio-max730x.o
 obj-$(CONFIG_GPIO_MAX7300) += gpio-max7300.o
diff --git a/drivers/gpio/gpio-lp873x.c b/drivers/gpio/gpio-lp873x.c
new file mode 100644
index 000..79c255d
--- /dev/null
+++ b/drivers/gpio/gpio-lp873x.c
@@ -0,0 +1,193 @@
+/*
+ * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/
+ * Keerthy 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether expressed or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License version 2 for more details.
+ *
+ * Based on the TPS65218 driver
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define BITS_PER_GPO   0x4
+#define LP873X_GPO_CTRL_OD 0x2
+
+struct lp873x_gpio {
+   struct gpio_chip chip;
+   struct lp873x *lp873;
+};
+
+static int lp873x_gpio_get_direction(struct gpio_chip *chip,
+unsigned int offset)
+{
+   /* This device is output only */
+   return 0;
+}
+
+static int lp873x_gpio_direction_input(struct gpio_chip *chip,
+  unsigned int offset)
+{
+   /* This device is output only */
+   return -EINVAL;
+}
+
+static int lp873x_gpio_direction_output(struct gpio_chip *chip,
+   unsigned int offset, int value)
+{
+   struct lp873x_gpio *gpio = gpiochip_get_data(chip);
+
+   /* Set the initial value */
+   return regmap_update_bits(gpio->lp873->regmap, LP873X_REG_GPO_CTRL,
+ BIT(offset * BITS_PER_GPO),
+ value ? BIT(offset * BITS_PER_GPO) : 0);
+}
+
+static int lp873x_gpio_get(struct gpio_chip *chip, unsigned int offset)
+{
+   struct lp873x_gpio *gpio = gpiochip_get_data(chip);
+   int ret, val;
+
+   ret = regmap_read(gpio->lp873->regmap, LP873X_REG_GPO_CTRL, );
+   if (ret < 0)
+   return ret;
+
+   return val & BIT(offset * BITS_PER_GPO);
+}
+
+static void lp873x_gpio_set(struct gpio_chip *chip, unsigned int offset,
+   int value)
+{
+   struct lp873x_gpio *gpio = gpiochip_get_data(chip);
+
+   regmap_update_bits(gpio->lp873->regmap, LP873X_REG_GPO_CTRL,
+  BIT(offset * BITS_PER_GPO),
+  value ? BIT(offset * BITS_PER_GPO) : 0);
+}
+
+static int lp873x_gpio_request(struct gpio_chip *gc, unsigned int offset)
+{
+   struct lp873x_gpio *gpio = gpiochip_get_data(gc);
+   int ret;
+
+   switch (offset) {
+   case 0:
+   /* No MUX Set up Needed for GPO */
+   break;
+   case 1:
+   /* Setup the CLKIN_PIN_SEL MUX to GPO2 */
+   ret = regmap_update_bits(gpio->lp873->regmap, LP873X_REG_CONFIG,
+LP873X_CONFIG_CLKIN_PIN_SEL, 0);
+   if (ret)
+   return ret;
+
+ 

[PATCH v8 3/3] regulator: lp873x: Change the MFD config option as per latest naming

2016-08-11 Thread Keerthy
Change the MFD config option as per latest naming

Acked-by: Mark Brown 
Signed-off-by: Keerthy 
---
 drivers/regulator/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig
index 6c88e31..97dc4cc 100644
--- a/drivers/regulator/Kconfig
+++ b/drivers/regulator/Kconfig
@@ -323,7 +323,7 @@ config REGULATOR_LP872X
 
 config REGULATOR_LP873X
tristate "TI LP873X Power regulators"
-   depends on MFD_LP873X && OF
+   depends on MFD_TI_LP873X && OF
help
  This driver supports LP873X voltage regulator chips. LP873X
  provides two step-down converters and two general-purpose LDO
-- 
1.9.1



[PATCH v8 0/3] mfd: lp873x: Add lp873x PMIC support

2016-08-11 Thread Keerthy
The LP873X chip is a power management IC for Portable Navigation Systems
and Tablet Computing devices. It contains the following components:

 - Regulators.
 - Configurable General Purpose Output Signals(GPO).

PMIC interacts with the main processor through i2c. PMIC has
couple of LDOs(Linear Regulators), couple of BUCKs (Step-Down DC-DC
Converter Cores) and GPOs(General Purpose Output Signals).

The regulator patch is already queued:
http://marc.info/?l=linux-kernel=146298767110771=2

DT patch also queued.

Hence posting the remaining patches of the series with
the comments fixed on mfd driver.

Changes in v8:

Defined Macros for hardcoded bit definitions in gpio driver.
Added Mark's ack for the regulator config change patch.

Chages in v7:

Reverted the dependency on 
http://www.gossamer-threads.com/lists/linux/kernel/2457552.

Changes in v6:

Series depends on: http://www.gossamer-threads.com/lists/linux/kernel/2457552
probe_new used.

Chnages in v4:

Added GPO driver support. 

Keerthy (3):
  mfd: lp873x: Add lp873x PMIC support
  gpio: lp873x: Add support for General Purpose Outputs
  regulator: lp873x: Change the MFD config option as per latest naming

 drivers/gpio/Kconfig   |  10 ++
 drivers/gpio/Makefile  |   1 +
 drivers/gpio/gpio-lp873x.c | 193 +
 drivers/mfd/Kconfig|  14 +++
 drivers/mfd/Makefile   |   2 +
 drivers/mfd/lp873x.c   |  97 +
 drivers/regulator/Kconfig  |   2 +-
 include/linux/mfd/lp873x.h | 264 +
 8 files changed, 582 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpio/gpio-lp873x.c
 create mode 100644 drivers/mfd/lp873x.c
 create mode 100644 include/linux/mfd/lp873x.h

-- 
1.9.1



[PATCH v8 2/3] gpio: lp873x: Add support for General Purpose Outputs

2016-08-11 Thread Keerthy
Add driver for lp873x PMIC family GPOs. Two GPOs are supported
and can be configured in Open-drain output or Push-pull output.

Acked-by: Linus Walleij 
Signed-off-by: Keerthy 
---

Changes in v8:

  * Replaced hardcoded bit fields with macros.

 drivers/gpio/Kconfig   |  10 +++
 drivers/gpio/Makefile  |   1 +
 drivers/gpio/gpio-lp873x.c | 193 +
 3 files changed, 204 insertions(+)
 create mode 100644 drivers/gpio/gpio-lp873x.c

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index 6de9062..ff68a80 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -853,6 +853,16 @@ config GPIO_LP3943
  LP3943 can be used as a GPIO expander which provides up to 16 GPIOs.
  Open drain outputs are required for this usage.
 
+config GPIO_LP873X
+   tristate "TI LP873X GPO"
+   depends on MFD_TI_LP873X
+   help
+ This driver supports the GPO on TI Lp873x PMICs. 2 GPOs are present
+ on LP873X PMICs.
+
+ This driver can also be built as a module. If so, the module will be
+  called gpio-lp873x.
+
 config GPIO_MAX77620
tristate "GPIO support for PMIC MAX77620 and MAX20024"
depends on MFD_MAX77620
diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile
index 2a035ed..d60432f 100644
--- a/drivers/gpio/Makefile
+++ b/drivers/gpio/Makefile
@@ -56,6 +56,7 @@ obj-$(CONFIG_GPIO_LOONGSON)   += gpio-loongson.o
 obj-$(CONFIG_GPIO_LP3943)  += gpio-lp3943.o
 obj-$(CONFIG_GPIO_LPC18XX) += gpio-lpc18xx.o
 obj-$(CONFIG_ARCH_LPC32XX) += gpio-lpc32xx.o
+obj-$(CONFIG_GPIO_LP873X)  += gpio-lp873x.o
 obj-$(CONFIG_GPIO_LYNXPOINT)   += gpio-lynxpoint.o
 obj-$(CONFIG_GPIO_MAX730X) += gpio-max730x.o
 obj-$(CONFIG_GPIO_MAX7300) += gpio-max7300.o
diff --git a/drivers/gpio/gpio-lp873x.c b/drivers/gpio/gpio-lp873x.c
new file mode 100644
index 000..79c255d
--- /dev/null
+++ b/drivers/gpio/gpio-lp873x.c
@@ -0,0 +1,193 @@
+/*
+ * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/
+ * Keerthy 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether expressed or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License version 2 for more details.
+ *
+ * Based on the TPS65218 driver
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define BITS_PER_GPO   0x4
+#define LP873X_GPO_CTRL_OD 0x2
+
+struct lp873x_gpio {
+   struct gpio_chip chip;
+   struct lp873x *lp873;
+};
+
+static int lp873x_gpio_get_direction(struct gpio_chip *chip,
+unsigned int offset)
+{
+   /* This device is output only */
+   return 0;
+}
+
+static int lp873x_gpio_direction_input(struct gpio_chip *chip,
+  unsigned int offset)
+{
+   /* This device is output only */
+   return -EINVAL;
+}
+
+static int lp873x_gpio_direction_output(struct gpio_chip *chip,
+   unsigned int offset, int value)
+{
+   struct lp873x_gpio *gpio = gpiochip_get_data(chip);
+
+   /* Set the initial value */
+   return regmap_update_bits(gpio->lp873->regmap, LP873X_REG_GPO_CTRL,
+ BIT(offset * BITS_PER_GPO),
+ value ? BIT(offset * BITS_PER_GPO) : 0);
+}
+
+static int lp873x_gpio_get(struct gpio_chip *chip, unsigned int offset)
+{
+   struct lp873x_gpio *gpio = gpiochip_get_data(chip);
+   int ret, val;
+
+   ret = regmap_read(gpio->lp873->regmap, LP873X_REG_GPO_CTRL, );
+   if (ret < 0)
+   return ret;
+
+   return val & BIT(offset * BITS_PER_GPO);
+}
+
+static void lp873x_gpio_set(struct gpio_chip *chip, unsigned int offset,
+   int value)
+{
+   struct lp873x_gpio *gpio = gpiochip_get_data(chip);
+
+   regmap_update_bits(gpio->lp873->regmap, LP873X_REG_GPO_CTRL,
+  BIT(offset * BITS_PER_GPO),
+  value ? BIT(offset * BITS_PER_GPO) : 0);
+}
+
+static int lp873x_gpio_request(struct gpio_chip *gc, unsigned int offset)
+{
+   struct lp873x_gpio *gpio = gpiochip_get_data(gc);
+   int ret;
+
+   switch (offset) {
+   case 0:
+   /* No MUX Set up Needed for GPO */
+   break;
+   case 1:
+   /* Setup the CLKIN_PIN_SEL MUX to GPO2 */
+   ret = regmap_update_bits(gpio->lp873->regmap, LP873X_REG_CONFIG,
+LP873X_CONFIG_CLKIN_PIN_SEL, 0);
+   if (ret)
+   return ret;
+
+   break;
+   default:
+   return -EINVAL;
+   

[PATCH v8 1/3] mfd: lp873x: Add lp873x PMIC support

2016-08-11 Thread Keerthy
The LP873X chip is a power management IC for Portable Navigation Systems
and Tablet Computing devices. It contains the following components:

 - Regulators.
 - Configurable General Purpose Output Signals(GPO).

PMIC interacts with the main processor through i2c. PMIC has
couple of LDOs(Linear Regulators), couple of BUCKs (Step-Down DC-DC
Converter Cores) and GPOs(General Purpose Output Signals).

Signed-off-by: Keerthy 
---
 drivers/mfd/Kconfig|  14 +++
 drivers/mfd/Makefile   |   2 +
 drivers/mfd/lp873x.c   |  97 +
 include/linux/mfd/lp873x.h | 264 +
 4 files changed, 377 insertions(+)
 create mode 100644 drivers/mfd/lp873x.c
 create mode 100644 include/linux/mfd/lp873x.h

diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index 2d1fb64..45fe00a 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -1224,6 +1224,20 @@ config MFD_TPS65217
  This driver can also be built as a module.  If so, the module
  will be called tps65217.
 
+config MFD_TI_LP873X
+   tristate "TI LP873X Power Management IC"
+   depends on I2C
+   select MFD_CORE
+   select REGMAP_I2C
+   help
+ If you say yes here then you get support for the LP873X series of
+ Power Management Integrated Circuits(PMIC).
+ These include voltage regulators, Thermal protection, Configurable
+ General Purpose Outputs(GPO) that are used in portable devices.
+
+ This driver can also be built as a module.  If so, the module
+ will be called lp873x.
+
 config MFD_TPS65218
tristate "TI TPS65218 Power Management chips"
depends on I2C
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index 2ba3ba3..42acbcd 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -22,6 +22,8 @@ obj-$(CONFIG_HTC_EGPIO)   += htc-egpio.o
 obj-$(CONFIG_HTC_PASIC3)   += htc-pasic3.o
 obj-$(CONFIG_HTC_I2CPLD)   += htc-i2cpld.o
 
+obj-$(CONFIG_MFD_TI_LP873X)+= lp873x.o
+
 obj-$(CONFIG_MFD_DAVINCI_VOICECODEC)   += davinci_voicecodec.o
 obj-$(CONFIG_MFD_DM355EVM_MSP) += dm355evm_msp.o
 obj-$(CONFIG_MFD_TI_AM335X_TSCADC) += ti_am335x_tscadc.o
diff --git a/drivers/mfd/lp873x.c b/drivers/mfd/lp873x.c
new file mode 100644
index 000..3ca0462
--- /dev/null
+++ b/drivers/mfd/lp873x.c
@@ -0,0 +1,97 @@
+/*
+ * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * Author: Keerthy 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+static const struct regmap_config lp873x_regmap_config = {
+   .reg_bits = 8,
+   .val_bits = 8,
+   .max_register = LP873X_REG_MAX,
+};
+
+static const struct mfd_cell lp873x_cells[] = {
+   { .name = "lp873x-regulator", },
+   { .name = "lp873x-gpio", },
+};
+
+static int lp873x_probe(struct i2c_client *client,
+   const struct i2c_device_id *ids)
+{
+   struct lp873x *lp873;
+   int ret;
+   unsigned int otpid;
+
+   lp873 = devm_kzalloc(>dev, sizeof(*lp873), GFP_KERNEL);
+   if (!lp873)
+   return -ENOMEM;
+
+   lp873->dev = >dev;
+
+   lp873->regmap = devm_regmap_init_i2c(client, _regmap_config);
+   if (IS_ERR(lp873->regmap)) {
+   ret = PTR_ERR(lp873->regmap);
+   dev_err(lp873->dev,
+   "Failed to initialize register map: %d\n", ret);
+   return ret;
+   }
+
+   mutex_init(>lock);
+
+   ret = regmap_read(lp873->regmap, LP873X_REG_OTP_REV, );
+   if (ret) {
+   dev_err(lp873->dev, "Failed to read OTP ID\n");
+   return ret;
+   }
+
+   lp873->rev = otpid & LP873X_OTP_REV_OTP_ID;
+   i2c_set_clientdata(client, lp873);
+   ret = mfd_add_devices(lp873->dev, PLATFORM_DEVID_AUTO, lp873x_cells,
+ ARRAY_SIZE(lp873x_cells), NULL, 0, NULL);
+
+   return ret;
+}
+
+static const struct of_device_id of_lp873x_match_table[] = {
+   { .compatible = "ti,lp8733", },
+   { .compatible = "ti,lp8732", },
+   {}
+};
+MODULE_DEVICE_TABLE(of, of_lp873x_match_table);
+
+static const struct i2c_device_id lp873x_id_table[] = {
+   { "lp873x", LP873X },
+   { },
+};
+MODULE_DEVICE_TABLE(i2c, lp873x_id_table);
+
+static struct i2c_driver lp873x_driver = {
+   .driver = {
+   .name   = "lp873x",
+   .of_match_table = 

[PATCH v8 1/3] mfd: lp873x: Add lp873x PMIC support

2016-08-11 Thread Keerthy
The LP873X chip is a power management IC for Portable Navigation Systems
and Tablet Computing devices. It contains the following components:

 - Regulators.
 - Configurable General Purpose Output Signals(GPO).

PMIC interacts with the main processor through i2c. PMIC has
couple of LDOs(Linear Regulators), couple of BUCKs (Step-Down DC-DC
Converter Cores) and GPOs(General Purpose Output Signals).

Signed-off-by: Keerthy 
---
 drivers/mfd/Kconfig|  14 +++
 drivers/mfd/Makefile   |   2 +
 drivers/mfd/lp873x.c   |  97 +
 include/linux/mfd/lp873x.h | 264 +
 4 files changed, 377 insertions(+)
 create mode 100644 drivers/mfd/lp873x.c
 create mode 100644 include/linux/mfd/lp873x.h

diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index 2d1fb64..45fe00a 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -1224,6 +1224,20 @@ config MFD_TPS65217
  This driver can also be built as a module.  If so, the module
  will be called tps65217.
 
+config MFD_TI_LP873X
+   tristate "TI LP873X Power Management IC"
+   depends on I2C
+   select MFD_CORE
+   select REGMAP_I2C
+   help
+ If you say yes here then you get support for the LP873X series of
+ Power Management Integrated Circuits(PMIC).
+ These include voltage regulators, Thermal protection, Configurable
+ General Purpose Outputs(GPO) that are used in portable devices.
+
+ This driver can also be built as a module.  If so, the module
+ will be called lp873x.
+
 config MFD_TPS65218
tristate "TI TPS65218 Power Management chips"
depends on I2C
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index 2ba3ba3..42acbcd 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -22,6 +22,8 @@ obj-$(CONFIG_HTC_EGPIO)   += htc-egpio.o
 obj-$(CONFIG_HTC_PASIC3)   += htc-pasic3.o
 obj-$(CONFIG_HTC_I2CPLD)   += htc-i2cpld.o
 
+obj-$(CONFIG_MFD_TI_LP873X)+= lp873x.o
+
 obj-$(CONFIG_MFD_DAVINCI_VOICECODEC)   += davinci_voicecodec.o
 obj-$(CONFIG_MFD_DM355EVM_MSP) += dm355evm_msp.o
 obj-$(CONFIG_MFD_TI_AM335X_TSCADC) += ti_am335x_tscadc.o
diff --git a/drivers/mfd/lp873x.c b/drivers/mfd/lp873x.c
new file mode 100644
index 000..3ca0462
--- /dev/null
+++ b/drivers/mfd/lp873x.c
@@ -0,0 +1,97 @@
+/*
+ * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * Author: Keerthy 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+static const struct regmap_config lp873x_regmap_config = {
+   .reg_bits = 8,
+   .val_bits = 8,
+   .max_register = LP873X_REG_MAX,
+};
+
+static const struct mfd_cell lp873x_cells[] = {
+   { .name = "lp873x-regulator", },
+   { .name = "lp873x-gpio", },
+};
+
+static int lp873x_probe(struct i2c_client *client,
+   const struct i2c_device_id *ids)
+{
+   struct lp873x *lp873;
+   int ret;
+   unsigned int otpid;
+
+   lp873 = devm_kzalloc(>dev, sizeof(*lp873), GFP_KERNEL);
+   if (!lp873)
+   return -ENOMEM;
+
+   lp873->dev = >dev;
+
+   lp873->regmap = devm_regmap_init_i2c(client, _regmap_config);
+   if (IS_ERR(lp873->regmap)) {
+   ret = PTR_ERR(lp873->regmap);
+   dev_err(lp873->dev,
+   "Failed to initialize register map: %d\n", ret);
+   return ret;
+   }
+
+   mutex_init(>lock);
+
+   ret = regmap_read(lp873->regmap, LP873X_REG_OTP_REV, );
+   if (ret) {
+   dev_err(lp873->dev, "Failed to read OTP ID\n");
+   return ret;
+   }
+
+   lp873->rev = otpid & LP873X_OTP_REV_OTP_ID;
+   i2c_set_clientdata(client, lp873);
+   ret = mfd_add_devices(lp873->dev, PLATFORM_DEVID_AUTO, lp873x_cells,
+ ARRAY_SIZE(lp873x_cells), NULL, 0, NULL);
+
+   return ret;
+}
+
+static const struct of_device_id of_lp873x_match_table[] = {
+   { .compatible = "ti,lp8733", },
+   { .compatible = "ti,lp8732", },
+   {}
+};
+MODULE_DEVICE_TABLE(of, of_lp873x_match_table);
+
+static const struct i2c_device_id lp873x_id_table[] = {
+   { "lp873x", LP873X },
+   { },
+};
+MODULE_DEVICE_TABLE(i2c, lp873x_id_table);
+
+static struct i2c_driver lp873x_driver = {
+   .driver = {
+   .name   = "lp873x",
+   .of_match_table = of_lp873x_match_table,
+   },
+   .probe 

[PATCH 0/2] AM335x-ICE: Add support for rotary switch

2016-08-11 Thread Vignesh R
This series adds support for rotary-switch on AM335x-ICE that is
connected to TI PCA9536 I2C GPIO expander.
First patch adds new generic driver to read status of group of GPIO
lines and report the value as an input event. The second patch adds DT
entries for the same.


Vignesh R (2):
  input: misc: Add generic input driver to read encoded GPIO lines
  ARM: dts: am335x-icev2: Add nodes for gpio-decoder

 .../devicetree/bindings/input/gpio-decoder.txt |  23 
 arch/arm/boot/dts/am335x-icev2.dts |   9 ++
 drivers/input/misc/Kconfig |  12 ++
 drivers/input/misc/Makefile|   1 +
 drivers/input/misc/gpio_decoder.c  | 128 +
 5 files changed, 173 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/input/gpio-decoder.txt
 create mode 100644 drivers/input/misc/gpio_decoder.c

-- 
2.9.2



[PATCH 0/2] AM335x-ICE: Add support for rotary switch

2016-08-11 Thread Vignesh R
This series adds support for rotary-switch on AM335x-ICE that is
connected to TI PCA9536 I2C GPIO expander.
First patch adds new generic driver to read status of group of GPIO
lines and report the value as an input event. The second patch adds DT
entries for the same.


Vignesh R (2):
  input: misc: Add generic input driver to read encoded GPIO lines
  ARM: dts: am335x-icev2: Add nodes for gpio-decoder

 .../devicetree/bindings/input/gpio-decoder.txt |  23 
 arch/arm/boot/dts/am335x-icev2.dts |   9 ++
 drivers/input/misc/Kconfig |  12 ++
 drivers/input/misc/Makefile|   1 +
 drivers/input/misc/gpio_decoder.c  | 128 +
 5 files changed, 173 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/input/gpio-decoder.txt
 create mode 100644 drivers/input/misc/gpio_decoder.c

-- 
2.9.2



[PATCH 1/2] input: misc: Add generic input driver to read encoded GPIO lines

2016-08-11 Thread Vignesh R
Add a driver to read group of GPIO lines and provide its status as a
numerical value as input event to the system. This will help in
intefacing devices, that can be connected over GPIOs, that provide input
to the system by driving GPIO lines connected to them like a rotary dial
or a switch.

For example, a rotary switch can be connected to four GPIO lines. The
status of the GPIO lines reflect the actual position of the rotary
switch dial. For example, if dial points to 9, then the four GPIO lines
connected to the switch will read HLLH(0b'1001 = 9). This value
can be reported as an ABS_* event to the input subsystem.

Signed-off-by: Vignesh R 
---
 .../devicetree/bindings/input/gpio-decoder.txt |  23 
 drivers/input/misc/Kconfig |  12 ++
 drivers/input/misc/Makefile|   1 +
 drivers/input/misc/gpio_decoder.c  | 128 +
 4 files changed, 164 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/input/gpio-decoder.txt
 create mode 100644 drivers/input/misc/gpio_decoder.c

diff --git a/Documentation/devicetree/bindings/input/gpio-decoder.txt 
b/Documentation/devicetree/bindings/input/gpio-decoder.txt
new file mode 100644
index ..efd947f52cb3
--- /dev/null
+++ b/Documentation/devicetree/bindings/input/gpio-decoder.txt
@@ -0,0 +1,23 @@
+* GPIO Decoder DT bindings
+
+Required Properties:
+- compatible: should be "gpio-decoder"
+- gpios: a spec of gpios (atleast two) to be decoded to a number with
+  first entry representing the MSB.
+
+Optional Properties:
+- gpio-decoder,max-value: Maximum possible value that can be reported by
+  the gpios.
+- linux,axis: the input subsystem axis to map to (ABS_X/ABS_Y).
+  Defaults to 0 (ABS_X).
+
+Example:
+   gpio-decoder0 {
+   compatible = "gpio-decoder";
+   gpios = < 3 GPIO_ACTIVE_HIGH>,
+   < 2 GPIO_ACTIVE_HIGH>,
+   < 1 GPIO_ACTIVE_HIGH>,
+   < 0 GPIO_ACTIVE_HIGH>;
+   linux,axis = <0>; /* ABS_X */
+   gpio-decoder,max-value = <9>;
+   };
diff --git a/drivers/input/misc/Kconfig b/drivers/input/misc/Kconfig
index efb0ca871327..c44513cac3b7 100644
--- a/drivers/input/misc/Kconfig
+++ b/drivers/input/misc/Kconfig
@@ -292,6 +292,18 @@ config INPUT_GPIO_TILT_POLLED
  To compile this driver as a module, choose M here: the
  module will be called gpio_tilt_polled.
 
+config INPUT_GPIO_DECODER
+   tristate "Polled GPIO Decoder Input driver"
+   depends on GPIOLIB && OF
+   select INPUT_POLLDEV
+   help
+Say Y here if you want driver to read status of multiple GPIO
+lines and report the encoded value as an absolute integer to
+input subsystem.
+
+To compile this driver as a module, choose M here: the module
+will will be called gpio_decoder.
+
 config INPUT_IXP4XX_BEEPER
tristate "IXP4XX Beeper support"
depends on ARCH_IXP4XX
diff --git a/drivers/input/misc/Makefile b/drivers/input/misc/Makefile
index 6a1e5e20fc1c..0b6d025f0487 100644
--- a/drivers/input/misc/Makefile
+++ b/drivers/input/misc/Makefile
@@ -35,6 +35,7 @@ obj-$(CONFIG_INPUT_DRV2667_HAPTICS)   += drv2667.o
 obj-$(CONFIG_INPUT_GP2A)   += gp2ap002a00f.o
 obj-$(CONFIG_INPUT_GPIO_BEEPER)+= gpio-beeper.o
 obj-$(CONFIG_INPUT_GPIO_TILT_POLLED)   += gpio_tilt_polled.o
+obj-$(CONFIG_INPUT_GPIO_DECODER)   += gpio_decoder.o
 obj-$(CONFIG_INPUT_HISI_POWERKEY)  += hisi_powerkey.o
 obj-$(CONFIG_HP_SDC_RTC)   += hp_sdc_rtc.o
 obj-$(CONFIG_INPUT_IMS_PCU)+= ims-pcu.o
diff --git a/drivers/input/misc/gpio_decoder.c 
b/drivers/input/misc/gpio_decoder.c
new file mode 100644
index ..e3bed35fc8db
--- /dev/null
+++ b/drivers/input/misc/gpio_decoder.c
@@ -0,0 +1,128 @@
+/*
+ * gpio-decoder.c
+ *
+ * A generic driver to reads multiple gpio lines and translate the
+ * encoded numeric value into an input event.
+ *
+ * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct gpio_decoder {
+   struct input_polled_dev *poll_dev;
+   struct gpio_descs *input_gpios;
+   struct device *dev;
+   u32 axis;
+   u32 last_stable;
+};
+
+static unsigned int gpio_decoder_get_gpios_state(struct gpio_decoder
+*decoder)
+{
+ 

[PATCH 2/2] ARM: dts: am335x-icev2: Add nodes for gpio-decoder

2016-08-11 Thread Vignesh R
AM335x ICE board has a rotary-switch connected to PCA9536 I2C GPIO
expander. The position of the rotary-switch is reflected by status of
GPIO lines. Add gpio-decoder node to read these GPIO line status via
gpio-decoder driver and report it as an input event to the system.

Signed-off-by: Vignesh R 
---
 arch/arm/boot/dts/am335x-icev2.dts | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/am335x-icev2.dts 
b/arch/arm/boot/dts/am335x-icev2.dts
index 7d8b8fefdf08..f2de594343f2 100644
--- a/arch/arm/boot/dts/am335x-icev2.dts
+++ b/arch/arm/boot/dts/am335x-icev2.dts
@@ -139,6 +139,15 @@
default-state = "off";
};
};
+   gpio-decoder0 {
+   compatible = "gpio-decoder";
+   gpios = < 3 GPIO_ACTIVE_HIGH>,
+   < 2 GPIO_ACTIVE_HIGH>,
+   < 1 GPIO_ACTIVE_HIGH>,
+   < 0 GPIO_ACTIVE_HIGH>;
+   linux,axis = <0>; /* ABS_X */
+   gpio-decoder,max-value = <9>;
+   };
 };
 
 _pinmux {
-- 
2.9.2



[PATCH 2/2] ARM: dts: am335x-icev2: Add nodes for gpio-decoder

2016-08-11 Thread Vignesh R
AM335x ICE board has a rotary-switch connected to PCA9536 I2C GPIO
expander. The position of the rotary-switch is reflected by status of
GPIO lines. Add gpio-decoder node to read these GPIO line status via
gpio-decoder driver and report it as an input event to the system.

Signed-off-by: Vignesh R 
---
 arch/arm/boot/dts/am335x-icev2.dts | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/am335x-icev2.dts 
b/arch/arm/boot/dts/am335x-icev2.dts
index 7d8b8fefdf08..f2de594343f2 100644
--- a/arch/arm/boot/dts/am335x-icev2.dts
+++ b/arch/arm/boot/dts/am335x-icev2.dts
@@ -139,6 +139,15 @@
default-state = "off";
};
};
+   gpio-decoder0 {
+   compatible = "gpio-decoder";
+   gpios = < 3 GPIO_ACTIVE_HIGH>,
+   < 2 GPIO_ACTIVE_HIGH>,
+   < 1 GPIO_ACTIVE_HIGH>,
+   < 0 GPIO_ACTIVE_HIGH>;
+   linux,axis = <0>; /* ABS_X */
+   gpio-decoder,max-value = <9>;
+   };
 };
 
 _pinmux {
-- 
2.9.2



[PATCH 1/2] input: misc: Add generic input driver to read encoded GPIO lines

2016-08-11 Thread Vignesh R
Add a driver to read group of GPIO lines and provide its status as a
numerical value as input event to the system. This will help in
intefacing devices, that can be connected over GPIOs, that provide input
to the system by driving GPIO lines connected to them like a rotary dial
or a switch.

For example, a rotary switch can be connected to four GPIO lines. The
status of the GPIO lines reflect the actual position of the rotary
switch dial. For example, if dial points to 9, then the four GPIO lines
connected to the switch will read HLLH(0b'1001 = 9). This value
can be reported as an ABS_* event to the input subsystem.

Signed-off-by: Vignesh R 
---
 .../devicetree/bindings/input/gpio-decoder.txt |  23 
 drivers/input/misc/Kconfig |  12 ++
 drivers/input/misc/Makefile|   1 +
 drivers/input/misc/gpio_decoder.c  | 128 +
 4 files changed, 164 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/input/gpio-decoder.txt
 create mode 100644 drivers/input/misc/gpio_decoder.c

diff --git a/Documentation/devicetree/bindings/input/gpio-decoder.txt 
b/Documentation/devicetree/bindings/input/gpio-decoder.txt
new file mode 100644
index ..efd947f52cb3
--- /dev/null
+++ b/Documentation/devicetree/bindings/input/gpio-decoder.txt
@@ -0,0 +1,23 @@
+* GPIO Decoder DT bindings
+
+Required Properties:
+- compatible: should be "gpio-decoder"
+- gpios: a spec of gpios (atleast two) to be decoded to a number with
+  first entry representing the MSB.
+
+Optional Properties:
+- gpio-decoder,max-value: Maximum possible value that can be reported by
+  the gpios.
+- linux,axis: the input subsystem axis to map to (ABS_X/ABS_Y).
+  Defaults to 0 (ABS_X).
+
+Example:
+   gpio-decoder0 {
+   compatible = "gpio-decoder";
+   gpios = < 3 GPIO_ACTIVE_HIGH>,
+   < 2 GPIO_ACTIVE_HIGH>,
+   < 1 GPIO_ACTIVE_HIGH>,
+   < 0 GPIO_ACTIVE_HIGH>;
+   linux,axis = <0>; /* ABS_X */
+   gpio-decoder,max-value = <9>;
+   };
diff --git a/drivers/input/misc/Kconfig b/drivers/input/misc/Kconfig
index efb0ca871327..c44513cac3b7 100644
--- a/drivers/input/misc/Kconfig
+++ b/drivers/input/misc/Kconfig
@@ -292,6 +292,18 @@ config INPUT_GPIO_TILT_POLLED
  To compile this driver as a module, choose M here: the
  module will be called gpio_tilt_polled.
 
+config INPUT_GPIO_DECODER
+   tristate "Polled GPIO Decoder Input driver"
+   depends on GPIOLIB && OF
+   select INPUT_POLLDEV
+   help
+Say Y here if you want driver to read status of multiple GPIO
+lines and report the encoded value as an absolute integer to
+input subsystem.
+
+To compile this driver as a module, choose M here: the module
+will will be called gpio_decoder.
+
 config INPUT_IXP4XX_BEEPER
tristate "IXP4XX Beeper support"
depends on ARCH_IXP4XX
diff --git a/drivers/input/misc/Makefile b/drivers/input/misc/Makefile
index 6a1e5e20fc1c..0b6d025f0487 100644
--- a/drivers/input/misc/Makefile
+++ b/drivers/input/misc/Makefile
@@ -35,6 +35,7 @@ obj-$(CONFIG_INPUT_DRV2667_HAPTICS)   += drv2667.o
 obj-$(CONFIG_INPUT_GP2A)   += gp2ap002a00f.o
 obj-$(CONFIG_INPUT_GPIO_BEEPER)+= gpio-beeper.o
 obj-$(CONFIG_INPUT_GPIO_TILT_POLLED)   += gpio_tilt_polled.o
+obj-$(CONFIG_INPUT_GPIO_DECODER)   += gpio_decoder.o
 obj-$(CONFIG_INPUT_HISI_POWERKEY)  += hisi_powerkey.o
 obj-$(CONFIG_HP_SDC_RTC)   += hp_sdc_rtc.o
 obj-$(CONFIG_INPUT_IMS_PCU)+= ims-pcu.o
diff --git a/drivers/input/misc/gpio_decoder.c 
b/drivers/input/misc/gpio_decoder.c
new file mode 100644
index ..e3bed35fc8db
--- /dev/null
+++ b/drivers/input/misc/gpio_decoder.c
@@ -0,0 +1,128 @@
+/*
+ * gpio-decoder.c
+ *
+ * A generic driver to reads multiple gpio lines and translate the
+ * encoded numeric value into an input event.
+ *
+ * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct gpio_decoder {
+   struct input_polled_dev *poll_dev;
+   struct gpio_descs *input_gpios;
+   struct device *dev;
+   u32 axis;
+   u32 last_stable;
+};
+
+static unsigned int gpio_decoder_get_gpios_state(struct gpio_decoder
+*decoder)
+{
+   struct 

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-11 Thread Dave Chinner
On Thu, Aug 11, 2016 at 08:20:53PM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 7:52 PM, Christoph Hellwig  wrote:
> >
> > I can look at that, but indeed optimizing this patch seems a bit
> > stupid.
> 
> The "write less than a full block to the end of the file" is actually
> a reasonably common case.
> 
> It may not make for a great filesystem benchmark, but it also isn't
> actually insane. People who do logging in user space do this all the
> time, for example. And it is *not* stupid in that context. Not at all.
> 
> It's never going to be the *main* thing you do (unless you're AIM),
> but I do think it's worth fixing.
> 
> And AIM7 remains one of those odd benchmarks that people use. I'm not
> quite sure why, but I really do think that the normal "append smaller
> chunks to the end of the file" should absolutely not be dismissed as
> stupid.

Yes, I agree that there are reasons for making sub-block IO work
well (which is why I'm looking to try to fix it), but that does't
mean the benchmark is sane. aim7 is, technically, a "scalability
benchmark". As such, expecting tiny writes to scale to moving large
amounts of data is the "stupid" thing it does. If you scale up the
amount of data you need to move, tehn you ned to scale up the
efficiency of moving that data. Case in point - writing 1GB of data
in 1kb chunks to XFs on a local /dev/pmem1 runs at ~600MB/s, whilst
moving it it in 1MB chunks runs at 1.9GB/s. aim7 doesn't actually
stress the scalability of the hardware, because inefficiencies in
it's implementation prevent it from getting to those limits.

That's what aim7 misses - as speeds and capabilities go up, the way
code needs to be written to make efficient use of the hardware also
changes. e.g. High throughput logging solutions don't write every
incoming log event immediately - they aggregate them into larger
buffers and then write those, knowing that they can support much
higher logging rates by doing this

That's why running aim7 as your "does the filesystem scale"
benchmark is somewhat irrelevant to scaling applications on high
performance systems these days - users with fast storage will be
expecting to see that 1.9GB/s throughput from their app, not
600MB/s

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-11 Thread Dave Chinner
On Thu, Aug 11, 2016 at 08:20:53PM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 7:52 PM, Christoph Hellwig  wrote:
> >
> > I can look at that, but indeed optimizing this patch seems a bit
> > stupid.
> 
> The "write less than a full block to the end of the file" is actually
> a reasonably common case.
> 
> It may not make for a great filesystem benchmark, but it also isn't
> actually insane. People who do logging in user space do this all the
> time, for example. And it is *not* stupid in that context. Not at all.
> 
> It's never going to be the *main* thing you do (unless you're AIM),
> but I do think it's worth fixing.
> 
> And AIM7 remains one of those odd benchmarks that people use. I'm not
> quite sure why, but I really do think that the normal "append smaller
> chunks to the end of the file" should absolutely not be dismissed as
> stupid.

Yes, I agree that there are reasons for making sub-block IO work
well (which is why I'm looking to try to fix it), but that does't
mean the benchmark is sane. aim7 is, technically, a "scalability
benchmark". As such, expecting tiny writes to scale to moving large
amounts of data is the "stupid" thing it does. If you scale up the
amount of data you need to move, tehn you ned to scale up the
efficiency of moving that data. Case in point - writing 1GB of data
in 1kb chunks to XFs on a local /dev/pmem1 runs at ~600MB/s, whilst
moving it it in 1MB chunks runs at 1.9GB/s. aim7 doesn't actually
stress the scalability of the hardware, because inefficiencies in
it's implementation prevent it from getting to those limits.

That's what aim7 misses - as speeds and capabilities go up, the way
code needs to be written to make efficient use of the hardware also
changes. e.g. High throughput logging solutions don't write every
incoming log event immediately - they aggregate them into larger
buffers and then write those, knowing that they can support much
higher logging rates by doing this

That's why running aim7 as your "does the filesystem scale"
benchmark is somewhat irrelevant to scaling applications on high
performance systems these days - users with fast storage will be
expecting to see that 1.9GB/s throughput from their app, not
600MB/s

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: kmemleak: Cannot insert 0xff7f1000 into the object search tree (overlaps existing)

2016-08-11 Thread Vignesh R
Hi Catalin,

Thanks for the response!

On Thursday 11 August 2016 10:38 PM, Catalin Marinas wrote:
> On Thu, Aug 11, 2016 at 07:48:12PM +0300, Grygorii Strashko wrote:
>> On 08/11/2016 06:54 PM, Catalin Marinas wrote:
>>> On Thu, Aug 11, 2016 at 05:20:51PM +0530, Vignesh R wrote:
 I see the below message from kmemleak when booting linux-next on AM335x
 GP EVM and DRA7 EVM
>>>
>>> Can you also reproduce it with 4.8-rc1?

Yes, I can reproduce this on 4.8.0-rc1-g4b9eaf33d83d

[...]
> 
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 483197ef613f..7d3361d53ac2 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -723,7 +723,8 @@ int __init_memblock memblock_free(phys_addr_t base, 
> phys_addr_t size)
>(unsigned long long)base + size - 1,
>(void *)_RET_IP_);
>  
> - kmemleak_free_part(__va(base), size);
> + if (base < __pa(high_memory))
> + kmemleak_free_part(__va(base), size);
>   return memblock_remove_range(, base, size);
>  }
>  
> @@ -1152,7 +1153,8 @@ static phys_addr_t __init 
> memblock_alloc_range_nid(phys_addr_t size,
>* The min_count is set to 0 so that memblock allocations are
>* never reported as leaks.
>*/
> - kmemleak_alloc(__va(found), size, 0, 0);
> + if (found < __pa(high_memory))
> + kmemleak_alloc(__va(found), size, 0, 0);
>   return found;
>   }
>   return 0;
> @@ -1399,7 +1401,8 @@ void __init __memblock_free_early(phys_addr_t base, 
> phys_addr_t size)
>   memblock_dbg("%s: [%#016llx-%#016llx] %pF\n",
>__func__, (u64)base, (u64)base + size - 1,
>(void *)_RET_IP_);
> - kmemleak_free_part(__va(base), size);
> + if (base < __pa(high_memory))
> + kmemleak_free_part(__va(base), size);
>   memblock_remove_range(, base, size);
>  }
>  
> @@ -1419,7 +1422,8 @@ void __init __memblock_free_late(phys_addr_t base, 
> phys_addr_t size)
>   memblock_dbg("%s: [%#016llx-%#016llx] %pF\n",
>__func__, (u64)base, (u64)base + size - 1,
>(void *)_RET_IP_);
> - kmemleak_free_part(__va(base), size);
> + if (base < __pa(high_memory))
> + kmemleak_free_part(__va(base), size);
>   cursor = PFN_UP(base);
>   end = PFN_DOWN(base + size);
>  
> 

With above change on 4.8-rc1, I see a different warning from kmemleak:

[0.002918] kmemleak: Trying to color unknown object at 0xfe80 as
Black
[0.002943] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.8.0-rc1-00121-g4b9eaf33d83d-dirty #59
[0.002955] Hardware name: Generic AM33XX (Flattened Device Tree)
[0.003000] [] (unwind_backtrace) from []
(show_stack+0x10/0x14)
[0.003027] [] (show_stack) from []
(dump_stack+0xac/0xe0)
[0.003052] [] (dump_stack) from []
(paint_ptr+0x78/0x9c)
[0.003074] [] (paint_ptr) from []
(kmemleak_init+0x1cc/0x284)
[0.003104] [] (kmemleak_init) from []
(start_kernel+0x2d8/0x3b4)
[0.003122] [] (start_kernel) from [<8000807c>] (0x8000807c)
[0.003133] kmemleak: Early log backtrace:
[0.003146][] dma_contiguous_reserve+0x80/0x94
[0.003170][] arm_memblock_init+0x130/0x184
[0.003191][] setup_arch+0x58c/0xc00
[0.003208][] start_kernel+0x58/0x3b4
[0.003224][<8000807c>] 0x8000807c
[0.003239][] 0x

Full boot log: http://pastebin.ubuntu.com/23048180/

-- 
Thanks
Vignesh


Re: kmemleak: Cannot insert 0xff7f1000 into the object search tree (overlaps existing)

2016-08-11 Thread Vignesh R
Hi Catalin,

Thanks for the response!

On Thursday 11 August 2016 10:38 PM, Catalin Marinas wrote:
> On Thu, Aug 11, 2016 at 07:48:12PM +0300, Grygorii Strashko wrote:
>> On 08/11/2016 06:54 PM, Catalin Marinas wrote:
>>> On Thu, Aug 11, 2016 at 05:20:51PM +0530, Vignesh R wrote:
 I see the below message from kmemleak when booting linux-next on AM335x
 GP EVM and DRA7 EVM
>>>
>>> Can you also reproduce it with 4.8-rc1?

Yes, I can reproduce this on 4.8.0-rc1-g4b9eaf33d83d

[...]
> 
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 483197ef613f..7d3361d53ac2 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -723,7 +723,8 @@ int __init_memblock memblock_free(phys_addr_t base, 
> phys_addr_t size)
>(unsigned long long)base + size - 1,
>(void *)_RET_IP_);
>  
> - kmemleak_free_part(__va(base), size);
> + if (base < __pa(high_memory))
> + kmemleak_free_part(__va(base), size);
>   return memblock_remove_range(, base, size);
>  }
>  
> @@ -1152,7 +1153,8 @@ static phys_addr_t __init 
> memblock_alloc_range_nid(phys_addr_t size,
>* The min_count is set to 0 so that memblock allocations are
>* never reported as leaks.
>*/
> - kmemleak_alloc(__va(found), size, 0, 0);
> + if (found < __pa(high_memory))
> + kmemleak_alloc(__va(found), size, 0, 0);
>   return found;
>   }
>   return 0;
> @@ -1399,7 +1401,8 @@ void __init __memblock_free_early(phys_addr_t base, 
> phys_addr_t size)
>   memblock_dbg("%s: [%#016llx-%#016llx] %pF\n",
>__func__, (u64)base, (u64)base + size - 1,
>(void *)_RET_IP_);
> - kmemleak_free_part(__va(base), size);
> + if (base < __pa(high_memory))
> + kmemleak_free_part(__va(base), size);
>   memblock_remove_range(, base, size);
>  }
>  
> @@ -1419,7 +1422,8 @@ void __init __memblock_free_late(phys_addr_t base, 
> phys_addr_t size)
>   memblock_dbg("%s: [%#016llx-%#016llx] %pF\n",
>__func__, (u64)base, (u64)base + size - 1,
>(void *)_RET_IP_);
> - kmemleak_free_part(__va(base), size);
> + if (base < __pa(high_memory))
> + kmemleak_free_part(__va(base), size);
>   cursor = PFN_UP(base);
>   end = PFN_DOWN(base + size);
>  
> 

With above change on 4.8-rc1, I see a different warning from kmemleak:

[0.002918] kmemleak: Trying to color unknown object at 0xfe80 as
Black
[0.002943] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.8.0-rc1-00121-g4b9eaf33d83d-dirty #59
[0.002955] Hardware name: Generic AM33XX (Flattened Device Tree)
[0.003000] [] (unwind_backtrace) from []
(show_stack+0x10/0x14)
[0.003027] [] (show_stack) from []
(dump_stack+0xac/0xe0)
[0.003052] [] (dump_stack) from []
(paint_ptr+0x78/0x9c)
[0.003074] [] (paint_ptr) from []
(kmemleak_init+0x1cc/0x284)
[0.003104] [] (kmemleak_init) from []
(start_kernel+0x2d8/0x3b4)
[0.003122] [] (start_kernel) from [<8000807c>] (0x8000807c)
[0.003133] kmemleak: Early log backtrace:
[0.003146][] dma_contiguous_reserve+0x80/0x94
[0.003170][] arm_memblock_init+0x130/0x184
[0.003191][] setup_arch+0x58c/0xc00
[0.003208][] start_kernel+0x58/0x3b4
[0.003224][<8000807c>] 0x8000807c
[0.003239][] 0x

Full boot log: http://pastebin.ubuntu.com/23048180/

-- 
Thanks
Vignesh


Re: [PART2 PATCH v5 00/12] iommu/AMD: Introduce IOMMU AVIC support

2016-08-11 Thread Suravee Suthikulpanit

On 08/09/2016 09:58 PM, Joerg Roedel wrote:

On Mon, Aug 08, 2016 at 09:42:36PM +0700, Suthikulpanit, Suravee wrote:

Hi Joerg/Radim/Paolo,

Are there any other concerns about this series?


Okay, looked again through the iommu-patches and sent you my feedback.
Overall I think we are on track to merge this for v4.9 when the
remaining small issues are addressed and Radim and Paolo give their
Acked-bys and/or Reviewed-bys too.



Joerg




Thanks Joerg. I'll update the patch and send out V6.

Suravee


Re: [PART2 PATCH v5 00/12] iommu/AMD: Introduce IOMMU AVIC support

2016-08-11 Thread Suravee Suthikulpanit

On 08/09/2016 09:58 PM, Joerg Roedel wrote:

On Mon, Aug 08, 2016 at 09:42:36PM +0700, Suthikulpanit, Suravee wrote:

Hi Joerg/Radim/Paolo,

Are there any other concerns about this series?


Okay, looked again through the iommu-patches and sent you my feedback.
Overall I think we are on track to merge this for v4.9 when the
remaining small issues are addressed and Radim and Paolo give their
Acked-bys and/or Reviewed-bys too.



Joerg




Thanks Joerg. I'll update the patch and send out V6.

Suravee


linux-next: Tree for Aug 12

2016-08-11 Thread Stephen Rothwell
Hi all,

Changes since 20160811:

Non-merge commits (relative to Linus' tree): 1492
 1513 files changed, 39934 insertions(+), 15095 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
(this fails its final link) and pseries_le_defconfig and i386, sparc
and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 241 trees (counting Linus' and 35 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (d3396e1e4ec4 Merge tag 'fixes-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging fixes/master (d3396e1e4ec4 Merge tag 'fixes-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging kbuild-current/rc-fixes (b36fad65d61f kbuild: Initialize exported 
variables)
Merging arc-current/for-curr (45c3b08a117e ARC: Elide redundant setup of DMA 
callbacks)
Merging arm-current/fixes (87eed3c74d7c ARM: fix address limit restoration for 
undefined instructions)
Merging m68k-current/for-linus (6bd80f372371 m68k/defconfig: Update defconfigs 
for v4.7-rc2)
Merging metag-fixes/fixes (97b1d23f7bcb metag: Drop show_mem() from mem_init())
Merging powerpc-fixes/fixes (ca49e64f0cb1 selftests/powerpc: Specify we expect 
to build with std=gnu99)
Merging powerpc-merge-mpe/fixes (bc0195aad0da Linux 4.2-rc2)
Merging sparc/master (4620a06e4b3c shmem: Fix link error if huge pages support 
is disabled)
Merging net/master (bbe11fab0b6c macsec: use after free when deleting the 
underlying device)
Merging ipsec/master (6916fb3b10b3 xfrm: Ignore socket policies when rebuilding 
hash tables)
Merging netfilter/master (4b5b9ba553f9 openvswitch: do not ignore netdev errors 
when creating tunnel vports)
Merging ipvs/master (ea43f860d984 Merge branch 'ethoc-fixes')
Merging wireless-drivers/master (034fdd4a17ff Merge ath-current from ath.git)
Merging mac80211/master (4d0bd46a4d55 Revert "wext: Fix 32 bit iwpriv 
compatibility issue with 64 bit Kernel")
Merging sound-current/for-linus (a52ff34e5ec6 ALSA: hda - Manage power well 
properly for resume)
Merging pci-current/for-linus (8b078c603249 PCI: Update 
"pci=resource_alignment" documentation)
Merging driver-core.current/driver-core-linus (29b4817d4018 Linux 4.8-rc1)
Merging tty.current/tty-linus (29b4817d4018 Linux 4.8-rc1)
Merging usb.current/usb-linus (539587511835 usb: misc: usbtest: add fix for 
driver hang)
Merging usb-gadget-fixes/fixes (a0ad85ae866f usb: dwc3: gadget: stop processing 
on HWO set)
Merging usb-serial-fixes/usb-linus (647024a7df36 USB: serial: fix memleak in 
driver-registration error path)
Merging usb-chipidea-fixes/ci-for-usb-stable (ea1d39a31d3b usb: common: 
otg-fsm: add license to usb-otg-fsm)
Merging staging.current/staging-linus (29b4817d4018 Linux 4.8-rc1)
Merging char-misc.current/char-misc-linus (29b4817d4018 Linux 4.8-rc1)
Merging input-current/for-linus (22fe874f3803 Input: silead - remove some dead 
code)
Merging crypto-current/master (a0118c8b2be9 crypto: caam - fix non-hmac hashes)
Merging ide/master (797cee982eef Merge branch 'stable-4.8' of 
git://git.infradead.org/users/pcmoore/audit)
Merging rr-fixes/fixes (8244062ef1e5 modules: fix longstanding /proc/kallsyms 
vs module insertion race.)
Merging vfio-fixes/for-linus (c8952a707556 vfio/pci: Fix NULL pointer oops in 
error interrupt setup handling)
Merging kselftest-fixes/fixes (29b4817d4018 Linux 4.8-rc1)
Merging backlight-fixes/for-backlight-fixes (68feaca0b13e backlight: pwm: 
Handle EPROBE_DEFER while requesting the P

linux-next: Tree for Aug 12

2016-08-11 Thread Stephen Rothwell
Hi all,

Changes since 20160811:

Non-merge commits (relative to Linus' tree): 1492
 1513 files changed, 39934 insertions(+), 15095 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
(this fails its final link) and pseries_le_defconfig and i386, sparc
and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 241 trees (counting Linus' and 35 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (d3396e1e4ec4 Merge tag 'fixes-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging fixes/master (d3396e1e4ec4 Merge tag 'fixes-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging kbuild-current/rc-fixes (b36fad65d61f kbuild: Initialize exported 
variables)
Merging arc-current/for-curr (45c3b08a117e ARC: Elide redundant setup of DMA 
callbacks)
Merging arm-current/fixes (87eed3c74d7c ARM: fix address limit restoration for 
undefined instructions)
Merging m68k-current/for-linus (6bd80f372371 m68k/defconfig: Update defconfigs 
for v4.7-rc2)
Merging metag-fixes/fixes (97b1d23f7bcb metag: Drop show_mem() from mem_init())
Merging powerpc-fixes/fixes (ca49e64f0cb1 selftests/powerpc: Specify we expect 
to build with std=gnu99)
Merging powerpc-merge-mpe/fixes (bc0195aad0da Linux 4.2-rc2)
Merging sparc/master (4620a06e4b3c shmem: Fix link error if huge pages support 
is disabled)
Merging net/master (bbe11fab0b6c macsec: use after free when deleting the 
underlying device)
Merging ipsec/master (6916fb3b10b3 xfrm: Ignore socket policies when rebuilding 
hash tables)
Merging netfilter/master (4b5b9ba553f9 openvswitch: do not ignore netdev errors 
when creating tunnel vports)
Merging ipvs/master (ea43f860d984 Merge branch 'ethoc-fixes')
Merging wireless-drivers/master (034fdd4a17ff Merge ath-current from ath.git)
Merging mac80211/master (4d0bd46a4d55 Revert "wext: Fix 32 bit iwpriv 
compatibility issue with 64 bit Kernel")
Merging sound-current/for-linus (a52ff34e5ec6 ALSA: hda - Manage power well 
properly for resume)
Merging pci-current/for-linus (8b078c603249 PCI: Update 
"pci=resource_alignment" documentation)
Merging driver-core.current/driver-core-linus (29b4817d4018 Linux 4.8-rc1)
Merging tty.current/tty-linus (29b4817d4018 Linux 4.8-rc1)
Merging usb.current/usb-linus (539587511835 usb: misc: usbtest: add fix for 
driver hang)
Merging usb-gadget-fixes/fixes (a0ad85ae866f usb: dwc3: gadget: stop processing 
on HWO set)
Merging usb-serial-fixes/usb-linus (647024a7df36 USB: serial: fix memleak in 
driver-registration error path)
Merging usb-chipidea-fixes/ci-for-usb-stable (ea1d39a31d3b usb: common: 
otg-fsm: add license to usb-otg-fsm)
Merging staging.current/staging-linus (29b4817d4018 Linux 4.8-rc1)
Merging char-misc.current/char-misc-linus (29b4817d4018 Linux 4.8-rc1)
Merging input-current/for-linus (22fe874f3803 Input: silead - remove some dead 
code)
Merging crypto-current/master (a0118c8b2be9 crypto: caam - fix non-hmac hashes)
Merging ide/master (797cee982eef Merge branch 'stable-4.8' of 
git://git.infradead.org/users/pcmoore/audit)
Merging rr-fixes/fixes (8244062ef1e5 modules: fix longstanding /proc/kallsyms 
vs module insertion race.)
Merging vfio-fixes/for-linus (c8952a707556 vfio/pci: Fix NULL pointer oops in 
error interrupt setup handling)
Merging kselftest-fixes/fixes (29b4817d4018 Linux 4.8-rc1)
Merging backlight-fixes/for-backlight-fixes (68feaca0b13e backlight: pwm: 
Handle EPROBE_DEFER while requesting the P

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-11 Thread Dave Chinner
On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 5:54 PM, Dave Chinner  wrote:
> >
> > So, removing mark_page_accessed() made the spinlock contention
> > *worse*.
> >
> >   36.51%  [kernel]  [k] _raw_spin_unlock_irqrestore
> >6.27%  [kernel]  [k] copy_user_generic_string
> >3.73%  [kernel]  [k] _raw_spin_unlock_irq
> >3.55%  [kernel]  [k] get_page_from_freelist
> >1.97%  [kernel]  [k] do_raw_spin_lock
> >1.72%  [kernel]  [k] __block_commit_write.isra.30
> 
> I don't recall having ever seen the mapping tree_lock as a contention
> point before, but it's not like I've tried that load either. So it
> might be a regression (going back long, I suspect), or just an unusual
> load that nobody has traditionally tested much.
> 
> Single-threaded big file write one page at a time, was it?

Yup. On a 4 node NUMA system.

So when memory reclaim kicks in, there's a write process, a
writeback kworker and 4 kswapd kthreads all banging on the
mapping->tree_lock. There's an awful lot of concurrency happening
behind the scenes of that single user process writing to a file...

> The mapping tree lock has been around forever (it used to be a rw-lock
> long long ago), but I wonder if we might have moved more stuff into it
> (memory accounting comes to mind) causing much worse contention or
> something.

Yeah, there is now a crapton of accounting updated in
account_page_dirtied under the tree lock - memcg, writeback, node,
zone, task, etc. And there's a *lot* of code that
__delete_from_page_cache() can execute under the tree lock.

> Hmm. Just for fun, I googled "tree_lock contention". It's shown up
> before - back in 2006, and it was you hitting it back then too.

Of course! That, however, would have been when I was playing with
real big SGI machines, not a tiddly little 16p VM :P

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-11 Thread Dave Chinner
On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 5:54 PM, Dave Chinner  wrote:
> >
> > So, removing mark_page_accessed() made the spinlock contention
> > *worse*.
> >
> >   36.51%  [kernel]  [k] _raw_spin_unlock_irqrestore
> >6.27%  [kernel]  [k] copy_user_generic_string
> >3.73%  [kernel]  [k] _raw_spin_unlock_irq
> >3.55%  [kernel]  [k] get_page_from_freelist
> >1.97%  [kernel]  [k] do_raw_spin_lock
> >1.72%  [kernel]  [k] __block_commit_write.isra.30
> 
> I don't recall having ever seen the mapping tree_lock as a contention
> point before, but it's not like I've tried that load either. So it
> might be a regression (going back long, I suspect), or just an unusual
> load that nobody has traditionally tested much.
> 
> Single-threaded big file write one page at a time, was it?

Yup. On a 4 node NUMA system.

So when memory reclaim kicks in, there's a write process, a
writeback kworker and 4 kswapd kthreads all banging on the
mapping->tree_lock. There's an awful lot of concurrency happening
behind the scenes of that single user process writing to a file...

> The mapping tree lock has been around forever (it used to be a rw-lock
> long long ago), but I wonder if we might have moved more stuff into it
> (memory accounting comes to mind) causing much worse contention or
> something.

Yeah, there is now a crapton of accounting updated in
account_page_dirtied under the tree lock - memcg, writeback, node,
zone, task, etc. And there's a *lot* of code that
__delete_from_page_cache() can execute under the tree lock.

> Hmm. Just for fun, I googled "tree_lock contention". It's shown up
> before - back in 2006, and it was you hitting it back then too.

Of course! That, however, would have been when I was playing with
real big SGI machines, not a tiddly little 16p VM :P

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: [PATCH] mm: Add the ram_latent_entropy kernel parameter

2016-08-11 Thread Balbir Singh


On 11/08/16 08:28, Kees Cook wrote:
> From: Emese Revfy 
> 
> When "ram_latent_entropy" is passed on the kernel command line, entropy
> will be extracted from up to the first 4GB of RAM while the runtime memory
> allocator is being initialized. This entropy isn't cryptographically
> secure, but does help provide additional unpredictability on otherwise
> low-entropy systems.
> 
> Based on work created by the PaX Team.
> 
> Signed-off-by: Emese Revfy 
> [kees: renamed parameter, dropped relationship with plugin, updated log]
> Signed-off-by: Kees Cook 
> ---
> This patch has been extracted from the latent_entropy gcc plugin, as
> suggested by Linus: https://lkml.org/lkml/2016/8/8/840
> ---
>  Documentation/kernel-parameters.txt |  5 +
>  mm/page_alloc.c | 21 +
>  2 files changed, 26 insertions(+)
> 
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index 46c030a49186..9d054984370f 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -3245,6 +3245,11 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   raid=   [HW,RAID]
>   See Documentation/md.txt.
>  
> + ram_latent_entropy
> + Enable a very simple form of latent entropy extraction
> + from the first 4GB of memory as the bootmem allocator
> + passes the memory pages to the buddy allocator.
> +
>   ramdisk_size=   [RAM] Sizes of RAM disks in kilobytes
>   See Documentation/blockdev/ramdisk.txt.
>  
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index fb975cec3518..1de94f0ff29d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -64,6 +64,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -1236,6 +1237,15 @@ static void __free_pages_ok(struct page *page, 
> unsigned int order)
>   local_irq_restore(flags);
>  }
>  
> +bool __meminitdata ram_latent_entropy;
> +
> +static int __init setup_ram_latent_entropy(char *str)
> +{
> + ram_latent_entropy = true;
> + return 0;
> +}
> +early_param("ram_latent_entropy", setup_ram_latent_entropy);
> +
>  static void __init __free_pages_boot_core(struct page *page, unsigned int 
> order)
>  {
>   unsigned int nr_pages = 1 << order;
> @@ -1251,6 +1261,17 @@ static void __init __free_pages_boot_core(struct page 
> *page, unsigned int order)
>   __ClearPageReserved(p);
>   set_page_count(p, 0);
>  
> + if (ram_latent_entropy && !PageHighMem(page) &&
> + page_to_pfn(page) < 0x10) {
> + u64 hash = 0;
> + size_t index, end = PAGE_SIZE * nr_pages / sizeof(hash);
> + const u64 *data = lowmem_page_address(page);
> +
> + for (index = 0; index < end; index++)
> + hash ^= hash + data[index];

Won't the hash be the same across boots? Is this entropy addition for
KASLR, since it is so early in boot?q

> + add_device_randomness((const void *), sizeof(hash));
> + }
> +
>   page_zone(page)->managed_pages += nr_pages;
>   set_page_refcounted(page);
>   __free_pages(page, order);
> 


Balbir Singh


Re: [PATCH] mm: Add the ram_latent_entropy kernel parameter

2016-08-11 Thread Balbir Singh


On 11/08/16 08:28, Kees Cook wrote:
> From: Emese Revfy 
> 
> When "ram_latent_entropy" is passed on the kernel command line, entropy
> will be extracted from up to the first 4GB of RAM while the runtime memory
> allocator is being initialized. This entropy isn't cryptographically
> secure, but does help provide additional unpredictability on otherwise
> low-entropy systems.
> 
> Based on work created by the PaX Team.
> 
> Signed-off-by: Emese Revfy 
> [kees: renamed parameter, dropped relationship with plugin, updated log]
> Signed-off-by: Kees Cook 
> ---
> This patch has been extracted from the latent_entropy gcc plugin, as
> suggested by Linus: https://lkml.org/lkml/2016/8/8/840
> ---
>  Documentation/kernel-parameters.txt |  5 +
>  mm/page_alloc.c | 21 +
>  2 files changed, 26 insertions(+)
> 
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index 46c030a49186..9d054984370f 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -3245,6 +3245,11 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   raid=   [HW,RAID]
>   See Documentation/md.txt.
>  
> + ram_latent_entropy
> + Enable a very simple form of latent entropy extraction
> + from the first 4GB of memory as the bootmem allocator
> + passes the memory pages to the buddy allocator.
> +
>   ramdisk_size=   [RAM] Sizes of RAM disks in kilobytes
>   See Documentation/blockdev/ramdisk.txt.
>  
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index fb975cec3518..1de94f0ff29d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -64,6 +64,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -1236,6 +1237,15 @@ static void __free_pages_ok(struct page *page, 
> unsigned int order)
>   local_irq_restore(flags);
>  }
>  
> +bool __meminitdata ram_latent_entropy;
> +
> +static int __init setup_ram_latent_entropy(char *str)
> +{
> + ram_latent_entropy = true;
> + return 0;
> +}
> +early_param("ram_latent_entropy", setup_ram_latent_entropy);
> +
>  static void __init __free_pages_boot_core(struct page *page, unsigned int 
> order)
>  {
>   unsigned int nr_pages = 1 << order;
> @@ -1251,6 +1261,17 @@ static void __init __free_pages_boot_core(struct page 
> *page, unsigned int order)
>   __ClearPageReserved(p);
>   set_page_count(p, 0);
>  
> + if (ram_latent_entropy && !PageHighMem(page) &&
> + page_to_pfn(page) < 0x10) {
> + u64 hash = 0;
> + size_t index, end = PAGE_SIZE * nr_pages / sizeof(hash);
> + const u64 *data = lowmem_page_address(page);
> +
> + for (index = 0; index < end; index++)
> + hash ^= hash + data[index];

Won't the hash be the same across boots? Is this entropy addition for
KASLR, since it is so early in boot?q

> + add_device_randomness((const void *), sizeof(hash));
> + }
> +
>   page_zone(page)->managed_pages += nr_pages;
>   set_page_refcounted(page);
>   __free_pages(page, order);
> 


Balbir Singh


Re: debug tip after earlycon is closed?

2016-08-11 Thread Masahiro Yamada
Hi.




2016-07-28 16:44 GMT+09:00 Arnd Bergmann :
>
> I think the problem is that you have three consoles:
>
> - the boot console that stays active until a real console comes up
> - the framebuffer console that is initialized early and goes on to
>   disable the bootconsole
> - the serial console that you are looking at but which doesn't get
>   initialized until much later
>
> Clearly something is wrong in this setup and we don't want to
> disable the boot console before the serial console is up.
>
> I guess you could work around it by disabling the framebuffer console
> at compile time, or by having the serial console initialized earlier
> than the framebuffer, but I wonder if this is something that could
> use a more general solution.


Looks like this comes from:

#elif defined(CONFIG_DUMMY_CONSOLE)
conswitchp = _con;
#endif


console= in the kernel-parameter is no problem,
but stdout-path in the chose node goes wrong with it.






2016-07-28 21:20 GMT+09:00 Russell King - ARM Linux :
> I think this is down to how the linux,stdout property is handled.
>
> Normally, with command line specified consoles, if you don't specify
> anything, you get the framebuffer console (actually, it's the first
> registered console, but practically this is the framebuffer if enabled)
> by default.
>
> If you specify a console (or consoles) on the command line, you get
> all of those you specified, (iirc) with /dev/console's input connected
> to the first specified console.
>
> This happens because the command line is parsed for consoles, and
> add_preferred_console() is called for each that are found, whether or
> not drivers are discovered for them.  __add_preferred_console()
> prevents the first registered console being automatically initialised
> by setting selected_console to a non-negative number.
>
> However, the parsing of DT specified consoles occurs at driver
> initialisation time - in uart_add_one_port() via of_console_check().
> Only at this time is add_preferred_console() called, which means that
> the first console registered prior to _any_ selected UART console
> driver mentioned in DT will become active.
>
> When the first console becomes active, the earlycon is disabled, which
> means that in the DT case, if we have a framebuffer enabled which
> registers prior to any selected UART console, the framebuffer will
> stop the earlycon output immediately.
>
> To me, what this means is that the DT parsing of linux,stdout is
> broken - while it may look nice from a design point of view, the
> design is wrong and fails to take account of non-UART consoles in
> the system.


Thanks for clarification.
I agree that stdout-path is something wrong.



Since I switched
from

chosen {

 bootargs = "console=ttyS0,115200";
};

to

chosen {
stdout-path = "serial0:115200n8";
};


I have had bad experiences.



The combination of stdout-path and earlycon gives doubled log.
I reported this in  https://lkml.org/lkml/2015/11/27/170
but not fixed yet.



I could fix the problem by changing


chosen {
 stdout-path = "serial0:115200n8";
 bootargs = "earlycon";
};

to

chosen {
 bootargs = "console=ttyS0,115200
earlycon=uniphier,mmio32,0x54006800,115200";
};






-- 
Best Regards
Masahiro Yamada


Re: debug tip after earlycon is closed?

2016-08-11 Thread Masahiro Yamada
Hi.




2016-07-28 16:44 GMT+09:00 Arnd Bergmann :
>
> I think the problem is that you have three consoles:
>
> - the boot console that stays active until a real console comes up
> - the framebuffer console that is initialized early and goes on to
>   disable the bootconsole
> - the serial console that you are looking at but which doesn't get
>   initialized until much later
>
> Clearly something is wrong in this setup and we don't want to
> disable the boot console before the serial console is up.
>
> I guess you could work around it by disabling the framebuffer console
> at compile time, or by having the serial console initialized earlier
> than the framebuffer, but I wonder if this is something that could
> use a more general solution.


Looks like this comes from:

#elif defined(CONFIG_DUMMY_CONSOLE)
conswitchp = _con;
#endif


console= in the kernel-parameter is no problem,
but stdout-path in the chose node goes wrong with it.






2016-07-28 21:20 GMT+09:00 Russell King - ARM Linux :
> I think this is down to how the linux,stdout property is handled.
>
> Normally, with command line specified consoles, if you don't specify
> anything, you get the framebuffer console (actually, it's the first
> registered console, but practically this is the framebuffer if enabled)
> by default.
>
> If you specify a console (or consoles) on the command line, you get
> all of those you specified, (iirc) with /dev/console's input connected
> to the first specified console.
>
> This happens because the command line is parsed for consoles, and
> add_preferred_console() is called for each that are found, whether or
> not drivers are discovered for them.  __add_preferred_console()
> prevents the first registered console being automatically initialised
> by setting selected_console to a non-negative number.
>
> However, the parsing of DT specified consoles occurs at driver
> initialisation time - in uart_add_one_port() via of_console_check().
> Only at this time is add_preferred_console() called, which means that
> the first console registered prior to _any_ selected UART console
> driver mentioned in DT will become active.
>
> When the first console becomes active, the earlycon is disabled, which
> means that in the DT case, if we have a framebuffer enabled which
> registers prior to any selected UART console, the framebuffer will
> stop the earlycon output immediately.
>
> To me, what this means is that the DT parsing of linux,stdout is
> broken - while it may look nice from a design point of view, the
> design is wrong and fails to take account of non-UART consoles in
> the system.


Thanks for clarification.
I agree that stdout-path is something wrong.



Since I switched
from

chosen {

 bootargs = "console=ttyS0,115200";
};

to

chosen {
stdout-path = "serial0:115200n8";
};


I have had bad experiences.



The combination of stdout-path and earlycon gives doubled log.
I reported this in  https://lkml.org/lkml/2015/11/27/170
but not fixed yet.



I could fix the problem by changing


chosen {
 stdout-path = "serial0:115200n8";
 bootargs = "earlycon";
};

to

chosen {
 bootargs = "console=ttyS0,115200
earlycon=uniphier,mmio32,0x54006800,115200";
};






-- 
Best Regards
Masahiro Yamada


Re: [PATCH] seccomp: Fix tracer exit notifications during fatal signals

2016-08-11 Thread Robert O'Callahan
Thanks!

On Fri, Aug 12, 2016 at 3:12 AM, Oleg Nesterov  wrote:
> > The bug happens because when __seccomp_filter() detects
> > fatal_signal_pending(), it calls do_exit() without dequeuing the fatal
> > signal. When do_exit() sends the PTRACE_EVENT_EXIT
>
> I _never_ understood what PTRACE_EVENT_EXIT should actually do. I mean,
> when it should actually stop. This was never defined.
>
> > notification and
> > that task is descheduled, __schedule() notices that there is a fatal
> > signal pending and changes its state from TASK_TRACED to TASK_RUNNING.
>
> And this can happen anyway, with or without this change, with or without
> seccomp. Because another fatal signal can be pending. So PTRACE_EVENT_EXIT
> actually depends on /dev/random.

True. But at least (as Kees alluded to later) this patch ensures
PTRACE_EVENT_EXIT delivery when exit is due to exit_group() and no
genuine fatal signals are involved.

> Perhaps we should finally define what it should do. Say, it should only
> stop if SIGKILL was sent "implicitely" by exit/exec. But as for exec,
> there are more (off-topic) complications, not sure we actually want this...

The ptrace man page currently says:
> A SIGKILL signal may still cause a PTRACE_EVENT_EXIT stop before actual 
> signal death.  This may be changed in the future; SIGKILL is meant to always 
> immediately kill tasks even under ptrace.  Last confirmed on Linux 3.13.

In practice, a PTRACE_EVENT_EXIT is almost always observed after
SIGKILL. That's nice for rr, because it lets us observe the process's
final state. But it allows a process to stay alive indefinitely after
receiving SIGKILL, so I can see why you might want to change it.

Rob
-- 
lbir ye,ea yer.tnietoehr  rdn rdsme,anea lurpr  edna e hnysnenh hhe uresyf toD
selthor  stor  edna  siewaoeodm  or v sstvr  esBa  kbvted,t rdsme,aoreseoouoto
o l euetiuruewFa  kbn e hnystoivateweh uresyf tulsa rehr  rdm  or rnea lurpr
.a war hsrer holsa rodvted,t  nenh hneireseoouot.tniesiewaoeivatewt sstvr  esn


Re: [PATCH 3/3] ARM: dts: imx6q-evi: Fix onboard hub reset line

2016-08-11 Thread Peter Chen
On Thu, Aug 11, 2016 at 09:40:32AM -0700, Joshua Clayton wrote:
> Previously the onboard hub was made to work by treating its
> reset gpio as a regulator enable.
> Get rid of that kludge now that pwseq has added reset gpio support
> Move pin muxing the hub reset pin into the usbh1 group
> 
> Signed-off-by: Joshua Clayton 
> ---
>  arch/arm/boot/dts/imx6q-evi.dts | 25 +++--
>  1 file changed, 7 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/arm/boot/dts/imx6q-evi.dts b/arch/arm/boot/dts/imx6q-evi.dts
> index 4fa5601..49c6f61 100644
> --- a/arch/arm/boot/dts/imx6q-evi.dts
> +++ b/arch/arm/boot/dts/imx6q-evi.dts
> @@ -54,18 +54,6 @@
>   reg = <0x1000 0x4000>;
>   };
>  
> - reg_usbh1_vbus: regulator-usbhubreset {
> - compatible = "regulator-fixed";
> - regulator-name = "usbh1_vbus";
> - regulator-min-microvolt = <500>;
> - regulator-max-microvolt = <500>;
> - enable-active-high;
> - startup-delay-us = <2>;
> - pinctrl-names = "default";
> - pinctrl-0 = <_usbh1_hubreset>;
> - gpio = < 12 GPIO_ACTIVE_HIGH>;
> - };
> -
>   reg_usb_otg_vbus: regulator-usbotgvbus {
>   compatible = "regulator-fixed";
>   regulator-name = "usb_otg_vbus";
> @@ -204,12 +192,18 @@
>  };
>  
>   {
> - vbus-supply = <_usbh1_vbus>;
>   pinctrl-names = "default";
>   pinctrl-0 = <_usbh1>;
>   dr_mode = "host";
>   disable-over-current;
>   status = "okay";
> +
> + usb2415host: hub@1 {
> + compatible = "usb424,2513";
> + reg = <1>;
> + reset-gpios = < 12 GPIO_ACTIVE_LOW>;
> + reset-duration-us = <3000>;
> + };
>  };
>  
>   {
> @@ -467,11 +461,6 @@
>   MX6QDL_PAD_GPIO_3__USB_H1_OC 0x1b0b0
>   /* usbh1_b OC */

This comment should be for above pin, do you mind I change it in
this patch?

>   MX6QDL_PAD_GPIO_0__GPIO1_IO00 0x1b0b0
> - >;
> - };
> -
> - pinctrl_usbh1_hubreset: usbh1hubresetgrp {
> - fsl,pins = <
>   MX6QDL_PAD_GPIO_17__GPIO7_IO12 0x1b0b0
>   >;
>   };

This changes remind me that I need to change the same thing for udoo
board.

-- 

Best Regards,
Peter Chen


Re: [PATCH] seccomp: Fix tracer exit notifications during fatal signals

2016-08-11 Thread Robert O'Callahan
Thanks!

On Fri, Aug 12, 2016 at 3:12 AM, Oleg Nesterov  wrote:
> > The bug happens because when __seccomp_filter() detects
> > fatal_signal_pending(), it calls do_exit() without dequeuing the fatal
> > signal. When do_exit() sends the PTRACE_EVENT_EXIT
>
> I _never_ understood what PTRACE_EVENT_EXIT should actually do. I mean,
> when it should actually stop. This was never defined.
>
> > notification and
> > that task is descheduled, __schedule() notices that there is a fatal
> > signal pending and changes its state from TASK_TRACED to TASK_RUNNING.
>
> And this can happen anyway, with or without this change, with or without
> seccomp. Because another fatal signal can be pending. So PTRACE_EVENT_EXIT
> actually depends on /dev/random.

True. But at least (as Kees alluded to later) this patch ensures
PTRACE_EVENT_EXIT delivery when exit is due to exit_group() and no
genuine fatal signals are involved.

> Perhaps we should finally define what it should do. Say, it should only
> stop if SIGKILL was sent "implicitely" by exit/exec. But as for exec,
> there are more (off-topic) complications, not sure we actually want this...

The ptrace man page currently says:
> A SIGKILL signal may still cause a PTRACE_EVENT_EXIT stop before actual 
> signal death.  This may be changed in the future; SIGKILL is meant to always 
> immediately kill tasks even under ptrace.  Last confirmed on Linux 3.13.

In practice, a PTRACE_EVENT_EXIT is almost always observed after
SIGKILL. That's nice for rr, because it lets us observe the process's
final state. But it allows a process to stay alive indefinitely after
receiving SIGKILL, so I can see why you might want to change it.

Rob
-- 
lbir ye,ea yer.tnietoehr  rdn rdsme,anea lurpr  edna e hnysnenh hhe uresyf toD
selthor  stor  edna  siewaoeodm  or v sstvr  esBa  kbvted,t rdsme,aoreseoouoto
o l euetiuruewFa  kbn e hnystoivateweh uresyf tulsa rehr  rdm  or rnea lurpr
.a war hsrer holsa rodvted,t  nenh hneireseoouot.tniesiewaoeivatewt sstvr  esn


Re: [PATCH 3/3] ARM: dts: imx6q-evi: Fix onboard hub reset line

2016-08-11 Thread Peter Chen
On Thu, Aug 11, 2016 at 09:40:32AM -0700, Joshua Clayton wrote:
> Previously the onboard hub was made to work by treating its
> reset gpio as a regulator enable.
> Get rid of that kludge now that pwseq has added reset gpio support
> Move pin muxing the hub reset pin into the usbh1 group
> 
> Signed-off-by: Joshua Clayton 
> ---
>  arch/arm/boot/dts/imx6q-evi.dts | 25 +++--
>  1 file changed, 7 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/arm/boot/dts/imx6q-evi.dts b/arch/arm/boot/dts/imx6q-evi.dts
> index 4fa5601..49c6f61 100644
> --- a/arch/arm/boot/dts/imx6q-evi.dts
> +++ b/arch/arm/boot/dts/imx6q-evi.dts
> @@ -54,18 +54,6 @@
>   reg = <0x1000 0x4000>;
>   };
>  
> - reg_usbh1_vbus: regulator-usbhubreset {
> - compatible = "regulator-fixed";
> - regulator-name = "usbh1_vbus";
> - regulator-min-microvolt = <500>;
> - regulator-max-microvolt = <500>;
> - enable-active-high;
> - startup-delay-us = <2>;
> - pinctrl-names = "default";
> - pinctrl-0 = <_usbh1_hubreset>;
> - gpio = < 12 GPIO_ACTIVE_HIGH>;
> - };
> -
>   reg_usb_otg_vbus: regulator-usbotgvbus {
>   compatible = "regulator-fixed";
>   regulator-name = "usb_otg_vbus";
> @@ -204,12 +192,18 @@
>  };
>  
>   {
> - vbus-supply = <_usbh1_vbus>;
>   pinctrl-names = "default";
>   pinctrl-0 = <_usbh1>;
>   dr_mode = "host";
>   disable-over-current;
>   status = "okay";
> +
> + usb2415host: hub@1 {
> + compatible = "usb424,2513";
> + reg = <1>;
> + reset-gpios = < 12 GPIO_ACTIVE_LOW>;
> + reset-duration-us = <3000>;
> + };
>  };
>  
>   {
> @@ -467,11 +461,6 @@
>   MX6QDL_PAD_GPIO_3__USB_H1_OC 0x1b0b0
>   /* usbh1_b OC */

This comment should be for above pin, do you mind I change it in
this patch?

>   MX6QDL_PAD_GPIO_0__GPIO1_IO00 0x1b0b0
> - >;
> - };
> -
> - pinctrl_usbh1_hubreset: usbh1hubresetgrp {
> - fsl,pins = <
>   MX6QDL_PAD_GPIO_17__GPIO7_IO12 0x1b0b0
>   >;
>   };

This changes remind me that I need to change the same thing for udoo
board.

-- 

Best Regards,
Peter Chen


Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-11 Thread Linus Torvalds
On Thu, Aug 11, 2016 at 7:52 PM, Christoph Hellwig  wrote:
>
> I can look at that, but indeed optimizing this patch seems a bit
> stupid.

The "write less than a full block to the end of the file" is actually
a reasonably common case.

It may not make for a great filesystem benchmark, but it also isn't
actually insane. People who do logging in user space do this all the
time, for example. And it is *not* stupid in that context. Not at all.

It's never going to be the *main* thing you do (unless you're AIM),
but I do think it's worth fixing.

And AIM7 remains one of those odd benchmarks that people use. I'm not
quite sure why, but I really do think that the normal "append smaller
chunks to the end of the file" should absolutely not be dismissed as
stupid.

Linus


Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-11 Thread Linus Torvalds
On Thu, Aug 11, 2016 at 7:52 PM, Christoph Hellwig  wrote:
>
> I can look at that, but indeed optimizing this patch seems a bit
> stupid.

The "write less than a full block to the end of the file" is actually
a reasonably common case.

It may not make for a great filesystem benchmark, but it also isn't
actually insane. People who do logging in user space do this all the
time, for example. And it is *not* stupid in that context. Not at all.

It's never going to be the *main* thing you do (unless you're AIM),
but I do think it's worth fixing.

And AIM7 remains one of those odd benchmarks that people use. I'm not
quite sure why, but I really do think that the normal "append smaller
chunks to the end of the file" should absolutely not be dismissed as
stupid.

Linus


Re: [PATCHES] ARM: dts: update udoo and evi hubs to use pwrseq

2016-08-11 Thread Peter Chen
On Thu, Aug 11, 2016 at 09:40:29AM -0700, Joshua Clayton wrote:
> 
> Hi Peter,
> 
> Consider this my attempt to squeeze my own dts into your patch series.
> These patches repalce patch 6 in V5. of the usb pwrseq series
> I adding a #address and #size to imx6qdl.dtsi in patch 1
> and remove those lines from imx6qdl-udoo.dtsi in patch 2 (formerly patch 6)
> Finally I make the change in imx6q-evi.dts and hope is can be slipped into
> the series with the others.
> 

Thanks, I will send them at next revision

-- 

Best Regards,
Peter Chen


Re: [PATCHES] ARM: dts: update udoo and evi hubs to use pwrseq

2016-08-11 Thread Peter Chen
On Thu, Aug 11, 2016 at 09:40:29AM -0700, Joshua Clayton wrote:
> 
> Hi Peter,
> 
> Consider this my attempt to squeeze my own dts into your patch series.
> These patches repalce patch 6 in V5. of the usb pwrseq series
> I adding a #address and #size to imx6qdl.dtsi in patch 1
> and remove those lines from imx6qdl-udoo.dtsi in patch 2 (formerly patch 6)
> Finally I make the change in imx6q-evi.dts and hope is can be slipped into
> the series with the others.
> 

Thanks, I will send them at next revision

-- 

Best Regards,
Peter Chen


Re: [RFC PATCH v7 7/7] Restartable sequences: self-tests

2016-08-11 Thread Mathieu Desnoyers
- On Aug 11, 2016, at 9:28 PM, Boqun Feng boqun.f...@gmail.com wrote:

> On Thu, Aug 11, 2016 at 11:26:30PM +, Mathieu Desnoyers wrote:
>> - On Jul 24, 2016, at 2:01 PM, Dave Watson davejwat...@fb.com wrote:
>> 
>> >>> +static inline __attribute__((always_inline))
>> >>> +bool rseq_finish(struct rseq_lock *rlock,
>> >>> + intptr_t *p, intptr_t to_write,
>> >>> + struct rseq_state start_value)
>> > 
>> >>> This ABI looks like it will work fine for our use case. I don't think it
>> >>> has been mentioned yet, but we may still need multiple asm blocks
>> >>> for differing numbers of writes. For example, an array-based freelist 
>> >>> push:
>> > 
>> >>> void push(void *obj) {
>> >>> if (index < maxlen) {
>> >>> freelist[index++] = obj;
>> >>> }
>> >>> }
>> > 
>> >>> would be more efficiently implemented with a two-write rseq_finish:
>> > 
>> >>> rseq_finish2([index], obj, // first write
>> >>> , index + 1, // second write
>> >>> ...);
>> > 
>> >> Would pairing one rseq_start with two rseq_finish do the trick
>> >> there ?
>> > 
>> > Yes, two rseq_finish works, as long as the extra rseq management overhead
>> > is not substantial.
>> 
>> I've added a commit implementing rseq_finish2() in my rseq volatile
>> dev branch. You can fetch it at:
>> 
>> https://github.com/compudj/linux-percpu-dev/tree/rseq-fallback
>> 
>> I also have a separate test and benchmark tree in addition to the
>> kernel selftests here:
>> 
>> https://github.com/compudj/rseq-test
>> 
>> I named the first write a "speculative" write, and the second write
>> the "final" write.
>> 
> 
> Maybe I miss something subtle, but if the first write is only a
> "speculative" write, why can't we put it in the rseq critical section
> rather than asm block? Like this:
> 
>   do_rseq(..., result, targetptr, newval
>   {
>   newval = index;
>   targetptr = 
>   if (newval < maxlen)
>   freelist[newval++] = obj;
>   else
>   result = false;
>   }
> 
> No extra rseq_finish() is needed here, but maybe a little more
> "speculative" writes?

This won't work unfortunately. The speculative stores need to be
between the rseq_event_counter comparison instruction in the rseq_finish
asm sequence and the final store. The ip fixup is really needed for
correctness of speculative stores. The sequence number scheme only works
for loads.

Putting it in the C code between rseq_start and rseq_finish would lead
to races such as:

thread Athread B
rseq_start


rseq_start
freelist[offset + 1] = obj
rseq_finish
   offset++


freelist[newval + 1] = obj  <--- corrupts the list content.



> Besides, do we allow userspace programs do read-only access to the
> memory objects modified by do_rseq(). If so, we have a problem when
> there are two writes in a do_rseq()(either in the rseq critical section
> or in the asm block), because in current implemetation, these two writes
> are unordered, which makes the readers outside a do_rseq() could observe
> the ordering of writes differently.
> 
> For rseq_finish2(), a simple solution would be making the "final" write
> a RELEASE.

Indeed, we would need a release semantic for the final store here if this
is the common use. Or we could duplicate the "flavors" of rseq_finish2 and
add a rseq_finish2_release. We should find a way to eliminate code duplication
there. I suspect we'll end up doing macros.

Thanks,

Mathieu

> 
> Regards,
> Boqun
> 
>> Thanks,
>> 
>> Mathieu
>> 
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
> > http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


Re: [V4 PATCH 2/2] mips/panic: Replace smp_send_stop() with kdump friendly version in panic path

2016-08-11 Thread Dave Young
On 08/10/16 at 05:09pm, Hidehiro Kawai wrote:
> Daniel Walker reported problems which happens when
> crash_kexec_post_notifiers kernel option is enabled
> (https://lkml.org/lkml/2015/6/24/44).
> 
> In that case, smp_send_stop() is called before entering kdump routines
> which assume other CPUs are still online.  As the result, kdump
> routines fail to save other CPUs' registers.  Additionally for MIPS
> OCTEON, it misses to stop the watchdog timer.
> 
> To fix this problem, call a new kdump friendly function,
> crash_smp_send_stop(), instead of the smp_send_stop() when
> crash_kexec_post_notifiers is enabled.  crash_smp_send_stop() is a
> weak function, and it just call smp_send_stop().  Architecture
> codes should override it so that kdump can work appropriately.
> This patch provides MIPS version.
> 
> Reported-by: Daniel Walker 
> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
> Signed-off-by: Hidehiro Kawai 
> Cc: Ralf Baechle 
> Cc: David Daney 
> Cc: Aaro Koskinen 
> Cc: "Steven J. Hill" 
> Cc: Corey Minyard 
> 
> ---
> I'm not familiar with MIPS, and I don't have a test environment and
> just did build tests only.  Please don't apply this patch until
> someone does enough tests, otherwise simply drop this patch.
> ---
>  arch/mips/cavium-octeon/setup.c  |   14 ++
>  arch/mips/include/asm/kexec.h|1 +
>  arch/mips/kernel/crash.c |   18 +-
>  arch/mips/kernel/machine_kexec.c |1 +
>  4 files changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/mips/cavium-octeon/setup.c b/arch/mips/cavium-octeon/setup.c
> index cb16fcc..5537f95 100644
> --- a/arch/mips/cavium-octeon/setup.c
> +++ b/arch/mips/cavium-octeon/setup.c
> @@ -267,6 +267,17 @@ static void octeon_crash_shutdown(struct pt_regs *regs)
>   default_machine_crash_shutdown(regs);
>  }
>  
> +#ifdef CONFIG_SMP
> +void octeon_crash_smp_send_stop(void)
> +{
> + int cpu;
> +
> + /* disable watchdogs */
> + for_each_online_cpu(cpu)
> + cvmx_write_csr(CVMX_CIU_WDOGX(cpu_logical_map(cpu)), 0);
> +}
> +#endif
> +
>  #endif /* CONFIG_KEXEC */
>  
>  #ifdef CONFIG_CAVIUM_RESERVE32
> @@ -911,6 +922,9 @@ void __init prom_init(void)
>   _machine_kexec_shutdown = octeon_shutdown;
>   _machine_crash_shutdown = octeon_crash_shutdown;
>   _machine_kexec_prepare = octeon_kexec_prepare;
> +#ifdef CONFIG_SMP
> + _crash_smp_send_stop = octeon_crash_smp_send_stop;
> +#endif
>  #endif
>  
>   octeon_user_io_init();
> diff --git a/arch/mips/include/asm/kexec.h b/arch/mips/include/asm/kexec.h
> index ee25ebb..493a3cc 100644
> --- a/arch/mips/include/asm/kexec.h
> +++ b/arch/mips/include/asm/kexec.h
> @@ -45,6 +45,7 @@ extern const unsigned char kexec_smp_wait[];
>  extern unsigned long secondary_kexec_args[4];
>  extern void (*relocated_kexec_smp_wait) (void *);
>  extern atomic_t kexec_ready_to_reboot;
> +extern void (*_crash_smp_send_stop)(void);
>  #endif
>  #endif
>  
> diff --git a/arch/mips/kernel/crash.c b/arch/mips/kernel/crash.c
> index 610f0f3..1723b17 100644
> --- a/arch/mips/kernel/crash.c
> +++ b/arch/mips/kernel/crash.c
> @@ -47,9 +47,14 @@ static void crash_shutdown_secondary(void *passed_regs)
>  
>  static void crash_kexec_prepare_cpus(void)
>  {
> + static int cpus_stopped;
>   unsigned int msecs;
> + unsigned int ncpus;
>  
> - unsigned int ncpus = num_online_cpus() - 1;/* Excluding the panic cpu */
> + if (cpus_stopped)
> + return;
> +
> + ncpus = num_online_cpus() - 1;/* Excluding the panic cpu */
>  
>   dump_send_ipi(crash_shutdown_secondary);
>   smp_wmb();
> @@ -64,6 +69,17 @@ static void crash_kexec_prepare_cpus(void)
>   cpu_relax();
>   mdelay(1);
>   }
> +
> + cpus_stopped = 1;
> +}
> +
> +/* Override the weak function in kernel/panic.c */
> +void crash_smp_send_stop(void)
> +{
> + if (_crash_smp_send_stop)
> + _crash_smp_send_stop();
> +
> + crash_kexec_prepare_cpus();
>  }
>  
>  #else /* !defined(CONFIG_SMP)  */
> diff --git a/arch/mips/kernel/machine_kexec.c 
> b/arch/mips/kernel/machine_kexec.c
> index 50980bf3..5972520 100644
> --- a/arch/mips/kernel/machine_kexec.c
> +++ b/arch/mips/kernel/machine_kexec.c
> @@ -25,6 +25,7 @@ void (*_machine_crash_shutdown)(struct pt_regs *regs) = 
> NULL;
>  #ifdef CONFIG_SMP
>  void (*relocated_kexec_smp_wait) (void *);
>  atomic_t kexec_ready_to_reboot = ATOMIC_INIT(0);
> +void (*_crash_smp_send_stop)(void) = NULL;
>  #endif
>  
>  int
> 
> 

Can any mips people review this patch and have a test?

Thanks
Dave



Re: [RFC PATCH v7 7/7] Restartable sequences: self-tests

2016-08-11 Thread Mathieu Desnoyers
- On Aug 11, 2016, at 9:28 PM, Boqun Feng boqun.f...@gmail.com wrote:

> On Thu, Aug 11, 2016 at 11:26:30PM +, Mathieu Desnoyers wrote:
>> - On Jul 24, 2016, at 2:01 PM, Dave Watson davejwat...@fb.com wrote:
>> 
>> >>> +static inline __attribute__((always_inline))
>> >>> +bool rseq_finish(struct rseq_lock *rlock,
>> >>> + intptr_t *p, intptr_t to_write,
>> >>> + struct rseq_state start_value)
>> > 
>> >>> This ABI looks like it will work fine for our use case. I don't think it
>> >>> has been mentioned yet, but we may still need multiple asm blocks
>> >>> for differing numbers of writes. For example, an array-based freelist 
>> >>> push:
>> > 
>> >>> void push(void *obj) {
>> >>> if (index < maxlen) {
>> >>> freelist[index++] = obj;
>> >>> }
>> >>> }
>> > 
>> >>> would be more efficiently implemented with a two-write rseq_finish:
>> > 
>> >>> rseq_finish2([index], obj, // first write
>> >>> , index + 1, // second write
>> >>> ...);
>> > 
>> >> Would pairing one rseq_start with two rseq_finish do the trick
>> >> there ?
>> > 
>> > Yes, two rseq_finish works, as long as the extra rseq management overhead
>> > is not substantial.
>> 
>> I've added a commit implementing rseq_finish2() in my rseq volatile
>> dev branch. You can fetch it at:
>> 
>> https://github.com/compudj/linux-percpu-dev/tree/rseq-fallback
>> 
>> I also have a separate test and benchmark tree in addition to the
>> kernel selftests here:
>> 
>> https://github.com/compudj/rseq-test
>> 
>> I named the first write a "speculative" write, and the second write
>> the "final" write.
>> 
> 
> Maybe I miss something subtle, but if the first write is only a
> "speculative" write, why can't we put it in the rseq critical section
> rather than asm block? Like this:
> 
>   do_rseq(..., result, targetptr, newval
>   {
>   newval = index;
>   targetptr = 
>   if (newval < maxlen)
>   freelist[newval++] = obj;
>   else
>   result = false;
>   }
> 
> No extra rseq_finish() is needed here, but maybe a little more
> "speculative" writes?

This won't work unfortunately. The speculative stores need to be
between the rseq_event_counter comparison instruction in the rseq_finish
asm sequence and the final store. The ip fixup is really needed for
correctness of speculative stores. The sequence number scheme only works
for loads.

Putting it in the C code between rseq_start and rseq_finish would lead
to races such as:

thread Athread B
rseq_start


rseq_start
freelist[offset + 1] = obj
rseq_finish
   offset++


freelist[newval + 1] = obj  <--- corrupts the list content.



> Besides, do we allow userspace programs do read-only access to the
> memory objects modified by do_rseq(). If so, we have a problem when
> there are two writes in a do_rseq()(either in the rseq critical section
> or in the asm block), because in current implemetation, these two writes
> are unordered, which makes the readers outside a do_rseq() could observe
> the ordering of writes differently.
> 
> For rseq_finish2(), a simple solution would be making the "final" write
> a RELEASE.

Indeed, we would need a release semantic for the final store here if this
is the common use. Or we could duplicate the "flavors" of rseq_finish2 and
add a rseq_finish2_release. We should find a way to eliminate code duplication
there. I suspect we'll end up doing macros.

Thanks,

Mathieu

> 
> Regards,
> Boqun
> 
>> Thanks,
>> 
>> Mathieu
>> 
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
> > http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


Re: [V4 PATCH 2/2] mips/panic: Replace smp_send_stop() with kdump friendly version in panic path

2016-08-11 Thread Dave Young
On 08/10/16 at 05:09pm, Hidehiro Kawai wrote:
> Daniel Walker reported problems which happens when
> crash_kexec_post_notifiers kernel option is enabled
> (https://lkml.org/lkml/2015/6/24/44).
> 
> In that case, smp_send_stop() is called before entering kdump routines
> which assume other CPUs are still online.  As the result, kdump
> routines fail to save other CPUs' registers.  Additionally for MIPS
> OCTEON, it misses to stop the watchdog timer.
> 
> To fix this problem, call a new kdump friendly function,
> crash_smp_send_stop(), instead of the smp_send_stop() when
> crash_kexec_post_notifiers is enabled.  crash_smp_send_stop() is a
> weak function, and it just call smp_send_stop().  Architecture
> codes should override it so that kdump can work appropriately.
> This patch provides MIPS version.
> 
> Reported-by: Daniel Walker 
> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
> Signed-off-by: Hidehiro Kawai 
> Cc: Ralf Baechle 
> Cc: David Daney 
> Cc: Aaro Koskinen 
> Cc: "Steven J. Hill" 
> Cc: Corey Minyard 
> 
> ---
> I'm not familiar with MIPS, and I don't have a test environment and
> just did build tests only.  Please don't apply this patch until
> someone does enough tests, otherwise simply drop this patch.
> ---
>  arch/mips/cavium-octeon/setup.c  |   14 ++
>  arch/mips/include/asm/kexec.h|1 +
>  arch/mips/kernel/crash.c |   18 +-
>  arch/mips/kernel/machine_kexec.c |1 +
>  4 files changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/mips/cavium-octeon/setup.c b/arch/mips/cavium-octeon/setup.c
> index cb16fcc..5537f95 100644
> --- a/arch/mips/cavium-octeon/setup.c
> +++ b/arch/mips/cavium-octeon/setup.c
> @@ -267,6 +267,17 @@ static void octeon_crash_shutdown(struct pt_regs *regs)
>   default_machine_crash_shutdown(regs);
>  }
>  
> +#ifdef CONFIG_SMP
> +void octeon_crash_smp_send_stop(void)
> +{
> + int cpu;
> +
> + /* disable watchdogs */
> + for_each_online_cpu(cpu)
> + cvmx_write_csr(CVMX_CIU_WDOGX(cpu_logical_map(cpu)), 0);
> +}
> +#endif
> +
>  #endif /* CONFIG_KEXEC */
>  
>  #ifdef CONFIG_CAVIUM_RESERVE32
> @@ -911,6 +922,9 @@ void __init prom_init(void)
>   _machine_kexec_shutdown = octeon_shutdown;
>   _machine_crash_shutdown = octeon_crash_shutdown;
>   _machine_kexec_prepare = octeon_kexec_prepare;
> +#ifdef CONFIG_SMP
> + _crash_smp_send_stop = octeon_crash_smp_send_stop;
> +#endif
>  #endif
>  
>   octeon_user_io_init();
> diff --git a/arch/mips/include/asm/kexec.h b/arch/mips/include/asm/kexec.h
> index ee25ebb..493a3cc 100644
> --- a/arch/mips/include/asm/kexec.h
> +++ b/arch/mips/include/asm/kexec.h
> @@ -45,6 +45,7 @@ extern const unsigned char kexec_smp_wait[];
>  extern unsigned long secondary_kexec_args[4];
>  extern void (*relocated_kexec_smp_wait) (void *);
>  extern atomic_t kexec_ready_to_reboot;
> +extern void (*_crash_smp_send_stop)(void);
>  #endif
>  #endif
>  
> diff --git a/arch/mips/kernel/crash.c b/arch/mips/kernel/crash.c
> index 610f0f3..1723b17 100644
> --- a/arch/mips/kernel/crash.c
> +++ b/arch/mips/kernel/crash.c
> @@ -47,9 +47,14 @@ static void crash_shutdown_secondary(void *passed_regs)
>  
>  static void crash_kexec_prepare_cpus(void)
>  {
> + static int cpus_stopped;
>   unsigned int msecs;
> + unsigned int ncpus;
>  
> - unsigned int ncpus = num_online_cpus() - 1;/* Excluding the panic cpu */
> + if (cpus_stopped)
> + return;
> +
> + ncpus = num_online_cpus() - 1;/* Excluding the panic cpu */
>  
>   dump_send_ipi(crash_shutdown_secondary);
>   smp_wmb();
> @@ -64,6 +69,17 @@ static void crash_kexec_prepare_cpus(void)
>   cpu_relax();
>   mdelay(1);
>   }
> +
> + cpus_stopped = 1;
> +}
> +
> +/* Override the weak function in kernel/panic.c */
> +void crash_smp_send_stop(void)
> +{
> + if (_crash_smp_send_stop)
> + _crash_smp_send_stop();
> +
> + crash_kexec_prepare_cpus();
>  }
>  
>  #else /* !defined(CONFIG_SMP)  */
> diff --git a/arch/mips/kernel/machine_kexec.c 
> b/arch/mips/kernel/machine_kexec.c
> index 50980bf3..5972520 100644
> --- a/arch/mips/kernel/machine_kexec.c
> +++ b/arch/mips/kernel/machine_kexec.c
> @@ -25,6 +25,7 @@ void (*_machine_crash_shutdown)(struct pt_regs *regs) = 
> NULL;
>  #ifdef CONFIG_SMP
>  void (*relocated_kexec_smp_wait) (void *);
>  atomic_t kexec_ready_to_reboot = ATOMIC_INIT(0);
> +void (*_crash_smp_send_stop)(void) = NULL;
>  #endif
>  
>  int
> 
> 

Can any mips people review this patch and have a test?

Thanks
Dave



Re: [V4 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version in panic path

2016-08-11 Thread Dave Young
Hi Hidehiro

Thanks for the update.
On 08/10/16 at 05:09pm, Hidehiro Kawai wrote:
> Daniel Walker reported problems which happens when
> crash_kexec_post_notifiers kernel option is enabled
> (https://lkml.org/lkml/2015/6/24/44).
> 
> In that case, smp_send_stop() is called before entering kdump routines
> which assume other CPUs are still online.  As the result, for x86,
> kdump routines fail to save other CPUs' registers  and disable
> virtualization extensions.

Seems you simplified the changelog, but I think a little more details
will be helpful to understand the patch. You know sometimes lkml.org
does not work well.

> 
> To fix this problem, call a new kdump friendly function,
> crash_smp_send_stop(), instead of the smp_send_stop() when
> crash_kexec_post_notifiers is enabled.  crash_smp_send_stop() is a
> weak function, and it just call smp_send_stop().  Architecture
> codes should override it so that kdump can work appropriately.
> This patch only provides x86-specific version.
> 
> For Xen's PV kernel, just keep the current behavior.

Could you explain a bit about above Xen PV kernel behavior?

BTW, this version looks better,  I think I'm fine with this version
besides of the questions about changelog.

> 
> Changes in V4:
> - Keep to use smp_send_stop if crash_kexec_post_notifiers is not set
> - Rename panic_smp_send_stop to crash_smp_send_stop
> - Don't change the behavior for Xen's PV kernel
> 
> Changes in V3:
> - Revise comments, description, and symbol names
> 
> Changes in V2:
> - Replace smp_send_stop() call with crash_kexec version which
>   saves cpu states and cleans up VMX/SVM
> - Drop a fix for Problem 1 at this moment
> 
> Reported-by: Daniel Walker 
> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
> Signed-off-by: Hidehiro Kawai 
> Cc: Dave Young 
> Cc: Baoquan He 
> Cc: Vivek Goyal 
> Cc: Eric Biederman 
> Cc: Masami Hiramatsu 
> Cc: Daniel Walker 
> Cc: Xunlei Pang 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: Borislav Petkov 
> Cc: David Vrabel 
> Cc: Toshi Kani 
> Cc: Andrew Morton 
> ---
>  arch/x86/include/asm/kexec.h |1 +
>  arch/x86/include/asm/smp.h   |1 +
>  arch/x86/kernel/crash.c  |   22 +---
>  arch/x86/kernel/smp.c|5 
>  kernel/panic.c   |   47 
> --
>  5 files changed, 66 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
> index d2434c1..282630e 100644
> --- a/arch/x86/include/asm/kexec.h
> +++ b/arch/x86/include/asm/kexec.h
> @@ -210,6 +210,7 @@ struct kexec_entry64_regs {
>  
>  typedef void crash_vmclear_fn(void);
>  extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss;
> +extern void kdump_nmi_shootdown_cpus(void);
>  
>  #endif /* __ASSEMBLY__ */
>  
> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
> index ebd0c16..f70989c 100644
> --- a/arch/x86/include/asm/smp.h
> +++ b/arch/x86/include/asm/smp.h
> @@ -50,6 +50,7 @@ struct smp_ops {
>   void (*smp_cpus_done)(unsigned max_cpus);
>  
>   void (*stop_other_cpus)(int wait);
> + void (*crash_stop_other_cpus)(void);
>   void (*smp_send_reschedule)(int cpu);
>  
>   int (*cpu_up)(unsigned cpu, struct task_struct *tidle);
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index 9616cf7..650830e 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -133,15 +133,31 @@ static void kdump_nmi_callback(int cpu, struct pt_regs 
> *regs)
>   disable_local_APIC();
>  }
>  
> -static void kdump_nmi_shootdown_cpus(void)
> +void kdump_nmi_shootdown_cpus(void)
>  {
>   nmi_shootdown_cpus(kdump_nmi_callback);
>  
>   disable_local_APIC();
>  }
>  
> +/* Override the weak function in kernel/panic.c */
> +void crash_smp_send_stop(void)
> +{
> + static int cpus_stopped;
> +
> + if (cpus_stopped)
> + return;
> +
> + if (smp_ops.crash_stop_other_cpus)
> + smp_ops.crash_stop_other_cpus();
> + else
> + smp_send_stop();
> +
> + cpus_stopped = 1;
> +}
> +
>  #else
> -static void kdump_nmi_shootdown_cpus(void)
> +void crash_smp_send_stop(void)
>  {
>   /* There are no cpus to shootdown */
>  }
> @@ -160,7 +176,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
>   /* The kernel is broken so disable interrupts */
>   local_irq_disable();
>  
> - kdump_nmi_shootdown_cpus();
> + crash_smp_send_stop();
>  
>   /*
>* VMCLEAR VMCSs loaded on this cpu if needed.
> diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
> 

Re: [V4 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version in panic path

2016-08-11 Thread Dave Young
Hi Hidehiro

Thanks for the update.
On 08/10/16 at 05:09pm, Hidehiro Kawai wrote:
> Daniel Walker reported problems which happens when
> crash_kexec_post_notifiers kernel option is enabled
> (https://lkml.org/lkml/2015/6/24/44).
> 
> In that case, smp_send_stop() is called before entering kdump routines
> which assume other CPUs are still online.  As the result, for x86,
> kdump routines fail to save other CPUs' registers  and disable
> virtualization extensions.

Seems you simplified the changelog, but I think a little more details
will be helpful to understand the patch. You know sometimes lkml.org
does not work well.

> 
> To fix this problem, call a new kdump friendly function,
> crash_smp_send_stop(), instead of the smp_send_stop() when
> crash_kexec_post_notifiers is enabled.  crash_smp_send_stop() is a
> weak function, and it just call smp_send_stop().  Architecture
> codes should override it so that kdump can work appropriately.
> This patch only provides x86-specific version.
> 
> For Xen's PV kernel, just keep the current behavior.

Could you explain a bit about above Xen PV kernel behavior?

BTW, this version looks better,  I think I'm fine with this version
besides of the questions about changelog.

> 
> Changes in V4:
> - Keep to use smp_send_stop if crash_kexec_post_notifiers is not set
> - Rename panic_smp_send_stop to crash_smp_send_stop
> - Don't change the behavior for Xen's PV kernel
> 
> Changes in V3:
> - Revise comments, description, and symbol names
> 
> Changes in V2:
> - Replace smp_send_stop() call with crash_kexec version which
>   saves cpu states and cleans up VMX/SVM
> - Drop a fix for Problem 1 at this moment
> 
> Reported-by: Daniel Walker 
> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
> Signed-off-by: Hidehiro Kawai 
> Cc: Dave Young 
> Cc: Baoquan He 
> Cc: Vivek Goyal 
> Cc: Eric Biederman 
> Cc: Masami Hiramatsu 
> Cc: Daniel Walker 
> Cc: Xunlei Pang 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: Borislav Petkov 
> Cc: David Vrabel 
> Cc: Toshi Kani 
> Cc: Andrew Morton 
> ---
>  arch/x86/include/asm/kexec.h |1 +
>  arch/x86/include/asm/smp.h   |1 +
>  arch/x86/kernel/crash.c  |   22 +---
>  arch/x86/kernel/smp.c|5 
>  kernel/panic.c   |   47 
> --
>  5 files changed, 66 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
> index d2434c1..282630e 100644
> --- a/arch/x86/include/asm/kexec.h
> +++ b/arch/x86/include/asm/kexec.h
> @@ -210,6 +210,7 @@ struct kexec_entry64_regs {
>  
>  typedef void crash_vmclear_fn(void);
>  extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss;
> +extern void kdump_nmi_shootdown_cpus(void);
>  
>  #endif /* __ASSEMBLY__ */
>  
> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
> index ebd0c16..f70989c 100644
> --- a/arch/x86/include/asm/smp.h
> +++ b/arch/x86/include/asm/smp.h
> @@ -50,6 +50,7 @@ struct smp_ops {
>   void (*smp_cpus_done)(unsigned max_cpus);
>  
>   void (*stop_other_cpus)(int wait);
> + void (*crash_stop_other_cpus)(void);
>   void (*smp_send_reschedule)(int cpu);
>  
>   int (*cpu_up)(unsigned cpu, struct task_struct *tidle);
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index 9616cf7..650830e 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -133,15 +133,31 @@ static void kdump_nmi_callback(int cpu, struct pt_regs 
> *regs)
>   disable_local_APIC();
>  }
>  
> -static void kdump_nmi_shootdown_cpus(void)
> +void kdump_nmi_shootdown_cpus(void)
>  {
>   nmi_shootdown_cpus(kdump_nmi_callback);
>  
>   disable_local_APIC();
>  }
>  
> +/* Override the weak function in kernel/panic.c */
> +void crash_smp_send_stop(void)
> +{
> + static int cpus_stopped;
> +
> + if (cpus_stopped)
> + return;
> +
> + if (smp_ops.crash_stop_other_cpus)
> + smp_ops.crash_stop_other_cpus();
> + else
> + smp_send_stop();
> +
> + cpus_stopped = 1;
> +}
> +
>  #else
> -static void kdump_nmi_shootdown_cpus(void)
> +void crash_smp_send_stop(void)
>  {
>   /* There are no cpus to shootdown */
>  }
> @@ -160,7 +176,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
>   /* The kernel is broken so disable interrupts */
>   local_irq_disable();
>  
> - kdump_nmi_shootdown_cpus();
> + crash_smp_send_stop();
>  
>   /*
>* VMCLEAR VMCSs loaded on this cpu if needed.
> diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
> index 658777c..68f8cc2 100644
> --- a/arch/x86/kernel/smp.c
> +++ b/arch/x86/kernel/smp.c
> @@ -32,6 +32,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +
>  /*
>   *   Some notes on x86 processor bugs affecting SMP operation:
>   *
> @@ -342,6 +344,9 @@ struct smp_ops smp_ops = {
>   .smp_cpus_done  = 

Re: [RFC PATCH v7 7/7] Restartable sequences: self-tests

2016-08-11 Thread Mathieu Desnoyers
- On Aug 11, 2016, at 11:10 PM, Mathieu Desnoyers 
mathieu.desnoy...@efficios.com wrote:

> - On Aug 11, 2016, at 9:28 PM, Boqun Feng boqun.f...@gmail.com wrote:
> 
>> On Thu, Aug 11, 2016 at 11:26:30PM +, Mathieu Desnoyers wrote:
>>> - On Jul 24, 2016, at 2:01 PM, Dave Watson davejwat...@fb.com wrote:
>>> 
>>> >>> +static inline __attribute__((always_inline))
>>> >>> +bool rseq_finish(struct rseq_lock *rlock,
>>> >>> + intptr_t *p, intptr_t to_write,
>>> >>> + struct rseq_state start_value)
>>> > 
>>> >>> This ABI looks like it will work fine for our use case. I don't think it
>>> >>> has been mentioned yet, but we may still need multiple asm blocks
>>> >>> for differing numbers of writes. For example, an array-based freelist 
>>> >>> push:
>>> > 
>>> >>> void push(void *obj) {
>>> >>> if (index < maxlen) {
>>> >>> freelist[index++] = obj;
>>> >>> }
>>> >>> }
>>> > 
>>> >>> would be more efficiently implemented with a two-write rseq_finish:
>>> > 
>>> >>> rseq_finish2([index], obj, // first write
>>> >>> , index + 1, // second write
>>> >>> ...);
>>> > 
>>> >> Would pairing one rseq_start with two rseq_finish do the trick
>>> >> there ?
>>> > 
>>> > Yes, two rseq_finish works, as long as the extra rseq management overhead
>>> > is not substantial.
>>> 
>>> I've added a commit implementing rseq_finish2() in my rseq volatile
>>> dev branch. You can fetch it at:
>>> 
>>> https://github.com/compudj/linux-percpu-dev/tree/rseq-fallback
>>> 
>>> I also have a separate test and benchmark tree in addition to the
>>> kernel selftests here:
>>> 
>>> https://github.com/compudj/rseq-test
>>> 
>>> I named the first write a "speculative" write, and the second write
>>> the "final" write.
>>> 
>> 
>> Maybe I miss something subtle, but if the first write is only a
>> "speculative" write, why can't we put it in the rseq critical section
>> rather than asm block? Like this:
>> 
>>  do_rseq(..., result, targetptr, newval
>>  {
>>  newval = index;
>>  targetptr = 
>>  if (newval < maxlen)
>>  freelist[newval++] = obj;
>>  else
>>  result = false;
>>  }
>> 
>> No extra rseq_finish() is needed here, but maybe a little more
>> "speculative" writes?
> 
> This won't work unfortunately. The speculative stores need to be
> between the rseq_event_counter comparison instruction in the rseq_finish
> asm sequence and the final store. The ip fixup is really needed for
> correctness of speculative stores. The sequence number scheme only works
> for loads.
> 
> Putting it in the C code between rseq_start and rseq_finish would lead
> to races such as:
> 
> thread Athread B
> rseq_start
> 
>
>rseq_start
>freelist[offset + 1] = obj
>rseq_finish
>   offset++
>
> 
> freelist[newval + 1] = obj  <--- corrupts the list content.
> 

Small clarification to the scenario:

thread Athread B
rseq_start
load offset into (register 1)

   
   rseq_start
   freelist[offset + 1] = obj
   rseq_finish
  offset++
   

freelist[(register 1) + 1] = obj  <--- corrupts the list content.

Thanks,

Mathieu


> 
> 
>> Besides, do we allow userspace programs do read-only access to the
>> memory objects modified by do_rseq(). If so, we have a problem when
>> there are two writes in a do_rseq()(either in the rseq critical section
>> or in the asm block), because in current implemetation, these two writes
>> are unordered, which makes the readers outside a do_rseq() could observe
>> the ordering of writes differently.
>> 
>> For rseq_finish2(), a simple solution would be making the "final" write
>> a RELEASE.
> 
> Indeed, we would need a release semantic for the final store here if this
> is the common use. Or we could duplicate the "flavors" of rseq_finish2 and
> add a rseq_finish2_release. We should find a way to eliminate code duplication
> there. I suspect we'll end up doing macros.
> 
> Thanks,
> 
> Mathieu
> 
>> 
>> Regards,
>> Boqun
>> 
>>> Thanks,
>>> 
>>> Mathieu
>>> 
>>> --
>>> Mathieu Desnoyers
>>> EfficiOS Inc.
>> > http://www.efficios.com
> 
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


[v10.2 PATCH 5/5] drm/rockchip: cdn-dp: add cdn DP support for rk3399

2016-08-11 Thread Chris Zhong
Add support for cdn DP controller which is embedded in the rk3399
SoCs. The DP is compliant with DisplayPort Specification,
Version 1.3, This IP is compatible with the rockchip type-c PHY IP.
There is a uCPU in DP controller, it need a firmware to work,
please put the firmware file to /lib/firmware/rockchip/dptx.bin. The
uCPU in charge of aux communication and link training, the host use
mailbox to communicate with the ucpu.
The dclk pin_pol of vop must not be invert for DP.

Signed-off-by: Chris Zhong 
Reviewed-by: Sean Paul 
Acked-by: Mark Yao 

---

Changes in v10.2:
- remove best_encoder ops

Changes in v10.1:
- support read sink count from DPCD

Changes in v10:
- control the grf_clk in DP

Changes in v9:
- do not need reset the phy before power_on
- add a orientation information for set_capability
- retry to read dpcd in 10 seconds

Changes in v8:
- optimization the err log

Changes in v7:
- support firmware standby when no dptx connection
- optimization the calculation of tu size and valid symbol

Changes in v6:
- add a port struct
- select SND_SOC_HDMI_CODEC
- force reset the phy when hpd detected

Changes in v5:
- alphabetical order
- do not use long, use u32 or u64
- return MODE_CLOCK_HIGH when requested > actual
- Optimized Coding Style
- add a formula to get better tu size and symbol value.
- modify according to Sean Paul's comments
- fixed the fw_wait always 0

Changes in v4:
- use phy framework to control DP phy
- support 2 phys

Changes in v3:
- use EXTCON_DISP_DP and EXTCON_DISP_DP_ALT cable to get dp port state.
- reset spdif before config it
- modify the firmware clk to 100Mhz
- retry load firmware if fw file is requested too early

Changes in v2:
- Alphabetic order
- remove excess error message
- use define clk_rate
- check all return value
- remove dev_set_name(dp->dev, "cdn-dp");
- use schedule_delayed_work
- remove never-called functions
- remove some unnecessary ()

Changes in v1:
- use extcon API
- use hdmi-codec for the DP Asoc
- do not initialize the "ret"
- printk a err log when drm_of_encoder_active_endpoint_id
- modify the dclk pin_pol to a single line

 drivers/gpu/drm/rockchip/Kconfig|  10 +
 drivers/gpu/drm/rockchip/Makefile   |   1 +
 drivers/gpu/drm/rockchip/cdn-dp-core.c  | 927 +++
 drivers/gpu/drm/rockchip/cdn-dp-core.h  | 104 +++
 drivers/gpu/drm/rockchip/cdn-dp-reg.c   | 959 
 drivers/gpu/drm/rockchip/cdn-dp-reg.h   | 482 ++
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c |  13 +-
 drivers/gpu/drm/rockchip/rockchip_drm_vop.h |   9 +
 drivers/gpu/drm/rockchip/rockchip_vop_reg.c |   2 +
 9 files changed, 2504 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-core.c
 create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-core.h
 create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-reg.c
 create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-reg.h

diff --git a/drivers/gpu/drm/rockchip/Kconfig b/drivers/gpu/drm/rockchip/Kconfig
index d30bdc3..20aaafe 100644
--- a/drivers/gpu/drm/rockchip/Kconfig
+++ b/drivers/gpu/drm/rockchip/Kconfig
@@ -25,6 +25,16 @@ config ROCKCHIP_ANALOGIX_DP
  for the Analogix Core DP driver. If you want to enable DP
  on RK3288 based SoC, you should selet this option.
 
+config ROCKCHIP_CDN_DP
+tristate "Rockchip cdn DP"
+depends on DRM_ROCKCHIP
+   select SND_SOC_HDMI_CODEC if SND_SOC
+help
+ This selects support for Rockchip SoC specific extensions
+ for the cdn DP driver. If you want to enable Dp on
+ RK3399 based SoC, you should select this
+ option.
+
 config ROCKCHIP_DW_HDMI
 tristate "Rockchip specific extensions for Synopsys DW HDMI"
 depends on DRM_ROCKCHIP
diff --git a/drivers/gpu/drm/rockchip/Makefile 
b/drivers/gpu/drm/rockchip/Makefile
index 05d0713..abdecd5 100644
--- a/drivers/gpu/drm/rockchip/Makefile
+++ b/drivers/gpu/drm/rockchip/Makefile
@@ -7,6 +7,7 @@ rockchipdrm-y := rockchip_drm_drv.o rockchip_drm_fb.o \
 rockchipdrm-$(CONFIG_DRM_FBDEV_EMULATION) += rockchip_drm_fbdev.o
 
 obj-$(CONFIG_ROCKCHIP_ANALOGIX_DP) += analogix_dp-rockchip.o
+obj-$(CONFIG_ROCKCHIP_CDN_DP) += cdn-dp-core.o cdn-dp-reg.o
 obj-$(CONFIG_ROCKCHIP_DW_HDMI) += dw_hdmi-rockchip.o
 obj-$(CONFIG_ROCKCHIP_DW_MIPI_DSI) += dw-mipi-dsi.o
 obj-$(CONFIG_ROCKCHIP_INNO_HDMI) += inno_hdmi.o
diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c 
b/drivers/gpu/drm/rockchip/cdn-dp-core.c
new file mode 100644
index 000..2f068b0
--- /dev/null
+++ b/drivers/gpu/drm/rockchip/cdn-dp-core.c
@@ -0,0 +1,927 @@
+/*
+ * Copyright (C) Fuzhou Rockchip Electronics Co.Ltd
+ * Author: Chris Zhong 
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified 

Re: [RFC PATCH v7 7/7] Restartable sequences: self-tests

2016-08-11 Thread Mathieu Desnoyers
- On Aug 11, 2016, at 11:10 PM, Mathieu Desnoyers 
mathieu.desnoy...@efficios.com wrote:

> - On Aug 11, 2016, at 9:28 PM, Boqun Feng boqun.f...@gmail.com wrote:
> 
>> On Thu, Aug 11, 2016 at 11:26:30PM +, Mathieu Desnoyers wrote:
>>> - On Jul 24, 2016, at 2:01 PM, Dave Watson davejwat...@fb.com wrote:
>>> 
>>> >>> +static inline __attribute__((always_inline))
>>> >>> +bool rseq_finish(struct rseq_lock *rlock,
>>> >>> + intptr_t *p, intptr_t to_write,
>>> >>> + struct rseq_state start_value)
>>> > 
>>> >>> This ABI looks like it will work fine for our use case. I don't think it
>>> >>> has been mentioned yet, but we may still need multiple asm blocks
>>> >>> for differing numbers of writes. For example, an array-based freelist 
>>> >>> push:
>>> > 
>>> >>> void push(void *obj) {
>>> >>> if (index < maxlen) {
>>> >>> freelist[index++] = obj;
>>> >>> }
>>> >>> }
>>> > 
>>> >>> would be more efficiently implemented with a two-write rseq_finish:
>>> > 
>>> >>> rseq_finish2([index], obj, // first write
>>> >>> , index + 1, // second write
>>> >>> ...);
>>> > 
>>> >> Would pairing one rseq_start with two rseq_finish do the trick
>>> >> there ?
>>> > 
>>> > Yes, two rseq_finish works, as long as the extra rseq management overhead
>>> > is not substantial.
>>> 
>>> I've added a commit implementing rseq_finish2() in my rseq volatile
>>> dev branch. You can fetch it at:
>>> 
>>> https://github.com/compudj/linux-percpu-dev/tree/rseq-fallback
>>> 
>>> I also have a separate test and benchmark tree in addition to the
>>> kernel selftests here:
>>> 
>>> https://github.com/compudj/rseq-test
>>> 
>>> I named the first write a "speculative" write, and the second write
>>> the "final" write.
>>> 
>> 
>> Maybe I miss something subtle, but if the first write is only a
>> "speculative" write, why can't we put it in the rseq critical section
>> rather than asm block? Like this:
>> 
>>  do_rseq(..., result, targetptr, newval
>>  {
>>  newval = index;
>>  targetptr = 
>>  if (newval < maxlen)
>>  freelist[newval++] = obj;
>>  else
>>  result = false;
>>  }
>> 
>> No extra rseq_finish() is needed here, but maybe a little more
>> "speculative" writes?
> 
> This won't work unfortunately. The speculative stores need to be
> between the rseq_event_counter comparison instruction in the rseq_finish
> asm sequence and the final store. The ip fixup is really needed for
> correctness of speculative stores. The sequence number scheme only works
> for loads.
> 
> Putting it in the C code between rseq_start and rseq_finish would lead
> to races such as:
> 
> thread Athread B
> rseq_start
> 
>
>rseq_start
>freelist[offset + 1] = obj
>rseq_finish
>   offset++
>
> 
> freelist[newval + 1] = obj  <--- corrupts the list content.
> 

Small clarification to the scenario:

thread Athread B
rseq_start
load offset into (register 1)

   
   rseq_start
   freelist[offset + 1] = obj
   rseq_finish
  offset++
   

freelist[(register 1) + 1] = obj  <--- corrupts the list content.

Thanks,

Mathieu


> 
> 
>> Besides, do we allow userspace programs do read-only access to the
>> memory objects modified by do_rseq(). If so, we have a problem when
>> there are two writes in a do_rseq()(either in the rseq critical section
>> or in the asm block), because in current implemetation, these two writes
>> are unordered, which makes the readers outside a do_rseq() could observe
>> the ordering of writes differently.
>> 
>> For rseq_finish2(), a simple solution would be making the "final" write
>> a RELEASE.
> 
> Indeed, we would need a release semantic for the final store here if this
> is the common use. Or we could duplicate the "flavors" of rseq_finish2 and
> add a rseq_finish2_release. We should find a way to eliminate code duplication
> there. I suspect we'll end up doing macros.
> 
> Thanks,
> 
> Mathieu
> 
>> 
>> Regards,
>> Boqun
>> 
>>> Thanks,
>>> 
>>> Mathieu
>>> 
>>> --
>>> Mathieu Desnoyers
>>> EfficiOS Inc.
>> > http://www.efficios.com
> 
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


[v10.2 PATCH 5/5] drm/rockchip: cdn-dp: add cdn DP support for rk3399

2016-08-11 Thread Chris Zhong
Add support for cdn DP controller which is embedded in the rk3399
SoCs. The DP is compliant with DisplayPort Specification,
Version 1.3, This IP is compatible with the rockchip type-c PHY IP.
There is a uCPU in DP controller, it need a firmware to work,
please put the firmware file to /lib/firmware/rockchip/dptx.bin. The
uCPU in charge of aux communication and link training, the host use
mailbox to communicate with the ucpu.
The dclk pin_pol of vop must not be invert for DP.

Signed-off-by: Chris Zhong 
Reviewed-by: Sean Paul 
Acked-by: Mark Yao 

---

Changes in v10.2:
- remove best_encoder ops

Changes in v10.1:
- support read sink count from DPCD

Changes in v10:
- control the grf_clk in DP

Changes in v9:
- do not need reset the phy before power_on
- add a orientation information for set_capability
- retry to read dpcd in 10 seconds

Changes in v8:
- optimization the err log

Changes in v7:
- support firmware standby when no dptx connection
- optimization the calculation of tu size and valid symbol

Changes in v6:
- add a port struct
- select SND_SOC_HDMI_CODEC
- force reset the phy when hpd detected

Changes in v5:
- alphabetical order
- do not use long, use u32 or u64
- return MODE_CLOCK_HIGH when requested > actual
- Optimized Coding Style
- add a formula to get better tu size and symbol value.
- modify according to Sean Paul's comments
- fixed the fw_wait always 0

Changes in v4:
- use phy framework to control DP phy
- support 2 phys

Changes in v3:
- use EXTCON_DISP_DP and EXTCON_DISP_DP_ALT cable to get dp port state.
- reset spdif before config it
- modify the firmware clk to 100Mhz
- retry load firmware if fw file is requested too early

Changes in v2:
- Alphabetic order
- remove excess error message
- use define clk_rate
- check all return value
- remove dev_set_name(dp->dev, "cdn-dp");
- use schedule_delayed_work
- remove never-called functions
- remove some unnecessary ()

Changes in v1:
- use extcon API
- use hdmi-codec for the DP Asoc
- do not initialize the "ret"
- printk a err log when drm_of_encoder_active_endpoint_id
- modify the dclk pin_pol to a single line

 drivers/gpu/drm/rockchip/Kconfig|  10 +
 drivers/gpu/drm/rockchip/Makefile   |   1 +
 drivers/gpu/drm/rockchip/cdn-dp-core.c  | 927 +++
 drivers/gpu/drm/rockchip/cdn-dp-core.h  | 104 +++
 drivers/gpu/drm/rockchip/cdn-dp-reg.c   | 959 
 drivers/gpu/drm/rockchip/cdn-dp-reg.h   | 482 ++
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c |  13 +-
 drivers/gpu/drm/rockchip/rockchip_drm_vop.h |   9 +
 drivers/gpu/drm/rockchip/rockchip_vop_reg.c |   2 +
 9 files changed, 2504 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-core.c
 create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-core.h
 create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-reg.c
 create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-reg.h

diff --git a/drivers/gpu/drm/rockchip/Kconfig b/drivers/gpu/drm/rockchip/Kconfig
index d30bdc3..20aaafe 100644
--- a/drivers/gpu/drm/rockchip/Kconfig
+++ b/drivers/gpu/drm/rockchip/Kconfig
@@ -25,6 +25,16 @@ config ROCKCHIP_ANALOGIX_DP
  for the Analogix Core DP driver. If you want to enable DP
  on RK3288 based SoC, you should selet this option.
 
+config ROCKCHIP_CDN_DP
+tristate "Rockchip cdn DP"
+depends on DRM_ROCKCHIP
+   select SND_SOC_HDMI_CODEC if SND_SOC
+help
+ This selects support for Rockchip SoC specific extensions
+ for the cdn DP driver. If you want to enable Dp on
+ RK3399 based SoC, you should select this
+ option.
+
 config ROCKCHIP_DW_HDMI
 tristate "Rockchip specific extensions for Synopsys DW HDMI"
 depends on DRM_ROCKCHIP
diff --git a/drivers/gpu/drm/rockchip/Makefile 
b/drivers/gpu/drm/rockchip/Makefile
index 05d0713..abdecd5 100644
--- a/drivers/gpu/drm/rockchip/Makefile
+++ b/drivers/gpu/drm/rockchip/Makefile
@@ -7,6 +7,7 @@ rockchipdrm-y := rockchip_drm_drv.o rockchip_drm_fb.o \
 rockchipdrm-$(CONFIG_DRM_FBDEV_EMULATION) += rockchip_drm_fbdev.o
 
 obj-$(CONFIG_ROCKCHIP_ANALOGIX_DP) += analogix_dp-rockchip.o
+obj-$(CONFIG_ROCKCHIP_CDN_DP) += cdn-dp-core.o cdn-dp-reg.o
 obj-$(CONFIG_ROCKCHIP_DW_HDMI) += dw_hdmi-rockchip.o
 obj-$(CONFIG_ROCKCHIP_DW_MIPI_DSI) += dw-mipi-dsi.o
 obj-$(CONFIG_ROCKCHIP_INNO_HDMI) += inno_hdmi.o
diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c 
b/drivers/gpu/drm/rockchip/cdn-dp-core.c
new file mode 100644
index 000..2f068b0
--- /dev/null
+++ b/drivers/gpu/drm/rockchip/cdn-dp-core.c
@@ -0,0 +1,927 @@
+/*
+ * Copyright (C) Fuzhou Rockchip Electronics Co.Ltd
+ * Author: Chris Zhong 
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * 

Re: [PATCH] timekeeping: Prints the amounts of time spent during suspend

2016-08-11 Thread kbuild test robot
Hi Ruchi,

[auto build test WARNING on tip/timers/core]
[also build test WARNING on v4.8-rc1 next-20160811]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Ruchi-Kandoi/timekeeping-Prints-the-amounts-of-time-spent-during-suspend/20160812-034700
config: openrisc-allmodconfig (attached as .config)
compiler: or32-linux-gcc (GCC) 4.5.1-or32-1.0rc1
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=openrisc 

All warnings (new ones prefixed by >>):

   kernel/time/timekeeping_debug.c: In function 'tk_debug_account_sleep_time':
>> kernel/time/timekeeping_debug.c:73:2: warning: format '%lu' expects type 
>> 'long unsigned int', but argument 2 has type 'time64_t'

vim +73 kernel/time/timekeeping_debug.c

57  struct dentry *d;
58  
59  d = debugfs_create_file("sleep_time", 0444, NULL, NULL,
60  _debug_sleep_time_fops);
61  if (!d) {
62  pr_err("Failed to create sleep_time debug file\n");
63  return -ENOMEM;
64  }
65  
66  return 0;
67  }
68  late_initcall(tk_debug_sleep_time_init);
69  
70  void tk_debug_account_sleep_time(struct timespec64 *t)
71  {
72  sleep_time_bin[fls(t->tv_sec)]++;
  > 73  pr_info("Suspended for %lu.%03lu seconds\n", t->tv_sec,
74  t->tv_nsec / NSEC_PER_MSEC);
75  }
76  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [PATCH] timekeeping: Prints the amounts of time spent during suspend

2016-08-11 Thread kbuild test robot
Hi Ruchi,

[auto build test WARNING on tip/timers/core]
[also build test WARNING on v4.8-rc1 next-20160811]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Ruchi-Kandoi/timekeeping-Prints-the-amounts-of-time-spent-during-suspend/20160812-034700
config: openrisc-allmodconfig (attached as .config)
compiler: or32-linux-gcc (GCC) 4.5.1-or32-1.0rc1
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=openrisc 

All warnings (new ones prefixed by >>):

   kernel/time/timekeeping_debug.c: In function 'tk_debug_account_sleep_time':
>> kernel/time/timekeeping_debug.c:73:2: warning: format '%lu' expects type 
>> 'long unsigned int', but argument 2 has type 'time64_t'

vim +73 kernel/time/timekeeping_debug.c

57  struct dentry *d;
58  
59  d = debugfs_create_file("sleep_time", 0444, NULL, NULL,
60  _debug_sleep_time_fops);
61  if (!d) {
62  pr_err("Failed to create sleep_time debug file\n");
63  return -ENOMEM;
64  }
65  
66  return 0;
67  }
68  late_initcall(tk_debug_sleep_time_init);
69  
70  void tk_debug_account_sleep_time(struct timespec64 *t)
71  {
72  sleep_time_bin[fls(t->tv_sec)]++;
  > 73  pr_info("Suspended for %lu.%03lu seconds\n", t->tv_sec,
74  t->tv_nsec / NSEC_PER_MSEC);
75  }
76  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [PATCH 0/2] ibmvfc: FC-TAPE Support

2016-08-11 Thread Martin K. Petersen
> "Tyrel" == Tyrel Datwyler  writes:

Tyrel> On 08/03/2016 02:36 PM, Tyrel Datwyler wrote:
>> This patchset introduces optional FC-TAPE/FC Class 3 Error Recovery
>> to the ibmvfc client driver.
>> 
>> Tyrel Datwyler (2): ibmvfc: Set READ FCP_XFER_READY DISABLED bit in
>> PRLI ibmvfc: add FC Class 3 Error Recovery support
>> 
>> drivers/scsi/ibmvscsi/ibmvfc.c | 11 +++
>> drivers/scsi/ibmvscsi/ibmvfc.h | 1 + 2 files changed, 12
>> insertions(+)
>> 

Tyrel> ping?

-ENOREVIEWS

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 0/2] ibmvfc: FC-TAPE Support

2016-08-11 Thread Martin K. Petersen
> "Tyrel" == Tyrel Datwyler  writes:

Tyrel> On 08/03/2016 02:36 PM, Tyrel Datwyler wrote:
>> This patchset introduces optional FC-TAPE/FC Class 3 Error Recovery
>> to the ibmvfc client driver.
>> 
>> Tyrel Datwyler (2): ibmvfc: Set READ FCP_XFER_READY DISABLED bit in
>> PRLI ibmvfc: add FC Class 3 Error Recovery support
>> 
>> drivers/scsi/ibmvscsi/ibmvfc.c | 11 +++
>> drivers/scsi/ibmvscsi/ibmvfc.h | 1 + 2 files changed, 12
>> insertions(+)
>> 

Tyrel> ping?

-ENOREVIEWS

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 1/3] Add a new field to struct shrinker

2016-08-11 Thread Tony Jones
On 07/29/2016 06:00 AM, Mel Gorman wrote:
> On Fri, Jul 29, 2016 at 10:13:40AM +1000, Dave Chinner wrote:
>> On Thu, Jul 28, 2016 at 11:25:13AM +0100, Mel Gorman wrote:
>>> On Thu, Jul 28, 2016 at 03:49:47PM +1000, Dave Chinner wrote:
 Seems you're all missing the obvious.

 Add a tracepoint for a shrinker callback that includes a "name"
 field, have the shrinker callback fill it out appropriately. e.g
 in the superblock shrinker:

trace_shrinker_callback(shrinker, shrink_control, sb->s_type->name);

>>>
>>> That misses capturing the latency of the call unless there is a begin/end
>>> tracepoint.
>>
>> Sure, but I didn't see that in the email talking about how to add a
>> name. Even if it is a requirement, it's not necessary as we've
>> already got shrinker runtime measurements from the
>> trace_mm_shrink_slab_start and trace_mm_shrink_slab_end trace
>> points. With the above callback event, shrinker call runtime is
>> simply the time between the calls to the same shrinker within
>> mm_shrink_slab start/end trace points.
>>
> 
> Fair point. It's not that hard to correlate them.

True but the scan_objects callback is only called if we have >batch_size 
objects.

It's possible to accumulate quite some time without calling the callback and 
being able to obtain 
the s_type->name.   So this time all gets associated with just super_cache_scan.



Re: [PATCH 1/3] Add a new field to struct shrinker

2016-08-11 Thread Tony Jones
On 07/29/2016 06:00 AM, Mel Gorman wrote:
> On Fri, Jul 29, 2016 at 10:13:40AM +1000, Dave Chinner wrote:
>> On Thu, Jul 28, 2016 at 11:25:13AM +0100, Mel Gorman wrote:
>>> On Thu, Jul 28, 2016 at 03:49:47PM +1000, Dave Chinner wrote:
 Seems you're all missing the obvious.

 Add a tracepoint for a shrinker callback that includes a "name"
 field, have the shrinker callback fill it out appropriately. e.g
 in the superblock shrinker:

trace_shrinker_callback(shrinker, shrink_control, sb->s_type->name);

>>>
>>> That misses capturing the latency of the call unless there is a begin/end
>>> tracepoint.
>>
>> Sure, but I didn't see that in the email talking about how to add a
>> name. Even if it is a requirement, it's not necessary as we've
>> already got shrinker runtime measurements from the
>> trace_mm_shrink_slab_start and trace_mm_shrink_slab_end trace
>> points. With the above callback event, shrinker call runtime is
>> simply the time between the calls to the same shrinker within
>> mm_shrink_slab start/end trace points.
>>
> 
> Fair point. It's not that hard to correlate them.

True but the scan_objects callback is only called if we have >batch_size 
objects.

It's possible to accumulate quite some time without calling the callback and 
being able to obtain 
the s_type->name.   So this time all gets associated with just super_cache_scan.



[PATCH] MIPS: ath79: Modify error handling

2016-08-11 Thread Amitoj Kaur Chawla
clk_register_fixed_factor returns an ERR_PTR in case of an error and
should have an IS_ERR check instead of a null check.

The Coccinelle semantic patch used to find this issue is as follows:
@@
expression e;
statement S;
@@

*e = clk_register_fixed_factor(...);
if (!e) S

Signed-off-by: Amitoj Kaur Chawla 
---
 arch/mips/ath79/clock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/ath79/clock.c b/arch/mips/ath79/clock.c
index 2e73784..cc3a1e3 100644
--- a/arch/mips/ath79/clock.c
+++ b/arch/mips/ath79/clock.c
@@ -96,7 +96,7 @@ static struct clk * __init ath79_reg_ffclk(const char *name,
struct clk *clk;
 
clk = clk_register_fixed_factor(NULL, name, parent_name, 0, mult, div);
-   if (!clk)
+   if (IS_ERR(clk))
panic("failed to allocate %s clock structure", name);
 
return clk;
-- 
1.9.1



[PATCH] MIPS: ath79: Modify error handling

2016-08-11 Thread Amitoj Kaur Chawla
clk_register_fixed_factor returns an ERR_PTR in case of an error and
should have an IS_ERR check instead of a null check.

The Coccinelle semantic patch used to find this issue is as follows:
@@
expression e;
statement S;
@@

*e = clk_register_fixed_factor(...);
if (!e) S

Signed-off-by: Amitoj Kaur Chawla 
---
 arch/mips/ath79/clock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/ath79/clock.c b/arch/mips/ath79/clock.c
index 2e73784..cc3a1e3 100644
--- a/arch/mips/ath79/clock.c
+++ b/arch/mips/ath79/clock.c
@@ -96,7 +96,7 @@ static struct clk * __init ath79_reg_ffclk(const char *name,
struct clk *clk;
 
clk = clk_register_fixed_factor(NULL, name, parent_name, 0, mult, div);
-   if (!clk)
+   if (IS_ERR(clk))
panic("failed to allocate %s clock structure", name);
 
return clk;
-- 
1.9.1



Re: spin_lock implicit/explicit memory barrier

2016-08-11 Thread Boqun Feng
On Thu, Aug 11, 2016 at 11:31:06AM -0700, Davidlohr Bueso wrote:
> On Thu, 11 Aug 2016, Peter Zijlstra wrote:
> 
> > On Wed, Aug 10, 2016 at 04:29:22PM -0700, Davidlohr Bueso wrote:
> > 
> > > (1) As Manfred suggested, have a patch 1 that fixes the race against 
> > > mainline
> > > with the redundant smp_rmb, then apply a second patch that gets rid of it
> > > for mainline, but only backport the original patch 1 down to 3.12.
> > 
> > I have not followed the thread closely, but this seems like the best
> > option. Esp. since 726328d92a42 ("locking/spinlock, arch: Update and fix
> > spin_unlock_wait() implementations") is incomplete, it relies on at
> > least 6262db7c088b ("powerpc/spinlock: Fix spin_unlock_wait()") to sort
> > PPC.
> 
> Yeah, and we'd also need the arm bits; which reminds me, aren't alpha
> ldl_l/stl_c sequences also exposed to this delaying of the publishing
> when a non-owner peeks at the lock? Right now sysv sem's would be busted
> when doing either is_locked or unlock_wait, shouldn't these be pimped up
> to full smp_mb()s?
> 

You are talking about a similar problem as this one:

http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1018307.html

right?

The trick of this problem is whether the barrier or operation in
spin_lock() could order the STORE part of the lock-acquire with memory
operations in critical sections.

On PPC, we use lwsync, which doesn't order STORE->LOAD, so there is
problem. On ARM64 and qspinlock in x86, there are similiar reasons.

But if an arch implements its spin_lock() with a full barrier, even
though the atomic is implemented by ll/sc, the STORE part of which can't
be reordered with memory operations in the critcal sections. I think
maybe that's the case for alpha(and also for ARM32).

Regards,
Boqun

> Thanks,
> Davidlohr
> 


signature.asc
Description: PGP signature


Re: spin_lock implicit/explicit memory barrier

2016-08-11 Thread Boqun Feng
On Thu, Aug 11, 2016 at 11:31:06AM -0700, Davidlohr Bueso wrote:
> On Thu, 11 Aug 2016, Peter Zijlstra wrote:
> 
> > On Wed, Aug 10, 2016 at 04:29:22PM -0700, Davidlohr Bueso wrote:
> > 
> > > (1) As Manfred suggested, have a patch 1 that fixes the race against 
> > > mainline
> > > with the redundant smp_rmb, then apply a second patch that gets rid of it
> > > for mainline, but only backport the original patch 1 down to 3.12.
> > 
> > I have not followed the thread closely, but this seems like the best
> > option. Esp. since 726328d92a42 ("locking/spinlock, arch: Update and fix
> > spin_unlock_wait() implementations") is incomplete, it relies on at
> > least 6262db7c088b ("powerpc/spinlock: Fix spin_unlock_wait()") to sort
> > PPC.
> 
> Yeah, and we'd also need the arm bits; which reminds me, aren't alpha
> ldl_l/stl_c sequences also exposed to this delaying of the publishing
> when a non-owner peeks at the lock? Right now sysv sem's would be busted
> when doing either is_locked or unlock_wait, shouldn't these be pimped up
> to full smp_mb()s?
> 

You are talking about a similar problem as this one:

http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1018307.html

right?

The trick of this problem is whether the barrier or operation in
spin_lock() could order the STORE part of the lock-acquire with memory
operations in critical sections.

On PPC, we use lwsync, which doesn't order STORE->LOAD, so there is
problem. On ARM64 and qspinlock in x86, there are similiar reasons.

But if an arch implements its spin_lock() with a full barrier, even
though the atomic is implemented by ll/sc, the STORE part of which can't
be reordered with memory operations in the critcal sections. I think
maybe that's the case for alpha(and also for ARM32).

Regards,
Boqun

> Thanks,
> Davidlohr
> 


signature.asc
Description: PGP signature


Re: [PATCH v2 0/6] Add support of inverting power control and some minor cleanup

2016-08-11 Thread Shawn Lin

On 2016/8/11 18:08, Jaehoon Chung wrote:

Hi Shawn,

On 08/09/2016 10:49 AM, Shawn Lin wrote:

Hi,

On 2016/8/8 18:24, Jaehoon Chung wrote:

Hi Shawn,

On 08/07/2016 10:33 AM, Shawn Lin wrote:

By default, dw_mmc outputs high level voltage to indicate powering
up the card and outputs low level vcltage to indicate powering
off the card. But that is not always correct. The power io should
be able to control different kind of hw components to supply or
cutoff power to the card. We have boards that need this patchset
to make the power control correct. Meanwhile let's expose it to
DT for board-specific usage.


I have a question for this patch-set. Does DWMMC IP support to invert ON/OFF at 
Power Enable register?


No.


Hmm..Well, if use the DW_MMC_CARD_PWR_INVERT, it should also be the similar 
behavior with Quirks.


yup, it makes the power control more complicated than before. :(



Other flags are related with dwmmc IP. But this flag (DW_MMC_CARD_PWR_INVERT) 
is not related with IP side.
I understood why you needs to add this flag..Is rockchip designed to invert the 
power controlling?


We don't invert the power controlling but our customers do.
The HW componet looks like some discrete LDOs which enable the related
power supply when outputing low voltage from pwren..


Is it related with PWREN register?
I'm checking the TRM for PWREN register..

"Optional feature; ports can be used as general-purpose outputs."

Does it use the ports as GPIO?





But it's not general case. We can discuss about this.


I have a solution which is to add gpio power control for slot-gpio of
mmc core. once finished, we could add pwr_cap_invert just like what we
did for cd/wp invert control..


I'm not sure that it's right that add the gpio power control into mmc core.
slot-gpio? i think it can use the regulator framework with gpio.


yup, I'm looking this and seems it's fine to register a regulator-gpio
and stuff it for vmmc-supply of sdmmc. I will test it, really helpful.

thanks!


I don't have a lot knowledge of power, but it might be controlled whether 
active is low or high.
So i think it doesn't need to add for entire mmc controller.

Best Regards,
Jaehoon Chung



Then we could remove PWREN from the default state of
pinctrl inside the sdmmc dt node, and let dwmmc request gpio power
control stuff after paring the property for pwr_cap_invert..

More over, it well fit for all mmc host's requirement of gpio
power control and inverted control if they want it. :)


That should be legit for us?
If it sounds ok to you and Ulf, I will come up with a RFC one for
community to comment it.:)



Best Regards,
Jaehoon Chung




Changes in v2:
- fix copy-paste err and typo

Shawn Lin (6):
  dt-bindings: rockchip-dw-mshc: add description of
rockchip,power-invert
  mmc: dw_mmc: cleanup power setting of set_ios callback
  mmc: dw_mmc: split out dw_mci_set_power
  mmc: dw_mmc: split out dw_mci_set_power_reg
  mmc: dw_mmc: support inverted power control
  mmc: dw_mmc-rockchip: add parsing of power control from DT

 .../devicetree/bindings/mmc/rockchip-dw-mshc.txt   |   6 +
 drivers/mmc/host/dw_mmc-rockchip.c |   8 ++
 drivers/mmc/host/dw_mmc.c  | 134 -
 drivers/mmc/host/dw_mmc.h  |   1 +
 4 files changed, 90 insertions(+), 59 deletions(-)











--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
Best Regards
Shawn Lin



Re: [PATCH v2 0/6] Add support of inverting power control and some minor cleanup

2016-08-11 Thread Shawn Lin

On 2016/8/11 18:08, Jaehoon Chung wrote:

Hi Shawn,

On 08/09/2016 10:49 AM, Shawn Lin wrote:

Hi,

On 2016/8/8 18:24, Jaehoon Chung wrote:

Hi Shawn,

On 08/07/2016 10:33 AM, Shawn Lin wrote:

By default, dw_mmc outputs high level voltage to indicate powering
up the card and outputs low level vcltage to indicate powering
off the card. But that is not always correct. The power io should
be able to control different kind of hw components to supply or
cutoff power to the card. We have boards that need this patchset
to make the power control correct. Meanwhile let's expose it to
DT for board-specific usage.


I have a question for this patch-set. Does DWMMC IP support to invert ON/OFF at 
Power Enable register?


No.


Hmm..Well, if use the DW_MMC_CARD_PWR_INVERT, it should also be the similar 
behavior with Quirks.


yup, it makes the power control more complicated than before. :(



Other flags are related with dwmmc IP. But this flag (DW_MMC_CARD_PWR_INVERT) 
is not related with IP side.
I understood why you needs to add this flag..Is rockchip designed to invert the 
power controlling?


We don't invert the power controlling but our customers do.
The HW componet looks like some discrete LDOs which enable the related
power supply when outputing low voltage from pwren..


Is it related with PWREN register?
I'm checking the TRM for PWREN register..

"Optional feature; ports can be used as general-purpose outputs."

Does it use the ports as GPIO?





But it's not general case. We can discuss about this.


I have a solution which is to add gpio power control for slot-gpio of
mmc core. once finished, we could add pwr_cap_invert just like what we
did for cd/wp invert control..


I'm not sure that it's right that add the gpio power control into mmc core.
slot-gpio? i think it can use the regulator framework with gpio.


yup, I'm looking this and seems it's fine to register a regulator-gpio
and stuff it for vmmc-supply of sdmmc. I will test it, really helpful.

thanks!


I don't have a lot knowledge of power, but it might be controlled whether 
active is low or high.
So i think it doesn't need to add for entire mmc controller.

Best Regards,
Jaehoon Chung



Then we could remove PWREN from the default state of
pinctrl inside the sdmmc dt node, and let dwmmc request gpio power
control stuff after paring the property for pwr_cap_invert..

More over, it well fit for all mmc host's requirement of gpio
power control and inverted control if they want it. :)


That should be legit for us?
If it sounds ok to you and Ulf, I will come up with a RFC one for
community to comment it.:)



Best Regards,
Jaehoon Chung




Changes in v2:
- fix copy-paste err and typo

Shawn Lin (6):
  dt-bindings: rockchip-dw-mshc: add description of
rockchip,power-invert
  mmc: dw_mmc: cleanup power setting of set_ios callback
  mmc: dw_mmc: split out dw_mci_set_power
  mmc: dw_mmc: split out dw_mci_set_power_reg
  mmc: dw_mmc: support inverted power control
  mmc: dw_mmc-rockchip: add parsing of power control from DT

 .../devicetree/bindings/mmc/rockchip-dw-mshc.txt   |   6 +
 drivers/mmc/host/dw_mmc-rockchip.c |   8 ++
 drivers/mmc/host/dw_mmc.c  | 134 -
 drivers/mmc/host/dw_mmc.h  |   1 +
 4 files changed, 90 insertions(+), 59 deletions(-)











--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
Best Regards
Shawn Lin



[PATCH 02/18] autofs: Drop unnecessary extern in autofs_i.h

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

autofs4_kill_sb() doesn't need to be declared as extern,
and no other functions in .h are explicitly declared as extern.

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 fs/autofs4/autofs_i.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index a439548..eb87055 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -271,4 +271,4 @@ static inline void autofs4_del_expiring(struct dentry 
*dentry)
}
 }
 
-extern void autofs4_kill_sb(struct super_block *);
+void autofs4_kill_sb(struct super_block *);



Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-11 Thread Christoph Hellwig
On Fri, Aug 12, 2016 at 12:23:29PM +1000, Dave Chinner wrote:
> Christoph, maybe there's something we can do to only trigger
> speculative prealloc growth checks if the new file size crosses the end of
> the currently allocated block at the EOF. That would chop out a fair
> chunk of the xfs_bmapi_read calls being done in this workload. I'm
> not sure how much effort we should spend optimising this slow path,
> though

I can look at that, but indeed optimizing this patch seems a bit
stupid.  The other thing we could do is to optimize xfs_bmapi_read - even
if it shouldn't be called this often it seems like it should waste a whole
lot less CPU cycles.


[PATCH 02/18] autofs: Drop unnecessary extern in autofs_i.h

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

autofs4_kill_sb() doesn't need to be declared as extern,
and no other functions in .h are explicitly declared as extern.

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 fs/autofs4/autofs_i.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index a439548..eb87055 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -271,4 +271,4 @@ static inline void autofs4_del_expiring(struct dentry 
*dentry)
}
 }
 
-extern void autofs4_kill_sb(struct super_block *);
+void autofs4_kill_sb(struct super_block *);



Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-11 Thread Christoph Hellwig
On Fri, Aug 12, 2016 at 12:23:29PM +1000, Dave Chinner wrote:
> Christoph, maybe there's something we can do to only trigger
> speculative prealloc growth checks if the new file size crosses the end of
> the currently allocated block at the EOF. That would chop out a fair
> chunk of the xfs_bmapi_read calls being done in this workload. I'm
> not sure how much effort we should spend optimising this slow path,
> though

I can look at that, but indeed optimizing this patch seems a bit
stupid.  The other thing we could do is to optimize xfs_bmapi_read - even
if it shouldn't be called this often it seems like it should waste a whole
lot less CPU cycles.


[PATCH 12/18] autofs: Update struct autofs_dev_ioctl in Documentation

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

Sync with changes made by 730c9eec which introduced an union for
various ioctl commands instead of having statically named arg1,2.

This commit simply replaces arg1,2 with the corresponding fields
without changing semantics.

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 .../filesystems/autofs4-mount-control.txt  |   70 
 1 file changed, 42 insertions(+), 28 deletions(-)

diff --git a/Documentation/filesystems/autofs4-mount-control.txt 
b/Documentation/filesystems/autofs4-mount-control.txt
index 540d9a7..50a3e01 100644
--- a/Documentation/filesystems/autofs4-mount-control.txt
+++ b/Documentation/filesystems/autofs4-mount-control.txt
@@ -179,8 +179,19 @@ struct autofs_dev_ioctl {
 * including this struct */
__s32 ioctlfd;  /* automount command fd */
 
-   __u32 arg1; /* Command parameters */
-   __u32 arg2;
+   union {
+   struct args_protoverprotover;
+   struct args_protosubver protosubver;
+   struct args_openmount   openmount;
+   struct args_ready   ready;
+   struct args_failfail;
+   struct args_setpipefd   setpipefd;
+   struct args_timeout timeout;
+   struct args_requester   requester;
+   struct args_expire  expire;
+   struct args_askumount   askumount;
+   struct args_ismountpointismountpoint;
+   };
 
char path[0];
 };
@@ -192,8 +203,8 @@ optionally be used to check a specific mount corresponding 
to a given
 mount point file descriptor, and when requesting the uid and gid of the
 last successful mount on a directory within the autofs file system.
 
-The fields arg1 and arg2 are used to communicate parameters and results of
-calls made as described below.
+The union is used to communicate parameters and results of calls made
+as described below.
 
 The path field is used to pass a path where it is needed and the size field
 is used account for the increased structure length when translating the
@@ -245,9 +256,9 @@ AUTOFS_DEV_IOCTL_PROTOVER_CMD and 
AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD
 Get the major and minor version of the autofs4 protocol version understood
 by loaded module. This call requires an initialized struct autofs_dev_ioctl
 with the ioctlfd field set to a valid autofs mount point descriptor
-and sets the requested version number in structure field arg1. These
-commands return 0 on success or one of the negative error codes if
-validation fails.
+and sets the requested version number in version field of struct args_protover
+or sub_version field of struct args_protosubver. These commands return
+0 on success or one of the negative error codes if validation fails.
 
 
 AUTOFS_DEV_IOCTL_OPENMOUNT and AUTOFS_DEV_IOCTL_CLOSEMOUNT
@@ -256,9 +267,9 @@ AUTOFS_DEV_IOCTL_OPENMOUNT and AUTOFS_DEV_IOCTL_CLOSEMOUNT
 Obtain and release a file descriptor for an autofs managed mount point
 path. The open call requires an initialized struct autofs_dev_ioctl with
 the path field set and the size field adjusted appropriately as well
-as the arg1 field set to the device number of the autofs mount. The
-device number can be obtained from the mount options shown in
-/proc/mounts. The close call requires an initialized struct
+as the devid field of struct args_openmount set to the device number of
+the autofs mount. The device number can be obtained from the mount options
+shown in /proc/mounts. The close call requires an initialized struct
 autofs_dev_ioct with the ioctlfd field set to the descriptor obtained
 from the open call. The release of the file descriptor can also be done
 with close(2) so any open descriptors will also be closed at process exit.
@@ -272,10 +283,10 @@ AUTOFS_DEV_IOCTL_READY_CMD and AUTOFS_DEV_IOCTL_FAIL_CMD
 Return mount and expire result status from user space to the kernel.
 Both of these calls require an initialized struct autofs_dev_ioctl
 with the ioctlfd field set to the descriptor obtained from the open
-call and the arg1 field set to the wait queue token number, received
-by user space in the foregoing mount or expire request. The arg2 field
-is set to the status to be returned. For the ready call this is always
-0 and for the fail call it is set to the errno of the operation.
+call and the token field of struct args_ready or struct args_fail set
+to the wait queue token number, received by user space in the foregoing
+mount or expire request. The status field of struct args_fail is set to
+the errno of the operation. It is set to 0 on success.
 
 
 AUTOFS_DEV_IOCTL_SETPIPEFD_CMD
@@ -290,9 +301,10 @@ mount be catatonic (see next call).
 
 The call requires an initialized struct autofs_dev_ioctl with the
 ioctlfd 

[PATCH 09/18] autofs: Don't fail to free_dev_ioctl(param)

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

Returning -ENOTTY here fails to free dynamically allocated param.

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 fs/autofs4/dev-ioctl.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index c7fcc74..d47b35a 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -662,7 +662,8 @@ static int _autofs_dev_ioctl(unsigned int command,
fn = lookup_dev_ioctl(cmd);
if (!fn) {
pr_warn("unknown command 0x%08x\n", command);
-   return -ENOTTY;
+   err = -ENOTTY;
+   goto out;
}
 
fp = NULL;



[PATCH 12/18] autofs: Update struct autofs_dev_ioctl in Documentation

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

Sync with changes made by 730c9eec which introduced an union for
various ioctl commands instead of having statically named arg1,2.

This commit simply replaces arg1,2 with the corresponding fields
without changing semantics.

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 .../filesystems/autofs4-mount-control.txt  |   70 
 1 file changed, 42 insertions(+), 28 deletions(-)

diff --git a/Documentation/filesystems/autofs4-mount-control.txt 
b/Documentation/filesystems/autofs4-mount-control.txt
index 540d9a7..50a3e01 100644
--- a/Documentation/filesystems/autofs4-mount-control.txt
+++ b/Documentation/filesystems/autofs4-mount-control.txt
@@ -179,8 +179,19 @@ struct autofs_dev_ioctl {
 * including this struct */
__s32 ioctlfd;  /* automount command fd */
 
-   __u32 arg1; /* Command parameters */
-   __u32 arg2;
+   union {
+   struct args_protoverprotover;
+   struct args_protosubver protosubver;
+   struct args_openmount   openmount;
+   struct args_ready   ready;
+   struct args_failfail;
+   struct args_setpipefd   setpipefd;
+   struct args_timeout timeout;
+   struct args_requester   requester;
+   struct args_expire  expire;
+   struct args_askumount   askumount;
+   struct args_ismountpointismountpoint;
+   };
 
char path[0];
 };
@@ -192,8 +203,8 @@ optionally be used to check a specific mount corresponding 
to a given
 mount point file descriptor, and when requesting the uid and gid of the
 last successful mount on a directory within the autofs file system.
 
-The fields arg1 and arg2 are used to communicate parameters and results of
-calls made as described below.
+The union is used to communicate parameters and results of calls made
+as described below.
 
 The path field is used to pass a path where it is needed and the size field
 is used account for the increased structure length when translating the
@@ -245,9 +256,9 @@ AUTOFS_DEV_IOCTL_PROTOVER_CMD and 
AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD
 Get the major and minor version of the autofs4 protocol version understood
 by loaded module. This call requires an initialized struct autofs_dev_ioctl
 with the ioctlfd field set to a valid autofs mount point descriptor
-and sets the requested version number in structure field arg1. These
-commands return 0 on success or one of the negative error codes if
-validation fails.
+and sets the requested version number in version field of struct args_protover
+or sub_version field of struct args_protosubver. These commands return
+0 on success or one of the negative error codes if validation fails.
 
 
 AUTOFS_DEV_IOCTL_OPENMOUNT and AUTOFS_DEV_IOCTL_CLOSEMOUNT
@@ -256,9 +267,9 @@ AUTOFS_DEV_IOCTL_OPENMOUNT and AUTOFS_DEV_IOCTL_CLOSEMOUNT
 Obtain and release a file descriptor for an autofs managed mount point
 path. The open call requires an initialized struct autofs_dev_ioctl with
 the path field set and the size field adjusted appropriately as well
-as the arg1 field set to the device number of the autofs mount. The
-device number can be obtained from the mount options shown in
-/proc/mounts. The close call requires an initialized struct
+as the devid field of struct args_openmount set to the device number of
+the autofs mount. The device number can be obtained from the mount options
+shown in /proc/mounts. The close call requires an initialized struct
 autofs_dev_ioct with the ioctlfd field set to the descriptor obtained
 from the open call. The release of the file descriptor can also be done
 with close(2) so any open descriptors will also be closed at process exit.
@@ -272,10 +283,10 @@ AUTOFS_DEV_IOCTL_READY_CMD and AUTOFS_DEV_IOCTL_FAIL_CMD
 Return mount and expire result status from user space to the kernel.
 Both of these calls require an initialized struct autofs_dev_ioctl
 with the ioctlfd field set to the descriptor obtained from the open
-call and the arg1 field set to the wait queue token number, received
-by user space in the foregoing mount or expire request. The arg2 field
-is set to the status to be returned. For the ready call this is always
-0 and for the fail call it is set to the errno of the operation.
+call and the token field of struct args_ready or struct args_fail set
+to the wait queue token number, received by user space in the foregoing
+mount or expire request. The status field of struct args_fail is set to
+the errno of the operation. It is set to 0 on success.
 
 
 AUTOFS_DEV_IOCTL_SETPIPEFD_CMD
@@ -290,9 +301,10 @@ mount be catatonic (see next call).
 
 The call requires an initialized struct autofs_dev_ioctl with the
 ioctlfd field set to the descriptor obtained from the open call and
-the arg1 

[PATCH 09/18] autofs: Don't fail to free_dev_ioctl(param)

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

Returning -ENOTTY here fails to free dynamically allocated param.

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 fs/autofs4/dev-ioctl.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index c7fcc74..d47b35a 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -662,7 +662,8 @@ static int _autofs_dev_ioctl(unsigned int command,
fn = lookup_dev_ioctl(cmd);
if (!fn) {
pr_warn("unknown command 0x%08x\n", command);
-   return -ENOTTY;
+   err = -ENOTTY;
+   goto out;
}
 
fp = NULL;



[PATCH 11/18] autofs: Fix Documentation regarding devid on ioctl

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

The explanation on how ioctl handles devid seems incorrect.
Userspace who calls this ioctl has no input regarding devid,
and ioctl implementation retrieves devid via superblock.

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 .../filesystems/autofs4-mount-control.txt  |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/Documentation/filesystems/autofs4-mount-control.txt 
b/Documentation/filesystems/autofs4-mount-control.txt
index aff2211..540d9a7 100644
--- a/Documentation/filesystems/autofs4-mount-control.txt
+++ b/Documentation/filesystems/autofs4-mount-control.txt
@@ -323,9 +323,8 @@ mount on the given path dentry.
 
 The call requires an initialized struct autofs_dev_ioctl with the path
 field set to the mount point in question and the size field adjusted
-appropriately as well as the arg1 field set to the device number of the
-containing autofs mount. Upon return the struct field arg1 contains the
-uid and arg2 the gid.
+appropriately. Upon return the struct field arg1 contains the uid and
+arg2 the gid.
 
 When reconstructing an autofs mount tree with active mounts we need to
 re-connect to mounts that may have used the original process uid and



[PATCH 15/18] autofs: Add autofs_dev_ioctl_version() for AUTOFS_DEV_IOCTL_VERSION_CMD

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

No functional changes, based on the following justification.

1. Make the code more consistent using the ioctl vector _ioctls[],
   rather than assigning NULL only for this ioctl command.
2. Remove goto done; for better maintainability in the long run.
3. The existing code is based on the fact that validate_dev_ioctl()
   sets ioctl version for any command, but AUTOFS_DEV_IOCTL_VERSION_CMD
   should explicitly set it regardless of the default behavior.

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 fs/autofs4/dev-ioctl.c |   25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index 13e7517..7289216 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -172,6 +172,17 @@ static struct autofs_sb_info *autofs_dev_ioctl_sbi(struct 
file *f)
return sbi;
 }
 
+/* Return autofs dev ioctl version */
+static int autofs_dev_ioctl_version(struct file *fp,
+   struct autofs_sb_info *sbi,
+   struct autofs_dev_ioctl *param)
+{
+   /* This should have already been set. */
+   param->ver_major = AUTOFS_DEV_IOCTL_VERSION_MAJOR;
+   param->ver_minor = AUTOFS_DEV_IOCTL_VERSION_MINOR;
+   return 0;
+}
+
 /* Return autofs module protocol version */
 static int autofs_dev_ioctl_protover(struct file *fp,
 struct autofs_sb_info *sbi,
@@ -590,7 +601,8 @@ static ioctl_fn lookup_dev_ioctl(unsigned int cmd)
int cmd;
ioctl_fn fn;
} _ioctls[] = {
-   {cmd_idx(AUTOFS_DEV_IOCTL_VERSION_CMD), NULL},
+   {cmd_idx(AUTOFS_DEV_IOCTL_VERSION_CMD),
+autofs_dev_ioctl_version},
{cmd_idx(AUTOFS_DEV_IOCTL_PROTOVER_CMD),
 autofs_dev_ioctl_protover},
{cmd_idx(AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD),
@@ -655,10 +667,6 @@ static int _autofs_dev_ioctl(unsigned int command,
if (err)
goto out;
 
-   /* The validate routine above always sets the version */
-   if (cmd == AUTOFS_DEV_IOCTL_VERSION_CMD)
-   goto done;
-
fn = lookup_dev_ioctl(cmd);
if (!fn) {
pr_warn("unknown command 0x%08x\n", command);
@@ -672,9 +680,11 @@ static int _autofs_dev_ioctl(unsigned int command,
/*
 * For obvious reasons the openmount can't have a file
 * descriptor yet. We don't take a reference to the
-* file during close to allow for immediate release.
+* file during close to allow for immediate release,
+* and the same for retrieving ioctl version.
 */
-   if (cmd != AUTOFS_DEV_IOCTL_OPENMOUNT_CMD &&
+   if (cmd != AUTOFS_DEV_IOCTL_VERSION_CMD &&
+   cmd != AUTOFS_DEV_IOCTL_OPENMOUNT_CMD &&
cmd != AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD) {
fp = fget(param->ioctlfd);
if (!fp) {
@@ -707,7 +717,6 @@ cont:
 
if (fp)
fput(fp);
-done:
if (err >= 0 && copy_to_user(user, param, AUTOFS_DEV_IOCTL_SIZE))
err = -EFAULT;
 out:



[PATCH 10/18] autofs: Remove AUTOFS_DEVID_LEN

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

This macro was never used by neither kernel nor userspace,
and also doesn't represent "devid length" in bytes.
(unless it was added to mean something else).

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 include/linux/auto_dev-ioctl.h |2 --
 1 file changed, 2 deletions(-)

diff --git a/include/linux/auto_dev-ioctl.h b/include/linux/auto_dev-ioctl.h
index 7caaf29..bf82e3a 100644
--- a/include/linux/auto_dev-ioctl.h
+++ b/include/linux/auto_dev-ioctl.h
@@ -18,8 +18,6 @@
 #define AUTOFS_DEV_IOCTL_VERSION_MAJOR 1
 #define AUTOFS_DEV_IOCTL_VERSION_MINOR 0
 
-#define AUTOFS_DEVID_LEN   16
-
 #define AUTOFS_DEV_IOCTL_SIZE  sizeof(struct autofs_dev_ioctl)
 
 /*



[PATCH 13/18] autofs: Fix pr_debug() message

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

This isn't a return value, so change the message to indicate
the status is the result of may_umount().

(or locate pr_debug() after put_user() with the same message)

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 fs/autofs4/root.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
index 1b0495a..d25c55f 100644
--- a/fs/autofs4/root.c
+++ b/fs/autofs4/root.c
@@ -840,7 +840,7 @@ static inline int autofs4_ask_umount(struct vfsmount *mnt, 
int __user *p)
if (may_umount(mnt))
status = 1;
 
-   pr_debug("returning %d\n", status);
+   pr_debug("may umount %d\n", status);
 
status = put_user(status, p);
 



[PATCH 18/18] autofs4 - move linux/auto_dev-ioctl.h to uapi/linux

2016-08-11 Thread Ian Kent
Since linux/auto_dev-ioctl.h wasn't included in include/linux/Kbuild
it wasn't moved to uapi/linux as part of the uapi series.

Signed-off-by: Ian Kent 
---
 include/linux/auto_dev-ioctl.h  |  209 -
 include/uapi/linux/auto_dev-ioctl.h |  221 +++
 2 files changed, 222 insertions(+), 208 deletions(-)
 create mode 100644 include/uapi/linux/auto_dev-ioctl.h

diff --git a/include/linux/auto_dev-ioctl.h b/include/linux/auto_dev-ioctl.h
index bf82e3a..28c1505 100644
--- a/include/linux/auto_dev-ioctl.h
+++ b/include/linux/auto_dev-ioctl.h
@@ -10,212 +10,5 @@
 #ifndef _LINUX_AUTO_DEV_IOCTL_H
 #define _LINUX_AUTO_DEV_IOCTL_H
 
-#include 
-#include 
-
-#define AUTOFS_DEVICE_NAME "autofs"
-
-#define AUTOFS_DEV_IOCTL_VERSION_MAJOR 1
-#define AUTOFS_DEV_IOCTL_VERSION_MINOR 0
-
-#define AUTOFS_DEV_IOCTL_SIZE  sizeof(struct autofs_dev_ioctl)
-
-/*
- * An ioctl interface for autofs mount point control.
- */
-
-struct args_protover {
-   __u32   version;
-};
-
-struct args_protosubver {
-   __u32   sub_version;
-};
-
-struct args_openmount {
-   __u32   devid;
-};
-
-struct args_ready {
-   __u32   token;
-};
-
-struct args_fail {
-   __u32   token;
-   __s32   status;
-};
-
-struct args_setpipefd {
-   __s32   pipefd;
-};
-
-struct args_timeout {
-   __u64   timeout;
-};
-
-struct args_requester {
-   __u32   uid;
-   __u32   gid;
-};
-
-struct args_expire {
-   __u32   how;
-};
-
-struct args_askumount {
-   __u32   may_umount;
-};
-
-struct args_ismountpoint {
-   union {
-   struct args_in {
-   __u32   type;
-   } in;
-   struct args_out {
-   __u32   devid;
-   __u32   magic;
-   } out;
-   };
-};
-
-/*
- * All the ioctls use this structure.
- * When sending a path size must account for the total length
- * of the chunk of memory otherwise is is the size of the
- * structure.
- */
-
-struct autofs_dev_ioctl {
-   __u32 ver_major;
-   __u32 ver_minor;
-   __u32 size; /* total size of data passed in
-* including this struct */
-   __s32 ioctlfd;  /* automount command fd */
-
-   /* Command parameters */
-
-   union {
-   struct args_protoverprotover;
-   struct args_protosubver protosubver;
-   struct args_openmount   openmount;
-   struct args_ready   ready;
-   struct args_failfail;
-   struct args_setpipefd   setpipefd;
-   struct args_timeout timeout;
-   struct args_requester   requester;
-   struct args_expire  expire;
-   struct args_askumount   askumount;
-   struct args_ismountpointismountpoint;
-   };
-
-   char path[0];
-};
-
-static inline void init_autofs_dev_ioctl(struct autofs_dev_ioctl *in)
-{
-   memset(in, 0, sizeof(struct autofs_dev_ioctl));
-   in->ver_major = AUTOFS_DEV_IOCTL_VERSION_MAJOR;
-   in->ver_minor = AUTOFS_DEV_IOCTL_VERSION_MINOR;
-   in->size = sizeof(struct autofs_dev_ioctl);
-   in->ioctlfd = -1;
-}
-
-/*
- * If you change this make sure you make the corresponding change
- * to autofs-dev-ioctl.c:lookup_ioctl()
- */
-enum {
-   /* Get various version info */
-   AUTOFS_DEV_IOCTL_VERSION_CMD = 0x71,
-   AUTOFS_DEV_IOCTL_PROTOVER_CMD,
-   AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD,
-
-   /* Open mount ioctl fd */
-   AUTOFS_DEV_IOCTL_OPENMOUNT_CMD,
-
-   /* Close mount ioctl fd */
-   AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD,
-
-   /* Mount/expire status returns */
-   AUTOFS_DEV_IOCTL_READY_CMD,
-   AUTOFS_DEV_IOCTL_FAIL_CMD,
-
-   /* Activate/deactivate autofs mount */
-   AUTOFS_DEV_IOCTL_SETPIPEFD_CMD,
-   AUTOFS_DEV_IOCTL_CATATONIC_CMD,
-
-   /* Expiry timeout */
-   AUTOFS_DEV_IOCTL_TIMEOUT_CMD,
-
-   /* Get mount last requesting uid and gid */
-   AUTOFS_DEV_IOCTL_REQUESTER_CMD,
-
-   /* Check for eligible expire candidates */
-   AUTOFS_DEV_IOCTL_EXPIRE_CMD,
-
-   /* Request busy status */
-   AUTOFS_DEV_IOCTL_ASKUMOUNT_CMD,
-
-   /* Check if path is a mountpoint */
-   AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD,
-};
-
-#define AUTOFS_IOCTL 0x93
-
-#define AUTOFS_DEV_IOCTL_VERSION \
-   _IOWR(AUTOFS_IOCTL, \
- AUTOFS_DEV_IOCTL_VERSION_CMD, struct autofs_dev_ioctl)
-
-#define AUTOFS_DEV_IOCTL_PROTOVER \
-   _IOWR(AUTOFS_IOCTL, \
- AUTOFS_DEV_IOCTL_PROTOVER_CMD, struct autofs_dev_ioctl)
-
-#define AUTOFS_DEV_IOCTL_PROTOSUBVER \
-   _IOWR(AUTOFS_IOCTL, \
- AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD, struct autofs_dev_ioctl)
-
-#define AUTOFS_DEV_IOCTL_OPENMOUNT \
-  

[PATCH 11/18] autofs: Fix Documentation regarding devid on ioctl

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

The explanation on how ioctl handles devid seems incorrect.
Userspace who calls this ioctl has no input regarding devid,
and ioctl implementation retrieves devid via superblock.

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 .../filesystems/autofs4-mount-control.txt  |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/Documentation/filesystems/autofs4-mount-control.txt 
b/Documentation/filesystems/autofs4-mount-control.txt
index aff2211..540d9a7 100644
--- a/Documentation/filesystems/autofs4-mount-control.txt
+++ b/Documentation/filesystems/autofs4-mount-control.txt
@@ -323,9 +323,8 @@ mount on the given path dentry.
 
 The call requires an initialized struct autofs_dev_ioctl with the path
 field set to the mount point in question and the size field adjusted
-appropriately as well as the arg1 field set to the device number of the
-containing autofs mount. Upon return the struct field arg1 contains the
-uid and arg2 the gid.
+appropriately. Upon return the struct field arg1 contains the uid and
+arg2 the gid.
 
 When reconstructing an autofs mount tree with active mounts we need to
 re-connect to mounts that may have used the original process uid and



[PATCH 15/18] autofs: Add autofs_dev_ioctl_version() for AUTOFS_DEV_IOCTL_VERSION_CMD

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

No functional changes, based on the following justification.

1. Make the code more consistent using the ioctl vector _ioctls[],
   rather than assigning NULL only for this ioctl command.
2. Remove goto done; for better maintainability in the long run.
3. The existing code is based on the fact that validate_dev_ioctl()
   sets ioctl version for any command, but AUTOFS_DEV_IOCTL_VERSION_CMD
   should explicitly set it regardless of the default behavior.

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 fs/autofs4/dev-ioctl.c |   25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index 13e7517..7289216 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -172,6 +172,17 @@ static struct autofs_sb_info *autofs_dev_ioctl_sbi(struct 
file *f)
return sbi;
 }
 
+/* Return autofs dev ioctl version */
+static int autofs_dev_ioctl_version(struct file *fp,
+   struct autofs_sb_info *sbi,
+   struct autofs_dev_ioctl *param)
+{
+   /* This should have already been set. */
+   param->ver_major = AUTOFS_DEV_IOCTL_VERSION_MAJOR;
+   param->ver_minor = AUTOFS_DEV_IOCTL_VERSION_MINOR;
+   return 0;
+}
+
 /* Return autofs module protocol version */
 static int autofs_dev_ioctl_protover(struct file *fp,
 struct autofs_sb_info *sbi,
@@ -590,7 +601,8 @@ static ioctl_fn lookup_dev_ioctl(unsigned int cmd)
int cmd;
ioctl_fn fn;
} _ioctls[] = {
-   {cmd_idx(AUTOFS_DEV_IOCTL_VERSION_CMD), NULL},
+   {cmd_idx(AUTOFS_DEV_IOCTL_VERSION_CMD),
+autofs_dev_ioctl_version},
{cmd_idx(AUTOFS_DEV_IOCTL_PROTOVER_CMD),
 autofs_dev_ioctl_protover},
{cmd_idx(AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD),
@@ -655,10 +667,6 @@ static int _autofs_dev_ioctl(unsigned int command,
if (err)
goto out;
 
-   /* The validate routine above always sets the version */
-   if (cmd == AUTOFS_DEV_IOCTL_VERSION_CMD)
-   goto done;
-
fn = lookup_dev_ioctl(cmd);
if (!fn) {
pr_warn("unknown command 0x%08x\n", command);
@@ -672,9 +680,11 @@ static int _autofs_dev_ioctl(unsigned int command,
/*
 * For obvious reasons the openmount can't have a file
 * descriptor yet. We don't take a reference to the
-* file during close to allow for immediate release.
+* file during close to allow for immediate release,
+* and the same for retrieving ioctl version.
 */
-   if (cmd != AUTOFS_DEV_IOCTL_OPENMOUNT_CMD &&
+   if (cmd != AUTOFS_DEV_IOCTL_VERSION_CMD &&
+   cmd != AUTOFS_DEV_IOCTL_OPENMOUNT_CMD &&
cmd != AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD) {
fp = fget(param->ioctlfd);
if (!fp) {
@@ -707,7 +717,6 @@ cont:
 
if (fp)
fput(fp);
-done:
if (err >= 0 && copy_to_user(user, param, AUTOFS_DEV_IOCTL_SIZE))
err = -EFAULT;
 out:



[PATCH 10/18] autofs: Remove AUTOFS_DEVID_LEN

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

This macro was never used by neither kernel nor userspace,
and also doesn't represent "devid length" in bytes.
(unless it was added to mean something else).

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 include/linux/auto_dev-ioctl.h |2 --
 1 file changed, 2 deletions(-)

diff --git a/include/linux/auto_dev-ioctl.h b/include/linux/auto_dev-ioctl.h
index 7caaf29..bf82e3a 100644
--- a/include/linux/auto_dev-ioctl.h
+++ b/include/linux/auto_dev-ioctl.h
@@ -18,8 +18,6 @@
 #define AUTOFS_DEV_IOCTL_VERSION_MAJOR 1
 #define AUTOFS_DEV_IOCTL_VERSION_MINOR 0
 
-#define AUTOFS_DEVID_LEN   16
-
 #define AUTOFS_DEV_IOCTL_SIZE  sizeof(struct autofs_dev_ioctl)
 
 /*



[PATCH 13/18] autofs: Fix pr_debug() message

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

This isn't a return value, so change the message to indicate
the status is the result of may_umount().

(or locate pr_debug() after put_user() with the same message)

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 fs/autofs4/root.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
index 1b0495a..d25c55f 100644
--- a/fs/autofs4/root.c
+++ b/fs/autofs4/root.c
@@ -840,7 +840,7 @@ static inline int autofs4_ask_umount(struct vfsmount *mnt, 
int __user *p)
if (may_umount(mnt))
status = 1;
 
-   pr_debug("returning %d\n", status);
+   pr_debug("may umount %d\n", status);
 
status = put_user(status, p);
 



[PATCH 18/18] autofs4 - move linux/auto_dev-ioctl.h to uapi/linux

2016-08-11 Thread Ian Kent
Since linux/auto_dev-ioctl.h wasn't included in include/linux/Kbuild
it wasn't moved to uapi/linux as part of the uapi series.

Signed-off-by: Ian Kent 
---
 include/linux/auto_dev-ioctl.h  |  209 -
 include/uapi/linux/auto_dev-ioctl.h |  221 +++
 2 files changed, 222 insertions(+), 208 deletions(-)
 create mode 100644 include/uapi/linux/auto_dev-ioctl.h

diff --git a/include/linux/auto_dev-ioctl.h b/include/linux/auto_dev-ioctl.h
index bf82e3a..28c1505 100644
--- a/include/linux/auto_dev-ioctl.h
+++ b/include/linux/auto_dev-ioctl.h
@@ -10,212 +10,5 @@
 #ifndef _LINUX_AUTO_DEV_IOCTL_H
 #define _LINUX_AUTO_DEV_IOCTL_H
 
-#include 
-#include 
-
-#define AUTOFS_DEVICE_NAME "autofs"
-
-#define AUTOFS_DEV_IOCTL_VERSION_MAJOR 1
-#define AUTOFS_DEV_IOCTL_VERSION_MINOR 0
-
-#define AUTOFS_DEV_IOCTL_SIZE  sizeof(struct autofs_dev_ioctl)
-
-/*
- * An ioctl interface for autofs mount point control.
- */
-
-struct args_protover {
-   __u32   version;
-};
-
-struct args_protosubver {
-   __u32   sub_version;
-};
-
-struct args_openmount {
-   __u32   devid;
-};
-
-struct args_ready {
-   __u32   token;
-};
-
-struct args_fail {
-   __u32   token;
-   __s32   status;
-};
-
-struct args_setpipefd {
-   __s32   pipefd;
-};
-
-struct args_timeout {
-   __u64   timeout;
-};
-
-struct args_requester {
-   __u32   uid;
-   __u32   gid;
-};
-
-struct args_expire {
-   __u32   how;
-};
-
-struct args_askumount {
-   __u32   may_umount;
-};
-
-struct args_ismountpoint {
-   union {
-   struct args_in {
-   __u32   type;
-   } in;
-   struct args_out {
-   __u32   devid;
-   __u32   magic;
-   } out;
-   };
-};
-
-/*
- * All the ioctls use this structure.
- * When sending a path size must account for the total length
- * of the chunk of memory otherwise is is the size of the
- * structure.
- */
-
-struct autofs_dev_ioctl {
-   __u32 ver_major;
-   __u32 ver_minor;
-   __u32 size; /* total size of data passed in
-* including this struct */
-   __s32 ioctlfd;  /* automount command fd */
-
-   /* Command parameters */
-
-   union {
-   struct args_protoverprotover;
-   struct args_protosubver protosubver;
-   struct args_openmount   openmount;
-   struct args_ready   ready;
-   struct args_failfail;
-   struct args_setpipefd   setpipefd;
-   struct args_timeout timeout;
-   struct args_requester   requester;
-   struct args_expire  expire;
-   struct args_askumount   askumount;
-   struct args_ismountpointismountpoint;
-   };
-
-   char path[0];
-};
-
-static inline void init_autofs_dev_ioctl(struct autofs_dev_ioctl *in)
-{
-   memset(in, 0, sizeof(struct autofs_dev_ioctl));
-   in->ver_major = AUTOFS_DEV_IOCTL_VERSION_MAJOR;
-   in->ver_minor = AUTOFS_DEV_IOCTL_VERSION_MINOR;
-   in->size = sizeof(struct autofs_dev_ioctl);
-   in->ioctlfd = -1;
-}
-
-/*
- * If you change this make sure you make the corresponding change
- * to autofs-dev-ioctl.c:lookup_ioctl()
- */
-enum {
-   /* Get various version info */
-   AUTOFS_DEV_IOCTL_VERSION_CMD = 0x71,
-   AUTOFS_DEV_IOCTL_PROTOVER_CMD,
-   AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD,
-
-   /* Open mount ioctl fd */
-   AUTOFS_DEV_IOCTL_OPENMOUNT_CMD,
-
-   /* Close mount ioctl fd */
-   AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD,
-
-   /* Mount/expire status returns */
-   AUTOFS_DEV_IOCTL_READY_CMD,
-   AUTOFS_DEV_IOCTL_FAIL_CMD,
-
-   /* Activate/deactivate autofs mount */
-   AUTOFS_DEV_IOCTL_SETPIPEFD_CMD,
-   AUTOFS_DEV_IOCTL_CATATONIC_CMD,
-
-   /* Expiry timeout */
-   AUTOFS_DEV_IOCTL_TIMEOUT_CMD,
-
-   /* Get mount last requesting uid and gid */
-   AUTOFS_DEV_IOCTL_REQUESTER_CMD,
-
-   /* Check for eligible expire candidates */
-   AUTOFS_DEV_IOCTL_EXPIRE_CMD,
-
-   /* Request busy status */
-   AUTOFS_DEV_IOCTL_ASKUMOUNT_CMD,
-
-   /* Check if path is a mountpoint */
-   AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD,
-};
-
-#define AUTOFS_IOCTL 0x93
-
-#define AUTOFS_DEV_IOCTL_VERSION \
-   _IOWR(AUTOFS_IOCTL, \
- AUTOFS_DEV_IOCTL_VERSION_CMD, struct autofs_dev_ioctl)
-
-#define AUTOFS_DEV_IOCTL_PROTOVER \
-   _IOWR(AUTOFS_IOCTL, \
- AUTOFS_DEV_IOCTL_PROTOVER_CMD, struct autofs_dev_ioctl)
-
-#define AUTOFS_DEV_IOCTL_PROTOSUBVER \
-   _IOWR(AUTOFS_IOCTL, \
- AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD, struct autofs_dev_ioctl)
-
-#define AUTOFS_DEV_IOCTL_OPENMOUNT \
-   

[PATCH 17/18] autofs: Move inclusion of linux/limits.h to uapi

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

linux/limits.h should be included by uapi instead of linux/auto_fs.h
so as not to cause compile error in userspace.

 # cat << EOF > ./test1.c
 > #include 
 > #include 
 > int main(void) {
 > return 0;
 > }
 > EOF
 # gcc -Wall -g ./test1.c
 In file included from ./test1.c:2:0:
 /usr/include/linux/auto_fs.h:54:12: error: 'NAME_MAX' undeclared here (not in 
a function)
   char name[NAME_MAX+1];
 ^

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 include/linux/auto_fs.h  |1 -
 include/uapi/linux/auto_fs.h |1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/auto_fs.h b/include/linux/auto_fs.h
index b4066bb..b8f814c 100644
--- a/include/linux/auto_fs.h
+++ b/include/linux/auto_fs.h
@@ -10,7 +10,6 @@
 #define _LINUX_AUTO_FS_H
 
 #include 
-#include 
 #include 
 #include 
 #endif /* _LINUX_AUTO_FS_H */
diff --git a/include/uapi/linux/auto_fs.h b/include/uapi/linux/auto_fs.h
index 9175a1b..1bfc3ed 100644
--- a/include/uapi/linux/auto_fs.h
+++ b/include/uapi/linux/auto_fs.h
@@ -12,6 +12,7 @@
 #define _UAPI_LINUX_AUTO_FS_H
 
 #include 
+#include 
 #ifndef __KERNEL__
 #include 
 #endif /* __KERNEL__ */



[PATCH 16/18] autofs: Fix print format for ioctl warning message

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

All other warnings use "cmd(0x%08x)" and this is the only one with "cmd(%d)".
(below comes from my userspace debug program, but not automount daemon)

[ 1139.905676] autofs4:pid:1640:check_dev_ioctl_version: ioctl control 
interface version mismatch: kernel(1.0), user(0.0), cmd(-1072131215)

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 fs/autofs4/dev-ioctl.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index 7289216..e89d6bb 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -75,7 +75,7 @@ static int check_dev_ioctl_version(int cmd, struct 
autofs_dev_ioctl *param)
if ((param->ver_major != AUTOFS_DEV_IOCTL_VERSION_MAJOR) ||
(param->ver_minor > AUTOFS_DEV_IOCTL_VERSION_MINOR)) {
pr_warn("ioctl control interface version mismatch: "
-   "kernel(%u.%u), user(%u.%u), cmd(%d)\n",
+   "kernel(%u.%u), user(%u.%u), cmd(0x%08x)\n",
AUTOFS_DEV_IOCTL_VERSION_MAJOR,
AUTOFS_DEV_IOCTL_VERSION_MINOR,
param->ver_major, param->ver_minor, cmd);



[PATCH 14/18] autofs - fix dev ioctl number range check

2016-08-11 Thread Ian Kent
The count of miscellaneous device ioctls in fs/autofs4/autofs_i.h is wrong.

The number of ioctls is the difference between AUTOFS_DEV_IOCTL_VERSION_CMD
and AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD (14) not the difference between
AUTOFS_IOC_COUNT and 11 (21).

Signed-off-by: Ian Kent 
Cc: Tomohiro Kusumi 
---
 fs/autofs4/autofs_i.h  |3 ++-
 fs/autofs4/dev-ioctl.c |2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index 73e3d38..dd71bd4 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -20,7 +20,8 @@
 #define AUTOFS_IOC_COUNT 32
 
 #define AUTOFS_DEV_IOCTL_IOC_FIRST (AUTOFS_DEV_IOCTL_VERSION)
-#define AUTOFS_DEV_IOCTL_IOC_COUNT (AUTOFS_IOC_COUNT - 11)
+#define AUTOFS_DEV_IOCTL_IOC_COUNT \
+   (AUTOFS_DEV_IOCTL_VERSION_CMD - AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD)
 
 #include 
 #include 
diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index d47b35a..13e7517 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -642,7 +642,7 @@ static int _autofs_dev_ioctl(unsigned int command,
cmd = _IOC_NR(command);
 
if (_IOC_TYPE(command) != _IOC_TYPE(AUTOFS_DEV_IOCTL_IOC_FIRST) ||
-   cmd - cmd_first >= AUTOFS_DEV_IOCTL_IOC_COUNT) {
+   cmd - cmd_first > AUTOFS_DEV_IOCTL_IOC_COUNT) {
return -ENOTTY;
}
 



[PATCH 17/18] autofs: Move inclusion of linux/limits.h to uapi

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

linux/limits.h should be included by uapi instead of linux/auto_fs.h
so as not to cause compile error in userspace.

 # cat << EOF > ./test1.c
 > #include 
 > #include 
 > int main(void) {
 > return 0;
 > }
 > EOF
 # gcc -Wall -g ./test1.c
 In file included from ./test1.c:2:0:
 /usr/include/linux/auto_fs.h:54:12: error: 'NAME_MAX' undeclared here (not in 
a function)
   char name[NAME_MAX+1];
 ^

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 include/linux/auto_fs.h  |1 -
 include/uapi/linux/auto_fs.h |1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/auto_fs.h b/include/linux/auto_fs.h
index b4066bb..b8f814c 100644
--- a/include/linux/auto_fs.h
+++ b/include/linux/auto_fs.h
@@ -10,7 +10,6 @@
 #define _LINUX_AUTO_FS_H
 
 #include 
-#include 
 #include 
 #include 
 #endif /* _LINUX_AUTO_FS_H */
diff --git a/include/uapi/linux/auto_fs.h b/include/uapi/linux/auto_fs.h
index 9175a1b..1bfc3ed 100644
--- a/include/uapi/linux/auto_fs.h
+++ b/include/uapi/linux/auto_fs.h
@@ -12,6 +12,7 @@
 #define _UAPI_LINUX_AUTO_FS_H
 
 #include 
+#include 
 #ifndef __KERNEL__
 #include 
 #endif /* __KERNEL__ */



[PATCH 16/18] autofs: Fix print format for ioctl warning message

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

All other warnings use "cmd(0x%08x)" and this is the only one with "cmd(%d)".
(below comes from my userspace debug program, but not automount daemon)

[ 1139.905676] autofs4:pid:1640:check_dev_ioctl_version: ioctl control 
interface version mismatch: kernel(1.0), user(0.0), cmd(-1072131215)

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 fs/autofs4/dev-ioctl.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index 7289216..e89d6bb 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -75,7 +75,7 @@ static int check_dev_ioctl_version(int cmd, struct 
autofs_dev_ioctl *param)
if ((param->ver_major != AUTOFS_DEV_IOCTL_VERSION_MAJOR) ||
(param->ver_minor > AUTOFS_DEV_IOCTL_VERSION_MINOR)) {
pr_warn("ioctl control interface version mismatch: "
-   "kernel(%u.%u), user(%u.%u), cmd(%d)\n",
+   "kernel(%u.%u), user(%u.%u), cmd(0x%08x)\n",
AUTOFS_DEV_IOCTL_VERSION_MAJOR,
AUTOFS_DEV_IOCTL_VERSION_MINOR,
param->ver_major, param->ver_minor, cmd);



[PATCH 14/18] autofs - fix dev ioctl number range check

2016-08-11 Thread Ian Kent
The count of miscellaneous device ioctls in fs/autofs4/autofs_i.h is wrong.

The number of ioctls is the difference between AUTOFS_DEV_IOCTL_VERSION_CMD
and AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD (14) not the difference between
AUTOFS_IOC_COUNT and 11 (21).

Signed-off-by: Ian Kent 
Cc: Tomohiro Kusumi 
---
 fs/autofs4/autofs_i.h  |3 ++-
 fs/autofs4/dev-ioctl.c |2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index 73e3d38..dd71bd4 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -20,7 +20,8 @@
 #define AUTOFS_IOC_COUNT 32
 
 #define AUTOFS_DEV_IOCTL_IOC_FIRST (AUTOFS_DEV_IOCTL_VERSION)
-#define AUTOFS_DEV_IOCTL_IOC_COUNT (AUTOFS_IOC_COUNT - 11)
+#define AUTOFS_DEV_IOCTL_IOC_COUNT \
+   (AUTOFS_DEV_IOCTL_VERSION_CMD - AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD)
 
 #include 
 #include 
diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index d47b35a..13e7517 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -642,7 +642,7 @@ static int _autofs_dev_ioctl(unsigned int command,
cmd = _IOC_NR(command);
 
if (_IOC_TYPE(command) != _IOC_TYPE(AUTOFS_DEV_IOCTL_IOC_FIRST) ||
-   cmd - cmd_first >= AUTOFS_DEV_IOCTL_IOC_COUNT) {
+   cmd - cmd_first > AUTOFS_DEV_IOCTL_IOC_COUNT) {
return -ENOTTY;
}
 



[PATCH 01/18] autofs: Fix typos in Documentation/filesystems/autofs4.txt

2016-08-11 Thread Ian Kent
From: Tomohiro Kusumi 

plus minor whitespace fixes.

Signed-off-by: Tomohiro Kusumi 
Signed-off-by: Ian Kent 
---
 Documentation/filesystems/autofs4.txt |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/filesystems/autofs4.txt 
b/Documentation/filesystems/autofs4.txt
index 39d02e1..8fac3fe 100644
--- a/Documentation/filesystems/autofs4.txt
+++ b/Documentation/filesystems/autofs4.txt
@@ -203,9 +203,9 @@ initiated or is being considered, otherwise it returns 0.
 Mountpoint expiry
 -
 
-The VFS has a mechansim for automatically expiring unused mounts,
+The VFS has a mechanism for automatically expiring unused mounts,
 much as it can expire any unused dentry information from the dcache.
-This is guided by the MNT_SHRINKABLE flag.  This  only applies to
+This is guided by the MNT_SHRINKABLE flag.  This only applies to
 mounts that were created by `d_automount()` returning a filesystem to be
 mounted.  As autofs doesn't return such a filesystem but leaves the
 mounting to the automount daemon, it must involve the automount daemon
@@ -298,7 +298,7 @@ remove directories and symlinks using normal filesystem 
operations.
 autofs knows whether a process requesting some operation is the daemon
 or not based on its process-group id number (see getpgid(1)).
 
-When an autofs filesystem it mounted the pgid of the mounting
+When an autofs filesystem is mounted the pgid of the mounting
 processes is recorded unless the "pgrp=" option is given, in which
 case that number is recorded instead.  Any request arriving from a
 process in that process group is considered to come from the daemon.
@@ -450,7 +450,7 @@ Commands are:
 numbers for existing filesystems can be found in
 `/proc/self/mountinfo`.
 - **AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD**: same as `close(ioctlfd)`.
-- **AUTOFS_DEV_IOCTL_SETPIPEFD_CMD**: if the  filesystem is in
+- **AUTOFS_DEV_IOCTL_SETPIPEFD_CMD**: if the filesystem is in
 catatonic mode, this can provide the write end of a new pipe
 in `arg1` to re-establish communication with a daemon.  The
 process group of the calling process is used to identify the



[PATCH 06/18] autofs - remove ino free in autofs4_dir_symlink()

2016-08-11 Thread Ian Kent
The inode allocation failure case in autofs4_dir_symlink() frees
the autofs dentry info of the dentry without setting ->d_fsdata to
NULL.

That could lead to a double free so just get rid of the free and
leave it to ->d_release().

Signed-off-by: Ian Kent 
Cc: Tomohiro Kusumi 
---
 fs/autofs4/root.c |2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
index fa84bb8..1b0495a 100644
--- a/fs/autofs4/root.c
+++ b/fs/autofs4/root.c
@@ -577,8 +577,6 @@ static int autofs4_dir_symlink(struct inode *dir,
inode = autofs4_get_inode(dir->i_sb, S_IFLNK | 0555);
if (!inode) {
kfree(cp);
-   if (!dentry->d_fsdata)
-   kfree(ino);
return -ENOMEM;
}
inode->i_private = cp;



  1   2   3   4   5   6   7   8   9   10   >