Re: [PATCH] x86/x2apic: fix x2apic_cluster_probe null pointer
2016-08-12 13:30 GMT+08:00 Huaitong Han: > percpu cpus_in_cluster is a pointer with CONFIG_CPUMASK_OFFSTACK, > cpu_in_cluster > is not alloced memory before x2apic_prepare_cpu is invoked, so cpumask_set_cpu > would get a NULL pointer bug: > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: [] x2apic_cluster_probe+0x35/0x70 Could you try this? https://lkml.org/lkml/2016/8/4/345 Regards, Wanpeng Li
Re: [PATCH v1] x86/power/64: Restore processor state before using per-cpu variables
On Thu, Aug 11, 2016 at 02:49:29PM -0700, Thomas Garnier wrote: > Restore the processor state before calling any other function to ensure > per-cpu variables can be used with KASLR memory randomization. > > Tracing functions use per-cpu variables (gs based) and one was called > just before restoring the processor state fully. It resulted in a double > fault when both the tracing & the exception handler functions tried to > use a per-cpu variable. > > Signed-off-by: Thomas Garnier> --- > Based on next-20160808 Ok, I believe before I test this, I need to apply another patch from Rafael. I think it is the "Always create temporary identity mapping correctly" thing. Yes, no? Rafael, can you please apply everything on a test branch for us to run? Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --
Re: [PATCH] x86/x2apic: fix x2apic_cluster_probe null pointer
2016-08-12 13:30 GMT+08:00 Huaitong Han : > percpu cpus_in_cluster is a pointer with CONFIG_CPUMASK_OFFSTACK, > cpu_in_cluster > is not alloced memory before x2apic_prepare_cpu is invoked, so cpumask_set_cpu > would get a NULL pointer bug: > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: [] x2apic_cluster_probe+0x35/0x70 Could you try this? https://lkml.org/lkml/2016/8/4/345 Regards, Wanpeng Li
Re: [PATCH v1] x86/power/64: Restore processor state before using per-cpu variables
On Thu, Aug 11, 2016 at 02:49:29PM -0700, Thomas Garnier wrote: > Restore the processor state before calling any other function to ensure > per-cpu variables can be used with KASLR memory randomization. > > Tracing functions use per-cpu variables (gs based) and one was called > just before restoring the processor state fully. It resulted in a double > fault when both the tracing & the exception handler functions tried to > use a per-cpu variable. > > Signed-off-by: Thomas Garnier > --- > Based on next-20160808 Ok, I believe before I test this, I need to apply another patch from Rafael. I think it is the "Always create temporary identity mapping correctly" thing. Yes, no? Rafael, can you please apply everything on a test branch for us to run? Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --
Re: [PATCH] uprobes/x86: fix rip-relative handling of EVEX-encoded insns.
* Denys Vlasenko[2016-08-11 17:45:21]: > Since instruction decoder now supports EVEX-encoded insns, two fixes are > needed > to correctly handle them in uprobes. Could you please add the commit that helps support the insns in the Changelog. Rest all looks okay. Looks good to me. Acked-by: Srikar Dronamraju -- Thanks and Regards Srikar Dronamraju
Re: [PATCH] uprobes/x86: fix rip-relative handling of EVEX-encoded insns.
* Denys Vlasenko [2016-08-11 17:45:21]: > Since instruction decoder now supports EVEX-encoded insns, two fixes are > needed > to correctly handle them in uprobes. Could you please add the commit that helps support the insns in the Changelog. Rest all looks okay. Looks good to me. Acked-by: Srikar Dronamraju -- Thanks and Regards Srikar Dronamraju
[PATCH] x86/x2apic: fix x2apic_cluster_probe null pointer
percpu cpus_in_cluster is a pointer with CONFIG_CPUMASK_OFFSTACK, cpu_in_cluster is not alloced memory before x2apic_prepare_cpu is invoked, so cpumask_set_cpu would get a NULL pointer bug: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] x2apic_cluster_probe+0x35/0x70 Signed-off-by: Huaitong Han--- arch/x86/kernel/apic/x2apic_cluster.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/apic/x2apic_cluster.c b/arch/x86/kernel/apic/x2apic_cluster.c index 6368fa6..6acb055 100644 --- a/arch/x86/kernel/apic/x2apic_cluster.c +++ b/arch/x86/kernel/apic/x2apic_cluster.c @@ -190,9 +190,10 @@ static int x2apic_cluster_probe(void) if (!x2apic_mode) return 0; - cpumask_set_cpu(cpu, per_cpu(cpus_in_cluster, cpu)); cpuhp_setup_state(CPUHP_X2APIC_PREPARE, "X2APIC_PREPARE", x2apic_prepare_cpu, x2apic_dead_cpu); + cpumask_set_cpu(cpu, per_cpu(cpus_in_cluster, cpu)); + return 1; } -- 1.8.3.1
[PATCH] x86/x2apic: fix x2apic_cluster_probe null pointer
percpu cpus_in_cluster is a pointer with CONFIG_CPUMASK_OFFSTACK, cpu_in_cluster is not alloced memory before x2apic_prepare_cpu is invoked, so cpumask_set_cpu would get a NULL pointer bug: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] x2apic_cluster_probe+0x35/0x70 Signed-off-by: Huaitong Han --- arch/x86/kernel/apic/x2apic_cluster.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/apic/x2apic_cluster.c b/arch/x86/kernel/apic/x2apic_cluster.c index 6368fa6..6acb055 100644 --- a/arch/x86/kernel/apic/x2apic_cluster.c +++ b/arch/x86/kernel/apic/x2apic_cluster.c @@ -190,9 +190,10 @@ static int x2apic_cluster_probe(void) if (!x2apic_mode) return 0; - cpumask_set_cpu(cpu, per_cpu(cpus_in_cluster, cpu)); cpuhp_setup_state(CPUHP_X2APIC_PREPARE, "X2APIC_PREPARE", x2apic_prepare_cpu, x2apic_dead_cpu); + cpumask_set_cpu(cpu, per_cpu(cpus_in_cluster, cpu)); + return 1; } -- 1.8.3.1
Re: [RFC PATCH v7 7/7] Restartable sequences: self-tests
On Fri, Aug 12, 2016 at 03:10:38AM +, Mathieu Desnoyers wrote: > - On Aug 11, 2016, at 9:28 PM, Boqun Feng boqun.f...@gmail.com wrote: > > > On Thu, Aug 11, 2016 at 11:26:30PM +, Mathieu Desnoyers wrote: > >> - On Jul 24, 2016, at 2:01 PM, Dave Watson davejwat...@fb.com wrote: > >> > >> >>> +static inline __attribute__((always_inline)) > >> >>> +bool rseq_finish(struct rseq_lock *rlock, > >> >>> + intptr_t *p, intptr_t to_write, > >> >>> + struct rseq_state start_value) > >> > > >> >>> This ABI looks like it will work fine for our use case. I don't think > >> >>> it > >> >>> has been mentioned yet, but we may still need multiple asm blocks > >> >>> for differing numbers of writes. For example, an array-based freelist > >> >>> push: > >> > > >> >>> void push(void *obj) { > >> >>> if (index < maxlen) { > >> >>> freelist[index++] = obj; > >> >>> } > >> >>> } > >> > > >> >>> would be more efficiently implemented with a two-write rseq_finish: > >> > > >> >>> rseq_finish2([index], obj, // first write > >> >>> , index + 1, // second write > >> >>> ...); > >> > > >> >> Would pairing one rseq_start with two rseq_finish do the trick > >> >> there ? > >> > > >> > Yes, two rseq_finish works, as long as the extra rseq management overhead > >> > is not substantial. > >> > >> I've added a commit implementing rseq_finish2() in my rseq volatile > >> dev branch. You can fetch it at: > >> > >> https://github.com/compudj/linux-percpu-dev/tree/rseq-fallback > >> > >> I also have a separate test and benchmark tree in addition to the > >> kernel selftests here: > >> > >> https://github.com/compudj/rseq-test > >> > >> I named the first write a "speculative" write, and the second write > >> the "final" write. > >> > > > > Maybe I miss something subtle, but if the first write is only a > > "speculative" write, why can't we put it in the rseq critical section > > rather than asm block? Like this: > > > > do_rseq(..., result, targetptr, newval > > { > > newval = index; > > targetptr = > > if (newval < maxlen) > > freelist[newval++] = obj; > > else > > result = false; > > } > > > > No extra rseq_finish() is needed here, but maybe a little more > > "speculative" writes? > > This won't work unfortunately. The speculative stores need to be > between the rseq_event_counter comparison instruction in the rseq_finish > asm sequence and the final store. The ip fixup is really needed for > correctness of speculative stores. The sequence number scheme only works > for loads. > > Putting it in the C code between rseq_start and rseq_finish would lead > to races such as: > > thread Athread B > rseq_start > > > rseq_start > freelist[offset + 1] = obj > rseq_finish >offset++ > > > freelist[newval + 1] = obj <--- corrupts the list content. > Ah, right! We couldn't do any "global"(real global or percpu) update in the rseq critical section(code between rseq_start and rseq_finish), because without an ip fixup, we cannot abort the critical section immediately, we have to compare the event_counter in rseq_finish, but that's too late for speculates stores. > > > > Besides, do we allow userspace programs do read-only access to the > > memory objects modified by do_rseq(). If so, we have a problem when > > there are two writes in a do_rseq()(either in the rseq critical section > > or in the asm block), because in current implemetation, these two writes > > are unordered, which makes the readers outside a do_rseq() could observe > > the ordering of writes differently. > > > > For rseq_finish2(), a simple solution would be making the "final" write > > a RELEASE. > > Indeed, we would need a release semantic for the final store here if this > is the common use. Or we could duplicate the "flavors" of rseq_finish2 and > add a rseq_finish2_release. We should find a way to eliminate code duplication I'm in favor of a separate rseq_finish2_release(). > there. I suspect we'll end up doing macros. > Me too. Lemme have a try ;-) Regards, Boqun > Thanks, > > Mathieu > > > > > Regards, > > Boqun > > > >> Thanks, > >> > >> Mathieu > >> > >> -- > >> Mathieu Desnoyers > >> EfficiOS Inc. > > > http://www.efficios.com > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com signature.asc Description: PGP signature
Re: [RFC PATCH v7 7/7] Restartable sequences: self-tests
On Fri, Aug 12, 2016 at 03:10:38AM +, Mathieu Desnoyers wrote: > - On Aug 11, 2016, at 9:28 PM, Boqun Feng boqun.f...@gmail.com wrote: > > > On Thu, Aug 11, 2016 at 11:26:30PM +, Mathieu Desnoyers wrote: > >> - On Jul 24, 2016, at 2:01 PM, Dave Watson davejwat...@fb.com wrote: > >> > >> >>> +static inline __attribute__((always_inline)) > >> >>> +bool rseq_finish(struct rseq_lock *rlock, > >> >>> + intptr_t *p, intptr_t to_write, > >> >>> + struct rseq_state start_value) > >> > > >> >>> This ABI looks like it will work fine for our use case. I don't think > >> >>> it > >> >>> has been mentioned yet, but we may still need multiple asm blocks > >> >>> for differing numbers of writes. For example, an array-based freelist > >> >>> push: > >> > > >> >>> void push(void *obj) { > >> >>> if (index < maxlen) { > >> >>> freelist[index++] = obj; > >> >>> } > >> >>> } > >> > > >> >>> would be more efficiently implemented with a two-write rseq_finish: > >> > > >> >>> rseq_finish2([index], obj, // first write > >> >>> , index + 1, // second write > >> >>> ...); > >> > > >> >> Would pairing one rseq_start with two rseq_finish do the trick > >> >> there ? > >> > > >> > Yes, two rseq_finish works, as long as the extra rseq management overhead > >> > is not substantial. > >> > >> I've added a commit implementing rseq_finish2() in my rseq volatile > >> dev branch. You can fetch it at: > >> > >> https://github.com/compudj/linux-percpu-dev/tree/rseq-fallback > >> > >> I also have a separate test and benchmark tree in addition to the > >> kernel selftests here: > >> > >> https://github.com/compudj/rseq-test > >> > >> I named the first write a "speculative" write, and the second write > >> the "final" write. > >> > > > > Maybe I miss something subtle, but if the first write is only a > > "speculative" write, why can't we put it in the rseq critical section > > rather than asm block? Like this: > > > > do_rseq(..., result, targetptr, newval > > { > > newval = index; > > targetptr = > > if (newval < maxlen) > > freelist[newval++] = obj; > > else > > result = false; > > } > > > > No extra rseq_finish() is needed here, but maybe a little more > > "speculative" writes? > > This won't work unfortunately. The speculative stores need to be > between the rseq_event_counter comparison instruction in the rseq_finish > asm sequence and the final store. The ip fixup is really needed for > correctness of speculative stores. The sequence number scheme only works > for loads. > > Putting it in the C code between rseq_start and rseq_finish would lead > to races such as: > > thread Athread B > rseq_start > > > rseq_start > freelist[offset + 1] = obj > rseq_finish >offset++ > > > freelist[newval + 1] = obj <--- corrupts the list content. > Ah, right! We couldn't do any "global"(real global or percpu) update in the rseq critical section(code between rseq_start and rseq_finish), because without an ip fixup, we cannot abort the critical section immediately, we have to compare the event_counter in rseq_finish, but that's too late for speculates stores. > > > > Besides, do we allow userspace programs do read-only access to the > > memory objects modified by do_rseq(). If so, we have a problem when > > there are two writes in a do_rseq()(either in the rseq critical section > > or in the asm block), because in current implemetation, these two writes > > are unordered, which makes the readers outside a do_rseq() could observe > > the ordering of writes differently. > > > > For rseq_finish2(), a simple solution would be making the "final" write > > a RELEASE. > > Indeed, we would need a release semantic for the final store here if this > is the common use. Or we could duplicate the "flavors" of rseq_finish2 and > add a rseq_finish2_release. We should find a way to eliminate code duplication I'm in favor of a separate rseq_finish2_release(). > there. I suspect we'll end up doing macros. > Me too. Lemme have a try ;-) Regards, Boqun > Thanks, > > Mathieu > > > > > Regards, > > Boqun > > > >> Thanks, > >> > >> Mathieu > >> > >> -- > >> Mathieu Desnoyers > >> EfficiOS Inc. > > > http://www.efficios.com > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com signature.asc Description: PGP signature
Re: [PATCH 1/5] arm/defconfigs: Remove CONFIG_IPV6_PRIVACY
On Thu, Aug 11, 2016 at 11:51:38PM +0200, Arnd Bergmann wrote: > That patch looks good, sorry I missed it. I see that Vladimir already > did the lpc32xx portion of it, can you send the part for s3c24xx > to the exynos maintainers? I see you've added them to CC, thanks Arnd. Here it is: --- From: Borislav PetkovDate: Fri, 12 Aug 2016 07:18:20 +0200 Subject: [PATCH] arm/s3c2410_defconfig: Remove CONFIG_IPV6_PRIVACY Option is long gone, see 5d9efa7ee99e ("ipv6: Remove privacy config option.") Signed-off-by: Borislav Petkov Cc: Kukjin Kim Cc: Krzysztof Kozlowski --- arch/arm/configs/s3c2410_defconfig | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/arm/configs/s3c2410_defconfig b/arch/arm/configs/s3c2410_defconfig index b3ade552a2a5..bc4bfe02e611 100644 --- a/arch/arm/configs/s3c2410_defconfig +++ b/arch/arm/configs/s3c2410_defconfig @@ -76,7 +76,6 @@ CONFIG_TCP_CONG_LP=m CONFIG_TCP_CONG_VENO=m CONFIG_TCP_CONG_YEAH=m CONFIG_TCP_CONG_ILLINOIS=m -CONFIG_IPV6_PRIVACY=y CONFIG_IPV6_ROUTER_PREF=y CONFIG_INET6_AH=m CONFIG_INET6_ESP=m -- 2.8.4 > If you have larger chunks like CONFIG_INET_LRO removal, we can take > that directly through arm-soc as a single patch for all defconfigs. I'll leave that to someone who wants to get their feet wet with patching the kernel - should be a simple task. :-) Thanks! -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. --
Re: [PATCH 1/5] arm/defconfigs: Remove CONFIG_IPV6_PRIVACY
On Thu, Aug 11, 2016 at 11:51:38PM +0200, Arnd Bergmann wrote: > That patch looks good, sorry I missed it. I see that Vladimir already > did the lpc32xx portion of it, can you send the part for s3c24xx > to the exynos maintainers? I see you've added them to CC, thanks Arnd. Here it is: --- From: Borislav Petkov Date: Fri, 12 Aug 2016 07:18:20 +0200 Subject: [PATCH] arm/s3c2410_defconfig: Remove CONFIG_IPV6_PRIVACY Option is long gone, see 5d9efa7ee99e ("ipv6: Remove privacy config option.") Signed-off-by: Borislav Petkov Cc: Kukjin Kim Cc: Krzysztof Kozlowski --- arch/arm/configs/s3c2410_defconfig | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/arm/configs/s3c2410_defconfig b/arch/arm/configs/s3c2410_defconfig index b3ade552a2a5..bc4bfe02e611 100644 --- a/arch/arm/configs/s3c2410_defconfig +++ b/arch/arm/configs/s3c2410_defconfig @@ -76,7 +76,6 @@ CONFIG_TCP_CONG_LP=m CONFIG_TCP_CONG_VENO=m CONFIG_TCP_CONG_YEAH=m CONFIG_TCP_CONG_ILLINOIS=m -CONFIG_IPV6_PRIVACY=y CONFIG_IPV6_ROUTER_PREF=y CONFIG_INET6_AH=m CONFIG_INET6_ESP=m -- 2.8.4 > If you have larger chunks like CONFIG_INET_LRO removal, we can take > that directly through arm-soc as a single patch for all defconfigs. I'll leave that to someone who wants to get their feet wet with patching the kernel - should be a simple task. :-) Thanks! -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. --
Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
On Thu, 2016-08-11 at 14:17 -0600, Alex Williamson wrote: > > vfio isn't playing nanny here for the fun of it, part of the reason we > have vpd access functions is because folks have discovered that vpd > registers between PCI functions on multi-function devices may be > shared. So pounding on vpd registers for function 1 may adversely > > affect someone else reading from a different function. A lot more than that may be shared... the bus for example. lots of devices have backdoors into their own BARs, one partition can make its BARs overlap the neighbour function, ... dang ! In the end, PCI assignment at a granularity smaller than a bus cannot be done in a fully air tight way and shouldn't be relied upon in environemnt where the isolation between partitions matters. The only exception here is SR-IOV of course since it's designed specifically so that the VFs can't be messed with. > This is a case > where I feel vfio needs to step in because if that's a user competing > with the host or two users stepping on each other, well that's exactly > what vfio tries to prevent. A driver in userspace or a VM driver can't > very well determine these sorts of interactions when it only has > visibility to a subset of the functions and users and hardware folks > would throw a fit if I extended iommu groups to encompass all the > related devices rather than take the relatively simple step of > virtualizing these accesses and occasionally quirking devices that are > extra broken, as seems to be required here. Thanks, We may want some kind of "strict" vs. "relaxed" model here to differenciate the desktop user wanting to give a function to his/her windows partition and doesn't care about strict isolation vs. the cloud data center. Cheers, Ben.
Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
On Thu, 2016-08-11 at 14:17 -0600, Alex Williamson wrote: > > vfio isn't playing nanny here for the fun of it, part of the reason we > have vpd access functions is because folks have discovered that vpd > registers between PCI functions on multi-function devices may be > shared. So pounding on vpd registers for function 1 may adversely > > affect someone else reading from a different function. A lot more than that may be shared... the bus for example. lots of devices have backdoors into their own BARs, one partition can make its BARs overlap the neighbour function, ... dang ! In the end, PCI assignment at a granularity smaller than a bus cannot be done in a fully air tight way and shouldn't be relied upon in environemnt where the isolation between partitions matters. The only exception here is SR-IOV of course since it's designed specifically so that the VFs can't be messed with. > This is a case > where I feel vfio needs to step in because if that's a user competing > with the host or two users stepping on each other, well that's exactly > what vfio tries to prevent. A driver in userspace or a VM driver can't > very well determine these sorts of interactions when it only has > visibility to a subset of the functions and users and hardware folks > would throw a fit if I extended iommu groups to encompass all the > related devices rather than take the relatively simple step of > virtualizing these accesses and occasionally quirking devices that are > extra broken, as seems to be required here. Thanks, We may want some kind of "strict" vs. "relaxed" model here to differenciate the desktop user wanting to give a function to his/her windows partition and doesn't care about strict isolation vs. the cloud data center. Cheers, Ben.
Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
On Thu, Aug 11, 2016 at 9:16 PM, Dave Chinnerwrote: > > That's why running aim7 as your "does the filesystem scale" > benchmark is somewhat irrelevant to scaling applications on high > performance systems these days Yes, don't get me wrong - I'm not at all trying to say that AIM7 is a good benchmark. It's just that I think what it happens to test is still meaningful, even if it's not necessarily in any way some kind of "high performance IO" thing. There are probably lots of other more important loads, I just reacted to Christoph seeming to argue that the AIM7 behavior was _so_ broken that we shouldn't even care. It's not _that_ broken, it's just not about high-performance IO streaming, it happens to test something else entirely. We've actually had AIM7 occasionally find other issues just because some of the things it does is so odd. Iirc it has a fork test that doesn't execve (very unusual - you'd generally use threads if you care about performance), and that has shown issues with our anon_vma scaling before anything else did. I also seem to remember some odd pty open/close/ioctl subtest that showed problems with some of the last remnants of the old BKL (the test probably actually tested something else, but ended up choking on the odd tty things). So in general, I'm not a fan of AIM as a benchmark, but it actually _has_ found lots of real issues because it tends to do things that kernel developers think are insane. And let's face it, user programs doing odd and not very efficient things should be considered par for the course. We're never going to get rid of insane user programs, so we might as well fix the performance problems even when we say "that's just stupid". Linus
Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
On Thu, Aug 11, 2016 at 9:16 PM, Dave Chinner wrote: > > That's why running aim7 as your "does the filesystem scale" > benchmark is somewhat irrelevant to scaling applications on high > performance systems these days Yes, don't get me wrong - I'm not at all trying to say that AIM7 is a good benchmark. It's just that I think what it happens to test is still meaningful, even if it's not necessarily in any way some kind of "high performance IO" thing. There are probably lots of other more important loads, I just reacted to Christoph seeming to argue that the AIM7 behavior was _so_ broken that we shouldn't even care. It's not _that_ broken, it's just not about high-performance IO streaming, it happens to test something else entirely. We've actually had AIM7 occasionally find other issues just because some of the things it does is so odd. Iirc it has a fork test that doesn't execve (very unusual - you'd generally use threads if you care about performance), and that has shown issues with our anon_vma scaling before anything else did. I also seem to remember some odd pty open/close/ioctl subtest that showed problems with some of the last remnants of the old BKL (the test probably actually tested something else, but ended up choking on the odd tty things). So in general, I'm not a fan of AIM as a benchmark, but it actually _has_ found lots of real issues because it tends to do things that kernel developers think are insane. And let's face it, user programs doing odd and not very efficient things should be considered par for the course. We're never going to get rid of insane user programs, so we might as well fix the performance problems even when we say "that's just stupid". Linus
[PATCH v8 3/3] regulator: lp873x: Change the MFD config option as per latest naming
Change the MFD config option as per latest naming Acked-by: Mark BrownSigned-off-by: Keerthy --- drivers/regulator/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig index 6c88e31..97dc4cc 100644 --- a/drivers/regulator/Kconfig +++ b/drivers/regulator/Kconfig @@ -323,7 +323,7 @@ config REGULATOR_LP872X config REGULATOR_LP873X tristate "TI LP873X Power regulators" - depends on MFD_LP873X && OF + depends on MFD_TI_LP873X && OF help This driver supports LP873X voltage regulator chips. LP873X provides two step-down converters and two general-purpose LDO -- 1.9.1
[PATCH v8 0/3] mfd: lp873x: Add lp873x PMIC support
The LP873X chip is a power management IC for Portable Navigation Systems and Tablet Computing devices. It contains the following components: - Regulators. - Configurable General Purpose Output Signals(GPO). PMIC interacts with the main processor through i2c. PMIC has couple of LDOs(Linear Regulators), couple of BUCKs (Step-Down DC-DC Converter Cores) and GPOs(General Purpose Output Signals). The regulator patch is already queued: http://marc.info/?l=linux-kernel=146298767110771=2 DT patch also queued. Hence posting the remaining patches of the series with the comments fixed on mfd driver. Changes in v8: Defined Macros for hardcoded bit definitions in gpio driver. Added Mark's ack for the regulator config change patch. Chages in v7: Reverted the dependency on http://www.gossamer-threads.com/lists/linux/kernel/2457552. Changes in v6: Series depends on: http://www.gossamer-threads.com/lists/linux/kernel/2457552 probe_new used. Chnages in v4: Added GPO driver support. Keerthy (3): mfd: lp873x: Add lp873x PMIC support gpio: lp873x: Add support for General Purpose Outputs regulator: lp873x: Change the MFD config option as per latest naming drivers/gpio/Kconfig | 10 ++ drivers/gpio/Makefile | 1 + drivers/gpio/gpio-lp873x.c | 193 + drivers/mfd/Kconfig| 14 +++ drivers/mfd/Makefile | 2 + drivers/mfd/lp873x.c | 97 + drivers/regulator/Kconfig | 2 +- include/linux/mfd/lp873x.h | 264 + 8 files changed, 582 insertions(+), 1 deletion(-) create mode 100644 drivers/gpio/gpio-lp873x.c create mode 100644 drivers/mfd/lp873x.c create mode 100644 include/linux/mfd/lp873x.h -- 1.9.1
[PATCH v8 2/3] gpio: lp873x: Add support for General Purpose Outputs
Add driver for lp873x PMIC family GPOs. Two GPOs are supported and can be configured in Open-drain output or Push-pull output. Acked-by: Linus WalleijSigned-off-by: Keerthy --- Changes in v8: * Replaced hardcoded bit fields with macros. drivers/gpio/Kconfig | 10 +++ drivers/gpio/Makefile | 1 + drivers/gpio/gpio-lp873x.c | 193 + 3 files changed, 204 insertions(+) create mode 100644 drivers/gpio/gpio-lp873x.c diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig index 6de9062..ff68a80 100644 --- a/drivers/gpio/Kconfig +++ b/drivers/gpio/Kconfig @@ -853,6 +853,16 @@ config GPIO_LP3943 LP3943 can be used as a GPIO expander which provides up to 16 GPIOs. Open drain outputs are required for this usage. +config GPIO_LP873X + tristate "TI LP873X GPO" + depends on MFD_TI_LP873X + help + This driver supports the GPO on TI Lp873x PMICs. 2 GPOs are present + on LP873X PMICs. + + This driver can also be built as a module. If so, the module will be + called gpio-lp873x. + config GPIO_MAX77620 tristate "GPIO support for PMIC MAX77620 and MAX20024" depends on MFD_MAX77620 diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile index 2a035ed..d60432f 100644 --- a/drivers/gpio/Makefile +++ b/drivers/gpio/Makefile @@ -56,6 +56,7 @@ obj-$(CONFIG_GPIO_LOONGSON) += gpio-loongson.o obj-$(CONFIG_GPIO_LP3943) += gpio-lp3943.o obj-$(CONFIG_GPIO_LPC18XX) += gpio-lpc18xx.o obj-$(CONFIG_ARCH_LPC32XX) += gpio-lpc32xx.o +obj-$(CONFIG_GPIO_LP873X) += gpio-lp873x.o obj-$(CONFIG_GPIO_LYNXPOINT) += gpio-lynxpoint.o obj-$(CONFIG_GPIO_MAX730X) += gpio-max730x.o obj-$(CONFIG_GPIO_MAX7300) += gpio-max7300.o diff --git a/drivers/gpio/gpio-lp873x.c b/drivers/gpio/gpio-lp873x.c new file mode 100644 index 000..79c255d --- /dev/null +++ b/drivers/gpio/gpio-lp873x.c @@ -0,0 +1,193 @@ +/* + * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/ + * Keerthy + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed "as is" WITHOUT ANY WARRANTY of any + * kind, whether expressed or implied; without even the implied warranty + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License version 2 for more details. + * + * Based on the TPS65218 driver + */ + +#include +#include +#include +#include + +#include + +#define BITS_PER_GPO 0x4 +#define LP873X_GPO_CTRL_OD 0x2 + +struct lp873x_gpio { + struct gpio_chip chip; + struct lp873x *lp873; +}; + +static int lp873x_gpio_get_direction(struct gpio_chip *chip, +unsigned int offset) +{ + /* This device is output only */ + return 0; +} + +static int lp873x_gpio_direction_input(struct gpio_chip *chip, + unsigned int offset) +{ + /* This device is output only */ + return -EINVAL; +} + +static int lp873x_gpio_direction_output(struct gpio_chip *chip, + unsigned int offset, int value) +{ + struct lp873x_gpio *gpio = gpiochip_get_data(chip); + + /* Set the initial value */ + return regmap_update_bits(gpio->lp873->regmap, LP873X_REG_GPO_CTRL, + BIT(offset * BITS_PER_GPO), + value ? BIT(offset * BITS_PER_GPO) : 0); +} + +static int lp873x_gpio_get(struct gpio_chip *chip, unsigned int offset) +{ + struct lp873x_gpio *gpio = gpiochip_get_data(chip); + int ret, val; + + ret = regmap_read(gpio->lp873->regmap, LP873X_REG_GPO_CTRL, ); + if (ret < 0) + return ret; + + return val & BIT(offset * BITS_PER_GPO); +} + +static void lp873x_gpio_set(struct gpio_chip *chip, unsigned int offset, + int value) +{ + struct lp873x_gpio *gpio = gpiochip_get_data(chip); + + regmap_update_bits(gpio->lp873->regmap, LP873X_REG_GPO_CTRL, + BIT(offset * BITS_PER_GPO), + value ? BIT(offset * BITS_PER_GPO) : 0); +} + +static int lp873x_gpio_request(struct gpio_chip *gc, unsigned int offset) +{ + struct lp873x_gpio *gpio = gpiochip_get_data(gc); + int ret; + + switch (offset) { + case 0: + /* No MUX Set up Needed for GPO */ + break; + case 1: + /* Setup the CLKIN_PIN_SEL MUX to GPO2 */ + ret = regmap_update_bits(gpio->lp873->regmap, LP873X_REG_CONFIG, +LP873X_CONFIG_CLKIN_PIN_SEL, 0); + if (ret) + return ret; + +
[PATCH v8 3/3] regulator: lp873x: Change the MFD config option as per latest naming
Change the MFD config option as per latest naming Acked-by: Mark Brown Signed-off-by: Keerthy --- drivers/regulator/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig index 6c88e31..97dc4cc 100644 --- a/drivers/regulator/Kconfig +++ b/drivers/regulator/Kconfig @@ -323,7 +323,7 @@ config REGULATOR_LP872X config REGULATOR_LP873X tristate "TI LP873X Power regulators" - depends on MFD_LP873X && OF + depends on MFD_TI_LP873X && OF help This driver supports LP873X voltage regulator chips. LP873X provides two step-down converters and two general-purpose LDO -- 1.9.1
[PATCH v8 0/3] mfd: lp873x: Add lp873x PMIC support
The LP873X chip is a power management IC for Portable Navigation Systems and Tablet Computing devices. It contains the following components: - Regulators. - Configurable General Purpose Output Signals(GPO). PMIC interacts with the main processor through i2c. PMIC has couple of LDOs(Linear Regulators), couple of BUCKs (Step-Down DC-DC Converter Cores) and GPOs(General Purpose Output Signals). The regulator patch is already queued: http://marc.info/?l=linux-kernel=146298767110771=2 DT patch also queued. Hence posting the remaining patches of the series with the comments fixed on mfd driver. Changes in v8: Defined Macros for hardcoded bit definitions in gpio driver. Added Mark's ack for the regulator config change patch. Chages in v7: Reverted the dependency on http://www.gossamer-threads.com/lists/linux/kernel/2457552. Changes in v6: Series depends on: http://www.gossamer-threads.com/lists/linux/kernel/2457552 probe_new used. Chnages in v4: Added GPO driver support. Keerthy (3): mfd: lp873x: Add lp873x PMIC support gpio: lp873x: Add support for General Purpose Outputs regulator: lp873x: Change the MFD config option as per latest naming drivers/gpio/Kconfig | 10 ++ drivers/gpio/Makefile | 1 + drivers/gpio/gpio-lp873x.c | 193 + drivers/mfd/Kconfig| 14 +++ drivers/mfd/Makefile | 2 + drivers/mfd/lp873x.c | 97 + drivers/regulator/Kconfig | 2 +- include/linux/mfd/lp873x.h | 264 + 8 files changed, 582 insertions(+), 1 deletion(-) create mode 100644 drivers/gpio/gpio-lp873x.c create mode 100644 drivers/mfd/lp873x.c create mode 100644 include/linux/mfd/lp873x.h -- 1.9.1
[PATCH v8 2/3] gpio: lp873x: Add support for General Purpose Outputs
Add driver for lp873x PMIC family GPOs. Two GPOs are supported and can be configured in Open-drain output or Push-pull output. Acked-by: Linus Walleij Signed-off-by: Keerthy --- Changes in v8: * Replaced hardcoded bit fields with macros. drivers/gpio/Kconfig | 10 +++ drivers/gpio/Makefile | 1 + drivers/gpio/gpio-lp873x.c | 193 + 3 files changed, 204 insertions(+) create mode 100644 drivers/gpio/gpio-lp873x.c diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig index 6de9062..ff68a80 100644 --- a/drivers/gpio/Kconfig +++ b/drivers/gpio/Kconfig @@ -853,6 +853,16 @@ config GPIO_LP3943 LP3943 can be used as a GPIO expander which provides up to 16 GPIOs. Open drain outputs are required for this usage. +config GPIO_LP873X + tristate "TI LP873X GPO" + depends on MFD_TI_LP873X + help + This driver supports the GPO on TI Lp873x PMICs. 2 GPOs are present + on LP873X PMICs. + + This driver can also be built as a module. If so, the module will be + called gpio-lp873x. + config GPIO_MAX77620 tristate "GPIO support for PMIC MAX77620 and MAX20024" depends on MFD_MAX77620 diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile index 2a035ed..d60432f 100644 --- a/drivers/gpio/Makefile +++ b/drivers/gpio/Makefile @@ -56,6 +56,7 @@ obj-$(CONFIG_GPIO_LOONGSON) += gpio-loongson.o obj-$(CONFIG_GPIO_LP3943) += gpio-lp3943.o obj-$(CONFIG_GPIO_LPC18XX) += gpio-lpc18xx.o obj-$(CONFIG_ARCH_LPC32XX) += gpio-lpc32xx.o +obj-$(CONFIG_GPIO_LP873X) += gpio-lp873x.o obj-$(CONFIG_GPIO_LYNXPOINT) += gpio-lynxpoint.o obj-$(CONFIG_GPIO_MAX730X) += gpio-max730x.o obj-$(CONFIG_GPIO_MAX7300) += gpio-max7300.o diff --git a/drivers/gpio/gpio-lp873x.c b/drivers/gpio/gpio-lp873x.c new file mode 100644 index 000..79c255d --- /dev/null +++ b/drivers/gpio/gpio-lp873x.c @@ -0,0 +1,193 @@ +/* + * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/ + * Keerthy + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed "as is" WITHOUT ANY WARRANTY of any + * kind, whether expressed or implied; without even the implied warranty + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License version 2 for more details. + * + * Based on the TPS65218 driver + */ + +#include +#include +#include +#include + +#include + +#define BITS_PER_GPO 0x4 +#define LP873X_GPO_CTRL_OD 0x2 + +struct lp873x_gpio { + struct gpio_chip chip; + struct lp873x *lp873; +}; + +static int lp873x_gpio_get_direction(struct gpio_chip *chip, +unsigned int offset) +{ + /* This device is output only */ + return 0; +} + +static int lp873x_gpio_direction_input(struct gpio_chip *chip, + unsigned int offset) +{ + /* This device is output only */ + return -EINVAL; +} + +static int lp873x_gpio_direction_output(struct gpio_chip *chip, + unsigned int offset, int value) +{ + struct lp873x_gpio *gpio = gpiochip_get_data(chip); + + /* Set the initial value */ + return regmap_update_bits(gpio->lp873->regmap, LP873X_REG_GPO_CTRL, + BIT(offset * BITS_PER_GPO), + value ? BIT(offset * BITS_PER_GPO) : 0); +} + +static int lp873x_gpio_get(struct gpio_chip *chip, unsigned int offset) +{ + struct lp873x_gpio *gpio = gpiochip_get_data(chip); + int ret, val; + + ret = regmap_read(gpio->lp873->regmap, LP873X_REG_GPO_CTRL, ); + if (ret < 0) + return ret; + + return val & BIT(offset * BITS_PER_GPO); +} + +static void lp873x_gpio_set(struct gpio_chip *chip, unsigned int offset, + int value) +{ + struct lp873x_gpio *gpio = gpiochip_get_data(chip); + + regmap_update_bits(gpio->lp873->regmap, LP873X_REG_GPO_CTRL, + BIT(offset * BITS_PER_GPO), + value ? BIT(offset * BITS_PER_GPO) : 0); +} + +static int lp873x_gpio_request(struct gpio_chip *gc, unsigned int offset) +{ + struct lp873x_gpio *gpio = gpiochip_get_data(gc); + int ret; + + switch (offset) { + case 0: + /* No MUX Set up Needed for GPO */ + break; + case 1: + /* Setup the CLKIN_PIN_SEL MUX to GPO2 */ + ret = regmap_update_bits(gpio->lp873->regmap, LP873X_REG_CONFIG, +LP873X_CONFIG_CLKIN_PIN_SEL, 0); + if (ret) + return ret; + + break; + default: + return -EINVAL; +
[PATCH v8 1/3] mfd: lp873x: Add lp873x PMIC support
The LP873X chip is a power management IC for Portable Navigation Systems and Tablet Computing devices. It contains the following components: - Regulators. - Configurable General Purpose Output Signals(GPO). PMIC interacts with the main processor through i2c. PMIC has couple of LDOs(Linear Regulators), couple of BUCKs (Step-Down DC-DC Converter Cores) and GPOs(General Purpose Output Signals). Signed-off-by: Keerthy--- drivers/mfd/Kconfig| 14 +++ drivers/mfd/Makefile | 2 + drivers/mfd/lp873x.c | 97 + include/linux/mfd/lp873x.h | 264 + 4 files changed, 377 insertions(+) create mode 100644 drivers/mfd/lp873x.c create mode 100644 include/linux/mfd/lp873x.h diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig index 2d1fb64..45fe00a 100644 --- a/drivers/mfd/Kconfig +++ b/drivers/mfd/Kconfig @@ -1224,6 +1224,20 @@ config MFD_TPS65217 This driver can also be built as a module. If so, the module will be called tps65217. +config MFD_TI_LP873X + tristate "TI LP873X Power Management IC" + depends on I2C + select MFD_CORE + select REGMAP_I2C + help + If you say yes here then you get support for the LP873X series of + Power Management Integrated Circuits(PMIC). + These include voltage regulators, Thermal protection, Configurable + General Purpose Outputs(GPO) that are used in portable devices. + + This driver can also be built as a module. If so, the module + will be called lp873x. + config MFD_TPS65218 tristate "TI TPS65218 Power Management chips" depends on I2C diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile index 2ba3ba3..42acbcd 100644 --- a/drivers/mfd/Makefile +++ b/drivers/mfd/Makefile @@ -22,6 +22,8 @@ obj-$(CONFIG_HTC_EGPIO) += htc-egpio.o obj-$(CONFIG_HTC_PASIC3) += htc-pasic3.o obj-$(CONFIG_HTC_I2CPLD) += htc-i2cpld.o +obj-$(CONFIG_MFD_TI_LP873X)+= lp873x.o + obj-$(CONFIG_MFD_DAVINCI_VOICECODEC) += davinci_voicecodec.o obj-$(CONFIG_MFD_DM355EVM_MSP) += dm355evm_msp.o obj-$(CONFIG_MFD_TI_AM335X_TSCADC) += ti_am335x_tscadc.o diff --git a/drivers/mfd/lp873x.c b/drivers/mfd/lp873x.c new file mode 100644 index 000..3ca0462 --- /dev/null +++ b/drivers/mfd/lp873x.c @@ -0,0 +1,97 @@ +/* + * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/ + * + * Author: Keerthy + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation version 2. + * + * This program is distributed "as is" WITHOUT ANY WARRANTY of any + * kind, whether express or implied; without even the implied warranty + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include +#include +#include +#include +#include + +#include + +static const struct regmap_config lp873x_regmap_config = { + .reg_bits = 8, + .val_bits = 8, + .max_register = LP873X_REG_MAX, +}; + +static const struct mfd_cell lp873x_cells[] = { + { .name = "lp873x-regulator", }, + { .name = "lp873x-gpio", }, +}; + +static int lp873x_probe(struct i2c_client *client, + const struct i2c_device_id *ids) +{ + struct lp873x *lp873; + int ret; + unsigned int otpid; + + lp873 = devm_kzalloc(>dev, sizeof(*lp873), GFP_KERNEL); + if (!lp873) + return -ENOMEM; + + lp873->dev = >dev; + + lp873->regmap = devm_regmap_init_i2c(client, _regmap_config); + if (IS_ERR(lp873->regmap)) { + ret = PTR_ERR(lp873->regmap); + dev_err(lp873->dev, + "Failed to initialize register map: %d\n", ret); + return ret; + } + + mutex_init(>lock); + + ret = regmap_read(lp873->regmap, LP873X_REG_OTP_REV, ); + if (ret) { + dev_err(lp873->dev, "Failed to read OTP ID\n"); + return ret; + } + + lp873->rev = otpid & LP873X_OTP_REV_OTP_ID; + i2c_set_clientdata(client, lp873); + ret = mfd_add_devices(lp873->dev, PLATFORM_DEVID_AUTO, lp873x_cells, + ARRAY_SIZE(lp873x_cells), NULL, 0, NULL); + + return ret; +} + +static const struct of_device_id of_lp873x_match_table[] = { + { .compatible = "ti,lp8733", }, + { .compatible = "ti,lp8732", }, + {} +}; +MODULE_DEVICE_TABLE(of, of_lp873x_match_table); + +static const struct i2c_device_id lp873x_id_table[] = { + { "lp873x", LP873X }, + { }, +}; +MODULE_DEVICE_TABLE(i2c, lp873x_id_table); + +static struct i2c_driver lp873x_driver = { + .driver = { + .name = "lp873x", + .of_match_table =
[PATCH v8 1/3] mfd: lp873x: Add lp873x PMIC support
The LP873X chip is a power management IC for Portable Navigation Systems and Tablet Computing devices. It contains the following components: - Regulators. - Configurable General Purpose Output Signals(GPO). PMIC interacts with the main processor through i2c. PMIC has couple of LDOs(Linear Regulators), couple of BUCKs (Step-Down DC-DC Converter Cores) and GPOs(General Purpose Output Signals). Signed-off-by: Keerthy --- drivers/mfd/Kconfig| 14 +++ drivers/mfd/Makefile | 2 + drivers/mfd/lp873x.c | 97 + include/linux/mfd/lp873x.h | 264 + 4 files changed, 377 insertions(+) create mode 100644 drivers/mfd/lp873x.c create mode 100644 include/linux/mfd/lp873x.h diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig index 2d1fb64..45fe00a 100644 --- a/drivers/mfd/Kconfig +++ b/drivers/mfd/Kconfig @@ -1224,6 +1224,20 @@ config MFD_TPS65217 This driver can also be built as a module. If so, the module will be called tps65217. +config MFD_TI_LP873X + tristate "TI LP873X Power Management IC" + depends on I2C + select MFD_CORE + select REGMAP_I2C + help + If you say yes here then you get support for the LP873X series of + Power Management Integrated Circuits(PMIC). + These include voltage regulators, Thermal protection, Configurable + General Purpose Outputs(GPO) that are used in portable devices. + + This driver can also be built as a module. If so, the module + will be called lp873x. + config MFD_TPS65218 tristate "TI TPS65218 Power Management chips" depends on I2C diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile index 2ba3ba3..42acbcd 100644 --- a/drivers/mfd/Makefile +++ b/drivers/mfd/Makefile @@ -22,6 +22,8 @@ obj-$(CONFIG_HTC_EGPIO) += htc-egpio.o obj-$(CONFIG_HTC_PASIC3) += htc-pasic3.o obj-$(CONFIG_HTC_I2CPLD) += htc-i2cpld.o +obj-$(CONFIG_MFD_TI_LP873X)+= lp873x.o + obj-$(CONFIG_MFD_DAVINCI_VOICECODEC) += davinci_voicecodec.o obj-$(CONFIG_MFD_DM355EVM_MSP) += dm355evm_msp.o obj-$(CONFIG_MFD_TI_AM335X_TSCADC) += ti_am335x_tscadc.o diff --git a/drivers/mfd/lp873x.c b/drivers/mfd/lp873x.c new file mode 100644 index 000..3ca0462 --- /dev/null +++ b/drivers/mfd/lp873x.c @@ -0,0 +1,97 @@ +/* + * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/ + * + * Author: Keerthy + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation version 2. + * + * This program is distributed "as is" WITHOUT ANY WARRANTY of any + * kind, whether express or implied; without even the implied warranty + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include +#include +#include +#include +#include + +#include + +static const struct regmap_config lp873x_regmap_config = { + .reg_bits = 8, + .val_bits = 8, + .max_register = LP873X_REG_MAX, +}; + +static const struct mfd_cell lp873x_cells[] = { + { .name = "lp873x-regulator", }, + { .name = "lp873x-gpio", }, +}; + +static int lp873x_probe(struct i2c_client *client, + const struct i2c_device_id *ids) +{ + struct lp873x *lp873; + int ret; + unsigned int otpid; + + lp873 = devm_kzalloc(>dev, sizeof(*lp873), GFP_KERNEL); + if (!lp873) + return -ENOMEM; + + lp873->dev = >dev; + + lp873->regmap = devm_regmap_init_i2c(client, _regmap_config); + if (IS_ERR(lp873->regmap)) { + ret = PTR_ERR(lp873->regmap); + dev_err(lp873->dev, + "Failed to initialize register map: %d\n", ret); + return ret; + } + + mutex_init(>lock); + + ret = regmap_read(lp873->regmap, LP873X_REG_OTP_REV, ); + if (ret) { + dev_err(lp873->dev, "Failed to read OTP ID\n"); + return ret; + } + + lp873->rev = otpid & LP873X_OTP_REV_OTP_ID; + i2c_set_clientdata(client, lp873); + ret = mfd_add_devices(lp873->dev, PLATFORM_DEVID_AUTO, lp873x_cells, + ARRAY_SIZE(lp873x_cells), NULL, 0, NULL); + + return ret; +} + +static const struct of_device_id of_lp873x_match_table[] = { + { .compatible = "ti,lp8733", }, + { .compatible = "ti,lp8732", }, + {} +}; +MODULE_DEVICE_TABLE(of, of_lp873x_match_table); + +static const struct i2c_device_id lp873x_id_table[] = { + { "lp873x", LP873X }, + { }, +}; +MODULE_DEVICE_TABLE(i2c, lp873x_id_table); + +static struct i2c_driver lp873x_driver = { + .driver = { + .name = "lp873x", + .of_match_table = of_lp873x_match_table, + }, + .probe
[PATCH 0/2] AM335x-ICE: Add support for rotary switch
This series adds support for rotary-switch on AM335x-ICE that is connected to TI PCA9536 I2C GPIO expander. First patch adds new generic driver to read status of group of GPIO lines and report the value as an input event. The second patch adds DT entries for the same. Vignesh R (2): input: misc: Add generic input driver to read encoded GPIO lines ARM: dts: am335x-icev2: Add nodes for gpio-decoder .../devicetree/bindings/input/gpio-decoder.txt | 23 arch/arm/boot/dts/am335x-icev2.dts | 9 ++ drivers/input/misc/Kconfig | 12 ++ drivers/input/misc/Makefile| 1 + drivers/input/misc/gpio_decoder.c | 128 + 5 files changed, 173 insertions(+) create mode 100644 Documentation/devicetree/bindings/input/gpio-decoder.txt create mode 100644 drivers/input/misc/gpio_decoder.c -- 2.9.2
[PATCH 0/2] AM335x-ICE: Add support for rotary switch
This series adds support for rotary-switch on AM335x-ICE that is connected to TI PCA9536 I2C GPIO expander. First patch adds new generic driver to read status of group of GPIO lines and report the value as an input event. The second patch adds DT entries for the same. Vignesh R (2): input: misc: Add generic input driver to read encoded GPIO lines ARM: dts: am335x-icev2: Add nodes for gpio-decoder .../devicetree/bindings/input/gpio-decoder.txt | 23 arch/arm/boot/dts/am335x-icev2.dts | 9 ++ drivers/input/misc/Kconfig | 12 ++ drivers/input/misc/Makefile| 1 + drivers/input/misc/gpio_decoder.c | 128 + 5 files changed, 173 insertions(+) create mode 100644 Documentation/devicetree/bindings/input/gpio-decoder.txt create mode 100644 drivers/input/misc/gpio_decoder.c -- 2.9.2
[PATCH 1/2] input: misc: Add generic input driver to read encoded GPIO lines
Add a driver to read group of GPIO lines and provide its status as a numerical value as input event to the system. This will help in intefacing devices, that can be connected over GPIOs, that provide input to the system by driving GPIO lines connected to them like a rotary dial or a switch. For example, a rotary switch can be connected to four GPIO lines. The status of the GPIO lines reflect the actual position of the rotary switch dial. For example, if dial points to 9, then the four GPIO lines connected to the switch will read HLLH(0b'1001 = 9). This value can be reported as an ABS_* event to the input subsystem. Signed-off-by: Vignesh R--- .../devicetree/bindings/input/gpio-decoder.txt | 23 drivers/input/misc/Kconfig | 12 ++ drivers/input/misc/Makefile| 1 + drivers/input/misc/gpio_decoder.c | 128 + 4 files changed, 164 insertions(+) create mode 100644 Documentation/devicetree/bindings/input/gpio-decoder.txt create mode 100644 drivers/input/misc/gpio_decoder.c diff --git a/Documentation/devicetree/bindings/input/gpio-decoder.txt b/Documentation/devicetree/bindings/input/gpio-decoder.txt new file mode 100644 index ..efd947f52cb3 --- /dev/null +++ b/Documentation/devicetree/bindings/input/gpio-decoder.txt @@ -0,0 +1,23 @@ +* GPIO Decoder DT bindings + +Required Properties: +- compatible: should be "gpio-decoder" +- gpios: a spec of gpios (atleast two) to be decoded to a number with + first entry representing the MSB. + +Optional Properties: +- gpio-decoder,max-value: Maximum possible value that can be reported by + the gpios. +- linux,axis: the input subsystem axis to map to (ABS_X/ABS_Y). + Defaults to 0 (ABS_X). + +Example: + gpio-decoder0 { + compatible = "gpio-decoder"; + gpios = < 3 GPIO_ACTIVE_HIGH>, + < 2 GPIO_ACTIVE_HIGH>, + < 1 GPIO_ACTIVE_HIGH>, + < 0 GPIO_ACTIVE_HIGH>; + linux,axis = <0>; /* ABS_X */ + gpio-decoder,max-value = <9>; + }; diff --git a/drivers/input/misc/Kconfig b/drivers/input/misc/Kconfig index efb0ca871327..c44513cac3b7 100644 --- a/drivers/input/misc/Kconfig +++ b/drivers/input/misc/Kconfig @@ -292,6 +292,18 @@ config INPUT_GPIO_TILT_POLLED To compile this driver as a module, choose M here: the module will be called gpio_tilt_polled. +config INPUT_GPIO_DECODER + tristate "Polled GPIO Decoder Input driver" + depends on GPIOLIB && OF + select INPUT_POLLDEV + help +Say Y here if you want driver to read status of multiple GPIO +lines and report the encoded value as an absolute integer to +input subsystem. + +To compile this driver as a module, choose M here: the module +will will be called gpio_decoder. + config INPUT_IXP4XX_BEEPER tristate "IXP4XX Beeper support" depends on ARCH_IXP4XX diff --git a/drivers/input/misc/Makefile b/drivers/input/misc/Makefile index 6a1e5e20fc1c..0b6d025f0487 100644 --- a/drivers/input/misc/Makefile +++ b/drivers/input/misc/Makefile @@ -35,6 +35,7 @@ obj-$(CONFIG_INPUT_DRV2667_HAPTICS) += drv2667.o obj-$(CONFIG_INPUT_GP2A) += gp2ap002a00f.o obj-$(CONFIG_INPUT_GPIO_BEEPER)+= gpio-beeper.o obj-$(CONFIG_INPUT_GPIO_TILT_POLLED) += gpio_tilt_polled.o +obj-$(CONFIG_INPUT_GPIO_DECODER) += gpio_decoder.o obj-$(CONFIG_INPUT_HISI_POWERKEY) += hisi_powerkey.o obj-$(CONFIG_HP_SDC_RTC) += hp_sdc_rtc.o obj-$(CONFIG_INPUT_IMS_PCU)+= ims-pcu.o diff --git a/drivers/input/misc/gpio_decoder.c b/drivers/input/misc/gpio_decoder.c new file mode 100644 index ..e3bed35fc8db --- /dev/null +++ b/drivers/input/misc/gpio_decoder.c @@ -0,0 +1,128 @@ +/* + * gpio-decoder.c + * + * A generic driver to reads multiple gpio lines and translate the + * encoded numeric value into an input event. + * + * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/ + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation version 2. + * + * This program is distributed "as is" WITHOUT ANY WARRANTY of any + * kind, whether express or implied; without even the implied warranty + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +struct gpio_decoder { + struct input_polled_dev *poll_dev; + struct gpio_descs *input_gpios; + struct device *dev; + u32 axis; + u32 last_stable; +}; + +static unsigned int gpio_decoder_get_gpios_state(struct gpio_decoder +*decoder) +{ +
[PATCH 2/2] ARM: dts: am335x-icev2: Add nodes for gpio-decoder
AM335x ICE board has a rotary-switch connected to PCA9536 I2C GPIO expander. The position of the rotary-switch is reflected by status of GPIO lines. Add gpio-decoder node to read these GPIO line status via gpio-decoder driver and report it as an input event to the system. Signed-off-by: Vignesh R--- arch/arm/boot/dts/am335x-icev2.dts | 9 + 1 file changed, 9 insertions(+) diff --git a/arch/arm/boot/dts/am335x-icev2.dts b/arch/arm/boot/dts/am335x-icev2.dts index 7d8b8fefdf08..f2de594343f2 100644 --- a/arch/arm/boot/dts/am335x-icev2.dts +++ b/arch/arm/boot/dts/am335x-icev2.dts @@ -139,6 +139,15 @@ default-state = "off"; }; }; + gpio-decoder0 { + compatible = "gpio-decoder"; + gpios = < 3 GPIO_ACTIVE_HIGH>, + < 2 GPIO_ACTIVE_HIGH>, + < 1 GPIO_ACTIVE_HIGH>, + < 0 GPIO_ACTIVE_HIGH>; + linux,axis = <0>; /* ABS_X */ + gpio-decoder,max-value = <9>; + }; }; _pinmux { -- 2.9.2
[PATCH 2/2] ARM: dts: am335x-icev2: Add nodes for gpio-decoder
AM335x ICE board has a rotary-switch connected to PCA9536 I2C GPIO expander. The position of the rotary-switch is reflected by status of GPIO lines. Add gpio-decoder node to read these GPIO line status via gpio-decoder driver and report it as an input event to the system. Signed-off-by: Vignesh R --- arch/arm/boot/dts/am335x-icev2.dts | 9 + 1 file changed, 9 insertions(+) diff --git a/arch/arm/boot/dts/am335x-icev2.dts b/arch/arm/boot/dts/am335x-icev2.dts index 7d8b8fefdf08..f2de594343f2 100644 --- a/arch/arm/boot/dts/am335x-icev2.dts +++ b/arch/arm/boot/dts/am335x-icev2.dts @@ -139,6 +139,15 @@ default-state = "off"; }; }; + gpio-decoder0 { + compatible = "gpio-decoder"; + gpios = < 3 GPIO_ACTIVE_HIGH>, + < 2 GPIO_ACTIVE_HIGH>, + < 1 GPIO_ACTIVE_HIGH>, + < 0 GPIO_ACTIVE_HIGH>; + linux,axis = <0>; /* ABS_X */ + gpio-decoder,max-value = <9>; + }; }; _pinmux { -- 2.9.2
[PATCH 1/2] input: misc: Add generic input driver to read encoded GPIO lines
Add a driver to read group of GPIO lines and provide its status as a numerical value as input event to the system. This will help in intefacing devices, that can be connected over GPIOs, that provide input to the system by driving GPIO lines connected to them like a rotary dial or a switch. For example, a rotary switch can be connected to four GPIO lines. The status of the GPIO lines reflect the actual position of the rotary switch dial. For example, if dial points to 9, then the four GPIO lines connected to the switch will read HLLH(0b'1001 = 9). This value can be reported as an ABS_* event to the input subsystem. Signed-off-by: Vignesh R --- .../devicetree/bindings/input/gpio-decoder.txt | 23 drivers/input/misc/Kconfig | 12 ++ drivers/input/misc/Makefile| 1 + drivers/input/misc/gpio_decoder.c | 128 + 4 files changed, 164 insertions(+) create mode 100644 Documentation/devicetree/bindings/input/gpio-decoder.txt create mode 100644 drivers/input/misc/gpio_decoder.c diff --git a/Documentation/devicetree/bindings/input/gpio-decoder.txt b/Documentation/devicetree/bindings/input/gpio-decoder.txt new file mode 100644 index ..efd947f52cb3 --- /dev/null +++ b/Documentation/devicetree/bindings/input/gpio-decoder.txt @@ -0,0 +1,23 @@ +* GPIO Decoder DT bindings + +Required Properties: +- compatible: should be "gpio-decoder" +- gpios: a spec of gpios (atleast two) to be decoded to a number with + first entry representing the MSB. + +Optional Properties: +- gpio-decoder,max-value: Maximum possible value that can be reported by + the gpios. +- linux,axis: the input subsystem axis to map to (ABS_X/ABS_Y). + Defaults to 0 (ABS_X). + +Example: + gpio-decoder0 { + compatible = "gpio-decoder"; + gpios = < 3 GPIO_ACTIVE_HIGH>, + < 2 GPIO_ACTIVE_HIGH>, + < 1 GPIO_ACTIVE_HIGH>, + < 0 GPIO_ACTIVE_HIGH>; + linux,axis = <0>; /* ABS_X */ + gpio-decoder,max-value = <9>; + }; diff --git a/drivers/input/misc/Kconfig b/drivers/input/misc/Kconfig index efb0ca871327..c44513cac3b7 100644 --- a/drivers/input/misc/Kconfig +++ b/drivers/input/misc/Kconfig @@ -292,6 +292,18 @@ config INPUT_GPIO_TILT_POLLED To compile this driver as a module, choose M here: the module will be called gpio_tilt_polled. +config INPUT_GPIO_DECODER + tristate "Polled GPIO Decoder Input driver" + depends on GPIOLIB && OF + select INPUT_POLLDEV + help +Say Y here if you want driver to read status of multiple GPIO +lines and report the encoded value as an absolute integer to +input subsystem. + +To compile this driver as a module, choose M here: the module +will will be called gpio_decoder. + config INPUT_IXP4XX_BEEPER tristate "IXP4XX Beeper support" depends on ARCH_IXP4XX diff --git a/drivers/input/misc/Makefile b/drivers/input/misc/Makefile index 6a1e5e20fc1c..0b6d025f0487 100644 --- a/drivers/input/misc/Makefile +++ b/drivers/input/misc/Makefile @@ -35,6 +35,7 @@ obj-$(CONFIG_INPUT_DRV2667_HAPTICS) += drv2667.o obj-$(CONFIG_INPUT_GP2A) += gp2ap002a00f.o obj-$(CONFIG_INPUT_GPIO_BEEPER)+= gpio-beeper.o obj-$(CONFIG_INPUT_GPIO_TILT_POLLED) += gpio_tilt_polled.o +obj-$(CONFIG_INPUT_GPIO_DECODER) += gpio_decoder.o obj-$(CONFIG_INPUT_HISI_POWERKEY) += hisi_powerkey.o obj-$(CONFIG_HP_SDC_RTC) += hp_sdc_rtc.o obj-$(CONFIG_INPUT_IMS_PCU)+= ims-pcu.o diff --git a/drivers/input/misc/gpio_decoder.c b/drivers/input/misc/gpio_decoder.c new file mode 100644 index ..e3bed35fc8db --- /dev/null +++ b/drivers/input/misc/gpio_decoder.c @@ -0,0 +1,128 @@ +/* + * gpio-decoder.c + * + * A generic driver to reads multiple gpio lines and translate the + * encoded numeric value into an input event. + * + * Copyright (C) 2016 Texas Instruments Incorporated - http://www.ti.com/ + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation version 2. + * + * This program is distributed "as is" WITHOUT ANY WARRANTY of any + * kind, whether express or implied; without even the implied warranty + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +struct gpio_decoder { + struct input_polled_dev *poll_dev; + struct gpio_descs *input_gpios; + struct device *dev; + u32 axis; + u32 last_stable; +}; + +static unsigned int gpio_decoder_get_gpios_state(struct gpio_decoder +*decoder) +{ + struct
Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
On Thu, Aug 11, 2016 at 08:20:53PM -0700, Linus Torvalds wrote: > On Thu, Aug 11, 2016 at 7:52 PM, Christoph Hellwigwrote: > > > > I can look at that, but indeed optimizing this patch seems a bit > > stupid. > > The "write less than a full block to the end of the file" is actually > a reasonably common case. > > It may not make for a great filesystem benchmark, but it also isn't > actually insane. People who do logging in user space do this all the > time, for example. And it is *not* stupid in that context. Not at all. > > It's never going to be the *main* thing you do (unless you're AIM), > but I do think it's worth fixing. > > And AIM7 remains one of those odd benchmarks that people use. I'm not > quite sure why, but I really do think that the normal "append smaller > chunks to the end of the file" should absolutely not be dismissed as > stupid. Yes, I agree that there are reasons for making sub-block IO work well (which is why I'm looking to try to fix it), but that does't mean the benchmark is sane. aim7 is, technically, a "scalability benchmark". As such, expecting tiny writes to scale to moving large amounts of data is the "stupid" thing it does. If you scale up the amount of data you need to move, tehn you ned to scale up the efficiency of moving that data. Case in point - writing 1GB of data in 1kb chunks to XFs on a local /dev/pmem1 runs at ~600MB/s, whilst moving it it in 1MB chunks runs at 1.9GB/s. aim7 doesn't actually stress the scalability of the hardware, because inefficiencies in it's implementation prevent it from getting to those limits. That's what aim7 misses - as speeds and capabilities go up, the way code needs to be written to make efficient use of the hardware also changes. e.g. High throughput logging solutions don't write every incoming log event immediately - they aggregate them into larger buffers and then write those, knowing that they can support much higher logging rates by doing this That's why running aim7 as your "does the filesystem scale" benchmark is somewhat irrelevant to scaling applications on high performance systems these days - users with fast storage will be expecting to see that 1.9GB/s throughput from their app, not 600MB/s Cheers, Dave. -- Dave Chinner da...@fromorbit.com
Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
On Thu, Aug 11, 2016 at 08:20:53PM -0700, Linus Torvalds wrote: > On Thu, Aug 11, 2016 at 7:52 PM, Christoph Hellwig wrote: > > > > I can look at that, but indeed optimizing this patch seems a bit > > stupid. > > The "write less than a full block to the end of the file" is actually > a reasonably common case. > > It may not make for a great filesystem benchmark, but it also isn't > actually insane. People who do logging in user space do this all the > time, for example. And it is *not* stupid in that context. Not at all. > > It's never going to be the *main* thing you do (unless you're AIM), > but I do think it's worth fixing. > > And AIM7 remains one of those odd benchmarks that people use. I'm not > quite sure why, but I really do think that the normal "append smaller > chunks to the end of the file" should absolutely not be dismissed as > stupid. Yes, I agree that there are reasons for making sub-block IO work well (which is why I'm looking to try to fix it), but that does't mean the benchmark is sane. aim7 is, technically, a "scalability benchmark". As such, expecting tiny writes to scale to moving large amounts of data is the "stupid" thing it does. If you scale up the amount of data you need to move, tehn you ned to scale up the efficiency of moving that data. Case in point - writing 1GB of data in 1kb chunks to XFs on a local /dev/pmem1 runs at ~600MB/s, whilst moving it it in 1MB chunks runs at 1.9GB/s. aim7 doesn't actually stress the scalability of the hardware, because inefficiencies in it's implementation prevent it from getting to those limits. That's what aim7 misses - as speeds and capabilities go up, the way code needs to be written to make efficient use of the hardware also changes. e.g. High throughput logging solutions don't write every incoming log event immediately - they aggregate them into larger buffers and then write those, knowing that they can support much higher logging rates by doing this That's why running aim7 as your "does the filesystem scale" benchmark is somewhat irrelevant to scaling applications on high performance systems these days - users with fast storage will be expecting to see that 1.9GB/s throughput from their app, not 600MB/s Cheers, Dave. -- Dave Chinner da...@fromorbit.com
Re: kmemleak: Cannot insert 0xff7f1000 into the object search tree (overlaps existing)
Hi Catalin, Thanks for the response! On Thursday 11 August 2016 10:38 PM, Catalin Marinas wrote: > On Thu, Aug 11, 2016 at 07:48:12PM +0300, Grygorii Strashko wrote: >> On 08/11/2016 06:54 PM, Catalin Marinas wrote: >>> On Thu, Aug 11, 2016 at 05:20:51PM +0530, Vignesh R wrote: I see the below message from kmemleak when booting linux-next on AM335x GP EVM and DRA7 EVM >>> >>> Can you also reproduce it with 4.8-rc1? Yes, I can reproduce this on 4.8.0-rc1-g4b9eaf33d83d [...] > > diff --git a/mm/memblock.c b/mm/memblock.c > index 483197ef613f..7d3361d53ac2 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -723,7 +723,8 @@ int __init_memblock memblock_free(phys_addr_t base, > phys_addr_t size) >(unsigned long long)base + size - 1, >(void *)_RET_IP_); > > - kmemleak_free_part(__va(base), size); > + if (base < __pa(high_memory)) > + kmemleak_free_part(__va(base), size); > return memblock_remove_range(, base, size); > } > > @@ -1152,7 +1153,8 @@ static phys_addr_t __init > memblock_alloc_range_nid(phys_addr_t size, >* The min_count is set to 0 so that memblock allocations are >* never reported as leaks. >*/ > - kmemleak_alloc(__va(found), size, 0, 0); > + if (found < __pa(high_memory)) > + kmemleak_alloc(__va(found), size, 0, 0); > return found; > } > return 0; > @@ -1399,7 +1401,8 @@ void __init __memblock_free_early(phys_addr_t base, > phys_addr_t size) > memblock_dbg("%s: [%#016llx-%#016llx] %pF\n", >__func__, (u64)base, (u64)base + size - 1, >(void *)_RET_IP_); > - kmemleak_free_part(__va(base), size); > + if (base < __pa(high_memory)) > + kmemleak_free_part(__va(base), size); > memblock_remove_range(, base, size); > } > > @@ -1419,7 +1422,8 @@ void __init __memblock_free_late(phys_addr_t base, > phys_addr_t size) > memblock_dbg("%s: [%#016llx-%#016llx] %pF\n", >__func__, (u64)base, (u64)base + size - 1, >(void *)_RET_IP_); > - kmemleak_free_part(__va(base), size); > + if (base < __pa(high_memory)) > + kmemleak_free_part(__va(base), size); > cursor = PFN_UP(base); > end = PFN_DOWN(base + size); > > With above change on 4.8-rc1, I see a different warning from kmemleak: [0.002918] kmemleak: Trying to color unknown object at 0xfe80 as Black [0.002943] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc1-00121-g4b9eaf33d83d-dirty #59 [0.002955] Hardware name: Generic AM33XX (Flattened Device Tree) [0.003000] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [0.003027] [] (show_stack) from [] (dump_stack+0xac/0xe0) [0.003052] [] (dump_stack) from [] (paint_ptr+0x78/0x9c) [0.003074] [] (paint_ptr) from [] (kmemleak_init+0x1cc/0x284) [0.003104] [] (kmemleak_init) from [] (start_kernel+0x2d8/0x3b4) [0.003122] [] (start_kernel) from [<8000807c>] (0x8000807c) [0.003133] kmemleak: Early log backtrace: [0.003146][] dma_contiguous_reserve+0x80/0x94 [0.003170][] arm_memblock_init+0x130/0x184 [0.003191][] setup_arch+0x58c/0xc00 [0.003208][] start_kernel+0x58/0x3b4 [0.003224][<8000807c>] 0x8000807c [0.003239][] 0x Full boot log: http://pastebin.ubuntu.com/23048180/ -- Thanks Vignesh
Re: kmemleak: Cannot insert 0xff7f1000 into the object search tree (overlaps existing)
Hi Catalin, Thanks for the response! On Thursday 11 August 2016 10:38 PM, Catalin Marinas wrote: > On Thu, Aug 11, 2016 at 07:48:12PM +0300, Grygorii Strashko wrote: >> On 08/11/2016 06:54 PM, Catalin Marinas wrote: >>> On Thu, Aug 11, 2016 at 05:20:51PM +0530, Vignesh R wrote: I see the below message from kmemleak when booting linux-next on AM335x GP EVM and DRA7 EVM >>> >>> Can you also reproduce it with 4.8-rc1? Yes, I can reproduce this on 4.8.0-rc1-g4b9eaf33d83d [...] > > diff --git a/mm/memblock.c b/mm/memblock.c > index 483197ef613f..7d3361d53ac2 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -723,7 +723,8 @@ int __init_memblock memblock_free(phys_addr_t base, > phys_addr_t size) >(unsigned long long)base + size - 1, >(void *)_RET_IP_); > > - kmemleak_free_part(__va(base), size); > + if (base < __pa(high_memory)) > + kmemleak_free_part(__va(base), size); > return memblock_remove_range(, base, size); > } > > @@ -1152,7 +1153,8 @@ static phys_addr_t __init > memblock_alloc_range_nid(phys_addr_t size, >* The min_count is set to 0 so that memblock allocations are >* never reported as leaks. >*/ > - kmemleak_alloc(__va(found), size, 0, 0); > + if (found < __pa(high_memory)) > + kmemleak_alloc(__va(found), size, 0, 0); > return found; > } > return 0; > @@ -1399,7 +1401,8 @@ void __init __memblock_free_early(phys_addr_t base, > phys_addr_t size) > memblock_dbg("%s: [%#016llx-%#016llx] %pF\n", >__func__, (u64)base, (u64)base + size - 1, >(void *)_RET_IP_); > - kmemleak_free_part(__va(base), size); > + if (base < __pa(high_memory)) > + kmemleak_free_part(__va(base), size); > memblock_remove_range(, base, size); > } > > @@ -1419,7 +1422,8 @@ void __init __memblock_free_late(phys_addr_t base, > phys_addr_t size) > memblock_dbg("%s: [%#016llx-%#016llx] %pF\n", >__func__, (u64)base, (u64)base + size - 1, >(void *)_RET_IP_); > - kmemleak_free_part(__va(base), size); > + if (base < __pa(high_memory)) > + kmemleak_free_part(__va(base), size); > cursor = PFN_UP(base); > end = PFN_DOWN(base + size); > > With above change on 4.8-rc1, I see a different warning from kmemleak: [0.002918] kmemleak: Trying to color unknown object at 0xfe80 as Black [0.002943] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc1-00121-g4b9eaf33d83d-dirty #59 [0.002955] Hardware name: Generic AM33XX (Flattened Device Tree) [0.003000] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [0.003027] [] (show_stack) from [] (dump_stack+0xac/0xe0) [0.003052] [] (dump_stack) from [] (paint_ptr+0x78/0x9c) [0.003074] [] (paint_ptr) from [] (kmemleak_init+0x1cc/0x284) [0.003104] [] (kmemleak_init) from [] (start_kernel+0x2d8/0x3b4) [0.003122] [] (start_kernel) from [<8000807c>] (0x8000807c) [0.003133] kmemleak: Early log backtrace: [0.003146][] dma_contiguous_reserve+0x80/0x94 [0.003170][] arm_memblock_init+0x130/0x184 [0.003191][] setup_arch+0x58c/0xc00 [0.003208][] start_kernel+0x58/0x3b4 [0.003224][<8000807c>] 0x8000807c [0.003239][] 0x Full boot log: http://pastebin.ubuntu.com/23048180/ -- Thanks Vignesh
Re: [PART2 PATCH v5 00/12] iommu/AMD: Introduce IOMMU AVIC support
On 08/09/2016 09:58 PM, Joerg Roedel wrote: On Mon, Aug 08, 2016 at 09:42:36PM +0700, Suthikulpanit, Suravee wrote: Hi Joerg/Radim/Paolo, Are there any other concerns about this series? Okay, looked again through the iommu-patches and sent you my feedback. Overall I think we are on track to merge this for v4.9 when the remaining small issues are addressed and Radim and Paolo give their Acked-bys and/or Reviewed-bys too. Joerg Thanks Joerg. I'll update the patch and send out V6. Suravee
Re: [PART2 PATCH v5 00/12] iommu/AMD: Introduce IOMMU AVIC support
On 08/09/2016 09:58 PM, Joerg Roedel wrote: On Mon, Aug 08, 2016 at 09:42:36PM +0700, Suthikulpanit, Suravee wrote: Hi Joerg/Radim/Paolo, Are there any other concerns about this series? Okay, looked again through the iommu-patches and sent you my feedback. Overall I think we are on track to merge this for v4.9 when the remaining small issues are addressed and Radim and Paolo give their Acked-bys and/or Reviewed-bys too. Joerg Thanks Joerg. I'll update the patch and send out V6. Suravee
linux-next: Tree for Aug 12
Hi all, Changes since 20160811: Non-merge commits (relative to Linus' tree): 1492 1513 files changed, 39934 insertions(+), 15095 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig (with CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig (this fails its final link) and pseries_le_defconfig and i386, sparc and sparc64 defconfig. Below is a summary of the state of the merge. I am currently merging 241 trees (counting Linus' and 35 trees of patches pending for Linus' tree). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (d3396e1e4ec4 Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc) Merging fixes/master (d3396e1e4ec4 Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc) Merging kbuild-current/rc-fixes (b36fad65d61f kbuild: Initialize exported variables) Merging arc-current/for-curr (45c3b08a117e ARC: Elide redundant setup of DMA callbacks) Merging arm-current/fixes (87eed3c74d7c ARM: fix address limit restoration for undefined instructions) Merging m68k-current/for-linus (6bd80f372371 m68k/defconfig: Update defconfigs for v4.7-rc2) Merging metag-fixes/fixes (97b1d23f7bcb metag: Drop show_mem() from mem_init()) Merging powerpc-fixes/fixes (ca49e64f0cb1 selftests/powerpc: Specify we expect to build with std=gnu99) Merging powerpc-merge-mpe/fixes (bc0195aad0da Linux 4.2-rc2) Merging sparc/master (4620a06e4b3c shmem: Fix link error if huge pages support is disabled) Merging net/master (bbe11fab0b6c macsec: use after free when deleting the underlying device) Merging ipsec/master (6916fb3b10b3 xfrm: Ignore socket policies when rebuilding hash tables) Merging netfilter/master (4b5b9ba553f9 openvswitch: do not ignore netdev errors when creating tunnel vports) Merging ipvs/master (ea43f860d984 Merge branch 'ethoc-fixes') Merging wireless-drivers/master (034fdd4a17ff Merge ath-current from ath.git) Merging mac80211/master (4d0bd46a4d55 Revert "wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel") Merging sound-current/for-linus (a52ff34e5ec6 ALSA: hda - Manage power well properly for resume) Merging pci-current/for-linus (8b078c603249 PCI: Update "pci=resource_alignment" documentation) Merging driver-core.current/driver-core-linus (29b4817d4018 Linux 4.8-rc1) Merging tty.current/tty-linus (29b4817d4018 Linux 4.8-rc1) Merging usb.current/usb-linus (539587511835 usb: misc: usbtest: add fix for driver hang) Merging usb-gadget-fixes/fixes (a0ad85ae866f usb: dwc3: gadget: stop processing on HWO set) Merging usb-serial-fixes/usb-linus (647024a7df36 USB: serial: fix memleak in driver-registration error path) Merging usb-chipidea-fixes/ci-for-usb-stable (ea1d39a31d3b usb: common: otg-fsm: add license to usb-otg-fsm) Merging staging.current/staging-linus (29b4817d4018 Linux 4.8-rc1) Merging char-misc.current/char-misc-linus (29b4817d4018 Linux 4.8-rc1) Merging input-current/for-linus (22fe874f3803 Input: silead - remove some dead code) Merging crypto-current/master (a0118c8b2be9 crypto: caam - fix non-hmac hashes) Merging ide/master (797cee982eef Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit) Merging rr-fixes/fixes (8244062ef1e5 modules: fix longstanding /proc/kallsyms vs module insertion race.) Merging vfio-fixes/for-linus (c8952a707556 vfio/pci: Fix NULL pointer oops in error interrupt setup handling) Merging kselftest-fixes/fixes (29b4817d4018 Linux 4.8-rc1) Merging backlight-fixes/for-backlight-fixes (68feaca0b13e backlight: pwm: Handle EPROBE_DEFER while requesting the P
linux-next: Tree for Aug 12
Hi all, Changes since 20160811: Non-merge commits (relative to Linus' tree): 1492 1513 files changed, 39934 insertions(+), 15095 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig (with CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig (this fails its final link) and pseries_le_defconfig and i386, sparc and sparc64 defconfig. Below is a summary of the state of the merge. I am currently merging 241 trees (counting Linus' and 35 trees of patches pending for Linus' tree). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (d3396e1e4ec4 Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc) Merging fixes/master (d3396e1e4ec4 Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc) Merging kbuild-current/rc-fixes (b36fad65d61f kbuild: Initialize exported variables) Merging arc-current/for-curr (45c3b08a117e ARC: Elide redundant setup of DMA callbacks) Merging arm-current/fixes (87eed3c74d7c ARM: fix address limit restoration for undefined instructions) Merging m68k-current/for-linus (6bd80f372371 m68k/defconfig: Update defconfigs for v4.7-rc2) Merging metag-fixes/fixes (97b1d23f7bcb metag: Drop show_mem() from mem_init()) Merging powerpc-fixes/fixes (ca49e64f0cb1 selftests/powerpc: Specify we expect to build with std=gnu99) Merging powerpc-merge-mpe/fixes (bc0195aad0da Linux 4.2-rc2) Merging sparc/master (4620a06e4b3c shmem: Fix link error if huge pages support is disabled) Merging net/master (bbe11fab0b6c macsec: use after free when deleting the underlying device) Merging ipsec/master (6916fb3b10b3 xfrm: Ignore socket policies when rebuilding hash tables) Merging netfilter/master (4b5b9ba553f9 openvswitch: do not ignore netdev errors when creating tunnel vports) Merging ipvs/master (ea43f860d984 Merge branch 'ethoc-fixes') Merging wireless-drivers/master (034fdd4a17ff Merge ath-current from ath.git) Merging mac80211/master (4d0bd46a4d55 Revert "wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel") Merging sound-current/for-linus (a52ff34e5ec6 ALSA: hda - Manage power well properly for resume) Merging pci-current/for-linus (8b078c603249 PCI: Update "pci=resource_alignment" documentation) Merging driver-core.current/driver-core-linus (29b4817d4018 Linux 4.8-rc1) Merging tty.current/tty-linus (29b4817d4018 Linux 4.8-rc1) Merging usb.current/usb-linus (539587511835 usb: misc: usbtest: add fix for driver hang) Merging usb-gadget-fixes/fixes (a0ad85ae866f usb: dwc3: gadget: stop processing on HWO set) Merging usb-serial-fixes/usb-linus (647024a7df36 USB: serial: fix memleak in driver-registration error path) Merging usb-chipidea-fixes/ci-for-usb-stable (ea1d39a31d3b usb: common: otg-fsm: add license to usb-otg-fsm) Merging staging.current/staging-linus (29b4817d4018 Linux 4.8-rc1) Merging char-misc.current/char-misc-linus (29b4817d4018 Linux 4.8-rc1) Merging input-current/for-linus (22fe874f3803 Input: silead - remove some dead code) Merging crypto-current/master (a0118c8b2be9 crypto: caam - fix non-hmac hashes) Merging ide/master (797cee982eef Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit) Merging rr-fixes/fixes (8244062ef1e5 modules: fix longstanding /proc/kallsyms vs module insertion race.) Merging vfio-fixes/for-linus (c8952a707556 vfio/pci: Fix NULL pointer oops in error interrupt setup handling) Merging kselftest-fixes/fixes (29b4817d4018 Linux 4.8-rc1) Merging backlight-fixes/for-backlight-fixes (68feaca0b13e backlight: pwm: Handle EPROBE_DEFER while requesting the P
Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote: > On Thu, Aug 11, 2016 at 5:54 PM, Dave Chinnerwrote: > > > > So, removing mark_page_accessed() made the spinlock contention > > *worse*. > > > > 36.51% [kernel] [k] _raw_spin_unlock_irqrestore > >6.27% [kernel] [k] copy_user_generic_string > >3.73% [kernel] [k] _raw_spin_unlock_irq > >3.55% [kernel] [k] get_page_from_freelist > >1.97% [kernel] [k] do_raw_spin_lock > >1.72% [kernel] [k] __block_commit_write.isra.30 > > I don't recall having ever seen the mapping tree_lock as a contention > point before, but it's not like I've tried that load either. So it > might be a regression (going back long, I suspect), or just an unusual > load that nobody has traditionally tested much. > > Single-threaded big file write one page at a time, was it? Yup. On a 4 node NUMA system. So when memory reclaim kicks in, there's a write process, a writeback kworker and 4 kswapd kthreads all banging on the mapping->tree_lock. There's an awful lot of concurrency happening behind the scenes of that single user process writing to a file... > The mapping tree lock has been around forever (it used to be a rw-lock > long long ago), but I wonder if we might have moved more stuff into it > (memory accounting comes to mind) causing much worse contention or > something. Yeah, there is now a crapton of accounting updated in account_page_dirtied under the tree lock - memcg, writeback, node, zone, task, etc. And there's a *lot* of code that __delete_from_page_cache() can execute under the tree lock. > Hmm. Just for fun, I googled "tree_lock contention". It's shown up > before - back in 2006, and it was you hitting it back then too. Of course! That, however, would have been when I was playing with real big SGI machines, not a tiddly little 16p VM :P Cheers, Dave. -- Dave Chinner da...@fromorbit.com
Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote: > On Thu, Aug 11, 2016 at 5:54 PM, Dave Chinner wrote: > > > > So, removing mark_page_accessed() made the spinlock contention > > *worse*. > > > > 36.51% [kernel] [k] _raw_spin_unlock_irqrestore > >6.27% [kernel] [k] copy_user_generic_string > >3.73% [kernel] [k] _raw_spin_unlock_irq > >3.55% [kernel] [k] get_page_from_freelist > >1.97% [kernel] [k] do_raw_spin_lock > >1.72% [kernel] [k] __block_commit_write.isra.30 > > I don't recall having ever seen the mapping tree_lock as a contention > point before, but it's not like I've tried that load either. So it > might be a regression (going back long, I suspect), or just an unusual > load that nobody has traditionally tested much. > > Single-threaded big file write one page at a time, was it? Yup. On a 4 node NUMA system. So when memory reclaim kicks in, there's a write process, a writeback kworker and 4 kswapd kthreads all banging on the mapping->tree_lock. There's an awful lot of concurrency happening behind the scenes of that single user process writing to a file... > The mapping tree lock has been around forever (it used to be a rw-lock > long long ago), but I wonder if we might have moved more stuff into it > (memory accounting comes to mind) causing much worse contention or > something. Yeah, there is now a crapton of accounting updated in account_page_dirtied under the tree lock - memcg, writeback, node, zone, task, etc. And there's a *lot* of code that __delete_from_page_cache() can execute under the tree lock. > Hmm. Just for fun, I googled "tree_lock contention". It's shown up > before - back in 2006, and it was you hitting it back then too. Of course! That, however, would have been when I was playing with real big SGI machines, not a tiddly little 16p VM :P Cheers, Dave. -- Dave Chinner da...@fromorbit.com
Re: [PATCH] mm: Add the ram_latent_entropy kernel parameter
On 11/08/16 08:28, Kees Cook wrote: > From: Emese Revfy> > When "ram_latent_entropy" is passed on the kernel command line, entropy > will be extracted from up to the first 4GB of RAM while the runtime memory > allocator is being initialized. This entropy isn't cryptographically > secure, but does help provide additional unpredictability on otherwise > low-entropy systems. > > Based on work created by the PaX Team. > > Signed-off-by: Emese Revfy > [kees: renamed parameter, dropped relationship with plugin, updated log] > Signed-off-by: Kees Cook > --- > This patch has been extracted from the latent_entropy gcc plugin, as > suggested by Linus: https://lkml.org/lkml/2016/8/8/840 > --- > Documentation/kernel-parameters.txt | 5 + > mm/page_alloc.c | 21 + > 2 files changed, 26 insertions(+) > > diff --git a/Documentation/kernel-parameters.txt > b/Documentation/kernel-parameters.txt > index 46c030a49186..9d054984370f 100644 > --- a/Documentation/kernel-parameters.txt > +++ b/Documentation/kernel-parameters.txt > @@ -3245,6 +3245,11 @@ bytes respectively. Such letter suffixes can also be > entirely omitted. > raid= [HW,RAID] > See Documentation/md.txt. > > + ram_latent_entropy > + Enable a very simple form of latent entropy extraction > + from the first 4GB of memory as the bootmem allocator > + passes the memory pages to the buddy allocator. > + > ramdisk_size= [RAM] Sizes of RAM disks in kilobytes > See Documentation/blockdev/ramdisk.txt. > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index fb975cec3518..1de94f0ff29d 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -64,6 +64,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -1236,6 +1237,15 @@ static void __free_pages_ok(struct page *page, > unsigned int order) > local_irq_restore(flags); > } > > +bool __meminitdata ram_latent_entropy; > + > +static int __init setup_ram_latent_entropy(char *str) > +{ > + ram_latent_entropy = true; > + return 0; > +} > +early_param("ram_latent_entropy", setup_ram_latent_entropy); > + > static void __init __free_pages_boot_core(struct page *page, unsigned int > order) > { > unsigned int nr_pages = 1 << order; > @@ -1251,6 +1261,17 @@ static void __init __free_pages_boot_core(struct page > *page, unsigned int order) > __ClearPageReserved(p); > set_page_count(p, 0); > > + if (ram_latent_entropy && !PageHighMem(page) && > + page_to_pfn(page) < 0x10) { > + u64 hash = 0; > + size_t index, end = PAGE_SIZE * nr_pages / sizeof(hash); > + const u64 *data = lowmem_page_address(page); > + > + for (index = 0; index < end; index++) > + hash ^= hash + data[index]; Won't the hash be the same across boots? Is this entropy addition for KASLR, since it is so early in boot?q > + add_device_randomness((const void *), sizeof(hash)); > + } > + > page_zone(page)->managed_pages += nr_pages; > set_page_refcounted(page); > __free_pages(page, order); > Balbir Singh
Re: [PATCH] mm: Add the ram_latent_entropy kernel parameter
On 11/08/16 08:28, Kees Cook wrote: > From: Emese Revfy > > When "ram_latent_entropy" is passed on the kernel command line, entropy > will be extracted from up to the first 4GB of RAM while the runtime memory > allocator is being initialized. This entropy isn't cryptographically > secure, but does help provide additional unpredictability on otherwise > low-entropy systems. > > Based on work created by the PaX Team. > > Signed-off-by: Emese Revfy > [kees: renamed parameter, dropped relationship with plugin, updated log] > Signed-off-by: Kees Cook > --- > This patch has been extracted from the latent_entropy gcc plugin, as > suggested by Linus: https://lkml.org/lkml/2016/8/8/840 > --- > Documentation/kernel-parameters.txt | 5 + > mm/page_alloc.c | 21 + > 2 files changed, 26 insertions(+) > > diff --git a/Documentation/kernel-parameters.txt > b/Documentation/kernel-parameters.txt > index 46c030a49186..9d054984370f 100644 > --- a/Documentation/kernel-parameters.txt > +++ b/Documentation/kernel-parameters.txt > @@ -3245,6 +3245,11 @@ bytes respectively. Such letter suffixes can also be > entirely omitted. > raid= [HW,RAID] > See Documentation/md.txt. > > + ram_latent_entropy > + Enable a very simple form of latent entropy extraction > + from the first 4GB of memory as the bootmem allocator > + passes the memory pages to the buddy allocator. > + > ramdisk_size= [RAM] Sizes of RAM disks in kilobytes > See Documentation/blockdev/ramdisk.txt. > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index fb975cec3518..1de94f0ff29d 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -64,6 +64,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -1236,6 +1237,15 @@ static void __free_pages_ok(struct page *page, > unsigned int order) > local_irq_restore(flags); > } > > +bool __meminitdata ram_latent_entropy; > + > +static int __init setup_ram_latent_entropy(char *str) > +{ > + ram_latent_entropy = true; > + return 0; > +} > +early_param("ram_latent_entropy", setup_ram_latent_entropy); > + > static void __init __free_pages_boot_core(struct page *page, unsigned int > order) > { > unsigned int nr_pages = 1 << order; > @@ -1251,6 +1261,17 @@ static void __init __free_pages_boot_core(struct page > *page, unsigned int order) > __ClearPageReserved(p); > set_page_count(p, 0); > > + if (ram_latent_entropy && !PageHighMem(page) && > + page_to_pfn(page) < 0x10) { > + u64 hash = 0; > + size_t index, end = PAGE_SIZE * nr_pages / sizeof(hash); > + const u64 *data = lowmem_page_address(page); > + > + for (index = 0; index < end; index++) > + hash ^= hash + data[index]; Won't the hash be the same across boots? Is this entropy addition for KASLR, since it is so early in boot?q > + add_device_randomness((const void *), sizeof(hash)); > + } > + > page_zone(page)->managed_pages += nr_pages; > set_page_refcounted(page); > __free_pages(page, order); > Balbir Singh
Re: debug tip after earlycon is closed?
Hi. 2016-07-28 16:44 GMT+09:00 Arnd Bergmann: > > I think the problem is that you have three consoles: > > - the boot console that stays active until a real console comes up > - the framebuffer console that is initialized early and goes on to > disable the bootconsole > - the serial console that you are looking at but which doesn't get > initialized until much later > > Clearly something is wrong in this setup and we don't want to > disable the boot console before the serial console is up. > > I guess you could work around it by disabling the framebuffer console > at compile time, or by having the serial console initialized earlier > than the framebuffer, but I wonder if this is something that could > use a more general solution. Looks like this comes from: #elif defined(CONFIG_DUMMY_CONSOLE) conswitchp = _con; #endif console= in the kernel-parameter is no problem, but stdout-path in the chose node goes wrong with it. 2016-07-28 21:20 GMT+09:00 Russell King - ARM Linux : > I think this is down to how the linux,stdout property is handled. > > Normally, with command line specified consoles, if you don't specify > anything, you get the framebuffer console (actually, it's the first > registered console, but practically this is the framebuffer if enabled) > by default. > > If you specify a console (or consoles) on the command line, you get > all of those you specified, (iirc) with /dev/console's input connected > to the first specified console. > > This happens because the command line is parsed for consoles, and > add_preferred_console() is called for each that are found, whether or > not drivers are discovered for them. __add_preferred_console() > prevents the first registered console being automatically initialised > by setting selected_console to a non-negative number. > > However, the parsing of DT specified consoles occurs at driver > initialisation time - in uart_add_one_port() via of_console_check(). > Only at this time is add_preferred_console() called, which means that > the first console registered prior to _any_ selected UART console > driver mentioned in DT will become active. > > When the first console becomes active, the earlycon is disabled, which > means that in the DT case, if we have a framebuffer enabled which > registers prior to any selected UART console, the framebuffer will > stop the earlycon output immediately. > > To me, what this means is that the DT parsing of linux,stdout is > broken - while it may look nice from a design point of view, the > design is wrong and fails to take account of non-UART consoles in > the system. Thanks for clarification. I agree that stdout-path is something wrong. Since I switched from chosen { bootargs = "console=ttyS0,115200"; }; to chosen { stdout-path = "serial0:115200n8"; }; I have had bad experiences. The combination of stdout-path and earlycon gives doubled log. I reported this in https://lkml.org/lkml/2015/11/27/170 but not fixed yet. I could fix the problem by changing chosen { stdout-path = "serial0:115200n8"; bootargs = "earlycon"; }; to chosen { bootargs = "console=ttyS0,115200 earlycon=uniphier,mmio32,0x54006800,115200"; }; -- Best Regards Masahiro Yamada
Re: debug tip after earlycon is closed?
Hi. 2016-07-28 16:44 GMT+09:00 Arnd Bergmann : > > I think the problem is that you have three consoles: > > - the boot console that stays active until a real console comes up > - the framebuffer console that is initialized early and goes on to > disable the bootconsole > - the serial console that you are looking at but which doesn't get > initialized until much later > > Clearly something is wrong in this setup and we don't want to > disable the boot console before the serial console is up. > > I guess you could work around it by disabling the framebuffer console > at compile time, or by having the serial console initialized earlier > than the framebuffer, but I wonder if this is something that could > use a more general solution. Looks like this comes from: #elif defined(CONFIG_DUMMY_CONSOLE) conswitchp = _con; #endif console= in the kernel-parameter is no problem, but stdout-path in the chose node goes wrong with it. 2016-07-28 21:20 GMT+09:00 Russell King - ARM Linux : > I think this is down to how the linux,stdout property is handled. > > Normally, with command line specified consoles, if you don't specify > anything, you get the framebuffer console (actually, it's the first > registered console, but practically this is the framebuffer if enabled) > by default. > > If you specify a console (or consoles) on the command line, you get > all of those you specified, (iirc) with /dev/console's input connected > to the first specified console. > > This happens because the command line is parsed for consoles, and > add_preferred_console() is called for each that are found, whether or > not drivers are discovered for them. __add_preferred_console() > prevents the first registered console being automatically initialised > by setting selected_console to a non-negative number. > > However, the parsing of DT specified consoles occurs at driver > initialisation time - in uart_add_one_port() via of_console_check(). > Only at this time is add_preferred_console() called, which means that > the first console registered prior to _any_ selected UART console > driver mentioned in DT will become active. > > When the first console becomes active, the earlycon is disabled, which > means that in the DT case, if we have a framebuffer enabled which > registers prior to any selected UART console, the framebuffer will > stop the earlycon output immediately. > > To me, what this means is that the DT parsing of linux,stdout is > broken - while it may look nice from a design point of view, the > design is wrong and fails to take account of non-UART consoles in > the system. Thanks for clarification. I agree that stdout-path is something wrong. Since I switched from chosen { bootargs = "console=ttyS0,115200"; }; to chosen { stdout-path = "serial0:115200n8"; }; I have had bad experiences. The combination of stdout-path and earlycon gives doubled log. I reported this in https://lkml.org/lkml/2015/11/27/170 but not fixed yet. I could fix the problem by changing chosen { stdout-path = "serial0:115200n8"; bootargs = "earlycon"; }; to chosen { bootargs = "console=ttyS0,115200 earlycon=uniphier,mmio32,0x54006800,115200"; }; -- Best Regards Masahiro Yamada
Re: [PATCH] seccomp: Fix tracer exit notifications during fatal signals
Thanks! On Fri, Aug 12, 2016 at 3:12 AM, Oleg Nesterovwrote: > > The bug happens because when __seccomp_filter() detects > > fatal_signal_pending(), it calls do_exit() without dequeuing the fatal > > signal. When do_exit() sends the PTRACE_EVENT_EXIT > > I _never_ understood what PTRACE_EVENT_EXIT should actually do. I mean, > when it should actually stop. This was never defined. > > > notification and > > that task is descheduled, __schedule() notices that there is a fatal > > signal pending and changes its state from TASK_TRACED to TASK_RUNNING. > > And this can happen anyway, with or without this change, with or without > seccomp. Because another fatal signal can be pending. So PTRACE_EVENT_EXIT > actually depends on /dev/random. True. But at least (as Kees alluded to later) this patch ensures PTRACE_EVENT_EXIT delivery when exit is due to exit_group() and no genuine fatal signals are involved. > Perhaps we should finally define what it should do. Say, it should only > stop if SIGKILL was sent "implicitely" by exit/exec. But as for exec, > there are more (off-topic) complications, not sure we actually want this... The ptrace man page currently says: > A SIGKILL signal may still cause a PTRACE_EVENT_EXIT stop before actual > signal death. This may be changed in the future; SIGKILL is meant to always > immediately kill tasks even under ptrace. Last confirmed on Linux 3.13. In practice, a PTRACE_EVENT_EXIT is almost always observed after SIGKILL. That's nice for rr, because it lets us observe the process's final state. But it allows a process to stay alive indefinitely after receiving SIGKILL, so I can see why you might want to change it. Rob -- lbir ye,ea yer.tnietoehr rdn rdsme,anea lurpr edna e hnysnenh hhe uresyf toD selthor stor edna siewaoeodm or v sstvr esBa kbvted,t rdsme,aoreseoouoto o l euetiuruewFa kbn e hnystoivateweh uresyf tulsa rehr rdm or rnea lurpr .a war hsrer holsa rodvted,t nenh hneireseoouot.tniesiewaoeivatewt sstvr esn
Re: [PATCH 3/3] ARM: dts: imx6q-evi: Fix onboard hub reset line
On Thu, Aug 11, 2016 at 09:40:32AM -0700, Joshua Clayton wrote: > Previously the onboard hub was made to work by treating its > reset gpio as a regulator enable. > Get rid of that kludge now that pwseq has added reset gpio support > Move pin muxing the hub reset pin into the usbh1 group > > Signed-off-by: Joshua Clayton> --- > arch/arm/boot/dts/imx6q-evi.dts | 25 +++-- > 1 file changed, 7 insertions(+), 18 deletions(-) > > diff --git a/arch/arm/boot/dts/imx6q-evi.dts b/arch/arm/boot/dts/imx6q-evi.dts > index 4fa5601..49c6f61 100644 > --- a/arch/arm/boot/dts/imx6q-evi.dts > +++ b/arch/arm/boot/dts/imx6q-evi.dts > @@ -54,18 +54,6 @@ > reg = <0x1000 0x4000>; > }; > > - reg_usbh1_vbus: regulator-usbhubreset { > - compatible = "regulator-fixed"; > - regulator-name = "usbh1_vbus"; > - regulator-min-microvolt = <500>; > - regulator-max-microvolt = <500>; > - enable-active-high; > - startup-delay-us = <2>; > - pinctrl-names = "default"; > - pinctrl-0 = <_usbh1_hubreset>; > - gpio = < 12 GPIO_ACTIVE_HIGH>; > - }; > - > reg_usb_otg_vbus: regulator-usbotgvbus { > compatible = "regulator-fixed"; > regulator-name = "usb_otg_vbus"; > @@ -204,12 +192,18 @@ > }; > > { > - vbus-supply = <_usbh1_vbus>; > pinctrl-names = "default"; > pinctrl-0 = <_usbh1>; > dr_mode = "host"; > disable-over-current; > status = "okay"; > + > + usb2415host: hub@1 { > + compatible = "usb424,2513"; > + reg = <1>; > + reset-gpios = < 12 GPIO_ACTIVE_LOW>; > + reset-duration-us = <3000>; > + }; > }; > > { > @@ -467,11 +461,6 @@ > MX6QDL_PAD_GPIO_3__USB_H1_OC 0x1b0b0 > /* usbh1_b OC */ This comment should be for above pin, do you mind I change it in this patch? > MX6QDL_PAD_GPIO_0__GPIO1_IO00 0x1b0b0 > - >; > - }; > - > - pinctrl_usbh1_hubreset: usbh1hubresetgrp { > - fsl,pins = < > MX6QDL_PAD_GPIO_17__GPIO7_IO12 0x1b0b0 > >; > }; This changes remind me that I need to change the same thing for udoo board. -- Best Regards, Peter Chen
Re: [PATCH] seccomp: Fix tracer exit notifications during fatal signals
Thanks! On Fri, Aug 12, 2016 at 3:12 AM, Oleg Nesterov wrote: > > The bug happens because when __seccomp_filter() detects > > fatal_signal_pending(), it calls do_exit() without dequeuing the fatal > > signal. When do_exit() sends the PTRACE_EVENT_EXIT > > I _never_ understood what PTRACE_EVENT_EXIT should actually do. I mean, > when it should actually stop. This was never defined. > > > notification and > > that task is descheduled, __schedule() notices that there is a fatal > > signal pending and changes its state from TASK_TRACED to TASK_RUNNING. > > And this can happen anyway, with or without this change, with or without > seccomp. Because another fatal signal can be pending. So PTRACE_EVENT_EXIT > actually depends on /dev/random. True. But at least (as Kees alluded to later) this patch ensures PTRACE_EVENT_EXIT delivery when exit is due to exit_group() and no genuine fatal signals are involved. > Perhaps we should finally define what it should do. Say, it should only > stop if SIGKILL was sent "implicitely" by exit/exec. But as for exec, > there are more (off-topic) complications, not sure we actually want this... The ptrace man page currently says: > A SIGKILL signal may still cause a PTRACE_EVENT_EXIT stop before actual > signal death. This may be changed in the future; SIGKILL is meant to always > immediately kill tasks even under ptrace. Last confirmed on Linux 3.13. In practice, a PTRACE_EVENT_EXIT is almost always observed after SIGKILL. That's nice for rr, because it lets us observe the process's final state. But it allows a process to stay alive indefinitely after receiving SIGKILL, so I can see why you might want to change it. Rob -- lbir ye,ea yer.tnietoehr rdn rdsme,anea lurpr edna e hnysnenh hhe uresyf toD selthor stor edna siewaoeodm or v sstvr esBa kbvted,t rdsme,aoreseoouoto o l euetiuruewFa kbn e hnystoivateweh uresyf tulsa rehr rdm or rnea lurpr .a war hsrer holsa rodvted,t nenh hneireseoouot.tniesiewaoeivatewt sstvr esn
Re: [PATCH 3/3] ARM: dts: imx6q-evi: Fix onboard hub reset line
On Thu, Aug 11, 2016 at 09:40:32AM -0700, Joshua Clayton wrote: > Previously the onboard hub was made to work by treating its > reset gpio as a regulator enable. > Get rid of that kludge now that pwseq has added reset gpio support > Move pin muxing the hub reset pin into the usbh1 group > > Signed-off-by: Joshua Clayton > --- > arch/arm/boot/dts/imx6q-evi.dts | 25 +++-- > 1 file changed, 7 insertions(+), 18 deletions(-) > > diff --git a/arch/arm/boot/dts/imx6q-evi.dts b/arch/arm/boot/dts/imx6q-evi.dts > index 4fa5601..49c6f61 100644 > --- a/arch/arm/boot/dts/imx6q-evi.dts > +++ b/arch/arm/boot/dts/imx6q-evi.dts > @@ -54,18 +54,6 @@ > reg = <0x1000 0x4000>; > }; > > - reg_usbh1_vbus: regulator-usbhubreset { > - compatible = "regulator-fixed"; > - regulator-name = "usbh1_vbus"; > - regulator-min-microvolt = <500>; > - regulator-max-microvolt = <500>; > - enable-active-high; > - startup-delay-us = <2>; > - pinctrl-names = "default"; > - pinctrl-0 = <_usbh1_hubreset>; > - gpio = < 12 GPIO_ACTIVE_HIGH>; > - }; > - > reg_usb_otg_vbus: regulator-usbotgvbus { > compatible = "regulator-fixed"; > regulator-name = "usb_otg_vbus"; > @@ -204,12 +192,18 @@ > }; > > { > - vbus-supply = <_usbh1_vbus>; > pinctrl-names = "default"; > pinctrl-0 = <_usbh1>; > dr_mode = "host"; > disable-over-current; > status = "okay"; > + > + usb2415host: hub@1 { > + compatible = "usb424,2513"; > + reg = <1>; > + reset-gpios = < 12 GPIO_ACTIVE_LOW>; > + reset-duration-us = <3000>; > + }; > }; > > { > @@ -467,11 +461,6 @@ > MX6QDL_PAD_GPIO_3__USB_H1_OC 0x1b0b0 > /* usbh1_b OC */ This comment should be for above pin, do you mind I change it in this patch? > MX6QDL_PAD_GPIO_0__GPIO1_IO00 0x1b0b0 > - >; > - }; > - > - pinctrl_usbh1_hubreset: usbh1hubresetgrp { > - fsl,pins = < > MX6QDL_PAD_GPIO_17__GPIO7_IO12 0x1b0b0 > >; > }; This changes remind me that I need to change the same thing for udoo board. -- Best Regards, Peter Chen
Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
On Thu, Aug 11, 2016 at 7:52 PM, Christoph Hellwigwrote: > > I can look at that, but indeed optimizing this patch seems a bit > stupid. The "write less than a full block to the end of the file" is actually a reasonably common case. It may not make for a great filesystem benchmark, but it also isn't actually insane. People who do logging in user space do this all the time, for example. And it is *not* stupid in that context. Not at all. It's never going to be the *main* thing you do (unless you're AIM), but I do think it's worth fixing. And AIM7 remains one of those odd benchmarks that people use. I'm not quite sure why, but I really do think that the normal "append smaller chunks to the end of the file" should absolutely not be dismissed as stupid. Linus
Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
On Thu, Aug 11, 2016 at 7:52 PM, Christoph Hellwig wrote: > > I can look at that, but indeed optimizing this patch seems a bit > stupid. The "write less than a full block to the end of the file" is actually a reasonably common case. It may not make for a great filesystem benchmark, but it also isn't actually insane. People who do logging in user space do this all the time, for example. And it is *not* stupid in that context. Not at all. It's never going to be the *main* thing you do (unless you're AIM), but I do think it's worth fixing. And AIM7 remains one of those odd benchmarks that people use. I'm not quite sure why, but I really do think that the normal "append smaller chunks to the end of the file" should absolutely not be dismissed as stupid. Linus
Re: [PATCHES] ARM: dts: update udoo and evi hubs to use pwrseq
On Thu, Aug 11, 2016 at 09:40:29AM -0700, Joshua Clayton wrote: > > Hi Peter, > > Consider this my attempt to squeeze my own dts into your patch series. > These patches repalce patch 6 in V5. of the usb pwrseq series > I adding a #address and #size to imx6qdl.dtsi in patch 1 > and remove those lines from imx6qdl-udoo.dtsi in patch 2 (formerly patch 6) > Finally I make the change in imx6q-evi.dts and hope is can be slipped into > the series with the others. > Thanks, I will send them at next revision -- Best Regards, Peter Chen
Re: [PATCHES] ARM: dts: update udoo and evi hubs to use pwrseq
On Thu, Aug 11, 2016 at 09:40:29AM -0700, Joshua Clayton wrote: > > Hi Peter, > > Consider this my attempt to squeeze my own dts into your patch series. > These patches repalce patch 6 in V5. of the usb pwrseq series > I adding a #address and #size to imx6qdl.dtsi in patch 1 > and remove those lines from imx6qdl-udoo.dtsi in patch 2 (formerly patch 6) > Finally I make the change in imx6q-evi.dts and hope is can be slipped into > the series with the others. > Thanks, I will send them at next revision -- Best Regards, Peter Chen
Re: [RFC PATCH v7 7/7] Restartable sequences: self-tests
- On Aug 11, 2016, at 9:28 PM, Boqun Feng boqun.f...@gmail.com wrote: > On Thu, Aug 11, 2016 at 11:26:30PM +, Mathieu Desnoyers wrote: >> - On Jul 24, 2016, at 2:01 PM, Dave Watson davejwat...@fb.com wrote: >> >> >>> +static inline __attribute__((always_inline)) >> >>> +bool rseq_finish(struct rseq_lock *rlock, >> >>> + intptr_t *p, intptr_t to_write, >> >>> + struct rseq_state start_value) >> > >> >>> This ABI looks like it will work fine for our use case. I don't think it >> >>> has been mentioned yet, but we may still need multiple asm blocks >> >>> for differing numbers of writes. For example, an array-based freelist >> >>> push: >> > >> >>> void push(void *obj) { >> >>> if (index < maxlen) { >> >>> freelist[index++] = obj; >> >>> } >> >>> } >> > >> >>> would be more efficiently implemented with a two-write rseq_finish: >> > >> >>> rseq_finish2([index], obj, // first write >> >>> , index + 1, // second write >> >>> ...); >> > >> >> Would pairing one rseq_start with two rseq_finish do the trick >> >> there ? >> > >> > Yes, two rseq_finish works, as long as the extra rseq management overhead >> > is not substantial. >> >> I've added a commit implementing rseq_finish2() in my rseq volatile >> dev branch. You can fetch it at: >> >> https://github.com/compudj/linux-percpu-dev/tree/rseq-fallback >> >> I also have a separate test and benchmark tree in addition to the >> kernel selftests here: >> >> https://github.com/compudj/rseq-test >> >> I named the first write a "speculative" write, and the second write >> the "final" write. >> > > Maybe I miss something subtle, but if the first write is only a > "speculative" write, why can't we put it in the rseq critical section > rather than asm block? Like this: > > do_rseq(..., result, targetptr, newval > { > newval = index; > targetptr = > if (newval < maxlen) > freelist[newval++] = obj; > else > result = false; > } > > No extra rseq_finish() is needed here, but maybe a little more > "speculative" writes? This won't work unfortunately. The speculative stores need to be between the rseq_event_counter comparison instruction in the rseq_finish asm sequence and the final store. The ip fixup is really needed for correctness of speculative stores. The sequence number scheme only works for loads. Putting it in the C code between rseq_start and rseq_finish would lead to races such as: thread Athread B rseq_start rseq_start freelist[offset + 1] = obj rseq_finish offset++ freelist[newval + 1] = obj <--- corrupts the list content. > Besides, do we allow userspace programs do read-only access to the > memory objects modified by do_rseq(). If so, we have a problem when > there are two writes in a do_rseq()(either in the rseq critical section > or in the asm block), because in current implemetation, these two writes > are unordered, which makes the readers outside a do_rseq() could observe > the ordering of writes differently. > > For rseq_finish2(), a simple solution would be making the "final" write > a RELEASE. Indeed, we would need a release semantic for the final store here if this is the common use. Or we could duplicate the "flavors" of rseq_finish2 and add a rseq_finish2_release. We should find a way to eliminate code duplication there. I suspect we'll end up doing macros. Thanks, Mathieu > > Regards, > Boqun > >> Thanks, >> >> Mathieu >> >> -- >> Mathieu Desnoyers >> EfficiOS Inc. > > http://www.efficios.com -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com
Re: [V4 PATCH 2/2] mips/panic: Replace smp_send_stop() with kdump friendly version in panic path
On 08/10/16 at 05:09pm, Hidehiro Kawai wrote: > Daniel Walker reported problems which happens when > crash_kexec_post_notifiers kernel option is enabled > (https://lkml.org/lkml/2015/6/24/44). > > In that case, smp_send_stop() is called before entering kdump routines > which assume other CPUs are still online. As the result, kdump > routines fail to save other CPUs' registers. Additionally for MIPS > OCTEON, it misses to stop the watchdog timer. > > To fix this problem, call a new kdump friendly function, > crash_smp_send_stop(), instead of the smp_send_stop() when > crash_kexec_post_notifiers is enabled. crash_smp_send_stop() is a > weak function, and it just call smp_send_stop(). Architecture > codes should override it so that kdump can work appropriately. > This patch provides MIPS version. > > Reported-by: Daniel Walker> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option) > Signed-off-by: Hidehiro Kawai > Cc: Ralf Baechle > Cc: David Daney > Cc: Aaro Koskinen > Cc: "Steven J. Hill" > Cc: Corey Minyard > > --- > I'm not familiar with MIPS, and I don't have a test environment and > just did build tests only. Please don't apply this patch until > someone does enough tests, otherwise simply drop this patch. > --- > arch/mips/cavium-octeon/setup.c | 14 ++ > arch/mips/include/asm/kexec.h|1 + > arch/mips/kernel/crash.c | 18 +- > arch/mips/kernel/machine_kexec.c |1 + > 4 files changed, 33 insertions(+), 1 deletion(-) > > diff --git a/arch/mips/cavium-octeon/setup.c b/arch/mips/cavium-octeon/setup.c > index cb16fcc..5537f95 100644 > --- a/arch/mips/cavium-octeon/setup.c > +++ b/arch/mips/cavium-octeon/setup.c > @@ -267,6 +267,17 @@ static void octeon_crash_shutdown(struct pt_regs *regs) > default_machine_crash_shutdown(regs); > } > > +#ifdef CONFIG_SMP > +void octeon_crash_smp_send_stop(void) > +{ > + int cpu; > + > + /* disable watchdogs */ > + for_each_online_cpu(cpu) > + cvmx_write_csr(CVMX_CIU_WDOGX(cpu_logical_map(cpu)), 0); > +} > +#endif > + > #endif /* CONFIG_KEXEC */ > > #ifdef CONFIG_CAVIUM_RESERVE32 > @@ -911,6 +922,9 @@ void __init prom_init(void) > _machine_kexec_shutdown = octeon_shutdown; > _machine_crash_shutdown = octeon_crash_shutdown; > _machine_kexec_prepare = octeon_kexec_prepare; > +#ifdef CONFIG_SMP > + _crash_smp_send_stop = octeon_crash_smp_send_stop; > +#endif > #endif > > octeon_user_io_init(); > diff --git a/arch/mips/include/asm/kexec.h b/arch/mips/include/asm/kexec.h > index ee25ebb..493a3cc 100644 > --- a/arch/mips/include/asm/kexec.h > +++ b/arch/mips/include/asm/kexec.h > @@ -45,6 +45,7 @@ extern const unsigned char kexec_smp_wait[]; > extern unsigned long secondary_kexec_args[4]; > extern void (*relocated_kexec_smp_wait) (void *); > extern atomic_t kexec_ready_to_reboot; > +extern void (*_crash_smp_send_stop)(void); > #endif > #endif > > diff --git a/arch/mips/kernel/crash.c b/arch/mips/kernel/crash.c > index 610f0f3..1723b17 100644 > --- a/arch/mips/kernel/crash.c > +++ b/arch/mips/kernel/crash.c > @@ -47,9 +47,14 @@ static void crash_shutdown_secondary(void *passed_regs) > > static void crash_kexec_prepare_cpus(void) > { > + static int cpus_stopped; > unsigned int msecs; > + unsigned int ncpus; > > - unsigned int ncpus = num_online_cpus() - 1;/* Excluding the panic cpu */ > + if (cpus_stopped) > + return; > + > + ncpus = num_online_cpus() - 1;/* Excluding the panic cpu */ > > dump_send_ipi(crash_shutdown_secondary); > smp_wmb(); > @@ -64,6 +69,17 @@ static void crash_kexec_prepare_cpus(void) > cpu_relax(); > mdelay(1); > } > + > + cpus_stopped = 1; > +} > + > +/* Override the weak function in kernel/panic.c */ > +void crash_smp_send_stop(void) > +{ > + if (_crash_smp_send_stop) > + _crash_smp_send_stop(); > + > + crash_kexec_prepare_cpus(); > } > > #else /* !defined(CONFIG_SMP) */ > diff --git a/arch/mips/kernel/machine_kexec.c > b/arch/mips/kernel/machine_kexec.c > index 50980bf3..5972520 100644 > --- a/arch/mips/kernel/machine_kexec.c > +++ b/arch/mips/kernel/machine_kexec.c > @@ -25,6 +25,7 @@ void (*_machine_crash_shutdown)(struct pt_regs *regs) = > NULL; > #ifdef CONFIG_SMP > void (*relocated_kexec_smp_wait) (void *); > atomic_t kexec_ready_to_reboot = ATOMIC_INIT(0); > +void (*_crash_smp_send_stop)(void) = NULL; > #endif > > int > > Can any mips people review this patch and have a test? Thanks Dave
Re: [RFC PATCH v7 7/7] Restartable sequences: self-tests
- On Aug 11, 2016, at 9:28 PM, Boqun Feng boqun.f...@gmail.com wrote: > On Thu, Aug 11, 2016 at 11:26:30PM +, Mathieu Desnoyers wrote: >> - On Jul 24, 2016, at 2:01 PM, Dave Watson davejwat...@fb.com wrote: >> >> >>> +static inline __attribute__((always_inline)) >> >>> +bool rseq_finish(struct rseq_lock *rlock, >> >>> + intptr_t *p, intptr_t to_write, >> >>> + struct rseq_state start_value) >> > >> >>> This ABI looks like it will work fine for our use case. I don't think it >> >>> has been mentioned yet, but we may still need multiple asm blocks >> >>> for differing numbers of writes. For example, an array-based freelist >> >>> push: >> > >> >>> void push(void *obj) { >> >>> if (index < maxlen) { >> >>> freelist[index++] = obj; >> >>> } >> >>> } >> > >> >>> would be more efficiently implemented with a two-write rseq_finish: >> > >> >>> rseq_finish2([index], obj, // first write >> >>> , index + 1, // second write >> >>> ...); >> > >> >> Would pairing one rseq_start with two rseq_finish do the trick >> >> there ? >> > >> > Yes, two rseq_finish works, as long as the extra rseq management overhead >> > is not substantial. >> >> I've added a commit implementing rseq_finish2() in my rseq volatile >> dev branch. You can fetch it at: >> >> https://github.com/compudj/linux-percpu-dev/tree/rseq-fallback >> >> I also have a separate test and benchmark tree in addition to the >> kernel selftests here: >> >> https://github.com/compudj/rseq-test >> >> I named the first write a "speculative" write, and the second write >> the "final" write. >> > > Maybe I miss something subtle, but if the first write is only a > "speculative" write, why can't we put it in the rseq critical section > rather than asm block? Like this: > > do_rseq(..., result, targetptr, newval > { > newval = index; > targetptr = > if (newval < maxlen) > freelist[newval++] = obj; > else > result = false; > } > > No extra rseq_finish() is needed here, but maybe a little more > "speculative" writes? This won't work unfortunately. The speculative stores need to be between the rseq_event_counter comparison instruction in the rseq_finish asm sequence and the final store. The ip fixup is really needed for correctness of speculative stores. The sequence number scheme only works for loads. Putting it in the C code between rseq_start and rseq_finish would lead to races such as: thread Athread B rseq_start rseq_start freelist[offset + 1] = obj rseq_finish offset++ freelist[newval + 1] = obj <--- corrupts the list content. > Besides, do we allow userspace programs do read-only access to the > memory objects modified by do_rseq(). If so, we have a problem when > there are two writes in a do_rseq()(either in the rseq critical section > or in the asm block), because in current implemetation, these two writes > are unordered, which makes the readers outside a do_rseq() could observe > the ordering of writes differently. > > For rseq_finish2(), a simple solution would be making the "final" write > a RELEASE. Indeed, we would need a release semantic for the final store here if this is the common use. Or we could duplicate the "flavors" of rseq_finish2 and add a rseq_finish2_release. We should find a way to eliminate code duplication there. I suspect we'll end up doing macros. Thanks, Mathieu > > Regards, > Boqun > >> Thanks, >> >> Mathieu >> >> -- >> Mathieu Desnoyers >> EfficiOS Inc. > > http://www.efficios.com -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com
Re: [V4 PATCH 2/2] mips/panic: Replace smp_send_stop() with kdump friendly version in panic path
On 08/10/16 at 05:09pm, Hidehiro Kawai wrote: > Daniel Walker reported problems which happens when > crash_kexec_post_notifiers kernel option is enabled > (https://lkml.org/lkml/2015/6/24/44). > > In that case, smp_send_stop() is called before entering kdump routines > which assume other CPUs are still online. As the result, kdump > routines fail to save other CPUs' registers. Additionally for MIPS > OCTEON, it misses to stop the watchdog timer. > > To fix this problem, call a new kdump friendly function, > crash_smp_send_stop(), instead of the smp_send_stop() when > crash_kexec_post_notifiers is enabled. crash_smp_send_stop() is a > weak function, and it just call smp_send_stop(). Architecture > codes should override it so that kdump can work appropriately. > This patch provides MIPS version. > > Reported-by: Daniel Walker > Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option) > Signed-off-by: Hidehiro Kawai > Cc: Ralf Baechle > Cc: David Daney > Cc: Aaro Koskinen > Cc: "Steven J. Hill" > Cc: Corey Minyard > > --- > I'm not familiar with MIPS, and I don't have a test environment and > just did build tests only. Please don't apply this patch until > someone does enough tests, otherwise simply drop this patch. > --- > arch/mips/cavium-octeon/setup.c | 14 ++ > arch/mips/include/asm/kexec.h|1 + > arch/mips/kernel/crash.c | 18 +- > arch/mips/kernel/machine_kexec.c |1 + > 4 files changed, 33 insertions(+), 1 deletion(-) > > diff --git a/arch/mips/cavium-octeon/setup.c b/arch/mips/cavium-octeon/setup.c > index cb16fcc..5537f95 100644 > --- a/arch/mips/cavium-octeon/setup.c > +++ b/arch/mips/cavium-octeon/setup.c > @@ -267,6 +267,17 @@ static void octeon_crash_shutdown(struct pt_regs *regs) > default_machine_crash_shutdown(regs); > } > > +#ifdef CONFIG_SMP > +void octeon_crash_smp_send_stop(void) > +{ > + int cpu; > + > + /* disable watchdogs */ > + for_each_online_cpu(cpu) > + cvmx_write_csr(CVMX_CIU_WDOGX(cpu_logical_map(cpu)), 0); > +} > +#endif > + > #endif /* CONFIG_KEXEC */ > > #ifdef CONFIG_CAVIUM_RESERVE32 > @@ -911,6 +922,9 @@ void __init prom_init(void) > _machine_kexec_shutdown = octeon_shutdown; > _machine_crash_shutdown = octeon_crash_shutdown; > _machine_kexec_prepare = octeon_kexec_prepare; > +#ifdef CONFIG_SMP > + _crash_smp_send_stop = octeon_crash_smp_send_stop; > +#endif > #endif > > octeon_user_io_init(); > diff --git a/arch/mips/include/asm/kexec.h b/arch/mips/include/asm/kexec.h > index ee25ebb..493a3cc 100644 > --- a/arch/mips/include/asm/kexec.h > +++ b/arch/mips/include/asm/kexec.h > @@ -45,6 +45,7 @@ extern const unsigned char kexec_smp_wait[]; > extern unsigned long secondary_kexec_args[4]; > extern void (*relocated_kexec_smp_wait) (void *); > extern atomic_t kexec_ready_to_reboot; > +extern void (*_crash_smp_send_stop)(void); > #endif > #endif > > diff --git a/arch/mips/kernel/crash.c b/arch/mips/kernel/crash.c > index 610f0f3..1723b17 100644 > --- a/arch/mips/kernel/crash.c > +++ b/arch/mips/kernel/crash.c > @@ -47,9 +47,14 @@ static void crash_shutdown_secondary(void *passed_regs) > > static void crash_kexec_prepare_cpus(void) > { > + static int cpus_stopped; > unsigned int msecs; > + unsigned int ncpus; > > - unsigned int ncpus = num_online_cpus() - 1;/* Excluding the panic cpu */ > + if (cpus_stopped) > + return; > + > + ncpus = num_online_cpus() - 1;/* Excluding the panic cpu */ > > dump_send_ipi(crash_shutdown_secondary); > smp_wmb(); > @@ -64,6 +69,17 @@ static void crash_kexec_prepare_cpus(void) > cpu_relax(); > mdelay(1); > } > + > + cpus_stopped = 1; > +} > + > +/* Override the weak function in kernel/panic.c */ > +void crash_smp_send_stop(void) > +{ > + if (_crash_smp_send_stop) > + _crash_smp_send_stop(); > + > + crash_kexec_prepare_cpus(); > } > > #else /* !defined(CONFIG_SMP) */ > diff --git a/arch/mips/kernel/machine_kexec.c > b/arch/mips/kernel/machine_kexec.c > index 50980bf3..5972520 100644 > --- a/arch/mips/kernel/machine_kexec.c > +++ b/arch/mips/kernel/machine_kexec.c > @@ -25,6 +25,7 @@ void (*_machine_crash_shutdown)(struct pt_regs *regs) = > NULL; > #ifdef CONFIG_SMP > void (*relocated_kexec_smp_wait) (void *); > atomic_t kexec_ready_to_reboot = ATOMIC_INIT(0); > +void (*_crash_smp_send_stop)(void) = NULL; > #endif > > int > > Can any mips people review this patch and have a test? Thanks Dave
Re: [V4 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version in panic path
Hi Hidehiro Thanks for the update. On 08/10/16 at 05:09pm, Hidehiro Kawai wrote: > Daniel Walker reported problems which happens when > crash_kexec_post_notifiers kernel option is enabled > (https://lkml.org/lkml/2015/6/24/44). > > In that case, smp_send_stop() is called before entering kdump routines > which assume other CPUs are still online. As the result, for x86, > kdump routines fail to save other CPUs' registers and disable > virtualization extensions. Seems you simplified the changelog, but I think a little more details will be helpful to understand the patch. You know sometimes lkml.org does not work well. > > To fix this problem, call a new kdump friendly function, > crash_smp_send_stop(), instead of the smp_send_stop() when > crash_kexec_post_notifiers is enabled. crash_smp_send_stop() is a > weak function, and it just call smp_send_stop(). Architecture > codes should override it so that kdump can work appropriately. > This patch only provides x86-specific version. > > For Xen's PV kernel, just keep the current behavior. Could you explain a bit about above Xen PV kernel behavior? BTW, this version looks better, I think I'm fine with this version besides of the questions about changelog. > > Changes in V4: > - Keep to use smp_send_stop if crash_kexec_post_notifiers is not set > - Rename panic_smp_send_stop to crash_smp_send_stop > - Don't change the behavior for Xen's PV kernel > > Changes in V3: > - Revise comments, description, and symbol names > > Changes in V2: > - Replace smp_send_stop() call with crash_kexec version which > saves cpu states and cleans up VMX/SVM > - Drop a fix for Problem 1 at this moment > > Reported-by: Daniel Walker> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option) > Signed-off-by: Hidehiro Kawai > Cc: Dave Young > Cc: Baoquan He > Cc: Vivek Goyal > Cc: Eric Biederman > Cc: Masami Hiramatsu > Cc: Daniel Walker > Cc: Xunlei Pang > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: "H. Peter Anvin" > Cc: Borislav Petkov > Cc: David Vrabel > Cc: Toshi Kani > Cc: Andrew Morton > --- > arch/x86/include/asm/kexec.h |1 + > arch/x86/include/asm/smp.h |1 + > arch/x86/kernel/crash.c | 22 +--- > arch/x86/kernel/smp.c|5 > kernel/panic.c | 47 > -- > 5 files changed, 66 insertions(+), 10 deletions(-) > > diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h > index d2434c1..282630e 100644 > --- a/arch/x86/include/asm/kexec.h > +++ b/arch/x86/include/asm/kexec.h > @@ -210,6 +210,7 @@ struct kexec_entry64_regs { > > typedef void crash_vmclear_fn(void); > extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss; > +extern void kdump_nmi_shootdown_cpus(void); > > #endif /* __ASSEMBLY__ */ > > diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h > index ebd0c16..f70989c 100644 > --- a/arch/x86/include/asm/smp.h > +++ b/arch/x86/include/asm/smp.h > @@ -50,6 +50,7 @@ struct smp_ops { > void (*smp_cpus_done)(unsigned max_cpus); > > void (*stop_other_cpus)(int wait); > + void (*crash_stop_other_cpus)(void); > void (*smp_send_reschedule)(int cpu); > > int (*cpu_up)(unsigned cpu, struct task_struct *tidle); > diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c > index 9616cf7..650830e 100644 > --- a/arch/x86/kernel/crash.c > +++ b/arch/x86/kernel/crash.c > @@ -133,15 +133,31 @@ static void kdump_nmi_callback(int cpu, struct pt_regs > *regs) > disable_local_APIC(); > } > > -static void kdump_nmi_shootdown_cpus(void) > +void kdump_nmi_shootdown_cpus(void) > { > nmi_shootdown_cpus(kdump_nmi_callback); > > disable_local_APIC(); > } > > +/* Override the weak function in kernel/panic.c */ > +void crash_smp_send_stop(void) > +{ > + static int cpus_stopped; > + > + if (cpus_stopped) > + return; > + > + if (smp_ops.crash_stop_other_cpus) > + smp_ops.crash_stop_other_cpus(); > + else > + smp_send_stop(); > + > + cpus_stopped = 1; > +} > + > #else > -static void kdump_nmi_shootdown_cpus(void) > +void crash_smp_send_stop(void) > { > /* There are no cpus to shootdown */ > } > @@ -160,7 +176,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs) > /* The kernel is broken so disable interrupts */ > local_irq_disable(); > > - kdump_nmi_shootdown_cpus(); > + crash_smp_send_stop(); > > /* >* VMCLEAR VMCSs loaded on this cpu if needed. > diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c >
Re: [V4 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version in panic path
Hi Hidehiro Thanks for the update. On 08/10/16 at 05:09pm, Hidehiro Kawai wrote: > Daniel Walker reported problems which happens when > crash_kexec_post_notifiers kernel option is enabled > (https://lkml.org/lkml/2015/6/24/44). > > In that case, smp_send_stop() is called before entering kdump routines > which assume other CPUs are still online. As the result, for x86, > kdump routines fail to save other CPUs' registers and disable > virtualization extensions. Seems you simplified the changelog, but I think a little more details will be helpful to understand the patch. You know sometimes lkml.org does not work well. > > To fix this problem, call a new kdump friendly function, > crash_smp_send_stop(), instead of the smp_send_stop() when > crash_kexec_post_notifiers is enabled. crash_smp_send_stop() is a > weak function, and it just call smp_send_stop(). Architecture > codes should override it so that kdump can work appropriately. > This patch only provides x86-specific version. > > For Xen's PV kernel, just keep the current behavior. Could you explain a bit about above Xen PV kernel behavior? BTW, this version looks better, I think I'm fine with this version besides of the questions about changelog. > > Changes in V4: > - Keep to use smp_send_stop if crash_kexec_post_notifiers is not set > - Rename panic_smp_send_stop to crash_smp_send_stop > - Don't change the behavior for Xen's PV kernel > > Changes in V3: > - Revise comments, description, and symbol names > > Changes in V2: > - Replace smp_send_stop() call with crash_kexec version which > saves cpu states and cleans up VMX/SVM > - Drop a fix for Problem 1 at this moment > > Reported-by: Daniel Walker > Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option) > Signed-off-by: Hidehiro Kawai > Cc: Dave Young > Cc: Baoquan He > Cc: Vivek Goyal > Cc: Eric Biederman > Cc: Masami Hiramatsu > Cc: Daniel Walker > Cc: Xunlei Pang > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: "H. Peter Anvin" > Cc: Borislav Petkov > Cc: David Vrabel > Cc: Toshi Kani > Cc: Andrew Morton > --- > arch/x86/include/asm/kexec.h |1 + > arch/x86/include/asm/smp.h |1 + > arch/x86/kernel/crash.c | 22 +--- > arch/x86/kernel/smp.c|5 > kernel/panic.c | 47 > -- > 5 files changed, 66 insertions(+), 10 deletions(-) > > diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h > index d2434c1..282630e 100644 > --- a/arch/x86/include/asm/kexec.h > +++ b/arch/x86/include/asm/kexec.h > @@ -210,6 +210,7 @@ struct kexec_entry64_regs { > > typedef void crash_vmclear_fn(void); > extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss; > +extern void kdump_nmi_shootdown_cpus(void); > > #endif /* __ASSEMBLY__ */ > > diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h > index ebd0c16..f70989c 100644 > --- a/arch/x86/include/asm/smp.h > +++ b/arch/x86/include/asm/smp.h > @@ -50,6 +50,7 @@ struct smp_ops { > void (*smp_cpus_done)(unsigned max_cpus); > > void (*stop_other_cpus)(int wait); > + void (*crash_stop_other_cpus)(void); > void (*smp_send_reschedule)(int cpu); > > int (*cpu_up)(unsigned cpu, struct task_struct *tidle); > diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c > index 9616cf7..650830e 100644 > --- a/arch/x86/kernel/crash.c > +++ b/arch/x86/kernel/crash.c > @@ -133,15 +133,31 @@ static void kdump_nmi_callback(int cpu, struct pt_regs > *regs) > disable_local_APIC(); > } > > -static void kdump_nmi_shootdown_cpus(void) > +void kdump_nmi_shootdown_cpus(void) > { > nmi_shootdown_cpus(kdump_nmi_callback); > > disable_local_APIC(); > } > > +/* Override the weak function in kernel/panic.c */ > +void crash_smp_send_stop(void) > +{ > + static int cpus_stopped; > + > + if (cpus_stopped) > + return; > + > + if (smp_ops.crash_stop_other_cpus) > + smp_ops.crash_stop_other_cpus(); > + else > + smp_send_stop(); > + > + cpus_stopped = 1; > +} > + > #else > -static void kdump_nmi_shootdown_cpus(void) > +void crash_smp_send_stop(void) > { > /* There are no cpus to shootdown */ > } > @@ -160,7 +176,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs) > /* The kernel is broken so disable interrupts */ > local_irq_disable(); > > - kdump_nmi_shootdown_cpus(); > + crash_smp_send_stop(); > > /* >* VMCLEAR VMCSs loaded on this cpu if needed. > diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c > index 658777c..68f8cc2 100644 > --- a/arch/x86/kernel/smp.c > +++ b/arch/x86/kernel/smp.c > @@ -32,6 +32,8 @@ > #include > #include > #include > +#include > + > /* > * Some notes on x86 processor bugs affecting SMP operation: > * > @@ -342,6 +344,9 @@ struct smp_ops smp_ops = { > .smp_cpus_done =
Re: [RFC PATCH v7 7/7] Restartable sequences: self-tests
- On Aug 11, 2016, at 11:10 PM, Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: > - On Aug 11, 2016, at 9:28 PM, Boqun Feng boqun.f...@gmail.com wrote: > >> On Thu, Aug 11, 2016 at 11:26:30PM +, Mathieu Desnoyers wrote: >>> - On Jul 24, 2016, at 2:01 PM, Dave Watson davejwat...@fb.com wrote: >>> >>> >>> +static inline __attribute__((always_inline)) >>> >>> +bool rseq_finish(struct rseq_lock *rlock, >>> >>> + intptr_t *p, intptr_t to_write, >>> >>> + struct rseq_state start_value) >>> > >>> >>> This ABI looks like it will work fine for our use case. I don't think it >>> >>> has been mentioned yet, but we may still need multiple asm blocks >>> >>> for differing numbers of writes. For example, an array-based freelist >>> >>> push: >>> > >>> >>> void push(void *obj) { >>> >>> if (index < maxlen) { >>> >>> freelist[index++] = obj; >>> >>> } >>> >>> } >>> > >>> >>> would be more efficiently implemented with a two-write rseq_finish: >>> > >>> >>> rseq_finish2([index], obj, // first write >>> >>> , index + 1, // second write >>> >>> ...); >>> > >>> >> Would pairing one rseq_start with two rseq_finish do the trick >>> >> there ? >>> > >>> > Yes, two rseq_finish works, as long as the extra rseq management overhead >>> > is not substantial. >>> >>> I've added a commit implementing rseq_finish2() in my rseq volatile >>> dev branch. You can fetch it at: >>> >>> https://github.com/compudj/linux-percpu-dev/tree/rseq-fallback >>> >>> I also have a separate test and benchmark tree in addition to the >>> kernel selftests here: >>> >>> https://github.com/compudj/rseq-test >>> >>> I named the first write a "speculative" write, and the second write >>> the "final" write. >>> >> >> Maybe I miss something subtle, but if the first write is only a >> "speculative" write, why can't we put it in the rseq critical section >> rather than asm block? Like this: >> >> do_rseq(..., result, targetptr, newval >> { >> newval = index; >> targetptr = >> if (newval < maxlen) >> freelist[newval++] = obj; >> else >> result = false; >> } >> >> No extra rseq_finish() is needed here, but maybe a little more >> "speculative" writes? > > This won't work unfortunately. The speculative stores need to be > between the rseq_event_counter comparison instruction in the rseq_finish > asm sequence and the final store. The ip fixup is really needed for > correctness of speculative stores. The sequence number scheme only works > for loads. > > Putting it in the C code between rseq_start and rseq_finish would lead > to races such as: > > thread Athread B > rseq_start > > >rseq_start >freelist[offset + 1] = obj >rseq_finish > offset++ > > > freelist[newval + 1] = obj <--- corrupts the list content. > Small clarification to the scenario: thread Athread B rseq_start load offset into (register 1) rseq_start freelist[offset + 1] = obj rseq_finish offset++ freelist[(register 1) + 1] = obj <--- corrupts the list content. Thanks, Mathieu > > >> Besides, do we allow userspace programs do read-only access to the >> memory objects modified by do_rseq(). If so, we have a problem when >> there are two writes in a do_rseq()(either in the rseq critical section >> or in the asm block), because in current implemetation, these two writes >> are unordered, which makes the readers outside a do_rseq() could observe >> the ordering of writes differently. >> >> For rseq_finish2(), a simple solution would be making the "final" write >> a RELEASE. > > Indeed, we would need a release semantic for the final store here if this > is the common use. Or we could duplicate the "flavors" of rseq_finish2 and > add a rseq_finish2_release. We should find a way to eliminate code duplication > there. I suspect we'll end up doing macros. > > Thanks, > > Mathieu > >> >> Regards, >> Boqun >> >>> Thanks, >>> >>> Mathieu >>> >>> -- >>> Mathieu Desnoyers >>> EfficiOS Inc. >> > http://www.efficios.com > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com
[v10.2 PATCH 5/5] drm/rockchip: cdn-dp: add cdn DP support for rk3399
Add support for cdn DP controller which is embedded in the rk3399 SoCs. The DP is compliant with DisplayPort Specification, Version 1.3, This IP is compatible with the rockchip type-c PHY IP. There is a uCPU in DP controller, it need a firmware to work, please put the firmware file to /lib/firmware/rockchip/dptx.bin. The uCPU in charge of aux communication and link training, the host use mailbox to communicate with the ucpu. The dclk pin_pol of vop must not be invert for DP. Signed-off-by: Chris ZhongReviewed-by: Sean Paul Acked-by: Mark Yao --- Changes in v10.2: - remove best_encoder ops Changes in v10.1: - support read sink count from DPCD Changes in v10: - control the grf_clk in DP Changes in v9: - do not need reset the phy before power_on - add a orientation information for set_capability - retry to read dpcd in 10 seconds Changes in v8: - optimization the err log Changes in v7: - support firmware standby when no dptx connection - optimization the calculation of tu size and valid symbol Changes in v6: - add a port struct - select SND_SOC_HDMI_CODEC - force reset the phy when hpd detected Changes in v5: - alphabetical order - do not use long, use u32 or u64 - return MODE_CLOCK_HIGH when requested > actual - Optimized Coding Style - add a formula to get better tu size and symbol value. - modify according to Sean Paul's comments - fixed the fw_wait always 0 Changes in v4: - use phy framework to control DP phy - support 2 phys Changes in v3: - use EXTCON_DISP_DP and EXTCON_DISP_DP_ALT cable to get dp port state. - reset spdif before config it - modify the firmware clk to 100Mhz - retry load firmware if fw file is requested too early Changes in v2: - Alphabetic order - remove excess error message - use define clk_rate - check all return value - remove dev_set_name(dp->dev, "cdn-dp"); - use schedule_delayed_work - remove never-called functions - remove some unnecessary () Changes in v1: - use extcon API - use hdmi-codec for the DP Asoc - do not initialize the "ret" - printk a err log when drm_of_encoder_active_endpoint_id - modify the dclk pin_pol to a single line drivers/gpu/drm/rockchip/Kconfig| 10 + drivers/gpu/drm/rockchip/Makefile | 1 + drivers/gpu/drm/rockchip/cdn-dp-core.c | 927 +++ drivers/gpu/drm/rockchip/cdn-dp-core.h | 104 +++ drivers/gpu/drm/rockchip/cdn-dp-reg.c | 959 drivers/gpu/drm/rockchip/cdn-dp-reg.h | 482 ++ drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 13 +- drivers/gpu/drm/rockchip/rockchip_drm_vop.h | 9 + drivers/gpu/drm/rockchip/rockchip_vop_reg.c | 2 + 9 files changed, 2504 insertions(+), 3 deletions(-) create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-core.c create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-core.h create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-reg.c create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-reg.h diff --git a/drivers/gpu/drm/rockchip/Kconfig b/drivers/gpu/drm/rockchip/Kconfig index d30bdc3..20aaafe 100644 --- a/drivers/gpu/drm/rockchip/Kconfig +++ b/drivers/gpu/drm/rockchip/Kconfig @@ -25,6 +25,16 @@ config ROCKCHIP_ANALOGIX_DP for the Analogix Core DP driver. If you want to enable DP on RK3288 based SoC, you should selet this option. +config ROCKCHIP_CDN_DP +tristate "Rockchip cdn DP" +depends on DRM_ROCKCHIP + select SND_SOC_HDMI_CODEC if SND_SOC +help + This selects support for Rockchip SoC specific extensions + for the cdn DP driver. If you want to enable Dp on + RK3399 based SoC, you should select this + option. + config ROCKCHIP_DW_HDMI tristate "Rockchip specific extensions for Synopsys DW HDMI" depends on DRM_ROCKCHIP diff --git a/drivers/gpu/drm/rockchip/Makefile b/drivers/gpu/drm/rockchip/Makefile index 05d0713..abdecd5 100644 --- a/drivers/gpu/drm/rockchip/Makefile +++ b/drivers/gpu/drm/rockchip/Makefile @@ -7,6 +7,7 @@ rockchipdrm-y := rockchip_drm_drv.o rockchip_drm_fb.o \ rockchipdrm-$(CONFIG_DRM_FBDEV_EMULATION) += rockchip_drm_fbdev.o obj-$(CONFIG_ROCKCHIP_ANALOGIX_DP) += analogix_dp-rockchip.o +obj-$(CONFIG_ROCKCHIP_CDN_DP) += cdn-dp-core.o cdn-dp-reg.o obj-$(CONFIG_ROCKCHIP_DW_HDMI) += dw_hdmi-rockchip.o obj-$(CONFIG_ROCKCHIP_DW_MIPI_DSI) += dw-mipi-dsi.o obj-$(CONFIG_ROCKCHIP_INNO_HDMI) += inno_hdmi.o diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c b/drivers/gpu/drm/rockchip/cdn-dp-core.c new file mode 100644 index 000..2f068b0 --- /dev/null +++ b/drivers/gpu/drm/rockchip/cdn-dp-core.c @@ -0,0 +1,927 @@ +/* + * Copyright (C) Fuzhou Rockchip Electronics Co.Ltd + * Author: Chris Zhong + * + * This software is licensed under the terms of the GNU General Public + * License version 2, as published by the Free Software Foundation, and + * may be copied, distributed, and modified
Re: [RFC PATCH v7 7/7] Restartable sequences: self-tests
- On Aug 11, 2016, at 11:10 PM, Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: > - On Aug 11, 2016, at 9:28 PM, Boqun Feng boqun.f...@gmail.com wrote: > >> On Thu, Aug 11, 2016 at 11:26:30PM +, Mathieu Desnoyers wrote: >>> - On Jul 24, 2016, at 2:01 PM, Dave Watson davejwat...@fb.com wrote: >>> >>> >>> +static inline __attribute__((always_inline)) >>> >>> +bool rseq_finish(struct rseq_lock *rlock, >>> >>> + intptr_t *p, intptr_t to_write, >>> >>> + struct rseq_state start_value) >>> > >>> >>> This ABI looks like it will work fine for our use case. I don't think it >>> >>> has been mentioned yet, but we may still need multiple asm blocks >>> >>> for differing numbers of writes. For example, an array-based freelist >>> >>> push: >>> > >>> >>> void push(void *obj) { >>> >>> if (index < maxlen) { >>> >>> freelist[index++] = obj; >>> >>> } >>> >>> } >>> > >>> >>> would be more efficiently implemented with a two-write rseq_finish: >>> > >>> >>> rseq_finish2([index], obj, // first write >>> >>> , index + 1, // second write >>> >>> ...); >>> > >>> >> Would pairing one rseq_start with two rseq_finish do the trick >>> >> there ? >>> > >>> > Yes, two rseq_finish works, as long as the extra rseq management overhead >>> > is not substantial. >>> >>> I've added a commit implementing rseq_finish2() in my rseq volatile >>> dev branch. You can fetch it at: >>> >>> https://github.com/compudj/linux-percpu-dev/tree/rseq-fallback >>> >>> I also have a separate test and benchmark tree in addition to the >>> kernel selftests here: >>> >>> https://github.com/compudj/rseq-test >>> >>> I named the first write a "speculative" write, and the second write >>> the "final" write. >>> >> >> Maybe I miss something subtle, but if the first write is only a >> "speculative" write, why can't we put it in the rseq critical section >> rather than asm block? Like this: >> >> do_rseq(..., result, targetptr, newval >> { >> newval = index; >> targetptr = >> if (newval < maxlen) >> freelist[newval++] = obj; >> else >> result = false; >> } >> >> No extra rseq_finish() is needed here, but maybe a little more >> "speculative" writes? > > This won't work unfortunately. The speculative stores need to be > between the rseq_event_counter comparison instruction in the rseq_finish > asm sequence and the final store. The ip fixup is really needed for > correctness of speculative stores. The sequence number scheme only works > for loads. > > Putting it in the C code between rseq_start and rseq_finish would lead > to races such as: > > thread Athread B > rseq_start > > >rseq_start >freelist[offset + 1] = obj >rseq_finish > offset++ > > > freelist[newval + 1] = obj <--- corrupts the list content. > Small clarification to the scenario: thread Athread B rseq_start load offset into (register 1) rseq_start freelist[offset + 1] = obj rseq_finish offset++ freelist[(register 1) + 1] = obj <--- corrupts the list content. Thanks, Mathieu > > >> Besides, do we allow userspace programs do read-only access to the >> memory objects modified by do_rseq(). If so, we have a problem when >> there are two writes in a do_rseq()(either in the rseq critical section >> or in the asm block), because in current implemetation, these two writes >> are unordered, which makes the readers outside a do_rseq() could observe >> the ordering of writes differently. >> >> For rseq_finish2(), a simple solution would be making the "final" write >> a RELEASE. > > Indeed, we would need a release semantic for the final store here if this > is the common use. Or we could duplicate the "flavors" of rseq_finish2 and > add a rseq_finish2_release. We should find a way to eliminate code duplication > there. I suspect we'll end up doing macros. > > Thanks, > > Mathieu > >> >> Regards, >> Boqun >> >>> Thanks, >>> >>> Mathieu >>> >>> -- >>> Mathieu Desnoyers >>> EfficiOS Inc. >> > http://www.efficios.com > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com
[v10.2 PATCH 5/5] drm/rockchip: cdn-dp: add cdn DP support for rk3399
Add support for cdn DP controller which is embedded in the rk3399 SoCs. The DP is compliant with DisplayPort Specification, Version 1.3, This IP is compatible with the rockchip type-c PHY IP. There is a uCPU in DP controller, it need a firmware to work, please put the firmware file to /lib/firmware/rockchip/dptx.bin. The uCPU in charge of aux communication and link training, the host use mailbox to communicate with the ucpu. The dclk pin_pol of vop must not be invert for DP. Signed-off-by: Chris Zhong Reviewed-by: Sean Paul Acked-by: Mark Yao --- Changes in v10.2: - remove best_encoder ops Changes in v10.1: - support read sink count from DPCD Changes in v10: - control the grf_clk in DP Changes in v9: - do not need reset the phy before power_on - add a orientation information for set_capability - retry to read dpcd in 10 seconds Changes in v8: - optimization the err log Changes in v7: - support firmware standby when no dptx connection - optimization the calculation of tu size and valid symbol Changes in v6: - add a port struct - select SND_SOC_HDMI_CODEC - force reset the phy when hpd detected Changes in v5: - alphabetical order - do not use long, use u32 or u64 - return MODE_CLOCK_HIGH when requested > actual - Optimized Coding Style - add a formula to get better tu size and symbol value. - modify according to Sean Paul's comments - fixed the fw_wait always 0 Changes in v4: - use phy framework to control DP phy - support 2 phys Changes in v3: - use EXTCON_DISP_DP and EXTCON_DISP_DP_ALT cable to get dp port state. - reset spdif before config it - modify the firmware clk to 100Mhz - retry load firmware if fw file is requested too early Changes in v2: - Alphabetic order - remove excess error message - use define clk_rate - check all return value - remove dev_set_name(dp->dev, "cdn-dp"); - use schedule_delayed_work - remove never-called functions - remove some unnecessary () Changes in v1: - use extcon API - use hdmi-codec for the DP Asoc - do not initialize the "ret" - printk a err log when drm_of_encoder_active_endpoint_id - modify the dclk pin_pol to a single line drivers/gpu/drm/rockchip/Kconfig| 10 + drivers/gpu/drm/rockchip/Makefile | 1 + drivers/gpu/drm/rockchip/cdn-dp-core.c | 927 +++ drivers/gpu/drm/rockchip/cdn-dp-core.h | 104 +++ drivers/gpu/drm/rockchip/cdn-dp-reg.c | 959 drivers/gpu/drm/rockchip/cdn-dp-reg.h | 482 ++ drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 13 +- drivers/gpu/drm/rockchip/rockchip_drm_vop.h | 9 + drivers/gpu/drm/rockchip/rockchip_vop_reg.c | 2 + 9 files changed, 2504 insertions(+), 3 deletions(-) create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-core.c create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-core.h create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-reg.c create mode 100644 drivers/gpu/drm/rockchip/cdn-dp-reg.h diff --git a/drivers/gpu/drm/rockchip/Kconfig b/drivers/gpu/drm/rockchip/Kconfig index d30bdc3..20aaafe 100644 --- a/drivers/gpu/drm/rockchip/Kconfig +++ b/drivers/gpu/drm/rockchip/Kconfig @@ -25,6 +25,16 @@ config ROCKCHIP_ANALOGIX_DP for the Analogix Core DP driver. If you want to enable DP on RK3288 based SoC, you should selet this option. +config ROCKCHIP_CDN_DP +tristate "Rockchip cdn DP" +depends on DRM_ROCKCHIP + select SND_SOC_HDMI_CODEC if SND_SOC +help + This selects support for Rockchip SoC specific extensions + for the cdn DP driver. If you want to enable Dp on + RK3399 based SoC, you should select this + option. + config ROCKCHIP_DW_HDMI tristate "Rockchip specific extensions for Synopsys DW HDMI" depends on DRM_ROCKCHIP diff --git a/drivers/gpu/drm/rockchip/Makefile b/drivers/gpu/drm/rockchip/Makefile index 05d0713..abdecd5 100644 --- a/drivers/gpu/drm/rockchip/Makefile +++ b/drivers/gpu/drm/rockchip/Makefile @@ -7,6 +7,7 @@ rockchipdrm-y := rockchip_drm_drv.o rockchip_drm_fb.o \ rockchipdrm-$(CONFIG_DRM_FBDEV_EMULATION) += rockchip_drm_fbdev.o obj-$(CONFIG_ROCKCHIP_ANALOGIX_DP) += analogix_dp-rockchip.o +obj-$(CONFIG_ROCKCHIP_CDN_DP) += cdn-dp-core.o cdn-dp-reg.o obj-$(CONFIG_ROCKCHIP_DW_HDMI) += dw_hdmi-rockchip.o obj-$(CONFIG_ROCKCHIP_DW_MIPI_DSI) += dw-mipi-dsi.o obj-$(CONFIG_ROCKCHIP_INNO_HDMI) += inno_hdmi.o diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c b/drivers/gpu/drm/rockchip/cdn-dp-core.c new file mode 100644 index 000..2f068b0 --- /dev/null +++ b/drivers/gpu/drm/rockchip/cdn-dp-core.c @@ -0,0 +1,927 @@ +/* + * Copyright (C) Fuzhou Rockchip Electronics Co.Ltd + * Author: Chris Zhong + * + * This software is licensed under the terms of the GNU General Public + * License version 2, as published by the Free Software Foundation, and + * may be copied, distributed, and modified under those terms. + * + * This program is distributed in the hope that it will be useful, + *
Re: [PATCH] timekeeping: Prints the amounts of time spent during suspend
Hi Ruchi, [auto build test WARNING on tip/timers/core] [also build test WARNING on v4.8-rc1 next-20160811] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Ruchi-Kandoi/timekeeping-Prints-the-amounts-of-time-spent-during-suspend/20160812-034700 config: openrisc-allmodconfig (attached as .config) compiler: or32-linux-gcc (GCC) 4.5.1-or32-1.0rc1 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=openrisc All warnings (new ones prefixed by >>): kernel/time/timekeeping_debug.c: In function 'tk_debug_account_sleep_time': >> kernel/time/timekeeping_debug.c:73:2: warning: format '%lu' expects type >> 'long unsigned int', but argument 2 has type 'time64_t' vim +73 kernel/time/timekeeping_debug.c 57 struct dentry *d; 58 59 d = debugfs_create_file("sleep_time", 0444, NULL, NULL, 60 _debug_sleep_time_fops); 61 if (!d) { 62 pr_err("Failed to create sleep_time debug file\n"); 63 return -ENOMEM; 64 } 65 66 return 0; 67 } 68 late_initcall(tk_debug_sleep_time_init); 69 70 void tk_debug_account_sleep_time(struct timespec64 *t) 71 { 72 sleep_time_bin[fls(t->tv_sec)]++; > 73 pr_info("Suspended for %lu.%03lu seconds\n", t->tv_sec, 74 t->tv_nsec / NSEC_PER_MSEC); 75 } 76 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data
Re: [PATCH] timekeeping: Prints the amounts of time spent during suspend
Hi Ruchi, [auto build test WARNING on tip/timers/core] [also build test WARNING on v4.8-rc1 next-20160811] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Ruchi-Kandoi/timekeeping-Prints-the-amounts-of-time-spent-during-suspend/20160812-034700 config: openrisc-allmodconfig (attached as .config) compiler: or32-linux-gcc (GCC) 4.5.1-or32-1.0rc1 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=openrisc All warnings (new ones prefixed by >>): kernel/time/timekeeping_debug.c: In function 'tk_debug_account_sleep_time': >> kernel/time/timekeeping_debug.c:73:2: warning: format '%lu' expects type >> 'long unsigned int', but argument 2 has type 'time64_t' vim +73 kernel/time/timekeeping_debug.c 57 struct dentry *d; 58 59 d = debugfs_create_file("sleep_time", 0444, NULL, NULL, 60 _debug_sleep_time_fops); 61 if (!d) { 62 pr_err("Failed to create sleep_time debug file\n"); 63 return -ENOMEM; 64 } 65 66 return 0; 67 } 68 late_initcall(tk_debug_sleep_time_init); 69 70 void tk_debug_account_sleep_time(struct timespec64 *t) 71 { 72 sleep_time_bin[fls(t->tv_sec)]++; > 73 pr_info("Suspended for %lu.%03lu seconds\n", t->tv_sec, 74 t->tv_nsec / NSEC_PER_MSEC); 75 } 76 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data
Re: [PATCH 0/2] ibmvfc: FC-TAPE Support
> "Tyrel" == Tyrel Datwylerwrites: Tyrel> On 08/03/2016 02:36 PM, Tyrel Datwyler wrote: >> This patchset introduces optional FC-TAPE/FC Class 3 Error Recovery >> to the ibmvfc client driver. >> >> Tyrel Datwyler (2): ibmvfc: Set READ FCP_XFER_READY DISABLED bit in >> PRLI ibmvfc: add FC Class 3 Error Recovery support >> >> drivers/scsi/ibmvscsi/ibmvfc.c | 11 +++ >> drivers/scsi/ibmvscsi/ibmvfc.h | 1 + 2 files changed, 12 >> insertions(+) >> Tyrel> ping? -ENOREVIEWS -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 0/2] ibmvfc: FC-TAPE Support
> "Tyrel" == Tyrel Datwyler writes: Tyrel> On 08/03/2016 02:36 PM, Tyrel Datwyler wrote: >> This patchset introduces optional FC-TAPE/FC Class 3 Error Recovery >> to the ibmvfc client driver. >> >> Tyrel Datwyler (2): ibmvfc: Set READ FCP_XFER_READY DISABLED bit in >> PRLI ibmvfc: add FC Class 3 Error Recovery support >> >> drivers/scsi/ibmvscsi/ibmvfc.c | 11 +++ >> drivers/scsi/ibmvscsi/ibmvfc.h | 1 + 2 files changed, 12 >> insertions(+) >> Tyrel> ping? -ENOREVIEWS -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 1/3] Add a new field to struct shrinker
On 07/29/2016 06:00 AM, Mel Gorman wrote: > On Fri, Jul 29, 2016 at 10:13:40AM +1000, Dave Chinner wrote: >> On Thu, Jul 28, 2016 at 11:25:13AM +0100, Mel Gorman wrote: >>> On Thu, Jul 28, 2016 at 03:49:47PM +1000, Dave Chinner wrote: Seems you're all missing the obvious. Add a tracepoint for a shrinker callback that includes a "name" field, have the shrinker callback fill it out appropriately. e.g in the superblock shrinker: trace_shrinker_callback(shrinker, shrink_control, sb->s_type->name); >>> >>> That misses capturing the latency of the call unless there is a begin/end >>> tracepoint. >> >> Sure, but I didn't see that in the email talking about how to add a >> name. Even if it is a requirement, it's not necessary as we've >> already got shrinker runtime measurements from the >> trace_mm_shrink_slab_start and trace_mm_shrink_slab_end trace >> points. With the above callback event, shrinker call runtime is >> simply the time between the calls to the same shrinker within >> mm_shrink_slab start/end trace points. >> > > Fair point. It's not that hard to correlate them. True but the scan_objects callback is only called if we have >batch_size objects. It's possible to accumulate quite some time without calling the callback and being able to obtain the s_type->name. So this time all gets associated with just super_cache_scan.
Re: [PATCH 1/3] Add a new field to struct shrinker
On 07/29/2016 06:00 AM, Mel Gorman wrote: > On Fri, Jul 29, 2016 at 10:13:40AM +1000, Dave Chinner wrote: >> On Thu, Jul 28, 2016 at 11:25:13AM +0100, Mel Gorman wrote: >>> On Thu, Jul 28, 2016 at 03:49:47PM +1000, Dave Chinner wrote: Seems you're all missing the obvious. Add a tracepoint for a shrinker callback that includes a "name" field, have the shrinker callback fill it out appropriately. e.g in the superblock shrinker: trace_shrinker_callback(shrinker, shrink_control, sb->s_type->name); >>> >>> That misses capturing the latency of the call unless there is a begin/end >>> tracepoint. >> >> Sure, but I didn't see that in the email talking about how to add a >> name. Even if it is a requirement, it's not necessary as we've >> already got shrinker runtime measurements from the >> trace_mm_shrink_slab_start and trace_mm_shrink_slab_end trace >> points. With the above callback event, shrinker call runtime is >> simply the time between the calls to the same shrinker within >> mm_shrink_slab start/end trace points. >> > > Fair point. It's not that hard to correlate them. True but the scan_objects callback is only called if we have >batch_size objects. It's possible to accumulate quite some time without calling the callback and being able to obtain the s_type->name. So this time all gets associated with just super_cache_scan.
[PATCH] MIPS: ath79: Modify error handling
clk_register_fixed_factor returns an ERR_PTR in case of an error and should have an IS_ERR check instead of a null check. The Coccinelle semantic patch used to find this issue is as follows: @@ expression e; statement S; @@ *e = clk_register_fixed_factor(...); if (!e) S Signed-off-by: Amitoj Kaur Chawla--- arch/mips/ath79/clock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/mips/ath79/clock.c b/arch/mips/ath79/clock.c index 2e73784..cc3a1e3 100644 --- a/arch/mips/ath79/clock.c +++ b/arch/mips/ath79/clock.c @@ -96,7 +96,7 @@ static struct clk * __init ath79_reg_ffclk(const char *name, struct clk *clk; clk = clk_register_fixed_factor(NULL, name, parent_name, 0, mult, div); - if (!clk) + if (IS_ERR(clk)) panic("failed to allocate %s clock structure", name); return clk; -- 1.9.1
[PATCH] MIPS: ath79: Modify error handling
clk_register_fixed_factor returns an ERR_PTR in case of an error and should have an IS_ERR check instead of a null check. The Coccinelle semantic patch used to find this issue is as follows: @@ expression e; statement S; @@ *e = clk_register_fixed_factor(...); if (!e) S Signed-off-by: Amitoj Kaur Chawla --- arch/mips/ath79/clock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/mips/ath79/clock.c b/arch/mips/ath79/clock.c index 2e73784..cc3a1e3 100644 --- a/arch/mips/ath79/clock.c +++ b/arch/mips/ath79/clock.c @@ -96,7 +96,7 @@ static struct clk * __init ath79_reg_ffclk(const char *name, struct clk *clk; clk = clk_register_fixed_factor(NULL, name, parent_name, 0, mult, div); - if (!clk) + if (IS_ERR(clk)) panic("failed to allocate %s clock structure", name); return clk; -- 1.9.1
Re: spin_lock implicit/explicit memory barrier
On Thu, Aug 11, 2016 at 11:31:06AM -0700, Davidlohr Bueso wrote: > On Thu, 11 Aug 2016, Peter Zijlstra wrote: > > > On Wed, Aug 10, 2016 at 04:29:22PM -0700, Davidlohr Bueso wrote: > > > > > (1) As Manfred suggested, have a patch 1 that fixes the race against > > > mainline > > > with the redundant smp_rmb, then apply a second patch that gets rid of it > > > for mainline, but only backport the original patch 1 down to 3.12. > > > > I have not followed the thread closely, but this seems like the best > > option. Esp. since 726328d92a42 ("locking/spinlock, arch: Update and fix > > spin_unlock_wait() implementations") is incomplete, it relies on at > > least 6262db7c088b ("powerpc/spinlock: Fix spin_unlock_wait()") to sort > > PPC. > > Yeah, and we'd also need the arm bits; which reminds me, aren't alpha > ldl_l/stl_c sequences also exposed to this delaying of the publishing > when a non-owner peeks at the lock? Right now sysv sem's would be busted > when doing either is_locked or unlock_wait, shouldn't these be pimped up > to full smp_mb()s? > You are talking about a similar problem as this one: http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1018307.html right? The trick of this problem is whether the barrier or operation in spin_lock() could order the STORE part of the lock-acquire with memory operations in critical sections. On PPC, we use lwsync, which doesn't order STORE->LOAD, so there is problem. On ARM64 and qspinlock in x86, there are similiar reasons. But if an arch implements its spin_lock() with a full barrier, even though the atomic is implemented by ll/sc, the STORE part of which can't be reordered with memory operations in the critcal sections. I think maybe that's the case for alpha(and also for ARM32). Regards, Boqun > Thanks, > Davidlohr > signature.asc Description: PGP signature
Re: spin_lock implicit/explicit memory barrier
On Thu, Aug 11, 2016 at 11:31:06AM -0700, Davidlohr Bueso wrote: > On Thu, 11 Aug 2016, Peter Zijlstra wrote: > > > On Wed, Aug 10, 2016 at 04:29:22PM -0700, Davidlohr Bueso wrote: > > > > > (1) As Manfred suggested, have a patch 1 that fixes the race against > > > mainline > > > with the redundant smp_rmb, then apply a second patch that gets rid of it > > > for mainline, but only backport the original patch 1 down to 3.12. > > > > I have not followed the thread closely, but this seems like the best > > option. Esp. since 726328d92a42 ("locking/spinlock, arch: Update and fix > > spin_unlock_wait() implementations") is incomplete, it relies on at > > least 6262db7c088b ("powerpc/spinlock: Fix spin_unlock_wait()") to sort > > PPC. > > Yeah, and we'd also need the arm bits; which reminds me, aren't alpha > ldl_l/stl_c sequences also exposed to this delaying of the publishing > when a non-owner peeks at the lock? Right now sysv sem's would be busted > when doing either is_locked or unlock_wait, shouldn't these be pimped up > to full smp_mb()s? > You are talking about a similar problem as this one: http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1018307.html right? The trick of this problem is whether the barrier or operation in spin_lock() could order the STORE part of the lock-acquire with memory operations in critical sections. On PPC, we use lwsync, which doesn't order STORE->LOAD, so there is problem. On ARM64 and qspinlock in x86, there are similiar reasons. But if an arch implements its spin_lock() with a full barrier, even though the atomic is implemented by ll/sc, the STORE part of which can't be reordered with memory operations in the critcal sections. I think maybe that's the case for alpha(and also for ARM32). Regards, Boqun > Thanks, > Davidlohr > signature.asc Description: PGP signature
Re: [PATCH v2 0/6] Add support of inverting power control and some minor cleanup
On 2016/8/11 18:08, Jaehoon Chung wrote: Hi Shawn, On 08/09/2016 10:49 AM, Shawn Lin wrote: Hi, On 2016/8/8 18:24, Jaehoon Chung wrote: Hi Shawn, On 08/07/2016 10:33 AM, Shawn Lin wrote: By default, dw_mmc outputs high level voltage to indicate powering up the card and outputs low level vcltage to indicate powering off the card. But that is not always correct. The power io should be able to control different kind of hw components to supply or cutoff power to the card. We have boards that need this patchset to make the power control correct. Meanwhile let's expose it to DT for board-specific usage. I have a question for this patch-set. Does DWMMC IP support to invert ON/OFF at Power Enable register? No. Hmm..Well, if use the DW_MMC_CARD_PWR_INVERT, it should also be the similar behavior with Quirks. yup, it makes the power control more complicated than before. :( Other flags are related with dwmmc IP. But this flag (DW_MMC_CARD_PWR_INVERT) is not related with IP side. I understood why you needs to add this flag..Is rockchip designed to invert the power controlling? We don't invert the power controlling but our customers do. The HW componet looks like some discrete LDOs which enable the related power supply when outputing low voltage from pwren.. Is it related with PWREN register? I'm checking the TRM for PWREN register.. "Optional feature; ports can be used as general-purpose outputs." Does it use the ports as GPIO? But it's not general case. We can discuss about this. I have a solution which is to add gpio power control for slot-gpio of mmc core. once finished, we could add pwr_cap_invert just like what we did for cd/wp invert control.. I'm not sure that it's right that add the gpio power control into mmc core. slot-gpio? i think it can use the regulator framework with gpio. yup, I'm looking this and seems it's fine to register a regulator-gpio and stuff it for vmmc-supply of sdmmc. I will test it, really helpful. thanks! I don't have a lot knowledge of power, but it might be controlled whether active is low or high. So i think it doesn't need to add for entire mmc controller. Best Regards, Jaehoon Chung Then we could remove PWREN from the default state of pinctrl inside the sdmmc dt node, and let dwmmc request gpio power control stuff after paring the property for pwr_cap_invert.. More over, it well fit for all mmc host's requirement of gpio power control and inverted control if they want it. :) That should be legit for us? If it sounds ok to you and Ulf, I will come up with a RFC one for community to comment it.:) Best Regards, Jaehoon Chung Changes in v2: - fix copy-paste err and typo Shawn Lin (6): dt-bindings: rockchip-dw-mshc: add description of rockchip,power-invert mmc: dw_mmc: cleanup power setting of set_ios callback mmc: dw_mmc: split out dw_mci_set_power mmc: dw_mmc: split out dw_mci_set_power_reg mmc: dw_mmc: support inverted power control mmc: dw_mmc-rockchip: add parsing of power control from DT .../devicetree/bindings/mmc/rockchip-dw-mshc.txt | 6 + drivers/mmc/host/dw_mmc-rockchip.c | 8 ++ drivers/mmc/host/dw_mmc.c | 134 - drivers/mmc/host/dw_mmc.h | 1 + 4 files changed, 90 insertions(+), 59 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Best Regards Shawn Lin
Re: [PATCH v2 0/6] Add support of inverting power control and some minor cleanup
On 2016/8/11 18:08, Jaehoon Chung wrote: Hi Shawn, On 08/09/2016 10:49 AM, Shawn Lin wrote: Hi, On 2016/8/8 18:24, Jaehoon Chung wrote: Hi Shawn, On 08/07/2016 10:33 AM, Shawn Lin wrote: By default, dw_mmc outputs high level voltage to indicate powering up the card and outputs low level vcltage to indicate powering off the card. But that is not always correct. The power io should be able to control different kind of hw components to supply or cutoff power to the card. We have boards that need this patchset to make the power control correct. Meanwhile let's expose it to DT for board-specific usage. I have a question for this patch-set. Does DWMMC IP support to invert ON/OFF at Power Enable register? No. Hmm..Well, if use the DW_MMC_CARD_PWR_INVERT, it should also be the similar behavior with Quirks. yup, it makes the power control more complicated than before. :( Other flags are related with dwmmc IP. But this flag (DW_MMC_CARD_PWR_INVERT) is not related with IP side. I understood why you needs to add this flag..Is rockchip designed to invert the power controlling? We don't invert the power controlling but our customers do. The HW componet looks like some discrete LDOs which enable the related power supply when outputing low voltage from pwren.. Is it related with PWREN register? I'm checking the TRM for PWREN register.. "Optional feature; ports can be used as general-purpose outputs." Does it use the ports as GPIO? But it's not general case. We can discuss about this. I have a solution which is to add gpio power control for slot-gpio of mmc core. once finished, we could add pwr_cap_invert just like what we did for cd/wp invert control.. I'm not sure that it's right that add the gpio power control into mmc core. slot-gpio? i think it can use the regulator framework with gpio. yup, I'm looking this and seems it's fine to register a regulator-gpio and stuff it for vmmc-supply of sdmmc. I will test it, really helpful. thanks! I don't have a lot knowledge of power, but it might be controlled whether active is low or high. So i think it doesn't need to add for entire mmc controller. Best Regards, Jaehoon Chung Then we could remove PWREN from the default state of pinctrl inside the sdmmc dt node, and let dwmmc request gpio power control stuff after paring the property for pwr_cap_invert.. More over, it well fit for all mmc host's requirement of gpio power control and inverted control if they want it. :) That should be legit for us? If it sounds ok to you and Ulf, I will come up with a RFC one for community to comment it.:) Best Regards, Jaehoon Chung Changes in v2: - fix copy-paste err and typo Shawn Lin (6): dt-bindings: rockchip-dw-mshc: add description of rockchip,power-invert mmc: dw_mmc: cleanup power setting of set_ios callback mmc: dw_mmc: split out dw_mci_set_power mmc: dw_mmc: split out dw_mci_set_power_reg mmc: dw_mmc: support inverted power control mmc: dw_mmc-rockchip: add parsing of power control from DT .../devicetree/bindings/mmc/rockchip-dw-mshc.txt | 6 + drivers/mmc/host/dw_mmc-rockchip.c | 8 ++ drivers/mmc/host/dw_mmc.c | 134 - drivers/mmc/host/dw_mmc.h | 1 + 4 files changed, 90 insertions(+), 59 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Best Regards Shawn Lin
[PATCH 02/18] autofs: Drop unnecessary extern in autofs_i.h
From: Tomohiro Kusumiautofs4_kill_sb() doesn't need to be declared as extern, and no other functions in .h are explicitly declared as extern. Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- fs/autofs4/autofs_i.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h index a439548..eb87055 100644 --- a/fs/autofs4/autofs_i.h +++ b/fs/autofs4/autofs_i.h @@ -271,4 +271,4 @@ static inline void autofs4_del_expiring(struct dentry *dentry) } } -extern void autofs4_kill_sb(struct super_block *); +void autofs4_kill_sb(struct super_block *);
Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
On Fri, Aug 12, 2016 at 12:23:29PM +1000, Dave Chinner wrote: > Christoph, maybe there's something we can do to only trigger > speculative prealloc growth checks if the new file size crosses the end of > the currently allocated block at the EOF. That would chop out a fair > chunk of the xfs_bmapi_read calls being done in this workload. I'm > not sure how much effort we should spend optimising this slow path, > though I can look at that, but indeed optimizing this patch seems a bit stupid. The other thing we could do is to optimize xfs_bmapi_read - even if it shouldn't be called this often it seems like it should waste a whole lot less CPU cycles.
[PATCH 02/18] autofs: Drop unnecessary extern in autofs_i.h
From: Tomohiro Kusumi autofs4_kill_sb() doesn't need to be declared as extern, and no other functions in .h are explicitly declared as extern. Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- fs/autofs4/autofs_i.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h index a439548..eb87055 100644 --- a/fs/autofs4/autofs_i.h +++ b/fs/autofs4/autofs_i.h @@ -271,4 +271,4 @@ static inline void autofs4_del_expiring(struct dentry *dentry) } } -extern void autofs4_kill_sb(struct super_block *); +void autofs4_kill_sb(struct super_block *);
Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
On Fri, Aug 12, 2016 at 12:23:29PM +1000, Dave Chinner wrote: > Christoph, maybe there's something we can do to only trigger > speculative prealloc growth checks if the new file size crosses the end of > the currently allocated block at the EOF. That would chop out a fair > chunk of the xfs_bmapi_read calls being done in this workload. I'm > not sure how much effort we should spend optimising this slow path, > though I can look at that, but indeed optimizing this patch seems a bit stupid. The other thing we could do is to optimize xfs_bmapi_read - even if it shouldn't be called this often it seems like it should waste a whole lot less CPU cycles.
[PATCH 12/18] autofs: Update struct autofs_dev_ioctl in Documentation
From: Tomohiro KusumiSync with changes made by 730c9eec which introduced an union for various ioctl commands instead of having statically named arg1,2. This commit simply replaces arg1,2 with the corresponding fields without changing semantics. Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- .../filesystems/autofs4-mount-control.txt | 70 1 file changed, 42 insertions(+), 28 deletions(-) diff --git a/Documentation/filesystems/autofs4-mount-control.txt b/Documentation/filesystems/autofs4-mount-control.txt index 540d9a7..50a3e01 100644 --- a/Documentation/filesystems/autofs4-mount-control.txt +++ b/Documentation/filesystems/autofs4-mount-control.txt @@ -179,8 +179,19 @@ struct autofs_dev_ioctl { * including this struct */ __s32 ioctlfd; /* automount command fd */ - __u32 arg1; /* Command parameters */ - __u32 arg2; + union { + struct args_protoverprotover; + struct args_protosubver protosubver; + struct args_openmount openmount; + struct args_ready ready; + struct args_failfail; + struct args_setpipefd setpipefd; + struct args_timeout timeout; + struct args_requester requester; + struct args_expire expire; + struct args_askumount askumount; + struct args_ismountpointismountpoint; + }; char path[0]; }; @@ -192,8 +203,8 @@ optionally be used to check a specific mount corresponding to a given mount point file descriptor, and when requesting the uid and gid of the last successful mount on a directory within the autofs file system. -The fields arg1 and arg2 are used to communicate parameters and results of -calls made as described below. +The union is used to communicate parameters and results of calls made +as described below. The path field is used to pass a path where it is needed and the size field is used account for the increased structure length when translating the @@ -245,9 +256,9 @@ AUTOFS_DEV_IOCTL_PROTOVER_CMD and AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD Get the major and minor version of the autofs4 protocol version understood by loaded module. This call requires an initialized struct autofs_dev_ioctl with the ioctlfd field set to a valid autofs mount point descriptor -and sets the requested version number in structure field arg1. These -commands return 0 on success or one of the negative error codes if -validation fails. +and sets the requested version number in version field of struct args_protover +or sub_version field of struct args_protosubver. These commands return +0 on success or one of the negative error codes if validation fails. AUTOFS_DEV_IOCTL_OPENMOUNT and AUTOFS_DEV_IOCTL_CLOSEMOUNT @@ -256,9 +267,9 @@ AUTOFS_DEV_IOCTL_OPENMOUNT and AUTOFS_DEV_IOCTL_CLOSEMOUNT Obtain and release a file descriptor for an autofs managed mount point path. The open call requires an initialized struct autofs_dev_ioctl with the path field set and the size field adjusted appropriately as well -as the arg1 field set to the device number of the autofs mount. The -device number can be obtained from the mount options shown in -/proc/mounts. The close call requires an initialized struct +as the devid field of struct args_openmount set to the device number of +the autofs mount. The device number can be obtained from the mount options +shown in /proc/mounts. The close call requires an initialized struct autofs_dev_ioct with the ioctlfd field set to the descriptor obtained from the open call. The release of the file descriptor can also be done with close(2) so any open descriptors will also be closed at process exit. @@ -272,10 +283,10 @@ AUTOFS_DEV_IOCTL_READY_CMD and AUTOFS_DEV_IOCTL_FAIL_CMD Return mount and expire result status from user space to the kernel. Both of these calls require an initialized struct autofs_dev_ioctl with the ioctlfd field set to the descriptor obtained from the open -call and the arg1 field set to the wait queue token number, received -by user space in the foregoing mount or expire request. The arg2 field -is set to the status to be returned. For the ready call this is always -0 and for the fail call it is set to the errno of the operation. +call and the token field of struct args_ready or struct args_fail set +to the wait queue token number, received by user space in the foregoing +mount or expire request. The status field of struct args_fail is set to +the errno of the operation. It is set to 0 on success. AUTOFS_DEV_IOCTL_SETPIPEFD_CMD @@ -290,9 +301,10 @@ mount be catatonic (see next call). The call requires an initialized struct autofs_dev_ioctl with the ioctlfd
[PATCH 09/18] autofs: Don't fail to free_dev_ioctl(param)
From: Tomohiro KusumiReturning -ENOTTY here fails to free dynamically allocated param. Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- fs/autofs4/dev-ioctl.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c index c7fcc74..d47b35a 100644 --- a/fs/autofs4/dev-ioctl.c +++ b/fs/autofs4/dev-ioctl.c @@ -662,7 +662,8 @@ static int _autofs_dev_ioctl(unsigned int command, fn = lookup_dev_ioctl(cmd); if (!fn) { pr_warn("unknown command 0x%08x\n", command); - return -ENOTTY; + err = -ENOTTY; + goto out; } fp = NULL;
[PATCH 12/18] autofs: Update struct autofs_dev_ioctl in Documentation
From: Tomohiro Kusumi Sync with changes made by 730c9eec which introduced an union for various ioctl commands instead of having statically named arg1,2. This commit simply replaces arg1,2 with the corresponding fields without changing semantics. Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- .../filesystems/autofs4-mount-control.txt | 70 1 file changed, 42 insertions(+), 28 deletions(-) diff --git a/Documentation/filesystems/autofs4-mount-control.txt b/Documentation/filesystems/autofs4-mount-control.txt index 540d9a7..50a3e01 100644 --- a/Documentation/filesystems/autofs4-mount-control.txt +++ b/Documentation/filesystems/autofs4-mount-control.txt @@ -179,8 +179,19 @@ struct autofs_dev_ioctl { * including this struct */ __s32 ioctlfd; /* automount command fd */ - __u32 arg1; /* Command parameters */ - __u32 arg2; + union { + struct args_protoverprotover; + struct args_protosubver protosubver; + struct args_openmount openmount; + struct args_ready ready; + struct args_failfail; + struct args_setpipefd setpipefd; + struct args_timeout timeout; + struct args_requester requester; + struct args_expire expire; + struct args_askumount askumount; + struct args_ismountpointismountpoint; + }; char path[0]; }; @@ -192,8 +203,8 @@ optionally be used to check a specific mount corresponding to a given mount point file descriptor, and when requesting the uid and gid of the last successful mount on a directory within the autofs file system. -The fields arg1 and arg2 are used to communicate parameters and results of -calls made as described below. +The union is used to communicate parameters and results of calls made +as described below. The path field is used to pass a path where it is needed and the size field is used account for the increased structure length when translating the @@ -245,9 +256,9 @@ AUTOFS_DEV_IOCTL_PROTOVER_CMD and AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD Get the major and minor version of the autofs4 protocol version understood by loaded module. This call requires an initialized struct autofs_dev_ioctl with the ioctlfd field set to a valid autofs mount point descriptor -and sets the requested version number in structure field arg1. These -commands return 0 on success or one of the negative error codes if -validation fails. +and sets the requested version number in version field of struct args_protover +or sub_version field of struct args_protosubver. These commands return +0 on success or one of the negative error codes if validation fails. AUTOFS_DEV_IOCTL_OPENMOUNT and AUTOFS_DEV_IOCTL_CLOSEMOUNT @@ -256,9 +267,9 @@ AUTOFS_DEV_IOCTL_OPENMOUNT and AUTOFS_DEV_IOCTL_CLOSEMOUNT Obtain and release a file descriptor for an autofs managed mount point path. The open call requires an initialized struct autofs_dev_ioctl with the path field set and the size field adjusted appropriately as well -as the arg1 field set to the device number of the autofs mount. The -device number can be obtained from the mount options shown in -/proc/mounts. The close call requires an initialized struct +as the devid field of struct args_openmount set to the device number of +the autofs mount. The device number can be obtained from the mount options +shown in /proc/mounts. The close call requires an initialized struct autofs_dev_ioct with the ioctlfd field set to the descriptor obtained from the open call. The release of the file descriptor can also be done with close(2) so any open descriptors will also be closed at process exit. @@ -272,10 +283,10 @@ AUTOFS_DEV_IOCTL_READY_CMD and AUTOFS_DEV_IOCTL_FAIL_CMD Return mount and expire result status from user space to the kernel. Both of these calls require an initialized struct autofs_dev_ioctl with the ioctlfd field set to the descriptor obtained from the open -call and the arg1 field set to the wait queue token number, received -by user space in the foregoing mount or expire request. The arg2 field -is set to the status to be returned. For the ready call this is always -0 and for the fail call it is set to the errno of the operation. +call and the token field of struct args_ready or struct args_fail set +to the wait queue token number, received by user space in the foregoing +mount or expire request. The status field of struct args_fail is set to +the errno of the operation. It is set to 0 on success. AUTOFS_DEV_IOCTL_SETPIPEFD_CMD @@ -290,9 +301,10 @@ mount be catatonic (see next call). The call requires an initialized struct autofs_dev_ioctl with the ioctlfd field set to the descriptor obtained from the open call and -the arg1
[PATCH 09/18] autofs: Don't fail to free_dev_ioctl(param)
From: Tomohiro Kusumi Returning -ENOTTY here fails to free dynamically allocated param. Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- fs/autofs4/dev-ioctl.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c index c7fcc74..d47b35a 100644 --- a/fs/autofs4/dev-ioctl.c +++ b/fs/autofs4/dev-ioctl.c @@ -662,7 +662,8 @@ static int _autofs_dev_ioctl(unsigned int command, fn = lookup_dev_ioctl(cmd); if (!fn) { pr_warn("unknown command 0x%08x\n", command); - return -ENOTTY; + err = -ENOTTY; + goto out; } fp = NULL;
[PATCH 11/18] autofs: Fix Documentation regarding devid on ioctl
From: Tomohiro KusumiThe explanation on how ioctl handles devid seems incorrect. Userspace who calls this ioctl has no input regarding devid, and ioctl implementation retrieves devid via superblock. Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- .../filesystems/autofs4-mount-control.txt |5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/Documentation/filesystems/autofs4-mount-control.txt b/Documentation/filesystems/autofs4-mount-control.txt index aff2211..540d9a7 100644 --- a/Documentation/filesystems/autofs4-mount-control.txt +++ b/Documentation/filesystems/autofs4-mount-control.txt @@ -323,9 +323,8 @@ mount on the given path dentry. The call requires an initialized struct autofs_dev_ioctl with the path field set to the mount point in question and the size field adjusted -appropriately as well as the arg1 field set to the device number of the -containing autofs mount. Upon return the struct field arg1 contains the -uid and arg2 the gid. +appropriately. Upon return the struct field arg1 contains the uid and +arg2 the gid. When reconstructing an autofs mount tree with active mounts we need to re-connect to mounts that may have used the original process uid and
[PATCH 15/18] autofs: Add autofs_dev_ioctl_version() for AUTOFS_DEV_IOCTL_VERSION_CMD
From: Tomohiro KusumiNo functional changes, based on the following justification. 1. Make the code more consistent using the ioctl vector _ioctls[], rather than assigning NULL only for this ioctl command. 2. Remove goto done; for better maintainability in the long run. 3. The existing code is based on the fact that validate_dev_ioctl() sets ioctl version for any command, but AUTOFS_DEV_IOCTL_VERSION_CMD should explicitly set it regardless of the default behavior. Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- fs/autofs4/dev-ioctl.c | 25 + 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c index 13e7517..7289216 100644 --- a/fs/autofs4/dev-ioctl.c +++ b/fs/autofs4/dev-ioctl.c @@ -172,6 +172,17 @@ static struct autofs_sb_info *autofs_dev_ioctl_sbi(struct file *f) return sbi; } +/* Return autofs dev ioctl version */ +static int autofs_dev_ioctl_version(struct file *fp, + struct autofs_sb_info *sbi, + struct autofs_dev_ioctl *param) +{ + /* This should have already been set. */ + param->ver_major = AUTOFS_DEV_IOCTL_VERSION_MAJOR; + param->ver_minor = AUTOFS_DEV_IOCTL_VERSION_MINOR; + return 0; +} + /* Return autofs module protocol version */ static int autofs_dev_ioctl_protover(struct file *fp, struct autofs_sb_info *sbi, @@ -590,7 +601,8 @@ static ioctl_fn lookup_dev_ioctl(unsigned int cmd) int cmd; ioctl_fn fn; } _ioctls[] = { - {cmd_idx(AUTOFS_DEV_IOCTL_VERSION_CMD), NULL}, + {cmd_idx(AUTOFS_DEV_IOCTL_VERSION_CMD), +autofs_dev_ioctl_version}, {cmd_idx(AUTOFS_DEV_IOCTL_PROTOVER_CMD), autofs_dev_ioctl_protover}, {cmd_idx(AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD), @@ -655,10 +667,6 @@ static int _autofs_dev_ioctl(unsigned int command, if (err) goto out; - /* The validate routine above always sets the version */ - if (cmd == AUTOFS_DEV_IOCTL_VERSION_CMD) - goto done; - fn = lookup_dev_ioctl(cmd); if (!fn) { pr_warn("unknown command 0x%08x\n", command); @@ -672,9 +680,11 @@ static int _autofs_dev_ioctl(unsigned int command, /* * For obvious reasons the openmount can't have a file * descriptor yet. We don't take a reference to the -* file during close to allow for immediate release. +* file during close to allow for immediate release, +* and the same for retrieving ioctl version. */ - if (cmd != AUTOFS_DEV_IOCTL_OPENMOUNT_CMD && + if (cmd != AUTOFS_DEV_IOCTL_VERSION_CMD && + cmd != AUTOFS_DEV_IOCTL_OPENMOUNT_CMD && cmd != AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD) { fp = fget(param->ioctlfd); if (!fp) { @@ -707,7 +717,6 @@ cont: if (fp) fput(fp); -done: if (err >= 0 && copy_to_user(user, param, AUTOFS_DEV_IOCTL_SIZE)) err = -EFAULT; out:
[PATCH 10/18] autofs: Remove AUTOFS_DEVID_LEN
From: Tomohiro KusumiThis macro was never used by neither kernel nor userspace, and also doesn't represent "devid length" in bytes. (unless it was added to mean something else). Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- include/linux/auto_dev-ioctl.h |2 -- 1 file changed, 2 deletions(-) diff --git a/include/linux/auto_dev-ioctl.h b/include/linux/auto_dev-ioctl.h index 7caaf29..bf82e3a 100644 --- a/include/linux/auto_dev-ioctl.h +++ b/include/linux/auto_dev-ioctl.h @@ -18,8 +18,6 @@ #define AUTOFS_DEV_IOCTL_VERSION_MAJOR 1 #define AUTOFS_DEV_IOCTL_VERSION_MINOR 0 -#define AUTOFS_DEVID_LEN 16 - #define AUTOFS_DEV_IOCTL_SIZE sizeof(struct autofs_dev_ioctl) /*
[PATCH 13/18] autofs: Fix pr_debug() message
From: Tomohiro KusumiThis isn't a return value, so change the message to indicate the status is the result of may_umount(). (or locate pr_debug() after put_user() with the same message) Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- fs/autofs4/root.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c index 1b0495a..d25c55f 100644 --- a/fs/autofs4/root.c +++ b/fs/autofs4/root.c @@ -840,7 +840,7 @@ static inline int autofs4_ask_umount(struct vfsmount *mnt, int __user *p) if (may_umount(mnt)) status = 1; - pr_debug("returning %d\n", status); + pr_debug("may umount %d\n", status); status = put_user(status, p);
[PATCH 18/18] autofs4 - move linux/auto_dev-ioctl.h to uapi/linux
Since linux/auto_dev-ioctl.h wasn't included in include/linux/Kbuild it wasn't moved to uapi/linux as part of the uapi series. Signed-off-by: Ian Kent--- include/linux/auto_dev-ioctl.h | 209 - include/uapi/linux/auto_dev-ioctl.h | 221 +++ 2 files changed, 222 insertions(+), 208 deletions(-) create mode 100644 include/uapi/linux/auto_dev-ioctl.h diff --git a/include/linux/auto_dev-ioctl.h b/include/linux/auto_dev-ioctl.h index bf82e3a..28c1505 100644 --- a/include/linux/auto_dev-ioctl.h +++ b/include/linux/auto_dev-ioctl.h @@ -10,212 +10,5 @@ #ifndef _LINUX_AUTO_DEV_IOCTL_H #define _LINUX_AUTO_DEV_IOCTL_H -#include -#include - -#define AUTOFS_DEVICE_NAME "autofs" - -#define AUTOFS_DEV_IOCTL_VERSION_MAJOR 1 -#define AUTOFS_DEV_IOCTL_VERSION_MINOR 0 - -#define AUTOFS_DEV_IOCTL_SIZE sizeof(struct autofs_dev_ioctl) - -/* - * An ioctl interface for autofs mount point control. - */ - -struct args_protover { - __u32 version; -}; - -struct args_protosubver { - __u32 sub_version; -}; - -struct args_openmount { - __u32 devid; -}; - -struct args_ready { - __u32 token; -}; - -struct args_fail { - __u32 token; - __s32 status; -}; - -struct args_setpipefd { - __s32 pipefd; -}; - -struct args_timeout { - __u64 timeout; -}; - -struct args_requester { - __u32 uid; - __u32 gid; -}; - -struct args_expire { - __u32 how; -}; - -struct args_askumount { - __u32 may_umount; -}; - -struct args_ismountpoint { - union { - struct args_in { - __u32 type; - } in; - struct args_out { - __u32 devid; - __u32 magic; - } out; - }; -}; - -/* - * All the ioctls use this structure. - * When sending a path size must account for the total length - * of the chunk of memory otherwise is is the size of the - * structure. - */ - -struct autofs_dev_ioctl { - __u32 ver_major; - __u32 ver_minor; - __u32 size; /* total size of data passed in -* including this struct */ - __s32 ioctlfd; /* automount command fd */ - - /* Command parameters */ - - union { - struct args_protoverprotover; - struct args_protosubver protosubver; - struct args_openmount openmount; - struct args_ready ready; - struct args_failfail; - struct args_setpipefd setpipefd; - struct args_timeout timeout; - struct args_requester requester; - struct args_expire expire; - struct args_askumount askumount; - struct args_ismountpointismountpoint; - }; - - char path[0]; -}; - -static inline void init_autofs_dev_ioctl(struct autofs_dev_ioctl *in) -{ - memset(in, 0, sizeof(struct autofs_dev_ioctl)); - in->ver_major = AUTOFS_DEV_IOCTL_VERSION_MAJOR; - in->ver_minor = AUTOFS_DEV_IOCTL_VERSION_MINOR; - in->size = sizeof(struct autofs_dev_ioctl); - in->ioctlfd = -1; -} - -/* - * If you change this make sure you make the corresponding change - * to autofs-dev-ioctl.c:lookup_ioctl() - */ -enum { - /* Get various version info */ - AUTOFS_DEV_IOCTL_VERSION_CMD = 0x71, - AUTOFS_DEV_IOCTL_PROTOVER_CMD, - AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD, - - /* Open mount ioctl fd */ - AUTOFS_DEV_IOCTL_OPENMOUNT_CMD, - - /* Close mount ioctl fd */ - AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD, - - /* Mount/expire status returns */ - AUTOFS_DEV_IOCTL_READY_CMD, - AUTOFS_DEV_IOCTL_FAIL_CMD, - - /* Activate/deactivate autofs mount */ - AUTOFS_DEV_IOCTL_SETPIPEFD_CMD, - AUTOFS_DEV_IOCTL_CATATONIC_CMD, - - /* Expiry timeout */ - AUTOFS_DEV_IOCTL_TIMEOUT_CMD, - - /* Get mount last requesting uid and gid */ - AUTOFS_DEV_IOCTL_REQUESTER_CMD, - - /* Check for eligible expire candidates */ - AUTOFS_DEV_IOCTL_EXPIRE_CMD, - - /* Request busy status */ - AUTOFS_DEV_IOCTL_ASKUMOUNT_CMD, - - /* Check if path is a mountpoint */ - AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD, -}; - -#define AUTOFS_IOCTL 0x93 - -#define AUTOFS_DEV_IOCTL_VERSION \ - _IOWR(AUTOFS_IOCTL, \ - AUTOFS_DEV_IOCTL_VERSION_CMD, struct autofs_dev_ioctl) - -#define AUTOFS_DEV_IOCTL_PROTOVER \ - _IOWR(AUTOFS_IOCTL, \ - AUTOFS_DEV_IOCTL_PROTOVER_CMD, struct autofs_dev_ioctl) - -#define AUTOFS_DEV_IOCTL_PROTOSUBVER \ - _IOWR(AUTOFS_IOCTL, \ - AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD, struct autofs_dev_ioctl) - -#define AUTOFS_DEV_IOCTL_OPENMOUNT \ -
[PATCH 11/18] autofs: Fix Documentation regarding devid on ioctl
From: Tomohiro Kusumi The explanation on how ioctl handles devid seems incorrect. Userspace who calls this ioctl has no input regarding devid, and ioctl implementation retrieves devid via superblock. Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- .../filesystems/autofs4-mount-control.txt |5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/Documentation/filesystems/autofs4-mount-control.txt b/Documentation/filesystems/autofs4-mount-control.txt index aff2211..540d9a7 100644 --- a/Documentation/filesystems/autofs4-mount-control.txt +++ b/Documentation/filesystems/autofs4-mount-control.txt @@ -323,9 +323,8 @@ mount on the given path dentry. The call requires an initialized struct autofs_dev_ioctl with the path field set to the mount point in question and the size field adjusted -appropriately as well as the arg1 field set to the device number of the -containing autofs mount. Upon return the struct field arg1 contains the -uid and arg2 the gid. +appropriately. Upon return the struct field arg1 contains the uid and +arg2 the gid. When reconstructing an autofs mount tree with active mounts we need to re-connect to mounts that may have used the original process uid and
[PATCH 15/18] autofs: Add autofs_dev_ioctl_version() for AUTOFS_DEV_IOCTL_VERSION_CMD
From: Tomohiro Kusumi No functional changes, based on the following justification. 1. Make the code more consistent using the ioctl vector _ioctls[], rather than assigning NULL only for this ioctl command. 2. Remove goto done; for better maintainability in the long run. 3. The existing code is based on the fact that validate_dev_ioctl() sets ioctl version for any command, but AUTOFS_DEV_IOCTL_VERSION_CMD should explicitly set it regardless of the default behavior. Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- fs/autofs4/dev-ioctl.c | 25 + 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c index 13e7517..7289216 100644 --- a/fs/autofs4/dev-ioctl.c +++ b/fs/autofs4/dev-ioctl.c @@ -172,6 +172,17 @@ static struct autofs_sb_info *autofs_dev_ioctl_sbi(struct file *f) return sbi; } +/* Return autofs dev ioctl version */ +static int autofs_dev_ioctl_version(struct file *fp, + struct autofs_sb_info *sbi, + struct autofs_dev_ioctl *param) +{ + /* This should have already been set. */ + param->ver_major = AUTOFS_DEV_IOCTL_VERSION_MAJOR; + param->ver_minor = AUTOFS_DEV_IOCTL_VERSION_MINOR; + return 0; +} + /* Return autofs module protocol version */ static int autofs_dev_ioctl_protover(struct file *fp, struct autofs_sb_info *sbi, @@ -590,7 +601,8 @@ static ioctl_fn lookup_dev_ioctl(unsigned int cmd) int cmd; ioctl_fn fn; } _ioctls[] = { - {cmd_idx(AUTOFS_DEV_IOCTL_VERSION_CMD), NULL}, + {cmd_idx(AUTOFS_DEV_IOCTL_VERSION_CMD), +autofs_dev_ioctl_version}, {cmd_idx(AUTOFS_DEV_IOCTL_PROTOVER_CMD), autofs_dev_ioctl_protover}, {cmd_idx(AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD), @@ -655,10 +667,6 @@ static int _autofs_dev_ioctl(unsigned int command, if (err) goto out; - /* The validate routine above always sets the version */ - if (cmd == AUTOFS_DEV_IOCTL_VERSION_CMD) - goto done; - fn = lookup_dev_ioctl(cmd); if (!fn) { pr_warn("unknown command 0x%08x\n", command); @@ -672,9 +680,11 @@ static int _autofs_dev_ioctl(unsigned int command, /* * For obvious reasons the openmount can't have a file * descriptor yet. We don't take a reference to the -* file during close to allow for immediate release. +* file during close to allow for immediate release, +* and the same for retrieving ioctl version. */ - if (cmd != AUTOFS_DEV_IOCTL_OPENMOUNT_CMD && + if (cmd != AUTOFS_DEV_IOCTL_VERSION_CMD && + cmd != AUTOFS_DEV_IOCTL_OPENMOUNT_CMD && cmd != AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD) { fp = fget(param->ioctlfd); if (!fp) { @@ -707,7 +717,6 @@ cont: if (fp) fput(fp); -done: if (err >= 0 && copy_to_user(user, param, AUTOFS_DEV_IOCTL_SIZE)) err = -EFAULT; out:
[PATCH 10/18] autofs: Remove AUTOFS_DEVID_LEN
From: Tomohiro Kusumi This macro was never used by neither kernel nor userspace, and also doesn't represent "devid length" in bytes. (unless it was added to mean something else). Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- include/linux/auto_dev-ioctl.h |2 -- 1 file changed, 2 deletions(-) diff --git a/include/linux/auto_dev-ioctl.h b/include/linux/auto_dev-ioctl.h index 7caaf29..bf82e3a 100644 --- a/include/linux/auto_dev-ioctl.h +++ b/include/linux/auto_dev-ioctl.h @@ -18,8 +18,6 @@ #define AUTOFS_DEV_IOCTL_VERSION_MAJOR 1 #define AUTOFS_DEV_IOCTL_VERSION_MINOR 0 -#define AUTOFS_DEVID_LEN 16 - #define AUTOFS_DEV_IOCTL_SIZE sizeof(struct autofs_dev_ioctl) /*
[PATCH 13/18] autofs: Fix pr_debug() message
From: Tomohiro Kusumi This isn't a return value, so change the message to indicate the status is the result of may_umount(). (or locate pr_debug() after put_user() with the same message) Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- fs/autofs4/root.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c index 1b0495a..d25c55f 100644 --- a/fs/autofs4/root.c +++ b/fs/autofs4/root.c @@ -840,7 +840,7 @@ static inline int autofs4_ask_umount(struct vfsmount *mnt, int __user *p) if (may_umount(mnt)) status = 1; - pr_debug("returning %d\n", status); + pr_debug("may umount %d\n", status); status = put_user(status, p);
[PATCH 18/18] autofs4 - move linux/auto_dev-ioctl.h to uapi/linux
Since linux/auto_dev-ioctl.h wasn't included in include/linux/Kbuild it wasn't moved to uapi/linux as part of the uapi series. Signed-off-by: Ian Kent --- include/linux/auto_dev-ioctl.h | 209 - include/uapi/linux/auto_dev-ioctl.h | 221 +++ 2 files changed, 222 insertions(+), 208 deletions(-) create mode 100644 include/uapi/linux/auto_dev-ioctl.h diff --git a/include/linux/auto_dev-ioctl.h b/include/linux/auto_dev-ioctl.h index bf82e3a..28c1505 100644 --- a/include/linux/auto_dev-ioctl.h +++ b/include/linux/auto_dev-ioctl.h @@ -10,212 +10,5 @@ #ifndef _LINUX_AUTO_DEV_IOCTL_H #define _LINUX_AUTO_DEV_IOCTL_H -#include -#include - -#define AUTOFS_DEVICE_NAME "autofs" - -#define AUTOFS_DEV_IOCTL_VERSION_MAJOR 1 -#define AUTOFS_DEV_IOCTL_VERSION_MINOR 0 - -#define AUTOFS_DEV_IOCTL_SIZE sizeof(struct autofs_dev_ioctl) - -/* - * An ioctl interface for autofs mount point control. - */ - -struct args_protover { - __u32 version; -}; - -struct args_protosubver { - __u32 sub_version; -}; - -struct args_openmount { - __u32 devid; -}; - -struct args_ready { - __u32 token; -}; - -struct args_fail { - __u32 token; - __s32 status; -}; - -struct args_setpipefd { - __s32 pipefd; -}; - -struct args_timeout { - __u64 timeout; -}; - -struct args_requester { - __u32 uid; - __u32 gid; -}; - -struct args_expire { - __u32 how; -}; - -struct args_askumount { - __u32 may_umount; -}; - -struct args_ismountpoint { - union { - struct args_in { - __u32 type; - } in; - struct args_out { - __u32 devid; - __u32 magic; - } out; - }; -}; - -/* - * All the ioctls use this structure. - * When sending a path size must account for the total length - * of the chunk of memory otherwise is is the size of the - * structure. - */ - -struct autofs_dev_ioctl { - __u32 ver_major; - __u32 ver_minor; - __u32 size; /* total size of data passed in -* including this struct */ - __s32 ioctlfd; /* automount command fd */ - - /* Command parameters */ - - union { - struct args_protoverprotover; - struct args_protosubver protosubver; - struct args_openmount openmount; - struct args_ready ready; - struct args_failfail; - struct args_setpipefd setpipefd; - struct args_timeout timeout; - struct args_requester requester; - struct args_expire expire; - struct args_askumount askumount; - struct args_ismountpointismountpoint; - }; - - char path[0]; -}; - -static inline void init_autofs_dev_ioctl(struct autofs_dev_ioctl *in) -{ - memset(in, 0, sizeof(struct autofs_dev_ioctl)); - in->ver_major = AUTOFS_DEV_IOCTL_VERSION_MAJOR; - in->ver_minor = AUTOFS_DEV_IOCTL_VERSION_MINOR; - in->size = sizeof(struct autofs_dev_ioctl); - in->ioctlfd = -1; -} - -/* - * If you change this make sure you make the corresponding change - * to autofs-dev-ioctl.c:lookup_ioctl() - */ -enum { - /* Get various version info */ - AUTOFS_DEV_IOCTL_VERSION_CMD = 0x71, - AUTOFS_DEV_IOCTL_PROTOVER_CMD, - AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD, - - /* Open mount ioctl fd */ - AUTOFS_DEV_IOCTL_OPENMOUNT_CMD, - - /* Close mount ioctl fd */ - AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD, - - /* Mount/expire status returns */ - AUTOFS_DEV_IOCTL_READY_CMD, - AUTOFS_DEV_IOCTL_FAIL_CMD, - - /* Activate/deactivate autofs mount */ - AUTOFS_DEV_IOCTL_SETPIPEFD_CMD, - AUTOFS_DEV_IOCTL_CATATONIC_CMD, - - /* Expiry timeout */ - AUTOFS_DEV_IOCTL_TIMEOUT_CMD, - - /* Get mount last requesting uid and gid */ - AUTOFS_DEV_IOCTL_REQUESTER_CMD, - - /* Check for eligible expire candidates */ - AUTOFS_DEV_IOCTL_EXPIRE_CMD, - - /* Request busy status */ - AUTOFS_DEV_IOCTL_ASKUMOUNT_CMD, - - /* Check if path is a mountpoint */ - AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD, -}; - -#define AUTOFS_IOCTL 0x93 - -#define AUTOFS_DEV_IOCTL_VERSION \ - _IOWR(AUTOFS_IOCTL, \ - AUTOFS_DEV_IOCTL_VERSION_CMD, struct autofs_dev_ioctl) - -#define AUTOFS_DEV_IOCTL_PROTOVER \ - _IOWR(AUTOFS_IOCTL, \ - AUTOFS_DEV_IOCTL_PROTOVER_CMD, struct autofs_dev_ioctl) - -#define AUTOFS_DEV_IOCTL_PROTOSUBVER \ - _IOWR(AUTOFS_IOCTL, \ - AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD, struct autofs_dev_ioctl) - -#define AUTOFS_DEV_IOCTL_OPENMOUNT \ -
[PATCH 17/18] autofs: Move inclusion of linux/limits.h to uapi
From: Tomohiro Kusumilinux/limits.h should be included by uapi instead of linux/auto_fs.h so as not to cause compile error in userspace. # cat << EOF > ./test1.c > #include > #include > int main(void) { > return 0; > } > EOF # gcc -Wall -g ./test1.c In file included from ./test1.c:2:0: /usr/include/linux/auto_fs.h:54:12: error: 'NAME_MAX' undeclared here (not in a function) char name[NAME_MAX+1]; ^ Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- include/linux/auto_fs.h |1 - include/uapi/linux/auto_fs.h |1 + 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/auto_fs.h b/include/linux/auto_fs.h index b4066bb..b8f814c 100644 --- a/include/linux/auto_fs.h +++ b/include/linux/auto_fs.h @@ -10,7 +10,6 @@ #define _LINUX_AUTO_FS_H #include -#include #include #include #endif /* _LINUX_AUTO_FS_H */ diff --git a/include/uapi/linux/auto_fs.h b/include/uapi/linux/auto_fs.h index 9175a1b..1bfc3ed 100644 --- a/include/uapi/linux/auto_fs.h +++ b/include/uapi/linux/auto_fs.h @@ -12,6 +12,7 @@ #define _UAPI_LINUX_AUTO_FS_H #include +#include #ifndef __KERNEL__ #include #endif /* __KERNEL__ */
[PATCH 16/18] autofs: Fix print format for ioctl warning message
From: Tomohiro KusumiAll other warnings use "cmd(0x%08x)" and this is the only one with "cmd(%d)". (below comes from my userspace debug program, but not automount daemon) [ 1139.905676] autofs4:pid:1640:check_dev_ioctl_version: ioctl control interface version mismatch: kernel(1.0), user(0.0), cmd(-1072131215) Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- fs/autofs4/dev-ioctl.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c index 7289216..e89d6bb 100644 --- a/fs/autofs4/dev-ioctl.c +++ b/fs/autofs4/dev-ioctl.c @@ -75,7 +75,7 @@ static int check_dev_ioctl_version(int cmd, struct autofs_dev_ioctl *param) if ((param->ver_major != AUTOFS_DEV_IOCTL_VERSION_MAJOR) || (param->ver_minor > AUTOFS_DEV_IOCTL_VERSION_MINOR)) { pr_warn("ioctl control interface version mismatch: " - "kernel(%u.%u), user(%u.%u), cmd(%d)\n", + "kernel(%u.%u), user(%u.%u), cmd(0x%08x)\n", AUTOFS_DEV_IOCTL_VERSION_MAJOR, AUTOFS_DEV_IOCTL_VERSION_MINOR, param->ver_major, param->ver_minor, cmd);
[PATCH 14/18] autofs - fix dev ioctl number range check
The count of miscellaneous device ioctls in fs/autofs4/autofs_i.h is wrong. The number of ioctls is the difference between AUTOFS_DEV_IOCTL_VERSION_CMD and AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD (14) not the difference between AUTOFS_IOC_COUNT and 11 (21). Signed-off-by: Ian KentCc: Tomohiro Kusumi --- fs/autofs4/autofs_i.h |3 ++- fs/autofs4/dev-ioctl.c |2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h index 73e3d38..dd71bd4 100644 --- a/fs/autofs4/autofs_i.h +++ b/fs/autofs4/autofs_i.h @@ -20,7 +20,8 @@ #define AUTOFS_IOC_COUNT 32 #define AUTOFS_DEV_IOCTL_IOC_FIRST (AUTOFS_DEV_IOCTL_VERSION) -#define AUTOFS_DEV_IOCTL_IOC_COUNT (AUTOFS_IOC_COUNT - 11) +#define AUTOFS_DEV_IOCTL_IOC_COUNT \ + (AUTOFS_DEV_IOCTL_VERSION_CMD - AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD) #include #include diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c index d47b35a..13e7517 100644 --- a/fs/autofs4/dev-ioctl.c +++ b/fs/autofs4/dev-ioctl.c @@ -642,7 +642,7 @@ static int _autofs_dev_ioctl(unsigned int command, cmd = _IOC_NR(command); if (_IOC_TYPE(command) != _IOC_TYPE(AUTOFS_DEV_IOCTL_IOC_FIRST) || - cmd - cmd_first >= AUTOFS_DEV_IOCTL_IOC_COUNT) { + cmd - cmd_first > AUTOFS_DEV_IOCTL_IOC_COUNT) { return -ENOTTY; }
[PATCH 17/18] autofs: Move inclusion of linux/limits.h to uapi
From: Tomohiro Kusumi linux/limits.h should be included by uapi instead of linux/auto_fs.h so as not to cause compile error in userspace. # cat << EOF > ./test1.c > #include > #include > int main(void) { > return 0; > } > EOF # gcc -Wall -g ./test1.c In file included from ./test1.c:2:0: /usr/include/linux/auto_fs.h:54:12: error: 'NAME_MAX' undeclared here (not in a function) char name[NAME_MAX+1]; ^ Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- include/linux/auto_fs.h |1 - include/uapi/linux/auto_fs.h |1 + 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/auto_fs.h b/include/linux/auto_fs.h index b4066bb..b8f814c 100644 --- a/include/linux/auto_fs.h +++ b/include/linux/auto_fs.h @@ -10,7 +10,6 @@ #define _LINUX_AUTO_FS_H #include -#include #include #include #endif /* _LINUX_AUTO_FS_H */ diff --git a/include/uapi/linux/auto_fs.h b/include/uapi/linux/auto_fs.h index 9175a1b..1bfc3ed 100644 --- a/include/uapi/linux/auto_fs.h +++ b/include/uapi/linux/auto_fs.h @@ -12,6 +12,7 @@ #define _UAPI_LINUX_AUTO_FS_H #include +#include #ifndef __KERNEL__ #include #endif /* __KERNEL__ */
[PATCH 16/18] autofs: Fix print format for ioctl warning message
From: Tomohiro Kusumi All other warnings use "cmd(0x%08x)" and this is the only one with "cmd(%d)". (below comes from my userspace debug program, but not automount daemon) [ 1139.905676] autofs4:pid:1640:check_dev_ioctl_version: ioctl control interface version mismatch: kernel(1.0), user(0.0), cmd(-1072131215) Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- fs/autofs4/dev-ioctl.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c index 7289216..e89d6bb 100644 --- a/fs/autofs4/dev-ioctl.c +++ b/fs/autofs4/dev-ioctl.c @@ -75,7 +75,7 @@ static int check_dev_ioctl_version(int cmd, struct autofs_dev_ioctl *param) if ((param->ver_major != AUTOFS_DEV_IOCTL_VERSION_MAJOR) || (param->ver_minor > AUTOFS_DEV_IOCTL_VERSION_MINOR)) { pr_warn("ioctl control interface version mismatch: " - "kernel(%u.%u), user(%u.%u), cmd(%d)\n", + "kernel(%u.%u), user(%u.%u), cmd(0x%08x)\n", AUTOFS_DEV_IOCTL_VERSION_MAJOR, AUTOFS_DEV_IOCTL_VERSION_MINOR, param->ver_major, param->ver_minor, cmd);
[PATCH 14/18] autofs - fix dev ioctl number range check
The count of miscellaneous device ioctls in fs/autofs4/autofs_i.h is wrong. The number of ioctls is the difference between AUTOFS_DEV_IOCTL_VERSION_CMD and AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD (14) not the difference between AUTOFS_IOC_COUNT and 11 (21). Signed-off-by: Ian Kent Cc: Tomohiro Kusumi --- fs/autofs4/autofs_i.h |3 ++- fs/autofs4/dev-ioctl.c |2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h index 73e3d38..dd71bd4 100644 --- a/fs/autofs4/autofs_i.h +++ b/fs/autofs4/autofs_i.h @@ -20,7 +20,8 @@ #define AUTOFS_IOC_COUNT 32 #define AUTOFS_DEV_IOCTL_IOC_FIRST (AUTOFS_DEV_IOCTL_VERSION) -#define AUTOFS_DEV_IOCTL_IOC_COUNT (AUTOFS_IOC_COUNT - 11) +#define AUTOFS_DEV_IOCTL_IOC_COUNT \ + (AUTOFS_DEV_IOCTL_VERSION_CMD - AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD) #include #include diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c index d47b35a..13e7517 100644 --- a/fs/autofs4/dev-ioctl.c +++ b/fs/autofs4/dev-ioctl.c @@ -642,7 +642,7 @@ static int _autofs_dev_ioctl(unsigned int command, cmd = _IOC_NR(command); if (_IOC_TYPE(command) != _IOC_TYPE(AUTOFS_DEV_IOCTL_IOC_FIRST) || - cmd - cmd_first >= AUTOFS_DEV_IOCTL_IOC_COUNT) { + cmd - cmd_first > AUTOFS_DEV_IOCTL_IOC_COUNT) { return -ENOTTY; }
[PATCH 01/18] autofs: Fix typos in Documentation/filesystems/autofs4.txt
From: Tomohiro Kusumiplus minor whitespace fixes. Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent --- Documentation/filesystems/autofs4.txt |8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/Documentation/filesystems/autofs4.txt b/Documentation/filesystems/autofs4.txt index 39d02e1..8fac3fe 100644 --- a/Documentation/filesystems/autofs4.txt +++ b/Documentation/filesystems/autofs4.txt @@ -203,9 +203,9 @@ initiated or is being considered, otherwise it returns 0. Mountpoint expiry - -The VFS has a mechansim for automatically expiring unused mounts, +The VFS has a mechanism for automatically expiring unused mounts, much as it can expire any unused dentry information from the dcache. -This is guided by the MNT_SHRINKABLE flag. This only applies to +This is guided by the MNT_SHRINKABLE flag. This only applies to mounts that were created by `d_automount()` returning a filesystem to be mounted. As autofs doesn't return such a filesystem but leaves the mounting to the automount daemon, it must involve the automount daemon @@ -298,7 +298,7 @@ remove directories and symlinks using normal filesystem operations. autofs knows whether a process requesting some operation is the daemon or not based on its process-group id number (see getpgid(1)). -When an autofs filesystem it mounted the pgid of the mounting +When an autofs filesystem is mounted the pgid of the mounting processes is recorded unless the "pgrp=" option is given, in which case that number is recorded instead. Any request arriving from a process in that process group is considered to come from the daemon. @@ -450,7 +450,7 @@ Commands are: numbers for existing filesystems can be found in `/proc/self/mountinfo`. - **AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD**: same as `close(ioctlfd)`. -- **AUTOFS_DEV_IOCTL_SETPIPEFD_CMD**: if the filesystem is in +- **AUTOFS_DEV_IOCTL_SETPIPEFD_CMD**: if the filesystem is in catatonic mode, this can provide the write end of a new pipe in `arg1` to re-establish communication with a daemon. The process group of the calling process is used to identify the
[PATCH 06/18] autofs - remove ino free in autofs4_dir_symlink()
The inode allocation failure case in autofs4_dir_symlink() frees the autofs dentry info of the dentry without setting ->d_fsdata to NULL. That could lead to a double free so just get rid of the free and leave it to ->d_release(). Signed-off-by: Ian KentCc: Tomohiro Kusumi --- fs/autofs4/root.c |2 -- 1 file changed, 2 deletions(-) diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c index fa84bb8..1b0495a 100644 --- a/fs/autofs4/root.c +++ b/fs/autofs4/root.c @@ -577,8 +577,6 @@ static int autofs4_dir_symlink(struct inode *dir, inode = autofs4_get_inode(dir->i_sb, S_IFLNK | 0555); if (!inode) { kfree(cp); - if (!dentry->d_fsdata) - kfree(ino); return -ENOMEM; } inode->i_private = cp;