Re: [RFC PATCH 7/9] powerpc: Add support to mask perf interrupts
On Tuesday 26 July 2016 12:00 PM, Nicholas Piggin wrote: On Tue, 26 Jul 2016 11:55:51 +0530 Madhavan Srinivasan wrote: On Tuesday 26 July 2016 11:16 AM, Nicholas Piggin wrote: On Mon, 25 Jul 2016 20:22:20 +0530 Madhavan Srinivasan wrote: To support masking of the PMI interrupts, couple of new interrupt handler macros are added MASKABLE_EXCEPTION_PSERIES_OOL and MASKABLE_RELON_EXCEPTION_PSERIES_OOL. These are needed to include the SOFTEN_TEST and implement the support at both host and guest kernel. Couple of new irq #defs "PACA_IRQ_PMI" and "SOFTEN_VALUE_0xf0*" added to use in the exception code to check for PMI interrupts. __SOFTEN_TEST macro is modified to support the PMI interrupt. Present __SOFTEN_TEST code loads the soft_enabled from paca and check to call masked_interrupt handler code. To support both current behaviour and PMI masking, these changes are added, 1) Current LR register content are saved in R11 2) "bge" branch operation is changed to "bgel". 3) restore R11 to LR Reason: To retain PMI as NMI behaviour for flag state of 1, we save the LR regsiter value in R11 and branch to "masked_interrupt" handler with LR update. And in "masked_interrupt" handler, we check for the "SOFTEN_VALUE_*" value in R10 for PMI and branch back with "blr" if PMI. To mask PMI for a flag >1 value, masked_interrupt vaoid's the above check and continue to execute the masked_interrupt code and disabled MSR[EE] and updated the irq_happend with PMI info. Finally, saving of R11 is moved before calling SOFTEN_TEST in the __EXCEPTION_PROLOG_1 macro to support saving of LR values in SOFTEN_TEST. Signed-off-by: Madhavan Srinivasan --- arch/powerpc/include/asm/exception-64s.h | 22 -- arch/powerpc/include/asm/hw_irq.h| 1 + arch/powerpc/kernel/exceptions-64s.S | 27 --- 3 files changed, 45 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h index 44d3f539d8a5..c951b7ab5108 100644 --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -166,8 +166,8 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) OPT_SAVE_REG_TO_PACA(area+EX_CFAR, r10, CPU_FTR_CFAR); \ SAVE_CTR(r10, area); \ mfcr r9; \ - extra(vec); \ std r11,area+EX_R11(r13); \ + extra(vec); \ std r12,area+EX_R12(r13); \ GET_SCRATCH0(r10); \ std r10,area+EX_R13(r13) @@ -403,12 +403,17 @@ label##_relon_hv: \ #define SOFTEN_VALUE_0xe82 PACA_IRQ_DBELL #define SOFTEN_VALUE_0xe60 PACA_IRQ_HMI #define SOFTEN_VALUE_0xe62 PACA_IRQ_HMI +#define SOFTEN_VALUE_0xf01 PACA_IRQ_PMI +#define SOFTEN_VALUE_0xf00 PACA_IRQ_PMI #define __SOFTEN_TEST(h, vec)\ lbz r10,PACASOFTIRQEN(r13); \ cmpwi r10,LAZY_INTERRUPT_DISABLED;\ li r10,SOFTEN_VALUE_##vec; \ - bge masked_##h##interrupt At which point, can't we pass in the interrupt level we want to mask for to SOFTEN_TEST, and avoid all this extra code changes? IIUC, we do pass the interrupt info to SOFTEN_TEST. Incase of PMU interrupt we will have the value as PACA_IRQ_PMI. PMU masked interrupt will compare with SOFTEN_LEVEL_PMU, existing interrupts will compare with SOFTEN_LEVEL_EE (or whatever suitable names there are). + mflr r11;\ + bgel masked_##h##interrupt; \ + mtlrr11; This might corrupt return prediction when masked_interrupt does not Hmm this is a valid point. return. I guess that's uncommon case though. No, it is. kernel mostly use irq_disable with (1) today and only in specific case we disable all the interrupts. So we are going to return almost always when irqs are soft diabled. Since we need to support the PMIs as NMI when irq disable level is 1, we need to skip masked_interrupt. As you mentioned if we have a separate macro (SOFTEN_TEST_PMU), these can be avoided, but then it is code replication and we may need to change some more macros. But this interesting, let me work on this. I would really prefer to do that, even if it means a little more code. Another option is to give an additional parameter to the MASKABLE variants of the exception handlers, which you can pass in the "mask level" into. I think it's not a bad idea to make it explicit even for the existing ones so it's clear which level they are masked at. Issue here is that mask_interrupt function is not part of the interrupt vector code (__EXCEPTION_PROLOG_1). So incase of PMI, if we enter the mask_interrupt function, we need to know where to return to continue incase of NMI. Maddy Thanks,
Re: [RFC PATCH 9/9] powerpc: rewrite local_t using soft_irq
On Tuesday 26 July 2016 11:23 AM, Nicholas Piggin wrote: On Mon, 25 Jul 2016 20:22:22 +0530 Madhavan Srinivasan wrote: https://lkml.org/lkml/2008/12/16/450 Modifications to Rusty's benchmark code: - Executed only local_t test Here are the values with the patch. Time in ns per iteration Local_t Without Patch With Patch _inc28 8 _add28 8 _read 3 3 _add_return 28 7 Tested the patch in a - pSeries LPAR (with perf record) Very nice. I'd like to see these patches get in. We can probably use the feature in other places too. Thanks for review. Maddy Thanks, Nick ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 8/9] powerpc: Support to replay PMIs
On Tuesday 26 July 2016 11:20 AM, Nicholas Piggin wrote: On Mon, 25 Jul 2016 20:22:21 +0530 Madhavan Srinivasan wrote: Code to replay the Performance Monitoring Interrupts(PMI). In the masked_interrupt handler, for PMIs we reset the MSR[EE] and return. This is due the fact that PMIs are level triggered. In the __check_irq_replay(), we enabled the MSR[EE] which will fire the interrupt for us. Patch also adds a new arch_local_irq_disable_var() variant. New variant takes an input value to write to the paca->soft_enabled. This will be used in following patch to implement the tri-state value for soft-enabled. Same comment also applies about patches being standalone transformations that work before and after. Some of these can be squashed together I think. Sure. Signed-off-by: Madhavan Srinivasan --- arch/powerpc/include/asm/hw_irq.h | 14 ++ arch/powerpc/kernel/irq.c | 9 - 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index cc69dde6eb84..863179654452 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -81,6 +81,20 @@ static inline unsigned long arch_local_irq_disable(void) return flags; } +static inline unsigned long arch_local_irq_disable_var(int value) +{ + unsigned long flags, zero; + + asm volatile( + "li %1,%3; lbz %0,%2(13); stb %1,%2(13)" + : "=r" (flags), "=&r" (zero) + : "i" (offsetof(struct paca_struct, soft_enabled)),\ + "i" (value) + : "memory"); + + return flags; +} arch_ function suggests it is arch implementation of a generic kernel function or something. I think our soft interrupt levels are just used in powerpc specific code. The name could also be a little more descriptive. I would have our internal function be something like soft_irq_set_level(), and then the arch disable just sets to the appropriate level as it does today. The PMU disable level could be implemented in powerpc specific header with local_irq_and_pmu_disable() or something like that. Yes. will do. Thanks, Nick ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] powerpc/64: Do load of PACAKBASE in LOAD_HANDLER
On Tue, 26 Jul 2016 15:29:30 +1000 Michael Ellerman wrote: > The LOAD_HANDLER macro requires that you have previously loaded "reg" > with PACAKBASE. Although that gives callers flexibility to get > PACAKBASE in some interesting way, none of the callers actually do > that. So fold the load of PACAKBASE into the macro, making it simpler > for callers to use correctly. > > Signed-off-by: Michael Ellerman I don't see any problem with this. Reviewed-by: Nick Piggin ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 7/9] powerpc: Add support to mask perf interrupts
On Tue, 26 Jul 2016 11:55:51 +0530 Madhavan Srinivasan wrote: > On Tuesday 26 July 2016 11:16 AM, Nicholas Piggin wrote: > > On Mon, 25 Jul 2016 20:22:20 +0530 > > Madhavan Srinivasan wrote: > > > >> To support masking of the PMI interrupts, couple of new interrupt > >> handler macros are added MASKABLE_EXCEPTION_PSERIES_OOL and > >> MASKABLE_RELON_EXCEPTION_PSERIES_OOL. These are needed to include > >> the SOFTEN_TEST and implement the support at both host and guest > >> kernel. > >> > >> Couple of new irq #defs "PACA_IRQ_PMI" and "SOFTEN_VALUE_0xf0*" > >> added to use in the exception code to check for PMI interrupts. > >> > >> __SOFTEN_TEST macro is modified to support the PMI interrupt. > >> Present __SOFTEN_TEST code loads the soft_enabled from paca and > >> check to call masked_interrupt handler code. To support both > >> current behaviour and PMI masking, these changes are added, > >> > >> 1) Current LR register content are saved in R11 > >> 2) "bge" branch operation is changed to "bgel". > >> 3) restore R11 to LR > >> > >> Reason: > >> > >> To retain PMI as NMI behaviour for flag state of 1, we save the LR > >> regsiter value in R11 and branch to "masked_interrupt" handler with > >> LR update. And in "masked_interrupt" handler, we check for the > >> "SOFTEN_VALUE_*" value in R10 for PMI and branch back with "blr" if > >> PMI. > >> > >> To mask PMI for a flag >1 value, masked_interrupt vaoid's the above > >> check and continue to execute the masked_interrupt code and > >> disabled MSR[EE] and updated the irq_happend with PMI info. > >> > >> Finally, saving of R11 is moved before calling SOFTEN_TEST in the > >> __EXCEPTION_PROLOG_1 macro to support saving of LR values in > >> SOFTEN_TEST. > >> > >> Signed-off-by: Madhavan Srinivasan > >> --- > >> arch/powerpc/include/asm/exception-64s.h | 22 > >> -- arch/powerpc/include/asm/hw_irq.h| > >> 1 + arch/powerpc/kernel/exceptions-64s.S | 27 > >> --- 3 files changed, 45 insertions(+), 5 > >> deletions(-) > >> > >> diff --git a/arch/powerpc/include/asm/exception-64s.h > >> b/arch/powerpc/include/asm/exception-64s.h index > >> 44d3f539d8a5..c951b7ab5108 100644 --- > >> a/arch/powerpc/include/asm/exception-64s.h +++ > >> b/arch/powerpc/include/asm/exception-64s.h @@ -166,8 +166,8 @@ > >> END_FTR_SECTION_NESTED(ftr,ftr,943) > >> OPT_SAVE_REG_TO_PACA(area+EX_CFAR, r10, CPU_FTR_CFAR); > >> \ SAVE_CTR(r10, area); > >> \ mfcr > >> r9;\ > >> - > >> extra(vec); > >> \ std > >> r11,area+EX_R11(r13); \ > >> + > >> extra(vec); > >> \ std > >> r12,area+EX_R12(r13); \ > >> GET_SCRATCH0(r10); \ > >> stdr10,area+EX_R13(r13) @@ -403,12 +403,17 @@ > >> label##_relon_hv: \ > >> #define SOFTEN_VALUE_0xe82 PACA_IRQ_DBELL #define > >> SOFTEN_VALUE_0xe60 PACA_IRQ_HMI #define > >> SOFTEN_VALUE_0xe62 PACA_IRQ_HMI +#define > >> SOFTEN_VALUE_0xf01 PACA_IRQ_PMI +#define > >> SOFTEN_VALUE_0xf00 PACA_IRQ_PMI > > #define __SOFTEN_TEST(h, > >> vec) \ lbz > >> r10,PACASOFTIRQEN(r13);\ > >> cmpwi > >> r10,LAZY_INTERRUPT_DISABLED; \ > >> li > >> r10,SOFTEN_VALUE_##vec;\ > >> - bge masked_##h##interrupt > > At which point, can't we pass in the interrupt level we want to mask > > for to SOFTEN_TEST, and avoid all this extra code changes? > IIUC, we do pass the interrupt info to SOFTEN_TEST. Incase of > PMU interrupt we will have the value as PACA_IRQ_PMI. > > > > > > > > PMU masked interrupt will compare with SOFTEN_LEVEL_PMU, existing > > interrupts will compare with SOFTEN_LEVEL_EE (or whatever suitable > > names there are). > > > > > >> + mflr > >> r11; \ > >> + bgel > >> masked_##h##interrupt; \ > >> + mtlrr11; > > This might corrupt return prediction when masked_interrupt does > > not > Hmm this is a valid point. > > > return. I guess that's uncommon case though. > > No, it is. kernel mostly use irq_disable with (1) today and only in > specific case > we disable all the interrupts. So we are going to return almost > always when irqs are > soft diabled. > > Since we need to support the PMIs as NMI when irq disable level is 1, > we need to skip masked_interrupt. > > As you mentioned if we have a separate macro (SOFTEN_TEST_PMU), > these can be avoided, but then it is code replication and we may need > to change some more macros. But this interesting, let me work on this. I would really prefer to do that, even if it means a little more code. Another option is to give an additional parameter to the MASKABLE variants of the exception handlers, whic
Re: [RFC PATCH 7/9] powerpc: Add support to mask perf interrupts
On Tuesday 26 July 2016 11:16 AM, Nicholas Piggin wrote: On Mon, 25 Jul 2016 20:22:20 +0530 Madhavan Srinivasan wrote: To support masking of the PMI interrupts, couple of new interrupt handler macros are added MASKABLE_EXCEPTION_PSERIES_OOL and MASKABLE_RELON_EXCEPTION_PSERIES_OOL. These are needed to include the SOFTEN_TEST and implement the support at both host and guest kernel. Couple of new irq #defs "PACA_IRQ_PMI" and "SOFTEN_VALUE_0xf0*" added to use in the exception code to check for PMI interrupts. __SOFTEN_TEST macro is modified to support the PMI interrupt. Present __SOFTEN_TEST code loads the soft_enabled from paca and check to call masked_interrupt handler code. To support both current behaviour and PMI masking, these changes are added, 1) Current LR register content are saved in R11 2) "bge" branch operation is changed to "bgel". 3) restore R11 to LR Reason: To retain PMI as NMI behaviour for flag state of 1, we save the LR regsiter value in R11 and branch to "masked_interrupt" handler with LR update. And in "masked_interrupt" handler, we check for the "SOFTEN_VALUE_*" value in R10 for PMI and branch back with "blr" if PMI. To mask PMI for a flag >1 value, masked_interrupt vaoid's the above check and continue to execute the masked_interrupt code and disabled MSR[EE] and updated the irq_happend with PMI info. Finally, saving of R11 is moved before calling SOFTEN_TEST in the __EXCEPTION_PROLOG_1 macro to support saving of LR values in SOFTEN_TEST. Signed-off-by: Madhavan Srinivasan --- arch/powerpc/include/asm/exception-64s.h | 22 -- arch/powerpc/include/asm/hw_irq.h| 1 + arch/powerpc/kernel/exceptions-64s.S | 27 --- 3 files changed, 45 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h index 44d3f539d8a5..c951b7ab5108 100644 --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -166,8 +166,8 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) OPT_SAVE_REG_TO_PACA(area+EX_CFAR, r10, CPU_FTR_CFAR); \ SAVE_CTR(r10, area); \ mfcr r9; \ - extra(vec); \ std r11,area+EX_R11(r13); \ + extra(vec); \ std r12,area+EX_R12(r13); \ GET_SCRATCH0(r10); \ std r10,area+EX_R13(r13) @@ -403,12 +403,17 @@ label##_relon_hv: \ #define SOFTEN_VALUE_0xe82 PACA_IRQ_DBELL #define SOFTEN_VALUE_0xe60 PACA_IRQ_HMI #define SOFTEN_VALUE_0xe62 PACA_IRQ_HMI +#define SOFTEN_VALUE_0xf01 PACA_IRQ_PMI +#define SOFTEN_VALUE_0xf00 PACA_IRQ_PMI #define __SOFTEN_TEST(h, vec)\ lbz r10,PACASOFTIRQEN(r13); \ cmpwi r10,LAZY_INTERRUPT_DISABLED;\ li r10,SOFTEN_VALUE_##vec; \ - bge masked_##h##interrupt At which point, can't we pass in the interrupt level we want to mask for to SOFTEN_TEST, and avoid all this extra code changes? IIUC, we do pass the interrupt info to SOFTEN_TEST. Incase of PMU interrupt we will have the value as PACA_IRQ_PMI. PMU masked interrupt will compare with SOFTEN_LEVEL_PMU, existing interrupts will compare with SOFTEN_LEVEL_EE (or whatever suitable names there are). + mflr r11;\ + bgel masked_##h##interrupt; \ + mtlrr11; This might corrupt return prediction when masked_interrupt does not Hmm this is a valid point. return. I guess that's uncommon case though. No, it is. kernel mostly use irq_disable with (1) today and only in specific case we disable all the interrupts. So we are going to return almost always when irqs are soft diabled. Since we need to support the PMIs as NMI when irq disable level is 1, we need to skip masked_interrupt. As you mentioned if we have a separate macro (SOFTEN_TEST_PMU), these can be avoided, but then it is code replication and we may need to change some more macros. But this interesting, let me work on this. Maddy But I think we can avoid this if we do the above, no? Thanks, Nick ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2] powerpc/64: Correct comment on LOAD_HANDLER()
On Tue, 26 Jul 2016 15:29:29 +1000 Michael Ellerman wrote: > The comment for LOAD_HANDLER() was wrong. The part about kdump has not > been true since 1f6a93e4c35e ("powerpc: Make it possible to move the > interrupt handlers away from the kernel"). > > Describe how it currently works, and combine the two separate comments > into one. > > Signed-off-by: Michael Ellerman Reviewed-by: Nick Piggin > --- > arch/powerpc/include/asm/exception-64s.h | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/powerpc/include/asm/exception-64s.h > b/arch/powerpc/include/asm/exception-64s.h index > 93ae809fe5ea..4ff3e2f16b5d 100644 --- > a/arch/powerpc/include/asm/exception-64s.h +++ > b/arch/powerpc/include/asm/exception-64s.h @@ -84,12 +84,12 @@ > > /* > * We're short on space and time in the exception prolog, so we can't > - * use the normal SET_REG_IMMEDIATE macro. Normally we just need the > - * low halfword of the address, but for Kdump we need the whole low > - * word. > + * use the normal LOAD_REG_IMMEDIATE macro to load the address of > label. > + * Instead we get the base of the kernel from paca->kernelbase and > or in the low > + * part of label. This requires that the label be within 64KB of > kernelbase, and > + * that kernelbase be 64K aligned. > */ > #define LOAD_HANDLER(reg, > label)\ > - /* Handlers must be within 64K of kbase, which must be 64k > aligned */ \ ori reg,reg,(label)-_stext; /* virt addr > of handler ... */ > /* Exception register prefixes */ ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 1/9] Add #defs for paca->soft_enabled flags
On Tue, 26 Jul 2016 11:35:16 +0530 Madhavan Srinivasan wrote: > On Tuesday 26 July 2016 10:57 AM, Nicholas Piggin wrote: > > On Mon, 25 Jul 2016 20:22:14 +0530 > > Madhavan Srinivasan wrote: > > > >> Two #defs LAZY_INTERRUPT_ENABLED and > >> LAZY_INTERRUPT_DISABLED are added to be used > >> when updating paca->soft_enabled. > > This is a very nice patchset, but can this not be a new name? > > Thanks, but idea is from ben :) > Regarding the name, I looked at the initial patchset posted by > paul and took the name from it :). > > But will work on that, any suggestion for the name? I don't have a strong preference. LAZY_* is not horrible itself, it's just that softe variant is used elsewhere. I don't mind if you rename softe to something else completely (although Ben might). Allow me to apply the first coat of paint to the bikeshed: irq_disable_level IRQ_DISABLE_LEVEL_NONE IRQ_DISABLE_LEVEL_LINUX IRQ_DISABLE_LEVEL_PMU ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 6/9] powerpc: modify __SOFTEN_TEST to support tri-state soft_enabled flag
On Tuesday 26 July 2016 11:11 AM, Nicholas Piggin wrote: On Mon, 25 Jul 2016 20:22:19 +0530 Madhavan Srinivasan wrote: Foundation patch to support checking of new flag for "paca->soft_enabled". Modify the condition checking for the "soft_enabled" from "equal" to "greater than or equal to". Rather than a "tri-state" and the mystery "2" state, can you make a #define for that guy, and use levels. Yes. Will do. Will wait for any feedback on the macro name for the patch 1 of this series. Maddy 0-> all enabled 1-> "linux" interrupts disabled 2-> PMU also disabled etc. Thanks, Nick ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 5/9] powerpc: reverse the soft_enable logic
On Tuesday 26 July 2016 11:01 AM, Nicholas Piggin wrote: On Mon, 25 Jul 2016 20:22:18 +0530 Madhavan Srinivasan wrote: "paca->soft_enabled" is used as a flag to mask some of interrupts. Currently supported flags values and their details: soft_enabledMSR[EE] 0 0 Disabled (PMI and HMI not masked) 1 1 Enabled "paca->soft_enabled" is initialed to 1 to make the interripts as enabled. arch_local_irq_disable() will toggle the value when interrupts needs to disbled. At this point, the interrupts are not actually disabled, instead, interrupt vector has code to check for the flag and mask it when it occurs. By "mask it", it updated interrupt paca->irq_happened and return. arch_local_irq_restore() is called to re-enable interrupts, which checks and replays interrupts if any occured. Now, as mentioned, current logic doesnot mask "performance monitoring interrupts" and PMIs are implemented as NMI. But this patchset depends on local_irq_* for a successful local_* update. Meaning, mask all possible interrupts during local_* update and replay them after the update. So the idea here is to reserve the "paca->soft_enabled" logic. New values and details: soft_enabledMSR[EE] 1 0 Disabled (PMI and HMI not masked) 0 1 Enabled Reason for the this change is to create foundation for a third flag value "2" for "soft_enabled" to add support to mask PMIs. When arch_irq_disable_* is called with a value "2", PMI interrupts are mask. But when called with a value of "1", PMI are not mask. With new flag value for "soft_enabled", states looks like: soft_enabledMSR[EE] 2 0 Disbaled PMIs also 1 0 Disabled (PMI and HMI not masked) 0 1 Enabled And interrupt handler code for checking has been modified to check for for "greater than or equal" to 1 condition instead. This bit of the patch seems to have been moved into other part of the series. Ideally (unless there is a good reason), it is nice to have each individual patch result in a working kernel before and after. Agreed. But I need to reason out the change and hence add all info here. But will edit the info in the next version. Maddy Nice way to avoid adding more branches though. Thanks, Nick ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 1/9] Add #defs for paca->soft_enabled flags
On Tuesday 26 July 2016 10:57 AM, Nicholas Piggin wrote: On Mon, 25 Jul 2016 20:22:14 +0530 Madhavan Srinivasan wrote: Two #defs LAZY_INTERRUPT_ENABLED and LAZY_INTERRUPT_DISABLED are added to be used when updating paca->soft_enabled. This is a very nice patchset, but can this not be a new name? Thanks, but idea is from ben :) Regarding the name, I looked at the initial patchset posted by paul and took the name from it :). But will work on that, any suggestion for the name? Maddy We use "soft enabled/disabled" everywhere for it. I think lazy is an implementation detail anyway because some interrupts don't cause a hard disable at all. Thanks, Nick ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 9/9] powerpc: rewrite local_t using soft_irq
On Mon, 25 Jul 2016 20:22:22 +0530 Madhavan Srinivasan wrote: > https://lkml.org/lkml/2008/12/16/450 > > Modifications to Rusty's benchmark code: > - Executed only local_t test > > Here are the values with the patch. > > Time in ns per iteration > > Local_t Without Patch With Patch > > _inc 28 8 > _add 28 8 > _read 3 3 > _add_return 28 7 > > Tested the patch in a > - pSeries LPAR (with perf record) Very nice. I'd like to see these patches get in. We can probably use the feature in other places too. Thanks, Nick ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 8/9] powerpc: Support to replay PMIs
On Mon, 25 Jul 2016 20:22:21 +0530 Madhavan Srinivasan wrote: > Code to replay the Performance Monitoring Interrupts(PMI). > In the masked_interrupt handler, for PMIs we reset the MSR[EE] > and return. This is due the fact that PMIs are level triggered. > In the __check_irq_replay(), we enabled the MSR[EE] which will > fire the interrupt for us. > > Patch also adds a new arch_local_irq_disable_var() variant. New > variant takes an input value to write to the paca->soft_enabled. > This will be used in following patch to implement the tri-state > value for soft-enabled. Same comment also applies about patches being standalone transformations that work before and after. Some of these can be squashed together I think. > Signed-off-by: Madhavan Srinivasan > --- > arch/powerpc/include/asm/hw_irq.h | 14 ++ > arch/powerpc/kernel/irq.c | 9 - > 2 files changed, 22 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/include/asm/hw_irq.h > b/arch/powerpc/include/asm/hw_irq.h index cc69dde6eb84..863179654452 > 100644 --- a/arch/powerpc/include/asm/hw_irq.h > +++ b/arch/powerpc/include/asm/hw_irq.h > @@ -81,6 +81,20 @@ static inline unsigned long > arch_local_irq_disable(void) return flags; > } > > +static inline unsigned long arch_local_irq_disable_var(int value) > +{ > + unsigned long flags, zero; > + > + asm volatile( > + "li %1,%3; lbz %0,%2(13); stb %1,%2(13)" > + : "=r" (flags), "=&r" (zero) > + : "i" (offsetof(struct paca_struct, soft_enabled)),\ > + "i" (value) > + : "memory"); > + > + return flags; > +} arch_ function suggests it is arch implementation of a generic kernel function or something. I think our soft interrupt levels are just used in powerpc specific code. The name could also be a little more descriptive. I would have our internal function be something like soft_irq_set_level(), and then the arch disable just sets to the appropriate level as it does today. The PMU disable level could be implemented in powerpc specific header with local_irq_and_pmu_disable() or something like that. Thanks, Nick ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 7/9] powerpc: Add support to mask perf interrupts
On Mon, 25 Jul 2016 20:22:20 +0530 Madhavan Srinivasan wrote: > To support masking of the PMI interrupts, couple of new interrupt > handler macros are added MASKABLE_EXCEPTION_PSERIES_OOL and > MASKABLE_RELON_EXCEPTION_PSERIES_OOL. These are needed to include the > SOFTEN_TEST and implement the support at both host and guest kernel. > > Couple of new irq #defs "PACA_IRQ_PMI" and "SOFTEN_VALUE_0xf0*" added > to use in the exception code to check for PMI interrupts. > > __SOFTEN_TEST macro is modified to support the PMI interrupt. > Present __SOFTEN_TEST code loads the soft_enabled from paca and check > to call masked_interrupt handler code. To support both current > behaviour and PMI masking, these changes are added, > > 1) Current LR register content are saved in R11 > 2) "bge" branch operation is changed to "bgel". > 3) restore R11 to LR > > Reason: > > To retain PMI as NMI behaviour for flag state of 1, we save the LR > regsiter value in R11 and branch to "masked_interrupt" handler with > LR update. And in "masked_interrupt" handler, we check for the > "SOFTEN_VALUE_*" value in R10 for PMI and branch back with "blr" if > PMI. > > To mask PMI for a flag >1 value, masked_interrupt vaoid's the above > check and continue to execute the masked_interrupt code and disabled > MSR[EE] and updated the irq_happend with PMI info. > > Finally, saving of R11 is moved before calling SOFTEN_TEST in the > __EXCEPTION_PROLOG_1 macro to support saving of LR values in > SOFTEN_TEST. > > Signed-off-by: Madhavan Srinivasan > --- > arch/powerpc/include/asm/exception-64s.h | 22 -- > arch/powerpc/include/asm/hw_irq.h| 1 + > arch/powerpc/kernel/exceptions-64s.S | 27 > --- 3 files changed, 45 insertions(+), 5 > deletions(-) > > diff --git a/arch/powerpc/include/asm/exception-64s.h > b/arch/powerpc/include/asm/exception-64s.h index > 44d3f539d8a5..c951b7ab5108 100644 --- > a/arch/powerpc/include/asm/exception-64s.h +++ > b/arch/powerpc/include/asm/exception-64s.h @@ -166,8 +166,8 @@ > END_FTR_SECTION_NESTED(ftr,ftr,943) > OPT_SAVE_REG_TO_PACA(area+EX_CFAR, r10, CPU_FTR_CFAR); > \ SAVE_CTR(r10, area); > \ mfcr > r9; \ > - > extra(vec); \ > std > r11,area+EX_R11(r13); \ > + > extra(vec); \ > std > r12,area+EX_R12(r13); \ > GET_SCRATCH0(r10);\ > std r10,area+EX_R13(r13) @@ -403,12 +403,17 @@ > label##_relon_hv: \ > #define SOFTEN_VALUE_0xe82PACA_IRQ_DBELL #define > SOFTEN_VALUE_0xe60PACA_IRQ_HMI #define > SOFTEN_VALUE_0xe62PACA_IRQ_HMI +#define > SOFTEN_VALUE_0xf01PACA_IRQ_PMI +#define > SOFTEN_VALUE_0xf00PACA_IRQ_PMI #define __SOFTEN_TEST(h, > vec) \ lbz > r10,PACASOFTIRQEN(r13); \ > cmpwi > r10,LAZY_INTERRUPT_DISABLED; \ > li > r10,SOFTEN_VALUE_##vec; \ > - bge masked_##h##interrupt At which point, can't we pass in the interrupt level we want to mask for to SOFTEN_TEST, and avoid all this extra code changes? PMU masked interrupt will compare with SOFTEN_LEVEL_PMU, existing interrupts will compare with SOFTEN_LEVEL_EE (or whatever suitable names there are). > + mflr > r11; \ > + bgel > masked_##h##interrupt;\ > + mtlrr11; This might corrupt return prediction when masked_interrupt does not return. I guess that's uncommon case though. But I think we can avoid this if we do the above, no? Thanks, Nick ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 6/9] powerpc: modify __SOFTEN_TEST to support tri-state soft_enabled flag
On Mon, 25 Jul 2016 20:22:19 +0530 Madhavan Srinivasan wrote: > Foundation patch to support checking of new flag for > "paca->soft_enabled". Modify the condition checking for the > "soft_enabled" from "equal" to "greater than or equal to". Rather than a "tri-state" and the mystery "2" state, can you make a #define for that guy, and use levels. 0-> all enabled 1-> "linux" interrupts disabled 2-> PMU also disabled etc. Thanks, Nick ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powernv/pci: Add PHB register dump debugfs handle
On Tue, 2016-07-26 at 11:45 +1000, Michael Ellerman wrote: > Quoting Russell Currey (2016-07-22 15:23:36) > > > > On EEH events the kernel will print a dump of relevant registers. > > If EEH is unavailable (i.e. CONFIG_EEH is disabled, a new platform > > doesn't have EEH support, etc) this information isn't readily available. > > > > Add a new debugfs handler to trigger a PHB register dump, so that this > > information can be made available on demand. > > This is a bit weird. > > It's a debugfs file, but when you read from it you get nothing (I think, > you have no read() defined). > > When you write to it, regardless of what you write, the kernel spits > some stuff out to dmesg and throws away whatever you wrote. > > Ideally pnv_pci_dump_phb_diag_data() would write its output to a buffer, > which we could then either send to dmesg, or give to debugfs. But that > might be more work than we want to do for this. > > If we just want a trigger file, then I think it'd be preferable to just > use a simple attribute, with a set and no show, eg. something like: > > static int foo_set(void *data, u64 val) > { > if (val != 1) > return -EINVAL; > > ... > > return 0; > } > > DEFINE_SIMPLE_ATTRIBUTE(fops_foo, NULL, foo_set, "%llu\n"); > > That requires that you write "1" to the file to trigger the reg dump. I don't think I can use this here. Triggering the diag dump on the given PHB (these are in /sys/kernel/debug/powerpc/PCI), and that PHB is retrieved from the file handler. It looks like I have no access to the file struct if using a simple getter/setter. > > > > > > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c > > b/arch/powerpc/platforms/powernv/pci-ioda.c > > index 891fc4a..ada2f3c 100644 > > --- a/arch/powerpc/platforms/powernv/pci-ioda.c > > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c > > @@ -3036,6 +3068,9 @@ static void pnv_pci_ioda_create_dbgfs(void) > > if (!phb->dbgfs) > > pr_warning("%s: Error on creating debugfs on > > PHB#%x\n", > > __func__, hose->global_number); > > + > > + debugfs_create_file("regdump", 0200, phb->dbgfs, hose, > > + &pnv_pci_debug_ops); > > } > > You shouldn't be trying to create the file if the directory create failed. So > the check for (!phb->dbgfs) should probably print and then continue. Good catch. > > And a better name would be "dump-regs", because it indicates that the file > does > something, rather than is something. That is indeed better. > > cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 5/9] powerpc: reverse the soft_enable logic
On Mon, 25 Jul 2016 20:22:18 +0530 Madhavan Srinivasan wrote: > "paca->soft_enabled" is used as a flag to mask some of interrupts. > Currently supported flags values and their details: > > soft_enabled MSR[EE] > > 0 0 Disabled (PMI and HMI not masked) > 1 1 Enabled > > "paca->soft_enabled" is initialed to 1 to make the interripts as > enabled. arch_local_irq_disable() will toggle the value when > interrupts needs to disbled. At this point, the interrupts are not > actually disabled, instead, interrupt vector has code to check for > the flag and mask it when it occurs. By "mask it", it updated > interrupt paca->irq_happened and return. arch_local_irq_restore() is > called to re-enable interrupts, which checks and replays interrupts > if any occured. > > Now, as mentioned, current logic doesnot mask "performance monitoring > interrupts" and PMIs are implemented as NMI. But this patchset > depends on local_irq_* for a successful local_* update. Meaning, mask > all possible interrupts during local_* update and replay them after > the update. > > So the idea here is to reserve the "paca->soft_enabled" logic. New > values and details: > > soft_enabled MSR[EE] > > 1 0 Disabled (PMI and HMI not masked) > 0 1 Enabled > > Reason for the this change is to create foundation for a third flag > value "2" for "soft_enabled" to add support to mask PMIs. When > arch_irq_disable_* is called with a value "2", PMI interrupts are > mask. But when called with a value of "1", PMI are not mask. > > With new flag value for "soft_enabled", states looks like: > > soft_enabled MSR[EE] > > 2 0 Disbaled PMIs also > 1 0 Disabled (PMI and HMI not masked) > 0 1 Enabled > > And interrupt handler code for checking has been modified to check for > for "greater than or equal" to 1 condition instead. This bit of the patch seems to have been moved into other part of the series. Ideally (unless there is a good reason), it is nice to have each individual patch result in a working kernel before and after. Nice way to avoid adding more branches though. Thanks, Nick ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/2] powerpc/64: Do load of PACAKBASE in LOAD_HANDLER
The LOAD_HANDLER macro requires that you have previously loaded "reg" with PACAKBASE. Although that gives callers flexibility to get PACAKBASE in some interesting way, none of the callers actually do that. So fold the load of PACAKBASE into the macro, making it simpler for callers to use correctly. Signed-off-by: Michael Ellerman --- arch/powerpc/include/asm/exception-64s.h | 3 +-- arch/powerpc/kernel/exceptions-64s.S | 10 -- 2 files changed, 1 insertion(+), 12 deletions(-) diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h index 4ff3e2f16b5d..887867ac4bfa 100644 --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -52,7 +52,6 @@ #ifdef CONFIG_RELOCATABLE #define __EXCEPTION_RELON_PROLOG_PSERIES_1(label, h) \ - ld r12,PACAKBASE(r13); /* get high part of &label */ \ mfspr r11,SPRN_##h##SRR0; /* save SRR0 */ \ LOAD_HANDLER(r12,label);\ mtctr r12;\ @@ -90,6 +89,7 @@ * that kernelbase be 64K aligned. */ #define LOAD_HANDLER(reg, label) \ + ld reg,PACAKBASE(r13); /* get high part of &label */ \ ori reg,reg,(label)-_stext; /* virt addr of handler ... */ /* Exception register prefixes */ @@ -175,7 +175,6 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) __EXCEPTION_PROLOG_1(area, extra, vec) #define __EXCEPTION_PROLOG_PSERIES_1(label, h) \ - ld r12,PACAKBASE(r13); /* get high part of &label */ \ ld r10,PACAKMSR(r13); /* get MSR value for kernel */ \ mfspr r11,SPRN_##h##SRR0; /* save SRR0 */ \ LOAD_HANDLER(r12,label) \ diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 8bcc1b457115..af30f26c35d8 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -41,7 +41,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE) \ #define SYSCALL_PSERIES_2_RFID \ mfspr r12,SPRN_SRR1 ; \ - ld r10,PACAKBASE(r13) ;\ LOAD_HANDLER(r10, system_call_entry) ; \ mtspr SPRN_SRR0,r10 ; \ ld r10,PACAKMSR(r13) ; \ @@ -64,7 +63,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE) \ */ #define SYSCALL_PSERIES_2_DIRECT \ mflrr10 ; \ - ld r12,PACAKBASE(r13) ;\ LOAD_HANDLER(r12, system_call_entry) ; \ mtctr r12 ; \ mfspr r12,SPRN_SRR1 ; \ @@ -219,7 +217,6 @@ data_access_slb_pSeries: * the kernel ends up being put. */ mfctr r11 - ld r10,PACAKBASE(r13) LOAD_HANDLER(r10, slb_miss_realmode) mtctr r10 bctr @@ -240,7 +237,6 @@ instruction_access_slb_pSeries: b slb_miss_realmode #else mfctr r11 - ld r10,PACAKBASE(r13) LOAD_HANDLER(r10, slb_miss_realmode) mtctr r10 bctr @@ -486,7 +482,6 @@ BEGIN_FTR_SECTION mfmsr r11 /* get MSR value */ ori r11,r11,MSR_ME /* turn on ME bit */ ori r11,r11,MSR_RI /* turn on RI bit */ - ld r12,PACAKBASE(r13) /* get high part of &label */ LOAD_HANDLER(r12, machine_check_handle_early) 1: mtspr SPRN_SRR0,r12 mtspr SPRN_SRR1,r11 @@ -499,7 +494,6 @@ BEGIN_FTR_SECTION */ addir1,r1,INT_FRAME_SIZE/* go back to previous stack frame */ ld r11,PACAKMSR(r13) - ld r12,PACAKBASE(r13) LOAD_HANDLER(r12, unrecover_mce) li r10,MSR_ME andcr11,r11,r10 /* Turn off MSR_ME */ @@ -802,7 +796,6 @@ data_access_slb_relon_pSeries: * the kernel ends up being put. */ mfctr r11 - ld r10,PACAKBASE(r13) LOAD_HANDLER(r10, slb_miss_realmode) mtctr r10 bctr @@ -822,7 +815,6 @@ instruction_access_slb_relon_pSeries: b slb_miss_realmode #else mfctr r11 - ld r10,PACAKBASE(r13) LOAD_HANDLER(r10, slb_miss_realmode) mtctr r10 bctr @@ -1321,7 +1313,6 @@ machine_check_handle_early: andi. r11,r12,MSR_RI bne 2f 1: mfspr r11,SPRN_SRR0 - ld r10,PACAKBASE(r13) LOAD_HANDLER(r10,unr
[PATCH 1/2] powerpc/64: Correct comment on LOAD_HANDLER()
The comment for LOAD_HANDLER() was wrong. The part about kdump has not been true since 1f6a93e4c35e ("powerpc: Make it possible to move the interrupt handlers away from the kernel"). Describe how it currently works, and combine the two separate comments into one. Signed-off-by: Michael Ellerman --- arch/powerpc/include/asm/exception-64s.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h index 93ae809fe5ea..4ff3e2f16b5d 100644 --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -84,12 +84,12 @@ /* * We're short on space and time in the exception prolog, so we can't - * use the normal SET_REG_IMMEDIATE macro. Normally we just need the - * low halfword of the address, but for Kdump we need the whole low - * word. + * use the normal LOAD_REG_IMMEDIATE macro to load the address of label. + * Instead we get the base of the kernel from paca->kernelbase and or in the low + * part of label. This requires that the label be within 64KB of kernelbase, and + * that kernelbase be 64K aligned. */ #define LOAD_HANDLER(reg, label) \ - /* Handlers must be within 64K of kbase, which must be 64k aligned */ \ ori reg,reg,(label)-_stext; /* virt addr of handler ... */ /* Exception register prefixes */ -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 1/9] Add #defs for paca->soft_enabled flags
On Mon, 25 Jul 2016 20:22:14 +0530 Madhavan Srinivasan wrote: > Two #defs LAZY_INTERRUPT_ENABLED and > LAZY_INTERRUPT_DISABLED are added to be used > when updating paca->soft_enabled. This is a very nice patchset, but can this not be a new name? We use "soft enabled/disabled" everywhere for it. I think lazy is an implementation detail anyway because some interrupts don't cause a hard disable at all. Thanks, Nick ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 02/11] mm: Hardened usercopy
On Mon, Jul 25, 2016 at 7:03 PM, Michael Ellerman wrote: > Josh Poimboeuf writes: > >> On Thu, Jul 21, 2016 at 11:34:25AM -0700, Kees Cook wrote: >>> On Wed, Jul 20, 2016 at 11:52 PM, Michael Ellerman >>> wrote: >>> > Kees Cook writes: >>> > >>> >> diff --git a/mm/usercopy.c b/mm/usercopy.c >>> >> new file mode 100644 >>> >> index ..e4bf4e7ccdf6 >>> >> --- /dev/null >>> >> +++ b/mm/usercopy.c >>> >> @@ -0,0 +1,234 @@ >>> > ... >>> >> + >>> >> +/* >>> >> + * Checks if a given pointer and length is contained by the current >>> >> + * stack frame (if possible). >>> >> + * >>> >> + * 0: not at all on the stack >>> >> + * 1: fully within a valid stack frame >>> >> + * 2: fully on the stack (when can't do frame-checking) >>> >> + * -1: error condition (invalid stack position or bad stack frame) >>> >> + */ >>> >> +static noinline int check_stack_object(const void *obj, unsigned long >>> >> len) >>> >> +{ >>> >> + const void * const stack = task_stack_page(current); >>> >> + const void * const stackend = stack + THREAD_SIZE; >>> > >>> > That allows access to the entire stack, including the struct thread_info, >>> > is that what we want - it seems dangerous? Or did I miss a check >>> > somewhere else? >>> >>> That seems like a nice improvement to make, yeah. >>> >>> > We have end_of_stack() which computes the end of the stack taking >>> > thread_info into account (end being the opposite of your end above). >>> >>> Amusingly, the object_is_on_stack() check in sched.h doesn't take >>> thread_info into account either. :P Regardless, I think using >>> end_of_stack() may not be best. To tighten the check, I think we could >>> add this after checking that the object is on the stack: >>> >>> #ifdef CONFIG_STACK_GROWSUP >>> stackend -= sizeof(struct thread_info); >>> #else >>> stack += sizeof(struct thread_info); >>> #endif >>> >>> e.g. then if the pointer was in the thread_info, the second test would >>> fail, triggering the protection. >> >> FWIW, this won't work right on x86 after Andy's >> CONFIG_THREAD_INFO_IN_TASK patches get merged. > > Yeah. I wonder if it's better for the arch helper to just take the obj and > len, > and work out it's own bounds for the stack using current and whatever makes > sense on that arch. > > It would avoid too much ifdefery in the generic code, and also avoid any > confusion about whether stackend is the high or low address. > > eg. on powerpc we could do: > > int noinline arch_within_stack_frames(const void *obj, unsigned long len) > { > void *stack_low = end_of_stack(current); > void *stack_high = task_stack_page(current) + THREAD_SIZE; > > > Whereas arches with STACK_GROWSUP=y could do roughly the reverse, and x86 can > do > whatever it needs to depending on whether the thread_info is on or off stack. > > cheers Yeah, I agree: this should be in the arch code. If the arch can actually do frame checking, the thread_info (if it exists on the stack) would already be excluded. But it'd be a nice tightening of the check. -Kees -- Kees Cook Chrome OS & Brillo Security ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc: sgy_cts1000: Fix gpio_halt_cb()'s signature
Halt callback in struct machdep_calls is declared with __noreturn attribute, so omitting that attribute in gpio_halt_cb()'s signatrue results in compilation error. Change the signature to address the problem as well as change the code of the function to avoid ever returning from the function. Signed-off-by: Andrey Smirnov --- arch/powerpc/platforms/85xx/sgy_cts1000.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/85xx/sgy_cts1000.c b/arch/powerpc/platforms/85xx/sgy_cts1000.c index 79fd0df..21d6aaa 100644 --- a/arch/powerpc/platforms/85xx/sgy_cts1000.c +++ b/arch/powerpc/platforms/85xx/sgy_cts1000.c @@ -38,18 +38,18 @@ static void gpio_halt_wfn(struct work_struct *work) } static DECLARE_WORK(gpio_halt_wq, gpio_halt_wfn); -static void gpio_halt_cb(void) +static void __noreturn gpio_halt_cb(void) { enum of_gpio_flags flags; int trigger, gpio; if (!halt_node) - return; + panic("No reset GPIO information was provided in DT\n"); gpio = of_get_gpio_flags(halt_node, 0, &flags); if (!gpio_is_valid(gpio)) - return; + panic("Provided GPIO is invalid\n"); trigger = (flags == OF_GPIO_ACTIVE_LOW); @@ -57,6 +57,8 @@ static void gpio_halt_cb(void) /* Probably wont return */ gpio_set_value(gpio, trigger); + + panic("Halt failed\n"); } /* This IRQ means someone pressed the power button and it is waiting for us -- 2.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/2] powerpc: e8248e: Select PHYLIB only if NETDEVICES is enabled
Select PHYLIB only if NETDEVICES is enabled and MDIO_BITBANG only if PHYLIB is present to avoid warnings from Kconfig. To prevent undefined references during linking register MDIO driver only if CONFIG_MDIO_BITBANG is enabled. Signed-off-by: Andrey Smirnov --- arch/powerpc/platforms/82xx/Kconfig | 4 ++-- arch/powerpc/platforms/82xx/ep8248e.c | 4 +++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/82xx/Kconfig b/arch/powerpc/platforms/82xx/Kconfig index 7c7df400..994d1a9 100644 --- a/arch/powerpc/platforms/82xx/Kconfig +++ b/arch/powerpc/platforms/82xx/Kconfig @@ -30,8 +30,8 @@ config EP8248E select 8272 select 8260 select FSL_SOC - select PHYLIB - select MDIO_BITBANG + select PHYLIB if NETDEVICES + select MDIO_BITBANG if PHYLIB help This enables support for the Embedded Planet EP8248E board. diff --git a/arch/powerpc/platforms/82xx/ep8248e.c b/arch/powerpc/platforms/82xx/ep8248e.c index cdab847..8fec050 100644 --- a/arch/powerpc/platforms/82xx/ep8248e.c +++ b/arch/powerpc/platforms/82xx/ep8248e.c @@ -298,7 +298,9 @@ static const struct of_device_id of_bus_ids[] __initconst = { static int __init declare_of_platform_devices(void) { of_platform_bus_probe(NULL, of_bus_ids, NULL); - platform_driver_register(&ep8248e_mdio_driver); + + if (IS_ENABLED(CONFIG_MDIO_BITBANG)) + platform_driver_register(&ep8248e_mdio_driver); return 0; } -- 2.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/2] powerpc: mpc85xx_mds: Select PHYLIB only if NETDEVICES is enabled
PHYLIB depends on NETDEVICES, so to avoid unmet dependencies warning from Kconfig it needs to be selected conditionally. Also add checks if PHYLIB is built-in to avoid undefined references to PHYLIB's symbols. Signed-off-by: Andrey Smirnov --- arch/powerpc/platforms/85xx/Kconfig | 2 +- arch/powerpc/platforms/85xx/mpc85xx_mds.c | 9 - 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/85xx/Kconfig b/arch/powerpc/platforms/85xx/Kconfig index e626461..3da35bc 100644 --- a/arch/powerpc/platforms/85xx/Kconfig +++ b/arch/powerpc/platforms/85xx/Kconfig @@ -72,7 +72,7 @@ config MPC85xx_CDS config MPC85xx_MDS bool "Freescale MPC85xx MDS" select DEFAULT_UIMAGE - select PHYLIB + select PHYLIB if NETDEVICES select HAS_RAPIDIO select SWIOTLB help diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c b/arch/powerpc/platforms/85xx/mpc85xx_mds.c index dbcb467..71aff5e 100644 --- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c +++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c @@ -63,6 +63,8 @@ #define DBG(fmt...) #endif +#if IS_BUILTIN(CONFIG_PHYLIB) + #define MV88E_SCR 0x10 #define MV88E_SCR_125CLK 0x0010 static int mpc8568_fixup_125_clock(struct phy_device *phydev) @@ -152,6 +154,8 @@ static int mpc8568_mds_phy_fixups(struct phy_device *phydev) return err; } +#endif + /* * * Setup the architecture @@ -313,6 +317,7 @@ static void __init mpc85xx_mds_setup_arch(void) swiotlb_detect_4g(); } +#if IS_BUILTIN(CONFIG_PHYLIB) static int __init board_fixups(void) { @@ -342,9 +347,12 @@ static int __init board_fixups(void) return 0; } + machine_arch_initcall(mpc8568_mds, board_fixups); machine_arch_initcall(mpc8569_mds, board_fixups); +#endif + static int __init mpc85xx_publish_devices(void) { if (machine_is(mpc8568_mds)) @@ -435,4 +443,3 @@ define_machine(p1021_mds) { .pcibios_fixup_phb = fsl_pcibios_fixup_phb, #endif }; - -- 2.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 3/3] powerpc: Convert fsl_rstcr_restart to a reset handler
Convert fsl_rstcr_restart into a function to be registered with register_reset_handler() API and introduce fls_rstcr_restart_register() function that can be added as an initcall that would do aforementioned registration. Signed-off-by: Andrey Smirnov --- arch/powerpc/platforms/85xx/bsc913x_qds.c | 2 +- arch/powerpc/platforms/85xx/bsc913x_rdb.c | 2 +- arch/powerpc/platforms/85xx/c293pcie.c| 2 +- arch/powerpc/platforms/85xx/corenet_generic.c | 2 +- arch/powerpc/platforms/85xx/ge_imp3a.c| 2 +- arch/powerpc/platforms/85xx/mpc8536_ds.c | 2 +- arch/powerpc/platforms/85xx/mpc85xx_ads.c | 2 +- arch/powerpc/platforms/85xx/mpc85xx_cds.c | 26 +++--- arch/powerpc/platforms/85xx/mpc85xx_ds.c | 7 --- arch/powerpc/platforms/85xx/mpc85xx_mds.c | 7 --- arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 21 +++-- arch/powerpc/platforms/85xx/mvme2500.c| 2 +- arch/powerpc/platforms/85xx/p1010rdb.c| 2 +- arch/powerpc/platforms/85xx/p1022_ds.c| 2 +- arch/powerpc/platforms/85xx/p1022_rdk.c | 3 ++- arch/powerpc/platforms/85xx/p1023_rdb.c | 2 +- arch/powerpc/platforms/85xx/ppa8548.c | 2 +- arch/powerpc/platforms/85xx/qemu_e500.c | 2 +- arch/powerpc/platforms/85xx/sbc8548.c | 2 +- arch/powerpc/platforms/85xx/socrates.c| 2 +- arch/powerpc/platforms/85xx/stx_gp3.c | 2 +- arch/powerpc/platforms/85xx/tqm85xx.c | 2 +- arch/powerpc/platforms/85xx/twr_p102x.c | 2 +- arch/powerpc/platforms/85xx/xes_mpc85xx.c | 7 --- arch/powerpc/platforms/86xx/gef_ppc9a.c | 2 +- arch/powerpc/platforms/86xx/gef_sbc310.c | 2 +- arch/powerpc/platforms/86xx/gef_sbc610.c | 2 +- arch/powerpc/platforms/86xx/mpc8610_hpcd.c| 2 +- arch/powerpc/platforms/86xx/mpc86xx_hpcn.c| 2 +- arch/powerpc/platforms/86xx/sbc8641d.c| 2 +- arch/powerpc/sysdev/fsl_soc.c | 22 +- arch/powerpc/sysdev/fsl_soc.h | 2 +- 32 files changed, 86 insertions(+), 57 deletions(-) diff --git a/arch/powerpc/platforms/85xx/bsc913x_qds.c b/arch/powerpc/platforms/85xx/bsc913x_qds.c index 07dd6ae..14ea7a0 100644 --- a/arch/powerpc/platforms/85xx/bsc913x_qds.c +++ b/arch/powerpc/platforms/85xx/bsc913x_qds.c @@ -53,6 +53,7 @@ static void __init bsc913x_qds_setup_arch(void) } machine_arch_initcall(bsc9132_qds, mpc85xx_common_publish_devices); +machine_arch_initcall(bsc9133_qds, fsl_rstcr_restart_register); /* * Called very early, device-tree isn't unflattened @@ -72,7 +73,6 @@ define_machine(bsc9132_qds) { .pcibios_fixup_bus = fsl_pcibios_fixup_bus, #endif .get_irq= mpic_get_irq, - .restart= fsl_rstcr_restart, .calibrate_decr = generic_calibrate_decr, .progress = udbg_progress, }; diff --git a/arch/powerpc/platforms/85xx/bsc913x_rdb.c b/arch/powerpc/platforms/85xx/bsc913x_rdb.c index e48f671..cd4e717 100644 --- a/arch/powerpc/platforms/85xx/bsc913x_rdb.c +++ b/arch/powerpc/platforms/85xx/bsc913x_rdb.c @@ -43,6 +43,7 @@ static void __init bsc913x_rdb_setup_arch(void) } machine_device_initcall(bsc9131_rdb, mpc85xx_common_publish_devices); +machine_arch_initcall(bsc9131_rdb, fsl_rstcr_restart_register); /* * Called very early, device-tree isn't unflattened @@ -59,7 +60,6 @@ define_machine(bsc9131_rdb) { .setup_arch = bsc913x_rdb_setup_arch, .init_IRQ = bsc913x_rdb_pic_init, .get_irq= mpic_get_irq, - .restart= fsl_rstcr_restart, .calibrate_decr = generic_calibrate_decr, .progress = udbg_progress, }; diff --git a/arch/powerpc/platforms/85xx/c293pcie.c b/arch/powerpc/platforms/85xx/c293pcie.c index 3b9e3f0..fbd63f9 100644 --- a/arch/powerpc/platforms/85xx/c293pcie.c +++ b/arch/powerpc/platforms/85xx/c293pcie.c @@ -48,6 +48,7 @@ static void __init c293_pcie_setup_arch(void) } machine_arch_initcall(c293_pcie, mpc85xx_common_publish_devices); +machine_arch_initcall(c293_pcie, fsl_rstcr_restart_register); /* * Called very early, device-tree isn't unflattened @@ -65,7 +66,6 @@ define_machine(c293_pcie) { .setup_arch = c293_pcie_setup_arch, .init_IRQ = c293_pcie_pic_init, .get_irq= mpic_get_irq, - .restart= fsl_rstcr_restart, .calibrate_decr = generic_calibrate_decr, .progress = udbg_progress, }; diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c index 3a6a84f..297379b 100644 --- a/arch/powerpc/platforms/85xx/corenet_generic.c +++ b/arch/powerpc/platforms/85xx/corenet_generic.c @@ -225,7 +225,6 @@ define_machine(corenet_generic) { #else .get_irq
[PATCH 2/3] powerpc: Call chained reset handlers during reset
Call out to all restart handlers that were added via register_restart_handler() API when restarting the machine. Signed-off-by: Andrey Smirnov --- arch/powerpc/kernel/setup-common.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index 5cd3283..205d073 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -145,6 +145,10 @@ void machine_restart(char *cmd) ppc_md.restart(cmd); smp_send_stop(); + + do_kernel_restart(cmd); + mdelay(1000); + machine_hang(); } -- 2.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/3] powerpc: Factor out common code in setup-common.c
Factor out a small bit of common code in machine_restart(), machine_power_off() and machine_halt(). Signed-off-by: Andrey Smirnov --- arch/powerpc/kernel/setup-common.c | 23 ++- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index 714b4ba..5cd3283 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -130,15 +130,22 @@ void machine_shutdown(void) ppc_md.machine_shutdown(); } +static void machine_hang(void) +{ + pr_emerg("System Halted, OK to turn off power\n"); + local_irq_disable(); + while (1) + ; +} + void machine_restart(char *cmd) { machine_shutdown(); if (ppc_md.restart) ppc_md.restart(cmd); + smp_send_stop(); - printk(KERN_EMERG "System Halted, OK to turn off power\n"); - local_irq_disable(); - while (1) ; + machine_hang(); } void machine_power_off(void) @@ -146,10 +153,9 @@ void machine_power_off(void) machine_shutdown(); if (pm_power_off) pm_power_off(); + smp_send_stop(); - printk(KERN_EMERG "System Halted, OK to turn off power\n"); - local_irq_disable(); - while (1) ; + machine_hang(); } /* Used by the G5 thermal driver */ EXPORT_SYMBOL_GPL(machine_power_off); @@ -162,10 +168,9 @@ void machine_halt(void) machine_shutdown(); if (ppc_md.halt) ppc_md.halt(); + smp_send_stop(); - printk(KERN_EMERG "System Halted, OK to turn off power\n"); - local_irq_disable(); - while (1) ; + machine_hang(); } -- 2.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V2 1/2] tty/hvc: Use IRQF_SHARED for OPAL hvc consoles
On Tue, Jul 26, 2016 at 02:11:11PM +1000, Michael Ellerman wrote: > Quoting Michael Ellerman (2016-07-11 16:29:20) > > Samuel Mendoza-Jonas writes: > > > > > Commit 2def86a7200c > > > ("hvc: Convert to using interrupts instead of opal events") > > > enabled the use of interrupts in the hvc_driver for OPAL platforms. > > > However on machines with more than one hvc console, any console after > > > the first will fail to register an interrupt handler in > > > notifier_add_irq() since all consoles share the same IRQ number but do > > > not set the IRQF_SHARED flag: > > > > > > [ 51.179907] genirq: Flags mismatch irq 31. (hvc_console) vs. > > > (hvc_console) > > > [ 51.180010] hvc_open: request_irq failed with rc -16. > > > > > > This error propagates up to hvc_open() and the console is closed, but > > > OPAL will still generate interrupts that are not handled, leading to > > > rcu_sched stall warnings. > > > > > > Set IRQF_SHARED when calling request_irq, allowing additional consoles > > > to start properly. This is only set for consoles handled by > > > hvc_opal_probe(), leaving other types unaffected. > > > > > > Signed-off-by: Samuel Mendoza-Jonas > > > Cc: # 4.1.x- > > > --- > > > drivers/tty/hvc/hvc_console.h | 1 + > > > drivers/tty/hvc/hvc_irq.c | 7 +-- > > > drivers/tty/hvc/hvc_opal.c| 3 +++ > > > 3 files changed, 9 insertions(+), 2 deletions(-) > > > > Acked-by: Michael Ellerman > > > > Greg are you happy to take these two? > > Hi Greg, > > I don't see this series anywhere, do you mind if I take them via the > powerpc tree for 4.8 ? Or do you want to pick them up. You can take them, I'm not touching patches now until 4.8-rc1 is out, sorry. thanks, greg k-h ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V2 1/2] tty/hvc: Use IRQF_SHARED for OPAL hvc consoles
Quoting Michael Ellerman (2016-07-11 16:29:20) > Samuel Mendoza-Jonas writes: > > > Commit 2def86a7200c > > ("hvc: Convert to using interrupts instead of opal events") > > enabled the use of interrupts in the hvc_driver for OPAL platforms. > > However on machines with more than one hvc console, any console after > > the first will fail to register an interrupt handler in > > notifier_add_irq() since all consoles share the same IRQ number but do > > not set the IRQF_SHARED flag: > > > > [ 51.179907] genirq: Flags mismatch irq 31. (hvc_console) vs. > > (hvc_console) > > [ 51.180010] hvc_open: request_irq failed with rc -16. > > > > This error propagates up to hvc_open() and the console is closed, but > > OPAL will still generate interrupts that are not handled, leading to > > rcu_sched stall warnings. > > > > Set IRQF_SHARED when calling request_irq, allowing additional consoles > > to start properly. This is only set for consoles handled by > > hvc_opal_probe(), leaving other types unaffected. > > > > Signed-off-by: Samuel Mendoza-Jonas > > Cc: # 4.1.x- > > --- > > drivers/tty/hvc/hvc_console.h | 1 + > > drivers/tty/hvc/hvc_irq.c | 7 +-- > > drivers/tty/hvc/hvc_opal.c| 3 +++ > > 3 files changed, 9 insertions(+), 2 deletions(-) > > Acked-by: Michael Ellerman > > Greg are you happy to take these two? Hi Greg, I don't see this series anywhere, do you mind if I take them via the powerpc tree for 4.8 ? Or do you want to pick them up. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 1/2] powerpc/mm: Fix build break when PPC_NATIVE=n
Hi Michael, On Tue, 26 Jul 2016 13:38:37 +1000 Michael Ellerman wrote: > > The recent commit to rework the hash MMU setup broke the build when > CONFIG_PPC_NATIVE=n. Fix it by adding an IS_ENABLED() check before > calling hpte_init_native(). > > Removing the else clause opens the possibility that we don't set any > ops, which would probably lead to a strange crash later. So add a check > that we correctly initialised at least one member of the struct. > > Fixes: 166dd7d3fbf2 ("powerpc/64: Move MMU backend selection out of platform > code") > Reported-by: Stephen Rothwell > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell -- Cheers, Stephen Rothwell ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 2/2] powerpc/mm: Rename hpte_init_lpar() and move the fallback to a header
Hi Michael, On Tue, 26 Jul 2016 13:38:38 +1000 Michael Ellerman wrote: > > hpte_init_lpar() is part of the pseries platform, so name it as such. > > Move the fallback implementation for when PSERIES=n into the header, > dropping the weak implementation. The panic() is now handled by the > calling code. Of course, this could have been handled the same way as the native one. > else if (firmware_has_feature(FW_FEATURE_LPAR)) else if (IS_ENABLED(CONFIG_PPC_PSERIES) && firmware_has_feature(FW_FEATURE_LPAR)) > - hpte_init_lpar(); > + hpte_init_pseries(); and no need to modify the header file. -- Cheers, Stephen Rothwell ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 2/2] powerpc/mm: Rename hpte_init_lpar() and move the fallback to a header
hpte_init_lpar() is part of the pseries platform, so name it as such. Move the fallback implementation for when PSERIES=n into the header, dropping the weak implementation. The panic() is now handled by the calling code. Signed-off-by: Michael Ellerman --- arch/powerpc/include/asm/book3s/64/mmu-hash.h | 7 ++- arch/powerpc/mm/hash_utils_64.c | 7 +-- arch/powerpc/platforms/pseries/lpar.c | 2 +- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h index b0f4dffe12ae..450b017fdc19 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h @@ -391,8 +391,13 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend, extern void add_gpage(u64 addr, u64 page_size, unsigned long number_of_pages); extern void demote_segment_4k(struct mm_struct *mm, unsigned long addr); +#ifdef CONFIG_PPC_PSERIES +void hpte_init_pseries(void); +#else +static inline void hpte_init_pseries(void) { } +#endif + extern void hpte_init_native(void); -extern void hpte_init_lpar(void); extern void hpte_init_beat(void); extern void hpte_init_beat_v3(void); diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index 381b5894cc99..1ff11c1bb182 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -885,11 +885,6 @@ static void __init htab_initialize(void) #undef KB #undef MB -void __init __weak hpte_init_lpar(void) -{ - panic("FW_FEATURE_LPAR set but no LPAR support compiled\n"); -} - void __init hash__early_init_mmu(void) { /* @@ -930,7 +925,7 @@ void __init hash__early_init_mmu(void) if (firmware_has_feature(FW_FEATURE_PS3_LV1)) ps3_early_mm_init(); else if (firmware_has_feature(FW_FEATURE_LPAR)) - hpte_init_lpar(); + hpte_init_pseries(); else if IS_ENABLED(CONFIG_PPC_NATIVE) hpte_init_native(); diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c index 0e91388d0af9..86707e67843f 100644 --- a/arch/powerpc/platforms/pseries/lpar.c +++ b/arch/powerpc/platforms/pseries/lpar.c @@ -589,7 +589,7 @@ static int __init disable_bulk_remove(char *str) __setup("bulk_remove=", disable_bulk_remove); -void __init hpte_init_lpar(void) +void __init hpte_init_pseries(void) { mmu_hash_ops.hpte_invalidate = pSeries_lpar_hpte_invalidate; mmu_hash_ops.hpte_updatepp = pSeries_lpar_hpte_updatepp; -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 1/2] powerpc/mm: Fix build break when PPC_NATIVE=n
The recent commit to rework the hash MMU setup broke the build when CONFIG_PPC_NATIVE=n. Fix it by adding an IS_ENABLED() check before calling hpte_init_native(). Removing the else clause opens the possibility that we don't set any ops, which would probably lead to a strange crash later. So add a check that we correctly initialised at least one member of the struct. Fixes: 166dd7d3fbf2 ("powerpc/64: Move MMU backend selection out of platform code") Reported-by: Stephen Rothwell Signed-off-by: Michael Ellerman --- arch/powerpc/mm/hash_utils_64.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index 341632471b9d..381b5894cc99 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -931,9 +931,12 @@ void __init hash__early_init_mmu(void) ps3_early_mm_init(); else if (firmware_has_feature(FW_FEATURE_LPAR)) hpte_init_lpar(); - else + else if IS_ENABLED(CONFIG_PPC_NATIVE) hpte_init_native(); + if (!mmu_hash_ops.hpte_insert) + panic("hash__early_init_mmu: No MMU hash ops defined!\n"); + /* Initialize the MMU Hash table and create the linear mapping * of memory. Has to be done before SLB initialization as this is * currently where the page size encoding is obtained. -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH v3 02/11] mm: Hardened usercopy
David Laight writes: > From: Josh Poimboeuf >> Sent: 22 July 2016 18:46 >> > >> > e.g. then if the pointer was in the thread_info, the second test would >> > fail, triggering the protection. >> >> FWIW, this won't work right on x86 after Andy's >> CONFIG_THREAD_INFO_IN_TASK patches get merged. > > What ends up in the 'thread_info' area? It depends on the arch. > If it contains the fp save area then programs like gdb may end up requesting > copy_in/out directly from that area. On the arches I've seen thread_info doesn't usually contain register save areas, but if it did then it would be up to the arch helper to allow that copy to go through. However given thread_info generally contains lots of low level flags that would be a good target for an attacker, the best way to cope with ptrace wanting to copy to/from it would be to use a temporary, and prohibit copying directly to/from thread_info - IMHO. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 02/11] mm: Hardened usercopy
Josh Poimboeuf writes: > On Thu, Jul 21, 2016 at 11:34:25AM -0700, Kees Cook wrote: >> On Wed, Jul 20, 2016 at 11:52 PM, Michael Ellerman >> wrote: >> > Kees Cook writes: >> > >> >> diff --git a/mm/usercopy.c b/mm/usercopy.c >> >> new file mode 100644 >> >> index ..e4bf4e7ccdf6 >> >> --- /dev/null >> >> +++ b/mm/usercopy.c >> >> @@ -0,0 +1,234 @@ >> > ... >> >> + >> >> +/* >> >> + * Checks if a given pointer and length is contained by the current >> >> + * stack frame (if possible). >> >> + * >> >> + * 0: not at all on the stack >> >> + * 1: fully within a valid stack frame >> >> + * 2: fully on the stack (when can't do frame-checking) >> >> + * -1: error condition (invalid stack position or bad stack frame) >> >> + */ >> >> +static noinline int check_stack_object(const void *obj, unsigned long >> >> len) >> >> +{ >> >> + const void * const stack = task_stack_page(current); >> >> + const void * const stackend = stack + THREAD_SIZE; >> > >> > That allows access to the entire stack, including the struct thread_info, >> > is that what we want - it seems dangerous? Or did I miss a check >> > somewhere else? >> >> That seems like a nice improvement to make, yeah. >> >> > We have end_of_stack() which computes the end of the stack taking >> > thread_info into account (end being the opposite of your end above). >> >> Amusingly, the object_is_on_stack() check in sched.h doesn't take >> thread_info into account either. :P Regardless, I think using >> end_of_stack() may not be best. To tighten the check, I think we could >> add this after checking that the object is on the stack: >> >> #ifdef CONFIG_STACK_GROWSUP >> stackend -= sizeof(struct thread_info); >> #else >> stack += sizeof(struct thread_info); >> #endif >> >> e.g. then if the pointer was in the thread_info, the second test would >> fail, triggering the protection. > > FWIW, this won't work right on x86 after Andy's > CONFIG_THREAD_INFO_IN_TASK patches get merged. Yeah. I wonder if it's better for the arch helper to just take the obj and len, and work out it's own bounds for the stack using current and whatever makes sense on that arch. It would avoid too much ifdefery in the generic code, and also avoid any confusion about whether stackend is the high or low address. eg. on powerpc we could do: int noinline arch_within_stack_frames(const void *obj, unsigned long len) { void *stack_low = end_of_stack(current); void *stack_high = task_stack_page(current) + THREAD_SIZE; Whereas arches with STACK_GROWSUP=y could do roughly the reverse, and x86 can do whatever it needs to depending on whether the thread_info is on or off stack. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powernv/pci: Add PHB register dump debugfs handle
Quoting Russell Currey (2016-07-22 15:23:36) > On EEH events the kernel will print a dump of relevant registers. > If EEH is unavailable (i.e. CONFIG_EEH is disabled, a new platform > doesn't have EEH support, etc) this information isn't readily available. > > Add a new debugfs handler to trigger a PHB register dump, so that this > information can be made available on demand. This is a bit weird. It's a debugfs file, but when you read from it you get nothing (I think, you have no read() defined). When you write to it, regardless of what you write, the kernel spits some stuff out to dmesg and throws away whatever you wrote. Ideally pnv_pci_dump_phb_diag_data() would write its output to a buffer, which we could then either send to dmesg, or give to debugfs. But that might be more work than we want to do for this. If we just want a trigger file, then I think it'd be preferable to just use a simple attribute, with a set and no show, eg. something like: static int foo_set(void *data, u64 val) { if (val != 1) return -EINVAL; ... return 0; } DEFINE_SIMPLE_ATTRIBUTE(fops_foo, NULL, foo_set, "%llu\n"); That requires that you write "1" to the file to trigger the reg dump. > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c > b/arch/powerpc/platforms/powernv/pci-ioda.c > index 891fc4a..ada2f3c 100644 > --- a/arch/powerpc/platforms/powernv/pci-ioda.c > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c > @@ -3036,6 +3068,9 @@ static void pnv_pci_ioda_create_dbgfs(void) > if (!phb->dbgfs) > pr_warning("%s: Error on creating debugfs on > PHB#%x\n", > __func__, hose->global_number); > + > + debugfs_create_file("regdump", 0200, phb->dbgfs, hose, > + &pnv_pci_debug_ops); > } You shouldn't be trying to create the file if the directory create failed. So the check for (!phb->dbgfs) should probably print and then continue. And a better name would be "dump-regs", because it indicates that the file does something, rather than is something. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2] include: mman: Use bool instead of int for the return value of arch_validate_prot
Andrew Morton writes: > On Mon, 25 Jul 2016 15:10:06 +1000 Michael Ellerman > wrote: >> cheng...@emindsoft.com.cn writes: >> > From: Chen Gang >> > >> > For pure bool function's return value, bool is a little better more or >> > less than int. >> > >> > Signed-off-by: Chen Gang >> >> LGTM. >> >> Acked-by: Michael Ellerman >> >> Andrew do you want to take this or should I? > > I grabbed it, thanks. Thanks. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powernv/pci: Add PHB register dump debugfs handle
Tyrel Datwyler writes: > On 07/21/2016 11:36 PM, Gavin Shan wrote: >> On Fri, Jul 22, 2016 at 03:23:36PM +1000, Russell Currey wrote: >>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c >>> b/arch/powerpc/platforms/powernv/pci-ioda.c >>> index 891fc4a..ada2f3c 100644 >>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c >>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c >>> @@ -3018,6 +3018,38 @@ static void pnv_ioda_setup_pe_seg(struct pnv_ioda_pe >>> *pe) >>> } >>> } >>> >>> +#ifdef CONFIG_DEBUG_FS >>> +static ssize_t pnv_pci_debug_write(struct file *filp, >>> + const char __user *user_buf, >>> + size_t count, loff_t *ppos) >>> +{ >>> + struct pci_controller *hose = filp->private_data; >>> + struct pnv_phb *phb; >>> + int ret = 0; >> >> Needn't initialize @ret in advance. The code might be simpler, but it's >> only a personal preference: > > I believe its actually preferred that it not be initialized in advance > so that the tooling can warn you about conditional code paths where you > may have forgotten to set a value. Yeah that's right, it's preferable not to initialise it. It helps for complex if/else/switch cases, where you might accidentally have a path where you return without giving ret the right value. The other case is when someone modifies your code. For example if you have: int ret; if (foo) ret = do_foo(); else ret = 1; return ret; And then you add a case to the if: if (foo) ret = do_foo(); else if (bar) do_bar(); else ret = 1; The compiler will warn you that in the bar case you forget to initialise ret. Whereas if you initialised ret at the start then the compiler can't help you. There are times when it's cleaner to initialise the value at the start, eg. if you have many error cases and only one success case. But that should be a deliberate choice. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v4 12/12] mm: SLUB hardened usercopy support
On 07/25/2016 01:45 PM, Kees Cook wrote: On Mon, Jul 25, 2016 at 12:16 PM, Laura Abbott wrote: On 07/20/2016 01:27 PM, Kees Cook wrote: Under CONFIG_HARDENED_USERCOPY, this adds object size checking to the SLUB allocator to catch any copies that may span objects. Includes a redzone handling fix discovered by Michael Ellerman. Based on code from PaX and grsecurity. Signed-off-by: Kees Cook Tested-by: Michael Ellerman --- init/Kconfig | 1 + mm/slub.c| 36 2 files changed, 37 insertions(+) diff --git a/init/Kconfig b/init/Kconfig index 798c2020ee7c..1c4711819dfd 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1765,6 +1765,7 @@ config SLAB config SLUB bool "SLUB (Unqueued Allocator)" + select HAVE_HARDENED_USERCOPY_ALLOCATOR help SLUB is a slab allocator that minimizes cache line usage instead of managing queues of cached objects (SLAB approach). diff --git a/mm/slub.c b/mm/slub.c index 825ff4505336..7dee3d9a5843 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3614,6 +3614,42 @@ void *__kmalloc_node(size_t size, gfp_t flags, int node) EXPORT_SYMBOL(__kmalloc_node); #endif +#ifdef CONFIG_HARDENED_USERCOPY +/* + * Rejects objects that are incorrectly sized. + * + * Returns NULL if check passes, otherwise const char * to name of cache + * to indicate an error. + */ +const char *__check_heap_object(const void *ptr, unsigned long n, + struct page *page) +{ + struct kmem_cache *s; + unsigned long offset; + size_t object_size; + + /* Find object and usable object size. */ + s = page->slab_cache; + object_size = slab_ksize(s); + + /* Find offset within object. */ + offset = (ptr - page_address(page)) % s->size; + + /* Adjust for redzone and reject if within the redzone. */ + if (kmem_cache_debug(s) && s->flags & SLAB_RED_ZONE) { + if (offset < s->red_left_pad) + return s->name; + offset -= s->red_left_pad; + } + + /* Allow address range falling entirely within object size. */ + if (offset <= object_size && n <= object_size - offset) + return NULL; + + return s->name; +} +#endif /* CONFIG_HARDENED_USERCOPY */ + I compared this against what check_valid_pointer does for SLUB_DEBUG checking. I was hoping we could utilize that function to avoid duplication but a) __check_heap_object needs to allow accesses anywhere in the object, not just the beginning b) accessing page->objects is racy without the addition of locking in SLUB_DEBUG. Still, the ptr < page_address(page) check from __check_heap_object would be good to add to avoid generating garbage large offsets and trying to infer C math. diff --git a/mm/slub.c b/mm/slub.c index 7dee3d9..5370e4f 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3632,6 +3632,9 @@ const char *__check_heap_object(const void *ptr, unsigned long n, s = page->slab_cache; object_size = slab_ksize(s); + if (ptr < page_address(page)) + return s->name; + /* Find offset within object. */ offset = (ptr - page_address(page)) % s->size; With that, you can add Reviwed-by: Laura Abbott Cool, I'll add that. Should I add your reviewed-by for this patch only or for the whole series? Thanks! -Kees Just this patch for now, I'm working through a couple of others static size_t __ksize(const void *object) { struct page *page; Thanks, Laura ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v3] UCC_GETH/UCC_FAST: Use IS_ERR_VALUE_U32 API to avoid IS_ERR_VALUE abuses.
From: Arvind Yadav Date: Sat, 23 Jul 2016 23:35:51 +0530 > However, anything that passes an 'unsigned short' or 'unsigned int' > argument into IS_ERR_VALUE() is guaranteed to be broken, as are > 8-bit integers and types that are wider than 'unsigned long'. ... > Passing value in IS_ERR_VALUE() is wrong, as they pass an > 'unsigned int' into a function that takes an 'unsigned long' > argument.This happens to work because the type is sign-extended > on 64-bit architectures before it gets converted into an > unsigned type. This commit log message is a complete mess, you're saying exactly the same thing over and over again. Also your Subject line is not formatted correctly, do not list the subsystem prefix in ALL CAPS. Just plain "ucc_geth/ucc_fast: " would be fine. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powernv/pci: Add PHB register dump debugfs handle
On Fri, 2016-07-22 at 16:36 +1000, Gavin Shan wrote: > On Fri, Jul 22, 2016 at 03:23:36PM +1000, Russell Currey wrote: > > > > On EEH events the kernel will print a dump of relevant registers. > > If EEH is unavailable (i.e. CONFIG_EEH is disabled, a new platform > > doesn't have EEH support, etc) this information isn't readily available. > > > > Add a new debugfs handler to trigger a PHB register dump, so that this > > information can be made available on demand. > > > > Signed-off-by: Russell Currey > > Reviewed-by: Gavin Shan Hi Gavin, thanks for the review. > > > > > --- > > arch/powerpc/platforms/powernv/pci-ioda.c | 35 > > +++ > > 1 file changed, 35 insertions(+) > > > > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c > > b/arch/powerpc/platforms/powernv/pci-ioda.c > > index 891fc4a..ada2f3c 100644 > > --- a/arch/powerpc/platforms/powernv/pci-ioda.c > > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c > > @@ -3018,6 +3018,38 @@ static void pnv_ioda_setup_pe_seg(struct pnv_ioda_pe > > *pe) > > } > > } > > > > +#ifdef CONFIG_DEBUG_FS > > +static ssize_t pnv_pci_debug_write(struct file *filp, > > + const char __user *user_buf, > > + size_t count, loff_t *ppos) > > +{ > > + struct pci_controller *hose = filp->private_data; > > + struct pnv_phb *phb; > > + int ret = 0; > > Needn't initialize @ret in advance. The code might be simpler, but it's > only a personal preference: > > struct pci_controller *hose = filp->private_data; > struct pnv_phb *phb = hose ? hose->private_data : NULL; > > if (!phb) > return -ENODEV; > > > > > + > > + if (!hose) > > + return -EFAULT; > > + > > + phb = hose->private_data; > > + if (!phb) > > + return -EFAULT; > > + > > + ret = opal_pci_get_phb_diag_data2(phb->opal_id, phb->diag.blob, > > + PNV_PCI_DIAG_BUF_SIZE); > > + > > + if (!ret) > > + pnv_pci_dump_phb_diag_data(phb->hose, phb->diag.blob); > > + > > + return ret < 0 ? ret : count; > > return ret == OPAL_SUCCESS ? count : -EIO; Yeah, that's much better. > > > > > +} > > + > > +static const struct file_operations pnv_pci_debug_ops = { > > + .open = simple_open, > > + .llseek = no_llseek, > > + .write = pnv_pci_debug_write, > > It might be reasonable to dump the diag-data on read if it is trying > to do it on write. I'm not sure about this one. I went with write since (at least, in my mind) writing to a file feels like triggering an action, although we're not actually reading any input. It also means that it works the same way as the other PHB debugfs entries (i.e. errinjct). I could rework it into a read that said something like "PHB#%x diag data dumped, check the kernel log", what do you think? > > > > > +}; > > +#endif /* CONFIG_DEBUG_FS */ > > + > > static void pnv_pci_ioda_create_dbgfs(void) > > { > > #ifdef CONFIG_DEBUG_FS > > @@ -3036,6 +3068,9 @@ static void pnv_pci_ioda_create_dbgfs(void) > > if (!phb->dbgfs) > > pr_warning("%s: Error on creating debugfs on PHB#%x\n", > > __func__, hose->global_number); > > + > > + debugfs_create_file("regdump", 0200, phb->dbgfs, hose, > > + &pnv_pci_debug_ops); > > "diag-data" might be indicating or a better one you can name :) > > Thanks, > Gavin > > > > > } > > #endif /* CONFIG_DEBUG_FS */ > > } > > -- > > 2.9.0 > > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 7/8] powerpc: Check arch.vec earlier during boot for memory features
Hi, [auto build test ERROR on pci/next] [cannot apply to powerpc/next next-20160725] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Michael-Bringmann/powerpc-devtree-Add-support-for-2-new-DRC-properties/20160726-063623 base: https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next config: powerpc-allnoconfig (attached as .config) compiler: powerpc-linux-gnu-gcc (Debian 5.4.0-6) 5.4.0 20160609 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=powerpc All errors (new ones prefixed by >>): arch/powerpc/kernel/built-in.o: In function `early_init_devtree': >> (.init.text+0x1072): undefined reference to `pseries_probe_fw_features' arch/powerpc/kernel/built-in.o: In function `early_init_devtree': (.init.text+0x107a): undefined reference to `pseries_probe_fw_features' --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v4 12/12] mm: SLUB hardened usercopy support
On Mon, 2016-07-25 at 16:29 -0700, Laura Abbott wrote: > On 07/25/2016 02:42 PM, Rik van Riel wrote: > > On Mon, 2016-07-25 at 12:16 -0700, Laura Abbott wrote: > > > On 07/20/2016 01:27 PM, Kees Cook wrote: > > > > Under CONFIG_HARDENED_USERCOPY, this adds object size checking > > > > to > > > > the > > > > SLUB allocator to catch any copies that may span objects. > > > > Includes > > > > a > > > > redzone handling fix discovered by Michael Ellerman. > > > > > > > > Based on code from PaX and grsecurity. > > > > > > > > Signed-off-by: Kees Cook > > > > Tested-by: Michael Ellerman > > > > --- > > > > init/Kconfig | 1 + > > > > mm/slub.c| 36 > > > > 2 files changed, 37 insertions(+) > > > > > > > > diff --git a/init/Kconfig b/init/Kconfig > > > > index 798c2020ee7c..1c4711819dfd 100644 > > > > --- a/init/Kconfig > > > > +++ b/init/Kconfig > > > > @@ -1765,6 +1765,7 @@ config SLAB > > > > > > > > config SLUB > > > > bool "SLUB (Unqueued Allocator)" > > > > + select HAVE_HARDENED_USERCOPY_ALLOCATOR > > > > help > > > > SLUB is a slab allocator that minimizes cache line > > > > usage > > > > instead of managing queues of cached objects (SLAB > > > > approach). > > > > diff --git a/mm/slub.c b/mm/slub.c > > > > index 825ff4505336..7dee3d9a5843 100644 > > > > --- a/mm/slub.c > > > > +++ b/mm/slub.c > > > > @@ -3614,6 +3614,42 @@ void *__kmalloc_node(size_t size, gfp_t > > > > flags, int node) > > > > EXPORT_SYMBOL(__kmalloc_node); > > > > #endif > > > > > > > > +#ifdef CONFIG_HARDENED_USERCOPY > > > > +/* > > > > + * Rejects objects that are incorrectly sized. > > > > + * > > > > + * Returns NULL if check passes, otherwise const char * to > > > > name of > > > > cache > > > > + * to indicate an error. > > > > + */ > > > > +const char *__check_heap_object(const void *ptr, unsigned long > > > > n, > > > > + struct page *page) > > > > +{ > > > > + struct kmem_cache *s; > > > > + unsigned long offset; > > > > + size_t object_size; > > > > + > > > > + /* Find object and usable object size. */ > > > > + s = page->slab_cache; > > > > + object_size = slab_ksize(s); > > > > + > > > > + /* Find offset within object. */ > > > > + offset = (ptr - page_address(page)) % s->size; > > > > + > > > > + /* Adjust for redzone and reject if within the > > > > redzone. */ > > > > + if (kmem_cache_debug(s) && s->flags & SLAB_RED_ZONE) { > > > > + if (offset < s->red_left_pad) > > > > + return s->name; > > > > + offset -= s->red_left_pad; > > > > + } > > > > + > > > > + /* Allow address range falling entirely within object > > > > size. */ > > > > + if (offset <= object_size && n <= object_size - > > > > offset) > > > > + return NULL; > > > > + > > > > + return s->name; > > > > +} > > > > +#endif /* CONFIG_HARDENED_USERCOPY */ > > > > + > > > > > > I compared this against what check_valid_pointer does for > > > SLUB_DEBUG > > > checking. I was hoping we could utilize that function to avoid > > > duplication but a) __check_heap_object needs to allow accesses > > > anywhere > > > in the object, not just the beginning b) accessing page->objects > > > is racy without the addition of locking in SLUB_DEBUG. > > > > > > Still, the ptr < page_address(page) check from > > > __check_heap_object > > > would > > > be good to add to avoid generating garbage large offsets and > > > trying > > > to > > > infer C math. > > > > > > diff --git a/mm/slub.c b/mm/slub.c > > > index 7dee3d9..5370e4f 100644 > > > --- a/mm/slub.c > > > +++ b/mm/slub.c > > > @@ -3632,6 +3632,9 @@ const char *__check_heap_object(const void > > > *ptr, unsigned long n, > > > s = page->slab_cache; > > > object_size = slab_ksize(s); > > > > > > + if (ptr < page_address(page)) > > > + return s->name; > > > + > > > /* Find offset within object. */ > > > offset = (ptr - page_address(page)) % s->size; > > > > > > > I don't get it, isn't that already guaranteed because we > > look for the page that ptr is in, before __check_heap_object > > is called? > > > > Specifically, in patch 3/12: > > > > + page = virt_to_head_page(ptr); > > + > > + /* Check slab allocator for flags and size. */ > > + if (PageSlab(page)) > > + return __check_heap_object(ptr, n, page); > > > > How can that generate a ptr that is not inside the page? > > > > What am I overlooking? And, should it be in the changelog or > > a comment? :) > > > > > I ran into the subtraction issue when the vmalloc detection wasn't > working on ARM64, somehow virt_to_head_page turned into a page > that happened to have PageSlab set. I agree if everything is working > properly this is redundant but given the type of feature this is, a > little bit of redundancy agai
Re: [PATCH v2] include: mman: Use bool instead of int for the return value of arch_validate_prot
On Mon, 25 Jul 2016 15:10:06 +1000 Michael Ellerman wrote: > cheng...@emindsoft.com.cn writes: > > > From: Chen Gang > > > > For pure bool function's return value, bool is a little better more or > > less than int. > > > > Signed-off-by: Chen Gang > > --- > > arch/powerpc/include/asm/mman.h | 8 > > include/linux/mman.h| 2 +- > > 2 files changed, 5 insertions(+), 5 deletions(-) > > LGTM. > > Acked-by: Michael Ellerman > > Andrew do you want to take this or should I? I grabbed it, thanks. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/8] powerpc/firmware: Add definitions for new firmware features.
On 07/25/2016 03:21 PM, Michael Bringmann wrote: > Firmware Features: Define new bit flags representing the presence of > new device tree properties "ibm,drc-info", and "ibm,dynamic-memory-v2". > These flags are used to tell the front end processor when the Linux > kernel supports the new properties, and by the front end processor to > tell the Linux kernel that the new properties are present in the devie > tree. > > Signed-off-by: Michael Bringmann > --- > diff --git a/arch/powerpc/include/asm/firmware.h > b/arch/powerpc/include/asm/firmware.h > index b062924..a9d66d5 100644 > --- a/arch/powerpc/include/asm/firmware.h > +++ b/arch/powerpc/include/asm/firmware.h > @@ -51,6 +51,8 @@ > #define FW_FEATURE_BEST_ENERGY ASM_CONST(0x8000) > #define FW_FEATURE_TYPE1_AFFINITY ASM_CONST(0x0001) > #define FW_FEATURE_PRRN ASM_CONST(0x0002) > +#define FW_FEATURE_RPS_DM2 ASM_CONST(0x0004) > +#define FW_FEATURE_RPS_DRC_INFO ASM_CONST(0x0008) I can't say that these names are my favorite. Especially _RPS_DM2. I haven't actually seen the PAPR updates that define these things, but I would hope that these had more self explanatory names. I'm not really sure what _RPS_ means. Like I said I haven't seen the PAPR update so maybe that is a new acronym defined there. -Tyrel > > #ifndef __ASSEMBLY__ > > @@ -66,7 +68,8 @@ enum { > FW_FEATURE_MULTITCE | FW_FEATURE_SPLPAR | FW_FEATURE_LPAR | > FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO | > FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY | > - FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN, > + FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN | > + FW_FEATURE_RPS_DM2 | FW_FEATURE_RPS_DRC_INFO, > FW_FEATURE_PSERIES_ALWAYS = 0, > FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL, > FW_FEATURE_POWERNV_ALWAYS = 0, > diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h > index 7f436ba..b9a1534 100644 > --- a/arch/powerpc/include/asm/prom.h > +++ b/arch/powerpc/include/asm/prom.h > @@ -155,6 +203,8 @@ struct of_drconf_cell { > #define OV5_PFO_HW_842 0x0E40 /* PFO Compression Accelerator > */ > #define OV5_PFO_HW_ENCR 0x0E20 /* PFO Encryption Accelerator */ > #define OV5_SUB_PROCESSORS 0x0F01 /* 1,2,or 4 Sub-Processors supported */ > +#define OV5_RPS_DM2 0x1680 /* Redef Prop Structures: dyn-mem-v2 */ > +#define OV5_RPS_DRC_INFO 0x1640 /* Redef Prop Structures: drc-info */ > > /* Option Vector 6: IBM PAPR hints */ > #define OV6_LINUX0x02/* Linux is our OS */ > diff --git a/arch/powerpc/platforms/pseries/firmware.c > b/arch/powerpc/platforms/pseries/firmware.c > index 8c80588..00243ee 100644 > --- a/arch/powerpc/platforms/pseries/firmware.c > +++ b/arch/powerpc/platforms/pseries/firmware.c > @@ -111,6 +111,8 @@ static __initdata struct vec5_fw_feature > vec5_fw_features_table[] = { > {FW_FEATURE_TYPE1_AFFINITY, OV5_TYPE1_AFFINITY}, > {FW_FEATURE_PRRN, OV5_PRRN}, > + {FW_FEATURE_RPS_DM2,OV5_RPS_DM2}, > + {FW_FEATURE_RPS_DRC_INFO, OV5_RPS_DRC_INFO}, > }; > > void __init fw_vec5_feature_init(const char *vec5, unsigned long len) > > ___ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/8] powerpc/memory: Parse new memory property to register blocks.
On 07/25/2016 03:21 PM, Michael Bringmann wrote: > powerpc/memory: Add parallel routines to parse the new property > "ibm,dynamic-memory-v2" property when it is present, and then to > register the relevant memory blocks with the operating system. > This property format is intended to provide a more compact > representation of memory when communicating with the front end > processor, especially when describing vast amounts of RAM. > > Signed-off-by: Michael Bringmann > --- > diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h > index 7f436ba..b9a1534 100644 > --- a/arch/powerpc/include/asm/prom.h > +++ b/arch/powerpc/include/asm/prom.h > @@ -69,6 +69,8 @@ struct boot_param_header { > * OF address retreival & translation > */ > > +extern int n_mem_addr_cells; > + > /* Parse the ibm,dma-window property of an OF node into the busno, phys and > * size parameters. > */ > @@ -81,8 +83,9 @@ extern void of_instantiate_rtc(void); > extern int of_get_ibm_chip_id(struct device_node *np); > > /* The of_drconf_cell struct defines the layout of the LMB array > - * specified in the device tree property > - * ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory > + * specified in the device tree properties, > + * ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory > + * ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory-v2 > */ > struct of_drconf_cell { > u64 base_addr; > @@ -92,9 +95,39 @@ struct of_drconf_cell { > u32 flags; > }; > > -#define DRCONF_MEM_ASSIGNED 0x0008 > -#define DRCONF_MEM_AI_INVALID0x0040 > -#define DRCONF_MEM_RESERVED 0x0080 > +#define DRCONF_MEM_ASSIGNED 0x0008 > +#define DRCONF_MEM_AI_INVALID0x0040 > +#define DRCONF_MEM_RESERVED 0x0080 > + > + /* It is important to note that this structure can not > + * be safely mapped onto the memory containing the > + * 'ibm,dynamic-memory-v2'. This structure represents > + * the order of the fields stored, but compiler alignment > + * may insert extra bytes of padding between the fields > + * 'num_seq_lmbs' and 'base_addr'. > + */ The "packed" attribute should prevent the struct from being padded. struct of_drconf_cell_v2 { ... } __attribute__((packed)); or, simply struct of_drconf_cell_v2 { ... } __packed; -Tyrel > +struct of_drconf_cell_v2 { > + u32 num_seq_lmbs; > + u64 base_addr; > + u32 drc_index; > + u32 aa_index; > + u32 flags; > +}; > + > + > +static inline int dyn_mem_v2_len(int entries) > +{ > + int drconf_v2_cells = (n_mem_addr_cells + 4); > + int drconf_v2_cells_len = (drconf_v2_cells * sizeof(unsigned int)); > + return (((entries) * drconf_v2_cells_len) + > +(1 * sizeof(unsigned int))); > +} > + > +extern void read_drconf_cell_v2(struct of_drconf_cell_v2 *drmem, > + const __be32 **cellp); > +extern void read_one_drc_info(int **info, char **drc_type, char **drc_name, > + unsigned long int *fdi_p, unsigned long int *nsl_p, > + unsigned long int *si_p, unsigned long int *ldi_p); > > /* > * There are two methods for telling firmware what our capabilities are. > diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c > index 669a15e..ad294ce 100644 > --- a/arch/powerpc/mm/numa.c > +++ b/arch/powerpc/mm/numa.c > @@ -405,6 +405,24 @@ static void read_drconf_cell(struct of_drconf_cell > *drmem, const __be32 **cellp) > > *cellp = cp + 4; > } > + > + /* > + * Retrieve and validate the ibm,dynamic-memory property of the device tree. > + * Read the next memory block set entry from the ibm,dynamic-memory-v2 > property > + * and return the information in the provided of_drconf_cell_v2 structure. > + */ > +void read_drconf_cell_v2(struct of_drconf_cell_v2 *drmem, const __be32 > **cellp) > +{ > + const __be32 *cp = (const __be32 *)*cellp; > + drmem->num_seq_lmbs = be32_to_cpu(*cp++); > + drmem->base_addr = read_n_cells(n_mem_addr_cells, &cp); > + drmem->drc_index = be32_to_cpu(*cp++); > + drmem->aa_index = be32_to_cpu(*cp++); > + drmem->flags = be32_to_cpu(*cp++); > + > + *cellp = cp; > +} > +EXPORT_SYMBOL(read_drconf_cell_v2); > > /* > * Retrieve and validate the ibm,dynamic-memory property of the device tree. > diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c > index 946e34f..a55bc1e 100644 > --- a/arch/powerpc/kernel/prom.c > +++ b/arch/powerpc/kernel/prom.c > @@ -56,6 +56,7 @@ > #include > #include > #include > +#include > > #include > > @@ -441,12 +442,12 @@ static int __init > early_init_dt_scan_chosen_ppc(unsigned long node, > > #ifdef CONFIG_PPC_PSERIES > /* > - * Interpret the ibm,dynamic-memory property in the > - * /ibm,dynamic-reconfiguration-memory node. > + * Interpret the ibm,dynamic-memory property/ibm,dynamic-memory-v2 > + * in the /
Re: [PATCH] powernv/pci: Add PHB register dump debugfs handle
On Mon, Jul 25, 2016 at 10:53:49AM -0700, Tyrel Datwyler wrote: >On 07/21/2016 11:36 PM, Gavin Shan wrote: >> On Fri, Jul 22, 2016 at 03:23:36PM +1000, Russell Currey wrote: >>> On EEH events the kernel will print a dump of relevant registers. >>> If EEH is unavailable (i.e. CONFIG_EEH is disabled, a new platform >>> doesn't have EEH support, etc) this information isn't readily available. >>> >>> Add a new debugfs handler to trigger a PHB register dump, so that this >>> information can be made available on demand. >>> >>> Signed-off-by: Russell Currey >> >> Reviewed-by: Gavin Shan >> >>> --- >>> arch/powerpc/platforms/powernv/pci-ioda.c | 35 >>> +++ >>> 1 file changed, 35 insertions(+) >>> >>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c >>> b/arch/powerpc/platforms/powernv/pci-ioda.c >>> index 891fc4a..ada2f3c 100644 >>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c >>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c >>> @@ -3018,6 +3018,38 @@ static void pnv_ioda_setup_pe_seg(struct pnv_ioda_pe >>> *pe) >>> } >>> } >>> >>> +#ifdef CONFIG_DEBUG_FS >>> +static ssize_t pnv_pci_debug_write(struct file *filp, >>> + const char __user *user_buf, >>> + size_t count, loff_t *ppos) >>> +{ >>> + struct pci_controller *hose = filp->private_data; >>> + struct pnv_phb *phb; >>> + int ret = 0; >> >> Needn't initialize @ret in advance. The code might be simpler, but it's >> only a personal preference: > >I believe its actually preferred that it not be initialized in advance >so that the tooling can warn you about conditional code paths where you >may have forgotten to set a value. Or as Gavin suggests to explicitly >use error values in the return statements. > Yeah, the data type should be int64_t as well. Thanks, Gavin ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v4 12/12] mm: SLUB hardened usercopy support
On 07/25/2016 02:42 PM, Rik van Riel wrote: On Mon, 2016-07-25 at 12:16 -0700, Laura Abbott wrote: On 07/20/2016 01:27 PM, Kees Cook wrote: Under CONFIG_HARDENED_USERCOPY, this adds object size checking to the SLUB allocator to catch any copies that may span objects. Includes a redzone handling fix discovered by Michael Ellerman. Based on code from PaX and grsecurity. Signed-off-by: Kees Cook Tested-by: Michael Ellerman --- init/Kconfig | 1 + mm/slub.c| 36 2 files changed, 37 insertions(+) diff --git a/init/Kconfig b/init/Kconfig index 798c2020ee7c..1c4711819dfd 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1765,6 +1765,7 @@ config SLAB config SLUB bool "SLUB (Unqueued Allocator)" + select HAVE_HARDENED_USERCOPY_ALLOCATOR help SLUB is a slab allocator that minimizes cache line usage instead of managing queues of cached objects (SLAB approach). diff --git a/mm/slub.c b/mm/slub.c index 825ff4505336..7dee3d9a5843 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3614,6 +3614,42 @@ void *__kmalloc_node(size_t size, gfp_t flags, int node) EXPORT_SYMBOL(__kmalloc_node); #endif +#ifdef CONFIG_HARDENED_USERCOPY +/* + * Rejects objects that are incorrectly sized. + * + * Returns NULL if check passes, otherwise const char * to name of cache + * to indicate an error. + */ +const char *__check_heap_object(const void *ptr, unsigned long n, + struct page *page) +{ + struct kmem_cache *s; + unsigned long offset; + size_t object_size; + + /* Find object and usable object size. */ + s = page->slab_cache; + object_size = slab_ksize(s); + + /* Find offset within object. */ + offset = (ptr - page_address(page)) % s->size; + + /* Adjust for redzone and reject if within the redzone. */ + if (kmem_cache_debug(s) && s->flags & SLAB_RED_ZONE) { + if (offset < s->red_left_pad) + return s->name; + offset -= s->red_left_pad; + } + + /* Allow address range falling entirely within object size. */ + if (offset <= object_size && n <= object_size - offset) + return NULL; + + return s->name; +} +#endif /* CONFIG_HARDENED_USERCOPY */ + I compared this against what check_valid_pointer does for SLUB_DEBUG checking. I was hoping we could utilize that function to avoid duplication but a) __check_heap_object needs to allow accesses anywhere in the object, not just the beginning b) accessing page->objects is racy without the addition of locking in SLUB_DEBUG. Still, the ptr < page_address(page) check from __check_heap_object would be good to add to avoid generating garbage large offsets and trying to infer C math. diff --git a/mm/slub.c b/mm/slub.c index 7dee3d9..5370e4f 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3632,6 +3632,9 @@ const char *__check_heap_object(const void *ptr, unsigned long n, s = page->slab_cache; object_size = slab_ksize(s); + if (ptr < page_address(page)) + return s->name; + /* Find offset within object. */ offset = (ptr - page_address(page)) % s->size; I don't get it, isn't that already guaranteed because we look for the page that ptr is in, before __check_heap_object is called? Specifically, in patch 3/12: + page = virt_to_head_page(ptr); + + /* Check slab allocator for flags and size. */ + if (PageSlab(page)) + return __check_heap_object(ptr, n, page); How can that generate a ptr that is not inside the page? What am I overlooking? And, should it be in the changelog or a comment? :) I ran into the subtraction issue when the vmalloc detection wasn't working on ARM64, somehow virt_to_head_page turned into a page that happened to have PageSlab set. I agree if everything is working properly this is redundant but given the type of feature this is, a little bit of redundancy against a system running off into the weeds or bad patches might be warranted. I'm not super attached to the check if other maintainers think it is redundant. Updating the __check_heap_object header comment with a note of what we are assuming could work Thanks, Laura ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 8/8] powerpc: Enable support for new DRC devtree properties
prom_init.c: Enable support for new DRC device tree properties "ibm,drc-info" and "ibm,dynamic-memory-v2" in initial handshake between the Linux kernel and the front end processor. Signed-off-by: Michael Bringmann --- diff -Naur linux-rhel/arch/powerpc/kernel/prom_init.c linux-rhel-patch/arch/powerpc/kernel/prom_init.c --- linux-rhel/arch/powerpc/kernel/prom_init.c 2016-03-03 07:36:25.0 -0600 +++ linux-rhel-patch/arch/powerpc/kernel/prom_init.c2016-06-20 15:59:58.016373676 -0500 @@ -695,7 +695,7 @@ unsigned char ibm_architecture_vec[] = { OV4_MIN_ENT_CAP,/* minimum VP entitled capacity */ /* option vector 5: PAPR/OF options */ - VECTOR_LENGTH(18), /* length */ + VECTOR_LENGTH(22), /* length */ 0, /* don't ignore, don't halt */ OV5_FEAT(OV5_LPAR) | OV5_FEAT(OV5_SPLPAR) | OV5_FEAT(OV5_LARGE_PAGES) | OV5_FEAT(OV5_DRCONF_MEMORY) | OV5_FEAT(OV5_DONATE_DEDICATE_CPU) | @@ -728,6 +728,10 @@ unsigned char ibm_architecture_vec[] = { OV5_FEAT(OV5_PFO_HW_RNG) | OV5_FEAT(OV5_PFO_HW_ENCR) | OV5_FEAT(OV5_PFO_HW_842), OV5_FEAT(OV5_SUB_PROCESSORS), + 0, + 0, + 0, + OV5_FEAT(OV5_RPS_DM2) | OV5_FEAT(OV5_RPS_DRC_INFO), /* option vector 6: IBM PAPR hints */ VECTOR_LENGTH(3), /* length */ ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 7/8] powerpc: Check arch.vec earlier during boot for memory features
architecture.vec5 features: The boot-time memory management needs to know the form of the "ibm,dynamic-memory-v2" property early during scanning of the flattened device tree. This patch moves execution of the function pseries_probe_fw_features() early enough to be before the scanning of the memory properties in the device tree to allow recognition of the supported properties. Signed-off-by: Michael Bringmann --- diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h index 9d86c66..e4c5076 100644 --- a/arch/powerpc/include/asm/opal.h +++ b/arch/powerpc/include/asm/opal.h @@ -215,6 +215,8 @@ extern int early_init_dt_scan_opal(unsigned long node, const char *uname, int depth, void *data); extern int early_init_dt_scan_recoverable_ranges(unsigned long node, const char *uname, int depth, void *data); +extern int pseries_probe_fw_features(unsigned long node, +const char *uname, int depth, void *data); extern int opal_get_chars(uint32_t vtermno, char *buf, int count); extern int opal_put_chars(uint32_t vtermno, const char *buf, int total_len); diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index 946e34f..2034edc 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -777,6 +777,7 @@ void __init early_init_devtree(void *params) of_scan_flat_dt(early_init_dt_scan_chosen_ppc, boot_command_line); /* Scan memory nodes and rebuild MEMBLOCKs */ + of_scan_flat_dt(pseries_probe_fw_features, NULL); of_scan_flat_dt(early_init_dt_scan_root, NULL); of_scan_flat_dt(early_init_dt_scan_memory_ppc, NULL); diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 9883bc7..f554205 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -736,7 +736,7 @@ static void pseries_power_off(void) * Called very early, MMU is off, device-tree isn't unflattened */ -static int __init pseries_probe_fw_features(unsigned long node, +int __init pseries_probe_fw_features(unsigned long node, const char *uname, int depth, void *data) { @@ -770,6 +770,7 @@ static int __init pseries_probe_fw_features(unsigned long node, return hypertas_found && vec5_found; } +EXPORT_SYMBOL(pseries_probe_fw_features); static int __init pSeries_probe(void) { ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 6/8] hotplug/drc-info: Add code to search new devtree properties
rpadlpar_core.c: Provide parallel routines to search the older device- tree properties ("ibm,drc-indexes", "ibm,drc-names", "ibm,drc-types" and "ibm,drc-power-domains"), or the new property "ibm,drc-info". The code searches for PHP PCI Slots, gets the DRC properties within the current node (using my-drc-index as correlation), and performs searches by name or type of DRC node. Signed-off-by: Michael Bringmann --- diff --git a/drivers/pci/hotplug/rpadlpar_core.c b/drivers/pci/hotplug/rpadlpar_core.c index dc67f39..bea9723 100644 --- a/drivers/pci/hotplug/rpadlpar_core.c +++ b/drivers/pci/hotplug/rpadlpar_core.c @@ -27,6 +27,7 @@ #include #include #include +#include #include "../pci.h" #include "rpaphp.h" @@ -44,15 +45,14 @@ static struct device_node *find_vio_slot_node(char *drc_name) { struct device_node *parent = of_find_node_by_name(NULL, "vdevice"); struct device_node *dn = NULL; - char *name; int rc; if (!parent) return NULL; while ((dn = of_get_next_child(parent, dn))) { - rc = rpaphp_get_drc_props(dn, NULL, &name, NULL, NULL); - if ((rc == 0) && (!strcmp(drc_name, name))) + rc = rpaphp_check_drc_props(dn, drc_name, NULL); + if (rc == 0) break; } @@ -64,15 +64,12 @@ static struct device_node *find_php_slot_pci_node(char *drc_name, char *drc_type) { struct device_node *np = NULL; - char *name; - char *type; int rc; while ((np = of_find_node_by_name(np, "pci"))) { - rc = rpaphp_get_drc_props(np, NULL, &name, &type, NULL); + rc = rpaphp_check_drc_props(np, drc_name, drc_type); if (rc == 0) - if (!strcmp(drc_name, name) && !strcmp(drc_type, type)) - break; + break; } return np; diff --git a/drivers/pci/hotplug/rpaphp.h b/drivers/pci/hotplug/rpaphp.h index 7db024e..8db5f2e 100644 --- a/drivers/pci/hotplug/rpaphp.h +++ b/drivers/pci/hotplug/rpaphp.h @@ -91,8 +91,8 @@ int rpaphp_get_sensor_state(struct slot *slot, int *state); /* rpaphp_core.c */ int rpaphp_add_slot(struct device_node *dn); -int rpaphp_get_drc_props(struct device_node *dn, int *drc_index, - char **drc_name, char **drc_type, int *drc_power_domain); +int rpaphp_check_drc_props(struct device_node *dn, char *drc_name, + char *drc_type); /* rpaphp_slot.c */ void dealloc_slot_struct(struct slot *slot); diff --git a/drivers/pci/hotplug/rpaphp_core.c b/drivers/pci/hotplug/rpaphp_core.c index 8d13202..0cfdbd9 100644 --- a/drivers/pci/hotplug/rpaphp_core.c +++ b/drivers/pci/hotplug/rpaphp_core.c @@ -30,6 +30,7 @@ #include #include #include +#include #include/* for eeh_add_device() */ #include /* rtas_call */ #include /* for pci_controller */ @@ -142,15 +143,6 @@ static enum pci_bus_speed get_max_bus_speed(struct slot *slot) case 5: case 6: speed = PCI_SPEED_33MHz;/* speed for case 1-6 */ - break; - case 7: - case 8: - speed = PCI_SPEED_66MHz; - break; - case 11: - case 14: - speed = PCI_SPEED_66MHz_PCIX; - break; case 12: case 15: speed = PCI_SPEED_100MHz_PCIX; @@ -196,25 +188,21 @@ static int get_children_props(struct device_node *dn, const int **drc_indexes, return 0; } -/* To get the DRC props describing the current node, first obtain it's - * my-drc-index property. Next obtain the DRC list from it's parent. Use - * the my-drc-index for correlation, and obtain the requested properties. + +/* Verify the existence of 'drc_name' and/or 'drc_type' within the + * current node. First obtain it's my-drc-index property. Next, + * obtain the DRC info from it's parent. Use the my-drc-index for + * correlation, and obtain/validate the requested properties. */ -int rpaphp_get_drc_props(struct device_node *dn, int *drc_index, - char **drc_name, char **drc_type, int *drc_power_domain) + +static int rpaphp_check_drc_props_v1(struct device_node *dn, char *drc_name, + char *drc_type, unsigned int my_index) { + char *name_tmp, *type_tmp; const int *indexes, *names; const int *types, *domains; - const unsigned int *my_index; - char *name_tmp, *type_tmp; int i, rc; - my_index = of_get_property(dn, "ibm,my-drc-index", NULL); - if (!my_index) { - /* Node isn't DLPAR/hotplug capable */ - return -EINVAL; - } - rc = get_children_props(dn->parent, &indexes, &names, &types, &domains); if (rc < 0) { return -EINVAL; @@ -225,24 +213,83 @@ int rpaphp_get_drc_props(struct dev
[PATCH 5/8] pseries/drc-info: Search new DRC properties for CPU indexes
pseries/drc-info: Provide parallel routines to convert between drc_index and CPU numbers at runtime, using the older device-tree properties ("ibm,drc-indexes", "ibm,drc-names", "ibm,drc-types" and "ibm,drc-power-domains"), or the new property "ibm,drc-info". Signed-off-by: Michael Bringmann --- diff --git a/arch/powerpc/platforms/pseries/pseries_energy.c b/arch/powerpc/platforms/pseries/pseries_energy.c index 9276779..10c4200 100644 --- a/arch/powerpc/platforms/pseries/pseries_energy.c +++ b/arch/powerpc/platforms/pseries/pseries_energy.c @@ -35,10 +35,68 @@ static int sysfs_entries; /* Helper Routines to convert between drc_index to cpu numbers */ +void read_one_drc_info(int **info, char **dtype, char **dname, + unsigned long int *fdi_p, unsigned long int *nsl_p, + unsigned long int *si_p, unsigned long int *ldi_p) +{ + char *drc_type, *drc_name, *pc; + u32 fdi, nsl, si, ldi; + + fdi = nsl = si = ldi = 0; + + /* Get drc-type:encode-string */ + pc = (char *)info; + drc_type = pc; + pc += (strlen(drc_type) + 1); + + /* Get drc-name-prefix:encode-string */ + drc_name = (char *)pc; + pc += (strlen(drc_name) + 1); + + /* Get drc-index-start:encode-int */ + memcpy(&fdi, pc, 4); + fdi = be32_to_cpu(fdi); + pc += 4; + + /* Get/skip drc-name-suffix-start:encode-int */ + pc += 4; + + /* Get number-sequential-elements:encode-int */ + memcpy(&nsl, pc, 4); + nsl = be32_to_cpu(nsl); + pc += 4; + + /* Get sequential-increment:encode-int */ + memcpy(&si, pc, 4); + si = be32_to_cpu(si); + pc += 4; + + /* Get/skip drc-power-domain:encode-int */ + pc += 4; + + /* Should now know end of current entry */ + ldi = fdi + ((nsl-1)*si); + + (*info) = (int *)pc; + + if (dtype) + *dtype = drc_type; + if (dname) + *dname = drc_name; + if (fdi_p) + *fdi_p = fdi; + if (nsl_p) + *nsl_p = nsl; + if (si_p) + *si_p = si; + if (ldi_p) + *ldi_p = ldi; +} +EXPORT_SYMBOL(read_one_drc_info); + static u32 cpu_to_drc_index(int cpu) { struct device_node *dn = NULL; - const int *indexes; int i; int rc = 1; u32 ret = 0; @@ -46,18 +104,54 @@ static u32 cpu_to_drc_index(int cpu) dn = of_find_node_by_path("/cpus"); if (dn == NULL) goto err; - indexes = of_get_property(dn, "ibm,drc-indexes", NULL); - if (indexes == NULL) - goto err_of_node_put; + /* Convert logical cpu number to core number */ i = cpu_core_index_of_thread(cpu); - /* -* The first element indexes[0] is the number of drc_indexes -* returned in the list. Hence i+1 will get the drc_index -* corresponding to core number i. -*/ - WARN_ON(i > indexes[0]); - ret = indexes[i + 1]; + + if (firmware_has_feature(FW_FEATURE_RPS_DRC_INFO)) { + int *info = (int *)4; + unsigned long int num_set_entries, j, iw = i, fdi = 0; + unsigned long int ldi = 0, nsl = 0, si = 0; + char *dtype; + char *dname; + + info = (int *)of_get_property(dn, "ibm,drc-info", NULL); + if (info == NULL) + goto err_of_node_put; + + num_set_entries = be32_to_cpu(*info++); + + for (j = 0; j < num_set_entries; j++) { + + read_one_drc_info(&info, &dtype, &dname, &fdi, + &nsl, &si, &ldi); + if (strcmp(dtype, "CPU")) + goto err; + + if (iw < ldi) + break; + + WARN_ON(((iw-fdi)%si) != 0); + } + WARN_ON((nsl == 0) | (si == 0)); + + ret = ldi + (iw*si); + } else { + const int *indexes; + + indexes = of_get_property(dn, "ibm,drc-indexes", NULL); + if (indexes == NULL) + goto err_of_node_put; + + /* +* The first element indexes[0] is the number of drc_indexes +* returned in the list. Hence i+1 will get the drc_index +* corresponding to core number i. +*/ + WARN_ON(i > indexes[0]); + ret = indexes[i + 1]; + } + rc = 0; err_of_node_put: @@ -78,21 +172,51 @@ static int drc_index_to_cpu(u32 drc_index) dn = of_find_node_by_path("/cpus"); if (dn == NULL) goto err; - indexes = of_get_property(dn, "ibm,drc-indexes", NULL); - if (indexes == NULL) - goto err_of_node_put; - /* -* First element in the array is the nu
[PATCH 4/8] pseries/hotplug init: Convert new DRC memory property for hotplug runtime
hotplug_init: Simplify the code needed for runtime memory hotplug and maintenance with a conversion routine that transforms the compressed property "ibm,dynamic-memory-v2" to the form of "ibm,dynamic-memory" within the "ibm,dynamic-reconfiguration-memory" property. Thus only a single set of routines should be required at runtime to parse, edit, and manipulate the memory representation in the device tree. Similarly, any userspace applications that need this information will only need to recognize the older format to be able to continue to operate. Signed-off-by: Michael Bringmann --- diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index 2ce1385..f422dcb 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -839,6 +839,95 @@ static int pseries_update_drconf_memory(struct of_reconfig_data *pr) return rc; } +static int pseries_rewrite_dynamic_memory_v2(void) +{ + unsigned long memblock_size; + struct device_node *dn; + struct property *prop, *prop_v2; + __be32 *p; + struct of_drconf_cell *lmbs; + u32 num_lmb_desc_sets, num_lmbs; + int i; + + dn = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory"); + if (!dn) + return -EINVAL; + + prop_v2 = of_find_property(dn, "ibm,dynamic-memory-v2", NULL); + if (!prop_v2) + return -EINVAL; + + memblock_size = pseries_memory_block_size(); + if (!memblock_size) + return -EINVAL; + + /* The first int of the property is the number of lmb sets +* described by the property. +*/ + p = (__be32 *)prop_v2->value; + num_lmb_desc_sets = be32_to_cpu(*p++); + + /* Count the number of LMBs for generating the alternate format +*/ + for (i = 0, num_lmbs = 0; i < num_lmb_desc_sets; i++) { + struct of_drconf_cell_v2 drmem; + + read_drconf_cell_v2(&drmem, (const __be32 **)&p); + num_lmbs += drmem.num_seq_lmbs; + } + + /* Create an empty copy of the new 'ibm,dynamic-memory' property +*/ + { + prop = kzalloc(sizeof(*prop), GFP_KERNEL); + if (!prop) + return -ENOMEM; + prop->name = kstrdup("ibm,dynamic-memory", GFP_KERNEL); + prop->length = dyn_mem_v2_len(num_lmbs); + prop->value = kzalloc(prop->length, GFP_KERNEL); + } + + /* Copy/expand the ibm,dynamic-memory-v2 format to produce the +* ibm,dynamic-memory format. +*/ + p = (__be32 *)prop->value; + *p = cpu_to_be32(num_lmbs); + p++; + lmbs = (struct of_drconf_cell *)p; + + p = (__be32 *)prop_v2->value; + p++; + + for (i = 0; i < num_lmb_desc_sets; i++) { + struct of_drconf_cell_v2 drmem; + int j, k = 0; + + read_drconf_cell_v2(&drmem, (const __be32 **)&p); + + for (j = 0; j < drmem.num_seq_lmbs; j++) { + lmbs[k+j].base_addr = be64_to_cpu(drmem.base_addr); + lmbs[k+j].drc_index = be32_to_cpu(drmem.drc_index); + lmbs[k+j].reserved = 0; + lmbs[k+j].aa_index = be32_to_cpu(drmem.aa_index); + lmbs[k+i].flags = be32_to_cpu(drmem.flags); + + drmem.base_addr += memblock_size; + drmem.drc_index++; + } + + k += drmem.num_seq_lmbs; + } + + of_remove_property(dn, prop_v2); + + of_add_property(dn, prop); + + /* And disable feature flag since the property has gone away */ + powerpc_firmware_features &= ~FW_FEATURE_RPS_DM2; + + return 0; +} + static int pseries_memory_notifier(struct notifier_block *nb, unsigned long action, void *data) { @@ -866,6 +952,8 @@ static struct notifier_block pseries_mem_nb = { static int __init pseries_memory_hotplug_init(void) { + if (firmware_has_feature(FW_FEATURE_RPS_DM2)) + pseries_rewrite_dynamic_memory_v2(); if (firmware_has_feature(FW_FEATURE_LPAR)) of_reconfig_notifier_register(&pseries_mem_nb); ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 3/8] powerpc/memory: Parse new memory property to initialize structures.
powerpc/memory: Add parallel routines to parse the new property "ibm,dynamic-memory-v2" property when it is present, and then to finish initialization of the relevant memory structures with the operating system. This code is shared between the boot-time initialization functions and the runtime functions for memory hotplug, so it needs to be able to handle both formats. Signed-off-by: Michael Bringmann --- diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index 669a15e..18b4ee7 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -57,8 +57,10 @@ EXPORT_SYMBOL(node_data); static int min_common_depth; -static int n_mem_addr_cells, n_mem_size_cells; +int n_mem_addr_cells; +static int n_mem_size_cells; static int form1_affinity; +EXPORT_SYMBOL(n_mem_addr_cells); #define MAX_DISTANCE_REF_POINTS 4 static int distance_ref_points_depth; @@ -405,9 +407,8 @@ *cellp = cp + 4; } - - /* - * Retrieve and validate the ibm,dynamic-memory property of the device tree. + +/* * Read the next memory block set entry from the ibm,dynamic-memory-v2 property * and return the information in the provided of_drconf_cell_v2 structure. */ @@ -425,30 +426,55 @@ EXPORT_SYMBOL(read_drconf_cell_v2); /* - * Retrieve and validate the ibm,dynamic-memory property of the device tree. + * Retrieve and validate the ibm,dynamic-memory[-v2] property of the + * device tree. + * + * The layout of the ibm,dynamic-memory property is a number N of memory + * block description list entries followed by N memory block description + * list entries. Each memory block description list entry contains + * information as laid out in the of_drconf_cell struct above. * - * The layout of the ibm,dynamic-memory property is a number N of memblock - * list entries followed by N memblock list entries. Each memblock list entry - * contains information as laid out in the of_drconf_cell struct above. + * The layout of the ibm,dynamic-memory-v2 property is a number N of memory + * block set description list entries, followed by N memory block set + * description set entries. */ static int of_get_drconf_memory(struct device_node *memory, const __be32 **dm) { const __be32 *prop; u32 len, entries; - prop = of_get_property(memory, "ibm,dynamic-memory", &len); - if (!prop || len < sizeof(unsigned int)) - return 0; + if (firmware_has_feature(FW_FEATURE_RPS_DM2)) { - entries = of_read_number(prop++, 1); + prop = of_get_property(memory, "ibm,dynamic-memory-v2", &len); + if (!prop || len < sizeof(unsigned int)) + return 0; - /* Now that we know the number of entries, revalidate the size -* of the property read in to ensure we have everything -*/ - if (len < (entries * (n_mem_addr_cells + 4) + 1) * sizeof(unsigned int)) - return 0; + entries = of_read_number(prop++, 1); + + /* Now that we know the number of set entries, revalidate the +* size of the property read in to ensure we have everything. +*/ + if (len < dyn_mem_v2_len(entries)) + return 0; + + *dm = prop; + } else { + prop = of_get_property(memory, "ibm,dynamic-memory", &len); + if (!prop || len < sizeof(unsigned int)) + return 0; + + entries = of_read_number(prop++, 1); + + /* Now that we know the number of entries, revalidate the size +* of the property read in to ensure we have everything +*/ + if (len < (entries * (n_mem_addr_cells + 4) + 1) * + sizeof(unsigned int)) + return 0; + + *dm = prop; + } - *dm = prop; return entries; } @@ -511,7 +537,7 @@ * This is like of_node_to_nid_single() for memory represented in the * ibm,dynamic-reconfiguration-memory node. */ -static int of_drconf_to_nid_single(struct of_drconf_cell *drmem, +static int of_drconf_to_nid_single(u32 drmem_flags, u32 drmem_aa_index, struct assoc_arrays *aa) { int default_nid = 0; @@ -519,16 +545,16 @@ int index; if (min_common_depth > 0 && min_common_depth <= aa->array_sz && - !(drmem->flags & DRCONF_MEM_AI_INVALID) && - drmem->aa_index < aa->n_arrays) { - index = drmem->aa_index * aa->array_sz + min_common_depth - 1; + !(drmem_flags & DRCONF_MEM_AI_INVALID) && + drmem_aa_index < aa->n_arrays) { + index = drmem_aa_index * aa->array_sz + min_common_depth - 1; nid = of_read_number(&aa->arrays[index], 1); if (nid == 0x || nid >= MAX_NUMNODES) nid = default_nid; if (nid > 0) { - index = drme
[PATCH 2/8] powerpc/memory: Parse new memory property to register blocks.
powerpc/memory: Add parallel routines to parse the new property "ibm,dynamic-memory-v2" property when it is present, and then to register the relevant memory blocks with the operating system. This property format is intended to provide a more compact representation of memory when communicating with the front end processor, especially when describing vast amounts of RAM. Signed-off-by: Michael Bringmann --- diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h index 7f436ba..b9a1534 100644 --- a/arch/powerpc/include/asm/prom.h +++ b/arch/powerpc/include/asm/prom.h @@ -69,6 +69,8 @@ struct boot_param_header { * OF address retreival & translation */ +extern int n_mem_addr_cells; + /* Parse the ibm,dma-window property of an OF node into the busno, phys and * size parameters. */ @@ -81,8 +83,9 @@ extern void of_instantiate_rtc(void); extern int of_get_ibm_chip_id(struct device_node *np); /* The of_drconf_cell struct defines the layout of the LMB array - * specified in the device tree property - * ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory + * specified in the device tree properties, + * ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory + * ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory-v2 */ struct of_drconf_cell { u64 base_addr; @@ -92,9 +95,39 @@ struct of_drconf_cell { u32 flags; }; -#define DRCONF_MEM_ASSIGNED0x0008 -#define DRCONF_MEM_AI_INVALID 0x0040 -#define DRCONF_MEM_RESERVED0x0080 +#define DRCONF_MEM_ASSIGNED0x0008 +#define DRCONF_MEM_AI_INVALID 0x0040 +#define DRCONF_MEM_RESERVED0x0080 + + /* It is important to note that this structure can not +* be safely mapped onto the memory containing the +* 'ibm,dynamic-memory-v2'. This structure represents +* the order of the fields stored, but compiler alignment +* may insert extra bytes of padding between the fields +* 'num_seq_lmbs' and 'base_addr'. +*/ +struct of_drconf_cell_v2 { + u32 num_seq_lmbs; + u64 base_addr; + u32 drc_index; + u32 aa_index; + u32 flags; +}; + + +static inline int dyn_mem_v2_len(int entries) +{ + int drconf_v2_cells = (n_mem_addr_cells + 4); + int drconf_v2_cells_len = (drconf_v2_cells * sizeof(unsigned int)); + return (((entries) * drconf_v2_cells_len) + +(1 * sizeof(unsigned int))); +} + +extern void read_drconf_cell_v2(struct of_drconf_cell_v2 *drmem, + const __be32 **cellp); +extern void read_one_drc_info(int **info, char **drc_type, char **drc_name, + unsigned long int *fdi_p, unsigned long int *nsl_p, + unsigned long int *si_p, unsigned long int *ldi_p); /* * There are two methods for telling firmware what our capabilities are. diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index 669a15e..ad294ce 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -405,6 +405,24 @@ static void read_drconf_cell(struct of_drconf_cell *drmem, const __be32 **cellp) *cellp = cp + 4; } + + /* + * Retrieve and validate the ibm,dynamic-memory property of the device tree. + * Read the next memory block set entry from the ibm,dynamic-memory-v2 property + * and return the information in the provided of_drconf_cell_v2 structure. + */ +void read_drconf_cell_v2(struct of_drconf_cell_v2 *drmem, const __be32 **cellp) +{ + const __be32 *cp = (const __be32 *)*cellp; + drmem->num_seq_lmbs = be32_to_cpu(*cp++); + drmem->base_addr = read_n_cells(n_mem_addr_cells, &cp); + drmem->drc_index = be32_to_cpu(*cp++); + drmem->aa_index = be32_to_cpu(*cp++); + drmem->flags = be32_to_cpu(*cp++); + + *cellp = cp; +} +EXPORT_SYMBOL(read_drconf_cell_v2); /* * Retrieve and validate the ibm,dynamic-memory property of the device tree. diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index 946e34f..a55bc1e 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -56,6 +56,7 @@ #include #include #include +#include #include @@ -441,12 +442,12 @@ static int __init early_init_dt_scan_chosen_ppc(unsigned long node, #ifdef CONFIG_PPC_PSERIES /* - * Interpret the ibm,dynamic-memory property in the - * /ibm,dynamic-reconfiguration-memory node. + * Interpret the ibm,dynamic-memory property/ibm,dynamic-memory-v2 + * in the /ibm,dynamic-reconfiguration-memory node. * This contains a list of memory blocks along with NUMA affinity * information. */ -static int __init early_init_dt_scan_drconf_memory(unsigned long node) +static int __init early_init_dt_scan_drconf_memory_v1(unsigned long node) { const __be32 *dm, *ls, *usm; int l; @@ -516,6 +517,105 @@ static int __init early_init_dt_scan_drconf_memory(unsigned long node) memblock_dump_all(
[PATCH 1/8] powerpc/firmware: Add definitions for new firmware features.
Firmware Features: Define new bit flags representing the presence of new device tree properties "ibm,drc-info", and "ibm,dynamic-memory-v2". These flags are used to tell the front end processor when the Linux kernel supports the new properties, and by the front end processor to tell the Linux kernel that the new properties are present in the devie tree. Signed-off-by: Michael Bringmann --- diff --git a/arch/powerpc/include/asm/firmware.h b/arch/powerpc/include/asm/firmware.h index b062924..a9d66d5 100644 --- a/arch/powerpc/include/asm/firmware.h +++ b/arch/powerpc/include/asm/firmware.h @@ -51,6 +51,8 @@ #define FW_FEATURE_BEST_ENERGY ASM_CONST(0x8000) #define FW_FEATURE_TYPE1_AFFINITY ASM_CONST(0x0001) #define FW_FEATURE_PRRNASM_CONST(0x0002) +#define FW_FEATURE_RPS_DM2 ASM_CONST(0x0004) +#define FW_FEATURE_RPS_DRC_INFOASM_CONST(0x0008) #ifndef __ASSEMBLY__ @@ -66,7 +68,8 @@ enum { FW_FEATURE_MULTITCE | FW_FEATURE_SPLPAR | FW_FEATURE_LPAR | FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO | FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY | - FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN, + FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN | + FW_FEATURE_RPS_DM2 | FW_FEATURE_RPS_DRC_INFO, FW_FEATURE_PSERIES_ALWAYS = 0, FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL, FW_FEATURE_POWERNV_ALWAYS = 0, diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h index 7f436ba..b9a1534 100644 --- a/arch/powerpc/include/asm/prom.h +++ b/arch/powerpc/include/asm/prom.h @@ -155,6 +203,8 @@ struct of_drconf_cell { #define OV5_PFO_HW_842 0x0E40 /* PFO Compression Accelerator */ #define OV5_PFO_HW_ENCR0x0E20 /* PFO Encryption Accelerator */ #define OV5_SUB_PROCESSORS 0x0F01 /* 1,2,or 4 Sub-Processors supported */ +#define OV5_RPS_DM20x1680 /* Redef Prop Structures: dyn-mem-v2 */ +#define OV5_RPS_DRC_INFO 0x1640 /* Redef Prop Structures: drc-info */ /* Option Vector 6: IBM PAPR hints */ #define OV6_LINUX 0x02/* Linux is our OS */ diff --git a/arch/powerpc/platforms/pseries/firmware.c b/arch/powerpc/platforms/pseries/firmware.c index 8c80588..00243ee 100644 --- a/arch/powerpc/platforms/pseries/firmware.c +++ b/arch/powerpc/platforms/pseries/firmware.c @@ -111,6 +111,8 @@ static __initdata struct vec5_fw_feature vec5_fw_features_table[] = { {FW_FEATURE_TYPE1_AFFINITY, OV5_TYPE1_AFFINITY}, {FW_FEATURE_PRRN, OV5_PRRN}, + {FW_FEATURE_RPS_DM2,OV5_RPS_DM2}, + {FW_FEATURE_RPS_DRC_INFO, OV5_RPS_DRC_INFO}, }; void __init fw_vec5_feature_init(const char *vec5, unsigned long len) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 0/8] powerpc/devtree: Add support for 2 new DRC properties
Several properties in the DRC device tree format are replaced by more compact representations to allow, for example, for the encoding of vast amounts of memory, and or reduced duplication of information in related data structures. "ibm,drc-info": This property, when present, replaces the following four properties: "ibm,drc-indexes", "ibm,drc-names", "ibm,drc-types" and "ibm,drc-power-domains". This property is defined for all dynamically reconfigurable platform nodes. The "ibm,drc-info" elements are intended to provide a more compact representation, and reduce some search overhead. "ibm,dynamic-memory-v2": This property replaces the "ibm,dynamic-memory" node representation within the "ibm,dynamic-reconfiguration-memory" property provided by the BMC. This element format is intended to provide a more compact representation of memory, especially, for systems with massive amounts of RAM. To simplify portability, this property is converted to the "ibm,dynamic-memory" property during system boot. "ibm,architecture.vec": Bit flags are added to this data structure by the front end processor to inform the kernel as to whether to expect the changes to one or both of the device tree structures "ibm,drc-info" and "ibm,dynamic-memory-v2". Signed-off-by: Michael Bringmann Michael Bringmann (8): powerpc/firmware: Add definitions for new firmware features. powerpc/memory: Parse new memory property to register blocks. powerpc/memory: Parse new memory property to initialize structures. pseries/hotplug init: Convert new DRC memory property for hotplug runtime pseries/drc-info: Search new DRC properties for CPU indexes hotplug/drc-info: Add code to search new devtree properties powerpc: Check arch.vec earlier during boot for memory features powerpc: Enable support for new DRC devtree properties ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v4 12/12] mm: SLUB hardened usercopy support
On Mon, 2016-07-25 at 12:16 -0700, Laura Abbott wrote: > On 07/20/2016 01:27 PM, Kees Cook wrote: > > Under CONFIG_HARDENED_USERCOPY, this adds object size checking to > > the > > SLUB allocator to catch any copies that may span objects. Includes > > a > > redzone handling fix discovered by Michael Ellerman. > > > > Based on code from PaX and grsecurity. > > > > Signed-off-by: Kees Cook > > Tested-by: Michael Ellerman > > --- > > init/Kconfig | 1 + > > mm/slub.c| 36 > > 2 files changed, 37 insertions(+) > > > > diff --git a/init/Kconfig b/init/Kconfig > > index 798c2020ee7c..1c4711819dfd 100644 > > --- a/init/Kconfig > > +++ b/init/Kconfig > > @@ -1765,6 +1765,7 @@ config SLAB > > > > config SLUB > > bool "SLUB (Unqueued Allocator)" > > + select HAVE_HARDENED_USERCOPY_ALLOCATOR > > help > > SLUB is a slab allocator that minimizes cache line > > usage > > instead of managing queues of cached objects (SLAB > > approach). > > diff --git a/mm/slub.c b/mm/slub.c > > index 825ff4505336..7dee3d9a5843 100644 > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -3614,6 +3614,42 @@ void *__kmalloc_node(size_t size, gfp_t > > flags, int node) > > EXPORT_SYMBOL(__kmalloc_node); > > #endif > > > > +#ifdef CONFIG_HARDENED_USERCOPY > > +/* > > + * Rejects objects that are incorrectly sized. > > + * > > + * Returns NULL if check passes, otherwise const char * to name of > > cache > > + * to indicate an error. > > + */ > > +const char *__check_heap_object(const void *ptr, unsigned long n, > > + struct page *page) > > +{ > > + struct kmem_cache *s; > > + unsigned long offset; > > + size_t object_size; > > + > > + /* Find object and usable object size. */ > > + s = page->slab_cache; > > + object_size = slab_ksize(s); > > + > > + /* Find offset within object. */ > > + offset = (ptr - page_address(page)) % s->size; > > + > > + /* Adjust for redzone and reject if within the redzone. */ > > + if (kmem_cache_debug(s) && s->flags & SLAB_RED_ZONE) { > > + if (offset < s->red_left_pad) > > + return s->name; > > + offset -= s->red_left_pad; > > + } > > + > > + /* Allow address range falling entirely within object > > size. */ > > + if (offset <= object_size && n <= object_size - offset) > > + return NULL; > > + > > + return s->name; > > +} > > +#endif /* CONFIG_HARDENED_USERCOPY */ > > + > > I compared this against what check_valid_pointer does for SLUB_DEBUG > checking. I was hoping we could utilize that function to avoid > duplication but a) __check_heap_object needs to allow accesses > anywhere > in the object, not just the beginning b) accessing page->objects > is racy without the addition of locking in SLUB_DEBUG. > > Still, the ptr < page_address(page) check from __check_heap_object > would > be good to add to avoid generating garbage large offsets and trying > to > infer C math. > > diff --git a/mm/slub.c b/mm/slub.c > index 7dee3d9..5370e4f 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -3632,6 +3632,9 @@ const char *__check_heap_object(const void > *ptr, unsigned long n, > s = page->slab_cache; > object_size = slab_ksize(s); > > + if (ptr < page_address(page)) > + return s->name; > + > /* Find offset within object. */ > offset = (ptr - page_address(page)) % s->size; > I don't get it, isn't that already guaranteed because we look for the page that ptr is in, before __check_heap_object is called? Specifically, in patch 3/12: + page = virt_to_head_page(ptr); + + /* Check slab allocator for flags and size. */ + if (PageSlab(page)) + return __check_heap_object(ptr, n, page); How can that generate a ptr that is not inside the page? What am I overlooking? And, should it be in the changelog or a comment? :) -- All Rights Reversed. signature.asc Description: This is a digitally signed message part ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v4 12/12] mm: SLUB hardened usercopy support
On Mon, Jul 25, 2016 at 12:16 PM, Laura Abbott wrote: > On 07/20/2016 01:27 PM, Kees Cook wrote: >> >> Under CONFIG_HARDENED_USERCOPY, this adds object size checking to the >> SLUB allocator to catch any copies that may span objects. Includes a >> redzone handling fix discovered by Michael Ellerman. >> >> Based on code from PaX and grsecurity. >> >> Signed-off-by: Kees Cook >> Tested-by: Michael Ellerman >> --- >> init/Kconfig | 1 + >> mm/slub.c| 36 >> 2 files changed, 37 insertions(+) >> >> diff --git a/init/Kconfig b/init/Kconfig >> index 798c2020ee7c..1c4711819dfd 100644 >> --- a/init/Kconfig >> +++ b/init/Kconfig >> @@ -1765,6 +1765,7 @@ config SLAB >> >> config SLUB >> bool "SLUB (Unqueued Allocator)" >> + select HAVE_HARDENED_USERCOPY_ALLOCATOR >> help >>SLUB is a slab allocator that minimizes cache line usage >>instead of managing queues of cached objects (SLAB approach). >> diff --git a/mm/slub.c b/mm/slub.c >> index 825ff4505336..7dee3d9a5843 100644 >> --- a/mm/slub.c >> +++ b/mm/slub.c >> @@ -3614,6 +3614,42 @@ void *__kmalloc_node(size_t size, gfp_t flags, int >> node) >> EXPORT_SYMBOL(__kmalloc_node); >> #endif >> >> +#ifdef CONFIG_HARDENED_USERCOPY >> +/* >> + * Rejects objects that are incorrectly sized. >> + * >> + * Returns NULL if check passes, otherwise const char * to name of cache >> + * to indicate an error. >> + */ >> +const char *__check_heap_object(const void *ptr, unsigned long n, >> + struct page *page) >> +{ >> + struct kmem_cache *s; >> + unsigned long offset; >> + size_t object_size; >> + >> + /* Find object and usable object size. */ >> + s = page->slab_cache; >> + object_size = slab_ksize(s); >> + >> + /* Find offset within object. */ >> + offset = (ptr - page_address(page)) % s->size; >> + >> + /* Adjust for redzone and reject if within the redzone. */ >> + if (kmem_cache_debug(s) && s->flags & SLAB_RED_ZONE) { >> + if (offset < s->red_left_pad) >> + return s->name; >> + offset -= s->red_left_pad; >> + } >> + >> + /* Allow address range falling entirely within object size. */ >> + if (offset <= object_size && n <= object_size - offset) >> + return NULL; >> + >> + return s->name; >> +} >> +#endif /* CONFIG_HARDENED_USERCOPY */ >> + > > > I compared this against what check_valid_pointer does for SLUB_DEBUG > checking. I was hoping we could utilize that function to avoid > duplication but a) __check_heap_object needs to allow accesses anywhere > in the object, not just the beginning b) accessing page->objects > is racy without the addition of locking in SLUB_DEBUG. > > Still, the ptr < page_address(page) check from __check_heap_object would > be good to add to avoid generating garbage large offsets and trying to > infer C math. > > diff --git a/mm/slub.c b/mm/slub.c > index 7dee3d9..5370e4f 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -3632,6 +3632,9 @@ const char *__check_heap_object(const void *ptr, > unsigned long n, > s = page->slab_cache; > object_size = slab_ksize(s); > + if (ptr < page_address(page)) > + return s->name; > + > /* Find offset within object. */ > offset = (ptr - page_address(page)) % s->size; > > With that, you can add > > Reviwed-by: Laura Abbott Cool, I'll add that. Should I add your reviewed-by for this patch only or for the whole series? Thanks! -Kees > >> static size_t __ksize(const void *object) >> { >> struct page *page; >> > > Thanks, > Laura -- Kees Cook Chrome OS & Brillo Security ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [Patch v3 1/3] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe
Hi Zhao Qiang, On Mon, Jul 25, 2016 at 04:59:54PM +0800, Zhao Qiang wrote: > move the driver from drivers/soc/fsl/qe to drivers/irqchip, > merge qe_ic.h and qe_ic.c into irq-qeic.c. > > Signed-off-by: Zhao Qiang > --- > Changes for v2: > - modify the subject and commit msg > Changes for v3: > - merge .h file to .c, rename it with irq-qeic.c > > drivers/irqchip/Makefile | 1 + > drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} | 82 +++- > drivers/soc/fsl/qe/Makefile| 2 +- > drivers/soc/fsl/qe/qe_ic.h | 103 > - > 4 files changed, 83 insertions(+), 105 deletions(-) > rename drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} (85%) > delete mode 100644 drivers/soc/fsl/qe/qe_ic.h > > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile > index 38853a1..cef999d 100644 > --- a/drivers/irqchip/Makefile > +++ b/drivers/irqchip/Makefile > @@ -69,3 +69,4 @@ obj-$(CONFIG_PIC32_EVIC)+= irq-pic32-evic.o > obj-$(CONFIG_MVEBU_ODMI) += irq-mvebu-odmi.o > obj-$(CONFIG_LS_SCFG_MSI)+= irq-ls-scfg-msi.o > obj-$(CONFIG_EZNPS_GIC) += irq-eznps.o > +obj-$(CONFIG_QUICC_ENGINE) += qe_ic.o Did you test this? ;-) > diff --git a/drivers/soc/fsl/qe/qe_ic.c b/drivers/irqchip/irq-qeic.c > similarity index 85% > rename from drivers/soc/fsl/qe/qe_ic.c > rename to drivers/irqchip/irq-qeic.c > index ec2ca86..1f91225 100644 > --- a/drivers/soc/fsl/qe/qe_ic.c > +++ b/drivers/irqchip/irq-qeic.c > @@ -30,7 +30,87 @@ > #include > #include > > -#include "qe_ic.h" > +#define NR_QE_IC_INTS64 > + > +/* QE IC registers offset */ > +#define QEIC_CICR0x00 > +#define QEIC_CIVEC 0x04 > +#define QEIC_CRIPNR 0x08 > +#define QEIC_CIPNR 0x0c > +#define QEIC_CIPXCC 0x10 > +#define QEIC_CIPYCC 0x14 > +#define QEIC_CIPWCC 0x18 > +#define QEIC_CIPZCC 0x1c > +#define QEIC_CIMR0x20 > +#define QEIC_CRIMR 0x24 > +#define QEIC_CICNR 0x28 > +#define QEIC_CIPRTA 0x30 > +#define QEIC_CIPRTB 0x34 > +#define QEIC_CRICR 0x3c > +#define QEIC_CHIVEC 0x60 > + > +/* Interrupt priority registers */ > +#define CIPCC_SHIFT_PRI0 29 > +#define CIPCC_SHIFT_PRI1 26 > +#define CIPCC_SHIFT_PRI2 23 > +#define CIPCC_SHIFT_PRI3 20 > +#define CIPCC_SHIFT_PRI4 13 > +#define CIPCC_SHIFT_PRI5 10 > +#define CIPCC_SHIFT_PRI6 7 > +#define CIPCC_SHIFT_PRI7 4 > + > +/* CICR priority modes */ > +#define CICR_GWCC0x0004 > +#define CICR_GXCC0x0002 > +#define CICR_GYCC0x0001 > +#define CICR_GZCC0x0008 > +#define CICR_GRTA0x0020 > +#define CICR_GRTB0x0040 > +#define CICR_HPIT_SHIFT 8 > +#define CICR_HPIT_MASK 0x0300 > +#define CICR_HP_SHIFT24 > +#define CICR_HP_MASK 0x3f00 > + > +/* CICNR */ > +#define CICNR_WCC1T_SHIFT20 > +#define CICNR_ZCC1T_SHIFT28 > +#define CICNR_YCC1T_SHIFT12 > +#define CICNR_XCC1T_SHIFT4 > + > +/* CRICR */ > +#define CRICR_RTA1T_SHIFT20 > +#define CRICR_RTB1T_SHIFT28 > + > +/* Signal indicator */ > +#define SIGNAL_MASK 3 > +#define SIGNAL_HIGH 2 > +#define SIGNAL_LOW 0 > + > +struct qe_ic { > + /* Control registers offset */ > + volatile u32 __iomem *regs; > + > + /* The remapper for this QEIC */ > + struct irq_domain *irqhost; > + > + /* The "linux" controller struct */ > + struct irq_chip hc_irq; > + > + /* VIRQ numbers of QE high/low irqs */ > + unsigned int virq_high; > + unsigned int virq_low; > +}; > + > +/* > + * QE interrupt controller internal structure > + */ > +struct qe_ic_info { > + u32 mask; /* location of this source at the QIMR register. */ > + u32 mask_reg; /* Mask register offset */ > + u8 pri_code; /* for grouped interrupts sources - the interrupt > + code as appears at the group priority register */ > + u32 pri_reg; /* Group priority register offset */ > +}; Please, no tail comments. Refer to KernelDoc. > > static DEFINE_RAW_SPINLOCK(qe_ic_lock); > > diff --git a/drivers/soc/fsl/qe/Makefile b/drivers/soc/fsl/qe/Makefile > index 2031d38..51e4726 100644 > --- a/drivers/soc/fsl/qe/Makefile > +++ b/drivers/soc/fsl/qe/Makefile > @@ -1,7 +1,7 @@ > # > # Makefile for the linux ppc-specific parts of QE > # > -obj-$(CONFIG_QUICC_ENGINE)+= qe.o qe_common.o qe_ic.o qe_io.o > +obj-$(CONFIG_QUICC_ENGINE)+= qe.o qe_common.o qe_io.o > obj-$(CONFIG_CPM)+= qe_common.o > obj-$(CONFIG_UCC)+= ucc.o > obj-$(CONFIG_UCC_SLOW) += ucc_slow.o > diff --git a/drivers/soc/fsl/qe/qe_ic.h b/drivers/soc/fsl/qe/qe_ic.h > deleted file mode 100644 > inde
Re: [PATCH v4 12/12] mm: SLUB hardened usercopy support
On 07/20/2016 01:27 PM, Kees Cook wrote: Under CONFIG_HARDENED_USERCOPY, this adds object size checking to the SLUB allocator to catch any copies that may span objects. Includes a redzone handling fix discovered by Michael Ellerman. Based on code from PaX and grsecurity. Signed-off-by: Kees Cook Tested-by: Michael Ellerman --- init/Kconfig | 1 + mm/slub.c| 36 2 files changed, 37 insertions(+) diff --git a/init/Kconfig b/init/Kconfig index 798c2020ee7c..1c4711819dfd 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1765,6 +1765,7 @@ config SLAB config SLUB bool "SLUB (Unqueued Allocator)" + select HAVE_HARDENED_USERCOPY_ALLOCATOR help SLUB is a slab allocator that minimizes cache line usage instead of managing queues of cached objects (SLAB approach). diff --git a/mm/slub.c b/mm/slub.c index 825ff4505336..7dee3d9a5843 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3614,6 +3614,42 @@ void *__kmalloc_node(size_t size, gfp_t flags, int node) EXPORT_SYMBOL(__kmalloc_node); #endif +#ifdef CONFIG_HARDENED_USERCOPY +/* + * Rejects objects that are incorrectly sized. + * + * Returns NULL if check passes, otherwise const char * to name of cache + * to indicate an error. + */ +const char *__check_heap_object(const void *ptr, unsigned long n, + struct page *page) +{ + struct kmem_cache *s; + unsigned long offset; + size_t object_size; + + /* Find object and usable object size. */ + s = page->slab_cache; + object_size = slab_ksize(s); + + /* Find offset within object. */ + offset = (ptr - page_address(page)) % s->size; + + /* Adjust for redzone and reject if within the redzone. */ + if (kmem_cache_debug(s) && s->flags & SLAB_RED_ZONE) { + if (offset < s->red_left_pad) + return s->name; + offset -= s->red_left_pad; + } + + /* Allow address range falling entirely within object size. */ + if (offset <= object_size && n <= object_size - offset) + return NULL; + + return s->name; +} +#endif /* CONFIG_HARDENED_USERCOPY */ + I compared this against what check_valid_pointer does for SLUB_DEBUG checking. I was hoping we could utilize that function to avoid duplication but a) __check_heap_object needs to allow accesses anywhere in the object, not just the beginning b) accessing page->objects is racy without the addition of locking in SLUB_DEBUG. Still, the ptr < page_address(page) check from __check_heap_object would be good to add to avoid generating garbage large offsets and trying to infer C math. diff --git a/mm/slub.c b/mm/slub.c index 7dee3d9..5370e4f 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3632,6 +3632,9 @@ const char *__check_heap_object(const void *ptr, unsigned long n, s = page->slab_cache; object_size = slab_ksize(s); + if (ptr < page_address(page)) + return s->name; + /* Find offset within object. */ offset = (ptr - page_address(page)) % s->size; With that, you can add Reviwed-by: Laura Abbott static size_t __ksize(const void *object) { struct page *page; Thanks, Laura ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH v2] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe
On Thu, Jul 07, 2016 at 10:25PM , Jason Cooper wrote: > -Original Message- > From: Jason Cooper [mailto:ja...@lakedaemon.net] > Sent: Thursday, July 07, 2016 10:25 PM > To: Qiang Zhao > Cc: o...@buserror.net; t...@linutronix.de; marc.zyng...@arm.com; linuxppc- > d...@lists.ozlabs.org; linux-ker...@vger.kernel.org; Xiaobo Xie > > Subject: Re: [PATCH v2] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe > > Hi Zhao Qiang, > > On Thu, Jul 07, 2016 at 09:23:55AM +0800, Zhao Qiang wrote: > > The driver stays the same. > > > > Signed-off-by: Zhao Qiang > > --- > > Changes for v2: > > - modify the subject and commit msg > > > > drivers/irqchip/Makefile| 1 + > > drivers/{soc/fsl/qe => irqchip}/qe_ic.c | 0 drivers/{soc/fsl/qe => > > irqchip}/qe_ic.h | 0 > > drivers/soc/fsl/qe/Makefile | 2 +- > > 4 files changed, 2 insertions(+), 1 deletion(-) rename > > drivers/{soc/fsl/qe => irqchip}/qe_ic.c (100%) rename > > drivers/{soc/fsl/qe => irqchip}/qe_ic.h (100%) > > Please merge the include file into the C file and rename to follow the naming > convention in drivers/irqchip/. e.g. irq-qeic.c or irq-qe_ic.c. > > Once you have that, please resend the entire series with this as the first > patch. Sorry, I have no idea about "Include file", could you explain which file you meant? Thank you! -Zhao Qiang ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powernv/pci: Add PHB register dump debugfs handle
On 07/21/2016 11:36 PM, Gavin Shan wrote: > On Fri, Jul 22, 2016 at 03:23:36PM +1000, Russell Currey wrote: >> On EEH events the kernel will print a dump of relevant registers. >> If EEH is unavailable (i.e. CONFIG_EEH is disabled, a new platform >> doesn't have EEH support, etc) this information isn't readily available. >> >> Add a new debugfs handler to trigger a PHB register dump, so that this >> information can be made available on demand. >> >> Signed-off-by: Russell Currey > > Reviewed-by: Gavin Shan > >> --- >> arch/powerpc/platforms/powernv/pci-ioda.c | 35 >> +++ >> 1 file changed, 35 insertions(+) >> >> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c >> b/arch/powerpc/platforms/powernv/pci-ioda.c >> index 891fc4a..ada2f3c 100644 >> --- a/arch/powerpc/platforms/powernv/pci-ioda.c >> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c >> @@ -3018,6 +3018,38 @@ static void pnv_ioda_setup_pe_seg(struct pnv_ioda_pe >> *pe) >> } >> } >> >> +#ifdef CONFIG_DEBUG_FS >> +static ssize_t pnv_pci_debug_write(struct file *filp, >> + const char __user *user_buf, >> + size_t count, loff_t *ppos) >> +{ >> +struct pci_controller *hose = filp->private_data; >> +struct pnv_phb *phb; >> +int ret = 0; > > Needn't initialize @ret in advance. The code might be simpler, but it's > only a personal preference: I believe its actually preferred that it not be initialized in advance so that the tooling can warn you about conditional code paths where you may have forgotten to set a value. Or as Gavin suggests to explicitly use error values in the return statements. -Tyrel > > struct pci_controller *hose = filp->private_data; > struct pnv_phb *phb = hose ? hose->private_data : NULL; > > if (!phb) > return -ENODEV; > >> + >> +if (!hose) >> +return -EFAULT; >> + >> +phb = hose->private_data; >> +if (!phb) >> +return -EFAULT; >> + >> +ret = opal_pci_get_phb_diag_data2(phb->opal_id, phb->diag.blob, >> + PNV_PCI_DIAG_BUF_SIZE); >> + >> +if (!ret) >> +pnv_pci_dump_phb_diag_data(phb->hose, phb->diag.blob); >> + >> +return ret < 0 ? ret : count; > > return ret == OPAL_SUCCESS ? count : -EIO; > >> +} >> + >> +static const struct file_operations pnv_pci_debug_ops = { >> +.open = simple_open, >> +.llseek = no_llseek, >> +.write = pnv_pci_debug_write, > > It might be reasonable to dump the diag-data on read if it is trying > to do it on write. > >> +}; >> +#endif /* CONFIG_DEBUG_FS */ >> + >> static void pnv_pci_ioda_create_dbgfs(void) >> { >> #ifdef CONFIG_DEBUG_FS >> @@ -3036,6 +3068,9 @@ static void pnv_pci_ioda_create_dbgfs(void) >> if (!phb->dbgfs) >> pr_warning("%s: Error on creating debugfs on PHB#%x\n", >> __func__, hose->global_number); >> + >> +debugfs_create_file("regdump", 0200, phb->dbgfs, hose, >> +&pnv_pci_debug_ops); > > "diag-data" might be indicating or a better one you can name :) > > Thanks, > Gavin > >> } >> #endif /* CONFIG_DEBUG_FS */ >> } >> -- >> 2.9.0 >> > > ___ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v4 00/12] mm: Hardened usercopy
On Fri, Jul 22, 2016 at 5:36 PM, Laura Abbott wrote: > On 07/20/2016 01:26 PM, Kees Cook wrote: >> >> Hi, >> >> [This is now in my kspp -next tree, though I'd really love to add some >> additional explicit Tested-bys, Reviewed-bys, or Acked-bys. If you've >> looked through any part of this or have done any testing, please consider >> sending an email with your "*-by:" line. :)] >> >> This is a start of the mainline port of PAX_USERCOPY[1]. After writing >> tests (now in lkdtm in -next) for Casey's earlier port[2], I kept tweaking >> things further and further until I ended up with a whole new patch series. >> To that end, I took Rik, Laura, and other people's feedback along with >> additional changes and clean-ups. >> >> Based on my understanding, PAX_USERCOPY was designed to catch a >> few classes of flaws (mainly bad bounds checking) around the use of >> copy_to_user()/copy_from_user(). These changes don't touch get_user() and >> put_user(), since these operate on constant sized lengths, and tend to be >> much less vulnerable. There are effectively three distinct protections in >> the whole series, each of which I've given a separate CONFIG, though this >> patch set is only the first of the three intended protections. (Generally >> speaking, PAX_USERCOPY covers what I'm calling CONFIG_HARDENED_USERCOPY >> (this) and CONFIG_HARDENED_USERCOPY_WHITELIST (future), and >> PAX_USERCOPY_SLABS covers CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC >> (future).) >> >> This series, which adds CONFIG_HARDENED_USERCOPY, checks that objects >> being copied to/from userspace meet certain criteria: >> - if address is a heap object, the size must not exceed the object's >> allocated size. (This will catch all kinds of heap overflow flaws.) >> - if address range is in the current process stack, it must be within the >> a valid stack frame (if such checking is possible) or at least entirely >> within the current process's stack. (This could catch large lengths that >> would have extended beyond the current process stack, or overflows if >> their length extends back into the original stack.) >> - if the address range is part of kernel data, rodata, or bss, allow it. >> - if address range is page-allocated, that it doesn't span multiple >> allocations (excepting Reserved and CMA pages). >> - if address is within the kernel text, reject it. >> - everything else is accepted >> >> The patches in the series are: >> - Support for examination of CMA page types: >> 1- mm: Add is_migrate_cma_page >> - Support for arch-specific stack frame checking (which will likely be >> replaced in the future by Josh's more comprehensive unwinder): >> 2- mm: Implement stack frame object validation >> - The core copy_to/from_user() checks, without the slab object checks: >> 3- mm: Hardened usercopy >> - Per-arch enablement of the protection: >> 4- x86/uaccess: Enable hardened usercopy >> 5- ARM: uaccess: Enable hardened usercopy >> 6- arm64/uaccess: Enable hardened usercopy >> 7- ia64/uaccess: Enable hardened usercopy >> 8- powerpc/uaccess: Enable hardened usercopy >> 9- sparc/uaccess: Enable hardened usercopy >>10- s390/uaccess: Enable hardened usercopy >> - The heap allocator implementation of object size checking: >>11- mm: SLAB hardened usercopy support >>12- mm: SLUB hardened usercopy support >> >> Some notes: >> >> - This is expected to apply on top of -next which contains fixes for the >> position of _etext on both arm and arm64, though it has some conflicts >> with KASAN that should be trivial to fix up. Also in -next are the >> tests for this protection (in lkdtm), prefixed with USERCOPY_. >> >> - I couldn't detect a measurable performance change with these features >> enabled. Kernel build times were unchanged, hackbench was unchanged, >> etc. I think we could flip this to "on by default" at some point, but >> for now, I'm leaving it off until I can get some more definitive >> measurements. I would love if someone with greater familiarity with >> perf could give this a spin and report results. >> >> - The SLOB support extracted from grsecurity seems entirely broken. I >> have no idea what's going on there, I spent my time testing SLAB and >> SLUB. Having someone else look at SLOB would be nice, but this series >> doesn't depend on it. >> >> Additional features that would be nice, but aren't blocking this series: >> >> - Needs more architecture support for stack frame checking (only x86 now, >> but it seems Josh will have a good solution for this soon). >> >> >> Thanks! >> >> -Kees >> >> [1] https://grsecurity.net/download.php "grsecurity - test kernel patch" >> [2] http://www.openwall.com/lists/kernel-hardening/2016/05/19/5 >> >> v4: >> - handle CMA pages, labbott >> - update stack checker comments, labbott >> - check for vmalloc addresses, labbott >> - deal with KASAN in -next changing arm64 copy*user calls >> -
[PATCH v3 2/2] powerpc/pseries: Implement indexed-count hotplug memory remove
Indexed-count remove for memory hotplug guarantees that a contiguous block of lmbs beginning at a specified will be unassigned (NOT that lmbs will be removed). Because of Qemu's per-DIMM memory management, the removal of a contiguous block of memory currently requires a series of individual calls. Indexed-count remove reduces this series into a single call. Signed-off-by: Sahil Mehta --- v2: -use u32s drc_index and count instead of u32 ic[] in dlpar_memory v3: -add logic to handle invalid drc_index input arch/powerpc/platforms/pseries/hotplug-memory.c | 90 +++ 1 file changed, 90 insertions(+) diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index 2d4ceb3..dd5eb38 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -503,6 +503,92 @@ static int dlpar_memory_remove_by_index(u32 drc_index, struct property *prop) return rc; } +static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index, +struct property *prop) +{ + struct of_drconf_cell *lmbs; + u32 num_lmbs, *p; + int i, rc, start_lmb_found; + int lmbs_available = 0, start_index = 0, end_index; + + pr_info("Attempting to hot-remove %u LMB(s) at %x\n", + lmbs_to_remove, drc_index); + + if (lmbs_to_remove == 0) + return -EINVAL; + + p = prop->value; + num_lmbs = *p++; + lmbs = (struct of_drconf_cell *)p; + start_lmb_found = 0; + + /* Navigate to drc_index */ + while (start_index < num_lmbs) { + if (lmbs[start_index].drc_index == drc_index) { + start_lmb_found = 1; + break; + } + + start_index++; + } + + if (!start_lmb_found) + return -EINVAL; + + end_index = start_index + lmbs_to_remove; + + /* Validate that there are enough LMBs to satisfy the request */ + for (i = start_index; i < end_index; i++) { + if (lmbs[i].flags & DRCONF_MEM_RESERVED) + break; + + lmbs_available++; + } + + if (lmbs_available < lmbs_to_remove) + return -EINVAL; + + for (i = 0; i < end_index; i++) { + if (!(lmbs[i].flags & DRCONF_MEM_ASSIGNED)) + continue; + + rc = dlpar_remove_lmb(&lmbs[i]); + if (rc) + break; + + lmbs[i].reserved = 1; + } + + if (rc) { + pr_err("Memory indexed-count-remove failed, adding any removed LMBs\n"); + + for (i = start_index; i < end_index; i++) { + if (!lmbs[i].reserved) + continue; + + rc = dlpar_add_lmb(&lmbs[i]); + if (rc) + pr_err("Failed to add LMB, drc index %x\n", + be32_to_cpu(lmbs[i].drc_index)); + + lmbs[i].reserved = 0; + } + rc = -EINVAL; + } else { + for (i = start_index; i < end_index; i++) { + if (!lmbs[i].reserved) + continue; + + pr_info("Memory at %llx (drc index %x) was hot-removed\n", + lmbs[i].base_addr, lmbs[i].drc_index); + + lmbs[i].reserved = 0; + } + } + + return rc; +} + #else static inline int pseries_remove_memblock(unsigned long base, unsigned int memblock_size) @@ -829,6 +915,10 @@ int dlpar_memory(struct pseries_hp_errorlog *hp_elog) } else if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_INDEX) { drc_index = hp_elog->_drc_u.drc_index; rc = dlpar_memory_remove_by_index(drc_index, prop); + } else if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_IC) { + count = hp_elog->_drc_u.indexed_count[0]; + drc_index = hp_elog->_drc_u.indexed_count[1]; + rc = dlpar_memory_remove_by_ic(count, drc_index, prop); } else { rc = -EINVAL; } ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v3 1/2] powerpc/pseries: Implement indexed-count hotplug memory add
Indexed-count add for memory hotplug guarantees that a contiguous block of lmbs beginning at a specified will be assigned (NOT that lmbs will be added). Because of Qemu's per-DIMM memory management, the addition of a contiguous block of memory currently requires a series of individual calls. Indexed-count add reduces this series into a single call. Signed-off-by: Sahil Mehta --- v2: -remove potential memory leak when parsing command -use u32s drc_index and count instead of u32 ic[] in dlpar_memory v3: -add logic to handle invalid drc_index input -update indexed-count trigger to follow naming convention -update dlpar_memory to follow kernel if-else style arch/powerpc/include/asm/rtas.h |2 arch/powerpc/platforms/pseries/dlpar.c | 34 ++- arch/powerpc/platforms/pseries/hotplug-memory.c | 110 +-- 3 files changed, 134 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h index 51400ba..088ea75 100644 --- a/arch/powerpc/include/asm/rtas.h +++ b/arch/powerpc/include/asm/rtas.h @@ -307,6 +307,7 @@ struct pseries_hp_errorlog { union { __be32 drc_index; __be32 drc_count; + __be32 indexed_count[2]; chardrc_name[1]; } _drc_u; }; @@ -322,6 +323,7 @@ struct pseries_hp_errorlog { #define PSERIES_HP_ELOG_ID_DRC_NAME1 #define PSERIES_HP_ELOG_ID_DRC_INDEX 2 #define PSERIES_HP_ELOG_ID_DRC_COUNT 3 +#define PSERIES_HP_ELOG_ID_DRC_IC 4 struct pseries_errorlog *get_pseries_errorlog(struct rtas_error_log *log, uint16_t section_id); diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index 2b93ae8..6dbd13c 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -345,11 +345,17 @@ static int handle_dlpar_errorlog(struct pseries_hp_errorlog *hp_elog) switch (hp_elog->id_type) { case PSERIES_HP_ELOG_ID_DRC_COUNT: hp_elog->_drc_u.drc_count = - be32_to_cpu(hp_elog->_drc_u.drc_count); + be32_to_cpu(hp_elog->_drc_u.drc_count); break; case PSERIES_HP_ELOG_ID_DRC_INDEX: hp_elog->_drc_u.drc_index = - be32_to_cpu(hp_elog->_drc_u.drc_index); + be32_to_cpu(hp_elog->_drc_u.drc_index); + break; + case PSERIES_HP_ELOG_ID_DRC_IC: + hp_elog->_drc_u.indexed_count[0] = + be32_to_cpu(hp_elog->_drc_u.indexed_count[0]); + hp_elog->_drc_u.indexed_count[1] = + be32_to_cpu(hp_elog->_drc_u.indexed_count[1]); } switch (hp_elog->resource) { @@ -409,7 +415,29 @@ static ssize_t dlpar_store(struct class *class, struct class_attribute *attr, goto dlpar_store_out; } - if (!strncmp(arg, "index", 5)) { + if (!strncmp(arg, "indexed-count", 13)) { + u32 index, count; + char *cstr, *istr; + + hp_elog->id_type = PSERIES_HP_ELOG_ID_DRC_IC; + arg += strlen("indexed-count "); + + cstr = kstrdup(arg, GFP_KERNEL); + istr = strchr(cstr, ' '); + *istr++ = '\0'; + + if (kstrtou32(cstr, 0, &count) || kstrtou32(istr, 0, &index)) { + rc = -EINVAL; + pr_err("Invalid index or count : \"%s\"\n", buf); + kfree(cstr); + goto dlpar_store_out; + } + + kfree(cstr); + + hp_elog->_drc_u.indexed_count[0] = cpu_to_be32(count); + hp_elog->_drc_u.indexed_count[1] = cpu_to_be32(index); + } else if (!strncmp(arg, "index", 5)) { u32 index; hp_elog->id_type = PSERIES_HP_ELOG_ID_DRC_INDEX; diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index 2ce1385..2d4ceb3 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -701,6 +701,89 @@ static int dlpar_memory_add_by_index(u32 drc_index, struct property *prop) return rc; } +static int dlpar_memory_add_by_ic(u32 lmbs_to_add, u32 drc_index, + struct property *prop) +{ + struct of_drconf_cell *lmbs; + u32 num_lmbs, *p; + int i, rc, start_lmb_found; + int lmbs_available = 0, start_index = 0, end_index; + + pr_info("Attempting to hot-add %u LMB(s) at index %x\n", + lmbs_to_add, drc_index); + + if (lmbs_to_add == 0) + return -EINVAL; + + p = prop->value; + num_lmbs = *p++; +
[PATCH v3 0/2] powerpc/pseries: Implement indexed-count hotplug memory management
Indexed-count memory management allows addition and removal of contiguous lmb blocks with a single command. When compared to the series of calls previously required to manage contiguous blocks, indexed-count decreases command frequency and reduces risk of buffer overflow. Changes in v2: -- -[PATCH 1/2]: -remove potential memory leak when parsing command -use u32s drc_index and count instead of u32 ic[] in dlpar_memory -[PATCH 2/2]: -use u32s drc_index and count instead of u32 ic[] in dlpar_memory Changes in v3: -- -[PATCH 1/2]: -add logic to handle invalid drc_index input -update indexed-count trigger to follow naming convention -update dlpar_memory to follow kernel if-else style -[PATCH 2/2]: -add logic to handle invalid drc_index input -Sahil Mehta --- include/asm/rtas.h |2 platforms/pseries/dlpar.c | 34 +- platforms/pseries/hotplug-memory.c | 200 +++-- 3 files changed, 224 insertions(+), 12 deletions(-) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 9/9] powerpc: rewrite local_t using soft_irq
Local atomic operations are fast and highly reentrant per CPU counters. Used for percpu variable updates. Local atomic operations only guarantee variable modification atomicity wrt the CPU which owns the data and these needs to be executed in a preemption safe way. Here is the design of this patch. Since local_* operations are only need to be atomic to interrupts (IIUC), we have two options. Either replay the "op" if interrupted or replay the interrupt after the "op". Initial patchset posted was based on implementing local_* operation based on CR5 which replay's the "op". Patchset had issues in case of rewinding the address pointor from an array. This make the slow patch really slow. Since CR5 based implementation proposed using __ex_table to find the rewind addressr, this rasied concerns about size of __ex_table and vmlinux. https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-December/123115.html But this patch uses Benjamin Herrenschmidt suggestion of arch_local_irq_disable_var() to soft_disable interrupts (including PMIs). After finishing the "op", arch_local_irq_restore() called and correspondingly interrupts are replayed if any occured. patch re-write the current local_* functions to use arch_local_irq_disbale. Base flow for each function is { arch_local_irq_disable_var(2) load .. store arch_local_irq_restore() } Currently only asm/local.h has been rewrite, and also the entire change is tested only in PPC64 (pseries guest) Reason for the approach is that, currently l[w/d]arx/st[w/d]cx. instruction pair is used for local_* operations, which are heavy on cycle count and they dont support a local variant. So to see whether the new implementation helps, used a modified version of Rusty's benchmark code on local_t. https://lkml.org/lkml/2008/12/16/450 Modifications to Rusty's benchmark code: - Executed only local_t test Here are the values with the patch. Time in ns per iteration Local_t Without Patch With Patch _inc28 8 _add28 8 _read 3 3 _add_return 28 7 Tested the patch in a - pSeries LPAR (with perf record) Signed-off-by: Madhavan Srinivasan --- arch/powerpc/include/asm/local.h | 91 +++- 1 file changed, 63 insertions(+), 28 deletions(-) diff --git a/arch/powerpc/include/asm/local.h b/arch/powerpc/include/asm/local.h index b8da91363864..afd3dabd92cb 100644 --- a/arch/powerpc/include/asm/local.h +++ b/arch/powerpc/include/asm/local.h @@ -14,24 +14,50 @@ typedef struct #define local_read(l) atomic_long_read(&(l)->a) #define local_set(l,i) atomic_long_set(&(l)->a, (i)) -#define local_add(i,l) atomic_long_add((i),(&(l)->a)) -#define local_sub(i,l) atomic_long_sub((i),(&(l)->a)) -#define local_inc(l) atomic_long_inc(&(l)->a) -#define local_dec(l) atomic_long_dec(&(l)->a) +static __inline__ void local_add(long i, local_t *l) +{ + long t; + unsigned long flags; + + flags = arch_local_irq_disable_var(2); + __asm__ __volatile__( + PPC_LL" %0,0(%2)\n\ + add %0,%1,%0\n" + PPC_STL" %0,0(%2)\n" + : "=&r" (t) + : "r" (i), "r" (&(l->a.counter))); + arch_local_irq_restore(flags); +} + +static __inline__ void local_sub(long i, local_t *l) +{ + long t; + unsigned long flags; + + flags = arch_local_irq_disable_var(2); + __asm__ __volatile__( + PPC_LL" %0,0(%2)\n\ + subf%0,%1,%0\n" + PPC_STL" %0,0(%2)\n" + : "=&r" (t) + : "r" (i), "r" (&(l->a.counter))); + arch_local_irq_restore(flags); +} static __inline__ long local_add_return(long a, local_t *l) { long t; + unsigned long flags; + flags = arch_local_irq_disable_var(2); __asm__ __volatile__( -"1:" PPC_LLARX(%0,0,%2,0) " # local_add_return\n\ + PPC_LL" %0,0(%2)\n\ add %0,%1,%0\n" - PPC405_ERR77(0,%2) - PPC_STLCX "%0,0,%2 \n\ - bne-1b" + PPC_STL "%0,0(%2)\n" : "=&r" (t) : "r" (a), "r" (&(l->a.counter)) : "cc", "memory"); + arch_local_irq_restore(flags); return t; } @@ -41,16 +67,18 @@ static __inline__ long local_add_return(long a, local_t *l) static __inline__ long local_sub_return(long a, local_t *l) { long t; + unsigned long flags; + + flags = arch_local_irq_disable_var(2); __asm__ __volatile__( -"1:" PPC_LLARX(%0,0,%2,0) " # local_sub_return\n\ +"1:" PPC_LL" %0,0(%2)\n\ subf%0,%1,%0\n" - PPC405_ERR77(0,%2) - PPC_STLCX "%0,0,%2 \n\ - bne-1b" + PPC_STL "%0,0(%2)\n" : "=&r" (t) : "r" (a), "r" (&(l->a.counter)) : "cc", "memory"); + arch_local_irq_restore(flags); return t; } @@ -58,16 +86,17 @@ static __inline__ long local
[RFC PATCH 8/9] powerpc: Support to replay PMIs
Code to replay the Performance Monitoring Interrupts(PMI). In the masked_interrupt handler, for PMIs we reset the MSR[EE] and return. This is due the fact that PMIs are level triggered. In the __check_irq_replay(), we enabled the MSR[EE] which will fire the interrupt for us. Patch also adds a new arch_local_irq_disable_var() variant. New variant takes an input value to write to the paca->soft_enabled. This will be used in following patch to implement the tri-state value for soft-enabled. Signed-off-by: Madhavan Srinivasan --- arch/powerpc/include/asm/hw_irq.h | 14 ++ arch/powerpc/kernel/irq.c | 9 - 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index cc69dde6eb84..863179654452 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -81,6 +81,20 @@ static inline unsigned long arch_local_irq_disable(void) return flags; } +static inline unsigned long arch_local_irq_disable_var(int value) +{ + unsigned long flags, zero; + + asm volatile( + "li %1,%3; lbz %0,%2(13); stb %1,%2(13)" + : "=r" (flags), "=&r" (zero) + : "i" (offsetof(struct paca_struct, soft_enabled)),\ + "i" (value) + : "memory"); + + return flags; +} + extern void arch_local_irq_restore(unsigned long); static inline void arch_local_irq_enable(void) diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index 597c20d1814c..81fe0da1f86d 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -158,9 +158,16 @@ notrace unsigned int __check_irq_replay(void) if ((happened & PACA_IRQ_DEC) || decrementer_check_overflow()) return 0x900; + /* +* In masked_handler() for PMI, we disable MSR[EE] and return. +* When replaying it, just enabling the MSR[EE] will do +* trick, since the PMI are "level" triggered. +*/ + local_paca->irq_happened &= ~PACA_IRQ_PMI; + /* Finally check if an external interrupt happened */ local_paca->irq_happened &= ~PACA_IRQ_EE; - if (happened & PACA_IRQ_EE) + if ((happened & PACA_IRQ_EE) || (happened & PACA_IRQ_PMI)) return 0x500; #ifdef CONFIG_PPC_BOOK3E -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 7/9] powerpc: Add support to mask perf interrupts
To support masking of the PMI interrupts, couple of new interrupt handler macros are added MASKABLE_EXCEPTION_PSERIES_OOL and MASKABLE_RELON_EXCEPTION_PSERIES_OOL. These are needed to include the SOFTEN_TEST and implement the support at both host and guest kernel. Couple of new irq #defs "PACA_IRQ_PMI" and "SOFTEN_VALUE_0xf0*" added to use in the exception code to check for PMI interrupts. __SOFTEN_TEST macro is modified to support the PMI interrupt. Present __SOFTEN_TEST code loads the soft_enabled from paca and check to call masked_interrupt handler code. To support both current behaviour and PMI masking, these changes are added, 1) Current LR register content are saved in R11 2) "bge" branch operation is changed to "bgel". 3) restore R11 to LR Reason: To retain PMI as NMI behaviour for flag state of 1, we save the LR regsiter value in R11 and branch to "masked_interrupt" handler with LR update. And in "masked_interrupt" handler, we check for the "SOFTEN_VALUE_*" value in R10 for PMI and branch back with "blr" if PMI. To mask PMI for a flag >1 value, masked_interrupt vaoid's the above check and continue to execute the masked_interrupt code and disabled MSR[EE] and updated the irq_happend with PMI info. Finally, saving of R11 is moved before calling SOFTEN_TEST in the __EXCEPTION_PROLOG_1 macro to support saving of LR values in SOFTEN_TEST. Signed-off-by: Madhavan Srinivasan --- arch/powerpc/include/asm/exception-64s.h | 22 -- arch/powerpc/include/asm/hw_irq.h| 1 + arch/powerpc/kernel/exceptions-64s.S | 27 --- 3 files changed, 45 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h index 44d3f539d8a5..c951b7ab5108 100644 --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -166,8 +166,8 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) OPT_SAVE_REG_TO_PACA(area+EX_CFAR, r10, CPU_FTR_CFAR); \ SAVE_CTR(r10, area);\ mfcrr9; \ - extra(vec); \ std r11,area+EX_R11(r13); \ + extra(vec); \ std r12,area+EX_R12(r13); \ GET_SCRATCH0(r10); \ std r10,area+EX_R13(r13) @@ -403,12 +403,17 @@ label##_relon_hv: \ #define SOFTEN_VALUE_0xe82 PACA_IRQ_DBELL #define SOFTEN_VALUE_0xe60 PACA_IRQ_HMI #define SOFTEN_VALUE_0xe62 PACA_IRQ_HMI +#define SOFTEN_VALUE_0xf01 PACA_IRQ_PMI +#define SOFTEN_VALUE_0xf00 PACA_IRQ_PMI #define __SOFTEN_TEST(h, vec) \ lbz r10,PACASOFTIRQEN(r13); \ cmpwi r10,LAZY_INTERRUPT_DISABLED;\ li r10,SOFTEN_VALUE_##vec; \ - bge masked_##h##interrupt + mflrr11;\ + bgelmasked_##h##interrupt; \ + mtlrr11; + #define _SOFTEN_TEST(h, vec) __SOFTEN_TEST(h, vec) #define SOFTEN_TEST_PR(vec)\ @@ -438,6 +443,12 @@ label##_pSeries: \ _MASKABLE_EXCEPTION_PSERIES(vec, label, \ EXC_STD, SOFTEN_TEST_PR) +#define MASKABLE_EXCEPTION_PSERIES_OOL(vec, label) \ + .globl label##_pSeries; \ +label##_pSeries: \ + EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_TEST_PR, vec);\ + EXCEPTION_PROLOG_PSERIES_1(label##_common, EXC_STD); + #define MASKABLE_EXCEPTION_HV(loc, vec, label) \ . = loc;\ .globl label##_hv; \ @@ -466,6 +477,13 @@ label##_relon_pSeries: \ _MASKABLE_RELON_EXCEPTION_PSERIES(vec, label, \ EXC_STD, SOFTEN_NOTEST_PR) +#define MASKABLE_RELON_EXCEPTION_PSERIES_OOL(vec, label) \ + .globl label##_relon_pSeries; \ +label##_relon_pSeries: \ + EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_NOTEST_PR, vec); \ + EXCEPTION_PROLOG_PSERIES_1(label##_common, EXC_STD); + + #define MASKABLE_RELON_EXCEPTION_HV(loc, vec, label)
[RFC PATCH 6/9] powerpc: modify __SOFTEN_TEST to support tri-state soft_enabled flag
Foundation patch to support checking of new flag for "paca->soft_enabled". Modify the condition checking for the "soft_enabled" from "equal" to "greater than or equal to". Signed-off-by: Madhavan Srinivasan --- arch/powerpc/include/asm/exception-64s.h | 2 +- arch/powerpc/include/asm/hw_irq.h| 4 ++-- arch/powerpc/include/asm/irqflags.h | 2 +- arch/powerpc/kernel/entry_64.S | 4 ++-- arch/powerpc/kernel/irq.c| 4 ++-- 5 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h index e24e63d216c4..44d3f539d8a5 100644 --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -408,7 +408,7 @@ label##_relon_hv: \ lbz r10,PACASOFTIRQEN(r13); \ cmpwi r10,LAZY_INTERRUPT_DISABLED;\ li r10,SOFTEN_VALUE_##vec; \ - beq masked_##h##interrupt + bge masked_##h##interrupt #define _SOFTEN_TEST(h, vec) __SOFTEN_TEST(h, vec) #define SOFTEN_TEST_PR(vec)\ diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index 2b87930e0e82..b7c7f1c6706f 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -94,7 +94,7 @@ static inline unsigned long arch_local_irq_save(void) static inline bool arch_irqs_disabled_flags(unsigned long flags) { - return flags == LAZY_INTERRUPT_DISABLED; + return flags >= LAZY_INTERRUPT_DISABLED; } static inline bool arch_irqs_disabled(void) @@ -139,7 +139,7 @@ static inline void may_hard_irq_enable(void) static inline bool arch_irq_disabled_regs(struct pt_regs *regs) { - return (regs->softe == LAZY_INTERRUPT_DISABLED); + return (regs->softe >= LAZY_INTERRUPT_DISABLED); } extern bool prep_irq_for_idle(void); diff --git a/arch/powerpc/include/asm/irqflags.h b/arch/powerpc/include/asm/irqflags.h index 6091e46f2455..235055fabf65 100644 --- a/arch/powerpc/include/asm/irqflags.h +++ b/arch/powerpc/include/asm/irqflags.h @@ -52,7 +52,7 @@ li __rA,LAZY_INTERRUPT_DISABLED; \ ori __rB,__rB,PACA_IRQ_HARD_DIS;\ stb __rB,PACAIRQHAPPENED(r13); \ - beq 44f;\ + bge 44f;\ stb __rA,PACASOFTIRQEN(r13);\ TRACE_DISABLE_INTS; \ 44: diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index cade169a7517..7ab6bfff653e 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -766,7 +766,7 @@ restore: ld r5,SOFTE(r1) lbz r6,PACASOFTIRQEN(r13) cmpwi cr0,r5,LAZY_INTERRUPT_DISABLED - beq restore_irq_off + bge restore_irq_off /* We are enabling, were we already enabled ? Yes, just return */ cmpwi cr0,r6,LAZY_INTERRUPT_ENABLED @@ -1012,7 +1012,7 @@ _GLOBAL(enter_rtas) * check it with the asm equivalent of WARN_ON */ lbz r0,PACASOFTIRQEN(r13) -1: tdnei r0,LAZY_INTERRUPT_DISABLED +1: tdeqi r0,LAZY_INTERRUPT_ENABLED EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING #endif diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index 9b9b6df8d83d..597c20d1814c 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -202,7 +202,7 @@ notrace void arch_local_irq_restore(unsigned long en) /* Write the new soft-enabled value */ set_soft_enabled(en); - if (en == LAZY_INTERRUPT_DISABLED) + if (en >= LAZY_INTERRUPT_DISABLED) return; /* * From this point onward, we can take interrupts, preempt, @@ -247,7 +247,7 @@ notrace void arch_local_irq_restore(unsigned long en) } #endif /* CONFIG_TRACE_IRQFLAG */ - set_soft_enabled(LAZY_INTERRUPT_DISABLED); + set_soft_enabled(en); /* * Check if anything needs to be re-emitted. We haven't -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 5/9] powerpc: reverse the soft_enable logic
"paca->soft_enabled" is used as a flag to mask some of interrupts. Currently supported flags values and their details: soft_enabledMSR[EE] 0 0 Disabled (PMI and HMI not masked) 1 1 Enabled "paca->soft_enabled" is initialed to 1 to make the interripts as enabled. arch_local_irq_disable() will toggle the value when interrupts needs to disbled. At this point, the interrupts are not actually disabled, instead, interrupt vector has code to check for the flag and mask it when it occurs. By "mask it", it updated interrupt paca->irq_happened and return. arch_local_irq_restore() is called to re-enable interrupts, which checks and replays interrupts if any occured. Now, as mentioned, current logic doesnot mask "performance monitoring interrupts" and PMIs are implemented as NMI. But this patchset depends on local_irq_* for a successful local_* update. Meaning, mask all possible interrupts during local_* update and replay them after the update. So the idea here is to reserve the "paca->soft_enabled" logic. New values and details: soft_enabledMSR[EE] 1 0 Disabled (PMI and HMI not masked) 0 1 Enabled Reason for the this change is to create foundation for a third flag value "2" for "soft_enabled" to add support to mask PMIs. When arch_irq_disable_* is called with a value "2", PMI interrupts are mask. But when called with a value of "1", PMI are not mask. With new flag value for "soft_enabled", states looks like: soft_enabledMSR[EE] 2 0 Disbaled PMIs also 1 0 Disabled (PMI and HMI not masked) 0 1 Enabled And interrupt handler code for checking has been modified to check for for "greater than or equal" to 1 condition instead. Comment here explains the logic changes that are implemented in the following patches. But this patch primarly does only reserve the logic. Following patches will make the corresponding changes. Signed-off-by: Madhavan Srinivasan --- arch/powerpc/include/asm/hw_irq.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index 09491417fbf7..2b87930e0e82 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -30,8 +30,8 @@ /* * flags for paca->soft_enabled */ -#define LAZY_INTERRUPT_ENABLED 1 -#define LAZY_INTERRUPT_DISABLED0 +#define LAZY_INTERRUPT_ENABLED 0 +#define LAZY_INTERRUPT_DISABLED1 #endif /* CONFIG_PPC64 */ -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 4/9] powerpc: Use set_soft_enabled api to update paca->soft_enabled
Signed-off-by: Madhavan Srinivasan --- arch/powerpc/include/asm/kvm_ppc.h | 2 +- arch/powerpc/kernel/irq.c | 4 ++-- arch/powerpc/kernel/setup_64.c | 3 ++- arch/powerpc/kernel/time.c | 4 ++-- 4 files changed, 7 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index e790b8a6bf0b..68c2275c3674 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -707,7 +707,7 @@ static inline void kvmppc_fix_ee_before_entry(void) /* Only need to enable IRQs by hard enabling them after this */ local_paca->irq_happened = 0; - local_paca->soft_enabled = LAZY_INTERRUPT_ENABLED; + set_soft_enabled(LAZY_INTERRUPT_ENABLED); #endif } diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index 88e541daf7b0..9b9b6df8d83d 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -202,7 +202,7 @@ notrace void arch_local_irq_restore(unsigned long en) /* Write the new soft-enabled value */ set_soft_enabled(en); - if (en == LAZY_INTERRUPT_DIABLED) + if (en == LAZY_INTERRUPT_DISABLED) return; /* * From this point onward, we can take interrupts, preempt, @@ -331,7 +331,7 @@ bool prep_irq_for_idle(void) * of entering the low power state. */ local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS; - local_paca->soft_enabled = LAZY_INTERRUPT_ENABLED; + set_soft_enabled(LAZY_INTERRUPT_ENABLED); /* Tell the caller to enter the low power state */ return true; diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c index 0ca504839550..2c7f4b23359a 100644 --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -206,7 +206,7 @@ static void fixup_boot_paca(void) /* Allow percpu accesses to work until we setup percpu data */ get_paca()->data_offset = 0; /* Mark interrupts disabled in PACA */ - get_paca()->soft_enabled = LAZY_INTERRUPT_DISABLED; + set_soft_enabled(LAZY_INTERRUPT_DISABLED); } static void cpu_ready_for_interrupts(void) @@ -326,6 +326,7 @@ void early_setup_secondary(void) { /* Mark interrupts enabled in PACA */ get_paca()->soft_enabled = 0; + set_soft_enabled(LAZY_INTERRUPT_DISABLED); /* Initialize the hash table or TLB handling */ early_init_mmu_secondary(); diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index e46f7ab6cbde..0a1669708a0d 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -258,7 +258,7 @@ void accumulate_stolen_time(void) * needs to reflect that so various debug stuff doesn't * complain */ - local_paca->soft_enabled = LAZY_INTERRUPT_DISABLED; + set_soft_enabled(LAZY_INTERRUPT_DISABLED); sst = scan_dispatch_log(local_paca->starttime_user); ust = scan_dispatch_log(local_paca->starttime); @@ -266,7 +266,7 @@ void accumulate_stolen_time(void) local_paca->user_time -= ust; local_paca->stolen_time += ust + sst; - local_paca->soft_enabled = save_soft_enabled; + set_soft_enabled(save_soft_enabled); } static inline u64 calculate_stolen_time(u64 stop_tb) -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 3/9] powerpc: move set_soft_enabled()
Move set_soft_enabled() from powerpc/kernel/irq.c to asm/hw_irq.c. this way updation of paca->soft_enabled can be forced wherever possible. Signed-off-by: Madhavan Srinivasan --- arch/powerpc/include/asm/hw_irq.h | 6 ++ arch/powerpc/kernel/irq.c | 6 -- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index 433fe60cf428..09491417fbf7 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -48,6 +48,12 @@ extern void unknown_exception(struct pt_regs *regs); #ifdef CONFIG_PPC64 #include +static inline notrace void set_soft_enabled(unsigned long enable) +{ + __asm__ __volatile__("stb %0,%1(13)" + : : "r" (enable), "i" (offsetof(struct paca_struct, soft_enabled))); +} + static inline unsigned long arch_local_save_flags(void) { unsigned long flags; diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index 06dff620fcdc..88e541daf7b0 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -106,12 +106,6 @@ static inline notrace unsigned long get_irq_happened(void) return happened; } -static inline notrace void set_soft_enabled(unsigned long enable) -{ - __asm__ __volatile__("stb %0,%1(13)" - : : "r" (enable), "i" (offsetof(struct paca_struct, soft_enabled))); -} - static inline notrace int decrementer_check_overflow(void) { u64 now = get_tb_or_rtc(); -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 1/9] Add #defs for paca->soft_enabled flags
Two #defs LAZY_INTERRUPT_ENABLED and LAZY_INTERRUPT_DISABLED are added to be used when updating paca->soft_enabled. Signed-off-by: Madhavan Srinivasan --- -If the macro names looks not right, kindly suggest arch/powerpc/include/asm/hw_irq.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index b59ac27a6b7d..e58c9d95050a 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -27,6 +27,13 @@ #define PACA_IRQ_EE_EDGE 0x10 /* BookE only */ #define PACA_IRQ_HMI 0x20 +/* + * flags for paca->soft_enabled + */ +#define LAZY_INTERRUPT_ENABLED 1 +#define LAZY_INTERRUPT_DISABLED0 + + #endif /* CONFIG_PPC64 */ #ifndef __ASSEMBLY__ -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 2/9] Cleanup to use LAZY_INTERRUPT_* macros for paca->soft_enabled update
Replace the hardcoded values used when updating paca->soft_enabled with LAZY_INTERRUPT_* #def. No logic change. Signed-off-by: Madhavan Srinivasan --- arch/powerpc/include/asm/exception-64s.h | 2 +- arch/powerpc/include/asm/hw_irq.h| 15 --- arch/powerpc/include/asm/irqflags.h | 6 +++--- arch/powerpc/include/asm/kvm_ppc.h | 2 +- arch/powerpc/kernel/entry_64.S | 14 +++--- arch/powerpc/kernel/head_64.S| 3 ++- arch/powerpc/kernel/idle_power4.S| 3 ++- arch/powerpc/kernel/irq.c| 9 + arch/powerpc/kernel/process.c| 3 ++- arch/powerpc/kernel/setup_64.c | 3 +++ arch/powerpc/kernel/time.c | 2 +- arch/powerpc/mm/hugetlbpage.c| 2 +- arch/powerpc/perf/core-book3s.c | 2 +- 13 files changed, 37 insertions(+), 29 deletions(-) diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h index 93ae809fe5ea..e24e63d216c4 100644 --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -406,7 +406,7 @@ label##_relon_hv: \ #define __SOFTEN_TEST(h, vec) \ lbz r10,PACASOFTIRQEN(r13); \ - cmpwi r10,0; \ + cmpwi r10,LAZY_INTERRUPT_DISABLED;\ li r10,SOFTEN_VALUE_##vec; \ beq masked_##h##interrupt #define _SOFTEN_TEST(h, vec) __SOFTEN_TEST(h, vec) diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index e58c9d95050a..433fe60cf428 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -65,9 +65,10 @@ static inline unsigned long arch_local_irq_disable(void) unsigned long flags, zero; asm volatile( - "li %1,0; lbz %0,%2(13); stb %1,%2(13)" + "li %1,%3; lbz %0,%2(13); stb %1,%2(13)" : "=r" (flags), "=&r" (zero) - : "i" (offsetof(struct paca_struct, soft_enabled)) + : "i" (offsetof(struct paca_struct, soft_enabled)),\ + "i" (LAZY_INTERRUPT_DISABLED) : "memory"); return flags; @@ -77,7 +78,7 @@ extern void arch_local_irq_restore(unsigned long); static inline void arch_local_irq_enable(void) { - arch_local_irq_restore(1); + arch_local_irq_restore(LAZY_INTERRUPT_ENABLED); } static inline unsigned long arch_local_irq_save(void) @@ -87,7 +88,7 @@ static inline unsigned long arch_local_irq_save(void) static inline bool arch_irqs_disabled_flags(unsigned long flags) { - return flags == 0; + return flags == LAZY_INTERRUPT_DISABLED; } static inline bool arch_irqs_disabled(void) @@ -107,9 +108,9 @@ static inline bool arch_irqs_disabled(void) u8 _was_enabled;\ __hard_irq_disable(); \ _was_enabled = local_paca->soft_enabled;\ - local_paca->soft_enabled = 0; \ + local_paca->soft_enabled = LAZY_INTERRUPT_DISABLED;\ local_paca->irq_happened |= PACA_IRQ_HARD_DIS; \ - if (_was_enabled) \ + if (_was_enabled == LAZY_INTERRUPT_ENABLED) \ trace_hardirqs_off(); \ } while(0) @@ -132,7 +133,7 @@ static inline void may_hard_irq_enable(void) static inline bool arch_irq_disabled_regs(struct pt_regs *regs) { - return !regs->softe; + return (regs->softe == LAZY_INTERRUPT_DISABLED); } extern bool prep_irq_for_idle(void); diff --git a/arch/powerpc/include/asm/irqflags.h b/arch/powerpc/include/asm/irqflags.h index f2149066fe5d..6091e46f2455 100644 --- a/arch/powerpc/include/asm/irqflags.h +++ b/arch/powerpc/include/asm/irqflags.h @@ -48,8 +48,8 @@ #define RECONCILE_IRQ_STATE(__rA, __rB)\ lbz __rA,PACASOFTIRQEN(r13);\ lbz __rB,PACAIRQHAPPENED(r13); \ - cmpwi cr0,__rA,0; \ - li __rA,0; \ + cmpwi cr0,__rA,LAZY_INTERRUPT_DISABLED;\ + li __rA,LAZY_INTERRUPT_DISABLED; \ ori __rB,__rB,PACA_IRQ_HARD_DIS;\ stb __rB,PACAIRQHAPPENED(r13); \ beq 44f;\ @@ -63,7 +63,7 @@ #define RECONCILE_IRQ_STATE(__rA, __rB)\ lbz __rA,PACAIRQHAPPENED(r13); \ - li __rB,0; \ + li __rB,LAZY_INTERRUPT_DISABLED; \ ori __rA,__rA,PACA_IRQ_HARD_DIS;\ stb __rB,PACASOFTIRQEN(r13);\ stb __rA,PACAIRQHAPPENED(r13) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include
[RFC PATCH 0/9]powerpc: "paca->soft_enabled" based local atomic operation implementation
Local atomic operations are fast and highly reentrant per CPU counters. Used for percpu variable updates. Local atomic operations only guarantee variable modification atomicity wrt the CPU which owns the data and these needs to be executed in a preemption safe way. Here is the design of the patchset. Since local_* operations are only need to be atomic to interrupts (IIUC), we have two options. Either replay the "op" if interrupted or replay the interrupt after the "op". Initial patchset posted was based on implementing local_* operation based on CR5 which replay's the "op". Patchset had issues in case of rewinding the address pointor from an array. This make the slow patch really slow. Since CR5 based implementation proposed using __ex_table to find the rewind addressr, this rasied concerns about size of __ex_table and vmlinux. https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-December/123115.html But this patchset uses Benjamin Herrenschmidt suggestion of using arch_local_irq_disable_var() to soft_disable interrupts (including PMIs). After finishing the "op", arch_local_irq_restore() called and correspondingly interrupts are replayed if any occured. patch re-write the current local_* functions to use arch_local_irq_disbale. Base flow for each function is { arch_local_irq_disable_var(2) load .. store arch_local_irq_restore() } Currently only asm/local.h has been rewrite, and also the entire change is tested only in PPC64 (pseries guest) Reason for the approach is that, currently l[w/d]arx/st[w/d]cx. instruction pair is used for local_* operations, which are heavy on cycle count and they dont support a local variant. So to see whether the new implementation helps, used a modified version of Rusty's benchmark code on local_t. https://lkml.org/lkml/2008/12/16/450 Modifications to Rusty's benchmark code: - Executed only local_t test Here are the values with the patch. Time in ns per iteration Local_t Without Patch With Patch _inc28 8 _add28 8 _read 3 3 _add_return 28 7 First four are the clean up patches which lays the foundation to make things easier. Fifth patch in the patchset reverse the current soft_enabled logic and commit message details the reason and need for this change. Sixth patch holds the changes needed for reversing logic. Rest of the patches are to add support for maskable PMI and implementation of local_t using arch_local_disable_*(). Since the patchset is experimental, changes made are focused on pseries and powernv platforms only. Would really like to know comments for this approach before extending to other powerpc platforms. Tested the patchset in a - pSeries LPAR (with perf record). - Ran kernbench with perf record for 24 hours. - More testing needed. Signed-off-by: Madhavan Srinivasan Madhavan Srinivasan (9): Add #defs for paca->soft_enabled flags Cleanup to use LAZY_INTERRUPT_* macros for paca->soft_enabled update powerpc: move set_soft_enabled() powerpc: Use set_soft_enabled api to update paca->soft_enabled powerpc: reverse the soft_enable logic powerpc: modify __SOFTEN_TEST to support tri-state soft_enabled flag powerpc: Add support to mask perf interrupts powerpc: Support to replay PMIs powerpc: rewrite local_t using soft_irq arch/powerpc/include/asm/exception-64s.h | 24 +++-- arch/powerpc/include/asm/hw_irq.h| 43 --- arch/powerpc/include/asm/irqflags.h | 8 +-- arch/powerpc/include/asm/kvm_ppc.h | 2 +- arch/powerpc/include/asm/local.h | 91 ++-- arch/powerpc/kernel/entry_64.S | 16 +++--- arch/powerpc/kernel/exceptions-64s.S | 27 -- arch/powerpc/kernel/head_64.S| 3 +- arch/powerpc/kernel/idle_power4.S| 3 +- arch/powerpc/kernel/irq.c| 24 + arch/powerpc/kernel/process.c| 3 +- arch/powerpc/kernel/setup_64.c | 4 ++ arch/powerpc/kernel/time.c | 4 +- arch/powerpc/mm/hugetlbpage.c| 2 +- arch/powerpc/perf/core-book3s.c | 2 +- 15 files changed, 184 insertions(+), 72 deletions(-) -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH v11 4/5] powerpc/fsl: move mpc85xx.h to include/linux/fsl
Hi Scott, > -Original Message- > From: Scott Wood [mailto:o...@buserror.net] > Sent: Friday, July 22, 2016 12:45 AM > To: Michael Ellerman; Arnd Bergmann > Cc: linux-...@vger.kernel.org; devicet...@vger.kernel.org; linuxppc- > d...@lists.ozlabs.org; linux-ker...@vger.kernel.org; Yangbo Lu > Subject: Re: [PATCH v11 4/5] powerpc/fsl: move mpc85xx.h to > include/linux/fsl > > On Thu, 2016-07-21 at 20:26 +1000, Michael Ellerman wrote: > > Quoting Scott Wood (2016-07-21 04:31:48) > > > > > > On Wed, 2016-07-20 at 13:24 +0200, Arnd Bergmann wrote: > > > > > > > > On Saturday, July 16, 2016 9:50:21 PM CEST Scott Wood wrote: > > > > > > > > > > > > > > > From: yangbo lu > > > > > > > > > > Move mpc85xx.h to include/linux/fsl and rename it to svr.h as a > > > > > common header file. This SVR numberspace is used on some ARM > > > > > chips as well as PPC, and even to check for a PPC SVR multi-arch > > > > > drivers would otherwise need to ifdef the header inclusion and > > > > > all references to the SVR symbols. > > > > > > > > > > Signed-off-by: Yangbo Lu > > > > > Acked-by: Wolfram Sang > > > > > Acked-by: Stephen Boyd > > > > > Acked-by: Joerg Roedel > > > > > [scottwood: update description] > > > > > Signed-off-by: Scott Wood > > > > > > > > > As discussed before, please don't introduce yet another vendor > > > > specific way to match a SoC ID from a device driver. > > > > > > > > I've posted a patch for an extension to the soc_device > > > > infrastructure to allow comparing the running SoC to a table of > > > > devices, use that instead. > > > As I asked before, in which relevant maintainership capacity are you > > > NACKing this? > > I'll nack the powerpc part until you guys can agree. > > OK, I've pulled these patches out. > > For the MMC issue I suggest using ifdef CONFIG_PPC and mfspr(SPRN_SVR) > like the clock driver does[1] and we can revisit the issue if/when we > need to do something similar on an ARM chip. [Lu Yangbo-B47093] I remembered that Uffe had opposed us to introduce non-generic header files(like '#include ') in mmc driver initially. So I think it will not be accepted to use ifdef CONFIG_PPC and mfspr(SPRN_SVR)... And this method still couldn’t get SVR of ARM chip now. Any other suggestion here? Thank you very much. - Yangbo Lu > > -Scott > > [1] One of the issues with Arnd's approach is that it wouldn't have > worked for early things like the clock driver, and he didn't seem to mind > using ifdef and > mfspr() there. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/3] powerpc/mm: Rename hpte_init_lpar() & put fallback in a header
On Mon, 2016-07-25 at 20:36 +1000, Michael Ellerman wrote: > That would be nice, but these look fishy at least: > > arch/powerpc/platforms/cell/spu_manage.c: if > (!firmware_has_feature(FW_FEATURE_LPAR)) > arch/powerpc/platforms/cell/spu_manage.c: if > (!firmware_has_feature(FW_FEATURE_LPAR)) { > > arch/powerpc/platforms/cell/spu_manage.c: if > > (!firmware_has_feature(FW_FEATURE_LPAR)) Those can just be checks for LV1, I think .. > > arch/powerpc/platforms/pasemi/iommu.c: > > !firmware_has_feature(FW_FEATURE_LPAR)) { > drivers/net/ethernet/pasemi/pasemi_mac.c: return > firmware_has_feature(FW_FEATURE_LPAR); And that was some experiemtal PAPR'ish thing wasn't it ? Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH for-4.8 V2 08/10] powerpc: use the jump label for cpu_has_feature
On Mon, Jul 25, 2016 at 04:28:49PM +1000, Nicholas Piggin wrote: > On Sat, 23 Jul 2016 14:42:41 +0530 > "Aneesh Kumar K.V" wrote: > > > From: Kevin Hao > > > > The cpu features are fixed once the probe of cpu features are done. > > And the function cpu_has_feature() does be used in some hot path. > > The checking of the cpu features for each time of invoking of > > cpu_has_feature() seems suboptimal. This tries to reduce this > > overhead of this check by using jump label. > > > > The generated assemble code of the following c program: > > if (cpu_has_feature(CPU_FTR_XXX)) > > xxx() > > > > Before: > > lis r9,-16230 > > lwz r9,12324(r9) > > lwz r9,12(r9) > > andi. r10,r9,512 > > beqlr- > > > > After: > > nop if CPU_FTR_XXX is enabled > > b xxx if CPU_FTR_XXX is not enabled > > > > Signed-off-by: Kevin Hao > > Signed-off-by: Aneesh Kumar K.V > > --- > > arch/powerpc/include/asm/cpufeatures.h | 21 + > > arch/powerpc/include/asm/cputable.h| 8 > > arch/powerpc/kernel/cputable.c | 20 > > arch/powerpc/lib/feature-fixups.c | 1 + > > 4 files changed, 50 insertions(+) > > > > diff --git a/arch/powerpc/include/asm/cpufeatures.h > > b/arch/powerpc/include/asm/cpufeatures.h index > > bfa6cb8f5629..4a4a0b898463 100644 --- > > a/arch/powerpc/include/asm/cpufeatures.h +++ > > b/arch/powerpc/include/asm/cpufeatures.h @@ -13,10 +13,31 @@ static > > inline bool __cpu_has_feature(unsigned long feature) > > return !!(CPU_FTRS_POSSIBLE & cur_cpu_spec->cpu_features & feature); } > > > > +#ifdef CONFIG_JUMP_LABEL > > +#include > > + > > +extern struct static_key_true cpu_feat_keys[MAX_CPU_FEATURES]; > > + > > +static __always_inline bool cpu_has_feature(unsigned long feature) > > +{ > > + int i; > > + > > + if (CPU_FTRS_ALWAYS & feature) > > + return true; > > + > > + if (!(CPU_FTRS_POSSIBLE & feature)) > > + return false; > > + > > + i = __builtin_ctzl(feature); > > + return static_branch_likely(&cpu_feat_keys[i]); > > +} > > Is feature ever not-constant, or could it ever be, I wonder? We could > do a build time check to ensure it is always constant? In the current code, all the using of this function are passing a constant argument. But yes, due to the implementation of jump label, we should add a check here to ensure that a constant is passed to this function. Something likes this: if (!__builtin_constant_p(feature)) return __cpu_has_feature(feature); We need the same change for the mmu_has_feature(). Thanks, Kevin signature.asc Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] powerpc/mm: Fix build break due when PPC_NATIVE=n
Quoting Michael Ellerman (2016-07-25 16:17:52) > Stephen Rothwell writes: > > > Hi Michael, > > > > On Mon, 25 Jul 2016 12:57:49 +1000 Michael Ellerman > > wrote: > >> > >> The recent commit to rework the hash MMU setup broke the build when > >> CONFIG_PPC_NATIVE=n. Fix it by providing a fallback implementation of > >> hpte_init_native(). > > > > Alternatively, you could make the call site dependent on > > IS_ENABLED(CONFIG_PPC_NATIVE) and not need the fallback. > > > > so: > > > > else if (IS_ENABLED(CONFIG_PPC_NATIVE)) > > hpte_init_native(); > > > > in arch/powerpc/mm/hash_utils_64.c and let the compiler elide the call. > > That would mean we might fall through and not assign any ops, so I think > it's preferable to have a fallback that explicitly panics(). Actually I think this works and is smaller all round. Will test and resend. cheers diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index 341632471b9d..e44f2d759055 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -885,11 +885,6 @@ static void __init htab_initialize(void) #undef KB #undef MB -void __init __weak hpte_init_lpar(void) -{ - panic("FW_FEATURE_LPAR set but no LPAR support compiled\n"); -} - void __init hash__early_init_mmu(void) { /* @@ -931,9 +926,12 @@ void __init hash__early_init_mmu(void) ps3_early_mm_init(); else if (firmware_has_feature(FW_FEATURE_LPAR)) hpte_init_lpar(); - else + else if IS_ENABLED(CONFIG_PPC_NATIVE) hpte_init_native(); + if (!mmu_hash_ops.hpte_insert) + panic("hash__early_init_mmu: No MMU hash ops defined!\n"); + /* Initialize the MMU Hash table and create the linear mapping * of memory. Has to be done before SLB initialization as this is * currently where the page size encoding is obtained. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/3] powerpc/mm: Rename hpte_init_lpar() & put fallback in a header
Benjamin Herrenschmidt writes: > On Mon, 2016-07-25 at 15:33 +1000, Michael Ellerman wrote: >> When we detect a PS3 we set both PS3_LV1 and LPAR at the same time, >> so >> there should be no way they can get out of sync, other than due to a >> bug in the code. > > I thought I had changed PS3 to no longer set LPAR ? Nope: FW_FEATURE_PS3_POSSIBLE = FW_FEATURE_LPAR | FW_FEATURE_PS3_LV1, ... #ifdef CONFIG_PPC_PS3 /* Identify PS3 firmware */ if (of_flat_dt_is_compatible(of_get_flat_dt_root(), "sony,ps3")) powerpc_firmware_features |= FW_FEATURE_PS3_POSSIBLE; #endif > I like having a flag that basically says PAPR and that's pretty much > what LPAR is, in fact I think I've been using it elsewhere with that > meaning That would be nice, but these look fishy at least: arch/powerpc/platforms/cell/spu_manage.c: if (!firmware_has_feature(FW_FEATURE_LPAR)) arch/powerpc/platforms/cell/spu_manage.c: if (!firmware_has_feature(FW_FEATURE_LPAR)) { arch/powerpc/platforms/cell/spu_manage.c: if (!firmware_has_feature(FW_FEATURE_LPAR)) arch/powerpc/platforms/pasemi/iommu.c: !firmware_has_feature(FW_FEATURE_LPAR)) { drivers/net/ethernet/pasemi/pasemi_mac.c: return firmware_has_feature(FW_FEATURE_LPAR); cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/3] powerpc/mm: Rename hpte_init_lpar() & put fallback in a header
On Mon, 2016-07-25 at 15:33 +1000, Michael Ellerman wrote: > When we detect a PS3 we set both PS3_LV1 and LPAR at the same time, > so > there should be no way they can get out of sync, other than due to a > bug in the code. I thought I had changed PS3 to no longer set LPAR ? I like having a flag that basically says PAPR and that's pretty much what LPAR is, in fact I think I've been using it elsewhere with that meaning Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH v3 02/11] mm: Hardened usercopy
From: Josh Poimboeuf > Sent: 22 July 2016 18:46 .. > > >> +/* > > >> + * Checks if a given pointer and length is contained by the current > > >> + * stack frame (if possible). > > >> + * > > >> + * 0: not at all on the stack > > >> + * 1: fully within a valid stack frame > > >> + * 2: fully on the stack (when can't do frame-checking) > > >> + * -1: error condition (invalid stack position or bad stack frame) > > >> + */ > > >> +static noinline int check_stack_object(const void *obj, unsigned long > > >> len) > > >> +{ > > >> + const void * const stack = task_stack_page(current); > > >> + const void * const stackend = stack + THREAD_SIZE; > > > > > > That allows access to the entire stack, including the struct thread_info, > > > is that what we want - it seems dangerous? Or did I miss a check > > > somewhere else? > > > > That seems like a nice improvement to make, yeah. > > > > > We have end_of_stack() which computes the end of the stack taking > > > thread_info into account (end being the opposite of your end above). > > > > Amusingly, the object_is_on_stack() check in sched.h doesn't take > > thread_info into account either. :P Regardless, I think using > > end_of_stack() may not be best. To tighten the check, I think we could > > add this after checking that the object is on the stack: > > > > #ifdef CONFIG_STACK_GROWSUP > > stackend -= sizeof(struct thread_info); > > #else > > stack += sizeof(struct thread_info); > > #endif > > > > e.g. then if the pointer was in the thread_info, the second test would > > fail, triggering the protection. > > FWIW, this won't work right on x86 after Andy's > CONFIG_THREAD_INFO_IN_TASK patches get merged. What ends up in the 'thread_info' area? If it contains the fp save area then programs like gdb may end up requesting copy_in/out directly from that area. Interestingly the avx registers don't need saving on a normal system call entry (they are all caller-saved) so the kernel stack can safely overwrite that area. Syscall entry probably ought to execute the 'zero all avx registers' instruction. They do need saving on interrupt entry - but the stack used will be less. David ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[Patch v3 1/3] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe
move the driver from drivers/soc/fsl/qe to drivers/irqchip, merge qe_ic.h and qe_ic.c into irq-qeic.c. Signed-off-by: Zhao Qiang --- Changes for v2: - modify the subject and commit msg Changes for v3: - merge .h file to .c, rename it with irq-qeic.c drivers/irqchip/Makefile | 1 + drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} | 82 +++- drivers/soc/fsl/qe/Makefile| 2 +- drivers/soc/fsl/qe/qe_ic.h | 103 - 4 files changed, 83 insertions(+), 105 deletions(-) rename drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} (85%) delete mode 100644 drivers/soc/fsl/qe/qe_ic.h diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile index 38853a1..cef999d 100644 --- a/drivers/irqchip/Makefile +++ b/drivers/irqchip/Makefile @@ -69,3 +69,4 @@ obj-$(CONFIG_PIC32_EVIC) += irq-pic32-evic.o obj-$(CONFIG_MVEBU_ODMI) += irq-mvebu-odmi.o obj-$(CONFIG_LS_SCFG_MSI) += irq-ls-scfg-msi.o obj-$(CONFIG_EZNPS_GIC)+= irq-eznps.o +obj-$(CONFIG_QUICC_ENGINE) += qe_ic.o diff --git a/drivers/soc/fsl/qe/qe_ic.c b/drivers/irqchip/irq-qeic.c similarity index 85% rename from drivers/soc/fsl/qe/qe_ic.c rename to drivers/irqchip/irq-qeic.c index ec2ca86..1f91225 100644 --- a/drivers/soc/fsl/qe/qe_ic.c +++ b/drivers/irqchip/irq-qeic.c @@ -30,7 +30,87 @@ #include #include -#include "qe_ic.h" +#define NR_QE_IC_INTS 64 + +/* QE IC registers offset */ +#define QEIC_CICR 0x00 +#define QEIC_CIVEC 0x04 +#define QEIC_CRIPNR0x08 +#define QEIC_CIPNR 0x0c +#define QEIC_CIPXCC0x10 +#define QEIC_CIPYCC0x14 +#define QEIC_CIPWCC0x18 +#define QEIC_CIPZCC0x1c +#define QEIC_CIMR 0x20 +#define QEIC_CRIMR 0x24 +#define QEIC_CICNR 0x28 +#define QEIC_CIPRTA0x30 +#define QEIC_CIPRTB0x34 +#define QEIC_CRICR 0x3c +#define QEIC_CHIVEC0x60 + +/* Interrupt priority registers */ +#define CIPCC_SHIFT_PRI0 29 +#define CIPCC_SHIFT_PRI1 26 +#define CIPCC_SHIFT_PRI2 23 +#define CIPCC_SHIFT_PRI3 20 +#define CIPCC_SHIFT_PRI4 13 +#define CIPCC_SHIFT_PRI5 10 +#define CIPCC_SHIFT_PRI6 7 +#define CIPCC_SHIFT_PRI7 4 + +/* CICR priority modes */ +#define CICR_GWCC 0x0004 +#define CICR_GXCC 0x0002 +#define CICR_GYCC 0x0001 +#define CICR_GZCC 0x0008 +#define CICR_GRTA 0x0020 +#define CICR_GRTB 0x0040 +#define CICR_HPIT_SHIFT8 +#define CICR_HPIT_MASK 0x0300 +#define CICR_HP_SHIFT 24 +#define CICR_HP_MASK 0x3f00 + +/* CICNR */ +#define CICNR_WCC1T_SHIFT 20 +#define CICNR_ZCC1T_SHIFT 28 +#define CICNR_YCC1T_SHIFT 12 +#define CICNR_XCC1T_SHIFT 4 + +/* CRICR */ +#define CRICR_RTA1T_SHIFT 20 +#define CRICR_RTB1T_SHIFT 28 + +/* Signal indicator */ +#define SIGNAL_MASK3 +#define SIGNAL_HIGH2 +#define SIGNAL_LOW 0 + +struct qe_ic { + /* Control registers offset */ + volatile u32 __iomem *regs; + + /* The remapper for this QEIC */ + struct irq_domain *irqhost; + + /* The "linux" controller struct */ + struct irq_chip hc_irq; + + /* VIRQ numbers of QE high/low irqs */ + unsigned int virq_high; + unsigned int virq_low; +}; + +/* + * QE interrupt controller internal structure + */ +struct qe_ic_info { + u32 mask; /* location of this source at the QIMR register. */ + u32 mask_reg; /* Mask register offset */ + u8 pri_code; /* for grouped interrupts sources - the interrupt +code as appears at the group priority register */ + u32 pri_reg; /* Group priority register offset */ +}; static DEFINE_RAW_SPINLOCK(qe_ic_lock); diff --git a/drivers/soc/fsl/qe/Makefile b/drivers/soc/fsl/qe/Makefile index 2031d38..51e4726 100644 --- a/drivers/soc/fsl/qe/Makefile +++ b/drivers/soc/fsl/qe/Makefile @@ -1,7 +1,7 @@ # # Makefile for the linux ppc-specific parts of QE # -obj-$(CONFIG_QUICC_ENGINE)+= qe.o qe_common.o qe_ic.o qe_io.o +obj-$(CONFIG_QUICC_ENGINE)+= qe.o qe_common.o qe_io.o obj-$(CONFIG_CPM) += qe_common.o obj-$(CONFIG_UCC) += ucc.o obj-$(CONFIG_UCC_SLOW) += ucc_slow.o diff --git a/drivers/soc/fsl/qe/qe_ic.h b/drivers/soc/fsl/qe/qe_ic.h deleted file mode 100644 index 926a2ed..000 --- a/drivers/soc/fsl/qe/qe_ic.h +++ /dev/null @@ -1,103 +0,0 @@ -/* - * drivers/soc/fsl/qe/qe_ic.h - * - * QUICC ENGINE Interrupt Controller Header - * - * Copyright (C) 2006 Freescale Semiconductor, Inc. All rights reserved. - * - * Author: Li Yang - * Based on code from Shlomi Gridish - * - * This progra
[Patch v3 2/3] irqchip/qeic: merge qeic init code from platforms to a common function
The codes of qe_ic init from a variety of platforms are redundant, merge them to a common function and put it to irqchip/irq-qeic.c For non-p1021_mds mpc85xx_mds boards, use "qe_ic_init(np, 0, qe_ic_cascade_low_mpic, qe_ic_cascade_high_mpic);" instead of "qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL);". qe_ic_cascade_muxed_mpic was used for boards has the same interrupt number for low interrupt and high interrupt, qe_ic_init has checked if "low interrupt == high interrupt" Signed-off-by: Zhao Qiang --- Changes for v2: - modify subject and commit msg - add check for qeic by type Changes for v3: - na arch/powerpc/platforms/83xx/misc.c| 15 --- arch/powerpc/platforms/85xx/corenet_generic.c | 9 - arch/powerpc/platforms/85xx/mpc85xx_mds.c | 14 -- arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 16 arch/powerpc/platforms/85xx/twr_p102x.c | 14 -- drivers/irqchip/irq-qeic.c| 16 6 files changed, 16 insertions(+), 68 deletions(-) diff --git a/arch/powerpc/platforms/83xx/misc.c b/arch/powerpc/platforms/83xx/misc.c index 7e923ca..9431fc7 100644 --- a/arch/powerpc/platforms/83xx/misc.c +++ b/arch/powerpc/platforms/83xx/misc.c @@ -93,24 +93,9 @@ void __init mpc83xx_ipic_init_IRQ(void) } #ifdef CONFIG_QUICC_ENGINE -void __init mpc83xx_qe_init_IRQ(void) -{ - struct device_node *np; - - np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic"); - if (!np) { - np = of_find_node_by_type(NULL, "qeic"); - if (!np) - return; - } - qe_ic_init(np, 0, qe_ic_cascade_low_ipic, qe_ic_cascade_high_ipic); - of_node_put(np); -} - void __init mpc83xx_ipic_and_qe_init_IRQ(void) { mpc83xx_ipic_init_IRQ(); - mpc83xx_qe_init_IRQ(); } #endif /* CONFIG_QUICC_ENGINE */ diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c index a2b0bc8..526fc2b 100644 --- a/arch/powerpc/platforms/85xx/corenet_generic.c +++ b/arch/powerpc/platforms/85xx/corenet_generic.c @@ -41,8 +41,6 @@ void __init corenet_gen_pic_init(void) unsigned int flags = MPIC_BIG_ENDIAN | MPIC_SINGLE_DEST_CPU | MPIC_NO_RESET; - struct device_node *np; - if (ppc_md.get_irq == mpic_get_coreint_irq) flags |= MPIC_ENABLE_COREINT; @@ -50,13 +48,6 @@ void __init corenet_gen_pic_init(void) BUG_ON(mpic == NULL); mpic_init(mpic); - - np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic"); - if (np) { - qe_ic_init(np, 0, qe_ic_cascade_low_mpic, - qe_ic_cascade_high_mpic); - of_node_put(np); - } } /* diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c b/arch/powerpc/platforms/85xx/mpc85xx_mds.c index f61cbe2..7ae4901 100644 --- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c +++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c @@ -279,20 +279,6 @@ static void __init mpc85xx_mds_qeic_init(void) of_node_put(np); return; } - - np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic"); - if (!np) { - np = of_find_node_by_type(NULL, "qeic"); - if (!np) - return; - } - - if (machine_is(p1021_mds)) - qe_ic_init(np, 0, qe_ic_cascade_low_mpic, - qe_ic_cascade_high_mpic); - else - qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL); - of_node_put(np); } #else static void __init mpc85xx_mds_qe_init(void) { } diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c index 3f4dad1..779f54f 100644 --- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c +++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c @@ -49,10 +49,6 @@ void __init mpc85xx_rdb_pic_init(void) struct mpic *mpic; unsigned long root = of_get_flat_dt_root(); -#ifdef CONFIG_QUICC_ENGINE - struct device_node *np; -#endif - if (of_flat_dt_is_compatible(root, "fsl,MPC85XXRDB-CAMP")) { mpic = mpic_alloc(NULL, 0, MPIC_NO_RESET | MPIC_BIG_ENDIAN | @@ -67,18 +63,6 @@ void __init mpc85xx_rdb_pic_init(void) BUG_ON(mpic == NULL); mpic_init(mpic); - -#ifdef CONFIG_QUICC_ENGINE - np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic"); - if (np) { - qe_ic_init(np, 0, qe_ic_cascade_low_mpic, - qe_ic_cascade_high_mpic); - of_node_put(np); - - } else - pr_err("%s: Could not find qe-ic node\n", __func__); -#endif - } /* diff --git a/arch/powerpc/platforms/85xx/twr_p102x.c b/arch/powerpc/platforms/85xx/twr_p102x.c index 71bc255..603e244 100644 --- a/arch/powerpc/platforms/85xx/twr_p102x.c +++ b/arch/powerpc/platform
[Patch v3 3/3] irqchip/qeic: merge qeic_of_init into qe_ic_init
qeic_of_init just get device_node of qeic from dtb and call qe_ic_init, pass the device_node to qe_ic_init. So merge qeic_of_init into qe_ic_init to get the qeic node in qe_ic_init. Signed-off-by: Zhao Qiang --- Changes for v2: - modify subject and commit msg - return 0 and add put node when return in qe_ic_init Changes for v3: - na drivers/irqchip/irq-qeic.c | 91 +- include/soc/fsl/qe/qe_ic.h | 7 2 files changed, 50 insertions(+), 48 deletions(-) diff --git a/drivers/irqchip/irq-qeic.c b/drivers/irqchip/irq-qeic.c index 1853fda..a0bf871 100644 --- a/drivers/irqchip/irq-qeic.c +++ b/drivers/irqchip/irq-qeic.c @@ -397,27 +397,38 @@ unsigned int qe_ic_get_high_irq(struct qe_ic *qe_ic) return irq_linear_revmap(qe_ic->irqhost, irq); } -void __init qe_ic_init(struct device_node *node, unsigned int flags, - void (*low_handler)(struct irq_desc *desc), - void (*high_handler)(struct irq_desc *desc)) +static int __init qe_ic_init(unsigned int flags) { + struct device_node *node; struct qe_ic *qe_ic; struct resource res; - u32 temp = 0, ret, high_active = 0; + u32 temp = 0, high_active = 0; + int ret = 0; + + node = of_find_compatible_node(NULL, NULL, "fsl,qe-ic"); + if (!node) { + node = of_find_node_by_type(NULL, "qeic"); + if (!node) + return -ENODEV; + } ret = of_address_to_resource(node, 0, &res); - if (ret) - return; + if (ret) { + ret = -ENODEV; + goto err_put_node; + } qe_ic = kzalloc(sizeof(*qe_ic), GFP_KERNEL); - if (qe_ic == NULL) - return; + if (qe_ic == NULL) { + ret = -ENOMEM; + goto err_put_node; + } qe_ic->irqhost = irq_domain_add_linear(node, NR_QE_IC_INTS, &qe_ic_host_ops, qe_ic); if (qe_ic->irqhost == NULL) { - kfree(qe_ic); - return; + ret = -ENOMEM; + goto err_free_qe_ic; } qe_ic->regs = ioremap(res.start, resource_size(&res)); @@ -428,9 +439,9 @@ void __init qe_ic_init(struct device_node *node, unsigned int flags, qe_ic->virq_low = irq_of_parse_and_map(node, 1); if (qe_ic->virq_low == NO_IRQ) { - printk(KERN_ERR "Failed to map QE_IC low IRQ\n"); - kfree(qe_ic); - return; + pr_err("Failed to map QE_IC low IRQ\n"); + ret = -ENOMEM; + goto err_domain_remove; } /* default priority scheme is grouped. If spread mode is*/ @@ -457,13 +468,24 @@ void __init qe_ic_init(struct device_node *node, unsigned int flags, qe_ic_write(qe_ic->regs, QEIC_CICR, temp); irq_set_handler_data(qe_ic->virq_low, qe_ic); - irq_set_chained_handler(qe_ic->virq_low, low_handler); + irq_set_chained_handler(qe_ic->virq_low, qe_ic_cascade_low_mpic); if (qe_ic->virq_high != NO_IRQ && qe_ic->virq_high != qe_ic->virq_low) { irq_set_handler_data(qe_ic->virq_high, qe_ic); - irq_set_chained_handler(qe_ic->virq_high, high_handler); + irq_set_chained_handler(qe_ic->virq_high, + qe_ic_cascade_high_mpic); } + of_node_put(node); + return 0; + +err_domain_remove: + irq_domain_remove(qe_ic->irqhost); +err_free_qe_ic: + kfree(qe_ic); +err_put_node: + of_node_put(node); + return ret; } void qe_ic_set_highest_priority(unsigned int virq, int high) @@ -570,39 +592,26 @@ static struct device device_qe_ic = { .bus = &qe_ic_subsys, }; -static int __init init_qe_ic_sysfs(void) +static int __init init_qe_ic(void) { - int rc; + int ret; - printk(KERN_DEBUG "Registering qe_ic with sysfs...\n"); + ret = qe_ic_init(0); + if (ret) + return ret; - rc = subsys_system_register(&qe_ic_subsys, NULL); - if (rc) { - printk(KERN_ERR "Failed registering qe_ic sys class\n"); + ret = subsys_system_register(&qe_ic_subsys, NULL); + if (ret) { + pr_err("Failed registering qe_ic sys class\n"); return -ENODEV; } - rc = device_register(&device_qe_ic); - if (rc) { - printk(KERN_ERR "Failed registering qe_ic sys device\n"); + ret = device_register(&device_qe_ic); + if (ret) { + pr_err("Failed registering qe_ic sys device\n"); return -ENODEV; } - return 0; -} - -static int __init qeic_of_init(void) -{ - struct device_node *np; - np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic"); - if (!np) { - np = of
Re: [PATCH for-4.8 V2 03/10] powerpc/mm/radix: Add radix_set_pte to use in early init
On Mon, 25 Jul 2016 18:36:09 +1000 Michael Ellerman wrote: > "Aneesh Kumar K.V" writes: > > > We want to use the static key based feature check in set_pte_at. > > Since we call radix__map_kernel_page early in boot before jump > > label is initialized we can't call set_pte_at there. Add > > radix__set_pte for the same. > > Although this is an OK solution to this problem, I think it > highlights a bigger problem, which is that we're still doing the > feature patching too late. > > If we can move the feature patching prior to MMU init, then all (or > more of) these problems with pre vs post patching go away. > > I'll see if I can come up with something tomorrow. Agreed, that would be much nicer if you can make it work. Thanks, Nick ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v4] powerpc: Export thread_struct.used_vr/used_vsr to user space
On Thu, Jul 21, 2016 at 08:57:29PM +1000, Michael Ellerman wrote: > Can one of you send a properly formatted and signed-off patch. I will work on that. Thanks, Simon ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH for-4.8 V2 03/10] powerpc/mm/radix: Add radix_set_pte to use in early init
"Aneesh Kumar K.V" writes: > We want to use the static key based feature check in set_pte_at. Since > we call radix__map_kernel_page early in boot before jump label is > initialized we can't call set_pte_at there. Add radix__set_pte for the > same. Although this is an OK solution to this problem, I think it highlights a bigger problem, which is that we're still doing the feature patching too late. If we can move the feature patching prior to MMU init, then all (or more of) these problems with pre vs post patching go away. I'll see if I can come up with something tomorrow. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH for-4.8 V2 03/10] powerpc/mm/radix: Add radix_set_pte to use in early init
Nicholas Piggin writes: > On Sat, 23 Jul 2016 14:42:36 +0530 > "Aneesh Kumar K.V" wrote: >> @@ -102,7 +123,7 @@ int radix__map_kernel_page(unsigned long ea, >> unsigned long pa, } >> >> set_the_pte: >> -set_pte_at(&init_mm, ea, ptep, pfn_pte(pa >> PAGE_SHIFT, >> flags)); >> +radix__set_pte(&init_mm, ea, ptep, pfn_pte(pa >> PAGE_SHIFT, >> flags)); smp_wmb(); > > What we have in existing code is set_pte_at() function that adds > the _PAGE_PTE bit, then calls __set_pte_at(), which calls radix or hash > version of __set_pte_at(). > > Now we also have radix__set_pte(), which has the function of the > set_pte_at(), which is starting to confuse the naming convention. > The new function is a radix-only set_pte_at(), rather than the > radix implementation that gets called via set_pte(). > > set_pte_at_radix()? That kind of sucks too, though. It might be better > if the radix/hash variants were called __radix__set_pte_at(), and this > new function was called radix__set_pte_at(). I think Aneesh originally used set_pte_at_r() or maybe rset_pte_at()? It was my idea to use radix__ and hash__ as prefixes for all the radix/hash functions. That was 1) to make it clear that it's not part of the name as such, ie. it's a prefix, and 2) because it's ugly as hell and hopefully that would motivate us to consolidate as many of them as possible. I balked at adding __radix__set_pte_at(), and just went with radix__set_pte_at(). But it does complicate things now. In fact I think we need to rethink this whole series, and not actually do it this way at all, meaning this naming problem will go away. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe
On Mon, 2016-07-25 at 06:15 +, Qiang Zhao wrote: > On Thu, Jul 07, 2016 at 10:25PM , Jason Cooper wrote: > > > > -Original Message- > > From: Jason Cooper [mailto:ja...@lakedaemon.net] > > Sent: Thursday, July 07, 2016 10:25 PM > > To: Qiang Zhao > > Cc: o...@buserror.net; t...@linutronix.de; marc.zyng...@arm.com; linuxppc- > > d...@lists.ozlabs.org; linux-ker...@vger.kernel.org; Xiaobo Xie > > > > Subject: Re: [PATCH v2] irqchip/qeic: move qeic driver from > > drivers/soc/fsl/qe > > > > Hi Zhao Qiang, > > > > On Thu, Jul 07, 2016 at 09:23:55AM +0800, Zhao Qiang wrote: > > > > > > The driver stays the same. > > > > > > Signed-off-by: Zhao Qiang > > > --- > > > Changes for v2: > > > - modify the subject and commit msg > > > > > > drivers/irqchip/Makefile| 1 + > > > drivers/{soc/fsl/qe => irqchip}/qe_ic.c | 0 drivers/{soc/fsl/qe => > > > irqchip}/qe_ic.h | 0 > > > drivers/soc/fsl/qe/Makefile | 2 +- > > > 4 files changed, 2 insertions(+), 1 deletion(-) rename > > > drivers/{soc/fsl/qe => irqchip}/qe_ic.c (100%) rename > > > drivers/{soc/fsl/qe => irqchip}/qe_ic.h (100%) > > Please merge the include file into the C file and rename to follow the > > naming > > convention in drivers/irqchip/. e.g. irq-qeic.c or irq-qe_ic.c. > > > > Once you have that, please resend the entire series with this as the first > > patch. > Sorry, I have no idea about "Include file", could you explain which file you > meant? qe_ic.h If nothing else is going to include that, then the contents can go directly into qe_ic.c. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev