Re: linux-next: manual merge of the f2fs tree with the ext4 tree
On Mon, May 16, 2016 at 03:22:41PM -0700, Jaegeuk Kim wrote: > > Sorry, I just ran out of time to try to verify that the patch wouldn't > > break anything, and given that we're going to need to wait for > > "fscrypto/f2fs: allow fs-specific key prefix for fs encryption" to go > > upstream. > > Agreed. IIUC, let me push the fscrypto/f2fs patch to v4.7 first? Right --- that's in linux-next already, right? And currently it's a combined fscrypto/f2fs patch, which is why I suspect letting it go into v4.7 first makes sense. I'll make sure the ext4 move to fs/crypto will be one of the first development patches for 4.8 (modulo any urgent bug fixes that need to go into 4.7 final first). > I have no planned patch right now, and of course, it must have no problem for > you to treat with further patches. > Also, let me take a look at any missing part again, regarding to your > concerns. I'm sure there may be some missing pieces around using file system level crypto for the desktop / server use case. Some of them are in how we handle removable thumb drives, for example. There are definitely some missing pieces about how to handle removable SD cards for Android, as well, including some kernel-side patches that are currently living in the unstable portion of the ext4 patch queue. We never got the design, implementation and kernel<->userspace API's fully baked, so that's not going upstream any time soon, but all of this means that we will need to figure out what's the best way to develop, test, and push fs/crypto changes in the long term. This may mean a new git tree with shared maintenance, as one way that we could do things. - Ted
Re: [PATCH v2 0/3] input: rmi4: Regulator supply support
On 05/13/2016 03:29 PM, Bjorn Andersson wrote: On Thu 12 May 17:52 PDT 2016, Andrew Duggan wrote: On 05/11/2016 08:05 PM, Bjorn Andersson wrote: On Wed 11 May 16:30 PDT 2016, Andrew Duggan wrote: Hi Bjorn, On 05/10/2016 08:49 AM, Bjorn Andersson wrote: [..] So either we duplicate the regulator support in spi/i2c or we make them optional in the core driver. Sounds like you prefer the prior, i.e. v1 of my patch. Yes, after all this I think it makes sense to put regulator support in the spi/i2c transports like in your v1 patch. I essentially duplicated the irq handling code in both transports so I would be ok with duplicating regulator support too. It doesn't seem like that much code. But, if this is too much duplication we could create some sort of common file and put the common irq and regulator support functions which could be called in the transports. Similar to how rmi_2d_sensor.c defines some common functions shared between rmi_f11 and rmi_f12. Sounds reasonable, I'm okay with this. Did you have any comments on the implementation I had in v1? I tested on a device which has an always on regulators so I didn't add anything to device tree for the device. But, it returned 0 when it didn't find anything which seems to be the correct behavior. Is there an easy way to avoid sleeping for 10ms when there are no regulators? Maybe check if both the supplies .consumer pointer is null? I did look at this as well, but unfortunately the regulators does not come back as NULL, but rather as dummy regulators. The delay matches Tpowerup (iirc) from the data sheet, which I assume is firmware/hardware dependant. Should we provide a knob for that and default the sleep to 0ms? Making the default 0 and then setting an appropriate time out for devices which need it sounds like a good idea to me. Thanks, Andrew Regards, Bjorn
Re: [PATCH v2 0/3] input: rmi4: Regulator supply support
On 05/13/2016 03:29 PM, Bjorn Andersson wrote: On Thu 12 May 17:52 PDT 2016, Andrew Duggan wrote: On 05/11/2016 08:05 PM, Bjorn Andersson wrote: On Wed 11 May 16:30 PDT 2016, Andrew Duggan wrote: Hi Bjorn, On 05/10/2016 08:49 AM, Bjorn Andersson wrote: [..] So either we duplicate the regulator support in spi/i2c or we make them optional in the core driver. Sounds like you prefer the prior, i.e. v1 of my patch. Yes, after all this I think it makes sense to put regulator support in the spi/i2c transports like in your v1 patch. I essentially duplicated the irq handling code in both transports so I would be ok with duplicating regulator support too. It doesn't seem like that much code. But, if this is too much duplication we could create some sort of common file and put the common irq and regulator support functions which could be called in the transports. Similar to how rmi_2d_sensor.c defines some common functions shared between rmi_f11 and rmi_f12. Sounds reasonable, I'm okay with this. Did you have any comments on the implementation I had in v1? I tested on a device which has an always on regulators so I didn't add anything to device tree for the device. But, it returned 0 when it didn't find anything which seems to be the correct behavior. Is there an easy way to avoid sleeping for 10ms when there are no regulators? Maybe check if both the supplies .consumer pointer is null? I did look at this as well, but unfortunately the regulators does not come back as NULL, but rather as dummy regulators. The delay matches Tpowerup (iirc) from the data sheet, which I assume is firmware/hardware dependant. Should we provide a knob for that and default the sleep to 0ms? Making the default 0 and then setting an appropriate time out for devices which need it sounds like a good idea to me. Thanks, Andrew Regards, Bjorn
Re: linux-next: manual merge of the f2fs tree with the ext4 tree
On Mon, May 16, 2016 at 03:22:41PM -0700, Jaegeuk Kim wrote: > > Sorry, I just ran out of time to try to verify that the patch wouldn't > > break anything, and given that we're going to need to wait for > > "fscrypto/f2fs: allow fs-specific key prefix for fs encryption" to go > > upstream. > > Agreed. IIUC, let me push the fscrypto/f2fs patch to v4.7 first? Right --- that's in linux-next already, right? And currently it's a combined fscrypto/f2fs patch, which is why I suspect letting it go into v4.7 first makes sense. I'll make sure the ext4 move to fs/crypto will be one of the first development patches for 4.8 (modulo any urgent bug fixes that need to go into 4.7 final first). > I have no planned patch right now, and of course, it must have no problem for > you to treat with further patches. > Also, let me take a look at any missing part again, regarding to your > concerns. I'm sure there may be some missing pieces around using file system level crypto for the desktop / server use case. Some of them are in how we handle removable thumb drives, for example. There are definitely some missing pieces about how to handle removable SD cards for Android, as well, including some kernel-side patches that are currently living in the unstable portion of the ext4 patch queue. We never got the design, implementation and kernel<->userspace API's fully baked, so that's not going upstream any time soon, but all of this means that we will need to figure out what's the best way to develop, test, and push fs/crypto changes in the long term. This may mean a new git tree with shared maintenance, as one way that we could do things. - Ted
Re: [PATCH net-next] bpf: arm64: remove callee-save registers use for tmp registers
On 5/16/2016 4:45 PM, Z Lim wrote: Hi Yang, On Mon, May 16, 2016 at 4:09 PM, Yang Shiwrote: In the current implementation of ARM64 eBPF JIT, R23 and R24 are used for tmp registers, which are callee-saved registers. This leads to variable size of JIT prologue and epilogue. The latest blinding constant change prefers to constant size of prologue and epilogue. AAPCS reserves R9 ~ R15 for temp registers which not need to be saved/restored during function call. So, replace R23 and R24 to R10 and R11, and remove tmp_used flag. CC: Zi Shen Lim CC: Daniel Borkmann Signed-off-by: Yang Shi --- Couple suggestions, but otherwise: Acked-by: Zi Shen Lim 1. Update the diagram. I think it should now be: -* BPF fp register => -80:+-+ <= (BPF_FP) +* BPF fp register => -64:+-+ <= (BPF_FP) Nice catch. I forgot the stack diagram. 2. Add a comment in commit log along the lines of: this is an optimization saving 2 instructions per jited BPF program. Sure, will address in V2. Thanks, Yang Thanks :) z Apply on top of Daniel's blinding constant patchset. arch/arm64/net/bpf_jit_comp.c | 32 1 file changed, 4 insertions(+), 28 deletions(-)
Re: [PATCH net-next] bpf: arm64: remove callee-save registers use for tmp registers
On 5/16/2016 4:45 PM, Z Lim wrote: Hi Yang, On Mon, May 16, 2016 at 4:09 PM, Yang Shi wrote: In the current implementation of ARM64 eBPF JIT, R23 and R24 are used for tmp registers, which are callee-saved registers. This leads to variable size of JIT prologue and epilogue. The latest blinding constant change prefers to constant size of prologue and epilogue. AAPCS reserves R9 ~ R15 for temp registers which not need to be saved/restored during function call. So, replace R23 and R24 to R10 and R11, and remove tmp_used flag. CC: Zi Shen Lim CC: Daniel Borkmann Signed-off-by: Yang Shi --- Couple suggestions, but otherwise: Acked-by: Zi Shen Lim 1. Update the diagram. I think it should now be: -* BPF fp register => -80:+-+ <= (BPF_FP) +* BPF fp register => -64:+-+ <= (BPF_FP) Nice catch. I forgot the stack diagram. 2. Add a comment in commit log along the lines of: this is an optimization saving 2 instructions per jited BPF program. Sure, will address in V2. Thanks, Yang Thanks :) z Apply on top of Daniel's blinding constant patchset. arch/arm64/net/bpf_jit_comp.c | 32 1 file changed, 4 insertions(+), 28 deletions(-)
Re: [PATCH net-next] bpf: arm64: remove callee-save registers use for tmp registers
Hi Yang, On Mon, May 16, 2016 at 4:09 PM, Yang Shiwrote: > In the current implementation of ARM64 eBPF JIT, R23 and R24 are used for > tmp registers, which are callee-saved registers. This leads to variable size > of JIT prologue and epilogue. The latest blinding constant change prefers to > constant size of prologue and epilogue. AAPCS reserves R9 ~ R15 for temp > registers which not need to be saved/restored during function call. So, > replace > R23 and R24 to R10 and R11, and remove tmp_used flag. > > CC: Zi Shen Lim > CC: Daniel Borkmann > Signed-off-by: Yang Shi > --- Couple suggestions, but otherwise: Acked-by: Zi Shen Lim 1. Update the diagram. I think it should now be: -* BPF fp register => -80:+-+ <= (BPF_FP) +* BPF fp register => -64:+-+ <= (BPF_FP) 2. Add a comment in commit log along the lines of: this is an optimization saving 2 instructions per jited BPF program. Thanks :) z > Apply on top of Daniel's blinding constant patchset. > > arch/arm64/net/bpf_jit_comp.c | 32 > 1 file changed, 4 insertions(+), 28 deletions(-) >
Re: [PATCH net-next] bpf: arm64: remove callee-save registers use for tmp registers
Hi Yang, On Mon, May 16, 2016 at 4:09 PM, Yang Shi wrote: > In the current implementation of ARM64 eBPF JIT, R23 and R24 are used for > tmp registers, which are callee-saved registers. This leads to variable size > of JIT prologue and epilogue. The latest blinding constant change prefers to > constant size of prologue and epilogue. AAPCS reserves R9 ~ R15 for temp > registers which not need to be saved/restored during function call. So, > replace > R23 and R24 to R10 and R11, and remove tmp_used flag. > > CC: Zi Shen Lim > CC: Daniel Borkmann > Signed-off-by: Yang Shi > --- Couple suggestions, but otherwise: Acked-by: Zi Shen Lim 1. Update the diagram. I think it should now be: -* BPF fp register => -80:+-+ <= (BPF_FP) +* BPF fp register => -64:+-+ <= (BPF_FP) 2. Add a comment in commit log along the lines of: this is an optimization saving 2 instructions per jited BPF program. Thanks :) z > Apply on top of Daniel's blinding constant patchset. > > arch/arm64/net/bpf_jit_comp.c | 32 > 1 file changed, 4 insertions(+), 28 deletions(-) >
Re: [PATCH v5 0/1] ARM64: ACPI: Update documentation for latest specification version
On Mon, May 2, 2016 at 09:19 PM, Al Stone wrote: > On 04/25/2016 03:21 PM, Al Stone wrote: > > The ACPI 6.1 specification was recently released at the end of January > > 2016, but the arm64 kernel documentation for the use of ACPI was written > > for the 5.1 version of the spec. There were significant additions to the > > spec that had not yet been mentioned -- for example, the 6.0 mechanisms > > added to make it easier to define processors and low power idle states, > > as well as the 6.1 addition allowing regular interrupts (not just from > > GPIO) be used to signal ACPI general purpose events. > > > > This patch reflects going back through and examining the specs in detail > > and updating content appropriately. Whilst there, a few odds and ends of > > typos were caught as well. This brings the documentation up to date with > > ACPI 6.1 for arm64. > > > > Changes for v5: > >-- Miscellaneous typos and corrections (Lorenzo Pieralisi) > >-- Add linux-acpi@ ML to the distribution list (Alexey Klimov) > >-- Corrections to CPPC information (Alexey Klimov) > >-- ACK from Lorenzo Pieralisi > >-- Updated bibliographic info (Al Stone) > > > > Changes for v4: > >-- Clarify that IORT can sometimes be optional (Jon Masters). > >-- Remove "Use as needed" descriptions of ACPI objects; they provide > > no substantive information and doing so simplifies maintenance of > > this document over time. These have been replaced with a simpler > > notice that states that unless otherwise noted, do what the ACPI > > specification says is needed. > >-- Corrected the _OSI object usage recommendation; it described kernel > > behavior that does not exist (Al Stone). > > > > Changes for v3: > >-- Clarify use of _LPI/_RDI (Vikas Sajjan) > >-- Whitespace cleanup as pointed out by checkpatch > > > > Changes for v2: > >-- Clean up white space (Harb Abdulhahmid) > >-- Clarification on _CCA usage (Harb Abdulhamid) > >-- IORT moved to required from recommended (Hanjun Guo) > >-- Clarify IORT description (Hanjun Guo) > > > > > > Al Stone (1): > > ARM64: ACPI: Update documentation for latest specification version > > > > Documentation/arm64/acpi_object_usage.txt | 343 > > -- > > Documentation/arm64/arm-acpi.txt | 40 ++-- > > 2 files changed, 213 insertions(+), 170 deletions(-) > > > > Ping? If there are no further comments, can this be pulled in through > either the documentation or arm64 tree? > > Thanks. Hi Al, sorry for delay. CPPC and PCC corrections look fine. Thanks. This comment is not to block your patch (maybe some to-do): I greped sources and your patch and I don't see description of _PSD object. This P-state dependancy object is optional but it's presense and correct data are extremely useful for CPPC and can potentially descrease number of performance changing requests. ACPI spec in section about CPPC tells that it may use _PSD (page 503 if I remember correctly) to specify domain belongings of CPUs. You may consider to add description of _PSD object later. Best regards, Alexey.
Re: [PATCH v5 0/1] ARM64: ACPI: Update documentation for latest specification version
On Mon, May 2, 2016 at 09:19 PM, Al Stone wrote: > On 04/25/2016 03:21 PM, Al Stone wrote: > > The ACPI 6.1 specification was recently released at the end of January > > 2016, but the arm64 kernel documentation for the use of ACPI was written > > for the 5.1 version of the spec. There were significant additions to the > > spec that had not yet been mentioned -- for example, the 6.0 mechanisms > > added to make it easier to define processors and low power idle states, > > as well as the 6.1 addition allowing regular interrupts (not just from > > GPIO) be used to signal ACPI general purpose events. > > > > This patch reflects going back through and examining the specs in detail > > and updating content appropriately. Whilst there, a few odds and ends of > > typos were caught as well. This brings the documentation up to date with > > ACPI 6.1 for arm64. > > > > Changes for v5: > >-- Miscellaneous typos and corrections (Lorenzo Pieralisi) > >-- Add linux-acpi@ ML to the distribution list (Alexey Klimov) > >-- Corrections to CPPC information (Alexey Klimov) > >-- ACK from Lorenzo Pieralisi > >-- Updated bibliographic info (Al Stone) > > > > Changes for v4: > >-- Clarify that IORT can sometimes be optional (Jon Masters). > >-- Remove "Use as needed" descriptions of ACPI objects; they provide > > no substantive information and doing so simplifies maintenance of > > this document over time. These have been replaced with a simpler > > notice that states that unless otherwise noted, do what the ACPI > > specification says is needed. > >-- Corrected the _OSI object usage recommendation; it described kernel > > behavior that does not exist (Al Stone). > > > > Changes for v3: > >-- Clarify use of _LPI/_RDI (Vikas Sajjan) > >-- Whitespace cleanup as pointed out by checkpatch > > > > Changes for v2: > >-- Clean up white space (Harb Abdulhahmid) > >-- Clarification on _CCA usage (Harb Abdulhamid) > >-- IORT moved to required from recommended (Hanjun Guo) > >-- Clarify IORT description (Hanjun Guo) > > > > > > Al Stone (1): > > ARM64: ACPI: Update documentation for latest specification version > > > > Documentation/arm64/acpi_object_usage.txt | 343 > > -- > > Documentation/arm64/arm-acpi.txt | 40 ++-- > > 2 files changed, 213 insertions(+), 170 deletions(-) > > > > Ping? If there are no further comments, can this be pulled in through > either the documentation or arm64 tree? > > Thanks. Hi Al, sorry for delay. CPPC and PCC corrections look fine. Thanks. This comment is not to block your patch (maybe some to-do): I greped sources and your patch and I don't see description of _PSD object. This P-state dependancy object is optional but it's presense and correct data are extremely useful for CPPC and can potentially descrease number of performance changing requests. ACPI spec in section about CPPC tells that it may use _PSD (page 503 if I remember correctly) to specify domain belongings of CPUs. You may consider to add description of _PSD object later. Best regards, Alexey.
Re: [v4,1/1] powerpc/86xx: Add support for Emerson/Artesyn MVME7100
On Wed, Apr 27, 2016 at 10:35:25AM +0200, Alessio Igor Bogani wrote: > + bcsr@4,0 { > + compatible = "artesyn,mvme7100-bcsr"; > + reg = <4 0 0x1>; > + }; > + > +serial@5,1000 { > + cell-index = <2>; > + device_type = "serial"; > + compatible = "ns16550"; > + reg = <5 0x1000 0x100>; > + clock-frequency = <1843200>; > + interrupts = <11 1 0 0>; > + }; The "serial@5,1000" line has spaces where there should be tabs. There are several other instances of this in the patch. Where did these cell-index values come from? Why are they needed? > + }; > + > +}; No blank line here. > + platform_ops.fixups = mvme7100_fixups; > + > +} No blank line here. > diff --git a/arch/powerpc/boot/ppcboot.h b/arch/powerpc/boot/ppcboot.h > index 6ae6f90..7b758be 100644 > --- a/arch/powerpc/boot/ppcboot.h > +++ b/arch/powerpc/boot/ppcboot.h > @@ -43,7 +43,7 @@ typedef struct bd_info { > unsigned long bi_sramstart; /* start of SRAM memory */ > unsigned long bi_sramsize;/* size of SRAM memory */ > #if defined(TARGET_8xx) || defined(TARGET_CPM2) || defined(TARGET_85xx) ||\ > - defined(TARGET_83xx) > + defined(TARGET_83xx) || defined(TARGET_MVME7100) > unsigned long bi_immr_base; /* base of IMMR register */ > #endif Again, please use TARGET_86xx here rather than TARGET_MVME7100. > @@ -69,7 +74,7 @@ config MPC8641 > select FSL_PCI if PCI > select PPC_UDBG_16550 > select MPIC > - default y if MPC8641_HPCN || SBC8641D || GEF_SBC610 || GEF_SBC310 || > GEF_PPC9A > + default y if MPC8641_HPCN || SBC8641D || GEF_SBC610 || GEF_SBC310 || > GEF_PPC9A || MVME7100 Please wrap this long line. -Scott
Re: [v4,1/1] powerpc/86xx: Add support for Emerson/Artesyn MVME7100
On Wed, Apr 27, 2016 at 10:35:25AM +0200, Alessio Igor Bogani wrote: > + bcsr@4,0 { > + compatible = "artesyn,mvme7100-bcsr"; > + reg = <4 0 0x1>; > + }; > + > +serial@5,1000 { > + cell-index = <2>; > + device_type = "serial"; > + compatible = "ns16550"; > + reg = <5 0x1000 0x100>; > + clock-frequency = <1843200>; > + interrupts = <11 1 0 0>; > + }; The "serial@5,1000" line has spaces where there should be tabs. There are several other instances of this in the patch. Where did these cell-index values come from? Why are they needed? > + }; > + > +}; No blank line here. > + platform_ops.fixups = mvme7100_fixups; > + > +} No blank line here. > diff --git a/arch/powerpc/boot/ppcboot.h b/arch/powerpc/boot/ppcboot.h > index 6ae6f90..7b758be 100644 > --- a/arch/powerpc/boot/ppcboot.h > +++ b/arch/powerpc/boot/ppcboot.h > @@ -43,7 +43,7 @@ typedef struct bd_info { > unsigned long bi_sramstart; /* start of SRAM memory */ > unsigned long bi_sramsize;/* size of SRAM memory */ > #if defined(TARGET_8xx) || defined(TARGET_CPM2) || defined(TARGET_85xx) ||\ > - defined(TARGET_83xx) > + defined(TARGET_83xx) || defined(TARGET_MVME7100) > unsigned long bi_immr_base; /* base of IMMR register */ > #endif Again, please use TARGET_86xx here rather than TARGET_MVME7100. > @@ -69,7 +74,7 @@ config MPC8641 > select FSL_PCI if PCI > select PPC_UDBG_16550 > select MPIC > - default y if MPC8641_HPCN || SBC8641D || GEF_SBC610 || GEF_SBC310 || > GEF_PPC9A > + default y if MPC8641_HPCN || SBC8641D || GEF_SBC610 || GEF_SBC310 || > GEF_PPC9A || MVME7100 Please wrap this long line. -Scott
Re: [PATCH] ideapad-laptop: add a new WMI string for ESC key
On Thu, May 12, 2016 at 12:34:05AM +0300, Denis Gordienko wrote: > Hi, Darren. This patch already tested on my laptop and I can confirm - it's > work fine. Awesome, thanks! Added your tested-by. > Denis > 11 Май 2016 г. 23:09 пользователь "Darren Hart"> написал: > > On Mon, May 09, 2016 at 11:49:21PM +0200, Arnd Bergmann wrote: > > My patch to the ideapad-laptop driver to get the ESC key working on the > > Yoga 1170 (Yoga 3) failed to do the same for the following model, the > > Lenovo Yoga 700. > > > > Denis Gordienko managed to get it working by adding another GUID for the > > new WMI interface. I have adapted his patch to normal coding style > > and simplified it a bit for inclusion, but this patch is currently > > untested. > > > > Link: > https://forums.lenovo.com/t5/Lenovo-Yoga-Series-Notebooks/YOGA-3-14-How-to-reclaim-my-Esc-key-and-permanently-disable/m-p/3317499 > > Signed-off-by: Arnd Bergmann > > Cc: Denis Gordienko > > Looks good to me, but of course I would like to see some testing. Denis, I > presume you have a Yoga 700 you could verify this version of your patch on? > Would you be willing to do so? > > > > --- > > drivers/platform/x86/ideapad-laptop.c | 19 --- > > 1 file changed, 16 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/platform/x86/ideapad-laptop.c > b/drivers/platform/x86/ideapad-laptop.c > > index 1d49db124753..531ecca9dbe3 100644 > > --- a/drivers/platform/x86/ideapad-laptop.c > > +++ b/drivers/platform/x86/ideapad-laptop.c > > @@ -48,7 +48,10 @@ > > #define CFG_CAMERA_BIT (19) > > > > #if IS_ENABLED(CONFIG_ACPI_WMI) > > -static const char ideapad_wmi_fnesc_event[] = > "26CAB2E5-5CF1-46AE-AAC3-4A12B6BA50E6"; > > +static const char* ideapad_wmi_fnesc_events[] = { Arnd, should this be: static const char *const ideapad_wmi_fnesc_events[] = { ? Checkpatch complains, I see both usages throughout the kernel. I believe we do want a const array of const char arrays, so I think this is a better declaration. Any objection to this change? -- Darren Hart Intel Open Source Technology Center
Re: [PATCH] ideapad-laptop: add a new WMI string for ESC key
On Thu, May 12, 2016 at 12:34:05AM +0300, Denis Gordienko wrote: > Hi, Darren. This patch already tested on my laptop and I can confirm - it's > work fine. Awesome, thanks! Added your tested-by. > Denis > 11 Май 2016 г. 23:09 пользователь "Darren Hart" > написал: > > On Mon, May 09, 2016 at 11:49:21PM +0200, Arnd Bergmann wrote: > > My patch to the ideapad-laptop driver to get the ESC key working on the > > Yoga 1170 (Yoga 3) failed to do the same for the following model, the > > Lenovo Yoga 700. > > > > Denis Gordienko managed to get it working by adding another GUID for the > > new WMI interface. I have adapted his patch to normal coding style > > and simplified it a bit for inclusion, but this patch is currently > > untested. > > > > Link: > https://forums.lenovo.com/t5/Lenovo-Yoga-Series-Notebooks/YOGA-3-14-How-to-reclaim-my-Esc-key-and-permanently-disable/m-p/3317499 > > Signed-off-by: Arnd Bergmann > > Cc: Denis Gordienko > > Looks good to me, but of course I would like to see some testing. Denis, I > presume you have a Yoga 700 you could verify this version of your patch on? > Would you be willing to do so? > > > > --- > > drivers/platform/x86/ideapad-laptop.c | 19 --- > > 1 file changed, 16 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/platform/x86/ideapad-laptop.c > b/drivers/platform/x86/ideapad-laptop.c > > index 1d49db124753..531ecca9dbe3 100644 > > --- a/drivers/platform/x86/ideapad-laptop.c > > +++ b/drivers/platform/x86/ideapad-laptop.c > > @@ -48,7 +48,10 @@ > > #define CFG_CAMERA_BIT (19) > > > > #if IS_ENABLED(CONFIG_ACPI_WMI) > > -static const char ideapad_wmi_fnesc_event[] = > "26CAB2E5-5CF1-46AE-AAC3-4A12B6BA50E6"; > > +static const char* ideapad_wmi_fnesc_events[] = { Arnd, should this be: static const char *const ideapad_wmi_fnesc_events[] = { ? Checkpatch complains, I see both usages throughout the kernel. I believe we do want a const array of const char arrays, so I think this is a better declaration. Any objection to this change? -- Darren Hart Intel Open Source Technology Center
Re: tty crash in Linux 4.6
Hi Mikulas, On 05/16/2016 01:12 PM, Mikulas Patocka wrote: > Hi > > In the kernel 4.6 I get crashes in the tty layer. I can reproduce the > crash by logging into the machine with ssh and typing before the prompt > appears. Thanks for the report. I tried to reproduce this a number of times on different machines with no luck. > The crash is caused by the pointer tty->disc_data being NULL in the > function n_tty_receive_buf_common. The crash happens on the statement > smp_load_acquire(>read_tail). > > Bisecting shows that the crashes are caused by the patch > 892d1fa7eaaed9d3c04954cb140c34ebc3393932 ("tty: Destroy ldisc instance on > hangup"). Can you try the test patch below? Regards, Peter Hurley > Kernel Fault: Code=15 regs=7d9e0720 (Addr=2260) > CPU: 0 PID: 3319 Comm: kworker/u8:0 Not tainted 4.6.0 #1 > Workqueue: events_unbound flush_to_ldisc > task: 7c25ea80 ti: 7d9e task.ti: 7d9e > > YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI > PSW: 1100 Not tainted > r00-03 0804000f 4076cd10 40475fb4 7f761800 > r04-07 40749510 0001 7f761800 7d9e0490 > r08-11 7e722890 7da4ec00 7f763823 > r12-15 7fc08ea8 7fc08c78 4080e080 > r16-19 7fc08c00 0001 2260 > r20-23 7f7618b0 7c25ea80 0001 0001 > r24-27 080f 7f7618ac 40749510 > r28-31 0001 7d9e0840 7d9e0720 0001 > sr00-03 086c8800 086c8800 > sr04-07 > > IASQ: IAOQ: 40475fd4 > 40475fd8 > IIR: 0e6c00d5ISR: IOR: 2260 > CPU:0 CR30: 7d9e CR31: ff87e7ffbc9e > ORIG_R28: 4080a180 > IAOQ[0]: n_tty_receive_buf_common+0xb4/0xbe0 > IAOQ[1]: n_tty_receive_buf_common+0xb8/0xbe0 > RP(r2): n_tty_receive_buf_common+0x94/0xbe0 > Backtrace: > [<40476b14>] n_tty_receive_buf2+0x14/0x20 > [<4047a208>] tty_ldisc_receive_buf+0x30/0x90 > [<4047a544>] flush_to_ldisc+0x144/0x1c8 > [<402556bc>] process_one_work+0x1b4/0x460 > [<40255bbc>] worker_thread+0x1e4/0x5e0 > [<4025d454>] kthread+0x134/0x168 --- >% --- diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c index 68947f6..f271832 100644 --- a/drivers/tty/tty_ldisc.c +++ b/drivers/tty/tty_ldisc.c @@ -653,7 +653,7 @@ static void tty_reset_termios(struct tty_struct *tty) * Returns 0 if successful, otherwise error code < 0 */ -int tty_ldisc_reinit(struct tty_struct *tty, int disc) +static int __tty_ldisc_reinit(struct tty_struct *tty, int disc) { struct tty_ldisc *ld; int retval; @@ -682,6 +682,16 @@ int tty_ldisc_reinit(struct tty_struct *tty, int disc) return retval; } +int tty_ldisc_reinit(struct tty_struct *tty, int disc) +{ + int retval; + + tty_ldisc_lock(tty, MAX_SCHEDULE_TIMEOUT); + retval = __tty_ldisc_reinit(tty, disc); + tty_ldisc_unlock(tty); + return retval; +} + /** * tty_ldisc_hangup- hangup ldisc reset * @tty: tty being hung up @@ -732,8 +742,8 @@ void tty_ldisc_hangup(struct tty_struct *tty, bool reinit) if (tty->ldisc) { if (reinit) { - if (tty_ldisc_reinit(tty, tty->termios.c_line) < 0) - tty_ldisc_reinit(tty, N_TTY); + if (__tty_ldisc_reinit(tty, tty->termios.c_line) < 0) + __tty_ldisc_reinit(tty, N_TTY); } else tty_ldisc_kill(tty); }
[PATCH net-next] bpf: arm64: remove callee-save registers use for tmp registers
In the current implementation of ARM64 eBPF JIT, R23 and R24 are used for tmp registers, which are callee-saved registers. This leads to variable size of JIT prologue and epilogue. The latest blinding constant change prefers to constant size of prologue and epilogue. AAPCS reserves R9 ~ R15 for temp registers which not need to be saved/restored during function call. So, replace R23 and R24 to R10 and R11, and remove tmp_used flag. CC: Zi Shen LimCC: Daniel Borkmann Signed-off-by: Yang Shi --- Apply on top of Daniel's blinding constant patchset. arch/arm64/net/bpf_jit_comp.c | 32 1 file changed, 4 insertions(+), 28 deletions(-) diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index d0d5190..ef3055a 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -51,9 +51,9 @@ static const int bpf2a64[] = { [BPF_REG_9] = A64_R(22), /* read-only frame pointer to access stack */ [BPF_REG_FP] = A64_R(25), - /* temporary register for internal BPF JIT */ - [TMP_REG_1] = A64_R(23), - [TMP_REG_2] = A64_R(24), + /* temporary registers for internal BPF JIT */ + [TMP_REG_1] = A64_R(10), + [TMP_REG_2] = A64_R(11), /* temporary register for blinding constants */ [BPF_REG_AX] = A64_R(9), }; @@ -61,7 +61,6 @@ static const int bpf2a64[] = { struct jit_ctx { const struct bpf_prog *prog; int idx; - int tmp_used; int epilogue_offset; int *offset; u32 *image; @@ -154,8 +153,6 @@ static void build_prologue(struct jit_ctx *ctx) const u8 r8 = bpf2a64[BPF_REG_8]; const u8 r9 = bpf2a64[BPF_REG_9]; const u8 fp = bpf2a64[BPF_REG_FP]; - const u8 tmp1 = bpf2a64[TMP_REG_1]; - const u8 tmp2 = bpf2a64[TMP_REG_2]; /* * BPF prog stack layout @@ -189,8 +186,6 @@ static void build_prologue(struct jit_ctx *ctx) /* Save callee-saved register */ emit(A64_PUSH(r6, r7, A64_SP), ctx); emit(A64_PUSH(r8, r9, A64_SP), ctx); - if (ctx->tmp_used) - emit(A64_PUSH(tmp1, tmp2, A64_SP), ctx); /* Save fp (x25) and x26. SP requires 16 bytes alignment */ emit(A64_PUSH(fp, A64_R(26), A64_SP), ctx); @@ -210,8 +205,6 @@ static void build_epilogue(struct jit_ctx *ctx) const u8 r8 = bpf2a64[BPF_REG_8]; const u8 r9 = bpf2a64[BPF_REG_9]; const u8 fp = bpf2a64[BPF_REG_FP]; - const u8 tmp1 = bpf2a64[TMP_REG_1]; - const u8 tmp2 = bpf2a64[TMP_REG_2]; /* We're done with BPF stack */ emit(A64_ADD_I(1, A64_SP, A64_SP, STACK_SIZE), ctx); @@ -220,8 +213,6 @@ static void build_epilogue(struct jit_ctx *ctx) emit(A64_POP(fp, A64_R(26), A64_SP), ctx); /* Restore callee-saved register */ - if (ctx->tmp_used) - emit(A64_POP(tmp1, tmp2, A64_SP), ctx); emit(A64_POP(r8, r9, A64_SP), ctx); emit(A64_POP(r6, r7, A64_SP), ctx); @@ -317,7 +308,6 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx) emit(A64_UDIV(is64, dst, dst, src), ctx); break; case BPF_MOD: - ctx->tmp_used = 1; emit(A64_UDIV(is64, tmp, dst, src), ctx); emit(A64_MUL(is64, tmp, tmp, src), ctx); emit(A64_SUB(is64, dst, dst, tmp), ctx); @@ -390,49 +380,41 @@ emit_bswap_uxt: /* dst = dst OP imm */ case BPF_ALU | BPF_ADD | BPF_K: case BPF_ALU64 | BPF_ADD | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_ADD(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_SUB | BPF_K: case BPF_ALU64 | BPF_SUB | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_SUB(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_AND | BPF_K: case BPF_ALU64 | BPF_AND | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_AND(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_OR | BPF_K: case BPF_ALU64 | BPF_OR | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_ORR(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_XOR | BPF_K: case BPF_ALU64 | BPF_XOR | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_EOR(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_MUL | BPF_K: case BPF_ALU64 | BPF_MUL | BPF_K: - ctx->tmp_used = 1;
[PATCH net-next] bpf: arm64: remove callee-save registers use for tmp registers
In the current implementation of ARM64 eBPF JIT, R23 and R24 are used for tmp registers, which are callee-saved registers. This leads to variable size of JIT prologue and epilogue. The latest blinding constant change prefers to constant size of prologue and epilogue. AAPCS reserves R9 ~ R15 for temp registers which not need to be saved/restored during function call. So, replace R23 and R24 to R10 and R11, and remove tmp_used flag. CC: Zi Shen Lim CC: Daniel Borkmann Signed-off-by: Yang Shi --- Apply on top of Daniel's blinding constant patchset. arch/arm64/net/bpf_jit_comp.c | 32 1 file changed, 4 insertions(+), 28 deletions(-) diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index d0d5190..ef3055a 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -51,9 +51,9 @@ static const int bpf2a64[] = { [BPF_REG_9] = A64_R(22), /* read-only frame pointer to access stack */ [BPF_REG_FP] = A64_R(25), - /* temporary register for internal BPF JIT */ - [TMP_REG_1] = A64_R(23), - [TMP_REG_2] = A64_R(24), + /* temporary registers for internal BPF JIT */ + [TMP_REG_1] = A64_R(10), + [TMP_REG_2] = A64_R(11), /* temporary register for blinding constants */ [BPF_REG_AX] = A64_R(9), }; @@ -61,7 +61,6 @@ static const int bpf2a64[] = { struct jit_ctx { const struct bpf_prog *prog; int idx; - int tmp_used; int epilogue_offset; int *offset; u32 *image; @@ -154,8 +153,6 @@ static void build_prologue(struct jit_ctx *ctx) const u8 r8 = bpf2a64[BPF_REG_8]; const u8 r9 = bpf2a64[BPF_REG_9]; const u8 fp = bpf2a64[BPF_REG_FP]; - const u8 tmp1 = bpf2a64[TMP_REG_1]; - const u8 tmp2 = bpf2a64[TMP_REG_2]; /* * BPF prog stack layout @@ -189,8 +186,6 @@ static void build_prologue(struct jit_ctx *ctx) /* Save callee-saved register */ emit(A64_PUSH(r6, r7, A64_SP), ctx); emit(A64_PUSH(r8, r9, A64_SP), ctx); - if (ctx->tmp_used) - emit(A64_PUSH(tmp1, tmp2, A64_SP), ctx); /* Save fp (x25) and x26. SP requires 16 bytes alignment */ emit(A64_PUSH(fp, A64_R(26), A64_SP), ctx); @@ -210,8 +205,6 @@ static void build_epilogue(struct jit_ctx *ctx) const u8 r8 = bpf2a64[BPF_REG_8]; const u8 r9 = bpf2a64[BPF_REG_9]; const u8 fp = bpf2a64[BPF_REG_FP]; - const u8 tmp1 = bpf2a64[TMP_REG_1]; - const u8 tmp2 = bpf2a64[TMP_REG_2]; /* We're done with BPF stack */ emit(A64_ADD_I(1, A64_SP, A64_SP, STACK_SIZE), ctx); @@ -220,8 +213,6 @@ static void build_epilogue(struct jit_ctx *ctx) emit(A64_POP(fp, A64_R(26), A64_SP), ctx); /* Restore callee-saved register */ - if (ctx->tmp_used) - emit(A64_POP(tmp1, tmp2, A64_SP), ctx); emit(A64_POP(r8, r9, A64_SP), ctx); emit(A64_POP(r6, r7, A64_SP), ctx); @@ -317,7 +308,6 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx) emit(A64_UDIV(is64, dst, dst, src), ctx); break; case BPF_MOD: - ctx->tmp_used = 1; emit(A64_UDIV(is64, tmp, dst, src), ctx); emit(A64_MUL(is64, tmp, tmp, src), ctx); emit(A64_SUB(is64, dst, dst, tmp), ctx); @@ -390,49 +380,41 @@ emit_bswap_uxt: /* dst = dst OP imm */ case BPF_ALU | BPF_ADD | BPF_K: case BPF_ALU64 | BPF_ADD | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_ADD(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_SUB | BPF_K: case BPF_ALU64 | BPF_SUB | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_SUB(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_AND | BPF_K: case BPF_ALU64 | BPF_AND | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_AND(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_OR | BPF_K: case BPF_ALU64 | BPF_OR | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_ORR(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_XOR | BPF_K: case BPF_ALU64 | BPF_XOR | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_EOR(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_MUL | BPF_K: case BPF_ALU64 | BPF_MUL | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx);
Re: tty crash in Linux 4.6
Hi Mikulas, On 05/16/2016 01:12 PM, Mikulas Patocka wrote: > Hi > > In the kernel 4.6 I get crashes in the tty layer. I can reproduce the > crash by logging into the machine with ssh and typing before the prompt > appears. Thanks for the report. I tried to reproduce this a number of times on different machines with no luck. > The crash is caused by the pointer tty->disc_data being NULL in the > function n_tty_receive_buf_common. The crash happens on the statement > smp_load_acquire(>read_tail). > > Bisecting shows that the crashes are caused by the patch > 892d1fa7eaaed9d3c04954cb140c34ebc3393932 ("tty: Destroy ldisc instance on > hangup"). Can you try the test patch below? Regards, Peter Hurley > Kernel Fault: Code=15 regs=7d9e0720 (Addr=2260) > CPU: 0 PID: 3319 Comm: kworker/u8:0 Not tainted 4.6.0 #1 > Workqueue: events_unbound flush_to_ldisc > task: 7c25ea80 ti: 7d9e task.ti: 7d9e > > YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI > PSW: 1100 Not tainted > r00-03 0804000f 4076cd10 40475fb4 7f761800 > r04-07 40749510 0001 7f761800 7d9e0490 > r08-11 7e722890 7da4ec00 7f763823 > r12-15 7fc08ea8 7fc08c78 4080e080 > r16-19 7fc08c00 0001 2260 > r20-23 7f7618b0 7c25ea80 0001 0001 > r24-27 080f 7f7618ac 40749510 > r28-31 0001 7d9e0840 7d9e0720 0001 > sr00-03 086c8800 086c8800 > sr04-07 > > IASQ: IAOQ: 40475fd4 > 40475fd8 > IIR: 0e6c00d5ISR: IOR: 2260 > CPU:0 CR30: 7d9e CR31: ff87e7ffbc9e > ORIG_R28: 4080a180 > IAOQ[0]: n_tty_receive_buf_common+0xb4/0xbe0 > IAOQ[1]: n_tty_receive_buf_common+0xb8/0xbe0 > RP(r2): n_tty_receive_buf_common+0x94/0xbe0 > Backtrace: > [<40476b14>] n_tty_receive_buf2+0x14/0x20 > [<4047a208>] tty_ldisc_receive_buf+0x30/0x90 > [<4047a544>] flush_to_ldisc+0x144/0x1c8 > [<402556bc>] process_one_work+0x1b4/0x460 > [<40255bbc>] worker_thread+0x1e4/0x5e0 > [<4025d454>] kthread+0x134/0x168 --- >% --- diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c index 68947f6..f271832 100644 --- a/drivers/tty/tty_ldisc.c +++ b/drivers/tty/tty_ldisc.c @@ -653,7 +653,7 @@ static void tty_reset_termios(struct tty_struct *tty) * Returns 0 if successful, otherwise error code < 0 */ -int tty_ldisc_reinit(struct tty_struct *tty, int disc) +static int __tty_ldisc_reinit(struct tty_struct *tty, int disc) { struct tty_ldisc *ld; int retval; @@ -682,6 +682,16 @@ int tty_ldisc_reinit(struct tty_struct *tty, int disc) return retval; } +int tty_ldisc_reinit(struct tty_struct *tty, int disc) +{ + int retval; + + tty_ldisc_lock(tty, MAX_SCHEDULE_TIMEOUT); + retval = __tty_ldisc_reinit(tty, disc); + tty_ldisc_unlock(tty); + return retval; +} + /** * tty_ldisc_hangup- hangup ldisc reset * @tty: tty being hung up @@ -732,8 +742,8 @@ void tty_ldisc_hangup(struct tty_struct *tty, bool reinit) if (tty->ldisc) { if (reinit) { - if (tty_ldisc_reinit(tty, tty->termios.c_line) < 0) - tty_ldisc_reinit(tty, N_TTY); + if (__tty_ldisc_reinit(tty, tty->termios.c_line) < 0) + __tty_ldisc_reinit(tty, N_TTY); } else tty_ldisc_kill(tty); }
Re: 4.6 compilation error when making signing key
CONFIG_MODULE_SIG_KEY="signing_key.pem" This should be 'certs/signing_key.pem', right? I'm not sure how it ended up like that -- perhaps something happened in my next branch (which shares the same .config generally). -- James Morris
Re: 4.6 compilation error when making signing key
CONFIG_MODULE_SIG_KEY="signing_key.pem" This should be 'certs/signing_key.pem', right? I'm not sure how it ended up like that -- perhaps something happened in my next branch (which shares the same .config generally). -- James Morris
Re: 4.6 compilation error when making signing key
On Mon, 16 May 2016, David Howells wrote: > James Morriswrote: > > > I'm seeing this with the 4.6 kernel build: > > > > CHK include/generated/compile.h > > make[1]: *** No rule to make target `signing_key.pem', needed by > > `certs/signing_key.x509'. Stop. > > make: *** [certs] Error 2 > > Can I have a look at your .config just in case? I presume you don't have a > signing_key.pem anywhere? See below. No .pem file. # # Automatically generated file; DO NOT EDIT. # Linux/x86 4.6.0 Kernel Configuration # CONFIG_64BIT=y CONFIG_X86_64=y CONFIG_X86=y CONFIG_INSTRUCTION_DECODER=y CONFIG_PERF_EVENTS_INTEL_UNCORE=y CONFIG_OUTPUT_FORMAT="elf64-x86-64" CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig" CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_MMU=y CONFIG_ARCH_MMAP_RND_BITS_MIN=28 CONFIG_ARCH_MMAP_RND_BITS_MAX=32 CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8 CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16 CONFIG_NEED_DMA_MAP_STATE=y CONFIG_NEED_SG_DMA_LENGTH=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_ARCH_HAS_CPU_RELAX=y CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y CONFIG_HAVE_SETUP_PER_CPU_AREA=y CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y CONFIG_ARCH_WANT_GENERAL_HUGETLB=y CONFIG_ZONE_DMA32=y CONFIG_AUDIT_ARCH=y CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y CONFIG_X86_64_SMP=y CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11" CONFIG_ARCH_SUPPORTS_UPROBES=y CONFIG_FIX_EARLYCON_MEM=y CONFIG_DEBUG_RODATA=y CONFIG_PGTABLE_LEVELS=4 CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" CONFIG_IRQ_WORK=y CONFIG_BUILDTIME_EXTABLE_SORT=y # # General setup # CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_CROSS_COMPILE="" # CONFIG_COMPILE_TEST is not set CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_HAVE_KERNEL_GZIP=y CONFIG_HAVE_KERNEL_BZIP2=y CONFIG_HAVE_KERNEL_LZMA=y CONFIG_HAVE_KERNEL_XZ=y CONFIG_HAVE_KERNEL_LZO=y CONFIG_HAVE_KERNEL_LZ4=y CONFIG_KERNEL_GZIP=y # CONFIG_KERNEL_BZIP2 is not set # CONFIG_KERNEL_LZMA is not set # CONFIG_KERNEL_XZ is not set # CONFIG_KERNEL_LZO is not set # CONFIG_KERNEL_LZ4 is not set CONFIG_DEFAULT_HOSTNAME="(none)" CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_POSIX_MQUEUE_SYSCTL=y CONFIG_CROSS_MEMORY_ATTACH=y CONFIG_FHANDLE=y CONFIG_USELIB=y CONFIG_AUDIT=y CONFIG_HAVE_ARCH_AUDITSYSCALL=y CONFIG_AUDITSYSCALL=y CONFIG_AUDIT_WATCH=y CONFIG_AUDIT_TREE=y # # IRQ subsystem # CONFIG_GENERIC_IRQ_PROBE=y CONFIG_GENERIC_IRQ_SHOW=y CONFIG_GENERIC_PENDING_IRQ=y CONFIG_IRQ_DOMAIN=y CONFIG_IRQ_DOMAIN_HIERARCHY=y CONFIG_GENERIC_MSI_IRQ=y CONFIG_GENERIC_MSI_IRQ_DOMAIN=y # CONFIG_IRQ_DOMAIN_DEBUG is not set CONFIG_IRQ_FORCED_THREADING=y CONFIG_SPARSE_IRQ=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_ARCH_CLOCKSOURCE_DATA=y CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y CONFIG_GENERIC_CMOS_UPDATE=y # # Timers subsystem # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ_COMMON=y # CONFIG_HZ_PERIODIC is not set CONFIG_NO_HZ_IDLE=y # CONFIG_NO_HZ_FULL is not set CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y # # CPU/Task time and stats accounting # CONFIG_TICK_CPU_ACCOUNTING=y # CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set # CONFIG_IRQ_TIME_ACCOUNTING is not set CONFIG_BSD_PROCESS_ACCT=y CONFIG_BSD_PROCESS_ACCT_V3=y CONFIG_TASKSTATS=y CONFIG_TASK_DELAY_ACCT=y CONFIG_TASK_XACCT=y CONFIG_TASK_IO_ACCOUNTING=y # # RCU Subsystem # CONFIG_PREEMPT_RCU=y # CONFIG_RCU_EXPERT is not set CONFIG_SRCU=y # CONFIG_TASKS_RCU is not set CONFIG_RCU_STALL_COMMON=y CONFIG_TREE_RCU_TRACE=y # CONFIG_RCU_EXPEDITE_BOOT is not set CONFIG_BUILD_BIN2C=y # CONFIG_IKCONFIG is not set CONFIG_LOG_BUF_SHIFT=17 CONFIG_LOG_CPU_MAX_BUF_SHIFT=12 CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y CONFIG_ARCH_SUPPORTS_INT128=y # CONFIG_NUMA_BALANCING is not set CONFIG_CGROUPS=y # CONFIG_MEMCG is not set # CONFIG_BLK_CGROUP is not set CONFIG_CGROUP_SCHED=y CONFIG_FAIR_GROUP_SCHED=y # CONFIG_CFS_BANDWIDTH is not set # CONFIG_RT_GROUP_SCHED is not set # CONFIG_CGROUP_PIDS is not set # CONFIG_CGROUP_FREEZER is not set # CONFIG_CGROUP_HUGETLB is not set CONFIG_CPUSETS=y CONFIG_PROC_PID_CPUSET=y # CONFIG_CGROUP_DEVICE is not set CONFIG_CGROUP_CPUACCT=y # CONFIG_CGROUP_PERF is not set # CONFIG_CGROUP_DEBUG is not set # CONFIG_CHECKPOINT_RESTORE is not set CONFIG_NAMESPACES=y CONFIG_UTS_NS=y CONFIG_IPC_NS=y CONFIG_USER_NS=y CONFIG_PID_NS=y
Re: 4.6 compilation error when making signing key
On Mon, 16 May 2016, David Howells wrote: > James Morris wrote: > > > I'm seeing this with the 4.6 kernel build: > > > > CHK include/generated/compile.h > > make[1]: *** No rule to make target `signing_key.pem', needed by > > `certs/signing_key.x509'. Stop. > > make: *** [certs] Error 2 > > Can I have a look at your .config just in case? I presume you don't have a > signing_key.pem anywhere? See below. No .pem file. # # Automatically generated file; DO NOT EDIT. # Linux/x86 4.6.0 Kernel Configuration # CONFIG_64BIT=y CONFIG_X86_64=y CONFIG_X86=y CONFIG_INSTRUCTION_DECODER=y CONFIG_PERF_EVENTS_INTEL_UNCORE=y CONFIG_OUTPUT_FORMAT="elf64-x86-64" CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig" CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_MMU=y CONFIG_ARCH_MMAP_RND_BITS_MIN=28 CONFIG_ARCH_MMAP_RND_BITS_MAX=32 CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8 CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16 CONFIG_NEED_DMA_MAP_STATE=y CONFIG_NEED_SG_DMA_LENGTH=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_ARCH_HAS_CPU_RELAX=y CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y CONFIG_HAVE_SETUP_PER_CPU_AREA=y CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y CONFIG_ARCH_WANT_GENERAL_HUGETLB=y CONFIG_ZONE_DMA32=y CONFIG_AUDIT_ARCH=y CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y CONFIG_X86_64_SMP=y CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11" CONFIG_ARCH_SUPPORTS_UPROBES=y CONFIG_FIX_EARLYCON_MEM=y CONFIG_DEBUG_RODATA=y CONFIG_PGTABLE_LEVELS=4 CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" CONFIG_IRQ_WORK=y CONFIG_BUILDTIME_EXTABLE_SORT=y # # General setup # CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_CROSS_COMPILE="" # CONFIG_COMPILE_TEST is not set CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_HAVE_KERNEL_GZIP=y CONFIG_HAVE_KERNEL_BZIP2=y CONFIG_HAVE_KERNEL_LZMA=y CONFIG_HAVE_KERNEL_XZ=y CONFIG_HAVE_KERNEL_LZO=y CONFIG_HAVE_KERNEL_LZ4=y CONFIG_KERNEL_GZIP=y # CONFIG_KERNEL_BZIP2 is not set # CONFIG_KERNEL_LZMA is not set # CONFIG_KERNEL_XZ is not set # CONFIG_KERNEL_LZO is not set # CONFIG_KERNEL_LZ4 is not set CONFIG_DEFAULT_HOSTNAME="(none)" CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_POSIX_MQUEUE_SYSCTL=y CONFIG_CROSS_MEMORY_ATTACH=y CONFIG_FHANDLE=y CONFIG_USELIB=y CONFIG_AUDIT=y CONFIG_HAVE_ARCH_AUDITSYSCALL=y CONFIG_AUDITSYSCALL=y CONFIG_AUDIT_WATCH=y CONFIG_AUDIT_TREE=y # # IRQ subsystem # CONFIG_GENERIC_IRQ_PROBE=y CONFIG_GENERIC_IRQ_SHOW=y CONFIG_GENERIC_PENDING_IRQ=y CONFIG_IRQ_DOMAIN=y CONFIG_IRQ_DOMAIN_HIERARCHY=y CONFIG_GENERIC_MSI_IRQ=y CONFIG_GENERIC_MSI_IRQ_DOMAIN=y # CONFIG_IRQ_DOMAIN_DEBUG is not set CONFIG_IRQ_FORCED_THREADING=y CONFIG_SPARSE_IRQ=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_ARCH_CLOCKSOURCE_DATA=y CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y CONFIG_GENERIC_CMOS_UPDATE=y # # Timers subsystem # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ_COMMON=y # CONFIG_HZ_PERIODIC is not set CONFIG_NO_HZ_IDLE=y # CONFIG_NO_HZ_FULL is not set CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y # # CPU/Task time and stats accounting # CONFIG_TICK_CPU_ACCOUNTING=y # CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set # CONFIG_IRQ_TIME_ACCOUNTING is not set CONFIG_BSD_PROCESS_ACCT=y CONFIG_BSD_PROCESS_ACCT_V3=y CONFIG_TASKSTATS=y CONFIG_TASK_DELAY_ACCT=y CONFIG_TASK_XACCT=y CONFIG_TASK_IO_ACCOUNTING=y # # RCU Subsystem # CONFIG_PREEMPT_RCU=y # CONFIG_RCU_EXPERT is not set CONFIG_SRCU=y # CONFIG_TASKS_RCU is not set CONFIG_RCU_STALL_COMMON=y CONFIG_TREE_RCU_TRACE=y # CONFIG_RCU_EXPEDITE_BOOT is not set CONFIG_BUILD_BIN2C=y # CONFIG_IKCONFIG is not set CONFIG_LOG_BUF_SHIFT=17 CONFIG_LOG_CPU_MAX_BUF_SHIFT=12 CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y CONFIG_ARCH_SUPPORTS_INT128=y # CONFIG_NUMA_BALANCING is not set CONFIG_CGROUPS=y # CONFIG_MEMCG is not set # CONFIG_BLK_CGROUP is not set CONFIG_CGROUP_SCHED=y CONFIG_FAIR_GROUP_SCHED=y # CONFIG_CFS_BANDWIDTH is not set # CONFIG_RT_GROUP_SCHED is not set # CONFIG_CGROUP_PIDS is not set # CONFIG_CGROUP_FREEZER is not set # CONFIG_CGROUP_HUGETLB is not set CONFIG_CPUSETS=y CONFIG_PROC_PID_CPUSET=y # CONFIG_CGROUP_DEVICE is not set CONFIG_CGROUP_CPUACCT=y # CONFIG_CGROUP_PERF is not set # CONFIG_CGROUP_DEBUG is not set # CONFIG_CHECKPOINT_RESTORE is not set CONFIG_NAMESPACES=y CONFIG_UTS_NS=y CONFIG_IPC_NS=y CONFIG_USER_NS=y CONFIG_PID_NS=y CONFIG_NET_NS=y
Re: [v5,1/7] QE: Add IC, SI and SIRAM document to device tree bindings.
On Wed, Mar 09, 2016 at 09:21:28AM +0800, Zhao Qiang wrote: > Add IC, SI and SIRAM document of QE to > Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt > > Signed-off-by: Zhao Qiang> Acked-by: Rob Herring > --- > changes for v2 > - Add interrupt-controller in Required properties > - delete address-cells and size-cells for qe-si and qe-siram > Changes for v3 > - Add SoC specific caompatible strings to qe-si and qe-siram > Changes for v4 > - NA > Changes for v5 > - NA > > .../devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt | 50 > ++ > 1 file changed, 50 insertions(+) > > diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt > b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt > index 4f89302..7ab21cb 100644 > --- a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt > +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt > @@ -69,6 +69,56 @@ Example: > }; > }; > > +* Interrupt Controller (IC) > + > +Required properties: > +- compatible : should be "fsl,qe-ic". > +- reg : Address range of IC register set. > +- interrupts : interrupts generated by the device. > +- interrupt-controller : this device is a interrupt controller. > + > +Example: > + > + qeic: interrupt-controller@80 { > + interrupt-controller; > + compatible = "fsl,qe-ic"; > + #address-cells = <0>; > + #interrupt-cells = <1>; > + reg = <0x80 0x80>; > + interrupts = <95 2 0 0 94 2 0 0>; //high:79 low:78 > + }; Why is the information about which interrupt is "high" and which is "low" in a comment in the example, rather than in the definition of the interrupts property? > +* Serial Interface Block (SI) > + > +The SI manages the routing of eight TDM lines to the QE block serial drivers > +, the MCC and the UCCs, for receive and transmit. > + > +Required properties: > +- compatible : should be "fsl,t1040-qe-si". > +- reg : Address range of SI register set. Is t1040 the only chip that has or will ever have this? -Scott
Re: [v5,1/7] QE: Add IC, SI and SIRAM document to device tree bindings.
On Wed, Mar 09, 2016 at 09:21:28AM +0800, Zhao Qiang wrote: > Add IC, SI and SIRAM document of QE to > Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt > > Signed-off-by: Zhao Qiang > Acked-by: Rob Herring > --- > changes for v2 > - Add interrupt-controller in Required properties > - delete address-cells and size-cells for qe-si and qe-siram > Changes for v3 > - Add SoC specific caompatible strings to qe-si and qe-siram > Changes for v4 > - NA > Changes for v5 > - NA > > .../devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt | 50 > ++ > 1 file changed, 50 insertions(+) > > diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt > b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt > index 4f89302..7ab21cb 100644 > --- a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt > +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt > @@ -69,6 +69,56 @@ Example: > }; > }; > > +* Interrupt Controller (IC) > + > +Required properties: > +- compatible : should be "fsl,qe-ic". > +- reg : Address range of IC register set. > +- interrupts : interrupts generated by the device. > +- interrupt-controller : this device is a interrupt controller. > + > +Example: > + > + qeic: interrupt-controller@80 { > + interrupt-controller; > + compatible = "fsl,qe-ic"; > + #address-cells = <0>; > + #interrupt-cells = <1>; > + reg = <0x80 0x80>; > + interrupts = <95 2 0 0 94 2 0 0>; //high:79 low:78 > + }; Why is the information about which interrupt is "high" and which is "low" in a comment in the example, rather than in the definition of the interrupts property? > +* Serial Interface Block (SI) > + > +The SI manages the routing of eight TDM lines to the QE block serial drivers > +, the MCC and the UCCs, for receive and transmit. > + > +Required properties: > +- compatible : should be "fsl,t1040-qe-si". > +- reg : Address range of SI register set. Is t1040 the only chip that has or will ever have this? -Scott
Re: [PATCH v2 1/7] powerpc/8xx: Fix vaddr for IMMR early remap
On Fri, 2016-05-13 at 11:25 +0200, Christophe Leroy wrote: > Le 11/05/2016 à 22:38, Scott Wood a écrit : > > On Wed, 2016-05-11 at 17:03 +0200, Christophe Leroy wrote: > > > Memory: 124428K/131072K available (3748K kernel code, 188K rwdata, > > > 648K rodata, 508K init, 290K bss, 6644K reserved) > > > Kernel virtual memory layout: > > >* 0xfffdf000..0xf000 : fixmap > > >* 0xfde0..0xfe00 : consistent mem > > >* 0xfddf6000..0xfde0 : early ioremap > > >* 0xc900..0xfddf6000 : vmalloc & ioremap > > > SLUB: HWalign=16, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 > > > > > > Today, IMMR is mapped 1:1 at startup > > > > > > Mapping IMMR 1:1 is just wrong because it may overlap with another > > > area. On most mpc8xx boards it is OK as IMMR is set to 0xff00 > > > but for instance on EP88xC board, IMMR is at 0xfa20 which > > > overlaps with VM ioremap area > > > > > > This patch fixes the virtual address for remapping IMMR with the fixmap > > > regardless of the value of IMMR. > > > > > > The size of IMMR area is 256kbytes (CPM at offset 0, security engine > > > at offset 128k) so a 512k page is enough > > > > > > Signed-off-by: Christophe Leroy> > > --- > > > v2: No change > > > > > > arch/powerpc/include/asm/fixmap.h | 7 +++ > > > arch/powerpc/kernel/asm-offsets.c | 8 > > > arch/powerpc/kernel/head_8xx.S| 11 ++- > > > arch/powerpc/mm/mmu_decl.h| 7 +++ > > > arch/powerpc/sysdev/cpm_common.c | 15 --- > > > 5 files changed, 40 insertions(+), 8 deletions(-) > > > > > > diff --git a/arch/powerpc/include/asm/fixmap.h > > > b/arch/powerpc/include/asm/fixmap.h > > > index 90f604b..4508b32 100644 > > > --- a/arch/powerpc/include/asm/fixmap.h > > > +++ b/arch/powerpc/include/asm/fixmap.h > > > @@ -51,6 +51,13 @@ enum fixed_addresses { > > > FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel > > > mappings */ > > > FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, > > > #endif > > > +#ifdef CONFIG_PPC_8xx > > > + /* For IMMR we need an aligned 512K area */ > > > +#define FIX_IMMR_SIZE(512 * 1024 / PAGE_SIZE) > > > + FIX_IMMR_START, > > > + FIX_IMMR_BASE = __ALIGN_MASK(FIX_IMMR_START, FIX_IMMR_SIZE - 1) > > > - 1 + > > > + > > > +FIX_IMMR_SIZE, > > > +#endif > > What happens if FIX_IMMR_START is, for example, 0x100? Then > > "__ALIGN_MASK(FIX_IMMR_START, FIX_IMMR_SIZE - 1) - 1" would be 0xff -- > > you've > > gone backwards. FIX_IMMR_BASE would be 0x17f, translating to an address > > of > > 0xffe8. IMMR would end at 0xfff0. The kmap range would begin at > > 0xffeff000 and you'd have an overlap. > > > > I think what you want is: > > FIX_IMMR_BASE = (FIX_IMMR_START & ~(FIX_IMMR_SIZE - 1)) + > > FIX_IMMR_SIZE - 1, > > > Why would the kmap range begin at 0xffeff000 ? > If FIX_IMMR_START is 0x100, this means FIX_KMAP_END is 0x0ff, so the > kmap range should begin at 0xfff0, shouldn't it ? Yeah, I'm not sure what I was thinking in that reply in general. -Scott
Re: [PATCH v2 1/7] powerpc/8xx: Fix vaddr for IMMR early remap
On Fri, 2016-05-13 at 11:25 +0200, Christophe Leroy wrote: > Le 11/05/2016 à 22:38, Scott Wood a écrit : > > On Wed, 2016-05-11 at 17:03 +0200, Christophe Leroy wrote: > > > Memory: 124428K/131072K available (3748K kernel code, 188K rwdata, > > > 648K rodata, 508K init, 290K bss, 6644K reserved) > > > Kernel virtual memory layout: > > >* 0xfffdf000..0xf000 : fixmap > > >* 0xfde0..0xfe00 : consistent mem > > >* 0xfddf6000..0xfde0 : early ioremap > > >* 0xc900..0xfddf6000 : vmalloc & ioremap > > > SLUB: HWalign=16, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 > > > > > > Today, IMMR is mapped 1:1 at startup > > > > > > Mapping IMMR 1:1 is just wrong because it may overlap with another > > > area. On most mpc8xx boards it is OK as IMMR is set to 0xff00 > > > but for instance on EP88xC board, IMMR is at 0xfa20 which > > > overlaps with VM ioremap area > > > > > > This patch fixes the virtual address for remapping IMMR with the fixmap > > > regardless of the value of IMMR. > > > > > > The size of IMMR area is 256kbytes (CPM at offset 0, security engine > > > at offset 128k) so a 512k page is enough > > > > > > Signed-off-by: Christophe Leroy > > > --- > > > v2: No change > > > > > > arch/powerpc/include/asm/fixmap.h | 7 +++ > > > arch/powerpc/kernel/asm-offsets.c | 8 > > > arch/powerpc/kernel/head_8xx.S| 11 ++- > > > arch/powerpc/mm/mmu_decl.h| 7 +++ > > > arch/powerpc/sysdev/cpm_common.c | 15 --- > > > 5 files changed, 40 insertions(+), 8 deletions(-) > > > > > > diff --git a/arch/powerpc/include/asm/fixmap.h > > > b/arch/powerpc/include/asm/fixmap.h > > > index 90f604b..4508b32 100644 > > > --- a/arch/powerpc/include/asm/fixmap.h > > > +++ b/arch/powerpc/include/asm/fixmap.h > > > @@ -51,6 +51,13 @@ enum fixed_addresses { > > > FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel > > > mappings */ > > > FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, > > > #endif > > > +#ifdef CONFIG_PPC_8xx > > > + /* For IMMR we need an aligned 512K area */ > > > +#define FIX_IMMR_SIZE(512 * 1024 / PAGE_SIZE) > > > + FIX_IMMR_START, > > > + FIX_IMMR_BASE = __ALIGN_MASK(FIX_IMMR_START, FIX_IMMR_SIZE - 1) > > > - 1 + > > > + > > > +FIX_IMMR_SIZE, > > > +#endif > > What happens if FIX_IMMR_START is, for example, 0x100? Then > > "__ALIGN_MASK(FIX_IMMR_START, FIX_IMMR_SIZE - 1) - 1" would be 0xff -- > > you've > > gone backwards. FIX_IMMR_BASE would be 0x17f, translating to an address > > of > > 0xffe8. IMMR would end at 0xfff0. The kmap range would begin at > > 0xffeff000 and you'd have an overlap. > > > > I think what you want is: > > FIX_IMMR_BASE = (FIX_IMMR_START & ~(FIX_IMMR_SIZE - 1)) + > > FIX_IMMR_SIZE - 1, > > > Why would the kmap range begin at 0xffeff000 ? > If FIX_IMMR_START is 0x100, this means FIX_KMAP_END is 0x0ff, so the > kmap range should begin at 0xfff0, shouldn't it ? Yeah, I'm not sure what I was thinking in that reply in general. -Scott
Re: [PATCH v7 3/3] SMAF: add fake secure module
Hi Benjamin, On 9 May 2016 at 16:07, Benjamin Gaignardwrote: > This module is allow testing secure calls of SMAF. > "Add fake secure module" does sound like something not (m)any people want to hear ;-) Have you considered calling it 'dummy', 'test' or similar ? > --- /dev/null > +++ b/drivers/smaf/smaf-fakesecure.c > @@ -0,0 +1,85 @@ > +/* > + * smaf-fakesecure.c > + * > + * Copyright (C) Linaro SA 2015 > + * Author: Benjamin Gaignard for Linaro. > + * License terms: GNU General Public License (GPL), version 2 > + */ > +#include > +#include > +#include > + > +#define MAGIC 0xDEADBEEF > + > +struct fake_private { > + int magic; > +}; > + > +static void *smaf_fakesecure_create(void) > +{ > + struct fake_private *priv; > + > + priv = kzalloc(sizeof(*priv), GFP_KERNEL); Missing ENOMEM handling ? > + priv->magic = MAGIC; > + > + return priv; > +} > + > +static int smaf_fakesecure_destroy(void *ctx) > +{ > + struct fake_private *priv = (struct fake_private *)ctx; You might want to flesh this cast into a (inline) helper and use it throughout ? ... and that is all. Hope these were useful, or at the very least not utterly wrong, suggestions :-) Regards, Emil P.S. From a quick look userspace has some subtle bugs/odd practises. Let me know if you're interested in my input.
Re: [PATCH v7 3/3] SMAF: add fake secure module
Hi Benjamin, On 9 May 2016 at 16:07, Benjamin Gaignard wrote: > This module is allow testing secure calls of SMAF. > "Add fake secure module" does sound like something not (m)any people want to hear ;-) Have you considered calling it 'dummy', 'test' or similar ? > --- /dev/null > +++ b/drivers/smaf/smaf-fakesecure.c > @@ -0,0 +1,85 @@ > +/* > + * smaf-fakesecure.c > + * > + * Copyright (C) Linaro SA 2015 > + * Author: Benjamin Gaignard for Linaro. > + * License terms: GNU General Public License (GPL), version 2 > + */ > +#include > +#include > +#include > + > +#define MAGIC 0xDEADBEEF > + > +struct fake_private { > + int magic; > +}; > + > +static void *smaf_fakesecure_create(void) > +{ > + struct fake_private *priv; > + > + priv = kzalloc(sizeof(*priv), GFP_KERNEL); Missing ENOMEM handling ? > + priv->magic = MAGIC; > + > + return priv; > +} > + > +static int smaf_fakesecure_destroy(void *ctx) > +{ > + struct fake_private *priv = (struct fake_private *)ctx; You might want to flesh this cast into a (inline) helper and use it throughout ? ... and that is all. Hope these were useful, or at the very least not utterly wrong, suggestions :-) Regards, Emil P.S. From a quick look userspace has some subtle bugs/odd practises. Let me know if you're interested in my input.
Re: [PATCH v7 2/3] SMAF: add CMA allocator
Hi Benjamin, On 9 May 2016 at 16:07, Benjamin Gaignardwrote: > SMAF CMA allocator implement helpers functions to allow SMAF > to allocate contiguous memory. > > match() each if at least one of the attached devices have coherent_dma_mask > set to DMA_BIT_MASK(32). > What is the idea behind the hardcoded 32. Wouldn't it be better to avoid that ? > +static void smaf_cma_release(struct dma_buf *dmabuf) > +{ > + struct smaf_cma_buffer_info *info = dmabuf->priv; > + DEFINE_DMA_ATTRS(attrs); > + > + dma_set_attr(DMA_ATTR_WRITE_COMBINE, ); > + Imho it's worth storing the dma_attrs within smaf_cma_buffer_info. This way it's less likely for things to go wrong, if one forgets to update one of the three in the future. > +static void smaf_cma_unmap(struct dma_buf_attachment *attachment, > + struct sg_table *sgt, > + enum dma_data_direction direction) > +{ > + /* do nothing */ There could/should really be a comment explaining why we "do nothing" here, right ? > +} > + > +static int smaf_cma_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma) > +{ > + struct smaf_cma_buffer_info *info = dmabuf->priv; > + int ret; > + DEFINE_DMA_ATTRS(attrs); > + > + dma_set_attr(DMA_ATTR_WRITE_COMBINE, ); > + > + if (info->size < vma->vm_end - vma->vm_start) > + return -EINVAL; > + > + vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; > + ret = dma_mmap_attrs(info->dev, vma, info->vaddr, info->paddr, > +info->size, ); > + > + return ret; Kill the temporary variable 'ret' ? > +static struct dma_buf_ops smaf_cma_ops = { const ? Afaict the compiler would/should warn you about discarding it as the ops are defined const. > +static struct dma_buf *smaf_cma_allocate(struct dma_buf *dmabuf, > +size_t length, unsigned int flags) > +{ > + struct dma_buf_attachment *attach_obj; > + struct smaf_cma_buffer_info *info; > + struct dma_buf *cma_dmabuf; > + int ret; > + > + DEFINE_DMA_BUF_EXPORT_INFO(export); > + DEFINE_DMA_ATTRS(attrs); > + > + dma_set_attr(DMA_ATTR_WRITE_COMBINE, ); > + > + info = kzalloc(sizeof(*info), GFP_KERNEL); > + if (!info) > + return NULL; > + > + info->dev = find_matching_device(dmabuf); find_matching_device() can return NULL. We should handle that imho. > + info->size = length; > + info->vaddr = dma_alloc_attrs(info->dev, info->size, >paddr, > + GFP_KERNEL | __GFP_NOWARN, ); > + if (!info->vaddr) { > + ret = -ENOMEM; set-but-unused-variable 'ret' ? > + goto error; > + } > + > + export.ops = _cma_ops; > + export.size = info->size; > + export.flags = flags; > + export.priv = info; > + > + cma_dmabuf = dma_buf_export(); > + if (IS_ERR(cma_dmabuf)) Missing dma_free_attrs() ? I'd add another label in the error path and handle it there. > + goto error; > + > + list_for_each_entry(attach_obj, >attachments, node) { > + dma_buf_attach(cma_dmabuf, attach_obj->dev); Imho one should error out if attach fails. Or warn at the very least ? > +static int __init smaf_cma_init(void) > +{ > + INIT_LIST_HEAD(_cma.list_node); Isn't this something that smaf_register_allocator() should be doing ? Regards, Emil
Re: [PATCH v7 2/3] SMAF: add CMA allocator
Hi Benjamin, On 9 May 2016 at 16:07, Benjamin Gaignard wrote: > SMAF CMA allocator implement helpers functions to allow SMAF > to allocate contiguous memory. > > match() each if at least one of the attached devices have coherent_dma_mask > set to DMA_BIT_MASK(32). > What is the idea behind the hardcoded 32. Wouldn't it be better to avoid that ? > +static void smaf_cma_release(struct dma_buf *dmabuf) > +{ > + struct smaf_cma_buffer_info *info = dmabuf->priv; > + DEFINE_DMA_ATTRS(attrs); > + > + dma_set_attr(DMA_ATTR_WRITE_COMBINE, ); > + Imho it's worth storing the dma_attrs within smaf_cma_buffer_info. This way it's less likely for things to go wrong, if one forgets to update one of the three in the future. > +static void smaf_cma_unmap(struct dma_buf_attachment *attachment, > + struct sg_table *sgt, > + enum dma_data_direction direction) > +{ > + /* do nothing */ There could/should really be a comment explaining why we "do nothing" here, right ? > +} > + > +static int smaf_cma_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma) > +{ > + struct smaf_cma_buffer_info *info = dmabuf->priv; > + int ret; > + DEFINE_DMA_ATTRS(attrs); > + > + dma_set_attr(DMA_ATTR_WRITE_COMBINE, ); > + > + if (info->size < vma->vm_end - vma->vm_start) > + return -EINVAL; > + > + vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; > + ret = dma_mmap_attrs(info->dev, vma, info->vaddr, info->paddr, > +info->size, ); > + > + return ret; Kill the temporary variable 'ret' ? > +static struct dma_buf_ops smaf_cma_ops = { const ? Afaict the compiler would/should warn you about discarding it as the ops are defined const. > +static struct dma_buf *smaf_cma_allocate(struct dma_buf *dmabuf, > +size_t length, unsigned int flags) > +{ > + struct dma_buf_attachment *attach_obj; > + struct smaf_cma_buffer_info *info; > + struct dma_buf *cma_dmabuf; > + int ret; > + > + DEFINE_DMA_BUF_EXPORT_INFO(export); > + DEFINE_DMA_ATTRS(attrs); > + > + dma_set_attr(DMA_ATTR_WRITE_COMBINE, ); > + > + info = kzalloc(sizeof(*info), GFP_KERNEL); > + if (!info) > + return NULL; > + > + info->dev = find_matching_device(dmabuf); find_matching_device() can return NULL. We should handle that imho. > + info->size = length; > + info->vaddr = dma_alloc_attrs(info->dev, info->size, >paddr, > + GFP_KERNEL | __GFP_NOWARN, ); > + if (!info->vaddr) { > + ret = -ENOMEM; set-but-unused-variable 'ret' ? > + goto error; > + } > + > + export.ops = _cma_ops; > + export.size = info->size; > + export.flags = flags; > + export.priv = info; > + > + cma_dmabuf = dma_buf_export(); > + if (IS_ERR(cma_dmabuf)) Missing dma_free_attrs() ? I'd add another label in the error path and handle it there. > + goto error; > + > + list_for_each_entry(attach_obj, >attachments, node) { > + dma_buf_attach(cma_dmabuf, attach_obj->dev); Imho one should error out if attach fails. Or warn at the very least ? > +static int __init smaf_cma_init(void) > +{ > + INIT_LIST_HEAD(_cma.list_node); Isn't this something that smaf_register_allocator() should be doing ? Regards, Emil
Re: [PATCH] crypto/sha1-mb: make sha1_x8_avx2() conform to C function ABI
On Mon, 2016-05-16 at 16:46 -0500, Josh Poimboeuf wrote: > On Mon, May 16, 2016 at 02:39:06PM -0700, Megha Dey wrote: > > On Mon, 2016-05-16 at 15:16 -0500, Josh Poimboeuf wrote: > > > On Mon, May 16, 2016 at 11:31:12AM -0700, Megha Dey wrote: > > > > On Mon, 2016-05-16 at 09:44 -0500, Josh Poimboeuf wrote: > > > > > On Fri, May 13, 2016 at 10:32:26AM -0700, Megha Dey wrote: > > > > > > On Fri, 2016-05-13 at 07:51 +0200, Ingo Molnar wrote: > > > > > > > * Herbert Xuwrote: > > > > > > > > > > > > > > > On Thu, May 12, 2016 at 04:31:06PM -0700, Megha Dey wrote: > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > When booting latest kernel with the CONFIG_CRYPTO_SHA1_MB > > > > > > > > > enabled, I > > > > > > > > > observe a panic. > > > > > > > > > > > > > > > > > > After having a quick look, on reverting the following > > > > > > > > > patches, I am able > > > > > > > > > to complete the booting process. > > > > > > > > > aec4d0e301f17bb143341c82cc44685b8af0b945 > > > > > > > > > 8691ccd764f9ecc69a6812dfe76214c86ac9ba06 > > > > > > > > > 68874ac3304ade7ed5ebb12af00d6b9bbbca0a16 > > > > > > > > > > > > > > > > > > Of the 3 patches, aec4d0e301f17bb143341c82cc44685b8af0b945 > > > > > > > > > seems wrong. > > > > > > > > > The r10 to r15 registers are used in sha1_x8_avx2.S, which is > > > > > > > > > called > > > > > > > > > from sha1_mb_mgr_flush_avx2.S. > > > > > > > > > > > > > > > > > > I do not think the functionality of the SHA1-MB crypto > > > > > > > > > algorithm has > > > > > > > > > been tested after applying these changes. (I am not sure if > > > > > > > > > any of the > > > > > > > > > other crypto algorithms have been affected by these changes). > > > > > > > > > > > > > > > > Josh, Ingo: > > > > > > > > > > > > > > > > Any ideas on this? Should we revert? > > > > > > > > > > > > > > Yeah, I think so - although another option would be to > > > > > > > standardize sha1_x8_avx2() > > > > > > > - the problem is that it is a function that clobbers registers > > > > > > > without > > > > > > > saving/restoring them, breaking the C function ABI. I realize > > > > > > > it's written in > > > > > > > assembly, but unless there are strong performance reasons to > > > > > > > deviate from the > > > > > > > regular calling convention it might make sense to fix that. > > > > > > > > > > > > > > Do any warnings get generated after the revert, if you enable > > > > > > > CONFIG_STACK_VALIDATION=y? > > > > > > > > > > > > After the revert and enabling CONFIG_STACK_VALIDATION: > > > > > > arch/x86/crypto/sha1-mb/sha1_mb_mgr_flush_avx2.o: warning: objtool: > > > > > > sha1_mb_mgr_flush_avx2()+0x20d: call without frame pointer > > > > > > save/setup > > > > > > > > > > > > arch/x86/crypto/sha1-mb/sha1_mb_mgr_submit_avx2.o: warning: objtool: > > > > > > sha1_mb_mgr_submit_avx2()+0x115: call without frame pointer > > > > > > save/setup > > > > > > > > > > Megha, > > > > > > > > > > Sorry for breaking it. I completely missed the fact that the function > > > > > calls sha1_x8_avx2() which clobbers registers. > > > > > > > > > > If the performance penalty isn't too bad, I'll submit a patch to > > > > > standardize sha1_x8_avx2() to follow the C ABI. > > > > > > > > > > Do you have any tips for testing this code? I've tried using the > > > > > tcrypt > > > > > module, but no luck. > > > > > > > > > Josh, > > > > Build the kernel with the following configs: > > > > CONFIG_CRYPTO_SHA1_MB=y > > > > CONFIG_CRYPTO_TEST=m > > > > CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=n > > > > There was a kernel panic while booting. > > > > So if after applying your new patch, we are able to get complete the > > > > boot, then we are good. > > > > > > > > Could you please send a copy of the patch, I could test it on my end > > > > too. > > > > > > Thanks. I was able to run the tests, though I didn't see a panic. Can > > > you test with this patch? > > > > > > > > > > > > From: Josh Poimboeuf > > > Subject: [PATCH] crypto/sha1-mb: make sha1_x8_avx2() conform to C > > > function ABI > > > > > > Megha Day reported a kernel panic in crypto code. The problem is that > > > sha1_x8_avx2() clobbers registers r12-r15 without saving and restoring > > > them. > > > > > > Before commit aec4d0e301f1 ("x86/asm/crypto: Simplify stack usage in > > > sha-mb functions"), those registers were saved and restored by the > > > callers of the function. I removed them with that commit because I > > > didn't realize sha1_x8_avx2() clobbered them. > > > > > > Fix the potential undefined behavior associated with clobbering the > > > registers and make the behavior less surprising by changing the > > > registers to be callee saved/restored to conform with the C function > > > call ABI. > > > > > > Also, rdx (aka RSP_SAVE) doesn't need to be saved: I verified that none > > > of the callers rely on it being saved, and it's not a callee-saved > > > register
Re: [PATCH] crypto/sha1-mb: make sha1_x8_avx2() conform to C function ABI
On Mon, 2016-05-16 at 16:46 -0500, Josh Poimboeuf wrote: > On Mon, May 16, 2016 at 02:39:06PM -0700, Megha Dey wrote: > > On Mon, 2016-05-16 at 15:16 -0500, Josh Poimboeuf wrote: > > > On Mon, May 16, 2016 at 11:31:12AM -0700, Megha Dey wrote: > > > > On Mon, 2016-05-16 at 09:44 -0500, Josh Poimboeuf wrote: > > > > > On Fri, May 13, 2016 at 10:32:26AM -0700, Megha Dey wrote: > > > > > > On Fri, 2016-05-13 at 07:51 +0200, Ingo Molnar wrote: > > > > > > > * Herbert Xu wrote: > > > > > > > > > > > > > > > On Thu, May 12, 2016 at 04:31:06PM -0700, Megha Dey wrote: > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > When booting latest kernel with the CONFIG_CRYPTO_SHA1_MB > > > > > > > > > enabled, I > > > > > > > > > observe a panic. > > > > > > > > > > > > > > > > > > After having a quick look, on reverting the following > > > > > > > > > patches, I am able > > > > > > > > > to complete the booting process. > > > > > > > > > aec4d0e301f17bb143341c82cc44685b8af0b945 > > > > > > > > > 8691ccd764f9ecc69a6812dfe76214c86ac9ba06 > > > > > > > > > 68874ac3304ade7ed5ebb12af00d6b9bbbca0a16 > > > > > > > > > > > > > > > > > > Of the 3 patches, aec4d0e301f17bb143341c82cc44685b8af0b945 > > > > > > > > > seems wrong. > > > > > > > > > The r10 to r15 registers are used in sha1_x8_avx2.S, which is > > > > > > > > > called > > > > > > > > > from sha1_mb_mgr_flush_avx2.S. > > > > > > > > > > > > > > > > > > I do not think the functionality of the SHA1-MB crypto > > > > > > > > > algorithm has > > > > > > > > > been tested after applying these changes. (I am not sure if > > > > > > > > > any of the > > > > > > > > > other crypto algorithms have been affected by these changes). > > > > > > > > > > > > > > > > Josh, Ingo: > > > > > > > > > > > > > > > > Any ideas on this? Should we revert? > > > > > > > > > > > > > > Yeah, I think so - although another option would be to > > > > > > > standardize sha1_x8_avx2() > > > > > > > - the problem is that it is a function that clobbers registers > > > > > > > without > > > > > > > saving/restoring them, breaking the C function ABI. I realize > > > > > > > it's written in > > > > > > > assembly, but unless there are strong performance reasons to > > > > > > > deviate from the > > > > > > > regular calling convention it might make sense to fix that. > > > > > > > > > > > > > > Do any warnings get generated after the revert, if you enable > > > > > > > CONFIG_STACK_VALIDATION=y? > > > > > > > > > > > > After the revert and enabling CONFIG_STACK_VALIDATION: > > > > > > arch/x86/crypto/sha1-mb/sha1_mb_mgr_flush_avx2.o: warning: objtool: > > > > > > sha1_mb_mgr_flush_avx2()+0x20d: call without frame pointer > > > > > > save/setup > > > > > > > > > > > > arch/x86/crypto/sha1-mb/sha1_mb_mgr_submit_avx2.o: warning: objtool: > > > > > > sha1_mb_mgr_submit_avx2()+0x115: call without frame pointer > > > > > > save/setup > > > > > > > > > > Megha, > > > > > > > > > > Sorry for breaking it. I completely missed the fact that the function > > > > > calls sha1_x8_avx2() which clobbers registers. > > > > > > > > > > If the performance penalty isn't too bad, I'll submit a patch to > > > > > standardize sha1_x8_avx2() to follow the C ABI. > > > > > > > > > > Do you have any tips for testing this code? I've tried using the > > > > > tcrypt > > > > > module, but no luck. > > > > > > > > > Josh, > > > > Build the kernel with the following configs: > > > > CONFIG_CRYPTO_SHA1_MB=y > > > > CONFIG_CRYPTO_TEST=m > > > > CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=n > > > > There was a kernel panic while booting. > > > > So if after applying your new patch, we are able to get complete the > > > > boot, then we are good. > > > > > > > > Could you please send a copy of the patch, I could test it on my end > > > > too. > > > > > > Thanks. I was able to run the tests, though I didn't see a panic. Can > > > you test with this patch? > > > > > > > > > > > > From: Josh Poimboeuf > > > Subject: [PATCH] crypto/sha1-mb: make sha1_x8_avx2() conform to C > > > function ABI > > > > > > Megha Day reported a kernel panic in crypto code. The problem is that > > > sha1_x8_avx2() clobbers registers r12-r15 without saving and restoring > > > them. > > > > > > Before commit aec4d0e301f1 ("x86/asm/crypto: Simplify stack usage in > > > sha-mb functions"), those registers were saved and restored by the > > > callers of the function. I removed them with that commit because I > > > didn't realize sha1_x8_avx2() clobbered them. > > > > > > Fix the potential undefined behavior associated with clobbering the > > > registers and make the behavior less surprising by changing the > > > registers to be callee saved/restored to conform with the C function > > > call ABI. > > > > > > Also, rdx (aka RSP_SAVE) doesn't need to be saved: I verified that none > > > of the callers rely on it being saved, and it's not a callee-saved > > > register in the C ABI. > > > > > > Fixes: aec4d0e301f1
Re: [PATCH 1/2] Create UV efi_call macros
On Thu, May 12, 2016 at 10:17:39AM +0200, Ingo Molnar wrote: > > Fine by me, although having a newline after arch_efi_call_virt_setup() > > but not before arch_efi_call_virt_teardown() seems rather arbitrary > > It's an oversight! :-) > > #define efi_call_virt(f, args...) \ > ({\ > efi_status_t __s; \ > unsigned long flags;\ > \ > arch_efi_call_virt_setup(); \ > \ > local_save_flags(flags);\ > __s = arch_efi_call_virt(f, args); \ > efi_call_virt_check_flags(flags, __stringify(f)); \ > \ > arch_efi_call_virt_teardown(); \ > \ > __s;\ > }) > > But if it's too segmented this is fine too: > > #define efi_call_virt(f, args...) \ > ({\ > efi_status_t __s; \ > unsigned long flags;\ > \ > arch_efi_call_virt_setup(); \ > local_save_flags(flags);\ > __s = arch_efi_call_virt(f, args); \ > efi_call_virt_check_flags(flags, __stringify(f)); \ > arch_efi_call_virt_teardown(); \ > \ > __s;\ > }) This makes sense to me. I'll make sure to include something like this in my next version of the patch. Thanks, guys! - Alex
Re: [PATCH 1/2] Create UV efi_call macros
On Thu, May 12, 2016 at 10:17:39AM +0200, Ingo Molnar wrote: > > Fine by me, although having a newline after arch_efi_call_virt_setup() > > but not before arch_efi_call_virt_teardown() seems rather arbitrary > > It's an oversight! :-) > > #define efi_call_virt(f, args...) \ > ({\ > efi_status_t __s; \ > unsigned long flags;\ > \ > arch_efi_call_virt_setup(); \ > \ > local_save_flags(flags);\ > __s = arch_efi_call_virt(f, args); \ > efi_call_virt_check_flags(flags, __stringify(f)); \ > \ > arch_efi_call_virt_teardown(); \ > \ > __s;\ > }) > > But if it's too segmented this is fine too: > > #define efi_call_virt(f, args...) \ > ({\ > efi_status_t __s; \ > unsigned long flags;\ > \ > arch_efi_call_virt_setup(); \ > local_save_flags(flags);\ > __s = arch_efi_call_virt(f, args); \ > efi_call_virt_check_flags(flags, __stringify(f)); \ > arch_efi_call_virt_teardown(); \ > \ > __s;\ > }) This makes sense to me. I'll make sure to include something like this in my next version of the patch. Thanks, guys! - Alex
[PATCH 0/2] DEC frame buffer modular driver cleanups
Hi, This is a pair of small cleanups for `__init' annotation issues in PMAG-B and PMAGB-B frame buffer drivers discovered in a modular build, which is a seldom used, but supported configuration. Please apply; I think these are obvious enough to be fast-tracked to 4.7 right away. Maciej
[PATCH 0/2] DEC frame buffer modular driver cleanups
Hi, This is a pair of small cleanups for `__init' annotation issues in PMAG-B and PMAGB-B frame buffer drivers discovered in a modular build, which is a seldom used, but supported configuration. Please apply; I think these are obvious enough to be fast-tracked to 4.7 right away. Maciej
[PATCH 2/2] video: fbdev: pmagb-b-fb: Remove bad `__init' annotation
Fix: WARNING: drivers/video/fbdev/pmagb-b-fb.o(.text+0x6dc): Section mismatch in reference from the function pmagbbfb_probe() to the function .init.text:pmagbbfb_erase_cursor() The function pmagbbfb_probe() references the function __init pmagbbfb_erase_cursor(). This is often because pmagbbfb_probe lacks a __init annotation or the annotation of pmagbbfb_erase_cursor is wrong. -- a fallout from a missed update from commit 5b1638d94080 ("VIDEO: PMAGB-B: Fix section mismatch") and then commit 48c68c4f1b54 ("Drivers: video: remove __dev* attributes.") Signed-off-by: Maciej W. Rozycki--- linux-pmagb-b-init.diff Index: linux-20160515-declance-module/drivers/video/fbdev/pmagb-b-fb.c === --- linux-20160515-declance-module.orig/drivers/video/fbdev/pmagb-b-fb.c +++ linux-20160515-declance-module/drivers/video/fbdev/pmagb-b-fb.c @@ -133,7 +133,7 @@ static struct fb_ops pmagbbfb_ops = { /* * Turn the hardware cursor off. */ -static void __init pmagbbfb_erase_cursor(struct fb_info *info) +static void pmagbbfb_erase_cursor(struct fb_info *info) { struct pmagbbfb_par *par = info->par;
[PATCH 2/2] video: fbdev: pmagb-b-fb: Remove bad `__init' annotation
Fix: WARNING: drivers/video/fbdev/pmagb-b-fb.o(.text+0x6dc): Section mismatch in reference from the function pmagbbfb_probe() to the function .init.text:pmagbbfb_erase_cursor() The function pmagbbfb_probe() references the function __init pmagbbfb_erase_cursor(). This is often because pmagbbfb_probe lacks a __init annotation or the annotation of pmagbbfb_erase_cursor is wrong. -- a fallout from a missed update from commit 5b1638d94080 ("VIDEO: PMAGB-B: Fix section mismatch") and then commit 48c68c4f1b54 ("Drivers: video: remove __dev* attributes.") Signed-off-by: Maciej W. Rozycki --- linux-pmagb-b-init.diff Index: linux-20160515-declance-module/drivers/video/fbdev/pmagb-b-fb.c === --- linux-20160515-declance-module.orig/drivers/video/fbdev/pmagb-b-fb.c +++ linux-20160515-declance-module/drivers/video/fbdev/pmagb-b-fb.c @@ -133,7 +133,7 @@ static struct fb_ops pmagbbfb_ops = { /* * Turn the hardware cursor off. */ -static void __init pmagbbfb_erase_cursor(struct fb_info *info) +static void pmagbbfb_erase_cursor(struct fb_info *info) { struct pmagbbfb_par *par = info->par;
[PATCH 1/2] video: fbdev: pmag-ba-fb: Remove bad `__init' annotation
Fix: WARNING: drivers/video/fbdev/pmag-ba-fb.o(.text+0x308): Section mismatch in reference from the function pmagbafb_probe() to the function .init.text:pmagbafb_erase_cursor() The function pmagbafb_probe() references the function __init pmagbafb_erase_cursor(). This is often because pmagbafb_probe lacks a __init annotation or the annotation of pmagbafb_erase_cursor is wrong. -- a fallout from a missed update from commit 9625b51350cc ("VIDEO: PMAG-BA: Fix section mismatch") and then commit 48c68c4f1b54 ("Drivers: video: remove __dev* attributes.") Signed-off-by: Maciej W. Rozycki--- linux-pmag-ba-init.diff Index: linux-20160515-declance-module/drivers/video/fbdev/pmag-ba-fb.c === --- linux-20160515-declance-module.orig/drivers/video/fbdev/pmag-ba-fb.c +++ linux-20160515-declance-module/drivers/video/fbdev/pmag-ba-fb.c @@ -129,7 +129,7 @@ static struct fb_ops pmagbafb_ops = { /* * Turn the hardware cursor off. */ -static void __init pmagbafb_erase_cursor(struct fb_info *info) +static void pmagbafb_erase_cursor(struct fb_info *info) { struct pmagbafb_par *par = info->par;
[PATCH 1/2] video: fbdev: pmag-ba-fb: Remove bad `__init' annotation
Fix: WARNING: drivers/video/fbdev/pmag-ba-fb.o(.text+0x308): Section mismatch in reference from the function pmagbafb_probe() to the function .init.text:pmagbafb_erase_cursor() The function pmagbafb_probe() references the function __init pmagbafb_erase_cursor(). This is often because pmagbafb_probe lacks a __init annotation or the annotation of pmagbafb_erase_cursor is wrong. -- a fallout from a missed update from commit 9625b51350cc ("VIDEO: PMAG-BA: Fix section mismatch") and then commit 48c68c4f1b54 ("Drivers: video: remove __dev* attributes.") Signed-off-by: Maciej W. Rozycki --- linux-pmag-ba-init.diff Index: linux-20160515-declance-module/drivers/video/fbdev/pmag-ba-fb.c === --- linux-20160515-declance-module.orig/drivers/video/fbdev/pmag-ba-fb.c +++ linux-20160515-declance-module/drivers/video/fbdev/pmag-ba-fb.c @@ -129,7 +129,7 @@ static struct fb_ops pmagbafb_ops = { /* * Turn the hardware cursor off. */ -static void __init pmagbafb_erase_cursor(struct fb_info *info) +static void pmagbafb_erase_cursor(struct fb_info *info) { struct pmagbafb_par *par = info->par;
Re: [PATCH v4] platform:x86: Add PMC Driver for Intel Core SOC
On Mon, May 16, 2016 at 03:45:50PM +0530, Rajneesh Bhardwaj wrote: > On Thu, May 12, 2016 at 06:02:45PM +0300, Andy Shevchenko wrote: > > On Thu, May 12, 2016 at 4:50 PM, Rajneesh Bhardwaj > >wrote: > > > > Sorry for a bit late review, but I think there are still issues need > > to be addressed. > > > > Thanks for the detailed review and the feedback. :) > > > > This patch adds the Power Management Controller driver as a pci driver > > > for Intel Core SOC architecture. > > > > SOC -> SoC > > > > Sure, will fix this throughout the code. > > > > > > > This driver can utilize debugging capabilities and supported features > > > as exposed by the Power Management Controller. > > > > > > Please refer to the below specification for more details on PMC features. > > > http://www.intel.in/content/www/in/en/chipsets/100-series-chipset-datasheet-vol-2.html > > > > > > The current version of this driver exposes slp_s0_residency counter. > > > This counter can be used for detecting fragile SLP_S0 signal related > > > failures and take corrective actions when PCH SLP_S0 signal is not > > > asserted after kernel freeze as part of suspend to idle flow > > > (echo freeze > /sys/power/state). > > > > > > Intel Platform Controller Hub (PCH) asserts SLP_S0 signal when it > > > detects favorable conditions to enter its low power mode. As a > > > pre-requisite the SOC should be in deepest possible Package C-State > > > and devices should be in low power mode. For example, on Skylake SOC > > > > Ditto. > > Check entire code for this. > > > > Same as above. > > > > the deepest Package C-State is Package C10 or PC10. Suspend to idle > > > flow generally leads to PC10 state but PC10 state may not be sufficient > > > for realizing the platform wide power potential which SLP_S0 signal > > > assertion can provide. > > > > > > SLP_S0 signal is often connected to the Embedded Controller (EC) and the > > > Power Management IC (PMIC) for other platform power management related > > > optimizations. > > > > > > In general, SLP_S0 assertion == PC10 + PCH low power mode + ModPhy Lanes > > > power gated + PLL Idle. > > > > > > As part of this driver, a mechanism to read the slp_s0 residency is > > > exposed > > > as an api and also debugfs features are added to indicate slp_s0 signal > > > > api -> API > > > > Sure, will take care of this change. > > > > assertion residency in microseconds. > > > > > > echo freeze > /sys/power/state > > > wake the system > > > cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec > > > > > > Signed-off-by: Rajneesh Bhardwaj > > > Signed-off-by: Vishwanath Somayaji > > > > Two people (1). > > > > > > Ok. > > > > +INTEL PMC CORE DRIVER > > > +M: Rajneesh Bhardwaj > > > +M: Vishwanath Somayaji > > > +L: platform-driver-...@vger.kernel.org > > > +S: Maintained > > > +F: drivers/platform/x86/intel_pmc_core.h > > > +F: drivers/platform/x86/intel_pmc_core.c > > > > So, we have arch/x86/platform/atom/pmc_atom.c > > > > This driver looks very similar (by what functional is), so, we have to > > become clear what our layout is. > > Either we go with drivers/platform/x86 or with arch/x86/platform. > > > > IMHO, the functianality provided by this driver is platform specific and > not architecture specific. > > There are few similar drivers present in this location already i.e. > intel_telemetry_core.c, intel_pmc_ipc.c etc. > > We had initially put this driver along with pmc_atom.c but later we thought > that driver/platform/x86 would be a more apt place for it because of the > above mentioned reasons. > > Please advise for the right location for this driver if its not placed in the > expected location. We're experiencing some repeat discussion due to some of the reviews having taken place off list. Sorry Andy, I've been trying to steer this back to list, so you're missing some context to no fault of your own. It's possible some of the existing drivers in arch really shouldn't be there. It's not particularly well defined, however, if it isn't used outside the kernel or by other subsystems within the kernel, it seems that drivers/platform/x86 would be the appropriate location. Thomas, Peter - do you have a criteria for what you prefer to have in arch/x86/platform versus drivers/platform/x86? > > > > --- a/drivers/platform/x86/Kconfig > > > +++ b/drivers/platform/x86/Kconfig > > > @@ -821,6 +821,18 @@ config INTEL_IPS > > > functionality. If in doubt, say Y here; it will only load on > > > supported platforms. > > > > > > +config INTEL_PMC_CORE > > > > Better to locate this in alphabetical order. > > > > Agreed, i tried to put it in alphabetical order but there are quite a few > entries > in Kconfig that are skewed e.g. INTEL_MFLD is placed above INTEL_IPS. > > Placing INTEL_PMC_CORE
Re: [PATCH v4] platform:x86: Add PMC Driver for Intel Core SOC
On Mon, May 16, 2016 at 03:45:50PM +0530, Rajneesh Bhardwaj wrote: > On Thu, May 12, 2016 at 06:02:45PM +0300, Andy Shevchenko wrote: > > On Thu, May 12, 2016 at 4:50 PM, Rajneesh Bhardwaj > > wrote: > > > > Sorry for a bit late review, but I think there are still issues need > > to be addressed. > > > > Thanks for the detailed review and the feedback. :) > > > > This patch adds the Power Management Controller driver as a pci driver > > > for Intel Core SOC architecture. > > > > SOC -> SoC > > > > Sure, will fix this throughout the code. > > > > > > > This driver can utilize debugging capabilities and supported features > > > as exposed by the Power Management Controller. > > > > > > Please refer to the below specification for more details on PMC features. > > > http://www.intel.in/content/www/in/en/chipsets/100-series-chipset-datasheet-vol-2.html > > > > > > The current version of this driver exposes slp_s0_residency counter. > > > This counter can be used for detecting fragile SLP_S0 signal related > > > failures and take corrective actions when PCH SLP_S0 signal is not > > > asserted after kernel freeze as part of suspend to idle flow > > > (echo freeze > /sys/power/state). > > > > > > Intel Platform Controller Hub (PCH) asserts SLP_S0 signal when it > > > detects favorable conditions to enter its low power mode. As a > > > pre-requisite the SOC should be in deepest possible Package C-State > > > and devices should be in low power mode. For example, on Skylake SOC > > > > Ditto. > > Check entire code for this. > > > > Same as above. > > > > the deepest Package C-State is Package C10 or PC10. Suspend to idle > > > flow generally leads to PC10 state but PC10 state may not be sufficient > > > for realizing the platform wide power potential which SLP_S0 signal > > > assertion can provide. > > > > > > SLP_S0 signal is often connected to the Embedded Controller (EC) and the > > > Power Management IC (PMIC) for other platform power management related > > > optimizations. > > > > > > In general, SLP_S0 assertion == PC10 + PCH low power mode + ModPhy Lanes > > > power gated + PLL Idle. > > > > > > As part of this driver, a mechanism to read the slp_s0 residency is > > > exposed > > > as an api and also debugfs features are added to indicate slp_s0 signal > > > > api -> API > > > > Sure, will take care of this change. > > > > assertion residency in microseconds. > > > > > > echo freeze > /sys/power/state > > > wake the system > > > cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec > > > > > > Signed-off-by: Rajneesh Bhardwaj > > > Signed-off-by: Vishwanath Somayaji > > > > Two people (1). > > > > > > Ok. > > > > +INTEL PMC CORE DRIVER > > > +M: Rajneesh Bhardwaj > > > +M: Vishwanath Somayaji > > > +L: platform-driver-...@vger.kernel.org > > > +S: Maintained > > > +F: drivers/platform/x86/intel_pmc_core.h > > > +F: drivers/platform/x86/intel_pmc_core.c > > > > So, we have arch/x86/platform/atom/pmc_atom.c > > > > This driver looks very similar (by what functional is), so, we have to > > become clear what our layout is. > > Either we go with drivers/platform/x86 or with arch/x86/platform. > > > > IMHO, the functianality provided by this driver is platform specific and > not architecture specific. > > There are few similar drivers present in this location already i.e. > intel_telemetry_core.c, intel_pmc_ipc.c etc. > > We had initially put this driver along with pmc_atom.c but later we thought > that driver/platform/x86 would be a more apt place for it because of the > above mentioned reasons. > > Please advise for the right location for this driver if its not placed in the > expected location. We're experiencing some repeat discussion due to some of the reviews having taken place off list. Sorry Andy, I've been trying to steer this back to list, so you're missing some context to no fault of your own. It's possible some of the existing drivers in arch really shouldn't be there. It's not particularly well defined, however, if it isn't used outside the kernel or by other subsystems within the kernel, it seems that drivers/platform/x86 would be the appropriate location. Thomas, Peter - do you have a criteria for what you prefer to have in arch/x86/platform versus drivers/platform/x86? > > > > --- a/drivers/platform/x86/Kconfig > > > +++ b/drivers/platform/x86/Kconfig > > > @@ -821,6 +821,18 @@ config INTEL_IPS > > > functionality. If in doubt, say Y here; it will only load on > > > supported platforms. > > > > > > +config INTEL_PMC_CORE > > > > Better to locate this in alphabetical order. > > > > Agreed, i tried to put it in alphabetical order but there are quite a few > entries > in Kconfig that are skewed e.g. INTEL_MFLD is placed above INTEL_IPS. > > Placing INTEL_PMC_CORE after INTEL_IMR would be ok? > > > > + bool "Intel PMC Core driver" > > > + depends on X86 && PCI > > > > > + ---help--- > > > +
Re: [PATCH v7 1/3] create SMAF module
Hi Benjamin, I'd suspect you're interested in some feedback on these, so here is a few :-) Sadly (ideally?) nothing serious, but a bunch minor suggestions, plus the odd bug. On 9 May 2016 at 16:07, Benjamin Gaignardwrote: > --- /dev/null > +++ b/drivers/smaf/smaf-core.c > @@ -0,0 +1,794 @@ > +/* > + * smaf.c The comment does not match the actual file name. You could give a brief summary of the file(s), if you're feeling gracious ;-) > + > +/** > + * smaf_grant_access - return true if the specified device can get access > + * to the memory area > + * Reading this makes me wonder if {request,allow}_access won't be better name ? > +static int smaf_secure_handle(struct smaf_handle *handle) > +{ > + if (atomic_read(>is_secure)) > + return 0; > + > + if (!have_secure_module()) > + return -EINVAL; > + > + handle->secure_ctx = smaf_dev.secure->create_ctx(); > + Should one use a temporary variable so that the caller provided storage is unchanged in case of an error ? > + if (!handle->secure_ctx) > + return -EINVAL; > + > + atomic_set(>is_secure, 1); > + return 0; > +} > + > +int smaf_register_secure(struct smaf_secure *s) > +{ > + /* make sure that secure module have all required functions > +* to avoid test them each time later > +*/ > + WARN_ON(!s || !s->create_ctx || !s->destroy_ctx || > + !s->grant_access || !s->revoke_access); > + Is something like below reasonable thing to do in the kernel ? Same question goes for smaf_register_allocator() further down. if (!s || ) return -ESHOULDNEVERHAPPEN; > +static struct vm_operations_struct smaf_vma_ops = { Ops/vfucs normally are const data. Is there something preventing us here ? > + .close = smaf_vm_close, > +}; > + > +static int smaf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma) > +{ > + struct smaf_handle *handle = dmabuf->priv; > + bool ret; > + enum dma_data_direction dir; > + > + /* if no allocator attached, get the first allocator */ > + if (!handle->allocator) { > + struct smaf_allocator *alloc; > + > + mutex_lock(_dev.lock); > + alloc = smaf_get_first_allocator(dmabuf); > + mutex_unlock(_dev.lock); > + > + /* still no allocator ? */ > + if (!alloc) > + return -EINVAL; > + > + handle->allocator = alloc; > + } > + > + if (!handle->db_alloc) { > + struct dma_buf *db_alloc; > + > + db_alloc = handle->allocator->allocate(dmabuf, > + handle->length, > + handle->flags); > + if (!db_alloc) > + return -EINVAL; > + > + handle->db_alloc = db_alloc; > + } > + The above half of the function looks identical to smaf_map_dma_buf(). Worth factoring it out to a helper function ? > +static int smaf_attach(struct dma_buf *dmabuf, struct device *dev, > + struct dma_buf_attachment *attach) > +{ > + struct smaf_handle *handle = dmabuf->priv; > + struct dma_buf_attachment *db_attach; > + > + if (!handle->db_alloc) > + return 0; > + Shouldn't one return an error (-EINVAL or similar) here ? > +static struct dma_buf_ops smaf_dma_buf_ops = { const ? From a very quick look the compiler should warn us about it - "smaf_dma_buf_ops discards const qualifier" or similar. > +struct smaf_handle *smaf_create_handle(size_t length, unsigned int flags) > +{ > + struct smaf_handle *handle; > + > + DEFINE_DMA_BUF_EXPORT_INFO(info); > + > + handle = kzalloc(sizeof(*handle), GFP_KERNEL); > + if (!handle) > + return ERR_PTR(-ENOMEM); > + Err this should be return NULL; correct ? > + info.ops = _dma_buf_ops; > + info.size = round_up(length, PAGE_SIZE); > + info.flags = flags; > + info.priv = handle; > + > + handle->dmabuf = dma_buf_export(); > + if (IS_ERR(handle->dmabuf)) { > + kfree(handle); > + return NULL; > + } > + > + handle->length = info.size; > + handle->flags = flags; > + > + return handle; > +} > +EXPORT_SYMBOL(smaf_create_handle); > + > +static long smaf_ioctl(struct file *file, unsigned int cmd, unsigned long > arg) > +{ > + switch (cmd) { > + case SMAF_IOC_CREATE: > + { > + struct smaf_create_data data; > + struct smaf_handle *handle; > + > + if (copy_from_user(, (void __user *)arg, _IOC_SIZE(cmd))) > + return -EFAULT; > + > + handle = smaf_create_handle(data.length, data.flags); We want to sanitise the input data.{length,flags} before sending it deeper in the kernel. > +
Re: [PATCH v7 1/3] create SMAF module
Hi Benjamin, I'd suspect you're interested in some feedback on these, so here is a few :-) Sadly (ideally?) nothing serious, but a bunch minor suggestions, plus the odd bug. On 9 May 2016 at 16:07, Benjamin Gaignard wrote: > --- /dev/null > +++ b/drivers/smaf/smaf-core.c > @@ -0,0 +1,794 @@ > +/* > + * smaf.c The comment does not match the actual file name. You could give a brief summary of the file(s), if you're feeling gracious ;-) > + > +/** > + * smaf_grant_access - return true if the specified device can get access > + * to the memory area > + * Reading this makes me wonder if {request,allow}_access won't be better name ? > +static int smaf_secure_handle(struct smaf_handle *handle) > +{ > + if (atomic_read(>is_secure)) > + return 0; > + > + if (!have_secure_module()) > + return -EINVAL; > + > + handle->secure_ctx = smaf_dev.secure->create_ctx(); > + Should one use a temporary variable so that the caller provided storage is unchanged in case of an error ? > + if (!handle->secure_ctx) > + return -EINVAL; > + > + atomic_set(>is_secure, 1); > + return 0; > +} > + > +int smaf_register_secure(struct smaf_secure *s) > +{ > + /* make sure that secure module have all required functions > +* to avoid test them each time later > +*/ > + WARN_ON(!s || !s->create_ctx || !s->destroy_ctx || > + !s->grant_access || !s->revoke_access); > + Is something like below reasonable thing to do in the kernel ? Same question goes for smaf_register_allocator() further down. if (!s || ) return -ESHOULDNEVERHAPPEN; > +static struct vm_operations_struct smaf_vma_ops = { Ops/vfucs normally are const data. Is there something preventing us here ? > + .close = smaf_vm_close, > +}; > + > +static int smaf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma) > +{ > + struct smaf_handle *handle = dmabuf->priv; > + bool ret; > + enum dma_data_direction dir; > + > + /* if no allocator attached, get the first allocator */ > + if (!handle->allocator) { > + struct smaf_allocator *alloc; > + > + mutex_lock(_dev.lock); > + alloc = smaf_get_first_allocator(dmabuf); > + mutex_unlock(_dev.lock); > + > + /* still no allocator ? */ > + if (!alloc) > + return -EINVAL; > + > + handle->allocator = alloc; > + } > + > + if (!handle->db_alloc) { > + struct dma_buf *db_alloc; > + > + db_alloc = handle->allocator->allocate(dmabuf, > + handle->length, > + handle->flags); > + if (!db_alloc) > + return -EINVAL; > + > + handle->db_alloc = db_alloc; > + } > + The above half of the function looks identical to smaf_map_dma_buf(). Worth factoring it out to a helper function ? > +static int smaf_attach(struct dma_buf *dmabuf, struct device *dev, > + struct dma_buf_attachment *attach) > +{ > + struct smaf_handle *handle = dmabuf->priv; > + struct dma_buf_attachment *db_attach; > + > + if (!handle->db_alloc) > + return 0; > + Shouldn't one return an error (-EINVAL or similar) here ? > +static struct dma_buf_ops smaf_dma_buf_ops = { const ? From a very quick look the compiler should warn us about it - "smaf_dma_buf_ops discards const qualifier" or similar. > +struct smaf_handle *smaf_create_handle(size_t length, unsigned int flags) > +{ > + struct smaf_handle *handle; > + > + DEFINE_DMA_BUF_EXPORT_INFO(info); > + > + handle = kzalloc(sizeof(*handle), GFP_KERNEL); > + if (!handle) > + return ERR_PTR(-ENOMEM); > + Err this should be return NULL; correct ? > + info.ops = _dma_buf_ops; > + info.size = round_up(length, PAGE_SIZE); > + info.flags = flags; > + info.priv = handle; > + > + handle->dmabuf = dma_buf_export(); > + if (IS_ERR(handle->dmabuf)) { > + kfree(handle); > + return NULL; > + } > + > + handle->length = info.size; > + handle->flags = flags; > + > + return handle; > +} > +EXPORT_SYMBOL(smaf_create_handle); > + > +static long smaf_ioctl(struct file *file, unsigned int cmd, unsigned long > arg) > +{ > + switch (cmd) { > + case SMAF_IOC_CREATE: > + { > + struct smaf_create_data data; > + struct smaf_handle *handle; > + > + if (copy_from_user(, (void __user *)arg, _IOC_SIZE(cmd))) > + return -EFAULT; > + > + handle = smaf_create_handle(data.length, data.flags); We want to sanitise the input data.{length,flags} before sending it deeper in the kernel. > + if (!handle) > +
Re: [PATCH 1/2] Create UV efi_call macros
On Thu, May 12, 2016 at 01:06:00PM +0100, Matt Fleming wrote: > (Adding author of arch_efi_call code) > > On Wed, 11 May, at 02:55:44PM, Alex Thorlton wrote: > > We need a slightly different macro than the standard efi_call_virt, > > since those macros all assume that the function pointer, f, that gets > > passed in will live somewhere in efi.systab->runtime. Our EFI function > > pointer lives in efi.uv_systab, so we can't use the standard macros out > > of the box. > > Is that true of all EFI services pointers? From reading the SGI/UV > code I got the impression that it's possible to call the standard > services via efi.systab->runtime but you also need the ability to call > these UV bios functions, which are not accessible via efi.systab. No, sorry. I wasn't very clear there. All of the standard EFI services (get_time, get_variable, etc.) are still called through efi.systab->runtime on UV. It's only the special UV-specific function pointer that is accessed through uv_systab, but, either way, we will still need a slightly different macro to make that happen. > Do you actually want the same environment setup and teardown to happen > when calling efi.uv_systab ? Or are you simply trying to reuse the > efi_call() implementation? I was simply re-using the efi_call implementation. Boris suggested that I re-write this using the efi_call_virt macro, so I just went with that. It all seems to work just fine, so I don't see much reason to stray away from that implementation. That being said, I'm obviously not a huge fun of the code duplication across the macros. I think there's probably a way to minimize this, though I haven't quite worked out the best method yet (ideas are welcome :) > > This commit creates some new uv_* macros that are functionally > > equivalent to the standard ones, with the exception of allowing us to > > use a different function pointer. I figure that we won't want to call > > these uv_* in the end (we'll probably want something more generic), but > > I thought I would get everyone's thoughts on how we might best reach > > that goal instead of just trying to come up with a new implementation on > > my own. > > > > By itself, this commit does get our machines booting, but it needs the > > small fix to the efi_call assembly code for our modules to work. > > Could you provide some more details in the changelog on why your > machine doesn't boot without this change? Absolutely. I wasn't sure exactly how much detail was necessary. I'll put a brief write-up of the original problem in the commit message for the next version. - Alex
Re: [PATCH 1/2] Create UV efi_call macros
On Thu, May 12, 2016 at 01:06:00PM +0100, Matt Fleming wrote: > (Adding author of arch_efi_call code) > > On Wed, 11 May, at 02:55:44PM, Alex Thorlton wrote: > > We need a slightly different macro than the standard efi_call_virt, > > since those macros all assume that the function pointer, f, that gets > > passed in will live somewhere in efi.systab->runtime. Our EFI function > > pointer lives in efi.uv_systab, so we can't use the standard macros out > > of the box. > > Is that true of all EFI services pointers? From reading the SGI/UV > code I got the impression that it's possible to call the standard > services via efi.systab->runtime but you also need the ability to call > these UV bios functions, which are not accessible via efi.systab. No, sorry. I wasn't very clear there. All of the standard EFI services (get_time, get_variable, etc.) are still called through efi.systab->runtime on UV. It's only the special UV-specific function pointer that is accessed through uv_systab, but, either way, we will still need a slightly different macro to make that happen. > Do you actually want the same environment setup and teardown to happen > when calling efi.uv_systab ? Or are you simply trying to reuse the > efi_call() implementation? I was simply re-using the efi_call implementation. Boris suggested that I re-write this using the efi_call_virt macro, so I just went with that. It all seems to work just fine, so I don't see much reason to stray away from that implementation. That being said, I'm obviously not a huge fun of the code duplication across the macros. I think there's probably a way to minimize this, though I haven't quite worked out the best method yet (ideas are welcome :) > > This commit creates some new uv_* macros that are functionally > > equivalent to the standard ones, with the exception of allowing us to > > use a different function pointer. I figure that we won't want to call > > these uv_* in the end (we'll probably want something more generic), but > > I thought I would get everyone's thoughts on how we might best reach > > that goal instead of just trying to come up with a new implementation on > > my own. > > > > By itself, this commit does get our machines booting, but it needs the > > small fix to the efi_call assembly code for our modules to work. > > Could you provide some more details in the changelog on why your > machine doesn't boot without this change? Absolutely. I wasn't sure exactly how much detail was necessary. I'll put a brief write-up of the original problem in the commit message for the next version. - Alex
Re: [GIT PULL] EFI fix
On Mon, May 16, 2016 at 03:23:51PM -0500, Alex Thorlton wrote: > Everything discussed above makes sense to me, and the patch looks sane. > I will apply and test it today and let you know how it works. I applied this to the latest -tip kernel and tested on both real hardware and in our simulator. Everything appears to be behaving as expected. I still need to work on the other patch in the patchset to get our callbacks working with the new EFI page tables, but I believe we have a clear path forward there, and that is (mostly) a separate issue. Thanks again for catching this mistake, Linus! - Alex
Re: [GIT PULL] EFI fix
On Mon, May 16, 2016 at 03:23:51PM -0500, Alex Thorlton wrote: > Everything discussed above makes sense to me, and the patch looks sane. > I will apply and test it today and let you know how it works. I applied this to the latest -tip kernel and tested on both real hardware and in our simulator. Everything appears to be behaving as expected. I still need to work on the other patch in the patchset to get our callbacks working with the new EFI page tables, but I believe we have a clear path forward there, and that is (mostly) a separate issue. Thanks again for catching this mistake, Linus! - Alex
Re: [PATCH] mm: unhide vmstat_text definition for CONFIG_SMP
On Mon, 16 May 2016 16:23:33 +0200 Michal Hockowrote: > Andrew, I think that the following is more straightforward fix and > should be folded in to the patch which has introduced vmstat_refresh. > --- > >From b8dd18fb7df040e1bfe61aadde1d903589de15e4 Mon Sep 17 00:00:00 2001 > From: Michal Hocko > Date: Mon, 16 May 2016 16:19:53 +0200 > Subject: [PATCH] mmotm: mm-proc-sys-vm-stat_refresh-to-force-vmstat-update-fix > > Arnd has reported: > In randconfig builds with sysfs, procfs and numa all disabled, > but SMP enabled, we now get a link error in the newly introduced > vmstat_refresh function: > > mm/built-in.o: In function `vmstat_refresh': > :(.text+0x15c78): undefined reference to `vmstat_text' > > vmstat_refresh is proc_fs specific so there is no reason to define it > when !CONFIG_PROC_FS. I already had this: From: Christoph Lameter Subject: Do not build vmstat_refresh if there is no procfs support It makes no sense to build functionality into the kernel that cannot be used and causes build issues. Link: http://lkml.kernel.org/r/alpine.deb.2.20.1605111011260.9...@east.gentwo.org Signed-off-by: Christoph Lameter Reported-by: Arnd Bergmann Signed-off-by: Andrew Morton --- mm/vmstat.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff -puN mm/vmstat.c~mm-proc-sys-vm-stat_refresh-to-force-vmstat-update-fix mm/vmstat.c --- a/mm/vmstat.c~mm-proc-sys-vm-stat_refresh-to-force-vmstat-update-fix +++ a/mm/vmstat.c @@ -1371,7 +1371,6 @@ static const struct file_operations proc .llseek = seq_lseek, .release= seq_release, }; -#endif /* CONFIG_PROC_FS */ #ifdef CONFIG_SMP static struct workqueue_struct *vmstat_wq; @@ -1436,7 +1435,10 @@ int vmstat_refresh(struct ctl_table *tab *lenp = 0; return 0; } +#endif /* CONFIG_SMP */ +#endif /* CONFIG_PROC_FS */ +#ifdef CONFIG_SMP static void vmstat_update(struct work_struct *w) { if (refresh_cpu_vm_stats(true)) { _
Re: [PATCH] mm: unhide vmstat_text definition for CONFIG_SMP
On Mon, 16 May 2016 16:23:33 +0200 Michal Hocko wrote: > Andrew, I think that the following is more straightforward fix and > should be folded in to the patch which has introduced vmstat_refresh. > --- > >From b8dd18fb7df040e1bfe61aadde1d903589de15e4 Mon Sep 17 00:00:00 2001 > From: Michal Hocko > Date: Mon, 16 May 2016 16:19:53 +0200 > Subject: [PATCH] mmotm: mm-proc-sys-vm-stat_refresh-to-force-vmstat-update-fix > > Arnd has reported: > In randconfig builds with sysfs, procfs and numa all disabled, > but SMP enabled, we now get a link error in the newly introduced > vmstat_refresh function: > > mm/built-in.o: In function `vmstat_refresh': > :(.text+0x15c78): undefined reference to `vmstat_text' > > vmstat_refresh is proc_fs specific so there is no reason to define it > when !CONFIG_PROC_FS. I already had this: From: Christoph Lameter Subject: Do not build vmstat_refresh if there is no procfs support It makes no sense to build functionality into the kernel that cannot be used and causes build issues. Link: http://lkml.kernel.org/r/alpine.deb.2.20.1605111011260.9...@east.gentwo.org Signed-off-by: Christoph Lameter Reported-by: Arnd Bergmann Signed-off-by: Andrew Morton --- mm/vmstat.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff -puN mm/vmstat.c~mm-proc-sys-vm-stat_refresh-to-force-vmstat-update-fix mm/vmstat.c --- a/mm/vmstat.c~mm-proc-sys-vm-stat_refresh-to-force-vmstat-update-fix +++ a/mm/vmstat.c @@ -1371,7 +1371,6 @@ static const struct file_operations proc .llseek = seq_lseek, .release= seq_release, }; -#endif /* CONFIG_PROC_FS */ #ifdef CONFIG_SMP static struct workqueue_struct *vmstat_wq; @@ -1436,7 +1435,10 @@ int vmstat_refresh(struct ctl_table *tab *lenp = 0; return 0; } +#endif /* CONFIG_SMP */ +#endif /* CONFIG_PROC_FS */ +#ifdef CONFIG_SMP static void vmstat_update(struct work_struct *w) { if (refresh_cpu_vm_stats(true)) { _
[PATCH 2/2] net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
There are two generics functions phy_ethtool_{get|set}_link_ksettings, so we can use them instead of defining the same code in the driver. Signed-off-by: Philippe Reynes--- drivers/net/ethernet/freescale/fec_mpc52xx.c | 26 ++ 1 files changed, 2 insertions(+), 24 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_mpc52xx.c b/drivers/net/ethernet/freescale/fec_mpc52xx.c index bcf0600..446ae9d 100644 --- a/drivers/net/ethernet/freescale/fec_mpc52xx.c +++ b/drivers/net/ethernet/freescale/fec_mpc52xx.c @@ -762,28 +762,6 @@ static void mpc52xx_fec_reset(struct net_device *dev) /* ethtool interface */ -static int mpc52xx_fec_get_ksettings(struct net_device *dev, -struct ethtool_link_ksettings *cmd) -{ - struct phy_device *phydev = dev->phydev; - - if (!phydev) - return -ENODEV; - - return phy_ethtool_ksettings_get(phydev, cmd); -} - -static int mpc52xx_fec_set_ksettings(struct net_device *dev, -const struct ethtool_link_ksettings *cmd) -{ - struct phy_device *phydev = dev->phydev; - - if (!phydev) - return -ENODEV; - - return phy_ethtool_ksettings_set(phydev, cmd); -} - static u32 mpc52xx_fec_get_msglevel(struct net_device *dev) { struct mpc52xx_fec_priv *priv = netdev_priv(dev); @@ -801,8 +779,8 @@ static const struct ethtool_ops mpc52xx_fec_ethtool_ops = { .get_msglevel = mpc52xx_fec_get_msglevel, .set_msglevel = mpc52xx_fec_set_msglevel, .get_ts_info = ethtool_op_get_ts_info, - .get_link_ksettings = mpc52xx_fec_get_ksettings, - .set_link_ksettings = mpc52xx_fec_set_ksettings, + .get_link_ksettings = phy_ethtool_get_link_ksettings, + .set_link_ksettings = phy_ethtool_set_link_ksettings, }; -- 1.7.4.4
[PATCH 2/2] net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
There are two generics functions phy_ethtool_{get|set}_link_ksettings, so we can use them instead of defining the same code in the driver. Signed-off-by: Philippe Reynes --- drivers/net/ethernet/freescale/fec_mpc52xx.c | 26 ++ 1 files changed, 2 insertions(+), 24 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_mpc52xx.c b/drivers/net/ethernet/freescale/fec_mpc52xx.c index bcf0600..446ae9d 100644 --- a/drivers/net/ethernet/freescale/fec_mpc52xx.c +++ b/drivers/net/ethernet/freescale/fec_mpc52xx.c @@ -762,28 +762,6 @@ static void mpc52xx_fec_reset(struct net_device *dev) /* ethtool interface */ -static int mpc52xx_fec_get_ksettings(struct net_device *dev, -struct ethtool_link_ksettings *cmd) -{ - struct phy_device *phydev = dev->phydev; - - if (!phydev) - return -ENODEV; - - return phy_ethtool_ksettings_get(phydev, cmd); -} - -static int mpc52xx_fec_set_ksettings(struct net_device *dev, -const struct ethtool_link_ksettings *cmd) -{ - struct phy_device *phydev = dev->phydev; - - if (!phydev) - return -ENODEV; - - return phy_ethtool_ksettings_set(phydev, cmd); -} - static u32 mpc52xx_fec_get_msglevel(struct net_device *dev) { struct mpc52xx_fec_priv *priv = netdev_priv(dev); @@ -801,8 +779,8 @@ static const struct ethtool_ops mpc52xx_fec_ethtool_ops = { .get_msglevel = mpc52xx_fec_get_msglevel, .set_msglevel = mpc52xx_fec_set_msglevel, .get_ts_info = ethtool_op_get_ts_info, - .get_link_ksettings = mpc52xx_fec_get_ksettings, - .set_link_ksettings = mpc52xx_fec_set_ksettings, + .get_link_ksettings = phy_ethtool_get_link_ksettings, + .set_link_ksettings = phy_ethtool_set_link_ksettings, }; -- 1.7.4.4
RE: [Intel-wired-lan] [PATCH] e1000e: prevent division by zero if TIMINCA is zero
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@lists.osuosl.org] On > Behalf Of Denys Vlasenko > Sent: Friday, May 6, 2016 12:42 PM > To: Kirsher, Jeffrey T> Cc: intel-wired-...@lists.osuosl.org; Denys Vlasenko > ; LKML ; > net...@vger.kernel.org > Subject: [Intel-wired-lan] [PATCH] e1000e: prevent division by zero if > TIMINCA is zero > > Users report that under VMWare, er32(TIMINCA) returns zero. > This causes division by zero at init time as follows: > > ==>incvalue = er32(TIMINCA) & E1000_TIMINCA_INCVALUE_MASK; > for (i = 0; i < E1000_MAX_82574_SYSTIM_REREADS; i++) { > /* latch SYSTIMH on read of SYSTIML */ > systim_next = (cycle_t)er32(SYSTIML); > systim_next |= (cycle_t)er32(SYSTIMH) << 32; > > time_delta = systim_next - systim; > temp = time_delta; > > rem = do_div(temp, incvalue); > > This change makes kernel survive this, and users report that > NIC does work after this change. > > Since on real hardware incvalue is never zero, this should not affect > real hardware use case. > > Signed-off-by: Denys Vlasenko > CC: Jeff Kirsher > CC: "Ruinskiy, Dima" > CC: intel-wired-...@lists.osuosl.org > CC: net...@vger.kernel.org > CC: LKML > --- > drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) As Mark Rustad pointed out I recall this was earlier rejected as something that is a VMWare error and it should be fixed there so that existing VMs will start working without installing a new driver. Having said that, it does not seem to be causing any harm in my testing, so... Tested-by: Aaron Brown
[PATCH 1/2] net: ethernet: fec-mpc52xx: use phydev from struct net_device
The private structure contain a pointer to phydev, but the structure net_device already contain such pointer. So we can remove the pointer phydev in the private structure, and update the driver to use the one contained in struct net_device. Signed-off-by: Philippe Reynes--- drivers/net/ethernet/freescale/fec_mpc52xx.c | 43 -- 1 files changed, 20 insertions(+), 23 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_mpc52xx.c b/drivers/net/ethernet/freescale/fec_mpc52xx.c index f444714..bcf0600 100644 --- a/drivers/net/ethernet/freescale/fec_mpc52xx.c +++ b/drivers/net/ethernet/freescale/fec_mpc52xx.c @@ -66,7 +66,6 @@ struct mpc52xx_fec_priv { /* MDIO link details */ unsigned int mdio_speed; struct device_node *phy_node; - struct phy_device *phydev; enum phy_state link; int seven_wire_mode; }; @@ -165,7 +164,7 @@ static int mpc52xx_fec_alloc_rx_buffers(struct net_device *dev, struct bcom_task static void mpc52xx_fec_adjust_link(struct net_device *dev) { struct mpc52xx_fec_priv *priv = netdev_priv(dev); - struct phy_device *phydev = priv->phydev; + struct phy_device *phydev = dev->phydev; int new_state = 0; if (phydev->link != PHY_DOWN) { @@ -215,16 +214,17 @@ static void mpc52xx_fec_adjust_link(struct net_device *dev) static int mpc52xx_fec_open(struct net_device *dev) { struct mpc52xx_fec_priv *priv = netdev_priv(dev); + struct phy_device *phydev = NULL; int err = -EBUSY; if (priv->phy_node) { - priv->phydev = of_phy_connect(priv->ndev, priv->phy_node, - mpc52xx_fec_adjust_link, 0, 0); - if (!priv->phydev) { + phydev = of_phy_connect(priv->ndev, priv->phy_node, + mpc52xx_fec_adjust_link, 0, 0); + if (!phydev) { dev_err(>dev, "of_phy_connect failed\n"); return -ENODEV; } - phy_start(priv->phydev); + phy_start(phydev); } if (request_irq(dev->irq, mpc52xx_fec_interrupt, IRQF_SHARED, @@ -268,10 +268,9 @@ static int mpc52xx_fec_open(struct net_device *dev) free_ctrl_irq: free_irq(dev->irq, dev); free_phy: - if (priv->phydev) { - phy_stop(priv->phydev); - phy_disconnect(priv->phydev); - priv->phydev = NULL; + if (phydev) { + phy_stop(phydev); + phy_disconnect(phydev); } return err; @@ -280,6 +279,7 @@ static int mpc52xx_fec_open(struct net_device *dev) static int mpc52xx_fec_close(struct net_device *dev) { struct mpc52xx_fec_priv *priv = netdev_priv(dev); + struct phy_device *phydev = dev->phydev; netif_stop_queue(dev); @@ -291,11 +291,10 @@ static int mpc52xx_fec_close(struct net_device *dev) free_irq(priv->r_irq, dev); free_irq(priv->t_irq, dev); - if (priv->phydev) { + if (phydev) { /* power down phy */ - phy_stop(priv->phydev); - phy_disconnect(priv->phydev); - priv->phydev = NULL; + phy_stop(phydev); + phy_disconnect(phydev); } return 0; @@ -766,10 +765,9 @@ static void mpc52xx_fec_reset(struct net_device *dev) static int mpc52xx_fec_get_ksettings(struct net_device *dev, struct ethtool_link_ksettings *cmd) { - struct mpc52xx_fec_priv *priv = netdev_priv(dev); - struct phy_device *phydev = priv->phydev; + struct phy_device *phydev = dev->phydev; - if (!priv->phydev) + if (!phydev) return -ENODEV; return phy_ethtool_ksettings_get(phydev, cmd); @@ -778,10 +776,9 @@ static int mpc52xx_fec_get_ksettings(struct net_device *dev, static int mpc52xx_fec_set_ksettings(struct net_device *dev, const struct ethtool_link_ksettings *cmd) { - struct mpc52xx_fec_priv *priv = netdev_priv(dev); - struct phy_device *phydev = priv->phydev; + struct phy_device *phydev = dev->phydev; - if (!priv->phydev) + if (!phydev) return -ENODEV; return phy_ethtool_ksettings_set(phydev, cmd); @@ -811,12 +808,12 @@ static const struct ethtool_ops mpc52xx_fec_ethtool_ops = { static int mpc52xx_fec_ioctl(struct net_device *dev, struct ifreq *rq, int cmd) { - struct mpc52xx_fec_priv *priv = netdev_priv(dev); + struct phy_device *phydev = dev->phydev; - if (!priv->phydev) + if (!phydev) return -ENOTSUPP; - return phy_mii_ioctl(priv->phydev, rq, cmd); + return phy_mii_ioctl(phydev, rq, cmd); } static const struct net_device_ops mpc52xx_fec_netdev_ops = { -- 1.7.4.4
[PATCH 2/5] f2fs: use percpu_counter for page counters
This patch substitutes percpu_counter for atomic_counter when counting various types of pages. Signed-off-by: Jaegeuk Kim--- fs/f2fs/debug.c | 1 + fs/f2fs/f2fs.h | 12 +++- fs/f2fs/super.c | 31 +++ 3 files changed, 35 insertions(+), 9 deletions(-) diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c index 37615b2..2210c7c 100644 --- a/fs/f2fs/debug.c +++ b/fs/f2fs/debug.c @@ -144,6 +144,7 @@ static void update_mem_info(struct f2fs_sb_info *sbi) si->base_mem = sizeof(struct f2fs_sb_info) + sbi->sb->s_blocksize; si->base_mem += 2 * sizeof(struct f2fs_inode_info); si->base_mem += sizeof(*sbi->ckpt); + si->base_mem += sizeof(struct percpu_counter) * NR_COUNT_TYPE; /* build sm */ si->base_mem += sizeof(struct f2fs_sm_info); diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 24f8170..2851078 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -833,7 +833,9 @@ struct f2fs_sb_info { block_t discard_blks; /* discard command candidats */ block_t last_valid_block_count; /* for recovery */ u32 s_next_generation; /* for NFS support */ - atomic_t nr_pages[NR_COUNT_TYPE]; /* # of pages, see count_type */ + + /* # of pages, see count_type */ + struct percpu_counter nr_pages[NR_COUNT_TYPE]; struct f2fs_mount_info mount_opt; /* mount options */ @@ -1178,7 +1180,7 @@ static inline void dec_valid_block_count(struct f2fs_sb_info *sbi, static inline void inc_page_count(struct f2fs_sb_info *sbi, int count_type) { - atomic_inc(>nr_pages[count_type]); + percpu_counter_inc(>nr_pages[count_type]); set_sbi_flag(sbi, SBI_IS_DIRTY); } @@ -1191,7 +1193,7 @@ static inline void inode_inc_dirty_pages(struct inode *inode) static inline void dec_page_count(struct f2fs_sb_info *sbi, int count_type) { - atomic_dec(>nr_pages[count_type]); + percpu_counter_dec(>nr_pages[count_type]); } static inline void inode_dec_dirty_pages(struct inode *inode) @@ -1205,9 +1207,9 @@ static inline void inode_dec_dirty_pages(struct inode *inode) F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA); } -static inline int get_pages(struct f2fs_sb_info *sbi, int count_type) +static inline s64 get_pages(struct f2fs_sb_info *sbi, int count_type) { - return atomic_read(>nr_pages[count_type]); + return percpu_counter_sum_positive(>nr_pages[count_type]); } static inline int get_dirty_pages(struct inode *inode) diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index 9df6d72..b3030ed 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -606,6 +606,14 @@ static void f2fs_destroy_inode(struct inode *inode) call_rcu(>i_rcu, f2fs_i_callback); } +static void destroy_percpu_info(struct f2fs_sb_info *sbi) +{ + int i; + + for (i = 0; i < NR_COUNT_TYPE; i++) + percpu_counter_destroy(>nr_pages[i]); +} + static void f2fs_put_super(struct super_block *sb) { struct f2fs_sb_info *sbi = F2FS_SB(sb); @@ -667,6 +675,8 @@ static void f2fs_put_super(struct super_block *sb) if (sbi->s_chksum_driver) crypto_free_shash(sbi->s_chksum_driver); kfree(sbi->raw_super); + + destroy_percpu_info(sbi); kfree(sbi); } @@ -1325,7 +1335,6 @@ int sanity_check_ckpt(struct f2fs_sb_info *sbi) static void init_sb_info(struct f2fs_sb_info *sbi) { struct f2fs_super_block *raw_super = sbi->raw_super; - int i; sbi->log_sectors_per_block = le32_to_cpu(raw_super->log_sectors_per_block); @@ -1345,9 +1354,6 @@ static void init_sb_info(struct f2fs_sb_info *sbi) sbi->cur_victim_sec = NULL_SECNO; sbi->max_victim_search = DEF_MAX_VICTIM_SEARCH; - for (i = 0; i < NR_COUNT_TYPE; i++) - atomic_set(>nr_pages[i], 0); - sbi->dir_level = DEF_DIR_LEVEL; sbi->interval_time[CP_TIME] = DEF_CP_INTERVAL; sbi->interval_time[REQ_TIME] = DEF_IDLE_INTERVAL; @@ -1363,6 +1369,18 @@ static void init_sb_info(struct f2fs_sb_info *sbi) #endif } +static int init_percpu_info(struct f2fs_sb_info *sbi) +{ + int i, err; + + for (i = 0; i < NR_COUNT_TYPE; i++) { + err = percpu_counter_init(>nr_pages[i], 0, GFP_KERNEL); + if (err) + return err; + } + return 0; +} + /* * Read f2fs raw super block. * Because we have two copies of super block, so read both of them @@ -1553,6 +1571,10 @@ try_onemore: init_waitqueue_head(>cp_wait); init_sb_info(sbi); + err = init_percpu_info(sbi); + if (err) + goto free_options; + /* get an inode for meta space */ sbi->meta_inode = f2fs_iget(sb, F2FS_META_INO(sbi)); if (IS_ERR(sbi->meta_inode)) { @@ -1755,6 +1777,7 @@ free_meta_inode: make_bad_inode(sbi->meta_inode);
[PATCH 2/5] f2fs: use percpu_counter for page counters
This patch substitutes percpu_counter for atomic_counter when counting various types of pages. Signed-off-by: Jaegeuk Kim --- fs/f2fs/debug.c | 1 + fs/f2fs/f2fs.h | 12 +++- fs/f2fs/super.c | 31 +++ 3 files changed, 35 insertions(+), 9 deletions(-) diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c index 37615b2..2210c7c 100644 --- a/fs/f2fs/debug.c +++ b/fs/f2fs/debug.c @@ -144,6 +144,7 @@ static void update_mem_info(struct f2fs_sb_info *sbi) si->base_mem = sizeof(struct f2fs_sb_info) + sbi->sb->s_blocksize; si->base_mem += 2 * sizeof(struct f2fs_inode_info); si->base_mem += sizeof(*sbi->ckpt); + si->base_mem += sizeof(struct percpu_counter) * NR_COUNT_TYPE; /* build sm */ si->base_mem += sizeof(struct f2fs_sm_info); diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 24f8170..2851078 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -833,7 +833,9 @@ struct f2fs_sb_info { block_t discard_blks; /* discard command candidats */ block_t last_valid_block_count; /* for recovery */ u32 s_next_generation; /* for NFS support */ - atomic_t nr_pages[NR_COUNT_TYPE]; /* # of pages, see count_type */ + + /* # of pages, see count_type */ + struct percpu_counter nr_pages[NR_COUNT_TYPE]; struct f2fs_mount_info mount_opt; /* mount options */ @@ -1178,7 +1180,7 @@ static inline void dec_valid_block_count(struct f2fs_sb_info *sbi, static inline void inc_page_count(struct f2fs_sb_info *sbi, int count_type) { - atomic_inc(>nr_pages[count_type]); + percpu_counter_inc(>nr_pages[count_type]); set_sbi_flag(sbi, SBI_IS_DIRTY); } @@ -1191,7 +1193,7 @@ static inline void inode_inc_dirty_pages(struct inode *inode) static inline void dec_page_count(struct f2fs_sb_info *sbi, int count_type) { - atomic_dec(>nr_pages[count_type]); + percpu_counter_dec(>nr_pages[count_type]); } static inline void inode_dec_dirty_pages(struct inode *inode) @@ -1205,9 +1207,9 @@ static inline void inode_dec_dirty_pages(struct inode *inode) F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA); } -static inline int get_pages(struct f2fs_sb_info *sbi, int count_type) +static inline s64 get_pages(struct f2fs_sb_info *sbi, int count_type) { - return atomic_read(>nr_pages[count_type]); + return percpu_counter_sum_positive(>nr_pages[count_type]); } static inline int get_dirty_pages(struct inode *inode) diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index 9df6d72..b3030ed 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -606,6 +606,14 @@ static void f2fs_destroy_inode(struct inode *inode) call_rcu(>i_rcu, f2fs_i_callback); } +static void destroy_percpu_info(struct f2fs_sb_info *sbi) +{ + int i; + + for (i = 0; i < NR_COUNT_TYPE; i++) + percpu_counter_destroy(>nr_pages[i]); +} + static void f2fs_put_super(struct super_block *sb) { struct f2fs_sb_info *sbi = F2FS_SB(sb); @@ -667,6 +675,8 @@ static void f2fs_put_super(struct super_block *sb) if (sbi->s_chksum_driver) crypto_free_shash(sbi->s_chksum_driver); kfree(sbi->raw_super); + + destroy_percpu_info(sbi); kfree(sbi); } @@ -1325,7 +1335,6 @@ int sanity_check_ckpt(struct f2fs_sb_info *sbi) static void init_sb_info(struct f2fs_sb_info *sbi) { struct f2fs_super_block *raw_super = sbi->raw_super; - int i; sbi->log_sectors_per_block = le32_to_cpu(raw_super->log_sectors_per_block); @@ -1345,9 +1354,6 @@ static void init_sb_info(struct f2fs_sb_info *sbi) sbi->cur_victim_sec = NULL_SECNO; sbi->max_victim_search = DEF_MAX_VICTIM_SEARCH; - for (i = 0; i < NR_COUNT_TYPE; i++) - atomic_set(>nr_pages[i], 0); - sbi->dir_level = DEF_DIR_LEVEL; sbi->interval_time[CP_TIME] = DEF_CP_INTERVAL; sbi->interval_time[REQ_TIME] = DEF_IDLE_INTERVAL; @@ -1363,6 +1369,18 @@ static void init_sb_info(struct f2fs_sb_info *sbi) #endif } +static int init_percpu_info(struct f2fs_sb_info *sbi) +{ + int i, err; + + for (i = 0; i < NR_COUNT_TYPE; i++) { + err = percpu_counter_init(>nr_pages[i], 0, GFP_KERNEL); + if (err) + return err; + } + return 0; +} + /* * Read f2fs raw super block. * Because we have two copies of super block, so read both of them @@ -1553,6 +1571,10 @@ try_onemore: init_waitqueue_head(>cp_wait); init_sb_info(sbi); + err = init_percpu_info(sbi); + if (err) + goto free_options; + /* get an inode for meta space */ sbi->meta_inode = f2fs_iget(sb, F2FS_META_INO(sbi)); if (IS_ERR(sbi->meta_inode)) { @@ -1755,6 +1777,7 @@ free_meta_inode: make_bad_inode(sbi->meta_inode); iput(sbi->meta_inode);
RE: [Intel-wired-lan] [PATCH] e1000e: prevent division by zero if TIMINCA is zero
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@lists.osuosl.org] On > Behalf Of Denys Vlasenko > Sent: Friday, May 6, 2016 12:42 PM > To: Kirsher, Jeffrey T > Cc: intel-wired-...@lists.osuosl.org; Denys Vlasenko > ; LKML ; > net...@vger.kernel.org > Subject: [Intel-wired-lan] [PATCH] e1000e: prevent division by zero if > TIMINCA is zero > > Users report that under VMWare, er32(TIMINCA) returns zero. > This causes division by zero at init time as follows: > > ==>incvalue = er32(TIMINCA) & E1000_TIMINCA_INCVALUE_MASK; > for (i = 0; i < E1000_MAX_82574_SYSTIM_REREADS; i++) { > /* latch SYSTIMH on read of SYSTIML */ > systim_next = (cycle_t)er32(SYSTIML); > systim_next |= (cycle_t)er32(SYSTIMH) << 32; > > time_delta = systim_next - systim; > temp = time_delta; > > rem = do_div(temp, incvalue); > > This change makes kernel survive this, and users report that > NIC does work after this change. > > Since on real hardware incvalue is never zero, this should not affect > real hardware use case. > > Signed-off-by: Denys Vlasenko > CC: Jeff Kirsher > CC: "Ruinskiy, Dima" > CC: intel-wired-...@lists.osuosl.org > CC: net...@vger.kernel.org > CC: LKML > --- > drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) As Mark Rustad pointed out I recall this was earlier rejected as something that is a VMWare error and it should be fixed there so that existing VMs will start working without installing a new driver. Having said that, it does not seem to be causing any harm in my testing, so... Tested-by: Aaron Brown
[PATCH 1/2] net: ethernet: fec-mpc52xx: use phydev from struct net_device
The private structure contain a pointer to phydev, but the structure net_device already contain such pointer. So we can remove the pointer phydev in the private structure, and update the driver to use the one contained in struct net_device. Signed-off-by: Philippe Reynes --- drivers/net/ethernet/freescale/fec_mpc52xx.c | 43 -- 1 files changed, 20 insertions(+), 23 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_mpc52xx.c b/drivers/net/ethernet/freescale/fec_mpc52xx.c index f444714..bcf0600 100644 --- a/drivers/net/ethernet/freescale/fec_mpc52xx.c +++ b/drivers/net/ethernet/freescale/fec_mpc52xx.c @@ -66,7 +66,6 @@ struct mpc52xx_fec_priv { /* MDIO link details */ unsigned int mdio_speed; struct device_node *phy_node; - struct phy_device *phydev; enum phy_state link; int seven_wire_mode; }; @@ -165,7 +164,7 @@ static int mpc52xx_fec_alloc_rx_buffers(struct net_device *dev, struct bcom_task static void mpc52xx_fec_adjust_link(struct net_device *dev) { struct mpc52xx_fec_priv *priv = netdev_priv(dev); - struct phy_device *phydev = priv->phydev; + struct phy_device *phydev = dev->phydev; int new_state = 0; if (phydev->link != PHY_DOWN) { @@ -215,16 +214,17 @@ static void mpc52xx_fec_adjust_link(struct net_device *dev) static int mpc52xx_fec_open(struct net_device *dev) { struct mpc52xx_fec_priv *priv = netdev_priv(dev); + struct phy_device *phydev = NULL; int err = -EBUSY; if (priv->phy_node) { - priv->phydev = of_phy_connect(priv->ndev, priv->phy_node, - mpc52xx_fec_adjust_link, 0, 0); - if (!priv->phydev) { + phydev = of_phy_connect(priv->ndev, priv->phy_node, + mpc52xx_fec_adjust_link, 0, 0); + if (!phydev) { dev_err(>dev, "of_phy_connect failed\n"); return -ENODEV; } - phy_start(priv->phydev); + phy_start(phydev); } if (request_irq(dev->irq, mpc52xx_fec_interrupt, IRQF_SHARED, @@ -268,10 +268,9 @@ static int mpc52xx_fec_open(struct net_device *dev) free_ctrl_irq: free_irq(dev->irq, dev); free_phy: - if (priv->phydev) { - phy_stop(priv->phydev); - phy_disconnect(priv->phydev); - priv->phydev = NULL; + if (phydev) { + phy_stop(phydev); + phy_disconnect(phydev); } return err; @@ -280,6 +279,7 @@ static int mpc52xx_fec_open(struct net_device *dev) static int mpc52xx_fec_close(struct net_device *dev) { struct mpc52xx_fec_priv *priv = netdev_priv(dev); + struct phy_device *phydev = dev->phydev; netif_stop_queue(dev); @@ -291,11 +291,10 @@ static int mpc52xx_fec_close(struct net_device *dev) free_irq(priv->r_irq, dev); free_irq(priv->t_irq, dev); - if (priv->phydev) { + if (phydev) { /* power down phy */ - phy_stop(priv->phydev); - phy_disconnect(priv->phydev); - priv->phydev = NULL; + phy_stop(phydev); + phy_disconnect(phydev); } return 0; @@ -766,10 +765,9 @@ static void mpc52xx_fec_reset(struct net_device *dev) static int mpc52xx_fec_get_ksettings(struct net_device *dev, struct ethtool_link_ksettings *cmd) { - struct mpc52xx_fec_priv *priv = netdev_priv(dev); - struct phy_device *phydev = priv->phydev; + struct phy_device *phydev = dev->phydev; - if (!priv->phydev) + if (!phydev) return -ENODEV; return phy_ethtool_ksettings_get(phydev, cmd); @@ -778,10 +776,9 @@ static int mpc52xx_fec_get_ksettings(struct net_device *dev, static int mpc52xx_fec_set_ksettings(struct net_device *dev, const struct ethtool_link_ksettings *cmd) { - struct mpc52xx_fec_priv *priv = netdev_priv(dev); - struct phy_device *phydev = priv->phydev; + struct phy_device *phydev = dev->phydev; - if (!priv->phydev) + if (!phydev) return -ENODEV; return phy_ethtool_ksettings_set(phydev, cmd); @@ -811,12 +808,12 @@ static const struct ethtool_ops mpc52xx_fec_ethtool_ops = { static int mpc52xx_fec_ioctl(struct net_device *dev, struct ifreq *rq, int cmd) { - struct mpc52xx_fec_priv *priv = netdev_priv(dev); + struct phy_device *phydev = dev->phydev; - if (!priv->phydev) + if (!phydev) return -ENOTSUPP; - return phy_mii_ioctl(priv->phydev, rq, cmd); + return phy_mii_ioctl(phydev, rq, cmd); } static const struct net_device_ops mpc52xx_fec_netdev_ops = { -- 1.7.4.4
[PATCH 1/5] f2fs: manipulate dirty file inodes when DATA_FLUSH is set
It needs to maintain dirty file inodes only if DATA_FLUSH is set. Otherwise, let's avoid its overhead. Signed-off-by: Jaegeuk Kim--- fs/f2fs/checkpoint.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index 3da6499..d04113b 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -785,9 +785,11 @@ void update_dirty_page(struct inode *inode, struct page *page) !S_ISLNK(inode->i_mode)) return; - spin_lock(>inode_lock[type]); - __add_dirty_inode(inode, type); - spin_unlock(>inode_lock[type]); + if (type != FILE_INODE || test_opt(sbi, DATA_FLUSH)) { + spin_lock(>inode_lock[type]); + __add_dirty_inode(inode, type); + spin_unlock(>inode_lock[type]); + } inode_inc_dirty_pages(inode); SetPagePrivate(page); @@ -803,6 +805,9 @@ void remove_dirty_inode(struct inode *inode) !S_ISLNK(inode->i_mode)) return; + if (type == FILE_INODE && !test_opt(sbi, DATA_FLUSH)) + return; + spin_lock(>inode_lock[type]); __remove_dirty_inode(inode, type); spin_unlock(>inode_lock[type]); -- 2.6.3
[PATCH 1/5] f2fs: manipulate dirty file inodes when DATA_FLUSH is set
It needs to maintain dirty file inodes only if DATA_FLUSH is set. Otherwise, let's avoid its overhead. Signed-off-by: Jaegeuk Kim --- fs/f2fs/checkpoint.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index 3da6499..d04113b 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -785,9 +785,11 @@ void update_dirty_page(struct inode *inode, struct page *page) !S_ISLNK(inode->i_mode)) return; - spin_lock(>inode_lock[type]); - __add_dirty_inode(inode, type); - spin_unlock(>inode_lock[type]); + if (type != FILE_INODE || test_opt(sbi, DATA_FLUSH)) { + spin_lock(>inode_lock[type]); + __add_dirty_inode(inode, type); + spin_unlock(>inode_lock[type]); + } inode_inc_dirty_pages(inode); SetPagePrivate(page); @@ -803,6 +805,9 @@ void remove_dirty_inode(struct inode *inode) !S_ISLNK(inode->i_mode)) return; + if (type == FILE_INODE && !test_opt(sbi, DATA_FLUSH)) + return; + spin_lock(>inode_lock[type]); __remove_dirty_inode(inode, type); spin_unlock(>inode_lock[type]); -- 2.6.3
[PATCH 4/5] f2fs: use percpu_counter for alloc_valid_block_count
This patch uses percpu_count for sbi->alloc_valid_block_count. Signed-off-by: Jaegeuk Kim--- fs/f2fs/checkpoint.c | 2 +- fs/f2fs/f2fs.h | 8 +--- fs/f2fs/recovery.c | 5 +++-- fs/f2fs/super.c | 7 +-- 4 files changed, 14 insertions(+), 8 deletions(-) diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index d04113b..140643d 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -1079,7 +1079,7 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc) /* update user_block_counts */ sbi->last_valid_block_count = sbi->total_valid_block_count; - sbi->alloc_valid_block_count = 0; + percpu_counter_set(>alloc_valid_block_count, 0); /* Here, we only have one bio having CP pack */ sync_meta_pages(sbi, META_FLUSH, LONG_MAX); diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index ae978a1..43aa692 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -829,13 +829,14 @@ struct f2fs_sb_info { block_t user_block_count; /* # of user blocks */ block_t total_valid_block_count;/* # of valid blocks */ - block_t alloc_valid_block_count;/* # of allocated blocks */ block_t discard_blks; /* discard command candidats */ block_t last_valid_block_count; /* for recovery */ u32 s_next_generation; /* for NFS support */ /* # of pages, see count_type */ struct percpu_counter nr_pages[NR_COUNT_TYPE]; + /* # of allocated blocks */ + struct percpu_counter alloc_valid_block_count; struct f2fs_mount_info mount_opt; /* mount options */ @@ -1161,8 +1162,9 @@ static inline bool inc_valid_block_count(struct f2fs_sb_info *sbi, inode->i_blocks += *count; sbi->total_valid_block_count = sbi->total_valid_block_count + (block_t)(*count); - sbi->alloc_valid_block_count += (block_t)(*count); spin_unlock(>stat_lock); + + percpu_counter_add(>alloc_valid_block_count, (*count)); return true; } @@ -1310,11 +1312,11 @@ static inline bool inc_valid_node_count(struct f2fs_sb_info *sbi, if (inode) inode->i_blocks++; - sbi->alloc_valid_block_count++; sbi->total_valid_node_count++; sbi->total_valid_block_count++; spin_unlock(>stat_lock); + percpu_counter_inc(>alloc_valid_block_count); return true; } diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c index 6303b2a..f89b70e 100644 --- a/fs/f2fs/recovery.c +++ b/fs/f2fs/recovery.c @@ -49,8 +49,9 @@ static struct kmem_cache *fsync_entry_slab; bool space_for_roll_forward(struct f2fs_sb_info *sbi) { - if (sbi->last_valid_block_count + sbi->alloc_valid_block_count - > sbi->user_block_count) + s64 nalloc = percpu_counter_sum_positive(>alloc_valid_block_count); + + if (sbi->last_valid_block_count + nalloc > sbi->user_block_count) return false; return true; } diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index 56f177b..c0fd075 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -618,6 +618,7 @@ static void destroy_percpu_info(struct f2fs_sb_info *sbi) for (i = 0; i < NR_COUNT_TYPE; i++) percpu_counter_destroy(>nr_pages[i]); + percpu_counter_destroy(>alloc_valid_block_count); } static void f2fs_put_super(struct super_block *sb) @@ -1384,7 +1385,9 @@ static int init_percpu_info(struct f2fs_sb_info *sbi) if (err) return err; } - return 0; + + return percpu_counter_init(>alloc_valid_block_count, 0, + GFP_KERNEL); } /* @@ -1603,7 +1606,7 @@ try_onemore: sbi->total_valid_block_count = le64_to_cpu(sbi->ckpt->valid_block_count); sbi->last_valid_block_count = sbi->total_valid_block_count; - sbi->alloc_valid_block_count = 0; + for (i = 0; i < NR_INODE_TYPE; i++) { INIT_LIST_HEAD(>inode_list[i]); spin_lock_init(>inode_lock[i]); -- 2.6.3
[PATCH 5/5] f2fs: use percpu_counter for total_valid_inode_count
This patch uses percpu_counter to avoid stat_lock. Signed-off-by: Jaegeuk Kim--- fs/f2fs/f2fs.h | 18 +++--- fs/f2fs/super.c | 11 --- 2 files changed, 15 insertions(+), 14 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 43aa692..c965897 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -822,7 +822,6 @@ struct f2fs_sb_info { unsigned int total_sections;/* total section count */ unsigned int total_node_count; /* total node block count */ unsigned int total_valid_node_count;/* valid node block count */ - unsigned int total_valid_inode_count; /* valid inode count */ loff_t max_file_blocks; /* max block index of file */ int active_logs;/* # of active logs */ int dir_level; /* directory level */ @@ -838,6 +837,9 @@ struct f2fs_sb_info { /* # of allocated blocks */ struct percpu_counter alloc_valid_block_count; + /* valid inode count */ + struct percpu_counter total_valid_inode_count; + struct f2fs_mount_info mount_opt; /* mount options */ /* for cleaning operations */ @@ -1343,23 +1345,17 @@ static inline unsigned int valid_node_count(struct f2fs_sb_info *sbi) static inline void inc_valid_inode_count(struct f2fs_sb_info *sbi) { - spin_lock(>stat_lock); - f2fs_bug_on(sbi, sbi->total_valid_inode_count == sbi->total_node_count); - sbi->total_valid_inode_count++; - spin_unlock(>stat_lock); + percpu_counter_inc(>total_valid_inode_count); } static inline void dec_valid_inode_count(struct f2fs_sb_info *sbi) { - spin_lock(>stat_lock); - f2fs_bug_on(sbi, !sbi->total_valid_inode_count); - sbi->total_valid_inode_count--; - spin_unlock(>stat_lock); + percpu_counter_dec(>total_valid_inode_count); } -static inline unsigned int valid_inode_count(struct f2fs_sb_info *sbi) +static inline s64 valid_inode_count(struct f2fs_sb_info *sbi) { - return sbi->total_valid_inode_count; + return percpu_counter_sum_positive(>total_valid_inode_count); } static inline struct page *f2fs_grab_cache_page(struct address_space *mapping, diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index c0fd075..bbfeb6a 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -619,6 +619,7 @@ static void destroy_percpu_info(struct f2fs_sb_info *sbi) for (i = 0; i < NR_COUNT_TYPE; i++) percpu_counter_destroy(>nr_pages[i]); percpu_counter_destroy(>alloc_valid_block_count); + percpu_counter_destroy(>total_valid_inode_count); } static void f2fs_put_super(struct super_block *sb) @@ -1386,7 +1387,11 @@ static int init_percpu_info(struct f2fs_sb_info *sbi) return err; } - return percpu_counter_init(>alloc_valid_block_count, 0, + err = percpu_counter_init(>alloc_valid_block_count, 0, GFP_KERNEL); + if (err) + return err; + + return percpu_counter_init(>total_valid_inode_count, 0, GFP_KERNEL); } @@ -1600,8 +1605,8 @@ try_onemore: sbi->total_valid_node_count = le32_to_cpu(sbi->ckpt->valid_node_count); - sbi->total_valid_inode_count = - le32_to_cpu(sbi->ckpt->valid_inode_count); + percpu_counter_set(>total_valid_inode_count, + le32_to_cpu(sbi->ckpt->valid_inode_count)); sbi->user_block_count = le64_to_cpu(sbi->ckpt->user_block_count); sbi->total_valid_block_count = le64_to_cpu(sbi->ckpt->valid_block_count); -- 2.6.3
[PATCH 3/5] f2fs: use percpu_counter for # of dirty pages in inode
This patch adds percpu_counter for # of dirty pages in inode. Signed-off-by: Jaegeuk Kim--- fs/f2fs/f2fs.h | 10 +- fs/f2fs/file.c | 2 +- fs/f2fs/super.c | 8 +++- 3 files changed, 13 insertions(+), 7 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 2851078..ae978a1 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -456,7 +456,7 @@ struct f2fs_inode_info { /* Use below internally in f2fs*/ unsigned long flags;/* use to pass per-file flags */ struct rw_semaphore i_sem; /* protect fi info */ - atomic_t dirty_pages; /* # of dirty pages */ + struct percpu_counter dirty_pages; /* # of dirty pages */ f2fs_hash_t chash; /* hash value of given file name */ unsigned int clevel;/* maximum level of given file name */ nid_t i_xattr_nid; /* node id that contains xattrs */ @@ -1186,7 +1186,7 @@ static inline void inc_page_count(struct f2fs_sb_info *sbi, int count_type) static inline void inode_inc_dirty_pages(struct inode *inode) { - atomic_inc(_I(inode)->dirty_pages); + percpu_counter_inc(_I(inode)->dirty_pages); inc_page_count(F2FS_I_SB(inode), S_ISDIR(inode->i_mode) ? F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA); } @@ -1202,7 +1202,7 @@ static inline void inode_dec_dirty_pages(struct inode *inode) !S_ISLNK(inode->i_mode)) return; - atomic_dec(_I(inode)->dirty_pages); + percpu_counter_dec(_I(inode)->dirty_pages); dec_page_count(F2FS_I_SB(inode), S_ISDIR(inode->i_mode) ? F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA); } @@ -1212,9 +1212,9 @@ static inline s64 get_pages(struct f2fs_sb_info *sbi, int count_type) return percpu_counter_sum_positive(>nr_pages[count_type]); } -static inline int get_dirty_pages(struct inode *inode) +static inline s64 get_dirty_pages(struct inode *inode) { - return atomic_read(_I(inode)->dirty_pages); + return percpu_counter_sum_positive(_I(inode)->dirty_pages); } static inline int get_blocktype_secs(struct f2fs_sb_info *sbi, int block_type) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index b5e3999..69451b83 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1457,7 +1457,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp) goto out; f2fs_msg(F2FS_I_SB(inode)->sb, KERN_WARNING, - "Unexpected flush for atomic writes: ino=%lu, npages=%u", + "Unexpected flush for atomic writes: ino=%lu, npages=%lld", inode->i_ino, get_dirty_pages(inode)); ret = filemap_write_and_wait_range(inode->i_mapping, 0, LLONG_MAX); if (ret) diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index b3030ed..56f177b 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -526,9 +526,13 @@ static struct inode *f2fs_alloc_inode(struct super_block *sb) init_once((void *) fi); + if (percpu_counter_init(>dirty_pages, 0, GFP_NOFS)) { + kmem_cache_free(f2fs_inode_cachep, fi); + return NULL; + } + /* Initialize f2fs-specific inode info */ fi->vfs_inode.i_version = 1; - atomic_set(>dirty_pages, 0); fi->i_current_depth = 1; fi->i_advise = 0; init_rwsem(>i_sem); @@ -598,6 +602,8 @@ static void f2fs_dirty_inode(struct inode *inode, int flags) static void f2fs_i_callback(struct rcu_head *head) { struct inode *inode = container_of(head, struct inode, i_rcu); + + percpu_counter_destroy(_I(inode)->dirty_pages); kmem_cache_free(f2fs_inode_cachep, F2FS_I(inode)); } -- 2.6.3
[PATCH 4/5] f2fs: use percpu_counter for alloc_valid_block_count
This patch uses percpu_count for sbi->alloc_valid_block_count. Signed-off-by: Jaegeuk Kim --- fs/f2fs/checkpoint.c | 2 +- fs/f2fs/f2fs.h | 8 +--- fs/f2fs/recovery.c | 5 +++-- fs/f2fs/super.c | 7 +-- 4 files changed, 14 insertions(+), 8 deletions(-) diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index d04113b..140643d 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -1079,7 +1079,7 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc) /* update user_block_counts */ sbi->last_valid_block_count = sbi->total_valid_block_count; - sbi->alloc_valid_block_count = 0; + percpu_counter_set(>alloc_valid_block_count, 0); /* Here, we only have one bio having CP pack */ sync_meta_pages(sbi, META_FLUSH, LONG_MAX); diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index ae978a1..43aa692 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -829,13 +829,14 @@ struct f2fs_sb_info { block_t user_block_count; /* # of user blocks */ block_t total_valid_block_count;/* # of valid blocks */ - block_t alloc_valid_block_count;/* # of allocated blocks */ block_t discard_blks; /* discard command candidats */ block_t last_valid_block_count; /* for recovery */ u32 s_next_generation; /* for NFS support */ /* # of pages, see count_type */ struct percpu_counter nr_pages[NR_COUNT_TYPE]; + /* # of allocated blocks */ + struct percpu_counter alloc_valid_block_count; struct f2fs_mount_info mount_opt; /* mount options */ @@ -1161,8 +1162,9 @@ static inline bool inc_valid_block_count(struct f2fs_sb_info *sbi, inode->i_blocks += *count; sbi->total_valid_block_count = sbi->total_valid_block_count + (block_t)(*count); - sbi->alloc_valid_block_count += (block_t)(*count); spin_unlock(>stat_lock); + + percpu_counter_add(>alloc_valid_block_count, (*count)); return true; } @@ -1310,11 +1312,11 @@ static inline bool inc_valid_node_count(struct f2fs_sb_info *sbi, if (inode) inode->i_blocks++; - sbi->alloc_valid_block_count++; sbi->total_valid_node_count++; sbi->total_valid_block_count++; spin_unlock(>stat_lock); + percpu_counter_inc(>alloc_valid_block_count); return true; } diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c index 6303b2a..f89b70e 100644 --- a/fs/f2fs/recovery.c +++ b/fs/f2fs/recovery.c @@ -49,8 +49,9 @@ static struct kmem_cache *fsync_entry_slab; bool space_for_roll_forward(struct f2fs_sb_info *sbi) { - if (sbi->last_valid_block_count + sbi->alloc_valid_block_count - > sbi->user_block_count) + s64 nalloc = percpu_counter_sum_positive(>alloc_valid_block_count); + + if (sbi->last_valid_block_count + nalloc > sbi->user_block_count) return false; return true; } diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index 56f177b..c0fd075 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -618,6 +618,7 @@ static void destroy_percpu_info(struct f2fs_sb_info *sbi) for (i = 0; i < NR_COUNT_TYPE; i++) percpu_counter_destroy(>nr_pages[i]); + percpu_counter_destroy(>alloc_valid_block_count); } static void f2fs_put_super(struct super_block *sb) @@ -1384,7 +1385,9 @@ static int init_percpu_info(struct f2fs_sb_info *sbi) if (err) return err; } - return 0; + + return percpu_counter_init(>alloc_valid_block_count, 0, + GFP_KERNEL); } /* @@ -1603,7 +1606,7 @@ try_onemore: sbi->total_valid_block_count = le64_to_cpu(sbi->ckpt->valid_block_count); sbi->last_valid_block_count = sbi->total_valid_block_count; - sbi->alloc_valid_block_count = 0; + for (i = 0; i < NR_INODE_TYPE; i++) { INIT_LIST_HEAD(>inode_list[i]); spin_lock_init(>inode_lock[i]); -- 2.6.3
[PATCH 5/5] f2fs: use percpu_counter for total_valid_inode_count
This patch uses percpu_counter to avoid stat_lock. Signed-off-by: Jaegeuk Kim --- fs/f2fs/f2fs.h | 18 +++--- fs/f2fs/super.c | 11 --- 2 files changed, 15 insertions(+), 14 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 43aa692..c965897 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -822,7 +822,6 @@ struct f2fs_sb_info { unsigned int total_sections;/* total section count */ unsigned int total_node_count; /* total node block count */ unsigned int total_valid_node_count;/* valid node block count */ - unsigned int total_valid_inode_count; /* valid inode count */ loff_t max_file_blocks; /* max block index of file */ int active_logs;/* # of active logs */ int dir_level; /* directory level */ @@ -838,6 +837,9 @@ struct f2fs_sb_info { /* # of allocated blocks */ struct percpu_counter alloc_valid_block_count; + /* valid inode count */ + struct percpu_counter total_valid_inode_count; + struct f2fs_mount_info mount_opt; /* mount options */ /* for cleaning operations */ @@ -1343,23 +1345,17 @@ static inline unsigned int valid_node_count(struct f2fs_sb_info *sbi) static inline void inc_valid_inode_count(struct f2fs_sb_info *sbi) { - spin_lock(>stat_lock); - f2fs_bug_on(sbi, sbi->total_valid_inode_count == sbi->total_node_count); - sbi->total_valid_inode_count++; - spin_unlock(>stat_lock); + percpu_counter_inc(>total_valid_inode_count); } static inline void dec_valid_inode_count(struct f2fs_sb_info *sbi) { - spin_lock(>stat_lock); - f2fs_bug_on(sbi, !sbi->total_valid_inode_count); - sbi->total_valid_inode_count--; - spin_unlock(>stat_lock); + percpu_counter_dec(>total_valid_inode_count); } -static inline unsigned int valid_inode_count(struct f2fs_sb_info *sbi) +static inline s64 valid_inode_count(struct f2fs_sb_info *sbi) { - return sbi->total_valid_inode_count; + return percpu_counter_sum_positive(>total_valid_inode_count); } static inline struct page *f2fs_grab_cache_page(struct address_space *mapping, diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index c0fd075..bbfeb6a 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -619,6 +619,7 @@ static void destroy_percpu_info(struct f2fs_sb_info *sbi) for (i = 0; i < NR_COUNT_TYPE; i++) percpu_counter_destroy(>nr_pages[i]); percpu_counter_destroy(>alloc_valid_block_count); + percpu_counter_destroy(>total_valid_inode_count); } static void f2fs_put_super(struct super_block *sb) @@ -1386,7 +1387,11 @@ static int init_percpu_info(struct f2fs_sb_info *sbi) return err; } - return percpu_counter_init(>alloc_valid_block_count, 0, + err = percpu_counter_init(>alloc_valid_block_count, 0, GFP_KERNEL); + if (err) + return err; + + return percpu_counter_init(>total_valid_inode_count, 0, GFP_KERNEL); } @@ -1600,8 +1605,8 @@ try_onemore: sbi->total_valid_node_count = le32_to_cpu(sbi->ckpt->valid_node_count); - sbi->total_valid_inode_count = - le32_to_cpu(sbi->ckpt->valid_inode_count); + percpu_counter_set(>total_valid_inode_count, + le32_to_cpu(sbi->ckpt->valid_inode_count)); sbi->user_block_count = le64_to_cpu(sbi->ckpt->user_block_count); sbi->total_valid_block_count = le64_to_cpu(sbi->ckpt->valid_block_count); -- 2.6.3
[PATCH 3/5] f2fs: use percpu_counter for # of dirty pages in inode
This patch adds percpu_counter for # of dirty pages in inode. Signed-off-by: Jaegeuk Kim --- fs/f2fs/f2fs.h | 10 +- fs/f2fs/file.c | 2 +- fs/f2fs/super.c | 8 +++- 3 files changed, 13 insertions(+), 7 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 2851078..ae978a1 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -456,7 +456,7 @@ struct f2fs_inode_info { /* Use below internally in f2fs*/ unsigned long flags;/* use to pass per-file flags */ struct rw_semaphore i_sem; /* protect fi info */ - atomic_t dirty_pages; /* # of dirty pages */ + struct percpu_counter dirty_pages; /* # of dirty pages */ f2fs_hash_t chash; /* hash value of given file name */ unsigned int clevel;/* maximum level of given file name */ nid_t i_xattr_nid; /* node id that contains xattrs */ @@ -1186,7 +1186,7 @@ static inline void inc_page_count(struct f2fs_sb_info *sbi, int count_type) static inline void inode_inc_dirty_pages(struct inode *inode) { - atomic_inc(_I(inode)->dirty_pages); + percpu_counter_inc(_I(inode)->dirty_pages); inc_page_count(F2FS_I_SB(inode), S_ISDIR(inode->i_mode) ? F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA); } @@ -1202,7 +1202,7 @@ static inline void inode_dec_dirty_pages(struct inode *inode) !S_ISLNK(inode->i_mode)) return; - atomic_dec(_I(inode)->dirty_pages); + percpu_counter_dec(_I(inode)->dirty_pages); dec_page_count(F2FS_I_SB(inode), S_ISDIR(inode->i_mode) ? F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA); } @@ -1212,9 +1212,9 @@ static inline s64 get_pages(struct f2fs_sb_info *sbi, int count_type) return percpu_counter_sum_positive(>nr_pages[count_type]); } -static inline int get_dirty_pages(struct inode *inode) +static inline s64 get_dirty_pages(struct inode *inode) { - return atomic_read(_I(inode)->dirty_pages); + return percpu_counter_sum_positive(_I(inode)->dirty_pages); } static inline int get_blocktype_secs(struct f2fs_sb_info *sbi, int block_type) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index b5e3999..69451b83 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1457,7 +1457,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp) goto out; f2fs_msg(F2FS_I_SB(inode)->sb, KERN_WARNING, - "Unexpected flush for atomic writes: ino=%lu, npages=%u", + "Unexpected flush for atomic writes: ino=%lu, npages=%lld", inode->i_ino, get_dirty_pages(inode)); ret = filemap_write_and_wait_range(inode->i_mapping, 0, LLONG_MAX); if (ret) diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index b3030ed..56f177b 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -526,9 +526,13 @@ static struct inode *f2fs_alloc_inode(struct super_block *sb) init_once((void *) fi); + if (percpu_counter_init(>dirty_pages, 0, GFP_NOFS)) { + kmem_cache_free(f2fs_inode_cachep, fi); + return NULL; + } + /* Initialize f2fs-specific inode info */ fi->vfs_inode.i_version = 1; - atomic_set(>dirty_pages, 0); fi->i_current_depth = 1; fi->i_advise = 0; init_rwsem(>i_sem); @@ -598,6 +602,8 @@ static void f2fs_dirty_inode(struct inode *inode, int flags) static void f2fs_i_callback(struct rcu_head *head) { struct inode *inode = container_of(head, struct inode, i_rcu); + + percpu_counter_destroy(_I(inode)->dirty_pages); kmem_cache_free(f2fs_inode_cachep, F2FS_I(inode)); } -- 2.6.3
Re: [PATCH] Use MICRO UINT_MAX instead of actual value
On 04/30/2016 07:13 AM, Minfei Huang wrote: Ping. Any comment is appreciate. Hi Minfei, I guess a good idea would be to resend the patch with the typo fixed, as a v2 patch. What do you think? Cheers, Guilherme Thanks Minfei On 04/25/16 at 11:13P, Minfei Huang wrote: It's more elegant to use MICRO UINT_MAX to represent the max value of type unsigned int. So replace the actual value by using this MICRO. Signed-off-by: Minfei Huang--- drivers/nvme/host/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 643f457..2c0bb13 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -597,7 +597,7 @@ static void nvme_config_discard(struct nvme_ns *ns) ns->queue->limits.discard_alignment = logical_block_size; ns->queue->limits.discard_granularity = logical_block_size; - blk_queue_max_discard_sectors(ns->queue, 0x); + blk_queue_max_discard_sectors(ns->queue, UINT_MAX); queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, ns->queue); } -- 2.6.3 ___ Linux-nvme mailing list linux-n...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme
Re: [GIT PULL] extcon next for 4.7
Dear Greg, On 2016년 05월 16일 21:35, Greg KH wrote: > On Tue, May 10, 2016 at 10:47:23AM +0900, Chanwoo Choi wrote: >> Dear Greg, >> >> I'm sorry for late pull request. This is extcon-next pull request for v4.7. >> I add detailed description of this pull request on below. >> Please pull extcon with following updates. >> >> Best Regards, >> Chanwoo Choi >> >> The following changes since commit c3b46c73264b03000d1e18b22f5caf63332547c9: >> >> Linux 4.6-rc4 (2016-04-17 19:13:32 -0700) > > Ugh, I missed this in time for 4.7-rc1. Can you resend this after > 4.7-rc1 is out if it is fixes that are needed for that release? Sure. I'll resend this pull request for 4.7-rc2 after releasing 4.7-rc1. Best Regards, Chanwoo CHoi
Re: [PATCH] Use MICRO UINT_MAX instead of actual value
On 04/30/2016 07:13 AM, Minfei Huang wrote: Ping. Any comment is appreciate. Hi Minfei, I guess a good idea would be to resend the patch with the typo fixed, as a v2 patch. What do you think? Cheers, Guilherme Thanks Minfei On 04/25/16 at 11:13P, Minfei Huang wrote: It's more elegant to use MICRO UINT_MAX to represent the max value of type unsigned int. So replace the actual value by using this MICRO. Signed-off-by: Minfei Huang --- drivers/nvme/host/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 643f457..2c0bb13 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -597,7 +597,7 @@ static void nvme_config_discard(struct nvme_ns *ns) ns->queue->limits.discard_alignment = logical_block_size; ns->queue->limits.discard_granularity = logical_block_size; - blk_queue_max_discard_sectors(ns->queue, 0x); + blk_queue_max_discard_sectors(ns->queue, UINT_MAX); queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, ns->queue); } -- 2.6.3 ___ Linux-nvme mailing list linux-n...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme
Re: [GIT PULL] extcon next for 4.7
Dear Greg, On 2016년 05월 16일 21:35, Greg KH wrote: > On Tue, May 10, 2016 at 10:47:23AM +0900, Chanwoo Choi wrote: >> Dear Greg, >> >> I'm sorry for late pull request. This is extcon-next pull request for v4.7. >> I add detailed description of this pull request on below. >> Please pull extcon with following updates. >> >> Best Regards, >> Chanwoo Choi >> >> The following changes since commit c3b46c73264b03000d1e18b22f5caf63332547c9: >> >> Linux 4.6-rc4 (2016-04-17 19:13:32 -0700) > > Ugh, I missed this in time for 4.7-rc1. Can you resend this after > 4.7-rc1 is out if it is fixes that are needed for that release? Sure. I'll resend this pull request for 4.7-rc2 after releasing 4.7-rc1. Best Regards, Chanwoo CHoi
Re: linux-next: manual merge of the f2fs tree with the ext4 tree
On Mon, May 16, 2016 at 05:30:28PM -0400, Theodore Ts'o wrote: > On Mon, May 09, 2016 at 10:15:03AM -0700, Jaegeuk Kim wrote: > > Hi Stephen, > > > > Thank you for the notice. > > I've been waiting for a comment about the below patch targeted to v4.7 from > > Ted. > > Meanwhile, I intended to prepare -next for that patch in advance. > > Surely, once I get a sense that I need to consider v4.8, I'll drop this > > patch > > for -next right away. > > Yeah, I think it would be better for me to take the ext4 "migrate into > vfs's crypto engine" patch, and at this point, for 4.8. Fair enough. > Sorry, I just ran out of time to try to verify that the patch wouldn't > break anything, and given that we're going to need to wait for > "fscrypto/f2fs: allow fs-specific key prefix for fs encryption" to go > upstream. Agreed. IIUC, let me push the fscrypto/f2fs patch to v4.7 first? > Do you have any other planned changes for the fscrypto tree planned > for 4.8. If not, then perhaps it will be easier if I take the pen for > any changes needed for fs/crypto, and moving forward, we probably need > to find ways to make changes where commits specific for fs/crypto > should be isolated from ext4 or f2fs changes, and as much as possible > to be backwards compatible so that as we add new features to > fs/crypto, we don't need to synchronize changes across multiple file > systems. I have no planned patch right now, and of course, it must have no problem for you to treat with further patches. Also, let me take a look at any missing part again, regarding to your concerns. Thanks,
Re: linux-next: manual merge of the f2fs tree with the ext4 tree
On Mon, May 16, 2016 at 05:30:28PM -0400, Theodore Ts'o wrote: > On Mon, May 09, 2016 at 10:15:03AM -0700, Jaegeuk Kim wrote: > > Hi Stephen, > > > > Thank you for the notice. > > I've been waiting for a comment about the below patch targeted to v4.7 from > > Ted. > > Meanwhile, I intended to prepare -next for that patch in advance. > > Surely, once I get a sense that I need to consider v4.8, I'll drop this > > patch > > for -next right away. > > Yeah, I think it would be better for me to take the ext4 "migrate into > vfs's crypto engine" patch, and at this point, for 4.8. Fair enough. > Sorry, I just ran out of time to try to verify that the patch wouldn't > break anything, and given that we're going to need to wait for > "fscrypto/f2fs: allow fs-specific key prefix for fs encryption" to go > upstream. Agreed. IIUC, let me push the fscrypto/f2fs patch to v4.7 first? > Do you have any other planned changes for the fscrypto tree planned > for 4.8. If not, then perhaps it will be easier if I take the pen for > any changes needed for fs/crypto, and moving forward, we probably need > to find ways to make changes where commits specific for fs/crypto > should be isolated from ext4 or f2fs changes, and as much as possible > to be backwards compatible so that as we add new features to > fs/crypto, we don't need to synchronize changes across multiple file > systems. I have no planned patch right now, and of course, it must have no problem for you to treat with further patches. Also, let me take a look at any missing part again, regarding to your concerns. Thanks,
Re: [PATCH] ftrace/x86: Fix function graph tracer reset path
On Mon, 16 May 2016 09:58:33 -0400 Steven Rostedtwrote: > > Nice work Masami! > > On Mon, 16 May 2016 21:32:50 +0900 > Namhyung Kim wrote: > > > > > -/* This is global to keep gas from relaxing the jumps */ > > > -ENTRY(early_idt_handler) > > > +/* This is weak to keep gas from relaxing the jumps */ > > > +WEAK(early_idt_handler) > > > cld > > > So IIUC, this -mshared option disables the optimization on branch > > instructions and generates slightly bigger code. Not sure how much > > affected by this though. > > > > Steve, do you think it's better to use this option? > > Can we solve this by doing the same thing it did for the kernel? Looks good to me :) Reviewed-by: Masami Hiramatsu Thanks! > > -- Steve > > diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S > index ed48a9f465f8..e13a695c3084 100644 > --- a/arch/x86/kernel/mcount_64.S > +++ b/arch/x86/kernel/mcount_64.S > @@ -182,7 +182,8 @@ GLOBAL(ftrace_graph_call) > jmp ftrace_stub > #endif > > -GLOBAL(ftrace_stub) > +/* This is weak to keep gas from relaxing the jumps */ > +WEAK(ftrace_stub) > retq > END(ftrace_caller) > -- Masami Hiramatsu
Re: [PATCH] ftrace/x86: Fix function graph tracer reset path
On Mon, 16 May 2016 09:58:33 -0400 Steven Rostedt wrote: > > Nice work Masami! > > On Mon, 16 May 2016 21:32:50 +0900 > Namhyung Kim wrote: > > > > > -/* This is global to keep gas from relaxing the jumps */ > > > -ENTRY(early_idt_handler) > > > +/* This is weak to keep gas from relaxing the jumps */ > > > +WEAK(early_idt_handler) > > > cld > > > So IIUC, this -mshared option disables the optimization on branch > > instructions and generates slightly bigger code. Not sure how much > > affected by this though. > > > > Steve, do you think it's better to use this option? > > Can we solve this by doing the same thing it did for the kernel? Looks good to me :) Reviewed-by: Masami Hiramatsu Thanks! > > -- Steve > > diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S > index ed48a9f465f8..e13a695c3084 100644 > --- a/arch/x86/kernel/mcount_64.S > +++ b/arch/x86/kernel/mcount_64.S > @@ -182,7 +182,8 @@ GLOBAL(ftrace_graph_call) > jmp ftrace_stub > #endif > > -GLOBAL(ftrace_stub) > +/* This is weak to keep gas from relaxing the jumps */ > +WEAK(ftrace_stub) > retq > END(ftrace_caller) > -- Masami Hiramatsu
Re: [RFC][PATCH] printk: Add option to append kernel version to the dict
On Sunday 05/15 at 15:36 +0900, Sergey Senozhatsky wrote: > Hello, > > On (05/13/16 13:58), Calvin Owens wrote: > [..] > > +#if defined(CONFIG_PRINTK_APPEND_UNAME) > > +static ssize_t msg_print_ext_uname(char *buf, size_t size) > > +{ > > + return scnprintf(buf, size, " UNAME=%s\n", init_utsname()->release); > > +} > > +#else > > +static ssize_t msg_print_ext_uname(char *buf, size_t size) > > +{ > > + return 0; > > +} > > +#endif > > + > > /* /dev/kmsg - userspace message inject/listen interface */ > > struct devkmsg_user { > > u64 seq; > > @@ -2305,6 +2317,8 @@ skip: > > sizeof(ext_text) - ext_len, > > log_dict(msg), msg->dict_len, > > log_text(msg), msg->text_len); > > + ext_len += msg_print_ext_uname(ext_text + ext_len, > > + sizeof(ext_text) - ext_len); > > } > > what if there is no place left for init_utsname() after > msg_print_ext_header() + msg_print_ext_body()? It ends up being truncated, like either of the preceeding calls would. > do you need init_utsname in every message? or just in WARN/ERR ones? I need it added to anything which netconsole actually emits: so I could filter based on loglevel here if you like. Thanks, Calvin > > -ss
Re: [RFC][PATCH] printk: Add option to append kernel version to the dict
On Sunday 05/15 at 15:36 +0900, Sergey Senozhatsky wrote: > Hello, > > On (05/13/16 13:58), Calvin Owens wrote: > [..] > > +#if defined(CONFIG_PRINTK_APPEND_UNAME) > > +static ssize_t msg_print_ext_uname(char *buf, size_t size) > > +{ > > + return scnprintf(buf, size, " UNAME=%s\n", init_utsname()->release); > > +} > > +#else > > +static ssize_t msg_print_ext_uname(char *buf, size_t size) > > +{ > > + return 0; > > +} > > +#endif > > + > > /* /dev/kmsg - userspace message inject/listen interface */ > > struct devkmsg_user { > > u64 seq; > > @@ -2305,6 +2317,8 @@ skip: > > sizeof(ext_text) - ext_len, > > log_dict(msg), msg->dict_len, > > log_text(msg), msg->text_len); > > + ext_len += msg_print_ext_uname(ext_text + ext_len, > > + sizeof(ext_text) - ext_len); > > } > > what if there is no place left for init_utsname() after > msg_print_ext_header() + msg_print_ext_body()? It ends up being truncated, like either of the preceeding calls would. > do you need init_utsname in every message? or just in WARN/ERR ones? I need it added to anything which netconsole actually emits: so I could filter based on loglevel here if you like. Thanks, Calvin > > -ss
Re: [PATCH v9] mm: kasan: Initial memory quarantine implementation
Hi Alexander, On Wed, May 11, 2016 at 6:18 PM, Alexander Potapenkowrote: > Quarantine isolates freed objects in a separate queue. The objects are > returned to the allocator later, which helps to detect use-after-free > errors. > > Freed objects are first added to per-cpu quarantine queues. > When a cache is destroyed or memory shrinking is requested, the objects > are moved into the global quarantine queue. Whenever a kmalloc call > allows memory reclaiming, the oldest objects are popped out of the > global queue until the total size of objects in quarantine is less than > 3/4 of the maximum quarantine size (which is a fraction of installed > physical memory). > > As long as an object remains in the quarantine, KASAN is able to report > accesses to it, so the chance of reporting a use-after-free is increased. > Once the object leaves quarantine, the allocator may reuse it, in which > case the object is unpoisoned and KASAN can't detect incorrect accesses > to it. > > Right now quarantine support is only enabled in SLAB allocator. > Unification of KASAN features in SLAB and SLUB will be done later. > > This patch is based on the "mm: kasan: quarantine" patch originally > prepared by Dmitry Chernenkov. A number of improvements have been > suggested by Andrey Ryabinin. > > Signed-off-by: Alexander Potapenko > --- > v2: - added copyright comments > - per request from Joonsoo Kim made __cache_free() more straightforward > - added comments for smp_load_acquire()/smp_store_release() > > v3: - incorporate changes introduced by the "mm, kasan: SLAB support" patch > > v4: - fix kbuild compile-time error (missing ___cache_free() declaration) > and a warning (wrong format specifier) > > v6: - extended the patch description > - dropped the unused qlist_remove() function > > v9: - incorporate the fixes by Andrey Ryabinin: > * Fix comment styles, > * Get rid of some ifdefs > * Revert needless functions renames in quarantine patch > * Remove needless local_irq_save()/restore() in > per_cpu_remove_cache() > * Add new 'struct qlist_node' instead of 'void **' types. This makes > code a bit more redable. > - remove the non-deterministic quarantine test > - dropped smp_load_acquire()/smp_store_release() > --- > include/linux/kasan.h | 13 ++- > mm/kasan/Makefile | 1 + > mm/kasan/kasan.c | 57 -- > mm/kasan/kasan.h | 21 +++- > mm/kasan/quarantine.c | 291 > ++ > mm/kasan/report.c | 1 + > mm/mempool.c | 2 +- > mm/slab.c | 12 ++- > mm/slab.h | 2 + > mm/slab_common.c | 2 + > 10 files changed, 387 insertions(+), 15 deletions(-) > create mode 100644 mm/kasan/quarantine.c > > diff --git a/include/linux/kasan.h b/include/linux/kasan.h > index 737371b..611927f 100644 > --- a/include/linux/kasan.h > +++ b/include/linux/kasan.h > @@ -50,6 +50,8 @@ void kasan_free_pages(struct page *page, unsigned int > order); > [...] > diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c > new file mode 100644 > index 000..4973505 > --- /dev/null > +++ b/mm/kasan/quarantine.c > @@ -0,0 +1,291 @@ > +/* > + * KASAN quarantine. > + * > + * Author: Alexander Potapenko > + * Copyright (C) 2016 Google, Inc. > + * > + * Based on code by Dmitry Chernenkov. > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public License > + * version 2 as published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * General Public License for more details. > + * > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "../slab.h" > +#include "kasan.h" > + > +/* Data structure and operations for quarantine queues. */ > + > +/* > + * Each queue is a signle-linked list, which also stores the total size of > + * objects inside of it. > + */ > +struct qlist_head { > + struct qlist_node *head; > + struct qlist_node *tail; > + size_t bytes; > +}; > + > +#define QLIST_INIT { NULL, NULL, 0 } > + > +static bool qlist_empty(struct qlist_head *q) > +{ > + return !q->head; > +} > + > +static void qlist_init(struct qlist_head *q) > +{ > + q->head = q->tail = NULL; > + q->bytes = 0; > +} > + > +static void qlist_put(struct qlist_head *q, struct qlist_node *qlink, > + size_t size) > +{ > + if (unlikely(qlist_empty(q))) > + q->head = qlink; > + else > + q->tail->next = qlink; > + q->tail = qlink; > + qlink->next = NULL; > + q->bytes +=
Re: [PATCH v9] mm: kasan: Initial memory quarantine implementation
Hi Alexander, On Wed, May 11, 2016 at 6:18 PM, Alexander Potapenko wrote: > Quarantine isolates freed objects in a separate queue. The objects are > returned to the allocator later, which helps to detect use-after-free > errors. > > Freed objects are first added to per-cpu quarantine queues. > When a cache is destroyed or memory shrinking is requested, the objects > are moved into the global quarantine queue. Whenever a kmalloc call > allows memory reclaiming, the oldest objects are popped out of the > global queue until the total size of objects in quarantine is less than > 3/4 of the maximum quarantine size (which is a fraction of installed > physical memory). > > As long as an object remains in the quarantine, KASAN is able to report > accesses to it, so the chance of reporting a use-after-free is increased. > Once the object leaves quarantine, the allocator may reuse it, in which > case the object is unpoisoned and KASAN can't detect incorrect accesses > to it. > > Right now quarantine support is only enabled in SLAB allocator. > Unification of KASAN features in SLAB and SLUB will be done later. > > This patch is based on the "mm: kasan: quarantine" patch originally > prepared by Dmitry Chernenkov. A number of improvements have been > suggested by Andrey Ryabinin. > > Signed-off-by: Alexander Potapenko > --- > v2: - added copyright comments > - per request from Joonsoo Kim made __cache_free() more straightforward > - added comments for smp_load_acquire()/smp_store_release() > > v3: - incorporate changes introduced by the "mm, kasan: SLAB support" patch > > v4: - fix kbuild compile-time error (missing ___cache_free() declaration) > and a warning (wrong format specifier) > > v6: - extended the patch description > - dropped the unused qlist_remove() function > > v9: - incorporate the fixes by Andrey Ryabinin: > * Fix comment styles, > * Get rid of some ifdefs > * Revert needless functions renames in quarantine patch > * Remove needless local_irq_save()/restore() in > per_cpu_remove_cache() > * Add new 'struct qlist_node' instead of 'void **' types. This makes > code a bit more redable. > - remove the non-deterministic quarantine test > - dropped smp_load_acquire()/smp_store_release() > --- > include/linux/kasan.h | 13 ++- > mm/kasan/Makefile | 1 + > mm/kasan/kasan.c | 57 -- > mm/kasan/kasan.h | 21 +++- > mm/kasan/quarantine.c | 291 > ++ > mm/kasan/report.c | 1 + > mm/mempool.c | 2 +- > mm/slab.c | 12 ++- > mm/slab.h | 2 + > mm/slab_common.c | 2 + > 10 files changed, 387 insertions(+), 15 deletions(-) > create mode 100644 mm/kasan/quarantine.c > > diff --git a/include/linux/kasan.h b/include/linux/kasan.h > index 737371b..611927f 100644 > --- a/include/linux/kasan.h > +++ b/include/linux/kasan.h > @@ -50,6 +50,8 @@ void kasan_free_pages(struct page *page, unsigned int > order); > [...] > diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c > new file mode 100644 > index 000..4973505 > --- /dev/null > +++ b/mm/kasan/quarantine.c > @@ -0,0 +1,291 @@ > +/* > + * KASAN quarantine. > + * > + * Author: Alexander Potapenko > + * Copyright (C) 2016 Google, Inc. > + * > + * Based on code by Dmitry Chernenkov. > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public License > + * version 2 as published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * General Public License for more details. > + * > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "../slab.h" > +#include "kasan.h" > + > +/* Data structure and operations for quarantine queues. */ > + > +/* > + * Each queue is a signle-linked list, which also stores the total size of > + * objects inside of it. > + */ > +struct qlist_head { > + struct qlist_node *head; > + struct qlist_node *tail; > + size_t bytes; > +}; > + > +#define QLIST_INIT { NULL, NULL, 0 } > + > +static bool qlist_empty(struct qlist_head *q) > +{ > + return !q->head; > +} > + > +static void qlist_init(struct qlist_head *q) > +{ > + q->head = q->tail = NULL; > + q->bytes = 0; > +} > + > +static void qlist_put(struct qlist_head *q, struct qlist_node *qlink, > + size_t size) > +{ > + if (unlikely(qlist_empty(q))) > + q->head = qlink; > + else > + q->tail->next = qlink; > + q->tail = qlink; > + qlink->next = NULL; > + q->bytes += size; > +} > + > +static void qlist_move_all(struct
RE: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects
Our understanding is the relationship between the SCSI host and SAS end devices is a parent-child and before ripping of SCSI host we need to rip of all the children. Why the enclosure ends up trying to re-parent the SCSI device from the enclosure to the SAS PHY even after we remove the SAS Phy?. Doesn't this need to be taken care in enclosure_removal? -Original Message- From: Calvin Owens [mailto:calvinow...@fb.com] Sent: Monday, May 16, 2016 2:25 PM To: Elliott, Robert (Persistent Memory) Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org; kernel-t...@fb.com; calvinow...@fb.com Subject: Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects On Friday 05/13 at 21:17 +, Elliott, Robert (Persistent Memory) wrote: > > > > -Original Message- > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel- > > ow...@vger.kernel.org] On Behalf Of Calvin Owens > > Sent: Friday, May 13, 2016 3:28 PM > ... > > Subject: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS > > PHY objects > > > ... > > The issue is that enclosure_remove_device() expects to be able to > > re-add the device that was previously enclosured: so with SES > > running, the order we unwind things matters in a way it otherwise > > wouldn't. > > > > Since mpt3sas deletes the SAS objects before the SCSI hosts, > > enclosure ends up trying to re-parent the SCSI device from the > > enclosure to the SAS PHY which has already been deleted. This obviously > > ends in sadness. > > > > The fix appears to be simple: just call scsi_remove_host() before we > > call > > sas_port_delete() and/or sas_remove_host(). > > > ... > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > > @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev) > > _scsih_raid_device_remove(ioc, raid_device); > > } > > > > + scsi_remove_host(shost); > > + > > /* free ports attached to the sas_host */ > > list_for_each_entry_safe(mpt3sas_port, next_port, > >>sas_hba.sas_port_list, port_list) { @@ -8172,7 +8174,6 @@ > > void scsih_remove(struct pci_dev *pdev) > > } > > > > sas_remove_host(shost); > > - scsi_remove_host(shost); > > mpt3sas_base_detach(ioc); > > spin_lock(_lock); > > list_del(>list); > > That change matches the pattern of all other drivers calling > sas_remove_host, except for one: > drivers/message/fusion/mptsas.c > > That consensus means this comment in drivers/scsi/scsi_transport_sas.c > is wrong: > > /** > * sas_remove_host - tear down a Scsi_Host's SAS data structures > * @shost: Scsi Host that is torn down > * > * Removes all SAS PHYs and remote PHYs for a given Scsi_Host. > * Must be called just before scsi_remove_host for SAS HBAs. > */ Yes, exactly: that comment appears to be backwards, and as you say most drivers appear to do the opposite (I looked at HPSA at least, which calls sas_port_delete() before scsi_remove_host()). I suppose I should send a patch to fix the comment as well? It'd be nice to get some testing to be certain I'm not breaking somebody else's setup by fixing mine though... Thanks, Calvin
RE: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects
Our understanding is the relationship between the SCSI host and SAS end devices is a parent-child and before ripping of SCSI host we need to rip of all the children. Why the enclosure ends up trying to re-parent the SCSI device from the enclosure to the SAS PHY even after we remove the SAS Phy?. Doesn't this need to be taken care in enclosure_removal? -Original Message- From: Calvin Owens [mailto:calvinow...@fb.com] Sent: Monday, May 16, 2016 2:25 PM To: Elliott, Robert (Persistent Memory) Cc: Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; James E.J. Bottomley; Martin K. Petersen; mpt-fusionlinux@broadcom.com; linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org; kernel-t...@fb.com; calvinow...@fb.com Subject: Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects On Friday 05/13 at 21:17 +, Elliott, Robert (Persistent Memory) wrote: > > > > -Original Message- > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel- > > ow...@vger.kernel.org] On Behalf Of Calvin Owens > > Sent: Friday, May 13, 2016 3:28 PM > ... > > Subject: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS > > PHY objects > > > ... > > The issue is that enclosure_remove_device() expects to be able to > > re-add the device that was previously enclosured: so with SES > > running, the order we unwind things matters in a way it otherwise > > wouldn't. > > > > Since mpt3sas deletes the SAS objects before the SCSI hosts, > > enclosure ends up trying to re-parent the SCSI device from the > > enclosure to the SAS PHY which has already been deleted. This obviously > > ends in sadness. > > > > The fix appears to be simple: just call scsi_remove_host() before we > > call > > sas_port_delete() and/or sas_remove_host(). > > > ... > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > > @@ -8149,6 +8149,8 @@ void scsih_remove(struct pci_dev *pdev) > > _scsih_raid_device_remove(ioc, raid_device); > > } > > > > + scsi_remove_host(shost); > > + > > /* free ports attached to the sas_host */ > > list_for_each_entry_safe(mpt3sas_port, next_port, > >>sas_hba.sas_port_list, port_list) { @@ -8172,7 +8174,6 @@ > > void scsih_remove(struct pci_dev *pdev) > > } > > > > sas_remove_host(shost); > > - scsi_remove_host(shost); > > mpt3sas_base_detach(ioc); > > spin_lock(_lock); > > list_del(>list); > > That change matches the pattern of all other drivers calling > sas_remove_host, except for one: > drivers/message/fusion/mptsas.c > > That consensus means this comment in drivers/scsi/scsi_transport_sas.c > is wrong: > > /** > * sas_remove_host - tear down a Scsi_Host's SAS data structures > * @shost: Scsi Host that is torn down > * > * Removes all SAS PHYs and remote PHYs for a given Scsi_Host. > * Must be called just before scsi_remove_host for SAS HBAs. > */ Yes, exactly: that comment appears to be backwards, and as you say most drivers appear to do the opposite (I looked at HPSA at least, which calls sas_port_delete() before scsi_remove_host()). I suppose I should send a patch to fix the comment as well? It'd be nice to get some testing to be certain I'm not breaking somebody else's setup by fixing mine though... Thanks, Calvin
Re: [PATCH 1/1] simplified security.nscapability xattr
On Mon, May 16, 2016 at 04:15:23PM -0500, Serge E. Hallyn wrote: > Quoting Serge E. Hallyn (se...@hallyn.com): > ... > > There's a problem though. The above suffices to prevent an unprivileged > > user > > in a user_ns from unsharing a user_ns to write a file capability and exploit > > that capability in the ns where he is unprivileged. With one exception, > > which > > is the case where the unprivileged user is mapped to the same kuid which > > created the namespace. So if uid 1000 on the host creates a namespace > > where uid 1000 maps to 1000 in the namespace, then 1000 in the namespace > > can create a new user_ns, write the xattr, and exploit it from the > > parent namespace. This is not an uncommon case. I'm not sure what to do > > about > > it. > > Ok I think I've convinced myself that requiring a kuid 0 in the container > and storing that in the security.nscapability is best solution. The DAC > objection is imo not really valid - we don't have to give uid 0 in the > container any special privilege, we just require that the ns have a uid 0 > mapping. I have not been able to think of any other reliable way to verify > that the writer of the capability is authorized to grant privilege to the > file when executed by current. > > I'm going to proceed with another POC based on the following design: > > 1. no new syscalls at the moment. You can choose to set/query > security.nscapability, but can also just set security.capability from > a user_ns and have the kernel transparently set a security.nscapability > entry for you. > > 2. For now just a single security.nscapability entry, but in a format > that turning it into an array will be a trivial change > > 3. When running file foo which has a security.nscapability for kuid 10, > then any namespace where kuid 10 is root - or which has an ancestor ns > where > that is the case - will run the file with the listed capabilities. > > 4. When doing getxattr of security.capability from a user_ns, if there is a > security.capability entry, that will be returned; else if there is a valid > security.nscapability for your ns, that will be returned. > > 5. when doing a setxattr of security.capability from a user_ns, if there is > a security.nscapability entry, you get EBUSY; else a security.nscapability > with your root kuid will be written provided that (a) you are privileged > over your namespace, (b) you are privileged over your root uid, (c) the > file owner maps into your namespace. Stéphane pointed out this isn't quite right. The EBUSY will happen if a security.nscapability is defined with a kuid over which the writer is not privileged - else it will overwrite. It will also happen if security.capbility is set. > 6. when doing a getxattr of security.nscapability, the entry will be shown > with kuid mapped into your namespace or -1 if the uid does not map into > your ns. > > 7. when doing a setxattr of security.nscapability, if an entry exists, you > get -EBUSY; if you are not privileged over your ns, your root uid, and > the file owner, then you get -EPERM; the xattr includes a uid field, which > must be either 0 or a value valid in your ns. The value will be converted > to a kuid and stored on disk. (Seth, I'm not sure offhand how that should > mesh with your patches, we can talk about it after I send the next patch, > which I'm quite certain will handle it wrongly) > > 8. If a security.capability exists, it will override any security.nscapability > at execve() (so, inverse of my previous two patches). > > -serge
Re: [PATCH 1/1] simplified security.nscapability xattr
On Mon, May 16, 2016 at 04:15:23PM -0500, Serge E. Hallyn wrote: > Quoting Serge E. Hallyn (se...@hallyn.com): > ... > > There's a problem though. The above suffices to prevent an unprivileged > > user > > in a user_ns from unsharing a user_ns to write a file capability and exploit > > that capability in the ns where he is unprivileged. With one exception, > > which > > is the case where the unprivileged user is mapped to the same kuid which > > created the namespace. So if uid 1000 on the host creates a namespace > > where uid 1000 maps to 1000 in the namespace, then 1000 in the namespace > > can create a new user_ns, write the xattr, and exploit it from the > > parent namespace. This is not an uncommon case. I'm not sure what to do > > about > > it. > > Ok I think I've convinced myself that requiring a kuid 0 in the container > and storing that in the security.nscapability is best solution. The DAC > objection is imo not really valid - we don't have to give uid 0 in the > container any special privilege, we just require that the ns have a uid 0 > mapping. I have not been able to think of any other reliable way to verify > that the writer of the capability is authorized to grant privilege to the > file when executed by current. > > I'm going to proceed with another POC based on the following design: > > 1. no new syscalls at the moment. You can choose to set/query > security.nscapability, but can also just set security.capability from > a user_ns and have the kernel transparently set a security.nscapability > entry for you. > > 2. For now just a single security.nscapability entry, but in a format > that turning it into an array will be a trivial change > > 3. When running file foo which has a security.nscapability for kuid 10, > then any namespace where kuid 10 is root - or which has an ancestor ns > where > that is the case - will run the file with the listed capabilities. > > 4. When doing getxattr of security.capability from a user_ns, if there is a > security.capability entry, that will be returned; else if there is a valid > security.nscapability for your ns, that will be returned. > > 5. when doing a setxattr of security.capability from a user_ns, if there is > a security.nscapability entry, you get EBUSY; else a security.nscapability > with your root kuid will be written provided that (a) you are privileged > over your namespace, (b) you are privileged over your root uid, (c) the > file owner maps into your namespace. Stéphane pointed out this isn't quite right. The EBUSY will happen if a security.nscapability is defined with a kuid over which the writer is not privileged - else it will overwrite. It will also happen if security.capbility is set. > 6. when doing a getxattr of security.nscapability, the entry will be shown > with kuid mapped into your namespace or -1 if the uid does not map into > your ns. > > 7. when doing a setxattr of security.nscapability, if an entry exists, you > get -EBUSY; if you are not privileged over your ns, your root uid, and > the file owner, then you get -EPERM; the xattr includes a uid field, which > must be either 0 or a value valid in your ns. The value will be converted > to a kuid and stored on disk. (Seth, I'm not sure offhand how that should > mesh with your patches, we can talk about it after I send the next patch, > which I'm quite certain will handle it wrongly) > > 8. If a security.capability exists, it will override any security.nscapability > at execve() (so, inverse of my previous two patches). > > -serge
Re: [PATCH] crypto/sha1-mb: make sha1_x8_avx2() conform to C function ABI
On Mon, May 16, 2016 at 02:39:06PM -0700, Megha Dey wrote: > On Mon, 2016-05-16 at 15:16 -0500, Josh Poimboeuf wrote: > > On Mon, May 16, 2016 at 11:31:12AM -0700, Megha Dey wrote: > > > On Mon, 2016-05-16 at 09:44 -0500, Josh Poimboeuf wrote: > > > > On Fri, May 13, 2016 at 10:32:26AM -0700, Megha Dey wrote: > > > > > On Fri, 2016-05-13 at 07:51 +0200, Ingo Molnar wrote: > > > > > > * Herbert Xuwrote: > > > > > > > > > > > > > On Thu, May 12, 2016 at 04:31:06PM -0700, Megha Dey wrote: > > > > > > > > Hi, > > > > > > > > > > > > > > > > When booting latest kernel with the CONFIG_CRYPTO_SHA1_MB > > > > > > > > enabled, I > > > > > > > > observe a panic. > > > > > > > > > > > > > > > > After having a quick look, on reverting the following patches, > > > > > > > > I am able > > > > > > > > to complete the booting process. > > > > > > > > aec4d0e301f17bb143341c82cc44685b8af0b945 > > > > > > > > 8691ccd764f9ecc69a6812dfe76214c86ac9ba06 > > > > > > > > 68874ac3304ade7ed5ebb12af00d6b9bbbca0a16 > > > > > > > > > > > > > > > > Of the 3 patches, aec4d0e301f17bb143341c82cc44685b8af0b945 > > > > > > > > seems wrong. > > > > > > > > The r10 to r15 registers are used in sha1_x8_avx2.S, which is > > > > > > > > called > > > > > > > > from sha1_mb_mgr_flush_avx2.S. > > > > > > > > > > > > > > > > I do not think the functionality of the SHA1-MB crypto > > > > > > > > algorithm has > > > > > > > > been tested after applying these changes. (I am not sure if any > > > > > > > > of the > > > > > > > > other crypto algorithms have been affected by these changes). > > > > > > > > > > > > > > Josh, Ingo: > > > > > > > > > > > > > > Any ideas on this? Should we revert? > > > > > > > > > > > > Yeah, I think so - although another option would be to standardize > > > > > > sha1_x8_avx2() > > > > > > - the problem is that it is a function that clobbers registers > > > > > > without > > > > > > saving/restoring them, breaking the C function ABI. I realize it's > > > > > > written in > > > > > > assembly, but unless there are strong performance reasons to > > > > > > deviate from the > > > > > > regular calling convention it might make sense to fix that. > > > > > > > > > > > > Do any warnings get generated after the revert, if you enable > > > > > > CONFIG_STACK_VALIDATION=y? > > > > > > > > > > After the revert and enabling CONFIG_STACK_VALIDATION: > > > > > arch/x86/crypto/sha1-mb/sha1_mb_mgr_flush_avx2.o: warning: objtool: > > > > > sha1_mb_mgr_flush_avx2()+0x20d: call without frame pointer save/setup > > > > > > > > > > arch/x86/crypto/sha1-mb/sha1_mb_mgr_submit_avx2.o: warning: objtool: > > > > > sha1_mb_mgr_submit_avx2()+0x115: call without frame pointer save/setup > > > > > > > > Megha, > > > > > > > > Sorry for breaking it. I completely missed the fact that the function > > > > calls sha1_x8_avx2() which clobbers registers. > > > > > > > > If the performance penalty isn't too bad, I'll submit a patch to > > > > standardize sha1_x8_avx2() to follow the C ABI. > > > > > > > > Do you have any tips for testing this code? I've tried using the tcrypt > > > > module, but no luck. > > > > > > > Josh, > > > Build the kernel with the following configs: > > > CONFIG_CRYPTO_SHA1_MB=y > > > CONFIG_CRYPTO_TEST=m > > > CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=n > > > There was a kernel panic while booting. > > > So if after applying your new patch, we are able to get complete the > > > boot, then we are good. > > > > > > Could you please send a copy of the patch, I could test it on my end > > > too. > > > > Thanks. I was able to run the tests, though I didn't see a panic. Can > > you test with this patch? > > > > > > > > From: Josh Poimboeuf > > Subject: [PATCH] crypto/sha1-mb: make sha1_x8_avx2() conform to C function > > ABI > > > > Megha Day reported a kernel panic in crypto code. The problem is that > > sha1_x8_avx2() clobbers registers r12-r15 without saving and restoring > > them. > > > > Before commit aec4d0e301f1 ("x86/asm/crypto: Simplify stack usage in > > sha-mb functions"), those registers were saved and restored by the > > callers of the function. I removed them with that commit because I > > didn't realize sha1_x8_avx2() clobbered them. > > > > Fix the potential undefined behavior associated with clobbering the > > registers and make the behavior less surprising by changing the > > registers to be callee saved/restored to conform with the C function > > call ABI. > > > > Also, rdx (aka RSP_SAVE) doesn't need to be saved: I verified that none > > of the callers rely on it being saved, and it's not a callee-saved > > register in the C ABI. > > > > Fixes: aec4d0e301f1 ("x86/asm/crypto: Simplify stack usage in sha-mb > > functions") > > Cc: sta...@vger.kernel.org # 4.6 > > Reported-by: Megha Dey > > Signed-off-by: Josh Poimboeuf > > --- > > arch/x86/crypto/sha-mb/sha1_x8_avx2.S
Re: [PATCH] crypto/sha1-mb: make sha1_x8_avx2() conform to C function ABI
On Mon, May 16, 2016 at 02:39:06PM -0700, Megha Dey wrote: > On Mon, 2016-05-16 at 15:16 -0500, Josh Poimboeuf wrote: > > On Mon, May 16, 2016 at 11:31:12AM -0700, Megha Dey wrote: > > > On Mon, 2016-05-16 at 09:44 -0500, Josh Poimboeuf wrote: > > > > On Fri, May 13, 2016 at 10:32:26AM -0700, Megha Dey wrote: > > > > > On Fri, 2016-05-13 at 07:51 +0200, Ingo Molnar wrote: > > > > > > * Herbert Xu wrote: > > > > > > > > > > > > > On Thu, May 12, 2016 at 04:31:06PM -0700, Megha Dey wrote: > > > > > > > > Hi, > > > > > > > > > > > > > > > > When booting latest kernel with the CONFIG_CRYPTO_SHA1_MB > > > > > > > > enabled, I > > > > > > > > observe a panic. > > > > > > > > > > > > > > > > After having a quick look, on reverting the following patches, > > > > > > > > I am able > > > > > > > > to complete the booting process. > > > > > > > > aec4d0e301f17bb143341c82cc44685b8af0b945 > > > > > > > > 8691ccd764f9ecc69a6812dfe76214c86ac9ba06 > > > > > > > > 68874ac3304ade7ed5ebb12af00d6b9bbbca0a16 > > > > > > > > > > > > > > > > Of the 3 patches, aec4d0e301f17bb143341c82cc44685b8af0b945 > > > > > > > > seems wrong. > > > > > > > > The r10 to r15 registers are used in sha1_x8_avx2.S, which is > > > > > > > > called > > > > > > > > from sha1_mb_mgr_flush_avx2.S. > > > > > > > > > > > > > > > > I do not think the functionality of the SHA1-MB crypto > > > > > > > > algorithm has > > > > > > > > been tested after applying these changes. (I am not sure if any > > > > > > > > of the > > > > > > > > other crypto algorithms have been affected by these changes). > > > > > > > > > > > > > > Josh, Ingo: > > > > > > > > > > > > > > Any ideas on this? Should we revert? > > > > > > > > > > > > Yeah, I think so - although another option would be to standardize > > > > > > sha1_x8_avx2() > > > > > > - the problem is that it is a function that clobbers registers > > > > > > without > > > > > > saving/restoring them, breaking the C function ABI. I realize it's > > > > > > written in > > > > > > assembly, but unless there are strong performance reasons to > > > > > > deviate from the > > > > > > regular calling convention it might make sense to fix that. > > > > > > > > > > > > Do any warnings get generated after the revert, if you enable > > > > > > CONFIG_STACK_VALIDATION=y? > > > > > > > > > > After the revert and enabling CONFIG_STACK_VALIDATION: > > > > > arch/x86/crypto/sha1-mb/sha1_mb_mgr_flush_avx2.o: warning: objtool: > > > > > sha1_mb_mgr_flush_avx2()+0x20d: call without frame pointer save/setup > > > > > > > > > > arch/x86/crypto/sha1-mb/sha1_mb_mgr_submit_avx2.o: warning: objtool: > > > > > sha1_mb_mgr_submit_avx2()+0x115: call without frame pointer save/setup > > > > > > > > Megha, > > > > > > > > Sorry for breaking it. I completely missed the fact that the function > > > > calls sha1_x8_avx2() which clobbers registers. > > > > > > > > If the performance penalty isn't too bad, I'll submit a patch to > > > > standardize sha1_x8_avx2() to follow the C ABI. > > > > > > > > Do you have any tips for testing this code? I've tried using the tcrypt > > > > module, but no luck. > > > > > > > Josh, > > > Build the kernel with the following configs: > > > CONFIG_CRYPTO_SHA1_MB=y > > > CONFIG_CRYPTO_TEST=m > > > CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=n > > > There was a kernel panic while booting. > > > So if after applying your new patch, we are able to get complete the > > > boot, then we are good. > > > > > > Could you please send a copy of the patch, I could test it on my end > > > too. > > > > Thanks. I was able to run the tests, though I didn't see a panic. Can > > you test with this patch? > > > > > > > > From: Josh Poimboeuf > > Subject: [PATCH] crypto/sha1-mb: make sha1_x8_avx2() conform to C function > > ABI > > > > Megha Day reported a kernel panic in crypto code. The problem is that > > sha1_x8_avx2() clobbers registers r12-r15 without saving and restoring > > them. > > > > Before commit aec4d0e301f1 ("x86/asm/crypto: Simplify stack usage in > > sha-mb functions"), those registers were saved and restored by the > > callers of the function. I removed them with that commit because I > > didn't realize sha1_x8_avx2() clobbered them. > > > > Fix the potential undefined behavior associated with clobbering the > > registers and make the behavior less surprising by changing the > > registers to be callee saved/restored to conform with the C function > > call ABI. > > > > Also, rdx (aka RSP_SAVE) doesn't need to be saved: I verified that none > > of the callers rely on it being saved, and it's not a callee-saved > > register in the C ABI. > > > > Fixes: aec4d0e301f1 ("x86/asm/crypto: Simplify stack usage in sha-mb > > functions") > > Cc: sta...@vger.kernel.org # 4.6 > > Reported-by: Megha Dey > > Signed-off-by: Josh Poimboeuf > > --- > > arch/x86/crypto/sha-mb/sha1_x8_avx2.S | 13 +++-- > > 1 file changed, 11 insertions(+), 2 deletions(-) > > > > diff --git
Re: [RFC][PATCH] printk: Add option to append kernel version to the dict
On Sunday 05/15 at 00:19 +0200, Richard Weinberger wrote: > On Fri, May 13, 2016 at 10:58 PM, Calvin Owenswrote: > > We use netconsole to collect kernel logs from all the servers at > > Facebook. We use this patch internally so each logline has a record of > > which kernel version emitted it. > > > > At first glance, this might seem lazy: as you would expect, we have a > > database which records which kernel version a host is currently running. > > But there are a lot of situations where that database cannot be current: > > early-ish boot crashes are probably the best example, but even beyond > > that there are lots of varieties of kernel brokenness that can prevent > > the database from being updated. Doing it explicitly this way ensures > > that we always know exactly what version emitted a given message. > > > > Doing it in printk() itself rather than extended netconsole ends up > > being much simpler, and has the advantage that future extended console > > implementations will be able to benefit from this as well. > > > > Signed-off-by: Calvin Owens > > --- > > init/Kconfig | 8 > > kernel/printk/printk.c | 14 ++ > > 2 files changed, 22 insertions(+) > > I don't think adding a new config option is appropriate. > How about adding a log format string tunable to netconsole? Like a generic way to append arbitrary key/val to the dict? It would still need to be configured via the kernel cmdline or at compile time: I can't rely getting to userspace. I had steered away from something more general like that because it didn't seem worth the trobule, but I can certainly go that route: what sort of format specifiers would we want? I only care about UTS_RELEASE, but I suppose UTS_NODENAME and UTS_VERSION might be handy in some cases? Thanks, Calvin > -- > Thanks, > //richard
Re: [RFC][PATCH] printk: Add option to append kernel version to the dict
On Sunday 05/15 at 00:19 +0200, Richard Weinberger wrote: > On Fri, May 13, 2016 at 10:58 PM, Calvin Owens wrote: > > We use netconsole to collect kernel logs from all the servers at > > Facebook. We use this patch internally so each logline has a record of > > which kernel version emitted it. > > > > At first glance, this might seem lazy: as you would expect, we have a > > database which records which kernel version a host is currently running. > > But there are a lot of situations where that database cannot be current: > > early-ish boot crashes are probably the best example, but even beyond > > that there are lots of varieties of kernel brokenness that can prevent > > the database from being updated. Doing it explicitly this way ensures > > that we always know exactly what version emitted a given message. > > > > Doing it in printk() itself rather than extended netconsole ends up > > being much simpler, and has the advantage that future extended console > > implementations will be able to benefit from this as well. > > > > Signed-off-by: Calvin Owens > > --- > > init/Kconfig | 8 > > kernel/printk/printk.c | 14 ++ > > 2 files changed, 22 insertions(+) > > I don't think adding a new config option is appropriate. > How about adding a log format string tunable to netconsole? Like a generic way to append arbitrary key/val to the dict? It would still need to be configured via the kernel cmdline or at compile time: I can't rely getting to userspace. I had steered away from something more general like that because it didn't seem worth the trobule, but I can certainly go that route: what sort of format specifiers would we want? I only care about UTS_RELEASE, but I suppose UTS_NODENAME and UTS_VERSION might be handy in some cases? Thanks, Calvin > -- > Thanks, > //richard
Re: [PATCH] objtool: cope with pre-4.5 gcc (and non-gcc)
On Fri, May 13, 2016 at 12:27:06AM -0600, Jan Beulich wrote: > The kernel's unreachable() translates to __builtin_unreachable() only > for gcc 4.5 and newer, and else expands to an infinite loop. Avoid > "function has unreachable instruction" warnings for this case by > inspecting the instructions immediately following the UD2. This cuts > down the number of files getting such warnings by about 99% for me. > > Signed-off-by: Jan BeulichThanks for the patch. I'm wondering if we can detect this situation more precisely, maybe by adding some kind of annotation to the unreachable() macro. Will need to think about it a little more... -- Josh
Re: [PATCH] objtool: cope with pre-4.5 gcc (and non-gcc)
On Fri, May 13, 2016 at 12:27:06AM -0600, Jan Beulich wrote: > The kernel's unreachable() translates to __builtin_unreachable() only > for gcc 4.5 and newer, and else expands to an infinite loop. Avoid > "function has unreachable instruction" warnings for this case by > inspecting the instructions immediately following the UD2. This cuts > down the number of files getting such warnings by about 99% for me. > > Signed-off-by: Jan Beulich Thanks for the patch. I'm wondering if we can detect this situation more precisely, maybe by adding some kind of annotation to the unreachable() macro. Will need to think about it a little more... -- Josh
Re: [PATCH] crypto/sha1-mb: make sha1_x8_avx2() conform to C function ABI
On Mon, 2016-05-16 at 15:16 -0500, Josh Poimboeuf wrote: > On Mon, May 16, 2016 at 11:31:12AM -0700, Megha Dey wrote: > > On Mon, 2016-05-16 at 09:44 -0500, Josh Poimboeuf wrote: > > > On Fri, May 13, 2016 at 10:32:26AM -0700, Megha Dey wrote: > > > > On Fri, 2016-05-13 at 07:51 +0200, Ingo Molnar wrote: > > > > > * Herbert Xuwrote: > > > > > > > > > > > On Thu, May 12, 2016 at 04:31:06PM -0700, Megha Dey wrote: > > > > > > > Hi, > > > > > > > > > > > > > > When booting latest kernel with the CONFIG_CRYPTO_SHA1_MB > > > > > > > enabled, I > > > > > > > observe a panic. > > > > > > > > > > > > > > After having a quick look, on reverting the following patches, I > > > > > > > am able > > > > > > > to complete the booting process. > > > > > > > aec4d0e301f17bb143341c82cc44685b8af0b945 > > > > > > > 8691ccd764f9ecc69a6812dfe76214c86ac9ba06 > > > > > > > 68874ac3304ade7ed5ebb12af00d6b9bbbca0a16 > > > > > > > > > > > > > > Of the 3 patches, aec4d0e301f17bb143341c82cc44685b8af0b945 seems > > > > > > > wrong. > > > > > > > The r10 to r15 registers are used in sha1_x8_avx2.S, which is > > > > > > > called > > > > > > > from sha1_mb_mgr_flush_avx2.S. > > > > > > > > > > > > > > I do not think the functionality of the SHA1-MB crypto algorithm > > > > > > > has > > > > > > > been tested after applying these changes. (I am not sure if any > > > > > > > of the > > > > > > > other crypto algorithms have been affected by these changes). > > > > > > > > > > > > Josh, Ingo: > > > > > > > > > > > > Any ideas on this? Should we revert? > > > > > > > > > > Yeah, I think so - although another option would be to standardize > > > > > sha1_x8_avx2() > > > > > - the problem is that it is a function that clobbers registers > > > > > without > > > > > saving/restoring them, breaking the C function ABI. I realize it's > > > > > written in > > > > > assembly, but unless there are strong performance reasons to deviate > > > > > from the > > > > > regular calling convention it might make sense to fix that. > > > > > > > > > > Do any warnings get generated after the revert, if you enable > > > > > CONFIG_STACK_VALIDATION=y? > > > > > > > > After the revert and enabling CONFIG_STACK_VALIDATION: > > > > arch/x86/crypto/sha1-mb/sha1_mb_mgr_flush_avx2.o: warning: objtool: > > > > sha1_mb_mgr_flush_avx2()+0x20d: call without frame pointer save/setup > > > > > > > > arch/x86/crypto/sha1-mb/sha1_mb_mgr_submit_avx2.o: warning: objtool: > > > > sha1_mb_mgr_submit_avx2()+0x115: call without frame pointer save/setup > > > > > > Megha, > > > > > > Sorry for breaking it. I completely missed the fact that the function > > > calls sha1_x8_avx2() which clobbers registers. > > > > > > If the performance penalty isn't too bad, I'll submit a patch to > > > standardize sha1_x8_avx2() to follow the C ABI. > > > > > > Do you have any tips for testing this code? I've tried using the tcrypt > > > module, but no luck. > > > > > Josh, > > Build the kernel with the following configs: > > CONFIG_CRYPTO_SHA1_MB=y > > CONFIG_CRYPTO_TEST=m > > CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=n > > There was a kernel panic while booting. > > So if after applying your new patch, we are able to get complete the > > boot, then we are good. > > > > Could you please send a copy of the patch, I could test it on my end > > too. > > Thanks. I was able to run the tests, though I didn't see a panic. Can > you test with this patch? > > > > From: Josh Poimboeuf > Subject: [PATCH] crypto/sha1-mb: make sha1_x8_avx2() conform to C function ABI > > Megha Day reported a kernel panic in crypto code. The problem is that > sha1_x8_avx2() clobbers registers r12-r15 without saving and restoring > them. > > Before commit aec4d0e301f1 ("x86/asm/crypto: Simplify stack usage in > sha-mb functions"), those registers were saved and restored by the > callers of the function. I removed them with that commit because I > didn't realize sha1_x8_avx2() clobbered them. > > Fix the potential undefined behavior associated with clobbering the > registers and make the behavior less surprising by changing the > registers to be callee saved/restored to conform with the C function > call ABI. > > Also, rdx (aka RSP_SAVE) doesn't need to be saved: I verified that none > of the callers rely on it being saved, and it's not a callee-saved > register in the C ABI. > > Fixes: aec4d0e301f1 ("x86/asm/crypto: Simplify stack usage in sha-mb > functions") > Cc: sta...@vger.kernel.org # 4.6 > Reported-by: Megha Dey > Signed-off-by: Josh Poimboeuf > --- > arch/x86/crypto/sha-mb/sha1_x8_avx2.S | 13 +++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/crypto/sha-mb/sha1_x8_avx2.S > b/arch/x86/crypto/sha-mb/sha1_x8_avx2.S > index 8e1b477..c9dae1c 100644 > --- a/arch/x86/crypto/sha-mb/sha1_x8_avx2.S > +++
Re: [PATCH] crypto/sha1-mb: make sha1_x8_avx2() conform to C function ABI
On Mon, 2016-05-16 at 15:16 -0500, Josh Poimboeuf wrote: > On Mon, May 16, 2016 at 11:31:12AM -0700, Megha Dey wrote: > > On Mon, 2016-05-16 at 09:44 -0500, Josh Poimboeuf wrote: > > > On Fri, May 13, 2016 at 10:32:26AM -0700, Megha Dey wrote: > > > > On Fri, 2016-05-13 at 07:51 +0200, Ingo Molnar wrote: > > > > > * Herbert Xu wrote: > > > > > > > > > > > On Thu, May 12, 2016 at 04:31:06PM -0700, Megha Dey wrote: > > > > > > > Hi, > > > > > > > > > > > > > > When booting latest kernel with the CONFIG_CRYPTO_SHA1_MB > > > > > > > enabled, I > > > > > > > observe a panic. > > > > > > > > > > > > > > After having a quick look, on reverting the following patches, I > > > > > > > am able > > > > > > > to complete the booting process. > > > > > > > aec4d0e301f17bb143341c82cc44685b8af0b945 > > > > > > > 8691ccd764f9ecc69a6812dfe76214c86ac9ba06 > > > > > > > 68874ac3304ade7ed5ebb12af00d6b9bbbca0a16 > > > > > > > > > > > > > > Of the 3 patches, aec4d0e301f17bb143341c82cc44685b8af0b945 seems > > > > > > > wrong. > > > > > > > The r10 to r15 registers are used in sha1_x8_avx2.S, which is > > > > > > > called > > > > > > > from sha1_mb_mgr_flush_avx2.S. > > > > > > > > > > > > > > I do not think the functionality of the SHA1-MB crypto algorithm > > > > > > > has > > > > > > > been tested after applying these changes. (I am not sure if any > > > > > > > of the > > > > > > > other crypto algorithms have been affected by these changes). > > > > > > > > > > > > Josh, Ingo: > > > > > > > > > > > > Any ideas on this? Should we revert? > > > > > > > > > > Yeah, I think so - although another option would be to standardize > > > > > sha1_x8_avx2() > > > > > - the problem is that it is a function that clobbers registers > > > > > without > > > > > saving/restoring them, breaking the C function ABI. I realize it's > > > > > written in > > > > > assembly, but unless there are strong performance reasons to deviate > > > > > from the > > > > > regular calling convention it might make sense to fix that. > > > > > > > > > > Do any warnings get generated after the revert, if you enable > > > > > CONFIG_STACK_VALIDATION=y? > > > > > > > > After the revert and enabling CONFIG_STACK_VALIDATION: > > > > arch/x86/crypto/sha1-mb/sha1_mb_mgr_flush_avx2.o: warning: objtool: > > > > sha1_mb_mgr_flush_avx2()+0x20d: call without frame pointer save/setup > > > > > > > > arch/x86/crypto/sha1-mb/sha1_mb_mgr_submit_avx2.o: warning: objtool: > > > > sha1_mb_mgr_submit_avx2()+0x115: call without frame pointer save/setup > > > > > > Megha, > > > > > > Sorry for breaking it. I completely missed the fact that the function > > > calls sha1_x8_avx2() which clobbers registers. > > > > > > If the performance penalty isn't too bad, I'll submit a patch to > > > standardize sha1_x8_avx2() to follow the C ABI. > > > > > > Do you have any tips for testing this code? I've tried using the tcrypt > > > module, but no luck. > > > > > Josh, > > Build the kernel with the following configs: > > CONFIG_CRYPTO_SHA1_MB=y > > CONFIG_CRYPTO_TEST=m > > CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=n > > There was a kernel panic while booting. > > So if after applying your new patch, we are able to get complete the > > boot, then we are good. > > > > Could you please send a copy of the patch, I could test it on my end > > too. > > Thanks. I was able to run the tests, though I didn't see a panic. Can > you test with this patch? > > > > From: Josh Poimboeuf > Subject: [PATCH] crypto/sha1-mb: make sha1_x8_avx2() conform to C function ABI > > Megha Day reported a kernel panic in crypto code. The problem is that > sha1_x8_avx2() clobbers registers r12-r15 without saving and restoring > them. > > Before commit aec4d0e301f1 ("x86/asm/crypto: Simplify stack usage in > sha-mb functions"), those registers were saved and restored by the > callers of the function. I removed them with that commit because I > didn't realize sha1_x8_avx2() clobbered them. > > Fix the potential undefined behavior associated with clobbering the > registers and make the behavior less surprising by changing the > registers to be callee saved/restored to conform with the C function > call ABI. > > Also, rdx (aka RSP_SAVE) doesn't need to be saved: I verified that none > of the callers rely on it being saved, and it's not a callee-saved > register in the C ABI. > > Fixes: aec4d0e301f1 ("x86/asm/crypto: Simplify stack usage in sha-mb > functions") > Cc: sta...@vger.kernel.org # 4.6 > Reported-by: Megha Dey > Signed-off-by: Josh Poimboeuf > --- > arch/x86/crypto/sha-mb/sha1_x8_avx2.S | 13 +++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/crypto/sha-mb/sha1_x8_avx2.S > b/arch/x86/crypto/sha-mb/sha1_x8_avx2.S > index 8e1b477..c9dae1c 100644 > --- a/arch/x86/crypto/sha-mb/sha1_x8_avx2.S > +++ b/arch/x86/crypto/sha-mb/sha1_x8_avx2.S > @@ -296,7 +296,11 @@ W14 = TMP_ > # > ENTRY(sha1_x8_avx2) > > - pushRSP_SAVE >
Re: linux-next: manual merge of the f2fs tree with the ext4 tree
On Mon, May 09, 2016 at 10:15:03AM -0700, Jaegeuk Kim wrote: > Hi Stephen, > > Thank you for the notice. > I've been waiting for a comment about the below patch targeted to v4.7 from > Ted. > Meanwhile, I intended to prepare -next for that patch in advance. > Surely, once I get a sense that I need to consider v4.8, I'll drop this patch > for -next right away. Yeah, I think it would be better for me to take the ext4 "migrate into vfs's crypto engine" patch, and at this point, for 4.8. Sorry, I just ran out of time to try to verify that the patch wouldn't break anything, and given that we're going to need to wait for "fscrypto/f2fs: allow fs-specific key prefix for fs encryption" to go upstream. Do you have any other planned changes for the fscrypto tree planned for 4.8. If not, then perhaps it will be easier if I take the pen for any changes needed for fs/crypto, and moving forward, we probably need to find ways to make changes where commits specific for fs/crypto should be isolated from ext4 or f2fs changes, and as much as possible to be backwards compatible so that as we add new features to fs/crypto, we don't need to synchronize changes across multiple file systems. Cheers, - Ted
Re: linux-next: manual merge of the f2fs tree with the ext4 tree
On Mon, May 09, 2016 at 10:15:03AM -0700, Jaegeuk Kim wrote: > Hi Stephen, > > Thank you for the notice. > I've been waiting for a comment about the below patch targeted to v4.7 from > Ted. > Meanwhile, I intended to prepare -next for that patch in advance. > Surely, once I get a sense that I need to consider v4.8, I'll drop this patch > for -next right away. Yeah, I think it would be better for me to take the ext4 "migrate into vfs's crypto engine" patch, and at this point, for 4.8. Sorry, I just ran out of time to try to verify that the patch wouldn't break anything, and given that we're going to need to wait for "fscrypto/f2fs: allow fs-specific key prefix for fs encryption" to go upstream. Do you have any other planned changes for the fscrypto tree planned for 4.8. If not, then perhaps it will be easier if I take the pen for any changes needed for fs/crypto, and moving forward, we probably need to find ways to make changes where commits specific for fs/crypto should be isolated from ext4 or f2fs changes, and as much as possible to be backwards compatible so that as we add new features to fs/crypto, we don't need to synchronize changes across multiple file systems. Cheers, - Ted
[GIT PULL] Generic device properties framework update for v4.7-rc1
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ device-properties-4.7-rc1 to receive an update of the generic device properties framework for v4.7-rc1 with top-most commit dab2e29402f40431d2199e6faff48174204d2d06 Merge back new device properties material for v4.7. on top of commit 0224a4a30b57385a60065aa598181868881d8fc6 device property: Avoid potential dereferences of invalid pointers Just one commit reworking the handling of built-in properties initialization and updating a few drivers in accordance with the core framework changes. Thanks! --- Heikki Krogerus (1): device property: don't bother the drivers with struct property_set --- arch/arm/mach-pxa/raumfeld.c | 12 arch/arm/mach-tegra/board-paz00.c | 6 +- drivers/base/platform.c | 19 ++- drivers/base/property.c | 34 +- drivers/mfd/intel-lpss-acpi.c | 12 ++-- drivers/mfd/intel-lpss-pci.c | 20 drivers/mfd/intel-lpss.c | 2 +- drivers/mfd/intel-lpss.h | 4 ++-- drivers/mfd/mfd-core.c| 4 ++-- include/linux/mfd/core.h | 4 ++-- include/linux/platform_device.h | 6 +++--- include/linux/property.h | 15 +++ 12 files changed, 55 insertions(+), 83 deletions(-)
[GIT PULL] Generic device properties framework update for v4.7-rc1
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ device-properties-4.7-rc1 to receive an update of the generic device properties framework for v4.7-rc1 with top-most commit dab2e29402f40431d2199e6faff48174204d2d06 Merge back new device properties material for v4.7. on top of commit 0224a4a30b57385a60065aa598181868881d8fc6 device property: Avoid potential dereferences of invalid pointers Just one commit reworking the handling of built-in properties initialization and updating a few drivers in accordance with the core framework changes. Thanks! --- Heikki Krogerus (1): device property: don't bother the drivers with struct property_set --- arch/arm/mach-pxa/raumfeld.c | 12 arch/arm/mach-tegra/board-paz00.c | 6 +- drivers/base/platform.c | 19 ++- drivers/base/property.c | 34 +- drivers/mfd/intel-lpss-acpi.c | 12 ++-- drivers/mfd/intel-lpss-pci.c | 20 drivers/mfd/intel-lpss.c | 2 +- drivers/mfd/intel-lpss.h | 4 ++-- drivers/mfd/mfd-core.c| 4 ++-- include/linux/mfd/core.h | 4 ++-- include/linux/platform_device.h | 6 +++--- include/linux/property.h | 15 +++ 12 files changed, 55 insertions(+), 83 deletions(-)
[GIT PULL] ACPI updates for v4.7-rc1
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ acpi-4.7-rc1 to receive core ACPI code updates for v4.7-rc1 with top-most commit fc723957801465c4a911d0a509709f0f8b91aa8a Merge branches 'acpi-pci', 'acpi-misc' and 'acpi-tools' on top of commit 44549e8f5eea4e0a41b487b63e616cb089922b99 Linux 4.6-rc7 The new features here are ACPI 6.1 support (and some previously missing bits of ACPI 6.0 support) in ACPICA and two new drivers, a driver for the ACPI Generic Event Device (GED) feature introduced by ACPI 6.1 and the INT3406 thermal driver for display thermal management. Also the value returned by the _HRV (hardware revision) ACPI object will be exported to user space via sysfs now. In addition to that, ACPI on ARM64 will not depend on EXPERT any more. The rest is mostly fixes and cleanups and some code reorganization. Specifics: - In-kernel ACPICA code update to the upstream release 20160422 adding support for ACPI 6.1 along with some previously missing bits of ACPI 6.0 support, making a fair amount of fixes and cleanups and reducing divergences between the upstream ACPICA and the in-kernel code (Bob Moore, Lv Zheng, Al Stone, Aleksey Makarov, Will Miles). - ACPI Generic Event Device (GED) support and a fix for it (Sinan Kaya, Paul Gortmaker). - INT3406 thermal driver for display thermal management and ACPI backlight support code reorganization related to it (Aaron Lu, Arnd Bergmann). - Support for exporting the value returned by the _HRV (hardware revision) ACPI object via sysfs (Betty Dall). - Removal of the EXPERT dependency for ACPI on ARM64 (Mark Brown). - Rework of the handling of ACPI _OSI mechanism allowing the _OSI("Darwin") support to be overridden from the kernel command line among other things (Lv Zheng, Chen Yu). - Rework of the ACPI tables override mechanism to prepare it for the introduction of overlays support going forward (Lv Zheng, Rafael Wysocki). - Fixes related to the ECDT support and module-level execution of AML (Lv Zheng). - ACPI PCI interrupts management update to make it work better on ARM64 mostly (Sinan Kaya). - ACPI SRAT handling update to make the code process all entires in the table order regardless of the entry type (Lukasz Anaczkowski). - EFI power off support for full-hardware ACPI platforms that don't support ACPI S5 (Chen Yu). - Fixes and cleanups related to the ACPI core's sysfs interface (Dan Carpenter, Betty Dall). - acpi_dev_present() API rework to reduce possible confusion related to it (Lukas Wunner). - Removal of CLK_IS_ROOT from two ACPI drivers (Stephen Boyd). Thanks! --- Aaron Lu (4): video / backlight: add two APIs for drivers to use video / backlight: remove the backlight_device_registered API ACPI/video: export acpi_video_get_levels Thermal / ACPI / video: add INT3406 thermal driver Al Stone (1): ACPICA: IORT: Add in support for the SMMUv3 subtable Aleksey Makarov (1): ACPICA: Headers: Add new constants for the DBG2 ACPI table Arnd Bergmann (1): ACPI / video: mark acpi_video_get_levels() inline Betty Dall (3): ACPI / device_sysfs: Add sysfs support for _HRV hardware revision ACPI / device_sysfs: Change _SUN and _STA show functions error return to EIO ACPI / device_sysfs: Clean up checkpatch errors Bob Moore (23): ACPICA: Headers: Minor update for SPCR ACPI table ACPICA: ACPI 6.1: Updates for the HEST ACPI table ACPICA: ACPI 6.1: Update NFIT table for additional new fields ACPICA: Headers: Update DMAR table for October 2014 I/O spec ACPICA: Tables: Update FADT handling ACPICA: ACPI 6.1: Add full support for this version of ACPI spec ACPICA: iASL/Headers: Fix incorrect definition of FPDT table ACPICA: Intepreter: Add object extensions to Concatenate operand ACPICA: Interpreter: Update some function headers, no functional change ACPICA: iASL: Cleanup/optimization for ToPLD macro support ACPICA: Cleanup some invocation indentations, no functional change ACPICA: Headers: Update generation of the ACPICA library ACPICA: Utilities: Update for strtoul64 merger ACPICA: All: const keyword changes across the ACPICA source ACPICA: iASL/Disassembler: Improve handling of unresolved methods ACPICA: Update version to 20160318 ACPICA: Refactor evaluate_object to reduce nesting ACPICA: ACPI 6.1: Support for new PCCT subtable ACPICA: ACPI 6.0: Update _BIX support for new package element ACPICA: ACPI 6.0, tools/iasl: Add support for new resource descriptors ACPICA: Renamed some #defined flag constants for clarity ACPICA: Move all ASCII utilities to a common file ACPICA: Update version to 20160422 Chen Yu (2): ACPI / PM: Introduce efi poweroff for HW-full platforms without _S5 ACPI / osi: Fix default _OSI(Darwin)
[GIT PULL] ACPI updates for v4.7-rc1
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ acpi-4.7-rc1 to receive core ACPI code updates for v4.7-rc1 with top-most commit fc723957801465c4a911d0a509709f0f8b91aa8a Merge branches 'acpi-pci', 'acpi-misc' and 'acpi-tools' on top of commit 44549e8f5eea4e0a41b487b63e616cb089922b99 Linux 4.6-rc7 The new features here are ACPI 6.1 support (and some previously missing bits of ACPI 6.0 support) in ACPICA and two new drivers, a driver for the ACPI Generic Event Device (GED) feature introduced by ACPI 6.1 and the INT3406 thermal driver for display thermal management. Also the value returned by the _HRV (hardware revision) ACPI object will be exported to user space via sysfs now. In addition to that, ACPI on ARM64 will not depend on EXPERT any more. The rest is mostly fixes and cleanups and some code reorganization. Specifics: - In-kernel ACPICA code update to the upstream release 20160422 adding support for ACPI 6.1 along with some previously missing bits of ACPI 6.0 support, making a fair amount of fixes and cleanups and reducing divergences between the upstream ACPICA and the in-kernel code (Bob Moore, Lv Zheng, Al Stone, Aleksey Makarov, Will Miles). - ACPI Generic Event Device (GED) support and a fix for it (Sinan Kaya, Paul Gortmaker). - INT3406 thermal driver for display thermal management and ACPI backlight support code reorganization related to it (Aaron Lu, Arnd Bergmann). - Support for exporting the value returned by the _HRV (hardware revision) ACPI object via sysfs (Betty Dall). - Removal of the EXPERT dependency for ACPI on ARM64 (Mark Brown). - Rework of the handling of ACPI _OSI mechanism allowing the _OSI("Darwin") support to be overridden from the kernel command line among other things (Lv Zheng, Chen Yu). - Rework of the ACPI tables override mechanism to prepare it for the introduction of overlays support going forward (Lv Zheng, Rafael Wysocki). - Fixes related to the ECDT support and module-level execution of AML (Lv Zheng). - ACPI PCI interrupts management update to make it work better on ARM64 mostly (Sinan Kaya). - ACPI SRAT handling update to make the code process all entires in the table order regardless of the entry type (Lukasz Anaczkowski). - EFI power off support for full-hardware ACPI platforms that don't support ACPI S5 (Chen Yu). - Fixes and cleanups related to the ACPI core's sysfs interface (Dan Carpenter, Betty Dall). - acpi_dev_present() API rework to reduce possible confusion related to it (Lukas Wunner). - Removal of CLK_IS_ROOT from two ACPI drivers (Stephen Boyd). Thanks! --- Aaron Lu (4): video / backlight: add two APIs for drivers to use video / backlight: remove the backlight_device_registered API ACPI/video: export acpi_video_get_levels Thermal / ACPI / video: add INT3406 thermal driver Al Stone (1): ACPICA: IORT: Add in support for the SMMUv3 subtable Aleksey Makarov (1): ACPICA: Headers: Add new constants for the DBG2 ACPI table Arnd Bergmann (1): ACPI / video: mark acpi_video_get_levels() inline Betty Dall (3): ACPI / device_sysfs: Add sysfs support for _HRV hardware revision ACPI / device_sysfs: Change _SUN and _STA show functions error return to EIO ACPI / device_sysfs: Clean up checkpatch errors Bob Moore (23): ACPICA: Headers: Minor update for SPCR ACPI table ACPICA: ACPI 6.1: Updates for the HEST ACPI table ACPICA: ACPI 6.1: Update NFIT table for additional new fields ACPICA: Headers: Update DMAR table for October 2014 I/O spec ACPICA: Tables: Update FADT handling ACPICA: ACPI 6.1: Add full support for this version of ACPI spec ACPICA: iASL/Headers: Fix incorrect definition of FPDT table ACPICA: Intepreter: Add object extensions to Concatenate operand ACPICA: Interpreter: Update some function headers, no functional change ACPICA: iASL: Cleanup/optimization for ToPLD macro support ACPICA: Cleanup some invocation indentations, no functional change ACPICA: Headers: Update generation of the ACPICA library ACPICA: Utilities: Update for strtoul64 merger ACPICA: All: const keyword changes across the ACPICA source ACPICA: iASL/Disassembler: Improve handling of unresolved methods ACPICA: Update version to 20160318 ACPICA: Refactor evaluate_object to reduce nesting ACPICA: ACPI 6.1: Support for new PCCT subtable ACPICA: ACPI 6.0: Update _BIX support for new package element ACPICA: ACPI 6.0, tools/iasl: Add support for new resource descriptors ACPICA: Renamed some #defined flag constants for clarity ACPICA: Move all ASCII utilities to a common file ACPICA: Update version to 20160422 Chen Yu (2): ACPI / PM: Introduce efi poweroff for HW-full platforms without _S5 ACPI / osi: Fix default _OSI(Darwin)
[GIT PULL] Power management updates for v4.7-rc1
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ pm-4.7-rc1 to receive the main part of power management material for v4.7-rc1 with top-most commit 27c4a1c5ef61b6d4a9aeae68b24419b4319b97ed Merge branches 'pm-avs', 'pm-clk', 'powercap' and 'pm-tools' on top of commit 44549e8f5eea4e0a41b487b63e616cb089922b99 Linux 4.6-rc7 The majority of changes go into the cpufreq subsystem this time. To me, quite obviously, the biggest ticket item is the new "schedutil" governor. Interestingly enough, it's the first new cpufreq governor since the beginning of the git era (except for some out-of-the-tree ones). There are two main differences between it and the existing governors. First, it uses the information provided by the scheduler directly for making its decisions, so it doesn't have to track anything by itself. Second, it can invoke drivers (supporting that feature) to adjust CPU performance right away without having to spawn work items to be executed in process context or similar. Currently, the acpi-cpufreq driver is the only one supporting that mode of operation, but then it is used on a large number of systems. The "schedutil" governor as included here is very simple and mostly regarded as a foundation for future work on the integration of the scheduler with CPU power management (in fact, there is work in progress on top of it already). Nevertheless it works and the preliminary results obtained with it are encouraging. There also is some consolidation of CPU frequency management for ARM platforms that can add their machine IDs the the new stub dt-platdev driver now and that will take care of creating the requisite platform device for cpufreq-dt, so it is not necessary to do that in platform code any more. Several ARM platforms are switched over to using this generic mechanism. In addition to that, the intel_pstate driver is now going to respect CPU frequency limits set by the platform firmware (or a BMC) and provided via the ACPI _PPC object. The devfreq subsystem is getting a new "passive" governor for SoCs subsystems that will depend on somebody else to manage their voltage rails and its support for Samsung Exynos SoCs is consolidated. The rest is support for new hardware (Intel Broxton support in intel_idle for one example), bug fixes, optimizations and cleanups in a number of places. Specifics: - New cpufreq "schedutil" governor (making decisions based on CPU utilization information provided by the scheduler and capable of switching CPU frequencies right away if the underlying driver supports that) and support for fast frequency switching in the acpi-cpufreq driver (Rafael Wysocki). - Consolidation of CPU frequency management on ARM platforms allowing them to get rid of some platform-specific boilerplate code if they are going to use the cpufreq-dt driver (Viresh Kumar, Finley Xiao, Marc Gonzalez). - Support for ACPI _PPC and CPU frequency limits in the intel_pstate driver (Srinivas Pandruvada). - Fixes and cleanups in the cpufreq core and generic governor code (Rafael Wysocki, Sai Gurrappadi). - intel_pstate driver optimizations and cleanups (Rafael Wysocki, Philippe Longepe, Chen Yu, Joe Perches). - cpufreq powernv driver fixes and cleanups (Akshay Adiga, Shilpasri Bhat). - cpufreq qoriq driver fixes and cleanups (Jia Hongtao). - ACPI cpufreq driver cleanups (Viresh Kumar). - Assorted cpufreq driver updates (Ashwin Chaugule, Geliang Tang, Javier Martinez Canillas, Paul Gortmaker, Sudeep Holla). - Assorted cpufreq fixes and cleanups (Joe Perches, Arnd Bergmann). - Fixes and cleanups in the OPP (Operating Performance Points) framework, mostly related to OPP sharing, and reorganization of OF-dependent code in it (Viresh Kumar, Arnd Bergmann, Sudeep Holla). - New "passive" governor for devfreq (for SoC subsystems that will rely on someone else for the management of their power resources) and consolidation of devfreq support for Exynos platforms, coding style and typo fixes for devfreq (Chanwoo Choi, MyungJoo Ham). - PM core fixes and cleanups, mostly to make it work better with the generic power domains (genpd) framework, and updates for that framework (Ulf Hansson, Thierry Reding, Colin Ian King). - Intel Broxton support for the intel_idle driver (Len Brown). - cpuidle core optimization and fix (Daniel Lezcano, Dave Gerlach). - ARM cpuidle cleanups (Jisheng Zhang). - Intel Kabylake support for the RAPL power capping driver (Jacob Pan). - AVS (Adaptive Voltage Switching) rockchip-io driver update (Heiko Stuebner). - Updates for the cpupower tool (Arjun Sreedharan, Colin Ian King, Mattia Dongili, Thomas Renninger). Thanks! --- Akshay Adiga (3): cpufreq: powernv: Ramp-down global pstate slower than local-pstate cpufreq: powernv: Move smp_call_function_any() out of irq safe block cpufreq: powernv: del_timer_sync
[GIT PULL] Power management updates for v4.7-rc1
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ pm-4.7-rc1 to receive the main part of power management material for v4.7-rc1 with top-most commit 27c4a1c5ef61b6d4a9aeae68b24419b4319b97ed Merge branches 'pm-avs', 'pm-clk', 'powercap' and 'pm-tools' on top of commit 44549e8f5eea4e0a41b487b63e616cb089922b99 Linux 4.6-rc7 The majority of changes go into the cpufreq subsystem this time. To me, quite obviously, the biggest ticket item is the new "schedutil" governor. Interestingly enough, it's the first new cpufreq governor since the beginning of the git era (except for some out-of-the-tree ones). There are two main differences between it and the existing governors. First, it uses the information provided by the scheduler directly for making its decisions, so it doesn't have to track anything by itself. Second, it can invoke drivers (supporting that feature) to adjust CPU performance right away without having to spawn work items to be executed in process context or similar. Currently, the acpi-cpufreq driver is the only one supporting that mode of operation, but then it is used on a large number of systems. The "schedutil" governor as included here is very simple and mostly regarded as a foundation for future work on the integration of the scheduler with CPU power management (in fact, there is work in progress on top of it already). Nevertheless it works and the preliminary results obtained with it are encouraging. There also is some consolidation of CPU frequency management for ARM platforms that can add their machine IDs the the new stub dt-platdev driver now and that will take care of creating the requisite platform device for cpufreq-dt, so it is not necessary to do that in platform code any more. Several ARM platforms are switched over to using this generic mechanism. In addition to that, the intel_pstate driver is now going to respect CPU frequency limits set by the platform firmware (or a BMC) and provided via the ACPI _PPC object. The devfreq subsystem is getting a new "passive" governor for SoCs subsystems that will depend on somebody else to manage their voltage rails and its support for Samsung Exynos SoCs is consolidated. The rest is support for new hardware (Intel Broxton support in intel_idle for one example), bug fixes, optimizations and cleanups in a number of places. Specifics: - New cpufreq "schedutil" governor (making decisions based on CPU utilization information provided by the scheduler and capable of switching CPU frequencies right away if the underlying driver supports that) and support for fast frequency switching in the acpi-cpufreq driver (Rafael Wysocki). - Consolidation of CPU frequency management on ARM platforms allowing them to get rid of some platform-specific boilerplate code if they are going to use the cpufreq-dt driver (Viresh Kumar, Finley Xiao, Marc Gonzalez). - Support for ACPI _PPC and CPU frequency limits in the intel_pstate driver (Srinivas Pandruvada). - Fixes and cleanups in the cpufreq core and generic governor code (Rafael Wysocki, Sai Gurrappadi). - intel_pstate driver optimizations and cleanups (Rafael Wysocki, Philippe Longepe, Chen Yu, Joe Perches). - cpufreq powernv driver fixes and cleanups (Akshay Adiga, Shilpasri Bhat). - cpufreq qoriq driver fixes and cleanups (Jia Hongtao). - ACPI cpufreq driver cleanups (Viresh Kumar). - Assorted cpufreq driver updates (Ashwin Chaugule, Geliang Tang, Javier Martinez Canillas, Paul Gortmaker, Sudeep Holla). - Assorted cpufreq fixes and cleanups (Joe Perches, Arnd Bergmann). - Fixes and cleanups in the OPP (Operating Performance Points) framework, mostly related to OPP sharing, and reorganization of OF-dependent code in it (Viresh Kumar, Arnd Bergmann, Sudeep Holla). - New "passive" governor for devfreq (for SoC subsystems that will rely on someone else for the management of their power resources) and consolidation of devfreq support for Exynos platforms, coding style and typo fixes for devfreq (Chanwoo Choi, MyungJoo Ham). - PM core fixes and cleanups, mostly to make it work better with the generic power domains (genpd) framework, and updates for that framework (Ulf Hansson, Thierry Reding, Colin Ian King). - Intel Broxton support for the intel_idle driver (Len Brown). - cpuidle core optimization and fix (Daniel Lezcano, Dave Gerlach). - ARM cpuidle cleanups (Jisheng Zhang). - Intel Kabylake support for the RAPL power capping driver (Jacob Pan). - AVS (Adaptive Voltage Switching) rockchip-io driver update (Heiko Stuebner). - Updates for the cpupower tool (Arjun Sreedharan, Colin Ian King, Mattia Dongili, Thomas Renninger). Thanks! --- Akshay Adiga (3): cpufreq: powernv: Ramp-down global pstate slower than local-pstate cpufreq: powernv: Move smp_call_function_any() out of irq safe block cpufreq: powernv: del_timer_sync
Re: [PATCH 1/1] simplified security.nscapability xattr
Quoting Serge E. Hallyn (se...@hallyn.com): ... > There's a problem though. The above suffices to prevent an unprivileged user > in a user_ns from unsharing a user_ns to write a file capability and exploit > that capability in the ns where he is unprivileged. With one exception, which > is the case where the unprivileged user is mapped to the same kuid which > created the namespace. So if uid 1000 on the host creates a namespace > where uid 1000 maps to 1000 in the namespace, then 1000 in the namespace > can create a new user_ns, write the xattr, and exploit it from the > parent namespace. This is not an uncommon case. I'm not sure what to do > about > it. Ok I think I've convinced myself that requiring a kuid 0 in the container and storing that in the security.nscapability is best solution. The DAC objection is imo not really valid - we don't have to give uid 0 in the container any special privilege, we just require that the ns have a uid 0 mapping. I have not been able to think of any other reliable way to verify that the writer of the capability is authorized to grant privilege to the file when executed by current. I'm going to proceed with another POC based on the following design: 1. no new syscalls at the moment. You can choose to set/query security.nscapability, but can also just set security.capability from a user_ns and have the kernel transparently set a security.nscapability entry for you. 2. For now just a single security.nscapability entry, but in a format that turning it into an array will be a trivial change 3. When running file foo which has a security.nscapability for kuid 10, then any namespace where kuid 10 is root - or which has an ancestor ns where that is the case - will run the file with the listed capabilities. 4. When doing getxattr of security.capability from a user_ns, if there is a security.capability entry, that will be returned; else if there is a valid security.nscapability for your ns, that will be returned. 5. when doing a setxattr of security.capability from a user_ns, if there is a security.nscapability entry, you get EBUSY; else a security.nscapability with your root kuid will be written provided that (a) you are privileged over your namespace, (b) you are privileged over your root uid, (c) the file owner maps into your namespace. 6. when doing a getxattr of security.nscapability, the entry will be shown with kuid mapped into your namespace or -1 if the uid does not map into your ns. 7. when doing a setxattr of security.nscapability, if an entry exists, you get -EBUSY; if you are not privileged over your ns, your root uid, and the file owner, then you get -EPERM; the xattr includes a uid field, which must be either 0 or a value valid in your ns. The value will be converted to a kuid and stored on disk. (Seth, I'm not sure offhand how that should mesh with your patches, we can talk about it after I send the next patch, which I'm quite certain will handle it wrongly) 8. If a security.capability exists, it will override any security.nscapability at execve() (so, inverse of my previous two patches). -serge
Re: [PATCH 1/1] simplified security.nscapability xattr
Quoting Serge E. Hallyn (se...@hallyn.com): ... > There's a problem though. The above suffices to prevent an unprivileged user > in a user_ns from unsharing a user_ns to write a file capability and exploit > that capability in the ns where he is unprivileged. With one exception, which > is the case where the unprivileged user is mapped to the same kuid which > created the namespace. So if uid 1000 on the host creates a namespace > where uid 1000 maps to 1000 in the namespace, then 1000 in the namespace > can create a new user_ns, write the xattr, and exploit it from the > parent namespace. This is not an uncommon case. I'm not sure what to do > about > it. Ok I think I've convinced myself that requiring a kuid 0 in the container and storing that in the security.nscapability is best solution. The DAC objection is imo not really valid - we don't have to give uid 0 in the container any special privilege, we just require that the ns have a uid 0 mapping. I have not been able to think of any other reliable way to verify that the writer of the capability is authorized to grant privilege to the file when executed by current. I'm going to proceed with another POC based on the following design: 1. no new syscalls at the moment. You can choose to set/query security.nscapability, but can also just set security.capability from a user_ns and have the kernel transparently set a security.nscapability entry for you. 2. For now just a single security.nscapability entry, but in a format that turning it into an array will be a trivial change 3. When running file foo which has a security.nscapability for kuid 10, then any namespace where kuid 10 is root - or which has an ancestor ns where that is the case - will run the file with the listed capabilities. 4. When doing getxattr of security.capability from a user_ns, if there is a security.capability entry, that will be returned; else if there is a valid security.nscapability for your ns, that will be returned. 5. when doing a setxattr of security.capability from a user_ns, if there is a security.nscapability entry, you get EBUSY; else a security.nscapability with your root kuid will be written provided that (a) you are privileged over your namespace, (b) you are privileged over your root uid, (c) the file owner maps into your namespace. 6. when doing a getxattr of security.nscapability, the entry will be shown with kuid mapped into your namespace or -1 if the uid does not map into your ns. 7. when doing a setxattr of security.nscapability, if an entry exists, you get -EBUSY; if you are not privileged over your ns, your root uid, and the file owner, then you get -EPERM; the xattr includes a uid field, which must be either 0 or a value valid in your ns. The value will be converted to a kuid and stored on disk. (Seth, I'm not sure offhand how that should mesh with your patches, we can talk about it after I send the next patch, which I'm quite certain will handle it wrongly) 8. If a security.capability exists, it will override any security.nscapability at execve() (so, inverse of my previous two patches). -serge