Re: [PATCH v2 06/14] ARM: sun8i: clk: Add clk-factor rate application method
On Thu, Jul 21, 2016 at 11:52:15AM +0200, Ondřej Jirman wrote: > >>> If so, then yes, trying to switch to the 24MHz oscillator before > >>> applying the factors, and then switching back when the PLL is stable > >>> would be a nice solution. > >>> > >>> I just checked, and all the SoCs we've had so far have that > >>> possibility, so if it works, for now, I'd like to stick to that. > >> > >> It would need to be tested. U-boot does the change only once, while the > >> kernel would be doing it all the time and between various frequencies > >> and PLL settings. So the issues may show up with this solution too. > > > > That would have the benefit of being quite easy to document, not be a > > huge amount of code and it would work on all the CPUs PLLs we have so > > far, so still, a pretty big win. If it doesn't, of course, we don't > > really have the choice. > > It's probably more code though. It has to access different register from > the one that is already defined in dts, which would add a lot of code > and require dts changes. The original patch I sent is simpler than that. Why? You can use container_of to retrieve the parent structure of the clock notifier, and then you get a ccu_common structure pointer, with the CCU base address, the clock register, its lock, etc. Look at what is done in drivers/clk/meson/clk-cpu.c. It's like 20 LoC. I don't really get why anything should be changed in the DT, or why it would add a lot of code. Or maybe we're not talking about the same thing? Maxime -- Maxime Ripard, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com signature.asc Description: PGP signature
Re: [PATCH] staging: ks7010: declare private functions static
On Tue, Jul 26, 2016 at 08:51:14AM +0200, Wolfram Sang wrote: > On Tue, Jul 26, 2016 at 06:48:00AM +, Nicholas Mc Guire wrote: > > On Mon, Jul 25, 2016 at 11:04:18PM +0200, Wolfram Sang wrote: > > > On Mon, Jul 25, 2016 at 09:22:27PM +0200, Nicholas Mc Guire wrote: > > > > Private functions in ks_hostif.c can be declared static. > > > > > > > > Fixes: 13a9930d15b4 ("staging: ks7010: add driver from Nanonote > > > > extra-repository") > > > > > > > > Signed-off-by: Nicholas Mc Guire > > > > > > Reviewed-by: Wolfram Sang > > > > > > drivers/staging/ks7010/ks7010_sdio.c and > > > drivers/staging/ks7010/ks_wlan_net.c have similar warnings in case you'd > > > like to fix those, too.) > > > > > the cases found regarding completion were: > > ./drivers/staging/ks7010/ks_hostif.c:80 treating signal case as success > > ./drivers/staging/ks7010/ks_wlan_net.c:109 treating signal case as success > > ./drivers/staging/ks7010/ks7010_sdio.c:901 treating signal case as success > > ./drivers/staging/ks7010/ks7010_sdio.c:929 treating signal case as success > > ./drivers/video/fbdev/exynos/exynos_mipi_dsi_common.c:383 treating signal > > case as success > > ./drivers/video/fbdev/exynos/exynos_mipi_dsi_common.c:247 treating signal > > case as success > > > > will be going through all of them in the next days. > > Awesome, thanks! > > I meant the "should it be static?" sparse warnings here, though :) > well I do run sparse on all the cleanups and if that triggers and it is sufficiently clear from context, patches will follow. thx! hofrat
Re: [PATCH] s390/perf: fix 'start' address of module's map
On Fri, Jul 22, 2016 at 10:47:34AM +0800, Songshan Gong wrote: > Has the patch been accepted by upstream? > > 在 7/21/2016 11:10 AM, Song Shan Gong 写道: > > At preset, when creating module's map, perf gets 'start' address by parsing > > '/proc/modules', but it's module base address, isn't the start address of > > '.text' section. In most archs, it's OK. But for s390, it places 'GOT' and > > 'PLT' relocations before '.text' section. So there exists an offset between > > module base address and '.text' section, which will incur wrong symbol > > resolution for modules. > > > > Fix this bug by getting 'start' address of module's map from parsing > > '/sys/module/[module name]/sections/.text', not from '/proc/modules'. > > > > Signed-off-by: Song Shan Gong > > Acked-by: Jiri Olsa I think it's good to go, Arnaldo, could you please take this one? thanks, jirka
Re: [PATCH] staging: ks7010: declare private functions static
On Tue, Jul 26, 2016 at 06:48:00AM +, Nicholas Mc Guire wrote: > On Mon, Jul 25, 2016 at 11:04:18PM +0200, Wolfram Sang wrote: > > On Mon, Jul 25, 2016 at 09:22:27PM +0200, Nicholas Mc Guire wrote: > > > Private functions in ks_hostif.c can be declared static. > > > > > > Fixes: 13a9930d15b4 ("staging: ks7010: add driver from Nanonote > > > extra-repository") > > > > > > Signed-off-by: Nicholas Mc Guire > > > > Reviewed-by: Wolfram Sang > > > > drivers/staging/ks7010/ks7010_sdio.c and > > drivers/staging/ks7010/ks_wlan_net.c have similar warnings in case you'd > > like to fix those, too.) > > > the cases found regarding completion were: > ./drivers/staging/ks7010/ks_hostif.c:80 treating signal case as success > ./drivers/staging/ks7010/ks_wlan_net.c:109 treating signal case as success > ./drivers/staging/ks7010/ks7010_sdio.c:901 treating signal case as success > ./drivers/staging/ks7010/ks7010_sdio.c:929 treating signal case as success > ./drivers/video/fbdev/exynos/exynos_mipi_dsi_common.c:383 treating signal > case as success > ./drivers/video/fbdev/exynos/exynos_mipi_dsi_common.c:247 treating signal > case as success > > will be going through all of them in the next days. Awesome, thanks! I meant the "should it be static?" sparse warnings here, though :) signature.asc Description: PGP signature
Re: [PATCH] i2c: i801: use IS_ENABLED() instead of checking for built-in or module
On Thu, Jul 21, 2016 at 12:11:01PM -0400, Javier Martinez Canillas wrote: > The IS_ENABLED() macro checks if a Kconfig symbol has been enabled either > built-in or as a module, use that macro instead of open coding the same. > > Using the macro makes the code more readable by helping abstract away some > of the Kconfig built-in and module enable details. > > Signed-off-by: Javier Martinez Canillas Applied to for-next, thanks! signature.asc Description: PGP signature
Re: [PATCH] iio: adc: rockchip_saradc: Explicitly disable ADC on probe
On 2016年07月26日 11:22, Guenter Roeck wrote: On 07/25/2016 07:51 PM, Caesar Wang wrote: Hi Guenter, Thanks for fixing it. On 2016年07月26日 03:39, Guenter Roeck wrote: If the ADC is read for the first time, the caller gets a timeout error, and the kernel log shows read channel() error: -110 The ADC may be enabled on boot, and needs to be explicitly disabled for a read sequence to work (otherwise there is no completion interrupt). Disaple it explicitly in the probe function. Fixes: 44d6f2ef94f9 ("iio: adc: add driver for Rockchip saradc") Signed-off-by: Guenter Roeck --- drivers/iio/adc/rockchip_saradc.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iio/adc/rockchip_saradc.c b/drivers/iio/adc/rockchip_saradc.c index f9ad6c2d6821..6aa3271d86b5 100644 --- a/drivers/iio/adc/rockchip_saradc.c +++ b/drivers/iio/adc/rockchip_saradc.c @@ -280,6 +280,9 @@ static int rockchip_saradc_probe(struct platform_device *pdev) goto err_pclk; } +/* Make sure ADC is disabled */ +writel_relaxed(0, info->regs + SARADC_CTRL); I think we should reset the saradc controller. Since make sure the reset value is 0 and loader-->kernel may even cause harm, as my experience on tsadc. (drivers/thermal/rockchip_thermal.c) e.g.: /** * Reset SARADC Controller, reset all saradc registers. */ static void rockchip_saradc_reset_controller(struct reset_control *reset) { reset_control_assert(reset); usleep_range(10, 20); reset_control_deassert(reset); } ..probe() { ... rockchip_saradc_reset_controller(); ... } Ok, I'll give it a try. I posted it on https://patchwork.kernel.org/patch/9247661/ Guenter - Caesar + platform_set_drvdata(pdev, indio_dev); indio_dev->name = dev_name(&pdev->dev); ___ Linux-rockchip mailing list linux-rockc...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip -- caesar wang | software engineer | w...@rock-chip.com
Re: [PATCH] staging: ks7010: declare private functions static
On Mon, Jul 25, 2016 at 11:04:18PM +0200, Wolfram Sang wrote: > On Mon, Jul 25, 2016 at 09:22:27PM +0200, Nicholas Mc Guire wrote: > > Private functions in ks_hostif.c can be declared static. > > > > Fixes: 13a9930d15b4 ("staging: ks7010: add driver from Nanonote > > extra-repository") > > > > Signed-off-by: Nicholas Mc Guire > > Reviewed-by: Wolfram Sang > > drivers/staging/ks7010/ks7010_sdio.c and > drivers/staging/ks7010/ks_wlan_net.c have similar warnings in case you'd > like to fix those, too.) > the cases found regarding completion were: ./drivers/staging/ks7010/ks_hostif.c:80 treating signal case as success ./drivers/staging/ks7010/ks_wlan_net.c:109 treating signal case as success ./drivers/staging/ks7010/ks7010_sdio.c:901 treating signal case as success ./drivers/staging/ks7010/ks7010_sdio.c:929 treating signal case as success ./drivers/video/fbdev/exynos/exynos_mipi_dsi_common.c:383 treating signal case as success ./drivers/video/fbdev/exynos/exynos_mipi_dsi_common.c:247 treating signal case as success will be going through all of them in the next days. thx! hofrat
Re: [PATCH] ceph: Correctly return NXIO errors from ceph_llseek.
> On Jul 22, 2016, at 01:43, Phil Turnbull wrote: > > ceph_llseek does not correctly return NXIO errors because the 'out' path > always returns 'offset'. > > Fixes: 06222e491e66 ("fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's > that define their own llseek") > Signed-off-by: Phil Turnbull > --- > fs/ceph/file.c | 12 +--- > 1 file changed, 5 insertions(+), 7 deletions(-) > > diff --git a/fs/ceph/file.c b/fs/ceph/file.c > index ce2f5795e44b..13adb5b2ef29 100644 > --- a/fs/ceph/file.c > +++ b/fs/ceph/file.c > @@ -1448,16 +1448,14 @@ static loff_t ceph_llseek(struct file *file, loff_t > offset, int whence) > { > struct inode *inode = file->f_mapping->host; > loff_t i_size; > - int ret; > + loff_t ret; > > inode_lock(inode); > > if (whence == SEEK_END || whence == SEEK_DATA || whence == SEEK_HOLE) { > ret = ceph_do_getattr(inode, CEPH_STAT_CAP_SIZE, false); > - if (ret < 0) { > - offset = ret; > + if (ret < 0) > goto out; > - } > } > > i_size = i_size_read(inode); > @@ -1473,7 +1471,7 @@ static loff_t ceph_llseek(struct file *file, loff_t > offset, int whence) >* write() or lseek() might have altered it >*/ > if (offset == 0) { > - offset = file->f_pos; > + ret = file->f_pos; > goto out; > } > offset += file->f_pos; > @@ -1493,11 +1491,11 @@ static loff_t ceph_llseek(struct file *file, loff_t > offset, int whence) > break; > } > > - offset = vfs_setpos(file, offset, inode->i_sb->s_maxbytes); > + ret = vfs_setpos(file, offset, inode->i_sb->s_maxbytes); > > out: > inode_unlock(inode); > - return offset; > + return ret; > } > > static inline void ceph_zero_partial_page( applied, thanks Yan, Zheng > -- > 2.9.0.rc2 >
Re: [GIT PULL] perf changes for v4.8
* Stephen Rothwell wrote: > > > That is why I sent this without mentioning the conflict. Is there any > > > other > > > complication that I missed? > > > > Actually, the perf tree on its own was enough to trigger the build problem, > > the luto-next tree was just what initially triggered the build failure in > > linux-next (I guess there is some missing dependency). After the build > > failed, > > I started including the perf tree directly before the tip tree and the > > build > > would fail when I merged that ... > > Now that this is fixed and merged into the tip tree, I have removed the perf > tree from linux-next. Ok, thanks - and sorry about this - I'll get the tooling fixes to Linus ASAP. Thanks, Ingo
Re: [PATCH 1/1 linux-next] kbuild: add make force=1 for testing
Andrew Morton writes: > On Sun, 24 Jul 2016 15:28:18 +0200 Fabian Frederick wrote: > >> Commit 51193b76bfff >> ("kbuild: forbid kernel directory to contain spaces and colons") >> >> makes it impossible to build kernel on default SD labels like >> "SD Card" for instance. >> >> Makefile:133: *** main directory cannot contain spaces nor colons. Stop. >> >> User could rename directories but volume name is not always writable. >> >> This patch adds ability to do make force=1 for people >> not interested in modules_install in this case but only testing. >> >> (Note that other options could go under ifndef force) > > That's a bit of a hack on a hack. > > 51193b76bfff said: > > :When the kernel path contains a space or a colon somewhere in the path > :name, the modules_install target doesn't work anymore, as the path names > :are not enclosed in double quotes. It is also supposed that and O= build > :will suffer from the same weakness as modules_install. > : > :Instead of checking and improving kbuild to resist to directories > :including these characters, error out early to prevent any build if the > :kernel's main directory contains a space. > > What's involved in fixing this properly? Make the whole kbuild > system operate correctly when there are spaces/colons in the > pathname? I was thinking originally fixing it by : http://www.spinics.net/lists/linux-kbuild/msg12036.html This fixed "properly" the make modules_install I think. And Marek pointed out that there were other cases, such as O=/my dir/ but not limited to, where it would also break, hence this patch. I'm not a kbuild expert so I'd like someone else (Marek) to enumerate the remaining cases not covered by the original patch. Cheers. -- Robert
Re: [PATCH] staging: ks7010: fix wait_for_completion_interruptible_timeout return handling
On Mon, Jul 25, 2016 at 10:54:03PM +0200, Wolfram Sang wrote: > On Mon, Jul 25, 2016 at 09:21:50PM +0200, Nicholas Mc Guire wrote: > > wait_for_completion_interruptible_timeout return 0 on timeout and > > -ERESTARTSYS if interrupted. The check for > > !wait_for_completion_interruptible_timeout() would report an interrupt > > as timeout. Further, while HZ/50 will work most of the time it could > > Wouldn't it interpret -ERESTARTSYS as *no timeout*? > yup - actually the current code just treats the -ERESTARTSYS case as success. > Anyway, the plain !0 comparison for me clearly shows that > 'interruptible' was more copy&pasted then really planned or supported. > If it was, it would need to cancel something. Also, 20ms is pretty hard > to cancel for a user ;) Given all that and the troubles we had with > 'interruptible' in the I2C subsystem, I'd much vote for dropping > interruptible here. > > > fail for HZ < 50, so this is switched to msecs_to_jiffies(20). > > Rest looks good, thanks! > thx! hofrat
Re: [PATCH] clocksource: sun4i: Clear interrupts after stopping timer in probe function
On Tue, Jul 26, 2016 at 1:49 PM, Maxime Ripard wrote: > On Tue, Jul 26, 2016 at 11:01:59AM +0800, Chen-Yu Tsai wrote: >> The bootloader (U-boot) sometimes uses this timer for various delays. >> It uses it as a ongoing counter, and does comparisons on the current >> counter value. The timer counter is never stopped. >> >> In some cases when the user interacts with the bootloader, or lets >> it idle for some time before loading Linux, the timer may expire, >> and an interrupt will be pending. This results in an unexpected >> interrupt when the timer interrupt is enabled by the kernel, at >> which point the event_handler isn't set yet. This results in a NULL >> pointer dereference exception, panic, and no way to reboot. >> >> Clear any pending interrupts after we stop the timer in the probe >> function to avoid this. >> >> Signed-off-by: Chen-Yu Tsai > > Awesome, thanks! > > You should put stable in Cc though for this kind of patches. AFAIK some maintainers prefer to add it themselves. Not sure about clocksource so I left it out. ChenYu
[PATCH 1/2] ARM: dts: imx7d: move ARM platform peripherals inside soc node
Since we have a SoC level node we should make use of it and have all nodes which are within the SoC, inside that node. This also saves an extra interrupt-parent properties. While at it, also order the Coresight nodes according to register addresses. Signed-off-by: Stefan Agner --- Hi Shawn, Not sure if there was a reasoning behind having all these nodes not within the soc subnode, but it seems to me somewhat uncommon in the i.MX world... If possible this patchset should go into v4.8 since 2/2 is a fix, however, I understand that 1/2 is not really post rc1 material... What do you think? -- Stefan arch/arm/boot/dts/imx7d.dtsi | 32 ++--- arch/arm/boot/dts/imx7s.dtsi | 301 +-- 2 files changed, 167 insertions(+), 166 deletions(-) diff --git a/arch/arm/boot/dts/imx7d.dtsi b/arch/arm/boot/dts/imx7d.dtsi index 51c13cb..3d77d95 100644 --- a/arch/arm/boot/dts/imx7d.dtsi +++ b/arch/arm/boot/dts/imx7d.dtsi @@ -52,23 +52,25 @@ }; }; - etm@3007d000 { - compatible = "arm,coresight-etm3x", "arm,primecell"; - reg = <0x3007d000 0x1000>; + soc { + etm@3007d000 { + compatible = "arm,coresight-etm3x", "arm,primecell"; + reg = <0x3007d000 0x1000>; - /* -* System will hang if added nosmp in kernel command line -* without arm,primecell-periphid because amba bus try to -* read id and core1 power off at this time. -*/ - arm,primecell-periphid = <0xbb956>; - cpu = <&cpu1>; - clocks = <&clks IMX7D_MAIN_AXI_ROOT_CLK>; - clock-names = "apb_pclk"; + /* +* System will hang if added nosmp in kernel command line +* without arm,primecell-periphid because amba bus try to +* read id and core1 power off at this time. +*/ + arm,primecell-periphid = <0xbb956>; + cpu = <&cpu1>; + clocks = <&clks IMX7D_MAIN_AXI_ROOT_CLK>; + clock-names = "apb_pclk"; - port { - etm1_out_port: endpoint { - remote-endpoint = <&ca_funnel_in_port1>; + port { + etm1_out_port: endpoint { + remote-endpoint = <&ca_funnel_in_port1>; + }; }; }; }; diff --git a/arch/arm/boot/dts/imx7s.dtsi b/arch/arm/boot/dts/imx7s.dtsi index 1e90bdb..d89587a 100644 --- a/arch/arm/boot/dts/imx7s.dtsi +++ b/arch/arm/boot/dts/imx7s.dtsi @@ -95,16 +95,6 @@ }; }; - intc: interrupt-controller@31001000 { - compatible = "arm,cortex-a7-gic"; - #interrupt-cells = <3>; - interrupt-controller; - reg = <0x31001000 0x1000>, - <0x31002000 0x1000>, - <0x31004000 0x2000>, - <0x31006000 0x2000>; - }; - ckil: clock-cki { compatible = "fixed-clock"; #clock-cells = <0>; @@ -119,195 +109,204 @@ clock-output-names = "osc"; }; - timer { - compatible = "arm,armv7-timer"; - interrupts = , -, -, -; + soc { + #address-cells = <1>; + #size-cells = <1>; + compatible = "simple-bus"; interrupt-parent = <&intc>; - }; + ranges; - etr@30086000 { - compatible = "arm,coresight-tmc", "arm,primecell"; - reg = <0x30086000 0x1000>; - clocks = <&clks IMX7D_MAIN_AXI_ROOT_CLK>; - clock-names = "apb_pclk"; + funnel@30041000 { + compatible = "arm,coresight-funnel", "arm,primecell"; + reg = <0x30041000 0x1000>; + clocks = <&clks IMX7D_MAIN_AXI_ROOT_CLK>; + clock-names = "apb_pclk"; + + ca_funnel_ports: ports { + #address-cells = <1>; + #size-cells = <0>; + + /* funnel input ports */ + port@0 { + reg = <0>; + ca_funnel_in_port0: endpoint { + slave-mode; + remote-endpoint = <&etm0_out_port>; + }; + }; + + /* funnel output
[PATCH 2/2] ARM: dts: imx7d: fix GIC nodes interrupt and register specification
The i.MX 7 as a GICv2, hence its CPU interface register map (the second register region) is 8kB long. Add the VGIC maintenance interrupt which allows to use the new VGIC driver. Signed-off-by: Stefan Agner Suggested-by: Marc Zyngier --- arch/arm/boot/dts/imx7s.dtsi | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/imx7s.dtsi b/arch/arm/boot/dts/imx7s.dtsi index d89587a..c63591c 100644 --- a/arch/arm/boot/dts/imx7s.dtsi +++ b/arch/arm/boot/dts/imx7s.dtsi @@ -292,10 +292,11 @@ intc: interrupt-controller@31001000 { compatible = "arm,cortex-a7-gic"; + interrupts = ; #interrupt-cells = <3>; interrupt-controller; reg = <0x31001000 0x1000>, - <0x31002000 0x1000>, + <0x31002000 0x2000>, <0x31004000 0x2000>, <0x31006000 0x2000>; }; -- 2.9.0
Re: [GIT PULL] perf changes for v4.8
* Stephen Rothwell wrote: > Hi Linus, > > On Mon, 25 Jul 2016 14:45:53 -0700 Linus Torvalds > wrote: > > > > On Mon, Jul 25, 2016 at 2:21 PM, Stephen Rothwell > > wrote: > > > > > > Actually, the perf tree on its own was enough to trigger the build > > > problem, the luto-next tree was just what initially triggered the build > > > failure in linux-next (I guess there is some missing dependency). > > > After the build failed, I started including the perf tree directly > > > before the tip tree and the build would fail when I merged that ... > > > > Ugh. It's merged in my tree now, because I thought it was ok. Can > > somebody point me to the fix? > > I only affects cross building of the objtool and vdso2c tools (which is > how I work). The latest version of the perf/core branch in the tip > tree now has all the fixes, so I assume that Ingo will send another > pull request. Yes, I'll send this ASAP. > Unfortunately, that means that your tree is broken for me this > morning ... but I will cope, I guess. That's weird, I pushed out the fix from Arnaldo yesterday (about 8 hours ago) which should merge fine with Linus's tree and make your tooling combination work. Thanks, Ingo
Re: staging: wilc1000: Reduce scope for a few variables in mac_ioctl()
>> -if (strncasecmp(buff, "RSSI", length) == 0) { >> +if (strncasecmp(buff, "RSSI", 0) == 0) { >> +s8 rssi; >> + > > Um, please think a second about if it makes any sense at all to compare > zero chars of two strings. Under which circumstances should the variable "length" contain an other value than zero? How can this open issue be fixed better? Regards, Markus
RE: [PATCH 4.6 143/203] memory: omap-gpmc: Fix omap gpmc EXTRADELAY timing
- Eaton Industries (France) S.A.S ~ Siège social: 110 Rue Blaise Pascal, Immeuble Le Viséo - Bâtiment A Innovallée, 38330, Montbonnot-St.-Martin, France ~ Lieu d'enregistrement au registre du commerce: Grenoble ~ Numéro d'enregistrement: 509 653 176 ~ Capital social souscrit et liberé:€ 16215441 ~ Numéro de TVA: FR47509653176 - -Message d'origine- De : Greg Kroah-Hartman [mailto:gre...@linuxfoundation.org] Envoyé : lundi 25 juillet 2016 22:56 À : linux-kernel@vger.kernel.org Cc : Greg Kroah-Hartman; sta...@vger.kernel.org; Ocquidant, Sebastien; Roger Quadros Objet : [PATCH 4.6 143/203] memory: omap-gpmc: Fix omap gpmc EXTRADELAY timing 4.6-stable review patch. If anyone has any objections, please let me know. -- From: Ocquidant, Sebastien commit 8f50b8e57442d28e41bb736c173d8a2490549a82 upstream. In the omap gpmc driver it can be noticed that GPMC_CONFIG4_OEEXTRADELAY is overwritten by the WEEXTRADELAY value from the device tree and GPMC_CONFIG4_WEEXTRADELAY is not updated by the value from the device tree. As a consequence, the memory accesses cannot be configured properly when the extra delay are needed for OE and WE. Fix the update of GPMC_CONFIG4_WEEXTRADELAY with the value from the device tree file and prevents GPMC_CONFIG4_OEXTRADELAY being overwritten by the WEXTRADELAY value from the device tree. Signed-off-by: Ocquidant, Sebastien Signed-off-by: Roger Quadros Signed-off-by: Greg Kroah-Hartman --- drivers/memory/omap-gpmc.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/memory/omap-gpmc.c +++ b/drivers/memory/omap-gpmc.c @@ -394,7 +394,7 @@ static void gpmc_cs_bool_timings(int cs, gpmc_cs_modify_reg(cs, GPMC_CS_CONFIG4, GPMC_CONFIG4_OEEXTRADELAY, p->oe_extra_delay); gpmc_cs_modify_reg(cs, GPMC_CS_CONFIG4, - GPMC_CONFIG4_OEEXTRADELAY, p->we_extra_delay); + GPMC_CONFIG4_WEEXTRADELAY, p->we_extra_delay); gpmc_cs_modify_reg(cs, GPMC_CS_CONFIG6, GPMC_CONFIG6_CYCLE2CYCLESAMECSEN, p->cycle2cyclesamecsen); Hi Greg, OK for me Sébastien Ocquidant
[PATCH v2] net: neigh: disallow transition to NUD_STALE if lladdr is unchanged in neigh_update()
NUD_STALE is used when the caller(e.g. arp_process()) can't guarantee neighbour reachability. If the entry was NUD_VALID and lladdr is unchanged, the entry state should not be changed. Currently the code puts an extra "NUD_CONNECTED" condition. So if old state was NUD_DELAY or NUD_PROBE (they are NUD_VALID but not NUD_CONNECTED), the state can be changed to NUD_STALE. This may cause problem. Because NUD_STALE lladdr doesn't guarantee reachability, when we send traffic, the state will be changed to NUD_DELAY. In normal case, if we get no confirmation (by dst_confirm()), we will change the state to NUD_PROBE and send probe traffic. But now the state may be reset to NUD_STALE again(e.g. by broadcast ARP packets), so the probe traffic will not be sent. This situation may happen again and again, and packets will be sent to an non-reachable lladdr forever. The fix is to remove the "NUD_CONNECTED" condition. After that the "NEIGH_UPDATE_F_WEAK_OVERRIDE" condition (used by IPv6) in that branch will be redundant, so remove it. This change may increase probe traffic, but it's essential since NUD_STALE lladdr is unreliable. To ensure correctness, we prefer to resolve lladdr, when we can't get confirmation, even while remote packets try to set NUD_STALE state. Signed-off-by: Chunhui He --- v2: - change title from "net: neigh: disallow state transition DELAY->STALE in neigh_update()" - remove "NUD_CONNECTED" condition instead of "NUD_CONNECTED | NUD_DELAY" - remove "NEIGH_UPDATE_F_WEAK_OVERRIDE" condition --- net/core/neighbour.c | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 510cd62..ed8c317e 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -1060,8 +1060,6 @@ static void neigh_update_hhs(struct neighbour *neigh) NEIGH_UPDATE_F_WEAK_OVERRIDE will suspect existing "connected" lladdr instead of overriding it if it is different. - It also allows to retain current state - if lladdr is unchanged. NEIGH_UPDATE_F_ADMINmeans that the change is administrative. NEIGH_UPDATE_F_OVERRIDE_ISROUTER allows to override existing @@ -1150,10 +1148,7 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new, } else goto out; } else { - if (lladdr == neigh->ha && new == NUD_STALE && - ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) || -(old & NUD_CONNECTED)) - ) + if (lladdr == neigh->ha && new == NUD_STALE) new = old; } } -- 2.1.4
[PATCH 1/4] iio: adc: rockchip_saradc: reset saradc controller before programming it
SARADC controller needs to be reset before programming it, otherwise it will not function properly. Signed-off-by: Caesar Wang Cc: Jonathan Cameron Cc: Heiko Stuebner Cc: Rob Herring Cc: linux-...@vger.kernel.org Cc: linux-rockc...@lists.infradead.org --- .../bindings/iio/adc/rockchip-saradc.txt | 5 + drivers/iio/adc/Kconfig| 1 + drivers/iio/adc/rockchip_saradc.c | 22 ++ 3 files changed, 28 insertions(+) diff --git a/Documentation/devicetree/bindings/iio/adc/rockchip-saradc.txt b/Documentation/devicetree/bindings/iio/adc/rockchip-saradc.txt index bf99e2f..d2258be 100644 --- a/Documentation/devicetree/bindings/iio/adc/rockchip-saradc.txt +++ b/Documentation/devicetree/bindings/iio/adc/rockchip-saradc.txt @@ -13,6 +13,9 @@ Required properties: - clocks: Must contain an entry for each entry in clock-names. - clock-names: Shall be "saradc" for the converter-clock, and "apb_pclk" for the peripheral clock. +- resets: Must contain an entry for each entry in reset-names. + See ../reset/reset.txt for details. +- reset-names: Must include the name "saradc-apb". - vref-supply: The regulator supply ADC reference voltage. - #io-channel-cells: Should be 1, see ../iio-bindings.txt @@ -23,6 +26,8 @@ Example: interrupts = ; clocks = <&cru SCLK_SARADC>, <&cru PCLK_SARADC>; clock-names = "saradc", "apb_pclk"; + resets = <&cru SRST_SARADC>; + reset-names = "saradc-apb"; #io-channel-cells = <1>; vref-supply = <&vcc18>; }; diff --git a/drivers/iio/adc/Kconfig b/drivers/iio/adc/Kconfig index 1de31bd..7675772 100644 --- a/drivers/iio/adc/Kconfig +++ b/drivers/iio/adc/Kconfig @@ -389,6 +389,7 @@ config QCOM_SPMI_VADC config ROCKCHIP_SARADC tristate "Rockchip SARADC driver" depends on ARCH_ROCKCHIP || (ARM && COMPILE_TEST) + depends on RESET_CONTROLLER help Say yes here to build support for the SARADC found in SoCs from Rockchip. diff --git a/drivers/iio/adc/rockchip_saradc.c b/drivers/iio/adc/rockchip_saradc.c index f9ad6c2..2f0e110 100644 --- a/drivers/iio/adc/rockchip_saradc.c +++ b/drivers/iio/adc/rockchip_saradc.c @@ -21,6 +21,8 @@ #include #include #include +#include +#include #include #include @@ -53,6 +55,7 @@ struct rockchip_saradc { struct clk *clk; struct completion completion; struct regulator*vref; + struct reset_control*reset; const struct rockchip_saradc_data *data; u16 last_val; }; @@ -190,6 +193,16 @@ static const struct of_device_id rockchip_saradc_match[] = { }; MODULE_DEVICE_TABLE(of, rockchip_saradc_match); +/** + * Reset SARADC Controller. + */ +static void rockchip_saradc_reset_controller(struct reset_control *reset) +{ + reset_control_assert(reset); + usleep_range(10, 20); + reset_control_deassert(reset); +} + static int rockchip_saradc_probe(struct platform_device *pdev) { struct rockchip_saradc *info = NULL; @@ -218,6 +231,13 @@ static int rockchip_saradc_probe(struct platform_device *pdev) if (IS_ERR(info->regs)) return PTR_ERR(info->regs); + info->reset = devm_reset_control_get(&pdev->dev, "saradc-apb"); + if (IS_ERR(info->reset)) { + ret = PTR_ERR(info->reset); + dev_err(&pdev->dev, "failed to get saradc reset: %d\n", ret); + return ret; + } + init_completion(&info->completion); irq = platform_get_irq(pdev, 0); @@ -252,6 +272,8 @@ static int rockchip_saradc_probe(struct platform_device *pdev) return PTR_ERR(info->vref); } + rockchip_saradc_reset_controller(info->reset); + /* * Use a default value for the converter clock. * This may become user-configurable in the future. -- 1.9.1
[PATCH v10 2/7] x86, acpi, cpu-hotplug: Enable acpi to register all possible cpus at boot time.
From: Gu Zheng [Problem] cpuid <-> nodeid mapping is firstly established at boot time. And workqueue caches the mapping in wq_numa_possible_cpumask in wq_numa_init() at boot time. When doing node online/offline, cpuid <-> nodeid mapping is established/destroyed, which means, cpuid <-> nodeid mapping will change if node hotplug happens. But workqueue does not update wq_numa_possible_cpumask. So here is the problem: Assume we have the following cpuid <-> nodeid in the beginning: Node | CPU node 0 | 0-14, 60-74 node 1 | 15-29, 75-89 node 2 | 30-44, 90-104 node 3 | 45-59, 105-119 and we hot-remove node2 and node3, it becomes: Node | CPU node 0 | 0-14, 60-74 node 1 | 15-29, 75-89 and we hot-add node4 and node5, it becomes: Node | CPU node 0 | 0-14, 60-74 node 1 | 15-29, 75-89 node 4 | 30-59 node 5 | 90-119 But in wq_numa_possible_cpumask, cpu30 is still mapped to node2, and the like. When a pool workqueue is initialized, if its cpumask belongs to a node, its pool->node will be mapped to that node. And memory used by this workqueue will also be allocated on that node. static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs){ ... /* if cpumask is contained inside a NUMA node, we belong to that node */ if (wq_numa_enabled) { for_each_node(node) { if (cpumask_subset(pool->attrs->cpumask, wq_numa_possible_cpumask[node])) { pool->node = node; break; } } } Since wq_numa_possible_cpumask is not updated, it could be mapped to an offline node, which will lead to memory allocation failure: SLUB: Unable to allocate memory on node 2 (gfp=0x80d0) cache: kmalloc-192, object size: 192, buffer size: 192, default order: 1, min order: 0 node 0: slabs: 6172, objs: 259224, free: 245741 node 1: slabs: 3261, objs: 136962, free: 127656 It happens here: create_worker(struct worker_pool *pool) |--> worker = alloc_worker(pool->node); static struct worker *alloc_worker(int node) { struct worker *worker; worker = kzalloc_node(sizeof(*worker), GFP_KERNEL, node); --> Here, useing the wrong node. .. return worker; } [Solution] There are four mappings in the kernel: 1. nodeid (logical node id) <-> pxm 2. apicid (physical cpu id) <-> nodeid 3. cpuid (logical cpu id) <-> apicid 4. cpuid (logical cpu id) <-> nodeid 1. pxm (proximity domain) is provided by ACPI firmware in SRAT, and nodeid <-> pxm mapping is setup at boot time. This mapping is persistent, won't change. 2. apicid <-> nodeid mapping is setup using info in 1. The mapping is setup at boot time and CPU hotadd time, and cleared at CPU hotremove time. This mapping is also persistent. 3. cpuid <-> apicid mapping is setup at boot time and CPU hotadd time. cpuid is allocated, lower ids first, and released at CPU hotremove time, reused for other hotadded CPUs. So this mapping is not persistent. 4. cpuid <-> nodeid mapping is also setup at boot time and CPU hotadd time, and cleared at CPU hotremove time. As a result of 3, this mapping is not persistent. To fix this problem, we establish cpuid <-> nodeid mapping for all the possible cpus at boot time, and make it persistent. And according to init_cpu_to_node(), cpuid <-> nodeid mapping is based on apicid <-> nodeid mapping and cpuid <-> apicid mapping. So the key point is obtaining all cpus' apicid. apicid can be obtained by _MAT (Multiple APIC Table Entry) method or found in MADT (Multiple APIC Description Table). So we finish the job in the following steps: 1. Enable apic registeration flow to handle both enabled and disabled cpus. This is done by introducing an extra parameter to generic_processor_info to let the caller control if disabled cpus are ignored. 2. Introduce a new array storing all possible cpuid <-> apicid mapping. And also modify the way cpuid is calculated. Establish all possible cpuid <-> apicid mapping when registering local apic. Store the mapping in this array. 3. Enable _MAT and MADT relative apis to return non-presnet or disabled cpus' apicid. This is also done by introducing an extra parameter to these apis to let the caller control if disabled cpus are ignored. 4. Establish all possible cpuid <-> nodeid mapping. This is done via an additional acpi namespace walk for processors. This patch finished step 1. Signed-off-by: Gu Zheng Signed-off-by: Tang Chen Signed-off-by: Zhu Guihua Signed-off-by: Dou Liyang --- arch/x86/kernel/apic/apic.c | 26 +++--- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 60078a6..8e3c377 100644 --- a/arch/x86/kernel/apic/api
[PATCH resent] w1:omap_hdq: fix regression
commit ("w1: masters: omap_hdq: add support for 1-wire mode") did add a statement to clear the hdq_irqstatus flags in hdq_read_byte(). If the hdq reading process is scheduled slowly or interrupts are disabled for a while the hardware read activity might already be finished on entry of hdq_read_byte(). And hdq_isr() already has set the hdq_irqstatus to 0x6 (can be seen in debug mode) denoting that both, the TXCOMPLETE and RXCOMPLETE interrupts occurred in parallel. This means there is no need to wait and the hdq_read_byte() can just read the byte from the hdq controller. By resetting hdq_irqstatus to 0 the read process is forced to be always waiting again (because the if statement always succeeds) but the hardware will not issue another RXCOMPLETE interrupt. This results in a false timeout. After such a situation the hdq bus hangs. Signed-off-by: H. Nikolaus Schaller Cc: sta...@vger.kernel.org --- drivers/w1/masters/omap_hdq.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/w1/masters/omap_hdq.c b/drivers/w1/masters/omap_hdq.c index a2eec97..bb09de6 100644 --- a/drivers/w1/masters/omap_hdq.c +++ b/drivers/w1/masters/omap_hdq.c @@ -390,8 +390,6 @@ static int hdq_read_byte(struct hdq_data *hdq_data, u8 *val) goto out; } - hdq_data->hdq_irqstatus = 0; - if (!(hdq_data->hdq_irqstatus & OMAP_HDQ_INT_STATUS_RXCOMPLETE)) { hdq_reg_merge(hdq_data, OMAP_HDQ_CTRL_STATUS, OMAP_HDQ_CTRL_STATUS_DIR | OMAP_HDQ_CTRL_STATUS_GO, -- 2.7.3
[PATCH v10 4/7] x86, acpi, cpu-hotplug: Enable MADT APIs to return disabled apicid.
From: Gu Zheng The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that, when node online/offline happens, cache based on cpuid <-> nodeid mapping such as wq_numa_possible_cpumask will not cause any problem. It contains 4 steps: 1. Enable apic registeration flow to handle both enabled and disabled cpus. 2. Introduce a new array storing all possible cpuid <-> apicid mapping. 3. Enable _MAT and MADT relative apis to return non-presnet or disabled cpus' apicid. 4. Establish all possible cpuid <-> nodeid mapping. This patch finishes step 3. There are four mappings in the kernel: 1. nodeid (logical node id) <-> pxm(persistent) 2. apicid (physical cpu id) <-> nodeid (persistent) 3. cpuid (logical cpu id) <-> apicid (not persistent, now persistent by step 2) 4. cpuid (logical cpu id) <-> nodeid (not persistent) So, in order to setup persistent cpuid <-> nodeid mapping for all possible CPUs, we should: 1. Setup cpuid <-> apicid mapping for all possible CPUs, which has been done in step 1, 2. 2. Setup cpuid <-> nodeid mapping for all possible CPUs. But before that, we should obtain all apicids from MADT. All processors' apicids can be obtained by _MAT method or from MADT in ACPI. The current code ignores disabled processors and returns -ENODEV. After this patch, a new parameter will be added to MADT APIs so that caller is able to control if disabled processors are ignored. Signed-off-by: Gu Zheng Signed-off-by: Tang Chen Signed-off-by: Zhu Guihua Signed-off-by: Dou Liyang --- drivers/acpi/acpi_processor.c | 5 +++- drivers/acpi/processor_core.c | 57 +++ 2 files changed, 40 insertions(+), 22 deletions(-) diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index c7ba948..e85b19a 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -300,8 +300,11 @@ static int acpi_processor_get_info(struct acpi_device *device) * Extra Processor objects may be enumerated on MP systems with * less than the max # of CPUs. They should be ignored _iff * they are physically not present. +* +* NOTE: Even if the processor has a cpuid, it may not present because +* cpuid <-> apicid mapping is persistent now. */ - if (invalid_logical_cpuid(pr->id)) { + if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) { int ret = acpi_processor_hotadd_init(pr); if (ret) return ret; diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c index 33a38d6..824b98b 100644 --- a/drivers/acpi/processor_core.c +++ b/drivers/acpi/processor_core.c @@ -32,12 +32,12 @@ static struct acpi_table_madt *get_madt_table(void) } static int map_lapic_id(struct acpi_subtable_header *entry, -u32 acpi_id, phys_cpuid_t *apic_id) +u32 acpi_id, phys_cpuid_t *apic_id, bool ignore_disabled) { struct acpi_madt_local_apic *lapic = container_of(entry, struct acpi_madt_local_apic, header); - if (!(lapic->lapic_flags & ACPI_MADT_ENABLED)) + if (ignore_disabled && !(lapic->lapic_flags & ACPI_MADT_ENABLED)) return -ENODEV; if (lapic->processor_id != acpi_id) @@ -48,12 +48,13 @@ static int map_lapic_id(struct acpi_subtable_header *entry, } static int map_x2apic_id(struct acpi_subtable_header *entry, - int device_declaration, u32 acpi_id, phys_cpuid_t *apic_id) + int device_declaration, u32 acpi_id, phys_cpuid_t *apic_id, + bool ignore_disabled) { struct acpi_madt_local_x2apic *apic = container_of(entry, struct acpi_madt_local_x2apic, header); - if (!(apic->lapic_flags & ACPI_MADT_ENABLED)) + if (ignore_disabled && !(apic->lapic_flags & ACPI_MADT_ENABLED)) return -ENODEV; if (device_declaration && (apic->uid == acpi_id)) { @@ -65,12 +66,13 @@ static int map_x2apic_id(struct acpi_subtable_header *entry, } static int map_lsapic_id(struct acpi_subtable_header *entry, - int device_declaration, u32 acpi_id, phys_cpuid_t *apic_id) + int device_declaration, u32 acpi_id, phys_cpuid_t *apic_id, + bool ignore_disabled) { struct acpi_madt_local_sapic *lsapic = container_of(entry, struct acpi_madt_local_sapic, header); - if (!(lsapic->lapic_flags & ACPI_MADT_ENABLED)) + if (ignore_disabled && !(lsapic->lapic_flags & ACPI_MADT_ENABLED)) return -ENODEV; if (device_declaration) { @@ -87,12 +89,13 @@ static int map_lsapic_id(struct acpi_subtable_header *entry, * Retrieve the ARM CPU physical identifier (MPIDR) */ static int map_gicc_mpidr(struct acpi_subtable_header *entry, - int device_declaration, u32 acpi_id, phys_cpuid_t *mpidr) +
[PATCH 3/4] arm64: dts: rockchip: add reset saradc node for rk3368 SoCs
SARADC controller needs to be reset before programming it, otherwise it will not function properly. Signed-off-by: Caesar Wang --- arch/arm64/boot/dts/rockchip/rk3368.dtsi | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/boot/dts/rockchip/rk3368.dtsi b/arch/arm64/boot/dts/rockchip/rk3368.dtsi index d02a9003..4f44d11 100644 --- a/arch/arm64/boot/dts/rockchip/rk3368.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3368.dtsi @@ -270,6 +270,8 @@ #io-channel-cells = <1>; clocks = <&cru SCLK_SARADC>, <&cru PCLK_SARADC>; clock-names = "saradc", "apb_pclk"; + resets = <&cru SRST_SARADC>; + reset-names = "saradc-apb"; status = "disabled"; }; -- 1.9.1
[PATCH v10 6/7] acpi: Provide the mechanism to validate processors in the ACPI tables
[Problem] When we set cpuid <-> nodeid mapping to be persistent, it will use the DSDT As we know, the ACPI tables are just like user's input in that respect, and we don't crash if user's input is unreasonable. Such as, the mapping of the proc_id and pxm in some machine's ACPI table is like this: proc_id |pxm 0 <-> 0 1 <-> 0 2 <-> 1 3 <-> 1 89 <-> 0 89 <-> 0 89 <-> 0 89 <-> 1 89 <-> 1 89 <-> 2 89 <-> 3 . We can't be sure which one is correct to the proc_id 89. We may map a wrong node to a cpu. When pages are allocated, this may cause a kernal panic. So, we should provide mechanisms to validate the ACPI tables, just like we do validation to check user's input in web project. The mechanism is that the processor objects which have the duplicate IDs are not valid. [Solution] We add a validation function, like this: foreach Processor in DSDT proc_id= get_ACPI_Processor_number(Processor) if(the proc_id has alreadly existed ) mark both of them as being unreasonable; The function will record the unique or duplicate processor IDs. The duplicate processor IDs such as 89 are regarded as the unreasonable IDs which mean that the processor objects in question are not valid. Signed-off-by: Dou Liyang --- drivers/acpi/acpi_processor.c | 79 +++ 1 file changed, 79 insertions(+) diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index 0c15828..346fbfc 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -581,8 +581,87 @@ static struct acpi_scan_handler processor_container_handler = { .attach = acpi_processor_container_attach, }; +/* The number of the unique processor IDs */ +static int nr_unique_ids; + +/* The number of the duplicate processor IDs */ +static int nr_duplicate_ids; + +/* Used to store the unique processor IDs */ +static int unique_processor_ids[] = { + [0 ... NR_CPUS - 1] = -1, +}; + +/* Used to store the duplicate processor IDs */ +static int duplicate_processor_ids[] = { + [0 ... NR_CPUS - 1] = -1, +}; + +static void processor_validated_ids_update(int proc_id) +{ + int i; + + if (nr_unique_ids == NR_CPUS||nr_duplicate_ids == NR_CPUS) + return; + + /* +* Firstly, compare the proc_id with duplicate IDs, if the proc_id is +* already in the IDs, do nothing. +*/ + for (i = 0; i < nr_duplicate_ids; i++) { + if (duplicate_processor_ids[i] == proc_id) + return; + } + + /* +* Secondly, compare the proc_id with unique IDs, if the proc_id is in +* the IDs, put it in the duplicate IDs. +*/ + for (i = 0; i < nr_unique_ids; i++) { + if (unique_processor_ids[i] == proc_id) { + duplicate_processor_ids[nr_duplicate_ids] = proc_id; + nr_duplicate_ids++; + return; + } + } + + /* +* Lastly, the proc_id is a unique ID, put it in the unique IDs. +*/ + unique_processor_ids[nr_unique_ids] = proc_id; + nr_unique_ids++; +} + +static acpi_status acpi_processor_ids_walk(acpi_handle handle, + u32 lvl, + void *context, + void **rv) +{ + acpi_status status; + union acpi_object object = { 0 }; + struct acpi_buffer buffer = { sizeof(union acpi_object), &object }; + + status = acpi_evaluate_object(handle, NULL, NULL, &buffer); + if (ACPI_FAILURE(status)) + acpi_handle_info(handle, "Not get the processor object\n"); + else + processor_validated_ids_update(object.processor.proc_id); + + return AE_OK; +} + +static void acpi_processor_duplication_valiate(void) +{ + /* Search all processor nodes in ACPI namespace */ + acpi_walk_namespace(ACPI_TYPE_PROCESSOR, ACPI_ROOT_OBJECT, + ACPI_UINT32_MAX, + acpi_processor_ids_walk, + NULL, NULL, NULL); +} + void __init acpi_processor_init(void) { + acpi_processor_duplication_valiate(); acpi_scan_add_handler_with_hotplug(&processor_handler, "processor"); acpi_scan_add_handler(&processor_container_handler); } -- 2.5.5
[PATCH 2/4] arm64: dts: rockchip: add the saradc for rk3399
This patch adds saradc needed information on rk3399 SoCs. Signed-off-by: Caesar Wang --- arch/arm64/boot/dts/rockchip/rk3399.dtsi | 12 1 file changed, 12 insertions(+) diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi index 4c84229..b81f84b 100644 --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi @@ -299,6 +299,18 @@ }; }; + saradc: saradc@ff10 { + compatible = "rockchip,rk3399-saradc"; + reg = <0x0 0xff10 0x0 0x100>; + interrupts = ; + #io-channel-cells = <1>; + clocks = <&cru SCLK_SARADC>, <&cru PCLK_SARADC>; + clock-names = "saradc", "apb_pclk"; + resets = <&cru SRST_P_SARADC>; + reset-names = "saradc-apb"; + status = "disabled"; + }; + i2c1: i2c@ff11 { compatible = "rockchip,rk3399-i2c"; reg = <0x0 0xff11 0x0 0x1000>; -- 1.9.1
[PATCH v10 5/7] x86, acpi, cpu-hotplug: Set persistent cpuid <-> nodeid mapping when booting.
From: Gu Zheng The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that, when node online/offline happens, cache based on cpuid <-> nodeid mapping such as wq_numa_possible_cpumask will not cause any problem. It contains 4 steps: 1. Enable apic registeration flow to handle both enabled and disabled cpus. 2. Introduce a new array storing all possible cpuid <-> apicid mapping. 3. Enable _MAT and MADT relative apis to return non-presnet or disabled cpus' apicid. 4. Establish all possible cpuid <-> nodeid mapping. This patch finishes step 4. This patch set the persistent cpuid <-> nodeid mapping for all enabled/disabled processors at boot time via an additional acpi namespace walk for processors. Signed-off-by: Gu Zheng Signed-off-by: Tang Chen Signed-off-by: Zhu Guihua Signed-off-by: Dou Liyang --- arch/ia64/kernel/acpi.c | 3 +- arch/x86/kernel/acpi/boot.c | 4 ++- drivers/acpi/acpi_processor.c | 5 drivers/acpi/bus.c| 1 + drivers/acpi/processor_core.c | 67 +++ include/linux/acpi.h | 3 ++ 6 files changed, 81 insertions(+), 2 deletions(-) diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c index b1698bc..bb36515 100644 --- a/arch/ia64/kernel/acpi.c +++ b/arch/ia64/kernel/acpi.c @@ -796,7 +796,7 @@ int acpi_isa_irq_to_gsi(unsigned isa_irq, u32 *gsi) * ACPI based hotplug CPU support */ #ifdef CONFIG_ACPI_HOTPLUG_CPU -static int acpi_map_cpu2node(acpi_handle handle, int cpu, int physid) +int acpi_map_cpu2node(acpi_handle handle, int cpu, int physid) { #ifdef CONFIG_ACPI_NUMA /* @@ -811,6 +811,7 @@ static int acpi_map_cpu2node(acpi_handle handle, int cpu, int physid) #endif return 0; } +EXPORT_SYMBOL(acpi_map_cpu2node); int additional_cpus __initdata = -1; diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c index 37248c3..0900264f 100644 --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -695,7 +695,7 @@ static void __init acpi_set_irq_model_ioapic(void) #ifdef CONFIG_ACPI_HOTPLUG_CPU #include -static void acpi_map_cpu2node(acpi_handle handle, int cpu, int physid) +int acpi_map_cpu2node(acpi_handle handle, int cpu, int physid) { #ifdef CONFIG_ACPI_NUMA int nid; @@ -706,7 +706,9 @@ static void acpi_map_cpu2node(acpi_handle handle, int cpu, int physid) numa_set_node(cpu, nid); } #endif + return 0; } +EXPORT_SYMBOL(acpi_map_cpu2node); int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, int *pcpu) { diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index e85b19a..0c15828 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -182,6 +182,11 @@ int __weak arch_register_cpu(int cpu) void __weak arch_unregister_cpu(int cpu) {} +int __weak acpi_map_cpu2node(acpi_handle handle, int cpu, int physid) +{ + return -ENODEV; +} + static int acpi_processor_hotadd_init(struct acpi_processor *pr) { unsigned long long sta; diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c index 262ca31..0fe5f54 100644 --- a/drivers/acpi/bus.c +++ b/drivers/acpi/bus.c @@ -1124,6 +1124,7 @@ static int __init acpi_init(void) acpi_sleep_proc_init(); acpi_wakeup_device_init(); acpi_debugger_init(); + acpi_set_processor_mapping(); return 0; } diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c index 824b98b..e814cd4 100644 --- a/drivers/acpi/processor_core.c +++ b/drivers/acpi/processor_core.c @@ -261,6 +261,73 @@ int acpi_get_cpuid(acpi_handle handle, int type, u32 acpi_id) } EXPORT_SYMBOL_GPL(acpi_get_cpuid); +#ifdef CONFIG_ACPI_HOTPLUG_CPU +static bool map_processor(acpi_handle handle, phys_cpuid_t *phys_id, int *cpuid) +{ + int type; + u32 acpi_id; + acpi_status status; + acpi_object_type acpi_type; + unsigned long long tmp; + union acpi_object object = { 0 }; + struct acpi_buffer buffer = { sizeof(union acpi_object), &object }; + + status = acpi_get_type(handle, &acpi_type); + if (ACPI_FAILURE(status)) + return false; + + switch (acpi_type) { + case ACPI_TYPE_PROCESSOR: + status = acpi_evaluate_object(handle, NULL, NULL, &buffer); + if (ACPI_FAILURE(status)) + return false; + acpi_id = object.processor.proc_id; + break; + case ACPI_TYPE_DEVICE: + status = acpi_evaluate_integer(handle, "_UID", NULL, &tmp); + if (ACPI_FAILURE(status)) + return false; + acpi_id = tmp; + break; + default: + return false; + } + + type = (acpi_type == ACPI_TYPE_DEVICE) ? 1 : 0; + + *phys_id = __acpi_get_phys_id(handle, type, acpi_id, false); + *cpuid = acpi_map_cpuid(*phys_id, acpi_id); + if
[PATCH 4/4] arm: dts: rockchip: add reset node for the exist saradc SoCs
SARADC controller needs to be reset before programming it, otherwise it will not function properly. Signed-off-by: Caesar Wang --- arch/arm/boot/dts/rk3066a.dtsi | 2 ++ arch/arm/boot/dts/rk3288.dtsi | 2 ++ arch/arm/boot/dts/rk3xxx.dtsi | 2 ++ 3 files changed, 6 insertions(+) diff --git a/arch/arm/boot/dts/rk3066a.dtsi b/arch/arm/boot/dts/rk3066a.dtsi index c0ba86c..0d0dae3 100644 --- a/arch/arm/boot/dts/rk3066a.dtsi +++ b/arch/arm/boot/dts/rk3066a.dtsi @@ -197,6 +197,8 @@ clock-names = "saradc", "apb_pclk"; interrupts = ; #io-channel-cells = <1>; + resets = <&cru SRST_SARADC>; + reset-names = "saradc-apb"; status = "disabled"; }; diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi index cd33f01..91c4b3c 100644 --- a/arch/arm/boot/dts/rk3288.dtsi +++ b/arch/arm/boot/dts/rk3288.dtsi @@ -279,6 +279,8 @@ #io-channel-cells = <1>; clocks = <&cru SCLK_SARADC>, <&cru PCLK_SARADC>; clock-names = "saradc", "apb_pclk"; + resets = <&cru SRST_SARADC>; + reset-names = "saradc-apb"; status = "disabled"; }; diff --git a/arch/arm/boot/dts/rk3xxx.dtsi b/arch/arm/boot/dts/rk3xxx.dtsi index 99bbcc2..e2cd683 100644 --- a/arch/arm/boot/dts/rk3xxx.dtsi +++ b/arch/arm/boot/dts/rk3xxx.dtsi @@ -399,6 +399,8 @@ #io-channel-cells = <1>; clocks = <&cru SCLK_SARADC>, <&cru PCLK_SARADC>; clock-names = "saradc", "apb_pclk"; + resets = <&cru SRST_SARADC>; + reset-names = "saradc-apb"; status = "disabled"; }; -- 1.9.1
[PATCH v10 7/7] acpi: Provide the interface to validate the proc_id
When we want to identify whether the proc_id is unreasonable or not, we can call the "acpi_processor_validate_proc_id" function. It will search in the duplicate IDs. If we find the proc_id in the IDs, we return true to the call function. Conversely, the false represents available. When we establish all possible cpuid <-> nodeid mapping to handle the cpu hotplugs, we will use the proc_id from ACPI table. We do validation when we get the proc_id. If the result is true, we will stop the mapping. Signed-off-by: Dou Liyang --- drivers/acpi/acpi_processor.c | 16 drivers/acpi/processor_core.c | 4 include/linux/acpi.h | 3 +++ 3 files changed, 23 insertions(+) diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index 346fbfc..ae6dae9 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -659,6 +659,22 @@ static void acpi_processor_duplication_valiate(void) NULL, NULL, NULL); } +bool acpi_processor_validate_proc_id(int proc_id) +{ + int i; + + /* +* compare the proc_id with duplicate IDs, if the proc_id is already +* in the duplicate IDs, return true, otherwise, return false. +*/ + for (i = 0; i < nr_duplicate_ids; i++) { + if (duplicate_processor_ids[i] == proc_id) + return true; + } + + return false; +} + void __init acpi_processor_init(void) { acpi_processor_duplication_valiate(); diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c index e814cd4..830c7ac 100644 --- a/drivers/acpi/processor_core.c +++ b/drivers/acpi/processor_core.c @@ -282,6 +282,10 @@ static bool map_processor(acpi_handle handle, phys_cpuid_t *phys_id, int *cpuid) if (ACPI_FAILURE(status)) return false; acpi_id = object.processor.proc_id; + + /* validate the acpi_id */ + if(acpi_processor_validate_proc_id(acpi_id)) + return false; break; case ACPI_TYPE_DEVICE: status = acpi_evaluate_integer(handle, "_UID", NULL, &tmp); diff --git a/include/linux/acpi.h b/include/linux/acpi.h index 30df63c..11bc794 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -254,6 +254,9 @@ static inline bool invalid_phys_cpuid(phys_cpuid_t phys_id) return phys_id == PHYS_CPUID_INVALID; } +/* Validate the processor object's proc_id */ +bool acpi_processor_validate_proc_id(int proc_id); + #ifdef CONFIG_ACPI_HOTPLUG_CPU /* Arch dependent functions for cpu hotplug support */ int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, int *pcpu); -- 2.5.5
[PATCH v10 0/7] Make cpuid <-> nodeid mapping persistent
[Problem] cpuid <-> nodeid mapping is firstly established at boot time. And workqueue caches the mapping in wq_numa_possible_cpumask in wq_numa_init() at boot time. When doing node online/offline, cpuid <-> nodeid mapping is established/destroyed, which means, cpuid <-> nodeid mapping will change if node hotplug happens. But workqueue does not update wq_numa_possible_cpumask. So here is the problem: Assume we have the following cpuid <-> nodeid in the beginning: Node | CPU node 0 | 0-14, 60-74 node 1 | 15-29, 75-89 node 2 | 30-44, 90-104 node 3 | 45-59, 105-119 and we hot-remove node2 and node3, it becomes: Node | CPU node 0 | 0-14, 60-74 node 1 | 15-29, 75-89 and we hot-add node4 and node5, it becomes: Node | CPU node 0 | 0-14, 60-74 node 1 | 15-29, 75-89 node 4 | 30-59 node 5 | 90-119 But in wq_numa_possible_cpumask, cpu30 is still mapped to node2, and the like. When a pool workqueue is initialized, if its cpumask belongs to a node, its pool->node will be mapped to that node. And memory used by this workqueue will also be allocated on that node. static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs){ ... /* if cpumask is contained inside a NUMA node, we belong to that node */ if (wq_numa_enabled) { for_each_node(node) { if (cpumask_subset(pool->attrs->cpumask, wq_numa_possible_cpumask[node])) { pool->node = node; break; } } } Since wq_numa_possible_cpumask is not updated, it could be mapped to an offline node, which will lead to memory allocation failure: SLUB: Unable to allocate memory on node 2 (gfp=0x80d0) cache: kmalloc-192, object size: 192, buffer size: 192, default order: 1, min order: 0 node 0: slabs: 6172, objs: 259224, free: 245741 node 1: slabs: 3261, objs: 136962, free: 127656 It happens here: create_worker(struct worker_pool *pool) |--> worker = alloc_worker(pool->node); static struct worker *alloc_worker(int node) { struct worker *worker; worker = kzalloc_node(sizeof(*worker), GFP_KERNEL, node); --> Here, useing the wrong node. .. return worker; } [Solution] There are four mappings in the kernel: 1. nodeid (logical node id) <-> pxm 2. apicid (physical cpu id) <-> nodeid 3. cpuid (logical cpu id) <-> apicid 4. cpuid (logical cpu id) <-> nodeid 1. pxm (proximity domain) is provided by ACPI firmware in SRAT, and nodeid <-> pxm mapping is setup at boot time. This mapping is persistent, won't change. 2. apicid <-> nodeid mapping is setup using info in 1. The mapping is setup at boot time and CPU hotadd time, and cleared at CPU hotremove time. This mapping is also persistent. 3. cpuid <-> apicid mapping is setup at boot time and CPU hotadd time. cpuid is allocated, lower ids first, and released at CPU hotremove time, reused for other hotadded CPUs. So this mapping is not persistent. 4. cpuid <-> nodeid mapping is also setup at boot time and CPU hotadd time, and cleared at CPU hotremove time. As a result of 3, this mapping is not persistent. To fix this problem, we establish cpuid <-> nodeid mapping for all the possible cpus at boot time, and make it persistent. And according to init_cpu_to_node(), cpuid <-> nodeid mapping is based on apicid <-> nodeid mapping and cpuid <-> apicid mapping. So the key point is obtaining all cpus' apicid. apicid can be obtained by _MAT (Multiple APIC Table Entry) method or found in MADT (Multiple APIC Description Table). So we finish the job in the following steps: 1. Enable apic registeration flow to handle both enabled and disabled cpus. This is done by introducing an extra parameter to generic_processor_info to let the caller control if disabled cpus are ignored. 2. Introduce a new array storing all possible cpuid <-> apicid mapping. And also modify the way cpuid is calculated. Establish all possible cpuid <-> apicid mapping when registering local apic. Store the mapping in this array. 3. Enable _MAT and MADT relative apis to return non-presnet or disabled cpus' apicid. This is also done by introducing an extra parameter to these apis to let the caller control if disabled cpus are ignored. 4. Establish all possible cpuid <-> nodeid mapping. This is done via an additional acpi namespace walk for processors. For previous discussion, please refer to: https://lkml.org/lkml/2015/2/27/145 https://lkml.org/lkml/2015/3/25/989 https://lkml.org/lkml/2015/5/14/244 https://lkml.org/lkml/2015/7/7/200 https://lkml.org/lkml/2015/9/27/209 https://lkml.org/lkml/2016/5/19/212 https://lkml.org/lkml/2016/7/19/181 https://lkml.org/lkml/2016/7/25/99 Change log v9 -> v10: 1. Providing an empty definition of acpi_set_
[PATCH v10 3/7] x86, acpi, cpu-hotplug: Introduce cpuid_to_apicid[] array to store persistent cpuid <-> apicid mapping.
From: Gu Zheng The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that, when node online/offline happens, cache based on cpuid <-> nodeid mapping such as wq_numa_possible_cpumask will not cause any problem. It contains 4 steps: 1. Enable apic registeration flow to handle both enabled and disabled cpus. 2. Introduce a new array storing all possible cpuid <-> apicid mapping. 3. Enable _MAT and MADT relative apis to return non-presnet or disabled cpus' apicid. 4. Establish all possible cpuid <-> nodeid mapping. This patch finishes step 2. In this patch, we introduce a new static array named cpuid_to_apicid[], which is large enough to store info for all possible cpus. And then, we modify the cpuid calculation. In generic_processor_info(), it simply finds the next unused cpuid. And it is also why the cpuid <-> nodeid mapping changes with node hotplug. After this patch, we find the next unused cpuid, map it to an apicid, and store the mapping in cpuid_to_apicid[], so that cpuid <-> apicid mapping will be persistent. And finally we will use this array to make cpuid <-> nodeid persistent. cpuid <-> apicid mapping is established at local apic registeration time. But non-present or disabled cpus are ignored. In this patch, we establish all possible cpuid <-> apicid mapping when registering local apic. Signed-off-by: Gu Zheng Signed-off-by: Tang Chen Signed-off-by: Zhu Guihua Signed-off-by: Dou Liyang --- arch/x86/include/asm/mpspec.h | 1 + arch/x86/kernel/acpi/boot.c | 6 ++--- arch/x86/kernel/apic/apic.c | 61 --- 3 files changed, 61 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h index b07233b..db902d8 100644 --- a/arch/x86/include/asm/mpspec.h +++ b/arch/x86/include/asm/mpspec.h @@ -86,6 +86,7 @@ static inline void early_reserve_e820_mpc_new(void) { } #endif int generic_processor_info(int apicid, int version); +int __generic_processor_info(int apicid, int version, bool enabled); #define PHYSID_ARRAY_SIZE BITS_TO_LONGS(MAX_LOCAL_APIC) diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c index 9414f84..37248c3 100644 --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -174,15 +174,13 @@ static int acpi_register_lapic(int id, u8 enabled) return -EINVAL; } - if (!enabled) { + if (!enabled) ++disabled_cpus; - return -EINVAL; - } if (boot_cpu_physical_apicid != -1U) ver = apic_version[boot_cpu_physical_apicid]; - return generic_processor_info(id, ver); + return __generic_processor_info(id, ver, enabled); } static int __init diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 8e3c377..366fbbc 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1998,7 +1998,53 @@ void disconnect_bsp_APIC(int virt_wire_setup) apic_write(APIC_LVT1, value); } -static int __generic_processor_info(int apicid, int version, bool enabled) +/* + * The number of allocated logical CPU IDs. Since logical CPU IDs are allocated + * contiguously, it equals to current allocated max logical CPU ID plus 1. + * All allocated CPU ID should be in [0, nr_logical_cpuidi), so the maximum of + * nr_logical_cpuids is nr_cpu_ids. + * + * NOTE: Reserve 0 for BSP. + */ +static int nr_logical_cpuids = 1; + +/* + * Used to store mapping between logical CPU IDs and APIC IDs. + */ +static int cpuid_to_apicid[] = { + [0 ... NR_CPUS - 1] = -1, +}; + +/* + * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids + * and cpuid_to_apicid[] synchronized. + */ +static int allocate_logical_cpuid(int apicid) +{ + int i; + + /* +* cpuid <-> apicid mapping is persistent, so when a cpu is up, +* check if the kernel has allocated a cpuid for it. +*/ + for (i = 0; i < nr_logical_cpuids; i++) { + if (cpuid_to_apicid[i] == apicid) + return i; + } + + /* Allocate a new cpuid. */ + if (nr_logical_cpuids >= nr_cpu_ids) { + WARN_ONCE(1, "Only %d processors supported." +"Processor %d/0x%x and the rest are ignored.\n", +nr_cpu_ids - 1, nr_logical_cpuids, apicid); + return -1; + } + + cpuid_to_apicid[nr_logical_cpuids] = apicid; + return nr_logical_cpuids++; +} + +int __generic_processor_info(int apicid, int version, bool enabled) { int cpu, max = nr_cpu_ids; bool boot_cpu_detected = physid_isset(boot_cpu_physical_apicid, @@ -2079,8 +2125,17 @@ static int __generic_processor_info(int apicid, int version, bool enabled) * for BSP. */ cpu = 0; - } else - cpu = cpumask_next_zero(-1, cpu_present_mask); + + /
[PATCH v10 1/7] x86, memhp, numa: Online memory-less nodes at boot time.
From: Tang Chen For now, x86 does not support memory-less node. A node without memory will not be onlined, and the cpus on it will be mapped to the other online nodes with memory in init_cpu_to_node(). The reason of doing this is to ensure each cpu has mapped to a node with memory, so that it will be able to allocate local memory for that cpu. But we don't have to do it in this way. In this series of patches, we are going to construct cpu <-> node mapping for all possible cpus at boot time, which is a persistent mapping. It means that the cpu will be mapped to the node which it belongs to, and will never be changed. If a node has only cpus but no memory, the cpus on it will be mapped to a memory-less node. And the memory-less node should be onlined. This patch allocate pgdats for all memory-less nodes and online them at boot time. Then build zonelists for these nodes. As a result, when cpus on these memory-less nodes try to allocate memory from local node, it will automatically fall back to the proper zones in the zonelists. Signed-off-by: Zhu Guihua Signed-off-by: Dou Liyang --- arch/x86/mm/numa.c | 27 +-- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c index 9c086c5..2a87a28 100644 --- a/arch/x86/mm/numa.c +++ b/arch/x86/mm/numa.c @@ -723,22 +723,19 @@ void __init x86_numa_init(void) numa_init(dummy_numa_init); } -static __init int find_near_online_node(int node) +static void __init init_memory_less_node(int nid) { - int n, val; - int min_val = INT_MAX; - int best_node = -1; + unsigned long zones_size[MAX_NR_ZONES] = {0}; + unsigned long zholes_size[MAX_NR_ZONES] = {0}; - for_each_online_node(n) { - val = node_distance(node, n); + /* Allocate and initialize node data. Memory-less node is now online.*/ + alloc_node_data(nid); + free_area_init_node(nid, zones_size, 0, zholes_size); - if (val < min_val) { - min_val = val; - best_node = n; - } - } - - return best_node; + /* +* All zonelists will be built later in start_kernel() after per cpu +* areas are initialized. +*/ } /* @@ -767,8 +764,10 @@ void __init init_cpu_to_node(void) if (node == NUMA_NO_NODE) continue; + if (!node_online(node)) - node = find_near_online_node(node); + init_memory_less_node(node); + numa_set_node(cpu, node); } } -- 2.5.5
Re: [PATCH] power:bq27xxx: 27000/10 read FLAGS register as single
ping > Am 18.07.2016 um 18:12 schrieb H. Nikolaus Schaller : > > The bq27000 and bq27010 have a single byte FLAGS register. > Other gauges have 16 bit FLAGS registers. > > For reading the FLAGS register it is sufficient to read the single > register instead of reading RSOC at the next higher address as > well and then ignore the high byte. > > This does not change functionality but optimizes i2c and hdq > traffic. > > Signed-off-by: H. Nikolaus Schaller > --- > drivers/power/bq27xxx_battery.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/power/bq27xxx_battery.c b/drivers/power/bq27xxx_battery.c > index 45f6ebf..56712b2 100644 > --- a/drivers/power/bq27xxx_battery.c > +++ b/drivers/power/bq27xxx_battery.c > @@ -656,8 +656,9 @@ static bool bq27xxx_battery_dead(struct > bq27xxx_device_info *di, u16 flags) > static int bq27xxx_battery_read_health(struct bq27xxx_device_info *di) > { > int flags; > + bool has_singe_flag = di->chip == BQ27000 || di->chip == BQ27010; > > - flags = bq27xxx_read(di, BQ27XXX_REG_FLAGS, false); > + flags = bq27xxx_read(di, BQ27XXX_REG_FLAGS, has_singe_flag); > if (flags < 0) { > dev_err(di->dev, "error reading flag register:%d\n", flags); > return flags; > @@ -760,7 +761,7 @@ static int bq27xxx_battery_current(struct > bq27xxx_device_info *di, > } > > if (di->chip == BQ27000 || di->chip == BQ27010) { > - flags = bq27xxx_read(di, BQ27XXX_REG_FLAGS, false); > + flags = bq27xxx_read(di, BQ27XXX_REG_FLAGS, true); > if (flags & BQ27000_FLAG_CHGS) { > dev_dbg(di->dev, "negative current!\n"); > curr = -curr; > -- > 2.7.3 >
Re: [PATCH v2 09/10] netns: Add a limit on the number of net namespaces
On Thu, Jul 21, 2016 at 9:40 AM, Eric W. Biederman wrote: > Signed-off-by: "Eric W. Biederman" > --- > include/linux/user_namespace.h | 1 + > kernel/user_namespace.c| 1 + > net/core/net_namespace.c | 15 +++ > 3 files changed, 17 insertions(+) > > diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h > index 1a3a9cb93277..f86afa536baf 100644 > --- a/include/linux/user_namespace.h > +++ b/include/linux/user_namespace.h > @@ -27,6 +27,7 @@ enum ucounts { > UCOUNT_PID_NAMESPACES, > UCOUNT_UTS_NAMESPACES, > UCOUNT_IPC_NAMESPACES, > + UCOUNT_NET_NAMESPACES, > UCOUNT_CGROUP_NAMESPACES, > UCOUNT_COUNTS, > }; > diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c > index 1cf074cb47e2..e326ca722ae0 100644 > --- a/kernel/user_namespace.c > +++ b/kernel/user_namespace.c > @@ -80,6 +80,7 @@ static struct ctl_table userns_table[] = { > UCOUNT_ENTRY("max_pid_namespaces"), > UCOUNT_ENTRY("max_uts_namespaces"), > UCOUNT_ENTRY("max_ipc_namespaces"), > + UCOUNT_ENTRY("max_net_namespaces"), > UCOUNT_ENTRY("max_cgroup_namespaces"), > { } > }; > diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c > index 2c2eb1b629b1..a489f192d619 100644 > --- a/net/core/net_namespace.c > +++ b/net/core/net_namespace.c > @@ -266,6 +266,16 @@ struct net *get_net_ns_by_id(struct net *net, int id) > return peer; > } > > +static bool inc_net_namespaces(struct user_namespace *ns) > +{ > + return inc_ucount(ns, UCOUNT_NET_NAMESPACES); > +} > + > +static void dec_net_namespaces(struct user_namespace *ns) > +{ > + dec_ucount(ns, UCOUNT_NET_NAMESPACES); > +} > + > /* > * setup_net runs the initializers for the network namespace object. > */ > @@ -276,6 +286,9 @@ static __net_init int setup_net(struct net *net, struct > user_namespace *user_ns) > int error = 0; > LIST_HEAD(net_exit_list); > > + if (!inc_net_namespaces(user_ns)) > + return -ENFILE; I think you need to move this check after initilizing net->passive. When setup_net returns an error, net_drop_ns is called: void net_drop_ns(void *p) { struct net *ns = p; if (ns && atomic_dec_and_test(&ns->passive)) net_free(ns); } Actually, I think it would be better to make this check before net_alloc(). > + > atomic_set(&net->count, 1); > atomic_set(&net->passive, 1); > net->dev_base_seq = 1; > @@ -372,6 +385,7 @@ struct net *copy_net_ns(unsigned long flags, > } > mutex_unlock(&net_mutex); > if (rv < 0) { > + dec_net_namespaces(user_ns); > put_user_ns(user_ns); > net_drop_ns(net); > return ERR_PTR(rv); > @@ -444,6 +458,7 @@ static void cleanup_net(struct work_struct *work) > /* Finally it is safe to free my network namespace structure */ > list_for_each_entry_safe(net, tmp, &net_exit_list, exit_list) { > list_del_init(&net->exit_list); > + dec_net_namespaces(net->user_ns); > put_user_ns(net->user_ns); > net_drop_ns(net); > } > -- > 2.8.3 > > ___ > Containers mailing list > contain...@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/containers
linux-next: Tree for Jul 26
Hi all, Please do not add material destined for v4.9 to your linux-next included branches until after v4.8-rc1 has been released. Changes since 20160725: New tree: random Removed tree: perf (problem solved and merged) My fixes tree contains: 22065b8b8dc5 Merge branch 'perf/core' of ../../tip 70ca58970f4a staging: emxx_udc: allow modular build The powerpc tree still had its build failure for which I applied a fix patch. The xen-tip tree gained conflicts against the tip tree. The random tree gained a conflict against the kspp tree. Non-merge commits (relative to Linus' tree): 9990 9049 files changed, 523915 insertions(+), 181557 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig (with CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig (this fails its final link) and pseries_le_defconfig and i386, sparc and sparc64 defconfig. Below is a summary of the state of the merge. I am currently merging 240 trees (counting Linus' and 35 trees of patches pending for Linus' tree). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (766fd5f6cdaf Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip) Merging fixes/master (22065b8b8dc5 Merge branch 'perf/core' of ../../tip) Merging kbuild-current/rc-fixes (b36fad65d61f kbuild: Initialize exported variables) Merging arc-current/for-curr (9bd54517ee86 arc: unwind: warn only once if DW2_UNWIND is disabled) Merging arm-current/fixes (f6492164ecb1 ARM: 8577/1: Fix Cortex-A15 798181 errata initialization) Merging m68k-current/for-linus (6bd80f372371 m68k/defconfig: Update defconfigs for v4.7-rc2) Merging metag-fixes/fixes (0164a711c97b metag: Fix ioremap_wc/ioremap_cached build errors) Merging powerpc-fixes/fixes (bfa37087aa04 powerpc: Initialise pci_io_base as early as possible) Merging powerpc-merge-mpe/fixes (bc0195aad0da Linux 4.2-rc2) Merging sparc/master (6b15d6650c53 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net) Merging net/master (107df03203bb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net) Merging ipsec/master (1ba5bf993c6a xfrm: fix crash in XFRM_MSG_GETSA netlink handler) Merging netfilter/master (ea43f860d984 Merge branch 'ethoc-fixes') Merging ipvs/master (ea43f860d984 Merge branch 'ethoc-fixes') Merging wireless-drivers/master (034fdd4a17ff Merge ath-current from ath.git) Merging mac80211/master (16a910a6722b cfg80211: handle failed skb allocation) Merging sound-current/for-linus (cf81d6b58344 Merge branch 'for-next' into for-linus) Merging pci-current/for-linus (ef0dab4aae14 PCI: Fix unaligned accesses in VC code) Merging driver-core.current/driver-core-linus (523d939ef98f Linux 4.7) Merging tty.current/tty-linus (a99cde438de0 Linux 4.7-rc6) Merging usb.current/usb-linus (b7545b79a169 Merge tag 'usb-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb) Merging usb-gadget-fixes/fixes (50c763f8c1ba usb: dwc3: Set the ClearPendIN bit on Clear Stall EP command) Merging usb-serial-fixes/usb-linus (4c2e07c6a29e Linux 4.7-rc5) Merging usb-chipidea-fixes/ci-for-usb-stable (ea1d39a31d3b usb: common: otg-fsm: add license to usb-otg-fsm) Merging staging.current/staging-linus (a99cde438de0 Linux 4.7-rc6) Merging char-misc.current/char-misc-linus (dd9506954539 Merge tag 'hwmon-for-linus-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging) Merging input-current/for-linus (e9003c9cfaa1 Input: tsc200x - report proper input_dev name) Merging crypto-
Re: [PATCH] clocksource: sun4i: Clear interrupts after stopping timer in probe function
On Tue, Jul 26, 2016 at 11:01:59AM +0800, Chen-Yu Tsai wrote: > The bootloader (U-boot) sometimes uses this timer for various delays. > It uses it as a ongoing counter, and does comparisons on the current > counter value. The timer counter is never stopped. > > In some cases when the user interacts with the bootloader, or lets > it idle for some time before loading Linux, the timer may expire, > and an interrupt will be pending. This results in an unexpected > interrupt when the timer interrupt is enabled by the kernel, at > which point the event_handler isn't set yet. This results in a NULL > pointer dereference exception, panic, and no way to reboot. > > Clear any pending interrupts after we stop the timer in the probe > function to avoid this. > > Signed-off-by: Chen-Yu Tsai Awesome, thanks! You should put stable in Cc though for this kind of patches. Maxime -- Maxime Ripard, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com signature.asc Description: PGP signature
[PATCH] USB: appledisplay: Remove deprecated create_singlethread_workqueue
The workqueue "wq" is involved in controlling the brightness of an Apple Cinema Display over USB. It has a single work item(&pdata->work) per appledisplay and hence doesn't require ordering. Also, it is not being used on a memory reclaim path. Hence, the singlethreaded workqueue has been replaced with the use of system_wq. System workqueues have been able to handle high level of concurrency for a long time now and hence it's not required to have a singlethreaded workqueue just to gain concurrency. Unlike a dedicated per-cpu workqueue created with create_singlethread_workqueue(), system_wq allows multiple work items to overlap executions even on the same CPU; however, a per-cpu workqueue doesn't have any CPU locality or global ordering guarantee unless the target CPU is explicitly specified and thus the increase of local concurrency shouldn't make any difference. The work item is self-requeueing and needs to wait for the in-flight work item to finish before proceeding with destruction. Hence, it has been sync cancelled in appledisplay_disconnect(). This also ensures that there are no pending tasks while disconnecting the driver. Signed-off-by: Bhaktipriya Shridhar --- drivers/usb/misc/appledisplay.c | 15 +++ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/drivers/usb/misc/appledisplay.c b/drivers/usb/misc/appledisplay.c index a0a3827..c760455 100644 --- a/drivers/usb/misc/appledisplay.c +++ b/drivers/usb/misc/appledisplay.c @@ -85,7 +85,6 @@ struct appledisplay { }; static atomic_t count_displays = ATOMIC_INIT(0); -static struct workqueue_struct *wq; static void appledisplay_complete(struct urb *urb) { @@ -122,7 +121,7 @@ static void appledisplay_complete(struct urb *urb) case ACD_BTN_BRIGHT_UP: case ACD_BTN_BRIGHT_DOWN: pdata->button_pressed = 1; - queue_delayed_work(wq, &pdata->work, 0); + schedule_delayed_work(&pdata->work, 0); break; case ACD_BTN_NONE: default: @@ -159,7 +158,7 @@ static int appledisplay_bl_update_status(struct backlight_device *bd) pdata->msgdata, 2, ACD_USB_TIMEOUT); mutex_unlock(&pdata->sysfslock); - + return retval; } @@ -344,7 +343,7 @@ static void appledisplay_disconnect(struct usb_interface *iface) if (pdata) { usb_kill_urb(pdata->urb); - cancel_delayed_work(&pdata->work); + cancel_delayed_work_sync(&pdata->work); backlight_device_unregister(pdata->bd); usb_free_coherent(pdata->udev, ACD_URB_BUFFER_LEN, pdata->urbdata, pdata->urb->transfer_dma); @@ -365,19 +364,11 @@ static struct usb_driver appledisplay_driver = { static int __init appledisplay_init(void) { - wq = create_singlethread_workqueue("appledisplay"); - if (!wq) { - printk(KERN_ERR "appledisplay: Could not create work queue\n"); - return -ENOMEM; - } - return usb_register(&appledisplay_driver); } static void __exit appledisplay_exit(void) { - flush_workqueue(wq); - destroy_workqueue(wq); usb_deregister(&appledisplay_driver); } -- 2.1.4
Re: [PATCH] xen/x86: Define stubs for xen_smp_intr_init/xen_smp_intr_free
On 25/07/16 23:14, Boris Ostrovsky wrote: > xen_smp_intr_init() and xen_smp_intr_free() are now called from > enlighten.c and therefore not guaranteed to have CONFIG_SMP. > > Instead of adding multiple ifdefs there provide stubs in smp.h > > Signed-off-by: Boris Ostrovsky Reviewed-by: Juergen Gross Juergen
Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote: > "Michael Kerrisk (man-pages)" writes: [snip] > [snip] > >>> So, from my point of view, the important piece that was missing from > >>> your commit message was the note to use readlink("/proc/self/fd/%d") > >>> on the returned FDs. I think that detail needs to be part of the > >>> commit message (and also the man page text). I think it even be > >>> helpful to include the above program as part of the commit message: > >>> it helps people more quickly grasp the API. > >> > >> Please, please make the standard way to compare these things fstat. > >> That is much less magic than a symlink, and a little more future proof. > >> Possibly even kcmp. I like the idea to use kcmp to compare namespaces. I am going to add this functionality to kcmp and describe all these in the man page. > > > > As in fstat() to get the st_ino field, right? > > Both the st_ino and st_dev fields. > > The most likely change to support checkpoint/restart in the future is to > preserve st_ino across migrations and instantiate a different instance > of nsfs to hold the inode numbers from the previous machine. It sounds tricky. BTW: Actually this is not only one places where we have this sort of problem. For example, now mount id-s are not preserved when a container is migrated. The same problem is applied to tmpfs, where inode numbers are not preserved for files. > > We would need to handle the preservation carefully or else there is > a chance that two namespace file descriptors (collected from different > sources) with different st_dev and st_ino fields may actuall refer to > the same object. > > Which is a long way of saying we have the st_dev field please use it, > it may matter at some point. > > Eric
RE: [PATCH v18 net-next 1/1] hv_sock: introduce Hyper-V Sockets
> From: David Miller [mailto:da...@davemloft.net] > ... > From: Dexuan Cui > Date: Tue, 26 Jul 2016 03:09:16 + > > > BTW, during the past month, at least 7 other people also reviewed > > the patch and gave me quite a few good comments, which have > > been addressed. > > Correction: Several people gave coding style and simple corrections > to your patch. > > Very few gave any review of the _SUBSTANCE_ of your changes. > > And the one of the few who did, and suggested you build your > facilities using the existing S390 hypervisor socket infrastructure, > you brushed off _IMMEDIATELY_. > > That drives me crazy. The one person who gave you real feedback > you basically didn't consider seriously at all. Hi David, I'm very sorry -- I guess I must have missed something here -- I don't remember somebody replied with S390 hypervisor socket infrastructure... I'm re-reading all the replies, trying to locate the reply and I'll find out why I didn't take it seriously. Sorry in advance. > I know why you don't want to consider alternative implementations, > and it's because you guys have so much invested in what you've > implemented already. This is not true. I'm absolutely open to any possibility to have an alternative better implementation. Please allow me to find the "S390 hypervisor socket infrastructure" reply first and I'll report back ASAP. > But that's tough and not our problem. > > And until this changes, yes, this submission will be stuck in the > mud and continue slogging on like this. I definitely agree and understand. Thanks, -- Dexuan
[PATCH v2 2/3] xen-blkfront: introduce blkif_set_queue_limits()
blk_mq_update_nr_hw_queues() reset all queue limits to default which it's not as xen-blkfront expected, introducing blkif_set_queue_limits() to reset limits with initial correct values. Signed-off-by: Bob Liu --- v2: Move blkif_set_queue_limits() after blkfront_gather_backend_features. --- drivers/block/xen-blkfront.c | 87 +++--- 1 file changed, 48 insertions(+), 39 deletions(-) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 032fc94..1b4c380 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -189,6 +189,8 @@ struct blkfront_info struct mutex mutex; struct xenbus_device *xbdev; struct gendisk *gd; + u16 sector_size; + unsigned int physical_sector_size; int vdevice; blkif_vdev_t handle; enum blkif_state connected; @@ -913,9 +915,45 @@ static struct blk_mq_ops blkfront_mq_ops = { .map_queue = blk_mq_map_queue, }; +static void blkif_set_queue_limits(struct blkfront_info *info) +{ + struct request_queue *rq = info->rq; + struct gendisk *gd = info->gd; + unsigned int segments = info->max_indirect_segments ? : + BLKIF_MAX_SEGMENTS_PER_REQUEST; + + queue_flag_set_unlocked(QUEUE_FLAG_VIRT, rq); + + if (info->feature_discard) { + queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, rq); + blk_queue_max_discard_sectors(rq, get_capacity(gd)); + rq->limits.discard_granularity = info->discard_granularity; + rq->limits.discard_alignment = info->discard_alignment; + if (info->feature_secdiscard) + queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, rq); + } + + /* Hard sector size and max sectors impersonate the equiv. hardware. */ + blk_queue_logical_block_size(rq, info->sector_size); + blk_queue_physical_block_size(rq, info->physical_sector_size); + blk_queue_max_hw_sectors(rq, (segments * XEN_PAGE_SIZE) / 512); + + /* Each segment in a request is up to an aligned page in size. */ + blk_queue_segment_boundary(rq, PAGE_SIZE - 1); + blk_queue_max_segment_size(rq, PAGE_SIZE); + + /* Ensure a merged request will fit in a single I/O ring slot. */ + blk_queue_max_segments(rq, segments / GRANTS_PER_PSEG); + + /* Make sure buffer addresses are sector-aligned. */ + blk_queue_dma_alignment(rq, 511); + + /* Make sure we don't use bounce buffers. */ + blk_queue_bounce_limit(rq, BLK_BOUNCE_ANY); +} + static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size, - unsigned int physical_sector_size, - unsigned int segments) + unsigned int physical_sector_size) { struct request_queue *rq; struct blkfront_info *info = gd->private_data; @@ -947,37 +985,11 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size, } rq->queuedata = info; - queue_flag_set_unlocked(QUEUE_FLAG_VIRT, rq); - - if (info->feature_discard) { - queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, rq); - blk_queue_max_discard_sectors(rq, get_capacity(gd)); - rq->limits.discard_granularity = info->discard_granularity; - rq->limits.discard_alignment = info->discard_alignment; - if (info->feature_secdiscard) - queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, rq); - } - - /* Hard sector size and max sectors impersonate the equiv. hardware. */ - blk_queue_logical_block_size(rq, sector_size); - blk_queue_physical_block_size(rq, physical_sector_size); - blk_queue_max_hw_sectors(rq, (segments * XEN_PAGE_SIZE) / 512); - - /* Each segment in a request is up to an aligned page in size. */ - blk_queue_segment_boundary(rq, PAGE_SIZE - 1); - blk_queue_max_segment_size(rq, PAGE_SIZE); - - /* Ensure a merged request will fit in a single I/O ring slot. */ - blk_queue_max_segments(rq, segments / GRANTS_PER_PSEG); - - /* Make sure buffer addresses are sector-aligned. */ - blk_queue_dma_alignment(rq, 511); - - /* Make sure we don't use bounce buffers. */ - blk_queue_bounce_limit(rq, BLK_BOUNCE_ANY); - - gd->queue = rq; - + info->rq = gd->queue = rq; + info->gd = gd; + info->sector_size = sector_size; + info->physical_sector_size = physical_sector_size; + blkif_set_queue_limits(info); return 0; } @@ -1142,16 +1154,11 @@ static int xlvbd_alloc_gendisk(blkif_sector_t capacity, gd->driverfs_dev = &(info->xbdev->dev); set_capacity(gd, capacity); - if (xlvbd_init_blk_queue(gd, sector_size, physical_sector_size, -info->max_indirect_segments ? : -BLKIF_
[PATCH v2 3/3] xen-blkfront: dynamic configuration of per-vbd resources
The current VBD layer reserves buffer space for each attached device based on three statically configured settings which are read at boot time. * max_indirect_segs: Maximum amount of segments. * max_ring_page_order: Maximum order of pages to be used for the shared ring. * max_queues: Maximum of queues(rings) to be used. But the storage backend, workload, and guest memory result in very different tuning requirements. It's impossible to centrally predict application characteristics so it's best to leave allow the settings can be dynamiclly adjusted based on workload inside the Guest. Usage: Show current values: cat /sys/devices/vbd-xxx/max_indirect_segs cat /sys/devices/vbd-xxx/max_ring_page_order cat /sys/devices/vbd-xxx/max_queues Write new values: echo > /sys/devices/vbd-xxx/max_indirect_segs echo > /sys/devices/vbd-xxx/max_ring_page_order echo > /sys/devices/vbd-xxx/max_queues Signed-off-by: Bob Liu -- v2: Rename to max_ring_page_order and rm the waiting code suggested by Roger. --- drivers/block/xen-blkfront.c | 275 +- 1 file changed, 269 insertions(+), 6 deletions(-) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 1b4c380..ff5ebe5 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -212,6 +212,11 @@ struct blkfront_info /* Save uncomplete reqs and bios for migration. */ struct list_head requests; struct bio_list bio_list; + /* For dynamic configuration. */ + unsigned int reconfiguring:1; + int new_max_indirect_segments; + int max_ring_page_order; + int max_queues; }; static unsigned int nr_minors; @@ -1350,6 +1355,31 @@ static void blkif_free(struct blkfront_info *info, int suspend) for (i = 0; i < info->nr_rings; i++) blkif_free_ring(&info->rinfo[i]); + /* Remove old xenstore nodes. */ + if (info->nr_ring_pages > 1) + xenbus_rm(XBT_NIL, info->xbdev->nodename, "ring-page-order"); + + if (info->nr_rings == 1) { + if (info->nr_ring_pages == 1) { + xenbus_rm(XBT_NIL, info->xbdev->nodename, "ring-ref"); + } else { + for (i = 0; i < info->nr_ring_pages; i++) { + char ring_ref_name[RINGREF_NAME_LEN]; + + snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i); + xenbus_rm(XBT_NIL, info->xbdev->nodename, ring_ref_name); + } + } + } else { + xenbus_rm(XBT_NIL, info->xbdev->nodename, "multi-queue-num-queues"); + + for (i = 0; i < info->nr_rings; i++) { + char queuename[QUEUE_NAME_LEN]; + + snprintf(queuename, QUEUE_NAME_LEN, "queue-%u", i); + xenbus_rm(XBT_NIL, info->xbdev->nodename, queuename); + } + } kfree(info->rinfo); info->rinfo = NULL; info->nr_rings = 0; @@ -1763,15 +1793,21 @@ static int talk_to_blkback(struct xenbus_device *dev, const char *message = NULL; struct xenbus_transaction xbt; int err; - unsigned int i, max_page_order = 0; + unsigned int i, backend_max_order = 0; unsigned int ring_page_order = 0; err = xenbus_scanf(XBT_NIL, info->xbdev->otherend, - "max-ring-page-order", "%u", &max_page_order); + "max-ring-page-order", "%u", &backend_max_order); if (err != 1) info->nr_ring_pages = 1; else { - ring_page_order = min(xen_blkif_max_ring_order, max_page_order); + if (info->max_ring_page_order) { + /* Dynamic configured through /sys. */ + BUG_ON(info->max_ring_page_order > backend_max_order); + ring_page_order = info->max_ring_page_order; + } else + /* Default. */ + ring_page_order = min(xen_blkif_max_ring_order, backend_max_order); info->nr_ring_pages = 1 << ring_page_order; } @@ -1894,7 +1930,14 @@ static int negotiate_mq(struct blkfront_info *info) if (err < 0) backend_max_queues = 1; - info->nr_rings = min(backend_max_queues, xen_blkif_max_queues); + if (info->max_queues) { + /* Dynamic configured through /sys */ + BUG_ON(info->max_queues > backend_max_queues); + info->nr_rings = info->max_queues; + } else + /* Default. */ + info->nr_rings = min(backend_max_queues, xen_blkif_max_queues); + /* We need at least one ring. */ if (!info->nr_rings) info->nr_rings = 1; @@ -2352,11 +2395,197 @@ static void blkfront_gather_backend_features(struct blkfront_info *info)
[PATCH 1/3] xen-blkfront: fix places not updated after introducing 64KB page granularity
Two places didn't get updated when 64KB page granularity was introduced, this patch fix them. Signed-off-by: Bob Liu Acked-by: Roger Pau Monné --- drivers/block/xen-blkfront.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index fcc5b4e..032fc94 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -1321,7 +1321,7 @@ free_shadow: rinfo->ring_ref[i] = GRANT_INVALID_REF; } } - free_pages((unsigned long)rinfo->ring.sring, get_order(info->nr_ring_pages * PAGE_SIZE)); + free_pages((unsigned long)rinfo->ring.sring, get_order(info->nr_ring_pages * XEN_PAGE_SIZE)); rinfo->ring.sring = NULL; if (rinfo->irq) @@ -2013,7 +2013,7 @@ static int blkif_recover(struct blkfront_info *info) blkfront_gather_backend_features(info); segs = info->max_indirect_segments ? : BLKIF_MAX_SEGMENTS_PER_REQUEST; - blk_queue_max_segments(info->rq, segs); + blk_queue_max_segments(info->rq, segs / GRANTS_PER_PSEG); for (r_index = 0; r_index < info->nr_rings; r_index++) { struct blkfront_ring_info *rinfo = &info->rinfo[r_index]; -- 1.7.10.4
[PATCH] usb: ftdi-elan: Remove deprecated create_singlethread_workqueue
The status workqueue is involved in initializing the Uxxx and polling the Uxxx until a supported PCMCIA CardBus device is detected. It then starts the command and respond workqueues and then loads the module that handles the device, after which it just polls the Uxxx looking for card ejects. The command and respond workqueues are involved in implementing a command sequencer for communicating with the firmware on the other side of the FTDI chip in the Uxxx. These workqueues have only a single work item each and hence they do not require ordering. Also, none of the above workqueues are being used on a memory recliam path. Hence, the singlethreaded workqueues have been replaced with the use of system_wq. System workqueues have been able to handle high level of concurrency for a long time now and hence it's not required to have a singlethreaded workqueue just to gain concurrency. Unlike a dedicated per-cpu workqueue created with create_singlethread_workqueue(), system_wq allows multiple work items to overlap executions even on the same CPU; however, a per-cpu workqueue doesn't have any CPU locality or global ordering guarantee unless the target CPU is explicitly specified and thus the increase of local concurrency shouldn't make any difference. The work items have been sync cancelled because they are self-requeueing and need to wait for the in-flight work item to finish before proceeding with destruction. Hence, they have been sync cancelled in ftdi_status_cancel_work(), ftdi_command_cancel_work() and ftdi_response_cancel_work(). These functions are called in ftdi_elan_exit() to ensure that there are no pending work items while disconnecting the driver. Signed-off-by: Bhaktipriya Shridhar --- drivers/usb/misc/ftdi-elan.c | 53 +--- 1 file changed, 10 insertions(+), 43 deletions(-) diff --git a/drivers/usb/misc/ftdi-elan.c b/drivers/usb/misc/ftdi-elan.c index 52c27ca..59031dc 100644 --- a/drivers/usb/misc/ftdi-elan.c +++ b/drivers/usb/misc/ftdi-elan.c @@ -61,9 +61,6 @@ module_param(distrust_firmware, bool, 0); MODULE_PARM_DESC(distrust_firmware, "true to distrust firmware power/overcurrent setup"); extern struct platform_driver u132_platform_driver; -static struct workqueue_struct *status_queue; -static struct workqueue_struct *command_queue; -static struct workqueue_struct *respond_queue; /* * ftdi_module_lock exists to protect access to global variables * @@ -228,56 +225,56 @@ static void ftdi_elan_init_kref(struct usb_ftdi *ftdi) static void ftdi_status_requeue_work(struct usb_ftdi *ftdi, unsigned int delta) { - if (!queue_delayed_work(status_queue, &ftdi->status_work, delta)) + if (!schedule_delayed_work(&ftdi->status_work, delta)) kref_put(&ftdi->kref, ftdi_elan_delete); } static void ftdi_status_queue_work(struct usb_ftdi *ftdi, unsigned int delta) { - if (queue_delayed_work(status_queue, &ftdi->status_work, delta)) + if (schedule_delayed_work(&ftdi->status_work, delta)) kref_get(&ftdi->kref); } static void ftdi_status_cancel_work(struct usb_ftdi *ftdi) { - if (cancel_delayed_work(&ftdi->status_work)) + if (cancel_delayed_work_sync(&ftdi->status_work)) kref_put(&ftdi->kref, ftdi_elan_delete); } static void ftdi_command_requeue_work(struct usb_ftdi *ftdi, unsigned int delta) { - if (!queue_delayed_work(command_queue, &ftdi->command_work, delta)) + if (!schedule_delayed_work(&ftdi->command_work, delta)) kref_put(&ftdi->kref, ftdi_elan_delete); } static void ftdi_command_queue_work(struct usb_ftdi *ftdi, unsigned int delta) { - if (queue_delayed_work(command_queue, &ftdi->command_work, delta)) + if (schedule_delayed_work(&ftdi->command_work, delta)) kref_get(&ftdi->kref); } static void ftdi_command_cancel_work(struct usb_ftdi *ftdi) { - if (cancel_delayed_work(&ftdi->command_work)) + if (cancel_delayed_work_sync(&ftdi->command_work)) kref_put(&ftdi->kref, ftdi_elan_delete); } static void ftdi_response_requeue_work(struct usb_ftdi *ftdi, unsigned int delta) { - if (!queue_delayed_work(respond_queue, &ftdi->respond_work, delta)) + if (!schedule_delayed_work(&ftdi->respond_work, delta)) kref_put(&ftdi->kref, ftdi_elan_delete); } static void ftdi_respond_queue_work(struct usb_ftdi *ftdi, unsigned int delta) { - if (queue_delayed_work(respond_queue, &ftdi->respond_work, delta)) + if (schedule_delayed_work(&ftdi->respond_work, delta)) kref_get(&ftdi->kref); } static void ftdi_response_cancel_work(struct usb_ftdi *ftdi) { - if (cancel_delayed_work(&ftdi->respond_work)) + if (cancel_delayed_work_sync(&ftdi->respond_work)) kref_put(&ftdi->kref, ftdi_elan_delete); } @@ -2823,9 +2820,6 @@ static void ftdi_elan_disconnect(stru
[PATCH 1/1] socfpga: defconfig: Enable Altera GPIO driver as module
From: Tien Hock Loh This patch enables Altera GPIO driver as module in socfpga_defconfig Signed-off-by: Tien Hock Loh --- arch/arm/configs/socfpga_defconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm/configs/socfpga_defconfig b/arch/arm/configs/socfpga_defconfig index 753f1a5..241ce4ca 100644 --- a/arch/arm/configs/socfpga_defconfig +++ b/arch/arm/configs/socfpga_defconfig @@ -108,3 +108,4 @@ CONFIG_DETECT_HUNG_TASK=y # CONFIG_SCHED_DEBUG is not set CONFIG_ENABLE_DEFAULT_TRACERS=y CONFIG_DEBUG_USER=y +CONFIG_GPIO_ALTERA=m -- 1.7.11.GIT
Re: PROBLEM: network data corruption (bisected to e5a4b0bb803b)
Al Viro wrote: > On Sun, Jul 24, 2016 at 07:45:13PM +0200, Christian Lamparter wrote: > > > > The symptom is that downloaded files (http, ftp, and probably other > > > protocols) have small corrupted segments (about 1-2 kilobytes long) in > > > random locations. Only downloads that sustain a high speed for at least a > > > few seconds are corrupted. Anything small enough to be received in less > > > than about 5 seconds is not affected. > > Can that sucker be reproduced with netcat? That would eliminate all issues > with multi-iovec recvmsg(2), narrowing the things down quite bit. netcat seems to be immune. Comparing strace results, I didn't see any recvmsg() calls in the other programs that have had the problem, but there is an interesting difference: netcat calls select() to wait for the socket to be ready for reading, where my other test programs just call read() and let it block until ready. So I wrote a small test program to isolate that difference. It downloads a file using only read() and write() and a hardcoded HTTP request. It has a select mode (main loop alternates read() and select() on the TCP socket) and a noselect mode (main loop just read()s the TCP socket). The program is included at the bottom of this message. I ran it several times in both modes and got corruption if and only if the noselect mode was used. > > Another thing (and if that works, it's *NOT* a proper fix - it would be > papering over the problem, but at least it would show where to look for > it) - try (on top of mainline) the following delta: > > diff --git a/net/core/datagram.c b/net/core/datagram.c Will try that patch soon. Meanwhile, here's my test: /* Demonstration program "dlbug". Usage: dlbug select > outfile or dlbug noselect > outfile outfile will contain the full HTTP response. Edit out the HTTP headers and what's left should be a valid gzip if the download worked. */ #include #include #include #include #include #include #include #include int main(int argc, char **argv) { const char *request = "GET /debian/dists/stable/main/Contents-amd64.gz HTTP/1.0\r\n" "Host: ftp.us.debian.org\r\n" "\r\n"; ssize_t request_len = strlen(request), w, r, copied; struct addrinfo hints, *host; int sock, err, doselect; char buf[10240]; if(argc!=2 || (!strcmp(argv[1], "select") && !strcmp(argv[1], "noselect"))) { fprintf(stderr, "Usage: %s {select|noselect}\n", argv[0]); return 1; } doselect = !strcmp(argv[1], "select"); memset(&hints, 0, sizeof hints); hints.ai_family = AF_INET; hints.ai_socktype = SOCK_STREAM; err = getaddrinfo("ftp.us.debian.org", 0, &hints, &host); if(err) { fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(err)); return 1; } sock = socket(host->ai_family, host->ai_socktype, host->ai_protocol); if(sock < 0) { perror("socket"); return 1; } ((struct sockaddr_in *)host->ai_addr)->sin_port = htons(80); if(connect(sock, host->ai_addr, host->ai_addrlen) < 0) { perror("connect"); return 1; } while(request_len) { w = write(sock, request, request_len); if(w < 0) { perror("write to socket"); return 1; } request += w; request_len -= w; } while((r = read(sock, buf, sizeof buf))) { if(r < 0) { perror("read from socket"); return 1; } copied = 0; while(copied < r) { w = write(1, buf+copied, r-copied); if(w < 0) { perror("write to stdout"); return 1; } copied += w; } if(doselect) { fd_set rfds; FD_ZERO(&rfds); FD_SET(sock, &rfds); select(sock+1, &rfds, 0, 0, 0); } } return 0; } -- Alan Curry
Re: linux-next: manual merge of the xen-tip tree with the block tree
Hi Boris, On Mon, 25 Jul 2016 18:25:00 -0400 Boris Ostrovsky wrote: > > > Jeremy Fitzhardinge > > Jeremy is no longer involved with Xen. However, > > Juergen Gross > > is also Linux Xen/x86 maintainer. I have replaced Jeremy with Juergen. -- Cheers, Stephen Rothwell
linux-next: manual merge of the random tree with the kspp tree
Hi Theodore, Today's linux-next merge of the random tree got a conflict in: drivers/char/random.c between commit: 8c6a68e9eaa5 ("latent_entropy: Mark functions with __latent_entropy") from the kspp tree and commit: e192be9d9a30 ("random: replace non-blocking pool with a Chacha20-based CRNG") from the random tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc drivers/char/random.c index 6cca3ed45817,8d0af74f6569.. --- a/drivers/char/random.c +++ b/drivers/char/random.c @@@ -442,10 -471,15 +471,15 @@@ struct entropy_store __u8 last_data[EXTRACT_SIZE]; }; + static ssize_t extract_entropy(struct entropy_store *r, void *buf, + size_t nbytes, int min, int rsvd); + static ssize_t _extract_entropy(struct entropy_store *r, void *buf, + size_t nbytes, int fips); + + static void crng_reseed(struct crng_state *crng, struct entropy_store *r); static void push_to_pool(struct work_struct *work); -static __u32 input_pool_data[INPUT_POOL_WORDS]; -static __u32 blocking_pool_data[OUTPUT_POOL_WORDS]; +static __u32 input_pool_data[INPUT_POOL_WORDS] __latent_entropy; +static __u32 blocking_pool_data[OUTPUT_POOL_WORDS] __latent_entropy; - static __u32 nonblocking_pool_data[OUTPUT_POOL_WORDS] __latent_entropy; static struct entropy_store input_pool = { .poolinfo = &poolinfo_table[0],
Re: PROBLEM: network data corruption (bisected to e5a4b0bb803b)
Thanks for the detailed bug-report. I looked around the web to see if it was already reported or not. If found that this issue was reported before: [0], [1] and [2] by the same person (CC'ed). One difference is that the reporter had this issue with rsync on multiple SPARC systems. I ran a git grep on a 4.7.0-rc7+ (wt-2016-07-21-15-g97bd3b0). But it didn't find any patches directly referencing the commit. I'm not sure if this issue has been fixed by now or not. I would greatly appreciate any comment about this from the "people of netdev" (Al Viro? Alex Mcwhirter?). I can confirm the issue i was having with this commit still exists on sparc with the latest mainline kernel.
Re: [PATCH v3 02/11] mm: Hardened usercopy
On Mon, Jul 25, 2016 at 7:03 PM, Michael Ellerman wrote: > Josh Poimboeuf writes: > >> On Thu, Jul 21, 2016 at 11:34:25AM -0700, Kees Cook wrote: >>> On Wed, Jul 20, 2016 at 11:52 PM, Michael Ellerman >>> wrote: >>> > Kees Cook writes: >>> > >>> >> diff --git a/mm/usercopy.c b/mm/usercopy.c >>> >> new file mode 100644 >>> >> index ..e4bf4e7ccdf6 >>> >> --- /dev/null >>> >> +++ b/mm/usercopy.c >>> >> @@ -0,0 +1,234 @@ >>> > ... >>> >> + >>> >> +/* >>> >> + * Checks if a given pointer and length is contained by the current >>> >> + * stack frame (if possible). >>> >> + * >>> >> + * 0: not at all on the stack >>> >> + * 1: fully within a valid stack frame >>> >> + * 2: fully on the stack (when can't do frame-checking) >>> >> + * -1: error condition (invalid stack position or bad stack frame) >>> >> + */ >>> >> +static noinline int check_stack_object(const void *obj, unsigned long >>> >> len) >>> >> +{ >>> >> + const void * const stack = task_stack_page(current); >>> >> + const void * const stackend = stack + THREAD_SIZE; >>> > >>> > That allows access to the entire stack, including the struct thread_info, >>> > is that what we want - it seems dangerous? Or did I miss a check >>> > somewhere else? >>> >>> That seems like a nice improvement to make, yeah. >>> >>> > We have end_of_stack() which computes the end of the stack taking >>> > thread_info into account (end being the opposite of your end above). >>> >>> Amusingly, the object_is_on_stack() check in sched.h doesn't take >>> thread_info into account either. :P Regardless, I think using >>> end_of_stack() may not be best. To tighten the check, I think we could >>> add this after checking that the object is on the stack: >>> >>> #ifdef CONFIG_STACK_GROWSUP >>> stackend -= sizeof(struct thread_info); >>> #else >>> stack += sizeof(struct thread_info); >>> #endif >>> >>> e.g. then if the pointer was in the thread_info, the second test would >>> fail, triggering the protection. >> >> FWIW, this won't work right on x86 after Andy's >> CONFIG_THREAD_INFO_IN_TASK patches get merged. > > Yeah. I wonder if it's better for the arch helper to just take the obj and > len, > and work out it's own bounds for the stack using current and whatever makes > sense on that arch. > > It would avoid too much ifdefery in the generic code, and also avoid any > confusion about whether stackend is the high or low address. > > eg. on powerpc we could do: > > int noinline arch_within_stack_frames(const void *obj, unsigned long len) > { > void *stack_low = end_of_stack(current); > void *stack_high = task_stack_page(current) + THREAD_SIZE; > > > Whereas arches with STACK_GROWSUP=y could do roughly the reverse, and x86 can > do > whatever it needs to depending on whether the thread_info is on or off stack. > > cheers Yeah, I agree: this should be in the arch code. If the arch can actually do frame checking, the thread_info (if it exists on the stack) would already be excluded. But it'd be a nice tightening of the check. -Kees -- Kees Cook Chrome OS & Brillo Security
Re: [PATCH 1/3] net: asix: Add in_pm parameter
Please correct the problems Grant Grundler mentioned in all of these patches, and resubmit this entire series freshly. Also, please include a proper "[PATCH 0/3] ..." introduction posting for the series which explains what this series is about, how it implements what it is doing, and why it is doing things that way. Thanks.
[PATCH] powerpc: sgy_cts1000: Fix gpio_halt_cb()'s signature
Halt callback in struct machdep_calls is declared with __noreturn attribute, so omitting that attribute in gpio_halt_cb()'s signatrue results in compilation error. Change the signature to address the problem as well as change the code of the function to avoid ever returning from the function. Signed-off-by: Andrey Smirnov --- arch/powerpc/platforms/85xx/sgy_cts1000.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/85xx/sgy_cts1000.c b/arch/powerpc/platforms/85xx/sgy_cts1000.c index 79fd0df..21d6aaa 100644 --- a/arch/powerpc/platforms/85xx/sgy_cts1000.c +++ b/arch/powerpc/platforms/85xx/sgy_cts1000.c @@ -38,18 +38,18 @@ static void gpio_halt_wfn(struct work_struct *work) } static DECLARE_WORK(gpio_halt_wq, gpio_halt_wfn); -static void gpio_halt_cb(void) +static void __noreturn gpio_halt_cb(void) { enum of_gpio_flags flags; int trigger, gpio; if (!halt_node) - return; + panic("No reset GPIO information was provided in DT\n"); gpio = of_get_gpio_flags(halt_node, 0, &flags); if (!gpio_is_valid(gpio)) - return; + panic("Provided GPIO is invalid\n"); trigger = (flags == OF_GPIO_ACTIVE_LOW); @@ -57,6 +57,8 @@ static void gpio_halt_cb(void) /* Probably wont return */ gpio_set_value(gpio, trigger); + + panic("Halt failed\n"); } /* This IRQ means someone pressed the power button and it is waiting for us -- 2.5.5
Re: [RFC patch 1/6] random: Simplify API for random address requests
On Mon, Jul 25, 2016 at 8:01 PM, Jason Cooper wrote: > To date, all callers of randomize_range() have set the length to 0, and > check for a zero return value. For the current callers, the only way > to get zero returned is if end <= start. Since they are all adding a > constant to the start address, this is unnecessary. > > We can remove a bunch of needless checks by simplifying the API to do > just what everyone wants, return an address between [start, start + > range]. > > While we're here, s/get_random_int/get_random_long/. No current call > site is adversely affected by get_random_int(), since all current range > requests are < MAX_UINT. However, we should match caller expectations > to avoid coming up short (ha!) in the future. > > Signed-off-by: Jason Cooper > --- > drivers/char/random.c | 17 - > include/linux/random.h | 2 +- > 2 files changed, 5 insertions(+), 14 deletions(-) > > diff --git a/drivers/char/random.c b/drivers/char/random.c > index 0158d3bff7e5..1251cb2cbab2 100644 > --- a/drivers/char/random.c > +++ b/drivers/char/random.c > @@ -1822,22 +1822,13 @@ unsigned long get_random_long(void) > EXPORT_SYMBOL(get_random_long); > > /* > - * randomize_range() returns a start address such that > - * > - *[.. .] > - * start end > - * > - * a with size "len" starting at the return value is inside in the > - * area defined by [start, end], but is otherwise randomized. > + * randomize_addr() returns a page aligned address within [start, start + > + * range] > */ > unsigned long > -randomize_range(unsigned long start, unsigned long end, unsigned long len) > +randomize_addr(unsigned long start, unsigned long range) Also, this series isn't bisectable since randomize_range gets removed here before the callers are updated. Perhaps add a macro that calls randomize_addr with a BUG_ON for len != 0? (And then remove it in the last patch?) -Kees > { > - unsigned long range = end - len - start; > - > - if (end <= start + len) > - return 0; > - return PAGE_ALIGN(get_random_int() % range + start); > + return PAGE_ALIGN(get_random_long() % range + start); > } > > /* Interface for in-kernel drivers of true hardware RNGs. > diff --git a/include/linux/random.h b/include/linux/random.h > index e47e533742b5..1ad877a98186 100644 > --- a/include/linux/random.h > +++ b/include/linux/random.h > @@ -34,7 +34,7 @@ extern const struct file_operations random_fops, > urandom_fops; > > unsigned int get_random_int(void); > unsigned long get_random_long(void); > -unsigned long randomize_range(unsigned long start, unsigned long end, > unsigned long len); > +unsigned long randomize_addr(unsigned long start, unsigned long range); > > u32 prandom_u32(void); > void prandom_bytes(void *buf, size_t nbytes); > -- > 2.9.2 > -- Kees Cook Chrome OS & Brillo Security
Re: [kbuild-all] arch/xtensa/include/asm/initialize_mmu.h:55: Error: invalid register 'atomctl' for 'wsr' instruction
Hi Max, On Tue, Jul 26, 2016 at 02:20:25AM +0300, Max Filippov wrote: Hi Fengguang, On Fri, Jul 22, 2016 at 3:44 PM, Fengguang Wu wrote: On Fri, Jul 22, 2016 at 06:32:28PM +0800, kbuild test robot wrote: FYI, the error/warning still remains. tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 47ef4ad2684d380dd6d596140fb79395115c3950 commit: 9da8320bb97768e35f2e64fa7642015271d672eb xtensa: add test_kc705_hifi variant date: 4 months ago config: xtensa-audio_kc705_defconfig (attached as .config) compiler: xtensa-linux-gcc (GCC) 4.9.0 All errors (new ones prefixed by >>): arch/xtensa/include/asm/initialize_mmu.h: Assembler messages: arch/xtensa/include/asm/initialize_mmu.h:55: Error: invalid register 'atomctl' for 'wsr' instruction -- arch/xtensa/kernel/coprocessor.S: Assembler messages: arch/xtensa/kernel/coprocessor.S:93: Error: unknown opcode or format name 'rur.ae_ovf_sar' arch/xtensa/kernel/coprocessor.S:93: Error: unknown opcode or format name 'rur.ae_bithead' arch/xtensa/kernel/coprocessor.S:93: Error: unknown opcode or format name 'rur.ae_ts_fts_bu_bp' arch/xtensa/kernel/coprocessor.S:93: Error: unknown opcode or format name 'rur.ae_cw_sd_no' arch/xtensa/kernel/coprocessor.S:93: Error: unknown opcode or format name 'rur.ae_cbegin0' arch/xtensa/kernel/coprocessor.S:93: Error: unknown opcode or format name 'rur.ae_cend0' arch/xtensa/kernel/coprocessor.S:93: Error: unknown opcode or format name 'ae_s64.i' arch/xtensa/kernel/coprocessor.S:93: Error: unknown opcode or format name 'ae_s64.i' Are they really matter? Or I can shut these errors up. Looks like I haven't supplied you with the compiler for test_kc705_hifi, for which these errors are reported. I've built it and put it here: http://jcmvbkbc.spb.ru/~jcmvbkbc/tmp/201604261801/x86_64-gcc-5.3.0-nolibc-xtensa-test_kc705_hifi-elf.tar.xz Please integrate it into your system along with other xtensa compilers. OK, done. :) Thanks, Fengguang
Re: [PATCH] caif-hsi: Remove deprecated create_singlethread_workqueue
From: Bhaktipriya Shridhar Date: Mon, 25 Jul 2016 18:40:57 +0530 > alloc_workqueue replaces deprecated create_singlethread_workqueue(). > > A dedicated workqueue has been used since the workitems are being used > on a packet tx/rx path. Hence, WQ_MEM_RECLAIM has been set to guarantee > forward progress under memory pressure. > > An ordered workqueue has been used since workitems &cfhsi->wake_up_work > and &cfhsi->wake_down_work cannot be run concurrently. > > Calls to flush_workqueue() before destroy_workqueue() have been dropped > since destroy_workqueue() itself calls drain_workqueue() which flushes > repeatedly till the workqueue becomes empty. > > Signed-off-by: Bhaktipriya Shridhar Applied.
Re: [RFC patch 1/6] random: Simplify API for random address requests
On Mon, Jul 25, 2016 at 8:30 PM, Jason Cooper wrote: > All, > > On Tue, Jul 26, 2016 at 03:01:55AM +, Jason Cooper wrote: >> To date, all callers of randomize_range() have set the length to 0, and >> check for a zero return value. For the current callers, the only way >> to get zero returned is if end <= start. Since they are all adding a >> constant to the start address, this is unnecessary. >> >> We can remove a bunch of needless checks by simplifying the API to do >> just what everyone wants, return an address between [start, start + >> range]. >> >> While we're here, s/get_random_int/get_random_long/. No current call >> site is adversely affected by get_random_int(), since all current range >> requests are < MAX_UINT. However, we should match caller expectations >> to avoid coming up short (ha!) in the future. >> >> Signed-off-by: Jason Cooper >> --- >> drivers/char/random.c | 17 - >> include/linux/random.h | 2 +- >> 2 files changed, 5 insertions(+), 14 deletions(-) >> >> diff --git a/drivers/char/random.c b/drivers/char/random.c >> index 0158d3bff7e5..1251cb2cbab2 100644 >> --- a/drivers/char/random.c >> +++ b/drivers/char/random.c >> @@ -1822,22 +1822,13 @@ unsigned long get_random_long(void) >> EXPORT_SYMBOL(get_random_long); >> >> /* >> - * randomize_range() returns a start address such that >> - * >> - *[.. .] >> - * start end >> - * >> - * a with size "len" starting at the return value is inside in the >> - * area defined by [start, end], but is otherwise randomized. >> + * randomize_addr() returns a page aligned address within [start, start + >> + * range] >> */ >> unsigned long >> -randomize_range(unsigned long start, unsigned long end, unsigned long len) >> +randomize_addr(unsigned long start, unsigned long range) >> { >> - unsigned long range = end - len - start; >> - >> - if (end <= start + len) >> - return 0; >> - return PAGE_ALIGN(get_random_int() % range + start); >> + return PAGE_ALIGN(get_random_long() % range + start); >> } > > bah! old patch file. This should have been: > > if (range == 0) > return start; > else > return PAGE_ALIGN(get_random_long() % range + start); I think range should be limited to start + range < UINTMAX, and it should be very clear if the range is inclusive or exclusive. start = 0, range = 4096. does this mean 1 page, or 2 pages possible? -Kees > > sorry, > > Jason. > >> >> /* Interface for in-kernel drivers of true hardware RNGs. >> diff --git a/include/linux/random.h b/include/linux/random.h >> index e47e533742b5..1ad877a98186 100644 >> --- a/include/linux/random.h >> +++ b/include/linux/random.h >> @@ -34,7 +34,7 @@ extern const struct file_operations random_fops, >> urandom_fops; >> >> unsigned int get_random_int(void); >> unsigned long get_random_long(void); >> -unsigned long randomize_range(unsigned long start, unsigned long end, >> unsigned long len); >> +unsigned long randomize_addr(unsigned long start, unsigned long range); >> >> u32 prandom_u32(void); >> void prandom_bytes(void *buf, size_t nbytes); >> -- >> 2.9.2 >> -- Kees Cook Chrome OS & Brillo Security
Re: [PATCH v18 net-next 1/1] hv_sock: introduce Hyper-V Sockets
From: Dexuan Cui Date: Tue, 26 Jul 2016 03:09:16 + > BTW, during the past month, at least 7 other people also reviewed > the patch and gave me quite a few good comments, which have > been addressed. Correction: Several people gave coding style and simple corrections to your patch. Very few gave any review of the _SUBSTANCE_ of your changes. And the one of the few who did, and suggested you build your facilities using the existing S390 hypervisor socket infrastructure, you brushed off _IMMEDIATELY_. That drives me crazy. The one person who gave you real feedback you basically didn't consider seriously at all. I know why you don't want to consider alternative implementations, and it's because you guys have so much invested in what you've implemented already. But that's tough and not our problem. And until this changes, yes, this submission will be stuck in the mud and continue slogging on like this. Sorry.
Re: PROBLEM: network data corruption (bisected to e5a4b0bb803b)
Christian Lamparter wrote: > > As for carl9170: I'm not sure what the driver or firmware can do about > this at this time. You can try to disable the hardware crypto by setting > nohwcrypt via the module option. However, this might not do anything at all. The nohwcrypt parameter didn't make any difference. > > > > lsusb identifies my network device as: > > > > Bus 005 Device 004: ID 0cf3:1002 Atheros Communications, Inc. TP-Link > > TL-WN821N v2 802.11n [Atheros AR9170] > > > > I have version 1.9.9 of carl9170-1.fw in /lib/firmware > Just one additional question: Is the TL-WN821N connected to a USB3 port? It never has been before. I tried it today and it made no difference. -- Alan Curry
[PATCH 1/2] powerpc: mpc85xx_mds: Select PHYLIB only if NETDEVICES is enabled
PHYLIB depends on NETDEVICES, so to avoid unmet dependencies warning from Kconfig it needs to be selected conditionally. Also add checks if PHYLIB is built-in to avoid undefined references to PHYLIB's symbols. Signed-off-by: Andrey Smirnov --- arch/powerpc/platforms/85xx/Kconfig | 2 +- arch/powerpc/platforms/85xx/mpc85xx_mds.c | 9 - 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/85xx/Kconfig b/arch/powerpc/platforms/85xx/Kconfig index e626461..3da35bc 100644 --- a/arch/powerpc/platforms/85xx/Kconfig +++ b/arch/powerpc/platforms/85xx/Kconfig @@ -72,7 +72,7 @@ config MPC85xx_CDS config MPC85xx_MDS bool "Freescale MPC85xx MDS" select DEFAULT_UIMAGE - select PHYLIB + select PHYLIB if NETDEVICES select HAS_RAPIDIO select SWIOTLB help diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c b/arch/powerpc/platforms/85xx/mpc85xx_mds.c index dbcb467..71aff5e 100644 --- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c +++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c @@ -63,6 +63,8 @@ #define DBG(fmt...) #endif +#if IS_BUILTIN(CONFIG_PHYLIB) + #define MV88E_SCR 0x10 #define MV88E_SCR_125CLK 0x0010 static int mpc8568_fixup_125_clock(struct phy_device *phydev) @@ -152,6 +154,8 @@ static int mpc8568_mds_phy_fixups(struct phy_device *phydev) return err; } +#endif + /* * * Setup the architecture @@ -313,6 +317,7 @@ static void __init mpc85xx_mds_setup_arch(void) swiotlb_detect_4g(); } +#if IS_BUILTIN(CONFIG_PHYLIB) static int __init board_fixups(void) { @@ -342,9 +347,12 @@ static int __init board_fixups(void) return 0; } + machine_arch_initcall(mpc8568_mds, board_fixups); machine_arch_initcall(mpc8569_mds, board_fixups); +#endif + static int __init mpc85xx_publish_devices(void) { if (machine_is(mpc8568_mds)) @@ -435,4 +443,3 @@ define_machine(p1021_mds) { .pcibios_fixup_phb = fsl_pcibios_fixup_phb, #endif }; - -- 2.5.5
[PATCH 2/2] powerpc: e8248e: Select PHYLIB only if NETDEVICES is enabled
Select PHYLIB only if NETDEVICES is enabled and MDIO_BITBANG only if PHYLIB is present to avoid warnings from Kconfig. To prevent undefined references during linking register MDIO driver only if CONFIG_MDIO_BITBANG is enabled. Signed-off-by: Andrey Smirnov --- arch/powerpc/platforms/82xx/Kconfig | 4 ++-- arch/powerpc/platforms/82xx/ep8248e.c | 4 +++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/82xx/Kconfig b/arch/powerpc/platforms/82xx/Kconfig index 7c7df400..994d1a9 100644 --- a/arch/powerpc/platforms/82xx/Kconfig +++ b/arch/powerpc/platforms/82xx/Kconfig @@ -30,8 +30,8 @@ config EP8248E select 8272 select 8260 select FSL_SOC - select PHYLIB - select MDIO_BITBANG + select PHYLIB if NETDEVICES + select MDIO_BITBANG if PHYLIB help This enables support for the Embedded Planet EP8248E board. diff --git a/arch/powerpc/platforms/82xx/ep8248e.c b/arch/powerpc/platforms/82xx/ep8248e.c index cdab847..8fec050 100644 --- a/arch/powerpc/platforms/82xx/ep8248e.c +++ b/arch/powerpc/platforms/82xx/ep8248e.c @@ -298,7 +298,9 @@ static const struct of_device_id of_bus_ids[] __initconst = { static int __init declare_of_platform_devices(void) { of_platform_bus_probe(NULL, of_bus_ids, NULL); - platform_driver_register(&ep8248e_mdio_driver); + + if (IS_ENABLED(CONFIG_MDIO_BITBANG)) + platform_driver_register(&ep8248e_mdio_driver); return 0; } -- 2.5.5
[PATCH 2/3] powerpc: Call chained reset handlers during reset
Call out to all restart handlers that were added via register_restart_handler() API when restarting the machine. Signed-off-by: Andrey Smirnov --- arch/powerpc/kernel/setup-common.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index 5cd3283..205d073 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -145,6 +145,10 @@ void machine_restart(char *cmd) ppc_md.restart(cmd); smp_send_stop(); + + do_kernel_restart(cmd); + mdelay(1000); + machine_hang(); } -- 2.5.5
[PATCH 3/3] powerpc: Convert fsl_rstcr_restart to a reset handler
Convert fsl_rstcr_restart into a function to be registered with register_reset_handler() API and introduce fls_rstcr_restart_register() function that can be added as an initcall that would do aforementioned registration. Signed-off-by: Andrey Smirnov --- arch/powerpc/platforms/85xx/bsc913x_qds.c | 2 +- arch/powerpc/platforms/85xx/bsc913x_rdb.c | 2 +- arch/powerpc/platforms/85xx/c293pcie.c| 2 +- arch/powerpc/platforms/85xx/corenet_generic.c | 2 +- arch/powerpc/platforms/85xx/ge_imp3a.c| 2 +- arch/powerpc/platforms/85xx/mpc8536_ds.c | 2 +- arch/powerpc/platforms/85xx/mpc85xx_ads.c | 2 +- arch/powerpc/platforms/85xx/mpc85xx_cds.c | 26 +++--- arch/powerpc/platforms/85xx/mpc85xx_ds.c | 7 --- arch/powerpc/platforms/85xx/mpc85xx_mds.c | 7 --- arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 21 +++-- arch/powerpc/platforms/85xx/mvme2500.c| 2 +- arch/powerpc/platforms/85xx/p1010rdb.c| 2 +- arch/powerpc/platforms/85xx/p1022_ds.c| 2 +- arch/powerpc/platforms/85xx/p1022_rdk.c | 3 ++- arch/powerpc/platforms/85xx/p1023_rdb.c | 2 +- arch/powerpc/platforms/85xx/ppa8548.c | 2 +- arch/powerpc/platforms/85xx/qemu_e500.c | 2 +- arch/powerpc/platforms/85xx/sbc8548.c | 2 +- arch/powerpc/platforms/85xx/socrates.c| 2 +- arch/powerpc/platforms/85xx/stx_gp3.c | 2 +- arch/powerpc/platforms/85xx/tqm85xx.c | 2 +- arch/powerpc/platforms/85xx/twr_p102x.c | 2 +- arch/powerpc/platforms/85xx/xes_mpc85xx.c | 7 --- arch/powerpc/platforms/86xx/gef_ppc9a.c | 2 +- arch/powerpc/platforms/86xx/gef_sbc310.c | 2 +- arch/powerpc/platforms/86xx/gef_sbc610.c | 2 +- arch/powerpc/platforms/86xx/mpc8610_hpcd.c| 2 +- arch/powerpc/platforms/86xx/mpc86xx_hpcn.c| 2 +- arch/powerpc/platforms/86xx/sbc8641d.c| 2 +- arch/powerpc/sysdev/fsl_soc.c | 22 +- arch/powerpc/sysdev/fsl_soc.h | 2 +- 32 files changed, 86 insertions(+), 57 deletions(-) diff --git a/arch/powerpc/platforms/85xx/bsc913x_qds.c b/arch/powerpc/platforms/85xx/bsc913x_qds.c index 07dd6ae..14ea7a0 100644 --- a/arch/powerpc/platforms/85xx/bsc913x_qds.c +++ b/arch/powerpc/platforms/85xx/bsc913x_qds.c @@ -53,6 +53,7 @@ static void __init bsc913x_qds_setup_arch(void) } machine_arch_initcall(bsc9132_qds, mpc85xx_common_publish_devices); +machine_arch_initcall(bsc9133_qds, fsl_rstcr_restart_register); /* * Called very early, device-tree isn't unflattened @@ -72,7 +73,6 @@ define_machine(bsc9132_qds) { .pcibios_fixup_bus = fsl_pcibios_fixup_bus, #endif .get_irq= mpic_get_irq, - .restart= fsl_rstcr_restart, .calibrate_decr = generic_calibrate_decr, .progress = udbg_progress, }; diff --git a/arch/powerpc/platforms/85xx/bsc913x_rdb.c b/arch/powerpc/platforms/85xx/bsc913x_rdb.c index e48f671..cd4e717 100644 --- a/arch/powerpc/platforms/85xx/bsc913x_rdb.c +++ b/arch/powerpc/platforms/85xx/bsc913x_rdb.c @@ -43,6 +43,7 @@ static void __init bsc913x_rdb_setup_arch(void) } machine_device_initcall(bsc9131_rdb, mpc85xx_common_publish_devices); +machine_arch_initcall(bsc9131_rdb, fsl_rstcr_restart_register); /* * Called very early, device-tree isn't unflattened @@ -59,7 +60,6 @@ define_machine(bsc9131_rdb) { .setup_arch = bsc913x_rdb_setup_arch, .init_IRQ = bsc913x_rdb_pic_init, .get_irq= mpic_get_irq, - .restart= fsl_rstcr_restart, .calibrate_decr = generic_calibrate_decr, .progress = udbg_progress, }; diff --git a/arch/powerpc/platforms/85xx/c293pcie.c b/arch/powerpc/platforms/85xx/c293pcie.c index 3b9e3f0..fbd63f9 100644 --- a/arch/powerpc/platforms/85xx/c293pcie.c +++ b/arch/powerpc/platforms/85xx/c293pcie.c @@ -48,6 +48,7 @@ static void __init c293_pcie_setup_arch(void) } machine_arch_initcall(c293_pcie, mpc85xx_common_publish_devices); +machine_arch_initcall(c293_pcie, fsl_rstcr_restart_register); /* * Called very early, device-tree isn't unflattened @@ -65,7 +66,6 @@ define_machine(c293_pcie) { .setup_arch = c293_pcie_setup_arch, .init_IRQ = c293_pcie_pic_init, .get_irq= mpic_get_irq, - .restart= fsl_rstcr_restart, .calibrate_decr = generic_calibrate_decr, .progress = udbg_progress, }; diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c index 3a6a84f..297379b 100644 --- a/arch/powerpc/platforms/85xx/corenet_generic.c +++ b/arch/powerpc/platforms/85xx/corenet_generic.c @@ -225,7 +225,6 @@ define_machine(corenet_generic) { #else .get_irq
[PATCH 1/3] powerpc: Factor out common code in setup-common.c
Factor out a small bit of common code in machine_restart(), machine_power_off() and machine_halt(). Signed-off-by: Andrey Smirnov --- arch/powerpc/kernel/setup-common.c | 23 ++- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index 714b4ba..5cd3283 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -130,15 +130,22 @@ void machine_shutdown(void) ppc_md.machine_shutdown(); } +static void machine_hang(void) +{ + pr_emerg("System Halted, OK to turn off power\n"); + local_irq_disable(); + while (1) + ; +} + void machine_restart(char *cmd) { machine_shutdown(); if (ppc_md.restart) ppc_md.restart(cmd); + smp_send_stop(); - printk(KERN_EMERG "System Halted, OK to turn off power\n"); - local_irq_disable(); - while (1) ; + machine_hang(); } void machine_power_off(void) @@ -146,10 +153,9 @@ void machine_power_off(void) machine_shutdown(); if (pm_power_off) pm_power_off(); + smp_send_stop(); - printk(KERN_EMERG "System Halted, OK to turn off power\n"); - local_irq_disable(); - while (1) ; + machine_hang(); } /* Used by the G5 thermal driver */ EXPORT_SYMBOL_GPL(machine_power_off); @@ -162,10 +168,9 @@ void machine_halt(void) machine_shutdown(); if (ppc_md.halt) ppc_md.halt(); + smp_send_stop(); - printk(KERN_EMERG "System Halted, OK to turn off power\n"); - local_irq_disable(); - while (1) ; + machine_hang(); } -- 2.5.5
Re: [PATCH v2 02/10] userns: Add per user namespace sysctls.
David Miller writes: > From: ebied...@xmission.com (Eric W. Biederman) > Date: Mon, 25 Jul 2016 19:44:50 -0500 > >> User namespaces have enabled unprivileged users access to a lot more >> data structures and so to catch programs that go crazy we need a lot >> more limits. I believe some of those limits make sense per namespace. >> As it is easy in some cases to say any more than Y number of those >> per namespace is excessive. For example a limit of 1,000,000 ipv4 >> routes per network namespaces is a sanity check as there are >> currently 621,649 ipv4 prefixes advertized in bgp. > > When we give a new namespace to unprivileged users, we honestly should > make the sysctl settings we give to them become "limits". They can > further constrain the sysctl settings but may not raise them. I won't disagree. I was thinking in terms of global setting that hold the limits for per namespace counters. As we are talking sanity check limits. Perhaps we could get sophisticated and do something more but the simpler we can make things and get the job done the better. Eric
[PATCH 3/3] mm/duet: framework code
The Duet framework code: - bittree.c: red-black bitmap tree that keeps track of items of interest - debug.c: functions used to print information used to debug Duet - hash.c: implementation of the global hash table where page events are stored for all tasks - hook.c: the function invoked by the page cache hooks when Duet is online - init.c: routines used to bring Duet online or offline - path.c: routines performing resolution of UUIDs to paths using d_path - task.c: implementation of Duet task fd operations Signed-off-by: George Amvrosiadis --- init/Kconfig | 2 + mm/Makefile | 1 + mm/duet/Kconfig | 31 +++ mm/duet/Makefile | 7 + mm/duet/bittree.c | 537 + mm/duet/common.h | 211 mm/duet/debug.c | 98 + mm/duet/hash.c| 315 + mm/duet/hook.c| 81 mm/duet/init.c| 172 mm/duet/path.c| 184 + mm/duet/syscall.h | 61 ++ mm/duet/task.c| 584 ++ 13 files changed, 2284 insertions(+) create mode 100644 mm/duet/Kconfig create mode 100644 mm/duet/Makefile create mode 100644 mm/duet/bittree.c create mode 100644 mm/duet/common.h create mode 100644 mm/duet/debug.c create mode 100644 mm/duet/hash.c create mode 100644 mm/duet/hook.c create mode 100644 mm/duet/init.c create mode 100644 mm/duet/path.c create mode 100644 mm/duet/syscall.h create mode 100644 mm/duet/task.c diff --git a/init/Kconfig b/init/Kconfig index c02d897..6f94b5a 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -294,6 +294,8 @@ config USELIB earlier, you may need to enable this syscall. Current systems running glibc can safely disable this. +source mm/duet/Kconfig + config AUDIT bool "Auditing support" depends on NET diff --git a/mm/Makefile b/mm/Makefile index 78c6f7d..074c15f 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -99,3 +99,4 @@ obj-$(CONFIG_USERFAULTFD) += userfaultfd.o obj-$(CONFIG_IDLE_PAGE_TRACKING) += page_idle.o obj-$(CONFIG_FRAME_VECTOR) += frame_vector.o obj-$(CONFIG_DEBUG_PAGE_REF) += debug_page_ref.o +obj-$(CONFIG_DUET) += duet/ diff --git a/mm/duet/Kconfig b/mm/duet/Kconfig new file mode 100644 index 000..2f3a0c5 --- /dev/null +++ b/mm/duet/Kconfig @@ -0,0 +1,31 @@ +config DUET + bool "Duet framework support" + + help + Duet is a framework aiming to reduce the IO footprint of analytics + and maintenance work. By exposing page cache events to these tasks, + it allows them to adapt their data processing order, in order to + benefit from data available in the page cache. Duet's operation is + based on hooks into the page cache. + + To compile support for Duet, say Y. + +config DUET_STATS + bool "Duet statistics collection" + depends on DUET + help + This option enables support for the collection of statistics on the + operation of Duet. It will print information about the data structures + used internally, and profiling information about the framework. + + If unsure, say N. + +config DUET_DEBUG + bool "Duet debugging support" + depends on DUET + help + Enable runtime debugging support for the Duet framework. This may + enable additional and expensive checks with negative impact on + performance. + + To compile debugging support for Duet, say Y. If unsure, say N. diff --git a/mm/duet/Makefile b/mm/duet/Makefile new file mode 100644 index 000..c0c9e11 --- /dev/null +++ b/mm/duet/Makefile @@ -0,0 +1,7 @@ +# +# Makefile for the linux Duet framework. +# + +obj-$(CONFIG_DUET) += duet.o + +duet-y := init.o hash.o hook.o task.o bittree.o path.o debug.o diff --git a/mm/duet/bittree.c b/mm/duet/bittree.c new file mode 100644 index 000..3b20c35 --- /dev/null +++ b/mm/duet/bittree.c @@ -0,0 +1,537 @@ +/* + * Copyright (C) 2016 George Amvrosiadis. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ + +#include "common.h" + +#define BMAP_READ 0x01/* Read bmaps (overrides other flags) */ +#define BMAP_CHECK 0x02/* Check given bmap value expression */ + /* Sets bmaps to match expression if not set */ + +/* Bmap expressions can be formed using the following flags: */ +#define BMAP_DONE_SET 0x04/* Set done bmap values */ +#define BMAP_DONE_RST 0x08/* Reset done bmap values */ +#define BMAP_RELV_SET 0x10/* Set re
[PATCH 1/3] mm: support for duet hooks
Adds the Duet hooks in the page cache. In filemap.c, two hooks are added at the time of addition and removal of a page descriptor. In page-flags.h, two more hooks are added to track page dirtying and flushing. The hooks are inactive while Duet is offline. Signed-off-by: George Amvrosiadis --- include/linux/duet.h | 43 + include/linux/page-flags.h | 53 ++ mm/filemap.c | 11 ++ 3 files changed, 107 insertions(+) create mode 100644 include/linux/duet.h diff --git a/include/linux/duet.h b/include/linux/duet.h new file mode 100644 index 000..80491e2 --- /dev/null +++ b/include/linux/duet.h @@ -0,0 +1,43 @@ +/* + * Defs necessary for Duet hooks + * + * Author: George Amvrosiadis + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ +#ifndef _DUET_H +#define _DUET_H + +/* + * Duet hooks into the page cache to monitor four types of events: + * ADDED:a page __descriptor__ was inserted into the page cache + * REMOVED: a page __describptor__ was removed from the page cache + * DIRTY:page's dirty bit was set + * FLUSHED: page's dirty bit was cleared + */ +#define DUET_PAGE_ADDED0x0001 +#define DUET_PAGE_REMOVED 0x0002 +#define DUET_PAGE_DIRTY0x0004 +#define DUET_PAGE_FLUSHED 0x0008 + +#define DUET_HOOK(funp, evt, data) \ + do { \ + rcu_read_lock(); \ + funp = rcu_dereference(duet_hook_fp); \ + if (funp) \ + funp(evt, (void *)data); \ + rcu_read_unlock(); \ + } while (0) + +/* Hook function pointer initialized by the Duet framework */ +typedef void (duet_hook_t) (__u16, void *); +extern duet_hook_t *duet_hook_fp; + +#endif /* _DUET_H */ diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index e5a3244..53be4a0 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -12,6 +12,9 @@ #include #include #endif /* !__GENERATING_BOUNDS_H */ +#ifdef CONFIG_DUET +#include +#endif /* CONFIG_DUET */ /* * Various page->flags bits: @@ -254,8 +257,58 @@ PAGEFLAG(Error, error, PF_NO_COMPOUND) TESTCLEARFLAG(Error, error, PF_NO_COMPOUN PAGEFLAG(Referenced, referenced, PF_HEAD) TESTCLEARFLAG(Referenced, referenced, PF_HEAD) __SETPAGEFLAG(Referenced, referenced, PF_HEAD) +#ifdef CONFIG_DUET +TESTPAGEFLAG(Dirty, dirty, PF_HEAD) + +static inline void SetPageDirty(struct page *page) +{ + duet_hook_t *dhfp = NULL; + + if (!test_and_set_bit(PG_dirty, &page->flags)) + DUET_HOOK(dhfp, DUET_PAGE_DIRTY, page); +} + +static inline void __ClearPageDirty(struct page *page) +{ + duet_hook_t *dhfp = NULL; + + if (__test_and_clear_bit(PG_dirty, &page->flags)) + DUET_HOOK(dhfp, DUET_PAGE_FLUSHED, page); +} + +static inline void ClearPageDirty(struct page *page) +{ + duet_hook_t *dhfp = NULL; + + if (test_and_clear_bit(PG_dirty, &page->flags)) + DUET_HOOK(dhfp, DUET_PAGE_FLUSHED, page); +} + +static inline int TestSetPageDirty(struct page *page) +{ + duet_hook_t *dhfp = NULL; + + if (!test_and_set_bit(PG_dirty, &page->flags)) { + DUET_HOOK(dhfp, DUET_PAGE_DIRTY, page); + return 0; + } + return 1; +} + +static inline int TestClearPageDirty(struct page *page) +{ + duet_hook_t *dhfp = NULL; + + if (test_and_clear_bit(PG_dirty, &page->flags)) { + DUET_HOOK(dhfp, DUET_PAGE_FLUSHED, page); + return 1; + } + return 0; +} +#else PAGEFLAG(Dirty, dirty, PF_HEAD) TESTSCFLAG(Dirty, dirty, PF_HEAD) __CLEARPAGEFLAG(Dirty, dirty, PF_HEAD) +#endif /* CONFIG_DUET */ PAGEFLAG(LRU, lru, PF_HEAD) __CLEARPAGEFLAG(LRU, lru, PF_HEAD) PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD) TESTCLEARFLAG(Active, active, PF_HEAD) diff --git a/mm/filemap.c b/mm/filemap.c index 20f3b1f..f06ebc0 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -166,6 +166,11 @@ static void page_cache_tree_delete(struct address_space *mapping, void __delete_from_page_cache(struct page *page, void *shadow) { struct address_space *mapping = page->mapping; +#ifdef CONFIG_DUET + duet_hook_t *dhfp = NULL; + + DUET_HOOK(dhfp, DUET_PAGE_REMOVED, page); +#endif /* CONFIG_DUET */ trace_mm_filemap_delete_from_page_cache(page); /* @@ -628,6 +633,9 @@ static int __add_to_page_cache_locked(struct page *page, int huge = PageHuge(page);
[PATCH 2/3] mm/duet: syscall wiring
Usual syscall wiring for the four Duet syscalls. Signed-off-by: George Amvrosiadis --- arch/x86/entry/syscalls/syscall_32.tbl | 4 arch/x86/entry/syscalls/syscall_64.tbl | 4 include/linux/syscalls.h | 8 include/uapi/asm-generic/unistd.h | 12 +++- kernel/sys_ni.c| 6 ++ 5 files changed, 33 insertions(+), 1 deletion(-) diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index 4cddd17..f34ff94 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -386,3 +386,7 @@ 377i386copy_file_range sys_copy_file_range 378i386preadv2 sys_preadv2 compat_sys_preadv2 379i386pwritev2sys_pwritev2 compat_sys_pwritev2 +380i386duet_status sys_duet_status +381i386duet_init sys_duet_init +382i386duet_bmap sys_duet_bmap +383i386duet_get_path sys_duet_get_path diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index 555263e..d04efaa 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -335,6 +335,10 @@ 326common copy_file_range sys_copy_file_range 32764 preadv2 sys_preadv2 32864 pwritev2sys_pwritev2 +329common duet_status sys_duet_status +330common duet_init sys_duet_init +331common duet_bmap sys_duet_bmap +332common duet_get_path sys_duet_get_path # # x32-specific system call numbers start at 512 to avoid cache impact diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index d022390..da1049e 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -65,6 +65,8 @@ struct old_linux_dirent; struct perf_event_attr; struct file_handle; struct sigaltstack; +struct duet_status_args; +struct duet_uuid_arg; union bpf_attr; #include @@ -898,4 +900,10 @@ asmlinkage long sys_copy_file_range(int fd_in, loff_t __user *off_in, asmlinkage long sys_mlock2(unsigned long start, size_t len, int flags); +asmlinkage long sys_duet_status(u16 flags, struct duet_status_args __user *arg); +asmlinkage long sys_duet_init(const char __user *taskname, u32 regmask, + const char __user *pathname); +asmlinkage long sys_duet_bmap(u16 flags, struct duet_uuid_arg __user *arg); +asmlinkage long sys_duet_get_path(struct duet_uuid_arg __user *uarg, + char __user *pathbuf, int pathbufsize); #endif diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index a26415b..7c287c0 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -725,8 +725,18 @@ __SC_COMP(__NR_preadv2, sys_preadv2, compat_sys_preadv2) #define __NR_pwritev2 287 __SC_COMP(__NR_pwritev2, sys_pwritev2, compat_sys_pwritev2) +/* mm/duet/syscall.c */ +#define __NR_duet_status 288 +__SYSCALL(__NR_duet_status, sys_duet_status) +#define __NR_duet_init 289 +__SYSCALL(__NR_duet_init, sys_duet_init) +#define __NR_duet_bmap 290 +__SYSCALL(__NR_duet_bmap, sys_duet_bmap) +#define __NR_duet_get_path 291 +__SYSCALL(__NR_duet_get_path, sys_duet_get_path) + #undef __NR_syscalls -#define __NR_syscalls 288 +#define __NR_syscalls 292 /* * All syscalls below here should go away really, diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index 2c5e3a8..3d4c53a 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -176,6 +176,12 @@ cond_syscall(sys_capget); cond_syscall(sys_capset); cond_syscall(sys_copy_file_range); +/* Duet syscall entries */ +cond_syscall(sys_duet_status); +cond_syscall(sys_duet_init); +cond_syscall(sys_duet_bmap); +cond_syscall(sys_duet_get_path); + /* arch-specific weak syscall entries */ cond_syscall(sys_pciconfig_read); cond_syscall(sys_pciconfig_write); -- 2.7.4
[PATCH 0/3] new feature: monitoring page cache events
I'm attaching a patch set implementing a mechanism we call Duet, which allows applications to monitor events at the page cache level: page additions, removals, dirtying, and flushing. Using such events, applications can identify and prioritize processing of cached data, thereby reducing their I/O footprint. One user of these events are maintenance tasks that scan large amounts of data (e.g., backup, defrag, scrubbing). Knowing what is currently cached allows them to piggy-back on each other and other applications running in the system. I've managed to run up to 3 such applications together (backup, scrubbing, defrag) and have them finish their work with 1/3rd of the I/O by using Duet. In this case, the task that traversed the data the fastest (scrubber) allowed the rest of the tasks to piggyback on the data brought into the cache. I.e., a file that was read to be backed up was also picked up by the scrubber and defrag process. I've found adapting applications to be straight-forward. Although I don't include examples in this patch set, I've adapted btrfs scrubbing, btrfs send (backup), btrfs defrag, rsync, and f2fs garbage collection in a few hundred lines of code each (basically just had to add an event handler and wire it up to the task's processing loop). You can read more about this in our full paper: http://dl.acm.org/citation.cfm?id=2815424. I'd be happy to generate subsequent patch sets for individual tasks if there's interest in this one. We've also used Duet to speed up Hadoop and Spark by taking into account cache residency of HDFS blocks across the cluster, when scheduling tasks, by up to 54% depending on overlap on the data processed: https://www.usenix.org/conference/hotstorage16/workshop-program/presentation/deslauriers Syscall interface (and how it works): Duet uses hooks into the page cache (see the "mm: support for duet hooks" patch). These hooks inform Duet of page events, which are stored in a hash table. Only events that are of interest to running tasks are stored, and only one copy of each event is stored for all interested tasks. To register for events, the following syscalls are used (see the "mm/duet: syscall wiring" patch for prototypes): - sys_duet_init(char *taskname, u32 regmask, char *path): returns an fd that watches for events under PATH (e.g. '/home') and are also described in the REGMASK (e.g. DUET_PAGE_ADDED | DUET_PAGE_REMOVED). TASKNAME is an optional, human-readable name for the task. - sys_duet_bmap(u16 flags, struct duet_uuid_arg *uuid): Duet allows applications to track processed items on an internal bitmap (which improves performance by being used to filter unnecessary events). The specified UUID is what read() returns on the fd created with sys_duet_init(), and uniquely identifies a file. FLAGS allow the bitmap to be set, reset, or have its state checked. - sys_duet_get_path(struct duet_uuid_arg *uuid, char *buf, int bufsize): Applications running with Duet do not understand UUIDs, but pathnames. This syscall traverses the dentry cache and returns the corresponding path in BUF. - sys_duet_status(u16 flags, struct duet_status_args *arg): Currently, the Duet framework can be turned on/off manually. This allows the admin to specify the number of max applications that will be registered concurrently, which allows us to size the internal hash table nodes appropriately (and limit performance or memory overhead). The syscall is also used for debugging purposes. I think this functionality should probably be exposed through ioctl()s to a device, and I'm open to suggestions on how to improve the current implementation. The framework itself (a bit less than 2300 LoC) is currently placed under mm/duet and the code is included in the "mm/duet: framework code" patch. Application interface: Applications interface with Duet through a user library, which is available at https://github.com/gamvrosi/duet-tools. In the same repo, I have included a dummy_task application which provides an example of how Duet can be used. Changelog: The patches are based on Linus' v4.7 tag, and touch on the following parts of the kernel: - mm/filemap.c and include/linux/page-flags.h: hooks in the page cache to track page events on page addition, removal, dirtying, and flushing. - arch/x86/*, include/linux/syscalls.h, kernel/sys_ni.h: wiring the 4 syscalls - mm/duet/*: framework code George Amvrosiadis (3): mm: support for duet hooks mm/duet: syscall wiring mm/duet: framework code arch/x86/entry/syscalls/syscall_32.tbl | 4 + arch/x86/entry/syscalls/syscall_64.tbl | 4 + include/linux/duet.h | 43 +++ include/linux/page-flags.h | 53 +++ include/linux/syscalls.h | 8 + include/uapi/asm-generic/unistd.h | 12 +- init/Kconfig | 2 + kernel/sys_ni.c| 6 + mm/Makefile| 1 + mm/duet/Kconfig
Re: [PATCH v2 3/3] x86/apic: Improved the setting of interrupt mode for bsp
Wei Jiangang writes: > If we specify the 'notsc' parameter for the dump-capture kernel, > and then trigger a crash(panic) by using "ALT-SysRq-c" or > "echo c > /proc/sysrq-trigger", the dump-capture kernel will > hang in calibrate_delay_converge() and wait for jiffies changes. > serial log as follows: > > tsc: Fast TSC calibration using PIT > tsc: Detected 2099.947 MHz processor > Calibrating delay loop... > > The reason for jiffies not changes is there's no timer interrupt > passed to dump-capture kernel. > > In fact, once kernel panic occurs, the local APIC is disabled > by lapic_shutdown() in reboot path. > generly speaking, local APIC state can be initialized by BIOS > after Power-Up or Reset, which doesn't apply to kdump case. > so the kernel has to be responsible for initialize the interrupt > mode properly according the latest status of APIC in bootup path. > > An MP operating system is booted under either PIC mode or > virtual wire mode. Later, the operating system switches to > symmetric I/O mode as it enters multiprocessor mode. > Two kinds of virtual wire mode are defined in Intel MP spec: > virtual wire mode via local APIC or via I/O APIC. > > Now we determine the mode of APIC only through a SMP BIOS(MP table). > That's not enough. It's better to do further check if APIC works > with effective interrupt mode, and then, do some proper setting. Reading through the code let me pause a moment and say: "Yowzers the interrupt initialization code has gotten hard to follow. It is now full of indirection with ill defined semantics." pre_vector_init indeed. I will argue this is the wrong fix. We really should not have to worry about getting the system functional in virtual wire mode on a modern system. And looking at the code someone has done half the work and made it conditional under acpi_gbl_reduced_hardware. Now reduced hardware implies a bit more than we ware talking about but if there is ACPI apic information we should not need to worry about external interrupts and can just enable the apics. In fact I think having MPtable information is enough for that. So I think what needs to happens is for the apic initialization to get an overhaul that makes apic initialization the happy path and the other irq controllers the odd backwards compatibility path. And when we are done we never run in anything except full apic mode unless the hardware doesn't support it. I think that will leave things more robust as we don't need to setup and then reset up the interrupts during boot. Eric > Signed-off-by: Cao jin > Signed-off-by: Wei Jiangang > --- > arch/x86/include/asm/io_apic.h | 5 > arch/x86/kernel/apic/apic.c| 60 > +- > arch/x86/kernel/apic/io_apic.c | 28 > 3 files changed, 92 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h > index 6cbf2cfb3f8a..a3257366bf7f 100644 > --- a/arch/x86/include/asm/io_apic.h > +++ b/arch/x86/include/asm/io_apic.h > @@ -190,6 +190,7 @@ static inline unsigned int io_apic_read(unsigned int > apic, unsigned int reg) > } > > extern void setup_IO_APIC(void); > +extern bool virt_wire_through_ioapic(void); > extern void enable_IO_APIC(void); > extern void disable_IO_APIC(void); > extern void setup_ioapic_dest(void); > @@ -231,6 +232,10 @@ static inline void io_apic_init_mappings(void) { } > #define native_disable_io_apic NULL > > static inline void setup_IO_APIC(void) { } > +static inline bool virt_wire_through_ioapic(void) > +{ > + return false; > +} > static inline void enable_IO_APIC(void) { } > static inline void setup_ioapic_dest(void) { } > > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c > index 8e25b9b2d351..a3939fb130cc 100644 > --- a/arch/x86/kernel/apic/apic.c > +++ b/arch/x86/kernel/apic/apic.c > @@ -1124,6 +1124,58 @@ void __init sync_Arb_IDs(void) > } > > /* > + * Check APIC enable/disable flag > + */ > +static bool check_apic_enabled(void) > +{ > + unsigned int value; > + > + /* > + * If APIC is disabled globally (IA32_APIC_BASE[11] == 0) > + * the boot cpu hasn't X86_FEATURE_APIC, > + * and init_bsp_APIC() has already checked it before. > + * so no need to check global enable/disable flag here > + */ > + > + /* Check the software enable/disable flag */ > + value = apic_read(APIC_SPIV); > + if (!(value & APIC_SPIV_APIC_ENABLED)) > + return false; > + > + return true; > +} > + > +/* > + * Return false means the through-local-APIC virtual wire mode is inactive > + */ > +static bool virt_wire_through_lapic(void) > +{ > + unsigned int value; > + > + /* > + * The through-local-APIC virtual wire mode requests > + * local APIC to enable LINT0 for ExtINT delivery mode > + * and LINT1 for NMI delivery mode > + */ > + value = apic_read(APIC_LVT0); > + if (GET_APIC_
linux-next: manual merge of the xen-tip tree with the tip tree
Hi all, Today's linux-next merge of the xen-tip tree got a conflict in: arch/x86/xen/smp.c between commit: 4c9075835511 ("xen/x86: Move irq allocation from Xen smp_op.cpu_up()") from the tip tree and commit: ad5475f9faf5 ("x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op") from the xen-tip tree. I fixed it up (I think - see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc arch/x86/xen/smp.c index 09d5cc062dbe,0b4d04c8ab4d.. --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@@ -486,7 -495,11 +493,7 @@@ static int xen_cpu_up(unsigned int cpu xen_pmu_init(cpu); - rc = HYPERVISOR_vcpu_op(VCPUOP_up, cpu, NULL); - rc = xen_smp_intr_init(cpu); - if (rc) - return rc; - + rc = HYPERVISOR_vcpu_op(VCPUOP_up, xen_vcpu_nr(cpu), NULL); BUG_ON(rc); while (cpu_report_state(cpu) != CPU_ONLINE)
linux-next: manual merge of the xen-tip tree with the tip tree
Hi all, Today's linux-next merge of the xen-tip tree got a conflict in: arch/x86/xen/enlighten.c between commit: 4c9075835511 ("xen/x86: Move irq allocation from Xen smp_op.cpu_up()") from the tip tree and commit: 88e957d6e47f ("xen: introduce xen_vcpu_id mapping") from the xen-tip tree. I fixed it up (I think - see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc arch/x86/xen/enlighten.c index dc96f939af88,85ef4c0442e0.. --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@@ -1803,49 -1823,21 +1824,53 @@@ static void __init init_hvm_pv_info(voi xen_domain_type = XEN_HVM_DOMAIN; } -static int xen_hvm_cpu_notify(struct notifier_block *self, unsigned long action, -void *hcpu) +static int xen_cpu_notify(struct notifier_block *self, unsigned long action, +void *hcpu) { int cpu = (long)hcpu; + int rc; + switch (action) { case CPU_UP_PREPARE: - if (cpu_acpi_id(cpu) != U32_MAX) - per_cpu(xen_vcpu_id, cpu) = cpu_acpi_id(cpu); - else - per_cpu(xen_vcpu_id, cpu) = cpu; - xen_vcpu_setup(cpu); - if (xen_have_vector_callback) { - if (xen_feature(XENFEAT_hvm_safe_pvclock)) - xen_setup_timer(cpu); + if (xen_hvm_domain()) { + /* + * This can happen if CPU was offlined earlier and + * offlining timed out in common_cpu_die(). + */ + if (cpu_report_state(cpu) == CPU_DEAD_FROZEN) { + xen_smp_intr_free(cpu); + xen_uninit_lock_cpu(cpu); + } + ++ if (cpu_acpi_id(cpu) != U32_MAX) ++ per_cpu(xen_vcpu_id, cpu) = cpu_acpi_id(cpu); ++ else ++ per_cpu(xen_vcpu_id, cpu) = cpu; + xen_vcpu_setup(cpu); } + + if (xen_pv_domain() || + (xen_have_vector_callback && + xen_feature(XENFEAT_hvm_safe_pvclock))) + xen_setup_timer(cpu); + + rc = xen_smp_intr_init(cpu); + if (rc) { + WARN(1, "xen_smp_intr_init() for CPU %d failed: %d\n", + cpu, rc); + return NOTIFY_BAD; + } + + break; + case CPU_ONLINE: + xen_init_lock_cpu(cpu); + break; + case CPU_UP_CANCELED: + xen_smp_intr_free(cpu); + if (xen_pv_domain() || + (xen_have_vector_callback && + xen_feature(XENFEAT_hvm_safe_pvclock))) + xen_teardown_timer(cpu); break; default: break;
Re: [PATCH v9 0/7] Make cpuid <-> nodeid mapping persistent
在 2016年07月26日 07:20, Andrew Morton 写道: On Mon, 25 Jul 2016 16:35:42 +0800 Dou Liyang wrote: [Problem] cpuid <-> nodeid mapping is firstly established at boot time. And workqueue caches the mapping in wq_numa_possible_cpumask in wq_numa_init() at boot time. When doing node online/offline, cpuid <-> nodeid mapping is established/destroyed, which means, cpuid <-> nodeid mapping will change if node hotplug happens. But workqueue does not update wq_numa_possible_cpumask. So here is the problem: Assume we have the following cpuid <-> nodeid in the beginning: Node | CPU node 0 | 0-14, 60-74 node 1 | 15-29, 75-89 node 2 | 30-44, 90-104 node 3 | 45-59, 105-119 and we hot-remove node2 and node3, it becomes: Node | CPU node 0 | 0-14, 60-74 node 1 | 15-29, 75-89 and we hot-add node4 and node5, it becomes: Node | CPU node 0 | 0-14, 60-74 node 1 | 15-29, 75-89 node 4 | 30-59 node 5 | 90-119 But in wq_numa_possible_cpumask, cpu30 is still mapped to node2, and the like. When a pool workqueue is initialized, if its cpumask belongs to a node, its pool->node will be mapped to that node. And memory used by this workqueue will also be allocated on that node. Plan B is to hunt down and fix up all the workqueue structures at hotplug-time. Has that option been evaluated? Yes, the option has been evaluate in this patch: http://www.gossamer-threads.com/lists/linux/kernel/2116748 Your fix is x86-only and this bug presumably affects other architectures, yes?I think a "Plan B" would fix all architectures? Yes, the bug may presumably affect few architectures which support CPU hotplug and NUMA. We have sent the "Plan B" in our community and got a lot of advice and ideas. Based on these suggestions, We carefully balance that two plan. Then we choice the first. Thirdly, what is the merge path for these patches? Is an x86 or ACPI maintainer working with you on them? Yes, we get a lot of guidance and help from RJ who is an ACPI maintainer. Thanks, Dou
[e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110
Greetings, This BUG message can be found in recent kernels as well as v4.4 and linux-stable. It happens when running modprobe netconsole netconsole=@/,$port@$server/ [ 39.937534] 22 Jul 13:30:40 ntpdate[440]: step time server 192.168.1.1 offset -673.833841 sec [ 39.943285] netpoll: netconsole: local port 6665 [ 39.943436] netpoll: netconsole: local IPv4 address 0.0.0.0 [ 39.943609] netpoll: netconsole: interface 'eth0' [ 39.943756] netpoll: netconsole: remote port 6672 [ 39.943913] netpoll: netconsole: remote IPv4 address 192.168.1.1 [ 39.944099] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff [ 39.944311] netpoll: netconsole: local IP 192.168.1.193 [ 39.944514] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 [ 39.944515] in_atomic(): 1, irqs_disabled(): 1, pid: 448, name: modprobe [ 39.944517] CPU: 6 PID: 448 Comm: modprobe Not tainted 4.7.0-rc7-wt-ath-10122-gf9b5ec2 #102 [ 39.944518] Hardware name: /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012 [ 39.944522] c90001f2f9e8 813417d9 88007faba5c0 [ 39.944524] 006e c90001f2fa00 810aec03 81a25948 [ 39.944525] c90001f2fa28 810aec9a 8803e5bd9400 8803e50fbd68 [ 39.944526] Call Trace: [ 39.944533] [] dump_stack+0x63/0x8a [ 39.944536] [] ___might_sleep+0xd3/0x120 [ 39.944537] [] __might_sleep+0x4a/0x80 [ 39.944541] [] synchronize_irq+0x38/0xa0 [ 39.944543] [] ? __irq_put_desc_unlock+0x1e/0x40 [ 39.944545] [] ? __disable_irq_nosync+0x43/0x60 [ 39.944547] [] disable_irq+0x1c/0x20 [ 39.944559] [] e1000_netpoll+0xf2/0x120 [e1000e] [ 39.944563] [] netpoll_poll_dev+0x5c/0x1a0 [ 39.944567] [] ? __kmalloc_reserve+0x31/0x90 [ 39.944569] [] netpoll_send_skb_on_dev+0x16b/0x250 [ 39.944572] [] netpoll_send_udp+0x2ec/0x450 [ 39.944576] [] write_msg+0xb2/0xf0 [netconsole] [ 39.944578] [] call_console_drivers+0x115/0x120 [ 39.944580] [] console_unlock+0x333/0x5c0 [ 39.944583] [] register_console+0x1c4/0x380 [ 39.944586] [] init_netconsole+0x1c5/0x1000 [netconsole] [ 39.944588] [] ? 0xa004f000 [ 39.944591] [] do_one_initcall+0x3d/0x150 [ 39.944592] [] ? __might_sleep+0x4a/0x80 [ 39.944596] [] ? kmem_cache_alloc_trace+0x188/0x1e0 [ 39.944598] [] do_init_module+0x5f/0x1d8 [ 39.944602] [] load_module+0x1429/0x1b40 [ 39.944604] [] ? __symbol_put+0x40/0x40 [ 39.944607] [] ? kernel_read_file+0x178/0x1a0 [ 39.944608] [] ? kernel_read_file_from_fd+0x49/0x80 [ 39.944611] [] SYSC_finit_module+0xc3/0xf0 [ 39.944614] [] SyS_finit_module+0xe/0x10 [ 39.944617] [] entry_SYSCALL_64_fastpath+0x1a/0xa9 [ 39.946384] console [netcon0] enabled [ 39.946514] netconsole: network logging started Can this be possibly fixed? Thanks, Fengguang
Re: [PATCH v3 3/3] mac80211: mesh: fixed HT ies in beacon template
On 2016年07月22日 14:26, Masashi Honma wrote: > On 2016年07月14日 05:07, Yaniv Machani wrote: >> + >> +/* if channel width is 20MHz - configure HT capab accordingly*/ >> +if (sdata->vif.bss_conf.chandef.width == NL80211_CHAN_WIDTH_20) { >> +cap &= ~IEEE80211_HT_CAP_SUP_WIDTH_20_40; >> +cap &= ~IEEE80211_HT_CAP_DSSSCCK40; >> +} > > I have tested this part of your patch and this works for me. > > Previouly, "Supported Channel Width Set bit" in HT Capabilities element > was 1 even though disable_ht40=1 existed in wpa_supplicant.conf. > After appllication of patch, the bit was 0. > > # I retransmit this because of mail delivery errors. I forgot to mention I have used this patch to test. http://lists.infradead.org/pipermail/hostap/2016-July/036029.html
Re: [RFC patch 1/6] random: Simplify API for random address requests
All, On Tue, Jul 26, 2016 at 03:01:55AM +, Jason Cooper wrote: > To date, all callers of randomize_range() have set the length to 0, and > check for a zero return value. For the current callers, the only way > to get zero returned is if end <= start. Since they are all adding a > constant to the start address, this is unnecessary. > > We can remove a bunch of needless checks by simplifying the API to do > just what everyone wants, return an address between [start, start + > range]. > > While we're here, s/get_random_int/get_random_long/. No current call > site is adversely affected by get_random_int(), since all current range > requests are < MAX_UINT. However, we should match caller expectations > to avoid coming up short (ha!) in the future. > > Signed-off-by: Jason Cooper > --- > drivers/char/random.c | 17 - > include/linux/random.h | 2 +- > 2 files changed, 5 insertions(+), 14 deletions(-) > > diff --git a/drivers/char/random.c b/drivers/char/random.c > index 0158d3bff7e5..1251cb2cbab2 100644 > --- a/drivers/char/random.c > +++ b/drivers/char/random.c > @@ -1822,22 +1822,13 @@ unsigned long get_random_long(void) > EXPORT_SYMBOL(get_random_long); > > /* > - * randomize_range() returns a start address such that > - * > - *[.. .] > - * start end > - * > - * a with size "len" starting at the return value is inside in the > - * area defined by [start, end], but is otherwise randomized. > + * randomize_addr() returns a page aligned address within [start, start + > + * range] > */ > unsigned long > -randomize_range(unsigned long start, unsigned long end, unsigned long len) > +randomize_addr(unsigned long start, unsigned long range) > { > - unsigned long range = end - len - start; > - > - if (end <= start + len) > - return 0; > - return PAGE_ALIGN(get_random_int() % range + start); > + return PAGE_ALIGN(get_random_long() % range + start); > } bah! old patch file. This should have been: if (range == 0) return start; else return PAGE_ALIGN(get_random_long() % range + start); sorry, Jason. > > /* Interface for in-kernel drivers of true hardware RNGs. > diff --git a/include/linux/random.h b/include/linux/random.h > index e47e533742b5..1ad877a98186 100644 > --- a/include/linux/random.h > +++ b/include/linux/random.h > @@ -34,7 +34,7 @@ extern const struct file_operations random_fops, > urandom_fops; > > unsigned int get_random_int(void); > unsigned long get_random_long(void); > -unsigned long randomize_range(unsigned long start, unsigned long end, > unsigned long len); > +unsigned long randomize_addr(unsigned long start, unsigned long range); > > u32 prandom_u32(void); > void prandom_bytes(void *buf, size_t nbytes); > -- > 2.9.2 >
Re: [PATCH] iio: adc: rockchip_saradc: Explicitly disable ADC on probe
On 07/25/2016 07:51 PM, Caesar Wang wrote: Hi Guenter, Thanks for fixing it. On 2016年07月26日 03:39, Guenter Roeck wrote: If the ADC is read for the first time, the caller gets a timeout error, and the kernel log shows read channel() error: -110 The ADC may be enabled on boot, and needs to be explicitly disabled for a read sequence to work (otherwise there is no completion interrupt). Disaple it explicitly in the probe function. Fixes: 44d6f2ef94f9 ("iio: adc: add driver for Rockchip saradc") Signed-off-by: Guenter Roeck --- drivers/iio/adc/rockchip_saradc.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iio/adc/rockchip_saradc.c b/drivers/iio/adc/rockchip_saradc.c index f9ad6c2d6821..6aa3271d86b5 100644 --- a/drivers/iio/adc/rockchip_saradc.c +++ b/drivers/iio/adc/rockchip_saradc.c @@ -280,6 +280,9 @@ static int rockchip_saradc_probe(struct platform_device *pdev) goto err_pclk; } +/* Make sure ADC is disabled */ +writel_relaxed(0, info->regs + SARADC_CTRL); I think we should reset the saradc controller. Since make sure the reset value is 0 and loader-->kernel may even cause harm, as my experience on tsadc. (drivers/thermal/rockchip_thermal.c) e.g.: /** * Reset SARADC Controller, reset all saradc registers. */ static void rockchip_saradc_reset_controller(struct reset_control *reset) { reset_control_assert(reset); usleep_range(10, 20); reset_control_deassert(reset); } ..probe() { ... rockchip_saradc_reset_controller(); ... } Ok, I'll give it a try. Guenter - Caesar + platform_set_drvdata(pdev, indio_dev); indio_dev->name = dev_name(&pdev->dev);
Re: [PATCH 04/32] x86/intel_rdt: Add L3 cache capacity bitmask management
You must specify a mask for each L3 cache. So you can achieve your 80/80 split either with one rdtgroup that has an 80% mask on each of the sockets and using affinity to make one VM run only on CPUs on one socket and the second VM on the other. Or separate rdtgroups for each VM that give them the 80% when they are on their own socket and the spare 20% if the wander off to the other socket. Sent from my iPhone > On Jul 25, 2016, at 19:13, Marcelo Tosatti wrote: > >> On Fri, Jul 22, 2016 at 02:43:23PM -0700, Luck, Tony wrote: >>> On Fri, Jul 22, 2016 at 04:12:04AM -0300, Marcelo Tosatti wrote: >>> How does this patchset handle the following condition: >>> >>> 6) Create reservations in such a way that the sum is larger than >>> total amount of cache, and CPU pinning (example from Karen Noel): >>> >>> VM-1 on socket-1 with 80% of reservation. >>> VM-2 on socket-2 with 80% of reservation. >>> VM-1 pinned to socket-1. >>> VM-2 pinned to socket-2. >> >> That's legal, but perhaps we need a description of >> overlapping cache reservations. >> >> Hardware tells you how finely you can divide the cache (and this >> information is shown in /sys/fs/resctrl/info/l3/max_cbm_len to save >> you from digging in CPUID leaves). E.g. on Broadwell the value is >> 20, so you can control cache allocations in 5% slices. >> >> A bitmask defines which slices you can use (and h/w has the restriction >> that you must have contiguous '1' bits in any mask). So you can pick >> your 80% using 0x0, 0x1fffe, 0x3fffc, 0x7fff8 or 0x0. >> >> There is no requirement that masks be exclusive of each other. So >> you might pick the two extremes: 0x0 and 0x0 for your two >> VM's in this example. Each would be allowed to allocate up to 80%, >> but with a big overlap in the middle. Each has 20% exclusive, but >> there is a 60% range in the middle that they would compete for. > > This are different sockets, so there is no competing/sharing of L3 cache > here: the question is about whether the interface allows the > user to specify that 80/80 reservation without complaining: > because the VM's are pinned, they will never actually > share the same L3 cache. > > (haven't finished reading the patchset to be certain). > >> Is this specific case useful? Possibly not. I think the more common >> overlap cases might be between processes that you know have shared >> code/data. Also the case where some rdtgroup has access to allocate >> in the entire cache (mask 0xf on Broadwell) and some other >> rdtgroups >> have limited cache allocation with less bits in the mask. >> >> -Tony > > All you have to do is to build the bitmask for a given processor > from the union of the tasks which have been scheduled on that > processor. > >
RE: [PATCH v18 net-next 1/1] hv_sock: introduce Hyper-V Sockets
> From: David Miller [mailto:da...@davemloft.net] > > From: Dexuan Cui > Date: Sat, 23 Jul 2016 01:35:51 + > > > +static struct sock *hvsock_create(struct net *net, struct socket *sock, > > + gfp_t priority, unsigned short type) > > +{ > > + struct hvsock_sock *hvsk; > > + struct sock *sk; > > + > > + sk = sk_alloc(net, AF_HYPERV, priority, &hvsock_proto, 0); > > + if (!sk) > > + return NULL; > ... > > + /* Looks stream-based socket doesn't need this. */ > > + sk->sk_backlog_rcv = NULL; > > + > > + sk->sk_state = 0; > > + sock_reset_flag(sk, SOCK_DONE); > > All of these are unnecessary initializations, since sk_alloc() zeroes > out the 'sk' object for you. Hi David, Thanks for the comment! I'll remove the 3 lines. May I know if you have more comments? BTW, during the past month, at least 7 other people also reviewed the patch and gave me quite a few good comments, which have been addressed. Though only one of them gave the Reviewed-by line for now, I guess I would get more if I ping them to have a look at the latest version of the patch, i.e., v19 -- I'm going to post it with the aforementioned 3 lines removed and if you've more comments, I'm ready to address them too. :-) Thanks, -- Dexuan
Re: [PATCH -next] drm/hisilicon: Fix error handling of ade_power_up()
On 19 July 2016 at 19:30, Wei Yongjun wrote: > From: Wei Yongjun > > Fix the reset_control_deassert() fail and clk_prepare_enable() fail > error handling of ade_power_up(). > > Signed-off-by: Wei Yongjun Applied, thanks. -xinliang > --- > drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c | 10 -- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c > b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c > index c3707d4..e2bd1e6 100644 > --- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c > +++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c > @@ -258,18 +258,24 @@ static int ade_power_up(struct ade_hw_ctx *ctx) > ret = reset_control_deassert(ctx->reset); > if (ret) { > DRM_ERROR("failed to deassert reset\n"); > - return ret; > + goto err_reset; > } > > ret = clk_prepare_enable(ctx->ade_core_clk); > if (ret) { > DRM_ERROR("failed to enable ade_core_clk (%d)\n", ret); > - return ret; > + goto err_prepare_enable; > } > > ade_init(ctx); > ctx->power_on = true; > return 0; > + > +err_prepare_enable: > + reset_control_assert(ctx->reset); > +err_reset: > + clk_disable_unprepare(ctx->media_noc_clk); > + return ret; > } > > static void ade_power_down(struct ade_hw_ctx *ctx) > > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
[RFC patch 6/6] unicore32: Use simpler API for random address requests
Currently, all callers to randomize_range() set the length to 0 and calculate end by adding a constant to the start address. We can simplify the API to remove a bunch of needless checks and variables. Use the new randomize_addr(start, range) call to set the requested address. Signed-off-by: Jason Cooper --- arch/unicore32/kernel/process.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c index 00299c927852..b856178cf167 100644 --- a/arch/unicore32/kernel/process.c +++ b/arch/unicore32/kernel/process.c @@ -295,8 +295,7 @@ unsigned long get_wchan(struct task_struct *p) unsigned long arch_randomize_brk(struct mm_struct *mm) { - unsigned long range_end = mm->brk + 0x0200; - return randomize_range(mm->brk, range_end, 0) ? : mm->brk; + return randomize_addr(mm->brk, 0x0200); } /* -- 2.9.2
[RFC patch 4/6] arm64: Use simpler API for random address requests
Currently, all callers to randomize_range() set the length to 0 and calculate end by adding a constant to the start address. We can simplify the API to remove a bunch of needless checks and variables. Use the new randomize_addr(start, range) call to set the requested address. Signed-off-by: Jason Cooper --- arch/arm64/kernel/process.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 6cd2612236dc..11bf454baf86 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -374,12 +374,8 @@ unsigned long arch_align_stack(unsigned long sp) unsigned long arch_randomize_brk(struct mm_struct *mm) { - unsigned long range_end = mm->brk; - if (is_compat_task()) - range_end += 0x0200; + return randomize_addr(mm->brk, 0x0200); else - range_end += 0x4000; - - return randomize_range(mm->brk, range_end, 0) ? : mm->brk; + return randomize_addr(mm->brk, 0x4000); } -- 2.9.2
[RFC patch 2/6] x86: Use simpler API for random address requests
Currently, all callers to randomize_range() set the length to 0 and calculate end by adding a constant to the start address. We can simplify the API to remove a bunch of needless checks and variables. Use the new randomize_addr(start, range) call to set the requested address. Signed-off-by: Jason Cooper --- arch/x86/kernel/process.c| 3 +-- arch/x86/kernel/sys_x86_64.c | 5 + 2 files changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 96becbbb52e0..a083a2c0744e 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -507,8 +507,7 @@ unsigned long arch_align_stack(unsigned long sp) unsigned long arch_randomize_brk(struct mm_struct *mm) { - unsigned long range_end = mm->brk + 0x0200; - return randomize_range(mm->brk, range_end, 0) ? : mm->brk; + return randomize_addr(mm->brk, 0x0200); } /* diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c index 10e0272d789a..f9cad22808fc 100644 --- a/arch/x86/kernel/sys_x86_64.c +++ b/arch/x86/kernel/sys_x86_64.c @@ -101,7 +101,6 @@ static void find_start_end(unsigned long flags, unsigned long *begin, unsigned long *end) { if (!test_thread_flag(TIF_ADDR32) && (flags & MAP_32BIT)) { - unsigned long new_begin; /* This is usually used needed to map code in small model, so it needs to be in the first 31bit. Limit it to that. This means we need to move the @@ -112,9 +111,7 @@ static void find_start_end(unsigned long flags, unsigned long *begin, *begin = 0x4000; *end = 0x8000; if (current->flags & PF_RANDOMIZE) { - new_begin = randomize_range(*begin, *begin + 0x0200, 0); - if (new_begin) - *begin = new_begin; + *begin = randomize_addr(*begin, 0x0200); } } else { *begin = current->mm->mmap_legacy_base; -- 2.9.2
[RFC patch 5/6] tile: Use simpler API for random address requests
Currently, all callers to randomize_range() set the length to 0 and calculate end by adding a constant to the start address. We can simplify the API to remove a bunch of needless checks and variables. Use the new randomize_addr(start, range) call to set the requested address. Signed-off-by: Jason Cooper --- arch/tile/mm/mmap.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/tile/mm/mmap.c b/arch/tile/mm/mmap.c index 851a94e6ae58..50f6a693a2b6 100644 --- a/arch/tile/mm/mmap.c +++ b/arch/tile/mm/mmap.c @@ -88,6 +88,5 @@ void arch_pick_mmap_layout(struct mm_struct *mm) unsigned long arch_randomize_brk(struct mm_struct *mm) { - unsigned long range_end = mm->brk + 0x0200; - return randomize_range(mm->brk, range_end, 0) ? : mm->brk; + return randomize_addr(mm->brk, 0x0200); } -- 2.9.2
[RFC patch 3/6] ARM: Use simpler API for random address requests
Currently, all callers to randomize_range() set the length to 0 and calculate end by adding a constant to the start address. We can simplify the API to remove a bunch of needless checks and variables. Use the new randomize_addr(start, range) call to set the requested address. Signed-off-by: Jason Cooper --- arch/arm/kernel/process.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index 4a803c5a1ff7..02dee671cded 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -314,8 +314,7 @@ unsigned long get_wchan(struct task_struct *p) unsigned long arch_randomize_brk(struct mm_struct *mm) { - unsigned long range_end = mm->brk + 0x0200; - return randomize_range(mm->brk, range_end, 0) ? : mm->brk; + return randomize_addr(mm->brk, 0x0200); } #ifdef CONFIG_MMU -- 2.9.2
[RFC patch 1/6] random: Simplify API for random address requests
To date, all callers of randomize_range() have set the length to 0, and check for a zero return value. For the current callers, the only way to get zero returned is if end <= start. Since they are all adding a constant to the start address, this is unnecessary. We can remove a bunch of needless checks by simplifying the API to do just what everyone wants, return an address between [start, start + range]. While we're here, s/get_random_int/get_random_long/. No current call site is adversely affected by get_random_int(), since all current range requests are < MAX_UINT. However, we should match caller expectations to avoid coming up short (ha!) in the future. Signed-off-by: Jason Cooper --- drivers/char/random.c | 17 - include/linux/random.h | 2 +- 2 files changed, 5 insertions(+), 14 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index 0158d3bff7e5..1251cb2cbab2 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1822,22 +1822,13 @@ unsigned long get_random_long(void) EXPORT_SYMBOL(get_random_long); /* - * randomize_range() returns a start address such that - * - *[.. .] - * start end - * - * a with size "len" starting at the return value is inside in the - * area defined by [start, end], but is otherwise randomized. + * randomize_addr() returns a page aligned address within [start, start + + * range] */ unsigned long -randomize_range(unsigned long start, unsigned long end, unsigned long len) +randomize_addr(unsigned long start, unsigned long range) { - unsigned long range = end - len - start; - - if (end <= start + len) - return 0; - return PAGE_ALIGN(get_random_int() % range + start); + return PAGE_ALIGN(get_random_long() % range + start); } /* Interface for in-kernel drivers of true hardware RNGs. diff --git a/include/linux/random.h b/include/linux/random.h index e47e533742b5..1ad877a98186 100644 --- a/include/linux/random.h +++ b/include/linux/random.h @@ -34,7 +34,7 @@ extern const struct file_operations random_fops, urandom_fops; unsigned int get_random_int(void); unsigned long get_random_long(void); -unsigned long randomize_range(unsigned long start, unsigned long end, unsigned long len); +unsigned long randomize_addr(unsigned long start, unsigned long range); u32 prandom_u32(void); void prandom_bytes(void *buf, size_t nbytes); -- 2.9.2
[PATCH v2 1/3] x86/apic: Remove "focus disabled" for 64bit case
Disable processor focus for 64bit causes a crash, Call Trace as following: [] dump_stack+0x63/0x84 [] __warn+0xd1/0xf0 [] warn_slowpath_fmt+0x5f/0x80 [] ex_handler_wrmsr_unsafe+0x62/0x70 [] fixup_exception+0x39/0x50 [] do_general_protection+0x80/0x160 [] general_protection+0x28/0x30 [] ? native_write_msr+0x4/0x30 [] ? native_apic_msr_write+0x32/0x40 [] init_bsp_APIC+0x5f/0x118 [] init_ISA_irqs+0x19/0x4c [] native_init_IRQ+0xd/0x377 [] init_IRQ+0x42/0x49 [] start_kernel+0x2ce/0x4c8 [] ? set_init_arg+0x55/0x55 [] ? early_idt_handler_array+0x120/0x120 [] x86_64_start_reservations+0x2f/0x31 [] x86_64_start_kernel+0x14c/0x16f Keep a consistent implementation with the setup_local_APIC(), always use processor focus for 64bit. more details refer to commit 89c38c2867eb ("x86: apic - unify setup_local_APIC") Signed-off-by: Cao jin Signed-off-by: Wei Jiangang --- arch/x86/kernel/apic/apic.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 60078a67d7e3..0273b652c689 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1154,9 +1154,7 @@ void __init init_bsp_APIC(void) if ((boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) && (boot_cpu_data.x86 == 15)) value &= ~APIC_SPIV_FOCUS_DISABLED; - else #endif - value |= APIC_SPIV_FOCUS_DISABLED; value |= SPURIOUS_APIC_VECTOR; apic_write(APIC_SPIV, value); -- 1.9.3
[PATCH v2 2/3] x86/apic: Update comment about disabling processor focus
Fix references to discarded end_level_ioapic_irq(). Signed-off-by: Cao jin Signed-off-by: Wei Jiangang --- arch/x86/kernel/apic/apic.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 0273b652c689..8e25b9b2d351 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1346,7 +1346,6 @@ void setup_local_APIC(void) * Actually disabling the focus CPU check just makes the hang less * frequent as it makes the interrupt distributon model be more * like LRU than MRU (the short-term load is more even across CPUs). -* See also the comment in end_level_ioapic_irq(). --macro */ /* -- 1.9.3
[PATCH v2 3/3] x86/apic: Improved the setting of interrupt mode for bsp
If we specify the 'notsc' parameter for the dump-capture kernel, and then trigger a crash(panic) by using "ALT-SysRq-c" or "echo c > /proc/sysrq-trigger", the dump-capture kernel will hang in calibrate_delay_converge() and wait for jiffies changes. serial log as follows: tsc: Fast TSC calibration using PIT tsc: Detected 2099.947 MHz processor Calibrating delay loop... The reason for jiffies not changes is there's no timer interrupt passed to dump-capture kernel. In fact, once kernel panic occurs, the local APIC is disabled by lapic_shutdown() in reboot path. generly speaking, local APIC state can be initialized by BIOS after Power-Up or Reset, which doesn't apply to kdump case. so the kernel has to be responsible for initialize the interrupt mode properly according the latest status of APIC in bootup path. An MP operating system is booted under either PIC mode or virtual wire mode. Later, the operating system switches to symmetric I/O mode as it enters multiprocessor mode. Two kinds of virtual wire mode are defined in Intel MP spec: virtual wire mode via local APIC or via I/O APIC. Now we determine the mode of APIC only through a SMP BIOS(MP table). That's not enough. It's better to do further check if APIC works with effective interrupt mode, and then, do some proper setting. Signed-off-by: Cao jin Signed-off-by: Wei Jiangang --- arch/x86/include/asm/io_apic.h | 5 arch/x86/kernel/apic/apic.c| 60 +- arch/x86/kernel/apic/io_apic.c | 28 3 files changed, 92 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h index 6cbf2cfb3f8a..a3257366bf7f 100644 --- a/arch/x86/include/asm/io_apic.h +++ b/arch/x86/include/asm/io_apic.h @@ -190,6 +190,7 @@ static inline unsigned int io_apic_read(unsigned int apic, unsigned int reg) } extern void setup_IO_APIC(void); +extern bool virt_wire_through_ioapic(void); extern void enable_IO_APIC(void); extern void disable_IO_APIC(void); extern void setup_ioapic_dest(void); @@ -231,6 +232,10 @@ static inline void io_apic_init_mappings(void) { } #define native_disable_io_apic NULL static inline void setup_IO_APIC(void) { } +static inline bool virt_wire_through_ioapic(void) +{ + return false; +} static inline void enable_IO_APIC(void) { } static inline void setup_ioapic_dest(void) { } diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 8e25b9b2d351..a3939fb130cc 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1124,6 +1124,58 @@ void __init sync_Arb_IDs(void) } /* + * Check APIC enable/disable flag + */ +static bool check_apic_enabled(void) +{ + unsigned int value; + + /* +* If APIC is disabled globally (IA32_APIC_BASE[11] == 0) +* the boot cpu hasn't X86_FEATURE_APIC, +* and init_bsp_APIC() has already checked it before. +* so no need to check global enable/disable flag here +*/ + + /* Check the software enable/disable flag */ + value = apic_read(APIC_SPIV); + if (!(value & APIC_SPIV_APIC_ENABLED)) + return false; + + return true; +} + +/* + * Return false means the through-local-APIC virtual wire mode is inactive + */ +static bool virt_wire_through_lapic(void) +{ + unsigned int value; + + /* +* The through-local-APIC virtual wire mode requests +* local APIC to enable LINT0 for ExtINT delivery mode +* and LINT1 for NMI delivery mode +*/ + value = apic_read(APIC_LVT0); + if (GET_APIC_DELIVERY_MODE(value) != APIC_MODE_EXTINT) + return false; + + value = apic_read(APIC_LVT1); + if (GET_APIC_DELIVERY_MODE(value) != APIC_MODE_NMI) + return false; + + return true; +} + +static bool check_virt_wire_mode(void) +{ + /* If neither of virtual wire mode is active, return false */ + return (check_apic_enabled() && (virt_wire_through_lapic() || + virt_wire_through_ioapic())); +} + +/* * An initial setup of the virtual wire mode. */ void __init init_bsp_APIC(void) @@ -1133,8 +1185,14 @@ void __init init_bsp_APIC(void) /* * Don't do the setup now if we have a SMP BIOS as the * through-I/O-APIC virtual wire mode might be active. +* +* It's better to do further check if either through-I/O-APIC +* or through-local-APIC is active. +* the worst case is that both of them are inactive, If so, +* we need to enable the through-local-APIC virtual wire mode */ - if (smp_found_config || !boot_cpu_has(X86_FEATURE_APIC)) + if (pic_mode || !boot_cpu_has(X86_FEATURE_APIC) || + (smp_found_config && check_virt_wire_mode())) return; /* diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 446702ed99dc..f794d389ba85 100
[PATCH v2 0/3] Fix dump-capture kernel hangs with notsc
v2: Just about the commit ("x86/apic: Improved the setting of interrupt mode for bsp") - Unify the name s/virtual_wire_via_*/virt_wire_through_* - Add check for PIC mode suggested-by Baoquan He - Add check enable/disable flag for IO-APIC suggested-by Xunlei Pang - Update comments v1: The goal is to fix dump-capture kernel with notsc option hangs in calibrate_delay_converge() Wei Jiangang (3): x86/apic: Remove "focus disabled" for 64bit case x86/apic: Update comment about disabling processor focus x86/apic: Improved the setting of interrupt mode for bsp arch/x86/include/asm/io_apic.h | 5 arch/x86/kernel/apic/apic.c| 63 +++--- arch/x86/kernel/apic/io_apic.c | 28 +++ 3 files changed, 92 insertions(+), 4 deletions(-) -- 1.9.3
Re: [RFC PATCH v7 1/7] Restartable sequences system call
- On Jul 25, 2016, at 7:02 PM, Andy Lutomirski l...@amacapital.net wrote: > On Thu, Jul 21, 2016 at 2:14 PM, Mathieu Desnoyers > wrote: >> Man page associated: >> >> RSEQ(2)Linux Programmer's Manual RSEQ(2) >> >> NAME >>rseq - Restartable sequences and cpu number cache >> >> SYNOPSIS >>#include >> >>int rseq(struct rseq * rseq, int flags); >> >> DESCRIPTION >>The rseq() ABI accelerates user-space operations on per-cpu >>data by defining a shared data structure ABI between each user- >>space thread and the kernel. >> >>The rseq argument is a pointer to the thread-local rseq struc‐ >>ture to be shared between kernel and user-space. A NULL rseq >>value can be used to check whether rseq is registered for the >>current thread. >> >>The layout of struct rseq is as follows: >> >>Structure alignment >> This structure needs to be aligned on multiples of 64 >> bytes. >> >>Structure size >> This structure has a fixed size of 128 bytes. >> >>Fields >> >>cpu_id >> Cache of the CPU number on which the calling thread is >> running. >> >>event_counter >> Restartable sequences event_counter field. > > That's an unhelpful description. Good point, how about: event_counter Counter guaranteed to be incremented when the current thread is preempted or when a signal is delivered to the current thread. In that same line of thoughts, I would reword cpu_id as: cpu_id Cache of the CPU number on which the current thread is running. > >> >>rseq_cs >> Restartable sequences rseq_cs field. Points to a struct >> rseq_cs. > > Why is it a pointer? Rewording like this should help understand: rseq_cs The rseq_cs field is a pointer to a struct rseq_cs. Is is NULL when no rseq assembly block critical section is active for the current thread. Setting it to point to a critical section descriptor (struct rseq_cs) marks the beginning of the critical section. It is cleared after the end of the critical section. > >> >>The layout of struct rseq_cs is as follows: >> >>Structure alignment >> This structure needs to be aligned on multiples of 64 >> bytes. >> >>Structure size >> This structure has a fixed size of 192 bytes. >> >>Fields >> >>start_ip >> Instruction pointer address of the first instruction of >> the sequence of consecutive assembly instructions. >> >>post_commit_ip >> Instruction pointer address after the last instruction >> of the sequence of consecutive assembly instructions. >> >>abort_ip >> Instruction pointer address where to move the execution >> flow in case of abort of the sequence of consecutive >> assembly instructions. >> >>The flags argument is currently unused and must be specified as >>0. >> >>Typically, a library or application will keep the rseq struc‐ >>ture in a thread-local storage variable, or other memory areas > > "variable or other memory area" ok > >>belonging to each thread. It is recommended to perform volatile >>reads of the thread-local cache to prevent the compiler from >>doing load tearing. An alternative approach is to read each >>field from inline assembly. > > I don't think the man page needs to tell people how to implement > correct atomic loads. ok, I can remove the two previous sentences. > >> >>Each thread is responsible for registering its rseq structure. >>Only one rseq structure address can be registered per thread. >>Once set, the rseq address is idempotent for a given thread. > > "Idempotent" is a property that applies to an action, and the "rseq > address" is not an action. I don't know what you're trying to say. I mean there is only one address registered per thread, and it stays registered for the life-time of the thread. Perhaps I could say: "Once set, the rseq address never changes for a given thread." > >> >>In a typical usage scenario, the thread registering the rseq >>structure will be performing loads and stores from/to that >>structure. It is however also allowed to read that structure >>from other threads. The rseq field updates performed by the >>kernel provide single-copy atomicity semantics, which guarantee >>that other threads performing single-copy atomic reads of the >>cpu number cache will always observe a consistent value. > > s/single-copy/relaxed atomic/ perhaps? ok > >> >>Memory registered as rseq structure should ne
[PATCH] clocksource: sun4i: Clear interrupts after stopping timer in probe function
The bootloader (U-boot) sometimes uses this timer for various delays. It uses it as a ongoing counter, and does comparisons on the current counter value. The timer counter is never stopped. In some cases when the user interacts with the bootloader, or lets it idle for some time before loading Linux, the timer may expire, and an interrupt will be pending. This results in an unexpected interrupt when the timer interrupt is enabled by the kernel, at which point the event_handler isn't set yet. This results in a NULL pointer dereference exception, panic, and no way to reboot. Clear any pending interrupts after we stop the timer in the probe function to avoid this. Signed-off-by: Chen-Yu Tsai --- I've run into this many times while working on U-boot. Finally made time to figure it out. --- drivers/clocksource/sun4i_timer.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/clocksource/sun4i_timer.c b/drivers/clocksource/sun4i_timer.c index 6f3719d73390..d5725c82401d 100644 --- a/drivers/clocksource/sun4i_timer.c +++ b/drivers/clocksource/sun4i_timer.c @@ -193,6 +193,9 @@ static void __init sun4i_timer_init(struct device_node *node) /* Make sure timer is stopped before playing with interrupts */ sun4i_clkevt_time_stop(0); + /* clear timer0 interrupt */ + writel(0x1, timer_base + TIMER_IRQ_ST_REG); + sun4i_clockevent.cpumask = cpu_possible_mask; sun4i_clockevent.irq = irq; -- 2.8.1
Re: [PATCH v2 02/10] userns: Add per user namespace sysctls.
From: ebied...@xmission.com (Eric W. Biederman) Date: Mon, 25 Jul 2016 19:44:50 -0500 > User namespaces have enabled unprivileged users access to a lot more > data structures and so to catch programs that go crazy we need a lot > more limits. I believe some of those limits make sense per namespace. > As it is easy in some cases to say any more than Y number of those > per namespace is excessive. For example a limit of 1,000,000 ipv4 > routes per network namespaces is a sanity check as there are > currently 621,649 ipv4 prefixes advertized in bgp. When we give a new namespace to unprivileged users, we honestly should make the sysctl settings we give to them become "limits". They can further constrain the sysctl settings but may not raise them.
Re: [PATCH] iio: adc: rockchip_saradc: Explicitly disable ADC on probe
Hi Guenter, Thanks for fixing it. On 2016年07月26日 03:39, Guenter Roeck wrote: If the ADC is read for the first time, the caller gets a timeout error, and the kernel log shows read channel() error: -110 The ADC may be enabled on boot, and needs to be explicitly disabled for a read sequence to work (otherwise there is no completion interrupt). Disaple it explicitly in the probe function. Fixes: 44d6f2ef94f9 ("iio: adc: add driver for Rockchip saradc") Signed-off-by: Guenter Roeck --- drivers/iio/adc/rockchip_saradc.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iio/adc/rockchip_saradc.c b/drivers/iio/adc/rockchip_saradc.c index f9ad6c2d6821..6aa3271d86b5 100644 --- a/drivers/iio/adc/rockchip_saradc.c +++ b/drivers/iio/adc/rockchip_saradc.c @@ -280,6 +280,9 @@ static int rockchip_saradc_probe(struct platform_device *pdev) goto err_pclk; } + /* Make sure ADC is disabled */ + writel_relaxed(0, info->regs + SARADC_CTRL); I think we should reset the saradc controller. Since make sure the reset value is 0 and loader-->kernel may even cause harm, as my experience on tsadc. (drivers/thermal/rockchip_thermal.c) e.g.: /** * Reset SARADC Controller, reset all saradc registers. */ static void rockchip_saradc_reset_controller(struct reset_control *reset) { reset_control_assert(reset); usleep_range(10, 20); reset_control_deassert(reset); } ..probe() { ... rockchip_saradc_reset_controller(); ... } - Caesar + platform_set_drvdata(pdev, indio_dev); indio_dev->name = dev_name(&pdev->dev); -- caesar wang | software engineer | w...@rock-chip.com
Re: [PATCH 4.6 000/203] 4.6.5-stable review
On Mon, Jul 25, 2016 at 07:49:58PM -0600, Shuah Khan wrote: > On 07/25/2016 02:53 PM, Greg Kroah-Hartman wrote: > > This is the start of the stable review cycle for the 4.6.5 release. > > There are 203 patches in this series, all will be posted as a response > > to this one. If anyone has any issues with these being applied, please > > let me know. > > > > Responses should be made by Wed Jul 27 20:33:38 UTC 2016. > > Anything received after that time might be too late. > > > > The whole patch series can be found in one patch at: > > kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.6.5-rc1.gz > > or in the git tree and branch at: > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > > linux-4.6.y > > and the diffstat can be found below. > > > Compiled and booted on my test system. No dmesg regressions, Great, thanks for testing all of these and letting me know. greg k-h
Re: [PATCH 1/3] memory: mediatek: Add a new interface mtk_smi_larb_is_ready
Hi, On Mon, Jul 25, 2016 at 5:39 PM, Matthias Brugger wrote: > > > On 20/07/16 05:01, Yong Wu wrote: >> >> Currently the iommu consumer always call iommu_present to get whether >> the iommu is ready. But in MTK IOMMU, this function can't indicate >> this. The IOMMU call bus_set_iommu->mtk_iommu_add_device-> >> mtk_iommu_attach_device to parse the iommu data, then it's able to >> transfer "struct mtk_smi_iommu" to SMI-LARB, and the iommu uses the >> larbs as compoents, the iommu will finish its probe until all the larbs >> probe done. >> >> If the iommu consumer(like DRM) begin to probe after the time of >> calling bus_set_iommu and before the time of SMI probe finish, it >> will hang like this: >> >> [7.832359] Call trace: >> [7.834778] [] mtk_smi_larb_get+0x24/0xa8 >> [7.840300] [] mtk_drm_crtc_enable+0x6c/0x450 >> >> Because the larb->mmu is NULL at that time. >> >> In order to avoid this issue, we add a new interface >> (mtk_smi_larb_is_ready) for checking whether the IOMMU and SMI have >> finished their probe. If it return false, the iommu consumer should >> probe-defer for the IOMMU and SMI. >> > > Can't we just skip the functions in the probe and call bus_set_iommu only if > we were able to bind all components? > Something like this: Note that we have to call bus_set_iommu() and actually have .add_device() and .attach_device() called before any of the slave devices probe. I found a similar problem with rockchip IOMMU after adding power domain and runtime PM handling there. I also found that current design of IOMMU core and related DMA mapping code is utterly broken regarding the device add/probe ordering (no support for deferring things properly). So my idea is to keep .add_device() as is, since typically it doesn't seem to require anything from the IOMMU hardware and just initializes some per-device data, but make .attach_device() being able to defer probe of that device if respective IOMMU has not probed yet. I'm still in process of figuring out the right way to achieve it, though... Best regards, Tomasz
Re: [Qemu-devel] [PATCH v2 0/2] vfio: add aer process
ping On 2016/7/19 16:13, Zhou Jie wrote: From: Chen Fan v1-v2: 1. Add aer process to vfio driver. Chen Fan (2): vfio : add aer process vfio : resume notifier drivers/vfio/pci/vfio_pci.c | 58 - drivers/vfio/pci/vfio_pci_intrs.c | 18 drivers/vfio/pci/vfio_pci_private.h | 3 ++ include/uapi/linux/vfio.h | 3 ++ 4 files changed, 81 insertions(+), 1 deletion(-)
Re: [PATCH v2] ceph: Mark the file cache as unreclaimable
> On Jul 26, 2016, at 01:12, Nikolay Borisov wrote: > > Ceph creates multiple caches with the SLAB_RECLAIMABLE flag set, so > that it can satisfy its internal needs. Inspecting the code shows that > most of the caches are indeed reclaimable since they are directly > related to the generic inode/dentry shrinkers. However, one of the > cache used to satisfy struct file is not reclaimable since its > entries are freed only when the last reference to the file is > dropped. If a heavily loaded node opens a lot of files it can > introduce non-trivial discrepancies between memory shown as reclaimable > and what is actually reclaimed when drop_caches is used. > > Fix this by removing the reclaimable flag for the file's cache. > > Signed-off-by: Nikolay Borisov > --- > > Fixed checkpatch warning + missing SOB line > > fs/ceph/super.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/ceph/super.c b/fs/ceph/super.c > index 91e02481ce06..8697cac6add0 100644 > --- a/fs/ceph/super.c > +++ b/fs/ceph/super.c > @@ -672,8 +672,8 @@ static int __init init_caches(void) > if (ceph_dentry_cachep == NULL) > goto bad_dentry; > > - ceph_file_cachep = KMEM_CACHE(ceph_file_info, > - SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD); > + ceph_file_cachep = KMEM_CACHE(ceph_file_info, SLAB_MEM_SPREAD); > + > if (ceph_file_cachep == NULL) > goto bad_file; > Applied, thanks Yan, Zheng > -- > 2.7.4 >
Re: [PATCH] randomize_range: use random long instead of int
Hi William, Kees, On Mon, Jul 25, 2016 at 11:25:41AM -0700, william.c.robe...@intel.com wrote: > From: William Roberts > > Use a long when generating the random range rather than > an int. This will produce better random distributions as > well as matching all the types at hand. > > Signed-off-by: William Roberts > --- > drivers/char/random.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) Upon further review, I think we should dig into this a little bit deeper. Standby, I'll post an RFC series shortly. thx, Jason. > diff --git a/drivers/char/random.c b/drivers/char/random.c > index 0158d3b..bbf11b5 100644 > --- a/drivers/char/random.c > +++ b/drivers/char/random.c > @@ -1837,7 +1837,8 @@ randomize_range(unsigned long start, unsigned long end, > unsigned long len) > > if (end <= start + len) > return 0; > - return PAGE_ALIGN(get_random_int() % range + start); > + > + return PAGE_ALIGN(get_random_long() % range + start); > } > > /* Interface for in-kernel drivers of true hardware RNGs. > -- > 1.9.1 >
Re: [PATCH] tools lib bpf: Use official ELF e_machine value
Hi Arnaldo, Please don't forget this patch. Thank you. On 2016/7/19 5:37, Alexei Starovoitov wrote: On Mon, Jul 18, 2016 at 06:01:08AM +, Wang Nan wrote: New LLVM will issue newly assigned EM_BPF machine code. The new code will be propogated to glibc and libelf. This patch introduces the new machine code to libbpf. Signed-off-by: Wang Nan Cc: Alexei Starovoitov Cc: Arnaldo Carvalho de Melo Cc: Zefan Li Cc: pi3or...@163.com --- tools/lib/bpf/libbpf.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 32e6b6b..b699aea 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -37,6 +37,10 @@ #include "libbpf.h" #include "bpf.h" +#ifndef EM_BPF +#define EM_BPF 247 +#endif + #define __printf(a, b)__attribute__((format(printf, a, b))) __printf(1, 2) @@ -439,7 +443,8 @@ static int bpf_object__elf_init(struct bpf_object *obj) } ep = &obj->efile.ehdr; - if ((ep->e_type != ET_REL) || (ep->e_machine != 0)) { + /* Old LLVM set e_machine to EM_NONE */ + if ((ep->e_type != ET_REL) || (ep->e_machine && (ep->e_machine != EM_BPF))) { Thanks for the fix. Didn't realize we already check for zero here. btw EM_BPF will be in llvm 3.9 release. Acked-by: Alexei Starovoitov
Re: [PATCH 04/32] x86/intel_rdt: Add L3 cache capacity bitmask management
On Fri, Jul 22, 2016 at 02:43:23PM -0700, Luck, Tony wrote: > On Fri, Jul 22, 2016 at 04:12:04AM -0300, Marcelo Tosatti wrote: > > How does this patchset handle the following condition: > > > > 6) Create reservations in such a way that the sum is larger than > > total amount of cache, and CPU pinning (example from Karen Noel): > > > > VM-1 on socket-1 with 80% of reservation. > > VM-2 on socket-2 with 80% of reservation. > > VM-1 pinned to socket-1. > > VM-2 pinned to socket-2. > > That's legal, but perhaps we need a description of > overlapping cache reservations. > > Hardware tells you how finely you can divide the cache (and this > information is shown in /sys/fs/resctrl/info/l3/max_cbm_len to save > you from digging in CPUID leaves). E.g. on Broadwell the value is > 20, so you can control cache allocations in 5% slices. > > A bitmask defines which slices you can use (and h/w has the restriction > that you must have contiguous '1' bits in any mask). So you can pick > your 80% using 0x0, 0x1fffe, 0x3fffc, 0x7fff8 or 0x0. > > There is no requirement that masks be exclusive of each other. So > you might pick the two extremes: 0x0 and 0x0 for your two > VM's in this example. Each would be allowed to allocate up to 80%, > but with a big overlap in the middle. Each has 20% exclusive, but > there is a 60% range in the middle that they would compete for. This are different sockets, so there is no competing/sharing of L3 cache here: the question is about whether the interface allows the user to specify that 80/80 reservation without complaining: because the VM's are pinned, they will never actually share the same L3 cache. (haven't finished reading the patchset to be certain). > Is this specific case useful? Possibly not. I think the more common > overlap cases might be between processes that you know have shared > code/data. Also the case where some rdtgroup has access to allocate > in the entire cache (mask 0xf on Broadwell) and some other > rdtgroups > have limited cache allocation with less bits in the mask. > > -Tony All you have to do is to build the bitmask for a given processor from the union of the tasks which have been scheduled on that processor.
RE: [PATCH v3 02/11] mm: Hardened usercopy
David Laight writes: > From: Josh Poimboeuf >> Sent: 22 July 2016 18:46 >> > >> > e.g. then if the pointer was in the thread_info, the second test would >> > fail, triggering the protection. >> >> FWIW, this won't work right on x86 after Andy's >> CONFIG_THREAD_INFO_IN_TASK patches get merged. > > What ends up in the 'thread_info' area? It depends on the arch. > If it contains the fp save area then programs like gdb may end up requesting > copy_in/out directly from that area. On the arches I've seen thread_info doesn't usually contain register save areas, but if it did then it would be up to the arch helper to allow that copy to go through. However given thread_info generally contains lots of low level flags that would be a good target for an attacker, the best way to cope with ptrace wanting to copy to/from it would be to use a temporary, and prohibit copying directly to/from thread_info - IMHO. cheers