Re: [PATCH v2 4/6] soc: mediatek: devapc: rename variable for new IC support
Hi, Matthias On Tue, 2021-04-06 at 15:43 +0200, Matthias Brugger wrote: > Regarding the commit subject: > "soc: mediatek: devapc: rename variable for new IC support" > maybe something like: > "soc: mediatek: devapc: rename register variable infra_base" > > Other then that looks good to me. > OK. I will fix it in the next version. Thanks > On 01/04/2021 08:38, Nina Wu wrote: > > From: Nina Wu > > > > For new ICs, there are multiple devapc HWs for different subsys. > > For example, there is devapc respectively for infra, peri, peri2, etc. > > So we rename the variable 'infra_base' to 'base' for code readability. > > > > Signed-off-by: Nina Wu > > --- > > drivers/soc/mediatek/mtk-devapc.c | 24 > > 1 file changed, 12 insertions(+), 12 deletions(-) > > > > diff --git a/drivers/soc/mediatek/mtk-devapc.c > > b/drivers/soc/mediatek/mtk-devapc.c > > index 68c3e35..bcf6e3c 100644 > > --- a/drivers/soc/mediatek/mtk-devapc.c > > +++ b/drivers/soc/mediatek/mtk-devapc.c > > @@ -45,7 +45,7 @@ struct mtk_devapc_data { > > > > struct mtk_devapc_context { > > struct device *dev; > > - void __iomem *infra_base; > > + void __iomem *base; > > u32 vio_idx_num; > > struct clk *infra_clk; > > const struct mtk_devapc_data *data; > > @@ -56,7 +56,7 @@ static void clear_vio_status(struct mtk_devapc_context > > *ctx) > > void __iomem *reg; > > int i; > > > > - reg = ctx->infra_base + ctx->data->vio_sta_offset; > > + reg = ctx->base + ctx->data->vio_sta_offset; > > > > for (i = 0; i < VIO_MOD_TO_REG_IND(ctx->vio_idx_num - 1); i++) > > writel(GENMASK(31, 0), reg + 4 * i); > > @@ -71,7 +71,7 @@ static void mask_module_irq(struct mtk_devapc_context > > *ctx, bool mask) > > u32 val; > > int i; > > > > - reg = ctx->infra_base + ctx->data->vio_mask_offset; > > + reg = ctx->base + ctx->data->vio_mask_offset; > > > > if (mask) > > val = GENMASK(31, 0); > > @@ -113,11 +113,11 @@ static int devapc_sync_vio_dbg(struct > > mtk_devapc_context *ctx) > > int ret; > > u32 val; > > > > - pd_vio_shift_sta_reg = ctx->infra_base + > > + pd_vio_shift_sta_reg = ctx->base + > >ctx->data->vio_shift_sta_offset; > > - pd_vio_shift_sel_reg = ctx->infra_base + > > + pd_vio_shift_sel_reg = ctx->base + > >ctx->data->vio_shift_sel_offset; > > - pd_vio_shift_con_reg = ctx->infra_base + > > + pd_vio_shift_con_reg = ctx->base + > >ctx->data->vio_shift_con_offset; > > > > /* Find the minimum shift group which has violation */ > > @@ -159,8 +159,8 @@ static void devapc_extract_vio_dbg(struct > > mtk_devapc_context *ctx) > > void __iomem *vio_dbg0_reg; > > void __iomem *vio_dbg1_reg; > > > > - vio_dbg0_reg = ctx->infra_base + ctx->data->vio_dbg0_offset; > > - vio_dbg1_reg = ctx->infra_base + ctx->data->vio_dbg1_offset; > > + vio_dbg0_reg = ctx->base + ctx->data->vio_dbg0_offset; > > + vio_dbg1_reg = ctx->base + ctx->data->vio_dbg1_offset; > > > > vio_dbgs.vio_dbg0 = readl(vio_dbg0_reg); > > vio_dbgs.vio_dbg1 = readl(vio_dbg1_reg); > > @@ -198,7 +198,7 @@ static irqreturn_t devapc_violation_irq(int irq_number, > > void *data) > > */ > > static void start_devapc(struct mtk_devapc_context *ctx) > > { > > - writel(BIT(31), ctx->infra_base + ctx->data->apc_con_offset); > > + writel(BIT(31), ctx->base + ctx->data->apc_con_offset); > > > > mask_module_irq(ctx, false); > > } > > @@ -210,7 +210,7 @@ static void stop_devapc(struct mtk_devapc_context *ctx) > > { > > mask_module_irq(ctx, true); > > > > - writel(BIT(2), ctx->infra_base + ctx->data->apc_con_offset); > > + writel(BIT(2), ctx->base + ctx->data->apc_con_offset); > > } > > > > static const struct mtk_devapc_data devapc_mt6779 = { > > @@ -249,8 +249,8 @@ static int mtk_devapc_probe(struct platform_device > > *pdev) > > ctx->data = of_device_get_match_data(>dev); > > ctx->dev = >dev; > > > > - ctx->infra_base = of_iomap(node, 0); > > - if (!ctx->infra_base) > > + ctx->base = of_iomap(node, 0); > > + if (!ctx->base) > > return -EINVAL; > > > > if (of_property_read_u32(node, "vio_idx_num", >vio_idx_num)) > >
Re: [PATCH 00/14] usb: dwc2: Fix Partial Power down issues.
Hi Greg, On 4/7/2021 14:00, Artur Petrosyan wrote: > This patch set fixes and improves the Partial Power Down mode for > dwc2 core. > It adds support for the following cases > 1. Entering and exiting partial power down when a port is > suspended, resumed, port reset is asserted. > 2. Exiting the partial power down mode before removing driver. > 3. Exiting partial power down in wakeup detected interrupt handler. > 4. Exiting from partial power down mode when connector ID. > status changes to "connId B > > It updates and fixes the implementation of dwc2 entering and > exiting partial power down mode when the system (PC) is suspended. > > The patch set also improves the implementation of function handlers > for entering and exiting host or device partial power down. > > NOTE: This is the second patch set in the power saving mode fixes > series. > This patch set is part of multiple series and is continuation > of the "usb: dwc2: Fix and improve power saving modes" patch set. > (Patch set link: https://marc.info/?l=linux-usb=160379622403975=2). > The patches that were included in the "usb: dwc2: > Fix and improve power saving modes" which was submitted > earlier was too large and needed to be split up into > smaller patch sets. > > > Artur Petrosyan (14): >usb: dwc2: Add device partial power down functions >usb: dwc2: Add host partial power down functions >usb: dwc2: Update enter and exit partial power down functions >usb: dwc2: Add partial power down exit flow in wakeup intr. >usb: dwc2: Update port suspend/resume function definitions. >usb: dwc2: Add enter partial power down when port is suspended >usb: dwc2: Add exit partial power down when port is resumed >usb: dwc2: Add exit partial power down when port reset is asserted >usb: dwc2: Add part. power down exit from > dwc2_conn_id_status_change(). >usb: dwc2: Allow exit partial power down in urb enqueue >usb: dwc2: Fix session request interrupt handler >usb: dwc2: Update partial power down entering by system suspend >usb: dwc2: Fix partial power down exiting by system resume >usb: dwc2: Add exit partial power down before removing driver > > drivers/usb/dwc2/core.c | 113 ++--- > drivers/usb/dwc2/core.h | 27 ++- > drivers/usb/dwc2/core_intr.c | 46 ++-- > drivers/usb/dwc2/gadget.c| 148 ++- > drivers/usb/dwc2/hcd.c | 458 +-- > drivers/usb/dwc2/hw.h| 1 + > drivers/usb/dwc2/platform.c | 11 +- > 7 files changed, 558 insertions(+), 246 deletions(-) > > > base-commit: e9fcb07704fcef6fa6d0333fd2b3a62442eaf45b > I have submitted this patch set yesterday. It contains 14 patches. But only 2 of those patches were received by LKML only the cover letter and the 13th patch. (https://lore.kernel.org/linux-usb/cover.1617782102.git.arthur.petros...@synopsys.com/T/#t) I checked here at Synopsys, Minas did receive all the patches as his email is in To list. Could this be an issue of vger.kernel.org mailing server? Because I checked every local possibility that could result to such behavior. The patch 13 which was received by LKML has the similar content as the other patches. The mailing tool that was used is ssmtp, checked all the configurations everything is fine. Could you please suggest what should I do in this situation? Regards, Artur
Re: [PATCH] drm/amd/pm: convert sysfs snprintf to sysfs_emit
On Wed, 7 Apr 2021 16:30:01 -0400 Alex Deucher wrote: > On Tue, Apr 6, 2021 at 10:13 AM Carlis wrote: > > > > From: Xuezhi Zhang > > > > Fix the following coccicheck warning: > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:1940:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:1978:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2022:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:294:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:154:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:496:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:512:9-17: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:1740:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:1667:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2074:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2047:9-17: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2768:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2738:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2442:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3246:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3253:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2458:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3047:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3133:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3209:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3216:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2410:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2496:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2470:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2426:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2965:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2972:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3006:8-16: > > WARNING: use scnprintf or sprintf > > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3013:8-16: > > WARNING: use scnprintf or sprintf > > > > Signed-off-by: Xuezhi Zhang > > I already applied a similar patch last week. > > Thanks, > > Alex > OK. Thanks, Xuezhi Zhang > > > --- > > drivers/gpu/drm/amd/pm/amdgpu_pm.c | 58 > > +++--- 1 file changed, 29 insertions(+), 29 > > deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c > > b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index > > 5fa65f191a37..2777966ec1ca 100644 --- > > a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ > > b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -151,7 +151,7 @@ static > > ssize_t amdgpu_get_power_dpm_state(struct device *dev, > > pm_runtime_mark_last_busy(ddev->dev); > > pm_runtime_put_autosuspend(ddev->dev); > > > > - return snprintf(buf, PAGE_SIZE, "%s\n", > > + return sysfs_emit(buf, "%s\n", > > (pm == POWER_STATE_TYPE_BATTERY) ? > > "battery" : (pm == POWER_STATE_TYPE_BALANCED) ? "balanced" : > > "performance"); } > > @@ -291,7 +291,7 @@ static ssize_t > > amdgpu_get_power_dpm_force_performance_level(struct device *dev, > > pm_runtime_mark_last_busy(ddev->dev); > > pm_runtime_put_autosuspend(ddev->dev); > > > > - return snprintf(buf, PAGE_SIZE, "%s\n", > > + return sysfs_emit(buf, "%s\n", > > (level == AMD_DPM_FORCED_LEVEL_AUTO) ? > > "auto" : (level == AMD_DPM_FORCED_LEVEL_LOW) ? "low" : > > (level == AMD_DPM_FORCED_LEVEL_HIGH) ? > > "high" : @@ -493,7 +493,7 @@ static ssize_t > > amdgpu_get_pp_cur_state(struct device *dev, if (i == data.nums) > > i = -EINVAL; > > > > - return snprintf(buf, PAGE_SIZE, "%d\n", i); > > + return sysfs_emit(buf, "%d\n", i); > > } > > > > static ssize_t amdgpu_get_pp_force_state(struct device *dev, > > @@ -509,7 +509,7 @@ static ssize_t amdgpu_get_pp_force_state(struct > > device *dev, if (adev->pp_force_state_enabled) > > return amdgpu_get_pp_cur_state(dev, attr, buf); > > else > > - return snprintf(buf, PAGE_SIZE, "\n"); > > + return sysfs_emit(buf, "\n"); > > } > > > > static ssize_t amdgpu_set_pp_force_state(struct device *dev, > > @@ -1664,7 +1664,7 @@ static ssize_t amdgpu_get_pp_sclk_od(struct > > device *dev,
Re: [PATCH v2 2/6] soc: mediatek: devapc: move 'vio_idx_num' info to DT
Hi, Matthias On Tue, 2021-04-06 at 15:41 +0200, Matthias Brugger wrote: > > On 01/04/2021 08:38, Nina Wu wrote: > > From: Nina Wu > > > > For new ICs, there are multiple devapc HWs for different subsys. > > The number of devices controlled by each devapc (i.e. 'vio_idx_num' > > in the code) varies. > > We move this info from compatible data to DT so that we do not need > > to add n compatible for a certain IC which has n devapc HWs with > > different 'vio_idx_num', respectively. > > > > Signed-off-by: Nina Wu > > --- > > drivers/soc/mediatek/mtk-devapc.c | 18 +- > > 1 file changed, 9 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/soc/mediatek/mtk-devapc.c > > b/drivers/soc/mediatek/mtk-devapc.c > > index f1cea04..a0f6fbd 100644 > > --- a/drivers/soc/mediatek/mtk-devapc.c > > +++ b/drivers/soc/mediatek/mtk-devapc.c > > @@ -32,9 +32,6 @@ struct mtk_devapc_vio_dbgs { > > }; > > > > struct mtk_devapc_data { > > - /* numbers of violation index */ > > - u32 vio_idx_num; > > - > > /* reg offset */ > > u32 vio_mask_offset; > > u32 vio_sta_offset; > > @@ -49,6 +46,7 @@ struct mtk_devapc_data { > > struct mtk_devapc_context { > > struct device *dev; > > void __iomem *infra_base; > > + u32 vio_idx_num; > > We should try to stay backwards compatible (newer kernel with older DTS). I > think we don't need to move vio_idx_num to mtk_devapc_context. Just don't > declare it in the per SoC match data. More details see below... > > > struct clk *infra_clk; > > const struct mtk_devapc_data *data; > > }; > > @@ -60,10 +58,10 @@ static void clear_vio_status(struct mtk_devapc_context > > *ctx) > > > > reg = ctx->infra_base + ctx->data->vio_sta_offset; > > > > - for (i = 0; i < VIO_MOD_TO_REG_IND(ctx->data->vio_idx_num) - 1; i++) > > + for (i = 0; i < VIO_MOD_TO_REG_IND(ctx->vio_idx_num - 1); i++) > > writel(GENMASK(31, 0), reg + 4 * i); > > > > - writel(GENMASK(VIO_MOD_TO_REG_OFF(ctx->data->vio_idx_num) - 1, 0), > > + writel(GENMASK(VIO_MOD_TO_REG_OFF(ctx->vio_idx_num - 1), 0), > >reg + 4 * i); > > } > > > > @@ -80,15 +78,15 @@ static void mask_module_irq(struct mtk_devapc_context > > *ctx, bool mask) > > else > > val = 0; > > > > - for (i = 0; i < VIO_MOD_TO_REG_IND(ctx->data->vio_idx_num) - 1; i++) > > + for (i = 0; i < VIO_MOD_TO_REG_IND(ctx->vio_idx_num - 1); i++) > > writel(val, reg + 4 * i); > > > > val = readl(reg + 4 * i); > > if (mask) > > - val |= GENMASK(VIO_MOD_TO_REG_OFF(ctx->data->vio_idx_num) - 1, > > + val |= GENMASK(VIO_MOD_TO_REG_OFF(ctx->vio_idx_num - 1), > >0); > > else > > - val &= ~GENMASK(VIO_MOD_TO_REG_OFF(ctx->data->vio_idx_num) - 1, > > + val &= ~GENMASK(VIO_MOD_TO_REG_OFF(ctx->vio_idx_num - 1), > > 0); > > > > writel(val, reg + 4 * i); > > @@ -216,7 +214,6 @@ static void stop_devapc(struct mtk_devapc_context *ctx) > > } > > > > static const struct mtk_devapc_data devapc_mt6779 = { > > - .vio_idx_num = 511, > > .vio_mask_offset = 0x0, > > .vio_sta_offset = 0x400, > > .vio_dbg0_offset = 0x900, > > @@ -256,6 +253,9 @@ static int mtk_devapc_probe(struct platform_device > > *pdev) > > if (!ctx->infra_base) > > return -EINVAL; > > > > + if (of_property_read_u32(node, "vio_idx_num", >vio_idx_num)) > > + return -EINVAL; > > + > > ...only read the property if vio_idx_num == 0. > What do you think? > > Regards, > Matthias > Good idea. I will fix it in the next version. Thanks > > devapc_irq = irq_of_parse_and_map(node, 0); > > if (!devapc_irq) > > return -EINVAL; > >
Re: [PATCH] mtd: add OTP (one-time-programmable) erase ioctl
Michael, Would you please resend this patch, together with the mtd-utils and the SPI NOR patch in a single patch set? You'll help us all having all in a single place. For the new ioctl we'll need acks from all the mtd maintainers and at least a tested-by tag. Cheers, ta
Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages
On Thu, Apr 08, 2021 at 10:46:18AM +0530, Anshuman Khandual wrote: > > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > From: Mike Rapoport > > > > The struct pages representing a reserved memory region are initialized > > using reserve_bootmem_range() function. This function is called for each > > reserved region just before the memory is freed from memblock to the buddy > > page allocator. > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > values set by the memory map initialization which makes it necessary to > > have a special treatment for such pages in pfn_valid() and > > pfn_valid_within(). > > > > Split out initialization of the reserved pages to a function with a > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > > reserved regions and mark struct pages for the NOMAP regions as > > PageReserved. > > This would definitely need updating the comment for MEMBLOCK_NOMAP definition > in include/linux/memblock.h just to make the semantics is clear, Sure > though arm64 is currently the only user for MEMBLOCK_NOMAP. > > Signed-off-by: Mike Rapoport > > --- > > mm/memblock.c | 23 +-- > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > index afaefa8fc6ab..6b7ea9d86310 100644 > > --- a/mm/memblock.c > > +++ b/mm/memblock.c > > @@ -2002,6 +2002,26 @@ static unsigned long __init > > __free_memory_core(phys_addr_t start, > > return end_pfn - start_pfn; > > } > > > > +static void __init memmap_init_reserved_pages(void) > > +{ > > + struct memblock_region *region; > > + phys_addr_t start, end; > > + u64 i; > > + > > + /* initialize struct pages for the reserved regions */ > > + for_each_reserved_mem_range(i, , ) > > + reserve_bootmem_region(start, end); > > + > > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > > + for_each_mem_region(region) { > > + if (memblock_is_nomap(region)) { > > + start = region->base; > > + end = start + region->size; > > + reserve_bootmem_region(start, end); > > + } > > + } > > +} > > + > > static unsigned long __init free_low_memory_core_early(void) > > { > > unsigned long count = 0; > > @@ -2010,8 +2030,7 @@ static unsigned long __init > > free_low_memory_core_early(void) > > > > memblock_clear_hotplug(0, -1); > > > > - for_each_reserved_mem_range(i, , ) > > - reserve_bootmem_region(start, end); > > + memmap_init_reserved_pages(); > > > > /* > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > > -- Sincerely yours, Mike.
Re: [v9,3/7] PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192
Hi Bjorn, Lorenzo, Just gentle ping for this patch set, please kindly let me know your comments about this patch set. Thanks. On Wed, 2021-03-24 at 11:05 +0800, Jianjun Wang wrote: > MediaTek's PCIe host controller has three generation HWs, the new > generation HW is an individual bridge, it supports Gen3 speed and > compatible with Gen2, Gen1 speed. > > Add support for new Gen3 controller which can be found on MT8192. > > Signed-off-by: Jianjun Wang > Acked-by: Ryder Lee > --- > drivers/pci/controller/Kconfig | 13 + > drivers/pci/controller/Makefile | 1 + > drivers/pci/controller/pcie-mediatek-gen3.c | 464 > 3 files changed, 478 insertions(+) > create mode 100644 drivers/pci/controller/pcie-mediatek-gen3.c > > diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig > index 5aa8977d7b0f..1e925ac47279 100644 > --- a/drivers/pci/controller/Kconfig > +++ b/drivers/pci/controller/Kconfig > @@ -233,6 +233,19 @@ config PCIE_MEDIATEK > Say Y here if you want to enable PCIe controller support on > MediaTek SoCs. > > +config PCIE_MEDIATEK_GEN3 > + tristate "MediaTek Gen3 PCIe controller" > + depends on ARCH_MEDIATEK || COMPILE_TEST > + depends on PCI_MSI_IRQ_DOMAIN > + help > + Adds support for PCIe Gen3 MAC controller for MediaTek SoCs. > + This PCIe controller is compatible with Gen3, Gen2 and Gen1 speed, > + and support up to 256 MSI interrupt numbers for > + multi-function devices. > + > + Say Y here if you want to enable Gen3 PCIe controller support on > + MediaTek SoCs. > + > config VMD > depends on PCI_MSI && X86_64 && SRCU > tristate "Intel Volume Management Device Driver" > diff --git a/drivers/pci/controller/Makefile b/drivers/pci/controller/Makefile > index e4559f2182f2..579973327815 100644 > --- a/drivers/pci/controller/Makefile > +++ b/drivers/pci/controller/Makefile > @@ -27,6 +27,7 @@ obj-$(CONFIG_PCIE_ROCKCHIP) += pcie-rockchip.o > obj-$(CONFIG_PCIE_ROCKCHIP_EP) += pcie-rockchip-ep.o > obj-$(CONFIG_PCIE_ROCKCHIP_HOST) += pcie-rockchip-host.o > obj-$(CONFIG_PCIE_MEDIATEK) += pcie-mediatek.o > +obj-$(CONFIG_PCIE_MEDIATEK_GEN3) += pcie-mediatek-gen3.o > obj-$(CONFIG_PCIE_MICROCHIP_HOST) += pcie-microchip-host.o > obj-$(CONFIG_VMD) += vmd.o > obj-$(CONFIG_PCIE_BRCMSTB) += pcie-brcmstb.o > diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c > b/drivers/pci/controller/pcie-mediatek-gen3.c > new file mode 100644 > index ..3546e53b3c85 > --- /dev/null > +++ b/drivers/pci/controller/pcie-mediatek-gen3.c > @@ -0,0 +1,464 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * MediaTek PCIe host controller driver. > + * > + * Copyright (c) 2020 MediaTek Inc. > + * Author: Jianjun Wang > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "../pci.h" > + > +#define PCIE_SETTING_REG 0x80 > +#define PCIE_PCI_IDS_1 0x9c > +#define PCI_CLASS(class) (class << 8) > +#define PCIE_RC_MODE BIT(0) > + > +#define PCIE_CFGNUM_REG 0x140 > +#define PCIE_CFG_DEVFN(devfn)((devfn) & GENMASK(7, 0)) > +#define PCIE_CFG_BUS(bus)(((bus) << 8) & GENMASK(15, 8)) > +#define PCIE_CFG_BYTE_EN(bytes) (((bytes) << 16) & GENMASK(19, > 16)) > +#define PCIE_CFG_FORCE_BYTE_EN BIT(20) > +#define PCIE_CFG_OFFSET_ADDR 0x1000 > +#define PCIE_CFG_HEADER(bus, devfn) \ > + (PCIE_CFG_BUS(bus) | PCIE_CFG_DEVFN(devfn)) > + > +#define PCIE_RST_CTRL_REG0x148 > +#define PCIE_MAC_RSTBBIT(0) > +#define PCIE_PHY_RSTBBIT(1) > +#define PCIE_BRG_RSTBBIT(2) > +#define PCIE_PE_RSTB BIT(3) > + > +#define PCIE_LTSSM_STATUS_REG0x150 > + > +#define PCIE_LINK_STATUS_REG 0x154 > +#define PCIE_PORT_LINKUP BIT(8) > + > +#define PCIE_TRANS_TABLE_BASE_REG0x800 > +#define PCIE_ATR_SRC_ADDR_MSB_OFFSET 0x4 > +#define PCIE_ATR_TRSL_ADDR_LSB_OFFSET0x8 > +#define PCIE_ATR_TRSL_ADDR_MSB_OFFSET0xc > +#define PCIE_ATR_TRSL_PARAM_OFFSET 0x10 > +#define PCIE_ATR_TLB_SET_OFFSET 0x20 > + > +#define PCIE_MAX_TRANS_TABLES8 > +#define PCIE_ATR_EN BIT(0) > +#define PCIE_ATR_SIZE(size) \ > + (size) - 1) << 1) & GENMASK(6, 1)) | PCIE_ATR_EN) > +#define PCIE_ATR_ID(id) ((id) & GENMASK(3, 0)) > +#define PCIE_ATR_TYPE_MEMPCIE_ATR_ID(0) > +#define PCIE_ATR_TYPE_IO PCIE_ATR_ID(1) > +#define PCIE_ATR_TLP_TYPE(type) (((type) << 16) & GENMASK(18, > 16)) > +#define PCIE_ATR_TLP_TYPE_MEMPCIE_ATR_TLP_TYPE(0) > +#define PCIE_ATR_TLP_TYPE_IO PCIE_ATR_TLP_TYPE(2) > + >
Re: Re: [PATCH] ASoC: codecs: Fix rumtime PM imbalance in tas2552_probe
> On Wed, Apr 07, 2021 at 02:54:00PM +0800, Dinghao Liu wrote: > > > - pm_runtime_set_active(>dev); > > - pm_runtime_set_autosuspend_delay(>dev, 1000); > > - pm_runtime_use_autosuspend(>dev); > > - pm_runtime_enable(>dev); > > - pm_runtime_mark_last_busy(>dev); > > - pm_runtime_put_sync_autosuspend(>dev); > > - > > dev_set_drvdata(>dev, data); > > > > ret = devm_snd_soc_register_component(>dev, > > @@ -733,6 +726,13 @@ static int tas2552_probe(struct i2c_client *client, > > if (ret < 0) > > dev_err(>dev, "Failed to register component: %d\n", > > ret); > > > > + pm_runtime_set_active(>dev); > > + pm_runtime_set_autosuspend_delay(>dev, 1000); > > + pm_runtime_use_autosuspend(>dev); > > It's not clear to me that just moving the operations after the > registration is a good fix - once the component is registered we could > start trying to do runtime PM operations with it which AFAIR won't count > references and so on properly if runtime PM isn't enabled so if we later > enable runtime PM we might have the rest of the code in a confused state > about what's going on. Thanks for your advice. I checked the use of devm_snd_soc_register_component() in the kernel and found sometimes runtime PM is enabled before registration and sometimes after registration. To be on the safe side, I will send a new patch to fix this in error handling path. Regards, Dinghao
Re: [PATCH v3 03/12] dump_stack: Add vmlinux build ID to stack traces
Quoting Petr Mladek (2021-04-07 06:42:38) > > I think that you need to use something like: > > #ifdef CONFIG_STACKTRACE_BUILD_ID > #define BUILD_ID_FTM " %20phN" > #define BUILD_ID_VAL vmlinux_build_id > #else > #define BUILD_ID_FTM "%s" > #define BUILD_ID_VAL "" > #endif > > printk("%sCPU: %d PID: %d Comm: %.20s %s%s %s %.*s" BUILD_ID_FTM "\n", >log_lvl, raw_smp_processor_id(), current->pid, current->comm, >kexec_crash_loaded() ? "Kdump: loaded " : "", >print_tainted(), >init_utsname()->release, >(int)strcspn(init_utsname()->version, " "), >init_utsname()->version, >BUILD_ID_VAL); > Thanks. I didn't see this warning but I see it now after compiling again. Not sure how I missed this one. I've rolled in this fix as well.
Re: [PATCH] sched/fair: use signed long when compute energy delta in eas
Hi On Wed, Apr 7, 2021 at 10:11 PM Pierre wrote: > > Hi, > > I test the patch, but the overflow still exists. > > In the "sched/fair: Use pd_cache to speed up find_energy_efficient_cpu()" > > I wonder why recompute the cpu util when cpu==dst_cpu in compute_energy(), > > when the dst_cpu's util change, it also would cause the overflow. > > The patches aim to cache the energy values for the CPUs whose > utilization is not modified (so we don't have to compute it multiple > times). The values cached are the 'base values' of the CPUs, i.e. when > the task is not placed on the CPU. When (cpu==dst_cpu) in > compute_energy(), it means the energy values need to be updated instead > of using the cached ones. > well, is it better to use the task_util(p) + cache values ? but in this case, the cache values may need more parameters. > You are right, there is still a possibility to have a negative delta > with the patches at: > https://gitlab.arm.com/linux-arm/linux-power/-/commits/eas/next/integration-20210129 > Adding a check before subtracting the values, and bailing out in such > case would avoid this, such as at: > https://gitlab.arm.com/linux-arm/linux-pg/-/commits/feec_bail_out/ > In your patch, you bail out the case by "go to fail", that means you don't use eas in such case. However, in the actual scene, the case often occurr when select cpu for small task. As a result, the small task would not select cpu according to the eas, it may affect power consumption? > I think a similar modification should be done in your patch. Even though > this is a good idea to group the calls to compute_energy() to reduce the > chances of having updates of utilization values in between the > compute_energy() calls, > there is still a chance to have updates. I think it happened when I > applied your patch. > > About changing the delta(s) from 'unsigned long' to 'long', I am not > sure of the meaning of having a negative delta. I thing it would be > better to check and fail before it happens instead. > > Regards >
Re: [PATCH v4 08/16] KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to manage guest DS buffer
Hi Peter, Thanks for your detailed comments. If you have more comments for other patches, please let me know. On 2021/4/7 23:39, Peter Zijlstra wrote: On Mon, Mar 29, 2021 at 01:41:29PM +0800, Like Xu wrote: @@ -3869,10 +3876,12 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data) if (arr[1].guest) arr[0].guest |= arr[1].guest; - else + else { arr[1].guest = arr[1].host; + arr[2].guest = arr[2].host; + } What's all this gibberish? The way I read that it says: if guest has PEBS_ENABLED guest GLOBAL_CTRL |= PEBS_ENABLED otherwise guest PEBS_ENABLED = host PEBS_ENABLED guest DS_AREA = host DS_AREA which is just completely random garbage afaict. Why would you leak host msrs into the guest? In fact, this is not a leak at all. When we do "arr[i].guest = arr[i].host;" assignment in the intel_guest_get_msrs(), the KVM will check "if (msrs[i].host == msrs[i].guest)" and if so, it disables the atomic switch for this msr during vmx transaction in the caller atomic_switch_perf_msrs(). In that case, the msr value doesn't change and any guest write will be trapped. If the next check is "msrs[i].host != msrs[i].guest", the atomic switch will be triggered again. Compared to before, this part of the logic has not changed, which helps to reduce overhead. Why would you change guest GLOBAL_CTRL implicitly; This is because in the early part of this function, we have operations: if (x86_pmu.flags & PMU_FL_PEBS_ALL) arr[0].guest &= ~cpuc->pebs_enabled; else arr[0].guest &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK); and if guest has PEBS_ENABLED, we need these bits back for PEBS counters: arr[0].guest |= arr[1].guest; guest had better wrmsr that himself to control when stuff is enabled. When vm_entry, the msr value of GLOBAL_CTRL on the hardware may be different from trapped value "pmu->global_ctrl" written by the guest. If the perf scheduler cross maps guest counter X to the host counter Y, we have to enable the bit Y in GLOBAL_CTRL before vm_entry rather than X. This just cannot be right.
Re: [PATCH v2 1/1] powerpc/iommu: Enable remaining IOMMU Pagesizes present in LoPAR
Leonardo Bras writes: > According to LoPAR, ibm,query-pe-dma-window output named "IO Page Sizes" > will let the OS know all possible pagesizes that can be used for creating a > new DDW. > > Currently Linux will only try using 3 of the 8 available options: > 4K, 64K and 16M. According to LoPAR, Hypervisor may also offer 32M, 64M, > 128M, 256M and 16G. Do we know of any hardware & hypervisor combination that will actually give us bigger pages? > Enabling bigger pages would be interesting for direct mapping systems > with a lot of RAM, while using less TCE entries. > > Signed-off-by: Leonardo Bras > --- > arch/powerpc/platforms/pseries/iommu.c | 49 ++ > 1 file changed, 42 insertions(+), 7 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/iommu.c > b/arch/powerpc/platforms/pseries/iommu.c > index 9fc5217f0c8e..6cda1c92597d 100644 > --- a/arch/powerpc/platforms/pseries/iommu.c > +++ b/arch/powerpc/platforms/pseries/iommu.c > @@ -53,6 +53,20 @@ enum { > DDW_EXT_QUERY_OUT_SIZE = 2 > }; A comment saying where the values come from would be good. > +#define QUERY_DDW_PGSIZE_4K 0x01 > +#define QUERY_DDW_PGSIZE_64K 0x02 > +#define QUERY_DDW_PGSIZE_16M 0x04 > +#define QUERY_DDW_PGSIZE_32M 0x08 > +#define QUERY_DDW_PGSIZE_64M 0x10 > +#define QUERY_DDW_PGSIZE_128M0x20 > +#define QUERY_DDW_PGSIZE_256M0x40 > +#define QUERY_DDW_PGSIZE_16G 0x80 I'm not sure the #defines really gain us much vs just putting the literal values in the array below? > +struct iommu_ddw_pagesize { > + u32 mask; > + int shift; > +}; > + > static struct iommu_table_group *iommu_pseries_alloc_group(int node) > { > struct iommu_table_group *table_group; > @@ -1099,6 +1113,31 @@ static void reset_dma_window(struct pci_dev *dev, > struct device_node *par_dn) >ret); > } > > +/* Returns page shift based on "IO Page Sizes" output at > ibm,query-pe-dma-window. See LoPAR */ > +static int iommu_get_page_shift(u32 query_page_size) > +{ > + const struct iommu_ddw_pagesize ddw_pagesize[] = { > + { QUERY_DDW_PGSIZE_16G, __builtin_ctz(SZ_16G) }, > + { QUERY_DDW_PGSIZE_256M, __builtin_ctz(SZ_256M) }, > + { QUERY_DDW_PGSIZE_128M, __builtin_ctz(SZ_128M) }, > + { QUERY_DDW_PGSIZE_64M, __builtin_ctz(SZ_64M) }, > + { QUERY_DDW_PGSIZE_32M, __builtin_ctz(SZ_32M) }, > + { QUERY_DDW_PGSIZE_16M, __builtin_ctz(SZ_16M) }, > + { QUERY_DDW_PGSIZE_64K, __builtin_ctz(SZ_64K) }, > + { QUERY_DDW_PGSIZE_4K, __builtin_ctz(SZ_4K) } > + }; cheers
Re: [PATCH v3 12/12] kdump: Use vmlinux_build_id to simplify
Quoting Petr Mladek (2021-04-07 10:03:28) > On Tue 2021-03-30 20:05:20, Stephen Boyd wrote: > > We can use the vmlinux_build_id array here now instead of open coding > > it. This mostly consolidates code. > > > > Cc: Jiri Olsa > > Cc: Alexei Starovoitov > > Cc: Jessica Yu > > Cc: Evan Green > > Cc: Hsin-Yi Wang > > Cc: Dave Young > > Cc: Baoquan He > > Cc: Vivek Goyal > > Cc: > > Signed-off-by: Stephen Boyd > > --- > > include/linux/crash_core.h | 6 +- > > kernel/crash_core.c| 41 ++ > > 2 files changed, 3 insertions(+), 44 deletions(-) > > > > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h > > index 206bde8308b2..fb8ab99bb2ee 100644 > > --- a/include/linux/crash_core.h > > +++ b/include/linux/crash_core.h > > @@ -39,7 +39,7 @@ phys_addr_t paddr_vmcoreinfo_note(void); > > #define VMCOREINFO_OSRELEASE(value) \ > > vmcoreinfo_append_str("OSRELEASE=%s\n", value) > > #define VMCOREINFO_BUILD_ID(value) \ > > - vmcoreinfo_append_str("BUILD-ID=%s\n", value) > > + vmcoreinfo_append_str("BUILD-ID=%20phN\n", value) > > Please, add also build check that BUILD_ID_MAX == 20. > I added a BUILD_BUG_ON() in kernel/crash_core.c. I tried static_assert() here but got mixed ISO errors from gcc-10, although it feels like it should work. In file included from ./arch/arm64/include/asm/cmpxchg.h:10, from ./arch/arm64/include/asm/atomic.h:16, from ./include/linux/atomic.h:7, from ./include/linux/mm_types_task.h:13, from ./include/linux/mm_types.h:5, from ./include/linux/buildid.h:5, from kernel/crash_core.c:7: kernel/crash_core.c: In function 'crash_save_vmcoreinfo_init': ./include/linux/build_bug.h:78:41: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] 78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg) | ^~ ./include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert' 77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr) | ^~~ ./include/linux/crash_core.h:42:2: note: in expansion of macro 'static_assert' 42 | static_assert(ARRAY_SIZE(value) == BUILD_ID_SIZE_MAX); \ | ^ kernel/crash_core.c:401:2: note: in expansion of macro 'VMCOREINFO_BUILD_ID' 401 | VMCOREINFO_BUILD_ID(vmlinux_build_id); > > The function add_build_id_vmcoreinfo() is used in > crash_save_vmcoreinfo_init() in this context: > > > VMCOREINFO_OSRELEASE(init_uts_ns.name.release); > add_build_id_vmcoreinfo(); > VMCOREINFO_PAGESIZE(PAGE_SIZE); > > VMCOREINFO_SYMBOL(init_uts_ns); > VMCOREINFO_OFFSET(uts_namespace, name); > VMCOREINFO_SYMBOL(node_online_map); > > The function is not longer need. VMCOREINFO_BUILD_ID() > can be used directly: > > VMCOREINFO_OSRELEASE(init_uts_ns.name.release); > VMCOREINFO_BUILD_ID(vmlinux_build_id); > VMCOREINFO_PAGESIZE(PAGE_SIZE); > > VMCOREINFO_SYMBOL(init_uts_ns); > VMCOREINFO_OFFSET(uts_namespace, name); > VMCOREINFO_SYMBOL(node_online_map); > > Thanks. Makes sense. I've rolled that in.
[RFC PATCH] Add split_lock
bit_spinlocks are horrible on RT because there's absolutely nowhere to put the mutex to sleep on. They also do not participate in lockdep because there's nowhere to put the map. Most (all?) bit spinlocks are actually a split lock; logically they could be treated as a single spinlock, but for performance, we want to split the lock over many objects. Introduce the split_lock as somewhere to store the lockdep map and as somewhere that the RT kernel can put a mutex. It may also let us store a ticket lock for better performance on non-RT kernels in the future, but I have left the current cpu_relax() implementation intact for now. The API change breaks all users except for the two which have been converted. This is an RFC, and I'm willing to fix all the rest. Signed-off-by: Matthew Wilcox (Oracle) --- fs/dcache.c | 25 ++-- include/linux/bit_spinlock.h | 36 ++--- include/linux/list_bl.h | 9 include/linux/split_lock.h | 45 mm/slub.c| 6 +++-- 5 files changed, 84 insertions(+), 37 deletions(-) create mode 100644 include/linux/split_lock.h diff --git a/fs/dcache.c b/fs/dcache.c index 7d24ff7eb206..a3861d330001 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -96,6 +96,7 @@ EXPORT_SYMBOL(slash_name); static unsigned int d_hash_shift __read_mostly; +static DEFINE_SPLIT_LOCK(d_hash_lock); static struct hlist_bl_head *dentry_hashtable __read_mostly; static inline struct hlist_bl_head *d_hash(unsigned int hash) @@ -469,9 +470,9 @@ static void ___d_drop(struct dentry *dentry) else b = d_hash(dentry->d_name.hash); - hlist_bl_lock(b); + hlist_bl_lock(b, _hash_lock); __hlist_bl_del(>d_hash); - hlist_bl_unlock(b); + hlist_bl_unlock(b, _hash_lock); } void __d_drop(struct dentry *dentry) @@ -2074,9 +2075,9 @@ static struct dentry *__d_instantiate_anon(struct dentry *dentry, __d_set_inode_and_type(dentry, inode, add_flags); hlist_add_head(>d_u.d_alias, >i_dentry); if (!disconnected) { - hlist_bl_lock(>d_sb->s_roots); + hlist_bl_lock(>d_sb->s_roots, _hash_lock); hlist_bl_add_head(>d_hash, >d_sb->s_roots); - hlist_bl_unlock(>d_sb->s_roots); + hlist_bl_unlock(>d_sb->s_roots, _hash_lock); } spin_unlock(>d_lock); spin_unlock(>i_lock); @@ -2513,9 +2514,9 @@ static void __d_rehash(struct dentry *entry) { struct hlist_bl_head *b = d_hash(entry->d_name.hash); - hlist_bl_lock(b); + hlist_bl_lock(b, _hash_lock); hlist_bl_add_head_rcu(>d_hash, b); - hlist_bl_unlock(b); + hlist_bl_unlock(b, _hash_lock); } /** @@ -2606,9 +2607,9 @@ struct dentry *d_alloc_parallel(struct dentry *parent, goto retry; } - hlist_bl_lock(b); + hlist_bl_lock(b, _hash_lock); if (unlikely(READ_ONCE(parent->d_inode->i_dir_seq) != seq)) { - hlist_bl_unlock(b); + hlist_bl_unlock(b, _hash_lock); rcu_read_unlock(); goto retry; } @@ -2626,7 +2627,7 @@ struct dentry *d_alloc_parallel(struct dentry *parent, continue; if (!d_same_name(dentry, parent, name)) continue; - hlist_bl_unlock(b); + hlist_bl_unlock(b, _hash_lock); /* now we can try to grab a reference */ if (!lockref_get_not_dead(>d_lockref)) { rcu_read_unlock(); @@ -2664,7 +2665,7 @@ struct dentry *d_alloc_parallel(struct dentry *parent, new->d_flags |= DCACHE_PAR_LOOKUP; new->d_wait = wq; hlist_bl_add_head_rcu(>d_u.d_in_lookup_hash, b); - hlist_bl_unlock(b); + hlist_bl_unlock(b, _hash_lock); return new; mismatch: spin_unlock(>d_lock); @@ -2677,12 +2678,12 @@ void __d_lookup_done(struct dentry *dentry) { struct hlist_bl_head *b = in_lookup_hash(dentry->d_parent, dentry->d_name.hash); - hlist_bl_lock(b); + hlist_bl_lock(b, _hash_lock); dentry->d_flags &= ~DCACHE_PAR_LOOKUP; __hlist_bl_del(>d_u.d_in_lookup_hash); wake_up_all(dentry->d_wait); dentry->d_wait = NULL; - hlist_bl_unlock(b); + hlist_bl_unlock(b, _hash_lock); INIT_HLIST_NODE(>d_u.d_alias); INIT_LIST_HEAD(>d_lru); } diff --git a/include/linux/bit_spinlock.h b/include/linux/bit_spinlock.h index bbc4730a6505..641623d471b0 100644 --- a/include/linux/bit_spinlock.h +++ b/include/linux/bit_spinlock.h @@ -2,6 +2,7 @@ #ifndef __LINUX_BIT_SPINLOCK_H #define __LINUX_BIT_SPINLOCK_H +#include #include #include #include @@ -13,32 +14,23 @@ * Don't use this unless you really need to: spin_lock() and spin_unlock() * are significantly faster. */
Re: [PATCH v6 3/8] regulator: IRQ based event/error notification helpers
Hello Andy, All. On Wed, 2021-04-07 at 16:21 +0300, Andy Shevchenko wrote: > On Wed, Apr 7, 2021 at 1:04 PM Matti Vaittinen > wrote: > > Provide helper function for IC's implementing regulator > > notifications > > when an IRQ fires. The helper also works for IRQs which can not be > > acked. > > Helper can be set to disable the IRQ at handler and then re- > > enabling it > > on delayed work later. The helper also adds > > regulator_get_error_flags() > > errors in cache for the duration of IRQ disabling. > > Thanks for an update, my comments below. After addressing them, feel > free to add > Reviewed-by: Andy Shevchenko > > > Signed-off-by: Matti Vaittinen > > > > static int _regulator_get_error_flags(struct regulator_dev *rdev, > > unsigned int *flags) > > { > > - int ret; > > + int ret, tmpret; > > > > regulator_lock(rdev); > > > > + ret = rdev_get_cached_err_flags(rdev); > > + > > /* sanity check */ > > - if (!rdev->desc->ops->get_error_flags) { > > + if (rdev->desc->ops->get_error_flags) { > > + tmpret = rdev->desc->ops->get_error_flags(rdev, > > flags); > > + if (tmpret > 0) > > + ret |= tmpret; > > Oh, I don't like this. Easy fix is to rename ret (okay, it's been > used > elsewhere, so adding then) to something meaningful, like error_flags, > then you can easily understand that value should be positive and > error > codes are negative. No wonder if this looks hairy. I think I have got this plain wrong. The rdev_get_cached_err_flags() is not updating the flags. Looks like just plain mistake from my side. I think I've mixed the returning flags via parameter and return value. This must be fixed. Well spotted. > + */ > > +void *devm_regulator_irq_helper(struct device *dev, > > + const struct regulator_irq_desc *d, > > int irq, > > + int irq_flags, int common_errs, > > + int *per_rdev_errs, > > + struct regulator_dev **rdev, int > > rdev_amount) > > I didn't get why you need the ** pointer instead of plain pointer. We have an array of pointers. And we give a pointer to the first pointer. Maybe it's the lack of coffee but I don't see why a single pointer would be correct? rdev structures are not in contagious memory, pointers to rdevs are. So we need address of first pointer, right? +#include > > +#include > > +#include > > +#include > > Not sure how this header is used. I haven't found any direct users of > it. Perhaps you wanted interrupt.h? Thanks. I think this specific header may be a leftover from first draft where I thought I'll use named IRQs. The header was for of_irq_get_byname(). That ended up as a mess for everything else but platform devices :) I'll check the headers, thanks. > > +#include > > +#include > > +#include > > + Blank line? I would separate group of generic headers with > particular to the subsystem I don't see this being used in regulator subsystem - and to tell the truth, I don't really see the value. > > +#include ... > > + > > +reread: > > + if (d->fatal_cnt && h->retry_cnt > d->fatal_cnt) { > > + if (d->die) > > + ret = d->die(rid); > > + else > > + die_loudly("Regulator HW failure? - no IC > > recovery"); > > + > > + /* > > +* If the 'last resort' IC recovery failed we will > > have > > +* nothing else left to do... > > +*/ > > + if (ret) > > + die_loudly("Regulator HW failure? - IC > > recovery failed"); > > Looking at the above code this will be executed if and only if > d->die() is defined, correct? > In that case, why not > > if (d->die) { > ret = ... > if (ret) >rdev_die_loudly(...); > } else >rdev_die_loudly(...); > > ? I think this should simply be: if (!d->die) die_loudly("Regulator HW failure? - no IC recovery"); ret = d->die(rdev); if (ret) die_loudly(...); ... > > +static void init_rdev_errors(struct regulator_irq *h) > > +{ > > + int i; > > + > > + for (i = 0; i < h->rdata.num_states; i++) { > > + if (h->rdata.states[i].possible_errs) > > + /* Can we trust writing this boolean is > > atomic? */ > > No. boolean is a compiler / platform specific and it may potentially > be written in a non-atomic way. Hmm.. I don't think this really is a problem here. We only use the use_cached_err for true/false evaluation - and if error getting api is called after the boolean is changed - then cached error is used, if before, then it is not used. Even if the value of the boolean was read in the middle of writing it, it will still evaluate either true or false - there is no 'maybe' state :) My point, I guess we can do the change without
Re: [PATCH net v4] atl1c: move tx cleanup processing out of interrupt
On 2021-04-07 19:55, Eric Dumazet wrote: On 4/6/21 4:49 PM, Gatis Peisenieks wrote: Tx queue cleanup happens in interrupt handler on same core as rx queue processing. Both can take considerable amount of processing in high packet-per-second scenarios. Sending big amounts of packets can stall the rx processing which is unfair and also can lead to out-of-memory condition since __dev_kfree_skb_irq queues the skbs for later kfree in softirq which is not allowed to happen with heavy load in interrupt handler. [ ... ] diff --git a/net/core/dev.c b/net/core/dev.c index 0f72ff5d34ba..489ac60b530c 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6789,6 +6789,7 @@ int dev_set_threaded(struct net_device *dev, bool threaded) return err; } +EXPORT_SYMBOL(dev_set_threaded); void netif_napi_add(struct net_device *dev, struct napi_struct *napi, int (*poll)(struct napi_struct *, int), int weight) This has already been done in net-next Please base your patch on top of net-next, this can not be backported to old versions anyway, without some amount of pain. Thank you Eric, for heads up, the v5 patch sent for net-next in response to David Miller comment already does that.
Re: [PATCH][next] scsi: pm80xx: Fix potential infinite loop
On Wed, Apr 7, 2021 at 7:18 PM Martin K. Petersen wrote: > > > Hi Colin! > > > The for-loop iterates with a u8 loop counter i and compares this with > > the loop upper limit of pm8001_ha->max_q_num which is a u32 type. > > There is a potential infinite loop if pm8001_ha->max_q_num is larger > > than the u8 loop counter. Fix this by making the loop counter the same > > type as pm8001_ha->max_q_num. > > No particular objections to the patch for future-proofing. However, as > far as I can tell max_q_num is capped at 64 (PM8001_MAX_MSIX_VEC). Exactly. > > -- > Martin K. Petersen Oracle Linux Engineering
Re: [RFC/RFT PATCH 0/3] arm64: drop pfn_valid_within() and simplify pfn_valid()
Adding James here. + James Morse On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport > > Hi, > > These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire > pfn_valid_within() to 1. That would be really great for arm64 platform as it will save CPU cycles on many generic MM paths, given that our pfn_valid() has been expensive. > > The idea is to mark NOMAP pages as reserved in the memory map and restore Though I am not really sure, would that possibly be problematic for UEFI/EFI use cases as it might have just treated them as normal struct pages till now. > the intended semantics of pfn_valid() to designate availability of struct > page for a pfn. Right, that would be better as the current semantics is not ideal. > > With this the core mm will be able to cope with the fact that it cannot use > NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks > will be treated correctly even without the need for pfn_valid_within. > > The patches are only boot tested on qemu-system-aarch64 so I'd really > appreciate memory stress tests on real hardware. Did some preliminary memory stress tests on a guest with portions of memory marked as MEMBLOCK_NOMAP and did not find any obvious problem. But this might require some testing on real UEFI environment with firmware using MEMBLOCK_NOMAP memory to make sure that changing these struct pages to PageReserved() is safe. > > If this actually works we'll be one step closer to drop custom pfn_valid() > on arm64 altogether. Right, planning to rework and respin the RFC originally sent last month. https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khand...@arm.com/
Re: [PATCH] arm64: dts: qcom: Move rmtfs memory region
Hey Sujit, Thanks for the patch. On 2021-03-30 07:16, Sujit Kautkar wrote: Move rmtfs memory region so that it does not overlap with system RAM (kernel data) when KAsan is enabled. This puts rmtfs right after mba_mem which is not supposed to increase beyond 0x9460 Signed-off-by: Sujit Kautkar --- arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi | 2 +- arch/arm64/boot/dts/qcom/sc7180.dtsi | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi index 07c8b2c926c0..fe052b477b72 100644 --- a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi @@ -45,7 +45,7 @@ trips { /* Increase the size from 2MB to 8MB */ _mem { - reg = <0x0 0x8440 0x0 0x80>; + reg = <0x0 0x9460 0x0 0x80>; Sorry for the late comments. Can you please do the same for sc7180-idp as well? Reviewed-by: Sibi Sankar }; / { diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index 1ea3344ab62c..ac956488908f 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -110,9 +110,9 @@ tz_mem: memory@80b0 { no-map; }; - rmtfs_mem: memory@8440 { + rmtfs_mem: memory@9460 { compatible = "qcom,rmtfs-mem"; - reg = <0x0 0x8440 0x0 0x20>; + reg = <0x0 0x9460 0x0 0x20>; no-map; qcom,client-id = <1>; -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages
On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport > > The struct pages representing a reserved memory region are initialized > using reserve_bootmem_range() function. This function is called for each > reserved region just before the memory is freed from memblock to the buddy > page allocator. > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > values set by the memory map initialization which makes it necessary to > have a special treatment for such pages in pfn_valid() and > pfn_valid_within(). > > Split out initialization of the reserved pages to a function with a > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > reserved regions and mark struct pages for the NOMAP regions as > PageReserved. This would definitely need updating the comment for MEMBLOCK_NOMAP definition in include/linux/memblock.h just to make the semantics is clear, though arm64 is currently the only user for MEMBLOCK_NOMAP. > > Signed-off-by: Mike Rapoport > --- > mm/memblock.c | 23 +-- > 1 file changed, 21 insertions(+), 2 deletions(-) > > diff --git a/mm/memblock.c b/mm/memblock.c > index afaefa8fc6ab..6b7ea9d86310 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -2002,6 +2002,26 @@ static unsigned long __init > __free_memory_core(phys_addr_t start, > return end_pfn - start_pfn; > } > > +static void __init memmap_init_reserved_pages(void) > +{ > + struct memblock_region *region; > + phys_addr_t start, end; > + u64 i; > + > + /* initialize struct pages for the reserved regions */ > + for_each_reserved_mem_range(i, , ) > + reserve_bootmem_region(start, end); > + > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > + for_each_mem_region(region) { > + if (memblock_is_nomap(region)) { > + start = region->base; > + end = start + region->size; > + reserve_bootmem_region(start, end); > + } > + } > +} > + > static unsigned long __init free_low_memory_core_early(void) > { > unsigned long count = 0; > @@ -2010,8 +2030,7 @@ static unsigned long __init > free_low_memory_core_early(void) > > memblock_clear_hotplug(0, -1); > > - for_each_reserved_mem_range(i, , ) > - reserve_bootmem_region(start, end); > + memmap_init_reserved_pages(); > > /* >* We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id >
Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid()
On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport > > The intended semantics of pfn_valid() is to verify whether there is a > struct page for the pfn in question and nothing else. Should there be a comment affirming this semantics interpretation, above the generic pfn_valid() in include/linux/mmzone.h ? > > Yet, on arm64 it is used to distinguish memory areas that are mapped in the > linear map vs those that require ioremap() to access them. > > Introduce a dedicated pfn_is_memory() to perform such check and use it > where appropriate. > > Signed-off-by: Mike Rapoport > --- > arch/arm64/include/asm/memory.h | 2 +- > arch/arm64/include/asm/page.h | 1 + > arch/arm64/kvm/mmu.c| 2 +- > arch/arm64/mm/init.c| 6 ++ > arch/arm64/mm/ioremap.c | 4 ++-- > arch/arm64/mm/mmu.c | 2 +- > 6 files changed, 12 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > index 0aabc3be9a75..7e77fdf71b9d 100644 > --- a/arch/arm64/include/asm/memory.h > +++ b/arch/arm64/include/asm/memory.h > @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) > > #define virt_addr_valid(addr)({ > \ > __typeof__(addr) __addr = __tag_reset(addr);\ > - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ > + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ > }) > > void dump_mem_limit(void); > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > index 012cffc574e8..32b485bcc6ff 100644 > --- a/arch/arm64/include/asm/page.h > +++ b/arch/arm64/include/asm/page.h > @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); > typedef struct page *pgtable_t; > > extern int pfn_valid(unsigned long); > +extern int pfn_is_memory(unsigned long); > > #include > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 8711894db8c2..ad2ea65a3937 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) > > static bool kvm_is_device_pfn(unsigned long pfn) > { > - return !pfn_valid(pfn); > + return !pfn_is_memory(pfn); > } > > /* > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 3685e12aba9b..258b1905ed4a 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) > } > EXPORT_SYMBOL(pfn_valid); > > +int pfn_is_memory(unsigned long pfn) > +{ > + return memblock_is_map_memory(PFN_PHYS(pfn)); > +} > +EXPORT_SYMBOL(pfn_is_memory);> + Should not this be generic though ? There is nothing platform or arm64 specific in here. Wondering as pfn_is_memory() just indicates that the pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() instead ? Regardless, it's fine either way. > static phys_addr_t memory_limit = PHYS_ADDR_MAX; > > /* > diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c > index b5e83c46b23e..82a369b22ef5 100644 > --- a/arch/arm64/mm/ioremap.c > +++ b/arch/arm64/mm/ioremap.c > @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t > phys_addr, size_t size, > /* >* Don't allow RAM to be mapped. >*/ > - if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr > + if (WARN_ON(pfn_is_memory(__phys_to_pfn(phys_addr > return NULL; > > area = get_vm_area_caller(size, VM_IOREMAP, caller); > @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap); > void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size) > { > /* For normal memory we already have a cacheable mapping. */ > - if (pfn_valid(__phys_to_pfn(phys_addr))) > + if (pfn_is_memory(__phys_to_pfn(phys_addr))) > return (void __iomem *)__phys_to_virt(phys_addr); > > return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL), > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index 5d9550fdb9cf..038d20fe163f 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd) > pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, > unsigned long size, pgprot_t vma_prot) > { > - if (!pfn_valid(pfn)) > + if (!pfn_is_memory(pfn)) > return pgprot_noncached(vma_prot); > else if (file->f_flags & O_SYNC) > return pgprot_writecombine(vma_prot); >
Re: [RFC/RFT PATCH 3/3] arm64: drop pfn_valid_within() and simplify pfn_valid()
On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport > > The arm64's version of pfn_valid() differs from the generic because of two > reasons: > > * Parts of the memory map are freed during boot. This makes it necessary to > verify that there is actual physical memory that corresponds to a pfn > which is done by querying memblock. > > * There are NOMAP memory regions. These regions are not mapped in the > linear map and until the previous commit the struct pages representing > these areas had default values. > > As the consequence of absence of the special treatment of NOMAP regions in > the memory map it was necessary to use memblock_is_map_memory() in > pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that > generic mm functionality would not treat a NOMAP page as a normal page. > > Since the NOMAP regions are now marked as PageReserved(), pfn walkers and > the rest of core mm will treat them as unusable memory and thus > pfn_valid_within() is no longer required at all and can be disabled by > removing CONFIG_HOLES_IN_ZONE on arm64. But what about the memory map that are freed during boot (mentioned above). Would not they still cause CONFIG_HOLES_IN_ZONE to be applicable and hence pfn_valid_within() ? > > pfn_valid() can be slightly simplified by replacing > memblock_is_map_memory() with memblock_is_memory(). Just to understand this better, pfn_valid() will now return true for all MEMBLOCK_NOMAP based memory but that is okay as core MM would still ignore them as unusable memory for being PageReserved(). > > Signed-off-by: Mike Rapoport > --- > arch/arm64/Kconfig | 3 --- > arch/arm64/mm/init.c | 4 ++-- > 2 files changed, 2 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index e4e1b6550115..58e439046d05 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK > def_bool y > depends on NUMA > > -config HOLES_IN_ZONE > - def_bool y > - > source "kernel/Kconfig.hz" > > config ARCH_SPARSEMEM_ENABLE > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 258b1905ed4a..bb6dd406b1f0 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn) > > /* >* ZONE_DEVICE memory does not have the memblock entries. > - * memblock_is_map_memory() check for ZONE_DEVICE based > + * memblock_is_memory() check for ZONE_DEVICE based >* addresses will always fail. Even the normal hotplugged >* memory will never have MEMBLOCK_NOMAP flag set in their >* memblock entries. Skip memblock search for all non early > @@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn) > return pfn_section_valid(ms, pfn); > } > #endif > - return memblock_is_map_memory(addr); > + return memblock_is_memory(addr); > } > EXPORT_SYMBOL(pfn_valid); > >
Re: [syzbot] INFO: task hung in io_ring_exit_work
Hello, syzbot has tested the proposed patch but the reproducer is still triggering an issue: INFO: task hung in io_ring_exit_work INFO: task kworker/u4:0:9 blocked for more than 143 seconds. Not tainted 5.12.0-rc2-syzkaller #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u4:0state:D stack:26336 pid:9 ppid: 2 flags:0x4000 Workqueue: events_unbound io_ring_exit_work Call Trace: context_switch kernel/sched/core.c:4324 [inline] __schedule+0x911/0x21b0 kernel/sched/core.c:5075 schedule+0xcf/0x270 kernel/sched/core.c:5154 schedule_timeout+0x1db/0x250 kernel/time/timer.c:1868 do_wait_for_common kernel/sched/completion.c:85 [inline] __wait_for_common kernel/sched/completion.c:106 [inline] wait_for_common kernel/sched/completion.c:117 [inline] wait_for_completion+0x168/0x270 kernel/sched/completion.c:138 io_ring_exit_work+0x4e8/0x12d0 fs/io_uring.c:8611 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 INFO: task kworker/u4:1:25 blocked for more than 144 seconds. Not tainted 5.12.0-rc2-syzkaller #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u4:1state:D stack:25312 pid: 25 ppid: 2 flags:0x4000 Workqueue: events_unbound io_ring_exit_work Call Trace: context_switch kernel/sched/core.c:4324 [inline] __schedule+0x911/0x21b0 kernel/sched/core.c:5075 schedule+0xcf/0x270 kernel/sched/core.c:5154 schedule_timeout+0x1db/0x250 kernel/time/timer.c:1868 do_wait_for_common kernel/sched/completion.c:85 [inline] __wait_for_common kernel/sched/completion.c:106 [inline] wait_for_common kernel/sched/completion.c:117 [inline] wait_for_completion+0x168/0x270 kernel/sched/completion.c:138 io_ring_exit_work+0x4e8/0x12d0 fs/io_uring.c:8611 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 INFO: task kworker/u4:3:110 blocked for more than 145 seconds. Not tainted 5.12.0-rc2-syzkaller #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u4:3state:D stack:23608 pid: 110 ppid: 2 flags:0x4000 Workqueue: events_unbound io_ring_exit_work Call Trace: context_switch kernel/sched/core.c:4324 [inline] __schedule+0x911/0x21b0 kernel/sched/core.c:5075 schedule+0xcf/0x270 kernel/sched/core.c:5154 schedule_timeout+0x1db/0x250 kernel/time/timer.c:1868 do_wait_for_common kernel/sched/completion.c:85 [inline] __wait_for_common kernel/sched/completion.c:106 [inline] wait_for_common kernel/sched/completion.c:117 [inline] wait_for_completion+0x168/0x270 kernel/sched/completion.c:138 io_ring_exit_work+0x4e8/0x12d0 fs/io_uring.c:8611 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 INFO: task kworker/u4:4:185 blocked for more than 145 seconds. Not tainted 5.12.0-rc2-syzkaller #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u4:4state:D stack:25584 pid: 185 ppid: 2 flags:0x4000 Workqueue: events_unbound io_ring_exit_work Call Trace: context_switch kernel/sched/core.c:4324 [inline] __schedule+0x911/0x21b0 kernel/sched/core.c:5075 schedule+0xcf/0x270 kernel/sched/core.c:5154 schedule_timeout+0x1db/0x250 kernel/time/timer.c:1868 do_wait_for_common kernel/sched/completion.c:85 [inline] __wait_for_common kernel/sched/completion.c:106 [inline] wait_for_common kernel/sched/completion.c:117 [inline] wait_for_completion+0x168/0x270 kernel/sched/completion.c:138 io_ring_exit_work+0x4e8/0x12d0 fs/io_uring.c:8611 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 Showing all locks held in the system: 2 locks held by kworker/u4:0/9: #0: 88800fc69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline] #0: 88800fc69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline] #0: 88800fc69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline] #0: 88800fc69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline] #0: 88800fc69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline] #0: 88800fc69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x871/0x1600
Re: [PATCH 2/2] powerpc: make 'boot_text_mapped' static
Le 08/04/2021 à 03:18, Yu Kuai a écrit : The sparse tool complains as follow: arch/powerpc/kernel/btext.c:48:5: warning: symbol 'boot_text_mapped' was not declared. Should it be static? This symbol is not used outside of btext.c, so this commit make it static. Signed-off-by: Yu Kuai --- arch/powerpc/kernel/btext.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/btext.c b/arch/powerpc/kernel/btext.c index 359d0f4ca532..8df9230be6fa 100644 --- a/arch/powerpc/kernel/btext.c +++ b/arch/powerpc/kernel/btext.c @@ -45,7 +45,7 @@ unsigned long disp_BAT[2] __initdata = {0, 0}; static unsigned char vga_font[cmapsz]; -int boot_text_mapped __force_data = 0; +static int boot_text_mapped __force_data; Are you sure the initialisation to 0 can be removed ? Usually initialisation to 0 is not needed because not initialised variables go in the BSS section which is zeroed at startup. But here the variable is flagged with __force_data so it is not going in the BSS section. extern void rmci_on(void); extern void rmci_off(void);
Re: [PATCH 1/2] powerpc: remove set but not used variable 'force_printk_to_btext'
Le 08/04/2021 à 03:18, Yu Kuai a écrit : Fixes gcc '-Wunused-but-set-variable' warning: arch/powerpc/kernel/btext.c:49:12: error: 'force_printk_to_btext' defined but not used. You don't get this error as it is now. You will get this error only if you make it 'static', which is what you did in your first patch based on the 'sparse' report. When removing a non static variable, you should explain that you can remove it after you have verifier that it is nowhere used, neither in that file nor in any other one. It is never used, and so can be removed. Signed-off-by: Yu Kuai --- arch/powerpc/kernel/btext.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/powerpc/kernel/btext.c b/arch/powerpc/kernel/btext.c index 803c2a45b22a..359d0f4ca532 100644 --- a/arch/powerpc/kernel/btext.c +++ b/arch/powerpc/kernel/btext.c @@ -46,7 +46,6 @@ unsigned long disp_BAT[2] __initdata = {0, 0}; static unsigned char vga_font[cmapsz]; int boot_text_mapped __force_data = 0; -int force_printk_to_btext = 0; extern void rmci_on(void); extern void rmci_off(void);
[PATCH] Revert "drm/syncobj: use dma_fence_get_stub"
From: David Stevens This reverts commit 86bbd89d5da66fe760049ad3f04adc407ec0c4d6. Using the singleton stub fence in drm_syncobj_assign_null_handle means that all syncobjs created in an already signaled state or any syncobjs signaled by userspace will reference the singleton fence when exported to a sync_file. If those sync_files are queried with SYNC_IOC_FILE_INFO, then the timestamp_ns value returned will correspond to whenever the singleton stub fence was first initialized. This can break the ability of userspace to use timestamps of these fences, as the singleton stub fence's timestamp bears no relationship to any meaningful event. Signed-off-by: David Stevens --- drivers/gpu/drm/drm_syncobj.c | 58 ++- 1 file changed, 44 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 349146049849..7cc11f1a83f4 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -211,6 +211,21 @@ struct syncobj_wait_entry { static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj, struct syncobj_wait_entry *wait); +struct drm_syncobj_stub_fence { + struct dma_fence base; + spinlock_t lock; +}; + +static const char *drm_syncobj_stub_fence_get_name(struct dma_fence *fence) +{ + return "syncobjstub"; +} + +static const struct dma_fence_ops drm_syncobj_stub_fence_ops = { + .get_driver_name = drm_syncobj_stub_fence_get_name, + .get_timeline_name = drm_syncobj_stub_fence_get_name, +}; + /** * drm_syncobj_find - lookup and reference a sync object. * @file_private: drm file private pointer @@ -344,18 +359,24 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, } EXPORT_SYMBOL(drm_syncobj_replace_fence); -/** - * drm_syncobj_assign_null_handle - assign a stub fence to the sync object - * @syncobj: sync object to assign the fence on - * - * Assign a already signaled stub fence to the sync object. - */ -static void drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj) +static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj) { - struct dma_fence *fence = dma_fence_get_stub(); + struct drm_syncobj_stub_fence *fence; - drm_syncobj_replace_fence(syncobj, fence); - dma_fence_put(fence); + fence = kzalloc(sizeof(*fence), GFP_KERNEL); + if (fence == NULL) + return -ENOMEM; + + spin_lock_init(>lock); + dma_fence_init(>base, _syncobj_stub_fence_ops, + >lock, 0, 0); + dma_fence_signal(>base); + + drm_syncobj_replace_fence(syncobj, >base); + + dma_fence_put(>base); + + return 0; } /* 5s default for wait submission */ @@ -469,6 +490,7 @@ EXPORT_SYMBOL(drm_syncobj_free); int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags, struct dma_fence *fence) { + int ret; struct drm_syncobj *syncobj; syncobj = kzalloc(sizeof(struct drm_syncobj), GFP_KERNEL); @@ -479,8 +501,13 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags, INIT_LIST_HEAD(>cb_list); spin_lock_init(>lock); - if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) - drm_syncobj_assign_null_handle(syncobj); + if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) { + ret = drm_syncobj_assign_null_handle(syncobj); + if (ret < 0) { + drm_syncobj_put(syncobj); + return ret; + } + } if (fence) drm_syncobj_replace_fence(syncobj, fence); @@ -1322,8 +1349,11 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, if (ret < 0) return ret; - for (i = 0; i < args->count_handles; i++) - drm_syncobj_assign_null_handle(syncobjs[i]); + for (i = 0; i < args->count_handles; i++) { + ret = drm_syncobj_assign_null_handle(syncobjs[i]); + if (ret < 0) + break; + } drm_syncobj_array_free(syncobjs, args->count_handles); -- 2.31.0.208.g409f899ff0-goog
Re: linux-next: build failure after merge of the bluetooth tree
Hi Luiz, On Thu, 8 Apr 2021 04:47:04 + "Von Dentz, Luiz" wrote: > > I'd leave this for Marcel to comments, but there are quite many > instances of // comment like that, so I wonder what is going on, or > perhaps that is not allowed in include/uapi? We only do these standalone compile checks on the uapi header files. -- Cheers, Stephen Rothwell pgpACVhh2tkWa.pgp Description: OpenPGP digital signature
Re: [PATCH-next] powerpc/interrupt: Remove duplicate header file
Le 08/04/2021 à 05:56, johnny.che...@huawei.com a écrit : From: Chen Yi Delete one of the header files that are included twice. Guys, we have been flooded with such tiny patches over the last weeks, some changes being sent several times by different people. That one is included in https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210323062916.295346-1-wanjiab...@vivo.com/ And was already submitted a few hours earlier by someone else: https://patchwork.ozlabs.org/project/linuxppc-dev/patch/1616464656-59372-1-git-send-email-zhouchuan...@vivo.com/ Could you work all together and cook an overall patch including all duplicate removal from arch/powerpc/ files ? Best way would be I think to file an issue at https://github.com/linuxppc/issues/issues , then you do a complete analysis and list in the issue all places to be modified, then once the analysis is complete you send a full single patch. Thanks Christophe Signed-off-by: Chen Yi --- arch/powerpc/kernel/interrupt.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c index c4dd4b8f9cfa..f64ace0208b7 100644 --- a/arch/powerpc/kernel/interrupt.c +++ b/arch/powerpc/kernel/interrupt.c @@ -7,7 +7,6 @@ #include #include #include -#include #include #include #include
Re: [PATCH v1] usb: dwc3: core: Add shutdown callback for dwc3
On 3/30/2021 7:02 PM, Greg Kroah-Hartman wrote: On Tue, Mar 30, 2021 at 06:18:43PM +0530, Sai Prakash Ranjan wrote: On 2021-03-30 16:46, Greg Kroah-Hartman wrote: On Tue, Mar 30, 2021 at 03:25:58PM +0530, Sai Prakash Ranjan wrote: On 2021-03-30 14:37, Greg Kroah-Hartman wrote: On Tue, Mar 30, 2021 at 02:12:04PM +0530, Sandeep Maheswaram wrote: On 3/26/2021 7:07 PM, Greg Kroah-Hartman wrote: On Wed, Mar 24, 2021 at 12:57:32AM +0530, Sandeep Maheswaram wrote: This patch adds a shutdown callback to USB DWC core driver to ensure that it is properly shutdown in reboot/shutdown path. This is required where SMMU address translation is enabled like on SC7180 SoC and few others. If the hardware is still accessing memory after SMMU translation is disabled as part of SMMU shutdown callback in system reboot or shutdown path, then IOVAs(I/O virtual address) which it was using will go on the bus as the physical addresses which might result in unknown crashes (NoC/interconnect errors). Previously this was added in dwc3 qcom glue driver. https://patchwork.kernel.org/project/linux-arm-msm/list/?series=382449 But observed kernel panic as glue driver shutdown getting called after iommu shutdown. As we are adding iommu nodes in dwc core node in device tree adding shutdown callback in core driver seems correct. So shouldn't you also remove this from the qcom glue driver at the same time? Please submit both as a patch series. thanks, greg k-h Hi Greg, The qcom glue driver patch is not merged yet. I have just mentioned for it for reference. You know that we can not add callbacks for no in-kernel user, so what good is this patch for now? What in-kernel user? Since when does shutdown callback need an in-kernel user? When you reboot or shutdown a system, it gets called. The reason why the shutdown callback is needed is provided in the commit text. As I can't see the patch here, I have no idea... You are replying now to the same patch which adds this shutdown callback :) Anyways the qcom dwc3 driver patch which is abandoned which is also mentioned in the commit text is here [1] and the new shutdown callback patch which we are both replying to is in here [2] [1] https://lore.kernel.org/lkml/1605162619-10064-1-git-send-email-s...@codeaurora.org/ [2] https://lore.kernel.org/lkml/1616527652-7937-1-git-send-email-s...@codeaurora.org/ Thanks, so, what am I supposed to do here? The patch is long gone from my queue... greg k-h Hi Greg, Should I resend this patch ? If so let me know your about opinion about Stephen's comment on just calling dwc3_remove in dwc3_shutdown and ignoring return value. https://lore.kernel.org/patchwork/patch/1401242/#1599316 Thanks Sandeep
RE: rtlwifi/rtl8192cu AP mode broken with PS STA
> -Original Message- > From: Maciej S. Szmigiero [mailto:m...@maciej.szmigiero.name] > Sent: Thursday, April 08, 2021 4:53 AM > To: Larry Finger; Pkshih > Cc: linux-wirel...@vger.kernel.org; net...@vger.kernel.org; > linux-kernel@vger.kernel.org; > johan...@sipsolutions.net; kv...@codeaurora.org > Subject: Re: rtlwifi/rtl8192cu AP mode broken with PS STA > > On 07.04.2021 06:21, Larry Finger wrote: > > On 4/6/21 9:48 PM, Pkshih wrote: > >> On Tue, 2021-04-06 at 11:25 -0500, Larry Finger wrote: > >>> On 4/6/21 7:06 AM, Maciej S. Szmigiero wrote: > On 06.04.2021 12:00, Kalle Valo wrote: > > "Maciej S. Szmigiero" writes: > > > >> On 29.03.2021 00:54, Maciej S. Szmigiero wrote: > >>> Hi, > >>> > >>> It looks like rtlwifi/rtl8192cu AP mode is broken when a STA is using > >>> PS, > >>> since the driver does not update its beacon to account for TIM > >>> changes, > >>> so a station that is sleeping will never learn that it has packets > >>> buffered at the AP. > >>> > >>> Looking at the code, the rtl8192cu driver implements neither the > >>> set_tim() > >>> callback, nor does it explicitly update beacon data periodically, so > >>> it > >>> has no way to learn that it had changed. > >>> > >>> This results in the AP mode being virtually unusable with STAs that do > >>> PS and don't allow for it to be disabled (IoT devices, mobile phones, > >>> etc.). > >>> > >>> I think the easiest fix here would be to implement set_tim() for > >>> example > >>> the way rt2x00 driver does: queue a work or schedule a tasklet to > >>> update > >>> the beacon data on the device. > >> > >> Are there any plans to fix this? > >> The driver is listed as maintained by Ping-Ke. > > > > Yeah, power save is hard and I'm not surprised that there are drivers > > with broken power save mode support. If there's no fix available we > > should stop supporting AP mode in the driver. > > > https://wireless.wiki.kernel.org/en/developers/documentation/mac80211/api > clearly documents that "For AP mode, it must (...) react to the set_tim() > callback or fetch each beacon from mac80211". > The driver isn't doing either so no wonder the beacon it is sending > isn't getting updated. > As I have said above, it seems to me that all that needs to be done here > is to queue a work in a set_tim() callback, then call > send_beacon_frame() from rtlwifi/core.c from this work. > But I don't know the exact device semantics, maybe it needs some other > notification that the beacon has changed, too, or even tries to > manage the TIM bitmap by itself. > It would be a shame to lose the AP mode for such minor thing, though. > I would play with this myself, but unfortunately I don't have time > to work on this right now. > That's where my question to Realtek comes: are there plans to actually > fix this? > >>> > >>> Yes, I am working on this. My only question is "if you are such an expert > >>> on the > >>> problem, why do you not fix it?" > >>> > >>> The example in rx200 is not particularly useful, and I have not found any > >>> other > >>> examples. > >>> > >> > >> Hi Larry, > >> > >> I have a draft patch that forks a work to do send_beacon_frame(), whose > >> behavior like Maciej mentioned. > > That's great, thanks! > > >> I did test on RTL8821AE; it works well. But, it seems already work well > >> even > >> I don't apply this patch, and I'm still digging why. > > It looks like PCI rtlwifi hardware uses a tasklet (specifically, > _rtl_pci_prepare_bcn_tasklet() in pci.c) to periodically transfer the > current beacon to the NIC. Got it. > > This tasklet is scheduled on a RTL_IMR_BCNINT interrupt, which sounds > like a beacon interval interrupt. > Yes, PCI series update every beacon, so TIM and DTIM count maintained by mac80211 work properly. > >> I don't have a rtl8192cu dongle on hand, but I'll try to find one. > > > > Maceij, > > > > Does this patch fix the problem? > > The beacon seems to be updating now and STAs no longer get stuck in PS > mode. > Although sometimes (every 2-3 minutes with continuous 1s interval pings) > there is around 5s delay in updating the transmitted beacon - don't know > why, maybe the NIC hardware still has the old version in queue? Since USB device doesn't update every beacon, dtim_count isn't updated neither. It leads STA doesn't awake properly. Please try to fix dtim_period=1 in hostapd.conf, which tells STA awakes every beacon interval. > > I doubt, however that this set_tim() callback should be added for every > rtlwifi device type. > > As I have said above, PCI devices seem to already have a mechanism in > place to update their beacon each beacon interval. > Your test that RTL8821AE works without this patch confirms that (at > least for the rtl8821ae driver). > > It seems this
Re: [PATCH] iommu/vt-d: Force to flush iotlb before creating superpage
Hi Longpeng, On 4/7/21 2:35 PM, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote: Hi Baolu, -Original Message- From: Lu Baolu [mailto:baolu...@linux.intel.com] Sent: Friday, April 2, 2021 12:44 PM To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.) ; io...@lists.linux-foundation.org; linux-kernel@vger.kernel.org Cc: baolu...@linux.intel.com; David Woodhouse ; Nadav Amit ; Alex Williamson ; Kevin Tian ; Gonglei (Arei) ; sta...@vger.kernel.org Subject: Re: [PATCH] iommu/vt-d: Force to flush iotlb before creating superpage Hi Longpeng, On 4/1/21 3:18 PM, Longpeng(Mike) wrote: diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index ee09323..cbcb434 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -2342,9 +2342,20 @@ static inline int hardware_largepage_caps(struct dmar_domain *domain, * removed to make room for superpage(s). * We're adding new large pages, so make sure * we don't remove their parent tables. +* +* We also need to flush the iotlb before creating +* superpage to ensure it does not perserves any +* obsolete info. */ - dma_pte_free_pagetable(domain, iov_pfn, end_pfn, - largepage_lvl + 1); + if (dma_pte_present(pte)) { The dma_pte_free_pagetable() clears a batch of PTEs. So checking current PTE is insufficient. How about removing this check and always performing cache invalidation? Um...the PTE here may be present( e.g. 4K mapping --> superpage mapping ) orNOT-present ( e.g. create a totally new superpage mapping ), but we only need to call free_pagetable and flush_iotlb in the former case, right ? But this code covers multiple PTEs and perhaps crosses the page boundary. How about moving this code into a separated function and check PTE presence there. A sample code could look like below: [compiled but not tested!] diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index d334f5b4e382..0e04d450c38a 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -2300,6 +2300,41 @@ static inline int hardware_largepage_caps(struct dmar_domain *domain, return level; } +/* + * Ensure that old small page tables are removed to make room for superpage(s). + * We're going to add new large pages, so make sure we don't remove their parent + * tables. The IOTLB/devTLBs should be flushed if any PDE/PTEs are cleared. + */ +static void switch_to_super_page(struct dmar_domain *domain, +unsigned long start_pfn, +unsigned long end_pfn, int level) +{ + unsigned long lvl_pages = lvl_to_nr_pages(level); + struct dma_pte *pte = NULL; + int i; + + while (start_pfn <= end_pfn) { + if (!pte) + pte = pfn_to_dma_pte(domain, start_pfn, ); + + if (dma_pte_present(pte)) { + dma_pte_free_pagetable(domain, start_pfn, + start_pfn + lvl_pages - 1, + level + 1); + + for_each_domain_iommu(i, domain) + iommu_flush_iotlb_psi(g_iommus[i], domain, + start_pfn, lvl_pages, + 0, 0); + } + + pte++; + start_pfn += lvl_pages; + if (first_pte_in_page(pte)) + pte = NULL; + } +} + static int __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn, unsigned long phys_pfn, unsigned long nr_pages, int prot) @@ -2341,22 +2376,11 @@ __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn, return -ENOMEM; /* It is large page*/ if (largepage_lvl > 1) { - unsigned long nr_superpages, end_pfn; + unsigned long end_pfn; pteval |= DMA_PTE_LARGE_PAGE; - lvl_pages = lvl_to_nr_pages(largepage_lvl); - - nr_superpages = nr_pages / lvl_pages; - end_pfn = iov_pfn + nr_superpages * lvl_pages - 1; - - /* -* Ensure that old small page tables are -* removed to make room for superpage(s). -* We're adding new large pages, so make sure -
Re: [PATCH] powerpc: remove old workaround for GCC < 4.9
Le 08/04/2021 à 05:05, Masahiro Yamada a écrit : According to Documentation/process/changes.rst, the minimum supported GCC version is 4.9. This workaround is dead code. This workaround is already on the way out, see https://github.com/linuxppc/linux/commit/802b5560393423166e436c7914b565f3cda9e6b9 Signed-off-by: Masahiro Yamada --- arch/powerpc/Makefile | 6 -- 1 file changed, 6 deletions(-) diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 5f8544cf724a..32dd693b4e42 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -181,12 +181,6 @@ CC_FLAGS_FTRACE := -pg ifdef CONFIG_MPROFILE_KERNEL CC_FLAGS_FTRACE += -mprofile-kernel endif -# Work around gcc code-gen bugs with -pg / -fno-omit-frame-pointer in gcc <= 4.8 -# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44199 -# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52828 -ifndef CONFIG_CC_IS_CLANG -CC_FLAGS_FTRACE+= $(call cc-ifversion, -lt, 0409, -mno-sched-epilog) -endif endif CFLAGS-$(CONFIG_TARGET_CPU_BOOL) += $(call cc-option,-mcpu=$(CONFIG_TARGET_CPU))
[PATCH v13 14/18] arm64: kexec: install a copy of the linear-map
To perform the kexec relocations with the MMU enabled, we need a copy of the linear map. Create one, and install it from the relocation code. This has to be done from the assembly code as it will be idmapped with TTBR0. The kernel runs in TTRB1, so can't use the break-before-make sequence on the mapping it is executing from. The makes no difference yet as the relocation code runs with the MMU disabled. Co-developed-by: James Morse Signed-off-by: Pavel Tatashin --- arch/arm64/include/asm/assembler.h | 19 +++ arch/arm64/include/asm/kexec.h | 2 ++ arch/arm64/kernel/asm-offsets.c | 2 ++ arch/arm64/kernel/hibernate-asm.S | 20 arch/arm64/kernel/machine_kexec.c | 16 ++-- arch/arm64/kernel/relocate_kernel.S | 3 +++ 6 files changed, 40 insertions(+), 22 deletions(-) diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index 29061b76aab6..3ce8131ad660 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -425,6 +425,25 @@ USER(\label, icivau, \tmp2)// invalidate I line PoU isb .endm +/* + * To prevent the possibility of old and new partial table walks being visible + * in the tlb, switch the ttbr to a zero page when we invalidate the old + * records. D4.7.1 'General TLB maintenance requirements' in ARM DDI 0487A.i + * Even switching to our copied tables will cause a changed output address at + * each stage of the walk. + */ + .macro break_before_make_ttbr_switch zero_page, page_table, tmp, tmp2 + phys_to_ttbr \tmp, \zero_page + msr ttbr1_el1, \tmp + isb + tlbivmalle1 + dsb nsh + phys_to_ttbr \tmp, \page_table + offset_ttbr1 \tmp, \tmp2 + msr ttbr1_el1, \tmp + isb + .endm + /* * reset_pmuserenr_el0 - reset PMUSERENR_EL0 if PMUv3 present */ diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h index 305cf0840ed3..59ac166daf53 100644 --- a/arch/arm64/include/asm/kexec.h +++ b/arch/arm64/include/asm/kexec.h @@ -97,6 +97,8 @@ struct kimage_arch { phys_addr_t dtb_mem; phys_addr_t kern_reloc; phys_addr_t el2_vectors; + phys_addr_t ttbr1; + phys_addr_t zero_page; /* Core ELF header buffer */ void *elf_headers; unsigned long elf_headers_mem; diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index 2e3278df1fc3..609362b5aa76 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -158,6 +158,8 @@ int main(void) #ifdef CONFIG_KEXEC_CORE DEFINE(KIMAGE_ARCH_DTB_MEM, offsetof(struct kimage, arch.dtb_mem)); DEFINE(KIMAGE_ARCH_EL2_VECTORS, offsetof(struct kimage, arch.el2_vectors)); + DEFINE(KIMAGE_ARCH_ZERO_PAGE,offsetof(struct kimage, arch.zero_page)); + DEFINE(KIMAGE_ARCH_TTBR1,offsetof(struct kimage, arch.ttbr1)); DEFINE(KIMAGE_HEAD, offsetof(struct kimage, head)); DEFINE(KIMAGE_START, offsetof(struct kimage, start)); BLANK(); diff --git a/arch/arm64/kernel/hibernate-asm.S b/arch/arm64/kernel/hibernate-asm.S index 8ccca660034e..a31e621ba867 100644 --- a/arch/arm64/kernel/hibernate-asm.S +++ b/arch/arm64/kernel/hibernate-asm.S @@ -15,26 +15,6 @@ #include #include -/* - * To prevent the possibility of old and new partial table walks being visible - * in the tlb, switch the ttbr to a zero page when we invalidate the old - * records. D4.7.1 'General TLB maintenance requirements' in ARM DDI 0487A.i - * Even switching to our copied tables will cause a changed output address at - * each stage of the walk. - */ -.macro break_before_make_ttbr_switch zero_page, page_table, tmp, tmp2 - phys_to_ttbr \tmp, \zero_page - msr ttbr1_el1, \tmp - isb - tlbivmalle1 - dsb nsh - phys_to_ttbr \tmp, \page_table - offset_ttbr1 \tmp, \tmp2 - msr ttbr1_el1, \tmp - isb -.endm - - /* * Resume from hibernate * diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index f1451d807708..c875ef522e53 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -153,6 +153,8 @@ static void *kexec_page_alloc(void *arg) int machine_kexec_post_load(struct kimage *kimage) { + int rc; + pgd_t *trans_pgd; void *reloc_code = page_to_virt(kimage->control_code_page); long reloc_size; struct trans_pgd_info info = { @@ -169,12 +171,22 @@ int machine_kexec_post_load(struct kimage *kimage) kimage->arch.el2_vectors = 0; if (is_hyp_callable()) { - int rc = trans_pgd_copy_el2_vectors(, - >arch.el2_vectors); + rc = trans_pgd_copy_el2_vectors(, +
[PATCH v13 18/18] arm64/mm: remove useless trans_pgd_map_page()
From: Pingfan Liu The intend of trans_pgd_map_page() was to map contigous range of VA memory to the memory that is getting relocated during kexec. However, since we are now using linear map instead of contigous range this function is not needed Signed-off-by: Pingfan Liu [Changed commit message] Signed-off-by: Pavel Tatashin --- arch/arm64/include/asm/trans_pgd.h | 5 +-- arch/arm64/mm/trans_pgd.c | 57 -- 2 files changed, 1 insertion(+), 61 deletions(-) diff --git a/arch/arm64/include/asm/trans_pgd.h b/arch/arm64/include/asm/trans_pgd.h index e0760e52d36d..234353df2f13 100644 --- a/arch/arm64/include/asm/trans_pgd.h +++ b/arch/arm64/include/asm/trans_pgd.h @@ -15,7 +15,7 @@ /* * trans_alloc_page * - Allocator that should return exactly one zeroed page, if this - * allocator fails, trans_pgd_create_copy() and trans_pgd_map_page() + * allocator fails, trans_pgd_create_copy() and trans_pgd_idmap_page() * return -ENOMEM error. * * trans_alloc_arg @@ -30,9 +30,6 @@ struct trans_pgd_info { int trans_pgd_create_copy(struct trans_pgd_info *info, pgd_t **trans_pgd, unsigned long start, unsigned long end); -int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t *trans_pgd, - void *page, unsigned long dst_addr, pgprot_t pgprot); - int trans_pgd_idmap_page(struct trans_pgd_info *info, phys_addr_t *trans_ttbr0, unsigned long *t0sz, void *page); diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c index 61549451ed3a..e24a749013c1 100644 --- a/arch/arm64/mm/trans_pgd.c +++ b/arch/arm64/mm/trans_pgd.c @@ -217,63 +217,6 @@ int trans_pgd_create_copy(struct trans_pgd_info *info, pgd_t **dst_pgdp, return rc; } -/* - * Add map entry to trans_pgd for a base-size page at PTE level. - * info: contains allocator and its argument - * trans_pgd: page table in which new map is added. - * page: page to be mapped. - * dst_addr: new VA address for the page - * pgprot: protection for the page. - * - * Returns 0 on success, and -ENOMEM on failure. - */ -int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t *trans_pgd, - void *page, unsigned long dst_addr, pgprot_t pgprot) -{ - pgd_t *pgdp; - p4d_t *p4dp; - pud_t *pudp; - pmd_t *pmdp; - pte_t *ptep; - - pgdp = pgd_offset_pgd(trans_pgd, dst_addr); - if (pgd_none(READ_ONCE(*pgdp))) { - p4dp = trans_alloc(info); - if (!pgdp) - return -ENOMEM; - pgd_populate(NULL, pgdp, p4dp); - } - - p4dp = p4d_offset(pgdp, dst_addr); - if (p4d_none(READ_ONCE(*p4dp))) { - pudp = trans_alloc(info); - if (!pudp) - return -ENOMEM; - p4d_populate(NULL, p4dp, pudp); - } - - pudp = pud_offset(p4dp, dst_addr); - if (pud_none(READ_ONCE(*pudp))) { - pmdp = trans_alloc(info); - if (!pmdp) - return -ENOMEM; - pud_populate(NULL, pudp, pmdp); - } - - pmdp = pmd_offset(pudp, dst_addr); - if (pmd_none(READ_ONCE(*pmdp))) { - ptep = trans_alloc(info); - if (!ptep) - return -ENOMEM; - pmd_populate_kernel(NULL, pmdp, ptep); - } - - ptep = pte_offset_kernel(pmdp, dst_addr); - set_pte(ptep, pfn_pte(virt_to_pfn(page), pgprot)); - - return 0; -} - /* * The page we want to idmap may be outside the range covered by VA_BITS that * can be built using the kernel's p?d_populate() helpers. As a one off, for a -- 2.25.1
[PATCH v13 17/18] arm64: kexec: Remove cpu-reset.h
This header contains only cpu_soft_restart() which is never used directly anymore. So, remove this header, and rename the helper to be cpu_soft_restart(). Suggested-by: James Morse Signed-off-by: Pavel Tatashin --- arch/arm64/include/asm/kexec.h| 6 ++ arch/arm64/kernel/cpu-reset.S | 7 +++ arch/arm64/kernel/cpu-reset.h | 30 -- arch/arm64/kernel/machine_kexec.c | 6 ++ 4 files changed, 11 insertions(+), 38 deletions(-) delete mode 100644 arch/arm64/kernel/cpu-reset.h diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h index 5fc87b51f8a9..ee71ae3b93ed 100644 --- a/arch/arm64/include/asm/kexec.h +++ b/arch/arm64/include/asm/kexec.h @@ -90,6 +90,12 @@ static inline void crash_prepare_suspend(void) {} static inline void crash_post_resume(void) {} #endif +#if defined(CONFIG_KEXEC_CORE) +void cpu_soft_restart(unsigned long el2_switch, unsigned long entry, + unsigned long arg0, unsigned long arg1, + unsigned long arg2); +#endif + #define ARCH_HAS_KIMAGE_ARCH struct kimage_arch { diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S index 37721eb6f9a1..5d47d6c92634 100644 --- a/arch/arm64/kernel/cpu-reset.S +++ b/arch/arm64/kernel/cpu-reset.S @@ -16,8 +16,7 @@ .pushsection.idmap.text, "awx" /* - * __cpu_soft_restart(el2_switch, entry, arg0, arg1, arg2) - Helper for - * cpu_soft_restart. + * cpu_soft_restart(el2_switch, entry, arg0, arg1, arg2) * * @el2_switch: Flag to indicate a switch to EL2 is needed. * @entry: Location to jump to for soft reset. @@ -29,7 +28,7 @@ * branch to what would be the reset vector. It must be executed with the * flat identity mapping. */ -SYM_CODE_START(__cpu_soft_restart) +SYM_CODE_START(cpu_soft_restart) /* Clear sctlr_el1 flags. */ mrs x12, sctlr_el1 mov_q x13, SCTLR_ELx_FLAGS @@ -51,6 +50,6 @@ SYM_CODE_START(__cpu_soft_restart) mov x1, x3 // arg1 mov x2, x4 // arg2 br x8 -SYM_CODE_END(__cpu_soft_restart) +SYM_CODE_END(cpu_soft_restart) .popsection diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h deleted file mode 100644 index f6d95512fec6.. --- a/arch/arm64/kernel/cpu-reset.h +++ /dev/null @@ -1,30 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0-only */ -/* - * CPU reset routines - * - * Copyright (C) 2015 Huawei Futurewei Technologies. - */ - -#ifndef _ARM64_CPU_RESET_H -#define _ARM64_CPU_RESET_H - -#include - -void __cpu_soft_restart(unsigned long el2_switch, unsigned long entry, - unsigned long arg0, unsigned long arg1, unsigned long arg2); - -static inline void __noreturn cpu_soft_restart(unsigned long entry, - unsigned long arg0, - unsigned long arg1, - unsigned long arg2) -{ - typeof(__cpu_soft_restart) *restart; - - restart = (void *)__pa_symbol(__cpu_soft_restart); - - cpu_install_idmap(); - restart(0, entry, arg0, arg1, arg2); - unreachable(); -} - -#endif diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index a1c9bee0cddd..ef7ba93f2bd6 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -23,8 +23,6 @@ #include #include -#include "cpu-reset.h" - /** * kexec_image_info - For debugging output. */ @@ -197,10 +195,10 @@ void machine_kexec(struct kimage *kimage) * In kexec_file case, the kernel starts directly without purgatory. */ if (kimage->head & IND_DONE) { - typeof(__cpu_soft_restart) *restart; + typeof(cpu_soft_restart) *restart; cpu_install_idmap(); - restart = (void *)__pa_symbol(__cpu_soft_restart); + restart = (void *)__pa_symbol(cpu_soft_restart); restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem, 0, 0); } else { -- 2.25.1
[PATCH v13 13/18] arm64: kexec: use ld script for relocation function
Currently, relocation code declares start and end variables which are used to compute its size. The better way to do this is to use ld script incited, and put relocation function in its own section. Signed-off-by: Pavel Tatashin --- arch/arm64/include/asm/sections.h | 1 + arch/arm64/kernel/machine_kexec.c | 14 ++ arch/arm64/kernel/relocate_kernel.S | 15 ++- arch/arm64/kernel/vmlinux.lds.S | 19 +++ 4 files changed, 28 insertions(+), 21 deletions(-) diff --git a/arch/arm64/include/asm/sections.h b/arch/arm64/include/asm/sections.h index 2f36b16a5b5d..31e459af89f6 100644 --- a/arch/arm64/include/asm/sections.h +++ b/arch/arm64/include/asm/sections.h @@ -20,5 +20,6 @@ extern char __exittext_begin[], __exittext_end[]; extern char __irqentry_text_start[], __irqentry_text_end[]; extern char __mmuoff_data_start[], __mmuoff_data_end[]; extern char __entry_tramp_text_start[], __entry_tramp_text_end[]; +extern char __relocate_new_kernel_start[], __relocate_new_kernel_end[]; #endif /* __ASM_SECTIONS_H */ diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index d5940b7889f8..f1451d807708 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -20,14 +20,11 @@ #include #include #include +#include #include #include "cpu-reset.h" -/* Global variables for the arm64_relocate_new_kernel routine. */ -extern const unsigned char arm64_relocate_new_kernel[]; -extern const unsigned long arm64_relocate_new_kernel_size; - /** * kexec_image_info - For debugging output. */ @@ -157,6 +154,7 @@ static void *kexec_page_alloc(void *arg) int machine_kexec_post_load(struct kimage *kimage) { void *reloc_code = page_to_virt(kimage->control_code_page); + long reloc_size; struct trans_pgd_info info = { .trans_alloc_page = kexec_page_alloc, .trans_alloc_arg= kimage, @@ -177,14 +175,14 @@ int machine_kexec_post_load(struct kimage *kimage) return rc; } - memcpy(reloc_code, arm64_relocate_new_kernel, - arm64_relocate_new_kernel_size); + reloc_size = __relocate_new_kernel_end - __relocate_new_kernel_start; + memcpy(reloc_code, __relocate_new_kernel_start, reloc_size); kimage->arch.kern_reloc = __pa(reloc_code); /* Flush the reloc_code in preparation for its execution. */ - __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size); + __flush_dcache_area(reloc_code, reloc_size); flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code + - arm64_relocate_new_kernel_size); + reloc_size); kexec_list_flush(kimage); kexec_image_info(kimage); diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S index df023b82544b..7a600ba33ae1 100644 --- a/arch/arm64/kernel/relocate_kernel.S +++ b/arch/arm64/kernel/relocate_kernel.S @@ -15,6 +15,7 @@ #include #include +.pushsection".kexec_relocate.text", "ax" /* * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it. * @@ -77,16 +78,4 @@ SYM_CODE_START(arm64_relocate_new_kernel) mov x3, xzr br x4 /* Jumps from el1 */ SYM_CODE_END(arm64_relocate_new_kernel) - -.align 3 /* To keep the 64-bit values below naturally aligned. */ - -.Lcopy_end: -.org KEXEC_CONTROL_PAGE_SIZE - -/* - * arm64_relocate_new_kernel_size - Number of bytes to copy to the - * control_code_page. - */ -.globl arm64_relocate_new_kernel_size -arm64_relocate_new_kernel_size: - .quad .Lcopy_end - arm64_relocate_new_kernel +.popsection diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S index 7eea7888bb02..0d9d5e6af66f 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -92,6 +93,16 @@ jiffies = jiffies_64; #define HIBERNATE_TEXT #endif +#ifdef CONFIG_KEXEC_CORE +#define KEXEC_TEXT \ + . = ALIGN(SZ_4K); \ + __relocate_new_kernel_start = .;\ + *(.kexec_relocate.text) \ + __relocate_new_kernel_end = .; +#else +#define KEXEC_TEXT +#endif + #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 #define TRAMP_TEXT \ . = ALIGN(PAGE_SIZE); \ @@ -152,6 +163,7 @@ SECTIONS HYPERVISOR_TEXT IDMAP_TEXT HIBERNATE_TEXT + KEXEC_TEXT TRAMP_TEXT *(.fixup) *(.gnu.warning) @@ -336,3 +348,10 @@ ASSERT(swapper_pg_dir - reserved_pg_dir ==
[PATCH v13 16/18] arm64: kexec: remove the pre-kexec PoC maintenance
Now that kexec does its relocations with the MMU enabled, we no longer need to clean the relocation data to the PoC. Co-developed-by: James Morse Signed-off-by: Pavel Tatashin --- arch/arm64/kernel/machine_kexec.c | 40 --- 1 file changed, 40 deletions(-) diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index d5c8aefc66f3..a1c9bee0cddd 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -76,45 +76,6 @@ int machine_kexec_prepare(struct kimage *kimage) return 0; } -/** - * kexec_list_flush - Helper to flush the kimage list and source pages to PoC. - */ -static void kexec_list_flush(struct kimage *kimage) -{ - kimage_entry_t *entry; - - __flush_dcache_area(kimage, sizeof(*kimage)); - - for (entry = >head; ; entry++) { - unsigned int flag; - void *addr; - - /* flush the list entries. */ - __flush_dcache_area(entry, sizeof(kimage_entry_t)); - - flag = *entry & IND_FLAGS; - if (flag == IND_DONE) - break; - - addr = phys_to_virt(*entry & PAGE_MASK); - - switch (flag) { - case IND_INDIRECTION: - /* Set entry point just before the new list page. */ - entry = (kimage_entry_t *)addr - 1; - break; - case IND_SOURCE: - /* flush the source pages. */ - __flush_dcache_area(addr, PAGE_SIZE); - break; - case IND_DESTINATION: - break; - default: - BUG(); - } - } -} - /** * kexec_segment_flush - Helper to flush the kimage segments to PoC. */ @@ -200,7 +161,6 @@ int machine_kexec_post_load(struct kimage *kimage) __flush_dcache_area(reloc_code, reloc_size); flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code + reloc_size); - kexec_list_flush(kimage); kexec_image_info(kimage); return 0; -- 2.25.1
[PATCH v13 15/18] arm64: kexec: keep MMU enabled during kexec relocation
Now, that we have linear map page tables configured, keep MMU enabled to allow faster relocation of segments to final destination. Cavium ThunderX2: Kernel Image size: 38M Iniramfs size: 46M Total relocation size: 84M MMU-disabled: relocation 7.489539915s MMU-enabled: relocation 0.03946095s Broadcom Stingray: The performance data: for a moderate size kernel + initramfs: 25M the relocation was taking 0.382s, with enabled MMU it now takes 0.019s only or x20 improvement. The time is proportional to the size of relocation, therefore if initramfs is larger, 100M it could take over a second. Signed-off-by: Pavel Tatashin --- arch/arm64/include/asm/kexec.h | 3 +++ arch/arm64/kernel/asm-offsets.c | 1 + arch/arm64/kernel/machine_kexec.c | 16 ++ arch/arm64/kernel/relocate_kernel.S | 33 +++-- 4 files changed, 38 insertions(+), 15 deletions(-) diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h index 59ac166daf53..5fc87b51f8a9 100644 --- a/arch/arm64/include/asm/kexec.h +++ b/arch/arm64/include/asm/kexec.h @@ -97,8 +97,11 @@ struct kimage_arch { phys_addr_t dtb_mem; phys_addr_t kern_reloc; phys_addr_t el2_vectors; + phys_addr_t ttbr0; phys_addr_t ttbr1; phys_addr_t zero_page; + unsigned long phys_offset; + unsigned long t0sz; /* Core ELF header buffer */ void *elf_headers; unsigned long elf_headers_mem; diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index 609362b5aa76..ec7bb80aedc8 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -159,6 +159,7 @@ int main(void) DEFINE(KIMAGE_ARCH_DTB_MEM, offsetof(struct kimage, arch.dtb_mem)); DEFINE(KIMAGE_ARCH_EL2_VECTORS, offsetof(struct kimage, arch.el2_vectors)); DEFINE(KIMAGE_ARCH_ZERO_PAGE,offsetof(struct kimage, arch.zero_page)); + DEFINE(KIMAGE_ARCH_PHYS_OFFSET, offsetof(struct kimage, arch.phys_offset)); DEFINE(KIMAGE_ARCH_TTBR1,offsetof(struct kimage, arch.ttbr1)); DEFINE(KIMAGE_HEAD, offsetof(struct kimage, head)); DEFINE(KIMAGE_START, offsetof(struct kimage, start)); diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index c875ef522e53..d5c8aefc66f3 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -190,6 +190,11 @@ int machine_kexec_post_load(struct kimage *kimage) reloc_size = __relocate_new_kernel_end - __relocate_new_kernel_start; memcpy(reloc_code, __relocate_new_kernel_start, reloc_size); kimage->arch.kern_reloc = __pa(reloc_code); + rc = trans_pgd_idmap_page(, >arch.ttbr0, + >arch.t0sz, reloc_code); + if (rc) + return rc; + kimage->arch.phys_offset = virt_to_phys(kimage) - (long)kimage; /* Flush the reloc_code in preparation for its execution. */ __flush_dcache_area(reloc_code, reloc_size); @@ -223,9 +228,9 @@ void machine_kexec(struct kimage *kimage) local_daif_mask(); /* -* Both restart and cpu_soft_restart will shutdown the MMU, disable data +* Both restart and kernel_reloc will shutdown the MMU, disable data * caches. However, restart will start new kernel or purgatory directly, -* cpu_soft_restart will transfer control to arm64_relocate_new_kernel +* kernel_reloc contains the body of arm64_relocate_new_kernel * In kexec case, kimage->start points to purgatory assuming that * kernel entry and dtb address are embedded in purgatory by * userspace (kexec-tools). @@ -239,10 +244,13 @@ void machine_kexec(struct kimage *kimage) restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem, 0, 0); } else { + void (*kernel_reloc)(struct kimage *kimage); + if (is_hyp_callable()) __hyp_set_vectors(kimage->arch.el2_vectors); - cpu_soft_restart(kimage->arch.kern_reloc, -virt_to_phys(kimage), 0, 0); + cpu_install_ttbr0(kimage->arch.ttbr0, kimage->arch.t0sz); + kernel_reloc = (void *)kimage->arch.kern_reloc; + kernel_reloc(kimage); } BUG(); /* Should never get here. */ diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S index e83b6380907d..433a57b3d76e 100644 --- a/arch/arm64/kernel/relocate_kernel.S +++ b/arch/arm64/kernel/relocate_kernel.S @@ -4,6 +4,8 @@ * * Copyright (C) Linaro. * Copyright (C) Huawei Futurewei Technologies. + * Copyright (C) 2020, Microsoft Corporation. + * Pavel Tatashin */ #include @@ -15,6 +17,15 @@ #include #include +.macro turn_off_mmu tmp1, tmp2 + mrs \tmp1, sctlr_el1 +
[PATCH v13 12/18] arm64: kexec: relocate in EL1 mode
Since we are going to keep MMU enabled during relocation, we need to keep EL1 mode throughout the relocation. Keep EL1 enabled, and switch EL2 only before enterying the new world. Suggested-by: James Morse Signed-off-by: Pavel Tatashin --- arch/arm64/kernel/cpu-reset.h | 3 +-- arch/arm64/kernel/machine_kexec.c | 4 ++-- arch/arm64/kernel/relocate_kernel.S | 13 +++-- 3 files changed, 14 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h index 1922e7a690f8..f6d95512fec6 100644 --- a/arch/arm64/kernel/cpu-reset.h +++ b/arch/arm64/kernel/cpu-reset.h @@ -20,11 +20,10 @@ static inline void __noreturn cpu_soft_restart(unsigned long entry, { typeof(__cpu_soft_restart) *restart; - unsigned long el2_switch = is_hyp_callable(); restart = (void *)__pa_symbol(__cpu_soft_restart); cpu_install_idmap(); - restart(el2_switch, entry, arg0, arg1, arg2); + restart(0, entry, arg0, arg1, arg2); unreachable(); } diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index fb03b6676fb9..d5940b7889f8 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -231,8 +231,8 @@ void machine_kexec(struct kimage *kimage) } else { if (is_hyp_callable()) __hyp_set_vectors(kimage->arch.el2_vectors); - cpu_soft_restart(kimage->arch.kern_reloc, virt_to_phys(kimage), -0, 0); + cpu_soft_restart(kimage->arch.kern_reloc, +virt_to_phys(kimage), 0, 0); } BUG(); /* Should never get here. */ diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S index 36b4496524c3..df023b82544b 100644 --- a/arch/arm64/kernel/relocate_kernel.S +++ b/arch/arm64/kernel/relocate_kernel.S @@ -13,6 +13,7 @@ #include #include #include +#include /* * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it. @@ -61,12 +62,20 @@ SYM_CODE_START(arm64_relocate_new_kernel) isb /* Start new image. */ + ldr x1, [x0, #KIMAGE_ARCH_EL2_VECTORS] /* relocation start */ + cbz x1, .Lel1 + ldr x1, [x0, #KIMAGE_START] /* relocation start */ + ldr x2, [x0, #KIMAGE_ARCH_DTB_MEM] /* dtb address */ + mov x3, xzr + mov x4, xzr + mov x0, #HVC_SOFT_RESTART + hvc #0 /* Jumps from el2 */ +.Lel1: ldr x4, [x0, #KIMAGE_START] /* relocation start */ ldr x0, [x0, #KIMAGE_ARCH_DTB_MEM] /* dtb address */ - mov x1, xzr mov x2, xzr mov x3, xzr - br x4 + br x4 /* Jumps from el1 */ SYM_CODE_END(arm64_relocate_new_kernel) .align 3 /* To keep the 64-bit values below naturally aligned. */ -- 2.25.1
[PATCH v13 11/18] arm64: kexec: kexec may require EL2 vectors
If we have a EL2 mode without VHE, the EL2 vectors are needed in order to switch to EL2 and jump to new world with hypervisor privileges. In preporation to MMU enabled relocation, configure our EL2 table now. Suggested-by: James Morse Signed-off-by: Pavel Tatashin --- arch/arm64/Kconfig| 2 +- arch/arm64/include/asm/kexec.h| 1 + arch/arm64/kernel/asm-offsets.c | 1 + arch/arm64/kernel/machine_kexec.c | 31 +++ 4 files changed, 34 insertions(+), 1 deletion(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index e4e1b6550115..0e876d980a1f 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1149,7 +1149,7 @@ config CRASH_DUMP config TRANS_TABLE def_bool y - depends on HIBERNATION + depends on HIBERNATION || KEXEC_CORE config XEN_DOM0 def_bool y diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h index 9befcd87e9a8..305cf0840ed3 100644 --- a/arch/arm64/include/asm/kexec.h +++ b/arch/arm64/include/asm/kexec.h @@ -96,6 +96,7 @@ struct kimage_arch { void *dtb; phys_addr_t dtb_mem; phys_addr_t kern_reloc; + phys_addr_t el2_vectors; /* Core ELF header buffer */ void *elf_headers; unsigned long elf_headers_mem; diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index 0c92e193f866..2e3278df1fc3 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -157,6 +157,7 @@ int main(void) #endif #ifdef CONFIG_KEXEC_CORE DEFINE(KIMAGE_ARCH_DTB_MEM, offsetof(struct kimage, arch.dtb_mem)); + DEFINE(KIMAGE_ARCH_EL2_VECTORS, offsetof(struct kimage, arch.el2_vectors)); DEFINE(KIMAGE_HEAD, offsetof(struct kimage, head)); DEFINE(KIMAGE_START, offsetof(struct kimage, start)); BLANK(); diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index 2e734e4ae12e..fb03b6676fb9 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -20,6 +20,7 @@ #include #include #include +#include #include "cpu-reset.h" @@ -42,7 +43,9 @@ static void _kexec_image_info(const char *func, int line, pr_debug("start: %lx\n", kimage->start); pr_debug("head:%lx\n", kimage->head); pr_debug("nr_segments: %lu\n", kimage->nr_segments); + pr_debug("dtb_mem: %pa\n", >arch.dtb_mem); pr_debug("kern_reloc: %pa\n", >arch.kern_reloc); + pr_debug("el2_vectors: %pa\n", >arch.el2_vectors); for (i = 0; i < kimage->nr_segments; i++) { pr_debug(" segment[%lu]: %016lx - %016lx, 0x%lx bytes, %lu pages\n", @@ -137,9 +140,27 @@ static void kexec_segment_flush(const struct kimage *kimage) } } +/* Allocates pages for kexec page table */ +static void *kexec_page_alloc(void *arg) +{ + struct kimage *kimage = (struct kimage *)arg; + struct page *page = kimage_alloc_control_pages(kimage, 0); + + if (!page) + return NULL; + + memset(page_address(page), 0, PAGE_SIZE); + + return page_address(page); +} + int machine_kexec_post_load(struct kimage *kimage) { void *reloc_code = page_to_virt(kimage->control_code_page); + struct trans_pgd_info info = { + .trans_alloc_page = kexec_page_alloc, + .trans_alloc_arg= kimage, + }; /* If in place, relocation is not used, only flush next kernel */ if (kimage->head & IND_DONE) { @@ -148,6 +169,14 @@ int machine_kexec_post_load(struct kimage *kimage) return 0; } + kimage->arch.el2_vectors = 0; + if (is_hyp_callable()) { + int rc = trans_pgd_copy_el2_vectors(, + >arch.el2_vectors); + if (rc) + return rc; + } + memcpy(reloc_code, arm64_relocate_new_kernel, arm64_relocate_new_kernel_size); kimage->arch.kern_reloc = __pa(reloc_code); @@ -200,6 +229,8 @@ void machine_kexec(struct kimage *kimage) restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem, 0, 0); } else { + if (is_hyp_callable()) + __hyp_set_vectors(kimage->arch.el2_vectors); cpu_soft_restart(kimage->arch.kern_reloc, virt_to_phys(kimage), 0, 0); } -- 2.25.1
[PATCH v13 10/18] arm64: kexec: pass kimage as the only argument to relocation function
Currently, kexec relocation function (arm64_relocate_new_kernel) accepts the following arguments: head: start of array that contains relocation information. entry: entry point for new kernel or purgatory. dtb_mem:first and only argument to entry. The number of arguments cannot be easily expended, because this function is also called from HVC_SOFT_RESTART, which preserves only three arguments. And, also arm64_relocate_new_kernel is written in assembly but called without stack, thus no place to move extra arguments to free registers. Soon, we will need to pass more arguments: once we enable MMU we will need to pass information about page tables. Pass kimage to arm64_relocate_new_kernel, and teach it to get the required fields from kimage. Suggested-by: James Morse Signed-off-by: Pavel Tatashin --- arch/arm64/kernel/asm-offsets.c | 7 +++ arch/arm64/kernel/machine_kexec.c | 6 -- arch/arm64/kernel/relocate_kernel.S | 10 -- 3 files changed, 15 insertions(+), 8 deletions(-) diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index a36e2fc330d4..0c92e193f866 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -9,6 +9,7 @@ #include #include +#include #include #include #include @@ -153,6 +154,12 @@ int main(void) DEFINE(PTRAUTH_USER_KEY_APGA,offsetof(struct ptrauth_keys_user, apga)); DEFINE(PTRAUTH_KERNEL_KEY_APIA, offsetof(struct ptrauth_keys_kernel, apia)); BLANK(); +#endif +#ifdef CONFIG_KEXEC_CORE + DEFINE(KIMAGE_ARCH_DTB_MEM, offsetof(struct kimage, arch.dtb_mem)); + DEFINE(KIMAGE_HEAD, offsetof(struct kimage, head)); + DEFINE(KIMAGE_START, offsetof(struct kimage, start)); + BLANK(); #endif return 0; } diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index b150b65f0b84..2e734e4ae12e 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -83,6 +83,8 @@ static void kexec_list_flush(struct kimage *kimage) { kimage_entry_t *entry; + __flush_dcache_area(kimage, sizeof(*kimage)); + for (entry = >head; ; entry++) { unsigned int flag; void *addr; @@ -198,8 +200,8 @@ void machine_kexec(struct kimage *kimage) restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem, 0, 0); } else { - cpu_soft_restart(kimage->arch.kern_reloc, kimage->head, -kimage->start, kimage->arch.dtb_mem); + cpu_soft_restart(kimage->arch.kern_reloc, virt_to_phys(kimage), +0, 0); } BUG(); /* Should never get here. */ diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S index 718037bef560..36b4496524c3 100644 --- a/arch/arm64/kernel/relocate_kernel.S +++ b/arch/arm64/kernel/relocate_kernel.S @@ -27,9 +27,7 @@ */ SYM_CODE_START(arm64_relocate_new_kernel) /* Setup the list loop variables. */ - mov x18, x2 /* x18 = dtb address */ - mov x17, x1 /* x17 = kimage_start */ - mov x16, x0 /* x16 = kimage_head */ + ldr x16, [x0, #KIMAGE_HEAD] /* x16 = kimage_head */ mov x14, xzr/* x14 = entry ptr */ mov x13, xzr/* x13 = copy dest */ raw_dcache_line_size x15, x1/* x15 = dcache line size */ @@ -63,12 +61,12 @@ SYM_CODE_START(arm64_relocate_new_kernel) isb /* Start new image. */ - mov x0, x18 + ldr x4, [x0, #KIMAGE_START] /* relocation start */ + ldr x0, [x0, #KIMAGE_ARCH_DTB_MEM] /* dtb address */ mov x1, xzr mov x2, xzr mov x3, xzr - br x17 - + br x4 SYM_CODE_END(arm64_relocate_new_kernel) .align 3 /* To keep the 64-bit values below naturally aligned. */ -- 2.25.1
[PATCH v13 09/18] arm64: kexec: Use dcache ops macros instead of open-coding
From: James Morse kexec does dcache maintenance when it re-writes all memory. Our dcache_by_line_op macro depends on reading the sanitised DminLine from memory. Kexec may have overwritten this, so open-codes the sequence. dcache_by_line_op is a whole set of macros, it uses dcache_line_size which uses read_ctr for the sanitsed DminLine. Reading the DminLine is the first thing the dcache_by_line_op does. Rename dcache_by_line_op dcache_by_myline_op and take DminLine as an argument. Kexec can now use the slightly smaller macro. This makes up-coming changes to the dcache maintenance easier on the eye. Code generated by the existing callers is unchanged. Signed-off-by: James Morse [Fixed merging issues] Signed-off-by: Pavel Tatashin --- arch/arm64/include/asm/assembler.h | 12 arch/arm64/kernel/relocate_kernel.S | 13 +++-- 2 files changed, 11 insertions(+), 14 deletions(-) diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index ca31594d3d6c..29061b76aab6 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -371,10 +371,9 @@ alternative_else alternative_endif .endm - .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2 - dcache_line_size \tmp1, \tmp2 + .macro dcache_by_myline_op op, domain, kaddr, size, linesz, tmp2 add \size, \kaddr, \size - sub \tmp2, \tmp1, #1 + sub \tmp2, \linesz, #1 bic \kaddr, \kaddr, \tmp2 9998: .ifc\op, cvau @@ -394,12 +393,17 @@ alternative_endif .endif .endif .endif - add \kaddr, \kaddr, \tmp1 + add \kaddr, \kaddr, \linesz cmp \kaddr, \size b.lo9998b dsb \domain .endm + .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2 + dcache_line_size \tmp1, \tmp2 + dcache_by_myline_op \op, \domain, \kaddr, \size, \tmp1, \tmp2 + .endm + /* * Macro to perform an instruction cache maintenance for the interval * [start, end) diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S index 8058fabe0a76..718037bef560 100644 --- a/arch/arm64/kernel/relocate_kernel.S +++ b/arch/arm64/kernel/relocate_kernel.S @@ -41,16 +41,9 @@ SYM_CODE_START(arm64_relocate_new_kernel) tbz x16, IND_SOURCE_BIT, .Ltest_indirection /* Invalidate dest page to PoC. */ - mov x2, x13 - add x20, x2, #PAGE_SIZE - sub x1, x15, #1 - bic x2, x2, x1 -2: dc ivac, x2 - add x2, x2, x15 - cmp x2, x20 - b.lo2b - dsb sy - + mov x2, x13 + mov x1, #PAGE_SIZE + dcache_by_myline_op ivac, sy, x2, x1, x15, x20 copy_page x13, x12, x1, x2, x3, x4, x5, x6, x7, x8 b .Lnext .Ltest_indirection: -- 2.25.1
[PATCH v13 07/18] arm64: kexec: flush image and lists during kexec load time
Currently, during kexec load we are copying relocation function and flushing it. However, we can also flush kexec relocation buffers and if new kernel image is already in place (i.e. crash kernel), we can also flush the new kernel image itself. Signed-off-by: Pavel Tatashin --- arch/arm64/kernel/machine_kexec.c | 49 +++ 1 file changed, 23 insertions(+), 26 deletions(-) diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index 90a335c74442..3a034bc25709 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -59,23 +59,6 @@ void machine_kexec_cleanup(struct kimage *kimage) /* Empty routine needed to avoid build errors. */ } -int machine_kexec_post_load(struct kimage *kimage) -{ - void *reloc_code = page_to_virt(kimage->control_code_page); - - memcpy(reloc_code, arm64_relocate_new_kernel, - arm64_relocate_new_kernel_size); - kimage->arch.kern_reloc = __pa(reloc_code); - kexec_image_info(kimage); - - /* Flush the reloc_code in preparation for its execution. */ - __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size); - flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code + - arm64_relocate_new_kernel_size); - - return 0; -} - /** * machine_kexec_prepare - Prepare for a kexec reboot. * @@ -152,6 +135,29 @@ static void kexec_segment_flush(const struct kimage *kimage) } } +int machine_kexec_post_load(struct kimage *kimage) +{ + void *reloc_code = page_to_virt(kimage->control_code_page); + + /* If in place flush new kernel image, else flush lists and buffers */ + if (kimage->head & IND_DONE) + kexec_segment_flush(kimage); + else + kexec_list_flush(kimage); + + memcpy(reloc_code, arm64_relocate_new_kernel, + arm64_relocate_new_kernel_size); + kimage->arch.kern_reloc = __pa(reloc_code); + kexec_image_info(kimage); + + /* Flush the reloc_code in preparation for its execution. */ + __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size); + flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code + + arm64_relocate_new_kernel_size); + + return 0; +} + /** * machine_kexec - Do the kexec reboot. * @@ -169,13 +175,6 @@ void machine_kexec(struct kimage *kimage) WARN(in_kexec_crash && (stuck_cpus || smp_crash_stop_failed()), "Some CPUs may be stale, kdump will be unreliable.\n"); - /* Flush the kimage list and its buffers. */ - kexec_list_flush(kimage); - - /* Flush the new image if already in place. */ - if ((kimage != kexec_crash_image) && (kimage->head & IND_DONE)) - kexec_segment_flush(kimage); - pr_info("Bye!\n"); local_daif_mask(); @@ -250,8 +249,6 @@ void arch_kexec_protect_crashkres(void) { int i; - kexec_segment_flush(kexec_crash_image); - for (i = 0; i < kexec_crash_image->nr_segments; i++) set_memory_valid( __phys_to_virt(kexec_crash_image->segment[i].mem), -- 2.25.1
[PATCH v13 08/18] arm64: kexec: skip relocation code for inplace kexec
In case of kdump or when segments are already in place the relocation is not needed, therefore the setup of relocation function and call to it can be skipped. Signed-off-by: Pavel Tatashin Suggested-by: James Morse --- arch/arm64/kernel/machine_kexec.c | 34 ++--- arch/arm64/kernel/relocate_kernel.S | 3 --- 2 files changed, 21 insertions(+), 16 deletions(-) diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index 3a034bc25709..b150b65f0b84 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -139,21 +139,23 @@ int machine_kexec_post_load(struct kimage *kimage) { void *reloc_code = page_to_virt(kimage->control_code_page); - /* If in place flush new kernel image, else flush lists and buffers */ - if (kimage->head & IND_DONE) + /* If in place, relocation is not used, only flush next kernel */ + if (kimage->head & IND_DONE) { kexec_segment_flush(kimage); - else - kexec_list_flush(kimage); + kexec_image_info(kimage); + return 0; + } memcpy(reloc_code, arm64_relocate_new_kernel, arm64_relocate_new_kernel_size); kimage->arch.kern_reloc = __pa(reloc_code); - kexec_image_info(kimage); /* Flush the reloc_code in preparation for its execution. */ __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size); flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code + arm64_relocate_new_kernel_size); + kexec_list_flush(kimage); + kexec_image_info(kimage); return 0; } @@ -180,19 +182,25 @@ void machine_kexec(struct kimage *kimage) local_daif_mask(); /* -* cpu_soft_restart will shutdown the MMU, disable data caches, then -* transfer control to the kern_reloc which contains a copy of -* the arm64_relocate_new_kernel routine. arm64_relocate_new_kernel -* uses physical addressing to relocate the new image to its final -* position and transfers control to the image entry point when the -* relocation is complete. +* Both restart and cpu_soft_restart will shutdown the MMU, disable data +* caches. However, restart will start new kernel or purgatory directly, +* cpu_soft_restart will transfer control to arm64_relocate_new_kernel * In kexec case, kimage->start points to purgatory assuming that * kernel entry and dtb address are embedded in purgatory by * userspace (kexec-tools). * In kexec_file case, the kernel starts directly without purgatory. */ - cpu_soft_restart(kimage->arch.kern_reloc, kimage->head, kimage->start, -kimage->arch.dtb_mem); + if (kimage->head & IND_DONE) { + typeof(__cpu_soft_restart) *restart; + + cpu_install_idmap(); + restart = (void *)__pa_symbol(__cpu_soft_restart); + restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem, + 0, 0); + } else { + cpu_soft_restart(kimage->arch.kern_reloc, kimage->head, +kimage->start, kimage->arch.dtb_mem); + } BUG(); /* Should never get here. */ } diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S index b78ea5de97a4..8058fabe0a76 100644 --- a/arch/arm64/kernel/relocate_kernel.S +++ b/arch/arm64/kernel/relocate_kernel.S @@ -32,8 +32,6 @@ SYM_CODE_START(arm64_relocate_new_kernel) mov x16, x0 /* x16 = kimage_head */ mov x14, xzr/* x14 = entry ptr */ mov x13, xzr/* x13 = copy dest */ - /* Check if the new image needs relocation. */ - tbnzx16, IND_DONE_BIT, .Ldone raw_dcache_line_size x15, x1/* x15 = dcache line size */ .Lloop: and x12, x16, PAGE_MASK /* x12 = addr */ @@ -65,7 +63,6 @@ SYM_CODE_START(arm64_relocate_new_kernel) .Lnext: ldr x16, [x14], #8 /* entry = *ptr++ */ tbz x16, IND_DONE_BIT, .Lloop /* while (!(entry & DONE)) */ -.Ldone: /* wait for writes from copy_page to finish */ dsb nsh ic iallu -- 2.25.1
[PATCH v13 06/18] arm64: hibernate: abstract ttrb0 setup function
Currently, only hibernate sets custom ttbr0 with safe idmaped function. Kexec, is also going to be using this functinality when relocation code is going to be idmapped. Move the setup seqeuence to a dedicated cpu_install_ttbr0() for custom ttbr0. Suggested-by: James Morse Signed-off-by: Pavel Tatashin --- arch/arm64/include/asm/mmu_context.h | 24 arch/arm64/kernel/hibernate.c| 21 + 2 files changed, 25 insertions(+), 20 deletions(-) diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index bd02e99b1a4c..f64d0d5e1b1f 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -115,6 +115,30 @@ static inline void cpu_install_idmap(void) cpu_switch_mm(lm_alias(idmap_pg_dir), _mm); } +/* + * Load our new page tables. A strict BBM approach requires that we ensure that + * TLBs are free of any entries that may overlap with the global mappings we are + * about to install. + * + * For a real hibernate/resume/kexec cycle TTBR0 currently points to a zero + * page, but TLBs may contain stale ASID-tagged entries (e.g. for EFI runtime + * services), while for a userspace-driven test_resume cycle it points to + * userspace page tables (and we must point it at a zero page ourselves). + * + * We change T0SZ as part of installing the idmap. This is undone by + * cpu_uninstall_idmap() in __cpu_suspend_exit(). + */ +static inline void cpu_install_ttbr0(phys_addr_t ttbr0, unsigned long t0sz) +{ + cpu_set_reserved_ttbr0(); + local_flush_tlb_all(); + __cpu_set_tcr_t0sz(t0sz); + + /* avoid cpu_switch_mm() and its SW-PAN and CNP interactions */ + write_sysreg(ttbr0, ttbr0_el1); + isb(); +} + /* * Atomically replaces the active TTBR1_EL1 PGD with a new VA-compatible PGD, * avoiding the possibility of conflicting TLB entries being allocated. diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c index 0b8bad8bb6eb..ded5115bcb63 100644 --- a/arch/arm64/kernel/hibernate.c +++ b/arch/arm64/kernel/hibernate.c @@ -206,26 +206,7 @@ static int create_safe_exec_page(void *src_start, size_t length, if (rc) return rc; - /* -* Load our new page tables. A strict BBM approach requires that we -* ensure that TLBs are free of any entries that may overlap with the -* global mappings we are about to install. -* -* For a real hibernate/resume cycle TTBR0 currently points to a zero -* page, but TLBs may contain stale ASID-tagged entries (e.g. for EFI -* runtime services), while for a userspace-driven test_resume cycle it -* points to userspace page tables (and we must point it at a zero page -* ourselves). -* -* We change T0SZ as part of installing the idmap. This is undone by -* cpu_uninstall_idmap() in __cpu_suspend_exit(). -*/ - cpu_set_reserved_ttbr0(); - local_flush_tlb_all(); - __cpu_set_tcr_t0sz(t0sz); - write_sysreg(trans_ttbr0, ttbr0_el1); - isb(); - + cpu_install_ttbr0(trans_ttbr0, t0sz); *phys_dst_addr = virt_to_phys(page); return 0; -- 2.25.1
[PATCH v13 05/18] arm64: trans_pgd: hibernate: Add trans_pgd_copy_el2_vectors
Users of trans_pgd may also need a copy of vector table because it is also may be overwritten if a linear map can be overwritten. Move setup of EL2 vectors from hibernate to trans_pgd, so it can be later shared with kexec as well. Suggested-by: James Morse Signed-off-by: Pavel Tatashin --- arch/arm64/include/asm/trans_pgd.h | 3 +++ arch/arm64/include/asm/virt.h | 3 +++ arch/arm64/kernel/hibernate.c | 28 ++-- arch/arm64/mm/trans_pgd.c | 20 4 files changed, 36 insertions(+), 18 deletions(-) diff --git a/arch/arm64/include/asm/trans_pgd.h b/arch/arm64/include/asm/trans_pgd.h index 5d08e5adf3d5..e0760e52d36d 100644 --- a/arch/arm64/include/asm/trans_pgd.h +++ b/arch/arm64/include/asm/trans_pgd.h @@ -36,4 +36,7 @@ int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t *trans_pgd, int trans_pgd_idmap_page(struct trans_pgd_info *info, phys_addr_t *trans_ttbr0, unsigned long *t0sz, void *page); +int trans_pgd_copy_el2_vectors(struct trans_pgd_info *info, + phys_addr_t *el2_vectors); + #endif /* _ASM_TRANS_TABLE_H */ diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h index 4216c8623538..bfbb66018114 100644 --- a/arch/arm64/include/asm/virt.h +++ b/arch/arm64/include/asm/virt.h @@ -67,6 +67,9 @@ */ extern u32 __boot_cpu_mode[2]; +extern char __hyp_stub_vectors[]; +#define ARM64_VECTOR_TABLE_LEN SZ_2K + void __hyp_set_vectors(phys_addr_t phys_vector_base); void __hyp_reset_vectors(void); diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c index c764574a1acb..0b8bad8bb6eb 100644 --- a/arch/arm64/kernel/hibernate.c +++ b/arch/arm64/kernel/hibernate.c @@ -48,12 +48,6 @@ */ extern int in_suspend; -/* temporary el2 vectors in the __hibernate_exit_text section. */ -extern char hibernate_el2_vectors[]; - -/* hyp-stub vectors, used to restore el2 during resume from hibernate. */ -extern char __hyp_stub_vectors[]; - /* * The logical cpu number we should resume on, initialised to a non-cpu * number. @@ -428,6 +422,7 @@ int swsusp_arch_resume(void) void *zero_page; size_t exit_size; pgd_t *tmp_pg_dir; + phys_addr_t el2_vectors; void __noreturn (*hibernate_exit)(phys_addr_t, phys_addr_t, void *, void *, phys_addr_t, phys_addr_t); struct trans_pgd_info trans_info = { @@ -455,6 +450,14 @@ int swsusp_arch_resume(void) return -ENOMEM; } + if (is_hyp_callable()) { + rc = trans_pgd_copy_el2_vectors(_info, _vectors); + if (rc) { + pr_err("Failed to setup el2 vectors\n"); + return rc; + } + } + exit_size = __hibernate_exit_text_end - __hibernate_exit_text_start; /* * Copy swsusp_arch_suspend_exit() to a safe page. This will generate @@ -467,25 +470,14 @@ int swsusp_arch_resume(void) return rc; } - /* -* The hibernate exit text contains a set of el2 vectors, that will -* be executed at el2 with the mmu off in order to reload hyp-stub. -*/ - __flush_dcache_area(hibernate_exit, exit_size); - /* * KASLR will cause the el2 vectors to be in a different location in * the resumed kernel. Load hibernate's temporary copy into el2. * * We can skip this step if we booted at EL1, or are running with VHE. */ - if (is_hyp_callable()) { - phys_addr_t el2_vectors = (phys_addr_t)hibernate_exit; - el2_vectors += hibernate_el2_vectors - - __hibernate_exit_text_start; /* offset */ - + if (is_hyp_callable()) __hyp_set_vectors(el2_vectors); - } hibernate_exit(virt_to_phys(tmp_pg_dir), resume_hdr.ttbr1_el1, resume_hdr.reenter_kernel, restore_pblist, diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c index 527f0a39c3da..61549451ed3a 100644 --- a/arch/arm64/mm/trans_pgd.c +++ b/arch/arm64/mm/trans_pgd.c @@ -322,3 +322,23 @@ int trans_pgd_idmap_page(struct trans_pgd_info *info, phys_addr_t *trans_ttbr0, return 0; } + +/* + * Create a copy of the vector table so we can call HVC_SET_VECTORS or + * HVC_SOFT_RESTART from contexts where the table may be overwritten. + */ +int trans_pgd_copy_el2_vectors(struct trans_pgd_info *info, + phys_addr_t *el2_vectors) +{ + void *hyp_stub = trans_alloc(info); + + if (!hyp_stub) + return -ENOMEM; + *el2_vectors = virt_to_phys(hyp_stub); + memcpy(hyp_stub, &__hyp_stub_vectors, ARM64_VECTOR_TABLE_LEN); + __flush_icache_range((unsigned long)hyp_stub, +(unsigned long)hyp_stub + ARM64_VECTOR_TABLE_LEN); +
[PATCH v13 04/18] arm64: kernel: add helper for booted at EL2 and not VHE
Replace places that contain logic like this: is_hyp_mode_available() && !is_kernel_in_hyp_mode() With a dedicated boolean function is_hyp_callable(). This will be needed later in kexec in order to sooner switch back to EL2. Suggested-by: James Morse Signed-off-by: Pavel Tatashin --- arch/arm64/include/asm/virt.h | 5 + arch/arm64/kernel/cpu-reset.h | 3 +-- arch/arm64/kernel/hibernate.c | 9 +++-- arch/arm64/kernel/sdei.c | 2 +- 4 files changed, 10 insertions(+), 9 deletions(-) diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h index 7379f35ae2c6..4216c8623538 100644 --- a/arch/arm64/include/asm/virt.h +++ b/arch/arm64/include/asm/virt.h @@ -128,6 +128,11 @@ static __always_inline bool is_protected_kvm_enabled(void) return cpus_have_final_cap(ARM64_KVM_PROTECTED_MODE); } +static inline bool is_hyp_callable(void) +{ + return is_hyp_mode_available() && !is_kernel_in_hyp_mode(); +} + #endif /* __ASSEMBLY__ */ #endif /* ! __ASM__VIRT_H */ diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h index ed50e9587ad8..1922e7a690f8 100644 --- a/arch/arm64/kernel/cpu-reset.h +++ b/arch/arm64/kernel/cpu-reset.h @@ -20,8 +20,7 @@ static inline void __noreturn cpu_soft_restart(unsigned long entry, { typeof(__cpu_soft_restart) *restart; - unsigned long el2_switch = !is_kernel_in_hyp_mode() && - is_hyp_mode_available(); + unsigned long el2_switch = is_hyp_callable(); restart = (void *)__pa_symbol(__cpu_soft_restart); cpu_install_idmap(); diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c index b1cef371df2b..c764574a1acb 100644 --- a/arch/arm64/kernel/hibernate.c +++ b/arch/arm64/kernel/hibernate.c @@ -48,9 +48,6 @@ */ extern int in_suspend; -/* Do we need to reset el2? */ -#define el2_reset_needed() (is_hyp_mode_available() && !is_kernel_in_hyp_mode()) - /* temporary el2 vectors in the __hibernate_exit_text section. */ extern char hibernate_el2_vectors[]; @@ -125,7 +122,7 @@ int arch_hibernation_header_save(void *addr, unsigned int max_size) hdr->reenter_kernel = _cpu_resume; /* We can't use __hyp_get_vectors() because kvm may still be loaded */ - if (el2_reset_needed()) + if (is_hyp_callable()) hdr->__hyp_stub_vectors = __pa_symbol(__hyp_stub_vectors); else hdr->__hyp_stub_vectors = 0; @@ -387,7 +384,7 @@ int swsusp_arch_suspend(void) dcache_clean_range(__idmap_text_start, __idmap_text_end); /* Clean kvm setup code to PoC? */ - if (el2_reset_needed()) { + if (is_hyp_callable()) { dcache_clean_range(__hyp_idmap_text_start, __hyp_idmap_text_end); dcache_clean_range(__hyp_text_start, __hyp_text_end); } @@ -482,7 +479,7 @@ int swsusp_arch_resume(void) * * We can skip this step if we booted at EL1, or are running with VHE. */ - if (el2_reset_needed()) { + if (is_hyp_callable()) { phys_addr_t el2_vectors = (phys_addr_t)hibernate_exit; el2_vectors += hibernate_el2_vectors - __hibernate_exit_text_start; /* offset */ diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c index 2c7ca449dd51..af0ac2f920cf 100644 --- a/arch/arm64/kernel/sdei.c +++ b/arch/arm64/kernel/sdei.c @@ -200,7 +200,7 @@ unsigned long sdei_arch_get_entry_point(int conduit) * dropped to EL1 because we don't support VHE, then we can't support * SDEI. */ - if (is_hyp_mode_available() && !is_kernel_in_hyp_mode()) { + if (is_hyp_callable()) { pr_err("Not supported on this hardware/boot configuration\n"); goto out_err; } -- 2.25.1
[PATCH v13 03/18] arm64: hyp-stub: Move el1_sync into the vectors
From: James Morse The hyp-stub's el1_sync code doesn't do very much, this can easily fit in the vectors. With this, all of the hyp-stubs behaviour is contained in its vectors. This lets kexec and hibernate copy the hyp-stub when they need its behaviour, instead of re-implementing it. Signed-off-by: James Morse [Fixed merging issues] Signed-off-by: Pavel Tatashin --- arch/arm64/kernel/hyp-stub.S | 59 ++-- 1 file changed, 29 insertions(+), 30 deletions(-) diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index ff329c5c074d..d1a73d0f74e0 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -21,6 +21,34 @@ SYM_CODE_START_LOCAL(\label) .align 7 b \label SYM_CODE_END(\label) +.endm + +.macro hyp_stub_el1_sync +SYM_CODE_START_LOCAL(hyp_stub_el1_sync) + .align 7 + cmp x0, #HVC_SET_VECTORS + b.ne2f + msr vbar_el2, x1 + b 9f + +2: cmp x0, #HVC_SOFT_RESTART + b.ne3f + mov x0, x2 + mov x2, x4 + mov x4, x1 + mov x1, x3 + br x4 // no return + +3: cmp x0, #HVC_RESET_VECTORS + beq 9f // Nothing to reset! + + /* Someone called kvm_call_hyp() against the hyp-stub... */ + mov_q x0, HVC_STUB_ERR + eret + +9: mov x0, xzr + eret +SYM_CODE_END(hyp_stub_el1_sync) .endm .text @@ -39,7 +67,7 @@ SYM_CODE_START(__hyp_stub_vectors) invalid_vector hyp_stub_el2h_fiq_invalid // FIQ EL2h invalid_vector hyp_stub_el2h_error_invalid // Error EL2h - ventry el1_sync// Synchronous 64-bit EL1 + hyp_stub_el1_sync // Synchronous 64-bit EL1 invalid_vector hyp_stub_el1_irq_invalid// IRQ 64-bit EL1 invalid_vector hyp_stub_el1_fiq_invalid// FIQ 64-bit EL1 invalid_vector hyp_stub_el1_error_invalid // Error 64-bit EL1 @@ -55,35 +83,6 @@ SYM_CODE_END(__hyp_stub_vectors) # Check the __hyp_stub_vectors didn't overflow .org . - (__hyp_stub_vectors_end - __hyp_stub_vectors) + SZ_2K - -SYM_CODE_START_LOCAL(el1_sync) - cmp x0, #HVC_SET_VECTORS - b.ne1f - msr vbar_el2, x1 - b 9f - -1: cmp x0, #HVC_VHE_RESTART - b.eqmutate_to_vhe - -2: cmp x0, #HVC_SOFT_RESTART - b.ne3f - mov x0, x2 - mov x2, x4 - mov x4, x1 - mov x1, x3 - br x4 // no return - -3: cmp x0, #HVC_RESET_VECTORS - beq 9f // Nothing to reset! - - /* Someone called kvm_call_hyp() against the hyp-stub... */ - mov_q x0, HVC_STUB_ERR - eret - -9: mov x0, xzr - eret -SYM_CODE_END(el1_sync) - // nVHE? No way! Give me the real thing! SYM_CODE_START_LOCAL(mutate_to_vhe) // Sanity check: MMU *must* be off -- 2.25.1
[PATCH v13 02/18] arm64: hyp-stub: Move invalid vector entries into the vectors
From: James Morse Most of the hyp-stub's vector entries are invalid. These are each a unique function that branches to itself. To move these into the vectors, merge the ventry and invalid_vector macros and give each one a unique name. This means we can copy the hyp-stub as it is self contained within its vectors. Signed-off-by: James Morse [Fixed merging issues] Signed-off-by: Pavel Tatashin --- arch/arm64/kernel/hyp-stub.S | 56 +++- 1 file changed, 23 insertions(+), 33 deletions(-) diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 572b28646005..ff329c5c074d 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -16,31 +16,38 @@ #include #include +.macro invalid_vector label +SYM_CODE_START_LOCAL(\label) + .align 7 + b \label +SYM_CODE_END(\label) +.endm + .text .pushsection.hyp.text, "ax" .align 11 SYM_CODE_START(__hyp_stub_vectors) - ventry el2_sync_invalid// Synchronous EL2t - ventry el2_irq_invalid // IRQ EL2t - ventry el2_fiq_invalid // FIQ EL2t - ventry el2_error_invalid // Error EL2t + invalid_vector hyp_stub_el2t_sync_invalid // Synchronous EL2t + invalid_vector hyp_stub_el2t_irq_invalid // IRQ EL2t + invalid_vector hyp_stub_el2t_fiq_invalid // FIQ EL2t + invalid_vector hyp_stub_el2t_error_invalid // Error EL2t - ventry el2_sync_invalid// Synchronous EL2h - ventry el2_irq_invalid // IRQ EL2h - ventry el2_fiq_invalid // FIQ EL2h - ventry el2_error_invalid // Error EL2h + invalid_vector hyp_stub_el2h_sync_invalid // Synchronous EL2h + invalid_vector hyp_stub_el2h_irq_invalid // IRQ EL2h + invalid_vector hyp_stub_el2h_fiq_invalid // FIQ EL2h + invalid_vector hyp_stub_el2h_error_invalid // Error EL2h ventry el1_sync// Synchronous 64-bit EL1 - ventry el1_irq_invalid // IRQ 64-bit EL1 - ventry el1_fiq_invalid // FIQ 64-bit EL1 - ventry el1_error_invalid // Error 64-bit EL1 - - ventry el1_sync_invalid// Synchronous 32-bit EL1 - ventry el1_irq_invalid // IRQ 32-bit EL1 - ventry el1_fiq_invalid // FIQ 32-bit EL1 - ventry el1_error_invalid // Error 32-bit EL1 + invalid_vector hyp_stub_el1_irq_invalid// IRQ 64-bit EL1 + invalid_vector hyp_stub_el1_fiq_invalid// FIQ 64-bit EL1 + invalid_vector hyp_stub_el1_error_invalid // Error 64-bit EL1 + + invalid_vector hyp_stub_32b_el1_sync_invalid // Synchronous 32-bit EL1 + invalid_vector hyp_stub_32b_el1_irq_invalid// IRQ 32-bit EL1 + invalid_vector hyp_stub_32b_el1_fiq_invalid// FIQ 32-bit EL1 + invalid_vector hyp_stub_32b_el1_error_invalid // Error 32-bit EL1 .align 11 SYM_INNER_LABEL(__hyp_stub_vectors_end, SYM_L_LOCAL) SYM_CODE_END(__hyp_stub_vectors) @@ -173,23 +180,6 @@ SYM_CODE_END(enter_vhe) .popsection -.macro invalid_vector label -SYM_CODE_START_LOCAL(\label) - b \label -SYM_CODE_END(\label) -.endm - - invalid_vector el2_sync_invalid - invalid_vector el2_irq_invalid - invalid_vector el2_fiq_invalid - invalid_vector el2_error_invalid - invalid_vector el1_sync_invalid - invalid_vector el1_irq_invalid - invalid_vector el1_fiq_invalid - invalid_vector el1_error_invalid - - .popsection - /* * __hyp_set_vectors: Call this after boot to set the initial hypervisor * vectors as part of hypervisor installation. On an SMP system, this should -- 2.25.1
[PATCH v13 01/18] arm64: hyp-stub: Check the size of the HYP stub's vectors
From: James Morse Hibernate contains a set of temporary EL2 vectors used to 'park' EL2 somewhere safe while all the memory is thrown in the air. Making kexec do its relocations with the MMU on means they have to be done at EL1, so EL2 has to be parked. This means yet another set of vectors. All these things do is HVC_SET_VECTORS and HVC_SOFT_RESTART, both of which are implemented by the hyp-stub. Lets copy it instead of re-inventing it. To do this the hyp-stub's entrails need to be packed neatly inside its 2K vectors. Start by moving the final 2K alignment inside the end marker, and add a build check that we didn't overflow 2K. Signed-off-by: James Morse Signed-off-by: Pavel Tatashin --- arch/arm64/kernel/hyp-stub.S | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 5eccbd62fec8..572b28646005 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -41,9 +41,13 @@ SYM_CODE_START(__hyp_stub_vectors) ventry el1_irq_invalid // IRQ 32-bit EL1 ventry el1_fiq_invalid // FIQ 32-bit EL1 ventry el1_error_invalid // Error 32-bit EL1 + .align 11 +SYM_INNER_LABEL(__hyp_stub_vectors_end, SYM_L_LOCAL) SYM_CODE_END(__hyp_stub_vectors) - .align 11 +# Check the __hyp_stub_vectors didn't overflow +.org . - (__hyp_stub_vectors_end - __hyp_stub_vectors) + SZ_2K + SYM_CODE_START_LOCAL(el1_sync) cmp x0, #HVC_SET_VECTORS -- 2.25.1
[PATCH v13 00/18] arm64: MMU enabled kexec relocation
Changelog: v13: - Fixed a hang on ThunderX2, thank you Pingfan Liu for reporting the problem. In relocation function we need civac not ivac, we need to clean data in addition to invalidating it. Since I was using ThunderX2 machine I also measured the new performance data on this large ARM64 server. The MMU improves kexec relocation 190 times on this machine! (see below for raw data). Saves 7.5s during CentOS kexec reboot. v12: - A major change compared to previous version. Instead of using contiguous VA range a copy of linear map is now used to perform copying of segments during relocation as it was agreed in the discussion of version 11 of this project. - In addition to using linear map, I also took several ideas from James Morse to better organize the kexec relocation: 1. skip relocation function entirely if that is not needed 2. remove the PoC flushing function since it is not needed anymore with MMU enabled. v11: - Fixed missing KEXEC_CORE dependency for trans_pgd.c - Removed useless "if(rc) return rc" statement (thank you Tyler Hicks) - Another 12 patches were accepted into maintainer's get. Re-based patches against: https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git Branch: for-next/kexec v10: - Addressed a lot of comments form James Morse and from Marc Zyngier - Added review-by's - Synchronized with mainline v9: - 9 patches from previous series landed in upstream, so now series is smaller - Added two patches from James Morse to address idmap issues for machines with high physical addresses. - Addressed comments from Selin Dag about compiling issues. He also tested my series and got similar performance results: ~60 ms instead of ~580 ms with an initramfs size of ~120MB. v8: - Synced with mainline to keep series up-to-date v7: -- Addressed comments from James Morse - arm64: hibernate: pass the allocated pgdp to ttbr0 Removed "Fixes" tag, and added Added Reviewed-by: James Morse - arm64: hibernate: check pgd table allocation Sent out as a standalone patch so it can be sent to stable Series applies on mainline + this patch - arm64: hibernate: add trans_pgd public functions Remove second allocation of tmp_pg_dir in swsusp_arch_resume Added Reviewed-by: James Morse - arm64: kexec: move relocation function setup and clean up Fixed typo in commit log Changed kern_reloc to phys_addr_t types. Added explanation why kern_reloc is needed. Split into four patches: arm64: kexec: make dtb_mem always enabled arm64: kexec: remove unnecessary debug prints arm64: kexec: call kexec_image_info only once arm64: kexec: move relocation function setup - arm64: kexec: add expandable argument to relocation function Changed types of new arguments from unsigned long to phys_addr_t. Changed offset prefix to KEXEC_* Split into four patches: arm64: kexec: cpu_soft_restart change argument types arm64: kexec: arm64_relocate_new_kernel clean-ups arm64: kexec: arm64_relocate_new_kernel don't use x0 as temp arm64: kexec: add expandable argument to relocation function - arm64: kexec: configure trans_pgd page table for kexec Added invalid entries into EL2 vector table Removed KEXEC_EL2_VECTOR_TABLE_SIZE and KEXEC_EL2_VECTOR_TABLE_OFFSET Copy relocation functions and table into separate pages Changed types in kern_reloc_arg. Split into three patches: arm64: kexec: offset for relocation function arm64: kexec: kexec EL2 vectors arm64: kexec: configure trans_pgd page table for kexec - arm64: kexec: enable MMU during kexec relocation Split into two patches: arm64: kexec: enable MMU during kexec relocation arm64: kexec: remove head from relocation argument v6: - Sync with mainline tip - Added Acked's from Dave Young v5: - Addressed comments from Matthias Brugger: added review-by's, improved comments, and made cleanups to swsusp_arch_resume() in addition to create_safe_exec_page(). - Synced with mainline tip. v4: - Addressed comments from James Morse. - Split "check pgd table allocation" into two patches, and moved to the beginning of series for simpler backport of the fixes. Added "Fixes:" tags to commit logs. - Changed "arm64, hibernate:" to "arm64: hibernate:" - Added Reviewed-by's - Moved "add PUD_SECT_RDONLY" earlier in series
[PATCH 4/4] spi: spi-zynqmp-gqspi: fix incorrect operating mode in zynqmp_qspi_read_op
From: Quanyang Wang When starting a read operation, we should call zynqmp_qspi_setuprxdma first to set xqspi->mode according to xqspi->bytes_to_receive and to calculate correct xqspi->dma_rx_bytes. Then in the function zynqmp_qspi_fillgenfifo, generate the appropriate command with operating mode and bytes to transfer, and fill the GENFIFO with the command to perform the read operation. Calling zynqmp_qspi_fillgenfifo before zynqmp_qspi_setuprxdma will result in incorrect transfer length and operating mode. So change the calling order to fix this issue. Fixes: 1c26372e5aa9 ("spi: spi-zynqmp-gqspi: Update driver to use spi-mem framework") Signed-off-by: Quanyang Wang Reviewed-by: Amit Kumar Mahapatra --- drivers/spi/spi-zynqmp-gqspi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/spi/spi-zynqmp-gqspi.c b/drivers/spi/spi-zynqmp-gqspi.c index cf73a069b759..036d8ae41c06 100644 --- a/drivers/spi/spi-zynqmp-gqspi.c +++ b/drivers/spi/spi-zynqmp-gqspi.c @@ -827,8 +827,8 @@ static void zynqmp_qspi_write_op(struct zynqmp_qspi *xqspi, u8 tx_nbits, static void zynqmp_qspi_read_op(struct zynqmp_qspi *xqspi, u8 rx_nbits, u32 genfifoentry) { - zynqmp_qspi_fillgenfifo(xqspi, rx_nbits, genfifoentry); zynqmp_qspi_setuprxdma(xqspi); + zynqmp_qspi_fillgenfifo(xqspi, rx_nbits, genfifoentry); } /** -- 2.25.1
[PATCH 3/4] spi: spi-zynqmp-gqspi: transmit dummy circles by using the controller's internal functionality
From: Quanyang Wang There is a data corruption issue that occurs in the reading operation (cmd:0x6c) when transmitting common data as dummy circles. The gqspi controller has the functionality to send dummy clock circles. When writing data with the fields [receive, transmit, data_xfer] = [0,0,1] to the Generic FIFO, and configuring the correct SPI mode, the controller will transmit dummy circles. So let's switch to hardware dummy cycles transfer to fix this issue. Fixes: 1c26372e5aa9 ("spi: spi-zynqmp-gqspi: Update driver to use spi-mem framework") Signed-off-by: Quanyang Wang Reviewed-by: Amit Kumar Mahapatra --- drivers/spi/spi-zynqmp-gqspi.c | 40 +++--- 1 file changed, 18 insertions(+), 22 deletions(-) diff --git a/drivers/spi/spi-zynqmp-gqspi.c b/drivers/spi/spi-zynqmp-gqspi.c index 3b39461d58b3..cf73a069b759 100644 --- a/drivers/spi/spi-zynqmp-gqspi.c +++ b/drivers/spi/spi-zynqmp-gqspi.c @@ -521,7 +521,7 @@ static void zynqmp_qspi_filltxfifo(struct zynqmp_qspi *xqspi, int size) { u32 count = 0, intermediate; - while ((xqspi->bytes_to_transfer > 0) && (count < size)) { + while ((xqspi->bytes_to_transfer > 0) && (count < size) && (xqspi->txbuf)) { memcpy(, xqspi->txbuf, 4); zynqmp_gqspi_write(xqspi, GQSPI_TXD_OFST, intermediate); @@ -580,7 +580,7 @@ static void zynqmp_qspi_fillgenfifo(struct zynqmp_qspi *xqspi, u8 nbits, genfifoentry |= GQSPI_GENFIFO_DATA_XFER; genfifoentry |= GQSPI_GENFIFO_TX; transfer_len = xqspi->bytes_to_transfer; - } else { + } else if (xqspi->rxbuf) { genfifoentry &= ~GQSPI_GENFIFO_TX; genfifoentry |= GQSPI_GENFIFO_DATA_XFER; genfifoentry |= GQSPI_GENFIFO_RX; @@ -588,6 +588,11 @@ static void zynqmp_qspi_fillgenfifo(struct zynqmp_qspi *xqspi, u8 nbits, transfer_len = xqspi->dma_rx_bytes; else transfer_len = xqspi->bytes_to_receive; + } else { + /* Sending dummy circles here */ + genfifoentry &= ~(GQSPI_GENFIFO_TX | GQSPI_GENFIFO_RX); + genfifoentry |= GQSPI_GENFIFO_DATA_XFER; + transfer_len = xqspi->bytes_to_transfer; } genfifoentry |= zynqmp_qspi_selectspimode(xqspi, nbits); xqspi->genfifoentry = genfifoentry; @@ -1011,32 +1016,23 @@ static int zynqmp_qspi_exec_op(struct spi_mem *mem, } if (op->dummy.nbytes) { - tmpbuf = kzalloc(op->dummy.nbytes, GFP_KERNEL | GFP_DMA); - if (!tmpbuf) - return -ENOMEM; - memset(tmpbuf, 0xff, op->dummy.nbytes); - reinit_completion(>data_completion); - xqspi->txbuf = tmpbuf; + xqspi->txbuf = NULL; xqspi->rxbuf = NULL; - xqspi->bytes_to_transfer = op->dummy.nbytes; + /* +* xqspi->bytes_to_transfer here represents the dummy circles +* which need to be sent. +*/ + xqspi->bytes_to_transfer = op->dummy.nbytes * 8 / op->dummy.buswidth; xqspi->bytes_to_receive = 0; - zynqmp_qspi_write_op(xqspi, op->dummy.buswidth, + /* +* Using op->data.buswidth instead of op->dummy.buswidth here because +* we need to use it to configure the correct SPI mode. +*/ + zynqmp_qspi_write_op(xqspi, op->data.buswidth, genfifoentry); zynqmp_gqspi_write(xqspi, GQSPI_CONFIG_OFST, zynqmp_gqspi_read(xqspi, GQSPI_CONFIG_OFST) | GQSPI_CFG_START_GEN_FIFO_MASK); - zynqmp_gqspi_write(xqspi, GQSPI_IER_OFST, - GQSPI_IER_TXEMPTY_MASK | - GQSPI_IER_GENFIFOEMPTY_MASK | - GQSPI_IER_TXNOT_FULL_MASK); - if (!wait_for_completion_interruptible_timeout - (>data_completion, msecs_to_jiffies(1000))) { - err = -ETIMEDOUT; - kfree(tmpbuf); - goto return_err; - } - - kfree(tmpbuf); } if (op->data.nbytes) { -- 2.25.1
Re: [PATCH 5/8] dt-bindings: soc: mediatek: apusys: Add new document for APU power domain
Hi, Rob, The error is resulted from some un-merged patch. Please note that the patch depends MT8192 clock patches which haven't yet been accepted. https://patchwork.kernel.org/project/linux-mediatek/patch/20210324104110.13383-7-chun-jie.c...@mediatek.com/ Thanks for your review. On Wed, 2021-04-07 at 09:28 -0500, Rob Herring wrote: > On Wed, 07 Apr 2021 11:28:03 +0800, Flora Fu wrote: > > Document the bindings for APU power domain on MediaTek SoC. > > > > Signed-off-by: Flora Fu > > --- > > .../soc/mediatek/mediatek,apu-pm.yaml | 146 ++ > > 1 file changed, 146 insertions(+) > > create mode 100644 > > Documentation/devicetree/bindings/soc/mediatek/mediatek,apu-pm.yaml > > > > My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check' > on your patch (DT_CHECKER_FLAGS is new in v5.13): > > yamllint warnings/errors: > > dtschema/dtc warnings/errors: > Documentation/devicetree/bindings/soc/mediatek/mediatek,apu-pm.example.dts:19:18: > fatal error: dt-bindings/clock/mt8192-clk.h: No such file or directory >19 | #include > | ^~~~ > compilation terminated. > make[1]: *** [scripts/Makefile.lib:377: > Documentation/devicetree/bindings/soc/mediatek/mediatek,apu-pm.example.dt.yaml] > Error 1 > make[1]: *** Waiting for unfinished jobs > make: *** [Makefile:1414: dt_binding_check] Error 2 > > See > https://urldefense.com/v3/__https://patchwork.ozlabs.org/patch/1463115__;!!CTRNKA9wMg0ARbw!0XUn1LcNHfvUShNClpM_yH73TAR9qdm29SZMckasoCQ8UzeKS-vZW0QUu3Ssn-s6$ > > > This check can fail if there are any dependencies. The base for a patch > series is generally the most recent rc1. > > If you already ran 'make dt_binding_check' and didn't see the above > error(s), then make sure 'yamllint' is installed and dt-schema is up to > date: > > pip3 install dtschema --upgrade > > Please check and re-submit. >
[PATCH 2/4] spi: spi-zynqmp-gqspi: add mutex locking for exec_op
From: Quanyang Wang The spi-mem framework has no locking to prevent ctlr->mem_ops->exec_op from concurrency. So add the locking to zynqmp_qspi_exec_op. Fixes: 1c26372e5aa9 ("spi: spi-zynqmp-gqspi: Update driver to use spi-mem framework") Signed-off-by: Quanyang Wang Reviewed-by: Amit Kumar Mahapatra --- drivers/spi/spi-zynqmp-gqspi.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/spi/spi-zynqmp-gqspi.c b/drivers/spi/spi-zynqmp-gqspi.c index d49ab6575553..3b39461d58b3 100644 --- a/drivers/spi/spi-zynqmp-gqspi.c +++ b/drivers/spi/spi-zynqmp-gqspi.c @@ -173,6 +173,7 @@ struct zynqmp_qspi { u32 genfifoentry; enum mode_type mode; struct completion data_completion; + struct mutex op_lock; }; /** @@ -951,6 +952,7 @@ static int zynqmp_qspi_exec_op(struct spi_mem *mem, op->cmd.opcode, op->cmd.buswidth, op->addr.buswidth, op->dummy.buswidth, op->data.buswidth); + mutex_lock(>op_lock); zynqmp_qspi_config_op(xqspi, mem->spi); zynqmp_qspi_chipselect(mem->spi, false); genfifoentry |= xqspi->genfifocs; @@ -1084,6 +1086,7 @@ static int zynqmp_qspi_exec_op(struct spi_mem *mem, return_err: zynqmp_qspi_chipselect(mem->spi, true); + mutex_unlock(>op_lock); return err; } @@ -1156,6 +1159,8 @@ static int zynqmp_qspi_probe(struct platform_device *pdev) goto clk_dis_pclk; } + mutex_init(>op_lock); + pm_runtime_use_autosuspend(>dev); pm_runtime_set_autosuspend_delay(>dev, SPI_AUTOSUSPEND_TIMEOUT); pm_runtime_set_active(>dev); -- 2.25.1
[PATCH 1/4] spi: spi-zynqmp-gqspi: use wait_for_completion_timeout to make zynqmp_qspi_exec_op not interruptible
From: Quanyang Wang When Ctrl+C occurs during the process of zynqmp_qspi_exec_op, the function wait_for_completion_interruptible_timeout will return a non-zero value -ERESTARTSYS immediately. This will disrupt the SPI memory operation because the data transmitting may begin before the command or address transmitting completes. Use wait_for_completion_timeout to prevent the process from being interruptible. This patch fixes the error as below: root@xilinx-zynqmp:~# flash_erase /dev/mtd3 0 0 Erasing 4 Kibyte @ 3d000 -- 4 % complete (Press Ctrl+C) [ 169.581911] zynqmp-qspi ff0f.spi: Chip select timed out [ 170.585907] zynqmp-qspi ff0f.spi: Chip select timed out [ 171.589910] zynqmp-qspi ff0f.spi: Chip select timed out [ 172.593910] zynqmp-qspi ff0f.spi: Chip select timed out [ 173.597907] zynqmp-qspi ff0f.spi: Chip select timed out [ 173.603480] spi-nor spi0.0: Erase operation failed. [ 173.608368] spi-nor spi0.0: Attempted to modify a protected sector. Fixes: 1c26372e5aa9 ("spi: spi-zynqmp-gqspi: Update driver to use spi-mem framework") Signed-off-by: Quanyang Wang Reviewed-by: Amit Kumar Mahapatra --- drivers/spi/spi-zynqmp-gqspi.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/spi/spi-zynqmp-gqspi.c b/drivers/spi/spi-zynqmp-gqspi.c index c8fa6ee18ae7..d49ab6575553 100644 --- a/drivers/spi/spi-zynqmp-gqspi.c +++ b/drivers/spi/spi-zynqmp-gqspi.c @@ -973,7 +973,7 @@ static int zynqmp_qspi_exec_op(struct spi_mem *mem, zynqmp_gqspi_write(xqspi, GQSPI_IER_OFST, GQSPI_IER_GENFIFOEMPTY_MASK | GQSPI_IER_TXNOT_FULL_MASK); - if (!wait_for_completion_interruptible_timeout + if (!wait_for_completion_timeout (>data_completion, msecs_to_jiffies(1000))) { err = -ETIMEDOUT; kfree(tmpbuf); @@ -1001,7 +1001,7 @@ static int zynqmp_qspi_exec_op(struct spi_mem *mem, GQSPI_IER_TXEMPTY_MASK | GQSPI_IER_GENFIFOEMPTY_MASK | GQSPI_IER_TXNOT_FULL_MASK); - if (!wait_for_completion_interruptible_timeout + if (!wait_for_completion_timeout (>data_completion, msecs_to_jiffies(1000))) { err = -ETIMEDOUT; goto return_err; @@ -1076,7 +1076,7 @@ static int zynqmp_qspi_exec_op(struct spi_mem *mem, GQSPI_IER_RXEMPTY_MASK); } } - if (!wait_for_completion_interruptible_timeout + if (!wait_for_completion_timeout (>data_completion, msecs_to_jiffies(1000))) err = -ETIMEDOUT; } -- 2.25.1
[PATCH 0/4] spi: spi-zynqmp-gpspi: fix some issues
From: Quanyang Wang Hello, This series fix some issues that occurs when the gqspi driver switches to spi-mem framework. Hi Amit, I rewrite the "Subject" and "commit message" of these patches, so they look different from the ones which you reviewed before. I still keep your "Reviewed-by" and hope you will not mind. Regards, Quanyang Wang Quanyang Wang (4): spi: spi-zynqmp-gqspi: use wait_for_completion_timeout to make zynqmp_qspi_exec_op not interruptible spi: spi-zynqmp-gqspi: add mutex locking for exec_op spi: spi-zynqmp-gqspi: transmit dummy circles by using the controller's internal functionality spi: spi-zynqmp-gqspi: fix incorrect operating mode in zynqmp_qspi_read_op drivers/spi/spi-zynqmp-gqspi.c | 53 +- 1 file changed, 27 insertions(+), 26 deletions(-) -- 2.25.1
linux-next: build failure after merge of the bluetooth tree
Hi all, After merging the bluetooth tree, today's linux-next build (x86_64 allmodconfig) failed like this: In file included from :32: ./usr/include/linux/virtio_bt.h:1:1: error: C++ style comments are not allowed in ISO C90 1 | // SPDX-License-Identifier: BSD-3-Clause | ^ ./usr/include/linux/virtio_bt.h:1:1: note: (this will be reported only once per input file) Caused by commit 148a48f61393 ("Bluetooth: Add support for virtio transport driver") I have used the bluetooth tree from next-20210407 for today. -- Cheers, Stephen Rothwell pgpt9Z2bR_E0e.pgp Description: OpenPGP digital signature
[PATCH -next] powerpc/mce: Make symbol 'mce_ue_event_work' static
The sparse tool complains as follows: arch/powerpc/kernel/mce.c:43:1: warning: symbol 'mce_ue_event_work' was not declared. Should it be static? This symbol is not used outside of mce.c, so this commit marks it static. Signed-off-by: Li Huafei --- arch/powerpc/kernel/mce.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index 11f0cae086ed..6aa6b1cda1ed 100644 --- a/arch/powerpc/kernel/mce.c +++ b/arch/powerpc/kernel/mce.c @@ -40,7 +40,7 @@ static struct irq_work mce_ue_event_irq_work = { .func = machine_check_ue_irq_work, }; -DECLARE_WORK(mce_ue_event_work, machine_process_ue_event); +static DECLARE_WORK(mce_ue_event_work, machine_process_ue_event); static BLOCKING_NOTIFIER_HEAD(mce_notifier_list); -- 2.17.1
[PATCH v3 6/6] percpu: implement partial chunk depopulation
This patch implements partial depopulation of percpu chunks. As now, a chunk can be depopulated only as a part of the final destruction, if there are no more outstanding allocations. However to minimize a memory waste it might be useful to depopulate a partially filed chunk, if a small number of outstanding allocations prevents the chunk from being fully reclaimed. This patch implements the following depopulation process: it scans over the chunk pages, looks for a range of empty and populated pages and performs the depopulation. To avoid races with new allocations, the chunk is previously isolated. After the depopulation the chunk is sidelined to a special list or freed. New allocations can't be served using a sidelined chunk. The chunk can be moved back to a corresponding slot if there are not enough chunks with empty populated pages. The depopulation is scheduled on the free path. Is the chunk: 1) has more than 1/4 of total pages free and populated 2) the system has enough free percpu pages aside of this chunk 3) isn't the reserved chunk 4) isn't the first chunk 5) isn't entirely free it's a good target for depopulation. If it's already depopulated but got free populated pages, it's a good target too. The chunk is moved to a special pcpu_depopulate_list, chunk->isolate flag is set and the async balancing is scheduled. The async balancing moves pcpu_depopulate_list to a local list (because pcpu_depopulate_list can be changed when pcpu_lock is releases), and then tries to depopulate each chunk. The depopulation is performed in the reverse direction to keep populated pages close to the beginning, if the global number of empty pages is reached. Depopulated chunks are sidelined to prevent further allocations. Skipped and fully empty chunks are returned to the corresponding slot. On the allocation path, if there are no suitable chunks found, the list of sidelined chunks in scanned prior to creating a new chunk. If there is a good sidelined chunk, it's placed back to the slot and the scanning is restarted. Many thanks to Dennis Zhou for his great ideas and a very constructive discussion which led to many improvements in this patchset! Signed-off-by: Roman Gushchin --- mm/percpu-internal.h | 2 + mm/percpu.c | 158 ++- 2 files changed, 158 insertions(+), 2 deletions(-) diff --git a/mm/percpu-internal.h b/mm/percpu-internal.h index 095d7eaa0db4..8e432663c41e 100644 --- a/mm/percpu-internal.h +++ b/mm/percpu-internal.h @@ -67,6 +67,8 @@ struct pcpu_chunk { void*data; /* chunk data */ boolimmutable; /* no [de]population allowed */ + boolisolated; /* isolated from chunk slot lists */ + booldepopulated;/* sidelined after depopulation */ int start_offset; /* the overlap with the previous region to have a page aligned base_addr */ diff --git a/mm/percpu.c b/mm/percpu.c index 357fd6994278..5bb294e394b3 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -181,6 +181,19 @@ static LIST_HEAD(pcpu_map_extend_chunks); */ int pcpu_nr_empty_pop_pages[PCPU_NR_CHUNK_TYPES]; +/* + * List of chunks with a lot of free pages. Used to depopulate them + * asynchronously. + */ +static struct list_head pcpu_depopulate_list[PCPU_NR_CHUNK_TYPES]; + +/* + * List of previously depopulated chunks. They are not usually used for new + * allocations, but can be returned back to service if a need arises. + */ +static struct list_head pcpu_sideline_list[PCPU_NR_CHUNK_TYPES]; + + /* * The number of populated pages in use by the allocator, protected by * pcpu_lock. This number is kept per a unit per chunk (i.e. when a page gets @@ -562,6 +575,12 @@ static void pcpu_chunk_relocate(struct pcpu_chunk *chunk, int oslot) { int nslot = pcpu_chunk_slot(chunk); + /* +* Keep isolated and depopulated chunks on a sideline. +*/ + if (chunk->isolated || chunk->depopulated) + return; + if (oslot != nslot) __pcpu_chunk_move(chunk, nslot, oslot < nslot); } @@ -1790,6 +1809,19 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, } } + /* search through sidelined depopulated chunks */ + list_for_each_entry(chunk, _sideline_list[type], list) { + /* +* If the allocation can fit the chunk, place the chunk back +* into corresponding slot and restart the scanning. +*/ + if (pcpu_check_chunk_hint(>chunk_md, bits, bit_align)) { + chunk->depopulated = false; + pcpu_chunk_relocate(chunk, -1); + goto restart; + } + } +
[PATCH v3 0/6] percpu: partial chunk depopulation
In our production experience the percpu memory allocator is sometimes struggling with returning the memory to the system. A typical example is a creation of several thousands memory cgroups (each has several chunks of the percpu data used for vmstats, vmevents, ref counters etc). Deletion and complete releasing of these cgroups doesn't always lead to a shrinkage of the percpu memory, so that sometimes there are several GB's of memory wasted. The underlying problem is the fragmentation: to release an underlying chunk all percpu allocations should be released first. The percpu allocator tends to top up chunks to improve the utilization. It means new small-ish allocations (e.g. percpu ref counters) are placed onto almost filled old-ish chunks, effectively pinning them in memory. This patchset solves this problem by implementing a partial depopulation of percpu chunks: chunks with many empty pages are being asynchronously depopulated and the pages are returned to the system. To illustrate the problem the following script can be used: -- #!/bin/bash cd /sys/fs/cgroup mkdir percpu_test echo "+memory" > percpu_test/cgroup.subtree_control cat /proc/meminfo | grep Percpu for i in `seq 1 1000`; do mkdir percpu_test/cg_"${i}" for j in `seq 1 10`; do mkdir percpu_test/cg_"${i}"_"${j}" done done cat /proc/meminfo | grep Percpu for i in `seq 1 1000`; do for j in `seq 1 10`; do rmdir percpu_test/cg_"${i}"_"${j}" done done sleep 10 cat /proc/meminfo | grep Percpu for i in `seq 1 1000`; do rmdir percpu_test/cg_"${i}" done rmdir percpu_test -- It creates 11000 memory cgroups and removes every 10 out of 11. It prints the initial size of the percpu memory, the size after creating all cgroups and the size after deleting most of them. Results: vanilla: ./percpu_test.sh Percpu: 7488 kB Percpu: 481152 kB Percpu: 481152 kB with this patchset applied: ./percpu_test.sh Percpu: 7488 kB Percpu: 481408 kB Percpu: 135552 kB So the total size of the percpu memory was reduced by more than 3.5 times. v3: - introduced pcpu_check_chunk_hint() - fixed a bug related to the hint check - minor cosmetic changes - s/pretends/fixes (cc Vlastimil) v2: - depopulated chunks are sidelined - depopulation happens in the reverse order - depopulate list made per-chunk type - better results due to better heuristics v1: - depopulation heuristics changed and optimized - chunks are put into a separate list, depopulation scan this list - chunk->isolated is introduced, chunk->depopulate is dropped - rearranged patches a bit - fixed a panic discovered by krobot - made pcpu_nr_empty_pop_pages per chunk type - minor fixes rfc: https://lwn.net/Articles/850508/ Roman Gushchin (6): percpu: fix a comment about the chunks ordering percpu: split __pcpu_balance_workfn() percpu: make pcpu_nr_empty_pop_pages per chunk type percpu: generalize pcpu_balance_populated() percpu: factor out pcpu_check_chunk_hint() percpu: implement partial chunk depopulation mm/percpu-internal.h | 4 +- mm/percpu-stats.c| 9 +- mm/percpu.c | 306 +++ 3 files changed, 261 insertions(+), 58 deletions(-) -- 2.30.2
[PATCH v3 4/6] percpu: generalize pcpu_balance_populated()
To prepare for the depopulation of percpu chunks, split out the populating part of the pcpu_balance_populated() into the new pcpu_grow_populated() (with an intention to add pcpu_shrink_populated() in the next commit). The goal of pcpu_balance_populated() is to determine whether there is a shortage or an excessive amount of empty percpu pages and call into the corresponding function. pcpu_grow_populated() takes a desired number of pages as an argument (nr_to_pop). If it creates a new chunk, nr_to_pop should be updated to reflect that the new chunk could be created already populated. Otherwise an infinite loop might appear. Signed-off-by: Roman Gushchin --- mm/percpu.c | 63 + 1 file changed, 39 insertions(+), 24 deletions(-) diff --git a/mm/percpu.c b/mm/percpu.c index 61339b3d9337..e20119668c42 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -1979,7 +1979,7 @@ static void pcpu_balance_free(enum pcpu_chunk_type type) } /** - * pcpu_balance_populated - manage the amount of populated pages + * pcpu_grow_populated - populate chunk(s) to satisfy atomic allocations * @type: chunk type * * Maintain a certain amount of populated pages to satisfy atomic allocations. @@ -1988,35 +1988,15 @@ static void pcpu_balance_free(enum pcpu_chunk_type type) * allocation causes the failure as it is possible that requests can be * serviced from already backed regions. */ -static void pcpu_balance_populated(enum pcpu_chunk_type type) +static void pcpu_grow_populated(enum pcpu_chunk_type type, int nr_to_pop) { /* gfp flags passed to underlying allocators */ const gfp_t gfp = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN; struct list_head *pcpu_slot = pcpu_chunk_list(type); struct pcpu_chunk *chunk; - int slot, nr_to_pop, ret; + int slot, ret; - /* -* Ensure there are certain number of free populated pages for -* atomic allocs. Fill up from the most packed so that atomic -* allocs don't increase fragmentation. If atomic allocation -* failed previously, always populate the maximum amount. This -* should prevent atomic allocs larger than PAGE_SIZE from keeping -* failing indefinitely; however, large atomic allocs are not -* something we support properly and can be highly unreliable and -* inefficient. -*/ retry_pop: - if (pcpu_atomic_alloc_failed) { - nr_to_pop = PCPU_EMPTY_POP_PAGES_HIGH; - /* best effort anyway, don't worry about synchronization */ - pcpu_atomic_alloc_failed = false; - } else { - nr_to_pop = clamp(PCPU_EMPTY_POP_PAGES_HIGH - - pcpu_nr_empty_pop_pages[type], - 0, PCPU_EMPTY_POP_PAGES_HIGH); - } - for (slot = pcpu_size_to_slot(PAGE_SIZE); slot < pcpu_nr_slots; slot++) { unsigned int nr_unpop = 0, rs, re; @@ -2060,12 +2040,47 @@ static void pcpu_balance_populated(enum pcpu_chunk_type type) if (chunk) { spin_lock_irq(_lock); pcpu_chunk_relocate(chunk, -1); + nr_to_pop = max_t(int, 0, nr_to_pop - chunk->nr_populated); spin_unlock_irq(_lock); - goto retry_pop; + if (nr_to_pop) + goto retry_pop; } } } +/** + * pcpu_balance_populated - manage the amount of populated pages + * @type: chunk type + * + * Populate or depopulate chunks to maintain a certain amount + * of free pages to satisfy atomic allocations, but not waste + * large amounts of memory. + */ +static void pcpu_balance_populated(enum pcpu_chunk_type type) +{ + int nr_to_pop; + + /* +* Ensure there are certain number of free populated pages for +* atomic allocs. Fill up from the most packed so that atomic +* allocs don't increase fragmentation. If atomic allocation +* failed previously, always populate the maximum amount. This +* should prevent atomic allocs larger than PAGE_SIZE from keeping +* failing indefinitely; however, large atomic allocs are not +* something we support properly and can be highly unreliable and +* inefficient. +*/ + if (pcpu_atomic_alloc_failed) { + nr_to_pop = PCPU_EMPTY_POP_PAGES_HIGH; + /* best effort anyway, don't worry about synchronization */ + pcpu_atomic_alloc_failed = false; + pcpu_grow_populated(type, nr_to_pop); + } else if (pcpu_nr_empty_pop_pages[type] < PCPU_EMPTY_POP_PAGES_HIGH) { + nr_to_pop = PCPU_EMPTY_POP_PAGES_HIGH - pcpu_nr_empty_pop_pages[type]; + pcpu_grow_populated(type, nr_to_pop); + } +} + /** * pcpu_balance_workfn - manage the amount of free chunks and
[PATCH v3 3/6] percpu: make pcpu_nr_empty_pop_pages per chunk type
nr_empty_pop_pages is used to guarantee that there are some free populated pages to satisfy atomic allocations. Accounted and non-accounted allocations are using separate sets of chunks, so both need to have a surplus of empty pages. This commit makes pcpu_nr_empty_pop_pages and the corresponding logic per chunk type. Signed-off-by: Roman Gushchin --- mm/percpu-internal.h | 2 +- mm/percpu-stats.c| 9 +++-- mm/percpu.c | 14 +++--- 3 files changed, 15 insertions(+), 10 deletions(-) diff --git a/mm/percpu-internal.h b/mm/percpu-internal.h index 18b768ac7dca..095d7eaa0db4 100644 --- a/mm/percpu-internal.h +++ b/mm/percpu-internal.h @@ -87,7 +87,7 @@ extern spinlock_t pcpu_lock; extern struct list_head *pcpu_chunk_lists; extern int pcpu_nr_slots; -extern int pcpu_nr_empty_pop_pages; +extern int pcpu_nr_empty_pop_pages[]; extern struct pcpu_chunk *pcpu_first_chunk; extern struct pcpu_chunk *pcpu_reserved_chunk; diff --git a/mm/percpu-stats.c b/mm/percpu-stats.c index c8400a2adbc2..f6026dbcdf6b 100644 --- a/mm/percpu-stats.c +++ b/mm/percpu-stats.c @@ -145,6 +145,7 @@ static int percpu_stats_show(struct seq_file *m, void *v) int slot, max_nr_alloc; int *buffer; enum pcpu_chunk_type type; + int nr_empty_pop_pages; alloc_buffer: spin_lock_irq(_lock); @@ -165,7 +166,11 @@ static int percpu_stats_show(struct seq_file *m, void *v) goto alloc_buffer; } -#define PL(X) \ + nr_empty_pop_pages = 0; + for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) + nr_empty_pop_pages += pcpu_nr_empty_pop_pages[type]; + +#define PL(X) \ seq_printf(m, " %-20s: %12lld\n", #X, (long long int)pcpu_stats_ai.X) seq_printf(m, @@ -196,7 +201,7 @@ static int percpu_stats_show(struct seq_file *m, void *v) PU(nr_max_chunks); PU(min_alloc_size); PU(max_alloc_size); - P("empty_pop_pages", pcpu_nr_empty_pop_pages); + P("empty_pop_pages", nr_empty_pop_pages); seq_putc(m, '\n'); #undef PU diff --git a/mm/percpu.c b/mm/percpu.c index 7e31e1b8725f..61339b3d9337 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -176,10 +176,10 @@ struct list_head *pcpu_chunk_lists __ro_after_init; /* chunk list slots */ static LIST_HEAD(pcpu_map_extend_chunks); /* - * The number of empty populated pages, protected by pcpu_lock. The - * reserved chunk doesn't contribute to the count. + * The number of empty populated pages by chunk type, protected by pcpu_lock. + * The reserved chunk doesn't contribute to the count. */ -int pcpu_nr_empty_pop_pages; +int pcpu_nr_empty_pop_pages[PCPU_NR_CHUNK_TYPES]; /* * The number of populated pages in use by the allocator, protected by @@ -559,7 +559,7 @@ static inline void pcpu_update_empty_pages(struct pcpu_chunk *chunk, int nr) { chunk->nr_empty_pop_pages += nr; if (chunk != pcpu_reserved_chunk) - pcpu_nr_empty_pop_pages += nr; + pcpu_nr_empty_pop_pages[pcpu_chunk_type(chunk)] += nr; } /* @@ -1835,7 +1835,7 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, mutex_unlock(_alloc_mutex); } - if (pcpu_nr_empty_pop_pages < PCPU_EMPTY_POP_PAGES_LOW) + if (pcpu_nr_empty_pop_pages[type] < PCPU_EMPTY_POP_PAGES_LOW) pcpu_schedule_balance_work(); /* clear the areas and return address relative to base address */ @@ -2013,7 +2013,7 @@ static void pcpu_balance_populated(enum pcpu_chunk_type type) pcpu_atomic_alloc_failed = false; } else { nr_to_pop = clamp(PCPU_EMPTY_POP_PAGES_HIGH - - pcpu_nr_empty_pop_pages, + pcpu_nr_empty_pop_pages[type], 0, PCPU_EMPTY_POP_PAGES_HIGH); } @@ -2595,7 +2595,7 @@ void __init pcpu_setup_first_chunk(const struct pcpu_alloc_info *ai, /* link the first chunk in */ pcpu_first_chunk = chunk; - pcpu_nr_empty_pop_pages = pcpu_first_chunk->nr_empty_pop_pages; + pcpu_nr_empty_pop_pages[PCPU_CHUNK_ROOT] = pcpu_first_chunk->nr_empty_pop_pages; pcpu_chunk_relocate(pcpu_first_chunk, -1); /* include all regions of the first chunk */ -- 2.30.2
[PATCH v3 5/6] percpu: factor out pcpu_check_chunk_hint()
Factor out the pcpu_check_chunk_hint() helper, which will be useful in the future. The new function checks if the allocation can likely fit the given chunk. Signed-off-by: Roman Gushchin --- mm/percpu.c | 30 +- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/mm/percpu.c b/mm/percpu.c index e20119668c42..357fd6994278 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -306,6 +306,26 @@ static unsigned long pcpu_block_off_to_off(int index, int off) return index * PCPU_BITMAP_BLOCK_BITS + off; } +/** + * pcpu_check_chunk_hint - check that allocation can fit a chunk + * @chunk_md: chunk's block + * @bits: size of request in allocation units + * @align: alignment of area (max PAGE_SIZE) + * + * Check to see if the allocation can fit in the chunk's contig hint. + * This is an optimization to prevent scanning by assuming if it + * cannot fit in the global hint, there is memory pressure and creating + * a new chunk would happen soon. + */ +static bool pcpu_check_chunk_hint(struct pcpu_block_md *chunk_md, int bits, + size_t align) +{ + int bit_off = ALIGN(chunk_md->contig_hint_start, align) - + chunk_md->contig_hint_start; + + return bit_off + bits <= chunk_md->contig_hint; +} + /* * pcpu_next_hint - determine which hint to use * @block: block of interest @@ -1065,15 +1085,7 @@ static int pcpu_find_block_fit(struct pcpu_chunk *chunk, int alloc_bits, struct pcpu_block_md *chunk_md = >chunk_md; int bit_off, bits, next_off; - /* -* Check to see if the allocation can fit in the chunk's contig hint. -* This is an optimization to prevent scanning by assuming if it -* cannot fit in the global hint, there is memory pressure and creating -* a new chunk would happen soon. -*/ - bit_off = ALIGN(chunk_md->contig_hint_start, align) - - chunk_md->contig_hint_start; - if (bit_off + alloc_bits > chunk_md->contig_hint) + if (!pcpu_check_chunk_hint(chunk_md, alloc_bits, align)) return -1; bit_off = pcpu_next_hint(chunk_md, alloc_bits); -- 2.30.2
[PATCH v3 2/6] percpu: split __pcpu_balance_workfn()
__pcpu_balance_workfn() became fairly big and hard to follow, but in fact it consists of two fully independent parts, responsible for the destruction of excessive free chunks and population of necessarily amount of free pages. In order to simplify the code and prepare for adding of a new functionality, split it in two functions: 1) pcpu_balance_free, 2) pcpu_balance_populated. Move the taking/releasing of the pcpu_alloc_mutex to an upper level to keep the current synchronization in place. Signed-off-by: Roman Gushchin Reviewed-by: Dennis Zhou --- mm/percpu.c | 46 +- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/mm/percpu.c b/mm/percpu.c index 2f27123bb489..7e31e1b8725f 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -1933,31 +1933,22 @@ void __percpu *__alloc_reserved_percpu(size_t size, size_t align) } /** - * __pcpu_balance_workfn - manage the amount of free chunks and populated pages + * pcpu_balance_free - manage the amount of free chunks * @type: chunk type * - * Reclaim all fully free chunks except for the first one. This is also - * responsible for maintaining the pool of empty populated pages. However, - * it is possible that this is called when physical memory is scarce causing - * OOM killer to be triggered. We should avoid doing so until an actual - * allocation causes the failure as it is possible that requests can be - * serviced from already backed regions. + * Reclaim all fully free chunks except for the first one. */ -static void __pcpu_balance_workfn(enum pcpu_chunk_type type) +static void pcpu_balance_free(enum pcpu_chunk_type type) { - /* gfp flags passed to underlying allocators */ - const gfp_t gfp = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN; LIST_HEAD(to_free); struct list_head *pcpu_slot = pcpu_chunk_list(type); struct list_head *free_head = _slot[pcpu_nr_slots - 1]; struct pcpu_chunk *chunk, *next; - int slot, nr_to_pop, ret; /* * There's no reason to keep around multiple unused chunks and VM * areas can be scarce. Destroy all free chunks except for one. */ - mutex_lock(_alloc_mutex); spin_lock_irq(_lock); list_for_each_entry_safe(chunk, next, free_head, list) { @@ -1985,6 +1976,25 @@ static void __pcpu_balance_workfn(enum pcpu_chunk_type type) pcpu_destroy_chunk(chunk); cond_resched(); } +} + +/** + * pcpu_balance_populated - manage the amount of populated pages + * @type: chunk type + * + * Maintain a certain amount of populated pages to satisfy atomic allocations. + * It is possible that this is called when physical memory is scarce causing + * OOM killer to be triggered. We should avoid doing so until an actual + * allocation causes the failure as it is possible that requests can be + * serviced from already backed regions. + */ +static void pcpu_balance_populated(enum pcpu_chunk_type type) +{ + /* gfp flags passed to underlying allocators */ + const gfp_t gfp = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN; + struct list_head *pcpu_slot = pcpu_chunk_list(type); + struct pcpu_chunk *chunk; + int slot, nr_to_pop, ret; /* * Ensure there are certain number of free populated pages for @@ -2054,22 +2064,24 @@ static void __pcpu_balance_workfn(enum pcpu_chunk_type type) goto retry_pop; } } - - mutex_unlock(_alloc_mutex); } /** * pcpu_balance_workfn - manage the amount of free chunks and populated pages * @work: unused * - * Call __pcpu_balance_workfn() for each chunk type. + * Call pcpu_balance_free() and pcpu_balance_populated() for each chunk type. */ static void pcpu_balance_workfn(struct work_struct *work) { enum pcpu_chunk_type type; - for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) - __pcpu_balance_workfn(type); + for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) { + mutex_lock(_alloc_mutex); + pcpu_balance_free(type); + pcpu_balance_populated(type); + mutex_unlock(_alloc_mutex); + } } /** -- 2.30.2
[PATCH v3 1/6] percpu: fix a comment about the chunks ordering
Since the commit 3e54097beb22 ("percpu: manage chunks based on contig_bits instead of free_bytes") chunks are sorted based on the size of the biggest continuous free area instead of the total number of free bytes. Update the corresponding comment to reflect this. Signed-off-by: Roman Gushchin --- mm/percpu.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/percpu.c b/mm/percpu.c index 6596a0a4286e..2f27123bb489 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -99,7 +99,10 @@ #include "percpu-internal.h" -/* the slots are sorted by free bytes left, 1-31 bytes share the same slot */ +/* + * The slots are sorted by the size of the biggest continuous free area. + * 1-31 bytes share the same slot. + */ #define PCPU_SLOT_BASE_SHIFT 5 /* chunks in slots below this are subject to being sidelined on failed alloc */ #define PCPU_SLOT_FAIL_THRESHOLD 3 -- 2.30.2
[PATCH-next] powerpc/interrupt: Remove duplicate header file
From: Chen Yi Delete one of the header files that are included twice. Signed-off-by: Chen Yi --- arch/powerpc/kernel/interrupt.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c index c4dd4b8f9cfa..f64ace0208b7 100644 --- a/arch/powerpc/kernel/interrupt.c +++ b/arch/powerpc/kernel/interrupt.c @@ -7,7 +7,6 @@ #include #include #include -#include #include #include #include -- 2.31.0
[PATCH -next v2] drm/bridge: lt8912b: Add header file
If CONFIG_DRM_LONTIUM_LT8912B=m, the following errors will be seen while compiling lontium-lt8912b.c drivers/gpu/drm/bridge/lontium-lt8912b.c: In function ‘lt8912_hard_power_on’: drivers/gpu/drm/bridge/lontium-lt8912b.c:252:2: error: implicit declaration of function ‘gpiod_set_value_cansleep’; did you mean ‘gpio_set_value_cansleep’? [-Werror=implicit-function-declaration] gpiod_set_value_cansleep(lt->gp_reset, 0); ^~~~ gpio_set_value_cansleep drivers/gpu/drm/bridge/lontium-lt8912b.c: In function ‘lt8912_parse_dt’: drivers/gpu/drm/bridge/lontium-lt8912b.c:628:13: error: implicit declaration of function ‘devm_gpiod_get_optional’; did you mean ‘devm_gpio_request_one’? [-Werror=implicit-function-declaration] gp_reset = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_HIGH); ^~~ devm_gpio_request_one drivers/gpu/drm/bridge/lontium-lt8912b.c:628:51: error: ‘GPIOD_OUT_HIGH’ undeclared (first use in this function); did you mean ‘GPIOF_INIT_HIGH’? gp_reset = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_HIGH); ^~ GPIOF_INIT_HIGH Signed-off-by: Zhang Jianhua --- v2: - add header file for lontium-lt8912b.c instead of add config dependence for CONFIG_DRM_LONTIUM_LT8912B --- drivers/gpu/drm/bridge/lontium-lt8912b.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/bridge/lontium-lt8912b.c b/drivers/gpu/drm/bridge/lontium-lt8912b.c index 61491615bad0..79845b3b19e1 100644 --- a/drivers/gpu/drm/bridge/lontium-lt8912b.c +++ b/drivers/gpu/drm/bridge/lontium-lt8912b.c @@ -3,6 +3,7 @@ * Copyright (c) 2018, The Linux Foundation. All rights reserved. */ +#include #include #include #include -- 2.17.1
Re: [External] linux-next: manual merge of the net-next tree with the bpf tree
On Wed, Apr 7, 2021 at 8:11 PM Stephen Rothwell wrote: > > Hi all, > > Today's linux-next merge of the net-next tree got a conflict in: > > net/core/skmsg.c > > between commit: > > 144748eb0c44 ("bpf, sockmap: Fix incorrect fwd_alloc accounting") > > from the bpf tree and commit: > > e3526bb92a20 ("skmsg: Move sk_redir from TCP_SKB_CB to skb") > > from the net-next tree. > > I fixed it up (I think - see below) and can carry the fix as > necessary. This is now fixed as far as linux-next is concerned, but any > non trivial conflicts should be mentioned to your upstream maintainer > when your tree is submitted for merging. You may also want to consider > cooperating with the maintainer of the conflicting tree to minimise any > particularly complex conflicts. Looks good from my quick glance. Thanks!
[rcu:dev.2021.04.02a 73/73] ia64-linux-ld: undefined reference to `rcu_spawn_one_boost_kthread'
tree: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev.2021.04.02a head: 4bc4fd6b7e87ff0bdb1aa2493af85be2784717c0 commit: 4bc4fd6b7e87ff0bdb1aa2493af85be2784717c0 [73/73] rcu: Fix RCU priority boosting and add more debug output config: ia64-defconfig (attached as .config) compiler: ia64-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/commit/?id=4bc4fd6b7e87ff0bdb1aa2493af85be2784717c0 git remote add rcu https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git git fetch --no-tags rcu dev.2021.04.02a git checkout 4bc4fd6b7e87ff0bdb1aa2493af85be2784717c0 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=ia64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): ia64-linux-ld: kernel/rcu/tree.o: in function `rcutree_online_cpu': (.text+0xf922): undefined reference to `rcu_spawn_one_boost_kthread' >> ia64-linux-ld: (.text+0xf9a2): undefined reference to >> `rcu_spawn_one_boost_kthread' ia64-linux-ld: (.text+0xfa32): undefined reference to `rcu_spawn_one_boost_kthread' --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [External] linux-next: manual merge of the net-next tree with the bpf tree
On Wed, Apr 7, 2021 at 8:02 PM Stephen Rothwell wrote: > > Hi all, > > Today's linux-next merge of the net-next tree got a conflict in: > > include/linux/skmsg.h > > between commit: > > 1c84b33101c8 ("bpf, sockmap: Fix sk->prot unhash op reset") > > from the bpf tree and commit: > > 8a59f9d1e3d4 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()") > > from the net-next tree. > > I didn't know how to fixed it up so I just used the latter version or > today - a better solution would be appreciated. This is now fixed as > far as linux-next is concerned, but any non trivial conflicts should be > mentioned to your upstream maintainer when your tree is submitted for > merging. You may also want to consider cooperating with the maintainer > of the conflicting tree to minimise any particularly complex conflicts. The right way to resolve this is to move the lines added in commit 1c84b33101c8 to the similar place in tcp_bpf_update_proto(). Thanks.
RE: [PATCH v16 1/2] scsi: ufs: Enable power management for wlun
Hi Asutosh Das, >+static inline bool is_rpmb_wlun(struct scsi_device *sdev) >+{ >+return (sdev->lun == >ufshcd_upiu_wlun_to_scsi_wlun(UFS_UPIU_RPMB_WLUN)); >+} >+ >+static inline bool is_device_wlun(struct scsi_device *sdev) >+{ >+return (sdev->lun == >+ufshcd_upiu_wlun_to_scsi_wlun(UFS_UPIU_UFS_DEVICE_WLUN)); >+} >+ > static void ufshcd_init_lrb(struct ufs_hba *hba, struct ufshcd_lrb *lrb, int > i) > { > struct utp_transfer_cmd_desc *cmd_descp = hba->ucdl_base_addr; >@@ -4099,11 +4113,11 @@ void ufshcd_auto_hibern8_update(struct ufs_hba *hba, >u32 ahit) > spin_unlock_irqrestore(hba->host->host_lock, flags); > > if (update && !pm_runtime_suspended(hba->dev)) { Could it be changed hba->sdev_ufs_device->sdev_gendev instead of hba->dev? Thanks, Daejun
[PATCHv2] mm/mmap.c: lines in __do_munmap repeat logic of inlined find_vma_intersection
Some lines in __do_munmap used the same logic as find_vma_intersection (which is inlined) instead of directly using that function. (Can't believe I made a typo in the first one, compiled this one, sorry first patch kinda nervous for some reason) Signed-off-by: Gonzalo Matias Juarez Tello --- mm/mmap.c | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 3f287599a7a3..1b29f8bf8344 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2823,15 +2823,10 @@ int __do_munmap(struct mm_struct *mm, unsigned long start, size_t len, arch_unmap(mm, start, end); /* Find the first overlapping VMA */ - vma = find_vma(mm, start); + vma = find_vma_intersection(mm, start, end); if (!vma) return 0; prev = vma->vm_prev; - /* we have start < vma->vm_end */ - - /* if it doesn't overlap, we have nothing.. */ - if (vma->vm_start >= end) - return 0; /* * If we need to split any vma, do it now to save pain later. -- 2.31.1
Re: [RFC PATCH v2 10/10] firmware: arm_scmi: Add virtio transport
On Fri, Nov 6, 2020 at 2:59 AM Peter Hilber wrote: > +static int scmi_vio_probe(struct virtio_device *vdev) > +{ > + struct device *dev = >dev; > + struct scmi_vio_channel **vioch; > + bool have_vq_rx; > + int vq_cnt; > + int i; > + struct virtqueue *vqs[VIRTIO_SCMI_VQ_MAX_CNT]; > + > + vioch = devm_kcalloc(dev, VIRTIO_SCMI_VQ_MAX_CNT, sizeof(*vioch), > +GFP_KERNEL); > + if (!vioch) > + return -ENOMEM; > + > + have_vq_rx = virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS); > + vq_cnt = have_vq_rx ? VIRTIO_SCMI_VQ_MAX_CNT : 1; > + > + for (i = 0; i < vq_cnt; i++) { > + vioch[i] = devm_kzalloc(dev, sizeof(**vioch), GFP_KERNEL); > + if (!vioch[i]) > + return -ENOMEM; > + } > + > + if (have_vq_rx) > + vioch[VIRTIO_SCMI_VQ_RX]->is_rx = true; > + > + if (virtio_find_vqs(vdev, vq_cnt, vqs, scmi_vio_complete_callbacks, > + scmi_vio_vqueue_names, NULL)) { > + dev_err(dev, "Failed to get %d virtqueue(s)\n", vq_cnt); > + return -1; > + } > + dev_info(dev, "Found %d virtqueue(s)\n", vq_cnt); > + > + for (i = 0; i < vq_cnt; i++) { > + spin_lock_init([i]->lock); > + vioch[i]->vqueue = vqs[i]; > + vioch[i]->vqueue->priv = vioch[i]; The vqueue->priv field is used by core, you can't update it else notifications won't work. > + } > + > + vdev->priv = vioch; > + > + virtio_device_ready(vdev); > + > + return 0; > +} diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c index f70aa72f34f1..b1af77341b30 100644 --- a/drivers/firmware/arm_scmi/virtio.c +++ b/drivers/firmware/arm_scmi/virtio.c @@ -80,7 +80,8 @@ static int scmi_vio_populate_vq_rx(struct scmi_vio_channel *vioch, static void scmi_vio_complete_cb(struct virtqueue *vqueue) { - struct scmi_vio_channel *vioch = vqueue->priv; + struct scmi_vio_channel **_vioch = vqueue->vdev->priv; + struct scmi_vio_channel *vioch = _vioch[vqueue->index]; unsigned long iflags; unsigned int length; @@ -454,7 +455,6 @@ static int scmi_vio_probe(struct virtio_device *vdev) for (i = 0; i < vq_cnt; i++) { spin_lock_init([i]->lock); vioch[i]->vqueue = vqs[i]; - vioch[i]->vqueue->priv = vioch[i]; } vdev->priv = vioch;
Re: [PATCH v2] hwmon: Add driver for fsp-3y PSUs and PDUs
On 4/7/21 7:34 PM, Václav Kubernát wrote: > This patch adds support for these devices: > - YH-5151E - the PDU > - YM-2151E - the PSU > > The device datasheet says that the devices support PMBus 1.2, but in my > testing, a lot of the commands aren't supported and if they are, they > sometimes behave strangely or inconsistently. For example, writes to the > PAGE command requires using PEC, otherwise the write won't work and the > page won't switch, even though, the standard says that PEC is opiotnal. > On the other hand, writes the SMBALERT don't require PEC. Because of > this, the driver is mostly reverse engineered with the help of a tool > called pmbus_peek written by David Brownell (and later adopted by my > colleague Jan Kundrát). > > The device also has some sort of a timing issue when switching pages, > which is explained further in the code. > > Because of this, the driver support is limited. It exposes only the > values, that have been tested to work correctly. > > Signed-off-by: Václav Kubernát > --- > Documentation/hwmon/fsp-3y.rst | 26 > drivers/hwmon/pmbus/Kconfig| 10 ++ > drivers/hwmon/pmbus/Makefile | 1 + > drivers/hwmon/pmbus/fsp-3y.c | 217 + > 4 files changed, 254 insertions(+) > create mode 100644 Documentation/hwmon/fsp-3y.rst > create mode 100644 drivers/hwmon/pmbus/fsp-3y.c > > diff --git a/Documentation/hwmon/fsp-3y.rst b/Documentation/hwmon/fsp-3y.rst > new file mode 100644 > index ..68a547021846 > --- /dev/null > +++ b/Documentation/hwmon/fsp-3y.rst > @@ -0,0 +1,26 @@ > +Kernel driver fsp3y > +== > +Supported devices: > + * 3Y POWER YH-5151E > + * 3Y POWER YM-2151E > + > +Author: Václav Kubernát > + > +Description > +--- > +This driver implements limited support for two 3Y POWER devices. > + > +Sysfs entries > +- > +in1_inputinput voltage > +in2_input12V output voltage > +in3_input5V output voltage > +curr1_input input current > +curr2_input 12V output current > +curr3_input 5V output current > +fan1_input fan rpm > +temp1_input temperature 1 > +temp2_input temperature 2 > +temp3_input temperature 3 > +power1_input input power > +power2_input output power > diff --git a/drivers/hwmon/pmbus/Kconfig b/drivers/hwmon/pmbus/Kconfig > index 03606d4298a4..9d12d446396c 100644 > --- a/drivers/hwmon/pmbus/Kconfig > +++ b/drivers/hwmon/pmbus/Kconfig > @@ -56,6 +56,16 @@ config SENSORS_BEL_PFE > This driver can also be built as a module. If so, the module will > be called bel-pfe. > > +config SENSORS_FSP_3Y > + tristate "FSP/3Y-Power power supplies" > + help > + If you say yes here you get hardware monitoring support for > + FSP/3Y-Power hot-swap power supplies. > + Supported models: YH-5151E, YM-2151E > + > + This driver can also be built as a module. If so, the module will > + be called fsp-3y. > + > config SENSORS_IBM_CFFPS > tristate "IBM Common Form Factor Power Supply" > depends on LEDS_CLASS > diff --git a/drivers/hwmon/pmbus/Makefile b/drivers/hwmon/pmbus/Makefile > index 6a4ba0fdc1db..bfe218ad898f 100644 > --- a/drivers/hwmon/pmbus/Makefile > +++ b/drivers/hwmon/pmbus/Makefile > @@ -8,6 +8,7 @@ obj-$(CONFIG_SENSORS_PMBUS) += pmbus.o > obj-$(CONFIG_SENSORS_ADM1266)+= adm1266.o > obj-$(CONFIG_SENSORS_ADM1275)+= adm1275.o > obj-$(CONFIG_SENSORS_BEL_PFE)+= bel-pfe.o > +obj-$(CONFIG_SENSORS_FSP_3Y) += fsp-3y.o > obj-$(CONFIG_SENSORS_IBM_CFFPS) += ibm-cffps.o > obj-$(CONFIG_SENSORS_INSPUR_IPSPS) += inspur-ipsps.o > obj-$(CONFIG_SENSORS_IR35221)+= ir35221.o > diff --git a/drivers/hwmon/pmbus/fsp-3y.c b/drivers/hwmon/pmbus/fsp-3y.c > new file mode 100644 > index ..2c165e034fa8 > --- /dev/null > +++ b/drivers/hwmon/pmbus/fsp-3y.c > @@ -0,0 +1,217 @@ > +// SPDX-License-Identifier: GPL-2.0-or-later > +/* > + * Hardware monitoring driver for FSP 3Y-Power PSUs > + * > + * Copyright (c) 2021 Václav Kubernát, CESNET > + */ > + > +#include > +#include > +#include > +#include > +#include "pmbus.h" > + > +#define YM2151_PAGE_12V_LOG 0x00 > +#define YM2151_PAGE_12V_REAL 0x00 > +#define YM2151_PAGE_5VSB_LOG 0x01 > +#define YM2151_PAGE_5VSB_REAL0x20 > +#define YH5151E_PAGE_12V_LOG 0x00 > +#define YH5151E_PAGE_12V_REAL0x00 > +#define YH5151E_PAGE_5V_LOG 0x01 > +#define YH5151E_PAGE_5V_REAL 0x10 > +#define YH5151E_PAGE_3V3_LOG 0x02 > +#define YH5151E_PAGE_3V3_REAL0x11 > + > +enum chips { > + ym2151e, > + yh5151e > +}; > + > +struct fsp3y_data { > + struct pmbus_driver_info info; > + enum chips chip; > + int page; > +}; > + > +#define to_fsp3y_data(x) container_of(x, struct fsp3y_data, info) > + > +static int page_log_to_page_real(int page_log, enum chips chip) > +{ > + switch (chip) { > + case
[PATCH -next] powerpc/security: Make symbol 'stf_barrier' static
The sparse tool complains as follows: arch/powerpc/kernel/security.c:253:6: warning: symbol 'stf_barrier' was not declared. Should it be static? This symbol is not used outside of security.c, so this commit marks it static. Signed-off-by: Li Huafei --- arch/powerpc/kernel/security.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index e4e1a94ccf6a..4de6bbd9672e 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -250,7 +250,7 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c static enum stf_barrier_type stf_enabled_flush_types; static bool no_stf_barrier; -bool stf_barrier; +static bool stf_barrier; static int __init handle_no_stf_barrier(char *p) { -- 2.17.1
Re: [PATCH 2/2] pinctrl: qcom-pmic-gpio: Add support for pm8008
On Wed 07 Apr 17:35 CDT 2021, Guru Das Srinagesh wrote: > Add support for the two GPIOs present on PM8008. > > Signed-off-by: Guru Das Srinagesh > --- > drivers/pinctrl/qcom/pinctrl-spmi-gpio.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c > b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c > index c2b9f2e..76e997a 100644 > --- a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c > +++ b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c > @@ -1137,6 +1137,7 @@ static const struct of_device_id pmic_gpio_of_match[] = > { > { .compatible = "qcom,pm6150l-gpio", .data = (void *) 12 }, > /* pmx55 has 11 GPIOs with holes on 3, 7, 10, 11 */ > { .compatible = "qcom,pmx55-gpio", .data = (void *) 11 }, > + { .compatible = "qcom,pm8008-gpio", .data = (void *) 2 }, As with the binding, please keep these sorted alphabetically. With that: Reviewed-by: Bjorn Andersson Regards, Bjorn > { }, > }; > > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project >
Re: [PATCH 1/2] dt-bindings: pinctrl: qcom-pmic-gpio: Add pm8008 support
On Wed 07 Apr 17:34 CDT 2021, Guru Das Srinagesh wrote: > Add support for the PM8008 GPIO support to the Qualcomm PMIC GPIO > binding. > > Signed-off-by: Guru Das Srinagesh > --- > Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.txt | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.txt > b/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.txt > index 70e119b..1818481 100644 > --- a/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.txt > +++ b/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.txt > @@ -36,6 +36,7 @@ PMIC's from Qualcomm. > "qcom,pm6150-gpio" > "qcom,pm6150l-gpio" > "qcom,pmx55-gpio" > + "qcom,pm8008-gpio" Please keep these sorted alphabetically (i.e. '8' < 'x') With that Acked-by: Bjorn Andersson Regards, Bjorn > > And must contain either "qcom,spmi-gpio" or "qcom,ssbi-gpio" > if the device is on an spmi bus or an ssbi bus respectively > @@ -125,6 +126,7 @@ to specify in a pin configuration subnode: > gpio1-gpio12 for pm6150l > gpio1-gpio11 for pmx55 (holes on gpio3, gpio7, gpio10 > and gpio11) > + gpio1-gpio2 for pm8008 > > - function: > Usage: required > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project >
[PATCH] mm/mmap.c: lines in __do_munmap repeat logic of inlined find_vma_intersection
Some lines in __do_munmap used the same logic as find_vma_intersection (which is inlined) instead of directly using that function. Signed-off-by: Gonzalo Matias Juarez Tello --- mm/mmap.c | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 3f287599a7a3..1b29f8bf8344 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2823,15 +2823,10 @@ int __do_munmap(struct mm_struct *mm, unsigned long start, size_t len, arch_unmap(mm, start, end); /* Find the first overlapping VMA */ - vma = find_vma(mm, start); + vma = find_vma_intersecion(mm, start, end); if (!vma) return 0; prev = vma->vm_prev; - /* we have start < vma->vm_end */ - - /* if it doesn't overlap, we have nothing.. */ - if (vma->vm_start >= end) - return 0; /* * If we need to split any vma, do it now to save pain later. -- 2.31.1
Re: [PATCH 3/4] mm/hugeltb: fix potential wrong gbl_reserve value for hugetlb_acct_memory()
On 2021/4/8 11:24, Miaohe Lin wrote: > On 2021/4/8 4:53, Mike Kravetz wrote: >> On 4/7/21 12:24 AM, Miaohe Lin wrote: >>> Hi: >>> On 2021/4/7 10:49, Mike Kravetz wrote: On 4/2/21 2:32 AM, Miaohe Lin wrote: > The resv_map could be NULL since this routine can be called in the evict > inode path for all hugetlbfs inodes. So we could have chg = 0 and this > would result in a negative value when chg - freed. This is unexpected for > hugepage_subpool_put_pages() and hugetlb_acct_memory(). I am not sure if this is possible. It is true that resv_map could be NULL. However, I believe resv map can only be NULL for inodes that are not regular or link inodes. This is the inode creation code in hugetlbfs_get_inode(). /* * Reserve maps are only needed for inodes that can have associated * page allocations. */ if (S_ISREG(mode) || S_ISLNK(mode)) { resv_map = resv_map_alloc(); if (!resv_map) return NULL; } >>> >>> Agree. >>> If resv_map is NULL, then no hugetlb pages can be allocated/associated with the file. As a result, remove_inode_hugepages will never find any huge pages associated with the inode and the passed value 'freed' will always be zero. >>> >>> But I am confused now. AFAICS, remove_inode_hugepages() searches the >>> address_space of >>> the inode to remove the hugepages while does not care if inode has >>> associated resv_map. >>> How does it prevent hugetlb pages from being allocated/associated with the >>> file if >>> resv_map is NULL? Could you please explain this more? >>> >> >> Recall that there are only two ways to get huge pages associated with >> a hugetlbfs file: fallocate and mmap/write fault. Directly writing to >> hugetlbfs files is not supported. >> >> If you take a closer look at hugetlbfs_get_inode, it has that code to >> allocate the resv map mentioned above as well as the following: >> >> switch (mode & S_IFMT) { >> default: >> init_special_inode(inode, mode, dev); >> break; >> case S_IFREG: >> inode->i_op = _inode_operations; >> inode->i_fop = _file_operations; >> break; >> case S_IFDIR: >> inode->i_op = _dir_inode_operations; >> inode->i_fop = _dir_operations; >> >> /* directory inodes start off with i_nlink == 2 (for >> "." entry) */ >> inc_nlink(inode); >> break; >> case S_IFLNK: >> inode->i_op = _symlink_inode_operations; >> inode_nohighmem(inode); >> break; >> } >> >> Notice that only S_IFREG inodes will have i_fop == >> _file_operations. >> hugetlbfs_file_operations contain the hugetlbfs specific mmap and fallocate >> routines. Hence, only files with S_IFREG inodes can potentially have >> associated huge pages. S_IFLNK inodes can as well via file linking. >> >> If an inode is not S_ISREG(mode) || S_ISLNK(mode), then it will not have >> a resv_map. In addition, it will not have hugetlbfs_file_operations and >> can not have associated huge pages. >> > > Many many thanks for detailed and patient explanation! :) I think I have got > the idea! > >> I looked at this closely when adding commits >> 58b6e5e8f1ad hugetlbfs: fix memory leak for resv_map >> f27a5136f70a hugetlbfs: always use address space in inode for resv_map >> pointer >> >> I may not be remembering all of the details correctly. Commit f27a5136f70a >> added the comment that resv_map could be NULL to hugetlb_unreserve_pages. >> > > Since we must have freed == 0 while chg == 0. Should we make this assumption > explict > by something like below? > > WARN_ON(chg < freed); > Or just a comment to avoid confusion ? > Thanks again! >
Re: [PATCH 3/4] mm/hugeltb: fix potential wrong gbl_reserve value for hugetlb_acct_memory()
On 2021/4/8 4:53, Mike Kravetz wrote: > On 4/7/21 12:24 AM, Miaohe Lin wrote: >> Hi: >> On 2021/4/7 10:49, Mike Kravetz wrote: >>> On 4/2/21 2:32 AM, Miaohe Lin wrote: The resv_map could be NULL since this routine can be called in the evict inode path for all hugetlbfs inodes. So we could have chg = 0 and this would result in a negative value when chg - freed. This is unexpected for hugepage_subpool_put_pages() and hugetlb_acct_memory(). >>> >>> I am not sure if this is possible. >>> >>> It is true that resv_map could be NULL. However, I believe resv map >>> can only be NULL for inodes that are not regular or link inodes. This >>> is the inode creation code in hugetlbfs_get_inode(). >>> >>>/* >>> * Reserve maps are only needed for inodes that can have associated >>> * page allocations. >>> */ >>> if (S_ISREG(mode) || S_ISLNK(mode)) { >>> resv_map = resv_map_alloc(); >>> if (!resv_map) >>> return NULL; >>> } >>> >> >> Agree. >> >>> If resv_map is NULL, then no hugetlb pages can be allocated/associated >>> with the file. As a result, remove_inode_hugepages will never find any >>> huge pages associated with the inode and the passed value 'freed' will >>> always be zero. >>> >> >> But I am confused now. AFAICS, remove_inode_hugepages() searches the >> address_space of >> the inode to remove the hugepages while does not care if inode has >> associated resv_map. >> How does it prevent hugetlb pages from being allocated/associated with the >> file if >> resv_map is NULL? Could you please explain this more? >> > > Recall that there are only two ways to get huge pages associated with > a hugetlbfs file: fallocate and mmap/write fault. Directly writing to > hugetlbfs files is not supported. > > If you take a closer look at hugetlbfs_get_inode, it has that code to > allocate the resv map mentioned above as well as the following: > > switch (mode & S_IFMT) { > default: > init_special_inode(inode, mode, dev); > break; > case S_IFREG: > inode->i_op = _inode_operations; > inode->i_fop = _file_operations; > break; > case S_IFDIR: > inode->i_op = _dir_inode_operations; > inode->i_fop = _dir_operations; > > /* directory inodes start off with i_nlink == 2 (for > "." entry) */ > inc_nlink(inode); > break; > case S_IFLNK: > inode->i_op = _symlink_inode_operations; > inode_nohighmem(inode); > break; > } > > Notice that only S_IFREG inodes will have i_fop == _file_operations. > hugetlbfs_file_operations contain the hugetlbfs specific mmap and fallocate > routines. Hence, only files with S_IFREG inodes can potentially have > associated huge pages. S_IFLNK inodes can as well via file linking. > > If an inode is not S_ISREG(mode) || S_ISLNK(mode), then it will not have > a resv_map. In addition, it will not have hugetlbfs_file_operations and > can not have associated huge pages. > Many many thanks for detailed and patient explanation! :) I think I have got the idea! > I looked at this closely when adding commits > 58b6e5e8f1ad hugetlbfs: fix memory leak for resv_map > f27a5136f70a hugetlbfs: always use address space in inode for resv_map pointer > > I may not be remembering all of the details correctly. Commit f27a5136f70a > added the comment that resv_map could be NULL to hugetlb_unreserve_pages. > Since we must have freed == 0 while chg == 0. Should we make this assumption explict by something like below? WARN_ON(chg < freed); Thanks again!
[PATCH] usb: dwc2: Enable RPi in ACPI mode
From: Jeremy Linton The dwc2 driver has everything we need to run in ACPI mode except for the ACPI module device table boilerplate. With that added and identified as "BCM2848", an id in use by other OSs for this device, the dw2 controller on the BCM2711 will work. Signed-off-by: Jeremy Linton --- drivers/usb/dwc2/core.h | 2 ++ drivers/usb/dwc2/params.c | 14 ++ drivers/usb/dwc2/platform.c | 1 + 3 files changed, 17 insertions(+) diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h index 7161344c6522..defc6034af49 100644 --- a/drivers/usb/dwc2/core.h +++ b/drivers/usb/dwc2/core.h @@ -38,6 +38,7 @@ #ifndef __DWC2_CORE_H__ #define __DWC2_CORE_H__ +#include #include #include #include @@ -1339,6 +1340,7 @@ irqreturn_t dwc2_handle_common_intr(int irq, void *dev); /* The device ID match table */ extern const struct of_device_id dwc2_of_match_table[]; +extern const struct acpi_device_id dwc2_acpi_match[]; int dwc2_lowlevel_hw_enable(struct dwc2_hsotg *hsotg); int dwc2_lowlevel_hw_disable(struct dwc2_hsotg *hsotg); diff --git a/drivers/usb/dwc2/params.c b/drivers/usb/dwc2/params.c index 92df3d620f7d..127878a0a397 100644 --- a/drivers/usb/dwc2/params.c +++ b/drivers/usb/dwc2/params.c @@ -232,6 +232,12 @@ const struct of_device_id dwc2_of_match_table[] = { }; MODULE_DEVICE_TABLE(of, dwc2_of_match_table); +const struct acpi_device_id dwc2_acpi_match[] = { + { "BCM2848", dwc2_set_bcm_params }, + { }, +}; +MODULE_DEVICE_TABLE(acpi, dwc2_acpi_match); + static void dwc2_set_param_otg_cap(struct dwc2_hsotg *hsotg) { u8 val; @@ -878,6 +884,14 @@ int dwc2_init_params(struct dwc2_hsotg *hsotg) if (match && match->data) { set_params = match->data; set_params(hsotg); + } else { + struct acpi_device_id *amatch; + + amatch = acpi_match_device(dwc2_acpi_match, hsotg->dev); + if (amatch && amatch->driver_data) { + set_params = amatch->driver_data; + set_params(hsotg); + } } dwc2_check_params(hsotg); diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c index 5f18acac7406..53fc6bc3ed1a 100644 --- a/drivers/usb/dwc2/platform.c +++ b/drivers/usb/dwc2/platform.c @@ -734,6 +734,7 @@ static struct platform_driver dwc2_platform_driver = { .driver = { .name = dwc2_driver_name, .of_match_table = dwc2_of_match_table, + .acpi_match_table = ACPI_PTR(dwc2_acpi_match), .pm = _dev_pm_ops, }, .probe = dwc2_driver_probe, -- 2.26.2
RE: [PATCH v3 08/10] fsdax: Dedup file range to use a compare function
> -Original Message- > From: Ritesh Harjani > Subject: Re: [PATCH v3 08/10] fsdax: Dedup file range to use a compare > function > > On 21/03/19 09:52AM, Shiyang Ruan wrote: > > With dax we cannot deal with readpage() etc. So, we create a dax > > comparison funciton which is similar with > > vfs_dedupe_file_range_compare(). > > And introduce dax_remap_file_range_prep() for filesystem use. > > > > Signed-off-by: Goldwyn Rodrigues > > Signed-off-by: Shiyang Ruan > > --- > > fs/dax.c | 56 > > > fs/remap_range.c | 45 --- > > fs/xfs/xfs_reflink.c | 9 +-- > > include/linux/dax.h | 4 > > include/linux/fs.h | 15 > > 5 files changed, 115 insertions(+), 14 deletions(-) > > > > diff --git a/fs/dax.c b/fs/dax.c > > index 348297b38f76..76f81f1d76ec 100644 > > --- a/fs/dax.c > > +++ b/fs/dax.c > > @@ -1833,3 +1833,59 @@ vm_fault_t dax_finish_sync_fault(struct vm_fault > *vmf, > > return dax_insert_pfn_mkwrite(vmf, pfn, order); } > > EXPORT_SYMBOL_GPL(dax_finish_sync_fault); > > + > > +static loff_t dax_range_compare_actor(struct inode *ino1, loff_t pos1, > > + struct inode *ino2, loff_t pos2, loff_t len, void *data, > > + struct iomap *smap, struct iomap *dmap) { > > + void *saddr, *daddr; > > + bool *same = data; > > + int ret; > > + > > + if (smap->type == IOMAP_HOLE && dmap->type == IOMAP_HOLE) { > > + *same = true; > > + return len; > > + } > > + > > + if (smap->type == IOMAP_HOLE || dmap->type == IOMAP_HOLE) { > > + *same = false; > > + return 0; > > + } > > + > > + ret = dax_iomap_direct_access(smap, pos1, ALIGN(pos1 + len, PAGE_SIZE), > > + , NULL); > > shouldn't it take len as the second argument? The second argument of dax_iomap_direct_access() means offset, and the third one means length. So, I think this is right. > > > + if (ret < 0) > > + return -EIO; > > + > > + ret = dax_iomap_direct_access(dmap, pos2, ALIGN(pos2 + len, PAGE_SIZE), > > + , NULL); > > ditto. > > + if (ret < 0) > > + return -EIO; > > + > > + *same = !memcmp(saddr, daddr, len); > > + return len; > > +} > > + > > +int dax_dedupe_file_range_compare(struct inode *src, loff_t srcoff, > > + struct inode *dest, loff_t destoff, loff_t len, bool *is_same, > > + const struct iomap_ops *ops) > > +{ > > + int id, ret = 0; > > + > > + id = dax_read_lock(); > > + while (len) { > > + ret = iomap_apply2(src, srcoff, dest, destoff, len, 0, ops, > > + is_same, dax_range_compare_actor); > > + if (ret < 0 || !*is_same) > > + goto out; > > + > > + len -= ret; > > + srcoff += ret; > > + destoff += ret; > > + } > > + ret = 0; > > +out: > > + dax_read_unlock(id); > > + return ret; > > +} > > +EXPORT_SYMBOL_GPL(dax_dedupe_file_range_compare); > > diff --git a/fs/remap_range.c b/fs/remap_range.c index > > 77dba3a49e65..9079390edaf3 100644 > > --- a/fs/remap_range.c > > +++ b/fs/remap_range.c > > @@ -14,6 +14,7 @@ > > #include > > #include > > #include > > +#include > > #include "internal.h" > > > > #include > > @@ -199,9 +200,9 @@ static void vfs_unlock_two_pages(struct page *page1, > struct page *page2) > > * Compare extents of two files to see if they are the same. > > * Caller must have locked both inodes to prevent write races. > > */ > > -static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, > > -struct inode *dest, loff_t destoff, > > -loff_t len, bool *is_same) > > +int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, > > + struct inode *dest, loff_t destoff, > > + loff_t len, bool *is_same) > > { > > loff_t src_poff; > > loff_t dest_poff; > > @@ -280,6 +281,7 @@ static int vfs_dedupe_file_range_compare(struct > > inode *src, loff_t srcoff, > > out_error: > > return error; > > } > > +EXPORT_SYMBOL(vfs_dedupe_file_range_compare); > > > > /* > > * Check that the two inodes are eligible for cloning, the ranges > > make @@ -289,9 +291,11 @@ static int > vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, > > * If there's an error, then the usual negative error code is returned. > > * Otherwise returns 0 with *len set to the request length. > > */ > > -int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in, > > - struct file *file_out, loff_t pos_out, > > - loff_t *len, unsigned int remap_flags) > > +static int > > +__generic_remap_file_range_prep(struct file *file_in, loff_t pos_in, > > + struct file *file_out,
linux-next: manual merge of the net-next tree with the net tree
Hi all, Today's linux-next merge of the net-next tree got a conflict in: net/tipc/crypto.c between commit: 2a2403ca3add ("tipc: increment the tmp aead refcnt before attaching it") from the net tree and commit: 97bc84bbd4de ("tipc: clean up warnings detected by sparse") from the net-next tree. I fixed it up (I used the former version) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell pgp3UqR42C5pN.pgp Description: OpenPGP digital signature
linux-next: manual merge of the net-next tree with the bpf tree
Hi all, Today's linux-next merge of the net-next tree got a conflict in: net/core/skmsg.c between commit: 144748eb0c44 ("bpf, sockmap: Fix incorrect fwd_alloc accounting") from the bpf tree and commit: e3526bb92a20 ("skmsg: Move sk_redir from TCP_SKB_CB to skb") from the net-next tree. I fixed it up (I think - see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc net/core/skmsg.c index 5def3a2e85be,92a83c02562a.. --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@@ -806,12 -900,17 +900,13 @@@ int sk_psock_tls_strp_read(struct sk_ps int ret = __SK_PASS; rcu_read_lock(); - prog = READ_ONCE(psock->progs.skb_verdict); + prog = READ_ONCE(psock->progs.stream_verdict); if (likely(prog)) { - /* We skip full set_owner_r here because if we do a SK_PASS - * or SK_DROP we can skip skb memory accounting and use the - * TLS context. - */ skb->sk = psock->sk; - tcp_skb_bpf_redirect_clear(skb); - ret = sk_psock_bpf_run(psock, prog, skb); - ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb)); + skb_dst_drop(skb); + skb_bpf_redirect_clear(skb); + ret = bpf_prog_run_pin_on_cpu(prog, skb); + ret = sk_psock_map_verd(ret, skb_bpf_redirect_fetch(skb)); skb->sk = NULL; } sk_psock_tls_verdict_apply(skb, psock->sk, ret); @@@ -876,13 -995,13 +991,14 @@@ static void sk_psock_strp_read(struct s kfree_skb(skb); goto out; } - prog = READ_ONCE(psock->progs.skb_verdict); - skb_set_owner_r(skb, sk); + prog = READ_ONCE(psock->progs.stream_verdict); if (likely(prog)) { + skb->sk = sk; - tcp_skb_bpf_redirect_clear(skb); - ret = sk_psock_bpf_run(psock, prog, skb); - ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb)); + skb_dst_drop(skb); + skb_bpf_redirect_clear(skb); + ret = bpf_prog_run_pin_on_cpu(prog, skb); + ret = sk_psock_map_verd(ret, skb_bpf_redirect_fetch(skb)); + skb->sk = NULL; } sk_psock_verdict_apply(psock, skb, ret); out: @@@ -953,13 -1115,15 +1112,16 @@@ static int sk_psock_verdict_recv(read_d kfree_skb(skb); goto out; } - prog = READ_ONCE(psock->progs.skb_verdict); - skb_set_owner_r(skb, sk); + prog = READ_ONCE(psock->progs.stream_verdict); + if (!prog) + prog = READ_ONCE(psock->progs.skb_verdict); if (likely(prog)) { + skb->sk = sk; - tcp_skb_bpf_redirect_clear(skb); - ret = sk_psock_bpf_run(psock, prog, skb); - ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb)); + skb_dst_drop(skb); + skb_bpf_redirect_clear(skb); + ret = bpf_prog_run_pin_on_cpu(prog, skb); + ret = sk_psock_map_verd(ret, skb_bpf_redirect_fetch(skb)); + skb->sk = NULL; } sk_psock_verdict_apply(psock, skb, ret); out: pgpmIksOHySb2.pgp Description: OpenPGP digital signature
[PATCH] powerpc: remove old workaround for GCC < 4.9
According to Documentation/process/changes.rst, the minimum supported GCC version is 4.9. This workaround is dead code. Signed-off-by: Masahiro Yamada --- arch/powerpc/Makefile | 6 -- 1 file changed, 6 deletions(-) diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 5f8544cf724a..32dd693b4e42 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -181,12 +181,6 @@ CC_FLAGS_FTRACE := -pg ifdef CONFIG_MPROFILE_KERNEL CC_FLAGS_FTRACE += -mprofile-kernel endif -# Work around gcc code-gen bugs with -pg / -fno-omit-frame-pointer in gcc <= 4.8 -# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44199 -# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52828 -ifndef CONFIG_CC_IS_CLANG -CC_FLAGS_FTRACE+= $(call cc-ifversion, -lt, 0409, -mno-sched-epilog) -endif endif CFLAGS-$(CONFIG_TARGET_CPU_BOOL) += $(call cc-option,-mcpu=$(CONFIG_TARGET_CPU)) -- 2.27.0
Re: [PATCH v4 4/6] drm/sprd: add Unisoc's drm display controller driver
On Wed, 7 Apr 2021 at 18:45, Maxime Ripard wrote: > > Hi, > > Adding Jörg, Will and Robin, You forgot to add them actually :) I've added Robin and Joerg. > > On Wed, Mar 31, 2021 at 09:21:19AM +0800, Kevin Tang wrote: > > > > +static u32 check_mmu_isr(struct sprd_dpu *dpu, u32 reg_val) > > > > +{ > > > > + struct dpu_context *ctx = >ctx; > > > > + u32 mmu_mask = BIT_DPU_INT_MMU_VAOR_RD | > > > > + BIT_DPU_INT_MMU_VAOR_WR | > > > > + BIT_DPU_INT_MMU_INV_RD | > > > > + BIT_DPU_INT_MMU_INV_WR; > > > > + u32 val = reg_val & mmu_mask; > > > > + int i; > > > > + > > > > + if (val) { > > > > + drm_err(dpu->drm, "--- iommu interrupt err: 0x%04x ---\n", > > > val); > > > > + > > > > + if (val & BIT_DPU_INT_MMU_INV_RD) > > > > + drm_err(dpu->drm, "iommu invalid read error, addr: > > > 0x%08x\n", > > > > + readl(ctx->base + REG_MMU_INV_ADDR_RD)); > > > > + if (val & BIT_DPU_INT_MMU_INV_WR) > > > > + drm_err(dpu->drm, "iommu invalid write error, > > > addr: 0x%08x\n", > > > > + readl(ctx->base + REG_MMU_INV_ADDR_WR)); > > > > + if (val & BIT_DPU_INT_MMU_VAOR_RD) > > > > + drm_err(dpu->drm, "iommu va out of range read > > > error, addr: 0x%08x\n", > > > > + readl(ctx->base + REG_MMU_VAOR_ADDR_RD)); > > > > + if (val & BIT_DPU_INT_MMU_VAOR_WR) > > > > + drm_err(dpu->drm, "iommu va out of range write > > > error, addr: 0x%08x\n", > > > > + readl(ctx->base + REG_MMU_VAOR_ADDR_WR)); > > > > > > Is that the IOMMU page fault interrupt? I would expect it to be in the > > > iommu driver. > > > > Our iommu driver is indeed an separate driver, and also in upstreaming, > > but iommu fault interrupts reporting by display controller, not iommu > > itself, > > if use iommu_set_fault_handler() to hook in our reporting function, there > > must be cross-module access to h/w registers. > > Can you explain a bit more the design of the hardware then? Each device > connected to the IOMMU has a status register (and an interrupt) that > reports when there's a fault? On Unisoc's platforms, one IOMMU serves one master device only, and interrupts are handled by master devices rather than IOMMUs, since the registers are in the physical address range of master devices. > > I'd like to get an ack at least from the IOMMU maintainers and > reviewers. > > > > > +static void sprd_dpi_init(struct sprd_dpu *dpu) > > > > +{ > > > > + struct dpu_context *ctx = >ctx; > > > > + u32 int_mask = 0; > > > > + u32 reg_val; > > > > + > > > > + if (ctx->if_type == SPRD_DPU_IF_DPI) { > > > > + /* use dpi as interface */ > > > > + dpu_reg_clr(ctx, REG_DPU_CFG0, BIT_DPU_IF_EDPI); > > > > + /* disable Halt function for SPRD DSI */ > > > > + dpu_reg_clr(ctx, REG_DPI_CTRL, BIT_DPU_DPI_HALT_EN); > > > > + /* select te from external pad */ > > > > + dpu_reg_set(ctx, REG_DPI_CTRL, > > > BIT_DPU_EDPI_FROM_EXTERNAL_PAD); > > > > + > > > > + /* set dpi timing */ > > > > + reg_val = ctx->vm.hsync_len << 0 | > > > > + ctx->vm.hback_porch << 8 | > > > > + ctx->vm.hfront_porch << 20; > > > > + writel(reg_val, ctx->base + REG_DPI_H_TIMING); > > > > + > > > > + reg_val = ctx->vm.vsync_len << 0 | > > > > + ctx->vm.vback_porch << 8 | > > > > + ctx->vm.vfront_porch << 20; > > > > + writel(reg_val, ctx->base + REG_DPI_V_TIMING); > > > > + > > > > + if (ctx->vm.vsync_len + ctx->vm.vback_porch < 32) > > > > + drm_warn(dpu->drm, "Warning: (vsync + vbp) < 32, " > > > > + "underflow risk!\n"); > > > > > > I don't think a warning is appropriate here. Maybe we should just > > > outright reject any mode that uses it? > > > > > This issue has been fixed on the new soc, maybe I should remove it. > > If it still requires a workaround on older SoC, you can definitely add > it but we should prevent any situation where the underflow might occur > instead of reporting it once we're there. > > > > > +static enum drm_mode_status sprd_crtc_mode_valid(struct drm_crtc *crtc, > > > > + const struct drm_display_mode > > > *mode) > > > > +{ > > > > + struct sprd_dpu *dpu = to_sprd_crtc(crtc); > > > > + > > > > + drm_dbg(dpu->drm, "%s() mode: "DRM_MODE_FMT"\n", __func__, > > > DRM_MODE_ARG(mode)); > > > > + > > > > + if (mode->type & DRM_MODE_TYPE_PREFERRED) { > > > > + drm_display_mode_to_videomode(mode, >ctx.vm); > > > > > > You don't seem to use that anywhere else? And that's a bit fragile, > > > nothing really guarantees
linux-next: manual merge of the net-next tree with the bpf tree
Hi all, Today's linux-next merge of the net-next tree got a conflict in: include/linux/skmsg.h between commit: 1c84b33101c8 ("bpf, sockmap: Fix sk->prot unhash op reset") from the bpf tree and commit: 8a59f9d1e3d4 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()") from the net-next tree. I didn't know how to fixed it up so I just used the latter version or today - a better solution would be appreciated. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell pgpLwie9IQh1g.pgp Description: OpenPGP digital signature
[PATCH] staging: rtl8712: matched alignment with open parenthesis
Aligned arguments with open parenthesis to meet linux kernel coding style Reported by checkpatch Signed-off-by: Mitali Borkar --- drivers/staging/rtl8712/usb_ops.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/staging/rtl8712/usb_ops.h b/drivers/staging/rtl8712/usb_ops.h index d62975447d29..7a6b619b73fa 100644 --- a/drivers/staging/rtl8712/usb_ops.h +++ b/drivers/staging/rtl8712/usb_ops.h @@ -21,9 +21,9 @@ void r8712_usb_write_mem(struct intf_hdl *pintfhdl, u32 addr, u32 cnt, u8 *wmem); u32 r8712_usb_write_port(struct intf_hdl *pintfhdl, u32 addr, - u32 cnt, u8 *wmem); +u32 cnt, u8 *wmem); u32 r8712_usb_read_port(struct intf_hdl *pintfhdl, u32 addr, -u32 cnt, u8 *rmem); + u32 cnt, u8 *rmem); void r8712_usb_set_intf_option(u32 *poption); void r8712_usb_set_intf_funs(struct intf_hdl *pintf_hdl); uint r8712_usb_init_intf_priv(struct intf_priv *pintfpriv); @@ -32,7 +32,7 @@ void r8712_usb_set_intf_ops(struct _io_ops *pops); void r8712_usb_read_port_cancel(struct _adapter *padapter); void r8712_usb_write_port_cancel(struct _adapter *padapter); int r8712_usbctrl_vendorreq(struct intf_priv *pintfpriv, u8 request, u16 value, - u16 index, void *pdata, u16 len, u8 requesttype); + u16 index, void *pdata, u16 len, u8 requesttype); #endif -- 2.30.2
[PATCH v7] soc: fsl: enable acpi support in RCPM driver
From: Peng Ma This patch enables ACPI support in RCPM driver. Signed-off-by: Peng Ma Signed-off-by: Ran Wang --- Change in v7: - Update comment for checking RCPM node which refferred to Change in v6: - Remove copyright udpate to rebase on latest mainline Change in v5: - Fix panic when dev->of_node is null Change in v4: - Make commit subject more accurate - Remove unrelated new blank line Change in v3: - Add #ifdef CONFIG_ACPI for acpi_device_id - Rename rcpm_acpi_imx_ids to rcpm_acpi_ids Change in v2: - Update acpi_device_id to fix conflict with other driver drivers/soc/fsl/rcpm.c | 24 ++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/soc/fsl/rcpm.c b/drivers/soc/fsl/rcpm.c index 4ace28cab314..90d3f4060b0c 100644 --- a/drivers/soc/fsl/rcpm.c +++ b/drivers/soc/fsl/rcpm.c @@ -13,6 +13,7 @@ #include #include #include +#include #define RCPM_WAKEUP_CELL_MAX_SIZE 7 @@ -78,10 +79,20 @@ static int rcpm_pm_prepare(struct device *dev) "fsl,rcpm-wakeup", value, rcpm->wakeup_cells + 1); - /* Wakeup source should refer to current rcpm device */ - if (ret || (np->phandle != value[0])) + if (ret) continue; + /* +* For DT mode, would handle devices with "fsl,rcpm-wakeup" +* pointing to the current RCPM node. +* +* For ACPI mode, currently we assume there is only one +* RCPM controller existing. +*/ + if (is_of_node(dev->fwnode)) + if (np->phandle != value[0]) + continue; + /* Property "#fsl,rcpm-wakeup-cells" of rcpm node defines the * number of IPPDEXPCR register cells, and "fsl,rcpm-wakeup" * of wakeup source IP contains an integer array:
linux-next: manual merge of the net-next tree with the net tree
Hi all, Today's linux-next merge of the net-next tree got a conflict in: include/linux/ethtool.h between commit: a975d7d8a356 ("ethtool: Remove link_mode param and derive link params from driver") from the net tree and commit: 7888fe53b706 ("ethtool: Add common function for filling out strings") from the net-next tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc include/linux/ethtool.h index cdca84e6dd6b,5c631a298994.. --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@@ -573,12 -573,13 +575,22 @@@ struct ethtool_phy_ops */ void ethtool_set_ethtool_phy_ops(const struct ethtool_phy_ops *ops); +/* + * ethtool_params_from_link_mode - Derive link parameters from a given link mode + * @link_ksettings: Link parameters to be derived from the link mode + * @link_mode: Link mode + */ +void +ethtool_params_from_link_mode(struct ethtool_link_ksettings *link_ksettings, +enum ethtool_link_mode_bit_indices link_mode); ++ + /** + * ethtool_sprintf - Write formatted string to ethtool string data + * @data: Pointer to start of string to update + * @fmt: Format of string to write + * + * Write formatted string to data. Update data to point at start of + * next string. + */ + extern __printf(2, 3) void ethtool_sprintf(u8 **data, const char *fmt, ...); #endif /* _LINUX_ETHTOOL_H */ pgpQXT1TusKn0.pgp Description: OpenPGP digital signature
[PATCH] gpio: gpio-104-dio-48e: Fixed coding style issues
Fixed multiple bare uses of 'unsigned' without 'int'. Fixed space around '*' operator. Fixed function parameter alignment to opening parenthesis. Reported by checkpatch. Signed-off-by: Barney Goette --- drivers/gpio/gpio-104-dio-48e.c | 53 + 1 file changed, 27 insertions(+), 26 deletions(-) diff --git a/drivers/gpio/gpio-104-dio-48e.c b/drivers/gpio/gpio-104-dio-48e.c index 7a9021c4fa48..38badc421c32 100644 --- a/drivers/gpio/gpio-104-dio-48e.c +++ b/drivers/gpio/gpio-104-dio-48e.c @@ -49,15 +49,15 @@ struct dio48e_gpio { unsigned char out_state[6]; unsigned char control[2]; raw_spinlock_t lock; - unsigned base; + unsigned int base; unsigned char irq_mask; }; -static int dio48e_gpio_get_direction(struct gpio_chip *chip, unsigned offset) +static int dio48e_gpio_get_direction(struct gpio_chip *chip, unsigned int offset) { struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip); - const unsigned port = offset / 8; - const unsigned mask = BIT(offset % 8); + const unsigned int port = offset / 8; + const unsigned int mask = BIT(offset % 8); if (dio48egpio->io_state[port] & mask) return GPIO_LINE_DIRECTION_IN; @@ -65,14 +65,15 @@ static int dio48e_gpio_get_direction(struct gpio_chip *chip, unsigned offset) return GPIO_LINE_DIRECTION_OUT; } -static int dio48e_gpio_direction_input(struct gpio_chip *chip, unsigned offset) +static int dio48e_gpio_direction_input(struct gpio_chip *chip, unsigned int offset) { struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip); - const unsigned io_port = offset / 8; + const unsigned int io_port = offset / 8; const unsigned int control_port = io_port / 3; - const unsigned control_addr = dio48egpio->base + 3 + control_port*4; - unsigned long flags; - unsigned control; + const unsigned int control_addr = dio48egpio->base + 3 + control_port * 4; + + unsigned int long flags; + unsigned int control; raw_spin_lock_irqsave(>lock, flags); @@ -104,17 +105,17 @@ static int dio48e_gpio_direction_input(struct gpio_chip *chip, unsigned offset) return 0; } -static int dio48e_gpio_direction_output(struct gpio_chip *chip, unsigned offset, - int value) +static int dio48e_gpio_direction_output(struct gpio_chip *chip, unsigned int offset, + int value) { struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip); - const unsigned io_port = offset / 8; + const unsigned int io_port = offset / 8; const unsigned int control_port = io_port / 3; - const unsigned mask = BIT(offset % 8); - const unsigned control_addr = dio48egpio->base + 3 + control_port*4; - const unsigned out_port = (io_port > 2) ? io_port + 1 : io_port; + const unsigned int mask = BIT(offset % 8); + const unsigned int control_addr = dio48egpio->base + 3 + control_port * 4; + const unsigned int out_port = (io_port > 2) ? io_port + 1 : io_port; unsigned long flags; - unsigned control; + unsigned int control; raw_spin_lock_irqsave(>lock, flags); @@ -154,14 +155,14 @@ static int dio48e_gpio_direction_output(struct gpio_chip *chip, unsigned offset, return 0; } -static int dio48e_gpio_get(struct gpio_chip *chip, unsigned offset) +static int dio48e_gpio_get(struct gpio_chip *chip, unsigned int offset) { struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip); - const unsigned port = offset / 8; - const unsigned mask = BIT(offset % 8); - const unsigned in_port = (port > 2) ? port + 1 : port; + const unsigned int port = offset / 8; + const unsigned int mask = BIT(offset % 8); + const unsigned int in_port = (port > 2) ? port + 1 : port; unsigned long flags; - unsigned port_state; + unsigned int port_state; raw_spin_lock_irqsave(>lock, flags); @@ -202,12 +203,12 @@ static int dio48e_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask, return 0; } -static void dio48e_gpio_set(struct gpio_chip *chip, unsigned offset, int value) +static void dio48e_gpio_set(struct gpio_chip *chip, unsigned int offset, int value) { struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip); - const unsigned port = offset / 8; - const unsigned mask = BIT(offset % 8); - const unsigned out_port = (port > 2) ? port + 1 : port; + const unsigned int port = offset / 8; + const unsigned int mask = BIT(offset % 8); + const unsigned int out_port = (port > 2) ? port + 1 : port; unsigned long flags; raw_spin_lock_irqsave(>lock, flags); @@ -306,7 +307,7 @@ static void dio48e_irq_unmask(struct irq_data *data) raw_spin_unlock_irqrestore(>lock, flags); } -static int
Re: [PATCH] s390/pci: move ioremap/ioremap_prot/ioremap_wc/ioremap_wt/iounmap to arch/s390/mm/ioremap.c
On 2021/4/6 19:14, Niklas Schnelle wrote: > and move the have_mio variable out of the PCI only code or use a raw > "#ifdef CONFIG_PCI". Obviously we don't have any actual users of > ioremap() that don't depend on CONFIG_PCI but it would make it so that > ioremap() exists and should actually function without CONFIG_PCI. > The weird part though is that for anyone using it without CONFIG_PCI it > would stop working if that is set and the machine doesn't have MIO > support but would work if it does. Well, Maybe it's better not to change it.And thank you for the explanation. Thanks, Bixuan Cui
Re: [PATCH 2/4] mm/hugeltb: simplify the return code of __vma_reservation_common()
On 2021/4/8 5:23, Mike Kravetz wrote: > On 4/6/21 8:09 PM, Miaohe Lin wrote: >> On 2021/4/7 10:37, Mike Kravetz wrote: >>> On 4/6/21 7:05 PM, Miaohe Lin wrote: Hi: On 2021/4/7 8:53, Mike Kravetz wrote: > On 4/2/21 2:32 AM, Miaohe Lin wrote: >> It's guaranteed that the vma is associated with a resv_map, i.e. either >> VM_MAYSHARE or HPAGE_RESV_OWNER, when the code reaches here or we would >> have returned via !resv check above. So ret must be less than 0 in the >> 'else' case. Simplify the return code to make this clear. > > I believe we still neeed that ternary operator in the return statement. > Why? > > There are two basic types of mappings to be concerned with: > shared and private. > For private mappings, a task can 'own' the mapping as indicated by > HPAGE_RESV_OWNER. Or, it may not own the mapping. The most common way > to create a non-owner private mapping is to have a task with a private > mapping fork. The parent process will have HPAGE_RESV_OWNER set, the > child process will not. The idea is that since the child has a COW copy > of the mapping it should not consume reservations made by the parent. The child process will not have HPAGE_RESV_OWNER set because at fork time, we do: /* * Clear hugetlb-related page reserves for children. This only * affects MAP_PRIVATE mappings. Faults generated by the child * are not guaranteed to succeed, even if read-only */ if (is_vm_hugetlb_page(tmp)) reset_vma_resv_huge_pages(tmp); i.e. we have vma->vm_private_data = (void *)0; for child process and vma_resv_map() will return NULL in this case. Or am I missed something? > Only the parent (HPAGE_RESV_OWNER) is allowed to consume the > reservations. > Hope that makens sense? > >> >> Signed-off-by: Miaohe Lin >> --- >> mm/hugetlb.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >> index a03a50b7c410..b7864abded3d 100644 >> --- a/mm/hugetlb.c >> +++ b/mm/hugetlb.c >> @@ -2183,7 +2183,7 @@ static long __vma_reservation_common(struct hstate >> *h, >> return 1; >> } >> else > > This else also handles the case !HPAGE_RESV_OWNER. In this case, we IMO, for the case !HPAGE_RESV_OWNER, we won't reach here. What do you think? >>> >>> I think you are correct. >>> >>> However, if this is true we should be able to simply the code even >>> further. There is no need to check for HPAGE_RESV_OWNER because we know >>> it must be set. Correct? If so, the code could look something like: >>> >>> if (vma->vm_flags & VM_MAYSHARE) >>> return ret; >>> >>> /* We know private mapping with HPAGE_RESV_OWNER */ >>> * ... * >>> * Add that existing comment */ >>> >>> if (ret > 0) >>> return 0; >>> if (ret == 0) >>> return 1; >>> return ret; >>> >> >> Many thanks for good suggestion! What do you mean is this ? > > I think the below changes would work fine. > > However, this patch/discussion has made me ask the question. Do we need > the HPAGE_RESV_OWNER flag? Is the followng true? > !(vm_flags & VM_MAYSHARE) && vma_resv_map() ===> HPAGE_RESV_OWNER > !(vm_flags & VM_MAYSHARE) && !vma_resv_map() ===> !HPAGE_RESV_OWNER > I agree with you. HPAGE_RESV_OWNER is set in hugetlb_reserve_pages() and there's no way to clear it in the owner process. The child process can not inherit both HPAGE_RESV_OWNER and resv_map. So for !HPAGE_RESV_OWNER vma, it knows nothing about resv_map. IMO, in !(vm_flags & VM_MAYSHARE) case, we must have: !!vma_resv_map() == !!HPAGE_RESV_OWNER > I am not suggesting we eliminate the flag and make corresponding > changes. Just curious if you believe we 'could' remove the flag and > depend on the above conditions. > > One reason for NOT removing the flag is that that flag itself and > supporting code and commnets help explain what happens with hugetlb > reserves for COW mappings. That code is hard to understand and the > existing code and coments around HPAGE_RESV_OWNER help with > understanding. Agree. These codes took me several days to understand... > Thanks.
Re: [PATCH 2/3] dt-bindings: mfd: Convert pm8xxx bindings to yaml
On Wed 07 Apr 10:37 CDT 2021, ska...@codeaurora.org wrote: > Hi Bjorn, > > On 2021-03-11 22:33, Bjorn Andersson wrote: > > On Thu 11 Mar 01:29 CST 2021, satya priya wrote: [..] > > > +patternProperties: > > > + "rtc@[0-9a-f]+$": > > > > Can we somehow link this to individual binding docs instead of listing > > all the possible functions here? > > > > you mean we should split this into two: > qcom-pm8xxx.yaml and qcom-pm8xxx-rtc.yaml > Please correct me if wrong. > Right, I'm worried that it will be quite hard to maintain this document once we start adding all the various pmic blocks to it. So if we somehow can maintain a series of qcom-pm8xxx-.yaml and just ref them into the main PMIC definition. @Rob, can you give us some guidance on how to structure this binding, with the various PMICs described will have some defined subset of a larger set of hardware blocks that's often shared between versions? Regards, Bjorn
[PATCH v2] Bluetooth: Add ncmd=0 recovery handling
During command status or command complete event, the controller may set ncmd=0 indicating that it is not accepting any more commands. In such a case, host holds off sending any more commands to the controller. If the controller doesn't recover from such condition, host will wait forever, until the user decides that the Bluetooth is broken and may power cycles the Bluetooth. This patch triggers the hardware error to reset the controller and driver when it gets into such state as there is no other wat out. Reviewed-by: Abhishek Pandit-Subedi Signed-off-by: Manish Mandlik --- Changes in v2: - Emit the hardware error when ncmd=0 occurs include/net/bluetooth/hci.h | 1 + include/net/bluetooth/hci_core.h | 1 + net/bluetooth/hci_core.c | 15 +++ net/bluetooth/hci_event.c| 10 ++ 4 files changed, 27 insertions(+) diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h index ea4ae551c426..c4b0650fb9ae 100644 --- a/include/net/bluetooth/hci.h +++ b/include/net/bluetooth/hci.h @@ -339,6 +339,7 @@ enum { #define HCI_PAIRING_TIMEOUTmsecs_to_jiffies(6) /* 60 seconds */ #define HCI_INIT_TIMEOUT msecs_to_jiffies(1) /* 10 seconds */ #define HCI_CMD_TIMEOUTmsecs_to_jiffies(2000) /* 2 seconds */ +#define HCI_NCMD_TIMEOUT msecs_to_jiffies(4000) /* 4 seconds */ #define HCI_ACL_TX_TIMEOUT msecs_to_jiffies(45000) /* 45 seconds */ #define HCI_AUTO_OFF_TIMEOUT msecs_to_jiffies(2000) /* 2 seconds */ #define HCI_POWER_OFF_TIMEOUT msecs_to_jiffies(5000) /* 5 seconds */ diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h index ebdd4afe30d2..f14692b39fd5 100644 --- a/include/net/bluetooth/hci_core.h +++ b/include/net/bluetooth/hci_core.h @@ -470,6 +470,7 @@ struct hci_dev { struct delayed_work service_cache; struct delayed_work cmd_timer; + struct delayed_work ncmd_timer; struct work_struct rx_work; struct work_struct cmd_work; diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c index b0d9c36acc03..c102a8763cb5 100644 --- a/net/bluetooth/hci_core.c +++ b/net/bluetooth/hci_core.c @@ -2769,6 +2769,20 @@ static void hci_cmd_timeout(struct work_struct *work) queue_work(hdev->workqueue, >cmd_work); } +/* HCI ncmd timer function */ +static void hci_ncmd_timeout(struct work_struct *work) +{ + struct hci_dev *hdev = container_of(work, struct hci_dev, + ncmd_timer.work); + + bt_dev_err(hdev, "Controller not accepting commands anymore: ncmd = 0"); + + /* This is an irrecoverable state. Inject hw error event to reset +* the device and driver. +*/ + hci_reset_dev(hdev); +} + struct oob_data *hci_find_remote_oob_data(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 bdaddr_type) { @@ -3831,6 +3845,7 @@ struct hci_dev *hci_alloc_dev(void) init_waitqueue_head(>suspend_wait_q); INIT_DELAYED_WORK(>cmd_timer, hci_cmd_timeout); + INIT_DELAYED_WORK(>ncmd_timer, hci_ncmd_timeout); hci_request_setup(hdev); diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index cf2f4a0abdbd..114a9170d809 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -3635,6 +3635,11 @@ static void hci_cmd_complete_evt(struct hci_dev *hdev, struct sk_buff *skb, if (*opcode != HCI_OP_NOP) cancel_delayed_work(>cmd_timer); + if (!ev->ncmd && !test_bit(HCI_RESET, >flags)) + schedule_delayed_work(>ncmd_timer, HCI_NCMD_TIMEOUT); + else + cancel_delayed_work(>ncmd_timer); + if (ev->ncmd && !test_bit(HCI_RESET, >flags)) atomic_set(>cmd_cnt, 1); @@ -3740,6 +3745,11 @@ static void hci_cmd_status_evt(struct hci_dev *hdev, struct sk_buff *skb, if (*opcode != HCI_OP_NOP) cancel_delayed_work(>cmd_timer); + if (!ev->ncmd && !test_bit(HCI_RESET, >flags)) + schedule_delayed_work(>ncmd_timer, HCI_NCMD_TIMEOUT); + else + cancel_delayed_work(>ncmd_timer); + if (ev->ncmd && !test_bit(HCI_RESET, >flags)) atomic_set(>cmd_cnt, 1); -- 2.31.0.208.g409f899ff0-goog
[PATCH v2] hwmon: Add driver for fsp-3y PSUs and PDUs
This patch adds support for these devices: - YH-5151E - the PDU - YM-2151E - the PSU The device datasheet says that the devices support PMBus 1.2, but in my testing, a lot of the commands aren't supported and if they are, they sometimes behave strangely or inconsistently. For example, writes to the PAGE command requires using PEC, otherwise the write won't work and the page won't switch, even though, the standard says that PEC is opiotnal. On the other hand, writes the SMBALERT don't require PEC. Because of this, the driver is mostly reverse engineered with the help of a tool called pmbus_peek written by David Brownell (and later adopted by my colleague Jan Kundrát). The device also has some sort of a timing issue when switching pages, which is explained further in the code. Because of this, the driver support is limited. It exposes only the values, that have been tested to work correctly. Signed-off-by: Václav Kubernát --- Documentation/hwmon/fsp-3y.rst | 26 drivers/hwmon/pmbus/Kconfig| 10 ++ drivers/hwmon/pmbus/Makefile | 1 + drivers/hwmon/pmbus/fsp-3y.c | 217 + 4 files changed, 254 insertions(+) create mode 100644 Documentation/hwmon/fsp-3y.rst create mode 100644 drivers/hwmon/pmbus/fsp-3y.c diff --git a/Documentation/hwmon/fsp-3y.rst b/Documentation/hwmon/fsp-3y.rst new file mode 100644 index ..68a547021846 --- /dev/null +++ b/Documentation/hwmon/fsp-3y.rst @@ -0,0 +1,26 @@ +Kernel driver fsp3y +== +Supported devices: + * 3Y POWER YH-5151E + * 3Y POWER YM-2151E + +Author: Václav Kubernát + +Description +--- +This driver implements limited support for two 3Y POWER devices. + +Sysfs entries +- +in1_inputinput voltage +in2_input12V output voltage +in3_input5V output voltage +curr1_input input current +curr2_input 12V output current +curr3_input 5V output current +fan1_input fan rpm +temp1_input temperature 1 +temp2_input temperature 2 +temp3_input temperature 3 +power1_input input power +power2_input output power diff --git a/drivers/hwmon/pmbus/Kconfig b/drivers/hwmon/pmbus/Kconfig index 03606d4298a4..9d12d446396c 100644 --- a/drivers/hwmon/pmbus/Kconfig +++ b/drivers/hwmon/pmbus/Kconfig @@ -56,6 +56,16 @@ config SENSORS_BEL_PFE This driver can also be built as a module. If so, the module will be called bel-pfe. +config SENSORS_FSP_3Y + tristate "FSP/3Y-Power power supplies" + help + If you say yes here you get hardware monitoring support for + FSP/3Y-Power hot-swap power supplies. + Supported models: YH-5151E, YM-2151E + + This driver can also be built as a module. If so, the module will + be called fsp-3y. + config SENSORS_IBM_CFFPS tristate "IBM Common Form Factor Power Supply" depends on LEDS_CLASS diff --git a/drivers/hwmon/pmbus/Makefile b/drivers/hwmon/pmbus/Makefile index 6a4ba0fdc1db..bfe218ad898f 100644 --- a/drivers/hwmon/pmbus/Makefile +++ b/drivers/hwmon/pmbus/Makefile @@ -8,6 +8,7 @@ obj-$(CONFIG_SENSORS_PMBUS) += pmbus.o obj-$(CONFIG_SENSORS_ADM1266) += adm1266.o obj-$(CONFIG_SENSORS_ADM1275) += adm1275.o obj-$(CONFIG_SENSORS_BEL_PFE) += bel-pfe.o +obj-$(CONFIG_SENSORS_FSP_3Y) += fsp-3y.o obj-$(CONFIG_SENSORS_IBM_CFFPS)+= ibm-cffps.o obj-$(CONFIG_SENSORS_INSPUR_IPSPS) += inspur-ipsps.o obj-$(CONFIG_SENSORS_IR35221) += ir35221.o diff --git a/drivers/hwmon/pmbus/fsp-3y.c b/drivers/hwmon/pmbus/fsp-3y.c new file mode 100644 index ..2c165e034fa8 --- /dev/null +++ b/drivers/hwmon/pmbus/fsp-3y.c @@ -0,0 +1,217 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Hardware monitoring driver for FSP 3Y-Power PSUs + * + * Copyright (c) 2021 Václav Kubernát, CESNET + */ + +#include +#include +#include +#include +#include "pmbus.h" + +#define YM2151_PAGE_12V_LOG0x00 +#define YM2151_PAGE_12V_REAL 0x00 +#define YM2151_PAGE_5VSB_LOG 0x01 +#define YM2151_PAGE_5VSB_REAL 0x20 +#define YH5151E_PAGE_12V_LOG 0x00 +#define YH5151E_PAGE_12V_REAL 0x00 +#define YH5151E_PAGE_5V_LOG0x01 +#define YH5151E_PAGE_5V_REAL 0x10 +#define YH5151E_PAGE_3V3_LOG 0x02 +#define YH5151E_PAGE_3V3_REAL 0x11 + +enum chips { + ym2151e, + yh5151e +}; + +struct fsp3y_data { + struct pmbus_driver_info info; + enum chips chip; + int page; +}; + +#define to_fsp3y_data(x) container_of(x, struct fsp3y_data, info) + +static int page_log_to_page_real(int page_log, enum chips chip) +{ + switch (chip) { + case ym2151e: + switch (page_log) { + case YM2151_PAGE_12V_LOG: + return YM2151_PAGE_12V_REAL; + case YM2151_PAGE_5VSB_LOG: + return YM2151_PAGE_5VSB_REAL; + } + return -1; + case yh5151e: +
Re: [PATCH v2 5/5] percpu: implement partial chunk depopulation
Hello, On Wed, Apr 07, 2021 at 11:26:18AM -0700, Roman Gushchin wrote: > This patch implements partial depopulation of percpu chunks. > > As now, a chunk can be depopulated only as a part of the final > destruction, if there are no more outstanding allocations. However > to minimize a memory waste it might be useful to depopulate a > partially filed chunk, if a small number of outstanding allocations > prevents the chunk from being fully reclaimed. > > This patch implements the following depopulation process: it scans > over the chunk pages, looks for a range of empty and populated pages > and performs the depopulation. To avoid races with new allocations, > the chunk is previously isolated. After the depopulation the chunk is > sidelined to a special list or freed. New allocations can't be served > using a sidelined chunk. The chunk can be moved back to a corresponding > slot if there are not enough chunks with empty populated pages. > > The depopulation is scheduled on the free path. Is the chunk: > 1) has more than 1/4 of total pages free and populated > 2) the system has enough free percpu pages aside of this chunk > 3) isn't the reserved chunk > 4) isn't the first chunk > 5) isn't entirely free > it's a good target for depopulation. If it's already depopulated > but got free populated pages, it's a good target too. > The chunk is moved to a special pcpu_depopulate_list, chunk->isolate > flag is set and the async balancing is scheduled. > > The async balancing moves pcpu_depopulate_list to a local list > (because pcpu_depopulate_list can be changed when pcpu_lock is > releases), and then tries to depopulate each chunk. The depopulation > is performed in the reverse direction to keep populated pages close to > the beginning, if the global number of empty pages is reached. > Depopulated chunks are sidelined to prevent further allocations. > Skipped and fully empty chunks are returned to the corresponding slot. > > On the allocation path, if there are no suitable chunks found, > the list of sidelined chunks in scanned prior to creating a new chunk. > If there is a good sidelined chunk, it's placed back to the slot > and the scanning is restarted. > > Many thanks to Dennis Zhou for his great ideas and a very constructive > discussion which led to many improvements in this patchset! > > Signed-off-by: Roman Gushchin > --- > mm/percpu-internal.h | 2 + > mm/percpu.c | 164 ++- > 2 files changed, 164 insertions(+), 2 deletions(-) > > diff --git a/mm/percpu-internal.h b/mm/percpu-internal.h > index 095d7eaa0db4..8e432663c41e 100644 > --- a/mm/percpu-internal.h > +++ b/mm/percpu-internal.h > @@ -67,6 +67,8 @@ struct pcpu_chunk { > > void*data; /* chunk data */ > boolimmutable; /* no [de]population allowed */ > + boolisolated; /* isolated from chunk slot > lists */ > + booldepopulated;/* sidelined after depopulation > */ > int start_offset; /* the overlap with the previous > region to have a page aligned > base_addr */ > diff --git a/mm/percpu.c b/mm/percpu.c > index e20119668c42..0a5a5e84e0a4 100644 > --- a/mm/percpu.c > +++ b/mm/percpu.c > @@ -181,6 +181,19 @@ static LIST_HEAD(pcpu_map_extend_chunks); > */ > int pcpu_nr_empty_pop_pages[PCPU_NR_CHUNK_TYPES]; > > +/* > + * List of chunks with a lot of free pages. Used to depopulate them > + * asynchronously. > + */ > +static struct list_head pcpu_depopulate_list[PCPU_NR_CHUNK_TYPES]; > + > +/* > + * List of previously depopulated chunks. They are not usually used for new > + * allocations, but can be returned back to service if a need arises. > + */ > +static struct list_head pcpu_sideline_list[PCPU_NR_CHUNK_TYPES]; > + > + > /* > * The number of populated pages in use by the allocator, protected by > * pcpu_lock. This number is kept per a unit per chunk (i.e. when a page > gets > @@ -542,6 +555,12 @@ static void pcpu_chunk_relocate(struct pcpu_chunk > *chunk, int oslot) > { > int nslot = pcpu_chunk_slot(chunk); > > + /* > + * Keep isolated and depopulated chunks on a sideline. > + */ > + if (chunk->isolated || chunk->depopulated) > + return; > + > if (oslot != nslot) > __pcpu_chunk_move(chunk, nslot, oslot < nslot); > } > @@ -1778,6 +1797,25 @@ static void __percpu *pcpu_alloc(size_t size, size_t > align, bool reserved, > } > } > > + /* search through sidelined depopulated chunks */ > + list_for_each_entry(chunk, _sideline_list[type], list) { > + struct pcpu_block_md *chunk_md = >chunk_md; > + int bit_off; > + > + /* > + * If the allocation can fit in the chunk's contig hint, > +
Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
On 4/1/21 9:16 AM, Chanwoo Choi wrote: > On 3/31/21 10:03 PM, andrew-sh.cheng wrote: >> On Wed, 2021-03-31 at 17:35 +0900, Chanwoo Choi wrote: >>> On 3/31/21 5:27 PM, Chanwoo Choi wrote: Hi, On 3/31/21 5:03 PM, andrew-sh.cheng wrote: > On Thu, 2021-03-25 at 17:14 +0900, Chanwoo Choi wrote: >> Hi, >> >> You are missing to add these patches to linux-pm mailing list. >> Need to send them to linu-pm ML. >> >> Also, before received this series, I tried to clean-up these patches >> on testing branch[1]. So that I add my comment with my clean-up case. >> [1] >> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgAPh9XRRs$ >> >> >> And 'Saravana Kannan ' is wrong email address. >> Please update the email or drop this email. > > Hi Chanwoo, > > Thank you for the advices. > I will resend patch v9 (add to linux-pm ML), remove this patch, and note > that my patch set base on > https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ > I has not yet test this patch[1] on devfreq-testing-passive-gov branch. So that if possible, I'd like you to test your patches with this patch[1] and then if there is no problem, could you send the next patches with patch[1]? [1]https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov=39c80d11a8f42dd63ecea1e0df595a0ceb83b454__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJR2cQqZs$ >>> >>> >>> Sorry for the confusion. I make the devfreq-testing-passive-gov[1] >>> branch based on latest devfreq-next branch. >>> [1] >>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ >>> >>> >>> First of all, if possible, I want to test them[1] with your patches in this >>> series. >>> And then if there are no any problem, please let me know. After confirmed >>> from you, >>> I'll send the patches of devfreq-testing-passive-gov[1] branch. >>> How about that? >>> >> Hi Chanwoo~ >> >> We will use this on Google Chrome project. >> Google Hsin-Yi has test your patch + my patch set v8 [2~8] >> >> make sure cci devfreqs runs with cpufreq. >> suspend resume >> speedometer2 benchmark >> It is okay. >> >> Please send the patches of devfreq-testing-passive-gov[1] branch. >> >> I will send patch v9 base on yours latter. > > Thanks for your test. I'll send the patches today. I'm sorry for delay because when I tested the patches for devfreq parent type on Odroid-xu3, there are some problem related to lazy linking of OPP. So I'm trying to analyze them. Unfortunately, we need to postpone these patches to next linux version. [snip] -- Best Regards, Chanwoo Choi Samsung Electronics
[PATCH] staging: rtl8712: removed extra blank line
Removed an extra blank line so that only one blank line is present in between two functions which separates them out. Reported by checkpatch Signed-off-by: Mitali Borkar --- drivers/staging/rtl8712/rtl8712_wmac_regdef.h | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/staging/rtl8712/rtl8712_wmac_regdef.h b/drivers/staging/rtl8712/rtl8712_wmac_regdef.h index 662383fe7a8d..dfe3e9fbed43 100644 --- a/drivers/staging/rtl8712/rtl8712_wmac_regdef.h +++ b/drivers/staging/rtl8712/rtl8712_wmac_regdef.h @@ -32,6 +32,5 @@ #define AMPDU_MIN_SPACE(RTL8712_WMAC_ + 0x37) #defineTXOP_STALL_CTRL (RTL8712_WMAC_ + 0x38) - #endif /*__RTL8712_WMAC_REGDEF_H__*/ -- 2.30.2