date:20210407

Re: [PATCH v2 4/6] soc: mediatek: devapc: rename variable for new IC support

2021-04-07 Thread Nina Wu

Hi, Matthias

On Tue, 2021-04-06 at 15:43 +0200, Matthias Brugger wrote:
> Regarding the commit subject:
> "soc: mediatek: devapc: rename variable for new IC support"
> maybe something like:
> "soc: mediatek: devapc: rename register variable infra_base"
> 
> Other then that looks good to me.
> 

OK. I will fix it in the next version.

Thanks

> On 01/04/2021 08:38, Nina Wu wrote:
> > From: Nina Wu 
> > 
> > For new ICs, there are multiple devapc HWs for different subsys.
> > For example, there is devapc respectively for infra, peri, peri2, etc.
> > So we rename the variable 'infra_base' to 'base' for code readability.
> > 
> > Signed-off-by: Nina Wu 
> > ---
> >  drivers/soc/mediatek/mtk-devapc.c | 24 
> >  1 file changed, 12 insertions(+), 12 deletions(-)
> > 
> > diff --git a/drivers/soc/mediatek/mtk-devapc.c 
> > b/drivers/soc/mediatek/mtk-devapc.c
> > index 68c3e35..bcf6e3c 100644
> > --- a/drivers/soc/mediatek/mtk-devapc.c
> > +++ b/drivers/soc/mediatek/mtk-devapc.c
> > @@ -45,7 +45,7 @@ struct mtk_devapc_data {
> >  
> >  struct mtk_devapc_context {
> > struct device *dev;
> > -   void __iomem *infra_base;
> > +   void __iomem *base;
> > u32 vio_idx_num;
> > struct clk *infra_clk;
> > const struct mtk_devapc_data *data;
> > @@ -56,7 +56,7 @@ static void clear_vio_status(struct mtk_devapc_context 
> > *ctx)
> > void __iomem *reg;
> > int i;
> >  
> > -   reg = ctx->infra_base + ctx->data->vio_sta_offset;
> > +   reg = ctx->base + ctx->data->vio_sta_offset;
> >  
> > for (i = 0; i < VIO_MOD_TO_REG_IND(ctx->vio_idx_num - 1); i++)
> > writel(GENMASK(31, 0), reg + 4 * i);
> > @@ -71,7 +71,7 @@ static void mask_module_irq(struct mtk_devapc_context 
> > *ctx, bool mask)
> > u32 val;
> > int i;
> >  
> > -   reg = ctx->infra_base + ctx->data->vio_mask_offset;
> > +   reg = ctx->base + ctx->data->vio_mask_offset;
> >  
> > if (mask)
> > val = GENMASK(31, 0);
> > @@ -113,11 +113,11 @@ static int devapc_sync_vio_dbg(struct 
> > mtk_devapc_context *ctx)
> > int ret;
> > u32 val;
> >  
> > -   pd_vio_shift_sta_reg = ctx->infra_base +
> > +   pd_vio_shift_sta_reg = ctx->base +
> >ctx->data->vio_shift_sta_offset;
> > -   pd_vio_shift_sel_reg = ctx->infra_base +
> > +   pd_vio_shift_sel_reg = ctx->base +
> >ctx->data->vio_shift_sel_offset;
> > -   pd_vio_shift_con_reg = ctx->infra_base +
> > +   pd_vio_shift_con_reg = ctx->base +
> >ctx->data->vio_shift_con_offset;
> >  
> > /* Find the minimum shift group which has violation */
> > @@ -159,8 +159,8 @@ static void devapc_extract_vio_dbg(struct 
> > mtk_devapc_context *ctx)
> > void __iomem *vio_dbg0_reg;
> > void __iomem *vio_dbg1_reg;
> >  
> > -   vio_dbg0_reg = ctx->infra_base + ctx->data->vio_dbg0_offset;
> > -   vio_dbg1_reg = ctx->infra_base + ctx->data->vio_dbg1_offset;
> > +   vio_dbg0_reg = ctx->base + ctx->data->vio_dbg0_offset;
> > +   vio_dbg1_reg = ctx->base + ctx->data->vio_dbg1_offset;
> >  
> > vio_dbgs.vio_dbg0 = readl(vio_dbg0_reg);
> > vio_dbgs.vio_dbg1 = readl(vio_dbg1_reg);
> > @@ -198,7 +198,7 @@ static irqreturn_t devapc_violation_irq(int irq_number, 
> > void *data)
> >   */
> >  static void start_devapc(struct mtk_devapc_context *ctx)
> >  {
> > -   writel(BIT(31), ctx->infra_base + ctx->data->apc_con_offset);
> > +   writel(BIT(31), ctx->base + ctx->data->apc_con_offset);
> >  
> > mask_module_irq(ctx, false);
> >  }
> > @@ -210,7 +210,7 @@ static void stop_devapc(struct mtk_devapc_context *ctx)
> >  {
> > mask_module_irq(ctx, true);
> >  
> > -   writel(BIT(2), ctx->infra_base + ctx->data->apc_con_offset);
> > +   writel(BIT(2), ctx->base + ctx->data->apc_con_offset);
> >  }
> >  
> >  static const struct mtk_devapc_data devapc_mt6779 = {
> > @@ -249,8 +249,8 @@ static int mtk_devapc_probe(struct platform_device 
> > *pdev)
> > ctx->data = of_device_get_match_data(>dev);
> > ctx->dev = >dev;
> >  
> > -   ctx->infra_base = of_iomap(node, 0);
> > -   if (!ctx->infra_base)
> > +   ctx->base = of_iomap(node, 0);
> > +   if (!ctx->base)
> > return -EINVAL;
> >  
> > if (of_property_read_u32(node, "vio_idx_num", >vio_idx_num))
> >

Re: [PATCH 00/14] usb: dwc2: Fix Partial Power down issues.

2021-04-07 Thread Artur Petrosyan

Hi Greg,

On 4/7/2021 14:00, Artur Petrosyan wrote:
> This patch set fixes and improves the Partial Power Down mode for
> dwc2 core.
> It adds support for the following cases
>  1. Entering and exiting partial power down when a port is
> suspended, resumed, port reset is asserted.
>  2. Exiting the partial power down mode before removing driver.
>  3. Exiting partial power down in wakeup detected interrupt handler.
>  4. Exiting from partial power down mode when connector ID.
> status changes to "connId B
> 
> It updates and fixes the implementation of dwc2 entering and
> exiting partial power down mode when the system (PC) is suspended.
> 
> The patch set also improves the implementation of function handlers
> for entering and exiting host or device partial power down.
> 
> NOTE: This is the second patch set in the power saving mode fixes
> series.
> This patch set is part of multiple series and is continuation
> of the "usb: dwc2: Fix and improve power saving modes" patch set.
> (Patch set link: https://marc.info/?l=linux-usb=160379622403975=2).
> The patches that were included in the "usb: dwc2:
> Fix and improve power saving modes" which was submitted
> earlier was too large and needed to be split up into
> smaller patch sets.
> 
> 
> Artur Petrosyan (14):
>usb: dwc2: Add device partial power down functions
>usb: dwc2: Add host partial power down functions
>usb: dwc2: Update enter and exit partial power down functions
>usb: dwc2: Add partial power down exit flow in wakeup intr.
>usb: dwc2: Update port suspend/resume function definitions.
>usb: dwc2: Add enter partial power down when port is suspended
>usb: dwc2: Add exit partial power down when port is resumed
>usb: dwc2: Add exit partial power down when port reset is asserted
>usb: dwc2: Add part. power down exit from
>  dwc2_conn_id_status_change().
>usb: dwc2: Allow exit partial power down in urb enqueue
>usb: dwc2: Fix session request interrupt handler
>usb: dwc2: Update partial power down entering by system suspend
>usb: dwc2: Fix partial power down exiting by system resume
>usb: dwc2: Add exit partial power down before removing driver
> 
>   drivers/usb/dwc2/core.c  | 113 ++---
>   drivers/usb/dwc2/core.h  |  27 ++-
>   drivers/usb/dwc2/core_intr.c |  46 ++--
>   drivers/usb/dwc2/gadget.c| 148 ++-
>   drivers/usb/dwc2/hcd.c   | 458 +--
>   drivers/usb/dwc2/hw.h|   1 +
>   drivers/usb/dwc2/platform.c  |  11 +-
>   7 files changed, 558 insertions(+), 246 deletions(-)
> 
> 
> base-commit: e9fcb07704fcef6fa6d0333fd2b3a62442eaf45b
> 
I have submitted this patch set yesterday. It contains 14 patches. But 
only 2 of those patches were received by LKML only the cover letter and 
the 13th patch. 
(https://lore.kernel.org/linux-usb/cover.1617782102.git.arthur.petros...@synopsys.com/T/#t)

I checked here at Synopsys, Minas did receive all the patches as his 
email is in To list. Could this be an issue of vger.kernel.org mailing 
server?

Because I checked every local possibility that could result to such 
behavior. The patch 13 which was received by LKML has the similar 
content as the other patches.

The mailing tool that was used is ssmtp, checked all the configurations 
everything is fine.

Could you please suggest what should I do in this situation?

Regards,
Artur

Re: [PATCH] drm/amd/pm: convert sysfs snprintf to sysfs_emit

2021-04-07 Thread Carlis

On Wed, 7 Apr 2021 16:30:01 -0400
Alex Deucher  wrote:

> On Tue, Apr 6, 2021 at 10:13 AM Carlis  wrote:
> >
> > From: Xuezhi Zhang 
> >
> > Fix the following coccicheck warning:
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:1940:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:1978:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2022:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:294:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:154:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:496:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:512:9-17:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:1740:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:1667:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2074:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2047:9-17:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2768:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2738:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2442:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3246:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3253:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2458:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3047:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3133:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3209:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3216:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2410:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2496:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2470:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2426:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2965:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:2972:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3006:8-16:
> > WARNING: use scnprintf or sprintf
> > drivers/gpu/drm/amd/pm//amdgpu_pm.c:3013:8-16:
> > WARNING: use scnprintf or sprintf
> >
> > Signed-off-by: Xuezhi Zhang   
> 
> I already applied a similar patch last week.
> 
> Thanks,
> 
> Alex
> 
OK.
Thanks,

Xuezhi Zhang
> 
> > ---
> >  drivers/gpu/drm/amd/pm/amdgpu_pm.c | 58
> > +++--- 1 file changed, 29 insertions(+), 29
> > deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> > b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index
> > 5fa65f191a37..2777966ec1ca 100644 ---
> > a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++
> > b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -151,7 +151,7 @@ static
> > ssize_t amdgpu_get_power_dpm_state(struct device *dev,
> > pm_runtime_mark_last_busy(ddev->dev);
> > pm_runtime_put_autosuspend(ddev->dev);
> >
> > -   return snprintf(buf, PAGE_SIZE, "%s\n",
> > +   return sysfs_emit(buf, "%s\n",
> > (pm == POWER_STATE_TYPE_BATTERY) ?
> > "battery" : (pm == POWER_STATE_TYPE_BALANCED) ? "balanced" :
> > "performance"); }
> > @@ -291,7 +291,7 @@ static ssize_t
> > amdgpu_get_power_dpm_force_performance_level(struct device *dev,
> > pm_runtime_mark_last_busy(ddev->dev);
> > pm_runtime_put_autosuspend(ddev->dev);
> >
> > -   return snprintf(buf, PAGE_SIZE, "%s\n",
> > +   return sysfs_emit(buf, "%s\n",
> > (level == AMD_DPM_FORCED_LEVEL_AUTO) ?
> > "auto" : (level == AMD_DPM_FORCED_LEVEL_LOW) ? "low" :
> > (level == AMD_DPM_FORCED_LEVEL_HIGH) ?
> > "high" : @@ -493,7 +493,7 @@ static ssize_t
> > amdgpu_get_pp_cur_state(struct device *dev, if (i == data.nums)
> > i = -EINVAL;
> >
> > -   return snprintf(buf, PAGE_SIZE, "%d\n", i);
> > +   return sysfs_emit(buf, "%d\n", i);
> >  }
> >
> >  static ssize_t amdgpu_get_pp_force_state(struct device *dev,
> > @@ -509,7 +509,7 @@ static ssize_t amdgpu_get_pp_force_state(struct
> > device *dev, if (adev->pp_force_state_enabled)
> > return amdgpu_get_pp_cur_state(dev, attr, buf);
> > else
> > -   return snprintf(buf, PAGE_SIZE, "\n");
> > +   return sysfs_emit(buf, "\n");
> >  }
> >
> >  static ssize_t amdgpu_set_pp_force_state(struct device *dev,
> > @@ -1664,7 +1664,7 @@ static ssize_t amdgpu_get_pp_sclk_od(struct
> > device *dev,

Re: [PATCH v2 2/6] soc: mediatek: devapc: move 'vio_idx_num' info to DT

2021-04-07 Thread Nina Wu

Hi, Matthias


On Tue, 2021-04-06 at 15:41 +0200, Matthias Brugger wrote:
> 
> On 01/04/2021 08:38, Nina Wu wrote:
> > From: Nina Wu 
> > 
> > For new ICs, there are multiple devapc HWs for different subsys.
> > The number of devices controlled by each devapc (i.e. 'vio_idx_num'
> > in the code) varies.
> > We move this info from compatible data to DT so that we do not need
> > to add n compatible for a certain IC which has n devapc HWs with
> > different 'vio_idx_num', respectively.
> > 
> > Signed-off-by: Nina Wu 
> > ---
> >  drivers/soc/mediatek/mtk-devapc.c | 18 +-
> >  1 file changed, 9 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/soc/mediatek/mtk-devapc.c 
> > b/drivers/soc/mediatek/mtk-devapc.c
> > index f1cea04..a0f6fbd 100644
> > --- a/drivers/soc/mediatek/mtk-devapc.c
> > +++ b/drivers/soc/mediatek/mtk-devapc.c
> > @@ -32,9 +32,6 @@ struct mtk_devapc_vio_dbgs {
> >  };
> >  
> >  struct mtk_devapc_data {
> > -   /* numbers of violation index */
> > -   u32 vio_idx_num;
> > -
> > /* reg offset */
> > u32 vio_mask_offset;
> > u32 vio_sta_offset;
> > @@ -49,6 +46,7 @@ struct mtk_devapc_data {
> >  struct mtk_devapc_context {
> > struct device *dev;
> > void __iomem *infra_base;
> > +   u32 vio_idx_num;
> 
> We should try to stay backwards compatible (newer kernel with older DTS). I
> think we don't need to move vio_idx_num to mtk_devapc_context. Just don't
> declare it in the per SoC match data. More details see below...
> 
> > struct clk *infra_clk;
> > const struct mtk_devapc_data *data;
> >  };
> > @@ -60,10 +58,10 @@ static void clear_vio_status(struct mtk_devapc_context 
> > *ctx)
> >  
> > reg = ctx->infra_base + ctx->data->vio_sta_offset;
> >  
> > -   for (i = 0; i < VIO_MOD_TO_REG_IND(ctx->data->vio_idx_num) - 1; i++)
> > +   for (i = 0; i < VIO_MOD_TO_REG_IND(ctx->vio_idx_num - 1); i++)
> > writel(GENMASK(31, 0), reg + 4 * i);
> >  
> > -   writel(GENMASK(VIO_MOD_TO_REG_OFF(ctx->data->vio_idx_num) - 1, 0),
> > +   writel(GENMASK(VIO_MOD_TO_REG_OFF(ctx->vio_idx_num - 1), 0),
> >reg + 4 * i);
> >  }
> >  
> > @@ -80,15 +78,15 @@ static void mask_module_irq(struct mtk_devapc_context 
> > *ctx, bool mask)
> > else
> > val = 0;
> >  
> > -   for (i = 0; i < VIO_MOD_TO_REG_IND(ctx->data->vio_idx_num) - 1; i++)
> > +   for (i = 0; i < VIO_MOD_TO_REG_IND(ctx->vio_idx_num - 1); i++)
> > writel(val, reg + 4 * i);
> >  
> > val = readl(reg + 4 * i);
> > if (mask)
> > -   val |= GENMASK(VIO_MOD_TO_REG_OFF(ctx->data->vio_idx_num) - 1,
> > +   val |= GENMASK(VIO_MOD_TO_REG_OFF(ctx->vio_idx_num - 1),
> >0);
> > else
> > -   val &= ~GENMASK(VIO_MOD_TO_REG_OFF(ctx->data->vio_idx_num) - 1,
> > +   val &= ~GENMASK(VIO_MOD_TO_REG_OFF(ctx->vio_idx_num - 1),
> > 0);
> >  
> > writel(val, reg + 4 * i);
> > @@ -216,7 +214,6 @@ static void stop_devapc(struct mtk_devapc_context *ctx)
> >  }
> >  
> >  static const struct mtk_devapc_data devapc_mt6779 = {
> > -   .vio_idx_num = 511,
> > .vio_mask_offset = 0x0,
> > .vio_sta_offset = 0x400,
> > .vio_dbg0_offset = 0x900,
> > @@ -256,6 +253,9 @@ static int mtk_devapc_probe(struct platform_device 
> > *pdev)
> > if (!ctx->infra_base)
> > return -EINVAL;
> >  
> > +   if (of_property_read_u32(node, "vio_idx_num", >vio_idx_num))
> > +   return -EINVAL;
> > +
> 
> ...only read the property if  vio_idx_num == 0.
> What do you think?
> 
> Regards,
> Matthias
> 

Good idea. I will fix it in the next version.

Thanks


> > devapc_irq = irq_of_parse_and_map(node, 0);
> > if (!devapc_irq)
> > return -EINVAL;
> >

Re: [PATCH] mtd: add OTP (one-time-programmable) erase ioctl

2021-04-07 Thread Tudor.Ambarus

Michael,

Would you please resend this patch, together with the mtd-utils
and the SPI NOR patch in a single patch set? You'll help us all
having all in a single place.

For the new ioctl we'll need acks from all the mtd maintainers
and at least a tested-by tag.

Cheers,
ta

Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages

2021-04-07 Thread Mike Rapoport

On Thu, Apr 08, 2021 at 10:46:18AM +0530, Anshuman Khandual wrote:
> 
> 
> On 4/7/21 10:56 PM, Mike Rapoport wrote:
> > From: Mike Rapoport 
> > 
> > The struct pages representing a reserved memory region are initialized
> > using reserve_bootmem_range() function. This function is called for each
> > reserved region just before the memory is freed from memblock to the buddy
> > page allocator.
> > 
> > The struct pages for MEMBLOCK_NOMAP regions are kept with the default
> > values set by the memory map initialization which makes it necessary to
> > have a special treatment for such pages in pfn_valid() and
> > pfn_valid_within().
> > 
> > Split out initialization of the reserved pages to a function with a
> > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the
> > reserved regions and mark struct pages for the NOMAP regions as
> > PageReserved.
> 
> This would definitely need updating the comment for MEMBLOCK_NOMAP definition
> in include/linux/memblock.h just to make the semantics is clear,

Sure

> though arm64 is currently the only user for MEMBLOCK_NOMAP.

> > Signed-off-by: Mike Rapoport 
> > ---
> >  mm/memblock.c | 23 +--
> >  1 file changed, 21 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/memblock.c b/mm/memblock.c
> > index afaefa8fc6ab..6b7ea9d86310 100644
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -2002,6 +2002,26 @@ static unsigned long __init 
> > __free_memory_core(phys_addr_t start,
> > return end_pfn - start_pfn;
> >  }
> >  
> > +static void __init memmap_init_reserved_pages(void)
> > +{
> > +   struct memblock_region *region;
> > +   phys_addr_t start, end;
> > +   u64 i;
> > +
> > +   /* initialize struct pages for the reserved regions */
> > +   for_each_reserved_mem_range(i, , )
> > +   reserve_bootmem_region(start, end);
> > +
> > +   /* and also treat struct pages for the NOMAP regions as PageReserved */
> > +   for_each_mem_region(region) {
> > +   if (memblock_is_nomap(region)) {
> > +   start = region->base;
> > +   end = start + region->size;
> > +   reserve_bootmem_region(start, end);
> > +   }
> > +   }
> > +}
> > +
> >  static unsigned long __init free_low_memory_core_early(void)
> >  {
> > unsigned long count = 0;
> > @@ -2010,8 +2030,7 @@ static unsigned long __init 
> > free_low_memory_core_early(void)
> >  
> > memblock_clear_hotplug(0, -1);
> >  
> > -   for_each_reserved_mem_range(i, , )
> > -   reserve_bootmem_region(start, end);
> > +   memmap_init_reserved_pages();
> >  
> > /*
> >  * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id
> > 

-- 
Sincerely yours,
Mike.

Re: [v9,3/7] PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192

2021-04-07 Thread Jianjun Wang

Hi Bjorn, Lorenzo,

Just gentle ping for this patch set, please kindly let me know your
comments about this patch set.

Thanks.

On Wed, 2021-03-24 at 11:05 +0800, Jianjun Wang wrote:
> MediaTek's PCIe host controller has three generation HWs, the new
> generation HW is an individual bridge, it supports Gen3 speed and
> compatible with Gen2, Gen1 speed.
> 
> Add support for new Gen3 controller which can be found on MT8192.
> 
> Signed-off-by: Jianjun Wang 
> Acked-by: Ryder Lee 
> ---
>  drivers/pci/controller/Kconfig  |  13 +
>  drivers/pci/controller/Makefile |   1 +
>  drivers/pci/controller/pcie-mediatek-gen3.c | 464 
>  3 files changed, 478 insertions(+)
>  create mode 100644 drivers/pci/controller/pcie-mediatek-gen3.c
> 
> diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
> index 5aa8977d7b0f..1e925ac47279 100644
> --- a/drivers/pci/controller/Kconfig
> +++ b/drivers/pci/controller/Kconfig
> @@ -233,6 +233,19 @@ config PCIE_MEDIATEK
> Say Y here if you want to enable PCIe controller support on
> MediaTek SoCs.
>  
> +config PCIE_MEDIATEK_GEN3
> + tristate "MediaTek Gen3 PCIe controller"
> + depends on ARCH_MEDIATEK || COMPILE_TEST
> + depends on PCI_MSI_IRQ_DOMAIN
> + help
> +   Adds support for PCIe Gen3 MAC controller for MediaTek SoCs.
> +   This PCIe controller is compatible with Gen3, Gen2 and Gen1 speed,
> +   and support up to 256 MSI interrupt numbers for
> +   multi-function devices.
> +
> +   Say Y here if you want to enable Gen3 PCIe controller support on
> +   MediaTek SoCs.
> +
>  config VMD
>   depends on PCI_MSI && X86_64 && SRCU
>   tristate "Intel Volume Management Device Driver"
> diff --git a/drivers/pci/controller/Makefile b/drivers/pci/controller/Makefile
> index e4559f2182f2..579973327815 100644
> --- a/drivers/pci/controller/Makefile
> +++ b/drivers/pci/controller/Makefile
> @@ -27,6 +27,7 @@ obj-$(CONFIG_PCIE_ROCKCHIP) += pcie-rockchip.o
>  obj-$(CONFIG_PCIE_ROCKCHIP_EP) += pcie-rockchip-ep.o
>  obj-$(CONFIG_PCIE_ROCKCHIP_HOST) += pcie-rockchip-host.o
>  obj-$(CONFIG_PCIE_MEDIATEK) += pcie-mediatek.o
> +obj-$(CONFIG_PCIE_MEDIATEK_GEN3) += pcie-mediatek-gen3.o
>  obj-$(CONFIG_PCIE_MICROCHIP_HOST) += pcie-microchip-host.o
>  obj-$(CONFIG_VMD) += vmd.o
>  obj-$(CONFIG_PCIE_BRCMSTB) += pcie-brcmstb.o
> diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c 
> b/drivers/pci/controller/pcie-mediatek-gen3.c
> new file mode 100644
> index ..3546e53b3c85
> --- /dev/null
> +++ b/drivers/pci/controller/pcie-mediatek-gen3.c
> @@ -0,0 +1,464 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * MediaTek PCIe host controller driver.
> + *
> + * Copyright (c) 2020 MediaTek Inc.
> + * Author: Jianjun Wang 
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "../pci.h"
> +
> +#define PCIE_SETTING_REG 0x80
> +#define PCIE_PCI_IDS_1   0x9c
> +#define PCI_CLASS(class) (class << 8)
> +#define PCIE_RC_MODE BIT(0)
> +
> +#define PCIE_CFGNUM_REG  0x140
> +#define PCIE_CFG_DEVFN(devfn)((devfn) & GENMASK(7, 0))
> +#define PCIE_CFG_BUS(bus)(((bus) << 8) & GENMASK(15, 8))
> +#define PCIE_CFG_BYTE_EN(bytes)  (((bytes) << 16) & GENMASK(19, 
> 16))
> +#define PCIE_CFG_FORCE_BYTE_EN   BIT(20)
> +#define PCIE_CFG_OFFSET_ADDR 0x1000
> +#define PCIE_CFG_HEADER(bus, devfn) \
> + (PCIE_CFG_BUS(bus) | PCIE_CFG_DEVFN(devfn))
> +
> +#define PCIE_RST_CTRL_REG0x148
> +#define PCIE_MAC_RSTBBIT(0)
> +#define PCIE_PHY_RSTBBIT(1)
> +#define PCIE_BRG_RSTBBIT(2)
> +#define PCIE_PE_RSTB BIT(3)
> +
> +#define PCIE_LTSSM_STATUS_REG0x150
> +
> +#define PCIE_LINK_STATUS_REG 0x154
> +#define PCIE_PORT_LINKUP BIT(8)
> +
> +#define PCIE_TRANS_TABLE_BASE_REG0x800
> +#define PCIE_ATR_SRC_ADDR_MSB_OFFSET 0x4
> +#define PCIE_ATR_TRSL_ADDR_LSB_OFFSET0x8
> +#define PCIE_ATR_TRSL_ADDR_MSB_OFFSET0xc
> +#define PCIE_ATR_TRSL_PARAM_OFFSET   0x10
> +#define PCIE_ATR_TLB_SET_OFFSET  0x20
> +
> +#define PCIE_MAX_TRANS_TABLES8
> +#define PCIE_ATR_EN  BIT(0)
> +#define PCIE_ATR_SIZE(size) \
> + (size) - 1) << 1) & GENMASK(6, 1)) | PCIE_ATR_EN)
> +#define PCIE_ATR_ID(id)  ((id) & GENMASK(3, 0))
> +#define PCIE_ATR_TYPE_MEMPCIE_ATR_ID(0)
> +#define PCIE_ATR_TYPE_IO PCIE_ATR_ID(1)
> +#define PCIE_ATR_TLP_TYPE(type)  (((type) << 16) & GENMASK(18, 
> 16))
> +#define PCIE_ATR_TLP_TYPE_MEMPCIE_ATR_TLP_TYPE(0)
> +#define PCIE_ATR_TLP_TYPE_IO PCIE_ATR_TLP_TYPE(2)
> +
>

Re: Re: [PATCH] ASoC: codecs: Fix rumtime PM imbalance in tas2552_probe

2021-04-07 Thread dinghao . liu

> On Wed, Apr 07, 2021 at 02:54:00PM +0800, Dinghao Liu wrote:
> 
> > -   pm_runtime_set_active(>dev);
> > -   pm_runtime_set_autosuspend_delay(>dev, 1000);
> > -   pm_runtime_use_autosuspend(>dev);
> > -   pm_runtime_enable(>dev);
> > -   pm_runtime_mark_last_busy(>dev);
> > -   pm_runtime_put_sync_autosuspend(>dev);
> > -
> > dev_set_drvdata(>dev, data);
> >  
> > ret = devm_snd_soc_register_component(>dev,
> > @@ -733,6 +726,13 @@ static int tas2552_probe(struct i2c_client *client,
> > if (ret < 0)
> > dev_err(>dev, "Failed to register component: %d\n", 
> > ret);
> >  
> > +   pm_runtime_set_active(>dev);
> > +   pm_runtime_set_autosuspend_delay(>dev, 1000);
> > +   pm_runtime_use_autosuspend(>dev);
> 
> It's not clear to me that just moving the operations after the
> registration is a good fix - once the component is registered we could
> start trying to do runtime PM operations with it which AFAIR won't count
> references and so on properly if runtime PM isn't enabled so if we later
> enable runtime PM we might have the rest of the code in a confused state
> about what's going on.

Thanks for your advice. I checked the use of devm_snd_soc_register_component() 
in the kernel and found sometimes runtime PM is enabled before registration 
and sometimes after registration. To be on the safe side, I will send a new
patch to fix this in error handling path.

Regards,
Dinghao

Re: [PATCH v3 03/12] dump_stack: Add vmlinux build ID to stack traces

2021-04-07 Thread Stephen Boyd

Quoting Petr Mladek (2021-04-07 06:42:38)
> 
> I think that you need to use something like:
> 
> #ifdef CONFIG_STACKTRACE_BUILD_ID
> #define BUILD_ID_FTM " %20phN"
> #define BUILD_ID_VAL vmlinux_build_id
> #else
> #define BUILD_ID_FTM "%s"
> #define BUILD_ID_VAL ""
> #endif
> 
> printk("%sCPU: %d PID: %d Comm: %.20s %s%s %s %.*s" BUILD_ID_FTM "\n",
>log_lvl, raw_smp_processor_id(), current->pid, current->comm,
>kexec_crash_loaded() ? "Kdump: loaded " : "",
>print_tainted(),
>init_utsname()->release,
>(int)strcspn(init_utsname()->version, " "),
>init_utsname()->version,
>BUILD_ID_VAL);
> 

Thanks. I didn't see this warning but I see it now after compiling
again. Not sure how I missed this one. I've rolled in this fix as well.

Re: [PATCH] sched/fair: use signed long when compute energy delta in eas

2021-04-07 Thread Xuewen Yan

Hi

On Wed, Apr 7, 2021 at 10:11 PM Pierre  wrote:
>
> Hi,
> > I test the patch, but the overflow still exists.
> > In the "sched/fair: Use pd_cache to speed up find_energy_efficient_cpu()"
> > I wonder why recompute the cpu util when cpu==dst_cpu in compute_energy(),
> > when the dst_cpu's util change, it also would cause the overflow.
>
> The patches aim to cache the energy values for the CPUs whose
> utilization is not modified (so we don't have to compute it multiple
> times). The values cached are the 'base values' of the CPUs, i.e. when
> the task is not placed on the CPU. When (cpu==dst_cpu) in
> compute_energy(), it means the energy values need to be updated instead
> of using the cached ones.
>
well, is it better to use the task_util(p) + cache values ? but in
this case, the cache
values may need more parameters.

> You are right, there is still a possibility to have a negative delta
> with the patches at:
> https://gitlab.arm.com/linux-arm/linux-power/-/commits/eas/next/integration-20210129
> Adding a check before subtracting the values, and bailing out in such
> case would avoid this, such as at:
> https://gitlab.arm.com/linux-arm/linux-pg/-/commits/feec_bail_out/
>
In your patch, you bail out the case by "go to fail", that means you
don't use eas in such
case. However, in the actual scene, the case often occurr when select
cpu for small task.
As a result, the small task would not select cpu according to the eas,
it may affect
power consumption?

> I think a similar modification should be done in your patch. Even though
> this is a good idea to group the calls to compute_energy() to reduce the
> chances of having updates of utilization values in between the
> compute_energy() calls,
> there is still a chance to have updates. I think it happened when I
> applied your patch.
>
> About changing the delta(s) from 'unsigned long' to 'long', I am not
> sure of the meaning of having a negative delta. I thing it would be
> better to check and fail before it happens instead.
>
> Regards
>

Re: [PATCH v4 08/16] KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to manage guest DS buffer

2021-04-07 Thread Xu, Like


Hi Peter,

Thanks for your detailed comments.

If you have more comments for other patches, please let me know.

On 2021/4/7 23:39, Peter Zijlstra wrote:

On Mon, Mar 29, 2021 at 01:41:29PM +0800, Like Xu wrote:

@@ -3869,10 +3876,12 @@ static struct perf_guest_switch_msr 
*intel_guest_get_msrs(int *nr, void *data)
  
  		if (arr[1].guest)

arr[0].guest |= arr[1].guest;
-   else
+   else {
arr[1].guest = arr[1].host;
+   arr[2].guest = arr[2].host;
+   }

What's all this gibberish?

The way I read that it says:

if guest has PEBS_ENABLED
guest GLOBAL_CTRL |= PEBS_ENABLED
otherwise
guest PEBS_ENABLED = host PEBS_ENABLED
guest DS_AREA = host DS_AREA

which is just completely random garbage afaict. Why would you leak host
msrs into the guest?


In fact, this is not a leak at all.

When we do "arr[i].guest = arr[i].host;" assignment in the 
intel_guest_get_msrs(),
the KVM will check "if (msrs[i].host == msrs[i].guest)" and if so, it 
disables the atomic
switch for this msr during vmx transaction in the caller 
atomic_switch_perf_msrs().


In that case, the msr value doesn't change and any guest write will be trapped.
If the next check is "msrs[i].host != msrs[i].guest", the atomic switch 
will be triggered again.


Compared to before, this part of the logic has not changed, which helps to 
reduce overhead.



Why would you change guest GLOBAL_CTRL implicitly;


This is because in the early part of this function, we have operations:

    if (x86_pmu.flags & PMU_FL_PEBS_ALL)
        arr[0].guest &= ~cpuc->pebs_enabled;
    else
        arr[0].guest &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK);

and if guest has PEBS_ENABLED, we need these bits back for PEBS counters:

    arr[0].guest |= arr[1].guest;


guest had better wrmsr that himself to control when stuff is enabled.


When vm_entry, the msr value of GLOBAL_CTRL on the hardware may be
different from trapped value "pmu->global_ctrl" written by the guest.

If the perf scheduler cross maps guest counter X to the host counter Y,
we have to enable the bit Y in GLOBAL_CTRL before vm_entry rather than X.



This just cannot be right.

Re: [PATCH v2 1/1] powerpc/iommu: Enable remaining IOMMU Pagesizes present in LoPAR

2021-04-07 Thread Michael Ellerman

Leonardo Bras  writes:
> According to LoPAR, ibm,query-pe-dma-window output named "IO Page Sizes"
> will let the OS know all possible pagesizes that can be used for creating a
> new DDW.
>
> Currently Linux will only try using 3 of the 8 available options:
> 4K, 64K and 16M. According to LoPAR, Hypervisor may also offer 32M, 64M,
> 128M, 256M and 16G.

Do we know of any hardware & hypervisor combination that will actually
give us bigger pages?

> Enabling bigger pages would be interesting for direct mapping systems
> with a lot of RAM, while using less TCE entries.
>
> Signed-off-by: Leonardo Bras 
> ---
>  arch/powerpc/platforms/pseries/iommu.c | 49 ++
>  1 file changed, 42 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/iommu.c 
> b/arch/powerpc/platforms/pseries/iommu.c
> index 9fc5217f0c8e..6cda1c92597d 100644
> --- a/arch/powerpc/platforms/pseries/iommu.c
> +++ b/arch/powerpc/platforms/pseries/iommu.c
> @@ -53,6 +53,20 @@ enum {
>   DDW_EXT_QUERY_OUT_SIZE = 2
>  };

A comment saying where the values come from would be good.

> +#define QUERY_DDW_PGSIZE_4K  0x01
> +#define QUERY_DDW_PGSIZE_64K 0x02
> +#define QUERY_DDW_PGSIZE_16M 0x04
> +#define QUERY_DDW_PGSIZE_32M 0x08
> +#define QUERY_DDW_PGSIZE_64M 0x10
> +#define QUERY_DDW_PGSIZE_128M0x20
> +#define QUERY_DDW_PGSIZE_256M0x40
> +#define QUERY_DDW_PGSIZE_16G 0x80

I'm not sure the #defines really gain us much vs just putting the
literal values in the array below?

> +struct iommu_ddw_pagesize {
> + u32 mask;
> + int shift;
> +};
> +
>  static struct iommu_table_group *iommu_pseries_alloc_group(int node)
>  {
>   struct iommu_table_group *table_group;
> @@ -1099,6 +1113,31 @@ static void reset_dma_window(struct pci_dev *dev, 
> struct device_node *par_dn)
>ret);
>  }
>  
> +/* Returns page shift based on "IO Page Sizes" output at 
> ibm,query-pe-dma-window. See LoPAR */
> +static int iommu_get_page_shift(u32 query_page_size)
> +{
> + const struct iommu_ddw_pagesize ddw_pagesize[] = {
> + { QUERY_DDW_PGSIZE_16G,  __builtin_ctz(SZ_16G)  },
> + { QUERY_DDW_PGSIZE_256M, __builtin_ctz(SZ_256M) },
> + { QUERY_DDW_PGSIZE_128M, __builtin_ctz(SZ_128M) },
> + { QUERY_DDW_PGSIZE_64M,  __builtin_ctz(SZ_64M)  },
> + { QUERY_DDW_PGSIZE_32M,  __builtin_ctz(SZ_32M)  },
> + { QUERY_DDW_PGSIZE_16M,  __builtin_ctz(SZ_16M)  },
> + { QUERY_DDW_PGSIZE_64K,  __builtin_ctz(SZ_64K)  },
> + { QUERY_DDW_PGSIZE_4K,   __builtin_ctz(SZ_4K)   }
> + };


cheers

Re: [PATCH v3 12/12] kdump: Use vmlinux_build_id to simplify

2021-04-07 Thread Stephen Boyd

Quoting Petr Mladek (2021-04-07 10:03:28)
> On Tue 2021-03-30 20:05:20, Stephen Boyd wrote:
> > We can use the vmlinux_build_id array here now instead of open coding
> > it. This mostly consolidates code.
> > 
> > Cc: Jiri Olsa 
> > Cc: Alexei Starovoitov 
> > Cc: Jessica Yu 
> > Cc: Evan Green 
> > Cc: Hsin-Yi Wang 
> > Cc: Dave Young 
> > Cc: Baoquan He 
> > Cc: Vivek Goyal 
> > Cc: 
> > Signed-off-by: Stephen Boyd 
> > ---
> >  include/linux/crash_core.h |  6 +-
> >  kernel/crash_core.c| 41 ++
> >  2 files changed, 3 insertions(+), 44 deletions(-)
> > 
> > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > index 206bde8308b2..fb8ab99bb2ee 100644
> > --- a/include/linux/crash_core.h
> > +++ b/include/linux/crash_core.h
> > @@ -39,7 +39,7 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> >  #define VMCOREINFO_OSRELEASE(value) \
> >   vmcoreinfo_append_str("OSRELEASE=%s\n", value)
> >  #define VMCOREINFO_BUILD_ID(value) \
> > - vmcoreinfo_append_str("BUILD-ID=%s\n", value)
> > + vmcoreinfo_append_str("BUILD-ID=%20phN\n", value)
> 
> Please, add also build check that BUILD_ID_MAX == 20.
> 

I added a BUILD_BUG_ON() in kernel/crash_core.c. I tried static_assert()
here but got mixed ISO errors from gcc-10, although it feels like it
should work.

In file included from ./arch/arm64/include/asm/cmpxchg.h:10,
 from ./arch/arm64/include/asm/atomic.h:16,
 from ./include/linux/atomic.h:7,
 from ./include/linux/mm_types_task.h:13,
 from ./include/linux/mm_types.h:5,
 from ./include/linux/buildid.h:5,
 from kernel/crash_core.c:7:
kernel/crash_core.c: In function 'crash_save_vmcoreinfo_init':
./include/linux/build_bug.h:78:41: warning: ISO C90 forbids mixed declarations 
and code [-Wdeclaration-after-statement]
   78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
  | ^~
./include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
   77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, 
#expr)
  |  ^~~
./include/linux/crash_core.h:42:2: note: in expansion of macro 'static_assert'
   42 |  static_assert(ARRAY_SIZE(value) == BUILD_ID_SIZE_MAX); \
  |  ^
kernel/crash_core.c:401:2: note: in expansion of macro 'VMCOREINFO_BUILD_ID'
  401 |  VMCOREINFO_BUILD_ID(vmlinux_build_id);

> 
> The function add_build_id_vmcoreinfo() is used in
> crash_save_vmcoreinfo_init() in this context:
> 
> 
> VMCOREINFO_OSRELEASE(init_uts_ns.name.release);
> add_build_id_vmcoreinfo();
> VMCOREINFO_PAGESIZE(PAGE_SIZE);
> 
> VMCOREINFO_SYMBOL(init_uts_ns);
> VMCOREINFO_OFFSET(uts_namespace, name);
> VMCOREINFO_SYMBOL(node_online_map);
> 
> The function is not longer need. VMCOREINFO_BUILD_ID()
> can be used directly:
> 
> VMCOREINFO_OSRELEASE(init_uts_ns.name.release);
> VMCOREINFO_BUILD_ID(vmlinux_build_id);
> VMCOREINFO_PAGESIZE(PAGE_SIZE);
> 
> VMCOREINFO_SYMBOL(init_uts_ns);
> VMCOREINFO_OFFSET(uts_namespace, name);
> VMCOREINFO_SYMBOL(node_online_map);
> 
> 

Thanks. Makes sense. I've rolled that in.

[RFC PATCH] Add split_lock

2021-04-07 Thread Matthew Wilcox (Oracle)

bit_spinlocks are horrible on RT because there's absolutely nowhere
to put the mutex to sleep on.  They also do not participate in lockdep
because there's nowhere to put the map.

Most (all?) bit spinlocks are actually a split lock; logically they
could be treated as a single spinlock, but for performance, we want to
split the lock over many objects.  Introduce the split_lock as somewhere
to store the lockdep map and as somewhere that the RT kernel can put
a mutex.  It may also let us store a ticket lock for better performance
on non-RT kernels in the future, but I have left the current cpu_relax()
implementation intact for now.

The API change breaks all users except for the two which have been
converted.  This is an RFC, and I'm willing to fix all the rest.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 fs/dcache.c  | 25 ++--
 include/linux/bit_spinlock.h | 36 ++---
 include/linux/list_bl.h  |  9 
 include/linux/split_lock.h   | 45 
 mm/slub.c|  6 +++--
 5 files changed, 84 insertions(+), 37 deletions(-)
 create mode 100644 include/linux/split_lock.h

diff --git a/fs/dcache.c b/fs/dcache.c
index 7d24ff7eb206..a3861d330001 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -96,6 +96,7 @@ EXPORT_SYMBOL(slash_name);
 
 static unsigned int d_hash_shift __read_mostly;
 
+static DEFINE_SPLIT_LOCK(d_hash_lock);
 static struct hlist_bl_head *dentry_hashtable __read_mostly;
 
 static inline struct hlist_bl_head *d_hash(unsigned int hash)
@@ -469,9 +470,9 @@ static void ___d_drop(struct dentry *dentry)
else
b = d_hash(dentry->d_name.hash);
 
-   hlist_bl_lock(b);
+   hlist_bl_lock(b, _hash_lock);
__hlist_bl_del(>d_hash);
-   hlist_bl_unlock(b);
+   hlist_bl_unlock(b, _hash_lock);
 }
 
 void __d_drop(struct dentry *dentry)
@@ -2074,9 +2075,9 @@ static struct dentry *__d_instantiate_anon(struct dentry 
*dentry,
__d_set_inode_and_type(dentry, inode, add_flags);
hlist_add_head(>d_u.d_alias, >i_dentry);
if (!disconnected) {
-   hlist_bl_lock(>d_sb->s_roots);
+   hlist_bl_lock(>d_sb->s_roots, _hash_lock);
hlist_bl_add_head(>d_hash, >d_sb->s_roots);
-   hlist_bl_unlock(>d_sb->s_roots);
+   hlist_bl_unlock(>d_sb->s_roots, _hash_lock);
}
spin_unlock(>d_lock);
spin_unlock(>i_lock);
@@ -2513,9 +2514,9 @@ static void __d_rehash(struct dentry *entry)
 {
struct hlist_bl_head *b = d_hash(entry->d_name.hash);
 
-   hlist_bl_lock(b);
+   hlist_bl_lock(b, _hash_lock);
hlist_bl_add_head_rcu(>d_hash, b);
-   hlist_bl_unlock(b);
+   hlist_bl_unlock(b, _hash_lock);
 }
 
 /**
@@ -2606,9 +2607,9 @@ struct dentry *d_alloc_parallel(struct dentry *parent,
goto retry;
}
 
-   hlist_bl_lock(b);
+   hlist_bl_lock(b, _hash_lock);
if (unlikely(READ_ONCE(parent->d_inode->i_dir_seq) != seq)) {
-   hlist_bl_unlock(b);
+   hlist_bl_unlock(b, _hash_lock);
rcu_read_unlock();
goto retry;
}
@@ -2626,7 +2627,7 @@ struct dentry *d_alloc_parallel(struct dentry *parent,
continue;
if (!d_same_name(dentry, parent, name))
continue;
-   hlist_bl_unlock(b);
+   hlist_bl_unlock(b, _hash_lock);
/* now we can try to grab a reference */
if (!lockref_get_not_dead(>d_lockref)) {
rcu_read_unlock();
@@ -2664,7 +2665,7 @@ struct dentry *d_alloc_parallel(struct dentry *parent,
new->d_flags |= DCACHE_PAR_LOOKUP;
new->d_wait = wq;
hlist_bl_add_head_rcu(>d_u.d_in_lookup_hash, b);
-   hlist_bl_unlock(b);
+   hlist_bl_unlock(b, _hash_lock);
return new;
 mismatch:
spin_unlock(>d_lock);
@@ -2677,12 +2678,12 @@ void __d_lookup_done(struct dentry *dentry)
 {
struct hlist_bl_head *b = in_lookup_hash(dentry->d_parent,
 dentry->d_name.hash);
-   hlist_bl_lock(b);
+   hlist_bl_lock(b, _hash_lock);
dentry->d_flags &= ~DCACHE_PAR_LOOKUP;
__hlist_bl_del(>d_u.d_in_lookup_hash);
wake_up_all(dentry->d_wait);
dentry->d_wait = NULL;
-   hlist_bl_unlock(b);
+   hlist_bl_unlock(b, _hash_lock);
INIT_HLIST_NODE(>d_u.d_alias);
INIT_LIST_HEAD(>d_lru);
 }
diff --git a/include/linux/bit_spinlock.h b/include/linux/bit_spinlock.h
index bbc4730a6505..641623d471b0 100644
--- a/include/linux/bit_spinlock.h
+++ b/include/linux/bit_spinlock.h
@@ -2,6 +2,7 @@
 #ifndef __LINUX_BIT_SPINLOCK_H
 #define __LINUX_BIT_SPINLOCK_H
 
+#include 
 #include 
 #include 
 #include 
@@ -13,32 +14,23 @@
  * Don't use this unless you really need to: spin_lock() and spin_unlock()
  * are significantly faster.
  */

Re: [PATCH v6 3/8] regulator: IRQ based event/error notification helpers

2021-04-07 Thread Matti Vaittinen

Hello Andy, All.

On Wed, 2021-04-07 at 16:21 +0300, Andy Shevchenko wrote:
> On Wed, Apr 7, 2021 at 1:04 PM Matti Vaittinen
>  wrote:
> > Provide helper function for IC's implementing regulator
> > notifications
> > when an IRQ fires. The helper also works for IRQs which can not be
> > acked.
> > Helper can be set to disable the IRQ at handler and then re-
> > enabling it
> > on delayed work later. The helper also adds
> > regulator_get_error_flags()
> > errors in cache for the duration of IRQ disabling.
> 
> Thanks for an update, my comments below. After addressing them, feel
> free to add
> Reviewed-by: Andy Shevchenko 
> 
> > Signed-off-by: Matti Vaittinen 
> > 
> >  static int _regulator_get_error_flags(struct regulator_dev *rdev,
> > unsigned int *flags)
> >  {
> > -   int ret;
> > +   int ret, tmpret;
> > 
> > regulator_lock(rdev);
> > 
> > +   ret = rdev_get_cached_err_flags(rdev);
> > +
> > /* sanity check */
> > -   if (!rdev->desc->ops->get_error_flags) {
> > +   if (rdev->desc->ops->get_error_flags) {
> > +   tmpret = rdev->desc->ops->get_error_flags(rdev,
> > flags);
> > +   if (tmpret > 0)
> > +   ret |= tmpret;
> 
> Oh, I don't like this. Easy fix is to rename ret (okay, it's been
> used
> elsewhere, so adding then) to something meaningful, like error_flags,
> then you can easily understand that value should be positive and
> error
> codes are negative.

No wonder if this looks hairy. I think I have got this plain wrong. The
rdev_get_cached_err_flags() is not updating the flags. Looks like just
plain mistake from my side. I think I've mixed the returning flags via
parameter and return value. This must be fixed. Well spotted.


> + */
> > +void *devm_regulator_irq_helper(struct device *dev,
> > +   const struct regulator_irq_desc *d,
> > int irq,
> > +   int irq_flags, int common_errs,
> > +   int *per_rdev_errs,
> > +   struct regulator_dev **rdev, int
> > rdev_amount)
> 
> I didn't get why you need the ** pointer instead of plain pointer.

We have an array of pointers. And we give a pointer to the first
pointer. Maybe it's the lack of coffee but I don't see why a single
pointer would be correct? rdev structures are not in contagious memory,
pointers to rdevs are. So we need address of first pointer, right?
+#include 


> > +#include 
> > +#include 
> > +#include 
> 
> Not sure how this header is used. I haven't found any direct users of
> it. Perhaps you wanted interrupt.h?

Thanks. I think this specific header may be a leftover from first draft
where I thought I'll use named IRQs. The header was for
 of_irq_get_byname(). That ended up as a mess for everything else but
platform devices :) I'll check the headers, thanks.

> > +#include 
> > +#include 
> > +#include 
> 
> + Blank line? I would separate group of generic headers with
> particular to the subsystem

I don't see this being used in regulator subsystem - and to tell the
truth, I don't really see the value.

> > +#include 

...

> > +
> > +reread:
> > +   if (d->fatal_cnt && h->retry_cnt > d->fatal_cnt) {
> > +   if (d->die)
> > +   ret = d->die(rid);
> > +   else
> > +   die_loudly("Regulator HW failure? - no IC
> > recovery");
> > +
> > +   /*
> > +* If the 'last resort' IC recovery failed we will
> > have
> > +* nothing else left to do...
> > +*/
> > +   if (ret)
> > +   die_loudly("Regulator HW failure? - IC
> > recovery failed");
> 
> Looking at the above code this will be executed if and only if
> d->die() is defined, correct?
> In that case, why not
> 
> if (d->die) {
>   ret = ...
>   if (ret)
>rdev_die_loudly(...);
> } else
>rdev_die_loudly(...);
> 
> ?

I think this should simply be:

if (!d->die)
die_loudly("Regulator HW failure? - no IC recovery");

ret = d->die(rdev);
if (ret)
die_loudly(...);

...

> > +static void init_rdev_errors(struct regulator_irq *h)
> > +{
> > +   int i;
> > +
> > +   for (i = 0; i < h->rdata.num_states; i++) {
> > +   if (h->rdata.states[i].possible_errs)
> > +   /* Can we trust writing this boolean is
> > atomic? */
> 
> No. boolean is a compiler / platform specific and it may potentially
> be written in a non-atomic way.

Hmm.. I don't think this really is a problem here. We only use the
use_cached_err for true/false evaluation - and if error getting api is
called after the boolean is changed - then cached error is used, if
before, then it is not used. Even if the value of the boolean was read
in the middle of writing it, it will still evaluate either true or
false - there is no 'maybe' state :)

My point, I guess we can do the change without

Re: [PATCH net v4] atl1c: move tx cleanup processing out of interrupt

2021-04-07 Thread Gatis Peisenieks


On 2021-04-07 19:55, Eric Dumazet wrote:

On 4/6/21 4:49 PM, Gatis Peisenieks wrote:

Tx queue cleanup happens in interrupt handler on same core as rx queue
processing. Both can take considerable amount of processing in high
packet-per-second scenarios.

Sending big amounts of packets can stall the rx processing which is 
unfair

and also can lead to out-of-memory condition since __dev_kfree_skb_irq
queues the skbs for later kfree in softirq which is not allowed to 
happen

with heavy load in interrupt handler.



[ ... ]


diff --git a/net/core/dev.c b/net/core/dev.c
index 0f72ff5d34ba..489ac60b530c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6789,6 +6789,7 @@ int dev_set_threaded(struct net_device *dev, 
bool threaded)


 return err;
 }
+EXPORT_SYMBOL(dev_set_threaded);

 void netif_napi_add(struct net_device *dev, struct napi_struct *napi,
 int (*poll)(struct napi_struct *, int), int weight)


This has already been done in net-next

Please base your patch on top of net-next, this can not be backported 
to old

versions anyway, without some amount of pain.


Thank you Eric, for heads up, the v5 patch sent for net-next in response 
to

David Miller comment already does that.

Re: [PATCH][next] scsi: pm80xx: Fix potential infinite loop

2021-04-07 Thread Jinpu Wang

On Wed, Apr 7, 2021 at 7:18 PM Martin K. Petersen
 wrote:
>
>
> Hi Colin!
>
> > The for-loop iterates with a u8 loop counter i and compares this with
> > the loop upper limit of pm8001_ha->max_q_num which is a u32 type.
> > There is a potential infinite loop if pm8001_ha->max_q_num is larger
> > than the u8 loop counter. Fix this by making the loop counter the same
> > type as pm8001_ha->max_q_num.
>
> No particular objections to the patch for future-proofing. However, as
> far as I can tell max_q_num is capped at 64 (PM8001_MAX_MSIX_VEC).
Exactly.
>
> --
> Martin K. Petersen  Oracle Linux Engineering

Re: [RFC/RFT PATCH 0/3] arm64: drop pfn_valid_within() and simplify pfn_valid()

2021-04-07 Thread Anshuman Khandual

Adding James here.

+ James Morse 

On 4/7/21 10:56 PM, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> Hi,
> 
> These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire
> pfn_valid_within() to 1. 

That would be really great for arm64 platform as it will save CPU cycles on
many generic MM paths, given that our pfn_valid() has been expensive.

> 
> The idea is to mark NOMAP pages as reserved in the memory map and restore

Though I am not really sure, would that possibly be problematic for UEFI/EFI
use cases as it might have just treated them as normal struct pages till now.

> the intended semantics of pfn_valid() to designate availability of struct
> page for a pfn.

Right, that would be better as the current semantics is not ideal.

> 
> With this the core mm will be able to cope with the fact that it cannot use
> NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks
> will be treated correctly even without the need for pfn_valid_within.
> 
> The patches are only boot tested on qemu-system-aarch64 so I'd really
> appreciate memory stress tests on real hardware.

Did some preliminary memory stress tests on a guest with portions of memory
marked as MEMBLOCK_NOMAP and did not find any obvious problem. But this might
require some testing on real UEFI environment with firmware using MEMBLOCK_NOMAP
memory to make sure that changing these struct pages to PageReserved() is safe.

> 
> If this actually works we'll be one step closer to drop custom pfn_valid()
> on arm64 altogether.

Right, planning to rework and respin the RFC originally sent last month.

https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khand...@arm.com/

Re: [PATCH] arm64: dts: qcom: Move rmtfs memory region

2021-04-07 Thread Sibi Sankar


Hey Sujit,
Thanks for the patch.

On 2021-03-30 07:16, Sujit Kautkar wrote:

Move rmtfs memory region so that it does not overlap with system
RAM (kernel data) when KAsan is enabled. This puts rmtfs right
after mba_mem which is not supposed to increase beyond 0x9460

Signed-off-by: Sujit Kautkar 
---
 arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi | 2 +-
 arch/arm64/boot/dts/qcom/sc7180.dtsi | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi
b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi
index 07c8b2c926c0..fe052b477b72 100644
--- a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi
@@ -45,7 +45,7 @@ trips {

 /* Increase the size from 2MB to 8MB */
 _mem {
-   reg = <0x0 0x8440 0x0 0x80>;
+   reg = <0x0 0x9460 0x0 0x80>;


Sorry for the late comments. Can you
please do the same for sc7180-idp
as well?

Reviewed-by: Sibi Sankar 


 };

 / {
diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi
b/arch/arm64/boot/dts/qcom/sc7180.dtsi
index 1ea3344ab62c..ac956488908f 100644
--- a/arch/arm64/boot/dts/qcom/sc7180.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi
@@ -110,9 +110,9 @@ tz_mem: memory@80b0 {
no-map;
};

-   rmtfs_mem: memory@8440 {
+   rmtfs_mem: memory@9460 {
compatible = "qcom,rmtfs-mem";
-   reg = <0x0 0x8440 0x0 0x20>;
+   reg = <0x0 0x9460 0x0 0x20>;
no-map;

qcom,client-id = <1>;


--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project.

Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages

2021-04-07 Thread Anshuman Khandual




On 4/7/21 10:56 PM, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> The struct pages representing a reserved memory region are initialized
> using reserve_bootmem_range() function. This function is called for each
> reserved region just before the memory is freed from memblock to the buddy
> page allocator.
> 
> The struct pages for MEMBLOCK_NOMAP regions are kept with the default
> values set by the memory map initialization which makes it necessary to
> have a special treatment for such pages in pfn_valid() and
> pfn_valid_within().
> 
> Split out initialization of the reserved pages to a function with a
> meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the
> reserved regions and mark struct pages for the NOMAP regions as
> PageReserved.

This would definitely need updating the comment for MEMBLOCK_NOMAP definition
in include/linux/memblock.h just to make the semantics is clear, though arm64
is currently the only user for MEMBLOCK_NOMAP.

> 
> Signed-off-by: Mike Rapoport 
> ---
>  mm/memblock.c | 23 +--
>  1 file changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memblock.c b/mm/memblock.c
> index afaefa8fc6ab..6b7ea9d86310 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -2002,6 +2002,26 @@ static unsigned long __init 
> __free_memory_core(phys_addr_t start,
>   return end_pfn - start_pfn;
>  }
>  
> +static void __init memmap_init_reserved_pages(void)
> +{
> + struct memblock_region *region;
> + phys_addr_t start, end;
> + u64 i;
> +
> + /* initialize struct pages for the reserved regions */
> + for_each_reserved_mem_range(i, , )
> + reserve_bootmem_region(start, end);
> +
> + /* and also treat struct pages for the NOMAP regions as PageReserved */
> + for_each_mem_region(region) {
> + if (memblock_is_nomap(region)) {
> + start = region->base;
> + end = start + region->size;
> + reserve_bootmem_region(start, end);
> + }
> + }
> +}
> +
>  static unsigned long __init free_low_memory_core_early(void)
>  {
>   unsigned long count = 0;
> @@ -2010,8 +2030,7 @@ static unsigned long __init 
> free_low_memory_core_early(void)
>  
>   memblock_clear_hotplug(0, -1);
>  
> - for_each_reserved_mem_range(i, , )
> - reserve_bootmem_region(start, end);
> + memmap_init_reserved_pages();
>  
>   /*
>* We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id
>

Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid()

2021-04-07 Thread Anshuman Khandual



On 4/7/21 10:56 PM, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> The intended semantics of pfn_valid() is to verify whether there is a
> struct page for the pfn in question and nothing else.

Should there be a comment affirming this semantics interpretation, above the
generic pfn_valid() in include/linux/mmzone.h ?

> 
> Yet, on arm64 it is used to distinguish memory areas that are mapped in the
> linear map vs those that require ioremap() to access them.
> 
> Introduce a dedicated pfn_is_memory() to perform such check and use it
> where appropriate.
> 
> Signed-off-by: Mike Rapoport 
> ---
>  arch/arm64/include/asm/memory.h | 2 +-
>  arch/arm64/include/asm/page.h   | 1 +
>  arch/arm64/kvm/mmu.c| 2 +-
>  arch/arm64/mm/init.c| 6 ++
>  arch/arm64/mm/ioremap.c | 4 ++--
>  arch/arm64/mm/mmu.c | 2 +-
>  6 files changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 0aabc3be9a75..7e77fdf71b9d 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x)
>  
>  #define virt_addr_valid(addr)({  
> \
>   __typeof__(addr) __addr = __tag_reset(addr);\
> - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr));  \
> + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr));  \
>  })
>  
>  void dump_mem_limit(void);
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 012cffc574e8..32b485bcc6ff 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from);
>  typedef struct page *pgtable_t;
>  
>  extern int pfn_valid(unsigned long);
> +extern int pfn_is_memory(unsigned long);
>  
>  #include 
>  
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 8711894db8c2..ad2ea65a3937 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
>  
>  static bool kvm_is_device_pfn(unsigned long pfn)
>  {
> - return !pfn_valid(pfn);
> + return !pfn_is_memory(pfn);
>  }
>  
>  /*
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 3685e12aba9b..258b1905ed4a 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn)
>  }
>  EXPORT_SYMBOL(pfn_valid);
>  
> +int pfn_is_memory(unsigned long pfn)
> +{
> + return memblock_is_map_memory(PFN_PHYS(pfn));
> +}
> +EXPORT_SYMBOL(pfn_is_memory);> +

Should not this be generic though ? There is nothing platform or arm64
specific in here. Wondering as pfn_is_memory() just indicates that the
pfn is linear mapped, should not it be renamed as pfn_is_linear_memory()
instead ? Regardless, it's fine either way.

>  static phys_addr_t memory_limit = PHYS_ADDR_MAX;
>  
>  /*
> diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c
> index b5e83c46b23e..82a369b22ef5 100644
> --- a/arch/arm64/mm/ioremap.c
> +++ b/arch/arm64/mm/ioremap.c
> @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t 
> phys_addr, size_t size,
>   /*
>* Don't allow RAM to be mapped.
>*/
> - if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr
> + if (WARN_ON(pfn_is_memory(__phys_to_pfn(phys_addr
>   return NULL;
>  
>   area = get_vm_area_caller(size, VM_IOREMAP, caller);
> @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap);
>  void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size)
>  {
>   /* For normal memory we already have a cacheable mapping. */
> - if (pfn_valid(__phys_to_pfn(phys_addr)))
> + if (pfn_is_memory(__phys_to_pfn(phys_addr)))
>   return (void __iomem *)__phys_to_virt(phys_addr);
>  
>   return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL),
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 5d9550fdb9cf..038d20fe163f 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd)
>  pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
> unsigned long size, pgprot_t vma_prot)
>  {
> - if (!pfn_valid(pfn))
> + if (!pfn_is_memory(pfn))
>   return pgprot_noncached(vma_prot);
>   else if (file->f_flags & O_SYNC)
>   return pgprot_writecombine(vma_prot);
>

Re: [RFC/RFT PATCH 3/3] arm64: drop pfn_valid_within() and simplify pfn_valid()

2021-04-07 Thread Anshuman Khandual



On 4/7/21 10:56 PM, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> The arm64's version of pfn_valid() differs from the generic because of two
> reasons:
> 
> * Parts of the memory map are freed during boot. This makes it necessary to
>   verify that there is actual physical memory that corresponds to a pfn
>   which is done by querying memblock.
> 
> * There are NOMAP memory regions. These regions are not mapped in the
>   linear map and until the previous commit the struct pages representing
>   these areas had default values.
> 
> As the consequence of absence of the special treatment of NOMAP regions in
> the memory map it was necessary to use memblock_is_map_memory() in
> pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that
> generic mm functionality would not treat a NOMAP page as a normal page.
> 
> Since the NOMAP regions are now marked as PageReserved(), pfn walkers and
> the rest of core mm will treat them as unusable memory and thus
> pfn_valid_within() is no longer required at all and can be disabled by
> removing CONFIG_HOLES_IN_ZONE on arm64.

But what about the memory map that are freed during boot (mentioned above).
Would not they still cause CONFIG_HOLES_IN_ZONE to be applicable and hence
pfn_valid_within() ?

> 
> pfn_valid() can be slightly simplified by replacing
> memblock_is_map_memory() with memblock_is_memory().

Just to understand this better, pfn_valid() will now return true for all
MEMBLOCK_NOMAP based memory but that is okay as core MM would still ignore
them as unusable memory for being PageReserved().

> 
> Signed-off-by: Mike Rapoport 
> ---
>  arch/arm64/Kconfig   | 3 ---
>  arch/arm64/mm/init.c | 4 ++--
>  2 files changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index e4e1b6550115..58e439046d05 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
>   def_bool y
>   depends on NUMA
>  
> -config HOLES_IN_ZONE
> - def_bool y
> -
>  source "kernel/Kconfig.hz"
>  
>  config ARCH_SPARSEMEM_ENABLE
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 258b1905ed4a..bb6dd406b1f0 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn)
>  
>   /*
>* ZONE_DEVICE memory does not have the memblock entries.
> -  * memblock_is_map_memory() check for ZONE_DEVICE based
> +  * memblock_is_memory() check for ZONE_DEVICE based
>* addresses will always fail. Even the normal hotplugged
>* memory will never have MEMBLOCK_NOMAP flag set in their
>* memblock entries. Skip memblock search for all non early
> @@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn)
>   return pfn_section_valid(ms, pfn);
>  }
>  #endif
> - return memblock_is_map_memory(addr);
> + return memblock_is_memory(addr);
>  }
>  EXPORT_SYMBOL(pfn_valid);
>  
>

Re: [syzbot] INFO: task hung in io_ring_exit_work

2021-04-07 Thread syzbot

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an 
issue:
INFO: task hung in io_ring_exit_work

INFO: task kworker/u4:0:9 blocked for more than 143 seconds.
  Not tainted 5.12.0-rc2-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u4:0state:D stack:26336 pid:9 ppid: 2 flags:0x4000
Workqueue: events_unbound io_ring_exit_work
Call Trace:
 context_switch kernel/sched/core.c:4324 [inline]
 __schedule+0x911/0x21b0 kernel/sched/core.c:5075
 schedule+0xcf/0x270 kernel/sched/core.c:5154
 schedule_timeout+0x1db/0x250 kernel/time/timer.c:1868
 do_wait_for_common kernel/sched/completion.c:85 [inline]
 __wait_for_common kernel/sched/completion.c:106 [inline]
 wait_for_common kernel/sched/completion.c:117 [inline]
 wait_for_completion+0x168/0x270 kernel/sched/completion.c:138
 io_ring_exit_work+0x4e8/0x12d0 fs/io_uring.c:8611
 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
INFO: task kworker/u4:1:25 blocked for more than 144 seconds.
  Not tainted 5.12.0-rc2-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u4:1state:D stack:25312 pid:   25 ppid: 2 flags:0x4000
Workqueue: events_unbound io_ring_exit_work
Call Trace:
 context_switch kernel/sched/core.c:4324 [inline]
 __schedule+0x911/0x21b0 kernel/sched/core.c:5075
 schedule+0xcf/0x270 kernel/sched/core.c:5154
 schedule_timeout+0x1db/0x250 kernel/time/timer.c:1868
 do_wait_for_common kernel/sched/completion.c:85 [inline]
 __wait_for_common kernel/sched/completion.c:106 [inline]
 wait_for_common kernel/sched/completion.c:117 [inline]
 wait_for_completion+0x168/0x270 kernel/sched/completion.c:138
 io_ring_exit_work+0x4e8/0x12d0 fs/io_uring.c:8611
 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
INFO: task kworker/u4:3:110 blocked for more than 145 seconds.
  Not tainted 5.12.0-rc2-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u4:3state:D stack:23608 pid:  110 ppid: 2 flags:0x4000
Workqueue: events_unbound io_ring_exit_work
Call Trace:
 context_switch kernel/sched/core.c:4324 [inline]
 __schedule+0x911/0x21b0 kernel/sched/core.c:5075
 schedule+0xcf/0x270 kernel/sched/core.c:5154
 schedule_timeout+0x1db/0x250 kernel/time/timer.c:1868
 do_wait_for_common kernel/sched/completion.c:85 [inline]
 __wait_for_common kernel/sched/completion.c:106 [inline]
 wait_for_common kernel/sched/completion.c:117 [inline]
 wait_for_completion+0x168/0x270 kernel/sched/completion.c:138
 io_ring_exit_work+0x4e8/0x12d0 fs/io_uring.c:8611
 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
INFO: task kworker/u4:4:185 blocked for more than 145 seconds.
  Not tainted 5.12.0-rc2-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u4:4state:D stack:25584 pid:  185 ppid: 2 flags:0x4000
Workqueue: events_unbound io_ring_exit_work
Call Trace:
 context_switch kernel/sched/core.c:4324 [inline]
 __schedule+0x911/0x21b0 kernel/sched/core.c:5075
 schedule+0xcf/0x270 kernel/sched/core.c:5154
 schedule_timeout+0x1db/0x250 kernel/time/timer.c:1868
 do_wait_for_common kernel/sched/completion.c:85 [inline]
 __wait_for_common kernel/sched/completion.c:106 [inline]
 wait_for_common kernel/sched/completion.c:117 [inline]
 wait_for_completion+0x168/0x270 kernel/sched/completion.c:138
 io_ring_exit_work+0x4e8/0x12d0 fs/io_uring.c:8611
 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

Showing all locks held in the system:
2 locks held by kworker/u4:0/9:
 #0: 88800fc69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: 88800fc69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: 88800fc69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: 88800fc69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
set_work_data kernel/workqueue.c:616 [inline]
 #0: 88800fc69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: 88800fc69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
process_one_work+0x871/0x1600

Re: [PATCH 2/2] powerpc: make 'boot_text_mapped' static

2021-04-07 Thread Christophe Leroy





Le 08/04/2021 à 03:18, Yu Kuai a écrit :

The sparse tool complains as follow:

arch/powerpc/kernel/btext.c:48:5: warning:
  symbol 'boot_text_mapped' was not declared. Should it be static?

This symbol is not used outside of btext.c, so this commit make
it static.

Signed-off-by: Yu Kuai 
---
  arch/powerpc/kernel/btext.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/btext.c b/arch/powerpc/kernel/btext.c
index 359d0f4ca532..8df9230be6fa 100644
--- a/arch/powerpc/kernel/btext.c
+++ b/arch/powerpc/kernel/btext.c
@@ -45,7 +45,7 @@ unsigned long disp_BAT[2] __initdata = {0, 0};
  
  static unsigned char vga_font[cmapsz];
  
-int boot_text_mapped __force_data = 0;

+static int boot_text_mapped __force_data;


Are you sure the initialisation to 0 can be removed ? Usually initialisation to 0 is not needed 
because not initialised variables go in the BSS section which is zeroed at startup. But here the 
variable is flagged with __force_data so it is not going in the BSS section.



  
  extern void rmci_on(void);

  extern void rmci_off(void);

Re: [PATCH 1/2] powerpc: remove set but not used variable 'force_printk_to_btext'

2021-04-07 Thread Christophe Leroy





Le 08/04/2021 à 03:18, Yu Kuai a écrit :

Fixes gcc '-Wunused-but-set-variable' warning:

arch/powerpc/kernel/btext.c:49:12: error: 'force_printk_to_btext'
defined but not used.


You don't get this error as it is now.
You will get this error only if you make it 'static', which is what you did in your first patch 
based on the 'sparse' report.


When removing a non static variable, you should explain that you can remove it after you have 
verifier that it is nowhere used, neither in that file nor in any other one.




It is never used, and so can be removed.

Signed-off-by: Yu Kuai 
---
  arch/powerpc/kernel/btext.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/btext.c b/arch/powerpc/kernel/btext.c
index 803c2a45b22a..359d0f4ca532 100644
--- a/arch/powerpc/kernel/btext.c
+++ b/arch/powerpc/kernel/btext.c
@@ -46,7 +46,6 @@ unsigned long disp_BAT[2] __initdata = {0, 0};
  static unsigned char vga_font[cmapsz];
  
  int boot_text_mapped __force_data = 0;

-int force_printk_to_btext = 0;
  
  extern void rmci_on(void);

  extern void rmci_off(void);

[PATCH] Revert "drm/syncobj: use dma_fence_get_stub"

2021-04-07 Thread David Stevens

From: David Stevens 

This reverts commit 86bbd89d5da66fe760049ad3f04adc407ec0c4d6.

Using the singleton stub fence in drm_syncobj_assign_null_handle means
that all syncobjs created in an already signaled state or any syncobjs
signaled by userspace will reference the singleton fence when exported
to a sync_file. If those sync_files are queried with SYNC_IOC_FILE_INFO,
then the timestamp_ns value returned will correspond to whenever the
singleton stub fence was first initialized. This can break the ability
of userspace to use timestamps of these fences, as the singleton stub
fence's timestamp bears no relationship to any meaningful event.

Signed-off-by: David Stevens 
---
 drivers/gpu/drm/drm_syncobj.c | 58 ++-
 1 file changed, 44 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 349146049849..7cc11f1a83f4 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -211,6 +211,21 @@ struct syncobj_wait_entry {
 static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,
  struct syncobj_wait_entry *wait);
 
+struct drm_syncobj_stub_fence {
+   struct dma_fence base;
+   spinlock_t lock;
+};
+
+static const char *drm_syncobj_stub_fence_get_name(struct dma_fence *fence)
+{
+   return "syncobjstub";
+}
+
+static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
+   .get_driver_name = drm_syncobj_stub_fence_get_name,
+   .get_timeline_name = drm_syncobj_stub_fence_get_name,
+};
+
 /**
  * drm_syncobj_find - lookup and reference a sync object.
  * @file_private: drm file private pointer
@@ -344,18 +359,24 @@ void drm_syncobj_replace_fence(struct drm_syncobj 
*syncobj,
 }
 EXPORT_SYMBOL(drm_syncobj_replace_fence);
 
-/**
- * drm_syncobj_assign_null_handle - assign a stub fence to the sync object
- * @syncobj: sync object to assign the fence on
- *
- * Assign a already signaled stub fence to the sync object.
- */
-static void drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
+static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
 {
-   struct dma_fence *fence = dma_fence_get_stub();
+   struct drm_syncobj_stub_fence *fence;
 
-   drm_syncobj_replace_fence(syncobj, fence);
-   dma_fence_put(fence);
+   fence = kzalloc(sizeof(*fence), GFP_KERNEL);
+   if (fence == NULL)
+   return -ENOMEM;
+
+   spin_lock_init(>lock);
+   dma_fence_init(>base, _syncobj_stub_fence_ops,
+  >lock, 0, 0);
+   dma_fence_signal(>base);
+
+   drm_syncobj_replace_fence(syncobj, >base);
+
+   dma_fence_put(>base);
+
+   return 0;
 }
 
 /* 5s default for wait submission */
@@ -469,6 +490,7 @@ EXPORT_SYMBOL(drm_syncobj_free);
 int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
   struct dma_fence *fence)
 {
+   int ret;
struct drm_syncobj *syncobj;
 
syncobj = kzalloc(sizeof(struct drm_syncobj), GFP_KERNEL);
@@ -479,8 +501,13 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, 
uint32_t flags,
INIT_LIST_HEAD(>cb_list);
spin_lock_init(>lock);
 
-   if (flags & DRM_SYNCOBJ_CREATE_SIGNALED)
-   drm_syncobj_assign_null_handle(syncobj);
+   if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
+   ret = drm_syncobj_assign_null_handle(syncobj);
+   if (ret < 0) {
+   drm_syncobj_put(syncobj);
+   return ret;
+   }
+   }
 
if (fence)
drm_syncobj_replace_fence(syncobj, fence);
@@ -1322,8 +1349,11 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void 
*data,
if (ret < 0)
return ret;
 
-   for (i = 0; i < args->count_handles; i++)
-   drm_syncobj_assign_null_handle(syncobjs[i]);
+   for (i = 0; i < args->count_handles; i++) {
+   ret = drm_syncobj_assign_null_handle(syncobjs[i]);
+   if (ret < 0)
+   break;
+   }
 
drm_syncobj_array_free(syncobjs, args->count_handles);
 
-- 
2.31.0.208.g409f899ff0-goog

Re: linux-next: build failure after merge of the bluetooth tree

2021-04-07 Thread Stephen Rothwell

Hi Luiz,

On Thu, 8 Apr 2021 04:47:04 + "Von Dentz, Luiz"  
wrote:
>
> I'd leave this for Marcel to comments, but there are quite many
> instances of // comment like that, so I wonder what is going on, or
> perhaps that is not allowed in include/uapi?

We only do these standalone compile checks on the uapi header files.

-- 
Cheers,
Stephen Rothwell


pgpACVhh2tkWa.pgp
Description: OpenPGP digital signature

Re: [PATCH-next] powerpc/interrupt: Remove duplicate header file

2021-04-07 Thread Christophe Leroy





Le 08/04/2021 à 05:56, johnny.che...@huawei.com a écrit :

From: Chen Yi 

Delete one of the header files  that are included
twice.


Guys, we have been flooded with such tiny patches over the last weeks, some changes being sent 
several times by different people.


That one is included in 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210323062916.295346-1-wanjiab...@vivo.com/


And was already submitted a few hours earlier by someone else: 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/1616464656-59372-1-git-send-email-zhouchuan...@vivo.com/


Could you work all together and cook an overall patch including all duplicate removal from 
arch/powerpc/ files ?


Best way would be I think to file an issue at https://github.com/linuxppc/issues/issues , then you 
do a complete analysis and list in the issue all places to be modified, then once the analysis is 
complete you send a full single patch.


Thanks
Christophe



Signed-off-by: Chen Yi 
---
  arch/powerpc/kernel/interrupt.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index c4dd4b8f9cfa..f64ace0208b7 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -7,7 +7,6 @@
  #include 
  #include 
  #include 
-#include 
  #include 
  #include 
  #include

Re: [PATCH v1] usb: dwc3: core: Add shutdown callback for dwc3

2021-04-07 Thread Sandeep Maheswaram




On 3/30/2021 7:02 PM, Greg Kroah-Hartman wrote:

On Tue, Mar 30, 2021 at 06:18:43PM +0530, Sai Prakash Ranjan wrote:

On 2021-03-30 16:46, Greg Kroah-Hartman wrote:

On Tue, Mar 30, 2021 at 03:25:58PM +0530, Sai Prakash Ranjan wrote:

On 2021-03-30 14:37, Greg Kroah-Hartman wrote:

On Tue, Mar 30, 2021 at 02:12:04PM +0530, Sandeep Maheswaram wrote:

On 3/26/2021 7:07 PM, Greg Kroah-Hartman wrote:

On Wed, Mar 24, 2021 at 12:57:32AM +0530, Sandeep Maheswaram wrote:

This patch adds a shutdown callback to USB DWC core driver to ensure that
it is properly shutdown in reboot/shutdown path. This is required
where SMMU address translation is enabled like on SC7180
SoC and few others. If the hardware is still accessing memory after
SMMU translation is disabled as part of SMMU shutdown callback in
system reboot or shutdown path, then IOVAs(I/O virtual address)
which it was using will go on the bus as the physical addresses which
might result in unknown crashes (NoC/interconnect errors).

Previously this was added in dwc3 qcom glue driver.
https://patchwork.kernel.org/project/linux-arm-msm/list/?series=382449
But observed kernel panic as glue driver shutdown getting called after
iommu shutdown. As we are adding iommu nodes in dwc core node
in device tree adding shutdown callback in core driver seems correct.

So shouldn't you also remove this from the qcom glue driver at the same
time?  Please submit both as a patch series.

thanks,

greg k-h

Hi Greg,

The qcom glue driver patch is not merged yet. I have just mentioned
for it for reference.

You know that we can not add callbacks for no in-kernel user, so what
good is this patch for now?


What in-kernel user? Since when does shutdown callback need an
in-kernel
user? When you reboot or shutdown a system, it gets called. The reason
why the shutdown callback is needed is provided in the commit text.

As I can't see the patch here, I have no idea...

You are replying now to the same patch which adds this shutdown callback :)
Anyways the qcom dwc3 driver patch which is abandoned which is also
mentioned
in the commit text is here [1] and the new shutdown callback patch which we
are both replying to is in here [2]

[1] 
https://lore.kernel.org/lkml/1605162619-10064-1-git-send-email-s...@codeaurora.org/

[2] 
https://lore.kernel.org/lkml/1616527652-7937-1-git-send-email-s...@codeaurora.org/

Thanks, so, what am I supposed to do here?  The patch is long gone from
my queue...

greg k-h


Hi Greg,

Should I resend this patch ? If so let me know your about opinion about 
Stephen's comment on just calling dwc3_remove in


dwc3_shutdown and ignoring return value.

https://lore.kernel.org/patchwork/patch/1401242/#1599316

Thanks

Sandeep

RE: rtlwifi/rtl8192cu AP mode broken with PS STA

2021-04-07 Thread Pkshih



> -Original Message-
> From: Maciej S. Szmigiero [mailto:m...@maciej.szmigiero.name]
> Sent: Thursday, April 08, 2021 4:53 AM
> To: Larry Finger; Pkshih
> Cc: linux-wirel...@vger.kernel.org; net...@vger.kernel.org; 
> linux-kernel@vger.kernel.org;
> johan...@sipsolutions.net; kv...@codeaurora.org
> Subject: Re: rtlwifi/rtl8192cu AP mode broken with PS STA
> 
> On 07.04.2021 06:21, Larry Finger wrote:
> > On 4/6/21 9:48 PM, Pkshih wrote:
> >> On Tue, 2021-04-06 at 11:25 -0500, Larry Finger wrote:
> >>> On 4/6/21 7:06 AM, Maciej S. Szmigiero wrote:
>  On 06.04.2021 12:00, Kalle Valo wrote:
> > "Maciej S. Szmigiero"  writes:
> >
> >> On 29.03.2021 00:54, Maciej S. Szmigiero wrote:
> >>> Hi,
> >>>
> >>> It looks like rtlwifi/rtl8192cu AP mode is broken when a STA is using 
> >>> PS,
> >>> since the driver does not update its beacon to account for TIM 
> >>> changes,
> >>> so a station that is sleeping will never learn that it has packets
> >>> buffered at the AP.
> >>>
> >>> Looking at the code, the rtl8192cu driver implements neither the 
> >>> set_tim()
> >>> callback, nor does it explicitly update beacon data periodically, so 
> >>> it
> >>> has no way to learn that it had changed.
> >>>
> >>> This results in the AP mode being virtually unusable with STAs that do
> >>> PS and don't allow for it to be disabled (IoT devices, mobile phones,
> >>> etc.).
> >>>
> >>> I think the easiest fix here would be to implement set_tim() for 
> >>> example
> >>> the way rt2x00 driver does: queue a work or schedule a tasklet to 
> >>> update
> >>> the beacon data on the device.
> >>
> >> Are there any plans to fix this?
> >> The driver is listed as maintained by Ping-Ke.
> >
> > Yeah, power save is hard and I'm not surprised that there are drivers
> > with broken power save mode support. If there's no fix available we
> > should stop supporting AP mode in the driver.
> >
>  https://wireless.wiki.kernel.org/en/developers/documentation/mac80211/api
>  clearly documents that "For AP mode, it must (...) react to the set_tim()
>  callback or fetch each beacon from mac80211".
>  The driver isn't doing either so no wonder the beacon it is sending
>  isn't getting updated.
>  As I have said above, it seems to me that all that needs to be done here
>  is to queue a work in a set_tim() callback, then call
>  send_beacon_frame() from rtlwifi/core.c from this work.
>  But I don't know the exact device semantics, maybe it needs some other
>  notification that the beacon has changed, too, or even tries to
>  manage the TIM bitmap by itself.
>  It would be a shame to lose the AP mode for such minor thing, though.
>  I would play with this myself, but unfortunately I don't have time
>  to work on this right now.
>  That's where my question to Realtek comes: are there plans to actually
>  fix this?
> >>>
> >>> Yes, I am working on this. My only question is "if you are such an expert 
> >>> on the
> >>> problem, why do you not fix it?"
> >>>
> >>> The example in rx200 is not particularly useful, and I have not found any 
> >>> other
> >>> examples.
> >>>
> >>
> >> Hi Larry,
> >>
> >> I have a draft patch that forks a work to do send_beacon_frame(), whose
> >> behavior like Maciej mentioned.
> 
> That's great, thanks!
> 
> >> I did test on RTL8821AE; it works well. But, it seems already work well 
> >> even
> >> I don't apply this patch, and I'm still digging why.
> 
> It looks like PCI rtlwifi hardware uses a tasklet (specifically,
> _rtl_pci_prepare_bcn_tasklet() in pci.c) to periodically transfer the
> current beacon to the NIC.

Got it.

> 
> This tasklet is scheduled on a RTL_IMR_BCNINT interrupt, which sounds
> like a beacon interval interrupt.
> 

Yes, PCI series update every beacon, so TIM and DTIM count maintained by
mac80211 work properly.

> >> I don't have a rtl8192cu dongle on hand, but I'll try to find one.
> >
> > Maceij,
> >
> > Does this patch fix the problem?
> 
> The beacon seems to be updating now and STAs no longer get stuck in PS
> mode.
> Although sometimes (every 2-3 minutes with continuous 1s interval pings)
> there is around 5s delay in updating the transmitted beacon - don't know
> why, maybe the NIC hardware still has the old version in queue?

Since USB device doesn't update every beacon, dtim_count isn't updated neither.
It leads STA doesn't awake properly. Please try to fix dtim_period=1 in
hostapd.conf, which tells STA awakes every beacon interval.

> 
> I doubt, however that this set_tim() callback should be added for every
> rtlwifi device type.
> 
> As I have said above, PCI devices seem to already have a mechanism in
> place to update their beacon each beacon interval.
> Your test that RTL8821AE works without this patch confirms that (at
> least for the rtl8821ae driver).
> 
> It seems this

Re: [PATCH] iommu/vt-d: Force to flush iotlb before creating superpage

2021-04-07 Thread Lu Baolu


Hi Longpeng,

On 4/7/21 2:35 PM, Longpeng (Mike, Cloud Infrastructure Service Product 
Dept.) wrote:

Hi Baolu,


-Original Message-
From: Lu Baolu [mailto:baolu...@linux.intel.com]
Sent: Friday, April 2, 2021 12:44 PM
To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
; io...@lists.linux-foundation.org;
linux-kernel@vger.kernel.org
Cc: baolu...@linux.intel.com; David Woodhouse ; Nadav
Amit ; Alex Williamson ;
Kevin Tian ; Gonglei (Arei) ;
sta...@vger.kernel.org
Subject: Re: [PATCH] iommu/vt-d: Force to flush iotlb before creating superpage

Hi Longpeng,

On 4/1/21 3:18 PM, Longpeng(Mike) wrote:

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index ee09323..cbcb434 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2342,9 +2342,20 @@ static inline int hardware_largepage_caps(struct

dmar_domain *domain,

 * removed to make room for superpage(s).
 * We're adding new large pages, so make sure
 * we don't remove their parent tables.
+*
+* We also need to flush the iotlb before 
creating
+* superpage to ensure it does not perserves any
+* obsolete info.
 */
-   dma_pte_free_pagetable(domain, iov_pfn, end_pfn,
-  largepage_lvl + 1);
+   if (dma_pte_present(pte)) {


The dma_pte_free_pagetable() clears a batch of PTEs. So checking current PTE is
insufficient. How about removing this check and always performing cache
invalidation?



Um...the PTE here may be present( e.g. 4K mapping --> superpage mapping ) 
orNOT-present ( e.g. create a totally new superpage mapping ), but we only need to 
call free_pagetable and flush_iotlb in the former case, right ?


But this code covers multiple PTEs and perhaps crosses the page
boundary.

How about moving this code into a separated function and check PTE
presence there. A sample code could look like below: [compiled but not
tested!]

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d334f5b4e382..0e04d450c38a 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2300,6 +2300,41 @@ static inline int hardware_largepage_caps(struct 
dmar_domain *domain,

return level;
 }

+/*
+ * Ensure that old small page tables are removed to make room for 
superpage(s).
+ * We're going to add new large pages, so make sure we don't remove 
their parent

+ * tables. The IOTLB/devTLBs should be flushed if any PDE/PTEs are cleared.
+ */
+static void switch_to_super_page(struct dmar_domain *domain,
+unsigned long start_pfn,
+unsigned long end_pfn, int level)
+{
+   unsigned long lvl_pages = lvl_to_nr_pages(level);
+   struct dma_pte *pte = NULL;
+   int i;
+
+   while (start_pfn <= end_pfn) {
+   if (!pte)
+   pte = pfn_to_dma_pte(domain, start_pfn, );
+
+   if (dma_pte_present(pte)) {
+   dma_pte_free_pagetable(domain, start_pfn,
+  start_pfn + lvl_pages - 1,
+  level + 1);
+
+   for_each_domain_iommu(i, domain)
+   iommu_flush_iotlb_psi(g_iommus[i], domain,
+ start_pfn, lvl_pages,
+ 0, 0);
+   }
+
+   pte++;
+   start_pfn += lvl_pages;
+   if (first_pte_in_page(pte))
+   pte = NULL;
+   }
+}
+
 static int
 __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
 unsigned long phys_pfn, unsigned long nr_pages, int prot)
@@ -2341,22 +2376,11 @@ __domain_mapping(struct dmar_domain *domain, 
unsigned long iov_pfn,

return -ENOMEM;
/* It is large page*/
if (largepage_lvl > 1) {
-   unsigned long nr_superpages, end_pfn;
+   unsigned long end_pfn;

pteval |= DMA_PTE_LARGE_PAGE;
-   lvl_pages = lvl_to_nr_pages(largepage_lvl);
-
-   nr_superpages = nr_pages / lvl_pages;
-   end_pfn = iov_pfn + nr_superpages * 
lvl_pages - 1;

-
-   /*
-* Ensure that old small page tables are
-* removed to make room for superpage(s).
-* We're adding new large pages, so make 
sure

-

Re: [PATCH] powerpc: remove old workaround for GCC < 4.9

2021-04-07 Thread Christophe Leroy





Le 08/04/2021 à 05:05, Masahiro Yamada a écrit :

According to Documentation/process/changes.rst, the minimum supported
GCC version is 4.9.

This workaround is dead code.


This workaround is already on the way out, see 
https://github.com/linuxppc/linux/commit/802b5560393423166e436c7914b565f3cda9e6b9






Signed-off-by: Masahiro Yamada 
---

  arch/powerpc/Makefile | 6 --
  1 file changed, 6 deletions(-)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 5f8544cf724a..32dd693b4e42 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -181,12 +181,6 @@ CC_FLAGS_FTRACE := -pg
  ifdef CONFIG_MPROFILE_KERNEL
  CC_FLAGS_FTRACE += -mprofile-kernel
  endif
-# Work around gcc code-gen bugs with -pg / -fno-omit-frame-pointer in gcc <= 
4.8
-# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44199
-# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52828
-ifndef CONFIG_CC_IS_CLANG
-CC_FLAGS_FTRACE+= $(call cc-ifversion, -lt, 0409, -mno-sched-epilog)
-endif
  endif
  
  CFLAGS-$(CONFIG_TARGET_CPU_BOOL) += $(call cc-option,-mcpu=$(CONFIG_TARGET_CPU))

[PATCH v13 14/18] arm64: kexec: install a copy of the linear-map

2021-04-07 Thread Pavel Tatashin

To perform the kexec relocations with the MMU enabled, we need a copy
of the linear map.

Create one, and install it from the relocation code. This has to be done
from the assembly code as it will be idmapped with TTBR0. The kernel
runs in TTRB1, so can't use the break-before-make sequence on the mapping
it is executing from.

The makes no difference yet as the relocation code runs with the MMU
disabled.

Co-developed-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/assembler.h  | 19 +++
 arch/arm64/include/asm/kexec.h  |  2 ++
 arch/arm64/kernel/asm-offsets.c |  2 ++
 arch/arm64/kernel/hibernate-asm.S   | 20 
 arch/arm64/kernel/machine_kexec.c   | 16 ++--
 arch/arm64/kernel/relocate_kernel.S |  3 +++
 6 files changed, 40 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index 29061b76aab6..3ce8131ad660 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -425,6 +425,25 @@ USER(\label, icivau, \tmp2)// 
invalidate I line PoU
isb
.endm
 
+/*
+ * To prevent the possibility of old and new partial table walks being visible
+ * in the tlb, switch the ttbr to a zero page when we invalidate the old
+ * records. D4.7.1 'General TLB maintenance requirements' in ARM DDI 0487A.i
+ * Even switching to our copied tables will cause a changed output address at
+ * each stage of the walk.
+ */
+   .macro break_before_make_ttbr_switch zero_page, page_table, tmp, tmp2
+   phys_to_ttbr \tmp, \zero_page
+   msr ttbr1_el1, \tmp
+   isb
+   tlbivmalle1
+   dsb nsh
+   phys_to_ttbr \tmp, \page_table
+   offset_ttbr1 \tmp, \tmp2
+   msr ttbr1_el1, \tmp
+   isb
+   .endm
+
 /*
  * reset_pmuserenr_el0 - reset PMUSERENR_EL0 if PMUv3 present
  */
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 305cf0840ed3..59ac166daf53 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -97,6 +97,8 @@ struct kimage_arch {
phys_addr_t dtb_mem;
phys_addr_t kern_reloc;
phys_addr_t el2_vectors;
+   phys_addr_t ttbr1;
+   phys_addr_t zero_page;
/* Core ELF header buffer */
void *elf_headers;
unsigned long elf_headers_mem;
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 2e3278df1fc3..609362b5aa76 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -158,6 +158,8 @@ int main(void)
 #ifdef CONFIG_KEXEC_CORE
   DEFINE(KIMAGE_ARCH_DTB_MEM,  offsetof(struct kimage, arch.dtb_mem));
   DEFINE(KIMAGE_ARCH_EL2_VECTORS,  offsetof(struct kimage, 
arch.el2_vectors));
+  DEFINE(KIMAGE_ARCH_ZERO_PAGE,offsetof(struct kimage, 
arch.zero_page));
+  DEFINE(KIMAGE_ARCH_TTBR1,offsetof(struct kimage, arch.ttbr1));
   DEFINE(KIMAGE_HEAD,  offsetof(struct kimage, head));
   DEFINE(KIMAGE_START, offsetof(struct kimage, start));
   BLANK();
diff --git a/arch/arm64/kernel/hibernate-asm.S 
b/arch/arm64/kernel/hibernate-asm.S
index 8ccca660034e..a31e621ba867 100644
--- a/arch/arm64/kernel/hibernate-asm.S
+++ b/arch/arm64/kernel/hibernate-asm.S
@@ -15,26 +15,6 @@
 #include 
 #include 
 
-/*
- * To prevent the possibility of old and new partial table walks being visible
- * in the tlb, switch the ttbr to a zero page when we invalidate the old
- * records. D4.7.1 'General TLB maintenance requirements' in ARM DDI 0487A.i
- * Even switching to our copied tables will cause a changed output address at
- * each stage of the walk.
- */
-.macro break_before_make_ttbr_switch zero_page, page_table, tmp, tmp2
-   phys_to_ttbr \tmp, \zero_page
-   msr ttbr1_el1, \tmp
-   isb
-   tlbivmalle1
-   dsb nsh
-   phys_to_ttbr \tmp, \page_table
-   offset_ttbr1 \tmp, \tmp2
-   msr ttbr1_el1, \tmp
-   isb
-.endm
-
-
 /*
  * Resume from hibernate
  *
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index f1451d807708..c875ef522e53 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -153,6 +153,8 @@ static void *kexec_page_alloc(void *arg)
 
 int machine_kexec_post_load(struct kimage *kimage)
 {
+   int rc;
+   pgd_t *trans_pgd;
void *reloc_code = page_to_virt(kimage->control_code_page);
long reloc_size;
struct trans_pgd_info info = {
@@ -169,12 +171,22 @@ int machine_kexec_post_load(struct kimage *kimage)
 
kimage->arch.el2_vectors = 0;
if (is_hyp_callable()) {
-   int rc = trans_pgd_copy_el2_vectors(,
-   >arch.el2_vectors);
+   rc = trans_pgd_copy_el2_vectors(,
+

[PATCH v13 18/18] arm64/mm: remove useless trans_pgd_map_page()

2021-04-07 Thread Pavel Tatashin

From: Pingfan Liu 

The intend of trans_pgd_map_page() was to map contigous range of VA
memory to the memory that is getting relocated during kexec. However,
since we are now using linear map instead of contigous range this
function is not needed

Signed-off-by: Pingfan Liu 
[Changed commit message]
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/trans_pgd.h |  5 +--
 arch/arm64/mm/trans_pgd.c  | 57 --
 2 files changed, 1 insertion(+), 61 deletions(-)

diff --git a/arch/arm64/include/asm/trans_pgd.h 
b/arch/arm64/include/asm/trans_pgd.h
index e0760e52d36d..234353df2f13 100644
--- a/arch/arm64/include/asm/trans_pgd.h
+++ b/arch/arm64/include/asm/trans_pgd.h
@@ -15,7 +15,7 @@
 /*
  * trans_alloc_page
  * - Allocator that should return exactly one zeroed page, if this
- *   allocator fails, trans_pgd_create_copy() and trans_pgd_map_page()
+ *   allocator fails, trans_pgd_create_copy() and trans_pgd_idmap_page()
  *   return -ENOMEM error.
  *
  * trans_alloc_arg
@@ -30,9 +30,6 @@ struct trans_pgd_info {
 int trans_pgd_create_copy(struct trans_pgd_info *info, pgd_t **trans_pgd,
  unsigned long start, unsigned long end);
 
-int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t *trans_pgd,
-  void *page, unsigned long dst_addr, pgprot_t pgprot);
-
 int trans_pgd_idmap_page(struct trans_pgd_info *info, phys_addr_t *trans_ttbr0,
 unsigned long *t0sz, void *page);
 
diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
index 61549451ed3a..e24a749013c1 100644
--- a/arch/arm64/mm/trans_pgd.c
+++ b/arch/arm64/mm/trans_pgd.c
@@ -217,63 +217,6 @@ int trans_pgd_create_copy(struct trans_pgd_info *info, 
pgd_t **dst_pgdp,
return rc;
 }
 
-/*
- * Add map entry to trans_pgd for a base-size page at PTE level.
- * info:   contains allocator and its argument
- * trans_pgd:  page table in which new map is added.
- * page:   page to be mapped.
- * dst_addr:   new VA address for the page
- * pgprot: protection for the page.
- *
- * Returns 0 on success, and -ENOMEM on failure.
- */
-int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t *trans_pgd,
-  void *page, unsigned long dst_addr, pgprot_t pgprot)
-{
-   pgd_t *pgdp;
-   p4d_t *p4dp;
-   pud_t *pudp;
-   pmd_t *pmdp;
-   pte_t *ptep;
-
-   pgdp = pgd_offset_pgd(trans_pgd, dst_addr);
-   if (pgd_none(READ_ONCE(*pgdp))) {
-   p4dp = trans_alloc(info);
-   if (!pgdp)
-   return -ENOMEM;
-   pgd_populate(NULL, pgdp, p4dp);
-   }
-
-   p4dp = p4d_offset(pgdp, dst_addr);
-   if (p4d_none(READ_ONCE(*p4dp))) {
-   pudp = trans_alloc(info);
-   if (!pudp)
-   return -ENOMEM;
-   p4d_populate(NULL, p4dp, pudp);
-   }
-
-   pudp = pud_offset(p4dp, dst_addr);
-   if (pud_none(READ_ONCE(*pudp))) {
-   pmdp = trans_alloc(info);
-   if (!pmdp)
-   return -ENOMEM;
-   pud_populate(NULL, pudp, pmdp);
-   }
-
-   pmdp = pmd_offset(pudp, dst_addr);
-   if (pmd_none(READ_ONCE(*pmdp))) {
-   ptep = trans_alloc(info);
-   if (!ptep)
-   return -ENOMEM;
-   pmd_populate_kernel(NULL, pmdp, ptep);
-   }
-
-   ptep = pte_offset_kernel(pmdp, dst_addr);
-   set_pte(ptep, pfn_pte(virt_to_pfn(page), pgprot));
-
-   return 0;
-}
-
 /*
  * The page we want to idmap may be outside the range covered by VA_BITS that
  * can be built using the kernel's p?d_populate() helpers. As a one off, for a
-- 
2.25.1

[PATCH v13 17/18] arm64: kexec: Remove cpu-reset.h

2021-04-07 Thread Pavel Tatashin

This header contains only cpu_soft_restart() which is never used directly
anymore. So, remove this header, and rename the helper to be
cpu_soft_restart().

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/kexec.h|  6 ++
 arch/arm64/kernel/cpu-reset.S |  7 +++
 arch/arm64/kernel/cpu-reset.h | 30 --
 arch/arm64/kernel/machine_kexec.c |  6 ++
 4 files changed, 11 insertions(+), 38 deletions(-)
 delete mode 100644 arch/arm64/kernel/cpu-reset.h

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 5fc87b51f8a9..ee71ae3b93ed 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -90,6 +90,12 @@ static inline void crash_prepare_suspend(void) {}
 static inline void crash_post_resume(void) {}
 #endif
 
+#if defined(CONFIG_KEXEC_CORE)
+void cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
+ unsigned long arg0, unsigned long arg1,
+ unsigned long arg2);
+#endif
+
 #define ARCH_HAS_KIMAGE_ARCH
 
 struct kimage_arch {
diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
index 37721eb6f9a1..5d47d6c92634 100644
--- a/arch/arm64/kernel/cpu-reset.S
+++ b/arch/arm64/kernel/cpu-reset.S
@@ -16,8 +16,7 @@
 .pushsection.idmap.text, "awx"
 
 /*
- * __cpu_soft_restart(el2_switch, entry, arg0, arg1, arg2) - Helper for
- * cpu_soft_restart.
+ * cpu_soft_restart(el2_switch, entry, arg0, arg1, arg2)
  *
  * @el2_switch: Flag to indicate a switch to EL2 is needed.
  * @entry: Location to jump to for soft reset.
@@ -29,7 +28,7 @@
  * branch to what would be the reset vector. It must be executed with the
  * flat identity mapping.
  */
-SYM_CODE_START(__cpu_soft_restart)
+SYM_CODE_START(cpu_soft_restart)
/* Clear sctlr_el1 flags. */
mrs x12, sctlr_el1
mov_q   x13, SCTLR_ELx_FLAGS
@@ -51,6 +50,6 @@ SYM_CODE_START(__cpu_soft_restart)
mov x1, x3  // arg1
mov x2, x4  // arg2
br  x8
-SYM_CODE_END(__cpu_soft_restart)
+SYM_CODE_END(cpu_soft_restart)
 
 .popsection
diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
deleted file mode 100644
index f6d95512fec6..
--- a/arch/arm64/kernel/cpu-reset.h
+++ /dev/null
@@ -1,30 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * CPU reset routines
- *
- * Copyright (C) 2015 Huawei Futurewei Technologies.
- */
-
-#ifndef _ARM64_CPU_RESET_H
-#define _ARM64_CPU_RESET_H
-
-#include 
-
-void __cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
-   unsigned long arg0, unsigned long arg1, unsigned long arg2);
-
-static inline void __noreturn cpu_soft_restart(unsigned long entry,
-  unsigned long arg0,
-  unsigned long arg1,
-  unsigned long arg2)
-{
-   typeof(__cpu_soft_restart) *restart;
-
-   restart = (void *)__pa_symbol(__cpu_soft_restart);
-
-   cpu_install_idmap();
-   restart(0, entry, arg0, arg1, arg2);
-   unreachable();
-}
-
-#endif
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index a1c9bee0cddd..ef7ba93f2bd6 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -23,8 +23,6 @@
 #include 
 #include 
 
-#include "cpu-reset.h"
-
 /**
  * kexec_image_info - For debugging output.
  */
@@ -197,10 +195,10 @@ void machine_kexec(struct kimage *kimage)
 * In kexec_file case, the kernel starts directly without purgatory.
 */
if (kimage->head & IND_DONE) {
-   typeof(__cpu_soft_restart) *restart;
+   typeof(cpu_soft_restart) *restart;
 
cpu_install_idmap();
-   restart = (void *)__pa_symbol(__cpu_soft_restart);
+   restart = (void *)__pa_symbol(cpu_soft_restart);
restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
0, 0);
} else {
-- 
2.25.1

[PATCH v13 13/18] arm64: kexec: use ld script for relocation function

2021-04-07 Thread Pavel Tatashin

Currently, relocation code declares start and end variables
which are used to compute its size.

The better way to do this is to use ld script incited, and put relocation
function in its own section.

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/sections.h   |  1 +
 arch/arm64/kernel/machine_kexec.c   | 14 ++
 arch/arm64/kernel/relocate_kernel.S | 15 ++-
 arch/arm64/kernel/vmlinux.lds.S | 19 +++
 4 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/include/asm/sections.h 
b/arch/arm64/include/asm/sections.h
index 2f36b16a5b5d..31e459af89f6 100644
--- a/arch/arm64/include/asm/sections.h
+++ b/arch/arm64/include/asm/sections.h
@@ -20,5 +20,6 @@ extern char __exittext_begin[], __exittext_end[];
 extern char __irqentry_text_start[], __irqentry_text_end[];
 extern char __mmuoff_data_start[], __mmuoff_data_end[];
 extern char __entry_tramp_text_start[], __entry_tramp_text_end[];
+extern char __relocate_new_kernel_start[], __relocate_new_kernel_end[];
 
 #endif /* __ASM_SECTIONS_H */
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index d5940b7889f8..f1451d807708 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -20,14 +20,11 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "cpu-reset.h"
 
-/* Global variables for the arm64_relocate_new_kernel routine. */
-extern const unsigned char arm64_relocate_new_kernel[];
-extern const unsigned long arm64_relocate_new_kernel_size;
-
 /**
  * kexec_image_info - For debugging output.
  */
@@ -157,6 +154,7 @@ static void *kexec_page_alloc(void *arg)
 int machine_kexec_post_load(struct kimage *kimage)
 {
void *reloc_code = page_to_virt(kimage->control_code_page);
+   long reloc_size;
struct trans_pgd_info info = {
.trans_alloc_page   = kexec_page_alloc,
.trans_alloc_arg= kimage,
@@ -177,14 +175,14 @@ int machine_kexec_post_load(struct kimage *kimage)
return rc;
}
 
-   memcpy(reloc_code, arm64_relocate_new_kernel,
-  arm64_relocate_new_kernel_size);
+   reloc_size = __relocate_new_kernel_end - __relocate_new_kernel_start;
+   memcpy(reloc_code, __relocate_new_kernel_start, reloc_size);
kimage->arch.kern_reloc = __pa(reloc_code);
 
/* Flush the reloc_code in preparation for its execution. */
-   __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size);
+   __flush_dcache_area(reloc_code, reloc_size);
flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
-  arm64_relocate_new_kernel_size);
+  reloc_size);
kexec_list_flush(kimage);
kexec_image_info(kimage);
 
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index df023b82544b..7a600ba33ae1 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -15,6 +15,7 @@
 #include 
 #include 
 
+.pushsection".kexec_relocate.text", "ax"
 /*
  * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
  *
@@ -77,16 +78,4 @@ SYM_CODE_START(arm64_relocate_new_kernel)
mov x3, xzr
br  x4  /* Jumps from el1 */
 SYM_CODE_END(arm64_relocate_new_kernel)
-
-.align 3   /* To keep the 64-bit values below naturally aligned. */
-
-.Lcopy_end:
-.org   KEXEC_CONTROL_PAGE_SIZE
-
-/*
- * arm64_relocate_new_kernel_size - Number of bytes to copy to the
- * control_code_page.
- */
-.globl arm64_relocate_new_kernel_size
-arm64_relocate_new_kernel_size:
-   .quad   .Lcopy_end - arm64_relocate_new_kernel
+.popsection
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7eea7888bb02..0d9d5e6af66f 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -92,6 +93,16 @@ jiffies = jiffies_64;
 #define HIBERNATE_TEXT
 #endif
 
+#ifdef CONFIG_KEXEC_CORE
+#define KEXEC_TEXT \
+   . = ALIGN(SZ_4K);   \
+   __relocate_new_kernel_start = .;\
+   *(.kexec_relocate.text) \
+   __relocate_new_kernel_end = .;
+#else
+#define KEXEC_TEXT
+#endif
+
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
 #define TRAMP_TEXT \
. = ALIGN(PAGE_SIZE);   \
@@ -152,6 +163,7 @@ SECTIONS
HYPERVISOR_TEXT
IDMAP_TEXT
HIBERNATE_TEXT
+   KEXEC_TEXT
TRAMP_TEXT
*(.fixup)
*(.gnu.warning)
@@ -336,3 +348,10 @@ ASSERT(swapper_pg_dir - reserved_pg_dir ==

[PATCH v13 16/18] arm64: kexec: remove the pre-kexec PoC maintenance

2021-04-07 Thread Pavel Tatashin

Now that kexec does its relocations with the MMU enabled, we no longer
need to clean the relocation data to the PoC.

Co-developed-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/machine_kexec.c | 40 ---
 1 file changed, 40 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index d5c8aefc66f3..a1c9bee0cddd 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -76,45 +76,6 @@ int machine_kexec_prepare(struct kimage *kimage)
return 0;
 }
 
-/**
- * kexec_list_flush - Helper to flush the kimage list and source pages to PoC.
- */
-static void kexec_list_flush(struct kimage *kimage)
-{
-   kimage_entry_t *entry;
-
-   __flush_dcache_area(kimage, sizeof(*kimage));
-
-   for (entry = >head; ; entry++) {
-   unsigned int flag;
-   void *addr;
-
-   /* flush the list entries. */
-   __flush_dcache_area(entry, sizeof(kimage_entry_t));
-
-   flag = *entry & IND_FLAGS;
-   if (flag == IND_DONE)
-   break;
-
-   addr = phys_to_virt(*entry & PAGE_MASK);
-
-   switch (flag) {
-   case IND_INDIRECTION:
-   /* Set entry point just before the new list page. */
-   entry = (kimage_entry_t *)addr - 1;
-   break;
-   case IND_SOURCE:
-   /* flush the source pages. */
-   __flush_dcache_area(addr, PAGE_SIZE);
-   break;
-   case IND_DESTINATION:
-   break;
-   default:
-   BUG();
-   }
-   }
-}
-
 /**
  * kexec_segment_flush - Helper to flush the kimage segments to PoC.
  */
@@ -200,7 +161,6 @@ int machine_kexec_post_load(struct kimage *kimage)
__flush_dcache_area(reloc_code, reloc_size);
flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
   reloc_size);
-   kexec_list_flush(kimage);
kexec_image_info(kimage);
 
return 0;
-- 
2.25.1

[PATCH v13 15/18] arm64: kexec: keep MMU enabled during kexec relocation

2021-04-07 Thread Pavel Tatashin

Now, that we have linear map page tables configured, keep MMU enabled
to allow faster relocation of segments to final destination.


Cavium ThunderX2:
Kernel Image size: 38M Iniramfs size: 46M Total relocation size: 84M
MMU-disabled:
relocation  7.489539915s
MMU-enabled:
relocation  0.03946095s

Broadcom Stingray:
The performance data: for a moderate size kernel + initramfs: 25M the
relocation was taking 0.382s, with enabled MMU it now takes
0.019s only or x20 improvement.

The time is proportional to the size of relocation, therefore if initramfs
is larger, 100M it could take over a second.

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/kexec.h  |  3 +++
 arch/arm64/kernel/asm-offsets.c |  1 +
 arch/arm64/kernel/machine_kexec.c   | 16 ++
 arch/arm64/kernel/relocate_kernel.S | 33 +++--
 4 files changed, 38 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 59ac166daf53..5fc87b51f8a9 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -97,8 +97,11 @@ struct kimage_arch {
phys_addr_t dtb_mem;
phys_addr_t kern_reloc;
phys_addr_t el2_vectors;
+   phys_addr_t ttbr0;
phys_addr_t ttbr1;
phys_addr_t zero_page;
+   unsigned long phys_offset;
+   unsigned long t0sz;
/* Core ELF header buffer */
void *elf_headers;
unsigned long elf_headers_mem;
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 609362b5aa76..ec7bb80aedc8 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -159,6 +159,7 @@ int main(void)
   DEFINE(KIMAGE_ARCH_DTB_MEM,  offsetof(struct kimage, arch.dtb_mem));
   DEFINE(KIMAGE_ARCH_EL2_VECTORS,  offsetof(struct kimage, 
arch.el2_vectors));
   DEFINE(KIMAGE_ARCH_ZERO_PAGE,offsetof(struct kimage, 
arch.zero_page));
+  DEFINE(KIMAGE_ARCH_PHYS_OFFSET,  offsetof(struct kimage, 
arch.phys_offset));
   DEFINE(KIMAGE_ARCH_TTBR1,offsetof(struct kimage, arch.ttbr1));
   DEFINE(KIMAGE_HEAD,  offsetof(struct kimage, head));
   DEFINE(KIMAGE_START, offsetof(struct kimage, start));
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index c875ef522e53..d5c8aefc66f3 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -190,6 +190,11 @@ int machine_kexec_post_load(struct kimage *kimage)
reloc_size = __relocate_new_kernel_end - __relocate_new_kernel_start;
memcpy(reloc_code, __relocate_new_kernel_start, reloc_size);
kimage->arch.kern_reloc = __pa(reloc_code);
+   rc = trans_pgd_idmap_page(, >arch.ttbr0,
+ >arch.t0sz, reloc_code);
+   if (rc)
+   return rc;
+   kimage->arch.phys_offset = virt_to_phys(kimage) - (long)kimage;
 
/* Flush the reloc_code in preparation for its execution. */
__flush_dcache_area(reloc_code, reloc_size);
@@ -223,9 +228,9 @@ void machine_kexec(struct kimage *kimage)
local_daif_mask();
 
/*
-* Both restart and cpu_soft_restart will shutdown the MMU, disable data
+* Both restart and kernel_reloc will shutdown the MMU, disable data
 * caches. However, restart will start new kernel or purgatory directly,
-* cpu_soft_restart will transfer control to arm64_relocate_new_kernel
+* kernel_reloc contains the body of arm64_relocate_new_kernel
 * In kexec case, kimage->start points to purgatory assuming that
 * kernel entry and dtb address are embedded in purgatory by
 * userspace (kexec-tools).
@@ -239,10 +244,13 @@ void machine_kexec(struct kimage *kimage)
restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
0, 0);
} else {
+   void (*kernel_reloc)(struct kimage *kimage);
+
if (is_hyp_callable())
__hyp_set_vectors(kimage->arch.el2_vectors);
-   cpu_soft_restart(kimage->arch.kern_reloc,
-virt_to_phys(kimage), 0, 0);
+   cpu_install_ttbr0(kimage->arch.ttbr0, kimage->arch.t0sz);
+   kernel_reloc = (void *)kimage->arch.kern_reloc;
+   kernel_reloc(kimage);
}
 
BUG(); /* Should never get here. */
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index e83b6380907d..433a57b3d76e 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -4,6 +4,8 @@
  *
  * Copyright (C) Linaro.
  * Copyright (C) Huawei Futurewei Technologies.
+ * Copyright (C) 2020, Microsoft Corporation.
+ * Pavel Tatashin 
  */
 
 #include 
@@ -15,6 +17,15 @@
 #include 
 #include 
 
+.macro turn_off_mmu tmp1, tmp2
+   mrs \tmp1, sctlr_el1
+

[PATCH v13 12/18] arm64: kexec: relocate in EL1 mode

2021-04-07 Thread Pavel Tatashin

Since we are going to keep MMU enabled during relocation, we need to
keep EL1 mode throughout the relocation.

Keep EL1 enabled, and switch EL2 only before enterying the new world.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/cpu-reset.h   |  3 +--
 arch/arm64/kernel/machine_kexec.c   |  4 ++--
 arch/arm64/kernel/relocate_kernel.S | 13 +++--
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
index 1922e7a690f8..f6d95512fec6 100644
--- a/arch/arm64/kernel/cpu-reset.h
+++ b/arch/arm64/kernel/cpu-reset.h
@@ -20,11 +20,10 @@ static inline void __noreturn cpu_soft_restart(unsigned 
long entry,
 {
typeof(__cpu_soft_restart) *restart;
 
-   unsigned long el2_switch = is_hyp_callable();
restart = (void *)__pa_symbol(__cpu_soft_restart);
 
cpu_install_idmap();
-   restart(el2_switch, entry, arg0, arg1, arg2);
+   restart(0, entry, arg0, arg1, arg2);
unreachable();
 }
 
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index fb03b6676fb9..d5940b7889f8 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -231,8 +231,8 @@ void machine_kexec(struct kimage *kimage)
} else {
if (is_hyp_callable())
__hyp_set_vectors(kimage->arch.el2_vectors);
-   cpu_soft_restart(kimage->arch.kern_reloc, virt_to_phys(kimage),
-0, 0);
+   cpu_soft_restart(kimage->arch.kern_reloc,
+virt_to_phys(kimage), 0, 0);
}
 
BUG(); /* Should never get here. */
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index 36b4496524c3..df023b82544b 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
@@ -61,12 +62,20 @@ SYM_CODE_START(arm64_relocate_new_kernel)
isb
 
/* Start new image. */
+   ldr x1, [x0, #KIMAGE_ARCH_EL2_VECTORS]  /* relocation start */
+   cbz x1, .Lel1
+   ldr x1, [x0, #KIMAGE_START] /* relocation start */
+   ldr x2, [x0, #KIMAGE_ARCH_DTB_MEM]  /* dtb address */
+   mov x3, xzr
+   mov x4, xzr
+   mov x0, #HVC_SOFT_RESTART
+   hvc #0  /* Jumps from el2 */
+.Lel1:
ldr x4, [x0, #KIMAGE_START] /* relocation start */
ldr x0, [x0, #KIMAGE_ARCH_DTB_MEM]  /* dtb address */
-   mov x1, xzr
mov x2, xzr
mov x3, xzr
-   br  x4
+   br  x4  /* Jumps from el1 */
 SYM_CODE_END(arm64_relocate_new_kernel)
 
 .align 3   /* To keep the 64-bit values below naturally aligned. */
-- 
2.25.1

[PATCH v13 11/18] arm64: kexec: kexec may require EL2 vectors

2021-04-07 Thread Pavel Tatashin

If we have a EL2 mode without VHE, the EL2 vectors are needed in order
to switch to EL2 and jump to new world with hypervisor privileges.

In preporation to MMU enabled relocation, configure our EL2 table now.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/Kconfig|  2 +-
 arch/arm64/include/asm/kexec.h|  1 +
 arch/arm64/kernel/asm-offsets.c   |  1 +
 arch/arm64/kernel/machine_kexec.c | 31 +++
 4 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index e4e1b6550115..0e876d980a1f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1149,7 +1149,7 @@ config CRASH_DUMP
 
 config TRANS_TABLE
def_bool y
-   depends on HIBERNATION
+   depends on HIBERNATION || KEXEC_CORE
 
 config XEN_DOM0
def_bool y
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 9befcd87e9a8..305cf0840ed3 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -96,6 +96,7 @@ struct kimage_arch {
void *dtb;
phys_addr_t dtb_mem;
phys_addr_t kern_reloc;
+   phys_addr_t el2_vectors;
/* Core ELF header buffer */
void *elf_headers;
unsigned long elf_headers_mem;
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 0c92e193f866..2e3278df1fc3 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -157,6 +157,7 @@ int main(void)
 #endif
 #ifdef CONFIG_KEXEC_CORE
   DEFINE(KIMAGE_ARCH_DTB_MEM,  offsetof(struct kimage, arch.dtb_mem));
+  DEFINE(KIMAGE_ARCH_EL2_VECTORS,  offsetof(struct kimage, 
arch.el2_vectors));
   DEFINE(KIMAGE_HEAD,  offsetof(struct kimage, head));
   DEFINE(KIMAGE_START, offsetof(struct kimage, start));
   BLANK();
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index 2e734e4ae12e..fb03b6676fb9 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "cpu-reset.h"
 
@@ -42,7 +43,9 @@ static void _kexec_image_info(const char *func, int line,
pr_debug("start:   %lx\n", kimage->start);
pr_debug("head:%lx\n", kimage->head);
pr_debug("nr_segments: %lu\n", kimage->nr_segments);
+   pr_debug("dtb_mem: %pa\n", >arch.dtb_mem);
pr_debug("kern_reloc: %pa\n", >arch.kern_reloc);
+   pr_debug("el2_vectors: %pa\n", >arch.el2_vectors);
 
for (i = 0; i < kimage->nr_segments; i++) {
pr_debug("  segment[%lu]: %016lx - %016lx, 0x%lx bytes, %lu 
pages\n",
@@ -137,9 +140,27 @@ static void kexec_segment_flush(const struct kimage 
*kimage)
}
 }
 
+/* Allocates pages for kexec page table */
+static void *kexec_page_alloc(void *arg)
+{
+   struct kimage *kimage = (struct kimage *)arg;
+   struct page *page = kimage_alloc_control_pages(kimage, 0);
+
+   if (!page)
+   return NULL;
+
+   memset(page_address(page), 0, PAGE_SIZE);
+
+   return page_address(page);
+}
+
 int machine_kexec_post_load(struct kimage *kimage)
 {
void *reloc_code = page_to_virt(kimage->control_code_page);
+   struct trans_pgd_info info = {
+   .trans_alloc_page   = kexec_page_alloc,
+   .trans_alloc_arg= kimage,
+   };
 
/* If in place, relocation is not used, only flush next kernel */
if (kimage->head & IND_DONE) {
@@ -148,6 +169,14 @@ int machine_kexec_post_load(struct kimage *kimage)
return 0;
}
 
+   kimage->arch.el2_vectors = 0;
+   if (is_hyp_callable()) {
+   int rc = trans_pgd_copy_el2_vectors(,
+   >arch.el2_vectors);
+   if (rc)
+   return rc;
+   }
+
memcpy(reloc_code, arm64_relocate_new_kernel,
   arm64_relocate_new_kernel_size);
kimage->arch.kern_reloc = __pa(reloc_code);
@@ -200,6 +229,8 @@ void machine_kexec(struct kimage *kimage)
restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
0, 0);
} else {
+   if (is_hyp_callable())
+   __hyp_set_vectors(kimage->arch.el2_vectors);
cpu_soft_restart(kimage->arch.kern_reloc, virt_to_phys(kimage),
 0, 0);
}
-- 
2.25.1

[PATCH v13 10/18] arm64: kexec: pass kimage as the only argument to relocation function

2021-04-07 Thread Pavel Tatashin

Currently, kexec relocation function (arm64_relocate_new_kernel) accepts
the following arguments:

head:   start of array that contains relocation information.
entry:  entry point for new kernel or purgatory.
dtb_mem:first and only argument to entry.

The number of arguments cannot be easily expended, because this
function is also called from HVC_SOFT_RESTART, which preserves only
three arguments. And, also arm64_relocate_new_kernel is written in
assembly but called without stack, thus no place to move extra arguments
to free registers.

Soon, we will need to pass more arguments: once we enable MMU we
will need to pass information about page tables.

Pass kimage to arm64_relocate_new_kernel, and teach it to get the
required fields from kimage.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/asm-offsets.c |  7 +++
 arch/arm64/kernel/machine_kexec.c   |  6 --
 arch/arm64/kernel/relocate_kernel.S | 10 --
 3 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index a36e2fc330d4..0c92e193f866 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -9,6 +9,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -153,6 +154,12 @@ int main(void)
   DEFINE(PTRAUTH_USER_KEY_APGA,offsetof(struct 
ptrauth_keys_user, apga));
   DEFINE(PTRAUTH_KERNEL_KEY_APIA,  offsetof(struct ptrauth_keys_kernel, 
apia));
   BLANK();
+#endif
+#ifdef CONFIG_KEXEC_CORE
+  DEFINE(KIMAGE_ARCH_DTB_MEM,  offsetof(struct kimage, arch.dtb_mem));
+  DEFINE(KIMAGE_HEAD,  offsetof(struct kimage, head));
+  DEFINE(KIMAGE_START, offsetof(struct kimage, start));
+  BLANK();
 #endif
   return 0;
 }
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index b150b65f0b84..2e734e4ae12e 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -83,6 +83,8 @@ static void kexec_list_flush(struct kimage *kimage)
 {
kimage_entry_t *entry;
 
+   __flush_dcache_area(kimage, sizeof(*kimage));
+
for (entry = >head; ; entry++) {
unsigned int flag;
void *addr;
@@ -198,8 +200,8 @@ void machine_kexec(struct kimage *kimage)
restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
0, 0);
} else {
-   cpu_soft_restart(kimage->arch.kern_reloc, kimage->head,
-kimage->start, kimage->arch.dtb_mem);
+   cpu_soft_restart(kimage->arch.kern_reloc, virt_to_phys(kimage),
+0, 0);
}
 
BUG(); /* Should never get here. */
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index 718037bef560..36b4496524c3 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -27,9 +27,7 @@
  */
 SYM_CODE_START(arm64_relocate_new_kernel)
/* Setup the list loop variables. */
-   mov x18, x2 /* x18 = dtb address */
-   mov x17, x1 /* x17 = kimage_start */
-   mov x16, x0 /* x16 = kimage_head */
+   ldr x16, [x0, #KIMAGE_HEAD] /* x16 = kimage_head */
mov x14, xzr/* x14 = entry ptr */
mov x13, xzr/* x13 = copy dest */
raw_dcache_line_size x15, x1/* x15 = dcache line size */
@@ -63,12 +61,12 @@ SYM_CODE_START(arm64_relocate_new_kernel)
isb
 
/* Start new image. */
-   mov x0, x18
+   ldr x4, [x0, #KIMAGE_START] /* relocation start */
+   ldr x0, [x0, #KIMAGE_ARCH_DTB_MEM]  /* dtb address */
mov x1, xzr
mov x2, xzr
mov x3, xzr
-   br  x17
-
+   br  x4
 SYM_CODE_END(arm64_relocate_new_kernel)
 
 .align 3   /* To keep the 64-bit values below naturally aligned. */
-- 
2.25.1

[PATCH v13 09/18] arm64: kexec: Use dcache ops macros instead of open-coding

2021-04-07 Thread Pavel Tatashin

From: James Morse 

kexec does dcache maintenance when it re-writes all memory. Our
dcache_by_line_op macro depends on reading the sanitised DminLine
from memory. Kexec may have overwritten this, so open-codes the
sequence.

dcache_by_line_op is a whole set of macros, it uses dcache_line_size
which uses read_ctr for the sanitsed DminLine. Reading the DminLine
is the first thing the dcache_by_line_op does.

Rename dcache_by_line_op dcache_by_myline_op and take DminLine as
an argument. Kexec can now use the slightly smaller macro.

This makes up-coming changes to the dcache maintenance easier on
the eye.

Code generated by the existing callers is unchanged.

Signed-off-by: James Morse 

[Fixed merging issues]

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/assembler.h  | 12 
 arch/arm64/kernel/relocate_kernel.S | 13 +++--
 2 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index ca31594d3d6c..29061b76aab6 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -371,10 +371,9 @@ alternative_else
 alternative_endif
.endm
 
-   .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
-   dcache_line_size \tmp1, \tmp2
+   .macro dcache_by_myline_op op, domain, kaddr, size, linesz, tmp2
add \size, \kaddr, \size
-   sub \tmp2, \tmp1, #1
+   sub \tmp2, \linesz, #1
bic \kaddr, \kaddr, \tmp2
 9998:
.ifc\op, cvau
@@ -394,12 +393,17 @@ alternative_endif
.endif
.endif
.endif
-   add \kaddr, \kaddr, \tmp1
+   add \kaddr, \kaddr, \linesz
cmp \kaddr, \size
b.lo9998b
dsb \domain
.endm
 
+   .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
+   dcache_line_size \tmp1, \tmp2
+   dcache_by_myline_op \op, \domain, \kaddr, \size, \tmp1, \tmp2
+   .endm
+
 /*
  * Macro to perform an instruction cache maintenance for the interval
  * [start, end)
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index 8058fabe0a76..718037bef560 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -41,16 +41,9 @@ SYM_CODE_START(arm64_relocate_new_kernel)
tbz x16, IND_SOURCE_BIT, .Ltest_indirection
 
/* Invalidate dest page to PoC. */
-   mov x2, x13
-   add x20, x2, #PAGE_SIZE
-   sub x1, x15, #1
-   bic x2, x2, x1
-2: dc  ivac, x2
-   add x2, x2, x15
-   cmp x2, x20
-   b.lo2b
-   dsb sy
-
+   mov x2, x13
+   mov x1, #PAGE_SIZE
+   dcache_by_myline_op ivac, sy, x2, x1, x15, x20
copy_page x13, x12, x1, x2, x3, x4, x5, x6, x7, x8
b   .Lnext
 .Ltest_indirection:
-- 
2.25.1

[PATCH v13 07/18] arm64: kexec: flush image and lists during kexec load time

2021-04-07 Thread Pavel Tatashin

Currently, during kexec load we are copying relocation function and
flushing it. However, we can also flush kexec relocation buffers and
if new kernel image is already in place (i.e. crash kernel), we can
also flush the new kernel image itself.

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/machine_kexec.c | 49 +++
 1 file changed, 23 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index 90a335c74442..3a034bc25709 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -59,23 +59,6 @@ void machine_kexec_cleanup(struct kimage *kimage)
/* Empty routine needed to avoid build errors. */
 }
 
-int machine_kexec_post_load(struct kimage *kimage)
-{
-   void *reloc_code = page_to_virt(kimage->control_code_page);
-
-   memcpy(reloc_code, arm64_relocate_new_kernel,
-  arm64_relocate_new_kernel_size);
-   kimage->arch.kern_reloc = __pa(reloc_code);
-   kexec_image_info(kimage);
-
-   /* Flush the reloc_code in preparation for its execution. */
-   __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size);
-   flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
-  arm64_relocate_new_kernel_size);
-
-   return 0;
-}
-
 /**
  * machine_kexec_prepare - Prepare for a kexec reboot.
  *
@@ -152,6 +135,29 @@ static void kexec_segment_flush(const struct kimage 
*kimage)
}
 }
 
+int machine_kexec_post_load(struct kimage *kimage)
+{
+   void *reloc_code = page_to_virt(kimage->control_code_page);
+
+   /* If in place flush new kernel image, else flush lists and buffers */
+   if (kimage->head & IND_DONE)
+   kexec_segment_flush(kimage);
+   else
+   kexec_list_flush(kimage);
+
+   memcpy(reloc_code, arm64_relocate_new_kernel,
+  arm64_relocate_new_kernel_size);
+   kimage->arch.kern_reloc = __pa(reloc_code);
+   kexec_image_info(kimage);
+
+   /* Flush the reloc_code in preparation for its execution. */
+   __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size);
+   flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
+  arm64_relocate_new_kernel_size);
+
+   return 0;
+}
+
 /**
  * machine_kexec - Do the kexec reboot.
  *
@@ -169,13 +175,6 @@ void machine_kexec(struct kimage *kimage)
WARN(in_kexec_crash && (stuck_cpus || smp_crash_stop_failed()),
"Some CPUs may be stale, kdump will be unreliable.\n");
 
-   /* Flush the kimage list and its buffers. */
-   kexec_list_flush(kimage);
-
-   /* Flush the new image if already in place. */
-   if ((kimage != kexec_crash_image) && (kimage->head & IND_DONE))
-   kexec_segment_flush(kimage);
-
pr_info("Bye!\n");
 
local_daif_mask();
@@ -250,8 +249,6 @@ void arch_kexec_protect_crashkres(void)
 {
int i;
 
-   kexec_segment_flush(kexec_crash_image);
-
for (i = 0; i < kexec_crash_image->nr_segments; i++)
set_memory_valid(
__phys_to_virt(kexec_crash_image->segment[i].mem),
-- 
2.25.1

[PATCH v13 08/18] arm64: kexec: skip relocation code for inplace kexec

2021-04-07 Thread Pavel Tatashin

In case of kdump or when segments are already in place the relocation
is not needed, therefore the setup of relocation function and call to
it can be skipped.

Signed-off-by: Pavel Tatashin 
Suggested-by: James Morse 
---
 arch/arm64/kernel/machine_kexec.c   | 34 ++---
 arch/arm64/kernel/relocate_kernel.S |  3 ---
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index 3a034bc25709..b150b65f0b84 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -139,21 +139,23 @@ int machine_kexec_post_load(struct kimage *kimage)
 {
void *reloc_code = page_to_virt(kimage->control_code_page);
 
-   /* If in place flush new kernel image, else flush lists and buffers */
-   if (kimage->head & IND_DONE)
+   /* If in place, relocation is not used, only flush next kernel */
+   if (kimage->head & IND_DONE) {
kexec_segment_flush(kimage);
-   else
-   kexec_list_flush(kimage);
+   kexec_image_info(kimage);
+   return 0;
+   }
 
memcpy(reloc_code, arm64_relocate_new_kernel,
   arm64_relocate_new_kernel_size);
kimage->arch.kern_reloc = __pa(reloc_code);
-   kexec_image_info(kimage);
 
/* Flush the reloc_code in preparation for its execution. */
__flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size);
flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
   arm64_relocate_new_kernel_size);
+   kexec_list_flush(kimage);
+   kexec_image_info(kimage);
 
return 0;
 }
@@ -180,19 +182,25 @@ void machine_kexec(struct kimage *kimage)
local_daif_mask();
 
/*
-* cpu_soft_restart will shutdown the MMU, disable data caches, then
-* transfer control to the kern_reloc which contains a copy of
-* the arm64_relocate_new_kernel routine.  arm64_relocate_new_kernel
-* uses physical addressing to relocate the new image to its final
-* position and transfers control to the image entry point when the
-* relocation is complete.
+* Both restart and cpu_soft_restart will shutdown the MMU, disable data
+* caches. However, restart will start new kernel or purgatory directly,
+* cpu_soft_restart will transfer control to arm64_relocate_new_kernel
 * In kexec case, kimage->start points to purgatory assuming that
 * kernel entry and dtb address are embedded in purgatory by
 * userspace (kexec-tools).
 * In kexec_file case, the kernel starts directly without purgatory.
 */
-   cpu_soft_restart(kimage->arch.kern_reloc, kimage->head, kimage->start,
-kimage->arch.dtb_mem);
+   if (kimage->head & IND_DONE) {
+   typeof(__cpu_soft_restart) *restart;
+
+   cpu_install_idmap();
+   restart = (void *)__pa_symbol(__cpu_soft_restart);
+   restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
+   0, 0);
+   } else {
+   cpu_soft_restart(kimage->arch.kern_reloc, kimage->head,
+kimage->start, kimage->arch.dtb_mem);
+   }
 
BUG(); /* Should never get here. */
 }
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index b78ea5de97a4..8058fabe0a76 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -32,8 +32,6 @@ SYM_CODE_START(arm64_relocate_new_kernel)
mov x16, x0 /* x16 = kimage_head */
mov x14, xzr/* x14 = entry ptr */
mov x13, xzr/* x13 = copy dest */
-   /* Check if the new image needs relocation. */
-   tbnzx16, IND_DONE_BIT, .Ldone
raw_dcache_line_size x15, x1/* x15 = dcache line size */
 .Lloop:
and x12, x16, PAGE_MASK /* x12 = addr */
@@ -65,7 +63,6 @@ SYM_CODE_START(arm64_relocate_new_kernel)
 .Lnext:
ldr x16, [x14], #8  /* entry = *ptr++ */
tbz x16, IND_DONE_BIT, .Lloop   /* while (!(entry & DONE)) */
-.Ldone:
/* wait for writes from copy_page to finish */
dsb nsh
ic  iallu
-- 
2.25.1

[PATCH v13 06/18] arm64: hibernate: abstract ttrb0 setup function

2021-04-07 Thread Pavel Tatashin

Currently, only hibernate sets custom ttbr0 with safe idmaped function.
Kexec, is also going to be using this functinality when relocation code
is going to be idmapped.

Move the setup seqeuence to a dedicated cpu_install_ttbr0() for custom
ttbr0.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/mmu_context.h | 24 
 arch/arm64/kernel/hibernate.c| 21 +
 2 files changed, 25 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/include/asm/mmu_context.h 
b/arch/arm64/include/asm/mmu_context.h
index bd02e99b1a4c..f64d0d5e1b1f 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -115,6 +115,30 @@ static inline void cpu_install_idmap(void)
cpu_switch_mm(lm_alias(idmap_pg_dir), _mm);
 }
 
+/*
+ * Load our new page tables. A strict BBM approach requires that we ensure that
+ * TLBs are free of any entries that may overlap with the global mappings we 
are
+ * about to install.
+ *
+ * For a real hibernate/resume/kexec cycle TTBR0 currently points to a zero
+ * page, but TLBs may contain stale ASID-tagged entries (e.g. for EFI runtime
+ * services), while for a userspace-driven test_resume cycle it points to
+ * userspace page tables (and we must point it at a zero page ourselves).
+ *
+ * We change T0SZ as part of installing the idmap. This is undone by
+ * cpu_uninstall_idmap() in __cpu_suspend_exit().
+ */
+static inline void cpu_install_ttbr0(phys_addr_t ttbr0, unsigned long t0sz)
+{
+   cpu_set_reserved_ttbr0();
+   local_flush_tlb_all();
+   __cpu_set_tcr_t0sz(t0sz);
+
+   /* avoid cpu_switch_mm() and its SW-PAN and CNP interactions */
+   write_sysreg(ttbr0, ttbr0_el1);
+   isb();
+}
+
 /*
  * Atomically replaces the active TTBR1_EL1 PGD with a new VA-compatible PGD,
  * avoiding the possibility of conflicting TLB entries being allocated.
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 0b8bad8bb6eb..ded5115bcb63 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -206,26 +206,7 @@ static int create_safe_exec_page(void *src_start, size_t 
length,
if (rc)
return rc;
 
-   /*
-* Load our new page tables. A strict BBM approach requires that we
-* ensure that TLBs are free of any entries that may overlap with the
-* global mappings we are about to install.
-*
-* For a real hibernate/resume cycle TTBR0 currently points to a zero
-* page, but TLBs may contain stale ASID-tagged entries (e.g. for EFI
-* runtime services), while for a userspace-driven test_resume cycle it
-* points to userspace page tables (and we must point it at a zero page
-* ourselves).
-*
-* We change T0SZ as part of installing the idmap. This is undone by
-* cpu_uninstall_idmap() in __cpu_suspend_exit().
-*/
-   cpu_set_reserved_ttbr0();
-   local_flush_tlb_all();
-   __cpu_set_tcr_t0sz(t0sz);
-   write_sysreg(trans_ttbr0, ttbr0_el1);
-   isb();
-
+   cpu_install_ttbr0(trans_ttbr0, t0sz);
*phys_dst_addr = virt_to_phys(page);
 
return 0;
-- 
2.25.1

[PATCH v13 05/18] arm64: trans_pgd: hibernate: Add trans_pgd_copy_el2_vectors

2021-04-07 Thread Pavel Tatashin

Users of trans_pgd may also need a copy of vector table because it is
also may be overwritten if a linear map can be overwritten.

Move setup of EL2 vectors from hibernate to trans_pgd, so it can be
later shared with kexec as well.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/trans_pgd.h |  3 +++
 arch/arm64/include/asm/virt.h  |  3 +++
 arch/arm64/kernel/hibernate.c  | 28 ++--
 arch/arm64/mm/trans_pgd.c  | 20 
 4 files changed, 36 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/include/asm/trans_pgd.h 
b/arch/arm64/include/asm/trans_pgd.h
index 5d08e5adf3d5..e0760e52d36d 100644
--- a/arch/arm64/include/asm/trans_pgd.h
+++ b/arch/arm64/include/asm/trans_pgd.h
@@ -36,4 +36,7 @@ int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t 
*trans_pgd,
 int trans_pgd_idmap_page(struct trans_pgd_info *info, phys_addr_t *trans_ttbr0,
 unsigned long *t0sz, void *page);
 
+int trans_pgd_copy_el2_vectors(struct trans_pgd_info *info,
+  phys_addr_t *el2_vectors);
+
 #endif /* _ASM_TRANS_TABLE_H */
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 4216c8623538..bfbb66018114 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -67,6 +67,9 @@
  */
 extern u32 __boot_cpu_mode[2];
 
+extern char __hyp_stub_vectors[];
+#define ARM64_VECTOR_TABLE_LEN SZ_2K
+
 void __hyp_set_vectors(phys_addr_t phys_vector_base);
 void __hyp_reset_vectors(void);
 
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index c764574a1acb..0b8bad8bb6eb 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -48,12 +48,6 @@
  */
 extern int in_suspend;
 
-/* temporary el2 vectors in the __hibernate_exit_text section. */
-extern char hibernate_el2_vectors[];
-
-/* hyp-stub vectors, used to restore el2 during resume from hibernate. */
-extern char __hyp_stub_vectors[];
-
 /*
  * The logical cpu number we should resume on, initialised to a non-cpu
  * number.
@@ -428,6 +422,7 @@ int swsusp_arch_resume(void)
void *zero_page;
size_t exit_size;
pgd_t *tmp_pg_dir;
+   phys_addr_t el2_vectors;
void __noreturn (*hibernate_exit)(phys_addr_t, phys_addr_t, void *,
  void *, phys_addr_t, phys_addr_t);
struct trans_pgd_info trans_info = {
@@ -455,6 +450,14 @@ int swsusp_arch_resume(void)
return -ENOMEM;
}
 
+   if (is_hyp_callable()) {
+   rc = trans_pgd_copy_el2_vectors(_info, _vectors);
+   if (rc) {
+   pr_err("Failed to setup el2 vectors\n");
+   return rc;
+   }
+   }
+
exit_size = __hibernate_exit_text_end - __hibernate_exit_text_start;
/*
 * Copy swsusp_arch_suspend_exit() to a safe page. This will generate
@@ -467,25 +470,14 @@ int swsusp_arch_resume(void)
return rc;
}
 
-   /*
-* The hibernate exit text contains a set of el2 vectors, that will
-* be executed at el2 with the mmu off in order to reload hyp-stub.
-*/
-   __flush_dcache_area(hibernate_exit, exit_size);
-
/*
 * KASLR will cause the el2 vectors to be in a different location in
 * the resumed kernel. Load hibernate's temporary copy into el2.
 *
 * We can skip this step if we booted at EL1, or are running with VHE.
 */
-   if (is_hyp_callable()) {
-   phys_addr_t el2_vectors = (phys_addr_t)hibernate_exit;
-   el2_vectors += hibernate_el2_vectors -
-  __hibernate_exit_text_start; /* offset */
-
+   if (is_hyp_callable())
__hyp_set_vectors(el2_vectors);
-   }
 
hibernate_exit(virt_to_phys(tmp_pg_dir), resume_hdr.ttbr1_el1,
   resume_hdr.reenter_kernel, restore_pblist,
diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
index 527f0a39c3da..61549451ed3a 100644
--- a/arch/arm64/mm/trans_pgd.c
+++ b/arch/arm64/mm/trans_pgd.c
@@ -322,3 +322,23 @@ int trans_pgd_idmap_page(struct trans_pgd_info *info, 
phys_addr_t *trans_ttbr0,
 
return 0;
 }
+
+/*
+ * Create a copy of the vector table so we can call HVC_SET_VECTORS or
+ * HVC_SOFT_RESTART from contexts where the table may be overwritten.
+ */
+int trans_pgd_copy_el2_vectors(struct trans_pgd_info *info,
+  phys_addr_t *el2_vectors)
+{
+   void *hyp_stub = trans_alloc(info);
+
+   if (!hyp_stub)
+   return -ENOMEM;
+   *el2_vectors = virt_to_phys(hyp_stub);
+   memcpy(hyp_stub, &__hyp_stub_vectors, ARM64_VECTOR_TABLE_LEN);
+   __flush_icache_range((unsigned long)hyp_stub,
+(unsigned long)hyp_stub + ARM64_VECTOR_TABLE_LEN);
+

[PATCH v13 04/18] arm64: kernel: add helper for booted at EL2 and not VHE

2021-04-07 Thread Pavel Tatashin

Replace places that contain logic like this:
is_hyp_mode_available() && !is_kernel_in_hyp_mode()

With a dedicated boolean function  is_hyp_callable(). This will be needed
later in kexec in order to sooner switch back to EL2.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/virt.h | 5 +
 arch/arm64/kernel/cpu-reset.h | 3 +--
 arch/arm64/kernel/hibernate.c | 9 +++--
 arch/arm64/kernel/sdei.c  | 2 +-
 4 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 7379f35ae2c6..4216c8623538 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -128,6 +128,11 @@ static __always_inline bool is_protected_kvm_enabled(void)
return cpus_have_final_cap(ARM64_KVM_PROTECTED_MODE);
 }
 
+static inline bool is_hyp_callable(void)
+{
+   return is_hyp_mode_available() && !is_kernel_in_hyp_mode();
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* ! __ASM__VIRT_H */
diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
index ed50e9587ad8..1922e7a690f8 100644
--- a/arch/arm64/kernel/cpu-reset.h
+++ b/arch/arm64/kernel/cpu-reset.h
@@ -20,8 +20,7 @@ static inline void __noreturn cpu_soft_restart(unsigned long 
entry,
 {
typeof(__cpu_soft_restart) *restart;
 
-   unsigned long el2_switch = !is_kernel_in_hyp_mode() &&
-   is_hyp_mode_available();
+   unsigned long el2_switch = is_hyp_callable();
restart = (void *)__pa_symbol(__cpu_soft_restart);
 
cpu_install_idmap();
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index b1cef371df2b..c764574a1acb 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -48,9 +48,6 @@
  */
 extern int in_suspend;
 
-/* Do we need to reset el2? */
-#define el2_reset_needed() (is_hyp_mode_available() && 
!is_kernel_in_hyp_mode())
-
 /* temporary el2 vectors in the __hibernate_exit_text section. */
 extern char hibernate_el2_vectors[];
 
@@ -125,7 +122,7 @@ int arch_hibernation_header_save(void *addr, unsigned int 
max_size)
hdr->reenter_kernel = _cpu_resume;
 
/* We can't use __hyp_get_vectors() because kvm may still be loaded */
-   if (el2_reset_needed())
+   if (is_hyp_callable())
hdr->__hyp_stub_vectors = __pa_symbol(__hyp_stub_vectors);
else
hdr->__hyp_stub_vectors = 0;
@@ -387,7 +384,7 @@ int swsusp_arch_suspend(void)
dcache_clean_range(__idmap_text_start, __idmap_text_end);
 
/* Clean kvm setup code to PoC? */
-   if (el2_reset_needed()) {
+   if (is_hyp_callable()) {
dcache_clean_range(__hyp_idmap_text_start, 
__hyp_idmap_text_end);
dcache_clean_range(__hyp_text_start, __hyp_text_end);
}
@@ -482,7 +479,7 @@ int swsusp_arch_resume(void)
 *
 * We can skip this step if we booted at EL1, or are running with VHE.
 */
-   if (el2_reset_needed()) {
+   if (is_hyp_callable()) {
phys_addr_t el2_vectors = (phys_addr_t)hibernate_exit;
el2_vectors += hibernate_el2_vectors -
   __hibernate_exit_text_start; /* offset */
diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c
index 2c7ca449dd51..af0ac2f920cf 100644
--- a/arch/arm64/kernel/sdei.c
+++ b/arch/arm64/kernel/sdei.c
@@ -200,7 +200,7 @@ unsigned long sdei_arch_get_entry_point(int conduit)
 * dropped to EL1 because we don't support VHE, then we can't support
 * SDEI.
 */
-   if (is_hyp_mode_available() && !is_kernel_in_hyp_mode()) {
+   if (is_hyp_callable()) {
pr_err("Not supported on this hardware/boot configuration\n");
goto out_err;
}
-- 
2.25.1

[PATCH v13 03/18] arm64: hyp-stub: Move el1_sync into the vectors

2021-04-07 Thread Pavel Tatashin

From: James Morse 

The hyp-stub's el1_sync code doesn't do very much, this can easily fit
in the vectors.

With this, all of the hyp-stubs behaviour is contained in its vectors.
This lets kexec and hibernate copy the hyp-stub when they need its
behaviour, instead of re-implementing it.

Signed-off-by: James Morse 

[Fixed merging issues]

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/hyp-stub.S | 59 ++--
 1 file changed, 29 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index ff329c5c074d..d1a73d0f74e0 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -21,6 +21,34 @@ SYM_CODE_START_LOCAL(\label)
.align 7
b   \label
 SYM_CODE_END(\label)
+.endm
+
+.macro hyp_stub_el1_sync
+SYM_CODE_START_LOCAL(hyp_stub_el1_sync)
+   .align 7
+   cmp x0, #HVC_SET_VECTORS
+   b.ne2f
+   msr vbar_el2, x1
+   b   9f
+
+2: cmp x0, #HVC_SOFT_RESTART
+   b.ne3f
+   mov x0, x2
+   mov x2, x4
+   mov x4, x1
+   mov x1, x3
+   br  x4  // no return
+
+3: cmp x0, #HVC_RESET_VECTORS
+   beq 9f  // Nothing to reset!
+
+   /* Someone called kvm_call_hyp() against the hyp-stub... */
+   mov_q   x0, HVC_STUB_ERR
+   eret
+
+9: mov x0, xzr
+   eret
+SYM_CODE_END(hyp_stub_el1_sync)
 .endm
 
.text
@@ -39,7 +67,7 @@ SYM_CODE_START(__hyp_stub_vectors)
invalid_vector  hyp_stub_el2h_fiq_invalid   // FIQ EL2h
invalid_vector  hyp_stub_el2h_error_invalid // Error EL2h
 
-   ventry  el1_sync// Synchronous 64-bit EL1
+   hyp_stub_el1_sync   // Synchronous 64-bit 
EL1
invalid_vector  hyp_stub_el1_irq_invalid// IRQ 64-bit EL1
invalid_vector  hyp_stub_el1_fiq_invalid// FIQ 64-bit EL1
invalid_vector  hyp_stub_el1_error_invalid  // Error 64-bit EL1
@@ -55,35 +83,6 @@ SYM_CODE_END(__hyp_stub_vectors)
 # Check the __hyp_stub_vectors didn't overflow
 .org . - (__hyp_stub_vectors_end - __hyp_stub_vectors) + SZ_2K
 
-
-SYM_CODE_START_LOCAL(el1_sync)
-   cmp x0, #HVC_SET_VECTORS
-   b.ne1f
-   msr vbar_el2, x1
-   b   9f
-
-1: cmp x0, #HVC_VHE_RESTART
-   b.eqmutate_to_vhe
-
-2: cmp x0, #HVC_SOFT_RESTART
-   b.ne3f
-   mov x0, x2
-   mov x2, x4
-   mov x4, x1
-   mov x1, x3
-   br  x4  // no return
-
-3: cmp x0, #HVC_RESET_VECTORS
-   beq 9f  // Nothing to reset!
-
-   /* Someone called kvm_call_hyp() against the hyp-stub... */
-   mov_q   x0, HVC_STUB_ERR
-   eret
-
-9: mov x0, xzr
-   eret
-SYM_CODE_END(el1_sync)
-
 // nVHE? No way! Give me the real thing!
 SYM_CODE_START_LOCAL(mutate_to_vhe)
// Sanity check: MMU *must* be off
-- 
2.25.1

[PATCH v13 02/18] arm64: hyp-stub: Move invalid vector entries into the vectors

2021-04-07 Thread Pavel Tatashin

From: James Morse 

Most of the hyp-stub's vector entries are invalid. These are each
a unique function that branches to itself. To move these into the
vectors, merge the ventry and invalid_vector macros and give each
one a unique name.

This means we can copy the hyp-stub as it is self contained within
its vectors.

Signed-off-by: James Morse 

[Fixed merging issues]

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/hyp-stub.S | 56 +++-
 1 file changed, 23 insertions(+), 33 deletions(-)

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 572b28646005..ff329c5c074d 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -16,31 +16,38 @@
 #include 
 #include 
 
+.macro invalid_vector  label
+SYM_CODE_START_LOCAL(\label)
+   .align 7
+   b   \label
+SYM_CODE_END(\label)
+.endm
+
.text
.pushsection.hyp.text, "ax"
 
.align 11
 
 SYM_CODE_START(__hyp_stub_vectors)
-   ventry  el2_sync_invalid// Synchronous EL2t
-   ventry  el2_irq_invalid // IRQ EL2t
-   ventry  el2_fiq_invalid // FIQ EL2t
-   ventry  el2_error_invalid   // Error EL2t
+   invalid_vector  hyp_stub_el2t_sync_invalid  // Synchronous EL2t
+   invalid_vector  hyp_stub_el2t_irq_invalid   // IRQ EL2t
+   invalid_vector  hyp_stub_el2t_fiq_invalid   // FIQ EL2t
+   invalid_vector  hyp_stub_el2t_error_invalid // Error EL2t
 
-   ventry  el2_sync_invalid// Synchronous EL2h
-   ventry  el2_irq_invalid // IRQ EL2h
-   ventry  el2_fiq_invalid // FIQ EL2h
-   ventry  el2_error_invalid   // Error EL2h
+   invalid_vector  hyp_stub_el2h_sync_invalid  // Synchronous EL2h
+   invalid_vector  hyp_stub_el2h_irq_invalid   // IRQ EL2h
+   invalid_vector  hyp_stub_el2h_fiq_invalid   // FIQ EL2h
+   invalid_vector  hyp_stub_el2h_error_invalid // Error EL2h
 
ventry  el1_sync// Synchronous 64-bit EL1
-   ventry  el1_irq_invalid // IRQ 64-bit EL1
-   ventry  el1_fiq_invalid // FIQ 64-bit EL1
-   ventry  el1_error_invalid   // Error 64-bit EL1
-
-   ventry  el1_sync_invalid// Synchronous 32-bit EL1
-   ventry  el1_irq_invalid // IRQ 32-bit EL1
-   ventry  el1_fiq_invalid // FIQ 32-bit EL1
-   ventry  el1_error_invalid   // Error 32-bit EL1
+   invalid_vector  hyp_stub_el1_irq_invalid// IRQ 64-bit EL1
+   invalid_vector  hyp_stub_el1_fiq_invalid// FIQ 64-bit EL1
+   invalid_vector  hyp_stub_el1_error_invalid  // Error 64-bit EL1
+
+   invalid_vector  hyp_stub_32b_el1_sync_invalid   // Synchronous 32-bit 
EL1
+   invalid_vector  hyp_stub_32b_el1_irq_invalid// IRQ 32-bit EL1
+   invalid_vector  hyp_stub_32b_el1_fiq_invalid// FIQ 32-bit EL1
+   invalid_vector  hyp_stub_32b_el1_error_invalid  // Error 32-bit EL1
.align 11
 SYM_INNER_LABEL(__hyp_stub_vectors_end, SYM_L_LOCAL)
 SYM_CODE_END(__hyp_stub_vectors)
@@ -173,23 +180,6 @@ SYM_CODE_END(enter_vhe)
 
.popsection
 
-.macro invalid_vector  label
-SYM_CODE_START_LOCAL(\label)
-   b \label
-SYM_CODE_END(\label)
-.endm
-
-   invalid_vector  el2_sync_invalid
-   invalid_vector  el2_irq_invalid
-   invalid_vector  el2_fiq_invalid
-   invalid_vector  el2_error_invalid
-   invalid_vector  el1_sync_invalid
-   invalid_vector  el1_irq_invalid
-   invalid_vector  el1_fiq_invalid
-   invalid_vector  el1_error_invalid
-
-   .popsection
-
 /*
  * __hyp_set_vectors: Call this after boot to set the initial hypervisor
  * vectors as part of hypervisor installation.  On an SMP system, this should
-- 
2.25.1

[PATCH v13 01/18] arm64: hyp-stub: Check the size of the HYP stub's vectors

2021-04-07 Thread Pavel Tatashin

From: James Morse 

Hibernate contains a set of temporary EL2 vectors used to 'park'
EL2 somewhere safe while all the memory is thrown in the air.
Making kexec do its relocations with the MMU on means they have to
be done at EL1, so EL2 has to be parked. This means yet another
set of vectors.

All these things do is HVC_SET_VECTORS and HVC_SOFT_RESTART, both
of which are implemented by the hyp-stub. Lets copy it instead
of re-inventing it.

To do this the hyp-stub's entrails need to be packed neatly inside
its 2K vectors.

Start by moving the final 2K alignment inside the end marker, and
add a build check that we didn't overflow 2K.

Signed-off-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/hyp-stub.S | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 5eccbd62fec8..572b28646005 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -41,9 +41,13 @@ SYM_CODE_START(__hyp_stub_vectors)
ventry  el1_irq_invalid // IRQ 32-bit EL1
ventry  el1_fiq_invalid // FIQ 32-bit EL1
ventry  el1_error_invalid   // Error 32-bit EL1
+   .align 11
+SYM_INNER_LABEL(__hyp_stub_vectors_end, SYM_L_LOCAL)
 SYM_CODE_END(__hyp_stub_vectors)
 
-   .align 11
+# Check the __hyp_stub_vectors didn't overflow
+.org . - (__hyp_stub_vectors_end - __hyp_stub_vectors) + SZ_2K
+
 
 SYM_CODE_START_LOCAL(el1_sync)
cmp x0, #HVC_SET_VECTORS
-- 
2.25.1

[PATCH v13 00/18] arm64: MMU enabled kexec relocation

2021-04-07 Thread Pavel Tatashin

Changelog:
v13:
- Fixed a hang on ThunderX2, thank you Pingfan Liu for reporting
  the problem. In relocation function we need civac not ivac, we
  need to clean data in addition to invalidating it.
  Since I was using ThunderX2 machine I also measured the new
  performance data on this large ARM64 server. The MMU improves
  kexec relocation 190 times on this machine! (see below for
  raw data). Saves 7.5s during CentOS kexec reboot.
v12:
- A major change compared to previous version. Instead of using
  contiguous VA range a copy of linear map is now used to perform
  copying of segments during relocation as it was agreed in the
  discussion of version 11 of this project.
- In addition to using linear map, I also took several ideas from
  James Morse to better organize the kexec relocation:
1. skip relocation function entirely if that is not needed
2. remove the PoC flushing function since it is not needed
   anymore with MMU enabled.
v11:
- Fixed missing KEXEC_CORE dependency for trans_pgd.c
- Removed useless "if(rc) return rc" statement (thank you Tyler Hicks)
- Another 12 patches were accepted into maintainer's get.
  Re-based patches against:
  https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
  Branch: for-next/kexec
v10:
- Addressed a lot of comments form James Morse and from  Marc Zyngier
- Added review-by's
- Synchronized with mainline

v9: - 9 patches from previous series landed in upstream, so now series
  is smaller
- Added two patches from James Morse to address idmap issues for 
machines
  with high physical addresses.
- Addressed comments from Selin Dag about compiling issues. He also 
tested
  my series and got similar performance results: ~60 ms instead of ~580 
ms
  with an initramfs size of ~120MB.
v8:
- Synced with mainline to keep series up-to-date
v7:
-- Addressed comments from James Morse
- arm64: hibernate: pass the allocated pgdp to ttbr0
  Removed "Fixes" tag, and added Added Reviewed-by: James Morse
- arm64: hibernate: check pgd table allocation
  Sent out as a standalone patch so it can be sent to stable
  Series applies on mainline + this patch
- arm64: hibernate: add trans_pgd public functions
  Remove second allocation of tmp_pg_dir in swsusp_arch_resume
  Added Reviewed-by: James Morse 
- arm64: kexec: move relocation function setup and clean up
  Fixed typo in commit log
  Changed kern_reloc to phys_addr_t types.
  Added explanation why kern_reloc is needed.
  Split into four patches:
  arm64: kexec: make dtb_mem always enabled
  arm64: kexec: remove unnecessary debug prints
  arm64: kexec: call kexec_image_info only once
  arm64: kexec: move relocation function setup
- arm64: kexec: add expandable argument to relocation function
  Changed types of new arguments from unsigned long to phys_addr_t.
  Changed offset prefix to KEXEC_*
  Split into four patches:
  arm64: kexec: cpu_soft_restart change argument types
  arm64: kexec: arm64_relocate_new_kernel clean-ups
  arm64: kexec: arm64_relocate_new_kernel don't use x0 as temp
  arm64: kexec: add expandable argument to relocation function
- arm64: kexec: configure trans_pgd page table for kexec
  Added invalid entries into EL2 vector table
  Removed KEXEC_EL2_VECTOR_TABLE_SIZE and KEXEC_EL2_VECTOR_TABLE_OFFSET
  Copy relocation functions and table into separate pages
  Changed types in kern_reloc_arg.
  Split into three patches:
  arm64: kexec: offset for relocation function
  arm64: kexec: kexec EL2 vectors
  arm64: kexec: configure trans_pgd page table for kexec
- arm64: kexec: enable MMU during kexec relocation
  Split into two patches:
  arm64: kexec: enable MMU during kexec relocation
  arm64: kexec: remove head from relocation argument
v6:
- Sync with mainline tip
- Added Acked's from Dave Young
v5:
- Addressed comments from Matthias Brugger: added review-by's, improved
  comments, and made cleanups to swsusp_arch_resume() in addition to
  create_safe_exec_page().
- Synced with mainline tip.
v4:
- Addressed comments from James Morse.
- Split "check pgd table allocation" into two patches, and moved to
  the beginning of series  for simpler backport of the fixes.
  Added "Fixes:" tags to commit logs.
- Changed "arm64, hibernate:" to "arm64: hibernate:"
- Added Reviewed-by's
- Moved "add PUD_SECT_RDONLY" earlier in series

[PATCH 4/4] spi: spi-zynqmp-gqspi: fix incorrect operating mode in zynqmp_qspi_read_op

2021-04-07 Thread quanyang . wang

From: Quanyang Wang 

When starting a read operation, we should call zynqmp_qspi_setuprxdma
first to set xqspi->mode according to xqspi->bytes_to_receive and
to calculate correct xqspi->dma_rx_bytes. Then in the function
zynqmp_qspi_fillgenfifo, generate the appropriate command with
operating mode and bytes to transfer, and fill the GENFIFO with
the command to perform the read operation.

Calling zynqmp_qspi_fillgenfifo before zynqmp_qspi_setuprxdma will
result in incorrect transfer length and operating mode. So change
the calling order to fix this issue.

Fixes: 1c26372e5aa9 ("spi: spi-zynqmp-gqspi: Update driver to use spi-mem 
framework")
Signed-off-by: Quanyang Wang 
Reviewed-by: Amit Kumar Mahapatra 
---
 drivers/spi/spi-zynqmp-gqspi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spi-zynqmp-gqspi.c b/drivers/spi/spi-zynqmp-gqspi.c
index cf73a069b759..036d8ae41c06 100644
--- a/drivers/spi/spi-zynqmp-gqspi.c
+++ b/drivers/spi/spi-zynqmp-gqspi.c
@@ -827,8 +827,8 @@ static void zynqmp_qspi_write_op(struct zynqmp_qspi *xqspi, 
u8 tx_nbits,
 static void zynqmp_qspi_read_op(struct zynqmp_qspi *xqspi, u8 rx_nbits,
u32 genfifoentry)
 {
-   zynqmp_qspi_fillgenfifo(xqspi, rx_nbits, genfifoentry);
zynqmp_qspi_setuprxdma(xqspi);
+   zynqmp_qspi_fillgenfifo(xqspi, rx_nbits, genfifoentry);
 }
 
 /**
-- 
2.25.1

[PATCH 3/4] spi: spi-zynqmp-gqspi: transmit dummy circles by using the controller's internal functionality

2021-04-07 Thread quanyang . wang

From: Quanyang Wang 

There is a data corruption issue that occurs in the reading operation
(cmd:0x6c) when transmitting common data as dummy circles.

The gqspi controller has the functionality to send dummy clock circles.
When writing data with the fields [receive, transmit, data_xfer] = [0,0,1]
to the Generic FIFO, and configuring the correct SPI mode, the controller
will transmit dummy circles.

So let's switch to hardware dummy cycles transfer to fix this issue.

Fixes: 1c26372e5aa9 ("spi: spi-zynqmp-gqspi: Update driver to use spi-mem 
framework")
Signed-off-by: Quanyang Wang 
Reviewed-by: Amit Kumar Mahapatra 
---
 drivers/spi/spi-zynqmp-gqspi.c | 40 +++---
 1 file changed, 18 insertions(+), 22 deletions(-)

diff --git a/drivers/spi/spi-zynqmp-gqspi.c b/drivers/spi/spi-zynqmp-gqspi.c
index 3b39461d58b3..cf73a069b759 100644
--- a/drivers/spi/spi-zynqmp-gqspi.c
+++ b/drivers/spi/spi-zynqmp-gqspi.c
@@ -521,7 +521,7 @@ static void zynqmp_qspi_filltxfifo(struct zynqmp_qspi 
*xqspi, int size)
 {
u32 count = 0, intermediate;
 
-   while ((xqspi->bytes_to_transfer > 0) && (count < size)) {
+   while ((xqspi->bytes_to_transfer > 0) && (count < size) && 
(xqspi->txbuf)) {
memcpy(, xqspi->txbuf, 4);
zynqmp_gqspi_write(xqspi, GQSPI_TXD_OFST, intermediate);
 
@@ -580,7 +580,7 @@ static void zynqmp_qspi_fillgenfifo(struct zynqmp_qspi 
*xqspi, u8 nbits,
genfifoentry |= GQSPI_GENFIFO_DATA_XFER;
genfifoentry |= GQSPI_GENFIFO_TX;
transfer_len = xqspi->bytes_to_transfer;
-   } else {
+   } else if (xqspi->rxbuf) {
genfifoentry &= ~GQSPI_GENFIFO_TX;
genfifoentry |= GQSPI_GENFIFO_DATA_XFER;
genfifoentry |= GQSPI_GENFIFO_RX;
@@ -588,6 +588,11 @@ static void zynqmp_qspi_fillgenfifo(struct zynqmp_qspi 
*xqspi, u8 nbits,
transfer_len = xqspi->dma_rx_bytes;
else
transfer_len = xqspi->bytes_to_receive;
+   } else {
+   /* Sending dummy circles here */
+   genfifoentry &= ~(GQSPI_GENFIFO_TX | GQSPI_GENFIFO_RX);
+   genfifoentry |= GQSPI_GENFIFO_DATA_XFER;
+   transfer_len = xqspi->bytes_to_transfer;
}
genfifoentry |= zynqmp_qspi_selectspimode(xqspi, nbits);
xqspi->genfifoentry = genfifoentry;
@@ -1011,32 +1016,23 @@ static int zynqmp_qspi_exec_op(struct spi_mem *mem,
}
 
if (op->dummy.nbytes) {
-   tmpbuf = kzalloc(op->dummy.nbytes, GFP_KERNEL | GFP_DMA);
-   if (!tmpbuf)
-   return -ENOMEM;
-   memset(tmpbuf, 0xff, op->dummy.nbytes);
-   reinit_completion(>data_completion);
-   xqspi->txbuf = tmpbuf;
+   xqspi->txbuf = NULL;
xqspi->rxbuf = NULL;
-   xqspi->bytes_to_transfer = op->dummy.nbytes;
+   /*
+* xqspi->bytes_to_transfer here represents the dummy circles
+* which need to be sent.
+*/
+   xqspi->bytes_to_transfer = op->dummy.nbytes * 8 / 
op->dummy.buswidth;
xqspi->bytes_to_receive = 0;
-   zynqmp_qspi_write_op(xqspi, op->dummy.buswidth,
+   /*
+* Using op->data.buswidth instead of op->dummy.buswidth here 
because
+* we need to use it to configure the correct SPI mode.
+*/
+   zynqmp_qspi_write_op(xqspi, op->data.buswidth,
 genfifoentry);
zynqmp_gqspi_write(xqspi, GQSPI_CONFIG_OFST,
   zynqmp_gqspi_read(xqspi, GQSPI_CONFIG_OFST) |
   GQSPI_CFG_START_GEN_FIFO_MASK);
-   zynqmp_gqspi_write(xqspi, GQSPI_IER_OFST,
-  GQSPI_IER_TXEMPTY_MASK |
-  GQSPI_IER_GENFIFOEMPTY_MASK |
-  GQSPI_IER_TXNOT_FULL_MASK);
-   if (!wait_for_completion_interruptible_timeout
-   (>data_completion, msecs_to_jiffies(1000))) {
-   err = -ETIMEDOUT;
-   kfree(tmpbuf);
-   goto return_err;
-   }
-
-   kfree(tmpbuf);
}
 
if (op->data.nbytes) {
-- 
2.25.1

Re: [PATCH 5/8] dt-bindings: soc: mediatek: apusys: Add new document for APU power domain

2021-04-07 Thread Flora Fu

Hi, Rob, 

The error is resulted from some un-merged patch.

Please note that the patch depends MT8192 clock patches which haven't yet been 
accepted.
https://patchwork.kernel.org/project/linux-mediatek/patch/20210324104110.13383-7-chun-jie.c...@mediatek.com/

Thanks for your review.

On Wed, 2021-04-07 at 09:28 -0500, Rob Herring wrote:
> On Wed, 07 Apr 2021 11:28:03 +0800, Flora Fu wrote:
> > Document the bindings for APU power domain on MediaTek SoC.
> > 
> > Signed-off-by: Flora Fu 
> > ---
> >  .../soc/mediatek/mediatek,apu-pm.yaml | 146 ++
> >  1 file changed, 146 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/soc/mediatek/mediatek,apu-pm.yaml
> > 
> 
> My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
> on your patch (DT_CHECKER_FLAGS is new in v5.13):
> 
> yamllint warnings/errors:
> 
> dtschema/dtc warnings/errors:
> Documentation/devicetree/bindings/soc/mediatek/mediatek,apu-pm.example.dts:19:18:
>  fatal error: dt-bindings/clock/mt8192-clk.h: No such file or directory
>19 | #include 
>   |  ^~~~
> compilation terminated.
> make[1]: *** [scripts/Makefile.lib:377: 
> Documentation/devicetree/bindings/soc/mediatek/mediatek,apu-pm.example.dt.yaml]
>  Error 1
> make[1]: *** Waiting for unfinished jobs
> make: *** [Makefile:1414: dt_binding_check] Error 2
> 
> See 
> https://urldefense.com/v3/__https://patchwork.ozlabs.org/patch/1463115__;!!CTRNKA9wMg0ARbw!0XUn1LcNHfvUShNClpM_yH73TAR9qdm29SZMckasoCQ8UzeKS-vZW0QUu3Ssn-s6$
>  
> 
> This check can fail if there are any dependencies. The base for a patch
> series is generally the most recent rc1.
> 
> If you already ran 'make dt_binding_check' and didn't see the above
> error(s), then make sure 'yamllint' is installed and dt-schema is up to
> date:
> 
> pip3 install dtschema --upgrade
> 
> Please check and re-submit.
>

[PATCH 2/4] spi: spi-zynqmp-gqspi: add mutex locking for exec_op

2021-04-07 Thread quanyang . wang

From: Quanyang Wang 

The spi-mem framework has no locking to prevent ctlr->mem_ops->exec_op
from concurrency. So add the locking to zynqmp_qspi_exec_op.

Fixes: 1c26372e5aa9 ("spi: spi-zynqmp-gqspi: Update driver to use spi-mem 
framework")
Signed-off-by: Quanyang Wang 
Reviewed-by: Amit Kumar Mahapatra 
---
 drivers/spi/spi-zynqmp-gqspi.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/spi/spi-zynqmp-gqspi.c b/drivers/spi/spi-zynqmp-gqspi.c
index d49ab6575553..3b39461d58b3 100644
--- a/drivers/spi/spi-zynqmp-gqspi.c
+++ b/drivers/spi/spi-zynqmp-gqspi.c
@@ -173,6 +173,7 @@ struct zynqmp_qspi {
u32 genfifoentry;
enum mode_type mode;
struct completion data_completion;
+   struct mutex op_lock;
 };
 
 /**
@@ -951,6 +952,7 @@ static int zynqmp_qspi_exec_op(struct spi_mem *mem,
op->cmd.opcode, op->cmd.buswidth, op->addr.buswidth,
op->dummy.buswidth, op->data.buswidth);
 
+   mutex_lock(>op_lock);
zynqmp_qspi_config_op(xqspi, mem->spi);
zynqmp_qspi_chipselect(mem->spi, false);
genfifoentry |= xqspi->genfifocs;
@@ -1084,6 +1086,7 @@ static int zynqmp_qspi_exec_op(struct spi_mem *mem,
 return_err:
 
zynqmp_qspi_chipselect(mem->spi, true);
+   mutex_unlock(>op_lock);
 
return err;
 }
@@ -1156,6 +1159,8 @@ static int zynqmp_qspi_probe(struct platform_device *pdev)
goto clk_dis_pclk;
}
 
+   mutex_init(>op_lock);
+
pm_runtime_use_autosuspend(>dev);
pm_runtime_set_autosuspend_delay(>dev, SPI_AUTOSUSPEND_TIMEOUT);
pm_runtime_set_active(>dev);
-- 
2.25.1

[PATCH 1/4] spi: spi-zynqmp-gqspi: use wait_for_completion_timeout to make zynqmp_qspi_exec_op not interruptible

2021-04-07 Thread quanyang . wang

From: Quanyang Wang 

When Ctrl+C occurs during the process of zynqmp_qspi_exec_op, the function
wait_for_completion_interruptible_timeout will return a non-zero value
-ERESTARTSYS immediately. This will disrupt the SPI memory operation
because the data transmitting may begin before the command or address
transmitting completes. Use wait_for_completion_timeout to prevent
the process from being interruptible.

This patch fixes the error as below:
root@xilinx-zynqmp:~# flash_erase /dev/mtd3 0 0
Erasing 4 Kibyte @ 3d000 --  4 % complete
(Press Ctrl+C)
[  169.581911] zynqmp-qspi ff0f.spi: Chip select timed out
[  170.585907] zynqmp-qspi ff0f.spi: Chip select timed out
[  171.589910] zynqmp-qspi ff0f.spi: Chip select timed out
[  172.593910] zynqmp-qspi ff0f.spi: Chip select timed out
[  173.597907] zynqmp-qspi ff0f.spi: Chip select timed out
[  173.603480] spi-nor spi0.0: Erase operation failed.
[  173.608368] spi-nor spi0.0: Attempted to modify a protected sector.

Fixes: 1c26372e5aa9 ("spi: spi-zynqmp-gqspi: Update driver to use spi-mem 
framework")
Signed-off-by: Quanyang Wang 
Reviewed-by: Amit Kumar Mahapatra 
---
 drivers/spi/spi-zynqmp-gqspi.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/spi/spi-zynqmp-gqspi.c b/drivers/spi/spi-zynqmp-gqspi.c
index c8fa6ee18ae7..d49ab6575553 100644
--- a/drivers/spi/spi-zynqmp-gqspi.c
+++ b/drivers/spi/spi-zynqmp-gqspi.c
@@ -973,7 +973,7 @@ static int zynqmp_qspi_exec_op(struct spi_mem *mem,
zynqmp_gqspi_write(xqspi, GQSPI_IER_OFST,
   GQSPI_IER_GENFIFOEMPTY_MASK |
   GQSPI_IER_TXNOT_FULL_MASK);
-   if (!wait_for_completion_interruptible_timeout
+   if (!wait_for_completion_timeout
(>data_completion, msecs_to_jiffies(1000))) {
err = -ETIMEDOUT;
kfree(tmpbuf);
@@ -1001,7 +1001,7 @@ static int zynqmp_qspi_exec_op(struct spi_mem *mem,
   GQSPI_IER_TXEMPTY_MASK |
   GQSPI_IER_GENFIFOEMPTY_MASK |
   GQSPI_IER_TXNOT_FULL_MASK);
-   if (!wait_for_completion_interruptible_timeout
+   if (!wait_for_completion_timeout
(>data_completion, msecs_to_jiffies(1000))) {
err = -ETIMEDOUT;
goto return_err;
@@ -1076,7 +1076,7 @@ static int zynqmp_qspi_exec_op(struct spi_mem *mem,
   GQSPI_IER_RXEMPTY_MASK);
}
}
-   if (!wait_for_completion_interruptible_timeout
+   if (!wait_for_completion_timeout
(>data_completion, msecs_to_jiffies(1000)))
err = -ETIMEDOUT;
}
-- 
2.25.1

[PATCH 0/4] spi: spi-zynqmp-gpspi: fix some issues

2021-04-07 Thread quanyang . wang

From: Quanyang Wang 

Hello,

This series fix some issues that occurs when the gqspi driver switches to 
spi-mem framework.

Hi Amit,
I rewrite the "Subject" and "commit message" of these patches, so they
look different from the ones which you reviewed before. I still keep
your "Reviewed-by" and hope you will not mind.

Regards,
Quanyang Wang

Quanyang Wang (4):
  spi: spi-zynqmp-gqspi: use wait_for_completion_timeout to make
zynqmp_qspi_exec_op not interruptible
  spi: spi-zynqmp-gqspi: add mutex locking for exec_op
  spi: spi-zynqmp-gqspi: transmit dummy circles by using the
controller's internal functionality
  spi: spi-zynqmp-gqspi: fix incorrect operating mode in
zynqmp_qspi_read_op

 drivers/spi/spi-zynqmp-gqspi.c | 53 +-
 1 file changed, 27 insertions(+), 26 deletions(-)

-- 
2.25.1

linux-next: build failure after merge of the bluetooth tree

2021-04-07 Thread Stephen Rothwell

Hi all,

After merging the bluetooth tree, today's linux-next build (x86_64
allmodconfig) failed like this:

In file included from :32:
./usr/include/linux/virtio_bt.h:1:1: error: C++ style comments are not allowed 
in ISO C90
1 | // SPDX-License-Identifier: BSD-3-Clause
  | ^
./usr/include/linux/virtio_bt.h:1:1: note: (this will be reported only once per 
input file)

Caused by commit

  148a48f61393 ("Bluetooth: Add support for virtio transport driver")

I have used the bluetooth tree from next-20210407 for today.

-- 
Cheers,
Stephen Rothwell


pgpt9Z2bR_E0e.pgp
Description: OpenPGP digital signature

[PATCH -next] powerpc/mce: Make symbol 'mce_ue_event_work' static

2021-04-07 Thread Li Huafei

The sparse tool complains as follows:

arch/powerpc/kernel/mce.c:43:1: warning:
 symbol 'mce_ue_event_work' was not declared. Should it be static?

This symbol is not used outside of mce.c, so this commit marks it
static.

Signed-off-by: Li Huafei 
---
 arch/powerpc/kernel/mce.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 11f0cae086ed..6aa6b1cda1ed 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -40,7 +40,7 @@ static struct irq_work mce_ue_event_irq_work = {
.func = machine_check_ue_irq_work,
 };
 
-DECLARE_WORK(mce_ue_event_work, machine_process_ue_event);
+static DECLARE_WORK(mce_ue_event_work, machine_process_ue_event);
 
 static BLOCKING_NOTIFIER_HEAD(mce_notifier_list);
 
-- 
2.17.1

[PATCH v3 6/6] percpu: implement partial chunk depopulation

2021-04-07 Thread Roman Gushchin

This patch implements partial depopulation of percpu chunks.

As now, a chunk can be depopulated only as a part of the final
destruction, if there are no more outstanding allocations. However
to minimize a memory waste it might be useful to depopulate a
partially filed chunk, if a small number of outstanding allocations
prevents the chunk from being fully reclaimed.

This patch implements the following depopulation process: it scans
over the chunk pages, looks for a range of empty and populated pages
and performs the depopulation. To avoid races with new allocations,
the chunk is previously isolated. After the depopulation the chunk is
sidelined to a special list or freed. New allocations can't be served
using a sidelined chunk. The chunk can be moved back to a corresponding
slot if there are not enough chunks with empty populated pages.

The depopulation is scheduled on the free path. Is the chunk:
  1) has more than 1/4 of total pages free and populated
  2) the system has enough free percpu pages aside of this chunk
  3) isn't the reserved chunk
  4) isn't the first chunk
  5) isn't entirely free
it's a good target for depopulation. If it's already depopulated
but got free populated pages, it's a good target too.
The chunk is moved to a special pcpu_depopulate_list, chunk->isolate
flag is set and the async balancing is scheduled.

The async balancing moves pcpu_depopulate_list to a local list
(because pcpu_depopulate_list can be changed when pcpu_lock is
releases), and then tries to depopulate each chunk.  The depopulation
is performed in the reverse direction to keep populated pages close to
the beginning, if the global number of empty pages is reached.
Depopulated chunks are sidelined to prevent further allocations.
Skipped and fully empty chunks are returned to the corresponding slot.

On the allocation path, if there are no suitable chunks found,
the list of sidelined chunks in scanned prior to creating a new chunk.
If there is a good sidelined chunk, it's placed back to the slot
and the scanning is restarted.

Many thanks to Dennis Zhou for his great ideas and a very constructive
discussion which led to many improvements in this patchset!

Signed-off-by: Roman Gushchin 
---
 mm/percpu-internal.h |   2 +
 mm/percpu.c  | 158 ++-
 2 files changed, 158 insertions(+), 2 deletions(-)

diff --git a/mm/percpu-internal.h b/mm/percpu-internal.h
index 095d7eaa0db4..8e432663c41e 100644
--- a/mm/percpu-internal.h
+++ b/mm/percpu-internal.h
@@ -67,6 +67,8 @@ struct pcpu_chunk {
 
void*data;  /* chunk data */
boolimmutable;  /* no [de]population allowed */
+   boolisolated;   /* isolated from chunk slot 
lists */
+   booldepopulated;/* sidelined after depopulation 
*/
int start_offset;   /* the overlap with the previous
   region to have a page aligned
   base_addr */
diff --git a/mm/percpu.c b/mm/percpu.c
index 357fd6994278..5bb294e394b3 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -181,6 +181,19 @@ static LIST_HEAD(pcpu_map_extend_chunks);
  */
 int pcpu_nr_empty_pop_pages[PCPU_NR_CHUNK_TYPES];
 
+/*
+ * List of chunks with a lot of free pages.  Used to depopulate them
+ * asynchronously.
+ */
+static struct list_head pcpu_depopulate_list[PCPU_NR_CHUNK_TYPES];
+
+/*
+ * List of previously depopulated chunks.  They are not usually used for new
+ * allocations, but can be returned back to service if a need arises.
+ */
+static struct list_head pcpu_sideline_list[PCPU_NR_CHUNK_TYPES];
+
+
 /*
  * The number of populated pages in use by the allocator, protected by
  * pcpu_lock.  This number is kept per a unit per chunk (i.e. when a page gets
@@ -562,6 +575,12 @@ static void pcpu_chunk_relocate(struct pcpu_chunk *chunk, 
int oslot)
 {
int nslot = pcpu_chunk_slot(chunk);
 
+   /*
+* Keep isolated and depopulated chunks on a sideline.
+*/
+   if (chunk->isolated || chunk->depopulated)
+   return;
+
if (oslot != nslot)
__pcpu_chunk_move(chunk, nslot, oslot < nslot);
 }
@@ -1790,6 +1809,19 @@ static void __percpu *pcpu_alloc(size_t size, size_t 
align, bool reserved,
}
}
 
+   /* search through sidelined depopulated chunks */
+   list_for_each_entry(chunk, _sideline_list[type], list) {
+   /*
+* If the allocation can fit the chunk, place the chunk back
+* into corresponding slot and restart the scanning.
+*/
+   if (pcpu_check_chunk_hint(>chunk_md, bits, bit_align)) {
+   chunk->depopulated = false;
+   pcpu_chunk_relocate(chunk, -1);
+   goto restart;
+   }
+   }
+

[PATCH v3 0/6] percpu: partial chunk depopulation

2021-04-07 Thread Roman Gushchin

In our production experience the percpu memory allocator is sometimes struggling
with returning the memory to the system. A typical example is a creation of
several thousands memory cgroups (each has several chunks of the percpu data
used for vmstats, vmevents, ref counters etc). Deletion and complete releasing
of these cgroups doesn't always lead to a shrinkage of the percpu memory,
so that sometimes there are several GB's of memory wasted.

The underlying problem is the fragmentation: to release an underlying chunk
all percpu allocations should be released first. The percpu allocator tends
to top up chunks to improve the utilization. It means new small-ish allocations
(e.g. percpu ref counters) are placed onto almost filled old-ish chunks,
effectively pinning them in memory.

This patchset solves this problem by implementing a partial depopulation
of percpu chunks: chunks with many empty pages are being asynchronously
depopulated and the pages are returned to the system.

To illustrate the problem the following script can be used:

--
#!/bin/bash

cd /sys/fs/cgroup

mkdir percpu_test
echo "+memory" > percpu_test/cgroup.subtree_control

cat /proc/meminfo | grep Percpu

for i in `seq 1 1000`; do
mkdir percpu_test/cg_"${i}"
for j in `seq 1 10`; do
mkdir percpu_test/cg_"${i}"_"${j}"
done
done

cat /proc/meminfo | grep Percpu

for i in `seq 1 1000`; do
for j in `seq 1 10`; do
rmdir percpu_test/cg_"${i}"_"${j}"
done
done

sleep 10

cat /proc/meminfo | grep Percpu

for i in `seq 1 1000`; do
rmdir percpu_test/cg_"${i}"
done

rmdir percpu_test
--

It creates 11000 memory cgroups and removes every 10 out of 11.
It prints the initial size of the percpu memory, the size after
creating all cgroups and the size after deleting most of them.

Results:
  vanilla:
./percpu_test.sh
Percpu: 7488 kB
Percpu:   481152 kB
Percpu:   481152 kB

  with this patchset applied:
./percpu_test.sh
Percpu: 7488 kB
Percpu:   481408 kB
Percpu:   135552 kB

So the total size of the percpu memory was reduced by more than 3.5 times.

v3:
  - introduced pcpu_check_chunk_hint()
  - fixed a bug related to the hint check
  - minor cosmetic changes
  - s/pretends/fixes (cc Vlastimil)

v2:
  - depopulated chunks are sidelined
  - depopulation happens in the reverse order
  - depopulate list made per-chunk type
  - better results due to better heuristics

v1:
  - depopulation heuristics changed and optimized
  - chunks are put into a separate list, depopulation scan this list
  - chunk->isolated is introduced, chunk->depopulate is dropped
  - rearranged patches a bit
  - fixed a panic discovered by krobot
  - made pcpu_nr_empty_pop_pages per chunk type
  - minor fixes

rfc:
  https://lwn.net/Articles/850508/


Roman Gushchin (6):
  percpu: fix a comment about the chunks ordering
  percpu: split __pcpu_balance_workfn()
  percpu: make pcpu_nr_empty_pop_pages per chunk type
  percpu: generalize pcpu_balance_populated()
  percpu: factor out pcpu_check_chunk_hint()
  percpu: implement partial chunk depopulation

 mm/percpu-internal.h |   4 +-
 mm/percpu-stats.c|   9 +-
 mm/percpu.c  | 306 +++
 3 files changed, 261 insertions(+), 58 deletions(-)

-- 
2.30.2

[PATCH v3 4/6] percpu: generalize pcpu_balance_populated()

2021-04-07 Thread Roman Gushchin

To prepare for the depopulation of percpu chunks, split out the
populating part of the pcpu_balance_populated() into the new
pcpu_grow_populated() (with an intention to add
pcpu_shrink_populated() in the next commit).

The goal of pcpu_balance_populated() is to determine whether
there is a shortage or an excessive amount of empty percpu pages
and call into the corresponding function.

pcpu_grow_populated() takes a desired number of pages as an argument
(nr_to_pop). If it creates a new chunk, nr_to_pop should be updated
to reflect that the new chunk could be created already populated.
Otherwise an infinite loop might appear.

Signed-off-by: Roman Gushchin 
---
 mm/percpu.c | 63 +
 1 file changed, 39 insertions(+), 24 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index 61339b3d9337..e20119668c42 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1979,7 +1979,7 @@ static void pcpu_balance_free(enum pcpu_chunk_type type)
 }
 
 /**
- * pcpu_balance_populated - manage the amount of populated pages
+ * pcpu_grow_populated - populate chunk(s) to satisfy atomic allocations
  * @type: chunk type
  *
  * Maintain a certain amount of populated pages to satisfy atomic allocations.
@@ -1988,35 +1988,15 @@ static void pcpu_balance_free(enum pcpu_chunk_type type)
  * allocation causes the failure as it is possible that requests can be
  * serviced from already backed regions.
  */
-static void pcpu_balance_populated(enum pcpu_chunk_type type)
+static void pcpu_grow_populated(enum pcpu_chunk_type type, int nr_to_pop)
 {
/* gfp flags passed to underlying allocators */
const gfp_t gfp = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN;
struct list_head *pcpu_slot = pcpu_chunk_list(type);
struct pcpu_chunk *chunk;
-   int slot, nr_to_pop, ret;
+   int slot, ret;
 
-   /*
-* Ensure there are certain number of free populated pages for
-* atomic allocs.  Fill up from the most packed so that atomic
-* allocs don't increase fragmentation.  If atomic allocation
-* failed previously, always populate the maximum amount.  This
-* should prevent atomic allocs larger than PAGE_SIZE from keeping
-* failing indefinitely; however, large atomic allocs are not
-* something we support properly and can be highly unreliable and
-* inefficient.
-*/
 retry_pop:
-   if (pcpu_atomic_alloc_failed) {
-   nr_to_pop = PCPU_EMPTY_POP_PAGES_HIGH;
-   /* best effort anyway, don't worry about synchronization */
-   pcpu_atomic_alloc_failed = false;
-   } else {
-   nr_to_pop = clamp(PCPU_EMPTY_POP_PAGES_HIGH -
- pcpu_nr_empty_pop_pages[type],
- 0, PCPU_EMPTY_POP_PAGES_HIGH);
-   }
-
for (slot = pcpu_size_to_slot(PAGE_SIZE); slot < pcpu_nr_slots; slot++) 
{
unsigned int nr_unpop = 0, rs, re;
 
@@ -2060,12 +2040,47 @@ static void pcpu_balance_populated(enum pcpu_chunk_type 
type)
if (chunk) {
spin_lock_irq(_lock);
pcpu_chunk_relocate(chunk, -1);
+   nr_to_pop = max_t(int, 0, nr_to_pop - 
chunk->nr_populated);
spin_unlock_irq(_lock);
-   goto retry_pop;
+   if (nr_to_pop)
+   goto retry_pop;
}
}
 }
 
+/**
+ * pcpu_balance_populated - manage the amount of populated pages
+ * @type: chunk type
+ *
+ * Populate or depopulate chunks to maintain a certain amount
+ * of free pages to satisfy atomic allocations, but not waste
+ * large amounts of memory.
+ */
+static void pcpu_balance_populated(enum pcpu_chunk_type type)
+{
+   int nr_to_pop;
+
+   /*
+* Ensure there are certain number of free populated pages for
+* atomic allocs.  Fill up from the most packed so that atomic
+* allocs don't increase fragmentation.  If atomic allocation
+* failed previously, always populate the maximum amount.  This
+* should prevent atomic allocs larger than PAGE_SIZE from keeping
+* failing indefinitely; however, large atomic allocs are not
+* something we support properly and can be highly unreliable and
+* inefficient.
+*/
+   if (pcpu_atomic_alloc_failed) {
+   nr_to_pop = PCPU_EMPTY_POP_PAGES_HIGH;
+   /* best effort anyway, don't worry about synchronization */
+   pcpu_atomic_alloc_failed = false;
+   pcpu_grow_populated(type, nr_to_pop);
+   } else if (pcpu_nr_empty_pop_pages[type] < PCPU_EMPTY_POP_PAGES_HIGH) {
+   nr_to_pop = PCPU_EMPTY_POP_PAGES_HIGH - 
pcpu_nr_empty_pop_pages[type];
+   pcpu_grow_populated(type, nr_to_pop);
+   }
+}
+
 /**
  * pcpu_balance_workfn - manage the amount of free chunks and

[PATCH v3 3/6] percpu: make pcpu_nr_empty_pop_pages per chunk type

2021-04-07 Thread Roman Gushchin

nr_empty_pop_pages is used to guarantee that there are some free
populated pages to satisfy atomic allocations. Accounted and
non-accounted allocations are using separate sets of chunks,
so both need to have a surplus of empty pages.

This commit makes pcpu_nr_empty_pop_pages and the corresponding logic
per chunk type.

Signed-off-by: Roman Gushchin 
---
 mm/percpu-internal.h |  2 +-
 mm/percpu-stats.c|  9 +++--
 mm/percpu.c  | 14 +++---
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/mm/percpu-internal.h b/mm/percpu-internal.h
index 18b768ac7dca..095d7eaa0db4 100644
--- a/mm/percpu-internal.h
+++ b/mm/percpu-internal.h
@@ -87,7 +87,7 @@ extern spinlock_t pcpu_lock;
 
 extern struct list_head *pcpu_chunk_lists;
 extern int pcpu_nr_slots;
-extern int pcpu_nr_empty_pop_pages;
+extern int pcpu_nr_empty_pop_pages[];
 
 extern struct pcpu_chunk *pcpu_first_chunk;
 extern struct pcpu_chunk *pcpu_reserved_chunk;
diff --git a/mm/percpu-stats.c b/mm/percpu-stats.c
index c8400a2adbc2..f6026dbcdf6b 100644
--- a/mm/percpu-stats.c
+++ b/mm/percpu-stats.c
@@ -145,6 +145,7 @@ static int percpu_stats_show(struct seq_file *m, void *v)
int slot, max_nr_alloc;
int *buffer;
enum pcpu_chunk_type type;
+   int nr_empty_pop_pages;
 
 alloc_buffer:
spin_lock_irq(_lock);
@@ -165,7 +166,11 @@ static int percpu_stats_show(struct seq_file *m, void *v)
goto alloc_buffer;
}
 
-#define PL(X) \
+   nr_empty_pop_pages = 0;
+   for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++)
+   nr_empty_pop_pages += pcpu_nr_empty_pop_pages[type];
+
+#define PL(X)  \
seq_printf(m, "  %-20s: %12lld\n", #X, (long long int)pcpu_stats_ai.X)
 
seq_printf(m,
@@ -196,7 +201,7 @@ static int percpu_stats_show(struct seq_file *m, void *v)
PU(nr_max_chunks);
PU(min_alloc_size);
PU(max_alloc_size);
-   P("empty_pop_pages", pcpu_nr_empty_pop_pages);
+   P("empty_pop_pages", nr_empty_pop_pages);
seq_putc(m, '\n');
 
 #undef PU
diff --git a/mm/percpu.c b/mm/percpu.c
index 7e31e1b8725f..61339b3d9337 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -176,10 +176,10 @@ struct list_head *pcpu_chunk_lists __ro_after_init; /* 
chunk list slots */
 static LIST_HEAD(pcpu_map_extend_chunks);
 
 /*
- * The number of empty populated pages, protected by pcpu_lock.  The
- * reserved chunk doesn't contribute to the count.
+ * The number of empty populated pages by chunk type, protected by pcpu_lock.
+ * The reserved chunk doesn't contribute to the count.
  */
-int pcpu_nr_empty_pop_pages;
+int pcpu_nr_empty_pop_pages[PCPU_NR_CHUNK_TYPES];
 
 /*
  * The number of populated pages in use by the allocator, protected by
@@ -559,7 +559,7 @@ static inline void pcpu_update_empty_pages(struct 
pcpu_chunk *chunk, int nr)
 {
chunk->nr_empty_pop_pages += nr;
if (chunk != pcpu_reserved_chunk)
-   pcpu_nr_empty_pop_pages += nr;
+   pcpu_nr_empty_pop_pages[pcpu_chunk_type(chunk)] += nr;
 }
 
 /*
@@ -1835,7 +1835,7 @@ static void __percpu *pcpu_alloc(size_t size, size_t 
align, bool reserved,
mutex_unlock(_alloc_mutex);
}
 
-   if (pcpu_nr_empty_pop_pages < PCPU_EMPTY_POP_PAGES_LOW)
+   if (pcpu_nr_empty_pop_pages[type] < PCPU_EMPTY_POP_PAGES_LOW)
pcpu_schedule_balance_work();
 
/* clear the areas and return address relative to base address */
@@ -2013,7 +2013,7 @@ static void pcpu_balance_populated(enum pcpu_chunk_type 
type)
pcpu_atomic_alloc_failed = false;
} else {
nr_to_pop = clamp(PCPU_EMPTY_POP_PAGES_HIGH -
- pcpu_nr_empty_pop_pages,
+ pcpu_nr_empty_pop_pages[type],
  0, PCPU_EMPTY_POP_PAGES_HIGH);
}
 
@@ -2595,7 +2595,7 @@ void __init pcpu_setup_first_chunk(const struct 
pcpu_alloc_info *ai,
 
/* link the first chunk in */
pcpu_first_chunk = chunk;
-   pcpu_nr_empty_pop_pages = pcpu_first_chunk->nr_empty_pop_pages;
+   pcpu_nr_empty_pop_pages[PCPU_CHUNK_ROOT] = 
pcpu_first_chunk->nr_empty_pop_pages;
pcpu_chunk_relocate(pcpu_first_chunk, -1);
 
/* include all regions of the first chunk */
-- 
2.30.2

[PATCH v3 5/6] percpu: factor out pcpu_check_chunk_hint()

2021-04-07 Thread Roman Gushchin

Factor out the pcpu_check_chunk_hint() helper, which will be useful
in the future. The new function checks if the allocation can likely
fit the given chunk.

Signed-off-by: Roman Gushchin 
---
 mm/percpu.c | 30 +-
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index e20119668c42..357fd6994278 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -306,6 +306,26 @@ static unsigned long pcpu_block_off_to_off(int index, int 
off)
return index * PCPU_BITMAP_BLOCK_BITS + off;
 }
 
+/**
+ * pcpu_check_chunk_hint - check that allocation can fit a chunk
+ * @chunk_md: chunk's block
+ * @bits: size of request in allocation units
+ * @align: alignment of area (max PAGE_SIZE)
+ *
+ * Check to see if the allocation can fit in the chunk's contig hint.
+ * This is an optimization to prevent scanning by assuming if it
+ * cannot fit in the global hint, there is memory pressure and creating
+ * a new chunk would happen soon.
+ */
+static bool pcpu_check_chunk_hint(struct pcpu_block_md *chunk_md, int bits,
+ size_t align)
+{
+   int bit_off = ALIGN(chunk_md->contig_hint_start, align) -
+   chunk_md->contig_hint_start;
+
+   return bit_off + bits <= chunk_md->contig_hint;
+}
+
 /*
  * pcpu_next_hint - determine which hint to use
  * @block: block of interest
@@ -1065,15 +1085,7 @@ static int pcpu_find_block_fit(struct pcpu_chunk *chunk, 
int alloc_bits,
struct pcpu_block_md *chunk_md = >chunk_md;
int bit_off, bits, next_off;
 
-   /*
-* Check to see if the allocation can fit in the chunk's contig hint.
-* This is an optimization to prevent scanning by assuming if it
-* cannot fit in the global hint, there is memory pressure and creating
-* a new chunk would happen soon.
-*/
-   bit_off = ALIGN(chunk_md->contig_hint_start, align) -
- chunk_md->contig_hint_start;
-   if (bit_off + alloc_bits > chunk_md->contig_hint)
+   if (!pcpu_check_chunk_hint(chunk_md, alloc_bits, align))
return -1;
 
bit_off = pcpu_next_hint(chunk_md, alloc_bits);
-- 
2.30.2

[PATCH v3 2/6] percpu: split __pcpu_balance_workfn()

2021-04-07 Thread Roman Gushchin

__pcpu_balance_workfn() became fairly big and hard to follow, but in
fact it consists of two fully independent parts, responsible for
the destruction of excessive free chunks and population of necessarily
amount of free pages.

In order to simplify the code and prepare for adding of a new
functionality, split it in two functions:

  1) pcpu_balance_free,
  2) pcpu_balance_populated.

Move the taking/releasing of the pcpu_alloc_mutex to an upper level
to keep the current synchronization in place.

Signed-off-by: Roman Gushchin 
Reviewed-by: Dennis Zhou 
---
 mm/percpu.c | 46 +-
 1 file changed, 29 insertions(+), 17 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index 2f27123bb489..7e31e1b8725f 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1933,31 +1933,22 @@ void __percpu *__alloc_reserved_percpu(size_t size, 
size_t align)
 }
 
 /**
- * __pcpu_balance_workfn - manage the amount of free chunks and populated pages
+ * pcpu_balance_free - manage the amount of free chunks
  * @type: chunk type
  *
- * Reclaim all fully free chunks except for the first one.  This is also
- * responsible for maintaining the pool of empty populated pages.  However,
- * it is possible that this is called when physical memory is scarce causing
- * OOM killer to be triggered.  We should avoid doing so until an actual
- * allocation causes the failure as it is possible that requests can be
- * serviced from already backed regions.
+ * Reclaim all fully free chunks except for the first one.
  */
-static void __pcpu_balance_workfn(enum pcpu_chunk_type type)
+static void pcpu_balance_free(enum pcpu_chunk_type type)
 {
-   /* gfp flags passed to underlying allocators */
-   const gfp_t gfp = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN;
LIST_HEAD(to_free);
struct list_head *pcpu_slot = pcpu_chunk_list(type);
struct list_head *free_head = _slot[pcpu_nr_slots - 1];
struct pcpu_chunk *chunk, *next;
-   int slot, nr_to_pop, ret;
 
/*
 * There's no reason to keep around multiple unused chunks and VM
 * areas can be scarce.  Destroy all free chunks except for one.
 */
-   mutex_lock(_alloc_mutex);
spin_lock_irq(_lock);
 
list_for_each_entry_safe(chunk, next, free_head, list) {
@@ -1985,6 +1976,25 @@ static void __pcpu_balance_workfn(enum pcpu_chunk_type 
type)
pcpu_destroy_chunk(chunk);
cond_resched();
}
+}
+
+/**
+ * pcpu_balance_populated - manage the amount of populated pages
+ * @type: chunk type
+ *
+ * Maintain a certain amount of populated pages to satisfy atomic allocations.
+ * It is possible that this is called when physical memory is scarce causing
+ * OOM killer to be triggered.  We should avoid doing so until an actual
+ * allocation causes the failure as it is possible that requests can be
+ * serviced from already backed regions.
+ */
+static void pcpu_balance_populated(enum pcpu_chunk_type type)
+{
+   /* gfp flags passed to underlying allocators */
+   const gfp_t gfp = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN;
+   struct list_head *pcpu_slot = pcpu_chunk_list(type);
+   struct pcpu_chunk *chunk;
+   int slot, nr_to_pop, ret;
 
/*
 * Ensure there are certain number of free populated pages for
@@ -2054,22 +2064,24 @@ static void __pcpu_balance_workfn(enum pcpu_chunk_type 
type)
goto retry_pop;
}
}
-
-   mutex_unlock(_alloc_mutex);
 }
 
 /**
  * pcpu_balance_workfn - manage the amount of free chunks and populated pages
  * @work: unused
  *
- * Call __pcpu_balance_workfn() for each chunk type.
+ * Call pcpu_balance_free() and pcpu_balance_populated() for each chunk type.
  */
 static void pcpu_balance_workfn(struct work_struct *work)
 {
enum pcpu_chunk_type type;
 
-   for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++)
-   __pcpu_balance_workfn(type);
+   for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) {
+   mutex_lock(_alloc_mutex);
+   pcpu_balance_free(type);
+   pcpu_balance_populated(type);
+   mutex_unlock(_alloc_mutex);
+   }
 }
 
 /**
-- 
2.30.2

[PATCH v3 1/6] percpu: fix a comment about the chunks ordering

2021-04-07 Thread Roman Gushchin

Since the commit 3e54097beb22 ("percpu: manage chunks based on
contig_bits instead of free_bytes") chunks are sorted based on the
size of the biggest continuous free area instead of the total number
of free bytes. Update the corresponding comment to reflect this.

Signed-off-by: Roman Gushchin 
---
 mm/percpu.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index 6596a0a4286e..2f27123bb489 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -99,7 +99,10 @@
 
 #include "percpu-internal.h"
 
-/* the slots are sorted by free bytes left, 1-31 bytes share the same slot */
+/*
+ * The slots are sorted by the size of the biggest continuous free area.
+ * 1-31 bytes share the same slot.
+ */
 #define PCPU_SLOT_BASE_SHIFT   5
 /* chunks in slots below this are subject to being sidelined on failed alloc */
 #define PCPU_SLOT_FAIL_THRESHOLD   3
-- 
2.30.2

[PATCH-next] powerpc/interrupt: Remove duplicate header file

2021-04-07 Thread johnny.chenyi

From: Chen Yi 

Delete one of the header files  that are included
twice.

Signed-off-by: Chen Yi 
---
 arch/powerpc/kernel/interrupt.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index c4dd4b8f9cfa..f64ace0208b7 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -7,7 +7,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
-- 
2.31.0

[PATCH -next v2] drm/bridge: lt8912b: Add header file

2021-04-07 Thread Zhang Jianhua

If CONFIG_DRM_LONTIUM_LT8912B=m, the following errors will be seen while
compiling lontium-lt8912b.c

drivers/gpu/drm/bridge/lontium-lt8912b.c: In function
‘lt8912_hard_power_on’:
drivers/gpu/drm/bridge/lontium-lt8912b.c:252:2: error: implicit
declaration of function ‘gpiod_set_value_cansleep’; did you mean
‘gpio_set_value_cansleep’? [-Werror=implicit-function-declaration]
  gpiod_set_value_cansleep(lt->gp_reset, 0);
  ^~~~
  gpio_set_value_cansleep
drivers/gpu/drm/bridge/lontium-lt8912b.c: In function ‘lt8912_parse_dt’:
drivers/gpu/drm/bridge/lontium-lt8912b.c:628:13: error: implicit
declaration of function ‘devm_gpiod_get_optional’; did you mean
‘devm_gpio_request_one’? [-Werror=implicit-function-declaration]
  gp_reset = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_HIGH);
 ^~~
 devm_gpio_request_one
drivers/gpu/drm/bridge/lontium-lt8912b.c:628:51: error: ‘GPIOD_OUT_HIGH’
undeclared (first use in this function); did you mean ‘GPIOF_INIT_HIGH’?
  gp_reset = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_HIGH);
   ^~
   GPIOF_INIT_HIGH

Signed-off-by: Zhang Jianhua 
---
v2:
- add header file  for lontium-lt8912b.c instead
  of add config dependence for CONFIG_DRM_LONTIUM_LT8912B
---
 drivers/gpu/drm/bridge/lontium-lt8912b.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/bridge/lontium-lt8912b.c 
b/drivers/gpu/drm/bridge/lontium-lt8912b.c
index 61491615bad0..79845b3b19e1 100644
--- a/drivers/gpu/drm/bridge/lontium-lt8912b.c
+++ b/drivers/gpu/drm/bridge/lontium-lt8912b.c
@@ -3,6 +3,7 @@
  * Copyright (c) 2018, The Linux Foundation. All rights reserved.
  */
 
+#include 
 #include 
 #include 
 #include 
-- 
2.17.1

Re: [External] linux-next: manual merge of the net-next tree with the bpf tree

2021-04-07 Thread Cong Wang .

On Wed, Apr 7, 2021 at 8:11 PM Stephen Rothwell  wrote:
>
> Hi all,
>
> Today's linux-next merge of the net-next tree got a conflict in:
>
>   net/core/skmsg.c
>
> between commit:
>
>   144748eb0c44 ("bpf, sockmap: Fix incorrect fwd_alloc accounting")
>
> from the bpf tree and commit:
>
>   e3526bb92a20 ("skmsg: Move sk_redir from TCP_SKB_CB to skb")
>
> from the net-next tree.
>
> I fixed it up (I think - see below) and can carry the fix as
> necessary. This is now fixed as far as linux-next is concerned, but any
> non trivial conflicts should be mentioned to your upstream maintainer
> when your tree is submitted for merging.  You may also want to consider
> cooperating with the maintainer of the conflicting tree to minimise any
> particularly complex conflicts.

Looks good from my quick glance.

Thanks!

[rcu:dev.2021.04.02a 73/73] ia64-linux-ld: undefined reference to `rcu_spawn_one_boost_kthread'

2021-04-07 Thread kernel test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
dev.2021.04.02a
head:   4bc4fd6b7e87ff0bdb1aa2493af85be2784717c0
commit: 4bc4fd6b7e87ff0bdb1aa2493af85be2784717c0 [73/73] rcu: Fix RCU priority 
boosting and add more debug output
config: ia64-defconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/commit/?id=4bc4fd6b7e87ff0bdb1aa2493af85be2784717c0
git remote add rcu 
https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
git fetch --no-tags rcu dev.2021.04.02a
git checkout 4bc4fd6b7e87ff0bdb1aa2493af85be2784717c0
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=ia64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   ia64-linux-ld: kernel/rcu/tree.o: in function `rcutree_online_cpu':
   (.text+0xf922): undefined reference to `rcu_spawn_one_boost_kthread'
>> ia64-linux-ld: (.text+0xf9a2): undefined reference to 
>> `rcu_spawn_one_boost_kthread'
   ia64-linux-ld: (.text+0xfa32): undefined reference to 
`rcu_spawn_one_boost_kthread'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: [External] linux-next: manual merge of the net-next tree with the bpf tree

2021-04-07 Thread Cong Wang .

On Wed, Apr 7, 2021 at 8:02 PM Stephen Rothwell  wrote:
>
> Hi all,
>
> Today's linux-next merge of the net-next tree got a conflict in:
>
>   include/linux/skmsg.h
>
> between commit:
>
>   1c84b33101c8 ("bpf, sockmap: Fix sk->prot unhash op reset")
>
> from the bpf tree and commit:
>
>   8a59f9d1e3d4 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()")
>
> from the net-next tree.
>
> I didn't know how to fixed it up so I just used the latter version or
> today - a better solution would be appreciated. This is now fixed as
> far as linux-next is concerned, but any non trivial conflicts should be
> mentioned to your upstream maintainer when your tree is submitted for
> merging.  You may also want to consider cooperating with the maintainer
> of the conflicting tree to minimise any particularly complex conflicts.

The right way to resolve this is to move the lines added in commit
1c84b33101c8 to the similar place in tcp_bpf_update_proto().

Thanks.

RE: [PATCH v16 1/2] scsi: ufs: Enable power management for wlun

2021-04-07 Thread Daejun Park

Hi Asutosh Das,

>+static inline bool is_rpmb_wlun(struct scsi_device *sdev)
>+{
>+return (sdev->lun == 
>ufshcd_upiu_wlun_to_scsi_wlun(UFS_UPIU_RPMB_WLUN));
>+}
>+
>+static inline bool is_device_wlun(struct scsi_device *sdev)
>+{
>+return (sdev->lun ==
>+ufshcd_upiu_wlun_to_scsi_wlun(UFS_UPIU_UFS_DEVICE_WLUN));
>+}
>+
> static void ufshcd_init_lrb(struct ufs_hba *hba, struct ufshcd_lrb *lrb, int 
> i)
> {
> struct utp_transfer_cmd_desc *cmd_descp = hba->ucdl_base_addr;
>@@ -4099,11 +4113,11 @@ void ufshcd_auto_hibern8_update(struct ufs_hba *hba, 
>u32 ahit)
> spin_unlock_irqrestore(hba->host->host_lock, flags);
> 
> if (update && !pm_runtime_suspended(hba->dev)) {

Could it be changed hba->sdev_ufs_device->sdev_gendev instead of hba->dev?

Thanks,
Daejun

[PATCHv2] mm/mmap.c: lines in __do_munmap repeat logic of inlined find_vma_intersection

2021-04-07 Thread Gonzalo Matias Juarez Tello

Some lines in __do_munmap used the same logic as find_vma_intersection
(which is inlined) instead of directly using that function.

(Can't believe I made a typo in the first one, compiled this one,
sorry first patch kinda nervous for some reason)

Signed-off-by: Gonzalo Matias Juarez Tello 
---
 mm/mmap.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 3f287599a7a3..1b29f8bf8344 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2823,15 +2823,10 @@ int __do_munmap(struct mm_struct *mm, unsigned long 
start, size_t len,
arch_unmap(mm, start, end);
 
/* Find the first overlapping VMA */
-   vma = find_vma(mm, start);
+   vma = find_vma_intersection(mm, start, end);
if (!vma)
return 0;
prev = vma->vm_prev;
-   /* we have  start < vma->vm_end  */
-
-   /* if it doesn't overlap, we have nothing.. */
-   if (vma->vm_start >= end)
-   return 0;
 
/*
 * If we need to split any vma, do it now to save pain later.
-- 
2.31.1

Re: [RFC PATCH v2 10/10] firmware: arm_scmi: Add virtio transport

2021-04-07 Thread Viresh Kumar

On Fri, Nov 6, 2020 at 2:59 AM Peter Hilber
 wrote:

> +static int scmi_vio_probe(struct virtio_device *vdev)
> +{
> +   struct device *dev = >dev;
> +   struct scmi_vio_channel **vioch;
> +   bool have_vq_rx;
> +   int vq_cnt;
> +   int i;
> +   struct virtqueue *vqs[VIRTIO_SCMI_VQ_MAX_CNT];
> +
> +   vioch = devm_kcalloc(dev, VIRTIO_SCMI_VQ_MAX_CNT, sizeof(*vioch),
> +GFP_KERNEL);
> +   if (!vioch)
> +   return -ENOMEM;
> +
> +   have_vq_rx = virtio_has_feature(vdev, VIRTIO_SCMI_F_P2A_CHANNELS);
> +   vq_cnt = have_vq_rx ? VIRTIO_SCMI_VQ_MAX_CNT : 1;
> +
> +   for (i = 0; i < vq_cnt; i++) {
> +   vioch[i] = devm_kzalloc(dev, sizeof(**vioch), GFP_KERNEL);
> +   if (!vioch[i])
> +   return -ENOMEM;
> +   }
> +
> +   if (have_vq_rx)
> +   vioch[VIRTIO_SCMI_VQ_RX]->is_rx = true;
> +
> +   if (virtio_find_vqs(vdev, vq_cnt, vqs, scmi_vio_complete_callbacks,
> +   scmi_vio_vqueue_names, NULL)) {
> +   dev_err(dev, "Failed to get %d virtqueue(s)\n", vq_cnt);
> +   return -1;
> +   }
> +   dev_info(dev, "Found %d virtqueue(s)\n", vq_cnt);
> +
> +   for (i = 0; i < vq_cnt; i++) {
> +   spin_lock_init([i]->lock);
> +   vioch[i]->vqueue = vqs[i];
> +   vioch[i]->vqueue->priv = vioch[i];

The vqueue->priv field is used by core, you can't update it else
notifications won't work.

> +   }
> +
> +   vdev->priv = vioch;
> +
> +   virtio_device_ready(vdev);
> +
> +   return 0;
> +}

diff --git a/drivers/firmware/arm_scmi/virtio.c
b/drivers/firmware/arm_scmi/virtio.c
index f70aa72f34f1..b1af77341b30 100644
--- a/drivers/firmware/arm_scmi/virtio.c
+++ b/drivers/firmware/arm_scmi/virtio.c
@@ -80,7 +80,8 @@ static int scmi_vio_populate_vq_rx(struct
scmi_vio_channel *vioch,

 static void scmi_vio_complete_cb(struct virtqueue *vqueue)
 {
-   struct scmi_vio_channel *vioch = vqueue->priv;
+   struct scmi_vio_channel **_vioch = vqueue->vdev->priv;
+   struct scmi_vio_channel *vioch = _vioch[vqueue->index];
unsigned long iflags;
unsigned int length;

@@ -454,7 +455,6 @@ static int scmi_vio_probe(struct virtio_device *vdev)
for (i = 0; i < vq_cnt; i++) {
spin_lock_init([i]->lock);
vioch[i]->vqueue = vqs[i];
-   vioch[i]->vqueue->priv = vioch[i];
}

vdev->priv = vioch;

Re: [PATCH v2] hwmon: Add driver for fsp-3y PSUs and PDUs

2021-04-07 Thread Guenter Roeck

On 4/7/21 7:34 PM, Václav Kubernát wrote:
> This patch adds support for these devices:
> - YH-5151E - the PDU
> - YM-2151E - the PSU
> 
> The device datasheet says that the devices support PMBus 1.2, but in my
> testing, a lot of the commands aren't supported and if they are, they
> sometimes behave strangely or inconsistently. For example, writes to the
> PAGE command requires using PEC, otherwise the write won't work and the
> page won't switch, even though, the standard says that PEC is opiotnal.
> On the other hand, writes the SMBALERT don't require PEC. Because of
> this, the driver is mostly reverse engineered with the help of a tool
> called pmbus_peek written by David Brownell (and later adopted by my
> colleague Jan Kundrát).
> 
> The device also has some sort of a timing issue when switching pages,
> which is explained further in the code.
> 
> Because of this, the driver support is limited. It exposes only the
> values, that have been tested to work correctly.
> 
> Signed-off-by: Václav Kubernát 
> ---
>  Documentation/hwmon/fsp-3y.rst |  26 
>  drivers/hwmon/pmbus/Kconfig|  10 ++
>  drivers/hwmon/pmbus/Makefile   |   1 +
>  drivers/hwmon/pmbus/fsp-3y.c   | 217 +
>  4 files changed, 254 insertions(+)
>  create mode 100644 Documentation/hwmon/fsp-3y.rst
>  create mode 100644 drivers/hwmon/pmbus/fsp-3y.c
> 
> diff --git a/Documentation/hwmon/fsp-3y.rst b/Documentation/hwmon/fsp-3y.rst
> new file mode 100644
> index ..68a547021846
> --- /dev/null
> +++ b/Documentation/hwmon/fsp-3y.rst
> @@ -0,0 +1,26 @@
> +Kernel driver fsp3y
> +==
> +Supported devices:
> +  * 3Y POWER YH-5151E
> +  * 3Y POWER YM-2151E
> +
> +Author: Václav Kubernát 
> +
> +Description
> +---
> +This driver implements limited support for two 3Y POWER devices.
> +
> +Sysfs entries
> +-
> +in1_inputinput voltage
> +in2_input12V output voltage
> +in3_input5V output voltage
> +curr1_input  input current
> +curr2_input  12V output current
> +curr3_input  5V output current
> +fan1_input   fan rpm
> +temp1_input  temperature 1
> +temp2_input  temperature 2
> +temp3_input  temperature 3
> +power1_input input power
> +power2_input output power
> diff --git a/drivers/hwmon/pmbus/Kconfig b/drivers/hwmon/pmbus/Kconfig
> index 03606d4298a4..9d12d446396c 100644
> --- a/drivers/hwmon/pmbus/Kconfig
> +++ b/drivers/hwmon/pmbus/Kconfig
> @@ -56,6 +56,16 @@ config SENSORS_BEL_PFE
> This driver can also be built as a module. If so, the module will
> be called bel-pfe.
>  
> +config SENSORS_FSP_3Y
> + tristate "FSP/3Y-Power power supplies"
> + help
> +   If you say yes here you get hardware monitoring support for
> +   FSP/3Y-Power hot-swap power supplies.
> +   Supported models: YH-5151E, YM-2151E
> +
> +   This driver can also be built as a module. If so, the module will
> +   be called fsp-3y.
> +
>  config SENSORS_IBM_CFFPS
>   tristate "IBM Common Form Factor Power Supply"
>   depends on LEDS_CLASS
> diff --git a/drivers/hwmon/pmbus/Makefile b/drivers/hwmon/pmbus/Makefile
> index 6a4ba0fdc1db..bfe218ad898f 100644
> --- a/drivers/hwmon/pmbus/Makefile
> +++ b/drivers/hwmon/pmbus/Makefile
> @@ -8,6 +8,7 @@ obj-$(CONFIG_SENSORS_PMBUS)   += pmbus.o
>  obj-$(CONFIG_SENSORS_ADM1266)+= adm1266.o
>  obj-$(CONFIG_SENSORS_ADM1275)+= adm1275.o
>  obj-$(CONFIG_SENSORS_BEL_PFE)+= bel-pfe.o
> +obj-$(CONFIG_SENSORS_FSP_3Y) += fsp-3y.o
>  obj-$(CONFIG_SENSORS_IBM_CFFPS)  += ibm-cffps.o
>  obj-$(CONFIG_SENSORS_INSPUR_IPSPS) += inspur-ipsps.o
>  obj-$(CONFIG_SENSORS_IR35221)+= ir35221.o
> diff --git a/drivers/hwmon/pmbus/fsp-3y.c b/drivers/hwmon/pmbus/fsp-3y.c
> new file mode 100644
> index ..2c165e034fa8
> --- /dev/null
> +++ b/drivers/hwmon/pmbus/fsp-3y.c
> @@ -0,0 +1,217 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Hardware monitoring driver for FSP 3Y-Power PSUs
> + *
> + * Copyright (c) 2021 Václav Kubernát, CESNET
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include "pmbus.h"
> +
> +#define YM2151_PAGE_12V_LOG  0x00
> +#define YM2151_PAGE_12V_REAL 0x00
> +#define YM2151_PAGE_5VSB_LOG 0x01
> +#define YM2151_PAGE_5VSB_REAL0x20
> +#define YH5151E_PAGE_12V_LOG 0x00
> +#define YH5151E_PAGE_12V_REAL0x00
> +#define YH5151E_PAGE_5V_LOG  0x01
> +#define YH5151E_PAGE_5V_REAL 0x10
> +#define YH5151E_PAGE_3V3_LOG 0x02
> +#define YH5151E_PAGE_3V3_REAL0x11
> +
> +enum chips {
> + ym2151e,
> + yh5151e
> +};
> +
> +struct fsp3y_data {
> + struct pmbus_driver_info info;
> + enum chips chip;
> + int page;
> +};
> +
> +#define to_fsp3y_data(x) container_of(x, struct fsp3y_data, info)
> +
> +static int page_log_to_page_real(int page_log, enum chips chip)
> +{
> + switch (chip) {
> + case

[PATCH -next] powerpc/security: Make symbol 'stf_barrier' static

2021-04-07 Thread Li Huafei

The sparse tool complains as follows:

arch/powerpc/kernel/security.c:253:6: warning:
 symbol 'stf_barrier' was not declared. Should it be static?

This symbol is not used outside of security.c, so this commit marks it
static.

Signed-off-by: Li Huafei 
---
 arch/powerpc/kernel/security.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c
index e4e1a94ccf6a..4de6bbd9672e 100644
--- a/arch/powerpc/kernel/security.c
+++ b/arch/powerpc/kernel/security.c
@@ -250,7 +250,7 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct 
device_attribute *attr, c
 
 static enum stf_barrier_type stf_enabled_flush_types;
 static bool no_stf_barrier;
-bool stf_barrier;
+static bool stf_barrier;
 
 static int __init handle_no_stf_barrier(char *p)
 {
-- 
2.17.1

Re: [PATCH 2/2] pinctrl: qcom-pmic-gpio: Add support for pm8008

2021-04-07 Thread Bjorn Andersson

On Wed 07 Apr 17:35 CDT 2021, Guru Das Srinagesh wrote:

> Add support for the two GPIOs present on PM8008.
> 
> Signed-off-by: Guru Das Srinagesh 
> ---
>  drivers/pinctrl/qcom/pinctrl-spmi-gpio.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c 
> b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c
> index c2b9f2e..76e997a 100644
> --- a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c
> +++ b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c
> @@ -1137,6 +1137,7 @@ static const struct of_device_id pmic_gpio_of_match[] = 
> {
>   { .compatible = "qcom,pm6150l-gpio", .data = (void *) 12 },
>   /* pmx55 has 11 GPIOs with holes on 3, 7, 10, 11 */
>   { .compatible = "qcom,pmx55-gpio", .data = (void *) 11 },
> + { .compatible = "qcom,pm8008-gpio", .data = (void *) 2 },

As with the binding, please keep these sorted alphabetically.

With that:

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

>   { },
>  };
>  
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>

Re: [PATCH 1/2] dt-bindings: pinctrl: qcom-pmic-gpio: Add pm8008 support

2021-04-07 Thread Bjorn Andersson

On Wed 07 Apr 17:34 CDT 2021, Guru Das Srinagesh wrote:

> Add support for the PM8008 GPIO support to the Qualcomm PMIC GPIO
> binding.
> 
> Signed-off-by: Guru Das Srinagesh 
> ---
>  Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.txt | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.txt 
> b/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.txt
> index 70e119b..1818481 100644
> --- a/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.txt
> +++ b/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.txt
> @@ -36,6 +36,7 @@ PMIC's from Qualcomm.
>   "qcom,pm6150-gpio"
>   "qcom,pm6150l-gpio"
>   "qcom,pmx55-gpio"
> + "qcom,pm8008-gpio"

Please keep these sorted alphabetically (i.e. '8' < 'x')

With that

Acked-by: Bjorn Andersson 

Regards,
Bjorn

>  
>   And must contain either "qcom,spmi-gpio" or "qcom,ssbi-gpio"
>   if the device is on an spmi bus or an ssbi bus respectively
> @@ -125,6 +126,7 @@ to specify in a pin configuration subnode:
>   gpio1-gpio12 for pm6150l
>   gpio1-gpio11 for pmx55 (holes on gpio3, gpio7, gpio10
>   and gpio11)
> + gpio1-gpio2 for pm8008
>  
>  - function:
>   Usage: required
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>

[PATCH] mm/mmap.c: lines in __do_munmap repeat logic of inlined find_vma_intersection

2021-04-07 Thread Gonzalo Matias Juarez Tello

Some lines in __do_munmap used the same logic as find_vma_intersection
(which is inlined) instead of directly using that function.

Signed-off-by: Gonzalo Matias Juarez Tello 
---
 mm/mmap.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 3f287599a7a3..1b29f8bf8344 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2823,15 +2823,10 @@ int __do_munmap(struct mm_struct *mm, unsigned long 
start, size_t len,
arch_unmap(mm, start, end);
 
/* Find the first overlapping VMA */
-   vma = find_vma(mm, start);
+   vma = find_vma_intersecion(mm, start, end);
if (!vma)
return 0;
prev = vma->vm_prev;
-   /* we have  start < vma->vm_end  */
-
-   /* if it doesn't overlap, we have nothing.. */
-   if (vma->vm_start >= end)
-   return 0;
 
/*
 * If we need to split any vma, do it now to save pain later.
-- 
2.31.1

Re: [PATCH 3/4] mm/hugeltb: fix potential wrong gbl_reserve value for hugetlb_acct_memory()

2021-04-07 Thread Miaohe Lin

On 2021/4/8 11:24, Miaohe Lin wrote:
> On 2021/4/8 4:53, Mike Kravetz wrote:
>> On 4/7/21 12:24 AM, Miaohe Lin wrote:
>>> Hi:
>>> On 2021/4/7 10:49, Mike Kravetz wrote:
 On 4/2/21 2:32 AM, Miaohe Lin wrote:
> The resv_map could be NULL since this routine can be called in the evict
> inode path for all hugetlbfs inodes. So we could have chg = 0 and this
> would result in a negative value when chg - freed. This is unexpected for
> hugepage_subpool_put_pages() and hugetlb_acct_memory().

 I am not sure if this is possible.

 It is true that resv_map could be NULL.  However, I believe resv map
 can only be NULL for inodes that are not regular or link inodes.  This
 is the inode creation code in hugetlbfs_get_inode().

/*
  * Reserve maps are only needed for inodes that can have associated
  * page allocations.
  */
 if (S_ISREG(mode) || S_ISLNK(mode)) {
 resv_map = resv_map_alloc();
 if (!resv_map)
 return NULL;
 }

>>>
>>> Agree.
>>>
 If resv_map is NULL, then no hugetlb pages can be allocated/associated
 with the file.  As a result, remove_inode_hugepages will never find any
 huge pages associated with the inode and the passed value 'freed' will
 always be zero.

>>>
>>> But I am confused now. AFAICS, remove_inode_hugepages() searches the 
>>> address_space of
>>> the inode to remove the hugepages while does not care if inode has 
>>> associated resv_map.
>>> How does it prevent hugetlb pages from being allocated/associated with the 
>>> file if
>>> resv_map is NULL? Could you please explain this more?
>>>
>>
>> Recall that there are only two ways to get huge pages associated with
>> a hugetlbfs file: fallocate and mmap/write fault.  Directly writing to
>> hugetlbfs files is not supported.
>>
>> If you take a closer look at hugetlbfs_get_inode, it has that code to
>> allocate the resv map mentioned above as well as the following:
>>
>>  switch (mode & S_IFMT) {
>>  default:
>>  init_special_inode(inode, mode, dev);
>>  break;
>>  case S_IFREG:
>>  inode->i_op = _inode_operations;
>>  inode->i_fop = _file_operations;
>>  break;
>>  case S_IFDIR:
>>  inode->i_op = _dir_inode_operations;
>>  inode->i_fop = _dir_operations;
>>
>>  /* directory inodes start off with i_nlink == 2 (for 
>> "." entry) */
>>  inc_nlink(inode);
>>  break;
>>  case S_IFLNK:
>>  inode->i_op = _symlink_inode_operations;
>>  inode_nohighmem(inode);
>>  break;
>>  }
>>
>> Notice that only S_IFREG inodes will have i_fop == 
>> _file_operations.
>> hugetlbfs_file_operations contain the hugetlbfs specific mmap and fallocate
>> routines.  Hence, only files with S_IFREG inodes can potentially have
>> associated huge pages.  S_IFLNK inodes can as well via file linking.
>>
>> If an inode is not S_ISREG(mode) || S_ISLNK(mode), then it will not have
>> a resv_map.  In addition, it will not have hugetlbfs_file_operations and
>> can not have associated huge pages.
>>
> 
> Many many thanks for detailed and patient explanation! :) I think I have got 
> the idea!
> 
>> I looked at this closely when adding commits
>> 58b6e5e8f1ad hugetlbfs: fix memory leak for resv_map
>> f27a5136f70a hugetlbfs: always use address space in inode for resv_map 
>> pointer
>>
>> I may not be remembering all of the details correctly.  Commit f27a5136f70a
>> added the comment that resv_map could be NULL to hugetlb_unreserve_pages.
>>
> 
> Since we must have freed == 0 while chg == 0. Should we make this assumption 
> explict
> by something like below?
> 
> WARN_ON(chg < freed);
> 

Or just a comment to avoid confusion ?

> Thanks again!
>

Re: [PATCH 3/4] mm/hugeltb: fix potential wrong gbl_reserve value for hugetlb_acct_memory()

2021-04-07 Thread Miaohe Lin

On 2021/4/8 4:53, Mike Kravetz wrote:
> On 4/7/21 12:24 AM, Miaohe Lin wrote:
>> Hi:
>> On 2021/4/7 10:49, Mike Kravetz wrote:
>>> On 4/2/21 2:32 AM, Miaohe Lin wrote:
 The resv_map could be NULL since this routine can be called in the evict
 inode path for all hugetlbfs inodes. So we could have chg = 0 and this
 would result in a negative value when chg - freed. This is unexpected for
 hugepage_subpool_put_pages() and hugetlb_acct_memory().
>>>
>>> I am not sure if this is possible.
>>>
>>> It is true that resv_map could be NULL.  However, I believe resv map
>>> can only be NULL for inodes that are not regular or link inodes.  This
>>> is the inode creation code in hugetlbfs_get_inode().
>>>
>>>/*
>>>  * Reserve maps are only needed for inodes that can have associated
>>>  * page allocations.
>>>  */
>>> if (S_ISREG(mode) || S_ISLNK(mode)) {
>>> resv_map = resv_map_alloc();
>>> if (!resv_map)
>>> return NULL;
>>> }
>>>
>>
>> Agree.
>>
>>> If resv_map is NULL, then no hugetlb pages can be allocated/associated
>>> with the file.  As a result, remove_inode_hugepages will never find any
>>> huge pages associated with the inode and the passed value 'freed' will
>>> always be zero.
>>>
>>
>> But I am confused now. AFAICS, remove_inode_hugepages() searches the 
>> address_space of
>> the inode to remove the hugepages while does not care if inode has 
>> associated resv_map.
>> How does it prevent hugetlb pages from being allocated/associated with the 
>> file if
>> resv_map is NULL? Could you please explain this more?
>>
> 
> Recall that there are only two ways to get huge pages associated with
> a hugetlbfs file: fallocate and mmap/write fault.  Directly writing to
> hugetlbfs files is not supported.
> 
> If you take a closer look at hugetlbfs_get_inode, it has that code to
> allocate the resv map mentioned above as well as the following:
> 
>   switch (mode & S_IFMT) {
>   default:
>   init_special_inode(inode, mode, dev);
>   break;
>   case S_IFREG:
>   inode->i_op = _inode_operations;
>   inode->i_fop = _file_operations;
>   break;
>   case S_IFDIR:
>   inode->i_op = _dir_inode_operations;
>   inode->i_fop = _dir_operations;
> 
>   /* directory inodes start off with i_nlink == 2 (for 
> "." entry) */
>   inc_nlink(inode);
>   break;
>   case S_IFLNK:
>   inode->i_op = _symlink_inode_operations;
>   inode_nohighmem(inode);
>   break;
>   }
> 
> Notice that only S_IFREG inodes will have i_fop == _file_operations.
> hugetlbfs_file_operations contain the hugetlbfs specific mmap and fallocate
> routines.  Hence, only files with S_IFREG inodes can potentially have
> associated huge pages.  S_IFLNK inodes can as well via file linking.
> 
> If an inode is not S_ISREG(mode) || S_ISLNK(mode), then it will not have
> a resv_map.  In addition, it will not have hugetlbfs_file_operations and
> can not have associated huge pages.
> 

Many many thanks for detailed and patient explanation! :) I think I have got 
the idea!

> I looked at this closely when adding commits
> 58b6e5e8f1ad hugetlbfs: fix memory leak for resv_map
> f27a5136f70a hugetlbfs: always use address space in inode for resv_map pointer
> 
> I may not be remembering all of the details correctly.  Commit f27a5136f70a
> added the comment that resv_map could be NULL to hugetlb_unreserve_pages.
> 

Since we must have freed == 0 while chg == 0. Should we make this assumption 
explict
by something like below?

WARN_ON(chg < freed);

Thanks again!

[PATCH] usb: dwc2: Enable RPi in ACPI mode

2021-04-07 Thread jlinton

From: Jeremy Linton 

The dwc2 driver has everything we need to run
in ACPI mode except for the ACPI module device table
boilerplate. With that added and identified as "BCM2848",
an id in use by other OSs for this device, the dw2
controller on the BCM2711 will work.

Signed-off-by: Jeremy Linton 
---
 drivers/usb/dwc2/core.h |  2 ++
 drivers/usb/dwc2/params.c   | 14 ++
 drivers/usb/dwc2/platform.c |  1 +
 3 files changed, 17 insertions(+)

diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index 7161344c6522..defc6034af49 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -38,6 +38,7 @@
 #ifndef __DWC2_CORE_H__
 #define __DWC2_CORE_H__
 
+#include 
 #include 
 #include 
 #include 
@@ -1339,6 +1340,7 @@ irqreturn_t dwc2_handle_common_intr(int irq, void *dev);
 
 /* The device ID match table */
 extern const struct of_device_id dwc2_of_match_table[];
+extern const struct acpi_device_id dwc2_acpi_match[];
 
 int dwc2_lowlevel_hw_enable(struct dwc2_hsotg *hsotg);
 int dwc2_lowlevel_hw_disable(struct dwc2_hsotg *hsotg);
diff --git a/drivers/usb/dwc2/params.c b/drivers/usb/dwc2/params.c
index 92df3d620f7d..127878a0a397 100644
--- a/drivers/usb/dwc2/params.c
+++ b/drivers/usb/dwc2/params.c
@@ -232,6 +232,12 @@ const struct of_device_id dwc2_of_match_table[] = {
 };
 MODULE_DEVICE_TABLE(of, dwc2_of_match_table);
 
+const struct acpi_device_id dwc2_acpi_match[] = {
+   { "BCM2848", dwc2_set_bcm_params },
+   { },
+};
+MODULE_DEVICE_TABLE(acpi, dwc2_acpi_match);
+
 static void dwc2_set_param_otg_cap(struct dwc2_hsotg *hsotg)
 {
u8 val;
@@ -878,6 +884,14 @@ int dwc2_init_params(struct dwc2_hsotg *hsotg)
if (match && match->data) {
set_params = match->data;
set_params(hsotg);
+   } else {
+   struct acpi_device_id *amatch;
+
+   amatch = acpi_match_device(dwc2_acpi_match, hsotg->dev);
+   if (amatch && amatch->driver_data) {
+   set_params = amatch->driver_data;
+   set_params(hsotg);
+   }
}
 
dwc2_check_params(hsotg);
diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c
index 5f18acac7406..53fc6bc3ed1a 100644
--- a/drivers/usb/dwc2/platform.c
+++ b/drivers/usb/dwc2/platform.c
@@ -734,6 +734,7 @@ static struct platform_driver dwc2_platform_driver = {
.driver = {
.name = dwc2_driver_name,
.of_match_table = dwc2_of_match_table,
+   .acpi_match_table = ACPI_PTR(dwc2_acpi_match),
.pm = _dev_pm_ops,
},
.probe = dwc2_driver_probe,
-- 
2.26.2

RE: [PATCH v3 08/10] fsdax: Dedup file range to use a compare function

2021-04-07 Thread ruansy.f...@fujitsu.com


> -Original Message-
> From: Ritesh Harjani 
> Subject: Re: [PATCH v3 08/10] fsdax: Dedup file range to use a compare 
> function
> 
> On 21/03/19 09:52AM, Shiyang Ruan wrote:
> > With dax we cannot deal with readpage() etc. So, we create a dax
> > comparison funciton which is similar with
> > vfs_dedupe_file_range_compare().
> > And introduce dax_remap_file_range_prep() for filesystem use.
> >
> > Signed-off-by: Goldwyn Rodrigues 
> > Signed-off-by: Shiyang Ruan 
> > ---
> >  fs/dax.c | 56
> 
> >  fs/remap_range.c | 45 ---
> >  fs/xfs/xfs_reflink.c |  9 +--
> >  include/linux/dax.h  |  4 
> >  include/linux/fs.h   | 15 
> >  5 files changed, 115 insertions(+), 14 deletions(-)
> >
> > diff --git a/fs/dax.c b/fs/dax.c
> > index 348297b38f76..76f81f1d76ec 100644
> > --- a/fs/dax.c
> > +++ b/fs/dax.c
> > @@ -1833,3 +1833,59 @@ vm_fault_t dax_finish_sync_fault(struct vm_fault
> *vmf,
> > return dax_insert_pfn_mkwrite(vmf, pfn, order);  }
> > EXPORT_SYMBOL_GPL(dax_finish_sync_fault);
> > +
> > +static loff_t dax_range_compare_actor(struct inode *ino1, loff_t pos1,
> > +   struct inode *ino2, loff_t pos2, loff_t len, void *data,
> > +   struct iomap *smap, struct iomap *dmap) {
> > +   void *saddr, *daddr;
> > +   bool *same = data;
> > +   int ret;
> > +
> > +   if (smap->type == IOMAP_HOLE && dmap->type == IOMAP_HOLE) {
> > +   *same = true;
> > +   return len;
> > +   }
> > +
> > +   if (smap->type == IOMAP_HOLE || dmap->type == IOMAP_HOLE) {
> > +   *same = false;
> > +   return 0;
> > +   }
> > +
> > +   ret = dax_iomap_direct_access(smap, pos1, ALIGN(pos1 + len, PAGE_SIZE),
> > + , NULL);
> 
> shouldn't it take len as the second argument?

The second argument of dax_iomap_direct_access() means offset, and the third 
one means length.  So, I think this is right.

> 
> > +   if (ret < 0)
> > +   return -EIO;
> > +
> > +   ret = dax_iomap_direct_access(dmap, pos2, ALIGN(pos2 + len, PAGE_SIZE),
> > + , NULL);
> 
> ditto.
> > +   if (ret < 0)
> > +   return -EIO;
> > +
> > +   *same = !memcmp(saddr, daddr, len);
> > +   return len;
> > +}
> > +
> > +int dax_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
> > +   struct inode *dest, loff_t destoff, loff_t len, bool *is_same,
> > +   const struct iomap_ops *ops)
> > +{
> > +   int id, ret = 0;
> > +
> > +   id = dax_read_lock();
> > +   while (len) {
> > +   ret = iomap_apply2(src, srcoff, dest, destoff, len, 0, ops,
> > +  is_same, dax_range_compare_actor);
> > +   if (ret < 0 || !*is_same)
> > +   goto out;
> > +
> > +   len -= ret;
> > +   srcoff += ret;
> > +   destoff += ret;
> > +   }
> > +   ret = 0;
> > +out:
> > +   dax_read_unlock(id);
> > +   return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(dax_dedupe_file_range_compare);
> > diff --git a/fs/remap_range.c b/fs/remap_range.c index
> > 77dba3a49e65..9079390edaf3 100644
> > --- a/fs/remap_range.c
> > +++ b/fs/remap_range.c
> > @@ -14,6 +14,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include "internal.h"
> >
> >  #include 
> > @@ -199,9 +200,9 @@ static void vfs_unlock_two_pages(struct page *page1,
> struct page *page2)
> >   * Compare extents of two files to see if they are the same.
> >   * Caller must have locked both inodes to prevent write races.
> >   */
> > -static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
> > -struct inode *dest, loff_t destoff,
> > -loff_t len, bool *is_same)
> > +int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
> > + struct inode *dest, loff_t destoff,
> > + loff_t len, bool *is_same)
> >  {
> > loff_t src_poff;
> > loff_t dest_poff;
> > @@ -280,6 +281,7 @@ static int vfs_dedupe_file_range_compare(struct
> > inode *src, loff_t srcoff,
> >  out_error:
> > return error;
> >  }
> > +EXPORT_SYMBOL(vfs_dedupe_file_range_compare);
> >
> >  /*
> >   * Check that the two inodes are eligible for cloning, the ranges
> > make @@ -289,9 +291,11 @@ static int
> vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
> >   * If there's an error, then the usual negative error code is returned.
> >   * Otherwise returns 0 with *len set to the request length.
> >   */
> > -int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in,
> > - struct file *file_out, loff_t pos_out,
> > - loff_t *len, unsigned int remap_flags)
> > +static int
> > +__generic_remap_file_range_prep(struct file *file_in, loff_t pos_in,
> > +   struct file *file_out,

linux-next: manual merge of the net-next tree with the net tree

2021-04-07 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  net/tipc/crypto.c

between commit:

  2a2403ca3add ("tipc: increment the tmp aead refcnt before attaching it")

from the net tree and commit:

  97bc84bbd4de ("tipc: clean up warnings detected by sparse")

from the net-next tree.

I fixed it up (I used the former version) and can carry the fix as
necessary. This is now fixed as far as linux-next is concerned, but any
non trivial conflicts should be mentioned to your upstream maintainer
when your tree is submitted for merging.  You may also want to consider
cooperating with the maintainer of the conflicting tree to minimise any
particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgp3UqR42C5pN.pgp
Description: OpenPGP digital signature

linux-next: manual merge of the net-next tree with the bpf tree

2021-04-07 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  net/core/skmsg.c

between commit:

  144748eb0c44 ("bpf, sockmap: Fix incorrect fwd_alloc accounting")

from the bpf tree and commit:

  e3526bb92a20 ("skmsg: Move sk_redir from TCP_SKB_CB to skb")

from the net-next tree.

I fixed it up (I think - see below) and can carry the fix as
necessary. This is now fixed as far as linux-next is concerned, but any
non trivial conflicts should be mentioned to your upstream maintainer
when your tree is submitted for merging.  You may also want to consider
cooperating with the maintainer of the conflicting tree to minimise any
particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc net/core/skmsg.c
index 5def3a2e85be,92a83c02562a..
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@@ -806,12 -900,17 +900,13 @@@ int sk_psock_tls_strp_read(struct sk_ps
int ret = __SK_PASS;
  
rcu_read_lock();
-   prog = READ_ONCE(psock->progs.skb_verdict);
+   prog = READ_ONCE(psock->progs.stream_verdict);
if (likely(prog)) {
 -  /* We skip full set_owner_r here because if we do a SK_PASS
 -   * or SK_DROP we can skip skb memory accounting and use the
 -   * TLS context.
 -   */
skb->sk = psock->sk;
-   tcp_skb_bpf_redirect_clear(skb);
-   ret = sk_psock_bpf_run(psock, prog, skb);
-   ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb));
+   skb_dst_drop(skb);
+   skb_bpf_redirect_clear(skb);
+   ret = bpf_prog_run_pin_on_cpu(prog, skb);
+   ret = sk_psock_map_verd(ret, skb_bpf_redirect_fetch(skb));
skb->sk = NULL;
}
sk_psock_tls_verdict_apply(skb, psock->sk, ret);
@@@ -876,13 -995,13 +991,14 @@@ static void sk_psock_strp_read(struct s
kfree_skb(skb);
goto out;
}
-   prog = READ_ONCE(psock->progs.skb_verdict);
 -  skb_set_owner_r(skb, sk);
+   prog = READ_ONCE(psock->progs.stream_verdict);
if (likely(prog)) {
 +  skb->sk = sk;
-   tcp_skb_bpf_redirect_clear(skb);
-   ret = sk_psock_bpf_run(psock, prog, skb);
-   ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb));
+   skb_dst_drop(skb);
+   skb_bpf_redirect_clear(skb);
+   ret = bpf_prog_run_pin_on_cpu(prog, skb);
+   ret = sk_psock_map_verd(ret, skb_bpf_redirect_fetch(skb));
 +  skb->sk = NULL;
}
sk_psock_verdict_apply(psock, skb, ret);
  out:
@@@ -953,13 -1115,15 +1112,16 @@@ static int sk_psock_verdict_recv(read_d
kfree_skb(skb);
goto out;
}
-   prog = READ_ONCE(psock->progs.skb_verdict);
 -  skb_set_owner_r(skb, sk);
+   prog = READ_ONCE(psock->progs.stream_verdict);
+   if (!prog)
+   prog = READ_ONCE(psock->progs.skb_verdict);
if (likely(prog)) {
 +  skb->sk = sk;
-   tcp_skb_bpf_redirect_clear(skb);
-   ret = sk_psock_bpf_run(psock, prog, skb);
-   ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb));
+   skb_dst_drop(skb);
+   skb_bpf_redirect_clear(skb);
+   ret = bpf_prog_run_pin_on_cpu(prog, skb);
+   ret = sk_psock_map_verd(ret, skb_bpf_redirect_fetch(skb));
 +  skb->sk = NULL;
}
sk_psock_verdict_apply(psock, skb, ret);
  out:


pgpmIksOHySb2.pgp
Description: OpenPGP digital signature

[PATCH] powerpc: remove old workaround for GCC < 4.9

2021-04-07 Thread Masahiro Yamada

According to Documentation/process/changes.rst, the minimum supported
GCC version is 4.9.

This workaround is dead code.

Signed-off-by: Masahiro Yamada 
---

 arch/powerpc/Makefile | 6 --
 1 file changed, 6 deletions(-)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 5f8544cf724a..32dd693b4e42 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -181,12 +181,6 @@ CC_FLAGS_FTRACE := -pg
 ifdef CONFIG_MPROFILE_KERNEL
 CC_FLAGS_FTRACE += -mprofile-kernel
 endif
-# Work around gcc code-gen bugs with -pg / -fno-omit-frame-pointer in gcc <= 
4.8
-# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44199
-# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52828
-ifndef CONFIG_CC_IS_CLANG
-CC_FLAGS_FTRACE+= $(call cc-ifversion, -lt, 0409, -mno-sched-epilog)
-endif
 endif
 
 CFLAGS-$(CONFIG_TARGET_CPU_BOOL) += $(call 
cc-option,-mcpu=$(CONFIG_TARGET_CPU))
-- 
2.27.0

Re: [PATCH v4 4/6] drm/sprd: add Unisoc's drm display controller driver

2021-04-07 Thread Chunyan Zhang

On Wed, 7 Apr 2021 at 18:45, Maxime Ripard  wrote:
>
> Hi,
>
> Adding Jörg, Will and Robin,

You forgot to add them actually :)
I've added Robin and Joerg.

>
> On Wed, Mar 31, 2021 at 09:21:19AM +0800, Kevin Tang wrote:
> > > > +static u32 check_mmu_isr(struct sprd_dpu *dpu, u32 reg_val)
> > > > +{
> > > > + struct dpu_context *ctx = >ctx;
> > > > + u32 mmu_mask = BIT_DPU_INT_MMU_VAOR_RD |
> > > > + BIT_DPU_INT_MMU_VAOR_WR |
> > > > + BIT_DPU_INT_MMU_INV_RD |
> > > > + BIT_DPU_INT_MMU_INV_WR;
> > > > + u32 val = reg_val & mmu_mask;
> > > > + int i;
> > > > +
> > > > + if (val) {
> > > > + drm_err(dpu->drm, "--- iommu interrupt err: 0x%04x ---\n",
> > > val);
> > > > +
> > > > + if (val & BIT_DPU_INT_MMU_INV_RD)
> > > > + drm_err(dpu->drm, "iommu invalid read error, addr:
> > > 0x%08x\n",
> > > > + readl(ctx->base + REG_MMU_INV_ADDR_RD));
> > > > + if (val & BIT_DPU_INT_MMU_INV_WR)
> > > > + drm_err(dpu->drm, "iommu invalid write error,
> > > addr: 0x%08x\n",
> > > > + readl(ctx->base + REG_MMU_INV_ADDR_WR));
> > > > + if (val & BIT_DPU_INT_MMU_VAOR_RD)
> > > > + drm_err(dpu->drm, "iommu va out of range read
> > > error, addr: 0x%08x\n",
> > > > + readl(ctx->base + REG_MMU_VAOR_ADDR_RD));
> > > > + if (val & BIT_DPU_INT_MMU_VAOR_WR)
> > > > + drm_err(dpu->drm, "iommu va out of range write
> > > error, addr: 0x%08x\n",
> > > > + readl(ctx->base + REG_MMU_VAOR_ADDR_WR));
> > >
> > > Is that the IOMMU page fault interrupt? I would expect it to be in the
> > > iommu driver.
> >
> > Our iommu driver is indeed an separate driver, and also in upstreaming,
> > but iommu fault interrupts reporting by display controller, not iommu
> >  itself,
> > if use iommu_set_fault_handler() to hook in our reporting function, there
> > must be cross-module access to h/w registers.
>
> Can you explain a bit more the design of the hardware then? Each device
> connected to the IOMMU has a status register (and an interrupt) that
> reports when there's a fault?

On Unisoc's platforms, one IOMMU serves one master device only, and interrupts
are handled by master devices rather than IOMMUs, since the registers are in the
physical address range of master devices.

>
> I'd like to get an ack at least from the IOMMU maintainers and
> reviewers.
>
> > > > +static void sprd_dpi_init(struct sprd_dpu *dpu)
> > > > +{
> > > > + struct dpu_context *ctx = >ctx;
> > > > + u32 int_mask = 0;
> > > > + u32 reg_val;
> > > > +
> > > > + if (ctx->if_type == SPRD_DPU_IF_DPI) {
> > > > + /* use dpi as interface */
> > > > + dpu_reg_clr(ctx, REG_DPU_CFG0, BIT_DPU_IF_EDPI);
> > > > + /* disable Halt function for SPRD DSI */
> > > > + dpu_reg_clr(ctx, REG_DPI_CTRL, BIT_DPU_DPI_HALT_EN);
> > > > + /* select te from external pad */
> > > > + dpu_reg_set(ctx, REG_DPI_CTRL,
> > > BIT_DPU_EDPI_FROM_EXTERNAL_PAD);
> > > > +
> > > > + /* set dpi timing */
> > > > + reg_val = ctx->vm.hsync_len << 0 |
> > > > +   ctx->vm.hback_porch << 8 |
> > > > +   ctx->vm.hfront_porch << 20;
> > > > + writel(reg_val, ctx->base + REG_DPI_H_TIMING);
> > > > +
> > > > + reg_val = ctx->vm.vsync_len << 0 |
> > > > +   ctx->vm.vback_porch << 8 |
> > > > +   ctx->vm.vfront_porch << 20;
> > > > + writel(reg_val, ctx->base + REG_DPI_V_TIMING);
> > > > +
> > > > + if (ctx->vm.vsync_len + ctx->vm.vback_porch < 32)
> > > > + drm_warn(dpu->drm, "Warning: (vsync + vbp) < 32, "
> > > > + "underflow risk!\n");
> > >
> > > I don't think a warning is appropriate here. Maybe we should just
> > > outright reject any mode that uses it?
> > >
> >  This issue has been fixed on the new soc, maybe I should remove it.
>
> If it still requires a workaround on older SoC, you can definitely add
> it but we should prevent any situation where the underflow might occur
> instead of reporting it once we're there.
>
> > > > +static enum drm_mode_status sprd_crtc_mode_valid(struct drm_crtc *crtc,
> > > > + const struct drm_display_mode
> > > *mode)
> > > > +{
> > > > + struct sprd_dpu *dpu = to_sprd_crtc(crtc);
> > > > +
> > > > + drm_dbg(dpu->drm, "%s() mode: "DRM_MODE_FMT"\n", __func__,
> > > DRM_MODE_ARG(mode));
> > > > +
> > > > + if (mode->type & DRM_MODE_TYPE_PREFERRED) {
> > > > + drm_display_mode_to_videomode(mode, >ctx.vm);
> > >
> > > You don't seem to use that anywhere else? And that's a bit fragile,
> > > nothing really guarantees

linux-next: manual merge of the net-next tree with the bpf tree

2021-04-07 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  include/linux/skmsg.h

between commit:

  1c84b33101c8 ("bpf, sockmap: Fix sk->prot unhash op reset")

from the bpf tree and commit:

  8a59f9d1e3d4 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()")

from the net-next tree.

I didn't know how to fixed it up so I just used the latter version or
today - a better solution would be appreciated. This is now fixed as
far as linux-next is concerned, but any non trivial conflicts should be
mentioned to your upstream maintainer when your tree is submitted for
merging.  You may also want to consider cooperating with the maintainer
of the conflicting tree to minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgpLwie9IQh1g.pgp
Description: OpenPGP digital signature

[PATCH] staging: rtl8712: matched alignment with open parenthesis

2021-04-07 Thread Mitali Borkar

Aligned arguments with open parenthesis to meet linux kernel coding
style
Reported by checkpatch

Signed-off-by: Mitali Borkar 
---
 drivers/staging/rtl8712/usb_ops.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/rtl8712/usb_ops.h 
b/drivers/staging/rtl8712/usb_ops.h
index d62975447d29..7a6b619b73fa 100644
--- a/drivers/staging/rtl8712/usb_ops.h
+++ b/drivers/staging/rtl8712/usb_ops.h
@@ -21,9 +21,9 @@
 void r8712_usb_write_mem(struct intf_hdl *pintfhdl, u32 addr,
 u32 cnt, u8 *wmem);
 u32 r8712_usb_write_port(struct intf_hdl *pintfhdl, u32 addr,
- u32 cnt, u8 *wmem);
+u32 cnt, u8 *wmem);
 u32 r8712_usb_read_port(struct intf_hdl *pintfhdl, u32 addr,
-u32 cnt, u8 *rmem);
+   u32 cnt, u8 *rmem);
 void r8712_usb_set_intf_option(u32 *poption);
 void r8712_usb_set_intf_funs(struct intf_hdl *pintf_hdl);
 uint r8712_usb_init_intf_priv(struct intf_priv *pintfpriv);
@@ -32,7 +32,7 @@ void r8712_usb_set_intf_ops(struct _io_ops *pops);
 void r8712_usb_read_port_cancel(struct _adapter *padapter);
 void r8712_usb_write_port_cancel(struct _adapter *padapter);
 int r8712_usbctrl_vendorreq(struct intf_priv *pintfpriv, u8 request, u16 value,
- u16 index, void *pdata, u16 len, u8 requesttype);
+   u16 index, void *pdata, u16 len, u8 requesttype);
 
 #endif
 
-- 
2.30.2

[PATCH v7] soc: fsl: enable acpi support in RCPM driver

2021-04-07 Thread Ran Wang

From: Peng Ma 

This patch enables ACPI support in RCPM driver.

Signed-off-by: Peng Ma 
Signed-off-by: Ran Wang 
---
Change in v7:
 - Update comment for checking RCPM node which refferred to

Change in v6:
 - Remove copyright udpate to rebase on latest mainline

Change in v5:
 - Fix panic when dev->of_node is null

Change in v4:
 - Make commit subject more accurate
 - Remove unrelated new blank line

Change in v3:
 - Add #ifdef CONFIG_ACPI for acpi_device_id
 - Rename rcpm_acpi_imx_ids to rcpm_acpi_ids

Change in v2:
 - Update acpi_device_id to fix conflict with other driver

 drivers/soc/fsl/rcpm.c | 24 ++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/soc/fsl/rcpm.c b/drivers/soc/fsl/rcpm.c
index 4ace28cab314..90d3f4060b0c 100644
--- a/drivers/soc/fsl/rcpm.c
+++ b/drivers/soc/fsl/rcpm.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define RCPM_WAKEUP_CELL_MAX_SIZE  7
 
@@ -78,10 +79,20 @@ static int rcpm_pm_prepare(struct device *dev)
"fsl,rcpm-wakeup", value,
rcpm->wakeup_cells + 1);
 
-   /*  Wakeup source should refer to current rcpm device */
-   if (ret || (np->phandle != value[0]))
+   if (ret)
continue;
 
+   /*
+* For DT mode, would handle devices with "fsl,rcpm-wakeup"
+* pointing to the current RCPM node.
+*
+* For ACPI mode, currently we assume there is only one
+* RCPM controller existing.
+*/
+   if (is_of_node(dev->fwnode))
+   if (np->phandle != value[0])
+   continue;
+
/* Property "#fsl,rcpm-wakeup-cells" of rcpm node defines the
 * number of IPPDEXPCR register cells, and "fsl,rcpm-wakeup"
 * of wakeup source IP contains an integer array:

linux-next: manual merge of the net-next tree with the net tree

2021-04-07 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  include/linux/ethtool.h

between commit:

  a975d7d8a356 ("ethtool: Remove link_mode param and derive link params from 
driver")

from the net tree and commit:

  7888fe53b706 ("ethtool: Add common function for filling out strings")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc include/linux/ethtool.h
index cdca84e6dd6b,5c631a298994..
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@@ -573,12 -573,13 +575,22 @@@ struct ethtool_phy_ops 
   */
  void ethtool_set_ethtool_phy_ops(const struct ethtool_phy_ops *ops);
  
 +/*
 + * ethtool_params_from_link_mode - Derive link parameters from a given link 
mode
 + * @link_ksettings: Link parameters to be derived from the link mode
 + * @link_mode: Link mode
 + */
 +void
 +ethtool_params_from_link_mode(struct ethtool_link_ksettings *link_ksettings,
 +enum ethtool_link_mode_bit_indices link_mode);
++
+ /**
+  * ethtool_sprintf - Write formatted string to ethtool string data
+  * @data: Pointer to start of string to update
+  * @fmt: Format of string to write
+  *
+  * Write formatted string to data. Update data to point at start of
+  * next string.
+  */
+ extern __printf(2, 3) void ethtool_sprintf(u8 **data, const char *fmt, ...);
  #endif /* _LINUX_ETHTOOL_H */


pgpQXT1TusKn0.pgp
Description: OpenPGP digital signature

[PATCH] gpio: gpio-104-dio-48e: Fixed coding style issues

2021-04-07 Thread Barney Goette

Fixed multiple bare uses of 'unsigned' without 'int'.
Fixed space around '*' operator.
Fixed function parameter alignment to opening parenthesis.
Reported by checkpatch.

Signed-off-by: Barney Goette 
---
 drivers/gpio/gpio-104-dio-48e.c | 53 +
 1 file changed, 27 insertions(+), 26 deletions(-)

diff --git a/drivers/gpio/gpio-104-dio-48e.c b/drivers/gpio/gpio-104-dio-48e.c
index 7a9021c4fa48..38badc421c32 100644
--- a/drivers/gpio/gpio-104-dio-48e.c
+++ b/drivers/gpio/gpio-104-dio-48e.c
@@ -49,15 +49,15 @@ struct dio48e_gpio {
unsigned char out_state[6];
unsigned char control[2];
raw_spinlock_t lock;
-   unsigned base;
+   unsigned int base;
unsigned char irq_mask;
 };
 
-static int dio48e_gpio_get_direction(struct gpio_chip *chip, unsigned offset)
+static int dio48e_gpio_get_direction(struct gpio_chip *chip, unsigned int 
offset)
 {
struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip);
-   const unsigned port = offset / 8;
-   const unsigned mask = BIT(offset % 8);
+   const unsigned int port = offset / 8;
+   const unsigned int mask = BIT(offset % 8);
 
if (dio48egpio->io_state[port] & mask)
return  GPIO_LINE_DIRECTION_IN;
@@ -65,14 +65,15 @@ static int dio48e_gpio_get_direction(struct gpio_chip 
*chip, unsigned offset)
return GPIO_LINE_DIRECTION_OUT;
 }
 
-static int dio48e_gpio_direction_input(struct gpio_chip *chip, unsigned offset)
+static int dio48e_gpio_direction_input(struct gpio_chip *chip, unsigned int 
offset)
 {
struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip);
-   const unsigned io_port = offset / 8;
+   const unsigned int io_port = offset / 8;
const unsigned int control_port = io_port / 3;
-   const unsigned control_addr = dio48egpio->base + 3 + control_port*4;
-   unsigned long flags;
-   unsigned control;
+   const unsigned int control_addr = dio48egpio->base + 3 + control_port * 
4;
+
+   unsigned int long flags;
+   unsigned int control;
 
raw_spin_lock_irqsave(>lock, flags);
 
@@ -104,17 +105,17 @@ static int dio48e_gpio_direction_input(struct gpio_chip 
*chip, unsigned offset)
return 0;
 }
 
-static int dio48e_gpio_direction_output(struct gpio_chip *chip, unsigned 
offset,
-   int value)
+static int dio48e_gpio_direction_output(struct gpio_chip *chip, unsigned int 
offset,
+   int value)
 {
struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip);
-   const unsigned io_port = offset / 8;
+   const unsigned int io_port = offset / 8;
const unsigned int control_port = io_port / 3;
-   const unsigned mask = BIT(offset % 8);
-   const unsigned control_addr = dio48egpio->base + 3 + control_port*4;
-   const unsigned out_port = (io_port > 2) ? io_port + 1 : io_port;
+   const unsigned int mask = BIT(offset % 8);
+   const unsigned int control_addr = dio48egpio->base + 3 + control_port * 
4;
+   const unsigned int out_port = (io_port > 2) ? io_port + 1 : io_port;
unsigned long flags;
-   unsigned control;
+   unsigned int control;
 
raw_spin_lock_irqsave(>lock, flags);
 
@@ -154,14 +155,14 @@ static int dio48e_gpio_direction_output(struct gpio_chip 
*chip, unsigned offset,
return 0;
 }
 
-static int dio48e_gpio_get(struct gpio_chip *chip, unsigned offset)
+static int dio48e_gpio_get(struct gpio_chip *chip, unsigned int offset)
 {
struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip);
-   const unsigned port = offset / 8;
-   const unsigned mask = BIT(offset % 8);
-   const unsigned in_port = (port > 2) ? port + 1 : port;
+   const unsigned int port = offset / 8;
+   const unsigned int mask = BIT(offset % 8);
+   const unsigned int in_port = (port > 2) ? port + 1 : port;
unsigned long flags;
-   unsigned port_state;
+   unsigned int port_state;
 
raw_spin_lock_irqsave(>lock, flags);
 
@@ -202,12 +203,12 @@ static int dio48e_gpio_get_multiple(struct gpio_chip 
*chip, unsigned long *mask,
return 0;
 }
 
-static void dio48e_gpio_set(struct gpio_chip *chip, unsigned offset, int value)
+static void dio48e_gpio_set(struct gpio_chip *chip, unsigned int offset, int 
value)
 {
struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip);
-   const unsigned port = offset / 8;
-   const unsigned mask = BIT(offset % 8);
-   const unsigned out_port = (port > 2) ? port + 1 : port;
+   const unsigned int port = offset / 8;
+   const unsigned int mask = BIT(offset % 8);
+   const unsigned int out_port = (port > 2) ? port + 1 : port;
unsigned long flags;
 
raw_spin_lock_irqsave(>lock, flags);
@@ -306,7 +307,7 @@ static void dio48e_irq_unmask(struct irq_data *data)
raw_spin_unlock_irqrestore(>lock, flags);
 }
 
-static int

Re: [PATCH] s390/pci: move ioremap/ioremap_prot/ioremap_wc/ioremap_wt/iounmap to arch/s390/mm/ioremap.c

2021-04-07 Thread Bixuan Cui




On 2021/4/6 19:14, Niklas Schnelle wrote:
> and move the have_mio variable out of the PCI only code or use a raw
> "#ifdef CONFIG_PCI". Obviously we don't have any actual users of
> ioremap() that don't depend on CONFIG_PCI but it would make it so that
> ioremap() exists and should actually function without CONFIG_PCI.
> The weird part though is that for anyone using it without CONFIG_PCI it
> would stop working if that is set and the machine doesn't have MIO
> support but would work if it does.
Well, Maybe it's better not to change it.And thank you for the explanation.

Thanks，
Bixuan Cui

Re: [PATCH 2/4] mm/hugeltb: simplify the return code of __vma_reservation_common()

2021-04-07 Thread Miaohe Lin

On 2021/4/8 5:23, Mike Kravetz wrote:
> On 4/6/21 8:09 PM, Miaohe Lin wrote:
>> On 2021/4/7 10:37, Mike Kravetz wrote:
>>> On 4/6/21 7:05 PM, Miaohe Lin wrote:
 Hi:
 On 2021/4/7 8:53, Mike Kravetz wrote:
> On 4/2/21 2:32 AM, Miaohe Lin wrote:
>> It's guaranteed that the vma is associated with a resv_map, i.e. either
>> VM_MAYSHARE or HPAGE_RESV_OWNER, when the code reaches here or we would
>> have returned via !resv check above. So ret must be less than 0 in the
>> 'else' case. Simplify the return code to make this clear.
>
> I believe we still neeed that ternary operator in the return statement.
> Why?
>
> There are two basic types of mappings to be concerned with:
> shared and private.
> For private mappings, a task can 'own' the mapping as indicated by
> HPAGE_RESV_OWNER.  Or, it may not own the mapping.  The most common way
> to create a non-owner private mapping is to have a task with a private
> mapping fork.  The parent process will have HPAGE_RESV_OWNER set, the
> child process will not.  The idea is that since the child has a COW copy
> of the mapping it should not consume reservations made by the parent.

 The child process will not have HPAGE_RESV_OWNER set because at fork time, 
 we do:
/*
 * Clear hugetlb-related page reserves for children. This only
 * affects MAP_PRIVATE mappings. Faults generated by the child
 * are not guaranteed to succeed, even if read-only
 */
if (is_vm_hugetlb_page(tmp))
reset_vma_resv_huge_pages(tmp);
 i.e. we have vma->vm_private_data = (void *)0; for child process and 
 vma_resv_map() will
 return NULL in this case.
 Or am I missed something?

> Only the parent (HPAGE_RESV_OWNER) is allowed to consume the
> reservations.
> Hope that makens sense?
>
>>
>> Signed-off-by: Miaohe Lin 
>> ---
>>  mm/hugetlb.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index a03a50b7c410..b7864abded3d 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -2183,7 +2183,7 @@ static long __vma_reservation_common(struct hstate 
>> *h,
>>  return 1;
>>  }
>>  else
>
> This else also handles the case !HPAGE_RESV_OWNER.  In this case, we

 IMO, for the case !HPAGE_RESV_OWNER, we won't reach here. What do you 
 think?

>>>
>>> I think you are correct.
>>>
>>> However, if this is true we should be able to simply the code even
>>> further.  There is no need to check for HPAGE_RESV_OWNER because we know
>>> it must be set.  Correct?  If so, the code could look something like:
>>>
>>> if (vma->vm_flags & VM_MAYSHARE)
>>> return ret;
>>>
>>> /* We know private mapping with HPAGE_RESV_OWNER */
>>>  * ...   *
>>>  * Add that existing comment */
>>>
>>> if (ret > 0)
>>> return 0;
>>> if (ret == 0)
>>> return 1;
>>> return ret;
>>>
>>
>> Many thanks for good suggestion! What do you mean is this ?
> 
> I think the below changes would work fine.
> 
> However, this patch/discussion has made me ask the question.  Do we need
> the HPAGE_RESV_OWNER flag?  Is the followng true?
> !(vm_flags & VM_MAYSHARE) && vma_resv_map()  ===> HPAGE_RESV_OWNER
> !(vm_flags & VM_MAYSHARE) && !vma_resv_map() ===> !HPAGE_RESV_OWNER
> 

I agree with you.

HPAGE_RESV_OWNER is set in hugetlb_reserve_pages() and there's no way to clear 
it
in the owner process. The child process can not inherit both HPAGE_RESV_OWNER 
and
resv_map. So for !HPAGE_RESV_OWNER vma, it knows nothing about resv_map.

IMO, in !(vm_flags & VM_MAYSHARE) case, we must have:
!!vma_resv_map() == !!HPAGE_RESV_OWNER

> I am not suggesting we eliminate the flag and make corresponding
> changes.  Just curious if you believe we 'could' remove the flag and
> depend on the above conditions.
> 
> One reason for NOT removing the flag is that that flag itself and
> supporting code and commnets help explain what happens with hugetlb
> reserves for COW mappings.  That code is hard to understand and the
> existing code and coments around HPAGE_RESV_OWNER help with
> understanding.

Agree. These codes took me several days to understand...

> 

Thanks.

Re: [PATCH 2/3] dt-bindings: mfd: Convert pm8xxx bindings to yaml

2021-04-07 Thread Bjorn Andersson

On Wed 07 Apr 10:37 CDT 2021, ska...@codeaurora.org wrote:

> Hi Bjorn,
> 
> On 2021-03-11 22:33, Bjorn Andersson wrote:
> > On Thu 11 Mar 01:29 CST 2021, satya priya wrote:
[..]
> > > +patternProperties:
> > > +  "rtc@[0-9a-f]+$":
> > 
> > Can we somehow link this to individual binding docs instead of listing
> > all the possible functions here?
> > 
> 
> you mean we should split this into two:
> qcom-pm8xxx.yaml and qcom-pm8xxx-rtc.yaml
> Please correct me if wrong.
> 

Right, I'm worried that it will be quite hard to maintain this document
once we start adding all the various pmic blocks to it. So if we somehow
can maintain a series of qcom-pm8xxx-.yaml and just ref them into
the main PMIC definition.

@Rob, can you give us some guidance on how to structure this binding,
with the various PMICs described will have some defined subset of a
larger set of hardware blocks that's often shared between versions?

Regards,
Bjorn

[PATCH v2] Bluetooth: Add ncmd=0 recovery handling

2021-04-07 Thread Manish Mandlik

During command status or command complete event, the controller may set
ncmd=0 indicating that it is not accepting any more commands. In such a
case, host holds off sending any more commands to the controller. If the
controller doesn't recover from such condition, host will wait forever,
until the user decides that the Bluetooth is broken and may power cycles
the Bluetooth.

This patch triggers the hardware error to reset the controller and
driver when it gets into such state as there is no other wat out.

Reviewed-by: Abhishek Pandit-Subedi 
Signed-off-by: Manish Mandlik 
---

Changes in v2:
- Emit the hardware error when ncmd=0 occurs

 include/net/bluetooth/hci.h  |  1 +
 include/net/bluetooth/hci_core.h |  1 +
 net/bluetooth/hci_core.c | 15 +++
 net/bluetooth/hci_event.c| 10 ++
 4 files changed, 27 insertions(+)

diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
index ea4ae551c426..c4b0650fb9ae 100644
--- a/include/net/bluetooth/hci.h
+++ b/include/net/bluetooth/hci.h
@@ -339,6 +339,7 @@ enum {
 #define HCI_PAIRING_TIMEOUTmsecs_to_jiffies(6) /* 60 seconds */
 #define HCI_INIT_TIMEOUT   msecs_to_jiffies(1) /* 10 seconds */
 #define HCI_CMD_TIMEOUTmsecs_to_jiffies(2000)  /* 2 seconds */
+#define HCI_NCMD_TIMEOUT   msecs_to_jiffies(4000)  /* 4 seconds */
 #define HCI_ACL_TX_TIMEOUT msecs_to_jiffies(45000) /* 45 seconds */
 #define HCI_AUTO_OFF_TIMEOUT   msecs_to_jiffies(2000)  /* 2 seconds */
 #define HCI_POWER_OFF_TIMEOUT  msecs_to_jiffies(5000)  /* 5 seconds */
diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
index ebdd4afe30d2..f14692b39fd5 100644
--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -470,6 +470,7 @@ struct hci_dev {
struct delayed_work service_cache;
 
struct delayed_work cmd_timer;
+   struct delayed_work ncmd_timer;
 
struct work_struct  rx_work;
struct work_struct  cmd_work;
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index b0d9c36acc03..c102a8763cb5 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -2769,6 +2769,20 @@ static void hci_cmd_timeout(struct work_struct *work)
queue_work(hdev->workqueue, >cmd_work);
 }
 
+/* HCI ncmd timer function */
+static void hci_ncmd_timeout(struct work_struct *work)
+{
+   struct hci_dev *hdev = container_of(work, struct hci_dev,
+   ncmd_timer.work);
+
+   bt_dev_err(hdev, "Controller not accepting commands anymore: ncmd = 0");
+
+   /* This is an irrecoverable state. Inject hw error event to reset
+* the device and driver.
+*/
+   hci_reset_dev(hdev);
+}
+
 struct oob_data *hci_find_remote_oob_data(struct hci_dev *hdev,
  bdaddr_t *bdaddr, u8 bdaddr_type)
 {
@@ -3831,6 +3845,7 @@ struct hci_dev *hci_alloc_dev(void)
init_waitqueue_head(>suspend_wait_q);
 
INIT_DELAYED_WORK(>cmd_timer, hci_cmd_timeout);
+   INIT_DELAYED_WORK(>ncmd_timer, hci_ncmd_timeout);
 
hci_request_setup(hdev);
 
diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
index cf2f4a0abdbd..114a9170d809 100644
--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -3635,6 +3635,11 @@ static void hci_cmd_complete_evt(struct hci_dev *hdev, 
struct sk_buff *skb,
if (*opcode != HCI_OP_NOP)
cancel_delayed_work(>cmd_timer);
 
+   if (!ev->ncmd && !test_bit(HCI_RESET, >flags))
+   schedule_delayed_work(>ncmd_timer, HCI_NCMD_TIMEOUT);
+   else
+   cancel_delayed_work(>ncmd_timer);
+
if (ev->ncmd && !test_bit(HCI_RESET, >flags))
atomic_set(>cmd_cnt, 1);
 
@@ -3740,6 +3745,11 @@ static void hci_cmd_status_evt(struct hci_dev *hdev, 
struct sk_buff *skb,
if (*opcode != HCI_OP_NOP)
cancel_delayed_work(>cmd_timer);
 
+   if (!ev->ncmd && !test_bit(HCI_RESET, >flags))
+   schedule_delayed_work(>ncmd_timer, HCI_NCMD_TIMEOUT);
+   else
+   cancel_delayed_work(>ncmd_timer);
+
if (ev->ncmd && !test_bit(HCI_RESET, >flags))
atomic_set(>cmd_cnt, 1);
 
-- 
2.31.0.208.g409f899ff0-goog

[PATCH v2] hwmon: Add driver for fsp-3y PSUs and PDUs

2021-04-07 Thread Václav Kubernát

This patch adds support for these devices:
- YH-5151E - the PDU
- YM-2151E - the PSU

The device datasheet says that the devices support PMBus 1.2, but in my
testing, a lot of the commands aren't supported and if they are, they
sometimes behave strangely or inconsistently. For example, writes to the
PAGE command requires using PEC, otherwise the write won't work and the
page won't switch, even though, the standard says that PEC is opiotnal.
On the other hand, writes the SMBALERT don't require PEC. Because of
this, the driver is mostly reverse engineered with the help of a tool
called pmbus_peek written by David Brownell (and later adopted by my
colleague Jan Kundrát).

The device also has some sort of a timing issue when switching pages,
which is explained further in the code.

Because of this, the driver support is limited. It exposes only the
values, that have been tested to work correctly.

Signed-off-by: Václav Kubernát 
---
 Documentation/hwmon/fsp-3y.rst |  26 
 drivers/hwmon/pmbus/Kconfig|  10 ++
 drivers/hwmon/pmbus/Makefile   |   1 +
 drivers/hwmon/pmbus/fsp-3y.c   | 217 +
 4 files changed, 254 insertions(+)
 create mode 100644 Documentation/hwmon/fsp-3y.rst
 create mode 100644 drivers/hwmon/pmbus/fsp-3y.c

diff --git a/Documentation/hwmon/fsp-3y.rst b/Documentation/hwmon/fsp-3y.rst
new file mode 100644
index ..68a547021846
--- /dev/null
+++ b/Documentation/hwmon/fsp-3y.rst
@@ -0,0 +1,26 @@
+Kernel driver fsp3y
+==
+Supported devices:
+  * 3Y POWER YH-5151E
+  * 3Y POWER YM-2151E
+
+Author: Václav Kubernát 
+
+Description
+---
+This driver implements limited support for two 3Y POWER devices.
+
+Sysfs entries
+-
+in1_inputinput voltage
+in2_input12V output voltage
+in3_input5V output voltage
+curr1_input  input current
+curr2_input  12V output current
+curr3_input  5V output current
+fan1_input   fan rpm
+temp1_input  temperature 1
+temp2_input  temperature 2
+temp3_input  temperature 3
+power1_input input power
+power2_input output power
diff --git a/drivers/hwmon/pmbus/Kconfig b/drivers/hwmon/pmbus/Kconfig
index 03606d4298a4..9d12d446396c 100644
--- a/drivers/hwmon/pmbus/Kconfig
+++ b/drivers/hwmon/pmbus/Kconfig
@@ -56,6 +56,16 @@ config SENSORS_BEL_PFE
  This driver can also be built as a module. If so, the module will
  be called bel-pfe.
 
+config SENSORS_FSP_3Y
+   tristate "FSP/3Y-Power power supplies"
+   help
+ If you say yes here you get hardware monitoring support for
+ FSP/3Y-Power hot-swap power supplies.
+ Supported models: YH-5151E, YM-2151E
+
+ This driver can also be built as a module. If so, the module will
+ be called fsp-3y.
+
 config SENSORS_IBM_CFFPS
tristate "IBM Common Form Factor Power Supply"
depends on LEDS_CLASS
diff --git a/drivers/hwmon/pmbus/Makefile b/drivers/hwmon/pmbus/Makefile
index 6a4ba0fdc1db..bfe218ad898f 100644
--- a/drivers/hwmon/pmbus/Makefile
+++ b/drivers/hwmon/pmbus/Makefile
@@ -8,6 +8,7 @@ obj-$(CONFIG_SENSORS_PMBUS) += pmbus.o
 obj-$(CONFIG_SENSORS_ADM1266)  += adm1266.o
 obj-$(CONFIG_SENSORS_ADM1275)  += adm1275.o
 obj-$(CONFIG_SENSORS_BEL_PFE)  += bel-pfe.o
+obj-$(CONFIG_SENSORS_FSP_3Y)   += fsp-3y.o
 obj-$(CONFIG_SENSORS_IBM_CFFPS)+= ibm-cffps.o
 obj-$(CONFIG_SENSORS_INSPUR_IPSPS) += inspur-ipsps.o
 obj-$(CONFIG_SENSORS_IR35221)  += ir35221.o
diff --git a/drivers/hwmon/pmbus/fsp-3y.c b/drivers/hwmon/pmbus/fsp-3y.c
new file mode 100644
index ..2c165e034fa8
--- /dev/null
+++ b/drivers/hwmon/pmbus/fsp-3y.c
@@ -0,0 +1,217 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Hardware monitoring driver for FSP 3Y-Power PSUs
+ *
+ * Copyright (c) 2021 Václav Kubernát, CESNET
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include "pmbus.h"
+
+#define YM2151_PAGE_12V_LOG0x00
+#define YM2151_PAGE_12V_REAL   0x00
+#define YM2151_PAGE_5VSB_LOG   0x01
+#define YM2151_PAGE_5VSB_REAL  0x20
+#define YH5151E_PAGE_12V_LOG   0x00
+#define YH5151E_PAGE_12V_REAL  0x00
+#define YH5151E_PAGE_5V_LOG0x01
+#define YH5151E_PAGE_5V_REAL   0x10
+#define YH5151E_PAGE_3V3_LOG   0x02
+#define YH5151E_PAGE_3V3_REAL  0x11
+
+enum chips {
+   ym2151e,
+   yh5151e
+};
+
+struct fsp3y_data {
+   struct pmbus_driver_info info;
+   enum chips chip;
+   int page;
+};
+
+#define to_fsp3y_data(x) container_of(x, struct fsp3y_data, info)
+
+static int page_log_to_page_real(int page_log, enum chips chip)
+{
+   switch (chip) {
+   case ym2151e:
+   switch (page_log) {
+   case YM2151_PAGE_12V_LOG:
+   return YM2151_PAGE_12V_REAL;
+   case YM2151_PAGE_5VSB_LOG:
+   return YM2151_PAGE_5VSB_REAL;
+   }
+   return -1;
+   case yh5151e:
+

Re: [PATCH v2 5/5] percpu: implement partial chunk depopulation

2021-04-07 Thread Dennis Zhou

Hello,

On Wed, Apr 07, 2021 at 11:26:18AM -0700, Roman Gushchin wrote:
> This patch implements partial depopulation of percpu chunks.
> 
> As now, a chunk can be depopulated only as a part of the final
> destruction, if there are no more outstanding allocations. However
> to minimize a memory waste it might be useful to depopulate a
> partially filed chunk, if a small number of outstanding allocations
> prevents the chunk from being fully reclaimed.
> 
> This patch implements the following depopulation process: it scans
> over the chunk pages, looks for a range of empty and populated pages
> and performs the depopulation. To avoid races with new allocations,
> the chunk is previously isolated. After the depopulation the chunk is
> sidelined to a special list or freed. New allocations can't be served
> using a sidelined chunk. The chunk can be moved back to a corresponding
> slot if there are not enough chunks with empty populated pages.
> 
> The depopulation is scheduled on the free path. Is the chunk:
>   1) has more than 1/4 of total pages free and populated
>   2) the system has enough free percpu pages aside of this chunk
>   3) isn't the reserved chunk
>   4) isn't the first chunk
>   5) isn't entirely free
> it's a good target for depopulation. If it's already depopulated
> but got free populated pages, it's a good target too.
> The chunk is moved to a special pcpu_depopulate_list, chunk->isolate
> flag is set and the async balancing is scheduled.
> 
> The async balancing moves pcpu_depopulate_list to a local list
> (because pcpu_depopulate_list can be changed when pcpu_lock is
> releases), and then tries to depopulate each chunk.  The depopulation
> is performed in the reverse direction to keep populated pages close to
> the beginning, if the global number of empty pages is reached.
> Depopulated chunks are sidelined to prevent further allocations.
> Skipped and fully empty chunks are returned to the corresponding slot.
> 
> On the allocation path, if there are no suitable chunks found,
> the list of sidelined chunks in scanned prior to creating a new chunk.
> If there is a good sidelined chunk, it's placed back to the slot
> and the scanning is restarted.
> 
> Many thanks to Dennis Zhou for his great ideas and a very constructive
> discussion which led to many improvements in this patchset!
> 
> Signed-off-by: Roman Gushchin 
> ---
>  mm/percpu-internal.h |   2 +
>  mm/percpu.c  | 164 ++-
>  2 files changed, 164 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/percpu-internal.h b/mm/percpu-internal.h
> index 095d7eaa0db4..8e432663c41e 100644
> --- a/mm/percpu-internal.h
> +++ b/mm/percpu-internal.h
> @@ -67,6 +67,8 @@ struct pcpu_chunk {
>  
>   void*data;  /* chunk data */
>   boolimmutable;  /* no [de]population allowed */
> + boolisolated;   /* isolated from chunk slot 
> lists */
> + booldepopulated;/* sidelined after depopulation 
> */
>   int start_offset;   /* the overlap with the previous
>  region to have a page aligned
>  base_addr */
> diff --git a/mm/percpu.c b/mm/percpu.c
> index e20119668c42..0a5a5e84e0a4 100644
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -181,6 +181,19 @@ static LIST_HEAD(pcpu_map_extend_chunks);
>   */
>  int pcpu_nr_empty_pop_pages[PCPU_NR_CHUNK_TYPES];
>  
> +/*
> + * List of chunks with a lot of free pages.  Used to depopulate them
> + * asynchronously.
> + */
> +static struct list_head pcpu_depopulate_list[PCPU_NR_CHUNK_TYPES];
> +
> +/*
> + * List of previously depopulated chunks.  They are not usually used for new
> + * allocations, but can be returned back to service if a need arises.
> + */
> +static struct list_head pcpu_sideline_list[PCPU_NR_CHUNK_TYPES];
> +
> +
>  /*
>   * The number of populated pages in use by the allocator, protected by
>   * pcpu_lock.  This number is kept per a unit per chunk (i.e. when a page 
> gets
> @@ -542,6 +555,12 @@ static void pcpu_chunk_relocate(struct pcpu_chunk 
> *chunk, int oslot)
>  {
>   int nslot = pcpu_chunk_slot(chunk);
>  
> + /*
> +  * Keep isolated and depopulated chunks on a sideline.
> +  */
> + if (chunk->isolated || chunk->depopulated)
> + return;
> +
>   if (oslot != nslot)
>   __pcpu_chunk_move(chunk, nslot, oslot < nslot);
>  }
> @@ -1778,6 +1797,25 @@ static void __percpu *pcpu_alloc(size_t size, size_t 
> align, bool reserved,
>   }
>   }
>  
> + /* search through sidelined depopulated chunks */
> + list_for_each_entry(chunk, _sideline_list[type], list) {
> + struct pcpu_block_md *chunk_md = >chunk_md;
> + int bit_off;
> +
> + /*
> +  * If the allocation can fit in the chunk's contig hint,
> +

Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor

2021-04-07 Thread Chanwoo Choi

On 4/1/21 9:16 AM, Chanwoo Choi wrote:
> On 3/31/21 10:03 PM, andrew-sh.cheng wrote:
>> On Wed, 2021-03-31 at 17:35 +0900, Chanwoo Choi wrote:
>>> On 3/31/21 5:27 PM, Chanwoo Choi wrote:
 Hi,

 On 3/31/21 5:03 PM, andrew-sh.cheng wrote:
> On Thu, 2021-03-25 at 17:14 +0900, Chanwoo Choi wrote:
>> Hi,
>>
>> You are missing to add these patches to linux-pm mailing list.
>> Need to send them to linu-pm ML.
>>
>> Also, before received this series, I tried to clean-up these patches
>> on testing branch[1]. So that I add my comment with my clean-up case.
>> [1] 
>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgAPh9XRRs$
>>  
>>
>> And 'Saravana Kannan ' is wrong email address.
>> Please update the email or drop this email.
>
> Hi Chanwoo,
>
> Thank you for the advices.
> I will resend patch v9 (add to linux-pm ML), remove this patch, and note
> that my patch set base on
> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$
>  

 I has not yet test this patch[1] on devfreq-testing-passive-gov branch.
 So that if possible, I'd like you to test your patches with this patch[1] 
 and then if there is no problem, could you send the next patches with 
 patch[1]?

 [1]https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov=39c80d11a8f42dd63ecea1e0df595a0ceb83b454__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJR2cQqZs$
  
>>>
>>>
>>> Sorry for the confusion. I make the devfreq-testing-passive-gov[1]
>>> branch based on latest devfreq-next branch.
>>> [1] 
>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$
>>>  
>>>
>>> First of all, if possible, I want to test them[1] with your patches in this 
>>> series.
>>> And then if there are no any problem, please let me know. After confirmed 
>>> from you,
>>> I'll send the patches of devfreq-testing-passive-gov[1] branch.
>>> How about that?
>>>
>> Hi Chanwoo~
>>
>> We will use this on Google Chrome project.
>> Google Hsin-Yi has test your patch + my patch set v8 [2~8]
>>
>> make sure cci devfreqs runs with cpufreq.
>> suspend resume
>> speedometer2 benchmark
>> It is okay.
>>
>> Please send the patches of devfreq-testing-passive-gov[1] branch.
>>
>> I will send patch v9 base on yours latter.
> 
> Thanks for your test. I'll send the patches today.

I'm sorry for delay because when I tested the patches
for devfreq parent type on Odroid-xu3, there are some problem
related to lazy linking of OPP. So I'm trying to analyze them.
Unfortunately, we need to postpone these patches to next linux
version.


[snip]

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

[PATCH] staging: rtl8712: removed extra blank line

2021-04-07 Thread Mitali Borkar

Removed an extra blank line so that only one blank line is present in
between two functions which separates them out.
Reported by checkpatch

Signed-off-by: Mitali Borkar 
---
 drivers/staging/rtl8712/rtl8712_wmac_regdef.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/staging/rtl8712/rtl8712_wmac_regdef.h 
b/drivers/staging/rtl8712/rtl8712_wmac_regdef.h
index 662383fe7a8d..dfe3e9fbed43 100644
--- a/drivers/staging/rtl8712/rtl8712_wmac_regdef.h
+++ b/drivers/staging/rtl8712/rtl8712_wmac_regdef.h
@@ -32,6 +32,5 @@
 #define AMPDU_MIN_SPACE(RTL8712_WMAC_ + 0x37)
 #defineTXOP_STALL_CTRL (RTL8712_WMAC_ + 0x38)
 
-
 #endif /*__RTL8712_WMAC_REGDEF_H__*/
 
-- 
2.30.2

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1711 matches

Mail list logo