Re: [PATCH 3/3] dma-contiguous: Type cast MAX_ORDER as unsigned int

2021-02-11 Thread Anshuman Khandual


On 2/11/21 1:34 PM, Christoph Hellwig wrote:
> On Thu, Feb 11, 2021 at 11:52:11AM +0530, Anshuman Khandual wrote:
>> Type cast MAX_ORDER as unsigned int to fix the following build warning.
>>
>> In file included from ./include/linux/kernel.h:14,
>>  from ./include/asm-generic/bug.h:20,
>>  from ./arch/arm64/include/asm/bug.h:26,
>>  from ./include/linux/bug.h:5,
>>  from ./include/linux/mmdebug.h:5,
>>  from ./arch/arm64/include/asm/memory.h:166,
>>  from ./arch/arm64/include/asm/page.h:42,
>>  from kernel/dma/contiguous.c:46:
>> kernel/dma/contiguous.c: In function ‘rmem_cma_setup’:
>> ./include/linux/minmax.h:18:28: warning: comparison of distinct pointer
>> types lacks a cast
>>   (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
>> ^~
>> ./include/linux/minmax.h:32:4: note: in expansion of macro ‘__typecheck’
>>(__typecheck(x, y) && __no_side_effects(x, y))
>> ^~~
>> ./include/linux/minmax.h:42:24: note: in expansion of macro ‘__safe_cmp’
>>   __builtin_choose_expr(__safe_cmp(x, y), \
>> ^~
>> ./include/linux/minmax.h:58:19: note: in expansion of macro
>> ‘__careful_cmp’
>>  #define max(x, y) __careful_cmp(x, y, >)
>>^
>> kernel/dma/contiguous.c:402:35: note: in expansion of macro ‘max’
>>   phys_addr_t align = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order);
>>
>> Cc: Christoph Hellwig 
>> Cc: Marek Szyprowski 
>> Cc: Robin Murphy 
>> Cc: iommu@lists.linux-foundation.org
>> Cc: linux-ker...@vger.kernel.org
>> Signed-off-by: Anshuman Khandual 
>> ---
>>  kernel/dma/contiguous.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
>> index 3d63d91cba5c..1c2782349d71 100644
>> --- a/kernel/dma/contiguous.c
>> +++ b/kernel/dma/contiguous.c
>> @@ -399,7 +399,7 @@ static const struct reserved_mem_ops rmem_cma_ops = {
>>  
>>  static int __init rmem_cma_setup(struct reserved_mem *rmem)
>>  {
>> -phys_addr_t align = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order);
>> +phys_addr_t align = PAGE_SIZE << max((unsigned int)MAX_ORDER - 1, 
>> pageblock_order);
> 
> MAX_ORDER and pageblock_order should be the same type.  So either fix

Right.

> MAX_ORDER to be an unsigned constant, which would be fundamentally
> the right thing to do but might cause some fallout, or turn
> pageblock_order into an int, which is probably much either as the stub
> define of it already has an integer type derived from MAX_ORDER as well.

Right, will change pageblock_order as 'int' which would be easier.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 02/16] iommu/smmu: Report empty domain nesting info

2021-02-11 Thread Vivek Gautam
Hi Yi,


On Sat, Jan 23, 2021 at 2:29 PM Liu, Yi L  wrote:
>
> Hi Eric,
>
> > From: Auger Eric 
> > Sent: Tuesday, January 19, 2021 6:03 PM
> >
> > Hi Yi, Vivek,
> >
> [...]
> > > I see. I think there needs a change in the code there. Should also expect
> > > a nesting_info returned instead of an int anymore. @Eric, how about your
> > > opinion?
> > >
> > > domain = iommu_get_domain_for_dev(>pdev->dev);
> > > ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_NESTING,
> > );
> > > if (ret || !(info.features & IOMMU_NESTING_FEAT_PAGE_RESP)) {
> > > /*
> > >  * No need go futher as no page request service support.
> > >  */
> > > return 0;
> > > }
> > Sure I think it is "just" a matter of synchro between the 2 series. Yi,
>
> exactly.
>
> > do you have plans to respin part of
> > [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
> > or would you allow me to embed this patch in my series.
>
> My v7 hasn’t touch the prq change yet. So I think it's better for you to
> embed it to  your series. ^_^
>

Can you please let me know if you have an updated series of these
patches? It will help me to work with virtio-iommu/arm side changes.

Thanks & regards
Vivek
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/3] arm64/hugetlb: Enable HUGETLB_PAGE_SIZE_VARIABLE

2021-02-11 Thread Anshuman Khandual



On 2/11/21 1:31 PM, Christoph Hellwig wrote:
> On Thu, Feb 11, 2021 at 11:52:10AM +0530, Anshuman Khandual wrote:
>> MAX_ORDER which invariably depends on FORCE_MAX_ZONEORDER can be a variable
>> for a given page size, depending on whether TRANSPARENT_HUGEPAGE is enabled
>> or not. In certain page size and THP combinations HUGETLB_PAGE_ORDER can be
>> greater than MAX_ORDER, making it unusable as pageblock_order.
>>
>> This enables HUGETLB_PAGE_SIZE_VARIABLE making pageblock_order a variable
>> rather than the compile time constant HUGETLB_PAGE_ORDER which could break
>> MAX_ORDER rule for certain configurations.
>>
>> Cc: Catalin Marinas 
>> Cc: Will Deacon 
>> Cc: linux-arm-ker...@lists.infradead.org
>> Cc: linux-ker...@vger.kernel.org
>> Signed-off-by: Anshuman Khandual 
>> ---
>>  arch/arm64/Kconfig | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index f39568b28ec1..8e3a5578f663 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -1909,6 +1909,10 @@ config ARCH_ENABLE_THP_MIGRATION
>>  def_bool y
>>  depends on TRANSPARENT_HUGEPAGE
>>  
>> +config HUGETLB_PAGE_SIZE_VARIABLE
> 
> Please move the definition of HUGETLB_PAGE_SIZE_VARIABLE to
> mm/Kconfig and select it from the arch Kconfigfs instead of duplicating
> the definition.

Sure, will do.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/3] mm/page_alloc: Fix pageblock_order when HUGETLB_PAGE_ORDER >= MAX_ORDER

2021-02-11 Thread Anshuman Khandual



On 2/11/21 1:30 PM, Christoph Hellwig wrote:
>> -if (HPAGE_SHIFT > PAGE_SHIFT)
>> +if ((HPAGE_SHIFT > PAGE_SHIFT) && (HUGETLB_PAGE_ORDER < MAX_ORDER))
> 
> No need for the braces.

Will drop them.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] mm/page_alloc: Fix pageblock_order with HUGETLB_PAGE_SIZE_VARIABLE

2021-02-11 Thread Anshuman Khandual

On 2/11/21 2:07 PM, David Hildenbrand wrote:
> On 11.02.21 07:22, Anshuman Khandual wrote:
>> The following warning gets triggered while trying to boot a 64K page size
>> without THP config kernel on arm64 platform.
>>
>> WARNING: CPU: 5 PID: 124 at mm/vmstat.c:1080 __fragmentation_index+0xa4/0xc0
>> Modules linked in:
>> CPU: 5 PID: 124 Comm: kswapd0 Not tainted 5.11.0-rc6-4-ga0ea7d62002 #159
>> Hardware name: linux,dummy-virt (DT)
>> [    8.810673] pstate: 2045 (nzCv daif +PAN -UAO -TCO BTYPE=--)
>> [    8.811732] pc : __fragmentation_index+0xa4/0xc0
>> [    8.812555] lr : fragmentation_index+0xf8/0x138
>> [    8.813360] sp : 864079b0
>> [    8.813958] x29: 864079b0 x28: 0372
>> [    8.814901] x27: 7682 x26: 8000135b3948
>> [    8.815847] x25: 1fffe00010c80f48 x24: 
>> [    8.816805] x23:  x22: 000d
>> [    8.817764] x21: 0030 x20: 0005ffcb4d58
>> [    8.818712] x19: 000b x18: 
>> [    8.819656] x17:  x16: 
>> [    8.820613] x15:  x14: 8000114c6258
>> [    8.821560] x13: 6000bff969ba x12: 1fffe000bff969b9
>> [    8.822514] x11: 1fffe000bff969b9 x10: 6000bff969b9
>> [    8.823461] x9 : dfff8000 x8 : 0005ffcb4dcf
>> [    8.824415] x7 : 0001 x6 : 41b58ab3
>> [    8.825359] x5 : 600010c80f48 x4 : dfff8000
>> [    8.826313] x3 : 8000102be670 x2 : 0007
>> [    8.827259] x1 : 86407a60 x0 : 000d
>> [    8.828218] Call trace:
>> [    8.828667]  __fragmentation_index+0xa4/0xc0
>> [    8.829436]  fragmentation_index+0xf8/0x138
>> [    8.830194]  compaction_suitable+0x98/0xb8
>> [    8.830934]  wakeup_kcompactd+0xdc/0x128
>> [    8.831640]  balance_pgdat+0x71c/0x7a0
>> [    8.832327]  kswapd+0x31c/0x520
>> [    8.832902]  kthread+0x224/0x230
>> [    8.833491]  ret_from_fork+0x10/0x30
>> [    8.834150] ---[ end trace 472836f79c15516b ]---
>>
>> This warning comes from __fragmentation_index() when the requested order
>> is greater than MAX_ORDER.
>>
>> static int __fragmentation_index(unsigned int order,
>>  struct contig_page_info *info)
>> {
>>  unsigned long requested = 1UL << order;
>>
>>  if (WARN_ON_ONCE(order >= MAX_ORDER)) <= Triggered here
>>  return 0;
>>
>> Digging it further reveals that pageblock_order has been assigned a value
>> which is greater than MAX_ORDER failing the above check. But why this
>> happened ? Because HUGETLB_PAGE_ORDER for the given config on arm64 is
>> greater than MAX_ORDER.
>>
>> The solution involves enabling HUGETLB_PAGE_SIZE_VARIABLE which would make
>> pageblock_order a variable instead of constant HUGETLB_PAGE_ORDER. But that
>> change alone also did not really work as pageblock_order still got assigned
>> as HUGETLB_PAGE_ORDER in set_pageblock_order(). HUGETLB_PAGE_ORDER needs to
>> be less than MAX_ORDER for its appropriateness as pageblock_order otherwise
>> just fallback to MAX_ORDER - 1 as before. While here it also fixes a build
>> problem via type casting MAX_ORDER in rmem_cma_setup().
> 
> I'm wondering, is there any real value in allowing FORCE_MAX_ZONEORDER to be 
> "11" with ARM64_64K_PAGES/ARM64_16K_PAGES?

MAX_ORDER should be as high as would be required for the current config.
Unless THP is enabled, there is no need for it to be any higher than 11.
But I might be missing historical reasons around this as well. Probably
others from arm64 could help here.

> 
> Meaning: are there any real use cases that actually build a kernel without 
> TRANSPARENT_HUGEPAGE and with ARM64_64K_PAGES/ARM64_16K_PAGES?

THP is always optional. Besides kernel builds without THP should always
be supported. Assuming that all builds will have THP enabled, might not
be accurate.

> 
> As builds are essentially broken, I assume this is not that relevant? Or how 
> long has it been broken?

Git blame shows that it's been there for some time now. But how does
that make this irrelevant ? A problem should be fixed nonetheless.

> 
> It might be easier to just drop the "TRANSPARENT_HUGEPAGE" part from the 
> FORCE_MAX_ZONEORDER config.
> 

Not sure if it would be a good idea to unnecessarily have larger MAX_ORDER
value for a given config. But I might be missing other contexts here.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v13 05/15] iommu/smmuv3: Get prepared for nested stage support

2021-02-11 Thread Auger Eric
Hi Keqian,

On 2/2/21 8:14 AM, Keqian Zhu wrote:
> Hi Eric,
> 
> On 2020/11/18 19:21, Eric Auger wrote:
>> When nested stage translation is setup, both s1_cfg and
>> s2_cfg are set.
>>
>> We introduce a new smmu domain abort field that will be set
>> upon guest stage1 configuration passing.
>>
>> arm_smmu_write_strtab_ent() is modified to write both stage
>> fields in the STE and deal with the abort field.
>>
>> In nested mode, only stage 2 is "finalized" as the host does
>> not own/configure the stage 1 context descriptor; guest does.
>>
>> Signed-off-by: Eric Auger 
>>
>> ---
>> v10 -> v11:
>> - Fix an issue reported by Shameer when switching from with vSMMU
>>   to without vSMMU. Despite the spec does not seem to mention it
>>   seems to be needed to reset the 2 high 64b when switching from
>>   S1+S2 cfg to S1 only. Especially dst[3] needs to be reset (S2TTB).
>>   On some implementations, if the S2TTB is not reset, this causes
>>   a C_BAD_STE error
>> ---
>>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 64 +
>>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 +
>>  2 files changed, 56 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> index 18ac5af1b284..412ea1bafa50 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> @@ -1181,8 +1181,10 @@ static void arm_smmu_write_strtab_ent(struct 
>> arm_smmu_master *master, u32 sid,
>>   * three cases at the moment:
>>   *
>>   * 1. Invalid (all zero) -> bypass/fault (init)
>> - * 2. Bypass/fault -> translation/bypass (attach)
>> - * 3. Translation/bypass -> bypass/fault (detach)
>> + * 2. Bypass/fault -> single stage translation/bypass (attach)
>> + * 3. Single or nested stage Translation/bypass -> bypass/fault (detach)
>> + * 4. S2 -> S1 + S2 (attach_pasid_table)
>> + * 5. S1 + S2 -> S2 (detach_pasid_table)
> 
> The following line "BUG_ON(ste_live && !nested);" forbids this transform.

Yes as pointed out by Kunkun, there is always an abort in-between. I
will restore the original comment.

> And I have a look at the 6th patch, the transform seems S1 + S2 -> abort.
> So after detach, the status is not the same as that before attach. Does it
> match our expectation?

Indeed at detach time I think I should reset the abort() flag as this
latter is not imposed anymore by the guest.

Thanks!

Eric


> 
>>   *
>>   * Given that we can't update the STE atomically and the SMMU
>>   * doesn't read the thing in a defined order, that leaves us
>> @@ -1193,7 +1195,8 @@ static void arm_smmu_write_strtab_ent(struct 
>> arm_smmu_master *master, u32 sid,
>>   * 3. Update Config, sync
>>   */
>>  u64 val = le64_to_cpu(dst[0]);
>> -bool ste_live = false;
>> +bool s1_live = false, s2_live = false, ste_live;
>> +bool abort, nested = false, translate = false;
>>  struct arm_smmu_device *smmu = NULL;
>>  struct arm_smmu_s1_cfg *s1_cfg;
>>  struct arm_smmu_s2_cfg *s2_cfg;
>> @@ -1233,6 +1236,8 @@ static void arm_smmu_write_strtab_ent(struct 
>> arm_smmu_master *master, u32 sid,
>>  default:
>>  break;
>>  }
>> +nested = s1_cfg->set && s2_cfg->set;
>> +translate = s1_cfg->set || s2_cfg->set;
>>  }
>>  
>>  if (val & STRTAB_STE_0_V) {
>> @@ -1240,23 +1245,36 @@ static void arm_smmu_write_strtab_ent(struct 
>> arm_smmu_master *master, u32 sid,
>>  case STRTAB_STE_0_CFG_BYPASS:
>>  break;
>>  case STRTAB_STE_0_CFG_S1_TRANS:
>> +s1_live = true;
>> +break;
>>  case STRTAB_STE_0_CFG_S2_TRANS:
>> -ste_live = true;
>> +s2_live = true;
>> +break;
>> +case STRTAB_STE_0_CFG_NESTED:
>> +s1_live = true;
>> +s2_live = true;
>>  break;
>>  case STRTAB_STE_0_CFG_ABORT:
>> -BUG_ON(!disable_bypass);
>>  break;
>>  default:
>>  BUG(); /* STE corruption */
>>  }
>>  }
>>  
>> +ste_live = s1_live || s2_live;
>> +
>>  /* Nuke the existing STE_0 value, as we're going to rewrite it */
>>  val = STRTAB_STE_0_V;
>>  
>>  /* Bypass/fault */
>> -if (!smmu_domain || !(s1_cfg->set || s2_cfg->set)) {
>> -if (!smmu_domain && disable_bypass)
>> +
>> +if (!smmu_domain)
>> +abort = disable_bypass;
>> +else
>> +abort = smmu_domain->abort;
>> +
>> +if (abort || !translate) {
>> +if (abort)
>>  val |= FIELD_PREP(STRTAB_STE_0_CFG, 
>> STRTAB_STE_0_CFG_ABORT);
>>  else
>>  val |= FIELD_PREP(STRTAB_STE_0_CFG, 
>> 

Re: [PATCH v13 06/15] iommu/smmuv3: Implement attach/detach_pasid_table

2021-02-11 Thread Auger Eric
Hi Keqian,

On 2/2/21 9:03 AM, Keqian Zhu wrote:
> Hi Eric,
> 
> On 2020/11/18 19:21, Eric Auger wrote:
>> On attach_pasid_table() we program STE S1 related info set
>> by the guest into the actual physical STEs. At minimum
>> we need to program the context descriptor GPA and compute
>> whether the stage1 is translated/bypassed or aborted.
>>
>> Signed-off-by: Eric Auger 
>>
>> ---
>> v7 -> v8:
>> - remove smmu->features check, now done on domain finalize
>>
>> v6 -> v7:
>> - check versions and comment the fact we don't need to take
>>   into account s1dss and s1fmt
>> v3 -> v4:
>> - adapt to changes in iommu_pasid_table_config
>> - different programming convention at s1_cfg/s2_cfg/ste.abort
>>
>> v2 -> v3:
>> - callback now is named set_pasid_table and struct fields
>>   are laid out differently.
>>
>> v1 -> v2:
>> - invalidate the STE before changing them
>> - hold init_mutex
>> - handle new fields
>> ---
>>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 89 +
>>  1 file changed, 89 insertions(+)
>>
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> index 412ea1bafa50..805acdc18a3a 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> @@ -2661,6 +2661,93 @@ static void arm_smmu_get_resv_regions(struct device 
>> *dev,
>>  iommu_dma_get_resv_regions(dev, head);
>>  }
>>  
>> +static int arm_smmu_attach_pasid_table(struct iommu_domain *domain,
>> +   struct iommu_pasid_table_config *cfg)
>> +{
>> +struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>> +struct arm_smmu_master *master;
>> +struct arm_smmu_device *smmu;
>> +unsigned long flags;
>> +int ret = -EINVAL;
>> +
>> +if (cfg->format != IOMMU_PASID_FORMAT_SMMUV3)
>> +return -EINVAL;
>> +
>> +if (cfg->version != PASID_TABLE_CFG_VERSION_1 ||
>> +cfg->vendor_data.smmuv3.version != PASID_TABLE_SMMUV3_CFG_VERSION_1)
>> +return -EINVAL;
>> +
>> +mutex_lock(_domain->init_mutex);
>> +
>> +smmu = smmu_domain->smmu;
>> +
>> +if (!smmu)
>> +goto out;
>> +
>> +if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
>> +goto out;
>> +
>> +switch (cfg->config) {
>> +case IOMMU_PASID_CONFIG_ABORT:
>> +smmu_domain->s1_cfg.set = false;
>> +smmu_domain->abort = true;
>> +break;
>> +case IOMMU_PASID_CONFIG_BYPASS:
>> +smmu_domain->s1_cfg.set = false;
>> +smmu_domain->abort = false;
> I didn't test it, but it seems that this will cause BUG() in 
> arm_smmu_write_strtab_ent().
> At the line "BUG_ON(ste_live && !nested);". Maybe I miss something?

No you are fully correct. Shammeer hit the BUG_ON() when booting the
guest with iommu.passthrough = 1. So I removed the BUG_ON(). The legacy
BUG_ON(ste_live) still is there under the form of BUG_ON(s1_live).

Thanks!

Eric


> 
>> +break;
>> +case IOMMU_PASID_CONFIG_TRANSLATE:
>> +/* we do not support S1 <-> S1 transitions */
>> +if (smmu_domain->s1_cfg.set)
>> +goto out;
>> +
>> +/*
>> + * we currently support a single CD so s1fmt and s1dss
>> + * fields are also ignored
>> + */
>> +if (cfg->pasid_bits)
>> +goto out;
>> +
>> +smmu_domain->s1_cfg.cdcfg.cdtab_dma = cfg->base_ptr;
>> +smmu_domain->s1_cfg.set = true;
>> +smmu_domain->abort = false;
>> +break;
>> +default:
>> +goto out;
>> +}
>> +spin_lock_irqsave(_domain->devices_lock, flags);
>> +list_for_each_entry(master, _domain->devices, domain_head)
>> +arm_smmu_install_ste_for_dev(master);
>> +spin_unlock_irqrestore(_domain->devices_lock, flags);
>> +ret = 0;
>> +out:
>> +mutex_unlock(_domain->init_mutex);
>> +return ret;
>> +}
>> +
> [...]
> 
> Thanks,
> Keqian
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v2 4/5] iommu/tegra-smmu: Rework tegra_smmu_probe_device()

2021-02-11 Thread Guillaume Tucker
On 10/02/2021 08:20, Nicolin Chen wrote:
> Hi Guillaume,
> 
> On Sat, Feb 06, 2021 at 01:40:13PM +, Guillaume Tucker wrote:
>>> It'd be nicer if I can get both logs of the vanilla kernel (failing)
>>> and the commit-reverted version (passing), each applying this patch.
>>
>> Sure, I've run 3 jobs:
>>
>> * v5.11-rc6 as a reference, to see the original issue:
>>   https://lava.collabora.co.uk/scheduler/job/3187848
>>
>> * + your debug patch:
>>   https://lava.collabora.co.uk/scheduler/job/3187849
>>
>> * + the "breaking" commit reverted, passing the tests:
>>   https://lava.collabora.co.uk/scheduler/job/3187851
> 
> Thanks for the help!
> 
> I am able to figure out what's probably wrong, yet not so sure
> about the best solution at this point.
> 
> Would it be possible for you to run one more time with another
> debugging patch? I'd like to see the same logs as previous:
> 1. Vanilla kernel + debug patch
> 2. Vanilla kernel + Reverted + debug patch

As it turns out, next-20210210 is passing all the tests again so
it looks like this got fixed in the meantime:

  https://lava.collabora.co.uk/scheduler/job/3210192
  https://lava.collabora.co.uk/results/3210192/0_igt-kms-tegra

And here's a more extensive list of IGT tests on next-20210211,
all the regressions have been fixed:

  https://kernelci.org/test/plan/id/60254c42f51df36be53abe62/


I haven't run a reversed bisection to find the fix, but I guess
it wouldn't be too hard to find out what happened by hand anyway.
I see the drm/tegra/for-5.12-rc1 tag has been merged into
linux-next, maybe that solved the issue?

FYI I've also run some jobs with your debug patch and with the
breaking patch reverted:

  https://lava.collabora.co.uk/scheduler/job/3210245
  https://lava.collabora.co.uk/scheduler/job/3210596

Meanwhile I'll see what can be done to improve the automated
bisection so if there are new IGT regressions they would get
reported earlier.  I guess it would have saved us all some time
if it had been bisected in December.

Thanks,
Guillaume
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: add a new dma_alloc_noncontiguous API v2

2021-02-11 Thread Laurent Pinchart
Hi Ricardo,

On Thu, Feb 11, 2021 at 02:20:30PM +0100, Ricardo Ribalda wrote:
> On Thu, Feb 11, 2021 at 2:06 PM Christoph Hellwig  wrote:
> > On Thu, Feb 11, 2021 at 10:08:18AM +0100, Ricardo Ribalda wrote:
> > > Hi Christoph
> > >
> > > What are your merge plans for the uvc change?
> > > http://git.infradead.org/users/hch/dma-mapping.git/commit/3dc47131f8aacc2093f68a9971d24c754e435520
> > >
> > > Are you going to remove the patch on your Merge request and then send
> > > it for review to Laurent? or merge it through your tree with a S-o-B
> > > him?
> >
> > I though I had all the ACKs to queue it up.  Is that not what was
> > intended?  Queueing up the API without a user is generally a bad idea.
> >
> 
> I am pretty sure we need the ack from Laurent. He maintains uvc
> 
> @Laurent Pinchart what are your thoughts?

I think it would have been nice to CC me on the patch in the first place
:-) I won't have time to review the series before next week at the
earliest.

-- 
Regards,

Laurent Pinchart
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: add a new dma_alloc_noncontiguous API v2

2021-02-11 Thread Ricardo Ribalda
HI Christoph

On Thu, Feb 11, 2021 at 2:06 PM Christoph Hellwig  wrote:
>
> On Thu, Feb 11, 2021 at 10:08:18AM +0100, Ricardo Ribalda wrote:
> > Hi Christoph
> >
> > What are your merge plans for the uvc change?
> > http://git.infradead.org/users/hch/dma-mapping.git/commit/3dc47131f8aacc2093f68a9971d24c754e435520
> >
> > Are you going to remove the patch on your Merge request and then send
> > it for review to Laurent? or merge it through your tree with a S-o-B
> > him?
>
> I though I had all the ACKs to queue it up.  Is that not what was
> intended?  Queueing up the API without a user is generally a bad idea.
>

I am pretty sure we need the ack from Laurent. He maintains uvc

@Laurent Pinchart what are your thoughts?

Thanks!


-- 
Ricardo Ribalda
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: add a new dma_alloc_noncontiguous API v2

2021-02-11 Thread Christoph Hellwig
On Thu, Feb 11, 2021 at 10:08:18AM +0100, Ricardo Ribalda wrote:
> Hi Christoph
> 
> What are your merge plans for the uvc change?
> http://git.infradead.org/users/hch/dma-mapping.git/commit/3dc47131f8aacc2093f68a9971d24c754e435520
> 
> Are you going to remove the patch on your Merge request and then send
> it for review to Laurent? or merge it through your tree with a S-o-B
> him?

I though I had all the ACKs to queue it up.  Is that not what was
intended?  Queueing up the API without a user is generally a bad idea.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v3 1/2] mempinfd: Add new syscall to provide memory pin

2021-02-11 Thread David Hildenbrand


Again in proper SVA it should be quite unlikely to take a fault caused
by something like migration, on the same likelyhood as the CPU. If
things are faulting so much this is a problem then I think it is a
system level problem with doing too much page motion.


My point is that single one SVA application shouldn't require system
to make global changes, such as disabling numa balancing, disabling
THP, to decrease page fault frequency by affecting other applications.

Anyway, guys are in lunar new year. Hopefully, we are getting more
real benchmark data afterwards to make the discussion more targeted.


Right, but I think functionality as proposed in this patch is highly 
unlikely to make it into the kernel. I'd be interested in approaches to 
mitigate this per process. E.g., temporarily stop the kernel from 
messing with THP of this specific process.


But even then, why should some random library make such decisions for a 
whole process? Just as, why should some random library pin pages never 
allocated by it and stop THP from being created or NUMA layout from 
getting optimized? This looks like a clear layer violation to me.


I fully agree with Jason: Why do the events happen that often such that 
your use cases are affected that heavily, such that we even need such 
ugly handling?


What mempinfd does is exposing dangerous functionality that we don't 
want 99.6% of all user space to ever use via a syscall to generic 
users to fix broken* hw.


*broken might be over-stressing the situation, but the HW (SVA) 
obviously seems to perform way worse than ordinary CPUs.


--
Thanks,

David / dhildenb

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: add a new dma_alloc_noncontiguous API v2

2021-02-11 Thread Ricardo Ribalda
Hi Christoph

What are your merge plans for the uvc change?
http://git.infradead.org/users/hch/dma-mapping.git/commit/3dc47131f8aacc2093f68a9971d24c754e435520

Are you going to remove the patch on your Merge request and then send
it for review to Laurent? or merge it through your tree with a S-o-B
him?

Thanks

On Tue, Feb 9, 2021 at 6:02 PM Christoph Hellwig  wrote:
>
> On Tue, Feb 09, 2021 at 03:46:13PM +0100, Ricardo Ribalda wrote:
> > Hi Christoph
> >
> > I have tested it in both arm and x86, since there are not significant
> > changes with the previous version I did not do a performance test.
>
> I'll take this as a Tested-by.



-- 
Ricardo Ribalda
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] mm/page_alloc: Fix pageblock_order with HUGETLB_PAGE_SIZE_VARIABLE

2021-02-11 Thread David Hildenbrand

On 11.02.21 07:22, Anshuman Khandual wrote:

The following warning gets triggered while trying to boot a 64K page size
without THP config kernel on arm64 platform.

WARNING: CPU: 5 PID: 124 at mm/vmstat.c:1080 __fragmentation_index+0xa4/0xc0
Modules linked in:
CPU: 5 PID: 124 Comm: kswapd0 Not tainted 5.11.0-rc6-4-ga0ea7d62002 #159
Hardware name: linux,dummy-virt (DT)
[8.810673] pstate: 2045 (nzCv daif +PAN -UAO -TCO BTYPE=--)
[8.811732] pc : __fragmentation_index+0xa4/0xc0
[8.812555] lr : fragmentation_index+0xf8/0x138
[8.813360] sp : 864079b0
[8.813958] x29: 864079b0 x28: 0372
[8.814901] x27: 7682 x26: 8000135b3948
[8.815847] x25: 1fffe00010c80f48 x24: 
[8.816805] x23:  x22: 000d
[8.817764] x21: 0030 x20: 0005ffcb4d58
[8.818712] x19: 000b x18: 
[8.819656] x17:  x16: 
[8.820613] x15:  x14: 8000114c6258
[8.821560] x13: 6000bff969ba x12: 1fffe000bff969b9
[8.822514] x11: 1fffe000bff969b9 x10: 6000bff969b9
[8.823461] x9 : dfff8000 x8 : 0005ffcb4dcf
[8.824415] x7 : 0001 x6 : 41b58ab3
[8.825359] x5 : 600010c80f48 x4 : dfff8000
[8.826313] x3 : 8000102be670 x2 : 0007
[8.827259] x1 : 86407a60 x0 : 000d
[8.828218] Call trace:
[8.828667]  __fragmentation_index+0xa4/0xc0
[8.829436]  fragmentation_index+0xf8/0x138
[8.830194]  compaction_suitable+0x98/0xb8
[8.830934]  wakeup_kcompactd+0xdc/0x128
[8.831640]  balance_pgdat+0x71c/0x7a0
[8.832327]  kswapd+0x31c/0x520
[8.832902]  kthread+0x224/0x230
[8.833491]  ret_from_fork+0x10/0x30
[8.834150] ---[ end trace 472836f79c15516b ]---

This warning comes from __fragmentation_index() when the requested order
is greater than MAX_ORDER.

static int __fragmentation_index(unsigned int order,
 struct contig_page_info *info)
{
 unsigned long requested = 1UL << order;

 if (WARN_ON_ONCE(order >= MAX_ORDER)) <= Triggered here
 return 0;

Digging it further reveals that pageblock_order has been assigned a value
which is greater than MAX_ORDER failing the above check. But why this
happened ? Because HUGETLB_PAGE_ORDER for the given config on arm64 is
greater than MAX_ORDER.

The solution involves enabling HUGETLB_PAGE_SIZE_VARIABLE which would make
pageblock_order a variable instead of constant HUGETLB_PAGE_ORDER. But that
change alone also did not really work as pageblock_order still got assigned
as HUGETLB_PAGE_ORDER in set_pageblock_order(). HUGETLB_PAGE_ORDER needs to
be less than MAX_ORDER for its appropriateness as pageblock_order otherwise
just fallback to MAX_ORDER - 1 as before. While here it also fixes a build
problem via type casting MAX_ORDER in rmem_cma_setup().


I'm wondering, is there any real value in allowing FORCE_MAX_ZONEORDER 
to be "11" with ARM64_64K_PAGES/ARM64_16K_PAGES?


Meaning: are there any real use cases that actually build a kernel 
without TRANSPARENT_HUGEPAGE and with ARM64_64K_PAGES/ARM64_16K_PAGES?


As builds are essentially broken, I assume this is not that relevant? Or 
how long has it been broken?


It might be easier to just drop the "TRANSPARENT_HUGEPAGE" part from the 
FORCE_MAX_ZONEORDER config.


--
Thanks,

David / dhildenb

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/3] dma-contiguous: Type cast MAX_ORDER as unsigned int

2021-02-11 Thread Christoph Hellwig
On Thu, Feb 11, 2021 at 11:52:11AM +0530, Anshuman Khandual wrote:
> Type cast MAX_ORDER as unsigned int to fix the following build warning.
> 
> In file included from ./include/linux/kernel.h:14,
>  from ./include/asm-generic/bug.h:20,
>  from ./arch/arm64/include/asm/bug.h:26,
>  from ./include/linux/bug.h:5,
>  from ./include/linux/mmdebug.h:5,
>  from ./arch/arm64/include/asm/memory.h:166,
>  from ./arch/arm64/include/asm/page.h:42,
>  from kernel/dma/contiguous.c:46:
> kernel/dma/contiguous.c: In function ‘rmem_cma_setup’:
> ./include/linux/minmax.h:18:28: warning: comparison of distinct pointer
> types lacks a cast
>   (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
> ^~
> ./include/linux/minmax.h:32:4: note: in expansion of macro ‘__typecheck’
>(__typecheck(x, y) && __no_side_effects(x, y))
> ^~~
> ./include/linux/minmax.h:42:24: note: in expansion of macro ‘__safe_cmp’
>   __builtin_choose_expr(__safe_cmp(x, y), \
> ^~
> ./include/linux/minmax.h:58:19: note: in expansion of macro
> ‘__careful_cmp’
>  #define max(x, y) __careful_cmp(x, y, >)
>^
> kernel/dma/contiguous.c:402:35: note: in expansion of macro ‘max’
>   phys_addr_t align = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order);
> 
> Cc: Christoph Hellwig 
> Cc: Marek Szyprowski 
> Cc: Robin Murphy 
> Cc: iommu@lists.linux-foundation.org
> Cc: linux-ker...@vger.kernel.org
> Signed-off-by: Anshuman Khandual 
> ---
>  kernel/dma/contiguous.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> index 3d63d91cba5c..1c2782349d71 100644
> --- a/kernel/dma/contiguous.c
> +++ b/kernel/dma/contiguous.c
> @@ -399,7 +399,7 @@ static const struct reserved_mem_ops rmem_cma_ops = {
>  
>  static int __init rmem_cma_setup(struct reserved_mem *rmem)
>  {
> - phys_addr_t align = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order);
> + phys_addr_t align = PAGE_SIZE << max((unsigned int)MAX_ORDER - 1, 
> pageblock_order);

MAX_ORDER and pageblock_order should be the same type.  So either fix
MAX_ORDER to be an unsigned constant, which would be fundamentally
the right thing to do but might cause some fallout, or turn
pageblock_order into an int, which is probably much either as the stub
define of it already has an integer type derived from MAX_ORDER as well.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/3] arm64/hugetlb: Enable HUGETLB_PAGE_SIZE_VARIABLE

2021-02-11 Thread Christoph Hellwig
On Thu, Feb 11, 2021 at 11:52:10AM +0530, Anshuman Khandual wrote:
> MAX_ORDER which invariably depends on FORCE_MAX_ZONEORDER can be a variable
> for a given page size, depending on whether TRANSPARENT_HUGEPAGE is enabled
> or not. In certain page size and THP combinations HUGETLB_PAGE_ORDER can be
> greater than MAX_ORDER, making it unusable as pageblock_order.
> 
> This enables HUGETLB_PAGE_SIZE_VARIABLE making pageblock_order a variable
> rather than the compile time constant HUGETLB_PAGE_ORDER which could break
> MAX_ORDER rule for certain configurations.
> 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> Signed-off-by: Anshuman Khandual 
> ---
>  arch/arm64/Kconfig | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index f39568b28ec1..8e3a5578f663 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1909,6 +1909,10 @@ config ARCH_ENABLE_THP_MIGRATION
>   def_bool y
>   depends on TRANSPARENT_HUGEPAGE
>  
> +config HUGETLB_PAGE_SIZE_VARIABLE

Please move the definition of HUGETLB_PAGE_SIZE_VARIABLE to
mm/Kconfig and select it from the arch Kconfigfs instead of duplicating
the definition.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/3] mm/page_alloc: Fix pageblock_order when HUGETLB_PAGE_ORDER >= MAX_ORDER

2021-02-11 Thread Christoph Hellwig
> - if (HPAGE_SHIFT > PAGE_SHIFT)
> + if ((HPAGE_SHIFT > PAGE_SHIFT) && (HUGETLB_PAGE_ORDER < MAX_ORDER))

No need for the braces.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu