Hello Konrad,

On Tue, Jun 23, 2020 at 09:38:43AM -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Apr 27, 2020 at 06:53:18PM +0000, Ashish Kalra wrote:
> > Hello Konrad,
> > 
> > On Mon, Mar 30, 2020 at 10:25:51PM +0000, Ashish Kalra wrote:
> > > Hello Konrad,
> > > 
> > > On Tue, Mar 03, 2020 at 12:03:53PM -0500, Konrad Rzeszutek Wilk wrote:
> > > > On Tue, Feb 04, 2020 at 07:35:00PM +0000, Ashish Kalra wrote:
> > > > > Hello Konrad,
> > > > > 
> > > > > Looking fwd. to your feedback regarding support of other memory
> > > > > encryption architectures such as Power, S390, etc.
> > > > > 
> > > > > Thanks,
> > > > > Ashish
> > > > > 
> > > > > On Fri, Jan 24, 2020 at 11:00:08PM +0000, Ashish Kalra wrote:
> > > > > > On Tue, Jan 21, 2020 at 03:54:03PM -0500, Konrad Rzeszutek Wilk 
> > > > > > wrote:
> > > > > > > > 
> > > > > > > > Additional memory calculations based on # of PCI devices and
> > > > > > > > their memory ranges will make it more complicated with so
> > > > > > > > many other permutations and combinations to explore, it is
> > > > > > > > essential to keep this patch as simple as possible by 
> > > > > > > > adjusting the bounce buffer size simply by determining it
> > > > > > > > from the amount of provisioned guest memory.
> > > > > > >> 
> > > > > > >> Please rework the patch to:
> > > > > > >> 
> > > > > > >>  - Use a log solution instead of the multiplication.
> > > > > > >>    Feel free to cap it at a sensible value.
> > > > > > 
> > > > > > Ok.
> > > > > > 
> > > > > > >> 
> > > > > > >>  - Also the code depends on SWIOTLB calling in to the
> > > > > > >>    adjust_swiotlb_default_size which looks wrong.
> > > > > > >> 
> > > > > > >>    You should not adjust io_tlb_nslabs from 
> > > > > > >> swiotlb_size_or_default.
> > > > > > 
> > > > > > >>    That function's purpose is to report a value.
> > > > > > >> 
> > > > > > >>  - Make io_tlb_nslabs be visible outside of the SWIOTLB code.
> > > > > > >> 
> > > > > > >>  - Can you utilize the IOMMU_INIT APIs and have your own detect 
> > > > > > >> which would
> > > > > > >>    modify the io_tlb_nslabs (and set swiotbl=1?).
> > > > > > 
> > > > > > This seems to be a nice option, but then IOMMU_INIT APIs are
> > > > > > x86-specific and this swiotlb buffer size adjustment is also needed
> > > > > > for other memory encryption architectures like Power, S390, etc.
> > > > 
> > > > Oh dear. That I hadn't considered.
> > > > > > 
> > > > > > >> 
> > > > > > >>    Actually you seem to be piggybacking on 
> > > > > > >> pci_swiotlb_detect_4gb - so
> > > > > > >>    perhaps add in this code ? Albeit it really should be in it's 
> > > > > > >> own
> > > > > > >>    file, not in arch/x86/kernel/pci-swiotlb.c
> > > > > > 
> > > > > > Actually, we piggyback on pci_swiotlb_detect_override which sets
> > > > > > swiotlb=1 as x86_64_start_kernel() and invocation of 
> > > > > > sme_early_init()
> > > > > > forces swiotlb on, but again this is all x86 architecture specific.
> > > > 
> > > > Then it looks like the best bet is to do it from within swiotlb_init?
> > > > We really can't do it from swiotlb_size_or_default - that function
> > > > should just return a value and nothing else.
> > > > 
> > > 
> > > Actually, we need to do it in swiotlb_size_or_default() as this gets 
> > > called by
> > > reserve_crashkernel_low() in arch/x86/kernel/setup.c and used to
> > > reserve low crashkernel memory. If we adjust swiotlb size later in
> > > swiotlb_init() which gets called later than reserve_crashkernel_low(),
> > > then any swiotlb size changes/expansion will conflict/overlap with the
> > > low memory reserved for crashkernel.
> > > 
> > and will also potentially cause SWIOTLB buffer allocation failures.
> > 
> > Do you have any feedback, comments on the above ?
> 
> 
> The init boot chain looks like this:
> 
> initmem_init
>       pci_iommu_alloc
>               -> pci_swiotlb_detect_4gb
>               -> swiotlb_init
> 
> reserve_crashkernel
>       reserve_crashkernel_low
>               -> swiotlb_size_or_default
>               ..
> 
> 
> (rootfs code):
>       pci_iommu_init
>               -> a bunch of the other IOMMU late_init code gets called..
>               ->  pci_swiotlb_late_init 
> 
> I have to say I am lost to how your patch fixes "If we adjust swiolb
> size later .. then any swiotlb size .. will overlap with the low memory
> reserved for crashkernel"?
> 

Actually as per the boot flow :

setup_arch() calls reserve_crashkernel() and pci_iommu_alloc() is
invoked through mm_init()/mem_init() and not via initmem_init().

start_kernel:
...
setup_arch()
        reserve_crashkernel
                reserve_crashkernel_low
                        -> swiotlb_size_or_default

...
...
mm_init()
        mem_init()
                pci_iommu_alloc
                        -> pci_swiotlb_detect_4gb
                        -> swiotlb_init

So as per the above boot flow, reserve_crashkernel() can get called
before swiotlb_detect/init, and hence, if we don't fixup or adjust
the SWIOTLB buffer size in swiotlb_size_or_default() then crash kernel
will reserve memory which will conflict/overlap with any SWIOTLB bounce
buffer allocated memory (adjusted or fixed up later).

Therefore, we need to adjust/fixup SWIOTLB bounce buffer memory in
swiotlb_size_or_default() function itself, before swiotlb detect/init
funtions get invoked.

Thanks,
Ashish

> Or are you saying that 'reserve_crashkernel_low' is the _culprit_ and it
> is the one changing the size? And hence it modifying the swiotlb size
> will fix this problem? Aka _before_ all the other IOMMU get their hand
> on it?
> 
> If so why not create an
> IOMMU_INIT(crashkernel_adjust_swiotlb,pci_swiotlb_detect_override,
> NULL, NULL);
> 
> And crashkernel_adjust_swiotlb would change the size of swiotlb buffer
> if conditions are found to require it.
> 
> You also may want to put a #define DEBUG in arch/x86/kernel/pci-iommu_table.c
> to check out whether the tree structure of IOMMU entries is correct.
> 
> 
> 
> But still I am lost - if say the AMD one does decide for unknown reason
> to expand the SWIOTLB you are still stuck with the 'overlap with
> the low memory reserved' or so.
> 
> Perhaps add a late_init that gets called as the last one to validate
> this ? And maybe if the swiotlb gets turned off you also take proper
> steps?
> 
> > As such i feel, this patch is complete otherwise and can be included as
> > it is. 
> > 
> > Thanks,
> > Ashish
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to