Re: [PATCH v1 09/10] mm/memory_hotplug: Don't mark pages PG_reserved when initializing the memmap

2019-11-04 Thread Boris Ostrovsky
On 10/24/19 8:09 AM, David Hildenbrand wrote:
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 4f2e78a5e4db..af69f057913a 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -374,6 +374,13 @@ static void xen_online_page(struct page *page, unsigned 
> int order)
>   mutex_lock(&balloon_mutex);
>   for (i = 0; i < size; i++) {
>   p = pfn_to_page(start_pfn + i);
> + /*
> +  * TODO: The core used to mark the pages reserved. Most probably
> +  * we can stop doing that now. However, especially
> +  * alloc_xenballooned_pages() left PG_reserved set
> +  * on pages that can get mapped to user space.
> +  */
> + __SetPageReserved(p);

I suspect this is not needed. Pages can get into balloon either from
here or from non-hotplug path (e.g. decrease_reservation()) and so when
we get a page from the balloon we would get a random page that may or
may not have Reserved bit set.

-boris


>   balloon_append(p);
>   }
>   mutex_unlock(&balloon_mutex);
>



Re: [PATCH v1 09/10] mm/memory_hotplug: Don't mark pages PG_reserved when initializing the memmap

2019-11-05 Thread Boris Ostrovsky
On 11/5/19 5:18 AM, David Hildenbrand wrote:
> On 04.11.19 23:44, Boris Ostrovsky wrote:
>> On 10/24/19 8:09 AM, David Hildenbrand wrote:
>>> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
>>> index 4f2e78a5e4db..af69f057913a 100644
>>> --- a/drivers/xen/balloon.c
>>> +++ b/drivers/xen/balloon.c
>>> @@ -374,6 +374,13 @@ static void xen_online_page(struct page *page,
>>> unsigned int order)
>>>   mutex_lock(&balloon_mutex);
>>>   for (i = 0; i < size; i++) {
>>>   p = pfn_to_page(start_pfn + i);
>>> +    /*
>>> + * TODO: The core used to mark the pages reserved. Most
>>> probably
>>> + * we can stop doing that now. However, especially
>>> + * alloc_xenballooned_pages() left PG_reserved set
>>> + * on pages that can get mapped to user space.
>>> + */
>>> +    __SetPageReserved(p);
>>
>> I suspect this is not needed. Pages can get into balloon either from
>> here or from non-hotplug path (e.g. decrease_reservation()) and so when
>> we get a page from the balloon we would get a random page that may or
>> may not have Reserved bit set.
>
> Yeah, I also think it is fine. If you agree, I'll drop this hunk and
> add details to the patch description.
>
>


Yes, let's do that. Thanks.

-boris


Re: [PATCH RFC v1 5/6] xen-swiotlb: convert variables to arrays

2021-02-19 Thread Boris Ostrovsky


On 2/19/21 3:32 PM, Konrad Rzeszutek Wilk wrote:
> On Sun, Feb 07, 2021 at 04:56:01PM +0100, Christoph Hellwig wrote:
>> On Thu, Feb 04, 2021 at 09:40:23AM +0100, Christoph Hellwig wrote:
>>> So one thing that has been on my mind for a while:  I'd really like
>>> to kill the separate dma ops in Xen swiotlb.  If we compare xen-swiotlb
>>> to swiotlb the main difference seems to be:
>>>
>>>  - additional reasons to bounce I/O vs the plain DMA capable
>>>  - the possibility to do a hypercall on arm/arm64
>>>  - an extra translation layer before doing the phys_to_dma and vice
>>>versa
>>>  - an special memory allocator
>>>
>>> I wonder if inbetween a few jump labels or other no overhead enablement
>>> options and possibly better use of the dma_range_map we could kill
>>> off most of swiotlb-xen instead of maintaining all this code duplication?
>> So I looked at this a bit more.
>>
>> For x86 with XENFEAT_auto_translated_physmap (how common is that?)
> Juergen, Boris please correct me if I am wrong, but that 
> XENFEAT_auto_translated_physmap
> only works for PVH guests?


That's both HVM and PVH (for dom0 it's only PVH).


-boris



>
>> pfn_to_gfn is a nop, so plain phys_to_dma/dma_to_phys do work as-is.
>>
>> xen_arch_need_swiotlb always returns true for x86, and
>> range_straddles_page_boundary should never be true for the
>> XENFEAT_auto_translated_physmap case.
> Correct. The kernel should have no clue of what the real MFNs are
> for PFNs.
>> So as far as I can tell the mapping fast path for the
>> XENFEAT_auto_translated_physmap can be trivially reused from swiotlb.
>>
>> That leaves us with the next more complicated case, x86 or fully cache
>> coherent arm{,64} without XENFEAT_auto_translated_physmap.  In that case
>> we need to patch in a phys_to_dma/dma_to_phys that performs the MFN
>> lookup, which could be done using alternatives or jump labels.
>> I think if that is done right we should also be able to let that cover
>> the foreign pages in is_xen_swiotlb_buffer/is_swiotlb_buffer, but
>> in that worst case that would need another alternative / jump label.
>>
>> For non-coherent arm{,64} we'd also need to use alternatives or jump
>> labels to for the cache maintainance ops, but that isn't a hard problem
>> either.
>>
>>


Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-02 Thread Boris Ostrovsky



On 3/2/22 8:15 AM, Boris Ostrovsky wrote:



On 3/1/22 9:55 PM, Stefano Stabellini wrote:

On Tue, 1 Mar 2022, Christoph Hellwig wrote:

Allow to pass a remap argument to the swiotlb initialization functions
to handle the Xen/x86 remap case.  ARM/ARM64 never did any remapping
from xen_swiotlb_fixup, so we don't even need that quirk.




Aside from that the rest looks OK. Also, you can add my:

Tested-by: Stefano Stabellini 



Not for me, I fail to boot with

[   52.202000] bnxt_en :31:00.0: swiotlb buffer is full (sz: 256 bytes), 
total 0 (slots), used 0 (slots)

(this is iscsi root so I need the NIC).


I bisected it to "x86: remove the IOMMU table infrastructure" but haven't 
actually looked at the code yet.



Again, this is as dom0. Baremetal is fine.


-boris



Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-02 Thread Boris Ostrovsky




On 3/1/22 9:55 PM, Stefano Stabellini wrote:

On Tue, 1 Mar 2022, Christoph Hellwig wrote:

Allow to pass a remap argument to the swiotlb initialization functions
to handle the Xen/x86 remap case.  ARM/ARM64 never did any remapping
from xen_swiotlb_fixup, so we don't even need that quirk.




Aside from that the rest looks OK. Also, you can add my:

Tested-by: Stefano Stabellini 



Not for me, I fail to boot with

[   52.202000] bnxt_en :31:00.0: swiotlb buffer is full (sz: 256 bytes), 
total 0 (slots), used 0 (slots)

(this is iscsi root so I need the NIC).


I bisected it to "x86: remove the IOMMU table infrastructure" but haven't 
actually looked at the code yet.


-boris


Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-03 Thread Boris Ostrovsky



On 3/3/22 5:57 AM, Christoph Hellwig wrote:

On Wed, Mar 02, 2022 at 08:15:03AM -0500, Boris Ostrovsky wrote:

Not for me, I fail to boot with

[   52.202000] bnxt_en :31:00.0: swiotlb buffer is full (sz: 256 bytes), 
total 0 (slots), used 0 (slots)

(this is iscsi root so I need the NIC).


I bisected it to "x86: remove the IOMMU table infrastructure" but haven't 
actually looked at the code yet.

Thanks. Looks like the sizing is going wrong.  Just to confirm, this is
dom0 on x86 and no special command line options?



Right.


module2 /boot/vmlinuz-5.17.0-rc6swiotlb placeholder 
root=UUID=dbef1262-8c8a-43db-8055-7d9bec7bece0 ro crashkernel=auto 
LANG=en_US.UTF-8 rd.luks=0 rd.lvm=0 rd.md=0 rd.dm=0 
netroot=iscsi:169.254.0.2:::1:iqn.2015-02.oracle.boot:uefi 
iscsi_param=node.session.timeo.replacement_timeout=6000 net.ifnames=1 
nvme_core.shutdown_timeout=10 ipmi_si.tryacpi=0 ipmi_si.trydmi=0 
ipmi_si.trydefaults=0 libiscsi.debug_libiscsi_eh=1  panic=20 nokaslr 
earlyprintk=xen console=hvc0 loglevel=8 4



Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-04 Thread Boris Ostrovsky



On 3/4/22 12:28 PM, Christoph Hellwig wrote:

On Wed, Mar 02, 2022 at 08:15:03AM -0500, Boris Ostrovsky wrote:

Not for me, I fail to boot with

[   52.202000] bnxt_en :31:00.0: swiotlb buffer is full (sz: 256 bytes), 
total 0 (slots), used 0 (slots)

(this is iscsi root so I need the NIC).


I bisected it to "x86: remove the IOMMU table infrastructure" but haven't 
actually looked at the code yet.

That looks like the swiotlb buffer did not get initialized at all, but I
can't really explain why.

Can you stick in a printk and see if xen_swiotlb_init_early gets called
at all?




Actually, that's the only thing I did do so far and yes, it does get called.


-boris



Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-04 Thread Boris Ostrovsky



On 3/4/22 12:43 PM, Christoph Hellwig wrote:

On Fri, Mar 04, 2022 at 12:36:17PM -0500, Boris Ostrovsky wrote:

I bisected it to "x86: remove the IOMMU table infrastructure" but haven't 
actually looked at the code yet.

That looks like the swiotlb buffer did not get initialized at all, but I
can't really explain why.

Can you stick in a printk and see if xen_swiotlb_init_early gets called
at all?



Actually, that's the only thing I did do so far and yes, it does get called.

So, specifically for "x86: remove the IOMMU table infrastructure" I
think we need the one-liner below so that swiotlb_exit doesn't get called
for the Xen case.  But that should have been fixed up by the next
patch already.



This indeed allows dom0 to boot. Not sure I see where in the next patch this 
would have been fixed?


(BTW, just noticed in iommu_setup() you set this variable to 1. Should be 
'true')


-boris



Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-08 Thread Boris Ostrovsky



On 3/1/22 5:53 AM, Christoph Hellwig wrote:

Allow to pass a remap argument to the swiotlb initialization functions
to handle the Xen/x86 remap case.  ARM/ARM64 never did any remapping
from xen_swiotlb_fixup, so we don't even need that quirk.



Any chance this patch could be split? Lots of things are happening here and 
it's somewhat hard to review. (Patch 7 too BTW but I think I managed to get 
through it)



diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index e0def4b1c3181..2f2c468acb955 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -71,15 +71,12 @@ static inline void __init pci_swiotlb_detect(void)
  #endif /* CONFIG_SWIOTLB */
  
  #ifdef CONFIG_SWIOTLB_XEN

-static bool xen_swiotlb;
-
  static void __init pci_xen_swiotlb_init(void)
  {
if (!xen_initial_domain() && !x86_swiotlb_enable)
return;



Now that there is a single call site for this routine I think this check can be 
dropped. We are only called here for xen_initial_domain()==true.


-boris


Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-09 Thread Boris Ostrovsky



On 3/9/22 1:18 AM, Christoph Hellwig wrote:

On Tue, Mar 08, 2022 at 04:38:21PM -0500, Boris Ostrovsky wrote:

On 3/1/22 5:53 AM, Christoph Hellwig wrote:

Allow to pass a remap argument to the swiotlb initialization functions
to handle the Xen/x86 remap case.  ARM/ARM64 never did any remapping
from xen_swiotlb_fixup, so we don't even need that quirk.


Any chance this patch could be split? Lots of things are happening here and 
it's somewhat hard to review. (Patch 7 too BTW but I think I managed to get 
through it)

What would be your preferred split?



swiotlb_init() rework to be done separately?





diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index e0def4b1c3181..2f2c468acb955 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -71,15 +71,12 @@ static inline void __init pci_swiotlb_detect(void)
   #endif /* CONFIG_SWIOTLB */
 #ifdef CONFIG_SWIOTLB_XEN
-static bool xen_swiotlb;
-
   static void __init pci_xen_swiotlb_init(void)
   {
if (!xen_initial_domain() && !x86_swiotlb_enable)
return;


Now that there is a single call site for this routine I think this check can be 
dropped. We are only called here for xen_initial_domain()==true.

The callsite just checks xen_pv_domain() and itself is called
unconditionally during initialization.



Oh, right, nevermind. *pv* domain.


-boris



Re: [PATCH 12/15] swiotlb: provide swiotlb_init variants that remap the buffer

2022-03-14 Thread Boris Ostrovsky



On 3/14/22 3:31 AM, Christoph Hellwig wrote:

-void __init swiotlb_init(bool addressing_limit, unsigned int flags)
+void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
+   int (*remap)(void *tlb, unsigned long nslabs))
  {
-   size_t bytes = PAGE_ALIGN(default_nslabs << IO_TLB_SHIFT);
+   unsigned long nslabs = default_nslabs;
+   size_t bytes;
void *tlb;
  
  	if (!addressing_limit && !swiotlb_force_bounce)

@@ -271,12 +273,24 @@ void __init swiotlb_init(bool addressing_limit, unsigned 
int flags)
 * allow to pick a location everywhere for hypervisors with guest
 * memory encryption.
 */
+retry:
+   bytes = PAGE_ALIGN(default_nslabs << IO_TLB_SHIFT);
if (flags & SWIOTLB_ANY)
tlb = memblock_alloc(bytes, PAGE_SIZE);
else
tlb = memblock_alloc_low(bytes, PAGE_SIZE);
if (!tlb)
goto fail;
+   if (remap && remap(tlb, nslabs) < 0) {
+   memblock_free(tlb, PAGE_ALIGN(bytes));
+
+   /* Min is 2MB */
+   if (nslabs <= 1024)



This is IO_TLB_MIN_SLABS, isn't it? (Xen code didn't say so but that's what it 
meant to say I believe)



+   panic("%s: Failed to remap %zu bytes\n",
+ __func__, bytes);
+   nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE));
+   goto retry;
+   }

@@ -303,6 +323,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask)
if (swiotlb_force_disable)
return 0;
  
+retry:

order = get_order(nslabs << IO_TLB_SHIFT);
nslabs = SLABS_PER_PAGE << order;
bytes = nslabs << IO_TLB_SHIFT;
@@ -317,6 +338,17 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask)
  
  	if (!vstart)

return -ENOMEM;
+   if (remap)
+   rc = remap(vstart, nslabs);
+   if (rc) {
+   free_pages((unsigned long)vstart, order);
+
+   /* Min is 2MB */
+   if (nslabs <= 1024)



Same here.


-boris



+   return rc;
+   nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE));
+   goto retry;
+   }
  
  	if (order != get_order(bytes)) {

pr_warn("only able to allocate %ld MB\n",


Re: [PATCH 13/15] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-14 Thread Boris Ostrovsky



On 3/14/22 3:31 AM, Christoph Hellwig wrote:

-
  static void __init pci_xen_swiotlb_init(void)
  {
if (!xen_initial_domain() && !x86_swiotlb_enable)
return;
x86_swiotlb_enable = true;
-   xen_swiotlb = true;
-   xen_swiotlb_init_early();
+   swiotlb_init_remap(true, x86_swiotlb_flags, xen_swiotlb_fixup);



I think we need to have SWIOTLB_ANY set in x86_swiotlb_flags here.




dma_ops = &xen_swiotlb_dma_ops;
if (IS_ENABLED(CONFIG_PCI))
pci_request_acs();
@@ -88,14 +85,16 @@ static void __init pci_xen_swiotlb_init(void)
  
  int pci_xen_swiotlb_init_late(void)

  {
-   int rc;
-
-   if (xen_swiotlb)
+   if (dma_ops == &xen_swiotlb_dma_ops)
return 0;
  
-	rc = xen_swiotlb_init();

-   if (rc)
-   return rc;
+   /* we can work with the default swiotlb */
+   if (!io_tlb_default_mem.nslabs) {
+   int rc = swiotlb_init_late(swiotlb_size_or_default(),
+  GFP_KERNEL, xen_swiotlb_fixup);



This may be comment for previous patch but looking at swiotlb_init_late():


retry:
    order = get_order(nslabs << IO_TLB_SHIFT);
    nslabs = SLABS_PER_PAGE << order;
    bytes = nslabs << IO_TLB_SHIFT;

    while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
    vstart = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN,
  order);
    if (vstart)
    break;
    order--;
    }

    if (!vstart)
    return -ENOMEM;
    if (remap)
    rc = remap(vstart, nslabs);
    if (rc) {
    free_pages((unsigned long)vstart, order);

    /* Min is 2MB */
    if (nslabs <= 1024)
    return rc;
    nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE));
    goto retry;
    }

    if (order != get_order(bytes)) {
    pr_warn("only able to allocate %ld MB\n",
    (PAGE_SIZE << order) >> 20);
    nslabs = SLABS_PER_PAGE << order; <===
    }

    rc = swiotlb_late_init_with_tbl(vstart, nslabs);

Notice that we don't do remap() after final update to nslabs. We should.



-boris


Re: [PATCH 14/15] swiotlb: remove swiotlb_init_with_tbl and swiotlb_init_late_with_tbl

2022-03-14 Thread Boris Ostrovsky



On 3/14/22 3:31 AM, Christoph Hellwig wrote:

@@ -314,6 +293,7 @@ void __init swiotlb_init(bool addressing_limit, unsigned 
int flags)
  int swiotlb_init_late(size_t size, gfp_t gfp_mask,
int (*remap)(void *tlb, unsigned long nslabs))
  {
+   struct io_tlb_mem *mem = &io_tlb_default_mem;
unsigned long nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE);
unsigned long bytes;
unsigned char *vstart = NULL;
@@ -355,33 +335,28 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
(PAGE_SIZE << order) >> 20);
nslabs = SLABS_PER_PAGE << order;
}
-   rc = swiotlb_late_init_with_tbl(vstart, nslabs);
-   if (rc)
-   free_pages((unsigned long)vstart, order);
-
-   return rc;
-}
-
-int
-swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
-{
-   struct io_tlb_mem *mem = &io_tlb_default_mem;
-   unsigned long bytes = nslabs << IO_TLB_SHIFT;
  
-	if (swiotlb_force_disable)

-   return 0;
+   if (remap)
+   rc = remap(vstart, nslabs);
+   if (rc) {
+   free_pages((unsigned long)vstart, order);
  
-	/* protect against double initialization */

-   if (WARN_ON_ONCE(mem->nslabs))
-   return -ENOMEM;
+   /* Min is 2MB */
+   if (nslabs <= 1024)
+   return rc;
+   nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE));
+   goto retry;
+   }



We now end up with two attempts to remap. I think this second one is what we 
want since it solves the problem I pointed in previous patch.


-boris




Re: [PATCH 12/15] swiotlb: provide swiotlb_init variants that remap the buffer

2022-03-15 Thread Boris Ostrovsky




On 3/15/22 2:36 AM, Christoph Hellwig wrote:


@@ -271,12 +273,23 @@ void __init swiotlb_init(bool addressing_limit, unsigned 
int flags)
 * allow to pick a location everywhere for hypervisors with guest
 * memory encryption.
 */
+retry:
+   bytes = PAGE_ALIGN(default_nslabs << IO_TLB_SHIFT);
if (flags & SWIOTLB_ANY)
tlb = memblock_alloc(bytes, PAGE_SIZE);
else
tlb = memblock_alloc_low(bytes, PAGE_SIZE);
if (!tlb)
goto fail;
+   if (remap && remap(tlb, nslabs) < 0) {
+   memblock_free(tlb, PAGE_ALIGN(bytes));
+
+   if (nslabs <= IO_TLB_MIN_SLABS)
+   panic("%s: Failed to remap %zu bytes\n",
+ __func__, bytes);
+   nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE));



I spoke with Konrad (who wrote the original patch --- 
f4b2f07b2ed9b469ead87e06fc2fc3d12663a725) and apparently the reason for 2MB was 
to optimize for Xen's slab allocator, it had nothing to do with 
IO_TLB_MIN_SLABS. Since this is now common code we should not expose 
Xen-specific optimizations here and smaller values will still work so 
IO_TLB_MIN_SLABS is fine.

I think this should be mentioned in the commit message though, probably best in 
the next patch where you switch to this code.

As far as the hunk above, I don't think we need the max() here: with 
IO_TLB_MIN_SLABS being 512 we may get stuck in an infinite loop. Something like

nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE);
if (nslabs <= IO_TLB_MIN_SLABS)
panic()

should be sufficient.



+   goto retry;
+   }
if (swiotlb_init_with_tbl(tlb, default_nslabs, flags))
goto fail_free_mem;
return;
@@ -287,12 +300,18 @@ void __init swiotlb_init(bool addressing_limit, unsigned 
int flags)
pr_warn("Cannot allocate buffer");
  }
  
+void __init swiotlb_init(bool addressing_limit, unsigned int flags)

+{
+   return swiotlb_init_remap(addressing_limit, flags, NULL);
+}
+
  /*
   * Systems with larger DMA zones (those that don't support ISA) can
   * initialize the swiotlb later using the slab allocator if needed.
   * This should be just like above, but with some error catching.
   */
-int swiotlb_init_late(size_t size, gfp_t gfp_mask)
+int swiotlb_init_late(size_t size, gfp_t gfp_mask,
+   int (*remap)(void *tlb, unsigned long nslabs))
  {
unsigned long nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE);
unsigned long bytes;
@@ -303,6 +322,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask)
if (swiotlb_force_disable)
return 0;
  
+retry:

order = get_order(nslabs << IO_TLB_SHIFT);
nslabs = SLABS_PER_PAGE << order;
bytes = nslabs << IO_TLB_SHIFT;
@@ -317,6 +337,16 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask)
  
  	if (!vstart)

return -ENOMEM;
+   if (remap)
+   rc = remap(vstart, nslabs);
+   if (rc) {
+   free_pages((unsigned long)vstart, order);
+
+   if (IO_TLB_MIN_SLABS <= 1024)
+   return rc;
+   nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE));



Same here. (The 'if' check above is wrong anyway).

Patches 13 and 14 look good.


-boris




+   goto retry;
+   }
  
  	if (order != get_order(bytes)) {

pr_warn("only able to allocate %ld MB\n",


Re: cleanup swiotlb initialization v8

2022-04-05 Thread Boris Ostrovsky



On 4/4/22 1:05 AM, Christoph Hellwig wrote:

Hi all,

this series tries to clean up the swiotlb initialization, including
that of swiotlb-xen.  To get there is also removes the x86 iommu table
infrastructure that massively obsfucates the initialization path.

Git tree:

 git://git.infradead.org/users/hch/misc.git swiotlb-init-cleanup

Gitweb:

 
http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/swiotlb-init-cleanup





Tested-by: Boris Ostrovsky 




Re: [PATCH] xen: replace xen_remap() with memremap()

2022-05-31 Thread Boris Ostrovsky



On 5/30/22 4:26 AM, Juergen Gross wrote:

xen_remap() is used to establish mappings for frames not under direct
control of the kernel: for Xenstore and console ring pages, and for
grant pages of non-PV guests.

Today xen_remap() is defined to use ioremap() on x86 (doing uncached
mappings), and ioremap_cache() on Arm (doing cached mappings).

Uncached mappings for those use cases are bad for performance, so they
should be avoided if possible. As all use cases of xen_remap() don't
require uncached mappings (the mapped area is always physical RAM),
a mapping using the standard WB cache mode is fine.

As sparse is flagging some of the xen_remap() use cases to be not
appropriate for iomem(), as the result is not annotated with the
__iomem modifier, eliminate xen_remap() completely and replace all
use cases with memremap() specifying the MEMREMAP_WB caching mode.

xen_unmap() can be replaced with memunmap().

Reported-by: kernel test robot 
Signed-off-by: Juergen Gross 




Reviewed-by: Boris Ostrovsky 



Re: [PATCH 3/3] tty: hvc_xen: hide xen_console_remove when unused

2016-01-25 Thread Boris Ostrovsky

On 01/25/2016 04:54 PM, Arnd Bergmann wrote:

xencons_disconnect_backend() is only called from xen_console_remove(),


and also from xencons_probe()/xencons_resume(). But those two are also 
under the

same ifdef.

-boris


which is conditionally compiled, so we get a harmless warning when
CONFIG_HVC_XEN_FRONTEND is unset:

hvc/hvc_xen.c:350:12: error: 'xen_console_remove' defined but not used 
[-Werror=unused-function]

This moves the function down into the same #ifdef section to silence
the warning.

Signed-off-by: Arnd Bergmann 
---
  drivers/tty/hvc/hvc_xen.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/hvc/hvc_xen.c b/drivers/tty/hvc/hvc_xen.c
index fa816b7193b6..11725422dacb 100644
--- a/drivers/tty/hvc/hvc_xen.c
+++ b/drivers/tty/hvc/hvc_xen.c
@@ -323,6 +323,7 @@ void xen_console_resume(void)
}
  }
  
+#ifdef CONFIG_HVC_XEN_FRONTEND

  static void xencons_disconnect_backend(struct xencons_info *info)
  {
if (info->irq > 0)
@@ -363,7 +364,6 @@ static int xen_console_remove(struct xencons_info *info)
return 0;
  }
  
-#ifdef CONFIG_HVC_XEN_FRONTEND

  static int xencons_remove(struct xenbus_device *dev)
  {
return xen_console_remove(dev_get_drvdata(&dev->dev));


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 10/13] x86: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable PMUs

2019-01-08 Thread Boris Ostrovsky
On 1/8/19 5:48 AM, Peter Zijlstra wrote:
> On Mon, Jan 07, 2019 at 04:27:27PM +, Andrew Murray wrote:
>> For drivers that do not support context exclusion let's advertise the
>> PERF_PMU_CAP_NOEXCLUDE capability. This ensures that perf will
>> prevent us from handling events where any exclusion flags are set.
>> Let's also remove the now unnecessary check for exclusion flags.
>>
>> Signed-off-by: Andrew Murray 
>> ---
>>  arch/x86/events/amd/ibs.c  | 13 +
>>  arch/x86/events/amd/power.c| 10 ++
>>  arch/x86/events/intel/cstate.c | 12 +++-
>>  arch/x86/events/intel/rapl.c   |  9 ++---
>>  arch/x86/events/intel/uncore_snb.c |  9 ++---
>>  arch/x86/events/msr.c  | 10 ++
>>  6 files changed, 12 insertions(+), 51 deletions(-)
> You (correctly) don't add CAP_NO_EXCLUDE to the main x86 pmu code, but
> then you also don't check if it handles all the various exclude options
> correctly/consistently.
>
> Now; I must admit that that is a bit of a maze, but I think we can at
> least add exclude_idle and exclude_hv fails in there, nothing uses those
> afaict.
>
> On the various exclude options; they are as follows (IIUC):
>
>   - exclude_guest: we're a HV/host-kernel and we don't want the counter
>to run when we run a guest context.
>
>   - exclude_host: we're a HV/host-kernel and we don't want the counter
>   to run when we run in host context.
>
>   - exclude_hv: we're a guest and don't want the counter to run in HV
> context.
>
> Now, KVM always implies exclude_hv afaict (for guests), I'm not sure
> what, if anything Xen does on x86 (IIRC Brendan Gregg once said perf
> works on Xen) -- nor quite sure who to ask, Boris, Jeurgen?

perf does work inside guests.

VPMU is managed by the Xen and it presents to the guest only samples
that are associated with the guest. So from that perspective exclude_hv
doesn't seem to be needed.

There is a VPMU mode that allows profiling whole system (host and
guests) from dom0, and this where exclude_hv might be useful. But this
mode, ahem, needs some work.


-boris


Re: [PATCH 4/8] xen: Use the correctly the Xen memory terminologies

2015-07-28 Thread Boris Ostrovsky

On 07/28/2015 11:02 AM, Julien Grall wrote:

Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN
is meant, I suspect this is because the first support for Xen was for
PV. This brough some misimplementation of helpers on ARM and make the
developper confused the expected behavior.

For instance, with pfn_to_mfn, we expect to get an MFN based on the name.
Although, if we look at the implementation on x86, it's returning a GFN.

For clarity and avoid new confusion, replace any reference of mfn into
gnf in any helpers used by PV drivers.






@@ -730,7 +730,7 @@ static void xen_do_pin(unsigned level, unsigned long pfn)
struct mmuext_op op;

op.cmd = level;
-   op.arg1.mfn = pfn_to_mfn(pfn);
+   op.arg1.mfn = pfn_to_gfn(pfn);



This looks slightly odd. It is correct but given that purpose of this 
series is to make things more clear perhaps we can add another union 
member (gfn) to mmuext_op.arg1?


(Of course, the hypervisor will continue referring to mfn which could 
still be confusing)





xen_extend_mmuext_op(&op);
  }
@@ -1323,7 +1323,7 @@ static void __xen_write_cr3(bool kernel, unsigned long 
cr3)
trace_xen_mmu_write_cr3(kernel, cr3);

if (cr3)
-   mfn = pfn_to_mfn(PFN_DOWN(cr3));
+   mfn = pfn_to_gfn(PFN_DOWN(cr3));


Here too. And further down, thoughout this patch.




-boris
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 4/8] xen: Use the correctly the Xen memory terminologies

2015-07-29 Thread Boris Ostrovsky

On 07/29/2015 07:25 AM, Julien Grall wrote:

Hi Boris,

On 28/07/15 20:12, Boris Ostrovsky wrote:

On 07/28/2015 11:02 AM, Julien Grall wrote:

Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN
is meant, I suspect this is because the first support for Xen was for
PV. This brough some misimplementation of helpers on ARM and make the
developper confused the expected behavior.

For instance, with pfn_to_mfn, we expect to get an MFN based on the name.
Although, if we look at the implementation on x86, it's returning a GFN.

For clarity and avoid new confusion, replace any reference of mfn into
gnf in any helpers used by PV drivers.






@@ -730,7 +730,7 @@ static void xen_do_pin(unsigned level, unsigned
long pfn)
   struct mmuext_op op;

   op.cmd = level;
-op.arg1.mfn = pfn_to_mfn(pfn);
+op.arg1.mfn = pfn_to_gfn(pfn);



This looks slightly odd. It is correct but given that purpose of this
series is to make things more clear perhaps we can add another union
member (gfn) to mmuext_op.arg1?

(Of course, the hypervisor will continue referring to mfn which could
still be confusing)


This operation is only used for PV guests, right?

IHMO re-introducing pfn_to_mfn for PV-guests only (i.e with a BUG_ON to
ensure no usage for auto-translated guest) would be the best solution.
It would avoid to have different name than the hypersivor one in the
hypercall interface. It will also make clear that virt_to_machine & co
is only PV specific.

I though doing this but I preferred to defer it to x86 expert as my
knowledge for x86 Xen is very limited. I don't know where it's more
suitable to use MFN or GFN. I guess this file (mmu.c) is mostly PV specific?

Would something like below fine for you?

static inline unsigned long pfn_to_mfn(unsigned long pfn)
{
unsigned long mfn;

BUG_ON(xen_feature(XENFEAT_auto_translated_physmap));

mfn = __pfn_to_mfn(pfn);
if (mfn != INVALID_P2M_ENTRY)
mfn &= ~(FOREIGN_FRAME_BIT | IDENTITY_FRAME_BIT);

return mfn;
}

static inline unsigned long pfn_to_gfn(unsigned long pfn)
{
if (xen_feature(XENFEAT_autotranslated_physmap))
return pfn;
else
return pfn_to_mfn(pfn);
}



But you'd still say 'op.arg1.mfn = pfn_to_gfn(pfn);' in xen_do_pin() 
i.e. assign GFN to MFN, right? That's what I was referring to.


(In general, I am not sure a guest should ever use 'mfn' as it is purely 
a hypervisor construct. Including p2m, which I think should really be 
p2g as this is what we use to figure out what to stick into page tables)


-boris




Similar splitting would be done for gfn_to_pfn and mfn_to_pfn.

Regards,



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 4/8] xen: Use the correctly the Xen memory terminologies

2015-07-29 Thread Boris Ostrovsky

On 07/29/2015 10:23 AM, Julien Grall wrote:

On 29/07/15 15:14, Boris Ostrovsky wrote:

static inline unsigned long pfn_to_gfn(unsigned long pfn)
{
 if (xen_feature(XENFEAT_autotranslated_physmap))
 return pfn;
 else
 return pfn_to_mfn(pfn);
}


But you'd still say 'op.arg1.mfn = pfn_to_gfn(pfn);' in xen_do_pin()
i.e. assign GFN to MFN, right? That's what I was referring to.

Well no. I would use op.arg1.mfn = pfn_to_mfn(pfn) given that the code,
if I'm right, is only executed for PV.

mfn = pfn_to_gfn(...) was valid too because on PV is always an MFN. The
suggestion of pfn_to_mfn was just for more readability,


Right, and my comments were also not about correctness.





(In general, I am not sure a guest should ever use 'mfn' as it is purely
a hypervisor construct. Including p2m, which I think should really be
p2g as this is what we use to figure out what to stick into page tables)

I think avoid to use mfn in the hypervisor interface is out-of-scope for
this series. If we ever want to modify the Xen API in Linux, we should
do in sync with Xen to avoid inconsistency on naming.

Anyway, the oddity of mfn = pfn_to_gfn(...) is mostly contained in the
x86 specific code. I don't mind to either add pfn_to_mfn and use it or
add a comment /* PV-specific so mfn == gfn */ for every use of mfn =
pfn_to_gfn(...).


I think the former is better (even thought it adds a test)

-boris
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 4/8] xen: Use the correctly the Xen memory terminologies

2015-08-04 Thread Boris Ostrovsky

On 08/04/2015 02:12 PM, Julien Grall wrote:
  
  /*

   * We detect special mappings in one of two ways:
@@ -217,9 +232,13 @@ static inline unsigned long bfn_to_local_pfn(unsigned long 
mfn)
  
  /* VIRT <-> MACHINE conversion */

  #define virt_to_machine(v)(phys_to_machine(XPADDR(__pa(v
-#define virt_to_pfn(v)  (PFN_DOWN(__pa(v)))
  #define virt_to_mfn(v)(pfn_to_mfn(virt_to_pfn(v)))
  #define mfn_to_virt(m)(__va(mfn_to_pfn(m) << PAGE_SHIFT))
+#define virt_to_pfn(v)  (PFN_DOWN(__pa(v)))


This looks like unnecessary change.



diff --git a/drivers/video/fbdev/xen-fbfront.c 
b/drivers/video/fbdev/xen-fbfront.c
index 09dc447..25e3cce 100644
--- a/drivers/video/fbdev/xen-fbfront.c
+++ b/drivers/video/fbdev/xen-fbfront.c
@@ -539,7 +539,7 @@ static int xenfb_remove(struct xenbus_device *dev)
  
  static unsigned long vmalloc_to_mfn(void *address)

  {
-   return pfn_to_mfn(vmalloc_to_pfn(address));
+   return pfn_to_gfn(vmalloc_to_pfn(address));
  }


Are you sure? This will return vmalloc_to_pfn(address)).


-boris
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 4/8] xen: Use the correctly the Xen memory terminologies

2015-08-05 Thread Boris Ostrovsky

On 08/05/2015 06:51 AM, Julien Grall wrote:



diff --git a/drivers/video/fbdev/xen-fbfront.c
b/drivers/video/fbdev/xen-fbfront.c
index 09dc447..25e3cce 100644
--- a/drivers/video/fbdev/xen-fbfront.c
+++ b/drivers/video/fbdev/xen-fbfront.c
@@ -539,7 +539,7 @@ static int xenfb_remove(struct xenbus_device *dev)
 static unsigned long vmalloc_to_mfn(void *address)
   {
-return pfn_to_mfn(vmalloc_to_pfn(address));
+return pfn_to_gfn(vmalloc_to_pfn(address));
   }

Are you sure? This will return vmalloc_to_pfn(address)).

I guess you mean vmalloc_to_mfn will return vmalloc_to_pfn?

If so, it will be only the case on auto-translated case (because pfn ==
gfn). In the case of PV, the mfn will be returned.


How will mfn be returned on PV when pfn_to_gfn() is an identity function?

static inline unsigned long pfn_to_gfn(unsigned long pfn)
 {
 return pfn;
 }


-boris



Although, this function is misnamed. It's fixed in a follow-up patch
(see #6) because it's required more renaming than this function. I
didn't want to add such changes within this patch.

Regards,



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [Xen-devel] [PATCH v2 4/8] xen: Use the correctly the Xen memory terminologies

2015-08-05 Thread Boris Ostrovsky

On 08/05/2015 08:33 AM, Julien Grall wrote:

On 05/08/15 13:19, Boris Ostrovsky wrote:

On 08/05/2015 06:51 AM, Julien Grall wrote:

diff --git a/drivers/video/fbdev/xen-fbfront.c
b/drivers/video/fbdev/xen-fbfront.c
index 09dc447..25e3cce 100644
--- a/drivers/video/fbdev/xen-fbfront.c
+++ b/drivers/video/fbdev/xen-fbfront.c
@@ -539,7 +539,7 @@ static int xenfb_remove(struct xenbus_device *dev)
  static unsigned long vmalloc_to_mfn(void *address)
{
-return pfn_to_mfn(vmalloc_to_pfn(address));
+return pfn_to_gfn(vmalloc_to_pfn(address));
}

Are you sure? This will return vmalloc_to_pfn(address)).

I guess you mean vmalloc_to_mfn will return vmalloc_to_pfn?

If so, it will be only the case on auto-translated case (because pfn ==
gfn). In the case of PV, the mfn will be returned.

How will mfn be returned on PV when pfn_to_gfn() is an identity function?

static inline unsigned long pfn_to_gfn(unsigned long pfn)
  {
  return pfn;
  }

The identity function is only for ARM guest which are always
auto-translated (arch/arm/include/asm/xen/page.h).

The x86 version contains a check if the guest is auto-translated or not
(arch/x86/include/asm/xen/page.):

static inline unsigned long pfn_to_gfn(unsigned long pfn)
{
 if (xen_feature(XENFEAT_auto_translated_physmap))
 return pfn;
 else
 return pfn_to_mfn(pfn);
}


Of course --- I was looking at the top of the patch and didn't realize 
it was ARM changes. Sorry for the noise.


-boris
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v1 13/16] xen: swiotlb: return error code from xen_swiotlb_map_sg()

2021-07-19 Thread Boris Ostrovsky


On 7/15/21 12:45 PM, Logan Gunthorpe wrote:
> From: Martin Oliveira 
>
> The .map_sg() op now expects an error code instead of zero on failure.
>
> xen_swiotlb_map_sg() may only fail if xen_swiotlb_map_page() fails, but
> xen_swiotlb_map_page() only supports returning errors as
> DMA_MAPPING_ERROR. So coalesce all errors into EINVAL.


Reviewed-by: Boris Ostrovsky 





Re: [PATCH v1 4/5] PCI: Adapt all code locations to not use struct pci_dev::driver directly

2021-07-30 Thread Boris Ostrovsky


On 7/29/21 4:37 PM, Uwe Kleine-König wrote:

> --- a/drivers/pci/xen-pcifront.c
> +++ b/drivers/pci/xen-pcifront.c
> @@ -599,12 +599,12 @@ static pci_ers_result_t pcifront_common_process(int cmd,
>   result = PCI_ERS_RESULT_NONE;
>  
>   pcidev = pci_get_domain_bus_and_slot(domain, bus, devfn);
> - if (!pcidev || !pcidev->driver) {
> + pdrv = pci_driver_of_dev(pcidev);
> + if (!pcidev || !pdrv) {


If pcidev is NULL we are dead by the time we reach 'if' statement.


-boris



>   dev_err(&pdev->xdev->dev, "device or AER driver is NULL\n");
>   pci_dev_put(pcidev);
>   return result;
>   }
> - pdrv = pcidev->driver;
>  


Re: [PATCH v1 4/5] PCI: Adapt all code locations to not use struct pci_dev::driver directly

2021-08-02 Thread Boris Ostrovsky


On 7/31/21 8:08 AM, Uwe Kleine-König wrote:
> Hello Boris,
>
> On Fri, Jul 30, 2021 at 04:37:31PM -0400, Boris Ostrovsky wrote:
>> On 7/29/21 4:37 PM, Uwe Kleine-König wrote:
>>> --- a/drivers/pci/xen-pcifront.c
>>> +++ b/drivers/pci/xen-pcifront.c
>>> @@ -599,12 +599,12 @@ static pci_ers_result_t pcifront_common_process(int 
>>> cmd,
>>> result = PCI_ERS_RESULT_NONE;
>>>  
>>> pcidev = pci_get_domain_bus_and_slot(domain, bus, devfn);
>>> -   if (!pcidev || !pcidev->driver) {
>>> +   pdrv = pci_driver_of_dev(pcidev);
>>> +   if (!pcidev || !pdrv) {
>> If pcidev is NULL we are dead by the time we reach 'if' statement.
> Oh, you're right. So this needs something like:
>
>   if (!pcidev || !(pdrv = pci_driver_of_dev(pcidev)))


Sure, that's fine. And while at it please also drop 'if (pdrv)' check below 
(it's not directly related to your change but is more noticeable now so since 
you are in that function anyway I'd appreciate if you could do that).


Thanks.

-boris


>
> or repeating the call to pci_driver_of_dev for each previous usage of
> pdev->driver.
>
> If there are no other preferences I'd got with the first approach for
> v2.
>
> Best regards and thanks for catching,
> Uwe
>


Re: [PATCH v2 5/6] PCI: Adapt all code locations to not use struct pci_dev::driver directly

2021-08-03 Thread Boris Ostrovsky


On 8/3/21 6:01 AM, Uwe Kleine-König wrote:
> This prepares removing the driver member of struct pci_dev which holds the
> same information than struct pci_dev::dev->driver.
>
> Signed-off-by: Uwe Kleine-König 
> ---
>  arch/powerpc/include/asm/ppc-pci.h|  3 +-
>  arch/powerpc/kernel/eeh_driver.c  | 12 ---
>  arch/x86/events/intel/uncore.c|  2 +-
>  arch/x86/kernel/probe_roms.c  |  2 +-
>  drivers/bcma/host_pci.c   |  6 ++--
>  drivers/crypto/hisilicon/qm.c |  2 +-
>  drivers/crypto/qat/qat_common/adf_aer.c   |  2 +-
>  drivers/message/fusion/mptbase.c  |  4 +--
>  drivers/misc/cxl/guest.c  | 21 +--
>  drivers/misc/cxl/pci.c| 25 +++--
>  .../ethernet/hisilicon/hns3/hns3_ethtool.c|  2 +-
>  .../ethernet/marvell/prestera/prestera_pci.c  |  2 +-
>  drivers/net/ethernet/mellanox/mlxsw/pci.c |  2 +-
>  .../ethernet/netronome/nfp/nfp_net_ethtool.c  |  2 +-
>  drivers/pci/iov.c | 23 +++-
>  drivers/pci/pci-driver.c  | 28 ---
>  drivers/pci/pci.c | 10 +++---
>  drivers/pci/pcie/err.c| 35 ++-
>  drivers/pci/xen-pcifront.c|  3 +-
>  drivers/ssb/pcihost_wrapper.c |  7 ++--
>  drivers/usb/host/xhci-pci.c   |  3 +-
>  21 files changed, 112 insertions(+), 84 deletions(-)


For Xen bits:

Reviewed-by: Boris Ostrovsky 




Re: [PATCH 0/5] xen: cleanup detection of non-essential pv devices

2021-11-23 Thread Boris Ostrovsky



On 11/22/21 3:20 AM, Juergen Gross wrote:

On 22.10.21 08:47, Juergen Gross wrote:

Today the non-essential pv devices are hard coded in the xenbus driver
and this list is lacking multiple entries.

This series reworks the detection logic of non-essential devices by
adding a flag for that purpose to struct xenbus_driver.

Juergen Gross (5):
   xen: add "not_essential" flag to struct xenbus_driver
   xen: flag xen_drm_front to be not essential for system boot
   xen: flag hvc_xen to be not essential for system boot
   xen: flag pvcalls-front to be not essential for system boot
   xen: flag xen_snd_front to be not essential for system boot

  drivers/gpu/drm/xen/xen_drm_front.c    |  1 +
  drivers/input/misc/xen-kbdfront.c  |  1 +
  drivers/tty/hvc/hvc_xen.c  |  1 +
  drivers/video/fbdev/xen-fbfront.c  |  1 +
  drivers/xen/pvcalls-front.c    |  1 +
  drivers/xen/xenbus/xenbus_probe_frontend.c | 14 +++---
  include/xen/xenbus.h   |  1 +
  sound/xen/xen_snd_front.c  |  1 +
  8 files changed, 10 insertions(+), 11 deletions(-)



Any further comments?



Reviewed-by: Boris Ostrovsky 


(I'll fix the semicolon typo in the last patch, no need to resend)



Re: [PATCH 0/5] xen: cleanup detection of non-essential pv devices

2021-11-25 Thread Boris Ostrovsky



On 11/22/21 3:20 AM, Juergen Gross wrote:

On 22.10.21 08:47, Juergen Gross wrote:

Today the non-essential pv devices are hard coded in the xenbus driver
and this list is lacking multiple entries.

This series reworks the detection logic of non-essential devices by
adding a flag for that purpose to struct xenbus_driver.

Juergen Gross (5):
   xen: add "not_essential" flag to struct xenbus_driver
   xen: flag xen_drm_front to be not essential for system boot
   xen: flag hvc_xen to be not essential for system boot
   xen: flag pvcalls-front to be not essential for system boot
   xen: flag xen_snd_front to be not essential for system boot

  drivers/gpu/drm/xen/xen_drm_front.c    |  1 +
  drivers/input/misc/xen-kbdfront.c  |  1 +
  drivers/tty/hvc/hvc_xen.c  |  1 +
  drivers/video/fbdev/xen-fbfront.c  |  1 +
  drivers/xen/pvcalls-front.c    |  1 +
  drivers/xen/xenbus/xenbus_probe_frontend.c | 14 +++---
  include/xen/xenbus.h   |  1 +
  sound/xen/xen_snd_front.c  |  1 +
  8 files changed, 10 insertions(+), 11 deletions(-)







Applied to for-linus-5.16c


-boris



Re: cleanup swiotlb initialization

2022-02-24 Thread Boris Ostrovsky



On 2/22/22 10:35 AM, Christoph Hellwig wrote:

Hi all,

this series tries to clean up the swiotlb initialization, including
that of swiotlb-xen.  To get there is also removes the x86 iommu table
infrastructure that massively obsfucates the initialization path.

Git tree:

 git://git.infradead.org/users/hch/misc.git swiotlb-init-cleanup



I haven't had a chance to look at this yet but this crashes as dom0:


[   37.377313] BUG: unable to handle page fault for address: c90042880018
[   37.378219] #PF: supervisor read access in kernel mode
[   37.378219] #PF: error_code(0x) - not-present page
[   37.378219] PGD 7c2f2ee067 P4D 7c2f2ee067 PUD 7bf019b067 PMD 105a30067 PTE 0
[   37.378219] Oops:  [#1] PREEMPT SMP NOPTI
[   37.378219] CPU: 14 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc5swiotlb #9
[   37.378219] Hardware name: Oracle Corporation ORACLE SERVER 
E1-2c/ASY,Generic,SM,E1-2c, BIOS 49004900 12/23/2020
[   37.378219] RIP: e030:init_iommu_one+0x248/0x2f0
[   37.378219] Code: 48 89 43 68 48 85 c0 74 c4 be 00 20 00 00 48 89 df e8 ea ee ff 
ff 48 89 43 78 48 85 c0 74 ae c6 83 98 00 00 00 00 48 8b 43 38 <48> 8b 40 18 a8 
01 74 07 83 8b a8 04 00 00 01 f6 83 a8 04 00 00 01
[   37.378219] RSP: e02b:c9004044bd18 EFLAGS: 00010286
[   37.378219] RAX: c9004288 RBX: 888107260800 RCX: 
[   37.378219] RDX: 8000 RSI: ea00041cab80 RDI: 
[   37.378219] RBP: c9004044bd38 R08: 0901 R09: ea00041cab00
[   37.378219] R10: 0002 R11:  R12: c90040435008
[   37.378219] R13: 0008 R14: efa0 R15: 
[   37.378219] FS:  () GS:88fef418() 
knlGS:
[   37.378219] CS:  e030 DS:  ES:  CR0: 80050033
[   37.378219] CR2: c90042880018 CR3: 0260a000 CR4: 00050660
[   37.378219] Call Trace:
[   37.378219]  
[   37.378219]  early_amd_iommu_init+0x3c5/0x72d
[   37.378219]  ? iommu_setup+0x284/0x284
[   37.378219]  state_next+0x158/0x68f
[   37.378219]  ? iommu_setup+0x284/0x284
[   37.378219]  iommu_go_to_state+0x28/0x2d
[   37.378219]  amd_iommu_init+0x15/0x4b
[   37.378219]  ? iommu_setup+0x284/0x284
[   37.378219]  pci_iommu_init+0x12/0x37
[   37.378219]  do_one_initcall+0x48/0x210
[   37.378219]  kernel_init_freeable+0x229/0x28c
[   37.378219]  ? rest_init+0xe0/0xe0
[   37.963966]  kernel_init+0x1a/0x130
[   37.979415]  ret_from_fork+0x22/0x30
[   37.991436]  
[   37.999465] Modules linked in:
[   38.007413] CR2: c90042880018
[   38.019416] ---[ end trace  ]---
[   38.023418] RIP: e030:init_iommu_one+0x248/0x2f0
[   38.023418] Code: 48 89 43 68 48 85 c0 74 c4 be 00 20 00 00 48 89 df e8 ea ee ff 
ff 48 89 43 78 48 85 c0 74 ae c6 83 98 00 00 00 00 48 8b 43 38 <48> 8b 40 18 a8 
01 74 07 83 8b a8 04 00 00 01 f6 83 a8 04 00 00 01
[   38.023418] RSP: e02b:c9004044bd18 EFLAGS: 00010286
[   38.023418] RAX: c9004288 RBX: 888107260800 RCX: 
[   38.155413] RDX: 8000 RSI: ea00041cab80 RDI: 
[   38.175965] Freeing initrd memory: 62640K
[   38.155413] RBP: c9004044bd38 R08: 0901 R09: ea00041cab00
[   38.155413] R10: 0002 R11:  R12: c90040435008
[   38.155413] R13: 0008 R14: efa0 R15: 
[   38.155413] FS:  () GS:88fef418() 
knlGS:
[   38.287414] CS:  e030 DS:  ES:  CR0: 80050033
[   38.309557] CR2: c90042880018 CR3: 0260a000 CR4: 00050660
[   38.332403] Kernel panic - not syncing: Fatal exception
[   38.351414] Rebooting in 20 seconds..



-boris



Re: cleanup swiotlb initialization

2022-02-24 Thread Boris Ostrovsky



On 2/24/22 10:58 AM, Christoph Hellwig wrote:

Thanks.

This looks really strange as early_amd_iommu_init should not interact much
with the changes.  I'll see if I can find a AMD system to test on.



Just to be clear: this crashes only as dom0. Boots fine as baremetal.


-boris




On Wed, Feb 23, 2022 at 07:57:49PM -0500, Boris Ostrovsky wrote:

[   37.377313] BUG: unable to handle page fault for address: c90042880018
[   37.378219] #PF: supervisor read access in kernel mode
[   37.378219] #PF: error_code(0x) - not-present page
[   37.378219] PGD 7c2f2ee067 P4D 7c2f2ee067 PUD 7bf019b067 PMD 105a30067 PTE 0
[   37.378219] Oops:  [#1] PREEMPT SMP NOPTI
[   37.378219] CPU: 14 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc5swiotlb #9
[   37.378219] Hardware name: Oracle Corporation ORACLE SERVER 
E1-2c/ASY,Generic,SM,E1-2c, BIOS 49004900 12/23/2020
[   37.378219] RIP: e030:init_iommu_one+0x248/0x2f0
[   37.378219] Code: 48 89 43 68 48 85 c0 74 c4 be 00 20 00 00 48 89 df e8 ea ee ff 
ff 48 89 43 78 48 85 c0 74 ae c6 83 98 00 00 00 00 48 8b 43 38 <48> 8b 40 18 a8 
01 74 07 83 8b a8 04 00 00 01 f6 83 a8 04 00 00 01
[   37.378219] RSP: e02b:c9004044bd18 EFLAGS: 00010286
[   37.378219] RAX: c9004288 RBX: 888107260800 RCX: 
[   37.378219] RDX: 8000 RSI: ea00041cab80 RDI: 
[   37.378219] RBP: c9004044bd38 R08: 0901 R09: ea00041cab00
[   37.378219] R10: 0002 R11:  R12: c90040435008
[   37.378219] R13: 0008 R14: efa0 R15: 
[   37.378219] FS:  () GS:88fef418() 
knlGS:
[   37.378219] CS:  e030 DS:  ES:  CR0: 80050033
[   37.378219] CR2: c90042880018 CR3: 0260a000 CR4: 00050660
[   37.378219] Call Trace:
[   37.378219]  
[   37.378219]  early_amd_iommu_init+0x3c5/0x72d
[   37.378219]  ? iommu_setup+0x284/0x284
[   37.378219]  state_next+0x158/0x68f
[   37.378219]  ? iommu_setup+0x284/0x284
[   37.378219]  iommu_go_to_state+0x28/0x2d
[   37.378219]  amd_iommu_init+0x15/0x4b
[   37.378219]  ? iommu_setup+0x284/0x284
[   37.378219]  pci_iommu_init+0x12/0x37
[   37.378219]  do_one_initcall+0x48/0x210
[   37.378219]  kernel_init_freeable+0x229/0x28c
[   37.378219]  ? rest_init+0xe0/0xe0
[   37.963966]  kernel_init+0x1a/0x130
[   37.979415]  ret_from_fork+0x22/0x30
[   37.991436]  
[   37.999465] Modules linked in:
[   38.007413] CR2: c90042880018
[   38.019416] ---[ end trace  ]---
[   38.023418] RIP: e030:init_iommu_one+0x248/0x2f0
[   38.023418] Code: 48 89 43 68 48 85 c0 74 c4 be 00 20 00 00 48 89 df e8 ea ee ff 
ff 48 89 43 78 48 85 c0 74 ae c6 83 98 00 00 00 00 48 8b 43 38 <48> 8b 40 18 a8 
01 74 07 83 8b a8 04 00 00 01 f6 83 a8 04 00 00 01
[   38.023418] RSP: e02b:c9004044bd18 EFLAGS: 00010286
[   38.023418] RAX: c9004288 RBX: 888107260800 RCX: 
[   38.155413] RDX: 8000 RSI: ea00041cab80 RDI: 
[   38.175965] Freeing initrd memory: 62640K
[   38.155413] RBP: c9004044bd38 R08: 0901 R09: ea00041cab00
[   38.155413] R10: 0002 R11:  R12: c90040435008
[   38.155413] R13: 0008 R14: efa0 R15: 
[   38.155413] FS:  () GS:88fef418() 
knlGS:
[   38.287414] CS:  e030 DS:  ES:  CR0: 80050033
[   38.309557] CR2: c90042880018 CR3: 0260a000 CR4: 00050660
[   38.332403] Kernel panic - not syncing: Fatal exception
[   38.351414] Rebooting in 20 seconds..



-boris

---end quoted text---



Re: cleanup swiotlb initialization

2022-02-24 Thread Boris Ostrovsky



On 2/24/22 11:39 AM, Christoph Hellwig wrote:

On Thu, Feb 24, 2022 at 11:18:33AM -0500, Boris Ostrovsky wrote:

On 2/24/22 10:58 AM, Christoph Hellwig wrote:

Thanks.

This looks really strange as early_amd_iommu_init should not interact much
with the changes.  I'll see if I can find a AMD system to test on.


Just to be clear: this crashes only as dom0. Boots fine as baremetal.

Ah.  I can gues what this might be.  On Xen the hypervisor controls the
IOMMU and we should never end up initializing it in Linux, right?



Right, we shouldn't be in that code path.


-boris



Re: cleanup swiotlb initialization

2022-02-25 Thread Boris Ostrovsky



On 2/25/22 3:47 AM, Christoph Hellwig wrote:

On Thu, Feb 24, 2022 at 12:07:26PM -0500, Boris Ostrovsky wrote:

Just to be clear: this crashes only as dom0. Boots fine as baremetal.

Ah.  I can gues what this might be.  On Xen the hypervisor controls the
IOMMU and we should never end up initializing it in Linux, right?


Right, we shouldn't be in that code path.

Can you try the patch below on top of the series?



Yes, this makes dom0 boot fine.


(It also addresses something I wanted to mention while looking at the patches, 
which was to remove Xen initialization code from pci_swiotlb_detect_4gb() since 
it had noting to do with what the routine's name implies.)


-boris



Re: [PATCH v3 7/8] PCI: Replace pci_dev::driver usage by pci_dev::dev.driver

2021-08-11 Thread Boris Ostrovsky


On 8/11/21 4:06 AM, Uwe Kleine-König wrote:
> struct pci_dev::driver contains (apart from a constant offset) the same
> data as struct pci_dev::dev->driver. Replace all remaining users of the
> former pointer by the latter to allow removing the former.
>
> Signed-off-by: Uwe Kleine-König 


Xen:


Reviewed-by: Boris Ostrovsky