Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
Dropped in favor of https://lkml.org/lkml/2019/1/29/910 On 1/24/19 5:02 PM, Julien Grall wrote: > > > On 24/01/2019 14:34, Oleksandr Andrushchenko wrote: >> Hello, Julien! > > Hi, > >> On 1/22/19 1:44 PM, Julien Grall wrote: >>> >>> >>> On 1/22/19 10:28 AM, Oleksandr Andrushchenko wrote: Hello, Julien! >>> >>> Hi, >>> On 1/21/19 7:09 PM, Julien Grall wrote: Well, I didn't get the attributes of pages at the backend side, but IMO those do not matter in my use-case (for simplicity I am not using zero-copying at backend side): >>> >>> They are actually important no matter what is your use case. If you >>> access the same physical page with different attributes, then you are >>> asking for trouble. >> So, we have: >> >> DomU: frontend side >> >> !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + >> PTE_ATTRINDX(MT_NORMAL) > > I still don't understand how you came up with MT_NORMAL when you seem > to confirm... > >> >> DomD: backend side >> >> PTE_USER + !PTE_RDONLY + PTE_PXN + PTE_NG + PTE_CONT + PTE_TABLE_BIT + >> PTE_UXN + PTE_ATTRINDX(MT_NORMAL) >> >> From the above it seems that I don't violate cached/non-cached >> agreement here >>> >>> This is why Xen imposes all the pages shared to have their memory >>> attributes following some rules. Actually, speaking with Mark R., we >>> may want to tight a bit more the attributes. >>> 1. Frontend device allocates display buffer pages which come from shmem and have these attributes: !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + PTE_ATTRINDX(MT_NORMAL) >>> >>> My knowledge of Xen DRM is inexistent. However, looking at the code in >>> 5.0-rc2, I don't seem to find the same attributes. For instance >>> xen_drm_front_gem_prime_vmap and gem_mmap_obj are using >>> pgprot_writecombine. So it looks like, the mapping will be >>> non-cacheable on Arm64. >>> >>> Can you explain how you came up to these attributes? >> pgprot_writecombine is PTE_ATTRINDX(MT_NORMAL_NC), so it seems to be >> applicable here? [1] > > ... that MT_NORMAL_NC is used for the frontend pages. > > MT_NORMAL_NC is different from MT_NORMAL. The use of the former will > result to non-cacheable memory while the latter will result to > cacheable memory. > > To me, this looks like the exact reason why you see artifact on the > display buffer. As the author of this code, can you explain why you > decided to use pgprot_writecombine here instead of relying on the > default VMA prot? > > [...] > >>> We actually never required to use cache flush in other PV protocol, so >>> I still don't understand why the PV DRM should be different here. >> Well, you are right. But at the same time not flushing the buffer makes >> troubles, >> so this is why I am trying to figure out what is wrong here. > > The cache flush is likely hiding the real problem rather than solving it. > >>> >>> To me, it looks like that you are either missing some barriers >> Barriers for the buffer? Not sure what you mean here. > > If you share information between two entities, you may need some > ordering so the information are seen consistently by the consumer > side. This can be achieved by using barriers. > >> Even more, we have >> a use case >> when the buffer is not touched by CPU in DomD and is solely owned by >> the HW. > > Memory ordering issues are subtle. The fact that one of your use-case > works does not imply that barriers are not necessary. I am not saying > there are a missing barriers, I am only pointed out potential reasons. > > Anyway, I don't think your problem is a missing barriers here. It is > more likely because of mismatch memory attributes (see above). > > Cheers, >
Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
On 1/24/19 5:02 PM, Julien Grall wrote: > > > On 24/01/2019 14:34, Oleksandr Andrushchenko wrote: >> Hello, Julien! > > Hi, > >> On 1/22/19 1:44 PM, Julien Grall wrote: >>> >>> >>> On 1/22/19 10:28 AM, Oleksandr Andrushchenko wrote: Hello, Julien! >>> >>> Hi, >>> On 1/21/19 7:09 PM, Julien Grall wrote: Well, I didn't get the attributes of pages at the backend side, but IMO those do not matter in my use-case (for simplicity I am not using zero-copying at backend side): >>> >>> They are actually important no matter what is your use case. If you >>> access the same physical page with different attributes, then you are >>> asking for trouble. >> So, we have: >> >> DomU: frontend side >> >> !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + >> PTE_ATTRINDX(MT_NORMAL) > > I still don't understand how you came up with MT_NORMAL when you seem > to confirm... > >> >> DomD: backend side >> >> PTE_USER + !PTE_RDONLY + PTE_PXN + PTE_NG + PTE_CONT + PTE_TABLE_BIT + >> PTE_UXN + PTE_ATTRINDX(MT_NORMAL) >> >> From the above it seems that I don't violate cached/non-cached >> agreement here >>> >>> This is why Xen imposes all the pages shared to have their memory >>> attributes following some rules. Actually, speaking with Mark R., we >>> may want to tight a bit more the attributes. >>> 1. Frontend device allocates display buffer pages which come from shmem and have these attributes: !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + PTE_ATTRINDX(MT_NORMAL) >>> >>> My knowledge of Xen DRM is inexistent. However, looking at the code in >>> 5.0-rc2, I don't seem to find the same attributes. For instance >>> xen_drm_front_gem_prime_vmap and gem_mmap_obj are using >>> pgprot_writecombine. So it looks like, the mapping will be >>> non-cacheable on Arm64. >>> >>> Can you explain how you came up to these attributes? >> pgprot_writecombine is PTE_ATTRINDX(MT_NORMAL_NC), so it seems to be >> applicable here? [1] > > ... that MT_NORMAL_NC is used for the frontend pages. > > MT_NORMAL_NC is different from MT_NORMAL. The use of the former will > result to non-cacheable memory while the latter will result to > cacheable memory. > > To me, this looks like the exact reason why you see artifact on the > display buffer. As the author of this code, can you explain why you > decided to use pgprot_writecombine here instead of relying on the > default VMA prot? > > [...] Sorry for late reply, it took me quite some time to re-check all the use-cases that we have with PV DRM. First of all I found a bug in my debug code which reported page attributes and now I can confirm that all the use-cases sue MT_NORMAL, both backend and frontend. You are right: there is no reason to have pgprot_writecombine and indeed I have to rely on almost default VMA prot. I came up with the following (used by omap drm, for example): diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c b/drivers/gpu/drm/xen/xen_drm_front_gem.c index ae28ad4b4254..867800a2ed42 100644 --- a/drivers/gpu/drm/xen/xen_drm_front_gem.c +++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c @@ -238,8 +238,7 @@ static int gem_mmap_obj(struct xen_gem_object *xen_obj, vma->vm_flags &= ~VM_PFNMAP; vma->vm_flags |= VM_MIXEDMAP; vma->vm_pgoff = 0; - vma->vm_page_prot = - pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); + vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); { int cnt = xen_obj->num_pages > 5 ? 5 : xen_obj->num_pages; @@ -295,7 +294,7 @@ void *xen_drm_front_gem_prime_vmap(struct drm_gem_object *gem_obj) return NULL; return vmap(xen_obj->pages, xen_obj->num_pages, - VM_MAP, pgprot_writecombine(PAGE_KERNEL)); + VM_MAP, PAGE_KERNEL); } With the above all the artifacts are gone now as page attributes are the same across all domains. So, I consider this as a fix and will send it as v3 or better drop this patch and have a new one. THANK YOU for helping figuring this out! > >>> We actually never required to use cache flush in other PV protocol, so >>> I still don't understand why the PV DRM should be different here. >> Well, you are right. But at the same time not flushing the buffer makes >> troubles, >> so this is why I am trying to figure out what is wrong here. > > The cache flush is likely hiding the real problem rather than solving it. > It does hide the real issue. And I have confirmed this >>> >>> To me, it looks like that you are either missing some barriers >> Barriers for the buffer? Not sure what you mean here. > > If you share information between two entities, you may need some > ordering so the information are seen consistently by the consumer > side. This can be achieved by using barriers. > >> Even more, we have >> a use case >> when the buffer is not touched by CPU in DomD and is solely owned by >> the HW. >
Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
On 24/01/2019 14:34, Oleksandr Andrushchenko wrote: Hello, Julien! Hi, On 1/22/19 1:44 PM, Julien Grall wrote: On 1/22/19 10:28 AM, Oleksandr Andrushchenko wrote: Hello, Julien! Hi, On 1/21/19 7:09 PM, Julien Grall wrote: Well, I didn't get the attributes of pages at the backend side, but IMO those do not matter in my use-case (for simplicity I am not using zero-copying at backend side): They are actually important no matter what is your use case. If you access the same physical page with different attributes, then you are asking for trouble. So, we have: DomU: frontend side !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + PTE_ATTRINDX(MT_NORMAL) I still don't understand how you came up with MT_NORMAL when you seem to confirm... DomD: backend side PTE_USER + !PTE_RDONLY + PTE_PXN + PTE_NG + PTE_CONT + PTE_TABLE_BIT + PTE_UXN + PTE_ATTRINDX(MT_NORMAL) From the above it seems that I don't violate cached/non-cached agreement here This is why Xen imposes all the pages shared to have their memory attributes following some rules. Actually, speaking with Mark R., we may want to tight a bit more the attributes. 1. Frontend device allocates display buffer pages which come from shmem and have these attributes: !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + PTE_ATTRINDX(MT_NORMAL) My knowledge of Xen DRM is inexistent. However, looking at the code in 5.0-rc2, I don't seem to find the same attributes. For instance xen_drm_front_gem_prime_vmap and gem_mmap_obj are using pgprot_writecombine. So it looks like, the mapping will be non-cacheable on Arm64. Can you explain how you came up to these attributes? pgprot_writecombine is PTE_ATTRINDX(MT_NORMAL_NC), so it seems to be applicable here? [1] ... that MT_NORMAL_NC is used for the frontend pages. MT_NORMAL_NC is different from MT_NORMAL. The use of the former will result to non-cacheable memory while the latter will result to cacheable memory. To me, this looks like the exact reason why you see artifact on the display buffer. As the author of this code, can you explain why you decided to use pgprot_writecombine here instead of relying on the default VMA prot? [...] We actually never required to use cache flush in other PV protocol, so I still don't understand why the PV DRM should be different here. Well, you are right. But at the same time not flushing the buffer makes troubles, so this is why I am trying to figure out what is wrong here. The cache flush is likely hiding the real problem rather than solving it. To me, it looks like that you are either missing some barriers Barriers for the buffer? Not sure what you mean here. If you share information between two entities, you may need some ordering so the information are seen consistently by the consumer side. This can be achieved by using barriers. Even more, we have a use case when the buffer is not touched by CPU in DomD and is solely owned by the HW. Memory ordering issues are subtle. The fact that one of your use-case works does not imply that barriers are not necessary. I am not saying there are a missing barriers, I am only pointed out potential reasons. Anyway, I don't think your problem is a missing barriers here. It is more likely because of mismatch memory attributes (see above). Cheers, -- Julien Grall
Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
Hello, Julien! Sorry for the late reply - it took quite some time to collect the data requested. On 1/22/19 1:44 PM, Julien Grall wrote: > > > On 1/22/19 10:28 AM, Oleksandr Andrushchenko wrote: >> Hello, Julien! > > Hi, > >> On 1/21/19 7:09 PM, Julien Grall wrote: >> Well, I didn't get the attributes of pages at the backend side, but IMO >> those >> do not matter in my use-case (for simplicity I am not using >> zero-copying at >> backend side): > > They are actually important no matter what is your use case. If you > access the same physical page with different attributes, then you are > asking for trouble. So, we have: DomU: frontend side !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + PTE_ATTRINDX(MT_NORMAL) DomD: backend side PTE_USER + !PTE_RDONLY + PTE_PXN + PTE_NG + PTE_CONT + PTE_TABLE_BIT + PTE_UXN + PTE_ATTRINDX(MT_NORMAL) From the above it seems that I don't violate cached/non-cached agreement here > > This is why Xen imposes all the pages shared to have their memory > attributes following some rules. Actually, speaking with Mark R., we > may want to tight a bit more the attributes. > >> >> 1. Frontend device allocates display buffer pages which come from shmem >> and have these attributes: >> !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + >> PTE_ATTRINDX(MT_NORMAL) > > My knowledge of Xen DRM is inexistent. However, looking at the code in > 5.0-rc2, I don't seem to find the same attributes. For instance > xen_drm_front_gem_prime_vmap and gem_mmap_obj are using > pgprot_writecombine. So it looks like, the mapping will be > non-cacheable on Arm64. > > Can you explain how you came up to these attributes? pgprot_writecombine is PTE_ATTRINDX(MT_NORMAL_NC), so it seems to be applicable here? [1] > >> >> 2. Frontend grants references to these pages and shares those with the >> backend >> >> 3. Backend is a user-space application (Weston client), so it uses >> gntdev kernel >> driver to mmap the pages. The pages, which are used by gntdev, are those >> coming >> from the Xen balloon driver and I believe they are all normal memory and >> shouldn't be non-cached. >> >> 4. Once the frontend starts displaying it flips the buffers and backend >> does *memcpy* >> from the frontend-backend shared buffer into Weston's buffer. This means >> no HW at the backend side touches the shared buffer. >> >> 5. I can see distorted picture. >> >> Previously I used setup with zero-copying, so then the picture becomes >> more complicated >> in terms of buffers and how those used by the backed, but anyways it >> seems that the >> very basic scenario with memory copying doesn't work for me. >> >> Using DMA API on frontend's side does help - no artifacts are seen. >> This is why I'm thinking that this is related to frontend/kernel side >> rather then to >> the backend side. This is why I'm thinking this is related to caches and >> trying to figure >> out what can be done here instead of using DMA API. > > We actually never required to use cache flush in other PV protocol, so > I still don't understand why the PV DRM should be different here. Well, you are right. But at the same time not flushing the buffer makes troubles, so this is why I am trying to figure out what is wrong here. > > To me, it looks like that you are either missing some barriers Barriers for the buffer? Not sure what you mean here. Even more, we have a use case when the buffer is not touched by CPU in DomD and is solely owned by the HW. > or the memory attributes are not correct. Please see above - I do need your advice here > > Cheers, > Thank you very much for your time, Oleksandr [1] https://elixir.bootlin.com/linux/v5.0-rc2/source/arch/arm64/include/asm/pgtable.h#L414
Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
On 1/22/19 10:28 AM, Oleksandr Andrushchenko wrote: Hello, Julien! Hi, On 1/21/19 7:09 PM, Julien Grall wrote: Well, I didn't get the attributes of pages at the backend side, but IMO those do not matter in my use-case (for simplicity I am not using zero-copying at backend side): They are actually important no matter what is your use case. If you access the same physical page with different attributes, then you are asking for trouble. This is why Xen imposes all the pages shared to have their memory attributes following some rules. Actually, speaking with Mark R., we may want to tight a bit more the attributes. 1. Frontend device allocates display buffer pages which come from shmem and have these attributes: !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + PTE_ATTRINDX(MT_NORMAL) My knowledge of Xen DRM is inexistent. However, looking at the code in 5.0-rc2, I don't seem to find the same attributes. For instance xen_drm_front_gem_prime_vmap and gem_mmap_obj are using pgprot_writecombine. So it looks like, the mapping will be non-cacheable on Arm64. Can you explain how you came up to these attributes? 2. Frontend grants references to these pages and shares those with the backend 3. Backend is a user-space application (Weston client), so it uses gntdev kernel driver to mmap the pages. The pages, which are used by gntdev, are those coming from the Xen balloon driver and I believe they are all normal memory and shouldn't be non-cached. 4. Once the frontend starts displaying it flips the buffers and backend does *memcpy* from the frontend-backend shared buffer into Weston's buffer. This means no HW at the backend side touches the shared buffer. 5. I can see distorted picture. Previously I used setup with zero-copying, so then the picture becomes more complicated in terms of buffers and how those used by the backed, but anyways it seems that the very basic scenario with memory copying doesn't work for me. Using DMA API on frontend's side does help - no artifacts are seen. This is why I'm thinking that this is related to frontend/kernel side rather then to the backend side. This is why I'm thinking this is related to caches and trying to figure out what can be done here instead of using DMA API. We actually never required to use cache flush in other PV protocol, so I still don't understand why the PV DRM should be different here. To me, it looks like that you are either missing some barriers or the memory attributes are not correct. Cheers, -- Julien Grall
Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
Hello, Julien! On 1/21/19 7:09 PM, Julien Grall wrote: > Hello, > > On 21/01/2019 12:43, Oleksandr Andrushchenko wrote: >> On 1/18/19 1:43 PM, Julien Grall wrote: >>> On 18/01/2019 09:40, Oleksandr Andrushchenko wrote: On 1/17/19 11:18 AM, Christoph Hellwig wrote: > On Wed, Jan 16, 2019 at 06:43:29AM +, Oleksandr Andrushchenko > wrote: >>> This whole issue keeps getting more and more confusing. >> Well, I don't really do DMA here, but instead the buffers in >> question are shared with other Xen domain, so effectively it >> could be thought of some sort of DMA here, where the "device" is >> that remote domain. If the buffers are not flushed then the >> remote part sees some inconsistency which in my case results >> in artifacts on screen while displaying the buffers. >> When buffers are allocated via DMA API then there are no artifacts; >> if buffers are allocated with shmem + DMA mapping then there are no >> artifacts as well. >> The only offending use-case is when I use shmem backed buffers, >> but do not flush them > The right answer would be to implement cache maintainance hooks for > this case in the Xen arch code. These would basically look the same > as the low-level cache maintainance used by the DMA ops, but without > going through the DMA mapping layer, in fact they should probably > reuse the same low-level assembly routines. > > I don't think this is the first usage of such Xen buffer sharing, so > what do the other users do? I'll have to get even deeper into it. Initially I looked at the code, but didn't find anything useful. Or maybe I have just overlooked obvious things there >>> From Xen on Arm ABI: >>> >>> "All memory which is shared with other entities in the system >>> (including the hypervisor and other guests) must reside in memory >>> which is mapped as Normal Inner Write-Back Outer Write-Back >>> Inner-Shareable. >>> This applies to: >>> - hypercall arguments passed via a pointer to guest memory. >>> - memory shared via the grant table mechanism (including PV I/O >>> rings etc). >>> - memory shared with the hypervisor (struct shared_info, struct >>> vcpu_info, the grant table, etc). >>> " >>> >>> So you should not need any cache maintenance here. Can you provide >>> more details on the memory attribute you use for memory shared in both >>> the backend and frontend? >>> >> It takes quite some time to collect this (because many components are >> involved in the >> use-case), but for now the pages in the guest domain are: >> !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + >> PTE_ATTRINDX(MT_NORMAL) > > So that's the attribute for the page mapped in the frontend, right? > How about the backend side? Please see below > > Also, could that page be handed to the graphic card correctly? Yes, if we use zero-copying. But please see below > If so, is your graphic card coherent? Yes, it is > > If one of your components is mapping with non-cacheable attributes > then you are already not following the Xen Arm ABI. In that case, we > would need to discuss how to extend the ABI. > > Cheers, > Well, I didn't get the attributes of pages at the backend side, but IMO those do not matter in my use-case (for simplicity I am not using zero-copying at backend side): 1. Frontend device allocates display buffer pages which come from shmem and have these attributes: !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + PTE_ATTRINDX(MT_NORMAL) 2. Frontend grants references to these pages and shares those with the backend 3. Backend is a user-space application (Weston client), so it uses gntdev kernel driver to mmap the pages. The pages, which are used by gntdev, are those coming from the Xen balloon driver and I believe they are all normal memory and shouldn't be non-cached. 4. Once the frontend starts displaying it flips the buffers and backend does *memcpy* from the frontend-backend shared buffer into Weston's buffer. This means no HW at the backend side touches the shared buffer. 5. I can see distorted picture. Previously I used setup with zero-copying, so then the picture becomes more complicated in terms of buffers and how those used by the backed, but anyways it seems that the very basic scenario with memory copying doesn't work for me. Using DMA API on frontend's side does help - no artifacts are seen. This is why I'm thinking that this is related to frontend/kernel side rather then to the backend side. This is why I'm thinking this is related to caches and trying to figure out what can be done here instead of using DMA API. Thank you, Olkesandr
Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
Hello, On 21/01/2019 12:43, Oleksandr Andrushchenko wrote: On 1/18/19 1:43 PM, Julien Grall wrote: On 18/01/2019 09:40, Oleksandr Andrushchenko wrote: On 1/17/19 11:18 AM, Christoph Hellwig wrote: On Wed, Jan 16, 2019 at 06:43:29AM +, Oleksandr Andrushchenko wrote: This whole issue keeps getting more and more confusing. Well, I don't really do DMA here, but instead the buffers in question are shared with other Xen domain, so effectively it could be thought of some sort of DMA here, where the "device" is that remote domain. If the buffers are not flushed then the remote part sees some inconsistency which in my case results in artifacts on screen while displaying the buffers. When buffers are allocated via DMA API then there are no artifacts; if buffers are allocated with shmem + DMA mapping then there are no artifacts as well. The only offending use-case is when I use shmem backed buffers, but do not flush them The right answer would be to implement cache maintainance hooks for this case in the Xen arch code. These would basically look the same as the low-level cache maintainance used by the DMA ops, but without going through the DMA mapping layer, in fact they should probably reuse the same low-level assembly routines. I don't think this is the first usage of such Xen buffer sharing, so what do the other users do? I'll have to get even deeper into it. Initially I looked at the code, but didn't find anything useful. Or maybe I have just overlooked obvious things there From Xen on Arm ABI: "All memory which is shared with other entities in the system (including the hypervisor and other guests) must reside in memory which is mapped as Normal Inner Write-Back Outer Write-Back Inner-Shareable. This applies to: - hypercall arguments passed via a pointer to guest memory. - memory shared via the grant table mechanism (including PV I/O rings etc). - memory shared with the hypervisor (struct shared_info, struct vcpu_info, the grant table, etc). " So you should not need any cache maintenance here. Can you provide more details on the memory attribute you use for memory shared in both the backend and frontend? It takes quite some time to collect this (because many components are involved in the use-case), but for now the pages in the guest domain are: !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + PTE_ATTRINDX(MT_NORMAL) So that's the attribute for the page mapped in the frontend, right? How about the backend side? Also, could that page be handed to the graphic card correctly? If so, is your graphic card coherent? If one of your components is mapping with non-cacheable attributes then you are already not following the Xen Arm ABI. In that case, we would need to discuss how to extend the ABI. Cheers, -- Julien Grall
Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
On 1/18/19 1:43 PM, Julien Grall wrote: > (+ Stefano) > > Hi, > > Sorry for jumping late in the conversation. > > On 18/01/2019 09:40, Oleksandr Andrushchenko wrote: >> On 1/17/19 11:18 AM, Christoph Hellwig wrote: >>> On Wed, Jan 16, 2019 at 06:43:29AM +, Oleksandr Andrushchenko >>> wrote: > This whole issue keeps getting more and more confusing. Well, I don't really do DMA here, but instead the buffers in question are shared with other Xen domain, so effectively it could be thought of some sort of DMA here, where the "device" is that remote domain. If the buffers are not flushed then the remote part sees some inconsistency which in my case results in artifacts on screen while displaying the buffers. When buffers are allocated via DMA API then there are no artifacts; if buffers are allocated with shmem + DMA mapping then there are no artifacts as well. The only offending use-case is when I use shmem backed buffers, but do not flush them >>> The right answer would be to implement cache maintainance hooks for >>> this case in the Xen arch code. These would basically look the same >>> as the low-level cache maintainance used by the DMA ops, but without >>> going through the DMA mapping layer, in fact they should probably >>> reuse the same low-level assembly routines. >>> >>> I don't think this is the first usage of such Xen buffer sharing, so >>> what do the other users do? >> I'll have to get even deeper into it. Initially I >> looked at the code, but didn't find anything useful. >> Or maybe I have just overlooked obvious things there > From Xen on Arm ABI: > > "All memory which is shared with other entities in the system > (including the hypervisor and other guests) must reside in memory > which is mapped as Normal Inner Write-Back Outer Write-Back > Inner-Shareable. > This applies to: > - hypercall arguments passed via a pointer to guest memory. > - memory shared via the grant table mechanism (including PV I/O > rings etc). > - memory shared with the hypervisor (struct shared_info, struct > vcpu_info, the grant table, etc). > " > > So you should not need any cache maintenance here. Can you provide > more details on the memory attribute you use for memory shared in both > the backend and frontend? > It takes quite some time to collect this (because many components are involved in the use-case), but for now the pages in the guest domain are: !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + PTE_ATTRINDX(MT_NORMAL) > Cheers, > >> >> Thank you, >> Oleksandr >> ___ >> Xen-devel mailing list >> xen-de...@lists.xenproject.org >> https://lists.xenproject.org/mailman/listinfo/xen-devel >> >
Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
(+ Stefano) Hi, Sorry for jumping late in the conversation. On 18/01/2019 09:40, Oleksandr Andrushchenko wrote: On 1/17/19 11:18 AM, Christoph Hellwig wrote: On Wed, Jan 16, 2019 at 06:43:29AM +, Oleksandr Andrushchenko wrote: This whole issue keeps getting more and more confusing. Well, I don't really do DMA here, but instead the buffers in question are shared with other Xen domain, so effectively it could be thought of some sort of DMA here, where the "device" is that remote domain. If the buffers are not flushed then the remote part sees some inconsistency which in my case results in artifacts on screen while displaying the buffers. When buffers are allocated via DMA API then there are no artifacts; if buffers are allocated with shmem + DMA mapping then there are no artifacts as well. The only offending use-case is when I use shmem backed buffers, but do not flush them The right answer would be to implement cache maintainance hooks for this case in the Xen arch code. These would basically look the same as the low-level cache maintainance used by the DMA ops, but without going through the DMA mapping layer, in fact they should probably reuse the same low-level assembly routines. I don't think this is the first usage of such Xen buffer sharing, so what do the other users do? I'll have to get even deeper into it. Initially I looked at the code, but didn't find anything useful. Or maybe I have just overlooked obvious things there From Xen on Arm ABI: "All memory which is shared with other entities in the system (including the hypervisor and other guests) must reside in memory which is mapped as Normal Inner Write-Back Outer Write-Back Inner-Shareable. This applies to: - hypercall arguments passed via a pointer to guest memory. - memory shared via the grant table mechanism (including PV I/O rings etc). - memory shared with the hypervisor (struct shared_info, struct vcpu_info, the grant table, etc). " So you should not need any cache maintenance here. Can you provide more details on the memory attribute you use for memory shared in both the backend and frontend? Cheers, Thank you, Oleksandr ___ Xen-devel mailing list xen-de...@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel -- Julien Grall
Re: [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
On 1/17/19 11:18 AM, Christoph Hellwig wrote: > On Wed, Jan 16, 2019 at 06:43:29AM +, Oleksandr Andrushchenko wrote: >>> This whole issue keeps getting more and more confusing. >> Well, I don't really do DMA here, but instead the buffers in >> question are shared with other Xen domain, so effectively it >> could be thought of some sort of DMA here, where the "device" is >> that remote domain. If the buffers are not flushed then the >> remote part sees some inconsistency which in my case results >> in artifacts on screen while displaying the buffers. >> When buffers are allocated via DMA API then there are no artifacts; >> if buffers are allocated with shmem + DMA mapping then there are no >> artifacts as well. >> The only offending use-case is when I use shmem backed buffers, >> but do not flush them > The right answer would be to implement cache maintainance hooks for > this case in the Xen arch code. These would basically look the same > as the low-level cache maintainance used by the DMA ops, but without > going through the DMA mapping layer, in fact they should probably > reuse the same low-level assembly routines. > > I don't think this is the first usage of such Xen buffer sharing, so > what do the other users do? I'll have to get even deeper into it. Initially I looked at the code, but didn't find anything useful. Or maybe I have just overlooked obvious things there Thank you, Oleksandr
Re: [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
On Wed, Jan 16, 2019 at 06:43:29AM +, Oleksandr Andrushchenko wrote: > > This whole issue keeps getting more and more confusing. > Well, I don't really do DMA here, but instead the buffers in > question are shared with other Xen domain, so effectively it > could be thought of some sort of DMA here, where the "device" is > that remote domain. If the buffers are not flushed then the > remote part sees some inconsistency which in my case results > in artifacts on screen while displaying the buffers. > When buffers are allocated via DMA API then there are no artifacts; > if buffers are allocated with shmem + DMA mapping then there are no > artifacts as well. > The only offending use-case is when I use shmem backed buffers, > but do not flush them The right answer would be to implement cache maintainance hooks for this case in the Xen arch code. These would basically look the same as the low-level cache maintainance used by the DMA ops, but without going through the DMA mapping layer, in fact they should probably reuse the same low-level assembly routines. I don't think this is the first usage of such Xen buffer sharing, so what do the other users do?
Re: [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
On 1/16/19 8:36 AM, Christoph Hellwig wrote: > On Wed, Jan 16, 2019 at 07:30:02AM +0100, Gerd Hoffmann wrote: >>Hi, >> >>> + if (!dma_map_sg(dev->dev, xen_obj->sgt->sgl, xen_obj->sgt->nents, >>> + DMA_BIDIRECTIONAL)) { >>> + ret = -EFAULT; >>> + goto fail_free_sgt; >>> + } >> Hmm, so it seems the arm guys could not come up with a suggestion how to >> solve that one in a better way. Ok, lets go with this then. >> >> But didn't we agree that this deserves a comment exmplaining the purpose >> of the dma_map_sg() call is to flush caches and that there is no actual >> DMA happening here? > Using a dma mapping call to flush caches is complete no-go. But the > real question is why you'd even want to flush cashes if you do not > want a dma mapping? > > This whole issue keeps getting more and more confusing. Well, I don't really do DMA here, but instead the buffers in question are shared with other Xen domain, so effectively it could be thought of some sort of DMA here, where the "device" is that remote domain. If the buffers are not flushed then the remote part sees some inconsistency which in my case results in artifacts on screen while displaying the buffers. When buffers are allocated via DMA API then there are no artifacts; if buffers are allocated with shmem + DMA mapping then there are no artifacts as well. The only offending use-case is when I use shmem backed buffers, but do not flush them
Re: [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
On 1/16/19 8:30 AM, Gerd Hoffmann wrote: >Hi, > >> +if (!dma_map_sg(dev->dev, xen_obj->sgt->sgl, xen_obj->sgt->nents, >> +DMA_BIDIRECTIONAL)) { >> +ret = -EFAULT; >> +goto fail_free_sgt; >> +} > Hmm, so it seems the arm guys could not come up with a suggestion how to > solve that one in a better way. Ok, lets go with this then. > > But didn't we agree that this deserves a comment exmplaining the purpose > of the dma_map_sg() call is to flush caches and that there is no actual > DMA happening here? My bad, will add the comment > cheers, >Gerd > Thank you, Oleksandr
Re: [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
On Wed, Jan 16, 2019 at 07:30:02AM +0100, Gerd Hoffmann wrote: > Hi, > > > + if (!dma_map_sg(dev->dev, xen_obj->sgt->sgl, xen_obj->sgt->nents, > > + DMA_BIDIRECTIONAL)) { > > + ret = -EFAULT; > > + goto fail_free_sgt; > > + } > > Hmm, so it seems the arm guys could not come up with a suggestion how to > solve that one in a better way. Ok, lets go with this then. > > But didn't we agree that this deserves a comment exmplaining the purpose > of the dma_map_sg() call is to flush caches and that there is no actual > DMA happening here? Using a dma mapping call to flush caches is complete no-go. But the real question is why you'd even want to flush cashes if you do not want a dma mapping? This whole issue keeps getting more and more confusing.
Re: [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
Hi, > + if (!dma_map_sg(dev->dev, xen_obj->sgt->sgl, xen_obj->sgt->nents, > + DMA_BIDIRECTIONAL)) { > + ret = -EFAULT; > + goto fail_free_sgt; > + } Hmm, so it seems the arm guys could not come up with a suggestion how to solve that one in a better way. Ok, lets go with this then. But didn't we agree that this deserves a comment exmplaining the purpose of the dma_map_sg() call is to flush caches and that there is no actual DMA happening here? cheers, Gerd
[PATCH v2] drm/xen-front: Make shmem backed display buffer coherent
From: Oleksandr Andrushchenko When GEM backing storage is allocated with drm_gem_get_pages the backing pages may be cached, thus making it possible that the backend sees only partial content of the buffer which may lead to screen artifacts. Make sure that the frontend's memory is coherent and the backend always sees correct display buffer content. Fixes: c575b7eeb89f ("drm/xen-front: Add support for Xen PV display frontend") Signed-off-by: Oleksandr Andrushchenko --- Changes since v1: - Remove GFP_USER|__GFP_DMA32 mapping flags (Gerd) - Use drm_prime_pages_to_sg directly (Noralf) drivers/gpu/drm/xen/xen_drm_front_gem.c | 38 ++--- 1 file changed, 34 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c b/drivers/gpu/drm/xen/xen_drm_front_gem.c index 28bc501af450..0b0d9b4f97dc 100644 --- a/drivers/gpu/drm/xen/xen_drm_front_gem.c +++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c @@ -32,8 +32,11 @@ struct xen_gem_object { /* set for buffers allocated by the backend */ bool be_alloc; - /* this is for imported PRIME buffer */ - struct sg_table *sgt_imported; + /* +* this is for imported PRIME buffer or the one allocated via +* drm_gem_get_pages. +*/ + struct sg_table *sgt; }; static inline struct xen_gem_object * @@ -124,8 +127,28 @@ static struct xen_gem_object *gem_create(struct drm_device *dev, size_t size) goto fail; } + xen_obj->sgt = drm_prime_pages_to_sg(xen_obj->pages, +xen_obj->num_pages); + if (IS_ERR_OR_NULL(xen_obj->sgt)) { + ret = PTR_ERR(xen_obj->sgt); + xen_obj->sgt = NULL; + goto fail_put_pages; + } + + if (!dma_map_sg(dev->dev, xen_obj->sgt->sgl, xen_obj->sgt->nents, + DMA_BIDIRECTIONAL)) { + ret = -EFAULT; + goto fail_free_sgt; + } + return xen_obj; +fail_free_sgt: + sg_free_table(xen_obj->sgt); + xen_obj->sgt = NULL; +fail_put_pages: + drm_gem_put_pages(&xen_obj->base, xen_obj->pages, true, false); + xen_obj->pages = NULL; fail: DRM_ERROR("Failed to allocate buffer with size %zu\n", size); return ERR_PTR(ret); @@ -148,7 +171,7 @@ void xen_drm_front_gem_free_object_unlocked(struct drm_gem_object *gem_obj) struct xen_gem_object *xen_obj = to_xen_gem_obj(gem_obj); if (xen_obj->base.import_attach) { - drm_prime_gem_destroy(&xen_obj->base, xen_obj->sgt_imported); + drm_prime_gem_destroy(&xen_obj->base, xen_obj->sgt); gem_free_pages_array(xen_obj); } else { if (xen_obj->pages) { @@ -157,6 +180,13 @@ void xen_drm_front_gem_free_object_unlocked(struct drm_gem_object *gem_obj) xen_obj->pages); gem_free_pages_array(xen_obj); } else { + if (xen_obj->sgt) { + dma_unmap_sg(xen_obj->base.dev->dev, +xen_obj->sgt->sgl, +xen_obj->sgt->nents, +DMA_BIDIRECTIONAL); + sg_free_table(xen_obj->sgt); + } drm_gem_put_pages(&xen_obj->base, xen_obj->pages, true, false); } @@ -202,7 +232,7 @@ xen_drm_front_gem_import_sg_table(struct drm_device *dev, if (ret < 0) return ERR_PTR(ret); - xen_obj->sgt_imported = sgt; + xen_obj->sgt = sgt; ret = drm_prime_sg_to_page_addr_arrays(sgt, xen_obj->pages, NULL, xen_obj->num_pages); -- 2.20.1