Re: Adreno crash on i.MX53 running 5.3-rc6
On 05/09/2019 20:05, Rob Clark wrote: On Thu, Sep 5, 2019 at 10:03 AM Rob Clark wrote: On Wed, Sep 4, 2019 at 11:06 AM Robin Murphy wrote: On 04/09/2019 01:12, Rob Clark wrote: On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam wrote: Hi Jonathan, On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek wrote: Hi, I tried this and it works with patches 4+5 from Rob's series and changing gpummu to use sg_phys(sg) instead of sg->dma_address (dma_address isn't set now that dma_map_sg isn't used). Thanks for testing it. I haven't had a chance to test it yet. Rob, I assume your series is targeted to 5.4, correct? maybe, although Christoph Hellwig didn't seem like a big fan of exposing cache ops, and would rather add a new allocation API for uncached pages.. so I'm not entirely sure what the way forward will be. TBH, the use of map/unmap looked reasonable in the context of "start/stop using these pages for stuff which may include DMA", so even if it was cheekily ignoring sg->dma_address I'm not sure I'd really consider it "abuse" - in comparison, using sync without a prior map unquestionably violates the API, and means that CONFIG_DMA_API_DEBUG will be rendered useless with false positives if this driver is active while trying to debug something else. The warning referenced in 0036bc73ccbe represents something being unmapped which didn't match a corresponding map - from what I can make of get_pages()/put_pages() it looks like that would need msm_obj->flags or msm_obj->sgt to change during the lifetime of the object, neither of which sounds like a thing that should legitimately happen. Are you sure this isn't all just hiding a subtle bug elsewhere? After all, if what was being unmapped wasn't right, who says that what's now being synced is? Correct, msm_obj->flags/sgt should not change. I reverted the various patches, and went back to the original setup that used dma_{map,unmap}_sg() to reproduce the original issue that prompted the change in the first place. It is a pretty massive flood of splats, which pretty quickly overflowed the dmesg ring buffer, so I might be missing some things, but I'll poke around some more. The one thing I wonder about, what would happen if the buffer is allocated and dma_map_sg() called before drm/msm attaches it's own iommu_domains, and then dma_unmap_sg() afterwards. We aren't actually ever using the iommu domain that DMA API is creating for the device, so all the extra iommu_map/unmap (and tlb flush) is at best unnecessary. But I'm not sure if it could be having some unintended side effects that cause this sort of problem. Right, one of the semi-intentional side-effects of 43c5bf11a610 is that iommu-dma no longer interferes with unmanaged domains - it will still go and make its own redundant mappings in the unattached default domain, but as long as the DMA API usage is fundamentally sound then it shouldn't actually get in the way. it seems like every time (or at least every time we splat), we end up w/ iova=f000 .. which doesn't sound likely to be right. Although from just looking at the dma-iommu.c code, I'm not sure how this happens. And adding some printk's results in enough traces that I can't boot for some reason.. Yeah, that's a bogus IOVA for sure, so regardless of how we actually make Adreno happy it would still be interesting to figure out how it came about. Do you see any WARNs from io-pgtable-arm before the one from __iommu_dma_unmap()? Robin. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Adreno crash on i.MX53 running 5.3-rc6
On Thu, Sep 5, 2019 at 3:30 PM Rob Clark wrote: > > On Thu, Sep 5, 2019 at 12:05 PM Rob Clark wrote: > > > > On Thu, Sep 5, 2019 at 10:03 AM Rob Clark wrote: > > > > > > On Wed, Sep 4, 2019 at 11:06 AM Robin Murphy wrote: > > > > > > > > On 04/09/2019 01:12, Rob Clark wrote: > > > > > On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam > > > > > wrote: > > > > >> > > > > >> Hi Jonathan, > > > > >> > > > > >> On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek > > > > >> wrote: > > > > >>> > > > > >>> Hi, > > > > >>> > > > > >>> I tried this and it works with patches 4+5 from Rob's series and > > > > >>> changing gpummu to use sg_phys(sg) instead of sg->dma_address > > > > >>> (dma_address isn't set now that dma_map_sg isn't used). > > > > >> > > > > >> Thanks for testing it. I haven't had a chance to test it yet. > > > > >> > > > > >> Rob, > > > > >> > > > > >> I assume your series is targeted to 5.4, correct? > > > > > > > > > > maybe, although Christoph Hellwig didn't seem like a big fan of > > > > > exposing cache ops, and would rather add a new allocation API for > > > > > uncached pages.. so I'm not entirely sure what the way forward will > > > > > be. > > > > > > > > TBH, the use of map/unmap looked reasonable in the context of > > > > "start/stop using these pages for stuff which may include DMA", so even > > > > if it was cheekily ignoring sg->dma_address I'm not sure I'd really > > > > consider it "abuse" - in comparison, using sync without a prior map > > > > unquestionably violates the API, and means that CONFIG_DMA_API_DEBUG > > > > will be rendered useless with false positives if this driver is active > > > > while trying to debug something else. > > > > > > > > The warning referenced in 0036bc73ccbe represents something being > > > > unmapped which didn't match a corresponding map - from what I can make > > > > of get_pages()/put_pages() it looks like that would need msm_obj->flags > > > > or msm_obj->sgt to change during the lifetime of the object, neither of > > > > which sounds like a thing that should legitimately happen. Are you sure > > > > this isn't all just hiding a subtle bug elsewhere? After all, if what > > > > was being unmapped wasn't right, who says that what's now being synced > > > > is? > > > > > > > > > > Correct, msm_obj->flags/sgt should not change. > > > > > > I reverted the various patches, and went back to the original setup > > > that used dma_{map,unmap}_sg() to reproduce the original issue that > > > prompted the change in the first place. It is a pretty massive flood > > > of splats, which pretty quickly overflowed the dmesg ring buffer, so I > > > might be missing some things, but I'll poke around some more. > > > > > > The one thing I wonder about, what would happen if the buffer is > > > allocated and dma_map_sg() called before drm/msm attaches it's own > > > iommu_domains, and then dma_unmap_sg() afterwards. We aren't actually > > > ever using the iommu domain that DMA API is creating for the device, > > > so all the extra iommu_map/unmap (and tlb flush) is at best > > > unnecessary. But I'm not sure if it could be having some unintended > > > side effects that cause this sort of problem. > > > > > > > it seems like every time (or at least every time we splat), we end up > > w/ iova=f000 .. which doesn't sound likely to be right. > > Although from just looking at the dma-iommu.c code, I'm not sure how > > this happens. And adding some printk's results in enough traces that > > I can't boot for some reason.. > > > > Ok, I see better what is going on.. at least on the kernel that I'm > using on the yoga c630 laptop, where I have a patch[1] to skip domain > attach. That results in to_smmu_domain(domain)->pgtbl_ops being null, > so arm_smmu_map() fails. So we skip __finalise_sg() which sets the > sg_dma_address(). Which causes the failure on unmap. > > That said, I'm pretty sure I've seen (or had reported) a similar splat > (although maybe not so frequent) on devices without that patch (where > the bootloader isn't enabling scanout). I'll have to switch over to a > different device that doesn't light up display from bootloader, so > that I can drop that skip-domain-attach patch > > All that said, this would be much easier if I could do the cache > operations without all this unneeded iommu stuff. (Not to mention the > unnecessary TLB flushes that I suspect are also happening.) > > [1] https://patchwork.kernel.org/patch/11038793/ > fwiw, with https://patchwork.freedesktop.org/series/63096/ we could go back to simply using dma_{map,unmap}_sg() in all cases, as the iommu dma_ops would no longer get in the way. BR, -R ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Adreno crash on i.MX53 running 5.3-rc6
On Thu, Sep 5, 2019 at 12:05 PM Rob Clark wrote: > > On Thu, Sep 5, 2019 at 10:03 AM Rob Clark wrote: > > > > On Wed, Sep 4, 2019 at 11:06 AM Robin Murphy wrote: > > > > > > On 04/09/2019 01:12, Rob Clark wrote: > > > > On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam > > > > wrote: > > > >> > > > >> Hi Jonathan, > > > >> > > > >> On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek > > > >> wrote: > > > >>> > > > >>> Hi, > > > >>> > > > >>> I tried this and it works with patches 4+5 from Rob's series and > > > >>> changing gpummu to use sg_phys(sg) instead of sg->dma_address > > > >>> (dma_address isn't set now that dma_map_sg isn't used). > > > >> > > > >> Thanks for testing it. I haven't had a chance to test it yet. > > > >> > > > >> Rob, > > > >> > > > >> I assume your series is targeted to 5.4, correct? > > > > > > > > maybe, although Christoph Hellwig didn't seem like a big fan of > > > > exposing cache ops, and would rather add a new allocation API for > > > > uncached pages.. so I'm not entirely sure what the way forward will > > > > be. > > > > > > TBH, the use of map/unmap looked reasonable in the context of > > > "start/stop using these pages for stuff which may include DMA", so even > > > if it was cheekily ignoring sg->dma_address I'm not sure I'd really > > > consider it "abuse" - in comparison, using sync without a prior map > > > unquestionably violates the API, and means that CONFIG_DMA_API_DEBUG > > > will be rendered useless with false positives if this driver is active > > > while trying to debug something else. > > > > > > The warning referenced in 0036bc73ccbe represents something being > > > unmapped which didn't match a corresponding map - from what I can make > > > of get_pages()/put_pages() it looks like that would need msm_obj->flags > > > or msm_obj->sgt to change during the lifetime of the object, neither of > > > which sounds like a thing that should legitimately happen. Are you sure > > > this isn't all just hiding a subtle bug elsewhere? After all, if what > > > was being unmapped wasn't right, who says that what's now being synced is? > > > > > > > Correct, msm_obj->flags/sgt should not change. > > > > I reverted the various patches, and went back to the original setup > > that used dma_{map,unmap}_sg() to reproduce the original issue that > > prompted the change in the first place. It is a pretty massive flood > > of splats, which pretty quickly overflowed the dmesg ring buffer, so I > > might be missing some things, but I'll poke around some more. > > > > The one thing I wonder about, what would happen if the buffer is > > allocated and dma_map_sg() called before drm/msm attaches it's own > > iommu_domains, and then dma_unmap_sg() afterwards. We aren't actually > > ever using the iommu domain that DMA API is creating for the device, > > so all the extra iommu_map/unmap (and tlb flush) is at best > > unnecessary. But I'm not sure if it could be having some unintended > > side effects that cause this sort of problem. > > > > it seems like every time (or at least every time we splat), we end up > w/ iova=f000 .. which doesn't sound likely to be right. > Although from just looking at the dma-iommu.c code, I'm not sure how > this happens. And adding some printk's results in enough traces that > I can't boot for some reason.. > Ok, I see better what is going on.. at least on the kernel that I'm using on the yoga c630 laptop, where I have a patch[1] to skip domain attach. That results in to_smmu_domain(domain)->pgtbl_ops being null, so arm_smmu_map() fails. So we skip __finalise_sg() which sets the sg_dma_address(). Which causes the failure on unmap. That said, I'm pretty sure I've seen (or had reported) a similar splat (although maybe not so frequent) on devices without that patch (where the bootloader isn't enabling scanout). I'll have to switch over to a different device that doesn't light up display from bootloader, so that I can drop that skip-domain-attach patch All that said, this would be much easier if I could do the cache operations without all this unneeded iommu stuff. (Not to mention the unnecessary TLB flushes that I suspect are also happening.) [1] https://patchwork.kernel.org/patch/11038793/ BR, -R ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Adreno crash on i.MX53 running 5.3-rc6
On Thu, Sep 5, 2019 at 10:03 AM Rob Clark wrote: > > On Wed, Sep 4, 2019 at 11:06 AM Robin Murphy wrote: > > > > On 04/09/2019 01:12, Rob Clark wrote: > > > On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam wrote: > > >> > > >> Hi Jonathan, > > >> > > >> On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek wrote: > > >>> > > >>> Hi, > > >>> > > >>> I tried this and it works with patches 4+5 from Rob's series and > > >>> changing gpummu to use sg_phys(sg) instead of sg->dma_address > > >>> (dma_address isn't set now that dma_map_sg isn't used). > > >> > > >> Thanks for testing it. I haven't had a chance to test it yet. > > >> > > >> Rob, > > >> > > >> I assume your series is targeted to 5.4, correct? > > > > > > maybe, although Christoph Hellwig didn't seem like a big fan of > > > exposing cache ops, and would rather add a new allocation API for > > > uncached pages.. so I'm not entirely sure what the way forward will > > > be. > > > > TBH, the use of map/unmap looked reasonable in the context of > > "start/stop using these pages for stuff which may include DMA", so even > > if it was cheekily ignoring sg->dma_address I'm not sure I'd really > > consider it "abuse" - in comparison, using sync without a prior map > > unquestionably violates the API, and means that CONFIG_DMA_API_DEBUG > > will be rendered useless with false positives if this driver is active > > while trying to debug something else. > > > > The warning referenced in 0036bc73ccbe represents something being > > unmapped which didn't match a corresponding map - from what I can make > > of get_pages()/put_pages() it looks like that would need msm_obj->flags > > or msm_obj->sgt to change during the lifetime of the object, neither of > > which sounds like a thing that should legitimately happen. Are you sure > > this isn't all just hiding a subtle bug elsewhere? After all, if what > > was being unmapped wasn't right, who says that what's now being synced is? > > > > Correct, msm_obj->flags/sgt should not change. > > I reverted the various patches, and went back to the original setup > that used dma_{map,unmap}_sg() to reproduce the original issue that > prompted the change in the first place. It is a pretty massive flood > of splats, which pretty quickly overflowed the dmesg ring buffer, so I > might be missing some things, but I'll poke around some more. > > The one thing I wonder about, what would happen if the buffer is > allocated and dma_map_sg() called before drm/msm attaches it's own > iommu_domains, and then dma_unmap_sg() afterwards. We aren't actually > ever using the iommu domain that DMA API is creating for the device, > so all the extra iommu_map/unmap (and tlb flush) is at best > unnecessary. But I'm not sure if it could be having some unintended > side effects that cause this sort of problem. > it seems like every time (or at least every time we splat), we end up w/ iova=f000 .. which doesn't sound likely to be right. Although from just looking at the dma-iommu.c code, I'm not sure how this happens. And adding some printk's results in enough traces that I can't boot for some reason.. BR, -R ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Adreno crash on i.MX53 running 5.3-rc6
On Wed, Sep 4, 2019 at 11:06 AM Robin Murphy wrote: > > On 04/09/2019 01:12, Rob Clark wrote: > > On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam wrote: > >> > >> Hi Jonathan, > >> > >> On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek wrote: > >>> > >>> Hi, > >>> > >>> I tried this and it works with patches 4+5 from Rob's series and > >>> changing gpummu to use sg_phys(sg) instead of sg->dma_address > >>> (dma_address isn't set now that dma_map_sg isn't used). > >> > >> Thanks for testing it. I haven't had a chance to test it yet. > >> > >> Rob, > >> > >> I assume your series is targeted to 5.4, correct? > > > > maybe, although Christoph Hellwig didn't seem like a big fan of > > exposing cache ops, and would rather add a new allocation API for > > uncached pages.. so I'm not entirely sure what the way forward will > > be. > > TBH, the use of map/unmap looked reasonable in the context of > "start/stop using these pages for stuff which may include DMA", so even > if it was cheekily ignoring sg->dma_address I'm not sure I'd really > consider it "abuse" - in comparison, using sync without a prior map > unquestionably violates the API, and means that CONFIG_DMA_API_DEBUG > will be rendered useless with false positives if this driver is active > while trying to debug something else. > > The warning referenced in 0036bc73ccbe represents something being > unmapped which didn't match a corresponding map - from what I can make > of get_pages()/put_pages() it looks like that would need msm_obj->flags > or msm_obj->sgt to change during the lifetime of the object, neither of > which sounds like a thing that should legitimately happen. Are you sure > this isn't all just hiding a subtle bug elsewhere? After all, if what > was being unmapped wasn't right, who says that what's now being synced is? > Correct, msm_obj->flags/sgt should not change. I reverted the various patches, and went back to the original setup that used dma_{map,unmap}_sg() to reproduce the original issue that prompted the change in the first place. It is a pretty massive flood of splats, which pretty quickly overflowed the dmesg ring buffer, so I might be missing some things, but I'll poke around some more. The one thing I wonder about, what would happen if the buffer is allocated and dma_map_sg() called before drm/msm attaches it's own iommu_domains, and then dma_unmap_sg() afterwards. We aren't actually ever using the iommu domain that DMA API is creating for the device, so all the extra iommu_map/unmap (and tlb flush) is at best unnecessary. But I'm not sure if it could be having some unintended side effects that cause this sort of problem. BR, -R ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Adreno crash on i.MX53 running 5.3-rc6
On 04/09/2019 01:12, Rob Clark wrote: On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam wrote: Hi Jonathan, On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek wrote: Hi, I tried this and it works with patches 4+5 from Rob's series and changing gpummu to use sg_phys(sg) instead of sg->dma_address (dma_address isn't set now that dma_map_sg isn't used). Thanks for testing it. I haven't had a chance to test it yet. Rob, I assume your series is targeted to 5.4, correct? maybe, although Christoph Hellwig didn't seem like a big fan of exposing cache ops, and would rather add a new allocation API for uncached pages.. so I'm not entirely sure what the way forward will be. TBH, the use of map/unmap looked reasonable in the context of "start/stop using these pages for stuff which may include DMA", so even if it was cheekily ignoring sg->dma_address I'm not sure I'd really consider it "abuse" - in comparison, using sync without a prior map unquestionably violates the API, and means that CONFIG_DMA_API_DEBUG will be rendered useless with false positives if this driver is active while trying to debug something else. The warning referenced in 0036bc73ccbe represents something being unmapped which didn't match a corresponding map - from what I can make of get_pages()/put_pages() it looks like that would need msm_obj->flags or msm_obj->sgt to change during the lifetime of the object, neither of which sounds like a thing that should legitimately happen. Are you sure this isn't all just hiding a subtle bug elsewhere? After all, if what was being unmapped wasn't right, who says that what's now being synced is? Robin. In the mean time, it is a bit ugly, but I guess something like this should work: diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 7263f4373f07..5a6a79fbc9d6 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -52,7 +52,7 @@ static void sync_for_device(struct msm_gem_object *msm_obj) { struct device *dev = msm_obj->base.dev->dev; -if (get_dma_ops(dev)) { +if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) { dma_sync_sg_for_device(dev, msm_obj->sgt->sgl, msm_obj->sgt->nents, DMA_BIDIRECTIONAL); } else { @@ -65,7 +65,7 @@ static void sync_for_cpu(struct msm_gem_object *msm_obj) { struct device *dev = msm_obj->base.dev->dev; -if (get_dma_ops(dev)) { +if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) { dma_sync_sg_for_cpu(dev, msm_obj->sgt->sgl, msm_obj->sgt->nents, DMA_BIDIRECTIONAL); } else { BR, -R If this is the case, what we should do about the i.MX5 regression on 5.3? Would a revert of the two commits be acceptable in 5.3 in order to avoid the regression? Please advise. Thanks ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Adreno crash on i.MX53 running 5.3-rc6
Hi, I tried this and it works with patches 4+5 from Rob's series and changing gpummu to use sg_phys(sg) instead of sg->dma_address (dma_address isn't set now that dma_map_sg isn't used). Jonathan On 9/3/19 11:22 AM, Rob Clark wrote: On Mon, Sep 2, 2019 at 11:03 AM Fabio Estevam wrote: Hi Robin, On Mon, Sep 2, 2019 at 11:45 AM Robin Murphy wrote: Try 0036bc73ccbe - that looks like something that CONFIG_DMA_API_DEBUG should have been screaming about anyway. Thanks for your suggestion. I can successfully boot after reverting the following commits: commit 141db5703c887f46957615cd6616ca28fe4691e0 (HEAD) Author: Fabio Estevam Date: Mon Sep 2 14:58:18 2019 -0300 Revert "drm/msm: stop abusing dma_map/unmap for cache" This reverts commit 0036bc73ccbe7e600a3468bf8e8879b122252274. commit fa5b1f620f2984c254877d6049214c39c24c8207 Author: Fabio Estevam Date: Mon Sep 2 14:56:01 2019 -0300 Revert "drm/msm: Use the correct dma_sync calls in msm_gem" This reverts commit 3de433c5b38af49a5fc7602721e2ab5d39f1e69c. Rob, What would be the recommended approach for fixing this? We need a direct way to handle cache, so we can stop trying to trick DMA API into doing what we want. Something like this is what I had in mind: https://patchwork.freedesktop.org/series/65211/ I guess I could respin that. I'm not really sure of any other way to have things working on the different combinations of archs and dma_ops that we have. Lately fixing one has been breaking another. BR, -R ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Adreno crash on i.MX53 running 5.3-rc6
Hi Rob, On Tue, Sep 3, 2019 at 9:12 PM Rob Clark wrote: > In the mean time, it is a bit ugly, but I guess something like this should > work: Yes, this works on a i.MX53 board, thanks: Tested-by: Fabio Estevam Is this something you could submit for 5.3? Thanks ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Adreno crash on i.MX53 running 5.3-rc6
On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam wrote: > > Hi Jonathan, > > On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek wrote: > > > > Hi, > > > > I tried this and it works with patches 4+5 from Rob's series and > > changing gpummu to use sg_phys(sg) instead of sg->dma_address > > (dma_address isn't set now that dma_map_sg isn't used). > > Thanks for testing it. I haven't had a chance to test it yet. > > Rob, > > I assume your series is targeted to 5.4, correct? maybe, although Christoph Hellwig didn't seem like a big fan of exposing cache ops, and would rather add a new allocation API for uncached pages.. so I'm not entirely sure what the way forward will be. In the mean time, it is a bit ugly, but I guess something like this should work: diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 7263f4373f07..5a6a79fbc9d6 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -52,7 +52,7 @@ static void sync_for_device(struct msm_gem_object *msm_obj) { struct device *dev = msm_obj->base.dev->dev; -if (get_dma_ops(dev)) { +if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) { dma_sync_sg_for_device(dev, msm_obj->sgt->sgl, msm_obj->sgt->nents, DMA_BIDIRECTIONAL); } else { @@ -65,7 +65,7 @@ static void sync_for_cpu(struct msm_gem_object *msm_obj) { struct device *dev = msm_obj->base.dev->dev; -if (get_dma_ops(dev)) { +if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) { dma_sync_sg_for_cpu(dev, msm_obj->sgt->sgl, msm_obj->sgt->nents, DMA_BIDIRECTIONAL); } else { BR, -R > If this is the case, what we should do about the i.MX5 regression on 5.3? > > Would a revert of the two commits be acceptable in 5.3 in order to > avoid the regression? > > Please advise. > > Thanks ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Adreno crash on i.MX53 running 5.3-rc6
Hi Jonathan, On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek wrote: > > Hi, > > I tried this and it works with patches 4+5 from Rob's series and > changing gpummu to use sg_phys(sg) instead of sg->dma_address > (dma_address isn't set now that dma_map_sg isn't used). Thanks for testing it. I haven't had a chance to test it yet. Rob, I assume your series is targeted to 5.4, correct? If this is the case, what we should do about the i.MX5 regression on 5.3? Would a revert of the two commits be acceptable in 5.3 in order to avoid the regression? Please advise. Thanks ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Adreno crash on i.MX53 running 5.3-rc6
On Mon, Sep 2, 2019 at 11:03 AM Fabio Estevam wrote: > > Hi Robin, > > On Mon, Sep 2, 2019 at 11:45 AM Robin Murphy wrote: > > > Try 0036bc73ccbe - that looks like something that CONFIG_DMA_API_DEBUG > > should have been screaming about anyway. > > Thanks for your suggestion. > > I can successfully boot after reverting the following commits: > > commit 141db5703c887f46957615cd6616ca28fe4691e0 (HEAD) > Author: Fabio Estevam > Date: Mon Sep 2 14:58:18 2019 -0300 > > Revert "drm/msm: stop abusing dma_map/unmap for cache" > > This reverts commit 0036bc73ccbe7e600a3468bf8e8879b122252274. > > commit fa5b1f620f2984c254877d6049214c39c24c8207 > Author: Fabio Estevam > Date: Mon Sep 2 14:56:01 2019 -0300 > > Revert "drm/msm: Use the correct dma_sync calls in msm_gem" > > This reverts commit 3de433c5b38af49a5fc7602721e2ab5d39f1e69c. > > Rob, > > What would be the recommended approach for fixing this? > We need a direct way to handle cache, so we can stop trying to trick DMA API into doing what we want. Something like this is what I had in mind: https://patchwork.freedesktop.org/series/65211/ I guess I could respin that. I'm not really sure of any other way to have things working on the different combinations of archs and dma_ops that we have. Lately fixing one has been breaking another. BR, -R ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Adreno crash on i.MX53 running 5.3-rc6
Hi Robin, On Mon, Sep 2, 2019 at 11:45 AM Robin Murphy wrote: > Try 0036bc73ccbe - that looks like something that CONFIG_DMA_API_DEBUG > should have been screaming about anyway. Thanks for your suggestion. I can successfully boot after reverting the following commits: commit 141db5703c887f46957615cd6616ca28fe4691e0 (HEAD) Author: Fabio Estevam Date: Mon Sep 2 14:58:18 2019 -0300 Revert "drm/msm: stop abusing dma_map/unmap for cache" This reverts commit 0036bc73ccbe7e600a3468bf8e8879b122252274. commit fa5b1f620f2984c254877d6049214c39c24c8207 Author: Fabio Estevam Date: Mon Sep 2 14:56:01 2019 -0300 Revert "drm/msm: Use the correct dma_sync calls in msm_gem" This reverts commit 3de433c5b38af49a5fc7602721e2ab5d39f1e69c. Rob, What would be the recommended approach for fixing this? Thanks ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Adreno crash on i.MX53 running 5.3-rc6
On 02/09/2019 14:51, Fabio Estevam wrote: Hi, I am getting the following crash when booting the adreno driver on i.MX53 running a 5.3-rc6 kernel. Such error does not happen with 5.2 though. Before I start running a bisect, I am wondering if anyone has any ideas about this issue. Try 0036bc73ccbe - that looks like something that CONFIG_DMA_API_DEBUG should have been screaming about anyway. Robin. Thanks, Fabio Estevam [2.083249] 8<--- cut here --- [2.086460] Unable to handle kernel paging request at virtual address 50001000 [2.094174] pgd = (ptrval) [2.096911] [50001000] *pgd= [2.100606] Internal error: Oops: 805 [#1] SMP ARM [2.105412] Modules linked in: [2.108487] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc6-00271-g9f159ae07f07 #4 [2.116411] Hardware name: Freescale i.MX53 (Device Tree Support) [2.122538] PC is at v7_dma_clean_range+0x20/0x38 [2.127254] LR is at __dma_page_cpu_to_dev+0x28/0x90 [2.132226] pc : []lr : []psr: 2013 [2.138500] sp : d80b5a88 ip : de96c000 fp : d840ce6c [2.143732] r10: r9 : 0001 r8 : d843e010 [2.148964] r7 : r6 : 8000 r5 : ddb6c000 r4 : [2.155500] r3 : 003f r2 : 0040 r1 : 50008000 r0 : 50001000 [2.162037] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [2.169180] Control: 10c5387d Table: 70004019 DAC: 0051 [2.174934] Process swapper/0 (pid: 1, stack limit = 0x(ptrval)) [2.180949] Stack: (0xd80b5a88 to 0xd80b6000) [2.185319] 5a80: c011c7bc d8491780 d840ce6c d849b380 c011822c [2.193509] 5aa0: c0d01a18 c0118abc c0118a78 d84a0200 0008 c1308908 d838e800 d849a4a8 [2.201697] 5ac0: d8491780 c06699b4 d8491600 d80b5b20 d84a0200 [2.209886] 5ae0: d8491780 d8491600 d80b5b20 d8491600 d849a4a8 d84a0200 0003 d84a0358 [2.218077] 5b00: c1308908 d8491600 d849a4a8 d8491780 d840ce6c c066a55c c1308908 c066a104 [2.226266] 5b20: 01001000 d84a0200 10700ac6 d849a480 d84a0200 d8491600 [2.234455] 5b40: e0845000 c1308908 c066a72c d849a480 d840ce6c d840ce00 c1308908 [2.242643] 5b60: c066b584 d849a488 d849a4a8 c1308908 d840ce6c c066ff40 [2.250832] 5b80: d849a488 d849a4a8 c1308908 d81b4000 e0845000 [2.259021] 5ba0: d838e800 c1308908 d8491600 10700ac6 d80b5bc8 d840ce00 d840ce6c 0001 [2.267210] 5bc0: e0845000 d838e800 c066ece4 0100 10ff [2.275399] 5be0: c1308908 0001 d81b4000 0100 0001 10700ac6 [2.283587] 5c00: c0d6d564 d840ce00 d81b4010 0001 d81b4000 c0d6d564 c1308908 d80b5c48 [2.291777] 5c20: d838e800 c061f9cc c1029dec d80b5c48 d838e800 c13e8788 [2.299965] 5c40: c1308928 c102a234 0100 10ff [2.308154] 5c60: 0001 a013 10700ac6 c13b7658 d840ce00 d838e800 d81b4000 [2.316343] 5c80: d840ce00 c1308908 0002 d838f800 c0620514 0001 10700ac6 [2.324531] 5ca0: d8496440 d81b4010 c1aa1c00 d838e800 c061e070 [2.332720] 5cc0: c0d6c534 df56cf34 00c8 10700ac6 d81b4010 [2.340909] 5ce0: d8496440 d838e800 c103acd0 d8496280 c1380488 c06a3e10 [2.349097] 5d00: d838f800 d838e800 d843e010 d8496440 c1308908 [2.357286] 5d20: d83f9640 c1380488 c0668554 0006 0007 c13804d4 d83f9640 [2.365475] 5d40: c1380488 c017ec18 d80c c0c43e40 d843e010 d8496440 0001 c0182a94 [2.373665] 5d60: 6013 10700ac6 d843e010 d8496280 d8496400 0018 d8496440 0001 [2.381854] 5d80: c13804d4 d83f9640 c1380488 c06a4280 c1380488 c0d764f8 d8496440 [2.390044] 5da0: c1380488 d843e010 c0d764f8 c1308908 c13ef300 c06a44f0 [2.398232] 5dc0: c0d8a0dc dffcc6f0 d843e010 dffcc6f0 d843e010 c06680b8 [2.406421] 5de0: d84988c0 d83f9640 d84988c0 d84989a0 d8498230 10700ac6 0001 d843e010 [2.414610] 5e00: c137eec0 c137eec0 c13ef300 c06ac1a0 [2.422799] 5e20: d843e010 c1aa40dc c1aa40e0 c137eec0 c06aa014 d843e010 c137eec0 [2.430988] 5e40: c137eec0 c1308908 c13e9880 c13e85d4 c06aa368 c1308908 c13e9880 [2.439178] 5e60: c13e85d4 d843e010 c137eec0 c1308908 c13e9880 c13e85d4 c06aa618 [2.447367] 5e80: c137eec0 d843e010 c06aa6a4 c137eec0 c06aa620 c06a844c [2.46] 5ea0: d80888d4 d80888a4 d84914d0 10700ac6 d80888d4 c137eec0 d8494f00 c1380d28 [2.463745] 5ec0: c06a946c c105f3d4 c1308908 c137eec0 c1308908 [2.471934] 5ee0: c125fdd0 c06ab304 c1308928 c1308908 c0103178 0109 [2.480123] 5f00: dc6e dc00 c1126860 0109 0109 c014dc88 c11253ac c10607a0