Re: [PATCH 3/3] arm64: dts: meson: Describe G12b GPU as coherent
On 05/10/2020 09:39, Boris Brezillon wrote: On Mon, 5 Oct 2020 09:34:06 +0100 Steven Price wrote: On 05/10/2020 09:15, Boris Brezillon wrote: Hi Robin, Neil, On Wed, 16 Sep 2020 10:26:43 +0200 Neil Armstrong wrote: Hi Robin, On 16/09/2020 01:51, Robin Murphy wrote: According to a downstream commit I found in the Khadas vendor kernel, the GPU on G12b is wired up for ACE-lite, so (now that Panfrost knows how to handle this properly) we should describe it as such. Otherwise the mismatch leads to all manner of fun with mismatched attributes and inadvertently snooping stale data from caches, which would account for at least some of the brokenness observed on this platform. Signed-off-by: Robin Murphy --- arch/arm64/boot/dts/amlogic/meson-g12b.dtsi | 4 1 file changed, 4 insertions(+) diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi index 9b8548e5f6e5..ee8fcae9f9f0 100644 --- a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi +++ b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi @@ -135,3 +135,7 @@ map1 { }; }; }; + + { + dma-coherent; +}; Thanks a lot for digging, I'll run a test to confirm it fixes the issue ! Sorry for the late reply. I triggered a dEQP run with this patch applied and I see a bunch of "panfrost ffe4.gpu: matching BO is not heap type" errors (see below for a full backtrace). That doesn't seem to happen when we drop this dma-coherent property. [ 690.945731] [ cut here ] [ 690.950003] panfrost ffe4.gpu: matching BO is not heap type (GPU VA = 319a000) [ 690.950051] WARNING: CPU: 0 PID: 120 at drivers/gpu/drm/panfrost/panfrost_mmu.c:465 panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 690.968854] Modules linked in: [ 690.971878] CPU: 0 PID: 120 Comm: irq/27-panfrost Tainted: GW 5.9.0-rc5-02434-g7d8109ec5a42 #784 [ 690.981964] Hardware name: Khadas VIM3 (DT) [ 690.986107] pstate: 6005 (nZCv daif -PAN -UAO BTYPE=--) [ 690.991627] pc : panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 690.997232] lr : panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 691.002836] sp : 800011bcbcd0 [ 691.006114] x29: 800011bcbcf0 x28: f3fe3800 [ 691.011375] x27: ceaf5350 x26: ca5fc500 [ 691.016636] x25: f32409c0 x24: 0001 [ 691.021897] x23: f3240880 x22: f3e3a800 [ 691.027159] x21: x20: [ 691.032420] x19: 00010001 x18: 0020 [ 691.037681] x17: x16: [ 691.042942] x15: f3fe3c70 x14: [ 691.048204] x13: 8000116c2428 x12: 8000116c2086 [ 691.053466] x11: 800011bcbcd0 x10: 800011bcbcd0 [ 691.058727] x9 : fffe x8 : [ 691.063988] x7 : 7420706165682074 x6 : 8000116c1816 [ 691.069249] x5 : x4 : [ 691.074510] x3 : x2 : 8000e348c000 [ 691.079771] x1 : f1b91ff9af2df000 x0 : [ 691.085033] Call trace: [ 691.087452] panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 691.092712] irq_thread_fn+0x2c/0xa0 [ 691.096246] irq_thread+0x184/0x208 [ 691.099699] kthread+0x140/0x160 [ 691.102890] ret_from_fork+0x10/0x34 [ 691.106424] ---[ end trace b5dd8c2dfada8236 ]--- It's quite possible this is caused by the GPU seeing a stale page table entry, so perhaps coherency isn't working as well as it should... Do you get an "Unhandled Page fault" message after this? Yep (see below). --->8--- [...] [ 689.805864] panfrost ffe4.gpu: Unhandled Page fault in AS0 at VA 0x03146080 [ 689.805864] Reason: TODO [ 689.805864] raw fault status: 0x10003C3 [ 689.805864] decoded fault status: SLAVE FAULT [ 689.805864] exception type 0xC3: TRANSLATION_FAULT_LEVEL3 [ 689.805864] access type 0x3: WRITE [ 689.805864] source id 0x100 [ 690.170419] panfrost ffe4.gpu: gpu sched timeout, js=1, config=0x7300, status=0x8, head=0x3101100, tail=0x3101100, sched_job=4b442768 [ 690.770373] panfrost ffe4.gpu: error powering up gpu shader [ 690.945123] panfrost ffe4.gpu: error powering up gpu shader [ 690.945731] [ cut here ] That's a write fault from level 3 of the page table triggered by shader core 0 in a fragment job. So could be writing out the frame buffer. It would be interesting to see if a patch like below would work round it: 8< diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index e8f7b11352d2..5144860afdea 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -460,9 +460,12 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, bo = bomapping->obj; if (!bo->is_heap) { - dev_WARN(pfdev->dev, "matching BO is not heap type (GPU VA = %llx)", +
Re: [PATCH 3/3] arm64: dts: meson: Describe G12b GPU as coherent
On 05/10/2020 09:15, Boris Brezillon wrote: Hi Robin, Neil, On Wed, 16 Sep 2020 10:26:43 +0200 Neil Armstrong wrote: Hi Robin, On 16/09/2020 01:51, Robin Murphy wrote: According to a downstream commit I found in the Khadas vendor kernel, the GPU on G12b is wired up for ACE-lite, so (now that Panfrost knows how to handle this properly) we should describe it as such. Otherwise the mismatch leads to all manner of fun with mismatched attributes and inadvertently snooping stale data from caches, which would account for at least some of the brokenness observed on this platform. Signed-off-by: Robin Murphy --- arch/arm64/boot/dts/amlogic/meson-g12b.dtsi | 4 1 file changed, 4 insertions(+) diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi index 9b8548e5f6e5..ee8fcae9f9f0 100644 --- a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi +++ b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi @@ -135,3 +135,7 @@ map1 { }; }; }; + + { + dma-coherent; +}; Thanks a lot for digging, I'll run a test to confirm it fixes the issue ! Sorry for the late reply. I triggered a dEQP run with this patch applied and I see a bunch of "panfrost ffe4.gpu: matching BO is not heap type" errors (see below for a full backtrace). That doesn't seem to happen when we drop this dma-coherent property. [ 690.945731] [ cut here ] [ 690.950003] panfrost ffe4.gpu: matching BO is not heap type (GPU VA = 319a000) [ 690.950051] WARNING: CPU: 0 PID: 120 at drivers/gpu/drm/panfrost/panfrost_mmu.c:465 panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 690.968854] Modules linked in: [ 690.971878] CPU: 0 PID: 120 Comm: irq/27-panfrost Tainted: GW 5.9.0-rc5-02434-g7d8109ec5a42 #784 [ 690.981964] Hardware name: Khadas VIM3 (DT) [ 690.986107] pstate: 6005 (nZCv daif -PAN -UAO BTYPE=--) [ 690.991627] pc : panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 690.997232] lr : panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 691.002836] sp : 800011bcbcd0 [ 691.006114] x29: 800011bcbcf0 x28: f3fe3800 [ 691.011375] x27: ceaf5350 x26: ca5fc500 [ 691.016636] x25: f32409c0 x24: 0001 [ 691.021897] x23: f3240880 x22: f3e3a800 [ 691.027159] x21: x20: [ 691.032420] x19: 00010001 x18: 0020 [ 691.037681] x17: x16: [ 691.042942] x15: f3fe3c70 x14: [ 691.048204] x13: 8000116c2428 x12: 8000116c2086 [ 691.053466] x11: 800011bcbcd0 x10: 800011bcbcd0 [ 691.058727] x9 : fffe x8 : [ 691.063988] x7 : 7420706165682074 x6 : 8000116c1816 [ 691.069249] x5 : x4 : [ 691.074510] x3 : x2 : 8000e348c000 [ 691.079771] x1 : f1b91ff9af2df000 x0 : [ 691.085033] Call trace: [ 691.087452] panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 691.092712] irq_thread_fn+0x2c/0xa0 [ 691.096246] irq_thread+0x184/0x208 [ 691.099699] kthread+0x140/0x160 [ 691.102890] ret_from_fork+0x10/0x34 [ 691.106424] ---[ end trace b5dd8c2dfada8236 ]--- It's quite possible this is caused by the GPU seeing a stale page table entry, so perhaps coherency isn't working as well as it should... Do you get an "Unhandled Page fault" message after this? It might give some clues. Coherency issues are a pain to debug though and it's of course possible there are issues on this specific platform. Steve ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 3/3] arm64: dts: meson: Describe G12b GPU as coherent
On Mon, 5 Oct 2020 09:34:06 +0100 Steven Price wrote: > On 05/10/2020 09:15, Boris Brezillon wrote: > > Hi Robin, Neil, > > > > On Wed, 16 Sep 2020 10:26:43 +0200 > > Neil Armstrong wrote: > > > >> Hi Robin, > >> > >> On 16/09/2020 01:51, Robin Murphy wrote: > >>> According to a downstream commit I found in the Khadas vendor kernel, > >>> the GPU on G12b is wired up for ACE-lite, so (now that Panfrost knows > >>> how to handle this properly) we should describe it as such. Otherwise > >>> the mismatch leads to all manner of fun with mismatched attributes and > >>> inadvertently snooping stale data from caches, which would account for > >>> at least some of the brokenness observed on this platform. > >>> > >>> Signed-off-by: Robin Murphy > >>> --- > >>> arch/arm64/boot/dts/amlogic/meson-g12b.dtsi | 4 > >>> 1 file changed, 4 insertions(+) > >>> > >>> diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > >>> b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > >>> index 9b8548e5f6e5..ee8fcae9f9f0 100644 > >>> --- a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > >>> +++ b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > >>> @@ -135,3 +135,7 @@ map1 { > >>> }; > >>> }; > >>> }; > >>> + > >>> + { > >>> + dma-coherent; > >>> +}; > >>> > >> > >> Thanks a lot for digging, I'll run a test to confirm it fixes the issue ! > > > > Sorry for the late reply. I triggered a dEQP run with this patch applied > > and I see a bunch of "panfrost ffe4.gpu: matching BO is not heap type" > > errors (see below for a full backtrace). That doesn't seem to happen when > > we drop this dma-coherent property. > > > > [ 690.945731] [ cut here ] > > [ 690.950003] panfrost ffe4.gpu: matching BO is not heap type (GPU VA > > = 319a000) > > [ 690.950051] WARNING: CPU: 0 PID: 120 at > > drivers/gpu/drm/panfrost/panfrost_mmu.c:465 > > panfrost_mmu_irq_handler_thread+0x47c/0x650 > > [ 690.968854] Modules linked in: > > [ 690.971878] CPU: 0 PID: 120 Comm: irq/27-panfrost Tainted: GW > > 5.9.0-rc5-02434-g7d8109ec5a42 #784 > > [ 690.981964] Hardware name: Khadas VIM3 (DT) > > [ 690.986107] pstate: 6005 (nZCv daif -PAN -UAO BTYPE=--) > > [ 690.991627] pc : panfrost_mmu_irq_handler_thread+0x47c/0x650 > > [ 690.997232] lr : panfrost_mmu_irq_handler_thread+0x47c/0x650 > > [ 691.002836] sp : 800011bcbcd0 > > [ 691.006114] x29: 800011bcbcf0 x28: f3fe3800 > > [ 691.011375] x27: ceaf5350 x26: ca5fc500 > > [ 691.016636] x25: f32409c0 x24: 0001 > > [ 691.021897] x23: f3240880 x22: f3e3a800 > > [ 691.027159] x21: x20: > > [ 691.032420] x19: 00010001 x18: 0020 > > [ 691.037681] x17: x16: > > [ 691.042942] x15: f3fe3c70 x14: > > [ 691.048204] x13: 8000116c2428 x12: 8000116c2086 > > [ 691.053466] x11: 800011bcbcd0 x10: 800011bcbcd0 > > [ 691.058727] x9 : fffe x8 : > > [ 691.063988] x7 : 7420706165682074 x6 : 8000116c1816 > > [ 691.069249] x5 : x4 : > > [ 691.074510] x3 : x2 : 8000e348c000 > > [ 691.079771] x1 : f1b91ff9af2df000 x0 : > > [ 691.085033] Call trace: > > [ 691.087452] panfrost_mmu_irq_handler_thread+0x47c/0x650 > > [ 691.092712] irq_thread_fn+0x2c/0xa0 > > [ 691.096246] irq_thread+0x184/0x208 > > [ 691.099699] kthread+0x140/0x160 > > [ 691.102890] ret_from_fork+0x10/0x34 > > [ 691.106424] ---[ end trace b5dd8c2dfada8236 ]--- > > > > It's quite possible this is caused by the GPU seeing a stale page table > entry, so perhaps coherency isn't working as well as it should... > > Do you get an "Unhandled Page fault" message after this? Yep (see below). --->8--- [ 689.640491] [ cut here ] [ 689.644754] panfrost ffe4.gpu: matching BO is not heap type (GPU VA = 3146000) [ 689.644802] WARNING: CPU: 0 PID: 120 at drivers/gpu/drm/panfrost/panfrost_mmu.c:465 panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 689.663607] Modules linked in: [ 689.31] CPU: 0 PID: 120 Comm: irq/27-panfrost Tainted: GW 5.9.0-rc5-02434-g7d8109ec5a42 #784 [ 689.676717] Hardware name: Khadas VIM3 (DT) [ 689.680860] pstate: 6005 (nZCv daif -PAN -UAO BTYPE=--) [ 689.686380] pc : panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 689.691987] lr : panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 689.697590] sp : 800011bcbcd0 [ 689.700867] x29: 800011bcbcf0 x28: f3fe3800 [ 689.706128] x27: d89cf750 x26: da34a800 [ 689.711389] x25: f32409c0 x24: 0001 [ 689.716650] x23: f3240880 x22: d456e000 [ 689.721911] x21: x20: [ 689.727173] x19: 00010001 x18: 0020 [
Re: [PATCH 3/3] arm64: dts: meson: Describe G12b GPU as coherent
Hi Robin, Neil, On Wed, 16 Sep 2020 10:26:43 +0200 Neil Armstrong wrote: > Hi Robin, > > On 16/09/2020 01:51, Robin Murphy wrote: > > According to a downstream commit I found in the Khadas vendor kernel, > > the GPU on G12b is wired up for ACE-lite, so (now that Panfrost knows > > how to handle this properly) we should describe it as such. Otherwise > > the mismatch leads to all manner of fun with mismatched attributes and > > inadvertently snooping stale data from caches, which would account for > > at least some of the brokenness observed on this platform. > > > > Signed-off-by: Robin Murphy > > --- > > arch/arm64/boot/dts/amlogic/meson-g12b.dtsi | 4 > > 1 file changed, 4 insertions(+) > > > > diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > > b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > > index 9b8548e5f6e5..ee8fcae9f9f0 100644 > > --- a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > > +++ b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > > @@ -135,3 +135,7 @@ map1 { > > }; > > }; > > }; > > + > > + { > > + dma-coherent; > > +}; > > > > Thanks a lot for digging, I'll run a test to confirm it fixes the issue ! Sorry for the late reply. I triggered a dEQP run with this patch applied and I see a bunch of "panfrost ffe4.gpu: matching BO is not heap type" errors (see below for a full backtrace). That doesn't seem to happen when we drop this dma-coherent property. [ 690.945731] [ cut here ] [ 690.950003] panfrost ffe4.gpu: matching BO is not heap type (GPU VA = 319a000) [ 690.950051] WARNING: CPU: 0 PID: 120 at drivers/gpu/drm/panfrost/panfrost_mmu.c:465 panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 690.968854] Modules linked in: [ 690.971878] CPU: 0 PID: 120 Comm: irq/27-panfrost Tainted: GW 5.9.0-rc5-02434-g7d8109ec5a42 #784 [ 690.981964] Hardware name: Khadas VIM3 (DT) [ 690.986107] pstate: 6005 (nZCv daif -PAN -UAO BTYPE=--) [ 690.991627] pc : panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 690.997232] lr : panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 691.002836] sp : 800011bcbcd0 [ 691.006114] x29: 800011bcbcf0 x28: f3fe3800 [ 691.011375] x27: ceaf5350 x26: ca5fc500 [ 691.016636] x25: f32409c0 x24: 0001 [ 691.021897] x23: f3240880 x22: f3e3a800 [ 691.027159] x21: x20: [ 691.032420] x19: 00010001 x18: 0020 [ 691.037681] x17: x16: [ 691.042942] x15: f3fe3c70 x14: [ 691.048204] x13: 8000116c2428 x12: 8000116c2086 [ 691.053466] x11: 800011bcbcd0 x10: 800011bcbcd0 [ 691.058727] x9 : fffe x8 : [ 691.063988] x7 : 7420706165682074 x6 : 8000116c1816 [ 691.069249] x5 : x4 : [ 691.074510] x3 : x2 : 8000e348c000 [ 691.079771] x1 : f1b91ff9af2df000 x0 : [ 691.085033] Call trace: [ 691.087452] panfrost_mmu_irq_handler_thread+0x47c/0x650 [ 691.092712] irq_thread_fn+0x2c/0xa0 [ 691.096246] irq_thread+0x184/0x208 [ 691.099699] kthread+0x140/0x160 [ 691.102890] ret_from_fork+0x10/0x34 [ 691.106424] ---[ end trace b5dd8c2dfada8236 ]--- ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 3/3] arm64: dts: meson: Describe G12b GPU as coherent
On 16/09/2020 01:51, Robin Murphy wrote: > According to a downstream commit I found in the Khadas vendor kernel, > the GPU on G12b is wired up for ACE-lite, so (now that Panfrost knows > how to handle this properly) we should describe it as such. Otherwise > the mismatch leads to all manner of fun with mismatched attributes and > inadvertently snooping stale data from caches, which would account for > at least some of the brokenness observed on this platform. > > Signed-off-by: Robin Murphy > --- > arch/arm64/boot/dts/amlogic/meson-g12b.dtsi | 4 > 1 file changed, 4 insertions(+) > > diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > index 9b8548e5f6e5..ee8fcae9f9f0 100644 > --- a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > +++ b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > @@ -135,3 +135,7 @@ map1 { > }; > }; > }; > + > + { > + dma-coherent; > +}; > Reviewed-by: Neil Armstrong ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 3/3] arm64: dts: meson: Describe G12b GPU as coherent
Hi Robin, On 16/09/2020 01:51, Robin Murphy wrote: > According to a downstream commit I found in the Khadas vendor kernel, > the GPU on G12b is wired up for ACE-lite, so (now that Panfrost knows > how to handle this properly) we should describe it as such. Otherwise > the mismatch leads to all manner of fun with mismatched attributes and > inadvertently snooping stale data from caches, which would account for > at least some of the brokenness observed on this platform. > > Signed-off-by: Robin Murphy > --- > arch/arm64/boot/dts/amlogic/meson-g12b.dtsi | 4 > 1 file changed, 4 insertions(+) > > diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > index 9b8548e5f6e5..ee8fcae9f9f0 100644 > --- a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > +++ b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi > @@ -135,3 +135,7 @@ map1 { > }; > }; > }; > + > + { > + dma-coherent; > +}; > Thanks a lot for digging, I'll run a test to confirm it fixes the issue ! Neil ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 3/3] arm64: dts: meson: Describe G12b GPU as coherent
According to a downstream commit I found in the Khadas vendor kernel, the GPU on G12b is wired up for ACE-lite, so (now that Panfrost knows how to handle this properly) we should describe it as such. Otherwise the mismatch leads to all manner of fun with mismatched attributes and inadvertently snooping stale data from caches, which would account for at least some of the brokenness observed on this platform. Signed-off-by: Robin Murphy --- arch/arm64/boot/dts/amlogic/meson-g12b.dtsi | 4 1 file changed, 4 insertions(+) diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi index 9b8548e5f6e5..ee8fcae9f9f0 100644 --- a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi +++ b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi @@ -135,3 +135,7 @@ map1 { }; }; }; + + { + dma-coherent; +}; -- 2.28.0.dirty ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel