On 29/07/2019 12:29, Jose Abreu wrote:
++ Catalin, Will (ARM64 Maintainers)

From: Jon Hunter <jonath...@nvidia.com>
Date: Jul/29/2019, 11:55:18 (UTC+00:00)


On 29/07/2019 09:16, Jose Abreu wrote:
From: Jose Abreu <joab...@synopsys.com>
Date: Jul/27/2019, 16:56:37 (UTC+00:00)

From: Jon Hunter <jonath...@nvidia.com>
Date: Jul/26/2019, 15:11:00 (UTC+00:00)


On 25/07/2019 16:12, Jose Abreu wrote:
From: Jon Hunter <jonath...@nvidia.com>
Date: Jul/25/2019, 15:25:59 (UTC+00:00)


On 25/07/2019 14:26, Jose Abreu wrote:

...

Well, I wasn't expecting that :/

Per documentation of barriers I think we should set descriptor fields
and then barrier and finally ownership to HW so that remaining fields
are coherent before owner is set.

Anyway, can you also add a dma_rmb() after the call to
stmmac_rx_status() ?

Yes. I removed the debug print added the barrier, but that did not help.

So, I was finally able to setup NFS using your replicated setup and I
can't see the issue :(

The only difference I have from yours is that I'm using TCP in NFS
whilst you (I believe from the logs), use UDP.

So I tried TCP by setting the kernel boot params to 'nfsvers=3' and
'proto=tcp' and this does appear to be more stable, but not 100% stable.
It still appears to fail in the same place about 50% of the time.

You do have flow control active right ? And your HW FIFO size is >= 4k ?

How can I verify if flow control is active?

You can check it by dumping register MTL_RxQ_Operation_Mode (0xd30).

Where would be the appropriate place to dump this? After probe? Maybe
best if you can share a code snippet of where to dump this.

Can you also add IOMMU debug in file "drivers/iommu/iommu.c" ?

You can find a boot log here:

https://urldefense.proofpoint.com/v2/url?u=https-3A__paste.ubuntu.com_p_qtRqtYKHGF_&d=DwICaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=WHDsc6kcWAl4i96Vm5hJ_19IJiuxx_p_Rzo2g-uHDKw&m=NrxsR2etpZHGb7HkN4XdgaGmKM1XYyldihNPL6qVSv0&s=CMATEcHVoqZw4sIrNOXc7SFE_kV_5CO5EU21-yJez6c&e=

And, please try attached debug patch.

With this patch it appears to boot fine. So far no issues seen.

Thank you for testing.

Hi Catalin and Will,

Sorry to add you in such a long thread but we are seeing a DMA issue
with stmmac driver in an ARM64 platform with IOMMU enabled.

The issue seems to be solved when buffers allocation for DMA based
transfers are *not* mapped with the DMA_ATTR_SKIP_CPU_SYNC flag *OR*
when IOMMU is disabled.

Notice that after transfer is done we do use
dma_sync_single_for_{cpu,device} and then we reuse *the same* page for
another transfer.

Can you please comment on whether DMA_ATTR_SKIP_CPU_SYNC can not be used
in ARM64 platforms with IOMMU ?

In terms of what they do, there should be no difference on arm64 between:

dma_map_page(..., dir);
...
dma_unmap_page(..., dir);

and:

dma_map_page_attrs(..., dir, DMA_ATTR_SKIP_CPU_SYNC);
dma_sync_single_for_device(..., dir);
...
dma_sync_single_for_cpu(..., dir);
dma_unmap_page_attrs(..., dir, DMA_ATTR_SKIP_CPU_SYNC);

provided that the first sync covers the whole buffer and any subsequent ones cover at least the parts of the buffer which may have changed. Plus for coherent hardware it's entirely moot either way.

Given Jon's previous findings, I would lean towards the idea that performing the extra (redundant) cache maintenance plus barrier in dma_unmap is mostly just perturbing timing in the same way as the debug print which also made things seem OK.

Robin.

Reply via email to