Hi, I have a HummingBoard Pulse, which is an NXP i.MX8MQ based board featuring two ethernets, with the second ethernet being an em(4) on a PCIe controller.
I had trouble getting it to work, but realized that the issue is the descriptor ring coherency. I looked into the code, tried to find if there are incorrect flushes, or something else, but nothing worked. Looking at the Linux driver I realized that their descriptor rings are allocated coherent. Some arm64 machines are fine with em(4), since the PCIe controller is coherent, but on my machine it is not. Explicitly mapping the rings coherent made my machine happy, so I believe that maybe the way that em(4) works, we need to make sure the rings are coherent. So I'd propose the following diff, which *only* makes the desciptor rings coherent, the packets stay cached and fast. This allows me to push plenty of traffic through my machine! This is a no-op on all x86, and on arm64-machines with coherent PCIe controllers. Opinions? ok? Patrick diff --git a/sys/dev/pci/if_em.c b/sys/dev/pci/if_em.c index aca6b4bb02f..27d0630bf9f 100644 --- a/sys/dev/pci/if_em.c +++ b/sys/dev/pci/if_em.c @@ -2113,7 +2113,7 @@ em_dma_malloc(struct em_softc *sc, bus_size_t size, struct em_dma_alloc *dma) goto destroy; r = bus_dmamem_map(sc->sc_dmat, &dma->dma_seg, dma->dma_nseg, size, - &dma->dma_vaddr, BUS_DMA_WAITOK); + &dma->dma_vaddr, BUS_DMA_WAITOK | BUS_DMA_COHERENT); if (r != 0) goto free;