On Fri, Jun 25, 2021 at 04:15:43PM +0200, Mark Kettenis wrote:
> > Date: Fri, 25 Jun 2021 13:27:28 +0000
> > From: Visa Hankala <v...@hankala.org>
> > 
> > On Thu, Jun 24, 2021 at 07:02:11PM +0000, Mickael Torres wrote:
> > > Hello,
> > > 
> > > On the risc-v SiFive Unmatched the internal cad0 ethernet interface stops 
> > > working randomly after some packets are sent/received. It looks like it's 
> > > because the bus_dmamap used isn't restricted to lower than 4GB physical 
> > > addresses, and the interface itself is.
> > 
> > I am surprised that this has not been raised before. I also wonder if
> > riscv64's DMA constraints are fully sane.
> 
> There is no DMA constraint on riscv64 yet.  We try to avoid having
> such a constraint on platforms that don't have a long history, hoping
> those platforms are (and remain) 64-bit "clean".  And on some modern
> platforms (e.g. arm64) there is no memory below 4GB, so we can't have
> a DMA constraint on those platforms.  The jury is still out where
> riscv64 will end up.
> 
> There is infrastructure to have the bootloader set the DMA constraint
> based on the device tree.
> 
> > > Configuring the interface for 64 bits DMA fixes the problem, and
> > > the machine is now useable with its internal ethernet port.
> > > 
> > > I didn't test very extensively, but it was very easy to run into
> > > "cad0: hresp error, interface stopped" before the patch. After the
> > > patch it survived a couple hours of tests and ping -f from and to
> > > it.
> > > 
> > > It is now depending on being compiled for __riscv64__ or not, would it be 
> > > better to do it dynamically when matching "sifive,fu740-c000-gem" ?
> > 
> > Hopefully all 64-bit platforms have a 64-bit capable revision of the
> > controller.
> 
> So far that seems to be true.  The controller on the PolarFire SoC is
> also 64-bit capable.
> 
> > However, I would avoid #ifdef'ing and make the selecting of
> > the DMA mode happen at runtime.
> > 
> > Below is how I had envisioned how the driver should work.
> > 
> > I have not tested the 64-bit side of the patch.
> 
> Seems to work fine.  The diff looks good to me.  Your diff does not
> set a 4GB boundary in the bus_dmamap_create() call for the rings.  It
> works without that, but if there really is a hardware constraint in
> crossing a 4GB boundary we may need to add this in.

So far I have not spotted such a restriction in the documentations
of Zynq UltraScale+ and PolarFire SoCs. These SoCs have 64-bit capable
GEM controllers.

The standalone GEM driver for Zynq in Xilinx embeddedsw library does
have comments about not crossing the 0xffffffff boundary. However, those
comments predate and seem inconsistent with 64-bit code.

> The hardware has a register that indicates whether 64-bit DMA is
> supported.  Maybe we should look at that instead of checking the
> compatible string.  But let's get this in and tweak it later.

The register is undocumented on the Zynq-7000. Reading the register
returns a constant (?) value (0x200), but the value probably means
something different.

However, making the register access conditional to GEM version might
work. Xilinx Zynq UltraScale+ and SiFive HiFive Unmatched have GEM
version 0x7, whereas MicroSemi PolarFire and Xilinx Versal appear to
have GEM version 0x107.

In addition, as cad(4) is now able to use 64-bit DMA, the DMA maps could
be created with the 64-bit capability turned on.

The diff is untested on 64-bit hardware.

Index: dev/fdt/if_cad.c
===================================================================
RCS file: src/sys/dev/fdt/if_cad.c,v
retrieving revision 1.4
diff -u -p -r1.4 if_cad.c
--- dev/fdt/if_cad.c    26 Jun 2021 10:47:59 -0000      1.4
+++ dev/fdt/if_cad.c    26 Jun 2021 11:08:36 -0000
@@ -126,6 +126,8 @@
 #define GEM_LADDRH(i)                  (0x008c + (i) * 8)
 #define GEM_LADDRNUM                   4
 #define GEM_MID                                0x00fc
+#define  GEM_MID_VERSION_MASK                  (0xfff << 16)
+#define  GEM_MID_VERSION_SHIFT                 16
 #define GEM_OCTTXL                     0x0100
 #define GEM_OCTTXH                     0x0104
 #define GEM_TXCNT                      0x0108
@@ -169,6 +171,8 @@
 #define GEM_RXIPCCNT                   0x01a8
 #define GEM_RXTCPCCNT                  0x01ac
 #define GEM_RXUDPCCNT                  0x01b0
+#define GEM_CFG6                       0x0294
+#define  GEM_CFG6_DMA64                                (1 << 23)
 #define GEM_TXQBASEHI                  0x04c8
 #define GEM_RXQBASEHI                  0x04d4
 
@@ -364,6 +368,7 @@ cad_attach(struct device *parent, struct
        struct cad_softc *sc = (struct cad_softc *)self;
        struct ifnet *ifp = &sc->sc_ac.ac_if;
        uint32_t hi, lo;
+       uint32_t rev, ver;
        unsigned int i;
        int node, phy;
 
@@ -416,8 +421,12 @@ cad_attach(struct device *parent, struct
                }
        }
 
+       rev = HREAD4(sc, GEM_MID);
+       ver = (rev & GEM_MID_VERSION_MASK) >> GEM_MID_VERSION_SHIFT;
+
        sc->sc_descsize = sizeof(struct cad_desc32);
-       if (OF_is_compatible(faa->fa_node, "sifive,fu740-c000-gem")) {
+       /* Register CFG6 is not present on Zynq-7000 / GEM version 0x2. */
+       if (ver >= 0x7 && (HREAD4(sc, GEM_CFG6) & GEM_CFG6_DMA64)) {
                sc->sc_descsize = sizeof(struct cad_desc64);
                sc->sc_dma64 = 1;
        }
@@ -459,7 +468,7 @@ cad_attach(struct device *parent, struct
                    IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6;
        }
 
-       printf(": rev 0x%x, address %s\n", HREAD4(sc, GEM_MID),
+       printf(": rev 0x%x, address %s\n", rev,
            ether_sprintf(sc->sc_ac.ac_enaddr));
 
        sc->sc_mii.mii_ifp = ifp;
@@ -593,9 +602,13 @@ cad_up(struct cad_softc *sc)
        struct cad_desc32 *desc32;
        struct cad_desc64 *desc64;
        uint64_t addr;
+       int flags = BUS_DMA_WAITOK | BUS_DMA_ALLOCNOW;
        unsigned int i;
        uint32_t val;
 
+       if (sc->sc_dma64)
+               flags |= BUS_DMA_64BIT;
+
        /*
         * Set up Tx descriptor ring.
         */
@@ -612,7 +625,7 @@ cad_up(struct cad_softc *sc)
        for (i = 0; i < CAD_NTXDESC; i++) {
                txb = &sc->sc_txbuf[i];
                bus_dmamap_create(sc->sc_dmat, MCLBYTES, CAD_NTXSEGS,
-                   MCLBYTES, 0, BUS_DMA_WAITOK, &txb->bf_map);
+                   MCLBYTES, 0, flags, &txb->bf_map);
                txb->bf_m = NULL;
 
                if (sc->sc_dma64) {
@@ -657,7 +670,7 @@ cad_up(struct cad_softc *sc)
        for (i = 0; i < CAD_NRXDESC; i++) {
                rxb = &sc->sc_rxbuf[i];
                bus_dmamap_create(sc->sc_dmat, MCLBYTES, 1,
-                   MCLBYTES, 0, BUS_DMA_WAITOK, &rxb->bf_map);
+                   MCLBYTES, 0, flags, &rxb->bf_map);
                rxb->bf_m = NULL;
 
                /* Mark all descriptors as used so that driver owns them. */
@@ -1486,13 +1499,17 @@ struct cad_dmamem *
 cad_dmamem_alloc(struct cad_softc *sc, bus_size_t size, bus_size_t align)
 {
        struct cad_dmamem *cdm;
+       int flags = BUS_DMA_WAITOK | BUS_DMA_ALLOCNOW;
        int nsegs;
 
        cdm = malloc(sizeof(*cdm), M_DEVBUF, M_WAITOK | M_ZERO);
        cdm->cdm_size = size;
 
+       if (sc->sc_dma64)
+               flags |= BUS_DMA_64BIT;
+
        if (bus_dmamap_create(sc->sc_dmat, size, 1, size, 0,
-           BUS_DMA_WAITOK | BUS_DMA_ALLOCNOW, &cdm->cdm_map) != 0)
+           flags, &cdm->cdm_map) != 0)
                goto cdmfree;
        if (bus_dmamem_alloc(sc->sc_dmat, size, align, 0, &cdm->cdm_seg, 1,
            &nsegs, BUS_DMA_WAITOK) != 0)

Reply via email to