Apparently the original poster sent his question to me in private, then sent it again to the mailing list right as I was responding in private.
Anyways, no need to continue to guess; if anyone has any questions, feel
free to ask.

Below is my response. Note that I edited it slightly to fix an error that I found

Scott

-------- Original Message --------
Subject: Re: use of bus_dmamap_sync
Date: Tue, 25 Oct 2005 07:59:03 -0600
From: Scott Long <[EMAIL PROTECTED]>
To: Dinesh Nair <[EMAIL PROTECTED]>
References: <[EMAIL PROTECTED]>

Dinesh Nair wrote:

hi scott,

i came across this message of yours,
http://lists.freebsd.org/pipermail/freebsd-current/2004-December/044395.html

and you seem like the perfect person to assist me in something. i've been
trying to figure out the best places to use bus_dmamap_sync when
reading/writing to a dma mapped address space. however, i cant seem to get
the gist of this, either from the mailing list discussions or the man page.
could you assist me ?

i'm on FreeBSD 4.11 right now, and i notice the definitions of BUS_DMASYNC_* has changed from an enum (0-3) in 4.x to a typedef in 5.x.

this is what i have done. i have used two buffers to handle reads from the
device and writes to the device. the pseudocode is as follows

rx_func()
{
    POSITION A
      bus_dmamap_sync(tag, map, BUS_DMASYNC_PREREAD);
      Ask hardware for data
      bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTREAD);

    read from readbuf (i'm assuming that device has put data in
               readbuf)
    POSITION B
}

tx_func()
{
    POSITION C

    write to txbuf (here's where we write to txbuf)
      bus_dmamap_sync(tag, map, BUS_DMASYNC_PREWRITE);
      notify hardware of the write

    POSITION D
      bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTWRITE);
}

what BUS_DMASYNC_{PRE,POST}{READ,WRITE} option should i use for bus_dmamap_sync in position A, B, C and D ?

any assistance would be gladly appreciated, as i'm seeing some really weird
symptoms on this device, where data written out is being immediately read
in. i'm guessing this has to do with my wrong usage of bus_dmamap_sync().


The point of the syncs is to do the proper memory barrier and cache
coherency magic between the CPU and the bus as well as do the memory
copies for bounce buffers.  If you are dealing with statically mapped
buffers, i.e. for an rx/tx descriptor ring, then you'll want code
exactly like described above.  In reality, most platforms only do stuff
for the POSTREAD and PREWRITE cases, but for the sake of completeness
the others are documented and usually used in drivers.  NetBSD might
have platforms that require operations for PREREAD and POSTWRITE, but
I've never looked that closely.

If you are dealing with dynamic buffers,
i.e. for mbuf data, then you'll want the PREREAD and PREWRITE ops to
happen in the callback function for bus_dmamap_load() and the POSTREAD
and POSTWRITE ops to happen right before calling bus_dmamap_unload.  So
in this case is would be:

rx_buf()
{
        allocate buffer
        allocate map
        bus_dmamap_load(tag, map, buffer, size, rx_callback, arg, flags)
}

rx_callback(arg, segs, nsegs, errno)
{
        convert segs to hardware format
        bus_dmamap_sync(tag, map, BUS_DMASYNC_PREREAD)
        notify hardware about buffer
}

rx_complete()
{
        bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTREAD)
        bus_dmamap_unload(tag, map, buffer)
        deallocate map
        process buffer
}

tx_buf()
{
        fill buffer
        allocate map
        bus_dmamap_load(tag, map, buffer, size, tx_callback, arg, flags)
}

tx_callback(arg, segs, nsegs, errno)
{
        convert segs to hardware format
        bus_dmamap_sync(tag, map, BUS_DMASYNC_PREWRITE)
        notify hardware about buffer
}

tx_complete()
        bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTWRITE)
        bus_dmamap_unload(tag, map, buffer)
        deallocate map
        free buffer
}

This is the design that busdma was originally modelled on.  It works
well for storage devices where the load operation must succeed.  It
doesn't work as well for network devices where the latency of the
indirect calls is measurable.  So for that, I added
bus_dmamap_load_mbuf_sg().  It eliminates the callback function and
returns the scatter gather list directly.  So, the above example would
be:

tx_buf()
{
        bus_dma_segment_t segs[maxsegs];
        int nsegs;

        fill buffer
        allocate map
        bus_dmamap_load_mbuf_sg(tag, map, buffer, size, &segs, &nsegs)
        convert segs to hardware format
        bus_dmamap_sync(tag, map, BUS_DMASYNC_PREWRITE)
        notify hardware about buffer
}

Also, the 'allocate map' part should be done carefully.  Most network
drivers are lazy and call bus_dmamap_create() and bus_dmamap_destroy()
for each buffer.  It's often better to pre-allocate the maps at init
time, put them on a list, and then just push and pop them off the list
at runtime.  This is usually faster than calling the busdma functions,
but you'll have to weigh the tradeoffs.

Scott


_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to