> > So a diff like the one below seems like a good idea.  However I'm not
 > > very experienced with the OpenBSD kernel and I'm wondering what the
 > > idiomatic way is to express the fact that we need to make sure that
 > > neither the compiler nor an out-of-order CPU reorder the TX descriptor
 > > writes either.  Or do we just not worry about this?

 > Assuming I understand you question correctly, this can not be guaranteed
 > and in fact it is unlikely to complete in order you expect.  All newish
 > DMA engines interleave DMA transfers.  A driver for a piece of hardware
 > that got bitten by that (because they assumed in order completion of a
 > DMA) is ami(4).  For the full horror story read the interrupt path code
 > which does nothing but ensure that individual pieces are completed
 > before it calls it an overall completion.

I don't think there's any such complexity in this case.  Maybe a
better way of phrasing what is needed is to say that we need to make
sure that the correct contents of the rest of the TX descriptor in
system memory must be visible to the ral device before the VALID|BUSY
bits become visible to the ral device.

On x86, because of the strong architectural memory ordering model,
simply making sure that the instructions that set those bits come last
in order is enough -- so we need a compiler barrier that makes sure
the compiler doesn't optimize things and move where the bits get set
earlier in the function.  On architectures with a weaker memory
ordering model, some sort of synchronization instruction is probably
required between writing the rest of the TX descriptor and then
writing the VALID|BUSY bits.

 - R.

Reply via email to