On Tue, Jul 12, 2011 at 10:21:05PM +0800, Felix Fietkau wrote: > On 12.07.2011, at 21:03, Michał Mirosław <mirq-li...@rere.qmqm.pl> wrote: > > > On Tue, Jul 12, 2011 at 08:54:32PM +0800, Felix Fietkau wrote: > >> On 12.07.2011, at 17:55, Michał Mirosław <mirq-li...@rere.qmqm.pl> wrote: > >> > >>> On Tue, Jul 12, 2011 at 12:36:06PM +0800, Felix Fietkau wrote: > >>>> On 2011-07-11 8:52 AM, Michał Mirosław wrote: > >>>>> Also constify buf_addr for ath9k_hw_process_rxdesc_edma() to verify > >>>>> assumptions --- dma_sync_single_for_device() call can be removed. > >>>>> > >>>>> Signed-off-by: Michał Mirosław<mirq-li...@rere.qmqm.pl> > >>>>> --- > >>>>> drivers/net/wireless/ath/ath9k/ar9003_mac.c | 4 ++-- > >>>>> drivers/net/wireless/ath/ath9k/ar9003_mac.h | 2 +- > >>>>> drivers/net/wireless/ath/ath9k/recv.c | 10 +++------- > >>>>> 3 files changed, 6 insertions(+), 10 deletions(-) > >>>>> > >>>>> diff --git a/drivers/net/wireless/ath/ath9k/recv.c > >>>>> b/drivers/net/wireless/ath/ath9k/recv.c > >>>>> index 70dc8ec..c5f46d5 100644 > >>>>> --- a/drivers/net/wireless/ath/ath9k/recv.c > >>>>> +++ b/drivers/net/wireless/ath/ath9k/recv.c > >>>>> @@ -684,15 +684,11 @@ static bool ath_edma_get_buffers(struct ath_softc > >>>>> *sc, > >>>>> BUG_ON(!bf); > >>>>> > >>>>> dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr, > >>>>> - common->rx_bufsize, DMA_FROM_DEVICE); > >>>>> + common->rx_bufsize, DMA_BIDIRECTIONAL); > >>>>> > >>>>> ret = ath9k_hw_process_rxdesc_edma(ah, NULL, skb->data); > >>>>> - if (ret == -EINPROGRESS) { > >>>>> - /*let device gain the buffer again*/ > >>>>> - dma_sync_single_for_device(sc->dev, bf->bf_buf_addr, > >>>>> - common->rx_bufsize, DMA_FROM_DEVICE); > >>>>> + if (ret == -EINPROGRESS) > >>>>> return false; > >>>>> - } > >>>>> > >>>>> __skb_unlink(skb,&rx_edma->rx_fifo); > >>>>> if (ret == -EINVAL) { > >>>> I have strong doubts about this change. On most MIPS devices, > >>>> dma_sync_single_for_cpu is a no-op, whereas > >>>> dma_sync_single_for_device flushes the cache range. With this > >>>> change, the CPU could cache the DMA status part behind skb->data and > >>>> that cache entry would not be flushed inbetween calls to this > >>>> functions on the same buffer, likely leading to rx stalls. > >>> You're suggesting a platform implementation bug then. If the platform is > >>> not > >>> cache-coherent, it should invalidate relevant CPU cache lines for > >>> sync_to_cpu > >>> and unmap cases. Do other devices show such symptoms on MIPS systems? > >>> > >>> I'm not familiar with the platform internals, so we should ask MIPS > >>> people. > >> I only mentioned MIPS to describe the potential side effect of this > >> change. From my current understanding of the DMA API, it would be wrong on > >> other platforms as well. I believe the _for_device function needs to be > >> used to transfer ownership of the buffer back to the device, before > >> calling _for_cpu again later for another read. > > What you're saying reminds the wording in DMA-API-HOWTO.txt that I find > > wrong (or at least misleading) compared to what DMA-API.txt describes. > > DMA sync calls do not transfer the ownership of the buffer - they are > > cache synchronization points, ownership passing is handled entirely by > > the driver. > What I meant was that the DMA sync calls reflect the ownership transfer of > the memory regions. In this case ownership is transferred between device and > CPU multiple times and the code reflects that. > >> This is definitely required in this case, because when the return code is > >> -EINPROGRESS, the driver waits for the hardware to complete this buffer, > >> and the next call has to fetch the memory again after the device has > >> updated it. > > Correctness of this access should be provided by sync_to_cpu() call. > At least in MIPS I'm sure it isn't. If I remember correctly, it also isn't on > ARM, so I'm pretty sure that either your understanding of the API is > incorrect, or arch code does not implement it properly. In either case, this > change (and probably also the p54 one) should not be merged.
I briefly looked through DMA API implementation in MIPS, and except for R10k and R12k both sync_for_cpu and sync_for_device are no-ops (see: arch/mips/mm/dma-default.c). For R10k and R12k the syncs are in both points, and exactly like I described before - CPU cachelines are invalidated for DMA_FROM_DEVICE mappings, written back for DMA_TO_DEVICE, both for DMA_BIDIRECTIONAL (including redundant mapping+sync direction). So doing that sync_to_device you are just invalidating the same cachelines twice for no gain (or do nothing twice in some cases) - they are not read by CPU between sync_to_device -> sync_to_cpu (unless you have other bugs in the driver). Best Regards, Michał Mirosław _______________________________________________ ath9k-devel mailing list ath9k-devel@lists.ath9k.org https://lists.ath9k.org/mailman/listinfo/ath9k-devel