On Thu, Jul 04, 2019 at 02:14:28PM +0200, Arnd Bergmann wrote: > On Thu, Jul 4, 2019 at 12:31 PM Ilias Apalodimas > <ilias.apalodi...@linaro.org> wrote: > > > On Wed, 3 Jul 2019 12:37:50 +0200 > > > Jose Abreu <jose.ab...@synopsys.com> wrote: > > > 1. page pool allocs packet. The API doesn't sync but i *think* you don't > > have to > > explicitly since the CPU won't touch that buffer until the NAPI handler > > kicks > > in. On the napi handler you need to dma_sync_single_for_cpu() and process > > the > > packet. > > > So bvottom line i *think* we can skip the dma_sync_single_for_device() on > > the > > initial allocation *only*. If am terribly wrong please let me know :) > > I think you have to do a sync_single_for_device /somewhere/ before the > buffer is given to the device. On a non-cache-coherent machine with > a write-back cache, there may be dirty cache lines that get written back > after the device DMA's data into it (e.g. from a previous memset > from before the buffer got freed), so you absolutely need to flush any > dirty cache lines on it first. Ok my bad here i forgot to add "when coherency is there", since the driver i had in mind runs on such a device (i think this is configurable though so i'll add the sync explicitly to make sure we won't break any configurations).
In general you are right, thanks for the explanation! > You may also need to invalidate the cache lines in the following > sync_single_for_cpu() to eliminate clean cache lines with stale data > that got there when speculatively reading between the cache-invalidate > and the DMA. > > Arnd Thanks! /Ilias