On Monday 22 June 2015 01:38 PM, Alexey Brodkin wrote:
> Hi all,
> 
> On Wed, 2015-06-17 at 07:03 +0000, Vineet Gupta wrote:
> +CC linux-arch, linux-mm, Arnd and Marek
> 
> On Tuesday 16 June 2015 11:11 PM, Alexey Brodkin wrote:
> 
> Current implementtion of descriptor init procedure only takes care about
> ownership flag. While it is perfectly possible to have underlying memory
> filled with garbage on boot or driver installation.
> 
> And randomly set flags in non-zeroed des0 and des1 fields may lead to
> unpredictable behavior of the GMAC DMA block.
> 
> Solution to this problem is as simple as explicit zeroing of both des0
> and des1 fields of all buffer descriptors.
> 
> Signed-off-by: Alexey Brodkin 
> <abrod...@synopsys.com><mailto:abrod...@synopsys.com>
> Cc: Giuseppe Cavallaro <peppe.cavall...@st.com><mailto:peppe.cavall...@st.com>
> Cc: arc-linux-...@synopsys.com<mailto:arc-linux-...@synopsys.com>
> Cc: linux-ker...@vger.kernel.org<mailto:linux-ker...@vger.kernel.org>
> Cc: sta...@vger.kernel.org<mailto:sta...@vger.kernel.org>
> 
> FWIW, this was causing sporadic/random networking flakiness on ARC SDP 
> platform (scheduled for upstream inclusion in next window)
> 
> This also leads to an interesting question - should 
> arch/*/dma_alloc_coherent() and friends unconditionally zero out memory (vs. 
> the current semantics of letting only doing it based on gfp, as requested by 
> driver). This is the second instance we ran into stale descriptor memory, the 
> first one was in dw_mmc driver which was recently fixed in upstream as well 
> (although debugged independently by Alexey and using the upstream fix)
> 
> http://www.spinics.net/lists/linux-mmc/msg31600.html
> 
> The pros is better out of box experience (despite buggy drivers) while the 
> cons are they remain broken and perhaps increased boot time due to extra 
> memzero....
> 
> Probably if we already have dma_zalloc_coherent() that does explicit zeroing 
> of returned memory then there's no need to do implicit zeroing in 
> dma_alloc_coherent()?


The question is, when drivers don't have dma_zalloc_coherent() - meaning they
don't pass __GFP_ZERO, which causes these random issues, do we need to be more
conservative in arch code (ARC at least is) or do we need to debug and fix these
drivers - one by one.

FWIW, ARC needs to fix __GFP_ZERO case, since we are doing memzero twice.

-Vineet
--
To unsubscribe from this list: send the line "unsubscribe netdev" in

Reply via email to