On Monday 22 June 2015 01:38 PM, Alexey Brodkin wrote: > Hi all, > > On Wed, 2015-06-17 at 07:03 +0000, Vineet Gupta wrote: > +CC linux-arch, linux-mm, Arnd and Marek > > On Tuesday 16 June 2015 11:11 PM, Alexey Brodkin wrote: > > Current implementtion of descriptor init procedure only takes care about > ownership flag. While it is perfectly possible to have underlying memory > filled with garbage on boot or driver installation. > > And randomly set flags in non-zeroed des0 and des1 fields may lead to > unpredictable behavior of the GMAC DMA block. > > Solution to this problem is as simple as explicit zeroing of both des0 > and des1 fields of all buffer descriptors. > > Signed-off-by: Alexey Brodkin > <abrod...@synopsys.com><mailto:abrod...@synopsys.com> > Cc: Giuseppe Cavallaro <peppe.cavall...@st.com><mailto:peppe.cavall...@st.com> > Cc: arc-linux-...@synopsys.com<mailto:arc-linux-...@synopsys.com> > Cc: linux-ker...@vger.kernel.org<mailto:linux-ker...@vger.kernel.org> > Cc: sta...@vger.kernel.org<mailto:sta...@vger.kernel.org> > > FWIW, this was causing sporadic/random networking flakiness on ARC SDP > platform (scheduled for upstream inclusion in next window) > > This also leads to an interesting question - should > arch/*/dma_alloc_coherent() and friends unconditionally zero out memory (vs. > the current semantics of letting only doing it based on gfp, as requested by > driver). This is the second instance we ran into stale descriptor memory, the > first one was in dw_mmc driver which was recently fixed in upstream as well > (although debugged independently by Alexey and using the upstream fix) > > http://www.spinics.net/lists/linux-mmc/msg31600.html > > The pros is better out of box experience (despite buggy drivers) while the > cons are they remain broken and perhaps increased boot time due to extra > memzero.... > > Probably if we already have dma_zalloc_coherent() that does explicit zeroing > of returned memory then there's no need to do implicit zeroing in > dma_alloc_coherent()?
The question is, when drivers don't have dma_zalloc_coherent() - meaning they don't pass __GFP_ZERO, which causes these random issues, do we need to be more conservative in arch code (ARC at least is) or do we need to debug and fix these drivers - one by one. FWIW, ARC needs to fix __GFP_ZERO case, since we are doing memzero twice. -Vineet -- To unsubscribe from this list: send the line "unsubscribe netdev" in