On 6/19/2020 7:37 PM, Horia Geanta wrote: > On 6/17/2020 11:48 PM, Michael Walle wrote: >> Am 2020-06-17 21:15, schrieb Horia Geantă: >>> On 6/4/2020 6:48 PM, Michael Walle wrote: >>>> + >>>> + desc = memalign(ARCH_DMA_MINALIGN, desc_size); >>>> + if (!desc) { >>>> + debug("cannot allocate RNG init descriptor memory\n"); >>>> + return -ENOMEM; >>>> + } >>>> + >>>> + for (sh_idx = 0; sh_idx < RNG4_MAX_HANDLES; sh_idx++) { >>>> + /* >>>> + * If the corresponding bit is set, then it means the state >>>> + * handle was initialized by us, and thus it needs to be >>>> + * deinitialized as well >>>> + */ >>>> + >>>> + if (state_handle_mask & RDSTA_IF(sh_idx)) { >>>> + /* >>>> + * Create the descriptor for deinstantating this state >>>> + * handle. >>>> + */ >>>> + inline_cnstr_jobdesc_rng_deinstantiation(desc, sh_idx); >>>> + flush_dcache_range((unsigned long)desc, >>>> + (unsigned long)desc + desc_size); >>> Shouldn't this be roundup(desc_size, ARCH_DMA_MINALIGN) instead of >>> desc_size? >> >> I've seen the same idioms sometimes, but it wasn't clear to me why that >> would >> be needed; the hardware only uses the desc_size, right? >> > Yes, HW will use only [desc, desc + desc_size]. > > I think this is needed to avoid cacheline sharing issues > on non-coherent platforms: CPU needs to make sure a larger area > is written back to memory and corresponding cache lines are invalidated. > > Looking at flush_dcache_range() implementation, it does its own rounding, > based on CTR_EL0[DminLine] - "smallest data cache line size". > I guess this value might be smaller than ARCH_DMA_MINALIGN, > hence the explicit rounding to ARCH_DMA_MINALIGN is needed. > Btw, I think desc = memalign(ARCH_DMA_MINALIGN, desc_size); needs to be replaced with desc = malloc_cache_aligned(desc_size);
Horia