Hi,

  The below are the timings on clean & flush.

/*
Size     Clean   Dirty_clean    Flush   Dirty_Flush
         T1(ns)       T2(ns)          T3(ns)      T2(ns)
============================================================
4096     30517    30517         30517         30517
8192     30517    30517         30517         30517
16384    30518    30518         30518         30518
32768    30518    30518         30518         61035<--
36864    61036    61036         61035         61035
65536    91553    91553         91553         91553
131072 183106     183106                183106  183106

Full     30518    30518         30518         30518<--
Cache 

*/
/* Based on Above values, 32768 size is breakeven for flushing/cleaning
 * full D cache
 */

I have noticed with 32KB DLIMIT, there is small reduction about 1fps in 
skiamark profile after this change. It could be because of full flush or
clean is causing more cache misses later on in the execution.

However with 64KB DLIMIT, there is further degrade in skiamark performance.
So I think 32KB is good value.

However the problems are seen in the Android UI. Small artifacts are 
seen during Video playback on UI widgets.

This artifacts are not seen if clean is called for each cpu.

Also I find it takes some effort to implement clean_all / flush_all
API's in cache-V7.S (asm) file to execute on each cpu.
And hence it was parked aside.

And I have not investigated, why flush on both cases in case of flush all on
Both cpu's always works?

Thanks & Regards
Vijay



-----Original Message-----
From: Linus Walleij [mailto:linus.wall...@linaro.org] 
Sent: Monday, June 27, 2011 5:30 PM
To: Russell King - ARM Linux; Srinidhi KASAGAR; Vijaya Kumar K-1
Cc: Per Forlin; Nicolas Pitre; Chris Ball; linaro-...@lists.linaro.org; 
linux-mmc@vger.kernel.org; linux-arm-ker...@lists.infradead.org; Robert Fekete
Subject: Re: [PATCH v6 00/11] mmc: use nonblock mmc requests to minimize latency

On Mon, Jun 27, 2011 at 12:02 PM, Russell King - ARM Linux
<li...@arm.linux.org.uk> wrote:

> The next thing to think about in DMA-land is whether we should total up
> the size of the SG list and choose whether to flush the individual SG
> elements or do a full cache flush.  There becomes a point where the full
> cache flush becomes cheaper than flushing each SG element individually.

We noticed that even for a single (large) buffer, any cache flush operation
above a certain threshold flushing indiviudal lines become more expensive
than flushing the entire cache.

I requested colleagues to look into implenting this threshold in the
arch/arm/mm/cache-v7.S file. but I think they ran into trouble and
eventually had to give up on it.

Vijay or Srinidhi, can you share your findings?

Thanks,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to