Re: [Qemu-devel] [PATCH V4 10/12] SD card users: optimize access to SDClass methods

2012-07-31 Thread Peter Maydell
On 27 July 2012 20:29, Igor Mitsyanko i.mitsya...@samsung.com wrote:
 Rather that repeatedly call SD_GET_CLASS() in a loop, call it once before
 a loop starts.

Anthony claims that SD_GET_CLASS should be cheap enough that we don't
need to hoist it out of loops like this. Do you have profiling data
or similar that caused you to write this patch?

-- PMM



Re: [Qemu-devel] [PATCH V4 10/12] SD card users: optimize access to SDClass methods

2012-07-31 Thread Igor Mitsyanko

On 07/31/2012 07:43 PM, Peter Maydell wrote:

On 27 July 2012 20:29, Igor Mitsyanko i.mitsya...@samsung.com wrote:

Rather that repeatedly call SD_GET_CLASS() in a loop, call it once before
a loop starts.

Anthony claims that SD_GET_CLASS should be cheap enough that we don't
need to hoist it out of loops like this. Do you have profiling data
or similar that caused you to write this patch?

-- PMM

Well, I've tested it by measuring an execution time of a 4Kb write to SD 
card, results showed that arithmetic mean of time for one 4k write was 
less by ~300us in sequence with SD_GET_CLASS extracted from the loop. 
Although I ran this test several times, I have little faith in test 
methodology and results, it obviously showed significant dispersion 
between measured time of distinct 4K writes (200-300% if I recall 
correctly). I really have no objection no objection to dropping this patch.




Re: [Qemu-devel] [PATCH V4 10/12] SD card users: optimize access to SDClass methods

2012-07-31 Thread Peter Maydell
On 31 July 2012 18:33, Igor Mitsyanko i.mitsya...@samsung.com wrote:
 On 07/31/2012 07:43 PM, Peter Maydell wrote:
 Anthony claims that SD_GET_CLASS should be cheap enough that we don't
 need to hoist it out of loops like this. Do you have profiling data
 or similar that caused you to write this patch?

 Well, I've tested it by measuring an execution time of a 4Kb write to SD
 card, results showed that arithmetic mean of time for one 4k write was less
 by ~300us in sequence with SD_GET_CLASS extracted from the loop.

How much is that as a % of the total time for the write ?

-- PMM



Re: [Qemu-devel] [PATCH V4 10/12] SD card users: optimize access to SDClass methods

2012-07-31 Thread Igor Mitsyanko

On 07/31/2012 09:47 PM, Peter Maydell wrote:

On 31 July 2012 18:33, Igor Mitsyanko i.mitsya...@samsung.com wrote:

On 07/31/2012 07:43 PM, Peter Maydell wrote:

Anthony claims that SD_GET_CLASS should be cheap enough that we don't
need to hoist it out of loops like this. Do you have profiling data
or similar that caused you to write this patch?

Well, I've tested it by measuring an execution time of a 4Kb write to SD
card, results showed that arithmetic mean of time for one 4k write was less
by ~300us in sequence with SD_GET_CLASS extracted from the loop.

How much is that as a % of the total time for the write ?

-- PMM


total write time was ~3.5-4.0 ms I think, so its ~0.08%



Re: [Qemu-devel] [PATCH V4 10/12] SD card users: optimize access to SDClass methods

2012-07-31 Thread Anthony Liguori
Igor Mitsyanko i.mitsya...@samsung.com writes:

 On 07/31/2012 07:43 PM, Peter Maydell wrote:
 On 27 July 2012 20:29, Igor Mitsyanko i.mitsya...@samsung.com wrote:
 Rather that repeatedly call SD_GET_CLASS() in a loop, call it once before
 a loop starts.
 Anthony claims that SD_GET_CLASS should be cheap enough that we don't
 need to hoist it out of loops like this. Do you have profiling data
 or similar that caused you to write this patch?

 -- PMM

 Well, I've tested it by measuring an execution time of a 4Kb write to SD 
 card, results showed that arithmetic mean of time for one 4k write was 
 less by ~300us in sequence with SD_GET_CLASS extracted from the loop.

How many loop iterations that?  300us is a huge amount of time, unless
you were looping on every byte, I have a hard time understanding that
delta.

Regards,

Anthony Liguori

 Although I ran this test several times, I have little faith in test 
 methodology and results, it obviously showed significant dispersion 
 between measured time of distinct 4K writes (200-300% if I recall 
 correctly). I really have no objection no objection to dropping this patch.




Re: [Qemu-devel] [PATCH V4 10/12] SD card users: optimize access to SDClass methods

2012-07-31 Thread Igor Mitsyanko

On 07/31/2012 10:15 PM, Anthony Liguori wrote:

How many loop iterations that?  300us is a huge amount of time, unless
you were looping on every byte, I have a hard time understanding that
delta.


I'm not sure I understand what you mean, I did loop on every byte, 
that's how our SD model works. So, for 4K bytes qemu have to call 
SD_GET_CLASS 4K times.
Or you're asking about write sequence length? It was 10 writes of 4096 
bytes (not statistically reliable, I have to agree).




[Qemu-devel] [PATCH V4 10/12] SD card users: optimize access to SDClass methods

2012-07-27 Thread Igor Mitsyanko
Rather that repeatedly call SD_GET_CLASS() in a loop, call it once before
a loop starts.

Signed-off-by: Igor Mitsyanko i.mitsya...@samsung.com
---
 hw/omap_mmc.c|9 +
 hw/pl181.c   |7 ---
 hw/pxa2xx_mmci.c |6 --
 3 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/hw/omap_mmc.c b/hw/omap_mmc.c
index aec0285..8ceb25b 100644
--- a/hw/omap_mmc.c
+++ b/hw/omap_mmc.c
@@ -219,6 +219,7 @@ static void omap_mmc_command(struct omap_mmc_s *host, int 
cmd, int dir,
 
 static void omap_mmc_transfer(struct omap_mmc_s *host)
 {
+SDClass *sd_class = SD_GET_CLASS(host-card);
 uint8_t value;
 
 if (!host-transfer)
@@ -229,10 +230,10 @@ static void omap_mmc_transfer(struct omap_mmc_s *host)
 if (host-fifo_len  host-af_level)
 break;
 
-value = sd_read_data(host-card);
+value = sd_class-read_data(host-card);
 host-fifo[(host-fifo_start + host-fifo_len)  31] = value;
 if (-- host-blen_counter) {
-value = sd_read_data(host-card);
+value = sd_class-read_data(host-card);
 host-fifo[(host-fifo_start + host-fifo_len)  31] |=
 value  8;
 host-blen_counter --;
@@ -244,10 +245,10 @@ static void omap_mmc_transfer(struct omap_mmc_s *host)
 break;
 
 value = host-fifo[host-fifo_start]  0xff;
-sd_write_data(host-card, value);
+sd_class-write_data(host-card, value);
 if (-- host-blen_counter) {
 value = host-fifo[host-fifo_start]  8;
-sd_write_data(host-card, value);
+sd_class-write_data(host-card, value);
 host-blen_counter --;
 }
 
diff --git a/hw/pl181.c b/hw/pl181.c
index 7d91fbb..8cd898c 100644
--- a/hw/pl181.c
+++ b/hw/pl181.c
@@ -209,18 +209,19 @@ error:
 
 static void pl181_fifo_run(pl181_state *s)
 {
+SDClass *sd_class = SD_GET_CLASS(s-card);
 uint32_t bits;
 uint32_t value = 0;
 int n;
 int is_read;
 
 is_read = (s-datactrl  PL181_DATA_DIRECTION) != 0;
-if (s-datacnt != 0  (!is_read || sd_data_ready(s-card))
+if (s-datacnt != 0  (!is_read || sd_class-data_ready(s-card))
  !s-linux_hack) {
 if (is_read) {
 n = 0;
 while (s-datacnt  s-fifo_len  PL181_FIFO_LEN) {
-value |= (uint32_t)sd_read_data(s-card)  (n * 8);
+value |= (uint32_t)sd_class-read_data(s-card)  (n * 8);
 s-datacnt--;
 n++;
 if (n == 4) {
@@ -241,7 +242,7 @@ static void pl181_fifo_run(pl181_state *s)
 }
 n--;
 s-datacnt--;
-sd_write_data(s-card, value  0xff);
+sd_class-write_data(s-card, value  0xff);
 value = 8;
 }
 }
diff --git a/hw/pxa2xx_mmci.c b/hw/pxa2xx_mmci.c
index b505a4c..af19c36 100644
--- a/hw/pxa2xx_mmci.c
+++ b/hw/pxa2xx_mmci.c
@@ -117,12 +117,14 @@ static void pxa2xx_mmci_int_update(PXA2xxMMCIState *s)
 
 static void pxa2xx_mmci_fifo_update(PXA2xxMMCIState *s)
 {
+SDClass *sd_class = SD_GET_CLASS(s-card);
+
 if (!s-active)
 return;
 
 if (s-cmdat  CMDAT_WR_RD) {
 while (s-bytesleft  s-tx_len) {
-sd_write_data(s-card, s-tx_fifo[s-tx_start ++]);
+sd_class-write_data(s-card, s-tx_fifo[s-tx_start ++]);
 s-tx_start = 0x1f;
 s-tx_len --;
 s-bytesleft --;
@@ -132,7 +134,7 @@ static void pxa2xx_mmci_fifo_update(PXA2xxMMCIState *s)
 } else
 while (s-bytesleft  s-rx_len  32) {
 s-rx_fifo[(s-rx_start + (s-rx_len ++))  0x1f] =
-sd_read_data(s-card);
+sd_class-read_data(s-card);
 s-bytesleft --;
 s-intreq |= INT_RXFIFO_REQ;
 }
-- 
1.7.5.4