Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-11-05 Thread Stephen Warren
On 11/02/2012 04:10 PM, Marek Vasut wrote:
 Dear Stephen Warren,
 
 On 11/02/2012 03:28 PM, Marek Vasut wrote:
 Dear Stephen Warren,

 On 11/02/2012 02:38 PM, Marek Vasut wrote:
 ...

 Dumb question -- might be unrelated. Does the tegra mmc driver do DMA?
 And if so, what happens if you do raw read to unaligned address (aka.
 how come you don't need the bounce buffer)?

 Yes, it does DMA, I believe. (At least if it doesn't, I have no idea why
 the driver is flushing caches!)

 I guess we only support the use of aligned addresses, so e.g. the
 following would work:

 ext2load mmc 0:1 0x0010 /file

 but the following wouldn't:

 ext2load mmc 0:1 0x0014 /file

 which while I suppose it is an artificial restriction, hasn't been an
 issue in practice.

 Then just enable the bounce buffer and it will work ;-)

 You suggested that last time, and it made no difference then... In fact,
 the config option you mentioned isn't used anywhere in the srouce tree
 except adding bouncebuf.o
 
 Bouncebuf.o ?

common/bouncebuf.c

 into the build right now; which config option
 do you think I should use?
 
 CONFIG_BOUNCE_BUFFER
 
 It's used on mx28-based boards.

Ahah, there's much more to it than just defining the config option, and
grep'ing for the config option doesn't reveal that.

Defining CONFIG_BOUNCE_BUFFER simply pulls in bouncebuf.o into the
build. It's then up to the individual MMC/... driver to actually make
use of the new functions - i.e. explicitly call e.g.
bounce_buffer_start(). The MXS MMC driver calls these without any ifdef
CONFIG_BOUNCE_BUFFER, and hence doesn't show up when you grep for that.
I had assumed that such functionality would be a conditional part of the
MMC core rather than the individual drivers, hence had missed the usage
in the MXS driver.

So, that config option plus some changes to the Tegra MMC driver might
work - I'll take a look. However, making that change will simply hide
any unaligned buffer and reduce performance, so I'd still prefer to
require that buffers be aligned in the future where possible.
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-11-05 Thread Marek Vasut
Dear Stephen Warren,

 On 11/02/2012 04:10 PM, Marek Vasut wrote:
  Dear Stephen Warren,
  
  On 11/02/2012 03:28 PM, Marek Vasut wrote:
  Dear Stephen Warren,
  
  On 11/02/2012 02:38 PM, Marek Vasut wrote:
  ...
  
  Dumb question -- might be unrelated. Does the tegra mmc driver do
  DMA? And if so, what happens if you do raw read to unaligned address
  (aka. how come you don't need the bounce buffer)?
  
  Yes, it does DMA, I believe. (At least if it doesn't, I have no idea
  why the driver is flushing caches!)
  
  I guess we only support the use of aligned addresses, so e.g. the
  following would work:
  
  ext2load mmc 0:1 0x0010 /file
  
  but the following wouldn't:
  
  ext2load mmc 0:1 0x0014 /file
  
  which while I suppose it is an artificial restriction, hasn't been an
  issue in practice.
  
  Then just enable the bounce buffer and it will work ;-)
  
  You suggested that last time, and it made no difference then... In fact,
  the config option you mentioned isn't used anywhere in the srouce tree
  except adding bouncebuf.o
  
  Bouncebuf.o ?
 
 common/bouncebuf.c

Ugh, it was applied ...

  into the build right now; which config option
  do you think I should use?
  
  CONFIG_BOUNCE_BUFFER
  
  It's used on mx28-based boards.
 
 Ahah, there's much more to it than just defining the config option, and
 grep'ing for the config option doesn't reveal that.
 
 Defining CONFIG_BOUNCE_BUFFER simply pulls in bouncebuf.o into the
 build. It's then up to the individual MMC/... driver to actually make
 use of the new functions - i.e. explicitly call e.g.
 bounce_buffer_start(). The MXS MMC driver calls these without any ifdef
 CONFIG_BOUNCE_BUFFER, and hence doesn't show up when you grep for that.
 I had assumed that such functionality would be a conditional part of the
 MMC core rather than the individual drivers, hence had missed the usage
 in the MXS driver.

Sorry about that, I didn't know the patch was pulled in so I expected the old 
approach.

 So, that config option plus some changes to the Tegra MMC driver might
 work - I'll take a look. However, making that change will simply hide
 any unaligned buffer and reduce performance, so I'd still prefer to
 require that buffers be aligned in the future where possible.

Best regards,
Marek Vasut
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-11-05 Thread Simon Glass
Hi Stephen,

On Mon, Nov 5, 2012 at 11:50 AM, Stephen Warren swar...@wwwdotorg.org wrote:
 On 11/02/2012 04:10 PM, Marek Vasut wrote:
 Dear Stephen Warren,

 On 11/02/2012 03:28 PM, Marek Vasut wrote:
 Dear Stephen Warren,

 On 11/02/2012 02:38 PM, Marek Vasut wrote:
 ...

 Dumb question -- might be unrelated. Does the tegra mmc driver do DMA?
 And if so, what happens if you do raw read to unaligned address (aka.
 how come you don't need the bounce buffer)?

 Yes, it does DMA, I believe. (At least if it doesn't, I have no idea why
 the driver is flushing caches!)

 I guess we only support the use of aligned addresses, so e.g. the
 following would work:

 ext2load mmc 0:1 0x0010 /file

 but the following wouldn't:

 ext2load mmc 0:1 0x0014 /file

 which while I suppose it is an artificial restriction, hasn't been an
 issue in practice.

 Then just enable the bounce buffer and it will work ;-)

 You suggested that last time, and it made no difference then... In fact,
 the config option you mentioned isn't used anywhere in the srouce tree
 except adding bouncebuf.o

 Bouncebuf.o ?

 common/bouncebuf.c

 into the build right now; which config option
 do you think I should use?

 CONFIG_BOUNCE_BUFFER

 It's used on mx28-based boards.

 Ahah, there's much more to it than just defining the config option, and
 grep'ing for the config option doesn't reveal that.

 Defining CONFIG_BOUNCE_BUFFER simply pulls in bouncebuf.o into the
 build. It's then up to the individual MMC/... driver to actually make
 use of the new functions - i.e. explicitly call e.g.
 bounce_buffer_start(). The MXS MMC driver calls these without any ifdef
 CONFIG_BOUNCE_BUFFER, and hence doesn't show up when you grep for that.
 I had assumed that such functionality would be a conditional part of the
 MMC core rather than the individual drivers, hence had missed the usage
 in the MXS driver.

 So, that config option plus some changes to the Tegra MMC driver might
 work - I'll take a look. However, making that change will simply hide
 any unaligned buffer and reduce performance, so I'd still prefer to
 require that buffers be aligned in the future where possible.

Yes that seems right to me.

Regards,
Simon
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-11-02 Thread Stephen Warren
On 04/26/2012 11:29 PM, Mike Frysinger wrote:
 On Thursday 26 April 2012 06:34:43 Thierry Reding wrote:
 For reference, see sd_change_freq() in drivers/mmc/mmc.c.

This is a follow-up to:

http://lists.denx.de/pipermail/u-boot/2012-April/123080.html

which was referenced from:

http://lists.denx.de/pipermail/u-boot/2012-September/133641.html

 yes, that shows what we're talking about. int sd_change_freq(struct
 mmc *mmc) { ... stack vars ... int err; 
 ALLOC_CACHE_ALIGN_BUFFER(uint, scr, 2); struct mmc_data data; int
 timeout; ... stack vars ... ... data.dest = (char *)scr; 
 data.blocksize = 8; data.blocks = 1; data.flags = MMC_DATA_READ; 
 err = mmc_send_cmd(mmc, cmd, data); ...
 
 static int mmc_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, 
 struct mmc_data *data) { ... if (data-flags  MMC_DATA_READ) { if
 ((uintptr_t)data-dest  (ARCH_DMA_MINALIGN - 1)) printf(Warning:
 unaligned read from %p  may fail\n, data-dest); 
 invalidate_dcache_range((ulong)data-dest, (ulong)data-dest + 
 data-blocks * data-blocksize); } ...
 
 so what invalidate_dcache_range() will invalidate is from scr and
 8 bytes after that.  the higher layers make sure scr is aligned
 properly, but not that it spans a full cache line.  so if the stack
 layout is: [random stuff] 0x100   [scr[0]] 0x104  [scr[1]] 0x108
 [err] 0x10a   [timeout] [more stuff]

That's not the stack layout. scr is allocated (as quoted above) using
ALLOC_CACHE_ALIGN_BUFFER. Therefore, the storage space allocated for
scr is always cache-aligned in address and size, even if the code only
uses 8 bytes of it.

 this implicitly invalidates [err] and [timeout] and more stuff.  by
 you forcibly rounding up the length, you no longer get a warning,
 but you still (silently) blew away those variables from cache
 possibly losing changes that weren't written back to external
 memory yet.

So, this problem won't actually occur here.

So, Thierry's proposed solution (which I'll re-quote below) would
actually work just fine:

 [PATCH 2/2] mmc: tegra: invalidate complete cachelines
 
 The MMC core sometimes reads buffers that are smaller than a
 complete cacheline, for example when reading the SCR. In order to
 avoid a warning from the ARM v7 cache handling code, this patch
 makes sure that complete cachelines are flushed.
 
 Signed-off-by: Thierry Reding thierry.reding at
 avionic-design.de --- drivers/mmc/tegra2_mmc.c |7 --- 1
 file changed, 4 insertions(+), 3 deletions(-)
 
 diff --git a/drivers/mmc/tegra2_mmc.c b/drivers/mmc/tegra2_mmc.c 
 index fb8a57d..3615876 100644 --- a/drivers/mmc/tegra2_mmc.c +++
 b/drivers/mmc/tegra2_mmc.c @@ -323,12 +323,13 @@ static int
 mmc_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, } writel(mask,
 host-reg-norintsts); if (data-flags  MMC_DATA_READ) { +
 ulong end = (ulong)data-dest + + 
 roundup(data-blocks *
 data-blocksize, +
 ARCH_DMA_MINALIGN); if
 ((uintptr_t)data-dest  (ARCH_DMA_MINALIGN - 1)) printf(Warning:
 unaligned read from %p  may fail\n, data-dest); -
 invalidate_dcache_range((ulong)data-dest, -  
 (ulong)data-dest
 + -   data-blocks * data-blocksize); +
 invalidate_dcache_range((ulong)data-dest, end); } }
 
 --

... so long as we require any struct mmc_data data's .dest field to
point at a buffer that was allocated using ALLOC_CACHE_ALIGN_BUFFER.

So, I'd like to propose that we take Thierry's fix, perhaps having
validated (or implemented some way of enforcing) that
ALLOC_CACHE_ALIGN_BUFFER() was used to allocate the buffer pointer.
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-11-02 Thread Simon Glass
Hi Stephen,

On Fri, Nov 2, 2012 at 1:22 PM, Stephen Warren swar...@wwwdotorg.org wrote:
 On 04/26/2012 11:29 PM, Mike Frysinger wrote:
 On Thursday 26 April 2012 06:34:43 Thierry Reding wrote:
 For reference, see sd_change_freq() in drivers/mmc/mmc.c.

 This is a follow-up to:

 http://lists.denx.de/pipermail/u-boot/2012-April/123080.html

 which was referenced from:

 http://lists.denx.de/pipermail/u-boot/2012-September/133641.html

 yes, that shows what we're talking about. int sd_change_freq(struct
 mmc *mmc) { ... stack vars ... int err;
 ALLOC_CACHE_ALIGN_BUFFER(uint, scr, 2); struct mmc_data data; int
 timeout; ... stack vars ... ... data.dest = (char *)scr;
 data.blocksize = 8; data.blocks = 1; data.flags = MMC_DATA_READ;
 err = mmc_send_cmd(mmc, cmd, data); ...

 static int mmc_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd,
 struct mmc_data *data) { ... if (data-flags  MMC_DATA_READ) { if
 ((uintptr_t)data-dest  (ARCH_DMA_MINALIGN - 1)) printf(Warning:
 unaligned read from %p  may fail\n, data-dest);
 invalidate_dcache_range((ulong)data-dest, (ulong)data-dest +
 data-blocks * data-blocksize); } ...

 so what invalidate_dcache_range() will invalidate is from scr and
 8 bytes after that.  the higher layers make sure scr is aligned
 properly, but not that it spans a full cache line.  so if the stack
 layout is: [random stuff] 0x100   [scr[0]] 0x104  [scr[1]] 0x108
 [err] 0x10a   [timeout] [more stuff]

 That's not the stack layout. scr is allocated (as quoted above) using
 ALLOC_CACHE_ALIGN_BUFFER. Therefore, the storage space allocated for
 scr is always cache-aligned in address and size, even if the code only
 uses 8 bytes of it.

 this implicitly invalidates [err] and [timeout] and more stuff.  by
 you forcibly rounding up the length, you no longer get a warning,
 but you still (silently) blew away those variables from cache
 possibly losing changes that weren't written back to external
 memory yet.

 So, this problem won't actually occur here.

 So, Thierry's proposed solution (which I'll re-quote below) would
 actually work just fine:

 [PATCH 2/2] mmc: tegra: invalidate complete cachelines

 The MMC core sometimes reads buffers that are smaller than a
 complete cacheline, for example when reading the SCR. In order to
 avoid a warning from the ARM v7 cache handling code, this patch
 makes sure that complete cachelines are flushed.

 Signed-off-by: Thierry Reding thierry.reding at
 avionic-design.de --- drivers/mmc/tegra2_mmc.c |7 --- 1
 file changed, 4 insertions(+), 3 deletions(-)

 diff --git a/drivers/mmc/tegra2_mmc.c b/drivers/mmc/tegra2_mmc.c
 index fb8a57d..3615876 100644 --- a/drivers/mmc/tegra2_mmc.c +++
 b/drivers/mmc/tegra2_mmc.c @@ -323,12 +323,13 @@ static int
 mmc_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, } writel(mask,
 host-reg-norintsts); if (data-flags  MMC_DATA_READ) { +
 ulong end = (ulong)data-dest + + 
 roundup(data-blocks *
 data-blocksize, +
 ARCH_DMA_MINALIGN); if
 ((uintptr_t)data-dest  (ARCH_DMA_MINALIGN - 1)) printf(Warning:
 unaligned read from %p  may fail\n, data-dest); -
 invalidate_dcache_range((ulong)data-dest, -  
 (ulong)data-dest
 + -   data-blocks * data-blocksize); +
 invalidate_dcache_range((ulong)data-dest, end); } }

 --

 ... so long as we require any struct mmc_data data's .dest field to
 point at a buffer that was allocated using ALLOC_CACHE_ALIGN_BUFFER.

 So, I'd like to propose that we take Thierry's fix, perhaps having
 validated (or implemented some way of enforcing) that
 ALLOC_CACHE_ALIGN_BUFFER() was used to allocate the buffer pointer.

That sounds ok to me.

It is important to avoid silent failure if we can. Are you proposing
to pass a 'size' parameter along with any buffer pointer, or something
else?

Regards,
Simon
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-11-02 Thread Marek Vasut
Dear Simon Glass,

 Hi Stephen,
 
 On Fri, Nov 2, 2012 at 1:22 PM, Stephen Warren swar...@wwwdotorg.org wrote:
  On 04/26/2012 11:29 PM, Mike Frysinger wrote:
  On Thursday 26 April 2012 06:34:43 Thierry Reding wrote:
  For reference, see sd_change_freq() in drivers/mmc/mmc.c.
  
  This is a follow-up to:
  
  http://lists.denx.de/pipermail/u-boot/2012-April/123080.html
  
  which was referenced from:
  
  http://lists.denx.de/pipermail/u-boot/2012-September/133641.html
  
  yes, that shows what we're talking about. int sd_change_freq(struct
  mmc *mmc) { ... stack vars ... int err;
  ALLOC_CACHE_ALIGN_BUFFER(uint, scr, 2); struct mmc_data data; int
  timeout; ... stack vars ... ... data.dest = (char *)scr;
  data.blocksize = 8; data.blocks = 1; data.flags = MMC_DATA_READ;
  err = mmc_send_cmd(mmc, cmd, data); ...
  
  static int mmc_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd,
  struct mmc_data *data) { ... if (data-flags  MMC_DATA_READ) { if
  ((uintptr_t)data-dest  (ARCH_DMA_MINALIGN - 1)) printf(Warning:
  unaligned read from %p  may fail\n, data-dest);
  invalidate_dcache_range((ulong)data-dest, (ulong)data-dest +
  data-blocks * data-blocksize); } ...
  
  so what invalidate_dcache_range() will invalidate is from scr and
  8 bytes after that.  the higher layers make sure scr is aligned
  properly, but not that it spans a full cache line.  so if the stack
  layout is: [random stuff] 0x100   [scr[0]] 0x104  [scr[1]] 0x108
  [err] 0x10a   [timeout] [more stuff]
  
  That's not the stack layout. scr is allocated (as quoted above) using
  ALLOC_CACHE_ALIGN_BUFFER. Therefore, the storage space allocated for
  scr is always cache-aligned in address and size, even if the code only
  uses 8 bytes of it.
  
  this implicitly invalidates [err] and [timeout] and more stuff.  by
  you forcibly rounding up the length, you no longer get a warning,
  but you still (silently) blew away those variables from cache
  possibly losing changes that weren't written back to external
  memory yet.
  
  So, this problem won't actually occur here.
  
  So, Thierry's proposed solution (which I'll re-quote below) would
  
  actually work just fine:
  [PATCH 2/2] mmc: tegra: invalidate complete cachelines
  
  The MMC core sometimes reads buffers that are smaller than a
  complete cacheline, for example when reading the SCR. In order to
  avoid a warning from the ARM v7 cache handling code, this patch
  makes sure that complete cachelines are flushed.
  
  Signed-off-by: Thierry Reding thierry.reding at
  avionic-design.de --- drivers/mmc/tegra2_mmc.c |7 --- 1
  file changed, 4 insertions(+), 3 deletions(-)
  
  diff --git a/drivers/mmc/tegra2_mmc.c b/drivers/mmc/tegra2_mmc.c
  index fb8a57d..3615876 100644 --- a/drivers/mmc/tegra2_mmc.c +++
  b/drivers/mmc/tegra2_mmc.c @@ -323,12 +323,13 @@ static int
  mmc_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, } writel(mask,
  host-reg-norintsts); if (data-flags  MMC_DATA_READ) { +
  ulong end = (ulong)data-dest + +
  roundup(data-blocks * data-blocksize, +  
   ARCH_DMA_MINALIGN); if ((uintptr_t)data-dest 
  (ARCH_DMA_MINALIGN - 1)) printf(Warning: unaligned read from %p  may
  fail\n, data-dest); -
  invalidate_dcache_range((ulong)data-dest, - 
  (ulong)data-dest + -   data-blocks *
  data-blocksize); + invalidate_dcache_range((ulong)data-dest, end); }
  }
  
  --
  
  ... so long as we require any struct mmc_data data's .dest field to
  point at a buffer that was allocated using ALLOC_CACHE_ALIGN_BUFFER.
  
  So, I'd like to propose that we take Thierry's fix, perhaps having
  validated (or implemented some way of enforcing) that
  ALLOC_CACHE_ALIGN_BUFFER() was used to allocate the buffer pointer.
 
 That sounds ok to me.
 
 It is important to avoid silent failure if we can. Are you proposing
 to pass a 'size' parameter along with any buffer pointer, or something
 else?

Dumb question -- might be unrelated. Does the tegra mmc driver do DMA? And if 
so, what happens if you do raw read to unaligned address (aka. how come you 
don't need the bounce buffer)?
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-11-02 Thread Stephen Warren
On 11/02/2012 02:25 PM, Simon Glass wrote:
 Hi Stephen,
 
 On Fri, Nov 2, 2012 at 1:22 PM, Stephen Warren swar...@wwwdotorg.org wrote:
 On 04/26/2012 11:29 PM, Mike Frysinger wrote:
 On Thursday 26 April 2012 06:34:43 Thierry Reding wrote:
 For reference, see sd_change_freq() in drivers/mmc/mmc.c.

 This is a follow-up to:

 http://lists.denx.de/pipermail/u-boot/2012-April/123080.html

 which was referenced from:

 http://lists.denx.de/pipermail/u-boot/2012-September/133641.html

 yes, that shows what we're talking about. int sd_change_freq(struct
 mmc *mmc) { ... stack vars ... int err;
 ALLOC_CACHE_ALIGN_BUFFER(uint, scr, 2); struct mmc_data data; int
 timeout; ... stack vars ... ... data.dest = (char *)scr;
 data.blocksize = 8; data.blocks = 1; data.flags = MMC_DATA_READ;
 err = mmc_send_cmd(mmc, cmd, data); ...

 static int mmc_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd,
 struct mmc_data *data) { ... if (data-flags  MMC_DATA_READ) { if
 ((uintptr_t)data-dest  (ARCH_DMA_MINALIGN - 1)) printf(Warning:
 unaligned read from %p  may fail\n, data-dest);
 invalidate_dcache_range((ulong)data-dest, (ulong)data-dest +
 data-blocks * data-blocksize); } ...

 so what invalidate_dcache_range() will invalidate is from scr and
 8 bytes after that.  the higher layers make sure scr is aligned
 properly, but not that it spans a full cache line.  so if the stack
 layout is: [random stuff] 0x100   [scr[0]] 0x104  [scr[1]] 0x108
 [err] 0x10a   [timeout] [more stuff]

 That's not the stack layout. scr is allocated (as quoted above) using
 ALLOC_CACHE_ALIGN_BUFFER. Therefore, the storage space allocated for
 scr is always cache-aligned in address and size, even if the code only
 uses 8 bytes of it.

 this implicitly invalidates [err] and [timeout] and more stuff.  by
 you forcibly rounding up the length, you no longer get a warning,
 but you still (silently) blew away those variables from cache
 possibly losing changes that weren't written back to external
 memory yet.

 So, this problem won't actually occur here.

 So, Thierry's proposed solution (which I'll re-quote below) would
 actually work just fine:

 [PATCH 2/2] mmc: tegra: invalidate complete cachelines

 The MMC core sometimes reads buffers that are smaller than a
 complete cacheline, for example when reading the SCR. In order to
 avoid a warning from the ARM v7 cache handling code, this patch
 makes sure that complete cachelines are flushed.

 Signed-off-by: Thierry Reding thierry.reding at
 avionic-design.de --- drivers/mmc/tegra2_mmc.c |7 --- 1
 file changed, 4 insertions(+), 3 deletions(-)

 diff --git a/drivers/mmc/tegra2_mmc.c b/drivers/mmc/tegra2_mmc.c
 index fb8a57d..3615876 100644 --- a/drivers/mmc/tegra2_mmc.c +++
 b/drivers/mmc/tegra2_mmc.c @@ -323,12 +323,13 @@ static int
 mmc_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, } writel(mask,
 host-reg-norintsts); if (data-flags  MMC_DATA_READ) { +
 ulong end = (ulong)data-dest + + 
 roundup(data-blocks *
 data-blocksize, +
 ARCH_DMA_MINALIGN); if
 ((uintptr_t)data-dest  (ARCH_DMA_MINALIGN - 1)) printf(Warning:
 unaligned read from %p  may fail\n, data-dest); -
 invalidate_dcache_range((ulong)data-dest, -  
 (ulong)data-dest
 + -   data-blocks * data-blocksize); +
 invalidate_dcache_range((ulong)data-dest, end); } }

 --

 ... so long as we require any struct mmc_data data's .dest field to
 point at a buffer that was allocated using ALLOC_CACHE_ALIGN_BUFFER.

 So, I'd like to propose that we take Thierry's fix, perhaps having
 validated (or implemented some way of enforcing) that
 ALLOC_CACHE_ALIGN_BUFFER() was used to allocate the buffer pointer.
 
 That sounds ok to me.
 
 It is important to avoid silent failure if we can. Are you proposing
 to pass a 'size' parameter along with any buffer pointer, or something
 else?

I was wondering about creating some kind of macro to initialize the
buffer pointer in struct mmc_data, and writing that macro so that the
buffer passed to it had to have been allocated using
ALLOC_CACHE_ALIGN_BUFFER (e.g. by referencing the hidden variable it
creates, or something like that).

Passing in the buffer size would also work, although it seems like it'd
be easier to screw that up and hide the silent errors you mentioned?
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-11-02 Thread Stephen Warren
On 11/02/2012 02:38 PM, Marek Vasut wrote:
 Dear Simon Glass,
 
 Hi Stephen,

 On Fri, Nov 2, 2012 at 1:22 PM, Stephen Warren swar...@wwwdotorg.org wrote:
 On 04/26/2012 11:29 PM, Mike Frysinger wrote:
 On Thursday 26 April 2012 06:34:43 Thierry Reding wrote:
 For reference, see sd_change_freq() in drivers/mmc/mmc.c.

 This is a follow-up to:

 http://lists.denx.de/pipermail/u-boot/2012-April/123080.html

 which was referenced from:

 http://lists.denx.de/pipermail/u-boot/2012-September/133641.html

 yes, that shows what we're talking about. int sd_change_freq(struct
 mmc *mmc) { ... stack vars ... int err;
 ALLOC_CACHE_ALIGN_BUFFER(uint, scr, 2); struct mmc_data data; int
 timeout; ... stack vars ... ... data.dest = (char *)scr;
 data.blocksize = 8; data.blocks = 1; data.flags = MMC_DATA_READ;
 err = mmc_send_cmd(mmc, cmd, data); ...

 static int mmc_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd,
 struct mmc_data *data) { ... if (data-flags  MMC_DATA_READ) { if
 ((uintptr_t)data-dest  (ARCH_DMA_MINALIGN - 1)) printf(Warning:
 unaligned read from %p  may fail\n, data-dest);
 invalidate_dcache_range((ulong)data-dest, (ulong)data-dest +
 data-blocks * data-blocksize); } ...

 so what invalidate_dcache_range() will invalidate is from scr and
 8 bytes after that.  the higher layers make sure scr is aligned
 properly, but not that it spans a full cache line.  so if the stack
 layout is: [random stuff] 0x100   [scr[0]] 0x104  [scr[1]] 0x108
 [err] 0x10a   [timeout] [more stuff]

 That's not the stack layout. scr is allocated (as quoted above) using
 ALLOC_CACHE_ALIGN_BUFFER. Therefore, the storage space allocated for
 scr is always cache-aligned in address and size, even if the code only
 uses 8 bytes of it.

 this implicitly invalidates [err] and [timeout] and more stuff.  by
 you forcibly rounding up the length, you no longer get a warning,
 but you still (silently) blew away those variables from cache
 possibly losing changes that weren't written back to external
 memory yet.

 So, this problem won't actually occur here.

 So, Thierry's proposed solution (which I'll re-quote below) would

 actually work just fine:
 [PATCH 2/2] mmc: tegra: invalidate complete cachelines

 The MMC core sometimes reads buffers that are smaller than a
 complete cacheline, for example when reading the SCR. In order to
 avoid a warning from the ARM v7 cache handling code, this patch
 makes sure that complete cachelines are flushed.

 Signed-off-by: Thierry Reding thierry.reding at
 avionic-design.de --- drivers/mmc/tegra2_mmc.c |7 --- 1
 file changed, 4 insertions(+), 3 deletions(-)

 diff --git a/drivers/mmc/tegra2_mmc.c b/drivers/mmc/tegra2_mmc.c
 index fb8a57d..3615876 100644 --- a/drivers/mmc/tegra2_mmc.c +++
 b/drivers/mmc/tegra2_mmc.c @@ -323,12 +323,13 @@ static int
 mmc_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, } writel(mask,
 host-reg-norintsts); if (data-flags  MMC_DATA_READ) { +
 ulong end = (ulong)data-dest + +
 roundup(data-blocks * data-blocksize, +  
  ARCH_DMA_MINALIGN); if ((uintptr_t)data-dest 
 (ARCH_DMA_MINALIGN - 1)) printf(Warning: unaligned read from %p  may
 fail\n, data-dest); -
 invalidate_dcache_range((ulong)data-dest, - 
 (ulong)data-dest + -   data-blocks *
 data-blocksize); + invalidate_dcache_range((ulong)data-dest, end); }
 }

 --

 ... so long as we require any struct mmc_data data's .dest field to
 point at a buffer that was allocated using ALLOC_CACHE_ALIGN_BUFFER.

 So, I'd like to propose that we take Thierry's fix, perhaps having
 validated (or implemented some way of enforcing) that
 ALLOC_CACHE_ALIGN_BUFFER() was used to allocate the buffer pointer.

 That sounds ok to me.

 It is important to avoid silent failure if we can. Are you proposing
 to pass a 'size' parameter along with any buffer pointer, or something
 else?
 
 Dumb question -- might be unrelated. Does the tegra mmc driver do DMA? And if 
 so, what happens if you do raw read to unaligned address (aka. how come you 
 don't need the bounce buffer)?

Yes, it does DMA, I believe. (At least if it doesn't, I have no idea why
the driver is flushing caches!)

I guess we only support the use of aligned addresses, so e.g. the
following would work:

ext2load mmc 0:1 0x0010 /file

but the following wouldn't:

ext2load mmc 0:1 0x0014 /file

which while I suppose it is an artificial restriction, hasn't been an
issue in practice.
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-11-02 Thread Marek Vasut
Dear Stephen Warren,

 On 11/02/2012 02:38 PM, Marek Vasut wrote:
  Dear Simon Glass,
  
  Hi Stephen,
  
  On Fri, Nov 2, 2012 at 1:22 PM, Stephen Warren swar...@wwwdotorg.org 
wrote:
  On 04/26/2012 11:29 PM, Mike Frysinger wrote:
  On Thursday 26 April 2012 06:34:43 Thierry Reding wrote:
  For reference, see sd_change_freq() in drivers/mmc/mmc.c.
  
  This is a follow-up to:
  
  http://lists.denx.de/pipermail/u-boot/2012-April/123080.html
  
  which was referenced from:
  
  http://lists.denx.de/pipermail/u-boot/2012-September/133641.html
  
  yes, that shows what we're talking about. int sd_change_freq(struct
  mmc *mmc) { ... stack vars ... int err;
  ALLOC_CACHE_ALIGN_BUFFER(uint, scr, 2); struct mmc_data data; int
  timeout; ... stack vars ... ... data.dest = (char *)scr;
  data.blocksize = 8; data.blocks = 1; data.flags = MMC_DATA_READ;
  err = mmc_send_cmd(mmc, cmd, data); ...
  
  static int mmc_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd,
  struct mmc_data *data) { ... if (data-flags  MMC_DATA_READ) { if
  ((uintptr_t)data-dest  (ARCH_DMA_MINALIGN - 1)) printf(Warning:
  unaligned read from %p  may fail\n, data-dest);
  invalidate_dcache_range((ulong)data-dest, (ulong)data-dest +
  data-blocks * data-blocksize); } ...
  
  so what invalidate_dcache_range() will invalidate is from scr and
  8 bytes after that.  the higher layers make sure scr is aligned
  properly, but not that it spans a full cache line.  so if the stack
  layout is: [random stuff] 0x100   [scr[0]] 0x104  [scr[1]] 0x108
  [err] 0x10a   [timeout] [more stuff]
  
  That's not the stack layout. scr is allocated (as quoted above) using
  ALLOC_CACHE_ALIGN_BUFFER. Therefore, the storage space allocated for
  scr is always cache-aligned in address and size, even if the code only
  uses 8 bytes of it.
  
  this implicitly invalidates [err] and [timeout] and more stuff.  by
  you forcibly rounding up the length, you no longer get a warning,
  but you still (silently) blew away those variables from cache
  possibly losing changes that weren't written back to external
  memory yet.
  
  So, this problem won't actually occur here.
  
  So, Thierry's proposed solution (which I'll re-quote below) would
  
  actually work just fine:
  [PATCH 2/2] mmc: tegra: invalidate complete cachelines
  
  The MMC core sometimes reads buffers that are smaller than a
  complete cacheline, for example when reading the SCR. In order to
  avoid a warning from the ARM v7 cache handling code, this patch
  makes sure that complete cachelines are flushed.
  
  Signed-off-by: Thierry Reding thierry.reding at
  avionic-design.de --- drivers/mmc/tegra2_mmc.c |7 --- 1
  file changed, 4 insertions(+), 3 deletions(-)
  
  diff --git a/drivers/mmc/tegra2_mmc.c b/drivers/mmc/tegra2_mmc.c
  index fb8a57d..3615876 100644 --- a/drivers/mmc/tegra2_mmc.c +++
  b/drivers/mmc/tegra2_mmc.c @@ -323,12 +323,13 @@ static int
  mmc_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, } writel(mask,
  host-reg-norintsts); if (data-flags  MMC_DATA_READ) { +
  ulong end = (ulong)data-dest + +
  roundup(data-blocks * data-blocksize, +
  
   ARCH_DMA_MINALIGN); if ((uintptr_t)data-dest 
  
  (ARCH_DMA_MINALIGN - 1)) printf(Warning: unaligned read from %p 
  may fail\n, data-dest); -
  invalidate_dcache_range((ulong)data-dest, -
  (ulong)data-dest + -   data-blocks *
  data-blocksize); + invalidate_dcache_range((ulong)data-dest, end); }
  }
  
  --
  
  ... so long as we require any struct mmc_data data's .dest field to
  point at a buffer that was allocated using ALLOC_CACHE_ALIGN_BUFFER.
  
  So, I'd like to propose that we take Thierry's fix, perhaps having
  validated (or implemented some way of enforcing) that
  ALLOC_CACHE_ALIGN_BUFFER() was used to allocate the buffer pointer.
  
  That sounds ok to me.
  
  It is important to avoid silent failure if we can. Are you proposing
  to pass a 'size' parameter along with any buffer pointer, or something
  else?
  
  Dumb question -- might be unrelated. Does the tegra mmc driver do DMA?
  And if so, what happens if you do raw read to unaligned address (aka.
  how come you don't need the bounce buffer)?
 
 Yes, it does DMA, I believe. (At least if it doesn't, I have no idea why
 the driver is flushing caches!)
 
 I guess we only support the use of aligned addresses, so e.g. the
 following would work:
 
 ext2load mmc 0:1 0x0010 /file
 
 but the following wouldn't:
 
 ext2load mmc 0:1 0x0014 /file
 
 which while I suppose it is an artificial restriction, hasn't been an
 issue in practice.

Then just enable the bounce buffer and it will work ;-)
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-11-02 Thread Stephen Warren
On 11/02/2012 03:28 PM, Marek Vasut wrote:
 Dear Stephen Warren,
 
 On 11/02/2012 02:38 PM, Marek Vasut wrote:
...
 Dumb question -- might be unrelated. Does the tegra mmc driver do DMA?
 And if so, what happens if you do raw read to unaligned address (aka.
 how come you don't need the bounce buffer)?

 Yes, it does DMA, I believe. (At least if it doesn't, I have no idea why
 the driver is flushing caches!)

 I guess we only support the use of aligned addresses, so e.g. the
 following would work:

 ext2load mmc 0:1 0x0010 /file

 but the following wouldn't:

 ext2load mmc 0:1 0x0014 /file

 which while I suppose it is an artificial restriction, hasn't been an
 issue in practice.
 
 Then just enable the bounce buffer and it will work ;-)

You suggested that last time, and it made no difference then... In fact,
the config option you mentioned isn't used anywhere in the srouce tree
except adding bouncebuf.o into the build right now; which config option
do you think I should use?
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-11-02 Thread Marek Vasut
Dear Stephen Warren,

 On 11/02/2012 03:28 PM, Marek Vasut wrote:
  Dear Stephen Warren,
  
  On 11/02/2012 02:38 PM, Marek Vasut wrote:
 ...
 
  Dumb question -- might be unrelated. Does the tegra mmc driver do DMA?
  And if so, what happens if you do raw read to unaligned address (aka.
  how come you don't need the bounce buffer)?
  
  Yes, it does DMA, I believe. (At least if it doesn't, I have no idea why
  the driver is flushing caches!)
  
  I guess we only support the use of aligned addresses, so e.g. the
  following would work:
  
  ext2load mmc 0:1 0x0010 /file
  
  but the following wouldn't:
  
  ext2load mmc 0:1 0x0014 /file
  
  which while I suppose it is an artificial restriction, hasn't been an
  issue in practice.
  
  Then just enable the bounce buffer and it will work ;-)
 
 You suggested that last time, and it made no difference then... In fact,
 the config option you mentioned isn't used anywhere in the srouce tree
 except adding bouncebuf.o

Bouncebuf.o ?

 into the build right now; which config option
 do you think I should use?

CONFIG_BOUNCE_BUFFER

It's used on mx28-based boards.

Best regards,
Marek Vasut
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-04-26 Thread Thierry Reding
* Mike Frysinger wrote:
 On Tuesday 24 April 2012 03:53:44 Thierry Reding wrote:
  The MMC core sometimes reads buffers that are smaller than a complete
  cacheline, for example when reading the SCR. In order to avoid a warning
  from the ARM v7 cache handling code, this patch makes sure that complete
  cachelines are flushed.
 
 this is still wrong.  all you've done is bypass the error message without 
 addressing the underlying problem -- you're invalidating too much.

Reading 8 bytes is always less than a cacheline, so we don't have much
choice, do we? We could of course always read a whole cacheline even if only
8 bytes are requested, but does that have any advantage over reading 8 bytes
and then invalidating the cacheline?

Or maybe I'm missing the point.

Thierry


pgpv2cWITeleM.pgp
Description: PGP signature
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-04-26 Thread Simon Glass
Hi Thierry,

On Thu, Apr 26, 2012 at 6:18 PM, Thierry Reding 
thierry.red...@avionic-design.de wrote:

 * Mike Frysinger wrote:
  On Tuesday 24 April 2012 03:53:44 Thierry Reding wrote:
   The MMC core sometimes reads buffers that are smaller than a complete
   cacheline, for example when reading the SCR. In order to avoid a
 warning
   from the ARM v7 cache handling code, this patch makes sure that
 complete
   cachelines are flushed.
 
  this is still wrong.  all you've done is bypass the error message without
  addressing the underlying problem -- you're invalidating too much.

 Reading 8 bytes is always less than a cacheline, so we don't have much
 choice, do we? We could of course always read a whole cacheline even if
 only
 8 bytes are requested, but does that have any advantage over reading 8
 bytes
 and then invalidating the cacheline?

 Or maybe I'm missing the point.


Well the point is that you can read 8 bytes but you still must use a buffer
that is large enough for DMA activity. So the caller must allocate a buffer
larger than 8 bytes. I worry that what you have done will just introduce
obscure bugs, since we will potentially invalidate stack variants (for
example) and lose their values.

With the case problems, we are trying to fix them at source (i.e. at the
higher level).

Regards,
Simon



 Thierry

 ___
 U-Boot mailing list
 U-Boot@lists.denx.de
 http://lists.denx.de/mailman/listinfo/u-boot


___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-04-26 Thread Thierry Reding
* Simon Glass wrote:
 Hi Thierry,
 
 On Thu, Apr 26, 2012 at 6:18 PM, Thierry Reding 
 thierry.red...@avionic-design.de wrote:
 
  * Mike Frysinger wrote:
   On Tuesday 24 April 2012 03:53:44 Thierry Reding wrote:
The MMC core sometimes reads buffers that are smaller than a complete
cacheline, for example when reading the SCR. In order to avoid a
  warning
from the ARM v7 cache handling code, this patch makes sure that
  complete
cachelines are flushed.
  
   this is still wrong.  all you've done is bypass the error message without
   addressing the underlying problem -- you're invalidating too much.
 
  Reading 8 bytes is always less than a cacheline, so we don't have much
  choice, do we? We could of course always read a whole cacheline even if
  only
  8 bytes are requested, but does that have any advantage over reading 8
  bytes
  and then invalidating the cacheline?
 
  Or maybe I'm missing the point.
 
 
 Well the point is that you can read 8 bytes but you still must use a buffer
 that is large enough for DMA activity. So the caller must allocate a buffer
 larger than 8 bytes.

The buffer used in this case is allocated with the ALLOC_CACHE_ALIGN_BUFFER()
macro, so the buffer size is not an issue. However, the mmc_data structure's
blocksize field is set to 8, which is the value that will eventually end up
in invalidate_dcache_range().

For reference, see sd_change_freq() in drivers/mmc/mmc.c.

 I worry that what you have done will just introduce obscure bugs, since we
 will potentially invalidate stack variants (for example) and lose their
 values.
 
 With the case problems, we are trying to fix them at source (i.e. at the
 higher level).

I understand. The question then becomes how to best fix the passed in size.
Always passing the size of a complete cacheline in the SEND_SCR command
doesn't seem like a better option because it may have other implications on
other hardware.

Thierry


pgpaTrmn3In8t.pgp
Description: PGP signature
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-04-26 Thread Mike Frysinger
On Thursday 26 April 2012 06:34:43 Thierry Reding wrote:
 For reference, see sd_change_freq() in drivers/mmc/mmc.c.

yes, that shows what we're talking about.
int sd_change_freq(struct mmc *mmc)
{
... stack vars ...
int err;
ALLOC_CACHE_ALIGN_BUFFER(uint, scr, 2);
struct mmc_data data;
int timeout;
... stack vars ...
...
data.dest = (char *)scr;
data.blocksize = 8;
data.blocks = 1;
data.flags = MMC_DATA_READ;
err = mmc_send_cmd(mmc, cmd, data);
...

static int mmc_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd,
struct mmc_data *data)
{
...
if (data-flags  MMC_DATA_READ) {
if ((uintptr_t)data-dest  (ARCH_DMA_MINALIGN - 1))
printf(Warning: unaligned read from %p 
may fail\n, data-dest);
invalidate_dcache_range((ulong)data-dest,
(ulong)data-dest +
data-blocks * data-blocksize);
}
...

so what invalidate_dcache_range() will invalidate is from scr and 8 bytes 
after that.  the higher layers make sure scr is aligned properly, but not 
that it spans a full cache line.  so if the stack layout is:
[random stuff]
0x100   [scr[0]]
0x104   [scr[1]]
0x108   [err]
0x10a   [timeout]
[more stuff]

this implicitly invalidates [err] and [timeout] and more stuff.  by you 
forcibly rounding up the length, you no longer get a warning, but you still 
(silently) blew away those variables from cache possibly losing changes that 
weren't written back to external memory yet.

  I worry that what you have done will just introduce obscure bugs, since
  we will potentially invalidate stack variants (for example) and lose
  their values.
  
  With the case problems, we are trying to fix them at source (i.e. at the
  higher level).
 
 I understand. The question then becomes how to best fix the passed in size.
 Always passing the size of a complete cacheline in the SEND_SCR command
 doesn't seem like a better option because it may have other implications on
 other hardware.

i proposed this in a previous thread, but i think Simon missed it.

perhaps the warning in the core code could be dropped and all your changes in 
fringe code obsoleted (such as these USB patches): when it detects that an 
address is starting on an unaligned boundary, flush that line first, and then 
let it be invalidated.  accordingly, when the end length is on an unaligned 
boundary, do the same flush-then-invalidate step.  this should also make things 
work without a (significant) loss in performance.  if anything, i suspect the 
overhead of doing runtime buffer size calculations and manually aligning 
pointers (which is what ALLOC_CACHE_ALIGN_BUFFER does) is a wash compared to 
partially flushing cache lines in the core ...

Simon: what do you think of this last idea ?
-mike


signature.asc
Description: This is a digitally signed message part.
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-04-26 Thread Simon Glass
Hi Mike,

On Fri, Apr 27, 2012 at 5:29 PM, Mike Frysinger vap...@gentoo.org wrote:

 On Thursday 26 April 2012 06:34:43 Thierry Reding wrote:
  For reference, see sd_change_freq() in drivers/mmc/mmc.c.

 yes, that shows what we're talking about.
 int sd_change_freq(struct mmc *mmc)
 {
... stack vars ...
int err;
ALLOC_CACHE_ALIGN_BUFFER(uint, scr, 2);
struct mmc_data data;
int timeout;
... stack vars ...
 ...
data.dest = (char *)scr;
data.blocksize = 8;
data.blocks = 1;
data.flags = MMC_DATA_READ;
err = mmc_send_cmd(mmc, cmd, data);
 ...

 static int mmc_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd,
 struct mmc_data *data)
 {
 ...
 if (data-flags  MMC_DATA_READ) {
 if ((uintptr_t)data-dest  (ARCH_DMA_MINALIGN - 1))
printf(Warning: unaligned read from %p 
may fail\n, data-dest);
 invalidate_dcache_range((ulong)data-dest,
(ulong)data-dest +
 data-blocks * data-blocksize);
}
 ...

 so what invalidate_dcache_range() will invalidate is from scr and 8 bytes
 after that.  the higher layers make sure scr is aligned properly, but not
 that it spans a full cache line.  so if the stack layout is:
[random stuff]
 0x100   [scr[0]]
 0x104   [scr[1]]
 0x108   [err]
 0x10a   [timeout]
[more stuff]

 this implicitly invalidates [err] and [timeout] and more stuff.  by you
 forcibly rounding up the length, you no longer get a warning, but you still
 (silently) blew away those variables from cache possibly losing changes
 that
 weren't written back to external memory yet.

   I worry that what you have done will just introduce obscure bugs, since
   we will potentially invalidate stack variants (for example) and lose
   their values.
  
   With the case problems, we are trying to fix them at source (i.e. at
 the
   higher level).
 
  I understand. The question then becomes how to best fix the passed in
 size.
  Always passing the size of a complete cacheline in the SEND_SCR command
  doesn't seem like a better option because it may have other implications
 on
  other hardware.

 i proposed this in a previous thread, but i think Simon missed it.

 perhaps the warning in the core code could be dropped and all your changes
 in
 fringe code obsoleted (such as these USB patches): when it detects that an
 address is starting on an unaligned boundary, flush that line first, and
 then
 let it be invalidated.  accordingly, when the end length is on an unaligned
 boundary, do the same flush-then-invalidate step.  this should also make
 things
 work without a (significant) loss in performance.  if anything, i suspect
 the
 overhead of doing runtime buffer size calculations and manually aligning
 pointers (which is what ALLOC_CACHE_ALIGN_BUFFER does) is a wash compared
 to
 partially flushing cache lines in the core ...

 Simon: what do you think of this last idea ?


I think I proved to myself a while back that it can't be done. Even if you
flush you can't know that more activity will not occur on that cache line
when you access the stack a microsecond later. Then much later when the DMA
updates the RAM address with the data, you will not see it, because you
have already pulled in that cache line. We are relying on the invalidate to
'hold' until we are ready to read the data.

How about just using an array of one element? So instead of:

struct fred fred;

do_something_with(fred);

use:

ALLOC_CACHE_ALIGN_BUFFER(struct fred, scr, 1)

do_something_with(fred);

Regards,
Simon


 -mike

___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-04-25 Thread Mike Frysinger
On Tuesday 24 April 2012 03:53:44 Thierry Reding wrote:
 The MMC core sometimes reads buffers that are smaller than a complete
 cacheline, for example when reading the SCR. In order to avoid a warning
 from the ARM v7 cache handling code, this patch makes sure that complete
 cachelines are flushed.

this is still wrong.  all you've done is bypass the error message without 
addressing the underlying problem -- you're invalidating too much.
-mike


signature.asc
Description: This is a digitally signed message part.
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


[U-Boot] [PATCH 2/2] mmc: tegra: invalidate complete cachelines

2012-04-24 Thread Thierry Reding
The MMC core sometimes reads buffers that are smaller than a complete
cacheline, for example when reading the SCR. In order to avoid a warning
from the ARM v7 cache handling code, this patch makes sure that complete
cachelines are flushed.

Signed-off-by: Thierry Reding thierry.red...@avionic-design.de
---
 drivers/mmc/tegra2_mmc.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/mmc/tegra2_mmc.c b/drivers/mmc/tegra2_mmc.c
index fb8a57d..3615876 100644
--- a/drivers/mmc/tegra2_mmc.c
+++ b/drivers/mmc/tegra2_mmc.c
@@ -323,12 +323,13 @@ static int mmc_send_cmd(struct mmc *mmc, struct mmc_cmd 
*cmd,
}
writel(mask, host-reg-norintsts);
if (data-flags  MMC_DATA_READ) {
+   ulong end = (ulong)data-dest +
+   roundup(data-blocks * data-blocksize,
+   ARCH_DMA_MINALIGN);
if ((uintptr_t)data-dest  (ARCH_DMA_MINALIGN - 1))
printf(Warning: unaligned read from %p 
may fail\n, data-dest);
-   invalidate_dcache_range((ulong)data-dest,
-   (ulong)data-dest +
-   data-blocks * data-blocksize);
+   invalidate_dcache_range((ulong)data-dest, end);
}
}
 
-- 
1.7.10

___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot