Re: [Mesa-dev] [PATCH 0/2] radeon: Use the DMA engine for buffer downloads

2014-08-09 Thread Niels Ole Salscheider
On Tuesday 04 March 2014, 02:08:58, Marek Olšák wrote:
 Could you please do this without changing u_upload_mgr? You can still
 use u_upload_alloc to allocate buffer memory in the driver and the map
 buffer read/write flags are not important with persistent coherent
 buffer mappings anyway.

Since 150ac07b855b5c5f879bf6ce9ca421ccd1a6c938 we allocate CPU - GPU 
streaming buffers (i. e. those with PIPE_USAGE_STREAM) in VRAM.
We should therefore set buffer.usage to PIPE_USAGE_STAGING in 
u_upload_alloc_buffer when we use u_upload_mgr for downloads - otherwise we 
won't get any performance improvements.
Would it now be OK to change u_upload_mgr or do you have a better proposal?

Ole

 Marek
 
 On Mon, Mar 3, 2014 at 9:29 PM, Niels Ole Salscheider
 
 niels_...@salscheider-online.de wrote:
  Using the DMA engine for buffer downloads vastly improves performance.
  This is because reads from VRAM by the CPU are slow because of the high
  latency of the PCIe bus.
  
  The first patch allows u_upload_mgr to be used for downloads, too. The
  second patch then uses u_upload_mgr in the radeon driver for downloads.
  I considered to rename u_upload_mgr to u_transfer_mgr since it might be
  confusing that an upload manager can be used for downloads. But then
  again we also have transfers so that u_transfer_mgr might also be
  confusing. Thus, I decided not to rename it for now.
  
  Without these patches, the buffer_bandwidth benchmark from uCLbench gives
  me:
  
  ./buffer_bandwidth --size=2000 --iterations=100
  # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant
  memory, 
  32 KB local memory)
  
  1/1 direct 2000 Bytes   759.29 MB/s(HD) 17.13 MB/s(DD)
  
  14.61 MB/s(DH)
  
  With these paches, the read performance is much better:
  
  ./buffer_bandwidth --size=2000 --iterations=100
  # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant
  memory, 
  32 KB local memory)
  
  1/1 direct 2000 Bytes   759.90 MB/s(HD) 613.49 MB/s(DD)
  
  1841.07 MB/s(DH)
  
  Judging by these numbers, it might even make sense to use the DMA engine
  for larger buffer downloads...
  
  Niels Ole Salscheider (2):
util/u_upload_mgr: Allow to also use it for downloads
radeon: Use transfer manager for buffer downloads
   
   src/gallium/auxiliary/hud/hud_context.c |  3 +-
   src/gallium/auxiliary/util/u_blitter.c  |  3 +-
   src/gallium/auxiliary/util/u_upload_mgr.c   | 49 +++-
   src/gallium/auxiliary/util/u_upload_mgr.h   | 13 -
   src/gallium/auxiliary/util/u_vbuf.c |  3 +-
   src/gallium/auxiliary/vl/vl_compositor.c|  3 +-
   src/gallium/drivers/ilo/ilo_context.c   |  3 +-
   src/gallium/drivers/r300/r300_context.c |  3 +-
   src/gallium/drivers/radeon/r600_buffer_common.c | 78
   +++-- src/gallium/drivers/radeon/r600_pipe_common.c 
| 14 -
   src/gallium/drivers/radeon/r600_pipe_common.h   |  1 +
   src/mesa/state_tracker/st_context.c |  9 ++-
   12 files changed, 136 insertions(+), 46 deletions(-)
  
  --
  1.9.0
  
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] radeon: Use the DMA engine for buffer downloads

2014-08-09 Thread Marek Olšák
You can try to do the allocation of the staging buffer with
pipe_buffer_create instead of u_upload_mgr. You can also use
u_suballocator, which is like a stripped out version of u_upload_mgr.
You would need another instance of u_upload_mgr anyway, because we'd
like to continue using PIPE_USAGE_STREAM for uploads.

Marek

On Sat, Aug 9, 2014 at 2:35 PM, Niels Ole Salscheider
niels_...@salscheider-online.de wrote:
 On Tuesday 04 March 2014, 02:08:58, Marek Olšák wrote:
 Could you please do this without changing u_upload_mgr? You can still
 use u_upload_alloc to allocate buffer memory in the driver and the map
 buffer read/write flags are not important with persistent coherent
 buffer mappings anyway.

 Since 150ac07b855b5c5f879bf6ce9ca421ccd1a6c938 we allocate CPU - GPU
 streaming buffers (i. e. those with PIPE_USAGE_STREAM) in VRAM.
 We should therefore set buffer.usage to PIPE_USAGE_STAGING in
 u_upload_alloc_buffer when we use u_upload_mgr for downloads - otherwise we
 won't get any performance improvements.
 Would it now be OK to change u_upload_mgr or do you have a better proposal?

 Ole

 Marek

 On Mon, Mar 3, 2014 at 9:29 PM, Niels Ole Salscheider

 niels_...@salscheider-online.de wrote:
  Using the DMA engine for buffer downloads vastly improves performance.
  This is because reads from VRAM by the CPU are slow because of the high
  latency of the PCIe bus.
 
  The first patch allows u_upload_mgr to be used for downloads, too. The
  second patch then uses u_upload_mgr in the radeon driver for downloads.
  I considered to rename u_upload_mgr to u_transfer_mgr since it might be
  confusing that an upload manager can be used for downloads. But then
  again we also have transfers so that u_transfer_mgr might also be
  confusing. Thus, I decided not to rename it for now.
 
  Without these patches, the buffer_bandwidth benchmark from uCLbench gives
  me:
 
  ./buffer_bandwidth --size=2000 --iterations=100
  # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant
  memory,
  32 KB local memory)
 
  1/1 direct 2000 Bytes   759.29 MB/s(HD) 17.13 MB/s(DD)
 
  14.61 MB/s(DH)
 
  With these paches, the read performance is much better:
 
  ./buffer_bandwidth --size=2000 --iterations=100
  # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant
  memory,
  32 KB local memory)
 
  1/1 direct 2000 Bytes   759.90 MB/s(HD) 613.49 MB/s(DD)
 
  1841.07 MB/s(DH)
 
  Judging by these numbers, it might even make sense to use the DMA engine
  for larger buffer downloads...
 
  Niels Ole Salscheider (2):
util/u_upload_mgr: Allow to also use it for downloads
radeon: Use transfer manager for buffer downloads
 
   src/gallium/auxiliary/hud/hud_context.c |  3 +-
   src/gallium/auxiliary/util/u_blitter.c  |  3 +-
   src/gallium/auxiliary/util/u_upload_mgr.c   | 49 +++-
   src/gallium/auxiliary/util/u_upload_mgr.h   | 13 -
   src/gallium/auxiliary/util/u_vbuf.c |  3 +-
   src/gallium/auxiliary/vl/vl_compositor.c|  3 +-
   src/gallium/drivers/ilo/ilo_context.c   |  3 +-
   src/gallium/drivers/r300/r300_context.c |  3 +-
   src/gallium/drivers/radeon/r600_buffer_common.c | 78
   +++-- src/gallium/drivers/radeon/r600_pipe_common.c
| 14 -
   src/gallium/drivers/radeon/r600_pipe_common.h   |  1 +
   src/mesa/state_tracker/st_context.c |  9 ++-
   12 files changed, 136 insertions(+), 46 deletions(-)
 
  --
  1.9.0
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] radeon: Use the DMA engine for buffer downloads

2014-03-04 Thread Niels Ole Salscheider
 Could you please do this without changing u_upload_mgr? You can still
 use u_upload_alloc to allocate buffer memory in the driver and the map
 buffer read/write flags are not important with persistent coherent
 buffer mappings anyway.

I have sent an updated patch to the list.

Ole
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/2] radeon: Use the DMA engine for buffer downloads

2014-03-03 Thread Niels Ole Salscheider
Using the DMA engine for buffer downloads vastly improves performance. This is
because reads from VRAM by the CPU are slow because of the high latency of the
PCIe bus.

The first patch allows u_upload_mgr to be used for downloads, too. The second
patch then uses u_upload_mgr in the radeon driver for downloads.
I considered to rename u_upload_mgr to u_transfer_mgr since it might be
confusing that an upload manager can be used for downloads. But then again we
also have transfers so that u_transfer_mgr might also be confusing. Thus, I
decided not to rename it for now.

Without these patches, the buffer_bandwidth benchmark from uCLbench gives me:

./buffer_bandwidth --size=2000 --iterations=100
# device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant memory,
32 KB local memory)
1/1 direct 2000 Bytes   759.29 MB/s(HD) 17.13 MB/s(DD)
14.61 MB/s(DH)

With these paches, the read performance is much better:

./buffer_bandwidth --size=2000 --iterations=100
# device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant memory,
32 KB local memory)
1/1 direct 2000 Bytes   759.90 MB/s(HD) 613.49 MB/s(DD)
1841.07 MB/s(DH)

Judging by these numbers, it might even make sense to use the DMA engine for
larger buffer downloads...

Niels Ole Salscheider (2):
  util/u_upload_mgr: Allow to also use it for downloads
  radeon: Use transfer manager for buffer downloads

 src/gallium/auxiliary/hud/hud_context.c |  3 +-
 src/gallium/auxiliary/util/u_blitter.c  |  3 +-
 src/gallium/auxiliary/util/u_upload_mgr.c   | 49 +++-
 src/gallium/auxiliary/util/u_upload_mgr.h   | 13 -
 src/gallium/auxiliary/util/u_vbuf.c |  3 +-
 src/gallium/auxiliary/vl/vl_compositor.c|  3 +-
 src/gallium/drivers/ilo/ilo_context.c   |  3 +-
 src/gallium/drivers/r300/r300_context.c |  3 +-
 src/gallium/drivers/radeon/r600_buffer_common.c | 78 +++--
 src/gallium/drivers/radeon/r600_pipe_common.c   | 14 -
 src/gallium/drivers/radeon/r600_pipe_common.h   |  1 +
 src/mesa/state_tracker/st_context.c |  9 ++-
 12 files changed, 136 insertions(+), 46 deletions(-)

-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] radeon: Use the DMA engine for buffer downloads

2014-03-03 Thread Marek Olšák
Could you please do this without changing u_upload_mgr? You can still
use u_upload_alloc to allocate buffer memory in the driver and the map
buffer read/write flags are not important with persistent coherent
buffer mappings anyway.

Marek

On Mon, Mar 3, 2014 at 9:29 PM, Niels Ole Salscheider
niels_...@salscheider-online.de wrote:
 Using the DMA engine for buffer downloads vastly improves performance. This is
 because reads from VRAM by the CPU are slow because of the high latency of the
 PCIe bus.

 The first patch allows u_upload_mgr to be used for downloads, too. The second
 patch then uses u_upload_mgr in the radeon driver for downloads.
 I considered to rename u_upload_mgr to u_transfer_mgr since it might be
 confusing that an upload manager can be used for downloads. But then again 
 we
 also have transfers so that u_transfer_mgr might also be confusing. Thus, I
 decided not to rename it for now.

 Without these patches, the buffer_bandwidth benchmark from uCLbench gives me:

 ./buffer_bandwidth --size=2000 --iterations=100
 # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant 
 memory,
 32 KB local memory)
 1/1 direct 2000 Bytes   759.29 MB/s(HD) 17.13 MB/s(DD)
 14.61 MB/s(DH)

 With these paches, the read performance is much better:

 ./buffer_bandwidth --size=2000 --iterations=100
 # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant 
 memory,
 32 KB local memory)
 1/1 direct 2000 Bytes   759.90 MB/s(HD) 613.49 MB/s(DD)
 1841.07 MB/s(DH)

 Judging by these numbers, it might even make sense to use the DMA engine for
 larger buffer downloads...

 Niels Ole Salscheider (2):
   util/u_upload_mgr: Allow to also use it for downloads
   radeon: Use transfer manager for buffer downloads

  src/gallium/auxiliary/hud/hud_context.c |  3 +-
  src/gallium/auxiliary/util/u_blitter.c  |  3 +-
  src/gallium/auxiliary/util/u_upload_mgr.c   | 49 +++-
  src/gallium/auxiliary/util/u_upload_mgr.h   | 13 -
  src/gallium/auxiliary/util/u_vbuf.c |  3 +-
  src/gallium/auxiliary/vl/vl_compositor.c|  3 +-
  src/gallium/drivers/ilo/ilo_context.c   |  3 +-
  src/gallium/drivers/r300/r300_context.c |  3 +-
  src/gallium/drivers/radeon/r600_buffer_common.c | 78 
 +++--
  src/gallium/drivers/radeon/r600_pipe_common.c   | 14 -
  src/gallium/drivers/radeon/r600_pipe_common.h   |  1 +
  src/mesa/state_tracker/st_context.c |  9 ++-
  12 files changed, 136 insertions(+), 46 deletions(-)

 --
 1.9.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev