[Mesa-dev] [PATCH 0/2] radeon: Use the DMA engine for buffer downloads

2014-03-03 Thread Niels Ole Salscheider
Using the DMA engine for buffer downloads vastly improves performance. This is
because reads from VRAM by the CPU are slow because of the high latency of the
PCIe bus.

The first patch allows u_upload_mgr to be used for downloads, too. The second
patch then uses u_upload_mgr in the radeon driver for downloads.
I considered to rename u_upload_mgr to u_transfer_mgr since it might be
confusing that an "upload manager" can be used for downloads. But then again we
also have "transfers" so that u_transfer_mgr might also be confusing. Thus, I
decided not to rename it for now.

Without these patches, the buffer_bandwidth benchmark from uCLbench gives me:

./buffer_bandwidth --size=2000 --iterations=100
# device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant memory,
32 KB local memory)
1/1 direct 2000 Bytes   759.29 MB/s(HD) 17.13 MB/s(DD)
14.61 MB/s(DH)

With these paches, the read performance is much better:

./buffer_bandwidth --size=2000 --iterations=100
# device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant memory,
32 KB local memory)
1/1 direct 2000 Bytes   759.90 MB/s(HD) 613.49 MB/s(DD)
1841.07 MB/s(DH)

Judging by these numbers, it might even make sense to use the DMA engine for
larger buffer downloads...

Niels Ole Salscheider (2):
  util/u_upload_mgr: Allow to also use it for downloads
  radeon: Use transfer manager for buffer downloads

 src/gallium/auxiliary/hud/hud_context.c |  3 +-
 src/gallium/auxiliary/util/u_blitter.c  |  3 +-
 src/gallium/auxiliary/util/u_upload_mgr.c   | 49 +++-
 src/gallium/auxiliary/util/u_upload_mgr.h   | 13 -
 src/gallium/auxiliary/util/u_vbuf.c |  3 +-
 src/gallium/auxiliary/vl/vl_compositor.c|  3 +-
 src/gallium/drivers/ilo/ilo_context.c   |  3 +-
 src/gallium/drivers/r300/r300_context.c |  3 +-
 src/gallium/drivers/radeon/r600_buffer_common.c | 78 +++--
 src/gallium/drivers/radeon/r600_pipe_common.c   | 14 -
 src/gallium/drivers/radeon/r600_pipe_common.h   |  1 +
 src/mesa/state_tracker/st_context.c |  9 ++-
 12 files changed, 136 insertions(+), 46 deletions(-)

-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] radeon: Use the DMA engine for buffer downloads

2014-08-09 Thread Niels Ole Salscheider
On Tuesday 04 March 2014, 02:08:58, Marek Olšák wrote:
> Could you please do this without changing u_upload_mgr? You can still
> use u_upload_alloc to allocate buffer memory in the driver and the map
> buffer read/write flags are not important with persistent coherent
> buffer mappings anyway.

Since 150ac07b855b5c5f879bf6ce9ca421ccd1a6c938 we allocate CPU -> GPU 
streaming buffers (i. e. those with PIPE_USAGE_STREAM) in VRAM.
We should therefore set buffer.usage to PIPE_USAGE_STAGING in 
u_upload_alloc_buffer when we use u_upload_mgr for downloads - otherwise we 
won't get any performance improvements.
Would it now be OK to change u_upload_mgr or do you have a better proposal?

Ole

> Marek
> 
> On Mon, Mar 3, 2014 at 9:29 PM, Niels Ole Salscheider
> 
>  wrote:
> > Using the DMA engine for buffer downloads vastly improves performance.
> > This is because reads from VRAM by the CPU are slow because of the high
> > latency of the PCIe bus.
> > 
> > The first patch allows u_upload_mgr to be used for downloads, too. The
> > second patch then uses u_upload_mgr in the radeon driver for downloads.
> > I considered to rename u_upload_mgr to u_transfer_mgr since it might be
> > confusing that an "upload manager" can be used for downloads. But then
> > again we also have "transfers" so that u_transfer_mgr might also be
> > confusing. Thus, I decided not to rename it for now.
> > 
> > Without these patches, the buffer_bandwidth benchmark from uCLbench gives
> > me:
> > 
> > ./buffer_bandwidth --size=2000 --iterations=100
> > # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant
> > memory,> 
> > 32 KB local memory)
> > 
> > 1/1 direct 2000 Bytes   759.29 MB/s(HD) 17.13 MB/s(DD)
> > 
> > 14.61 MB/s(DH)
> > 
> > With these paches, the read performance is much better:
> > 
> > ./buffer_bandwidth --size=2000 --iterations=100
> > # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant
> > memory,> 
> > 32 KB local memory)
> > 
> > 1/1 direct 2000 Bytes   759.90 MB/s(HD) 613.49 MB/s(DD)
> > 
> > 1841.07 MB/s(DH)
> > 
> > Judging by these numbers, it might even make sense to use the DMA engine
> > for larger buffer downloads...
> > 
> > Niels Ole Salscheider (2):
> >   util/u_upload_mgr: Allow to also use it for downloads
> >   radeon: Use transfer manager for buffer downloads
> >  
> >  src/gallium/auxiliary/hud/hud_context.c |  3 +-
> >  src/gallium/auxiliary/util/u_blitter.c  |  3 +-
> >  src/gallium/auxiliary/util/u_upload_mgr.c   | 49 +++-
> >  src/gallium/auxiliary/util/u_upload_mgr.h   | 13 -
> >  src/gallium/auxiliary/util/u_vbuf.c |  3 +-
> >  src/gallium/auxiliary/vl/vl_compositor.c|  3 +-
> >  src/gallium/drivers/ilo/ilo_context.c   |  3 +-
> >  src/gallium/drivers/r300/r300_context.c |  3 +-
> >  src/gallium/drivers/radeon/r600_buffer_common.c | 78
> >  +++-- src/gallium/drivers/radeon/r600_pipe_common.c 
> >   | 14 -
> >  src/gallium/drivers/radeon/r600_pipe_common.h   |  1 +
> >  src/mesa/state_tracker/st_context.c |  9 ++-
> >  12 files changed, 136 insertions(+), 46 deletions(-)
> > 
> > --
> > 1.9.0
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] radeon: Use the DMA engine for buffer downloads

2014-08-09 Thread Marek Olšák
You can try to do the allocation of the staging buffer with
pipe_buffer_create instead of u_upload_mgr. You can also use
u_suballocator, which is like a stripped out version of u_upload_mgr.
You would need another instance of u_upload_mgr anyway, because we'd
like to continue using PIPE_USAGE_STREAM for uploads.

Marek

On Sat, Aug 9, 2014 at 2:35 PM, Niels Ole Salscheider
 wrote:
> On Tuesday 04 March 2014, 02:08:58, Marek Olšák wrote:
>> Could you please do this without changing u_upload_mgr? You can still
>> use u_upload_alloc to allocate buffer memory in the driver and the map
>> buffer read/write flags are not important with persistent coherent
>> buffer mappings anyway.
>
> Since 150ac07b855b5c5f879bf6ce9ca421ccd1a6c938 we allocate CPU -> GPU
> streaming buffers (i. e. those with PIPE_USAGE_STREAM) in VRAM.
> We should therefore set buffer.usage to PIPE_USAGE_STAGING in
> u_upload_alloc_buffer when we use u_upload_mgr for downloads - otherwise we
> won't get any performance improvements.
> Would it now be OK to change u_upload_mgr or do you have a better proposal?
>
> Ole
>
>> Marek
>>
>> On Mon, Mar 3, 2014 at 9:29 PM, Niels Ole Salscheider
>>
>>  wrote:
>> > Using the DMA engine for buffer downloads vastly improves performance.
>> > This is because reads from VRAM by the CPU are slow because of the high
>> > latency of the PCIe bus.
>> >
>> > The first patch allows u_upload_mgr to be used for downloads, too. The
>> > second patch then uses u_upload_mgr in the radeon driver for downloads.
>> > I considered to rename u_upload_mgr to u_transfer_mgr since it might be
>> > confusing that an "upload manager" can be used for downloads. But then
>> > again we also have "transfers" so that u_transfer_mgr might also be
>> > confusing. Thus, I decided not to rename it for now.
>> >
>> > Without these patches, the buffer_bandwidth benchmark from uCLbench gives
>> > me:
>> >
>> > ./buffer_bandwidth --size=2000 --iterations=100
>> > # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant
>> > memory,>
>> > 32 KB local memory)
>> >
>> > 1/1 direct 2000 Bytes   759.29 MB/s(HD) 17.13 MB/s(DD)
>> >
>> > 14.61 MB/s(DH)
>> >
>> > With these paches, the read performance is much better:
>> >
>> > ./buffer_bandwidth --size=2000 --iterations=100
>> > # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant
>> > memory,>
>> > 32 KB local memory)
>> >
>> > 1/1 direct 2000 Bytes   759.90 MB/s(HD) 613.49 MB/s(DD)
>> >
>> > 1841.07 MB/s(DH)
>> >
>> > Judging by these numbers, it might even make sense to use the DMA engine
>> > for larger buffer downloads...
>> >
>> > Niels Ole Salscheider (2):
>> >   util/u_upload_mgr: Allow to also use it for downloads
>> >   radeon: Use transfer manager for buffer downloads
>> >
>> >  src/gallium/auxiliary/hud/hud_context.c |  3 +-
>> >  src/gallium/auxiliary/util/u_blitter.c  |  3 +-
>> >  src/gallium/auxiliary/util/u_upload_mgr.c   | 49 +++-
>> >  src/gallium/auxiliary/util/u_upload_mgr.h   | 13 -
>> >  src/gallium/auxiliary/util/u_vbuf.c |  3 +-
>> >  src/gallium/auxiliary/vl/vl_compositor.c|  3 +-
>> >  src/gallium/drivers/ilo/ilo_context.c   |  3 +-
>> >  src/gallium/drivers/r300/r300_context.c |  3 +-
>> >  src/gallium/drivers/radeon/r600_buffer_common.c | 78
>> >  +++-- src/gallium/drivers/radeon/r600_pipe_common.c
>> >   | 14 -
>> >  src/gallium/drivers/radeon/r600_pipe_common.h   |  1 +
>> >  src/mesa/state_tracker/st_context.c |  9 ++-
>> >  12 files changed, 136 insertions(+), 46 deletions(-)
>> >
>> > --
>> > 1.9.0
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] radeon: Use the DMA engine for buffer downloads

2014-03-03 Thread Marek Olšák
Could you please do this without changing u_upload_mgr? You can still
use u_upload_alloc to allocate buffer memory in the driver and the map
buffer read/write flags are not important with persistent coherent
buffer mappings anyway.

Marek

On Mon, Mar 3, 2014 at 9:29 PM, Niels Ole Salscheider
 wrote:
> Using the DMA engine for buffer downloads vastly improves performance. This is
> because reads from VRAM by the CPU are slow because of the high latency of the
> PCIe bus.
>
> The first patch allows u_upload_mgr to be used for downloads, too. The second
> patch then uses u_upload_mgr in the radeon driver for downloads.
> I considered to rename u_upload_mgr to u_transfer_mgr since it might be
> confusing that an "upload manager" can be used for downloads. But then again 
> we
> also have "transfers" so that u_transfer_mgr might also be confusing. Thus, I
> decided not to rename it for now.
>
> Without these patches, the buffer_bandwidth benchmark from uCLbench gives me:
>
> ./buffer_bandwidth --size=2000 --iterations=100
> # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant 
> memory,
> 32 KB local memory)
> 1/1 direct 2000 Bytes   759.29 MB/s(HD) 17.13 MB/s(DD)
> 14.61 MB/s(DH)
>
> With these paches, the read performance is much better:
>
> ./buffer_bandwidth --size=2000 --iterations=100
> # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant 
> memory,
> 32 KB local memory)
> 1/1 direct 2000 Bytes   759.90 MB/s(HD) 613.49 MB/s(DD)
> 1841.07 MB/s(DH)
>
> Judging by these numbers, it might even make sense to use the DMA engine for
> larger buffer downloads...
>
> Niels Ole Salscheider (2):
>   util/u_upload_mgr: Allow to also use it for downloads
>   radeon: Use transfer manager for buffer downloads
>
>  src/gallium/auxiliary/hud/hud_context.c |  3 +-
>  src/gallium/auxiliary/util/u_blitter.c  |  3 +-
>  src/gallium/auxiliary/util/u_upload_mgr.c   | 49 +++-
>  src/gallium/auxiliary/util/u_upload_mgr.h   | 13 -
>  src/gallium/auxiliary/util/u_vbuf.c |  3 +-
>  src/gallium/auxiliary/vl/vl_compositor.c|  3 +-
>  src/gallium/drivers/ilo/ilo_context.c   |  3 +-
>  src/gallium/drivers/r300/r300_context.c |  3 +-
>  src/gallium/drivers/radeon/r600_buffer_common.c | 78 
> +++--
>  src/gallium/drivers/radeon/r600_pipe_common.c   | 14 -
>  src/gallium/drivers/radeon/r600_pipe_common.h   |  1 +
>  src/mesa/state_tracker/st_context.c |  9 ++-
>  12 files changed, 136 insertions(+), 46 deletions(-)
>
> --
> 1.9.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] radeon: Use the DMA engine for buffer downloads

2014-03-04 Thread Niels Ole Salscheider
> Could you please do this without changing u_upload_mgr? You can still
> use u_upload_alloc to allocate buffer memory in the driver and the map
> buffer read/write flags are not important with persistent coherent
> buffer mappings anyway.

I have sent an updated patch to the list.

Ole
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev