Re: [Mesa-dev] [PATCH 0/4] RadeonSI: Upload constants to VRAM via SDMA

2019-02-11 Thread Dieter Nützel

Am 12.02.2019 00:40, schrieb Dieter Nützel:

Sorry that I step in so late, but the whole family recover slowly from
a bad flu...

Tried your 'latest" three series altogether with my Polaris 20 (NIR!).
UH and UV hang after some seconds reliable. VM faults. Have to dig
deeper in (remote) to get some logs.


UH

[47001.185090] amdgpu :01:00.0: GPU fault detected: 147 0x0b384801 
for process heaven_x64 pid 18565 thread heaven_x64:cs0 pid 18586
[47001.185094] amdgpu :01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
 0x0373EF67
[47001.185096] amdgpu :01:00.0:   
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x06048001
[47001.185098] amdgpu :01:00.0: VM fault (0x01, vmid 3, pasid 32786) 
at page 57929575, read from 'TC4' (0x54433400) (72)
[47011.401741] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx 
timeout, signaled seq=11380701, emitted seq=11380703
[47011.401784] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process 
information: process  pid 0 thread  pid 0

[47011.401787] amdgpu :01:00.0: GPU reset begin!
[47021.631605] [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* 
[CRTC:49:crtc-0] hw_done or flip_done timed out



But my reported Polaris triangle corruptions are solved, now.
W'll try to verify which patches fixed it.

Look here:
https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1079319-running-the-radeonsi-nir-back-end-with-mesa-19-1-git?p=1079390#post1079390

Greetings,
Dieter

Am 07.02.2019 02:21, schrieb Marek Olšák:

Hi,

This patch series increases radeonsi performance in some cases.
glxgears performance decreases slightly.

Visible VRAM is usually congested due to CPU accesses, which cause
buffers to be evicted from that part of VRAM. This removes
the congestion for all data pushed into const_uploader.

We have had many problems with const_uploader slowing stuff down due
to visible VRAM congestion. The most recent one is this Starcraft 2
issue report on github:

https://github.com/iXit/Mesa-3D/issues/333

Since const_uploader reuses buffers from the winsys buffer cache,
the odds are that the reused buffers are already evicted, so the first
use is usually slower due to higher shader load latencies.

This series uses SDMA to get constants into VRAM, so it doesn't have
any of the above drawbacks.

SC2 numbers with various other methods (from the github issue report):
- originally: 50-55 fps
- changing const_uploader to STREAM: 75-80 fps
- use stream_uploader for constants in Nine: 90 fps
- this series: 105-110 fps

Trivial benchmarks such as glxgears can expect 20% decrease
in performance due to the added cost of the SDMA CS ioctl that wasn't
there before.

CPU-bound apps with many IBs are almost unaffected thanks to winsys
multithreading.

Feedback welcome,

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/4] RadeonSI: Upload constants to VRAM via SDMA

2019-02-11 Thread Dieter Nützel

Am 12.02.2019 03:22, schrieb Dieter Nützel:

Am 12.02.2019 00:40, schrieb Dieter Nützel:

Sorry that I step in so late, but the whole family recover slowly from
a bad flu...

Tried your 'latest" three series altogether with my Polaris 20 (NIR!).
UH and UV hang after some seconds reliable. VM faults. Have to dig
deeper in (remote) to get some logs.


UH

[47001.185090] amdgpu :01:00.0: GPU fault detected: 147 0x0b384801
for process heaven_x64 pid 18565 thread heaven_x64:cs0 pid 18586
[47001.185094] amdgpu :01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0373EF67
[47001.185096] amdgpu :01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x06048001
[47001.185098] amdgpu :01:00.0: VM fault (0x01, vmid 3, pasid
32786) at page 57929575, read from 'TC4' (0x54433400) (72)
[47011.401741] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, signaled seq=11380701, emitted seq=11380703
[47011.401784] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
information: process  pid 0 thread  pid 0
[47011.401787] amdgpu :01:00.0: GPU reset begin!
[47021.631605] [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR*
[CRTC:49:crtc-0] hw_done or flip_done timed out


These GPU faults are SOLVED after reverting the SDMA (1-4) series.


But my reported Polaris triangle corruptions are solved, now.


These reported (last year) triangle corruptions are SOLVED _before_ all 
of these patches. GREAT!



W'll try to verify which patches fixed it.


If I have more time then I'll try to find the corresponding patch/fix.

Cheers,
Dieter


Look here:
https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1079319-running-the-radeonsi-nir-back-end-with-mesa-19-1-git?p=1079390#post1079390

Greetings,
Dieter

Am 07.02.2019 02:21, schrieb Marek Olšák:

Hi,

This patch series increases radeonsi performance in some cases.
glxgears performance decreases slightly.

Visible VRAM is usually congested due to CPU accesses, which cause
buffers to be evicted from that part of VRAM. This removes
the congestion for all data pushed into const_uploader.

We have had many problems with const_uploader slowing stuff down due
to visible VRAM congestion. The most recent one is this Starcraft 2
issue report on github:

https://github.com/iXit/Mesa-3D/issues/333

Since const_uploader reuses buffers from the winsys buffer cache,
the odds are that the reused buffers are already evicted, so the 
first

use is usually slower due to higher shader load latencies.

This series uses SDMA to get constants into VRAM, so it doesn't have
any of the above drawbacks.

SC2 numbers with various other methods (from the github issue 
report):

- originally: 50-55 fps
- changing const_uploader to STREAM: 75-80 fps
- use stream_uploader for constants in Nine: 90 fps
- this series: 105-110 fps

Trivial benchmarks such as glxgears can expect 20% decrease
in performance due to the added cost of the SDMA CS ioctl that wasn't
there before.

CPU-bound apps with many IBs are almost unaffected thanks to winsys
multithreading.

Feedback welcome,

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/4] RadeonSI: Upload constants to VRAM via SDMA

2019-02-12 Thread Dieter Nützel
Sorry that I step in so late, but the whole family recover slowly from a 
bad flu...


Tried your 'latest" three series altogether with my Polaris 20 (NIR!).
UH and UV hang after some seconds reliable. VM faults. Have to dig 
deeper in (remote) to get some logs.


But my reported Polaris triangle corruptions are solved, now.
W'll try to verify which patches fixed it.

Look here:
https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1079319-running-the-radeonsi-nir-back-end-with-mesa-19-1-git?p=1079390#post1079390

Greetings,
Dieter

Am 07.02.2019 02:21, schrieb Marek Olšák:

Hi,

This patch series increases radeonsi performance in some cases.
glxgears performance decreases slightly.

Visible VRAM is usually congested due to CPU accesses, which cause
buffers to be evicted from that part of VRAM. This removes
the congestion for all data pushed into const_uploader.

We have had many problems with const_uploader slowing stuff down due
to visible VRAM congestion. The most recent one is this Starcraft 2
issue report on github:

https://github.com/iXit/Mesa-3D/issues/333

Since const_uploader reuses buffers from the winsys buffer cache,
the odds are that the reused buffers are already evicted, so the first
use is usually slower due to higher shader load latencies.

This series uses SDMA to get constants into VRAM, so it doesn't have
any of the above drawbacks.

SC2 numbers with various other methods (from the github issue report):
- originally: 50-55 fps
- changing const_uploader to STREAM: 75-80 fps
- use stream_uploader for constants in Nine: 90 fps
- this series: 105-110 fps

Trivial benchmarks such as glxgears can expect 20% decrease
in performance due to the added cost of the SDMA CS ioctl that wasn't
there before.

CPU-bound apps with many IBs are almost unaffected thanks to winsys
multithreading.

Feedback welcome,

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/4] RadeonSI: Upload constants to VRAM via SDMA

2019-02-15 Thread Dieter Nützel

Am 12.02.2019 05:10, schrieb Dieter Nützel:

Am 12.02.2019 03:22, schrieb Dieter Nützel:

Am 12.02.2019 00:40, schrieb Dieter Nützel:
Sorry that I step in so late, but the whole family recover slowly 
from

a bad flu...

Tried your 'latest" three series altogether with my Polaris 20 
(NIR!).

UH and UV hang after some seconds reliable. VM faults. Have to dig
deeper in (remote) to get some logs.


UH

[47001.185090] amdgpu :01:00.0: GPU fault detected: 147 0x0b384801
for process heaven_x64 pid 18565 thread heaven_x64:cs0 pid 18586
[47001.185094] amdgpu :01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0373EF67
[47001.185096] amdgpu :01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x06048001
[47001.185098] amdgpu :01:00.0: VM fault (0x01, vmid 3, pasid
32786) at page 57929575, read from 'TC4' (0x54433400) (72)
[47011.401741] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, signaled seq=11380701, emitted seq=11380703
[47011.401784] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
information: process  pid 0 thread  pid 0
[47011.401787] amdgpu :01:00.0: GPU reset begin!
[47021.631605] [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR*
[CRTC:49:crtc-0] hw_done or flip_done timed out


These GPU faults are SOLVED after reverting the SDMA (1-4) series.


So I gave this a second change with LLVM 9.0 git.
+ some other patches

e83af67eed7 (HEAD -> master) ac: use new LLVM 8 intrinsic when loading 
16-bit values

7f32d569ffc ac: add ac_build_llvm8_tbuffer_load() helper
037bda54a7d nir: remove simple dead if detection from nir_opt_dead_cf()
51fe88ff1ab radeonsi/nir: set shader_buffers_declared properly
e66a73aa1a6 radeonsi/nir: set colors_read properly
83955dfc81a radeonsi/nir: set input_usage_mask properly
4c355a562db radeonsi: use SDMA for uploading data through const_uploader
6855f871e47 gallium/u_upload_mgr: allow use of FLUSH_EXPLICIT with 
persistent mappings

2116355fc01 gallium/u_threaded: always unmap const_uploader
6e70cce39f3 st/mesa: always unmap the uploader in st_atom_array.c
22a88ca1d92 radeonsi: re-initialize query buffers if they are reused
6775665e5ee (origin/master, origin/HEAD) spirv: Eliminate dead 
input/output variables after translation.


UH
run some sences but (same? - Yes.) GPU fault. - Shit, sadly overwritten 
my dmesg.log. :-(


UV
run some sences but (same? - Yes.) GPU fault.

Unigine Valley Benchmark 1.0 (1.0)Unigine~# world_load valley/valley
Loading "valley/valley.cpp" 126ms
Loading "valley/valley.mat" 72 materials 1160ms
Loading "valley/sound/sound.prop" 142 properties 1ms
Loading "valley/valley.world" 2253ms
valley_x64: ../src/gallium/auxiliary/util/u_inlines.h:81: 
pipe_reference_described: Assertion `count != 1' failed.


[ 1079.415836] amdgpu :01:00.0: GPU fault detected: 147 0x0ca04801 
for process valley_x64 pid 18050 thread valley_x64:cs0 pid 18071
[ 1079.415841] amdgpu :01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
 0x094A9594
[ 1079.415842] amdgpu :01:00.0:   
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08048001
[ 1079.415845] amdgpu :01:00.0: VM fault (0x01, vmid 4, pasid 32769) 
at page 155882900, read from 'TC4' (0x54433400) (72)
[ 1089.543336] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx 
timeout, signaled seq=91489, emitted seq=91491
[ 1089.543379] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process 
information: process valley_x64 pid 18050 thread valley_x64:cs0 pid 
18071

[ 1089.543382] amdgpu :01:00.0: GPU reset begin!
[ 1099.773342] [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* 
[CRTC:49:crtc-0] hw_done or flip_done timed out


Hope that helps some.

Dieter
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/4] RadeonSI: Upload constants to VRAM via SDMA

2019-02-19 Thread Marek Olšák
Yeah, u_threaded_context is broken.

Marek

On Thu, Feb 14, 2019 at 8:06 PM Dieter Nützel  wrote:

> Am 12.02.2019 05:10, schrieb Dieter Nützel:
> > Am 12.02.2019 03:22, schrieb Dieter Nützel:
> >> Am 12.02.2019 00:40, schrieb Dieter Nützel:
> >>> Sorry that I step in so late, but the whole family recover slowly
> >>> from
> >>> a bad flu...
> >>>
> >>> Tried your 'latest" three series altogether with my Polaris 20
> >>> (NIR!).
> >>> UH and UV hang after some seconds reliable. VM faults. Have to dig
> >>> deeper in (remote) to get some logs.
> >>
> >> UH
> >>
> >> [47001.185090] amdgpu :01:00.0: GPU fault detected: 147 0x0b384801
> >> for process heaven_x64 pid 18565 thread heaven_x64:cs0 pid 18586
> >> [47001.185094] amdgpu :01:00.0:
> >> VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0373EF67
> >> [47001.185096] amdgpu :01:00.0:
> >> VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x06048001
> >> [47001.185098] amdgpu :01:00.0: VM fault (0x01, vmid 3, pasid
> >> 32786) at page 57929575, read from 'TC4' (0x54433400) (72)
> >> [47011.401741] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
> >> timeout, signaled seq=11380701, emitted seq=11380703
> >> [47011.401784] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
> >> information: process  pid 0 thread  pid 0
> >> [47011.401787] amdgpu :01:00.0: GPU reset begin!
> >> [47021.631605] [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR*
> >> [CRTC:49:crtc-0] hw_done or flip_done timed out
> >
> > These GPU faults are SOLVED after reverting the SDMA (1-4) series.
>
> So I gave this a second change with LLVM 9.0 git.
> + some other patches
>
> e83af67eed7 (HEAD -> master) ac: use new LLVM 8 intrinsic when loading
> 16-bit values
> 7f32d569ffc ac: add ac_build_llvm8_tbuffer_load() helper
> 037bda54a7d nir: remove simple dead if detection from nir_opt_dead_cf()
> 51fe88ff1ab radeonsi/nir: set shader_buffers_declared properly
> e66a73aa1a6 radeonsi/nir: set colors_read properly
> 83955dfc81a radeonsi/nir: set input_usage_mask properly
> 4c355a562db radeonsi: use SDMA for uploading data through const_uploader
> 6855f871e47 gallium/u_upload_mgr: allow use of FLUSH_EXPLICIT with
> persistent mappings
> 2116355fc01 gallium/u_threaded: always unmap const_uploader
> 6e70cce39f3 st/mesa: always unmap the uploader in st_atom_array.c
> 22a88ca1d92 radeonsi: re-initialize query buffers if they are reused
> 6775665e5ee (origin/master, origin/HEAD) spirv: Eliminate dead
> input/output variables after translation.
>
> UH
> run some sences but (same? - Yes.) GPU fault. - Shit, sadly overwritten
> my dmesg.log. :-(
>
> UV
> run some sences but (same? - Yes.) GPU fault.
>
> Unigine Valley Benchmark 1.0 (1.0)Unigine~# world_load valley/valley
> Loading "valley/valley.cpp" 126ms
> Loading "valley/valley.mat" 72 materials 1160ms
> Loading "valley/sound/sound.prop" 142 properties 1ms
> Loading "valley/valley.world" 2253ms
> valley_x64: ../src/gallium/auxiliary/util/u_inlines.h:81:
> pipe_reference_described: Assertion `count != 1' failed.
>
> [ 1079.415836] amdgpu :01:00.0: GPU fault detected: 147 0x0ca04801
> for process valley_x64 pid 18050 thread valley_x64:cs0 pid 18071
> [ 1079.415841] amdgpu :01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR
>   0x094A9594
> [ 1079.415842] amdgpu :01:00.0:
> VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08048001
> [ 1079.415845] amdgpu :01:00.0: VM fault (0x01, vmid 4, pasid 32769)
> at page 155882900, read from 'TC4' (0x54433400) (72)
> [ 1089.543336] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
> timeout, signaled seq=91489, emitted seq=91491
> [ 1089.543379] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
> information: process valley_x64 pid 18050 thread valley_x64:cs0 pid
> 18071
> [ 1089.543382] amdgpu :01:00.0: GPU reset begin!
> [ 1099.773342] [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR*
> [CRTC:49:crtc-0] hw_done or flip_done timed out
>
> Hope that helps some.
>
> Dieter
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/4] RadeonSI: Upload constants to VRAM via SDMA

2019-02-25 Thread Dieter Nützel

Hello Marek,

you wrote with your series sent:

[-]
Trivial benchmarks such as glxgears can expect 20% decrease
in performance due to the added cost of the SDMA CS ioctl that wasn't
there before.
[-]

Any ideas to speed this up, again?
glmark2 went from 9766 (best ever) down to 7455 (all with NIR).
Or are micro benchmarks not worth more effort?

Dieter

SDMA
===
glmark2 2017.07
===
OpenGL Information
GL_VENDOR: X.Org
GL_RENDERER:   Radeon RX 580 Series (POLARIS10, DRM 3.30.0, 
5.0.0-rc1-1.g7262353-default+, LLVM 9.0.0)
GL_VERSION:4.5 (Compatibility Profile) Mesa 19.1.0-devel 
(git-a9b32aaa16)

===
[build] use-vbo=false: FPS: 3694 FrameTime: 0.271 ms
[build] use-vbo=true: FPS: 9341 FrameTime: 0.107 ms
[texture] texture-filter=nearest: FPS: 9140 FrameTime: 0.109 ms
[texture] texture-filter=linear: FPS: 9163 FrameTime: 0.109 ms
[texture] texture-filter=mipmap: FPS: 9161 FrameTime: 0.109 ms
[shading] shading=gouraud: FPS: 9234 FrameTime: 0.108 ms
[shading] shading=blinn-phong-inf: FPS: 9255 FrameTime: 0.108 ms
[shading] shading=phong: FPS: 9226 FrameTime: 0.108 ms
[shading] shading=cel: FPS: 9310 FrameTime: 0.107 ms
[bump] bump-render=high-poly: FPS: 9298 FrameTime: 0.108 ms
[bump] bump-render=normals: FPS: 9121 FrameTime: 0.110 ms
[bump] bump-render=height: FPS: 9120 FrameTime: 0.110 ms
libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 9858 FrameTime: 0.101 ms
libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 9854 FrameTime: 
0.101 ms
[pulsar] light=false:quads=5:texture=false: FPS: 8468 FrameTime: 0.118 
ms

libpng warning: iCCP: known incorrect sRGB profile
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: 
FPS: 5181 FrameTime: 0.193 ms

libpng warning: iCCP: known incorrect sRGB profile
[desktop] effect=shadow:windows=4: FPS: 5374 FrameTime: 0.186 ms
[buffer] 
columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: 
FPS: 824 FrameTime: 1.214 ms
[buffer] 
columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: 
FPS: 1114 FrameTime: 0.898 ms
[buffer] 
columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: 
FPS: 899 FrameTime: 1.112 ms

[ideas] speed=duration: FPS: 3485 FrameTime: 0.287 ms
[jellyfish] : FPS: 7992 FrameTime: 0.125 ms
[terrain] : FPS: 1796 FrameTime: 0.557 ms
[shadow] : FPS: 7350 FrameTime: 0.136 ms
[refract] : FPS: 3595 FrameTime: 0.278 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 9401 FrameTime: 
0.106 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 9413 FrameTime: 
0.106 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 9417 FrameTime: 
0.106 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 9365 
FrameTime: 0.107 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 9451 
FrameTime: 0.106 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 9300 
FrameTime: 0.108 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 9440 
FrameTime: 0.106 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 9392 
FrameTime: 0.106 ms

===
  glmark2 Score: 7455
===


Before
===
glmark2 2017.07
===
OpenGL Information
GL_VENDOR: X.Org
GL_RENDERER:   Radeon RX 580 Series (POLARIS10, DRM 3.27.0, 
4.20.0-rc3-1.g7262353-default+, LLVM 8.0.0)
GL_VERSION:4.5 (Compatibility Profile) Mesa 19.0.0-devel 
(git-c49b3df3cb)

===
[build] use-vbo=false: FPS: 3373 FrameTime: 0.296 ms
[build] use-vbo=true: FPS: 13121 FrameTime: 0.076 ms
[texture] texture-filter=nearest: FPS: 12172 FrameTime: 0.082 ms
[texture] texture-filter=linear: FPS: 12557 FrameTime: 0.080 ms
[texture] texture-filter=mipmap: FPS: 12228 FrameTime: 0.082 ms
[shading] shading=gouraud: FPS: 12536 FrameTime: 0.080 ms
[shading] shading=blinn-phong-inf: FPS: 12782 FrameTime: 0.078 ms
[shading] shading=phong: FPS: 12619 FrameTime: 0.079 ms
[shading] shading=cel: FPS: 12735 FrameTime: 0.079 ms
[bump] bump-render=high-poly: FPS: 11412 FrameTime: 0.088 ms
[bump] bump-render=normals: FPS: 12467 FrameTime: 0.080 ms
[bump] bump-render=height: FPS: 12422 FrameTime: 0.081 ms
libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 13252 FrameTime: 0.075 ms
libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 11468 FrameTime: 
0.087 ms
[pulsar] light=false:quads=5:texture=

Re: [Mesa-dev] [PATCH 0/4] RadeonSI: Upload constants to VRAM via SDMA

2019-02-25 Thread Marek Olšák
We need to extend the CS ioctl to allow submitting 2 command buffers at the
same time.

Marek

On Mon, Feb 25, 2019, 10:06 PM Dieter Nützel  wrote:

> Hello Marek,
>
> you wrote with your series sent:
>
> [-]
> Trivial benchmarks such as glxgears can expect 20% decrease
> in performance due to the added cost of the SDMA CS ioctl that wasn't
> there before.
> [-]
>
> Any ideas to speed this up, again?
> glmark2 went from 9766 (best ever) down to 7455 (all with NIR).
> Or are micro benchmarks not worth more effort?
>
> Dieter
>
> SDMA
> ===
>  glmark2 2017.07
> ===
>  OpenGL Information
>  GL_VENDOR: X.Org
>  GL_RENDERER:   Radeon RX 580 Series (POLARIS10, DRM 3.30.0,
> 5.0.0-rc1-1.g7262353-default+, LLVM 9.0.0)
>  GL_VERSION:4.5 (Compatibility Profile) Mesa 19.1.0-devel
> (git-a9b32aaa16)
> ===
> [build] use-vbo=false: FPS: 3694 FrameTime: 0.271 ms
> [build] use-vbo=true: FPS: 9341 FrameTime: 0.107 ms
> [texture] texture-filter=nearest: FPS: 9140 FrameTime: 0.109 ms
> [texture] texture-filter=linear: FPS: 9163 FrameTime: 0.109 ms
> [texture] texture-filter=mipmap: FPS: 9161 FrameTime: 0.109 ms
> [shading] shading=gouraud: FPS: 9234 FrameTime: 0.108 ms
> [shading] shading=blinn-phong-inf: FPS: 9255 FrameTime: 0.108 ms
> [shading] shading=phong: FPS: 9226 FrameTime: 0.108 ms
> [shading] shading=cel: FPS: 9310 FrameTime: 0.107 ms
> [bump] bump-render=high-poly: FPS: 9298 FrameTime: 0.108 ms
> [bump] bump-render=normals: FPS: 9121 FrameTime: 0.110 ms
> [bump] bump-render=height: FPS: 9120 FrameTime: 0.110 ms
> libpng warning: iCCP: known incorrect sRGB profile
> [effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 9858 FrameTime: 0.101 ms
> libpng warning: iCCP: known incorrect sRGB profile
> [effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 9854 FrameTime:
> 0.101 ms
> [pulsar] light=false:quads=5:texture=false: FPS: 8468 FrameTime: 0.118
> ms
> libpng warning: iCCP: known incorrect sRGB profile
> [desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4:
> FPS: 5181 FrameTime: 0.193 ms
> libpng warning: iCCP: known incorrect sRGB profile
> [desktop] effect=shadow:windows=4: FPS: 5374 FrameTime: 0.186 ms
> [buffer]
> columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map:
>
> FPS: 824 FrameTime: 1.214 ms
> [buffer]
> columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata:
>
> FPS: 1114 FrameTime: 0.898 ms
> [buffer]
> columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map:
>
> FPS: 899 FrameTime: 1.112 ms
> [ideas] speed=duration: FPS: 3485 FrameTime: 0.287 ms
> [jellyfish] : FPS: 7992 FrameTime: 0.125 ms
> [terrain] : FPS: 1796 FrameTime: 0.557 ms
> [shadow] : FPS: 7350 FrameTime: 0.136 ms
> [refract] : FPS: 3595 FrameTime: 0.278 ms
> [conditionals] fragment-steps=0:vertex-steps=0: FPS: 9401 FrameTime:
> 0.106 ms
> [conditionals] fragment-steps=5:vertex-steps=0: FPS: 9413 FrameTime:
> 0.106 ms
> [conditionals] fragment-steps=0:vertex-steps=5: FPS: 9417 FrameTime:
> 0.106 ms
> [function] fragment-complexity=low:fragment-steps=5: FPS: 9365
> FrameTime: 0.107 ms
> [function] fragment-complexity=medium:fragment-steps=5: FPS: 9451
> FrameTime: 0.106 ms
> [loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 9300
> FrameTime: 0.108 ms
> [loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 9440
> FrameTime: 0.106 ms
> [loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 9392
> FrameTime: 0.106 ms
> ===
>glmark2 Score: 7455
> ===
>
>
> Before
> ===
>  glmark2 2017.07
> ===
>  OpenGL Information
>  GL_VENDOR: X.Org
>  GL_RENDERER:   Radeon RX 580 Series (POLARIS10, DRM 3.27.0,
> 4.20.0-rc3-1.g7262353-default+, LLVM 8.0.0)
>  GL_VERSION:4.5 (Compatibility Profile) Mesa 19.0.0-devel
> (git-c49b3df3cb)
> ===
> [build] use-vbo=false: FPS: 3373 FrameTime: 0.296 ms
> [build] use-vbo=true: FPS: 13121 FrameTime: 0.076 ms
> [texture] texture-filter=nearest: FPS: 12172 FrameTime: 0.082 ms
> [texture] texture-filter=linear: FPS: 12557 FrameTime: 0.080 ms
> [texture] texture-filter=mipmap: FPS: 12228 FrameTime: 0.082 ms
> [shading] shading=gouraud: FPS: 12536 FrameTime: 0.080 ms
> [shading] shading=blinn-phong-inf: FPS: 12782 FrameTime: 0.078 ms
> [shading] shading=phong: FPS: 12619 FrameTime: 0.079 ms
> [shading] shading=cel: FPS: 12735 FrameTime: 0.079 ms
> [bump] bump-render=high-poly: FPS: 11412 FrameTime: 0.088 ms
> [bump] bump-render=normals: FPS: 12467 FrameTime: 0.080 ms
> 

Re: [Mesa-dev] [PATCH 0/4] RadeonSI: Upload constants to VRAM via SDMA

2019-04-08 Thread Dieter Nützel

Maybe someone working on this, too.

I'm feeling fine again after a short 'trip' into the hospital...;-)

Dieter

Am 26.02.2019 07:36, schrieb Marek Olšák:

We need to extend the CS ioctl to allow submitting 2 command buffers
at the same time.

Marek

On Mon, Feb 25, 2019, 10:06 PM Dieter Nützel 
wrote:


Hello Marek,

you wrote with your series sent:

[-]
Trivial benchmarks such as glxgears can expect 20% decrease
in performance due to the added cost of the SDMA CS ioctl that
wasn't
there before.
[-]

Any ideas to speed this up, again?
glmark2 went from 9766 (best ever) down to 7455 (all with NIR).
Or are micro benchmarks not worth more effort?

Dieter

SDMA
===
glmark2 2017.07
===
OpenGL Information
GL_VENDOR: X.Org
GL_RENDERER:   Radeon RX 580 Series (POLARIS10, DRM 3.30.0,
5.0.0-rc1-1.g7262353-default+, LLVM 9.0.0)
GL_VERSION:4.5 (Compatibility Profile) Mesa 19.1.0-devel
(git-a9b32aaa16)
===
[build] use-vbo=false: FPS: 3694 FrameTime: 0.271 ms
[build] use-vbo=true: FPS: 9341 FrameTime: 0.107 ms
[texture] texture-filter=nearest: FPS: 9140 FrameTime: 0.109 ms
[texture] texture-filter=linear: FPS: 9163 FrameTime: 0.109 ms
[texture] texture-filter=mipmap: FPS: 9161 FrameTime: 0.109 ms
[shading] shading=gouraud: FPS: 9234 FrameTime: 0.108 ms
[shading] shading=blinn-phong-inf: FPS: 9255 FrameTime: 0.108 ms
[shading] shading=phong: FPS: 9226 FrameTime: 0.108 ms
[shading] shading=cel: FPS: 9310 FrameTime: 0.107 ms
[bump] bump-render=high-poly: FPS: 9298 FrameTime: 0.108 ms
[bump] bump-render=normals: FPS: 9121 FrameTime: 0.110 ms
[bump] bump-render=height: FPS: 9120 FrameTime: 0.110 ms
libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 9858 FrameTime: 0.101 ms
libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 9854
FrameTime:
0.101 ms
[pulsar] light=false:quads=5:texture=false: FPS: 8468 FrameTime:
0.118
ms
libpng warning: iCCP: known incorrect sRGB profile
[desktop]
blur-radius=5:effect=blur:passes=1:separable=true:windows=4:
FPS: 5181 FrameTime: 0.193 ms
libpng warning: iCCP: known incorrect sRGB profile
[desktop] effect=shadow:windows=4: FPS: 5374 FrameTime: 0.186 ms
[buffer]


columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map:


FPS: 824 FrameTime: 1.214 ms
[buffer]


columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata:


FPS: 1114 FrameTime: 0.898 ms
[buffer]


columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map:


FPS: 899 FrameTime: 1.112 ms
[ideas] speed=duration: FPS: 3485 FrameTime: 0.287 ms
[jellyfish] : FPS: 7992 FrameTime: 0.125 ms
[terrain] : FPS: 1796 FrameTime: 0.557 ms
[shadow] : FPS: 7350 FrameTime: 0.136 ms
[refract] : FPS: 3595 FrameTime: 0.278 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 9401 FrameTime:

0.106 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 9413 FrameTime:

0.106 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 9417 FrameTime:

0.106 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 9365
FrameTime: 0.107 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 9451
FrameTime: 0.106 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS:
9300
FrameTime: 0.108 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS:
9440
FrameTime: 0.106 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS:
9392
FrameTime: 0.106 ms
===
glmark2 Score: 7455
===

Before
===
glmark2 2017.07
===
OpenGL Information
GL_VENDOR: X.Org
GL_RENDERER:   Radeon RX 580 Series (POLARIS10, DRM 3.27.0,
4.20.0-rc3-1.g7262353-default+, LLVM 8.0.0)
GL_VERSION:4.5 (Compatibility Profile) Mesa 19.0.0-devel
(git-c49b3df3cb)
===
[build] use-vbo=false: FPS: 3373 FrameTime: 0.296 ms
[build] use-vbo=true: FPS: 13121 FrameTime: 0.076 ms
[texture] texture-filter=nearest: FPS: 12172 FrameTime: 0.082 ms
[texture] texture-filter=linear: FPS: 12557 FrameTime: 0.080 ms
[texture] texture-filter=mipmap: FPS: 12228 FrameTime: 0.082 ms
[shading] shading=gouraud: FPS: 12536 FrameTime: 0.080 ms
[shading] shading=blinn-phong-inf: FPS: 12782 FrameTime: 0.078 ms
[shading] shading=phong: FPS: 12619 FrameTime: 0.079 ms
[shading] shading=cel: FPS: 12735 FrameTime: 0.079 ms
[bump] bump-render=high-poly: FPS: 11412 FrameTime: 0.088 ms
[bump] bump-render=normals: FPS: 12467 FrameTime: 0.080 ms
[bump] bump-render=height: FPS: 12422 FrameTime: 0.081 ms
libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=0,

Re: [Mesa-dev] [PATCH 0/4] RadeonSI: Upload constants to VRAM via SDMA

2019-04-08 Thread Marek Olšák
I'm pretty sure I merged this series in February.

Marek

On Mon, Apr 8, 2019 at 6:10 PM Dieter Nützel  wrote:

> Maybe someone working on this, too.
>
> I'm feeling fine again after a short 'trip' into the hospital...;-)
>
> Dieter
>
> Am 26.02.2019 07:36, schrieb Marek Olšák:
> > We need to extend the CS ioctl to allow submitting 2 command buffers
> > at the same time.
> >
> > Marek
> >
> > On Mon, Feb 25, 2019, 10:06 PM Dieter Nützel 
> > wrote:
> >
> >> Hello Marek,
> >>
> >> you wrote with your series sent:
> >>
> >> [-]
> >> Trivial benchmarks such as glxgears can expect 20% decrease
> >> in performance due to the added cost of the SDMA CS ioctl that
> >> wasn't
> >> there before.
> >> [-]
> >>
> >> Any ideas to speed this up, again?
> >> glmark2 went from 9766 (best ever) down to 7455 (all with NIR).
> >> Or are micro benchmarks not worth more effort?
> >>
> >> Dieter
> >>
> >> SDMA
> >> ===
> >> glmark2 2017.07
> >> ===
> >> OpenGL Information
> >> GL_VENDOR: X.Org
> >> GL_RENDERER:   Radeon RX 580 Series (POLARIS10, DRM 3.30.0,
> >> 5.0.0-rc1-1.g7262353-default+, LLVM 9.0.0)
> >> GL_VERSION:4.5 (Compatibility Profile) Mesa 19.1.0-devel
> >> (git-a9b32aaa16)
> >> ===
> >> [build] use-vbo=false: FPS: 3694 FrameTime: 0.271 ms
> >> [build] use-vbo=true: FPS: 9341 FrameTime: 0.107 ms
> >> [texture] texture-filter=nearest: FPS: 9140 FrameTime: 0.109 ms
> >> [texture] texture-filter=linear: FPS: 9163 FrameTime: 0.109 ms
> >> [texture] texture-filter=mipmap: FPS: 9161 FrameTime: 0.109 ms
> >> [shading] shading=gouraud: FPS: 9234 FrameTime: 0.108 ms
> >> [shading] shading=blinn-phong-inf: FPS: 9255 FrameTime: 0.108 ms
> >> [shading] shading=phong: FPS: 9226 FrameTime: 0.108 ms
> >> [shading] shading=cel: FPS: 9310 FrameTime: 0.107 ms
> >> [bump] bump-render=high-poly: FPS: 9298 FrameTime: 0.108 ms
> >> [bump] bump-render=normals: FPS: 9121 FrameTime: 0.110 ms
> >> [bump] bump-render=height: FPS: 9120 FrameTime: 0.110 ms
> >> libpng warning: iCCP: known incorrect sRGB profile
> >> [effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 9858 FrameTime: 0.101 ms
> >> libpng warning: iCCP: known incorrect sRGB profile
> >> [effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 9854
> >> FrameTime:
> >> 0.101 ms
> >> [pulsar] light=false:quads=5:texture=false: FPS: 8468 FrameTime:
> >> 0.118
> >> ms
> >> libpng warning: iCCP: known incorrect sRGB profile
> >> [desktop]
> >> blur-radius=5:effect=blur:passes=1:separable=true:windows=4:
> >> FPS: 5181 FrameTime: 0.193 ms
> >> libpng warning: iCCP: known incorrect sRGB profile
> >> [desktop] effect=shadow:windows=4: FPS: 5374 FrameTime: 0.186 ms
> >> [buffer]
> >>
> >
> columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map:
> >>
> >> FPS: 824 FrameTime: 1.214 ms
> >> [buffer]
> >>
> >
> columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata:
> >>
> >> FPS: 1114 FrameTime: 0.898 ms
> >> [buffer]
> >>
> >
> columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map:
> >>
> >> FPS: 899 FrameTime: 1.112 ms
> >> [ideas] speed=duration: FPS: 3485 FrameTime: 0.287 ms
> >> [jellyfish] : FPS: 7992 FrameTime: 0.125 ms
> >> [terrain] : FPS: 1796 FrameTime: 0.557 ms
> >> [shadow] : FPS: 7350 FrameTime: 0.136 ms
> >> [refract] : FPS: 3595 FrameTime: 0.278 ms
> >> [conditionals] fragment-steps=0:vertex-steps=0: FPS: 9401 FrameTime:
> >>
> >> 0.106 ms
> >> [conditionals] fragment-steps=5:vertex-steps=0: FPS: 9413 FrameTime:
> >>
> >> 0.106 ms
> >> [conditionals] fragment-steps=0:vertex-steps=5: FPS: 9417 FrameTime:
> >>
> >> 0.106 ms
> >> [function] fragment-complexity=low:fragment-steps=5: FPS: 9365
> >> FrameTime: 0.107 ms
> >> [function] fragment-complexity=medium:fragment-steps=5: FPS: 9451
> >> FrameTime: 0.106 ms
> >> [loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS:
> >> 9300
> >> FrameTime: 0.108 ms
> >> [loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS:
> >> 9440
> >> FrameTime: 0.106 ms
> >> [loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS:
> >> 9392
> >> FrameTime: 0.106 ms
> >> ===
> >> glmark2 Score: 7455
> >> ===
> >>
> >> Before
> >> ===
> >> glmark2 2017.07
> >> ===
> >> OpenGL Information
> >> GL_VENDOR: X.Org
> >> GL_RENDERER:   Radeon RX 580 Series (POLARIS10, DRM 3.27.0,
> >> 4.20.0-rc3-1.g7262353-default+, LLVM 8.0.0)
> >> GL_VERSION:4.5 (Compatibility Profile) Mesa 19.0.0-devel
> >> (git-c49b3df3cb)
> >> ===
> >> [build] use-vbo=false: FPS: 3373 FrameTime: 0.296 ms
> >> [build] use-vbo=true: FPS: 1

Re: [Mesa-dev] [PATCH 0/4] RadeonSI: Upload constants to VRAM via SDMA

2019-04-08 Thread Dieter Nützel

Am 09.04.2019 02:42, schrieb Marek Olšák:

I'm pretty sure I merged this series in February.

Marek


Yes, of course you did (with my tb), but I meant... (see below)


On Mon, Apr 8, 2019 at 6:10 PM Dieter Nützel 
wrote:


Maybe someone working on this, too.

I'm feeling fine again after a short 'trip' into the hospital...;-)

Dieter

Am 26.02.2019 07:36, schrieb Marek Olšák:

We need to extend the CS ioctl to allow submitting 2 command

buffers

at the same time.


This additional work.

Dieter


Marek

On Mon, Feb 25, 2019, 10:06 PM Dieter Nützel



wrote:


Hello Marek,

you wrote with your series sent:

[-]
Trivial benchmarks such as glxgears can expect 20% decrease
in performance due to the added cost of the SDMA CS ioctl that
wasn't
there before.
[-]

Any ideas to speed this up, again?
glmark2 went from 9766 (best ever) down to 7455 (all with NIR).
Or are micro benchmarks not worth more effort?

Dieter

SDMA
===
glmark2 2017.07
===
OpenGL Information
GL_VENDOR: X.Org
GL_RENDERER:   Radeon RX 580 Series (POLARIS10, DRM 3.30.0,
5.0.0-rc1-1.g7262353-default+, LLVM 9.0.0)
GL_VERSION:4.5 (Compatibility Profile) Mesa 19.1.0-devel
(git-a9b32aaa16)
===
[build] use-vbo=false: FPS: 3694 FrameTime: 0.271 ms
[build] use-vbo=true: FPS: 9341 FrameTime: 0.107 ms
[texture] texture-filter=nearest: FPS: 9140 FrameTime: 0.109 ms
[texture] texture-filter=linear: FPS: 9163 FrameTime: 0.109 ms
[texture] texture-filter=mipmap: FPS: 9161 FrameTime: 0.109 ms
[shading] shading=gouraud: FPS: 9234 FrameTime: 0.108 ms
[shading] shading=blinn-phong-inf: FPS: 9255 FrameTime: 0.108 ms
[shading] shading=phong: FPS: 9226 FrameTime: 0.108 ms
[shading] shading=cel: FPS: 9310 FrameTime: 0.107 ms
[bump] bump-render=high-poly: FPS: 9298 FrameTime: 0.108 ms
[bump] bump-render=normals: FPS: 9121 FrameTime: 0.110 ms
[bump] bump-render=height: FPS: 9120 FrameTime: 0.110 ms
libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 9858 FrameTime: 0.101

ms

libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 9854
FrameTime:
0.101 ms
[pulsar] light=false:quads=5:texture=false: FPS: 8468 FrameTime:
0.118
ms
libpng warning: iCCP: known incorrect sRGB profile
[desktop]
blur-radius=5:effect=blur:passes=1:separable=true:windows=4:
FPS: 5181 FrameTime: 0.193 ms
libpng warning: iCCP: known incorrect sRGB profile
[desktop] effect=shadow:windows=4: FPS: 5374 FrameTime: 0.186 ms
[buffer]






columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map:


FPS: 824 FrameTime: 1.214 ms
[buffer]






columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata:


FPS: 1114 FrameTime: 0.898 ms
[buffer]






columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map:


FPS: 899 FrameTime: 1.112 ms
[ideas] speed=duration: FPS: 3485 FrameTime: 0.287 ms
[jellyfish] : FPS: 7992 FrameTime: 0.125 ms
[terrain] : FPS: 1796 FrameTime: 0.557 ms
[shadow] : FPS: 7350 FrameTime: 0.136 ms
[refract] : FPS: 3595 FrameTime: 0.278 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 9401

FrameTime:


0.106 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 9413

FrameTime:


0.106 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 9417

FrameTime:


0.106 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 9365
FrameTime: 0.107 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 9451
FrameTime: 0.106 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS:
9300
FrameTime: 0.108 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5:

FPS:

9440
FrameTime: 0.106 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5:

FPS:

9392
FrameTime: 0.106 ms
===
glmark2 Score: 7455
===

Before
===
glmark2 2017.07
===
OpenGL Information
GL_VENDOR: X.Org
GL_RENDERER:   Radeon RX 580 Series (POLARIS10, DRM 3.27.0,
4.20.0-rc3-1.g7262353-default+, LLVM 8.0.0)
GL_VERSION:4.5 (Compatibility Profile) Mesa 19.0.0-devel
(git-c49b3df3cb)
===
[build] use-vbo=false: FPS: 3373 FrameTime: 0.296 ms
[build] use-vbo=true: FPS: 13121 FrameTime: 0.076 ms
[texture] texture-filter=nearest: FPS: 12172 FrameTime: 0.082 ms
[texture] texture-filter=linear: FPS: 12557 FrameTime: 0.080 ms
[texture] texture-filter=mipmap: FPS: 12228 FrameTime: 0.082 ms
[shading] shading=gouraud: FPS: 12536 FrameTime: 0.080 ms
[shading] shading=blinn-phong-inf: FPS: 12782 FrameTime: 0.078 ms
[shading] shading=phong: FPS: 12619 FrameTime: 0.079 ms
[shading] shading=c

Re: [Mesa-dev] [PATCH 0/4] RadeonSI: Upload constants to VRAM via SDMA

2019-04-09 Thread Marek Olšák
On Mon, Apr 8, 2019 at 10:32 PM Dieter Nützel  wrote:

> Am 09.04.2019 02:42, schrieb Marek Olšák:
> > I'm pretty sure I merged this series in February.
> >
> > Marek
>
> Yes, of course you did (with my tb), but I meant... (see below)
>
> > On Mon, Apr 8, 2019 at 6:10 PM Dieter Nützel 
> > wrote:
> >
> >> Maybe someone working on this, too.
> >>
> >> I'm feeling fine again after a short 'trip' into the hospital...;-)
> >>
> >> Dieter
> >>
> >> Am 26.02.2019 07:36, schrieb Marek Olšák:
> >>> We need to extend the CS ioctl to allow submitting 2 command
> >> buffers
> >>> at the same time.
>
> This additional work.
>

Well, that will take some time, as nobody is working on it.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/4] RadeonSI: Upload constants to VRAM via SDMA

2019-02-07 Thread Timur Kristóf
Tested-By: Timur Kristóf 

It may be worth to note that the issue addressed was with an external
GPU (through Thunderbolt 3) and the performance benefit may be less
extreme on a system on which the bandwidth isn't as constrained.
(Also, the TB3 eGPU is still outperformed by a normal PCI-E GPU.)

Cheers & best regards,
Tim

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev