Re: [Mesa-dev] Profile-guides optimizations
Hi, On 13.2.2020 20.44, Dylan Baker wrote: I actually spent a bunch of time toying with PGO a couple of years ago. I got the guidance all working and was able to train it, but what we found was that it made the specific workloads we threw at it much faster, but it made every real world use case I tried (playing a game, running piglit, gnome) slower, often significantly so. The hard part is not setting up pgo, it's getting the right training data. Do you still remember what exact use-cases were used for training, and what exact cases suffered worst as a result? Did you for example profile also X server, and (some) X and Wayland compositor during the application profiling? Also, did you use normal FDO (instrumented code) or AutoFDO (profiling) to gather the profiling data? I'm wondering whether AutoFDO could be done so that you install Mesa to the system, profile the whole system, and just filter out non-Mesa profiling data when applying the information to the new build... - Eero Dylan Quoting Marek Olšák (2020-02-13 10:30:46) [Forked from the other thread] Guys, we could run some simple tests similar to piglit/drawoverhead as the last step of the pgo=generate build. Tests like that should exercise the most common codepaths in drivers. We could add subtests that we care about the most. Marek On Thu., Feb. 13, 2020, 13:16 Dylan Baker, wrote: meson has buildtins for both of these, -Db_lto=true turns on lto, for pgo you would run: meson build -Db_pgo=generate ninja -C build meson configure build -Db_pgo=use ninja -C build Quoting Marek Olšák (2020-02-12 10:46:12) > How do you enable LTO+PGO? Is it something we could enable by default for > release builds? > > Marek > > On Wed, Feb 12, 2020 at 1:56 AM Dieter Nützel wrote: > > Hello Gert, > > your merge 'broke' LTO and then later on PGO compilation/linking. > > I do generally compiling with '-Dgallium-drivers= r600,radeonsi,swrast' > for testing radeonsi and (your) r600 work. ;-) > > After your merge I get several warnings in 'addrlib' with LTO and even a > compiler error (gcc (SUSE Linux) 9.2.1 20200128). > > I had to disable 'r600' ('swrast' is needed for 'nine') to get a working > LTO and even better PGO radeonsi driver. > I'm preparing GREAT LTO+PGO (the later is the greater) numbers over the > last 2 months. I'll send my results later, today. > > Summary > radeonsi is ~40% smaller and 16-20% faster with PGO (!!!). > > Honza and the GCC people (Intel's ICC folks) do GREAT things. > 'glmark2' numbers are better then 'vkmark'. (Hello, Marek.). > > Need some sleep. > > See my log, below. > > Greetings and GREAT work! > > -Dieter > > Am 09.02.2020 15:46, schrieb Gert Wollny: > > Am Donnerstag, den 23.01.2020, 20:31 +0100 schrieb Gert Wollny: > >> has anybody any objections if I merge the r600/NIR code? > >> Without explicitely setting the debug flag it doesn't change a > >> thing, but it would be better to continue developing in-tree. > > Okay, if nobody objects, I'll merge it Monday evening. > > > > Best, > > Gert > > [1425/1433] Linking target src/gallium/targets/dri/libgallium_dri.so. > FAILED: src/gallium/targets/dri/libgallium_dri.so > c++ -o src/gallium/targets/dri/libgallium_dri.so > 'src/gallium/targets/dri/8381c20@@gallium_dri@sha/target.c.o' -flto > -fprofile-generate -Wl,--as-needed -Wl,--no-undefined -Wl,-O1 -shared > -fPIC -Wl,--start-group -Wl,-soname,libgallium_dri.so > src/mesa/libmesa_gallium.a src/mesa/libmesa_common.a > src/compiler/glsl/libglsl.a src/compiler/glsl/glcpp/libglcpp.a > src/util/libmesa_util.a src/util/format/libmesa_format.a > src/compiler/nir/libnir.a src/compiler/libcompiler.a > src/mesa/libmesa_sse41.a src/mesa/drivers/dri/common/libdricommon.a > src/mesa/drivers/dri/common/libmegadriver_stub.a > src/gallium/state_trackers/dri/libdri.a > src/gallium/auxiliary/libgalliumvl.a src/gallium/auxiliary/ libgallium.a > src/mapi/shared-glapi/libglapi.so.0.0.0 > src/gallium/auxiliary/pipe-loader/libpipe_loader_static.a > src/loader/libloader.a src/util/libxmlconfig.a > src/gallium/winsys/sw/null/libws_null.a > src/gallium/winsys/sw/wrapper/libwsw.a > src/gallium/winsys/sw/dri/libswdri.a > src/gallium/winsys/sw/kms-dri/libswkmsdri.a > src/gallium/drivers/llvmpipe/libllvmpipe.a > src/gallium/drivers/softpipe/libsoftpipe.a > src/gallium/drivers/r600/libr600.a > src/gallium/winsys/r
Re: [Mesa-dev] Profile-guides optimizations
The GCC wiki says: "GCC uses execution profiles consisting of basic block and edge frequency counts to guide optimizations such as instruction scheduling, basic block reordering, function splitting, and register allocation." More info here: https://gcc.gnu.org/wiki/AutoFDO/Tutorial Timur On Friday, 14 February 2020, Marek Olšák wrote: > Yeah I guess it reduces instruction cache misses, but then other codepaths > are likely to get more misses. > > Does it do anything smarter? > > Marek > > On Thu., Feb. 13, 2020, 17:52 Dave Airlie, wrote: > > > On Fri, 14 Feb 2020 at 08:22, Marek Olšák wrote: > > > > > > I wonder what PGO really does other than placing likely/unlikely. > > > > With LTO it can do a lot more, like grouping hot functions into closer > > regions so they avoid TLB misses and faults etc. > > > > Dave. > > > -- Sent from my Sailfish device ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Profile-guides optimizations
Yeah I guess it reduces instruction cache misses, but then other codepaths are likely to get more misses. Does it do anything smarter? Marek On Thu., Feb. 13, 2020, 17:52 Dave Airlie, wrote: > On Fri, 14 Feb 2020 at 08:22, Marek Olšák wrote: > > > > I wonder what PGO really does other than placing likely/unlikely. > > With LTO it can do a lot more, like grouping hot functions into closer > regions so they avoid TLB misses and faults etc. > > Dave. > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Profile-guides optimizations
On Fri, 14 Feb 2020 at 08:22, Marek Olšák wrote: > > I wonder what PGO really does other than placing likely/unlikely. With LTO it can do a lot more, like grouping hot functions into closer regions so they avoid TLB misses and faults etc. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Profile-guides optimizations
I wonder what PGO really does other than placing likely/unlikely. Marek On Thu., Feb. 13, 2020, 13:43 Dylan Baker, wrote: > I actually spent a bunch of time toying with PGO a couple of years ago. I > got > the guidance all working and was able to train it, but what we found was > that it > made the specific workloads we threw at it much faster, but it made every > real > world use case I tried (playing a game, running piglit, gnome) slower, > often > significantly so. > > The hard part is not setting up pgo, it's getting the right training data. > > Dylan > > Quoting Marek Olšák (2020-02-13 10:30:46) > > [Forked from the other thread] > > > > Guys, we could run some simple tests similar to piglit/drawoverhead as > the last > > step of the pgo=generate build. Tests like that should exercise the most > common > > codepaths in drivers. We could add subtests that we care about the most. > > > > Marek > > > > On Thu., Feb. 13, 2020, 13:16 Dylan Baker, wrote: > > > > meson has buildtins for both of these, -Db_lto=true turns on lto, > for pgo > > you > > would run: > > > > meson build -Db_pgo=generate > > ninja -C build > > > > meson configure build -Db_pgo=use > > ninja -C build > > > > Quoting Marek Olšák (2020-02-12 10:46:12) > > > How do you enable LTO+PGO? Is it something we could enable by > default for > > > release builds? > > > > > > Marek > > > > > > On Wed, Feb 12, 2020 at 1:56 AM Dieter Nützel < > die...@nuetzel-hh.de> > > wrote: > > > > > > Hello Gert, > > > > > > your merge 'broke' LTO and then later on PGO > compilation/linking. > > > > > > I do generally compiling with '-Dgallium-drivers= > > r600,radeonsi,swrast' > > > for testing radeonsi and (your) r600 work. ;-) > > > > > > After your merge I get several warnings in 'addrlib' with LTO > and > > even a > > > compiler error (gcc (SUSE Linux) 9.2.1 20200128). > > > > > > I had to disable 'r600' ('swrast' is needed for 'nine') to get > a > > working > > > LTO and even better PGO radeonsi driver. > > > I'm preparing GREAT LTO+PGO (the later is the greater) numbers > over > > the > > > last 2 months. I'll send my results later, today. > > > > > > Summary > > > radeonsi is ~40% smaller and 16-20% faster with PGO (!!!). > > > > > > Honza and the GCC people (Intel's ICC folks) do GREAT things. > > > 'glmark2' numbers are better then 'vkmark'. (Hello, Marek.). > > > > > > Need some sleep. > > > > > > See my log, below. > > > > > > Greetings and GREAT work! > > > > > > -Dieter > > > > > > Am 09.02.2020 15:46, schrieb Gert Wollny: > > > > Am Donnerstag, den 23.01.2020, 20:31 +0100 schrieb Gert > Wollny: > > > >> has anybody any objections if I merge the r600/NIR code? > > > >> Without explicitely setting the debug flag it doesn't > change a > > > >> thing, but it would be better to continue developing > in-tree. > > > > Okay, if nobody objects, I'll merge it Monday evening. > > > > > > > > Best, > > > > Gert > > > > > > [1425/1433] Linking target > src/gallium/targets/dri/libgallium_dri.so. > > > FAILED: src/gallium/targets/dri/libgallium_dri.so > > > c++ -o src/gallium/targets/dri/libgallium_dri.so > > > 'src/gallium/targets/dri/8381c20@@gallium_dri@sha/target.c.o' > -flto > > > -fprofile-generate -Wl,--as-needed -Wl,--no-undefined -Wl,-O1 > -shared > > > -fPIC -Wl,--start-group -Wl,-soname,libgallium_dri.so > > > src/mesa/libmesa_gallium.a src/mesa/libmesa_common.a > > > src/compiler/glsl/libglsl.a src/compiler/glsl/glcpp/libglcpp.a > > > src/util/libmesa_util.a src/util/format/libmesa_format.a > > > src/compiler/nir/libnir.a src/compiler/libcompiler.a > > > src/mesa/libmesa_sse41.a > src/mesa/drivers/dri/common/libdricommon.a > > > src/mesa/drivers/dri/common/libmegadriver_stub.a > > > src/gallium/state_trackers/dri/libdri.a > > > src/gallium/auxiliary/libgalliumvl.a src/gallium/auxiliary/ > > libgallium.a > > > src/mapi/shared-glapi/libglapi.so.0.0.0 > > > src/gallium/auxiliary/pipe-loader/libpipe_loader_static.a > > > src/loader/libloader.a src/util/libxmlconfig.a > > > src/gallium/winsys/sw/null/libws_null.a > > > src/gallium/winsys/sw/wrapper/libwsw.a > > > src/gallium/winsys/sw/dri/libswdri.a > > > src/gallium/winsys/sw/kms-dri/libswkmsdri.a > > > src/gallium/drivers/llvmpipe/libllvmpipe.a > > > src/gallium/drivers/softpipe/libsoftpipe.a > > > src/gallium/drivers/r600/libr600.a > > > src/gallium/winsys/radeon/drm/libradeonwinsys.a > > > src/gallium/drivers/radeonsi/libradeonsi.a > > >
Re: [Mesa-dev] Profile-guides optimizations
I actually spent a bunch of time toying with PGO a couple of years ago. I got the guidance all working and was able to train it, but what we found was that it made the specific workloads we threw at it much faster, but it made every real world use case I tried (playing a game, running piglit, gnome) slower, often significantly so. The hard part is not setting up pgo, it's getting the right training data. Dylan Quoting Marek Olšák (2020-02-13 10:30:46) > [Forked from the other thread] > > Guys, we could run some simple tests similar to piglit/drawoverhead as the > last > step of the pgo=generate build. Tests like that should exercise the most > common > codepaths in drivers. We could add subtests that we care about the most. > > Marek > > On Thu., Feb. 13, 2020, 13:16 Dylan Baker, wrote: > > meson has buildtins for both of these, -Db_lto=true turns on lto, for pgo > you > would run: > > meson build -Db_pgo=generate > ninja -C build > > meson configure build -Db_pgo=use > ninja -C build > > Quoting Marek Olšák (2020-02-12 10:46:12) > > How do you enable LTO+PGO? Is it something we could enable by default > for > > release builds? > > > > Marek > > > > On Wed, Feb 12, 2020 at 1:56 AM Dieter Nützel > wrote: > > > > Hello Gert, > > > > your merge 'broke' LTO and then later on PGO compilation/linking. > > > > I do generally compiling with '-Dgallium-drivers= > r600,radeonsi,swrast' > > for testing radeonsi and (your) r600 work. ;-) > > > > After your merge I get several warnings in 'addrlib' with LTO and > even a > > compiler error (gcc (SUSE Linux) 9.2.1 20200128). > > > > I had to disable 'r600' ('swrast' is needed for 'nine') to get a > working > > LTO and even better PGO radeonsi driver. > > I'm preparing GREAT LTO+PGO (the later is the greater) numbers over > the > > last 2 months. I'll send my results later, today. > > > > Summary > > radeonsi is ~40% smaller and 16-20% faster with PGO (!!!). > > > > Honza and the GCC people (Intel's ICC folks) do GREAT things. > > 'glmark2' numbers are better then 'vkmark'. (Hello, Marek.). > > > > Need some sleep. > > > > See my log, below. > > > > Greetings and GREAT work! > > > > -Dieter > > > > Am 09.02.2020 15:46, schrieb Gert Wollny: > > > Am Donnerstag, den 23.01.2020, 20:31 +0100 schrieb Gert Wollny: > > >> has anybody any objections if I merge the r600/NIR code? > > >> Without explicitely setting the debug flag it doesn't change a > > >> thing, but it would be better to continue developing in-tree. > > > Okay, if nobody objects, I'll merge it Monday evening. > > > > > > Best, > > > Gert > > > > [1425/1433] Linking target > src/gallium/targets/dri/libgallium_dri.so. > > FAILED: src/gallium/targets/dri/libgallium_dri.so > > c++ -o src/gallium/targets/dri/libgallium_dri.so > > 'src/gallium/targets/dri/8381c20@@gallium_dri@sha/target.c.o' -flto > > -fprofile-generate -Wl,--as-needed -Wl,--no-undefined -Wl,-O1 > -shared > > -fPIC -Wl,--start-group -Wl,-soname,libgallium_dri.so > > src/mesa/libmesa_gallium.a src/mesa/libmesa_common.a > > src/compiler/glsl/libglsl.a src/compiler/glsl/glcpp/libglcpp.a > > src/util/libmesa_util.a src/util/format/libmesa_format.a > > src/compiler/nir/libnir.a src/compiler/libcompiler.a > > src/mesa/libmesa_sse41.a src/mesa/drivers/dri/common/libdricommon.a > > src/mesa/drivers/dri/common/libmegadriver_stub.a > > src/gallium/state_trackers/dri/libdri.a > > src/gallium/auxiliary/libgalliumvl.a src/gallium/auxiliary/ > libgallium.a > > src/mapi/shared-glapi/libglapi.so.0.0.0 > > src/gallium/auxiliary/pipe-loader/libpipe_loader_static.a > > src/loader/libloader.a src/util/libxmlconfig.a > > src/gallium/winsys/sw/null/libws_null.a > > src/gallium/winsys/sw/wrapper/libwsw.a > > src/gallium/winsys/sw/dri/libswdri.a > > src/gallium/winsys/sw/kms-dri/libswkmsdri.a > > src/gallium/drivers/llvmpipe/libllvmpipe.a > > src/gallium/drivers/softpipe/libsoftpipe.a > > src/gallium/drivers/r600/libr600.a > > src/gallium/winsys/radeon/drm/libradeonwinsys.a > > src/gallium/drivers/radeonsi/libradeonsi.a > > src/gallium/winsys/amdgpu/drm/libamdgpuwinsys.a > > src/amd/addrlib/libaddrlib.a src/amd/common/libamd_common.a > > src/amd/llvm/libamd_common_llvm.a -Wl,--build-id=sha1 > -Wl,--gc-sections > > -Wl,--version-script /opt/mesa/src/gallium/targets/dri/dri.sym > > -Wl,--dynamic-list > /opt/mesa/src/gallium/targets/dri/../dri-vdpau.dyn > > /us
[Mesa-dev] Profile-guides optimizations
[Forked from the other thread] Guys, we could run some simple tests similar to piglit/drawoverhead as the last step of the pgo=generate build. Tests like that should exercise the most common codepaths in drivers. We could add subtests that we care about the most. Marek On Thu., Feb. 13, 2020, 13:16 Dylan Baker, wrote: > meson has buildtins for both of these, -Db_lto=true turns on lto, for pgo > you > would run: > > meson build -Db_pgo=generate > ninja -C build > > meson configure build -Db_pgo=use > ninja -C build > > Quoting Marek Olšák (2020-02-12 10:46:12) > > How do you enable LTO+PGO? Is it something we could enable by default for > > release builds? > > > > Marek > > > > On Wed, Feb 12, 2020 at 1:56 AM Dieter Nützel > wrote: > > > > Hello Gert, > > > > your merge 'broke' LTO and then later on PGO compilation/linking. > > > > I do generally compiling with > '-Dgallium-drivers=r600,radeonsi,swrast' > > for testing radeonsi and (your) r600 work. ;-) > > > > After your merge I get several warnings in 'addrlib' with LTO and > even a > > compiler error (gcc (SUSE Linux) 9.2.1 20200128). > > > > I had to disable 'r600' ('swrast' is needed for 'nine') to get a > working > > LTO and even better PGO radeonsi driver. > > I'm preparing GREAT LTO+PGO (the later is the greater) numbers over > the > > last 2 months. I'll send my results later, today. > > > > Summary > > radeonsi is ~40% smaller and 16-20% faster with PGO (!!!). > > > > Honza and the GCC people (Intel's ICC folks) do GREAT things. > > 'glmark2' numbers are better then 'vkmark'. (Hello, Marek.). > > > > Need some sleep. > > > > See my log, below. > > > > Greetings and GREAT work! > > > > -Dieter > > > > Am 09.02.2020 15:46, schrieb Gert Wollny: > > > Am Donnerstag, den 23.01.2020, 20:31 +0100 schrieb Gert Wollny: > > >> has anybody any objections if I merge the r600/NIR code? > > >> Without explicitely setting the debug flag it doesn't change a > > >> thing, but it would be better to continue developing in-tree. > > > Okay, if nobody objects, I'll merge it Monday evening. > > > > > > Best, > > > Gert > > > > [1425/1433] Linking target src/gallium/targets/dri/libgallium_dri.so. > > FAILED: src/gallium/targets/dri/libgallium_dri.so > > c++ -o src/gallium/targets/dri/libgallium_dri.so > > 'src/gallium/targets/dri/8381c20@@gallium_dri@sha/target.c.o' -flto > > -fprofile-generate -Wl,--as-needed -Wl,--no-undefined -Wl,-O1 -shared > > -fPIC -Wl,--start-group -Wl,-soname,libgallium_dri.so > > src/mesa/libmesa_gallium.a src/mesa/libmesa_common.a > > src/compiler/glsl/libglsl.a src/compiler/glsl/glcpp/libglcpp.a > > src/util/libmesa_util.a src/util/format/libmesa_format.a > > src/compiler/nir/libnir.a src/compiler/libcompiler.a > > src/mesa/libmesa_sse41.a src/mesa/drivers/dri/common/libdricommon.a > > src/mesa/drivers/dri/common/libmegadriver_stub.a > > src/gallium/state_trackers/dri/libdri.a > > src/gallium/auxiliary/libgalliumvl.a > src/gallium/auxiliary/libgallium.a > > src/mapi/shared-glapi/libglapi.so.0.0.0 > > src/gallium/auxiliary/pipe-loader/libpipe_loader_static.a > > src/loader/libloader.a src/util/libxmlconfig.a > > src/gallium/winsys/sw/null/libws_null.a > > src/gallium/winsys/sw/wrapper/libwsw.a > > src/gallium/winsys/sw/dri/libswdri.a > > src/gallium/winsys/sw/kms-dri/libswkmsdri.a > > src/gallium/drivers/llvmpipe/libllvmpipe.a > > src/gallium/drivers/softpipe/libsoftpipe.a > > src/gallium/drivers/r600/libr600.a > > src/gallium/winsys/radeon/drm/libradeonwinsys.a > > src/gallium/drivers/radeonsi/libradeonsi.a > > src/gallium/winsys/amdgpu/drm/libamdgpuwinsys.a > > src/amd/addrlib/libaddrlib.a src/amd/common/libamd_common.a > > src/amd/llvm/libamd_common_llvm.a -Wl,--build-id=sha1 > -Wl,--gc-sections > > -Wl,--version-script /opt/mesa/src/gallium/targets/dri/dri.sym > > -Wl,--dynamic-list /opt/mesa/src/gallium/targets/dri/../dri-vdpau.dyn > > /usr/lib64/libdrm.so -L/usr/local/lib -lLLVM-10git -pthread > > /usr/lib64/libexpat.so > > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libz.so -lm > > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libzstd.so > > -L/usr/local/lib -lLLVM-10git /usr/lib64/libunwind.so -ldl -lsensors > > -L/usr/local/lib -lLLVM-10git /usr/lib64/libdrm_radeon.so > > /usr/lib64/libelf.so -L/usr/local/lib -lLLVM-10git -L/usr/local/lib > > -lLLVM-10git -L/usr/local/lib -lLLVM-10git > /usr/lib64/libdrm_amdgpu.so > > -L/usr/local/lib -lLLVM-10git -Wl,--end-group > > > '-Wl,-rpath,$ORIGIN/../../../mesa:$ORIGIN/../../../compiler/glsl:$ORIGIN/.. > > > /../../compiler/glsl/glcpp:$ORIGIN/../../../util:$ORIGIN/../../../util/ > > > format:$ORIGIN/../../../compiler/nir:$ORIGIN/../../../compiler:$ORIGIN/.. > > > /../../mesa/drivers/dri/common:$ORIGIN/.