Hi Roland! On 4/11/19 8:18 PM, Roland Scheidegger wrote:
The original results were generated using version 19.0.2 (from the arch linux repositories), but I got the same results using the current git version (98934e6aa19795072a353dae6020dafadc76a1e3).What version of mesa are you using?
Using GALLIVM_PERF does not a make a difference, either, but that should be expected because I'm not using mipmaps, just "regular" linear filtering (GL_NEAREST).The debug flags were changed a while ago (so that those perf tweaks can be disabled on release builds too), it needs to be either: GALLIVM_PERF=no_rho_approx,no_brilinear,no_quad_lod or easier GALLIVM_PERF=no_filter_hacks (which disables these 3 things above together)Although all of that only really affects filtering with mipmaps (not sure if you do?).
(more below)
See my responses below as well.
Do I understand you correctly in that for the 2D case, the results of the first two lerps (done in 16 bit) are converted to 8 bit, then converted back to 16 bit for the final (second stage) lerp?Am 11.04.19 um 18:00 schrieb Dominik Drees:Running with the suggested flags in the environment does not change the result for the test case I described below. The results with and without the environment variables set are pixel-wise equal. By the way, and if this of interest: For GL_NEAREST sampling the results from hardware and llvmpipe are equal as well. Best, Dominik On 4/11/19 4:36 PM, Ilia Mirkin wrote:llvmpipe takes a number of shortcuts in the interest of speed which cause inaccurate texturing. Try running with GALLIVM_DEBUG=no_rho_approx,no_brilinear,no_quad_lod and see if the issue still occurs. Cheers, -ilia On Thu, Apr 11, 2019 at 8:30 AM Dominik Drees <dominik.dr...@wwu.de> wrote:Hello, everyone! I have a question regarding the interpolation precision of llvmpipe. Feel free to redirect me to somewhere else if this is not the right place to ask. Consider the following scenario: In a fragment shader we are sampling from a 16x16, 8 bit texture with values between 0 and 3 using linear interpolation. Then we write white to the screen if the sampled value is > 1/255 and black otherwise. The output looks very different when rendered with llvmpipe compared to the result produced by rendering hardware (for both intel (mesa i965) and nvidia (proprietary driver)). I've uploaded examplary output images here (https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fimgur.com%2Fa%2FD1udpez&data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501149697&sdata=vymggYHZTDLwKNh7RpcM1eSyhVA2L%2BfHNchvYS8yQPQ%3D&reserved=0) and the corresponding fragment shader here (https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fpa808Req&data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501149697&sdata=%2FqKVJCXFS4UswynKeSoqCKivTHAb2o%2FZwVE1nwNms3M%3D&reserved=0).The shader looks iffy to me, how do you use that vec4 in the if clause?My hypothesis is that llvmpipe (in contrast to hardware) only uses 8 bit for the interpolation computation when reading from 8 bit textures and thus loses precision in the lower bits. Is that correct? If so, does anyone know of a workaround?So, in theory it is indeed possible the results are less accurate with llvmpipe (I believe all recent hw does rgba8 filtering with more than 8 bit precision). For formats fitting into rgba8, we have a fast path in llvmpipe (gallivm) for the lerp, which unpacks the 8bit values into 16bit values, does the lerp with that and packs back to 8 bit. The result is accurately rounded there (to 8 bit) but only for 1 lerp step - for a 2d texture there are 3 of those (one per direction, and a final one combining the result). And yes this means the filtered result only has 8 bits.
If so and if I'm understanding this correctly, for 2D (i.e., a 2-stage linear interpolation) we potentially have an error in the order of one bit for the final 8 bit value due to the intermediate 16->8->16 conversion. For sampling from a 3D texture (i.e., a 3-stage linear interpolation) the effect would be amplified: The extra stage could cause an error with a magnitude of two bits of the final 8 bit result (if I'm doing the math in my head correctly).
Is there any (conceptual) reason why the result of a one dimensional interpolation step is reduced back to 8 bits before the second stage interpolation? Would avoiding these conversions not actually be faster (in addition to the improved accuracy)?
In principle you are correct. In our regressiontests we actually have (per test) configurable thresholds for maximum pixel distance/maximum number of differing pixels/neighborhood search radius etc. We could just increase these thresholds, but would risk missing some regressions that (for example) only affect a very small portion of the screen. For the larger part of our test suite llvmpipe actually works quite well within the established limits. For some other cases where we render a relatively small 8 bit 3D volume the differences basically trampled the previously set thresholds and were quite visible to the naked eye.I do believe you should not rely on implementations having more accuracy - as far as I know the filtering we do is conformant there (it is tricky to do better using the fast path).
Forcing float precision indeed fixes the test case described below and our volume rendering regression tests! If this cannot be fixed in general I would be very happy about an option to force float precision via GALLIVM_PERF. FWIW, with forced float precision running our test suit is actually faster (~6 minutes) than "stock" master (~6:40), but these may be highly biased, of course.There would be code to actually do filtering with full float precision, although there's no way to reach it with rgba8 formats unless you change the code (if you want to try out the theory, look at lp_bld_sample_soa.c, lp_build_sample_soa_code() determines whether to use the fast (aos) filtering path (use_aos, determined mostly by util_format_fits_8unorm()). If you set this to false it will use the full float filtering path. (FWIW I was actually thinking a while ago we should force this path when there's only 1 channel, albeit I never got around to test (benchmark) it - this is because the AoS filtering path is really optimized for rgba8 formats, and if you only have 1 channel it's quite possible float filtering is actually faster, since this handles the channels individually.) I guess though if the full float precision filtering is useful in general, we could add that to GALLIVM_PERF.
Best, Dominik
RolandA little bit of background about the use case: We are trying to move the CI of Voreen (https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.uni-muenster.de%2FVoreen%2F&data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501149697&sdata=tZf1sxXpC0rDhAAzqXNp9UQnRmrnZceKCerfJKcMdmk%3D&reserved=0) to the Gitlab-CI running in docker without any hardware dependencies. Using llvmpipe for our regression tests works in principle, but shows significant differences in the raycasting rendering of an 8-bit-per-voxel dataset. (The effect is of course less visible than the constructed example case linked above, but still quite noticeable for a human.) Any help or pointers would be appreciated! Best, Dominik -- Dominik Drees Department of Computer Science Westfaelische Wilhelms-Universitaet Muenster email: dominik.dr...@wwu.de web: https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wwu.de%2FPRIA%2Fpersonen%2Fdrees.shtml&data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501159687&sdata=tZeO2bZCQzdIz8ifZnNRbQ8tM46CCTDrDFgTeXbVWUU%3D&reserved=0 phone: +49 251 83 - 38448 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-dev&data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501159687&sdata=d%2Fj7ZLjayR308Y0qFzFu5YqVBbQF%2B1b8tHPS75U3jco%3D&reserved=0_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-dev&data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501179679&sdata=fMbBfbBWnYQbDmwTcV%2FaOVpXwTLD%2BV5PF2yGH8hvHkM%3D&reserved=0
-- Dominik Drees Department of Computer Science Westfaelische Wilhelms-Universitaet Muenster email: dominik.dr...@wwu.de web: https://www.wwu.de/PRIA/personen/drees.shtml phone: +49 251 83 - 38448
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev