Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware
I can confirm tri/cube work with latest git. Talos Principle refuses to start because of missing vkCmdBeginQuery, time to jump into the docs to see how much of gen8 is copy-able there. OG. On Tue, Mar 1, 2016 at 10:28 AM, Jacek Konieczny wrote: > On 2016-03-01 10:10, Martin Peres wrote: >> >> On 29/02/16 20:48, Jason Ekstrand wrote: >>> >>> On Fri, Feb 26, 2016 at 2:18 AM, Olivier Galibert >> <mailto:galib...@pobox.com>> wrote: >>> >>> Ok, I can tell you that 3DSTATE_DEPTH_BUFFER and >>> 3DSTATE_STENCIL_BUFFER seem perfectly correct (assuming the gem >>> address-patching-in works for the depth buffer address). I'll see if >>> I can find a past version that works. >>> >>> >>> FYI, this hang has been fixed now and most of the demos work >>> more-or-less. >>> --Jason >> >> >> Just tried the vkcube with hsw and there is definitely an improvements >> (the machine does not hard hang anymore) but vkcube now segfaults: > > > For me both 'vkcube' and the 'cube' and 'tri' demos from > LoaderAndValidationLayers work correctly with GIT revision > 46b7c242da7c7c9ea7877a2c4b1fecdf5c1c0452. > > 'cube', 'tri' and most other Vulkan examples would cause GPU hang on > earlier revisions, so the improvement is (was?) clear. > > Jacek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware
Beware of path issues, vk* has no error checking and gives funky values to the driver if it fails at finding its extra files. OG. On Tue, Mar 1, 2016 at 10:10 AM, Martin Peres wrote: > On 29/02/16 20:48, Jason Ekstrand wrote: > > On Fri, Feb 26, 2016 at 2:18 AM, Olivier Galibert > wrote: >> >> Ok, I can tell you that 3DSTATE_DEPTH_BUFFER and >> 3DSTATE_STENCIL_BUFFER seem perfectly correct (assuming the gem >> address-patching-in works for the depth buffer address). I'll see if >> I can find a past version that works. > > > FYI, this hang has been fixed now and most of the demos work more-or-less. > --Jason > > > Just tried the vkcube with hsw and there is definitely an improvements (the > machine does not hard hang anymore) but vkcube now segfaults: > > #0 0x75210f23 in anv_descriptor_set_create () from > /usr/lib/libvulkan_intel.so > #1 0x7521121d in anv_AllocateDescriptorSets () from > /usr/lib/libvulkan_intel.so > #2 0x004063b0 in ?? () > #3 0x004035df in ?? () > #4 0x00404986 in ?? () > #5 0x0040589f in ?? () > #6 0x76aa9710 in __libc_start_main () from /usr/lib/libc.so.6 > #7 0x00402e69 in ?? () > > Is it supposed to? > > I will have a look at it tonight. > > Martin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware
Ok, I can tell you that 3DSTATE_DEPTH_BUFFER and 3DSTATE_STENCIL_BUFFER seem perfectly correct (assuming the gem address-patching-in works for the depth buffer address). I'll see if I can find a past version that works. OG. On Wed, Feb 17, 2016 at 4:31 PM, Jason Ekstrand wrote: > On Tue, Feb 16, 2016 at 11:22 PM, Olivier Galibert > wrote: >> >> I'm actually interested about how one goes about debugging that kind >> of problem, if you have pointers. I would have an idea or two on how >> to go about it if it was in userspace only, but once it crosses into >> the kernel I'm not sure what strategies are best. > > > This is almost certainly a userspace problem. I mentioned before that it's > probably a depth/stencil problem. I remember having similar problems a few > months ago when I was reviving gen7. I know that depth/stencil did work at > some point. > > I would start by looking at is where we emit the 3DSTATE_DEPTH_BUFFER and > 3DSTATE_STENCIL_BUFFER and trying to see if we're setting something up > wrong. Sometimes it's just a matter of looking at the documentation and > comparing the values we're setting to the docs and seeing if the make sense. > That's where I'd start. > > You could also try to go back a little ways (don't to past the update to > 1.0) to see if you can find a point where depth/stencil worked and try and > bisect to find where it broke. That may also provide hints as to what's > going wrong. > > Hope that helps, > --Jason > >> >> >> Best, >> >> OG. >> >> >> On Wed, Feb 17, 2016 at 2:51 AM, Jason Ekstrand >> wrote: >> > On Tue, Feb 16, 2016 at 1:21 PM, Olivier Galibert >> > wrote: >> >> >> >> Hi, >> >> >> >> I'm getting gpu hangs with the lunarg examples (cube and tri) on my >> >> Haswell (64 bits). I attach /sys/class/drm/card0/error fwiw. How >> >> should I go about debugging that? >> > >> > >> > It's a depth-stencil issue and we know about it. The gen7 code needs >> > some >> > love. I think Kristian and Jordan have been working on it. >> > --Jason >> > >> >> >> >> >> >> OG. >> >> >> >> >> >> On Tue, Feb 16, 2016 at 4:19 PM, Jason Ekstrand >> >> wrote: >> >> > The Intel mesa team is pleased to announce a brand-new open-source >> >> > Vulkan >> >> > driver for Intel hardware. We've been working hard on this over the >> >> > course >> >> > of the past year or so and are excited to finally share it with the >> >> > community. We will work on up-streaming the driver in the next few >> >> > weeks >> >> > and hope to have it all in place in time for mesa 11.3 (mesa 12?). >> >> > In >> >> > the >> >> > mean time, the driver can be found in the "vulkan" branch of the mesa >> >> > git >> >> > repo on freedesktop.org: >> >> > >> >> > https://cgit.freedesktop.org/mesa/mesa/log/?h=vulkan >> >> > >> >> > More information on building the driver and running a few simple apps >> >> > can >> >> > be found on the 01.org web site: >> >> > >> >> > >> >> > >> >> > https://01.org/linuxgraphics/blogs/jekstrand/2016/open-source-vulkan-drivers-intel-hardware >> >> > >> >> > We have talked to people at Red Hat and Cannonical and binaries >> >> > should >> >> > be >> >> > available for Fedora and Ubuntu soon. We will update the page on >> >> > 01.org >> >> > with links as soon as they are available. >> >> > >> >> > We have also created a small test suite called crucible which >> >> > contains a >> >> > few hundred tests (mostly for miptrees) that we created when bringing >> >> > up >> >> > the driver. This isn't really intended to be the piglit of vulkan. >> >> > With >> >> > the CTS being publicly available, most cross-platform tests should go >> >> > there. We mostly made crucible so that we could write a few tests >> >> > early >> >> > on >> >> > to get us going and for tests that were targetted specifically at our >> >> > implementation. None the less, they may prove usefu
Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware
Ok, I'll do that, thanks :-) No matter what, I'll learn interesting things. OG. On Wed, Feb 17, 2016 at 4:31 PM, Jason Ekstrand wrote: > On Tue, Feb 16, 2016 at 11:22 PM, Olivier Galibert > wrote: >> >> I'm actually interested about how one goes about debugging that kind >> of problem, if you have pointers. I would have an idea or two on how >> to go about it if it was in userspace only, but once it crosses into >> the kernel I'm not sure what strategies are best. > > > This is almost certainly a userspace problem. I mentioned before that it's > probably a depth/stencil problem. I remember having similar problems a few > months ago when I was reviving gen7. I know that depth/stencil did work at > some point. > > I would start by looking at is where we emit the 3DSTATE_DEPTH_BUFFER and > 3DSTATE_STENCIL_BUFFER and trying to see if we're setting something up > wrong. Sometimes it's just a matter of looking at the documentation and > comparing the values we're setting to the docs and seeing if the make sense. > That's where I'd start. > > You could also try to go back a little ways (don't to past the update to > 1.0) to see if you can find a point where depth/stencil worked and try and > bisect to find where it broke. That may also provide hints as to what's > going wrong. > > Hope that helps, > --Jason > >> >> >> Best, >> >> OG. >> >> >> On Wed, Feb 17, 2016 at 2:51 AM, Jason Ekstrand >> wrote: >> > On Tue, Feb 16, 2016 at 1:21 PM, Olivier Galibert >> > wrote: >> >> >> >> Hi, >> >> >> >> I'm getting gpu hangs with the lunarg examples (cube and tri) on my >> >> Haswell (64 bits). I attach /sys/class/drm/card0/error fwiw. How >> >> should I go about debugging that? >> > >> > >> > It's a depth-stencil issue and we know about it. The gen7 code needs >> > some >> > love. I think Kristian and Jordan have been working on it. >> > --Jason >> > >> >> >> >> >> >> OG. >> >> >> >> >> >> On Tue, Feb 16, 2016 at 4:19 PM, Jason Ekstrand >> >> wrote: >> >> > The Intel mesa team is pleased to announce a brand-new open-source >> >> > Vulkan >> >> > driver for Intel hardware. We've been working hard on this over the >> >> > course >> >> > of the past year or so and are excited to finally share it with the >> >> > community. We will work on up-streaming the driver in the next few >> >> > weeks >> >> > and hope to have it all in place in time for mesa 11.3 (mesa 12?). >> >> > In >> >> > the >> >> > mean time, the driver can be found in the "vulkan" branch of the mesa >> >> > git >> >> > repo on freedesktop.org: >> >> > >> >> > https://cgit.freedesktop.org/mesa/mesa/log/?h=vulkan >> >> > >> >> > More information on building the driver and running a few simple apps >> >> > can >> >> > be found on the 01.org web site: >> >> > >> >> > >> >> > >> >> > https://01.org/linuxgraphics/blogs/jekstrand/2016/open-source-vulkan-drivers-intel-hardware >> >> > >> >> > We have talked to people at Red Hat and Cannonical and binaries >> >> > should >> >> > be >> >> > available for Fedora and Ubuntu soon. We will update the page on >> >> > 01.org >> >> > with links as soon as they are available. >> >> > >> >> > We have also created a small test suite called crucible which >> >> > contains a >> >> > few hundred tests (mostly for miptrees) that we created when bringing >> >> > up >> >> > the driver. This isn't really intended to be the piglit of vulkan. >> >> > With >> >> > the CTS being publicly available, most cross-platform tests should go >> >> > there. We mostly made crucible so that we could write a few tests >> >> > early >> >> > on >> >> > to get us going and for tests that were targetted specifically at our >> >> > implementation. None the less, they may prove useful to someone and >> >> > we >> >> > are >> >> > happy to share them. The cruc
Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware
I'm actually interested about how one goes about debugging that kind of problem, if you have pointers. I would have an idea or two on how to go about it if it was in userspace only, but once it crosses into the kernel I'm not sure what strategies are best. Best, OG. On Wed, Feb 17, 2016 at 2:51 AM, Jason Ekstrand wrote: > On Tue, Feb 16, 2016 at 1:21 PM, Olivier Galibert > wrote: >> >> Hi, >> >> I'm getting gpu hangs with the lunarg examples (cube and tri) on my >> Haswell (64 bits). I attach /sys/class/drm/card0/error fwiw. How >> should I go about debugging that? > > > It's a depth-stencil issue and we know about it. The gen7 code needs some > love. I think Kristian and Jordan have been working on it. > --Jason > >> >> >> OG. >> >> >> On Tue, Feb 16, 2016 at 4:19 PM, Jason Ekstrand >> wrote: >> > The Intel mesa team is pleased to announce a brand-new open-source >> > Vulkan >> > driver for Intel hardware. We've been working hard on this over the >> > course >> > of the past year or so and are excited to finally share it with the >> > community. We will work on up-streaming the driver in the next few >> > weeks >> > and hope to have it all in place in time for mesa 11.3 (mesa 12?). In >> > the >> > mean time, the driver can be found in the "vulkan" branch of the mesa >> > git >> > repo on freedesktop.org: >> > >> > https://cgit.freedesktop.org/mesa/mesa/log/?h=vulkan >> > >> > More information on building the driver and running a few simple apps >> > can >> > be found on the 01.org web site: >> > >> > >> > https://01.org/linuxgraphics/blogs/jekstrand/2016/open-source-vulkan-drivers-intel-hardware >> > >> > We have talked to people at Red Hat and Cannonical and binaries should >> > be >> > available for Fedora and Ubuntu soon. We will update the page on 01.org >> > with links as soon as they are available. >> > >> > We have also created a small test suite called crucible which contains a >> > few hundred tests (mostly for miptrees) that we created when bringing up >> > the driver. This isn't really intended to be the piglit of vulkan. >> > With >> > the CTS being publicly available, most cross-platform tests should go >> > there. We mostly made crucible so that we could write a few tests early >> > on >> > to get us going and for tests that were targetted specifically at our >> > implementation. None the less, they may prove useful to someone and we >> > are >> > happy to share them. The crucible source code can be found at >> > >> > https://cgit.freedesktop.org/mesa/crucible/ >> > >> > Frequently Asked Questions: >> > >> > What all hardware does it support? >> > >> >The driver currently supports Sky Lake all the way back to Ivy >> > Bridge. >> >The driver is Vulkan 1.0 conformant for 64-bit builds on Sky Lake, >> >Broadwell, and Braswell. We are still having a couple of 32-bit >> > issues >> >and support for Haswell, Ivy Bridge, and Bay Trail should be >> > considered >> >experimental. >> > >> > How much code is shared between the Vulkan and GL drivers? >> > >> >For shaders, we're using a SPIR-V to NIR pass which is new, and a few >> >new NIR lowering passes for things that we previously depended on >> > GLSL >> >IR to handle. Beyond that, we're using the same core NIR and the >> > same >> >back-end compiler that we have for GL. We're carrying a few patches >> >against the back-end compiler, but the delta is very small and it's >> > all >> >stuff that we eventually want to do for GL anyway. >> > >> >The main API handling and state setup code is all new and written >> > from >> >the ground-up for Vulkan. For actually packing hardware packets, we >> > are >> >using a codegen system that Kristian developed early on in the >> > project >> >that's based on an XML description of the hardware packets. The >> > result >> >is state setup code that's both easier to work with and maybe even a >> >little more efficient than what we have in mesa today. >> > >> >We also have a brand-new surface layout library called ISL that >> > handles >
Re: [Mesa-dev] [PATCH v6] nir: Add an ALU op builder kind of like ir_builder.h
Hi, I thought mesa was C++ by now? That API is really C-ish. OG. On Wed, Feb 18, 2015 at 2:12 AM, Kenneth Graunke wrote: > On Friday, February 06, 2015 04:00:10 PM Eric Anholt wrote: >> v2: Rebase on the nir_opcodes.h python code generation support. >> v3: Use SSA values, and set an appropriate writemask on dot products. >> v4: Make the arguments be SSA references as well. This lets you stack up >> expressions in the arguments of other expressions, at the cost of >> having to insert a fmov/imov if you want to swizzle. Also, add >> the generated file to NIR_GENERATED_FILES. >> v5: Use more pythonish style for iterating the list. >> v6: Infer the size of the dest from the size of the srcs, and auto-swizzle >> a single small src out to the appropriate size. >> --- >> src/glsl/Makefile.am | 5 ++ >> src/glsl/Makefile.sources | 1 + >> src/glsl/nir/.gitignore | 1 + >> src/glsl/nir/nir_builder.h| 114 >> ++ >> src/glsl/nir/nir_builder_opcodes_h.py | 38 >> 5 files changed, 159 insertions(+) >> create mode 100644 src/glsl/nir/nir_builder.h >> create mode 100644 src/glsl/nir/nir_builder_opcodes_h.py > > This patch is: > Reviewed-by: Kenneth Graunke > > I do like Connor's ideas - we should definitely extend this and use it > in more places. I think we can easily do that as a follow on series. > > It might make sense to (eventually) have an API like: > > nir_builder *nir_builder_create(...) > > nir_builder_insert_at_cf_list(nir_builder *b, nir_cf_list *cf_list) > nir_builder_insert_at_block_start(nir_builder *b, nir_bblock *block) > nir_builder_insert_at_block_end(nir_builder *b, nir_bblock *block) > nir_builder_insert_after_instr(nir_builder *b, nir_instruction *instr) > nir_builder_insert_before_instr(nir_builder *b, nir_instruction *instr) > > I could see us having to store a cf_list/bblock/instr and needing to > swap around several fields, so having functions would be nicer than > prodding at struct fields directly. > > But for now, I think it's sufficient - it'll be easy enough to create > later, when we actually make the other APIs and start using them. > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Add mesa SHA-1 functions
Hi, Not sure there's anything to maintain, but sure, I'll maintain it. Best, OG. On Sun, Dec 21, 2014 at 8:51 PM, Emil Velikov wrote: > On 20 December 2014 at 14:21, Olivier Galibert wrote: >> Here is an implementation I've written myself, so no license issues. >> > Thanks OG, > > Afaics the main issue is not the lack of implementation, but that > no-one wants to step up to "maintain" it. > Even adding code that is x2 the size is considered a better solution :'-( > > If you're up-to the maintenance task, we can resolve all the issues > (linking, multi platform support) in half the size and a lot cleaner > build :-) > > Cheers, > Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Add mesa SHA-1 functions
Here is an implementation I've written myself, so no license issues. OG. On Fri, Dec 12, 2014 at 10:48 AM, Jose Fonseca wrote: > On 11/12/14 22:02, Brian Paul wrote: >> >> On 12/11/2014 02:51 PM, Carl Worth wrote: >>> >>> From: Kristian Høgsberg >>> >>> The upcoming shader cache uses the SHA-1 algorithm for cryptographic >>> naming. These new mesa_sha1 functions are implemented with the nettle >>> library. >>> --- >>> >>> This patch is another in support of my upcoming shader-cache work. >>> Thanks to >>> Kritian for coding this piece. >>> >>> As currently written, this patch introduces a new dependency of Mesa >>> on the >>> Nettle library to implement SHA-1. I'm open to recommendations if >>> people would prefer some other option. >>> >>> For example, the xserver can be configured to get a SHA-1 >>> implementation from >>> libmd, libc, CommonCrypto, CryptoAPI, libnettle, libgcrypt, libsha1, or >>> openssl. >>> >>> I don't know if it's important to offer as many options as that, which >>> is why >>> I'm asking for opinions here. >> >> >> >> We'll need a solution for Windows too. I don't have time right now to >> do any research into that. > > > Yes, ideally we'd have something small that we could bundle into mesa source > tree, for sake of non Linux OSes. > > If Windows was the only concern, we could use its Crypto API, > http://msdn.microsoft.com/en-us/library/windows/desktop/aa382379.aspx and > avoid depending on anything else, but some of the above mention libraries > are not trivial to install. > > The other alternative is to disable shader cache when no suitable dependency > is found. That is, make this an optional dependency. > > Jose > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev /* * Copyright © 2014 Olivier Galibert & Intel Corporation * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice (including the next * paragraph) shall be included in all copies or substantial portions of the * Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER * DEALINGS IN THE SOFTWARE. */ #include #include #include "sha1.h" static inline unsigned int mesa_sha1_shift(unsigned int val, int count) { return (val << count) | (val >> (32-count)); } static void mesa_sha1_init(struct mesa_sha1 *ctx) { ctx->digest[0] = 0x67452301; ctx->digest[1] = 0xefcdab89; ctx->digest[2] = 0x98badcfe; ctx->digest[3] = 0x10325476; ctx->digest[4] = 0xc3d2e1f0; ctx->msize = 0; } static void mesa_sha1_handle_block(struct mesa_sha1 *ctx, const unsigned char *b) { unsigned int W[80]; for(int i=0; i != 16; i++) W[i] = (b[4*i] << 24) | (b[4*i+1] << 16) | (b[4*i+2] << 8) | b[4*i+3]; for(int i=16; i != 80; i++) W[i] = mesa_sha1_shift(W[i-3]^W[i-8]^W[i-14]^W[i-16], 1); unsigned int A = ctx->digest[0]; unsigned int B = ctx->digest[1]; unsigned int C = ctx->digest[2]; unsigned int D = ctx->digest[3]; unsigned int E = ctx->digest[4]; for(int i= 0; i != 20; i++) { unsigned int T = mesa_sha1_shift(A, 5) + ((B & C) | ((~B) & D))+ E + W[i] + 0x5A827999; E = D; D = C; C = mesa_sha1_shift(B, 30); B = A; A = T; } for(int i=20; i != 40; i++) { unsigned int T = mesa_sha1_shift(A, 5) + (B^C^D) + E + W[i] + 0x6ed9eba1; E = D; D = C; C = mesa_sha1_shift(B, 30); B = A; A = T; } for(int i=40; i != 60; i++) { unsigned int T = mesa_sha1_shift(A, 5) + ((B & C) | (B & D
Re: [Mesa-dev] [PATCH 03/16] mesa: Clamps the stencil value masks to GLint when queried
H, if you convert to float you have a real problem: floats only have 23 bits of mantissa, so if bit 31 or 32 is set bits 0-7 will be lost. Converting directly won't change a thing there. Initing to 255 is definitively better it seems. W.r.t clamping, in computer graphics clamping a value to an interval mean setting the value to the nearest boundary if it was outside of the interval. Clamping can never change a value to something *inside* the interval, which masking does. OG. On Thu, Dec 18, 2014 at 12:08 PM, Eduardo Lima Mitev wrote: > On 12/18/2014 10:28 AM, Eduardo Lima Mitev wrote: >> On 12/18/2014 09:55 AM, Olivier Galibert wrote: >>> Something is not clear to me: In which way -1 is incorrect? >>> >> >> Hi Olivier, >> >> The values being queried are the front and back stencil masks. Masks are >> (conceptually?) an unsigned integer, AFAIU. > > Well, more accurately, just a string of bits, so -1 (or any other value) > is probably fine. Problem is when the signed integer value is further > converted to float, which is the case for these failing tests. (Note > that only the test cases that query the mask as a float value are the > ones failing). > > Giving a bit more of thought to this, and assuming the test is fine > querying for a mask value using the glGetFloat API, the problem is Mesa > converting from unsigned int (the mask) to signed int, then to float; > instead of converting to float directly. I don't have a say on why it > does that intermediate conversion to int, though. > > So my original solution was wrong from different angles, and the final > solution is probably not the best one either, since we are "avoiding" > the type conversion problem rather than fixing it. > > Thanks for rising these points. > > Eduardo > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/16] mesa: Clamps the stencil value masks to GLint when queried
Hi, Something is not clear to me: In which way -1 is incorrect? Also, w.r.t comments, what you're doing is masking, not clamping, which incidentally is a good thing since clamping would be severely bad for stencil. Best, OG. On Thu, Dec 11, 2014 at 11:34 PM, Eduardo Lima Mitev wrote: > Stencil value masks values (ctx->Stencil.ValueMask[]) stores GLuint values > which are initialized with max unsigned integer (~0u). When these values > are queried by glGet* (GL_STENCIL_VALUE_MASK or GL_STENCIL_BACK_VALUE_MASK), > they are converted to a signed integer. Currently, these values overflow > and return incorrect result (-1). > > This patch clamps these values to max int (0x7FFF) before storing. > > Fixes 6 dEQP failing tests: > * dEQP-GLES3.functional.state_query.integers.stencil_value_mask_getfloat > * dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_getfloat > * > dEQP-GLES3.functional.state_query.integers.stencil_value_mask_separate_getfloat > * > dEQP-GLES3.functional.state_query.integers.stencil_value_mask_separate_both_getfloat > * > dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_separate_getfloat > * > dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_separate_both_getfloat > --- > src/mesa/main/get.c | 11 ++- > src/mesa/main/get_hash_params.py | 2 +- > 2 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c > index 6091efc..4578a36 100644 > --- a/src/mesa/main/get.c > +++ b/src/mesa/main/get.c > @@ -726,7 +726,16 @@ find_custom_value(struct gl_context *ctx, const struct > value_desc *d, union valu >v->value_int = _mesa_get_stencil_ref(ctx, 1); >break; > case GL_STENCIL_VALUE_MASK: > - v->value_int = ctx->Stencil.ValueMask[ctx->Stencil.ActiveFace]; > + /* Since stencil value mask is a GLuint, it requires clamping > + * before storing in a signed int to avoid overflow. > + * Notice that Stencil.ValueMask values are initialized to ~0u, > + * so without clamping it will return -1 when assigned to value_int. > + */ > + v->value_int = ctx->Stencil.ValueMask[ctx->Stencil.ActiveFace] & > 0x7FFF; > + break; > + case GL_STENCIL_BACK_VALUE_MASK: > + /* Same as with GL_STENCIL_VALUE_MASK, value requires claming. */ > + v->value_int = ctx->Stencil.ValueMask[1] & 0x7FFF; >break; > case GL_STENCIL_WRITEMASK: >v->value_int = ctx->Stencil.WriteMask[ctx->Stencil.ActiveFace]; > diff --git a/src/mesa/main/get_hash_params.py > b/src/mesa/main/get_hash_params.py > index 09a61ac..a3bf1cb 100644 > --- a/src/mesa/main/get_hash_params.py > +++ b/src/mesa/main/get_hash_params.py > @@ -283,7 +283,7 @@ descriptor=[ > > # OpenGL 2.0 >[ "STENCIL_BACK_FUNC", "CONTEXT_ENUM(Stencil.Function[1]), NO_EXTRA" ], > - [ "STENCIL_BACK_VALUE_MASK", "CONTEXT_INT(Stencil.ValueMask[1]), NO_EXTRA" > ], > + [ "STENCIL_BACK_VALUE_MASK", "LOC_CUSTOM, TYPE_INT, NO_OFFSET, NO_EXTRA"], >[ "STENCIL_BACK_WRITEMASK", "CONTEXT_INT(Stencil.WriteMask[1]), NO_EXTRA" > ], >[ "STENCIL_BACK_REF", "LOC_CUSTOM, TYPE_INT, NO_OFFSET, NO_EXTRA" ], >[ "STENCIL_BACK_FAIL", "CONTEXT_ENUM(Stencil.FailFunc[1]), NO_EXTRA" ], > -- > 2.1.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] mesa: Initializes the stencil value masks to 0xFF instead of ~0u
Note that ~0U is perfectly correct w.r.t the GLES3 spec. It just means that s=32, which happens to be greater or equal to 8. Best, OG. On Tue, Dec 16, 2014 at 8:58 AM, Eduardo Lima Mitev wrote: > On 12/15/2014 08:30 PM, Ian Romanick wrote: >> On 12/15/2014 08:04 AM, Eduardo Lima Mitev wrote: >>> >>> Since the maximum supported precision for stencil buffers is 8 bits, mask >>> values should be initialized to 2^8 - 1 = 0xFF. >>> >>> Currently, these masks are initialized to max unsigned integer (~0u), which >>> causes their values to overflow to -1 when converted to signed int by >>> glGet* APIs. >> >> I did some research on this... before desktop OpenGL 3.1, the spec said >> something quite different. Please add the following to the commit message: >> >> "In OpenGL 3.0 and before, the an initial value of ~0u was specified: >> >> In the initial state, stenciling is disabled, the front and back >> stencil reference value are both zero, the front and back stencil >> comparison functions are both ALWAYS, and the front and back >> stencil mask are both all ones." >> > > Oh, interesting. I should have looked back into older specs to > understand where the ~0u was coming from. Note taken. > >> >> With that, this patch is >> >> Reviewed-by: Ian Romanick >> > > Great. If you feel like nitpicking, you can check the final commit log > here: > https://github.com/Igalia/mesa/commit/3784f7b2d5aa739c4abf9aa28874b85bbd1550e5 > > Thanks a lot! > > Eduardo > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Add mesa SHA-1 functions
Hi, SHA1 is easy to implement. If you want an always-working backup, I have a couple of C versions I wrote myself. Libraries are only interesting if they offer significant speedups through cpu-dependance. Especially since the shader cache is not in the happy fun land of hardware-based attacks (or attacks in the first place). Best, OG. On Fri, Dec 12, 2014 at 10:48 AM, Jose Fonseca wrote: > On 11/12/14 22:02, Brian Paul wrote: >> >> On 12/11/2014 02:51 PM, Carl Worth wrote: >>> >>> From: Kristian Høgsberg >>> >>> The upcoming shader cache uses the SHA-1 algorithm for cryptographic >>> naming. These new mesa_sha1 functions are implemented with the nettle >>> library. >>> --- >>> >>> This patch is another in support of my upcoming shader-cache work. >>> Thanks to >>> Kritian for coding this piece. >>> >>> As currently written, this patch introduces a new dependency of Mesa >>> on the >>> Nettle library to implement SHA-1. I'm open to recommendations if >>> people would prefer some other option. >>> >>> For example, the xserver can be configured to get a SHA-1 >>> implementation from >>> libmd, libc, CommonCrypto, CryptoAPI, libnettle, libgcrypt, libsha1, or >>> openssl. >>> >>> I don't know if it's important to offer as many options as that, which >>> is why >>> I'm asking for opinions here. >> >> >> >> We'll need a solution for Windows too. I don't have time right now to >> do any research into that. > > > Yes, ideally we'd have something small that we could bundle into mesa source > tree, for sake of non Linux OSes. > > If Windows was the only concern, we could use its Crypto API, > http://msdn.microsoft.com/en-us/library/windows/desktop/aa382379.aspx and > avoid depending on anything else, but some of the above mention libraries > are not trivial to install. > > The other alternative is to disable shader cache when no suitable dependency > is found. That is, make this an optional dependency. > > Jose > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()
Applied. OG. On Fri, Sep 26, 2014 at 6:11 PM, Erik Faye-Lund wrote: > Our current atan()-approximation is pretty inaccurate at 1.0, so > let's try to improve the situation by doing a direct approximation > without going through atan. > > This new implementation uses an 11th degree polynomial to approximate > atan in the [-1..1] range, and the following identitiy to reduce the > entire range to [-1..1]: > > atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x) > > This range-reduction idea is taken from the paper "Fast computation > of Arctangent Functions for Embedded Applications: A Comparative > Analysis" (Ukil et al. 2011). > > The polynomial that approximates atan(x) is: > > x * 0.793128310355 - x^3 * 0.3326756418091246 + > x^5 * 0.1938924977115610 - x^7 * 0.1173503194786851 + > x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444 > > This polynomial was found with the following GNU Octave script: > > x = linspace(0, 1); > y = atan(x); > n = [1, 3, 5, 7, 9, 11]; > format long; > polyfitc(x, y, n) > > The polyfitc function is not built-in, but too long to include here. > It can be downloaded from the following URL: > > http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m > > This fixes the following piglit test: > shaders/glsl-const-folding-01 > > Signed-off-by: Erik Faye-Lund > Reviewed-by: Ian Romanick > --- > src/glsl/builtin_functions.cpp | 65 > +++--- > 1 file changed, 55 insertions(+), 10 deletions(-) > > diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp > index 9be7f6d..c126b60 100644 > --- a/src/glsl/builtin_functions.cpp > +++ b/src/glsl/builtin_functions.cpp > @@ -442,6 +442,7 @@ private: > ir_swizzle *matrix_elt(ir_variable *var, int col, int row); > > ir_expression *asin_expr(ir_variable *x); > + void do_atan(ir_factory &body, const glsl_type *type, ir_variable *res, > operand y_over_x); > > /** > * Call function \param f with parameters specified as the linked > @@ -2684,11 +2685,7 @@ builtin_builder::_atan2(const glsl_type *type) >ir_factory outer_then(&outer_if->then_instructions, mem_ctx); > >/* Then...call atan(y/x) */ > - ir_variable *y_over_x = outer_then.make_temp(glsl_type::float_type, > "y_over_x"); > - outer_then.emit(assign(y_over_x, div(y, x))); > - outer_then.emit(assign(r, mul(y_over_x, rsq(add(mul(y_over_x, > y_over_x), > - imm(1.0f)); > - outer_then.emit(assign(r, asin_expr(r))); > + do_atan(body, glsl_type::float_type, r, div(y, x)); > >/* ...and fix it up: */ >ir_if *inner_if = new(mem_ctx) ir_if(less(x, imm(0.0f))); > @@ -2711,17 +2708,65 @@ builtin_builder::_atan2(const glsl_type *type) > return sig; > } > > +void > +builtin_builder::do_atan(ir_factory &body, const glsl_type *type, > ir_variable *res, operand y_over_x) > +{ > + /* > +* range-reduction, first step: > +* > +* / y_over_x if |y_over_x| <= 1.0; > +* x = < > +* \ 1.0 / y_over_x otherwise > +*/ > + ir_variable *x = body.make_temp(type, "atan_x"); > + body.emit(assign(x, div(min2(abs(y_over_x), > +imm(1.0f)), > + max2(abs(y_over_x), > +imm(1.0f); > + > + /* > +* approximate atan by evaluating polynomial: > +* > +* x * 0.793128310355 - x^3 * 0.3326756418091246 + > +* x^5 * 0.1938924977115610 - x^7 * 0.1173503194786851 + > +* x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444 > +*/ > + ir_variable *tmp = body.make_temp(type, "atan_tmp"); > + body.emit(assign(tmp, mul(x, x))); > + body.emit(assign(tmp, > mul(add(mul(sub(mul(add(mul(sub(mul(add(mul(imm(-0.0121323213173444f), > + tmp), > + > imm(0.0536813784310406f)), > + tmp), > + > imm(0.1173503194786851f)), > + tmp), > + imm(0.1938924977115610f)), > + tmp), > + imm(0.3326756418091246f)), > + tmp), > + imm(0.793128310355f)), > + x))); > + > + /* range-reduction fixup */ > + body.emit(assign(tmp, add(tmp, > + mul(b2f(greater(abs(y_over_x), > + imm(1.0f, type->components(, > + add(mul(tmp, > + imm(-2.0f)), > + im
Re: [Mesa-dev] [PATCH] glsl/glsl_parser_extras: Handle GLSL 4.50
Sorry for not replying earlier, I didn't see your answer. On Thu, Sep 4, 2014 at 12:33 AM, Matt Turner wrote: > Did you change the leading whitespace on purpose? Not really, I can un-change that. I have an emacs config that's supposedly what mesa wants, but it may be incorrect. >> - } supported_versions[12]; >> + } supported_versions[14]; > > Where does this number come from, and can we make it a little clearer > what it is? It's the maximum number of simultaneous glsl versions that a driver can support. With the current code it should be 13, that is, if a driver supports glsl 440 it's going to segfault writing after the end of array. Supporting 450 needs one more. Note that it's not (yet) a security issue since the largest we support is 330. Best, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/{version, getstring}: Future-proof version handling
Are we that far? OG. On Sat, Aug 23, 2014 at 7:22 PM, Ian Romanick wrote: > I'm content with waiting to add this until we're even close to > supporting any of those versions... especially given all the lines like > "false && // ARB_gpu_shader_fp64 &&". That's just clutter. > > On 08/21/2014 05:02 AM, Olivier Galibert wrote: >> Signed-off-by: Olivier Galibert >> --- >> src/mesa/main/getstring.c | 6 ++ >> src/mesa/main/version.c | 140 >> +- >> 2 files changed, 143 insertions(+), 3 deletions(-) >> >> diff --git a/src/mesa/main/getstring.c b/src/mesa/main/getstring.c >> index 431d60b..f9d13a7 100644 >> --- a/src/mesa/main/getstring.c >> +++ b/src/mesa/main/getstring.c >> @@ -58,6 +58,12 @@ shading_language_version(struct gl_context *ctx) >> return (const GLubyte *) "4.10"; >>case 420: >> return (const GLubyte *) "4.20"; >> + case 430: >> + return (const GLubyte *) "4.30"; >> + case 440: >> + return (const GLubyte *) "4.40"; >> + case 450: >> + return (const GLubyte *) "4.50"; >>default: >> _mesa_problem(ctx, >> "Invalid GLSL version in >> shading_language_version()"); >> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c >> index 4dea530..c7a2381 100644 >> --- a/src/mesa/main/version.c >> +++ b/src/mesa/main/version.c >> @@ -290,7 +290,122 @@ compute_version(const struct gl_extensions *extensions, >>extensions->EXT_texture_swizzle); >>/* ARB_sampler_objects is always enabled in >> mesa */ >> >> - if (ver_3_3) { >> + const GLboolean ver_4_0 = (ver_3_3 && >> + consts->GLSLVersion >= 400 && >> + extensions->ARB_draw_buffers_blend && >> + extensions->ARB_draw_indirect && >> + extensions->ARB_gpu_shader5 && >> + false && // ARB_gpu_shader_fp64 && >> + extensions->ARB_sample_shading && >> + false && // ARB_shader_subroutine >> + false && // ARB_tesselation_shader >> + extensions->ARB_texture_buffer_object_rgb32 && >> + extensions->ARB_texture_cube_map_array && >> + extensions->ARB_texture_gather && >> + extensions->ARB_texture_query_lod && >> + extensions->ARB_transform_feedback2 && >> + extensions->ARB_transform_feedback3); >> + >> + const GLboolean ver_4_1 = (ver_4_0 && >> + consts->GLSLVersion >= 410 && >> + extensions->ARB_ES2_compatibility && >> + false && // ARB_shader_precision >> + false && // ARB_vertex_attrib_64bit >> + extensions->ARB_viewport_array); >> + /* ARB_get_program_binary and >> + ARB_separate_shader_objects are always >> enabled in mesa */ >> + >> + const GLboolean ver_4_2 = (ver_4_1 && >> + consts->GLSLVersion >= 420 && >> + extensions->ARB_texture_compression_bptc && >> + extensions->ARB_shader_atomic_counters && >> + extensions->ARB_transform_feedback_instanced >> && >> + extensions->ARB_base_instance && >> + extensions->ARB_shader_image_load_store && >> + extensions->ARB_conservative_depth && >> + extensions->ARB_shading_language_420pack && >> + extensions->ARB_internalformat_query); >> + /* ARB_compressed_texture_pixel_storage, >> +
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
In that case staying as close as possible to spir may make sense? OG. On Fri, Aug 22, 2014 at 5:08 AM, Dave Airlie wrote: > On 22 August 2014 12:46, Jason Ekstrand wrote: >> On Thu, Aug 21, 2014 at 7:36 PM, Dave Airlie wrote: >>> >>> On 21 August 2014 19:10, Henri Verbeet wrote: >>> > On 21 August 2014 04:56, Michel Dänzer wrote: >>> >> On 21.08.2014 04:29, Henri Verbeet wrote: >>> >>> For whatever it's worth, I have been avoiding radeonsi in part because >>> >>> of the LLVM dependency. Some of the other issues already mentioned >>> >>> aside, I also think it makes it just painful to do bisects over >>> >>> moderate/longer periods of time. >>> >> >>> >> More painful, sure, but not too bad IME. In particular, if you know the >>> >> regression is in Mesa, you can always use a stable release of LLVM for >>> >> the bisect. You only need to change the --with-llvm-prefix= parameter >>> >> to >>> >> Mesa's configure for that. Of course, it could still be mildly painful >>> >> if you need to go so far back that the current stable LLVM release >>> >> wasn't supported yet. But how often does that happen? Very rarely for >>> >> me. >>> >> >>> > Sure, it's not impossible, but is that really the kind of process you >>> > want users to go through when bisecting a regression? Perhaps throw in >>> > building 32-bit versions of both Mesa and LLVM on 64-bit as well if >>> > they want to run 32-bit applications. >>> > >>> >> Without LLVM, I'm not sure there would be a driver you could avoid. :) >>> >> >>> > R600g didn't really exist either, and that one seems to have worked >>> > out fine. I think in a large part because of work done by Jerome and >>> > Dave in the early days, but regardless. From what I've seen from SI, I >>> > don't think radeonsi needed to be a separate driver to start with, and >>> > while its ISA is certainly different from R600-Cayman, it doesn't >>> > particularly strike me as much harder to work with. >>> > >>> > Back to the more immediate topic though, I think think that on >>> > occasion the discussion is framed as "Is there any reason using LLVM >>> > IR wouldn't work?", while it would perhaps be more appropriate to >>> > think of as "Would using LLVM IR provide enough advantages to justify >>> > adding a LLVM dependency to core Mesa?". >>> >>> Could we use an llvm compatible IR? is also a question I'd like to see >>> answered. >> >> >> What do you mean by llvm compatible? Do you mean forking their IR inside >> mesa or just something that's easy to translate back and forth? >> > > Importing/forking the llvm IR code with a different symbol set, and > trying to not intentionally > be incompatible with their llvm. > > Dave. > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl/glsl_parser_extras: Handle GLSL 4.50
Signed-off-by: Olivier Galibert --- src/glsl/glsl_parser_extras.cpp | 2 +- src/glsl/glsl_parser_extras.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index 490c3c8..87d4846 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -50,7 +50,7 @@ glsl_compute_version_string(void *mem_ctx, bool is_es, unsigned version) static const unsigned known_desktop_glsl_versions[] = - { 110, 120, 130, 140, 150, 330, 400, 410, 420, 430, 440 }; + { 110, 120, 130, 140, 150, 330, 400, 410, 420, 430, 440, 450 }; _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct gl_context *_ctx, diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index c8b9478..cd252f1 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -215,7 +215,7 @@ struct _mesa_glsl_parse_state { struct { unsigned ver; bool es; - } supported_versions[12]; + } supported_versions[14]; bool es_shader; unsigned language_version; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mapi/glapi/gen/gl_API.xml: Summer cleanup.
This adds all the extension names and numbers, adds some missing numbers and fixes the order in places. Future extension additions should be slightly easier by not requiring to find where it should go anymore. Signed-off-by: Olivier Galibert --- src/mapi/glapi/gen/gl_API.xml | 804 ++ 1 file changed, 578 insertions(+), 226 deletions(-) diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 73f2f75..e91f37e 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -6275,7 +6275,7 @@ - + @@ -6300,7 +6300,7 @@ - + @@ -6335,6 +6335,9 @@ + + + @@ -6360,10 +6363,10 @@ - - - - + + + + @@ -6776,7 +6779,7 @@ - + @@ -7443,7 +7446,7 @@ parameter was in the NV functions. When this error was discovered and fixed, there was already at least one implementation of GLX protocol for ARB_vertex_program, but there were no - implementations of NV_vertex_program. The sollution was to renumber + implementations of NV_vertex_program. The solution was to renumber the opcodes for NV_vertex_program and convert the unused field in the ARB_vertex_program protocol to unused padding. --> @@ -7683,6 +7686,8 @@ + + @@ -8079,7 +8084,7 @@ -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> @@ -8094,79 +8099,79 @@ -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> - - + + -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> - - + + -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> @@ -8176,15 +8181,15 @@ -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> - + @@ -8205,13 +8210,17 @@ -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> - +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> + + + + + @@ -8243,21 +8252,28 @@ - +http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> + + - +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> + + -http://www.w3.org/2001/XInclude"/> + -http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> -http://www.w3.org/2001/XInclude"/> + -http://www.
[Mesa-dev] [PATCH] mesa/{version, getstring}: Future-proof version handling
Signed-off-by: Olivier Galibert --- src/mesa/main/getstring.c | 6 ++ src/mesa/main/version.c | 140 +- 2 files changed, 143 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/getstring.c b/src/mesa/main/getstring.c index 431d60b..f9d13a7 100644 --- a/src/mesa/main/getstring.c +++ b/src/mesa/main/getstring.c @@ -58,6 +58,12 @@ shading_language_version(struct gl_context *ctx) return (const GLubyte *) "4.10"; case 420: return (const GLubyte *) "4.20"; + case 430: + return (const GLubyte *) "4.30"; + case 440: + return (const GLubyte *) "4.40"; + case 450: + return (const GLubyte *) "4.50"; default: _mesa_problem(ctx, "Invalid GLSL version in shading_language_version()"); diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c index 4dea530..c7a2381 100644 --- a/src/mesa/main/version.c +++ b/src/mesa/main/version.c @@ -290,7 +290,122 @@ compute_version(const struct gl_extensions *extensions, extensions->EXT_texture_swizzle); /* ARB_sampler_objects is always enabled in mesa */ - if (ver_3_3) { + const GLboolean ver_4_0 = (ver_3_3 && + consts->GLSLVersion >= 400 && + extensions->ARB_draw_buffers_blend && + extensions->ARB_draw_indirect && + extensions->ARB_gpu_shader5 && + false && // ARB_gpu_shader_fp64 && + extensions->ARB_sample_shading && + false && // ARB_shader_subroutine + false && // ARB_tesselation_shader + extensions->ARB_texture_buffer_object_rgb32 && + extensions->ARB_texture_cube_map_array && + extensions->ARB_texture_gather && + extensions->ARB_texture_query_lod && + extensions->ARB_transform_feedback2 && + extensions->ARB_transform_feedback3); + + const GLboolean ver_4_1 = (ver_4_0 && + consts->GLSLVersion >= 410 && + extensions->ARB_ES2_compatibility && + false && // ARB_shader_precision + false && // ARB_vertex_attrib_64bit + extensions->ARB_viewport_array); + /* ARB_get_program_binary and + ARB_separate_shader_objects are always enabled in mesa */ + + const GLboolean ver_4_2 = (ver_4_1 && + consts->GLSLVersion >= 420 && + extensions->ARB_texture_compression_bptc && + extensions->ARB_shader_atomic_counters && + extensions->ARB_transform_feedback_instanced && + extensions->ARB_base_instance && + extensions->ARB_shader_image_load_store && + extensions->ARB_conservative_depth && + extensions->ARB_shading_language_420pack && + extensions->ARB_internalformat_query); + /* ARB_compressed_texture_pixel_storage, + ARB_texture_storage and + ARB_map_buffer_alignment are always enabled in mesa */ + + const GLboolean ver_4_3 = (ver_4_2 && + consts->GLSLVersion >= 430 && + false && // ARB_arrays_of_arrays + extensions->ARB_ES3_compatibility && + extensions->ARB_compute_shader && + extensions->ARB_copy_image && + extensions->ARB_explicit_uniform_location && + extensions->ARB_fragment_layer_viewport && + false && // ARB_framebuffer_no_attachments + false && // ARB_internalformat_query2 + extensions->ARB_draw_indirect && + false && // ARB_program_interface_query + false &&
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
And don't forget that explicit vec4 becomes immensely amusing once you add fp64/double to the problem. OG. On Wed, Aug 20, 2014 at 4:01 PM, Francisco Jerez wrote: > Connor Abbott writes: > >> On Tue, Aug 19, 2014 at 11:33 PM, Francisco Jerez >> wrote: >>> Connor Abbott writes: >>> On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez wrote: > Tom Stellard writes: > >> On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote: >>> On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer >>> wrote: >>> > On 19.08.2014 01:28, Connor Abbott wrote: >>> >> On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer >>> >> wrote: >>> >>> On 16.08.2014 09:12, Connor Abbott wrote: >>> I know what you might be thinking right now. "Wait, *another* IR? >>> Don't >>> we already have like 5 of those, not counting all the >>> driver-specific >>> ones? Isn't this stuff complicated enough already?" Well, there >>> are some >>> pretty good reasons to start afresh (again...). In the years we've >>> been >>> using GLSL IR, we've come to realize that, in fact, it's not what >>> we >>> want *at all* to do optimizations on. >>> >>> >>> >>> Did you evaluate using LLVM IR instead of inventing yet another one? >>> >>> >>> >>> >>> >>> -- >>> >>> Earthling Michel Dänzer| >>> >>> http://www.amd.com >>> >>> Libre software enthusiast |Mesa and X >>> >>> developer >>> >> >>> >> Yes. See >>> >> >>> >> http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html >>> >> >>> >> and >>> >> >>> >> http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html >>> > >>> > I know Ian can't deal with LLVM for some reason. I was wondering if >>> > *you* evaluated it, and if so, why you rejected it. >>> > >>> > >>> > -- >>> > Earthling Michel Dänzer| >>> > http://www.amd.com >>> > Libre software enthusiast |Mesa and X >>> > developer >>> >>> >>> Well, first of all, the fact that Ian and Ken don't want to use it >>> means that any plan to use LLVM for the Intel driver is dead in the >>> water anyways - you can translate NIR into LLVM if you want, but for >>> i965 we want to share optimizations between our 2 backends (FS and >>> vec4) that we can't do today in GLSL IR so this is what we want to use >>> for that, and since nobody else does anything with the core GLSL >>> compiler except when they have to, when we start moving things out of >>> GLSL IR this will probably replace GLSL IR as the infrastructure that >>> all Mesa drivers use. But with that in mind, here are a few reasons >>> why we wouldn't want to use LLVM: >>> >>> * LLVM wasn't built to understand structured CFG's, meaning that you >>> need to re-structurize it using a pass that's fragile and prone to >>> break if some other pass "optimizes" the shader in a way that makes it >>> non-structured (i.e. not expressible in terms of loops and if >>> statements). This loss of information also means that passes that need >>> to know things like, for example, the loop nesting depth need to do an >>> analysis pass whereas with NIR you can just walk up the control flow >>> tree and count the number of loops we hit. >>> >> >> LLVM has a pass to structurize the CFG. We use it in the radeon >> drivers, and it is run after all of the other LLVM optimizations which >> have >> no concept of structured CFG. It's not bug free, but it works really >> well even with all of the complex OpenCL kernels we throw at it. >> >> Your point about losing information when the CFG is de-structurized is >> valid, but for things like loop depth, I'm not sure why we couldn't >> write an >> LLVM analysis pass for this (if one doesn't already exist). >> > > I don't think this is such a big deal either. At least the > structurization pass used on newer AMD hardware isn't "fragile" in the > way you seem to imply -- AFAIK (unlike the old AMDIL heuristic > algorithm) it's guaranteed to give you a valid structurized output no > matter what the previous optimization passes have done to the CFG, > modulo bugs. I admit that the situation is nevertheless suboptimal. > Ideally this information wouldn't get lost along the way. For the long > term we may want to represent structured control flow directly in the IR > as you say, I just don't see how reinventing the IR saves us any work if > we could just fix the existing one. It seems to me that something like how we represent control flow is a pretty fundamental part of the IR - it affects any optimization pas
Re: [Mesa-dev] [PATCH 8/8] mesa: simplify _mesa_update_draw_buffers()
Hi, That patch makes glDrawBuffer(0, NULL); segfault because _mesa_drawbuffers expects buffers[0] to be valid. Note that the bug is there, but I'm not sure what the final setup should look like in that case. Best, OG. PS: reported by haagch on irc On Fri, Aug 8, 2014 at 11:20 PM, Brian Paul wrote: > There's no need to copy the array of DrawBuffer enums to a temp array. > --- > src/mesa/main/buffers.c |9 ++--- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/src/mesa/main/buffers.c b/src/mesa/main/buffers.c > index 6b4fac9..140cf6e 100644 > --- a/src/mesa/main/buffers.c > +++ b/src/mesa/main/buffers.c > @@ -567,16 +567,11 @@ _mesa_drawbuffers(struct gl_context *ctx, GLuint n, > const GLenum *buffers, > void > _mesa_update_draw_buffers(struct gl_context *ctx) > { > - GLenum buffers[MAX_DRAW_BUFFERS]; > - GLuint i; > - > /* should be a window system FBO */ > assert(_mesa_is_winsys_fbo(ctx->DrawBuffer)); > > - for (i = 0; i < ctx->Const.MaxDrawBuffers; i++) > - buffers[i] = ctx->Color.DrawBuffer[i]; > - > - _mesa_drawbuffers(ctx, ctx->Const.MaxDrawBuffers, buffers, NULL); > + _mesa_drawbuffers(ctx, ctx->Const.MaxDrawBuffers, > + ctx->Color.DrawBuffer, NULL); > } > > > -- > 1.7.10.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Intel-gfx] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
On Mon, Jul 30, 2012 at 10:30:57AM -0700, Eric Anholt wrote: > I'm perfectly fine with the VUE containing slots for both when the app > has gone out of its way to ask for deprecated two-sided color > rendering. Are you also ok with recompiler the shaders when that enable is switched? OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
On Tue, Jul 17, 2012 at 07:37:43AM -0700, Paul Berry wrote: > If possible, I would still like to think of a way to address this situation > that (a) doesn't require modifying both fragment shader back-ends and the > SF program, and (b) helps all Mesa drivers, not just Intel Gen4-5. > Especially because I suspect we may have bugs in Gen6-7 related to this > situation. You don't :-) It's correctly handled in gen6_sf_state.c::get_attr_override with similar semantics too. > Would you be happy with one of the following two alternatives? > > 1. In the GLSL front-end, if we detect that a vertex shader writes to > gl_BackColor but not gl_FrontColor, then automatically insert > "gl_FrontColor = 0;" into the shader. This will guarantee that whenever > gl_BackColor is written, gl_FrontColor is too. > > 2. In the function brw_compute_vue_map(), assign a VUE slot for > VERT_RESULT_COL0 whenever *either* VERT_RESULT_COL0 or VERT_RESULT_BFC0 is > used. This will guarantee that we always have a VUE slot available for > front color, so we don't have to be as tricky in the FS and SF code. With both methods the SF code is not really simplified. Doing the mov without testing would require writing to/reserving a slot for gl_BackColor if gl_FrontColor is written to, which wouldn't be acceptable. And to write to/reserve a slot for the two of them if gl_Color is read in any case. Probably unacceptable. So the need_* stuff is going to stay in any case :/ So the only simplification would be in the fs/wm and I'm somewhat afraid of having a vue slot that's not in outputs_written of the previous stage. They seem to be expected equivalent. > This morning I'll try to ask some other Intel folks for their opinion on > the subject. Did they have an opinion? Best, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/7] ir_to_mesa: Don't set component for ir_dereference in ir_quadop_vector
On Fri, Jul 27, 2012 at 10:49:25AM -0700, Kenneth Graunke wrote: > From: Ian Romanick > > There can only be one variable used in an ir_quadop_vector. Accesses > of this variable must be swizzled. There's nothing anywhere ensuring the presence of the swizzle. I completely agree that trashing the components value is a bad idea, but you should have SWIZZLE_X as a default value for the components array. Amusingly enough, it's already the case (SWIZZLE_X==0), so making it explicit would be perfect. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/7] glsl: Fix ir_last_opcode value.
On Fri, Jul 27, 2012 at 10:49:24AM -0700, Kenneth Graunke wrote: > From: Ian Romanick > > Now that ir_quadop_vector exists, ir_last_binop and ir_last_opcode are > no longer the same. Only one place currently uses this enumeration, and > already handles ir_quadop_vector correctly. > > Signed-off-by: Ian Romanick > Signed-off-by: Kenneth Graunke > --- > src/glsl/ir.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/glsl/ir.h b/src/glsl/ir.h > index e2743f6..a69494f 100644 > --- a/src/glsl/ir.h > +++ b/src/glsl/ir.h > @@ -1027,7 +1027,7 @@ enum ir_expression_operation { > /** > * A sentinel marking the last of all operations. > */ > - ir_last_opcode = ir_last_binop > + ir_last_opcode = ir_quadop_vector > }; Another obvious-in-hindsight bugfix. Reviewed-by: Olivier Galibert ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/7] glsl: Add "typeless" constructor for quadop ir_expressions
On Fri, Jul 27, 2012 at 10:49:23AM -0700, Kenneth Graunke wrote: > From: Ian Romanick > > This matches the typeless constructors for unop and binop > ir_expressions. > > Signed-off-by: Ian Romanick > Reviewed-by: Kenneth Graunke > --- > src/glsl/ir.cpp | 17 + > src/glsl/ir.h | 2 ++ > 2 files changed, 19 insertions(+) > > diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp > index b0e38d8..5faf34a 100644 > --- a/src/glsl/ir.cpp > +++ b/src/glsl/ir.cpp > @@ -236,6 +236,23 @@ ir_expression::ir_expression(int op, const struct > glsl_type *type, > this->operands[3] = op3; > } > > +ir_expression::ir_expression(int op, ir_rvalue *op0, ir_rvalue *op1, > + ir_rvalue *op2, ir_rvalue *op3) > +{ > + assert(op0->type->is_scalar()); > + assert((op0->type == op1->type) > + && (op0->type == op2->type) > + && (op0->type == op3->type)); > + > + this->ir_type = ir_type_expression; > + this->type = glsl_type::get_instance(op0->type->base_type, 4, 1); > + this->operation = ir_expression_operation(op); You're hardcoding ir_quadop_vector's properties here. A comment saying so could be useful, if other quadops with different properties happen someday. In fact, you're hardcoding them so hard passing "op" may not make sense. A static method ir_expression *ir_expression::build_quadop_vector perhaps? OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] glsl: Request an Nx1 type instance in ir_quadop_vector lowering pass.
On Fri, Jul 27, 2012 at 10:49:22AM -0700, Kenneth Graunke wrote: > From: Ian Romanick > > No types have 0 columns. The glsl_type::get_instance method contains > >if ((rows < 1) || (rows > 4) || (columns < 1) || (columns > 4)) > return error_type; > > To get a vector, use columns = 1. Reviewed-by: Olivier Galibert That's an obvious bugfix. If there's a stable branch with the glsl compiler in, it probably should go there. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/7] glsl: Add glsl_type::get_sampler_instance method.
On Fri, Jul 27, 2012 at 10:49:20AM -0700, Kenneth Graunke wrote: > +/** > + * Convert sampler type attributes into an index in the sampler_types array > + */ > +#define SAMPLER_TYPE_INDEX(dim, sample_type, array, shadow) \ > + ((unsigned(dim) * 12) + (sample_type * 4) + (unsigned(array) * 2) \ > ++ unsigned(shadow)) > + > +/** > + * \note > + * Arrays like this are \b the argument for C99-style designated > initializers. > + * Too bad C++ and VisualStudio are too cool for that sort of useful > + * functionality. > + */ > +const glsl_type *const glsl_type::sampler_types[] = { Did you think about using a 4-dimensions array and let the compiler take care of the multiplies? It may not be that much more readable though. > + /* GLSL_SAMPLER_DIM_1D */ > + &builtin_130_types[10], /* uint */ > + NULL,/* uint, shadow */ What does NULL mean? OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] glsl: Make bvec and ivec types accessible without using get_instance.
On Fri, Jul 27, 2012 at 10:49:19AM -0700, Kenneth Graunke wrote: > It's more convenient to use shortcuts like glsl_type::bvec2_type than > the longwinded glsl_type::get_instance(GLSL_TYPE_BOOL, 2, 1). Yay, code in zones I understand :-) Reviewed-by: Olivier Galibert ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Support for EXT/ARB_geometry_shader4
On Fri, Jul 27, 2012 at 10:40:28AM -0500, Bryan Cain wrote: > https://github.com/Plombo/mesa/tree/geometry-shaders . Quick remarks from a fast read: - you missed draw_pipe_clip.c:clip_init_state, where you need to plug in the gs info where appropriate. Should be easy. It will take care of the interpolation-on-clipping issues you currently have even if you don't know you have them :-) - starting with 4.0 EmitVertex and EmitPrimitive are in fact EmitStreamVertex(0) and EmitStreamPrimitive(0). It may be a good idea to implement the stream version at the ir_* level, even if the first implementation just ignores the parameter. - all the is_*_shader boolean variables should probably be an integer "shader type" variable, since there will be two more types to add for 4.0. - I'm not sure we want to use _ARB versions of constants when the suffix-less versions exist and have the same value. - cross_validate_outputs_to_inputs could use some kind of const char *_mesa_get_shader_type_string(gl_shader *sh) from somewhere like shaderapi.h. We'll need two more shader types soon. I'll see how hard the intel gen4 supports looks to be, shouldn't be that bad. Need to finish clipper first though. Best, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] mesa: Add a Version field to the context with VersionMajor*10+VersionMinor.
On Thu, Jul 26, 2012 at 05:27:43PM -0700, Eric Anholt wrote: > As we get into supporting GL 3.x core, we come across more and more features > of the API that depend on the version number as opposed to just the extension > list. This will let us more sanely do version checks than "(VersionMajor == 3 > && VersionMinor >= 2) || VersionMajor >= 4". Pure bikeshedding, but why not use *100 in order to be identical to glsl? OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/9] intel gen4-5: fix the vue view in the fs.
On Thu, Jul 26, 2012 at 10:18:01AM -0700, Eric Anholt wrote: > Olivier Galibert writes: > > > In some cases the fragment shader view of the vue registers was out of > > sync with the builder. This fixes it. > > s/builder/SF outputs/ ? > > I'd love to see the pre-gen6 code get rearranged so the FS walked the > bitfield of FS inputs from SF and chose the urb offset for each. But > this does look like the minimal fix. In other words, an explicit linking pass? That could be useful with geometry shaders, too. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/9] intel gen4-5: Compute the interpolation status for every variable in one place.
On Thu, Jul 26, 2012 at 10:22:26AM -0700, Eric Anholt wrote: > I don't like seeing this data that should be referenced out of the > program cache key being communicated through brw->. What would you like it being communicated through? OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] sp_tex_sample: Fix stupid copy/paste error.
[Sorry, mail was down yesterday] On Tue, Jul 24, 2012 at 10:06:05AM -0600, Brian Paul wrote: > Does this fix bug 52369? Yes. > Do you need me to commit this for you? Yes please. Perhaps I should see about getting a fdo account. Best, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] sp_tex_sample: Fix stupid copy/paste error.
diff --git a/src/gallium/drivers/softpipe/sp_tex_sample.c b/src/gallium/drivers/softpipe/sp_tex_sample.c index f215b90..0aeb8e2 100644 --- a/src/gallium/drivers/softpipe/sp_tex_sample.c +++ b/src/gallium/drivers/softpipe/sp_tex_sample.c @@ -1950,8 +1950,8 @@ mip_filter_linear_2d_linear_repeat_POT( float rgbax[TGSI_NUM_CHANNELS][TGSI_QUAD_SIZE]; int c; - img_filter_2d_linear_repeat_POT(tgsi_sampler, s[j], t[j], p[j], level0, samp->faces[j], tgsi_sampler_lod_bias, &rgbax[0][j]); - img_filter_2d_linear_repeat_POT(tgsi_sampler, s[j], t[j], p[j], level0+1, samp->faces[j], tgsi_sampler_lod_bias, &rgbax[0][j]); + img_filter_2d_linear_repeat_POT(tgsi_sampler, s[j], t[j], p[j], level0, samp->faces[j], tgsi_sampler_lod_bias, &rgbax[0][0]); + img_filter_2d_linear_repeat_POT(tgsi_sampler, s[j], t[j], p[j], level0+1, samp->faces[j], tgsi_sampler_lod_bias, &rgbax[0][1]); for (c = 0; c < TGSI_NUM_CHANNELS; c++) rgba[c][j] = lerp(levelBlend, rgbax[c][0], rgbax[c][1]); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Intel-gfx] [PATCH 4/9] intel gen4-5: Fix backface/frontface selection when one one color is written to.
On Fri, Jul 20, 2012 at 10:01:03AM -0700, Eric Anholt wrote: > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > > b/src/mesa/drivers/dri/i965/brw_fs.cpp > > index 3f98137..3b62952 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > > @@ -972,6 +972,15 @@ fs_visitor::calculate_urb_setup() > > if (c->key.vp_outputs_written & BITFIELD64_BIT(i)) { > > int fp_index = _mesa_vert_result_to_frag_attrib((gl_vert_result) i); > > > > +/* Special case: two-sided vertex option, vertex program > > + * only writes to the back color. Map it to the > > + * associated front color location. > > + */ > > +if (i >= VERT_RESULT_BFC0 && i <= VERT_RESULT_BFC1 && > > +ctx->VertexProgram._TwoSideEnabled && > > +urb_setup[i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0] == -1) > > + fp_index = i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0; > > In the fs_visitor (and brw_wm_pass*), you don't get to look at ctx-> > state like that -- you're getting called once with some set of ctx > state, but the program will get reused even if the ctx state changes. > You'd have to get that state into the wm prog key, and use that, which > would guarantee that you have the appropriate program code. Ok. OTOH, we don't actually *need* to look at TwoSideEnabled. If the rest of the condition triggers it's either correct or undefined behaviour. So we can do it systematically. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 9/9] intel gen4-5: Don't touch flatshaded values when clipping, only copy them.
This patch ensures that integers will pass through unscathed. Doing (useless) computations on them is risky, especially when their bit patterns correspond to values like inf or nan. Signed-off-by: Olivier Galibert --- src/mesa/drivers/dri/i965/brw_clip_util.c | 48 ++--- 1 file changed, 30 insertions(+), 18 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_clip_util.c b/src/mesa/drivers/dri/i965/brw_clip_util.c index b06ad1d..998c304 100644 --- a/src/mesa/drivers/dri/i965/brw_clip_util.c +++ b/src/mesa/drivers/dri/i965/brw_clip_util.c @@ -293,30 +293,42 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c, * header), so interpolate: * *New = attr0 + t*attr1 - t*attr0 + * + * unless it's flat shaded, then just copy the value from a + * source vertex. */ - struct brw_reg tmp = get_tmp(c); + GLuint interp = brw->interpolation_mode[slot]; - struct brw_reg t = -brw->interpolation_mode[slot] == INTERP_QUALIFIER_NOPERSPECTIVE ? -t_nopersp : t0; + if(interp == INTERP_QUALIFIER_SMOOTH || +interp == INTERP_QUALIFIER_NOPERSPECTIVE) { +struct brw_reg tmp = get_tmp(c); +struct brw_reg t = + interp == INTERP_QUALIFIER_NOPERSPECTIVE ? + t_nopersp : t0; -brw_MUL(p, -vec4(brw_null_reg()), -deref_4f(v1_ptr, delta), -t); +brw_MUL(p, +vec4(brw_null_reg()), +deref_4f(v1_ptr, delta), +t); -brw_MAC(p, -tmp, -negate(deref_4f(v0_ptr, delta)), -t); +brw_MAC(p, +tmp, +negate(deref_4f(v0_ptr, delta)), +t); -brw_ADD(p, -deref_4f(dest_ptr, delta), -deref_4f(v0_ptr, delta), -tmp); - - release_tmp(c, tmp); +brw_ADD(p, +deref_4f(dest_ptr, delta), +deref_4f(v0_ptr, delta), +tmp); + +release_tmp(c, tmp); + + } else if(interp == INTERP_QUALIFIER_FLAT) { +brw_MOV(p, +deref_4f(dest_ptr, delta), +deref_4f(v0_ptr, delta)); + } } } -- 1.7.10.280.gaa39 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/9] intel gen4-5: Make noperspective clipping work.
At this point all interpolation tests with fixed clipping work. Signed-off-by: Olivier Galibert Reviewed-by: Paul Berry --- src/mesa/drivers/dri/i965/brw_clip.c |9 ++ src/mesa/drivers/dri/i965/brw_clip.h |1 + src/mesa/drivers/dri/i965/brw_clip_util.c | 147 ++--- 3 files changed, 146 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_clip.c b/src/mesa/drivers/dri/i965/brw_clip.c index 8512172..eca2844 100644 --- a/src/mesa/drivers/dri/i965/brw_clip.c +++ b/src/mesa/drivers/dri/i965/brw_clip.c @@ -239,6 +239,15 @@ brw_upload_clip_prog(struct brw_context *brw) break; } } + key.has_noperspective_shading = 0; + for (i = 0; i < BRW_VERT_RESULT_MAX; i++) { + if (brw->interpolation_mode[i] == INTERP_QUALIFIER_NOPERSPECTIVE && + brw->vs.prog_data->vue_map.slot_to_vert_result[i] != VERT_RESULT_HPOS) { + key.has_noperspective_shading = 1; + break; + } + } + key.pv_first = (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION); memcpy(key.interpolation_mode, brw->interpolation_mode, BRW_VERT_RESULT_MAX); diff --git a/src/mesa/drivers/dri/i965/brw_clip.h b/src/mesa/drivers/dri/i965/brw_clip.h index 3ad2e13..66dd928 100644 --- a/src/mesa/drivers/dri/i965/brw_clip.h +++ b/src/mesa/drivers/dri/i965/brw_clip.h @@ -47,6 +47,7 @@ struct brw_clip_prog_key { GLuint primitive:4; GLuint nr_userclip:4; GLuint has_flat_shading:1; + GLuint has_noperspective_shading:1; GLuint pv_first:1; GLuint do_unfilled:1; GLuint fill_cw:2; /* includes cull information */ diff --git a/src/mesa/drivers/dri/i965/brw_clip_util.c b/src/mesa/drivers/dri/i965/brw_clip_util.c index 692573e..b06ad1d 100644 --- a/src/mesa/drivers/dri/i965/brw_clip_util.c +++ b/src/mesa/drivers/dri/i965/brw_clip_util.c @@ -129,6 +129,8 @@ static void brw_clip_project_vertex( struct brw_clip_compile *c, /* Interpolate between two vertices and put the result into a0.0. * Increment a0.0 accordingly. + * + * Beware that dest_ptr can be equal to v0_ptr. */ void brw_clip_interp_vertex( struct brw_clip_compile *c, struct brw_indirect dest_ptr, @@ -138,7 +140,8 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c, bool force_edgeflag) { struct brw_compile *p = &c->func; - struct brw_reg tmp = get_tmp(c); + struct brw_context *brw = p->brw; + struct brw_reg t_nopersp, v0_ndc_copy; GLuint slot; /* Just copy the vertex header: @@ -148,13 +151,130 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c, * back on Ironlake, so needn't change it */ brw_copy_indirect_to_indirect(p, dest_ptr, v0_ptr, 1); - - /* Iterate over each attribute (could be done in pairs?) + + /* +* First handle the 3D and NDC positioning, in case we need +* noperspective interpolation. Doing it early has no performance +* impact in any case. +*/ + + /* Start by picking up the v0 NDC coordinates, because that vertex +* may be shared with the destination. +*/ + if (c->key.has_noperspective_shading) { + GLuint offset = brw_vert_result_to_offset(&c->vue_map, +BRW_VERT_RESULT_NDC); + v0_ndc_copy = get_tmp(c); + brw_MOV(p, v0_ndc_copy, deref_4f(v0_ptr, offset)); + } + + /* +* Compute the new 3D position +* +* dest_hpos = v0_hpos * (1 - t0) + v1_hpos * t0 +*/ + { + GLuint delta = brw_vert_result_to_offset(&c->vue_map, VERT_RESULT_HPOS); + struct brw_reg tmp = get_tmp(c); + brw_MUL(p, + vec4(brw_null_reg()), + deref_4f(v1_ptr, delta), + t0); + + brw_MAC(p, + tmp, + negate(deref_4f(v0_ptr, delta)), + t0); + + brw_ADD(p, + deref_4f(dest_ptr, delta), + deref_4f(v0_ptr, delta), + tmp); + release_tmp(c, tmp); + } + + /* Then recreate the projected (NDC) coordinate in the new vertex +* header +*/ + brw_clip_project_vertex(c, dest_ptr); + + /* +* If we have noperspective attributes, we now need to compute the +* screen-space t. +*/ + if (c->key.has_noperspective_shading) { + GLuint delta = brw_vert_result_to_offset(&c->vue_map, BRW_VERT_RESULT_NDC); + struct brw_reg tmp = get_tmp(c); + t_nopersp = get_tmp(c); + + /* Build a register with coordinates from the second and new vertices + * + * t_nopersp = vec4(v1.xy, dest.xy) + */ + brw_MOV(p, t_nopersp, deref_4f(v1_ptr, delta)); + brw_MOV(p, tmp, deref_4f(dest_ptr, delta)); + brw_set_access_mode(p, BRW_ALIGN_16); + brw_MOV(p, + brw_writemask(t_nopersp, WRITEMASK_ZW), + brw_swizzle(tmp,
[Mesa-dev] [PATCH 7/9] intel gen4-5: Correctly handle flat vs. non-flat in the clipper.
At that point, all interpolation piglit tests involving fixed clipping work as long as there's no noperspective. Signed-off-by: Olivier Galibert Reviewed-by: Paul Berry --- src/mesa/drivers/dri/i965/brw_clip.c | 13 -- src/mesa/drivers/dri/i965/brw_clip.h |6 +-- src/mesa/drivers/dri/i965/brw_clip_line.c |6 +-- src/mesa/drivers/dri/i965/brw_clip_tri.c | 20 - src/mesa/drivers/dri/i965/brw_clip_unfilled.c |2 +- src/mesa/drivers/dri/i965/brw_clip_util.c | 56 +++-- src/mesa/drivers/dri/i965/brw_sf_emit.c |8 7 files changed, 50 insertions(+), 61 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_clip.c b/src/mesa/drivers/dri/i965/brw_clip.c index b4a2e0a..8512172 100644 --- a/src/mesa/drivers/dri/i965/brw_clip.c +++ b/src/mesa/drivers/dri/i965/brw_clip.c @@ -218,7 +218,7 @@ brw_upload_clip_prog(struct brw_context *brw) struct intel_context *intel = &brw->intel; struct gl_context *ctx = &intel->ctx; struct brw_clip_prog_key key; - + int i; memset(&key, 0, sizeof(key)); /* Populate the key: @@ -231,11 +231,16 @@ brw_upload_clip_prog(struct brw_context *brw) key.primitive = brw->intel.reduced_primitive; /* CACHE_NEW_VS_PROG (also part of VUE map) */ key.attrs = brw->vs.prog_data->outputs_written; - /* _NEW_LIGHT */ - key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT); + /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */ + key.has_flat_shading = 0; + for (i = 0; i < BRW_VERT_RESULT_MAX; i++) { + if (brw->interpolation_mode[i] == INTERP_QUALIFIER_FLAT) { + key.has_flat_shading = 1; + break; + } + } key.pv_first = (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION); - /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */ memcpy(key.interpolation_mode, brw->interpolation_mode, BRW_VERT_RESULT_MAX); /* _NEW_TRANSFORM (also part of VUE map)*/ diff --git a/src/mesa/drivers/dri/i965/brw_clip.h b/src/mesa/drivers/dri/i965/brw_clip.h index e78d074..3ad2e13 100644 --- a/src/mesa/drivers/dri/i965/brw_clip.h +++ b/src/mesa/drivers/dri/i965/brw_clip.h @@ -46,7 +46,7 @@ struct brw_clip_prog_key { unsigned char interpolation_mode[BRW_VERT_RESULT_MAX]; /* copy of the main context */ GLuint primitive:4; GLuint nr_userclip:4; - GLuint do_flat_shading:1; + GLuint has_flat_shading:1; GLuint pv_first:1; GLuint do_unfilled:1; GLuint fill_cw:2; /* includes cull information */ @@ -166,8 +166,8 @@ void brw_clip_kill_thread(struct brw_clip_compile *c); struct brw_reg brw_clip_plane_stride( struct brw_clip_compile *c ); struct brw_reg brw_clip_plane0_address( struct brw_clip_compile *c ); -void brw_clip_copy_colors( struct brw_clip_compile *c, - GLuint to, GLuint from ); +void brw_clip_copy_flatshaded_attributes( struct brw_clip_compile *c, + GLuint to, GLuint from ); void brw_clip_init_clipmask( struct brw_clip_compile *c ); diff --git a/src/mesa/drivers/dri/i965/brw_clip_line.c b/src/mesa/drivers/dri/i965/brw_clip_line.c index 6cf2bd2..729d8c0 100644 --- a/src/mesa/drivers/dri/i965/brw_clip_line.c +++ b/src/mesa/drivers/dri/i965/brw_clip_line.c @@ -271,11 +271,11 @@ void brw_emit_line_clip( struct brw_clip_compile *c ) brw_clip_line_alloc_regs(c); brw_clip_init_ff_sync(c); - if (c->key.do_flat_shading) { + if (c->key.has_flat_shading) { if (c->key.pv_first) - brw_clip_copy_colors(c, 1, 0); + brw_clip_copy_flatshaded_attributes(c, 1, 0); else - brw_clip_copy_colors(c, 0, 1); + brw_clip_copy_flatshaded_attributes(c, 0, 1); } clip_and_emit_line(c); diff --git a/src/mesa/drivers/dri/i965/brw_clip_tri.c b/src/mesa/drivers/dri/i965/brw_clip_tri.c index a29f8e0..71225f5 100644 --- a/src/mesa/drivers/dri/i965/brw_clip_tri.c +++ b/src/mesa/drivers/dri/i965/brw_clip_tri.c @@ -187,8 +187,8 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile *c ) brw_IF(p, BRW_EXECUTE_1); { - brw_clip_copy_colors(c, 1, 0); - brw_clip_copy_colors(c, 2, 0); + brw_clip_copy_flatshaded_attributes(c, 1, 0); + brw_clip_copy_flatshaded_attributes(c, 2, 0); } brw_ELSE(p); { @@ -200,19 +200,19 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile *c ) brw_imm_ud(_3DPRIM_TRIFAN)); brw_IF(p, BRW_EXECUTE_1); { - brw_clip_copy_colors(c, 0, 1); - brw_clip_copy_colors(c, 2, 1); + brw_clip_copy_flatshaded_attributes(c, 0, 1); + brw_clip_copy_flatshaded_attributes(c, 2, 1); } brw_ELSE(p); { - brw_clip_copy_colors(c, 1, 0); - brw_clip_copy_colors(c, 2, 0); + brw_clip_copy_flatshaded_attributes(c, 1, 0); + brw_clip_copy_flatsha
[Mesa-dev] [PATCH 6/9] intel gen4-5: Correctly setup the parameters in the sf.
This patch also correct a couple of problems with noperspective interpolation. At that point all the glsl 1.1/1.3 interpolation tests that do not clip pass (the -none ones). The fs code does not use the pre-resolved interpolation modes in order not to mess with gen6+. Sharing the resolution would require putting brw_wm_prog before brw_clip_prog and brw_sf_prog. This may be a good thing, but it could have unexpected consequences, so it's better be done independently in any case. Signed-off-by: Olivier Galibert Reviewed-by: Paul Berry --- src/mesa/drivers/dri/i965/brw_fs.cpp |2 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 15 +++ src/mesa/drivers/dri/i965/brw_sf.c | 12 +- src/mesa/drivers/dri/i965/brw_sf.h |2 +- src/mesa/drivers/dri/i965/brw_sf_emit.c | 164 +- 5 files changed, 106 insertions(+), 89 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 3b62952..4734a5d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -757,7 +757,7 @@ fs_visitor::emit_general_interpolation(ir_variable *ir) inst->predicated = true; inst->predicate_inverse = true; } - if (intel->gen < 6) { + if (intel->gen < 6 && interpolation_mode == INTERP_QUALIFIER_SMOOTH) { emit(BRW_OPCODE_MUL, attr, attr, this->pixel_w); } } diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 08c0130..c6dc265 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1872,6 +1872,21 @@ fs_visitor::emit_interpolation_setup_gen4() emit(BRW_OPCODE_ADD, this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC], this->pixel_y, fs_reg(negate(brw_vec1_grf(1, 1; + /* +* On Gen4-5, we accomplish perspective-correct interpolation by +* dividing the attribute values by w in the sf shader, +* interpolating the result linearly in screen space, and then +* multiplying by w in the fragment shader. So the interpolation +* step is always linear in screen space, regardless of whether the +* attribute is perspective or non-perspective. Accordingly, we +* use the same delta_x and delta_y values for both kinds of +* interpolation. +*/ + this->delta_x[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] = + this->delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC]; + this->delta_y[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] = + this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC]; + this->current_annotation = "compute pos.w and 1/pos.w"; /* Compute wpos.w. It's always in our setup, since it's needed to * interpolate the other attributes. diff --git a/src/mesa/drivers/dri/i965/brw_sf.c b/src/mesa/drivers/dri/i965/brw_sf.c index 26cbaf7..c00e85a 100644 --- a/src/mesa/drivers/dri/i965/brw_sf.c +++ b/src/mesa/drivers/dri/i965/brw_sf.c @@ -139,6 +139,7 @@ brw_upload_sf_prog(struct brw_context *brw) struct brw_sf_prog_key key; /* _NEW_BUFFERS */ bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer); + int i; memset(&key, 0, sizeof(key)); @@ -190,11 +191,16 @@ brw_upload_sf_prog(struct brw_context *brw) if ((ctx->Point.SpriteOrigin == GL_LOWER_LEFT) != render_to_fbo) key.sprite_origin_lower_left = true; - /* _NEW_LIGHT */ - key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT); + /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */ + key.has_flat_shading = 0; + for (i = 0; i < BRW_VERT_RESULT_MAX; i++) { + if (brw->interpolation_mode[i] == INTERP_QUALIFIER_FLAT) { + key.has_flat_shading = 1; + break; + } + } key.do_twoside_color = ctx->VertexProgram._TwoSideEnabled; - /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */ memcpy(key.interpolation_mode, brw->interpolation_mode, BRW_VERT_RESULT_MAX); /* _NEW_POLYGON */ diff --git a/src/mesa/drivers/dri/i965/brw_sf.h b/src/mesa/drivers/dri/i965/brw_sf.h index 5e261fb..47fdb3e 100644 --- a/src/mesa/drivers/dri/i965/brw_sf.h +++ b/src/mesa/drivers/dri/i965/brw_sf.h @@ -50,7 +50,7 @@ struct brw_sf_prog_key { uint8_t point_sprite_coord_replace; GLuint primitive:2; GLuint do_twoside_color:1; - GLuint do_flat_shading:1; + GLuint has_flat_shading:1; GLuint frontface_ccw:1; GLuint do_point_sprite:1; GLuint do_point_coord:1; diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c b/src/mesa/drivers/dri/i965/brw_sf_emit.c index 9d8aa38..c99578a 100644 --- a/src/mesa/drivers/dri/i965/brw_sf_emit.c +++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c @@ -44,6 +44,17 @@ /** + * Determine the vue slot corresponding to the given half of the given + *
[Mesa-dev] [PATCH 5/9] intel gen4-5: Compute the interpolation status for every variable in one place.
The program keys are updated accordingly, but the values are not used yet. Signed-off-by: Olivier Galibert --- src/mesa/drivers/dri/i965/brw_clip.c| 90 ++- src/mesa/drivers/dri/i965/brw_clip.h|1 + src/mesa/drivers/dri/i965/brw_context.h | 11 src/mesa/drivers/dri/i965/brw_sf.c |5 +- src/mesa/drivers/dri/i965/brw_sf.h |1 + src/mesa/drivers/dri/i965/brw_wm.c |2 + src/mesa/drivers/dri/i965/brw_wm.h |1 + 7 files changed, 109 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_clip.c b/src/mesa/drivers/dri/i965/brw_clip.c index d411208..b4a2e0a 100644 --- a/src/mesa/drivers/dri/i965/brw_clip.c +++ b/src/mesa/drivers/dri/i965/brw_clip.c @@ -47,6 +47,86 @@ #define FRONT_UNFILLED_BIT 0x1 #define BACK_UNFILLED_BIT 0x2 +/** + * Lookup the interpolation mode information for every element in the + * vue. + */ +static void +brw_lookup_interpolation(struct brw_context *brw) +{ + /* pprog means "previous program", i.e. the last program before the +* fragment shader. It can only be the vertex shader for now, but +* it may be a geometry shader in the future. +*/ + const struct gl_program *pprog = &brw->vertex_program->Base; + const struct gl_fragment_program *fprog = brw->fragment_program; + struct brw_vue_map *vue_map = &brw->vs.prog_data->vue_map; + + /* Default everything to INTERP_QUALIFIER_NONE */ + memset(brw->interpolation_mode, INTERP_QUALIFIER_NONE, BRW_VERT_RESULT_MAX); + + /* If there is no fragment shader, interpolation won't be needed, +* so defaulting to none is good. +*/ + if (!fprog) + return; + + for (int i = 0; i < vue_map->num_slots; i++) { + /* First lookup the vert result, skip if there isn't one */ + int vert_result = vue_map->slot_to_vert_result[i]; + if (vert_result == BRW_VERT_RESULT_MAX) + continue; + + /* HPOS is special. In the clipper, it is handled specifically, + * so its value is irrelevant. In the sf, it's forced to + * linear. In the wm, it's special cased, irrelevant again. So + * force linear to remove the sf special case. + */ + if (vert_result == VERT_RESULT_HPOS) { + brw->interpolation_mode[i] = INTERP_QUALIFIER_NOPERSPECTIVE; + continue; + } + + /* There is a 1-1 mapping of vert result to frag attrib except + * for BackColor and vars + */ + int frag_attrib = vert_result; + if (vert_result >= VERT_RESULT_BFC0 && vert_result <= VERT_RESULT_BFC1) + frag_attrib = vert_result - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0; + else if(vert_result >= VERT_RESULT_VAR0) + frag_attrib = vert_result - VERT_RESULT_VAR0 + FRAG_ATTRIB_VAR0; + + /* If the output is not used by the fragment shader, skip it. */ + if (!(fprog->Base.InputsRead & BITFIELD64_BIT(frag_attrib))) + continue; + + /* Lookup the interpolation mode */ + enum glsl_interp_qualifier interpolation_mode = fprog->InterpQualifier[frag_attrib]; + + /* If the mode is not specified, then the default varies. Color + * values follow the shader model, while all the rest uses + * smooth. + */ + if (interpolation_mode == INTERP_QUALIFIER_NONE) { + if (frag_attrib >= FRAG_ATTRIB_COL0 && frag_attrib <= FRAG_ATTRIB_COL1) +interpolation_mode = brw->intel.ctx.Light.ShadeModel == GL_FLAT ? INTERP_QUALIFIER_FLAT : INTERP_QUALIFIER_SMOOTH; + else +interpolation_mode = INTERP_QUALIFIER_SMOOTH; + } + + /* Finally, if we have both a front color and a back color for + * the same channel, the selection will be done before + * interpolation and the back color copied over the front color + * if necessary. So interpolating the back color is + * unnecessary. + */ + if (vert_result >= VERT_RESULT_BFC0 && vert_result <= VERT_RESULT_BFC1) + if (pprog->OutputsWritten & BITFIELD64_BIT(vert_result - VERT_RESULT_BFC0 + VERT_RESULT_COL0)) +interpolation_mode = INTERP_QUALIFIER_NONE; + + brw->interpolation_mode[i] = interpolation_mode; + } +} static void compile_clip_prog( struct brw_context *brw, struct brw_clip_prog_key *key ) @@ -143,6 +223,10 @@ brw_upload_clip_prog(struct brw_context *brw) /* Populate the key: */ + + /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */ + brw_lookup_interpolation(brw); + /* BRW_NEW_REDUCED_PRIMITIVE */ key.primitive = brw->intel.reduced_primitive; /* CACHE_NEW_VS_PROG (also part of VUE map) */ @@ -150,6 +234,10 @@ brw_upload_clip_prog(struct brw_context *brw) /* _NEW_LIGHT */ key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT); key.pv_first = (ctx->Light.ProvokingV
[Mesa-dev] [PATCH 4/9] intel gen4-5: Fix backface/frontface selection when one one color is written to.
Shaders, piglit test ones in particular, may write only to one of gl_FrontColor/gl_BackColor. The standard is unclear on whether the behaviour is defined in that case, but it seems reasonable to support it. The choice done there to pick up whichever color was actually written to. That makes most of the generated piglit tests useless to test the backface selection, but it's simple and it works. Signed-off-by: Olivier Galibert --- src/mesa/drivers/dri/i965/brw_fs.cpp |9 + src/mesa/drivers/dri/i965/brw_wm_pass2.c |9 + 2 files changed, 18 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 3f98137..3b62952 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -972,6 +972,15 @@ fs_visitor::calculate_urb_setup() if (c->key.vp_outputs_written & BITFIELD64_BIT(i)) { int fp_index = _mesa_vert_result_to_frag_attrib((gl_vert_result) i); +/* Special case: two-sided vertex option, vertex program + * only writes to the back color. Map it to the + * associated front color location. + */ +if (i >= VERT_RESULT_BFC0 && i <= VERT_RESULT_BFC1 && +ctx->VertexProgram._TwoSideEnabled && +urb_setup[i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0] == -1) + fp_index = i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0; + /* The back color slot is skipped when the front color is * also written to. In addition, some slots can be * written in the vertex shader and not read in the diff --git a/src/mesa/drivers/dri/i965/brw_wm_pass2.c b/src/mesa/drivers/dri/i965/brw_wm_pass2.c index eacf7c0..48143f3 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_pass2.c +++ b/src/mesa/drivers/dri/i965/brw_wm_pass2.c @@ -96,6 +96,15 @@ static void init_registers( struct brw_wm_compile *c ) if (c->key.vp_outputs_written & BITFIELD64_BIT(j)) { int fp_index = _mesa_vert_result_to_frag_attrib(j); +/* Special case: two-sided vertex option, vertex program + * only writes to the back color. Map it to the + * associated front color location. + */ +if (j >= VERT_RESULT_BFC0 && j <= VERT_RESULT_BFC1 && +intel->ctx.VertexProgram._TwoSideEnabled && +!(c->key.vp_outputs_written & BITFIELD64_BIT(j - VERT_RESULT_BFC0 + VERT_RESULT_COL0))) + fp_index = j - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0; + nr_interp_regs++; /* The back color slot is skipped when the front color is -- 1.7.10.280.gaa39 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/9] intel gen4-5: fix GL_VERTEX_PROGRAM_TWO_SIDE selection.
Previous code only selected two side in pure fixed-function setups. This version also activates it when needed with shaders programs. Signed-off-by: Olivier Galibert --- src/mesa/drivers/dri/i965/brw_sf.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_sf.c b/src/mesa/drivers/dri/i965/brw_sf.c index 23a874a..791210f 100644 --- a/src/mesa/drivers/dri/i965/brw_sf.c +++ b/src/mesa/drivers/dri/i965/brw_sf.c @@ -192,7 +192,7 @@ brw_upload_sf_prog(struct brw_context *brw) /* _NEW_LIGHT */ key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT); - key.do_twoside_color = (ctx->Light.Enabled && ctx->Light.Model.TwoSide); + key.do_twoside_color = ctx->VertexProgram._TwoSideEnabled; /* _NEW_POLYGON */ if (key.do_twoside_color) { -- 1.7.10.280.gaa39 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/9] intel gen4-5: simplify the bfc copy in the sf.
This patch is mostly designed to make followup patches simpler, but it's a simplification by itself. Signed-off-by: Olivier Galibert --- src/mesa/drivers/dri/i965/brw_sf_emit.c | 93 +-- 1 file changed, 52 insertions(+), 41 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c b/src/mesa/drivers/dri/i965/brw_sf_emit.c index ff6383b..9d8aa38 100644 --- a/src/mesa/drivers/dri/i965/brw_sf_emit.c +++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c @@ -79,24 +79,9 @@ have_attr(struct brw_sf_compile *c, GLuint attr) /*** * Twoside lighting */ -static void copy_bfc( struct brw_sf_compile *c, - struct brw_reg vert ) -{ - struct brw_compile *p = &c->func; - GLuint i; - - for (i = 0; i < 2; i++) { - if (have_attr(c, VERT_RESULT_COL0+i) && - have_attr(c, VERT_RESULT_BFC0+i)) -brw_MOV(p, -get_vert_result(c, vert, VERT_RESULT_COL0+i), -get_vert_result(c, vert, VERT_RESULT_BFC0+i)); - } -} - - static void do_twoside_color( struct brw_sf_compile *c ) { + GLuint i, need_0, need_1; struct brw_compile *p = &c->func; GLuint backface_conditional = c->key.frontface_ccw ? BRW_CONDITIONAL_G : BRW_CONDITIONAL_L; @@ -105,12 +90,14 @@ static void do_twoside_color( struct brw_sf_compile *c ) if (c->key.primitive == SF_UNFILLED_TRIS) return; - /* XXX: What happens if BFC isn't present? This could only happen -* for user-supplied vertex programs, as t_vp_build.c always does -* the right thing. + /* If the vertex shader provides both front and backface color, do +* the selection. Otherwise the generated code will pick up +* whichever there is. */ - if (!(have_attr(c, VERT_RESULT_COL0) && have_attr(c, VERT_RESULT_BFC0)) && - !(have_attr(c, VERT_RESULT_COL1) && have_attr(c, VERT_RESULT_BFC1))) + need_0 = have_attr(c, VERT_RESULT_COL0) && have_attr(c, VERT_RESULT_BFC0); + need_1 = have_attr(c, VERT_RESULT_COL1) && have_attr(c, VERT_RESULT_BFC1); + + if (!need_0 && !need_1) return; /* Need to use BRW_EXECUTE_4 and also do an 4-wide compare in order @@ -121,12 +108,15 @@ static void do_twoside_color( struct brw_sf_compile *c ) brw_push_insn_state(p); brw_CMP(p, vec4(brw_null_reg()), backface_conditional, c->det, brw_imm_f(0)); brw_IF(p, BRW_EXECUTE_4); - { - switch (c->nr_verts) { - case 3: copy_bfc(c, c->vert[2]); - case 2: copy_bfc(c, c->vert[1]); - case 1: copy_bfc(c, c->vert[0]); - } + for (i=0; inr_verts; i++) { + if (need_0) +brw_MOV(p, +get_vert_result(c, c->vert[i], VERT_RESULT_COL0), +get_vert_result(c, c->vert[i], VERT_RESULT_BFC0)); + if (need_1) +brw_MOV(p, +get_vert_result(c, c->vert[i], VERT_RESULT_COL1), +get_vert_result(c, c->vert[i], VERT_RESULT_BFC1)); } brw_ENDIF(p); brw_pop_insn_state(p); @@ -139,20 +129,27 @@ static void do_twoside_color( struct brw_sf_compile *c ) */ #define VERT_RESULT_COLOR_BITS (BITFIELD64_BIT(VERT_RESULT_COL0) | \ - BITFIELD64_BIT(VERT_RESULT_COL1)) +BITFIELD64_BIT(VERT_RESULT_COL1)) static void copy_colors( struct brw_sf_compile *c, struct brw_reg dst, -struct brw_reg src) + struct brw_reg src, + int allow_twoside) { struct brw_compile *p = &c->func; GLuint i; for (i = VERT_RESULT_COL0; i <= VERT_RESULT_COL1; i++) { - if (have_attr(c,i)) + if (have_attr(c,i)) { brw_MOV(p, get_vert_result(c, dst, i), get_vert_result(c, src, i)); + + } else if(allow_twoside && have_attr(c, i - VERT_RESULT_COL0 + VERT_RESULT_BFC0)) { +brw_MOV(p, +get_vert_result(c, dst, i - VERT_RESULT_COL0 + VERT_RESULT_BFC0), +get_vert_result(c, src, i - VERT_RESULT_COL0 + VERT_RESULT_BFC0)); + } } } @@ -167,9 +164,19 @@ static void do_flatshade_triangle( struct brw_sf_compile *c ) struct brw_compile *p = &c->func; struct intel_context *intel = &p->brw->intel; struct brw_reg ip = brw_ip_reg(); - GLuint nr = _mesa_bitcount_64(c->key.attrs & VERT_RESULT_COLOR_BITS); GLuint jmpi = 1; + GLuint nr; + + if (c->key.do_twoside_color) { + nr = ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL0) | BITFIELD64_BIT(VERT_RESULT_BFC0))) != 0) + + ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL1) | BITFIELD64_BIT(VERT_RESULT_BFC1))) != 0); + + } else { + nr = ((c->key.attrs & BITFIELD64_BIT(VERT_RESULT_COL0)) != 0) + +
[Mesa-dev] [PATCH 1/9] intel gen4-5: fix the vue view in the fs.
In some cases the fragment shader view of the vue registers was out of sync with the builder. This fixes it. Signed-off-by: Olivier Galibert --- src/mesa/drivers/dri/i965/brw_fs.cpp |9 - src/mesa/drivers/dri/i965/brw_wm_pass2.c | 10 +- 2 files changed, 17 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index b3b25cc..3f98137 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -972,8 +972,15 @@ fs_visitor::calculate_urb_setup() if (c->key.vp_outputs_written & BITFIELD64_BIT(i)) { int fp_index = _mesa_vert_result_to_frag_attrib((gl_vert_result) i); + /* The back color slot is skipped when the front color is +* also written to. In addition, some slots can be +* written in the vertex shader and not read in the +* fragment shader. So the register number must always be +* incremented, mapped or not. +*/ if (fp_index >= 0) - urb_setup[fp_index] = urb_next++; + urb_setup[fp_index] = urb_next; + urb_next++; } } diff --git a/src/mesa/drivers/dri/i965/brw_wm_pass2.c b/src/mesa/drivers/dri/i965/brw_wm_pass2.c index 27c0a94..eacf7c0 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_pass2.c +++ b/src/mesa/drivers/dri/i965/brw_wm_pass2.c @@ -97,8 +97,16 @@ static void init_registers( struct brw_wm_compile *c ) int fp_index = _mesa_vert_result_to_frag_attrib(j); nr_interp_regs++; + + /* The back color slot is skipped when the front color is +* also written to. In addition, some slots can be +* written in the vertex shader and not read in the +* fragment shader. So the register number must always be +* incremented, mapped or not. +*/ if (fp_index >= 0) - prealloc_reg(c, &c->payload.input_interp[fp_index], i++); + prealloc_reg(c, &c->payload.input_interp[fp_index], i); +i++; } } assert(nr_interp_regs >= 1); -- 1.7.10.280.gaa39 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] (no subject)
Hi, This is the second verion of the clipping/interpolation patches. Main differences: - I tried to take all of Paul's remarks into account - I exploded the first patch in 4 independant ones - I've added a patch to ensure that integers pass through unscathed Patch 4/9 is (slightly) controversial. There may be better ways to do it, or at least more general ones. But it's simple, it works, and it allows to validate the other 8. It's an easy one to revert if we build an alternative. Best, OG. [PATCH 1/9] intel gen4-5: fix the vue view in the fs. [PATCH 2/9] intel gen4-5: simplify the bfc copy in the sf. [PATCH 3/9] intel gen4-5: fix GL_VERTEX_PROGRAM_TWO_SIDE selection. [PATCH 4/9] intel gen4-5: Fix backface/frontface selection when one [PATCH 5/9] intel gen4-5: Compute the interpolation status for every [PATCH 6/9] intel gen4-5: Correctly setup the parameters in the sf. [PATCH 7/9] intel gen4-5: Correctly handle flat vs. non-flat in the [PATCH 8/9] intel gen4-5: Make noperspective clipping work. [PATCH 9/9] intel gen4-5: Don't touch flatshaded values when ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] sp_tex_sample: Fix segfault with fbo-cubemap.
On Thu, Jul 19, 2012 at 10:57:38AM -0600, Brian Paul wrote: > > static const float ... Indeed. > Reviewed-by: Brian Paul Thanks. Could you commit it please? Best, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] sp_tex_sample: Fix segfault with fbo-cubemap.
The cube sampler generates two-dimensional texture coordinates and hence passes NULL for the array for the third one. The actual 2D sampler, lower in the pipe, knew not to used that array since it didn't need it. But the samplers have become single-texel and the coordinate array dereference has been moved up one step, to a level where the code does not know only two coordinates are used. Hence the segfault. The simplest fix by far is to add a third dummy coordinate array in the call to the next pipe step, which will be dereferenced to an harmless 0 which then will be happily ignored by the sampler. Signed-off-by: Olivier Galibert --- src/gallium/drivers/softpipe/sp_tex_sample.c |7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) Brown paper bag time. I had tested with (I think) everything with "tex" in the name. Guess what fbo-cubemap doesn't have in the name? Fixes 52250. diff --git a/src/gallium/drivers/softpipe/sp_tex_sample.c b/src/gallium/drivers/softpipe/sp_tex_sample.c index 292dc6e..2f6e272 100644 --- a/src/gallium/drivers/softpipe/sp_tex_sample.c +++ b/src/gallium/drivers/softpipe/sp_tex_sample.c @@ -2090,6 +2090,11 @@ sample_cube(struct tgsi_sampler *tgsi_sampler, unsigned j; float [4], [4]; + /* Not actually used, but the intermediate steps that do the +* dereferencing don't know it. +*/ + float [4] = { 0, 0, 0, 0 }; + /* major axis directiontarget sc tcma @@ -2157,7 +2162,7 @@ sample_cube(struct tgsi_sampler *tgsi_sampler, * is not active, this will point somewhere deeper into the * pipeline, eg. to mip_filter or even img_filter. */ - samp->compare(tgsi_sampler, , , NULL, c0, control, rgba); + samp->compare(tgsi_sampler, , , , c0, control, rgba); } -- 1.7.10.280.gaa39 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] softpipe: Take all lods into account when texture sampling.
On Tue, Jul 17, 2012 at 03:41:44PM -0600, Brian Paul wrote: > On 07/13/2012 10:30 AM, Olivier Galibert wrote: > > On Wed, Jun 20, 2012 at 08:33:38AM -0600, Brian Paul wrote: > >> Yeah, I think it's pretty clear that we need to support per-pixel LOD > >> selection. For softpipe, Olivier's big patch looks good. > > > > ... and then nothing happened. Ping? The only code remark was a > > whitespace issue on one line :-) > > I'll commit/push your patch soon. I don't always remember who has > git-write access so if you can't push patches yourself you should > probably indicate so. I indeed don't have commit access, but more importantly there has been discussion but not review, which is why I didn't know if I had to change things :-) Best, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
On Mon, Jul 16, 2012 at 08:43:17PM -0700, Paul Berry wrote: > Also, I'm not convinced that #3 is necessary. Is there something in the > spec that dictates this behaviour? My reading of the spec is that if the > vertex shader writes to gl_BackColor but not glFrontColor, then the > contents of gl_Color in the fragment shader is undefined. Oh, I remember why I did that in the first place. All the front/back piglit tests only write the appropriate color slot and not the other one. The language is annoying: The following built-in vertex output variables are available, but deprecated. A particular one should be written to if any functionality in a corresponding fragment shader or fixed pipeline uses it or state derived from it. Otherwise, behavior is undefined. out vec4 gl_FrontColor; out vec4 gl_BackColor; out vec4 gl_FrontSecondaryColor; out vec4 gl_BackSecondaryColor; [...] One could argue that you don't "use" gl_FrontColor if all your polygons are back-facing. Dunno. Do you consider all of the twoside piglit tests buggy? We can fix *that*. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Intel-gfx] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
On Mon, Jul 16, 2012 at 08:43:17PM -0700, Paul Berry wrote: > Can you split this into three separate patches? That will make it easier > to troubleshoot in case we find bugs with these patches in the future. I'm going to try. > Also, I'm not convinced that #3 is necessary. Is there something in the > spec that dictates this behaviour? My reading of the spec is that if the > vertex shader writes to gl_BackColor but not glFrontColor, then the > contents of gl_Color in the fragment shader is undefined. Given the number of security issues/information leaks that happen due to reads out of place, I'm always extremely wary of reads from nowhere. So one pretty much has a choice between forcing a specific value (like 0) or reading from someplace else that makes sense. In that particular case I considered reading from the other color slot the easy way out. > If we *do* decide that #3 is necessary, then I think a better way to > accomplish it is to handle it in the GLSL vertex shader front-end, by > replacing gl_BackColor with gl_FrontColor in cases where gl_FrontColor is > not written to. That way our special case code to handle this situation > would be in just one place, rather than in three places (both fragment > shader back-ends, and the SF program). Also then the fix would apply to > all hardware, not just Intel Gen4-5. You'd have to switch off two-sided lighting too, but why not. > Finally, I couldn't figure out what you meant by "the stray mov into > lalaland". Can you elaborate on which piece of code used to generate that > stray mov, and why it doesn't anymore? Thanks. Looking at it again, I was wrong, it was protected. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes
On Fri, Jul 13, 2012 at 02:45:10PM -0700, Kenneth Graunke wrote: > Sorry...been really busy, and most of us haven't actually spent much if > any time in the clipper shaders. I'll try and review it within a week. Ok cool, lack of time is something I completely understand :-) > Despite the lack of response, I am really excited to see that you're > working on this---this is a huge step toward bringing GL 3.x back to > Gen4/5, and we're all really glad to see it happen! Excellent. I was starting to wonder if gen4/5 was abandoned (by lack of resources if anything), nice to see it isn't. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] softpipe: Take all lods into account when texture sampling.
On Wed, Jun 20, 2012 at 08:33:38AM -0600, Brian Paul wrote: > Yeah, I think it's pretty clear that we need to support per-pixel LOD > selection. For softpipe, Olivier's big patch looks good. ... and then nothing happened. Ping? The only code remark was a whitespace issue on one line :-) > For > llvmpipe it's important to maintain performance for the common case > where we compute LOD per quad but we'll also need new paths for > per-pixel LOD. Hopefully, the two paths can share some code. I've been thinking, it looks reasonable to statically check whether the lod/grad/bias is shared at the glsl level. Then we could have separate opcodes for the texturing variants for when we're sure things are shared and when we aren't. And pay the cost only when it is needed. Would that sound reasonable? OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: Generates TGSI that always recognizes INSTANCEID/VERTEXID as integers.
On Thu, Jul 12, 2012 at 08:50:13PM +0100, jfons...@vmware.com wrote: > From: José Fonseca > > Tested by running piglit draw-instanced, and by forcing llvmpipe advertise no > native > integer support, which now produces: Looks like a very good solution to me. Did you check draw-non-instanced too? 51366 is a variant of the same issue. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes
On Sat, Jun 30, 2012 at 08:50:10PM +0200, Olivier Galibert wrote: > This is the first part of the fixes I've done to make my gm45 work > correctly w.r.t clipping and interpolation. There's a fair chance > they work for everything gen 4/5, but I have no way to be sure. So, not even one comment, nothing? OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper
On Wed, Jul 11, 2012 at 02:19:02PM +0200, Marek Ol??ák wrote: > On Wed, Jul 11, 2012 at 1:09 PM, Jose Fonseca wrote: > > My current plan is to: > > - make it clear that INSTANCEID/VERTEXID always means integer > > - require PIPE_SHADER_CAP_INTEGERS to be advertise in the vertex shader > > stage in order to advertise INSTANCEID/VERTEXID in Mesa statetracker > > - given that Mesa assumes integer, insert a I2F when loading > > INSTANCEID/VERTEXID (this meets the new semantics while avoiding a big > > re-architecture) > > The first two points sound good, but why I2F? Note that softpipe fully > supports integers while llvmpipe doesn't, and I2F after loading > INSTANCEID would very likely break softpipe. I think that would break llvmpipe too. llvmpipe actually fully supports integers, it only thinks it doesn't, and least according to piglit (textureFetch is the only real remaining issue left for glsl 1.30). And draw-instanced works perfectly well with native integer llvmpipe (which is why I didn't see the problem before the bug report). OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper
On Wed, Jul 11, 2012 at 12:51:32PM +0200, Marek Ol??ák wrote: > Dude, you should really learn GLSL. The idea to emulate integers is > even older than the GLSL itself. It first appeared in HLSL and NVIDIA > Cg on hardware that wasn't even GL2-capable. I'm learning 3.30+, which is what I consider useful now :-) But that makes it a little harder to remember what appeared when. > >From the GLSL 1.2 spec: > "The uniform qualifier can be used with any of the basic data types, > ...", then the section 4.1 lists the basic data types (like ivec4). Fuck, damn. Yes, we do have a problem. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper
On Tue, Jul 10, 2012 at 09:19:05AM -0700, Stéphane Marchesin wrote: > There is also option 3): revert the two patches causing the regression. And then you'll have this problem again as soon as you want llvmpipe to reach GL 3.00+/GLSL 1.30+. So why not find a definitive solution now? Previous code converted the instance id to float, and said "it's an integer guv', honest". That does not fly in the face of native integers, at all, unless you like your second instance to be numbered 1065353216. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper
On Tue, Jul 10, 2012 at 03:51:22PM +0200, Marek Ol??ák wrote: > I just wanted to tell you Stephane's change cannot work and it even > has no effect at the moment. The native integer support is global in > core Mesa. It's because integer uniforms are converted to floats based > on the global NativeInteger flag for all shader stages and that can't > be fixed easily, because uniforms can be shared between shaders. > Basically, all drivers must advertise integer support either for all > shader stages or none. Really? I mean the idea here is that drivers like i915g which don't have native integers in the fragger are going to advertise native integers in the vs but stay at glsl 1.20. Can you have integer uniforms without 1.30+? I don't think so. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] draw: draw_get_shader_param should return correct values WRT llvm
On Wed, Jul 04, 2012 at 01:59:44PM +0200, Marek Ol??ák wrote: > Please disregard patch 1 and 2. It wouldn't work. What's wrong with them? > I still plan to commit patch 3. Patch 3 makes sense. I probably should have done it like that in the first place (learned a lot since :-). OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper
On Tue, Jul 03, 2012 at 12:39:47PM -0700, Jose Fonseca wrote: > Note that all registers are stored as floats (for convenience, and > because LLVM has no unions), so integers are bitcasted into floats > while storing/loading. And I'm not sure if your patch would break > that. I did test the patch with a llvmpipe in a glsl 120/no native integer setup. draw_instanced worked. I didn't try a full piglit though. > I still think that having draw/gallivm guessing whether native integer > support is intended or not is bad. Either: > 1) TGSI is extended (e.g., more type annotations) so that native-integer > support can inferred from it > 2) draw/gallivm need to now if the driver has native-integer or not > > I'm inclined towards 1), as TGSI should be self-documented. That is, > it should not be necessary to know if the driver has or not native > integer support to know whether system values should be assumed to > be integers or floats... It could be argued that dtype being TGSI_TYPE_FLOAT is the documentation on what is expected. But I'm quickly reaching the point where I don't really care, just tell me what you want. As long as textureFetch stays the only issue between llvmpipe and 1.30 I'm ok. Of course doing textureFetch right is going to require an interesting overhaul of the texture allocations... need to finish fixing the gm45 interpolation/clipping first. Best, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper
On Mon, Jul 02, 2012 at 06:44:37AM -0700, Jose Fonseca wrote: > But I think that this fix is too ad-hoc, and I suspect it may > introduce other regressions. > > If I understood the problem correctly, the issue here is that some > drivers want system values in floats, others want in > integers. Right? It's slightly more perverted than that. GLSL 1.20 says "if a value is an integer, it will be forced into a float but don't expect more than 16 bits precision", while 1.30 has native integers. Next to that, some extensions (and gl versions) introduce integer system values but require native integer support to have them implemented. The glsl parser handles that correctly by adding the needed type conversions when accessing these values from a 1.20 shader. But then in mesa someone decided to extend the extensions and implement things like draw_instanced without native integer support. st_glsl_to_tgsi behaves very differently when native integers aren't there, forcing evey type to float and ignoring the integer->float type conversions. What tells you that is that the requested type (dtype) is float while the system value itself is integer. In fact, I suspect the conversion code is ill-advised. It was picked up from the previous code, but actually it should only check that the types are identical or that float is requested for an int, and bitch otherwise. Still, it would be interesting to know if that patch works for i915g, even if we make things more cranky afterwards. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] intel gen4-5: Make noperspective clipping work.
At this point all interpolation tests with fixed clipping work. Signed-off-by: Olivier Galibert --- src/mesa/drivers/dri/i965/brw_clip.c |9 ++ src/mesa/drivers/dri/i965/brw_clip.h |1 + src/mesa/drivers/dri/i965/brw_clip_util.c | 133 ++--- 3 files changed, 132 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_clip.c b/src/mesa/drivers/dri/i965/brw_clip.c index 952eb4a..6bfdf24 100644 --- a/src/mesa/drivers/dri/i965/brw_clip.c +++ b/src/mesa/drivers/dri/i965/brw_clip.c @@ -234,6 +234,15 @@ brw_upload_clip_prog(struct brw_context *brw) break; } } + key.has_noperspective_shading = 0; + for (i = 0; i < BRW_VERT_RESULT_MAX; i++) { + if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_NOPERSPECTIVE && + brw->vs.prog_data->vue_map.slot_to_vert_result[i] != VERT_RESULT_HPOS) { + key.has_noperspective_shading = 1; + break; + } + } + key.pv_first = (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION); brw_copy_interpolation_modes(brw, key.interpolation_mode); /* _NEW_TRANSFORM (also part of VUE map)*/ diff --git a/src/mesa/drivers/dri/i965/brw_clip.h b/src/mesa/drivers/dri/i965/brw_clip.h index 0ea0394..2a7245a 100644 --- a/src/mesa/drivers/dri/i965/brw_clip.h +++ b/src/mesa/drivers/dri/i965/brw_clip.h @@ -47,6 +47,7 @@ struct brw_clip_prog_key { GLuint primitive:4; GLuint nr_userclip:4; GLuint has_flat_shading:1; + GLuint has_noperspective_shading:1; GLuint pv_first:1; GLuint do_unfilled:1; GLuint fill_cw:2; /* includes cull information */ diff --git a/src/mesa/drivers/dri/i965/brw_clip_util.c b/src/mesa/drivers/dri/i965/brw_clip_util.c index 7b0205a..5bdcef8 100644 --- a/src/mesa/drivers/dri/i965/brw_clip_util.c +++ b/src/mesa/drivers/dri/i965/brw_clip_util.c @@ -129,6 +129,8 @@ static void brw_clip_project_vertex( struct brw_clip_compile *c, /* Interpolate between two vertices and put the result into a0.0. * Increment a0.0 accordingly. + * + * Beware that dest_ptr can be equal to v0_ptr. */ void brw_clip_interp_vertex( struct brw_clip_compile *c, struct brw_indirect dest_ptr, @@ -138,8 +140,9 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c, bool force_edgeflag) { struct brw_compile *p = &c->func; - struct brw_reg tmp = get_tmp(c); - GLuint slot; + struct brw_context *brw = p->brw; + struct brw_reg tmp, t_nopersp, v0_ndc_copy; + GLuint slot, delta; /* Just copy the vertex header: */ @@ -148,13 +151,119 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c, * back on Ironlake, so needn't change it */ brw_copy_indirect_to_indirect(p, dest_ptr, v0_ptr, 1); - - /* Iterate over each attribute (could be done in pairs?) + + /* +* First handle the 3D and NDC positioning, in case we need +* noperspective interpolation. Doing it early has no performance +* impact in any case. +*/ + + /* Start by picking up the v0 NDC coordinates, because that vertex +* may be shared with the destination. +*/ + if (c->key.has_noperspective_shading) { + v0_ndc_copy = get_tmp(c); + brw_MOV(p, v0_ndc_copy, deref_4f(v0_ptr, + brw_vert_result_to_offset(&c->vue_map, + BRW_VERT_RESULT_NDC))); + } + + /* +* Compute the new 3D position +*/ + + delta = brw_vert_result_to_offset(&c->vue_map, VERT_RESULT_HPOS); + tmp = get_tmp(c); + brw_MUL(p, + vec4(brw_null_reg()), + deref_4f(v1_ptr, delta), + t0); + + brw_MAC(p, + tmp, + negate(deref_4f(v0_ptr, delta)), + t0); + + brw_ADD(p, + deref_4f(dest_ptr, delta), + deref_4f(v0_ptr, delta), + tmp); + release_tmp(c, tmp); + + /* Then recreate the projected (NDC) coordinate in the new vertex +* header */ + brw_clip_project_vertex(c, dest_ptr); + + /* +* If we have noperspective attributes, we now need to compute the +* screen-space t. +*/ + if (c->key.has_noperspective_shading) { + delta = brw_vert_result_to_offset(&c->vue_map, BRW_VERT_RESULT_NDC); + t_nopersp = get_tmp(c); + tmp = get_tmp(c); + + /* Build a register with coordinates from the second and new vertices */ + brw_MOV(p, t_nopersp, deref_4f(v1_ptr, delta)); + brw_MOV(p, tmp, deref_4f(dest_ptr, delta)); + brw_set_access_mode(p, BRW_ALIGN_16); + brw_MOV(p, + brw_writemask(t_nopersp, WRITEMASK_ZW), + brw_swizzle(tmp, 0,1,0,1)); + + /* Subtract the coordinates of the first vertex */ + brw_ADD(p, t_nopersp, t_nopersp, negate(brw_swizzle(v0_ndc_copy, 0,1,0,1))); + +
[Mesa-dev] [PATCH 4/5] intel gen4-5: Correctly handle flat vs. non-flat in the clipper.
At that point, all interpolation piglit tests involving fixed clipping work as long as there's no noperspective. Signed-off-by: Olivier Galibert --- src/mesa/drivers/dri/i965/brw_clip.c | 10 - src/mesa/drivers/dri/i965/brw_clip.h |6 +-- src/mesa/drivers/dri/i965/brw_clip_line.c |6 +-- src/mesa/drivers/dri/i965/brw_clip_tri.c | 20 - src/mesa/drivers/dri/i965/brw_clip_unfilled.c |2 +- src/mesa/drivers/dri/i965/brw_clip_util.c | 56 +++-- 6 files changed, 41 insertions(+), 59 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_clip.c b/src/mesa/drivers/dri/i965/brw_clip.c index 52e8c47..952eb4a 100644 --- a/src/mesa/drivers/dri/i965/brw_clip.c +++ b/src/mesa/drivers/dri/i965/brw_clip.c @@ -215,7 +215,7 @@ brw_upload_clip_prog(struct brw_context *brw) struct intel_context *intel = &brw->intel; struct gl_context *ctx = &intel->ctx; struct brw_clip_prog_key key; - + int i; memset(&key, 0, sizeof(key)); brw_lookup_interpolation(brw); @@ -227,7 +227,13 @@ brw_upload_clip_prog(struct brw_context *brw) /* CACHE_NEW_VS_PROG (also part of VUE map) */ key.attrs = brw->vs.prog_data->outputs_written; /* _NEW_LIGHT */ - key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT); + key.has_flat_shading = 0; + for (i = 0; i < BRW_VERT_RESULT_MAX; i++) { + if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) { + key.has_flat_shading = 1; + break; + } + } key.pv_first = (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION); brw_copy_interpolation_modes(brw, key.interpolation_mode); /* _NEW_TRANSFORM (also part of VUE map)*/ diff --git a/src/mesa/drivers/dri/i965/brw_clip.h b/src/mesa/drivers/dri/i965/brw_clip.h index 6f811ae..0ea0394 100644 --- a/src/mesa/drivers/dri/i965/brw_clip.h +++ b/src/mesa/drivers/dri/i965/brw_clip.h @@ -46,7 +46,7 @@ struct brw_clip_prog_key { GLbitfield64 interpolation_mode[2]; /* copy of the main context */ GLuint primitive:4; GLuint nr_userclip:4; - GLuint do_flat_shading:1; + GLuint has_flat_shading:1; GLuint pv_first:1; GLuint do_unfilled:1; GLuint fill_cw:2; /* includes cull information */ @@ -166,8 +166,8 @@ void brw_clip_kill_thread(struct brw_clip_compile *c); struct brw_reg brw_clip_plane_stride( struct brw_clip_compile *c ); struct brw_reg brw_clip_plane0_address( struct brw_clip_compile *c ); -void brw_clip_copy_colors( struct brw_clip_compile *c, - GLuint to, GLuint from ); +void brw_clip_copy_flatshaded_attributes( struct brw_clip_compile *c, + GLuint to, GLuint from ); void brw_clip_init_clipmask( struct brw_clip_compile *c ); diff --git a/src/mesa/drivers/dri/i965/brw_clip_line.c b/src/mesa/drivers/dri/i965/brw_clip_line.c index 6cf2bd2..729d8c0 100644 --- a/src/mesa/drivers/dri/i965/brw_clip_line.c +++ b/src/mesa/drivers/dri/i965/brw_clip_line.c @@ -271,11 +271,11 @@ void brw_emit_line_clip( struct brw_clip_compile *c ) brw_clip_line_alloc_regs(c); brw_clip_init_ff_sync(c); - if (c->key.do_flat_shading) { + if (c->key.has_flat_shading) { if (c->key.pv_first) - brw_clip_copy_colors(c, 1, 0); + brw_clip_copy_flatshaded_attributes(c, 1, 0); else - brw_clip_copy_colors(c, 0, 1); + brw_clip_copy_flatshaded_attributes(c, 0, 1); } clip_and_emit_line(c); diff --git a/src/mesa/drivers/dri/i965/brw_clip_tri.c b/src/mesa/drivers/dri/i965/brw_clip_tri.c index a29f8e0..71225f5 100644 --- a/src/mesa/drivers/dri/i965/brw_clip_tri.c +++ b/src/mesa/drivers/dri/i965/brw_clip_tri.c @@ -187,8 +187,8 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile *c ) brw_IF(p, BRW_EXECUTE_1); { - brw_clip_copy_colors(c, 1, 0); - brw_clip_copy_colors(c, 2, 0); + brw_clip_copy_flatshaded_attributes(c, 1, 0); + brw_clip_copy_flatshaded_attributes(c, 2, 0); } brw_ELSE(p); { @@ -200,19 +200,19 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile *c ) brw_imm_ud(_3DPRIM_TRIFAN)); brw_IF(p, BRW_EXECUTE_1); { - brw_clip_copy_colors(c, 0, 1); - brw_clip_copy_colors(c, 2, 1); + brw_clip_copy_flatshaded_attributes(c, 0, 1); + brw_clip_copy_flatshaded_attributes(c, 2, 1); } brw_ELSE(p); { - brw_clip_copy_colors(c, 1, 0); - brw_clip_copy_colors(c, 2, 0); + brw_clip_copy_flatshaded_attributes(c, 1, 0); + brw_clip_copy_flatshaded_attributes(c, 2, 0); } brw_ENDIF(p); } else { - brw_clip_copy_colors(c, 0, 2); - brw_clip_copy_colors(c, 1, 2); + brw_clip_copy_flatshaded_attributes(c, 0, 2); + brw_clip_copy_flatshaded_attributes(c, 1, 2); }
[Mesa-dev] [PATCH 3/5] intel gen4-5: Correctly setup the parameters in the sf.
This patch also correct a couple of problems with noperspective interpolation. At that point all the glsl 1.1/1.3 interpolation tests that do not clip pass (the -none ones). The fs code does not use the pre-resolved interpolation modes in order not to mess with gen6+. Sharing the resolution would require putting brw_wm_prog before brw_clip_prog and brw_sf_prog. This may be a good thing, but it could have unexpected consequences, so it's better be done independently in any case. Signed-off-by: Olivier Galibert --- src/mesa/drivers/dri/i965/brw_fs.cpp |2 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |5 + src/mesa/drivers/dri/i965/brw_sf.c |9 +- src/mesa/drivers/dri/i965/brw_sf.h |2 +- src/mesa/drivers/dri/i965/brw_sf_emit.c | 164 +- 5 files changed, 95 insertions(+), 87 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 710f2ff..b142f2b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -506,7 +506,7 @@ fs_visitor::emit_general_interpolation(ir_variable *ir) struct brw_reg interp = interp_reg(location, k); emit_linterp(attr, fs_reg(interp), interpolation_mode, ir->centroid); - if (intel->gen < 6) { + if (intel->gen < 6 && interpolation_mode == INTERP_QUALIFIER_SMOOTH) { emit(BRW_OPCODE_MUL, attr, attr, this->pixel_w); } } diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 9bd1e67..ab83a95 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1924,6 +1924,11 @@ fs_visitor::emit_interpolation_setup_gen4() emit(BRW_OPCODE_ADD, this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC], this->pixel_y, fs_reg(negate(brw_vec1_grf(1, 1; + this->delta_x[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] = + this->delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC]; + this->delta_y[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] = + this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC]; + this->current_annotation = "compute pos.w and 1/pos.w"; /* Compute wpos.w. It's always in our setup, since it's needed to * interpolate the other attributes. diff --git a/src/mesa/drivers/dri/i965/brw_sf.c b/src/mesa/drivers/dri/i965/brw_sf.c index 0cc4fc7..85f5f51 100644 --- a/src/mesa/drivers/dri/i965/brw_sf.c +++ b/src/mesa/drivers/dri/i965/brw_sf.c @@ -139,6 +139,7 @@ brw_upload_sf_prog(struct brw_context *brw) struct brw_sf_prog_key key; /* _NEW_BUFFERS */ bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer); + int i; memset(&key, 0, sizeof(key)); @@ -191,7 +192,13 @@ brw_upload_sf_prog(struct brw_context *brw) key.sprite_origin_lower_left = true; /* _NEW_LIGHT */ - key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT); + key.has_flat_shading = 0; + for (i = 0; i < BRW_VERT_RESULT_MAX; i++) { + if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) { + key.has_flat_shading = 1; + break; + } + } key.do_twoside_color = (ctx->Light.Enabled && ctx->Light.Model.TwoSide) || ctx->VertexProgram._TwoSideEnabled; brw_copy_interpolation_modes(brw, key.interpolation_mode); diff --git a/src/mesa/drivers/dri/i965/brw_sf.h b/src/mesa/drivers/dri/i965/brw_sf.h index 0a8135c..c718072 100644 --- a/src/mesa/drivers/dri/i965/brw_sf.h +++ b/src/mesa/drivers/dri/i965/brw_sf.h @@ -50,7 +50,7 @@ struct brw_sf_prog_key { uint8_t point_sprite_coord_replace; GLuint primitive:2; GLuint do_twoside_color:1; - GLuint do_flat_shading:1; + GLuint has_flat_shading:1; GLuint frontface_ccw:1; GLuint do_point_sprite:1; GLuint do_point_coord:1; diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c b/src/mesa/drivers/dri/i965/brw_sf_emit.c index 9d8aa38..387685a 100644 --- a/src/mesa/drivers/dri/i965/brw_sf_emit.c +++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c @@ -44,6 +44,17 @@ /** + * Determine the vue slot corresponding to the given half of the given + * register. half=0 means the first half of a register, half=1 means the + * second half. + */ +static inline int vert_reg_to_vue_slot(struct brw_sf_compile *c, GLuint reg, + int half) +{ + return (reg + c->urb_entry_read_offset) * 2 + half; +} + +/** * Determine the vert_result corresponding to the given half of the given * register. half=0 means the first half of a register, half=1 means the * second half. @@ -51,11 +62,24 @@ static inline int vert_reg_to_vert_result(struct brw_sf_compile *c, GLuint reg, int half) {
[Mesa-dev] [PATCH 2/5] intel gen4-5: Compute the interpolation status for every variable in one place.
The program keys are updated accordingly, but the values are not used yet. Signed-off-by: Olivier Galibert --- src/mesa/drivers/dri/i965/brw_clip.c| 82 ++- src/mesa/drivers/dri/i965/brw_clip.h|1 + src/mesa/drivers/dri/i965/brw_context.h | 59 ++ src/mesa/drivers/dri/i965/brw_sf.c |3 +- src/mesa/drivers/dri/i965/brw_sf.h |1 + src/mesa/drivers/dri/i965/brw_wm.c |4 ++ src/mesa/drivers/dri/i965/brw_wm.h |1 + 7 files changed, 149 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_clip.c b/src/mesa/drivers/dri/i965/brw_clip.c index d411208..52e8c47 100644 --- a/src/mesa/drivers/dri/i965/brw_clip.c +++ b/src/mesa/drivers/dri/i965/brw_clip.c @@ -47,6 +47,83 @@ #define FRONT_UNFILLED_BIT 0x1 #define BACK_UNFILLED_BIT 0x2 +/** + * Lookup the interpolation mode information for every element in the + * vue. + */ +static void +brw_lookup_interpolation(struct brw_context *brw) +{ + /* pprog means "previous program", i.e. the last program before the +* fragment shader. It can only be the vertex shader for now, but +* it may be a geometry shader in the future. +*/ + const struct gl_program *pprog = &brw->vertex_program->Base; + const struct gl_fragment_program *fprog = brw->fragment_program; + struct brw_vue_map *vue_map = &brw->vs.prog_data->vue_map; + + /* Default everything to INTERP_QUALIFIER_NONE */ + brw_clear_interpolation_modes(brw); + + /* If there is no fragment shader, interpolation won't be needed, +* so defaulting to none is good. +*/ + if (!fprog) + return; + + for (int i = 0; i < vue_map->num_slots; i++) { + /* First lookup the vert result, skip if there isn't one */ + int vert_result = vue_map->slot_to_vert_result[i]; + if (vert_result == BRW_VERT_RESULT_MAX) + continue; + + /* HPOS is special, it must be linear + */ + if (vert_result == VERT_RESULT_HPOS) { + brw_set_interpolation_mode(brw, i, INTERP_QUALIFIER_NOPERSPECTIVE); + continue; + } + + /* There is a 1-1 mapping of vert result to frag attrib except + * for BackColor and vars + */ + int frag_attrib = vert_result; + if (vert_result >= VERT_RESULT_BFC0 && vert_result <= VERT_RESULT_BFC1) + frag_attrib = vert_result - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0; + else if(vert_result >= VERT_RESULT_VAR0) + frag_attrib = vert_result - VERT_RESULT_VAR0 + FRAG_ATTRIB_VAR0; + + /* If the output is not used by the fragment shader, skip it. */ + if (!(fprog->Base.InputsRead & BITFIELD64_BIT(frag_attrib))) + continue; + + /* Lookup the interpolation mode */ + enum glsl_interp_qualifier interpolation_mode = fprog->InterpQualifier[frag_attrib]; + + /* If the mode is not specified, then the default varies. Color + * values follow the shader model, while all the rest uses + * smooth. + */ + if (interpolation_mode == INTERP_QUALIFIER_NONE) { + if (frag_attrib >= FRAG_ATTRIB_COL0 && frag_attrib <= FRAG_ATTRIB_COL1) +interpolation_mode = brw->intel.ctx.Light.ShadeModel == GL_FLAT ? INTERP_QUALIFIER_FLAT : INTERP_QUALIFIER_SMOOTH; + else +interpolation_mode = INTERP_QUALIFIER_SMOOTH; + } + + /* Finally, if we have both a front color and a back color for + * the same channel, the selection will be done before + * interpolation and the back color copied over the front color + * if necessary. So interpolating the back color is + * unnecessary. + */ + if (vert_result >= VERT_RESULT_BFC0 && vert_result <= VERT_RESULT_BFC1) + if (pprog->OutputsWritten & BITFIELD64_BIT(vert_result - VERT_RESULT_BFC0 + VERT_RESULT_COL0)) +interpolation_mode = INTERP_QUALIFIER_NONE; + + brw_set_interpolation_mode(brw, i, interpolation_mode); + } +} static void compile_clip_prog( struct brw_context *brw, struct brw_clip_prog_key *key ) @@ -141,6 +218,8 @@ brw_upload_clip_prog(struct brw_context *brw) memset(&key, 0, sizeof(key)); + brw_lookup_interpolation(brw); + /* Populate the key: */ /* BRW_NEW_REDUCED_PRIMITIVE */ @@ -150,6 +229,7 @@ brw_upload_clip_prog(struct brw_context *brw) /* _NEW_LIGHT */ key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT); key.pv_first = (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION); + brw_copy_interpolation_modes(brw, key.interpolation_mode); /* _NEW_TRANSFORM (also part of VUE map)*/ key.nr_userclip = _mesa_bitcount_64(ctx->Transform.ClipPlanesEnabled); @@ -258,7 +338,7 @@ const struct brw_tracked_state brw_clip_prog = { _NEW_TRANSFORM | _NE
[Mesa-dev] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
There was... confusion about which register goes where. With that patch urb_setup is in line with the vue setup, even when these annoying backcolor slots are used. And in addition the stray mov into lalaland is avoided when only one of the front/back slots is used and the backface is looking at you. The code instead picks whatever slot was written to by the vertex shader. That makes most of the generated piglit tests useless to test the backface selection though. Signed-off-by: Olivier Galibert --- src/mesa/drivers/dri/i965/brw_fs.cpp | 18 +- src/mesa/drivers/dri/i965/brw_sf.c |3 +- src/mesa/drivers/dri/i965/brw_sf_emit.c | 93 +- src/mesa/drivers/dri/i965/brw_wm_pass2.c | 19 +- 4 files changed, 89 insertions(+), 44 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 6cef08a..710f2ff 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -721,8 +721,24 @@ fs_visitor::calculate_urb_setup() if (c->key.vp_outputs_written & BITFIELD64_BIT(i)) { int fp_index = _mesa_vert_result_to_frag_attrib((gl_vert_result) i); +/* Special case: two-sided vertex option, vertex program + * only writes to the back color. Map it to the + * associated front color location. + */ +if (i >= VERT_RESULT_BFC0 && i <= VERT_RESULT_BFC1 && +ctx->VertexProgram._TwoSideEnabled && +urb_setup[i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0] == -1) + fp_index = i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0; + + /* The back color slot is skipped when the front color is +* also written to. In addition, some slots can be +* written in the vertex shader and not read in the +* fragment shader. So the register number must always be +* incremented, mapped or not. +*/ if (fp_index >= 0) - urb_setup[fp_index] = urb_next++; + urb_setup[fp_index] = urb_next; + urb_next++; } } diff --git a/src/mesa/drivers/dri/i965/brw_sf.c b/src/mesa/drivers/dri/i965/brw_sf.c index 23a874a..7867ab5 100644 --- a/src/mesa/drivers/dri/i965/brw_sf.c +++ b/src/mesa/drivers/dri/i965/brw_sf.c @@ -192,7 +192,8 @@ brw_upload_sf_prog(struct brw_context *brw) /* _NEW_LIGHT */ key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT); - key.do_twoside_color = (ctx->Light.Enabled && ctx->Light.Model.TwoSide); + key.do_twoside_color = (ctx->Light.Enabled && ctx->Light.Model.TwoSide) || + ctx->VertexProgram._TwoSideEnabled; /* _NEW_POLYGON */ if (key.do_twoside_color) { diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c b/src/mesa/drivers/dri/i965/brw_sf_emit.c index ff6383b..9d8aa38 100644 --- a/src/mesa/drivers/dri/i965/brw_sf_emit.c +++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c @@ -79,24 +79,9 @@ have_attr(struct brw_sf_compile *c, GLuint attr) /*** * Twoside lighting */ -static void copy_bfc( struct brw_sf_compile *c, - struct brw_reg vert ) -{ - struct brw_compile *p = &c->func; - GLuint i; - - for (i = 0; i < 2; i++) { - if (have_attr(c, VERT_RESULT_COL0+i) && - have_attr(c, VERT_RESULT_BFC0+i)) -brw_MOV(p, -get_vert_result(c, vert, VERT_RESULT_COL0+i), -get_vert_result(c, vert, VERT_RESULT_BFC0+i)); - } -} - - static void do_twoside_color( struct brw_sf_compile *c ) { + GLuint i, need_0, need_1; struct brw_compile *p = &c->func; GLuint backface_conditional = c->key.frontface_ccw ? BRW_CONDITIONAL_G : BRW_CONDITIONAL_L; @@ -105,12 +90,14 @@ static void do_twoside_color( struct brw_sf_compile *c ) if (c->key.primitive == SF_UNFILLED_TRIS) return; - /* XXX: What happens if BFC isn't present? This could only happen -* for user-supplied vertex programs, as t_vp_build.c always does -* the right thing. + /* If the vertex shader provides both front and backface color, do +* the selection. Otherwise the generated code will pick up +* whichever there is. */ - if (!(have_attr(c, VERT_RESULT_COL0) && have_attr(c, VERT_RESULT_BFC0)) && - !(have_attr(c, VERT_RESULT_COL1) && have_attr(c, VERT_RESULT_BFC1))) + need_0 = have_attr(c, VERT_RESULT_COL0) && have_attr(c, VERT_RESULT_BFC0); + need_1 = have_attr(c, VERT_RESULT_COL1) && have_attr(c, VERT_RESULT_BFC1); + + if (!need_0 && !need_1) return; /* Need to use BRW_EXECUTE_4 and also do an 4-wide compare in order @@ -121,12 +108,15 @@ static void do_twoside_color( struct brw_sf_compile *c ) brw_push
[Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes
Hi, This is the first part of the fixes I've done to make my gm45 work correctly w.r.t clipping and interpolation. There's a fair chance they work for everything gen 4/5, but I have no way to be sure. [PATCH 1/5] intel gen4-5: fix GL_VERTEX_PROGRAM_TWO_SIDE. [PATCH 2/5] intel gen4-5: Compute the interpolation status for every [PATCH 3/5] intel gen4-5: Correctly setup the parameters in the sf. [PATCH 4/5] intel gen4-5: Correctly handle flat vs. non-flat in the [PATCH 5/5] intel gen4-5: Make noperspective clipping work. After this batch every piglit interpolation test involving no clipping or fixed clipping passes. Vertex clipping clearly never worked (VERT_RESULT_CLIP_VERTEX is not used, so...) and clipdistance isn't implemented. These will be the topic of the second batch, whenever it exists. Best, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper
On Fri, Jun 29, 2012 at 03:09:23PM -0700, Stéphane Marchesin wrote: > I do, but it fixes a regression, so unless you have a fix, it's the way to > go. If you have a fix I'll happily test it :) Just between us, revert on small regressions may not be optimal long term on a project like mesa where the review/commit pipeline is clogged. The risk of losing developers is non-negligible. The linux kernel can afford it because even if you miss a cycle you know that you will have another one in two months, and there are a lot of intermediate collation trees in which your patch can be tried out and shaken for bugs (subsystem trees, -next, akp patch tree, etc). I'm not sure Mesa can afford it. That said, try this. commit 56555c58d7f16c8d619c21feb23096155e2fb505 Author: Olivier Galibert Date: Sat Jun 30 00:41:20 2012 +0200 lp_bld_tgsi_soa: Fix conversion of system values to float. Signed-off-by: Olivier Galibert diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index 55db561..f8df2bc 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -811,9 +811,10 @@ emit_fetch_system_value( break; } + /* Extend that when atype can happen to be float */ if (atype != stype) { if (stype == TGSI_TYPE_FLOAT) { - res = LLVMBuildBitCast(builder, res, bld_base->base.vec_type, ""); + res = lp_build_int_to_float(&bld_base->base, res); } else if (stype == TGSI_TYPE_UNSIGNED) { res = LLVMBuildBitCast(builder, res, bld_base->uint_bld.vec_type, ""); } else if (stype == TGSI_TYPE_SIGNED) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper
On Fri, Jun 29, 2012 at 12:52:06PM -0700, Stéphane Marchesin wrote: > Yeah, but my question was more high level, whether the vertex id > support required the previous refactor. It looks like it does though, > and I don't want to untangle, so I'll revert both 3/4 and 4/4. You realize that will re-break instanceID on llvmpipe for glsl > 120, right? OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.
On Wed, Jun 27, 2012 at 03:17:05AM -0700, Jose Fonseca wrote: > I took a look at the results, and it seems to me that bri linear > code is fine -- the test is merely too strict, and doesnot forgive > the gravitation towards integer lod that brilinear implements. Yes, the current code maps [0,.25] to 0, [0.25,0.75] to [0-1] and [0.75,1] to 1. So you you need an error tolerance of 0.20 given how the test is done on multiples of 0.2. What's your criteria to decide that a precision is "good enough"? OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] llvmpipe: handle more PIPE_CAP_x queries
On Tue, Jun 26, 2012 at 02:46:01PM -0600, Brian Paul wrote: > As with the previous commit for softpipe. > > v2: remove 'default' case to get compile-time warning > --- > src/gallium/drivers/llvmpipe/lp_screen.c | 52 +++-- > 1 files changed, 48 insertions(+), 4 deletions(-) > > diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c > b/src/gallium/drivers/llvmpipe/lp_screen.c > index 40037a5..e66737b 100644 > --- a/src/gallium/drivers/llvmpipe/lp_screen.c > +++ b/src/gallium/drivers/llvmpipe/lp_screen.c > + case PIPE_CAP_GLSL_FEATURE_LEVEL: > + return 0; Why not 120? OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.
On Mon, Jun 25, 2012 at 03:16:35PM -0700, Jose Fonseca wrote: > Indeed lp_build_brilinear_lod is not faster than > lp_build_ifloor_fract, but brilinear is faster, not because log is a > faster approximation, but because it increases the odds that fract > part is zero, which means that we can sample from a single mip > level, instead of lerping between two mip levels. > > I think you have a good point here -- lp_build_brilinear_lod is a > log2 approximation which is wrong here and that's a great catch, -- > but I have a point too: lp_build_ifloor_fract will slow down texture > sampling. > > Just like log2 and brilinear log2, we need a variant of > ifloor_fract, that increases the probability of fract part being > zero, essentially by applying a stair case transformation like: You can do that by multiplying by 'k', subtracting 0.5*k and clamping to [0,1[. The question is whether you really want to do something like that for explicit lod, where the user supposedly exactly knows what he wants. "textureLod" is not used often at all[1], so one can think that when it's used you'd better do it precisely. OG. [1] You see more uses of lod bias and/or textureGrad, the latter due to the use of conditionals. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.
On Mon, Jun 25, 2012 at 11:40:08AM -0700, Jose Fonseca wrote: > My thoughts too. > > Brilinear filtering provides a significant boost, and I don't see why skip > the optimization for explicit lod over implicit lods. Warning: code misread :-) Explicit lod does not need brilinear filtering because explicit lod is post log2. Brilinear is only about a faster log2, nothing else. Explicit lod only needs the integer/fractional part separation. The whole code is: if (mip_filter == PIPE_TEX_MIPFILTER_LINEAR) { if (!explicit_lod && !(gallivm_debug & GALLIVM_DEBUG_NO_BRILINEAR)) { lp_build_brilinear_lod(float_bld, lod, BRILINEAR_FACTOR, out_lod_ipart, out_lod_fpart); } else { lp_build_ifloor_fract(float_bld, lod, out_lod_ipart, out_lod_fpart); } lp_build_name(*out_lod_fpart, "lod_fpart"); } else { *out_lod_ipart = lp_build_iround(float_bld, lod); } and you're not going to tell me that lp_build_brilinear_lod is faster than lp_build_ifloor_fract (especially since it includes it ;-) OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.
On Mon, Jun 25, 2012 at 07:31:20PM +0100, Roland Scheidegger wrote: > Does this fix the test because lp_build_brilinear_lod produces bogus > values in this case or just because the test is strict about such > filtering optimizations? In the latter case I'm not sure I really see > much point. Bogus. It does the fractional-part log2 approximation there, which only makes sense if you called fast_log2 before (and even then the log bias is going to be strangely applied, but meh). > I'm surprised it can actually pass in either case since we drop all but > the first lod per quad values on the floor anyway so I think you will > get neither the right filtering weights between mipmaps nor even the > right mip levels (if the integer part of the lod isn't the same) for > anything but the first texel per quad. Luck due to the design of the test. It's rectangles with a fixed lod value, so the quads all have the same. That's pretty much why I cooked up miplevels-2 (only in vs though, it's much easier there and the code is shared). OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] llvmpipe: Remove the ARB_draw_instanced capability.
On Mon, Jun 25, 2012 at 05:34:25AM -0700, Jose Fonseca wrote: > - Original Message - > > That capability requires integer handling and that's not yet active, > > ending with a failure in draw-non-instanced unless you force it on. > > See bug 51366. > > > > Frankly, I'd rather have that patch rejected and integer/glsl 130 > > capability activated instead. There still are things missing, but > > they mostly have their own extension anyway. And the overall picture > > ain't so bad. > > I'm personally also more interested in seeing llvmpipe to get the missing > features for GLSL 1.30 / OGL 3. > > What's the overall picture of llvmpipe w/ integer/glsl 130? That is, how many > piglit tests go from skipped to passed/failed? To failed: precision-05.vert link-mismatch-layout-02 no-redeclaration-01.vert feature-macro.vert fs-exec-after-break - general failures, everybody has them vs-clip-distance-bulk-assign vs-clip-distance-inout-param vs-clip-distance-out-param vs-clip-distance-retval - haven't checked what the problem is, softpipe has it right fs-isinf-vec2 fs-isinf-vec3 fs-isinf-vec4 vs-isinf-vec2 vs-isinf-vec3 vs-isinf-vec4 - test is iffy fs-texelFetch-2D fs-texelFetchOffset-2D - no texelFetch support yet fs-texture-sampler2dshadow-10 fs-texture-sampler2dshadow-11 - dunno what's going on, softpipe fails it too vs-attrib-ivec4-implied vs-attrib-ivec4-precision vs-attrib-uvec4-implied vs-attrib-uvec4-precision - use glVertexAttribIPointer, which is GL 3.0+ only vs-textureLod-miplevels - issue with vertex shader invalidation when sampler mode changes (as in, it's not done) vs-textureLod-miplevels-2 - you know that one, it's nowhere near fixed yet (the softpipe patch is awaiting review too :-) texel-offset-limits - no limits defined in lp_screen.c, udnno whether texture() would take it into account either To pass: 1503 total, it seems, you can be sure I'm not going to list them :-) Best, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.
Brilinear folding must only be used if the log2 was computed with brilinear too. Fixes fs-textureLod-miplevels. Signed-off-by: Olivier Galibert --- src/gallium/auxiliary/gallivm/lp_bld_sample.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c b/src/gallium/auxiliary/gallivm/lp_bld_sample.c index d966788..9deda61 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c @@ -513,7 +513,7 @@ lp_build_lod_selector(struct lp_build_sample_context *bld, } if (mip_filter == PIPE_TEX_MIPFILTER_LINEAR) { - if (!(gallivm_debug & GALLIVM_DEBUG_NO_BRILINEAR)) { + if (!explicit_lod && !(gallivm_debug & GALLIVM_DEBUG_NO_BRILINEAR)) { lp_build_brilinear_lod(float_bld, lod, BRILINEAR_FACTOR, out_lod_ipart, out_lod_fpart); } -- 1.7.10.280.gaa39 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] u2f_emit: Fix type parameter in LLVM call.
The type is the destination type (i.e. float vector) and not the source type. Fixes piglit fs-{in,de}crement-uint. Signed-off-by: Olivier Galibert --- src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c index cbc5945..17f288f 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c @@ -693,7 +693,7 @@ u2f_emit( { emit_data->output[emit_data->chan] = LLVMBuildUIToFP(bld_base->base.gallivm->builder, emit_data->args[0], - bld_base->uint_bld.vec_type, ""); + bld_base->base.vec_type, ""); } static void -- 1.7.10.280.gaa39 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Comparison of llvmpipe with 2.9 and 3.1
Hi, I've just finished two piglit runs of llvmpipe with glsl 1.40 and gl 3.1 forced on, one with LLVM 2.9, the other with 3.1. The least we can say is that there aren't many differences. - fp-indirections2, didn't have the patience to wait to see whether it would eventually stop. Looks like something quadratic or worse in the LLVM optimizers. - 17000-consecutive-chars-identifier, the memory corruption it creates behaved differently (probably due to the different glibc, it wasn't on the same box), causing a deadlock in malloc() - texCombine fails on 3.1 only with: Returncode: -5 Errors: src/gallium/auxiliary/draw/draw_llvm.c:309:create_jit_vertex_header: Assertion `LLVMABISizeOfType(target, vertex_header) == __builtin_offsetof (struct vertex_header, data[data_elems])' failed. Output: -- GL_EXT_texture_env_combine verification test. We only test a subset of all possible texture env combinations because there's simply too many to exhaustively test them all. So, in total, the story isn't bad. Best, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] builtin_variables: Only advertise gl_InstanceIDARB when GLSL handle integers.
It can be argued it makes to sense to advertise an integer system variable in GLSL levels where integers aren't handled. Signed-off-by: Olivier Galibert --- I don't really know if that's a patch we want, but otoh having gl_InstanceIDARB being a different type depending on the GLSL version would be... weird. diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp index 03b64c9..f9a341f 100644 --- a/src/glsl/builtin_variables.cpp +++ b/src/glsl/builtin_variables.cpp @@ -888,12 +888,13 @@ generate_ARB_draw_instanced_variables(exec_list *instructions, bool warn, _mesa_glsl_parser_targets target) { - /* gl_InstanceIDARB is only available in the vertex shader. + /* gl_InstanceIDARB is only available in the vertex shader, and +* only if the glsl level can handle integers. */ if (target != vertex_shader) return; - if (state->ARB_draw_instanced_enable) { + if (state->ARB_draw_instanced_enable && state->language_version >= 130) { ir_variable *inst = add_variable(instructions, state->symbols, "gl_InstanceIDARB", glsl_type::int_type, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] softpipe: Do round-to-even, not round-up.
On Fri, May 18, 2012 at 08:55:39AM -0600, Brian Paul wrote: > In any case, I think this function could be moved into u_math.c so it > could be used elsewhere. [...] > I was looking at the GLSL round() and roundEven() functions. The GLSL > spec says round() can use whatever method is fastest. But in > builtin_functions.cpp the round() function is implemented in terms of > the round_even builtin. It seems to me that we should have a generic > 'round' builtin function and separate TGSI_ROUND and TGSI_ROUND_EVEN > opcodes so that drivers can really have the option of using a > faster/looser round function. I've tried doing that. I've moved the function to u_math.c, then made src/glsl/ir_constant_expression.cpp use it. That blew up. If I compile with scons, I get: Linking build/linux-x86_64-debug/glsl/builtin_compiler ... build/linux-x86_64-debug/glsl/ir_constant_expression.o: In function `dot': /home/galibert/X/work/mesa-play/src/glsl/ir_constant_expression.cpp:47: undefined reference to `_debug_assert_fail' [...] /home/galibert/X/work/mesa-play/src/glsl/ir_constant_expression.cpp:265: undefined reference to `ieee754_fp32_round_half_to_even' [etc] If I compile with autoconf/make I get: ir_constant_expression.cpp:42:25: fatal error: util/u_math.h: No such file or directory So at that point src/glsl and src/gallium are not supposed to meet each other. And changing that is not a responsability I feel like taking. Any advice? Best, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] llvmpipe: Remove the ARB_draw_instanced capability.
That capability requires integer handling and that's not yet active, ending with a failure in draw-non-instanced unless you force it on. See bug 51366. Frankly, I'd rather have that patch rejected and integer/glsl 130 capability activated instead. There still are things missing, but they mostly have their own extension anyway. And the overall picture ain't so bad. Signed-off-by: Olivier Galibert diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c b/src/gallium/drivers/llvmpipe/lp_screen.c index 40037a5..5eb826e 100644 --- a/src/gallium/drivers/llvmpipe/lp_screen.c +++ b/src/gallium/drivers/llvmpipe/lp_screen.c @@ -152,8 +152,6 @@ llvmpipe_get_param(struct pipe_screen *screen, enum pipe_cap param) return 1; case PIPE_CAP_DEPTH_CLIP_DISABLE: return 0; - case PIPE_CAP_TGSI_INSTANCEID: - case PIPE_CAP_VERTEX_ELEMENT_INSTANCE_DIVISOR: case PIPE_CAP_MIXED_COLORBUFFER_FORMATS: case PIPE_CAP_CONDITIONAL_RENDER: return 1; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st_program.c: gl_ClipDistance must be interpolated in 3d space.
That old bug was hidden but the clipper always interpolating in 3d space no matter what it should have been doing. Now that the interpolation has been fixed, the bug shows up. Fixes bugzilla 51364. Signed-off-by: Olivier Galibert diff --git a/src/mesa/state_tracker/st_program.c b/src/mesa/state_tracker/st_program.c index e6664fb..9f98298 100644 --- a/src/mesa/state_tracker/st_program.c +++ b/src/mesa/state_tracker/st_program.c @@ -569,12 +569,12 @@ st_translate_fragment_program(struct st_context *st, case FRAG_ATTRIB_CLIP_DIST0: input_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST; input_semantic_index[slot] = 0; -interpMode[slot] = TGSI_INTERPOLATE_LINEAR; +interpMode[slot] = TGSI_INTERPOLATE_PERSPECTIVE; break; case FRAG_ATTRIB_CLIP_DIST1: input_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST; input_semantic_index[slot] = 1; -interpMode[slot] = TGSI_INTERPOLATE_LINEAR; +interpMode[slot] = TGSI_INTERPOLATE_PERSPECTIVE; break; /* In most cases, there is nothing special about these * inputs, so adopt a convention to use the generic ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper
On Thu, Jun 21, 2012 at 10:28:22AM -0700, Jose Fonseca wrote: > This patch series is causing regressions in select/feedback mode. Can you > take a look? Sure. I wouldn't have expected that case to ever happen, but it makes sense now that I think of it. commit edc7b26b03c0393582ff5ec8c963207c7553850a Author: Olivier Galibert Date: Thu Jun 21 19:37:11 2012 +0200 clip_init_state: Handle the case when there isn't a fragment shader. Signed-off-by: Olivier Galibert diff --git a/src/gallium/auxiliary/draw/draw_pipe_clip.c b/src/gallium/auxiliary/draw/draw_pipe_clip.c index 2d36eb3..c02d0ef 100644 --- a/src/gallium/auxiliary/draw/draw_pipe_clip.c +++ b/src/gallium/auxiliary/draw/draw_pipe_clip.c @@ -586,6 +586,9 @@ clip_init_state( struct draw_stage *stage ) * two outputs for one input, so we tuck the information in a * specific array. Second if they don't have qualifiers, the * default value has to be picked from the global shade mode. +* +* Of course, if we don't have a fragment shader in the first +* place, defaults should be used. */ /* First pick up the interpolation mode for @@ -595,10 +598,12 @@ clip_init_state( struct draw_stage *stage ) indexed_interp[0] = indexed_interp[1] = stage->draw->rasterizer->flatshade ? TGSI_INTERPOLATE_CONSTANT : TGSI_INTERPOLATE_PERSPECTIVE; - for (i = 0; i < fs->info.num_inputs; i++) { - if (fs->info.input_semantic_name[i] == TGSI_SEMANTIC_COLOR) { - if (fs->info.input_interpolate[i] != TGSI_INTERPOLATE_COLOR) -indexed_interp[fs->info.input_semantic_index[i]] = fs->info.input_interpolate[i]; + if (fs) { + for (i = 0; i < fs->info.num_inputs; i++) { + if (fs->info.input_semantic_name[i] == TGSI_SEMANTIC_COLOR) { +if (fs->info.input_interpolate[i] != TGSI_INTERPOLATE_COLOR) + indexed_interp[fs->info.input_semantic_index[i]] = fs->info.input_interpolate[i]; + } } } @@ -627,12 +632,14 @@ clip_init_state( struct draw_stage *stage ) */ uint j; interp = TGSI_INTERPOLATE_PERSPECTIVE; - for (j = 0; j < fs->info.num_inputs; j++) { -if (vs->info.output_semantic_name[i] == fs->info.input_semantic_name[j] && -vs->info.output_semantic_index[i] == fs->info.input_semantic_index[j]) { - interp = fs->info.input_interpolate[j]; - break; -} + if (fs) { +for (j = 0; j < fs->info.num_inputs; j++) { + if (vs->info.output_semantic_name[i] == fs->info.input_semantic_name[j] && + vs->info.output_semantic_index[i] == fs->info.input_semantic_index[j]) { + interp = fs->info.input_interpolate[j]; + break; + } +} } } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/msaa: Only do multisample rasterization if GL_MULTISAMPLE enabled.
On Thu, Jun 21, 2012 at 11:58:57AM +0200, Michel Dänzer wrote: > On Don, 2012-06-21 at 11:38 +0200, Olivier Galibert wrote: > > On Thu, Jun 21, 2012 at 11:19:39AM +0200, Michel Dänzer wrote: > > > On Die, 2012-06-19 at 17:18 -0700, Kenneth Graunke wrote: > > > > Also, distribute the appropriate emacs and vim settings to indent things > > > > correctly. > > > > > > In any case, please do this *before* any kind of cleanup. > > > > (global-set-key [(control c) (s)] (lambda () (interactive) (setq > > c-basic-offset 3 tab-width 8 indent-tabs-mode nil))) > > The point is to encode that in a file in the tree which is picked up > automagically. Errr, automagically running code coming from a repository without user intervention is not usually considered smart, security-wise... OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/msaa: Only do multisample rasterization if GL_MULTISAMPLE enabled.
On Thu, Jun 21, 2012 at 11:19:39AM +0200, Michel Dänzer wrote: > On Die, 2012-06-19 at 17:18 -0700, Kenneth Graunke wrote: > > Also, distribute the appropriate emacs and vim settings to indent things > > correctly. > > In any case, please do this *before* any kind of cleanup. (global-set-key [(control c) (s)] (lambda () (interactive) (setq c-basic-offset 3 tab-width 8 indent-tabs-mode nil))) OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] softpipe: Take all lods into account when texture sampling.
On Wed, Jun 20, 2012 at 01:44:14PM +0100, Roland Scheidegger wrote: > A lot of code I just glossed over it, but seems to look ok other than > the (performance) implications this might have. Actually whether there's a performance implication is not obvious. In practice the code just kicks the 4-pixel loop one or two function calls higher. This unshares some tests, some function calls, and the mip-size computation shifts. For normal texturing and on x86 the tests are correctly predicted after the first one, and so are the function calls, giving all of them a near zero cost. So I'm not sure the costs is that measurable. With the actual vectorization the llvmpipe situation may be different (not so sure with the aos texturing though). Best, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] softpipe: Take all lods into account when texture sampling.
On Tue, Jun 19, 2012 at 02:46:35PM -0700, Jose Fonseca wrote: > Could you give more background on why is this necessary? > > This will make software renderering slower, so I'd really like to avoid it on > llvmpipe if at all possible. Well, given the existence of textureLod and textureGrad every texture sample can easily hit a different mipmap or even, by switching between minification and magnification, a different filter entirely. Even a simple texture() is hit, if your polygon is horizontal enough. And this goes double for vertex shaders, where texture fetches there have less reason to be close in texture space. textureSize and textureFetch, with their explicit lod, have of course the same problem. only worse. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] llvmpipe: Simplify and fix system variables fetch.
The system array values concept doesn't really because it expects the system values to be fixed per call, which is wrong for gl_VertexID and iffy for gl_SampleID. So this patch does two things: - kill the array, have emit_fetch_system_value directly pick the values it needs (only gl_InstanceID for now, as the previous code) - correctly handle the expected type in emit_fetch_system_value Signed-off-by: Olivier Galibert Reviewed-by: Brian Paul --- src/gallium/auxiliary/draw/draw_llvm.c | 10 +-- src/gallium/auxiliary/gallivm/lp_bld_tgsi.h | 11 +-- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 88 +++ src/gallium/drivers/llvmpipe/lp_state_fs.c |2 +- 4 files changed, 33 insertions(+), 78 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_llvm.c b/src/gallium/auxiliary/draw/draw_llvm.c index e1df2f1..8e787c5 100644 --- a/src/gallium/auxiliary/draw/draw_llvm.c +++ b/src/gallium/auxiliary/draw/draw_llvm.c @@ -459,7 +459,7 @@ generate_vs(struct draw_llvm *llvm, LLVMBuilderRef builder, LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS], const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS], -LLVMValueRef system_values_array, +LLVMValueRef instance_id, LLVMValueRef context_ptr, struct lp_build_sampler_soa *draw_sampler, boolean clamp_vertex_color) @@ -491,7 +491,7 @@ generate_vs(struct draw_llvm *llvm, vs_type, NULL /*struct lp_build_mask_context *mask*/, consts_ptr, - system_values_array, + instance_id, NULL /*pos*/, inputs, outputs, @@ -1249,7 +1249,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, LLVMValueRef stride, step, io_itr; LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr; LLVMValueRef instance_id; - LLVMValueRef system_values_array; LLVMValueRef zero = lp_build_const_int32(gallivm, 0); LLVMValueRef one = lp_build_const_int32(gallivm, 1); struct draw_context *draw = llvm->draw; @@ -1340,9 +1339,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, lp_build_context_init(&bld, gallivm, lp_type_int(32)); - system_values_array = lp_build_system_values_array(gallivm, vs_info, - instance_id, NULL); - /* function will return non-zero i32 value if any clipped vertices */ ret_ptr = lp_build_alloca(gallivm, int32_type, ""); LLVMBuildStore(builder, zero, ret_ptr); @@ -1418,7 +1414,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, builder, outputs, ptr_aos, - system_values_array, + instance_id, context_ptr, sampler, variant->key.clamp_vertex_color); diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h index 141e799..c4e690c 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h @@ -205,7 +205,7 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm, struct lp_type type, struct lp_build_mask_context *mask, LLVMValueRef consts_ptr, - LLVMValueRef system_values_array, + LLVMValueRef instance_id, const LLVMValueRef *pos, const LLVMValueRef (*inputs)[4], LLVMValueRef (*outputs)[4], @@ -225,13 +225,6 @@ lp_build_tgsi_aos(struct gallivm_state *gallivm, const struct tgsi_shader_info *info); -LLVMValueRef -lp_build_system_values_array(struct gallivm_state *gallivm, - const struct tgsi_shader_info *info, - LLVMValueRef instance_id, - LLVMValueRef facing); - - struct lp_exec_mask { struct lp_build_context *bld; @@ -388,7 +381,7 @@ struct lp_build_tgsi_soa_context */ LLVMValueRef inputs_array; - LLVMValueRef system_values_array; + LLVMValueRef instance_id; /** bitmask indicating which register files are accessed indirectly */ unsigned indirect_files; diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index 412dc0c..26be902 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -786,18 +786,37 @@ emit_fetch_system_value( { struct lp_build_tgsi_soa_context * bld = lp_soa_context(bld_base); struct gallivm_state *gallivm = bld->bld_base.base.gallivm; + const struct tgsi_shader_info *info = bld->bld_base.info; LLVMBuilde
[Mesa-dev] [PATCH 4/4] llvmpipe: Add vertex id support.
Signed-off-by: Olivier Galibert Reviewed-by: Brian Paul --- src/gallium/auxiliary/draw/draw_llvm.c | 32 ++- src/gallium/auxiliary/gallivm/lp_bld_tgsi.h | 13 +++-- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 11 +--- src/gallium/drivers/llvmpipe/lp_state_fs.c |5 +++- 4 files changed, 42 insertions(+), 19 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_llvm.c b/src/gallium/auxiliary/draw/draw_llvm.c index 8e787c5..e08221e 100644 --- a/src/gallium/auxiliary/draw/draw_llvm.c +++ b/src/gallium/auxiliary/draw/draw_llvm.c @@ -459,7 +459,7 @@ generate_vs(struct draw_llvm *llvm, LLVMBuilderRef builder, LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS], const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS], -LLVMValueRef instance_id, +const struct lp_bld_tgsi_system_values *system_values, LLVMValueRef context_ptr, struct lp_build_sampler_soa *draw_sampler, boolean clamp_vertex_color) @@ -491,7 +491,7 @@ generate_vs(struct draw_llvm *llvm, vs_type, NULL /*struct lp_build_mask_context *mask*/, consts_ptr, - instance_id, + system_values, NULL /*pos*/, inputs, outputs, @@ -1248,7 +1248,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, LLVMValueRef count, fetch_elts, fetch_count; LLVMValueRef stride, step, io_itr; LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr; - LLVMValueRef instance_id; LLVMValueRef zero = lp_build_const_int32(gallivm, 0); LLVMValueRef one = lp_build_const_int32(gallivm, 1); struct draw_context *draw = llvm->draw; @@ -1270,6 +1269,9 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, const unsigned pos = draw_current_shader_position_output(llvm->draw); const unsigned cv = draw_current_shader_clipvertex_output(llvm->draw); boolean have_clipdist = FALSE; + struct lp_bld_tgsi_system_values system_values; + + memset(&system_values, 0, sizeof(system_values)); arg_types[0] = get_context_ptr_type(llvm); /* context */ arg_types[1] = get_vertex_header_ptr_type(llvm); /* vertex_header */ @@ -1300,19 +1302,19 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, LLVMAddAttribute(LLVMGetParam(variant_func, i), LLVMNoAliasAttribute); - context_ptr = LLVMGetParam(variant_func, 0); - io_ptr = LLVMGetParam(variant_func, 1); - vbuffers_ptr = LLVMGetParam(variant_func, 2); - stride = LLVMGetParam(variant_func, 5); - vb_ptr = LLVMGetParam(variant_func, 6); - instance_id = LLVMGetParam(variant_func, 7); + context_ptr = LLVMGetParam(variant_func, 0); + io_ptr= LLVMGetParam(variant_func, 1); + vbuffers_ptr = LLVMGetParam(variant_func, 2); + stride= LLVMGetParam(variant_func, 5); + vb_ptr= LLVMGetParam(variant_func, 6); + system_values.instance_id = LLVMGetParam(variant_func, 7); lp_build_name(context_ptr, "context"); lp_build_name(io_ptr, "io"); lp_build_name(vbuffers_ptr, "vbuffers"); lp_build_name(stride, "stride"); lp_build_name(vb_ptr, "vb"); - lp_build_name(instance_id, "instance_id"); + lp_build_name(system_values.instance_id, "instance_id"); if (elts) { fetch_elts = LLVMGetParam(variant_func, 3); @@ -1378,6 +1380,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, lp_build_printf(builder, " --- io %d = %p, loop counter %d\n", io_itr, io, lp_loop.counter); #endif + system_values.vertex_id = lp_build_zero(gallivm, lp_type_uint_vec(32)); for (i = 0; i < TGSI_NUM_CHANNELS; ++i) { LLVMValueRef true_index = LLVMBuildAdd(builder, @@ -1395,7 +1398,10 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, &true_index, 1, ""); true_index = LLVMBuildLoad(builder, fetch_ptr, "fetch_elt"); } - + + system_values.vertex_id = LLVMBuildInsertElement(gallivm->builder, + system_values.vertex_id, true_index, + lp_build_const_int32(gallivm, i), ""); for (j = 0; j < draw->pt.nr_vertex_elements; ++j) { struct pipe_vertex_element *velem = &draw->pt.vertex_element[j]; LLVMValueRef vb_index = @@ -1403,7 +1409,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struc
[Mesa-dev] [PATCH 2/4] draw: fix flat shading and screen-space linear interpolation in clipper
This includes: - picking up correctly which attributes are flatshaded and which are noperspective - copying the flatshaded attributes when needed, including the non-built-in ones - correctly interpolating the noperspective attributes in screen-space instead than in a 3d-correct fashion. Signed-off-by: Olivier Galibert Reviewed-by: Brian Paul --- src/gallium/auxiliary/draw/draw_pipe_clip.c | 144 +-- 1 file changed, 113 insertions(+), 31 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_pipe_clip.c b/src/gallium/auxiliary/draw/draw_pipe_clip.c index 4da4d65..2d36eb3 100644 --- a/src/gallium/auxiliary/draw/draw_pipe_clip.c +++ b/src/gallium/auxiliary/draw/draw_pipe_clip.c @@ -39,6 +39,7 @@ #include "draw_vs.h" #include "draw_pipe.h" +#include "draw_fs.h" #ifndef IS_NEGATIVE @@ -56,11 +57,12 @@ struct clip_stage { struct draw_stage stage; /**< base class */ - /* Basically duplicate some of the flatshading logic here: -*/ - boolean flat; - uint num_color_attribs; - uint color_attribs[4]; /* front/back primary/secondary colors */ + /* List of the attributes to be flatshaded. */ + uint num_flat_attribs; + uint flat_attribs[PIPE_MAX_SHADER_OUTPUTS]; + + /* Mask of attributes in noperspective mode */ + boolean noperspective_attribs[PIPE_MAX_SHADER_OUTPUTS]; float (*plane)[4]; }; @@ -91,17 +93,16 @@ static void interp_attr( float dst[4], /** - * Copy front/back, primary/secondary colors from src vertex to dst vertex. - * Used when flat shading. + * Copy flat shaded attributes src vertex to dst vertex. */ -static void copy_colors( struct draw_stage *stage, -struct vertex_header *dst, -const struct vertex_header *src ) +static void copy_flat( struct draw_stage *stage, + struct vertex_header *dst, + const struct vertex_header *src ) { const struct clip_stage *clipper = clip_stage(stage); uint i; - for (i = 0; i < clipper->num_color_attribs; i++) { - const uint attr = clipper->color_attribs[i]; + for (i = 0; i < clipper->num_flat_attribs; i++) { + const uint attr = clipper->flat_attribs[i]; COPY_4FV(dst->data[attr], src->data[attr]); } } @@ -120,6 +121,7 @@ static void interp( const struct clip_stage *clip, const unsigned pos_attr = draw_current_shader_position_output(clip->stage.draw); const unsigned clip_attr = draw_current_shader_clipvertex_output(clip->stage.draw); unsigned j; + float t_nopersp; /* Vertex header. */ @@ -148,12 +150,36 @@ static void interp( const struct clip_stage *clip, dst->data[pos_attr][2] = pos[2] * oow * scale[2] + trans[2]; dst->data[pos_attr][3] = oow; } + + /** +* Compute the t in screen-space instead of 3d space to use +* for noperspective interpolation. +* +* The points can be aligned with the X axis, so in that case try +* the Y. When both points are at the same screen position, we can +* pick whatever value (the interpolated point won't be in front +* anyway), so just use the 3d t. +*/ + { + int k; + t_nopersp = t; + for (k = 0; k < 2; k++) + if (in->data[pos_attr][k] != out->data[pos_attr][k]) { +t_nopersp = (dst->data[pos_attr][k] - out->data[pos_attr][k]) / + (in->data[pos_attr][k] - out->data[pos_attr][k]); +break; + } + } /* Other attributes */ for (j = 0; j < nr_attrs; j++) { - if (j != pos_attr && j != clip_attr) -interp_attr(dst->data[j], t, in->data[j], out->data[j]); + if (j != pos_attr && j != clip_attr) { + if (clip->noperspective_attribs[j]) +interp_attr(dst->data[j], t_nopersp, in->data[j], out->data[j]); + else +interp_attr(dst->data[j], t, in->data[j], out->data[j]); + } } } @@ -406,14 +432,14 @@ do_clip_tri( struct draw_stage *stage, /* If flat-shading, copy provoking vertex color to polygon vertex[0] */ if (n >= 3) { - if (clipper->flat) { + if (clipper->num_flat_attribs) { if (stage->draw->rasterizer->flatshade_first) { if (inlist[0] != header->v[0]) { assert(tmpnr < MAX_CLIPPED_VERTICES + 1); if (tmpnr >= MAX_CLIPPED_VERTICES + 1) return; inlist[0] = dup_vert(stage, inlist[0], tmpnr++); - copy_colors(stage, inlist[0], header->v[0]); + copy_flat(stage, inlist[0], header->v[0]); } } else { @@ -422,7 +448,7 @@ do_clip_tri( struct draw_stage *stage, if (tmpnr >= MAX_CLIPPED_VERTICES + 1) return;
[Mesa-dev] [PATCH 1/4] softpipe: Offset is not to be applied to the layer parameter of array texture fetches.
Signed-off-by: Olivier Galibert Reviewed-by: Brian Paul --- src/gallium/drivers/softpipe/sp_tex_sample.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/softpipe/sp_tex_sample.c b/src/gallium/drivers/softpipe/sp_tex_sample.c index d4c0175..f29a6c7 100644 --- a/src/gallium/drivers/softpipe/sp_tex_sample.c +++ b/src/gallium/drivers/softpipe/sp_tex_sample.c @@ -2693,7 +2693,7 @@ sample_get_texels(struct tgsi_sampler *tgsi_sampler, case PIPE_TEXTURE_1D_ARRAY: for (j = 0; j < TGSI_QUAD_SIZE; j++) { int x = CLAMP(v_i[j] + offset[0], 0, width - 1); - int y = CLAMP(v_j[j] + offset[1], 0, layers - 1); + int y = CLAMP(v_j[j], 0, layers - 1); tx = get_texel_1d_array(samp, addr, x, y); for (c = 0; c < 4; c++) { rgba[c][j] = tx[c]; @@ -2715,7 +2715,7 @@ sample_get_texels(struct tgsi_sampler *tgsi_sampler, for (j = 0; j < TGSI_QUAD_SIZE; j++) { int x = CLAMP(v_i[j] + offset[0], 0, width - 1); int y = CLAMP(v_j[j] + offset[1], 0, height - 1); - int layer = CLAMP(v_k[j] + offset[2], 0, layers - 1); + int layer = CLAMP(v_k[j], 0, layers - 1); tx = get_texel_2d_array(samp, addr, x, y, layer); for (c = 0; c < 4; c++) { rgba[c][j] = tx[c]; -- 1.7.10.280.gaa39 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Ping: patches to apply
Hi, They've been revieved, they've been changed when requested :-) Best, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Clarifications w.r.t MSAA
On Tue, Jun 12, 2012 at 01:50:08PM +0200, Christoph Bumiller wrote: > > First question: how many depths should be computed, and for which > > coordinates? Which of these values is associated with which sample? > > One for each sample point. The depth buffer will be multisampled as well. > Coverage sampling (CSAA) where you have extra coverage samples that do > NOT (necessarily) correspond to color sample locations are not covered > by the GL spec, it's vendor-specific. Ok. So that means that if the shader writes z, you have to do full supersampling then. > > Second question: how many samples should be shaded, and for which > > coordinates? What is the impact of depth testing failure? > > As many as the user requested via glMinSampleShading, and the sample > locations to choose seem to be up to the implementation. Do you know what's usually expected? Center of collapsed samples, one of the samples, center of the pixel? > > Third question: what happens when a variable has a "sample" qualifier > > in the fragment shader? Or "centroid"? > > "When interpolating variables declared using sample in when MULTISAMPLE > is enabled, the fragment shader will be invoked separately for each (!) > covered sample and the variable will be sampled at the corresponding > sample point." So a "sample" anywhere means full supersampling, ok. Thanks, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Clarifications w.r.t MSAA
Hi all, I'm getting a little lost in all the interactions between the different parts of the GL standards and what I understand of the expectations when it comes to MSAA. It would be nice if I could have some clarifications. I'll start with what I think I understand (and please correct me when I'm wrong) and add a number of questions. I'll also ignore the "resolve" part, which isn't an issue (at least for me :-). MSAA is a variant on the supersampling theme where the coverage is supersampled but depth, stencil and color may or may not be. The destination buffer has enough space to store the full results of a complete supersampling, but some of the values may be duplicated. The variable MIN_SAMPLE_SHADING_VALUE allows the application to control the minimum number of values that have to be computed. It can say for instance that in a 16xMSAA case at least 4 samples per pixel are required. So let's take a case of 16xMSAA (say with the DX11 pattern) and let's look at the pipeline. First the coverage is sampled for the 16 fixed positions, leaving C active samples. Then there should be early depth testing then shading, or the other way around, depending on the shaders. First question: how many depths should be computed, and for which coordinates? Which of these values is associated with which sample? Second question: how many samples should be shaded, and for which coordinates? What is the impact of depth testing failure? Third question: what happens when a variable has a "sample" qualifier in the fragment shader? Or "centroid"? Fourth question: how does gl_SampleMask interact with all that when more than one sample is evaluated. And what does gl_SampleMaskIn look like in the same case? I hope you people can help me clarify all that stuff :-) Best, OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/2] Add vertex id to llvmpipe.
On Fri, Jun 08, 2012 at 09:01:42AM -0700, Jose Fonseca wrote: > Oliver, > > There will be other system values in the future, so instead of passing every > value as a different parameter, please define a structure in > src/gallium/auxiliary/gallivm/lp_bld_tgsi.h as > > struct lp_bld_tgsi_system_values { > LLVMValueRef facing; > LLVMValueRef instance_id; > LLVMValueRef vertex_id; > ... > } > > which is then passed to lp_build_tgsi_soa and all other functions. > > Otherwise the change looks good overall. Something like that for the second part? OG. Author: Olivier Galibert Date: Fri Jun 1 22:58:58 2012 +0200 llvmpipe: Add vertex id support. Signed-off-by: Olivier Galibert diff --git a/src/gallium/auxiliary/draw/draw_llvm.c b/src/gallium/auxiliary/draw/draw_llvm.c index d5eb727..de495cf 100644 --- a/src/gallium/auxiliary/draw/draw_llvm.c +++ b/src/gallium/auxiliary/draw/draw_llvm.c @@ -456,7 +456,7 @@ generate_vs(struct draw_llvm *llvm, LLVMBuilderRef builder, LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS], const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS], -LLVMValueRef instance_id, +const struct lp_bld_tgsi_system_values *system_values, LLVMValueRef context_ptr, struct lp_build_sampler_soa *draw_sampler, boolean clamp_vertex_color) @@ -488,7 +488,7 @@ generate_vs(struct draw_llvm *llvm, vs_type, NULL /*struct lp_build_mask_context *mask*/, consts_ptr, - instance_id, + system_values, NULL /*pos*/, inputs, outputs, @@ -1245,7 +1245,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, LLVMValueRef count, fetch_elts, fetch_count; LLVMValueRef stride, step, io_itr; LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr; - LLVMValueRef instance_id; LLVMValueRef zero = lp_build_const_int32(gallivm, 0); LLVMValueRef one = lp_build_const_int32(gallivm, 1); struct draw_context *draw = llvm->draw; @@ -1267,6 +1266,9 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, const unsigned pos = draw_current_shader_position_output(llvm->draw); const unsigned cv = draw_current_shader_clipvertex_output(llvm->draw); boolean have_clipdist = FALSE; + struct lp_bld_tgsi_system_values system_values; + + memset(&system_values, 0, sizeof(system_values)); arg_types[0] = get_context_ptr_type(llvm); /* context */ arg_types[1] = get_vertex_header_ptr_type(llvm); /* vertex_header */ @@ -1297,19 +1299,19 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, LLVMAddAttribute(LLVMGetParam(variant_func, i), LLVMNoAliasAttribute); - context_ptr = LLVMGetParam(variant_func, 0); - io_ptr = LLVMGetParam(variant_func, 1); - vbuffers_ptr = LLVMGetParam(variant_func, 2); - stride = LLVMGetParam(variant_func, 5); - vb_ptr = LLVMGetParam(variant_func, 6); - instance_id = LLVMGetParam(variant_func, 7); + context_ptr = LLVMGetParam(variant_func, 0); + io_ptr= LLVMGetParam(variant_func, 1); + vbuffers_ptr = LLVMGetParam(variant_func, 2); + stride= LLVMGetParam(variant_func, 5); + vb_ptr= LLVMGetParam(variant_func, 6); + system_values.instance_id = LLVMGetParam(variant_func, 7); lp_build_name(context_ptr, "context"); lp_build_name(io_ptr, "io"); lp_build_name(vbuffers_ptr, "vbuffers"); lp_build_name(stride, "stride"); lp_build_name(vb_ptr, "vb"); - lp_build_name(instance_id, "instance_id"); + lp_build_name(system_values.instance_id, "instance_id"); if (elts) { fetch_elts = LLVMGetParam(variant_func, 3); @@ -1375,6 +1377,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, lp_build_printf(builder, " --- io %d = %p, loop counter %d\n", io_itr, io, lp_loop.counter); #endif + system_values.vertex_id = lp_build_zero(gallivm, lp_type_uint_vec(32)); for (i = 0; i < TGSI_NUM_CHANNELS; ++i) { LLVMValueRef true_index = LLVMBuildAdd(builder, @@ -1392,7 +1395,10 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, &true_index, 1, ""); true_index = LLVMBuildLoad(builder, fetch_ptr, "fetch_elt"); } - + + system_values.vertex_id = LLVMBuildInsertElement(gallivm->builder, + system_values.vertex_id
Re: [Mesa-dev] [PATCH] glsl: Fix pi/2 constant in acos built-in function
On Tue, Jun 05, 2012 at 04:51:54PM -0700, Paul Berry wrote: > The best idea I've got so far would be a shader_runner test with a fragment > shader that computes dFdx(asin(x)), compares it to the theoretical closed > form derivative of asin(x) (which is 1/sqrt(1-x^2)), and draws red pixels > if the result is outside a certain error tolerance. We'd probably want to > use a relative error (since the derivative of asin(x) can get quite large) > and stop a bit shy of the endpoints where it goes to infinity. Can't you take the perfectly reasonable hypothesis that the system's asin is precise, and upload something like a 256x256 R32FG32FB32FA32F texture with reference values? 262144 testing points should be good enough :-) And that's something that generalizes easily to all the functions you may want to test on a segment. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Fix pi/2 constant in acos built-in function
On Mon, Jun 04, 2012 at 03:23:34PM -0700, Paul Berry wrote: > I'm not even kidding--I love this > stuff and I'm jealous that I don't have time to work on it right now Do you have a favorite method for Vandermonde matrix inversion? OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Fix pi/2 constant in acos built-in function
On Mon, Jun 04, 2012 at 01:11:13PM -0700, Ian Romanick wrote: > From: Ian Romanick > > In single precision, 1.5707963 becomes 1.5707962513 which is too > small. However, 1.5707964 becomes 1.5707963705 which is just right. > The value 1.5707964 is already used in asin.ir. > > NOTE: This is a candidate for stable release branches. If piglit stops bitching on that partical problem thanks to such a small change, it's just beautiful. Do we need a better precision atan, or should piglit just be told to shutup? The shutup patch has been sent it ages ago, but I can't do the "more precision" one if that's what's wanted. OG. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] llvmpipe: Add vertex id support.
Signed-off-by: Olivier Galibert --- src/gallium/auxiliary/draw/draw_llvm.c | 10 -- src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |3 ++- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |7 +++ src/gallium/drivers/llvmpipe/lp_state_fs.c |2 +- 4 files changed, 18 insertions(+), 4 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_llvm.c b/src/gallium/auxiliary/draw/draw_llvm.c index d5eb727..71125ba 100644 --- a/src/gallium/auxiliary/draw/draw_llvm.c +++ b/src/gallium/auxiliary/draw/draw_llvm.c @@ -457,6 +457,7 @@ generate_vs(struct draw_llvm *llvm, LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS], const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS], LLVMValueRef instance_id, +LLVMValueRef vertex_id, LLVMValueRef context_ptr, struct lp_build_sampler_soa *draw_sampler, boolean clamp_vertex_color) @@ -489,6 +490,7 @@ generate_vs(struct draw_llvm *llvm, NULL /*struct lp_build_mask_context *mask*/, consts_ptr, instance_id, + vertex_id, NULL /*pos*/, inputs, outputs, @@ -1245,7 +1247,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, LLVMValueRef count, fetch_elts, fetch_count; LLVMValueRef stride, step, io_itr; LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr; - LLVMValueRef instance_id; + LLVMValueRef instance_id, vertex_id; LLVMValueRef zero = lp_build_const_int32(gallivm, 0); LLVMValueRef one = lp_build_const_int32(gallivm, 1); struct draw_context *draw = llvm->draw; @@ -1375,6 +1377,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, lp_build_printf(builder, " --- io %d = %p, loop counter %d\n", io_itr, io, lp_loop.counter); #endif + vertex_id = lp_build_zero(gallivm, lp_type_uint_vec(32)); for (i = 0; i < TGSI_NUM_CHANNELS; ++i) { LLVMValueRef true_index = LLVMBuildAdd(builder, @@ -1392,7 +1395,9 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, &true_index, 1, ""); true_index = LLVMBuildLoad(builder, fetch_ptr, "fetch_elt"); } - + + vertex_id = LLVMBuildInsertElement(gallivm->builder, vertex_id, true_index, +lp_build_const_int32(gallivm, i), ""); for (j = 0; j < draw->pt.nr_vertex_elements; ++j) { struct pipe_vertex_element *velem = &draw->pt.vertex_element[j]; LLVMValueRef vb_index = @@ -1412,6 +1417,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant, outputs, ptr_aos, instance_id, + vertex_id, context_ptr, sampler, variant->key.clamp_vertex_color); diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h index c4e690c..f87f899 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h @@ -206,6 +206,7 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm, struct lp_build_mask_context *mask, LLVMValueRef consts_ptr, LLVMValueRef instance_id, + LLVMValueRef vertex_id, const LLVMValueRef *pos, const LLVMValueRef (*inputs)[4], LLVMValueRef (*outputs)[4], @@ -381,7 +382,7 @@ struct lp_build_tgsi_soa_context */ LLVMValueRef inputs_array; - LLVMValueRef instance_id; + LLVMValueRef instance_id, vertex_id; /** bitmask indicating which register files are accessed indirectly */ unsigned indirect_files; diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index 26be902..37599da 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -799,6 +799,11 @@ emit_fetch_system_value( atype = TGSI_TYPE_UNSIGNED; break; + case TGSI_SEMANTIC_VERTEXID: + res = bld->vertex_id; + atype = TGSI_TYPE_UNSIGNED; + break; + default: assert(!"unexpected semantic in emit_fetch_system_value"); res = bld_base->base.zero; @@ -1996,6 +2001,7 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm, struct lp_build_mask_context *mask, LLVMValueRef consts_ptr, LLVMValueRef instance_id, + LLVMValueRef vertex_id, const LLVMValueRef *pos, const LLVMValueRef (*inp