Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware

2016-03-01 Thread Olivier Galibert
I can confirm tri/cube work with latest git.  Talos Principle refuses
to start because of missing vkCmdBeginQuery, time to jump into the
docs to see how much of gen8 is copy-able there.

  OG.


On Tue, Mar 1, 2016 at 10:28 AM, Jacek Konieczny <jaj...@jajcus.net> wrote:
> On 2016-03-01 10:10, Martin Peres wrote:
>>
>> On 29/02/16 20:48, Jason Ekstrand wrote:
>>>
>>> On Fri, Feb 26, 2016 at 2:18 AM, Olivier Galibert <galib...@pobox.com
>>> <mailto:galib...@pobox.com>> wrote:
>>>
>>> Ok, I can tell you that 3DSTATE_DEPTH_BUFFER and
>>> 3DSTATE_STENCIL_BUFFER seem perfectly correct (assuming the gem
>>> address-patching-in works for the depth buffer address). I'll see if
>>> I can find a past version that works.
>>>
>>>
>>> FYI, this hang has been fixed now and most of the demos work
>>> more-or-less.
>>> --Jason
>>
>>
>> Just tried the vkcube with hsw and there is definitely an improvements
>> (the machine does not hard hang anymore) but vkcube now segfaults:
>
>
> For me both 'vkcube' and the 'cube' and 'tri' demos from
> LoaderAndValidationLayers work correctly with GIT revision
> 46b7c242da7c7c9ea7877a2c4b1fecdf5c1c0452.
>
> 'cube', 'tri' and most other Vulkan examples would cause GPU hang on
> earlier revisions, so the improvement is (was?) clear.
>
> Jacek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware

2016-03-01 Thread Olivier Galibert
Beware of path issues, vk* has no error checking and gives funky
values to the driver if it fails at finding its extra files.

  OG.


On Tue, Mar 1, 2016 at 10:10 AM, Martin Peres <martin.pe...@free.fr> wrote:
> On 29/02/16 20:48, Jason Ekstrand wrote:
>
> On Fri, Feb 26, 2016 at 2:18 AM, Olivier Galibert <galib...@pobox.com>
> wrote:
>>
>> Ok, I can tell you that 3DSTATE_DEPTH_BUFFER and
>> 3DSTATE_STENCIL_BUFFER seem perfectly correct (assuming the gem
>> address-patching-in works for the depth buffer address).  I'll see if
>> I can find a past version that works.
>
>
> FYI, this hang has been fixed now and most of the demos work more-or-less.
> --Jason
>
>
> Just tried the vkcube with hsw and there is definitely an improvements (the
> machine does not hard hang anymore) but vkcube now segfaults:
>
> #0  0x75210f23 in anv_descriptor_set_create () from
> /usr/lib/libvulkan_intel.so
> #1  0x7521121d in anv_AllocateDescriptorSets () from
> /usr/lib/libvulkan_intel.so
> #2  0x004063b0 in ?? ()
> #3  0x004035df in ?? ()
> #4  0x00404986 in ?? ()
> #5  0x0040589f in ?? ()
> #6  0x76aa9710 in __libc_start_main () from /usr/lib/libc.so.6
> #7  0x00402e69 in ?? ()
>
> Is it supposed to?
>
> I will have a look at it tonight.
>
> Martin
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware

2016-02-26 Thread Olivier Galibert
Ok, I can tell you that 3DSTATE_DEPTH_BUFFER and
3DSTATE_STENCIL_BUFFER seem perfectly correct (assuming the gem
address-patching-in works for the depth buffer address).  I'll see if
I can find a past version that works.

  OG.


On Wed, Feb 17, 2016 at 4:31 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote:
> On Tue, Feb 16, 2016 at 11:22 PM, Olivier Galibert <galib...@pobox.com>
> wrote:
>>
>> I'm actually interested about how one goes about debugging that kind
>> of problem, if you have pointers.  I would have an idea or two on how
>> to go about it if it was in userspace only, but once it crosses into
>> the kernel I'm not sure what strategies are best.
>
>
> This is almost certainly a userspace problem.  I mentioned before that  it's
> probably a depth/stencil problem.  I remember having similar problems a few
> months ago when I was reviving gen7.  I know that depth/stencil did work at
> some point.
>
> I would start by looking at is where we emit the 3DSTATE_DEPTH_BUFFER and
> 3DSTATE_STENCIL_BUFFER and trying to see if we're setting something up
> wrong.  Sometimes it's just a matter of looking at the documentation and
> comparing the values we're setting to the docs and seeing if the make sense.
> That's where I'd start.
>
> You could also try to go back a little ways (don't to past the update to
> 1.0) to see if you can find a point where depth/stencil worked and try and
> bisect to find where it broke.  That may also provide hints as to what's
> going wrong.
>
> Hope that helps,
> --Jason
>
>>
>>
>> Best,
>>
>>   OG.
>>
>>
>> On Wed, Feb 17, 2016 at 2:51 AM, Jason Ekstrand <ja...@jlekstrand.net>
>> wrote:
>> > On Tue, Feb 16, 2016 at 1:21 PM, Olivier Galibert <galib...@pobox.com>
>> > wrote:
>> >>
>> >>   Hi,
>> >>
>> >> I'm getting gpu hangs with the lunarg examples (cube and tri) on my
>> >> Haswell (64 bits).  I attach /sys/class/drm/card0/error fwiw.  How
>> >> should I go about debugging that?
>> >
>> >
>> > It's a depth-stencil issue and we know about it.   The gen7 code needs
>> > some
>> > love.   I think Kristian and Jordan have been working on it.
>> > --Jason
>> >
>> >>
>> >>
>> >>   OG.
>> >>
>> >>
>> >> On Tue, Feb 16, 2016 at 4:19 PM, Jason Ekstrand <ja...@jlekstrand.net>
>> >> wrote:
>> >> > The Intel mesa team is pleased to announce a brand-new open-source
>> >> > Vulkan
>> >> > driver for Intel hardware.  We've been working hard on this over the
>> >> > course
>> >> > of the past year or so and are excited to finally share it with the
>> >> > community.  We will work on up-streaming the driver in the next few
>> >> > weeks
>> >> > and hope to have it all in place in time for mesa 11.3 (mesa 12?).
>> >> > In
>> >> > the
>> >> > mean time, the driver can be found in the "vulkan" branch of the mesa
>> >> > git
>> >> > repo on freedesktop.org:
>> >> >
>> >> > https://cgit.freedesktop.org/mesa/mesa/log/?h=vulkan
>> >> >
>> >> > More information on building the driver and running a few simple apps
>> >> > can
>> >> > be found on the 01.org web site:
>> >> >
>> >> >
>> >> >
>> >> > https://01.org/linuxgraphics/blogs/jekstrand/2016/open-source-vulkan-drivers-intel-hardware
>> >> >
>> >> > We have talked to people at Red Hat and Cannonical and binaries
>> >> > should
>> >> > be
>> >> > available for Fedora and Ubuntu soon.  We will update the page on
>> >> > 01.org
>> >> > with links as soon as they are available.
>> >> >
>> >> > We have also created a small test suite called crucible which
>> >> > contains a
>> >> > few hundred tests (mostly for miptrees) that we created when bringing
>> >> > up
>> >> > the driver.  This isn't really intended to be the piglit of vulkan.
>> >> > With
>> >> > the CTS being publicly available, most cross-platform tests should go
>> >> > there.  We mostly made crucible so that we could write a few tests
>> >> > early
>> >> > on
>> >> > to get us going and for tests that were targetted specifically at our

Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware

2016-02-17 Thread Olivier Galibert
Ok, I'll do that, thanks :-)  No matter what, I'll learn interesting things.

  OG.


On Wed, Feb 17, 2016 at 4:31 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote:
> On Tue, Feb 16, 2016 at 11:22 PM, Olivier Galibert <galib...@pobox.com>
> wrote:
>>
>> I'm actually interested about how one goes about debugging that kind
>> of problem, if you have pointers.  I would have an idea or two on how
>> to go about it if it was in userspace only, but once it crosses into
>> the kernel I'm not sure what strategies are best.
>
>
> This is almost certainly a userspace problem.  I mentioned before that  it's
> probably a depth/stencil problem.  I remember having similar problems a few
> months ago when I was reviving gen7.  I know that depth/stencil did work at
> some point.
>
> I would start by looking at is where we emit the 3DSTATE_DEPTH_BUFFER and
> 3DSTATE_STENCIL_BUFFER and trying to see if we're setting something up
> wrong.  Sometimes it's just a matter of looking at the documentation and
> comparing the values we're setting to the docs and seeing if the make sense.
> That's where I'd start.
>
> You could also try to go back a little ways (don't to past the update to
> 1.0) to see if you can find a point where depth/stencil worked and try and
> bisect to find where it broke.  That may also provide hints as to what's
> going wrong.
>
> Hope that helps,
> --Jason
>
>>
>>
>> Best,
>>
>>   OG.
>>
>>
>> On Wed, Feb 17, 2016 at 2:51 AM, Jason Ekstrand <ja...@jlekstrand.net>
>> wrote:
>> > On Tue, Feb 16, 2016 at 1:21 PM, Olivier Galibert <galib...@pobox.com>
>> > wrote:
>> >>
>> >>   Hi,
>> >>
>> >> I'm getting gpu hangs with the lunarg examples (cube and tri) on my
>> >> Haswell (64 bits).  I attach /sys/class/drm/card0/error fwiw.  How
>> >> should I go about debugging that?
>> >
>> >
>> > It's a depth-stencil issue and we know about it.   The gen7 code needs
>> > some
>> > love.   I think Kristian and Jordan have been working on it.
>> > --Jason
>> >
>> >>
>> >>
>> >>   OG.
>> >>
>> >>
>> >> On Tue, Feb 16, 2016 at 4:19 PM, Jason Ekstrand <ja...@jlekstrand.net>
>> >> wrote:
>> >> > The Intel mesa team is pleased to announce a brand-new open-source
>> >> > Vulkan
>> >> > driver for Intel hardware.  We've been working hard on this over the
>> >> > course
>> >> > of the past year or so and are excited to finally share it with the
>> >> > community.  We will work on up-streaming the driver in the next few
>> >> > weeks
>> >> > and hope to have it all in place in time for mesa 11.3 (mesa 12?).
>> >> > In
>> >> > the
>> >> > mean time, the driver can be found in the "vulkan" branch of the mesa
>> >> > git
>> >> > repo on freedesktop.org:
>> >> >
>> >> > https://cgit.freedesktop.org/mesa/mesa/log/?h=vulkan
>> >> >
>> >> > More information on building the driver and running a few simple apps
>> >> > can
>> >> > be found on the 01.org web site:
>> >> >
>> >> >
>> >> >
>> >> > https://01.org/linuxgraphics/blogs/jekstrand/2016/open-source-vulkan-drivers-intel-hardware
>> >> >
>> >> > We have talked to people at Red Hat and Cannonical and binaries
>> >> > should
>> >> > be
>> >> > available for Fedora and Ubuntu soon.  We will update the page on
>> >> > 01.org
>> >> > with links as soon as they are available.
>> >> >
>> >> > We have also created a small test suite called crucible which
>> >> > contains a
>> >> > few hundred tests (mostly for miptrees) that we created when bringing
>> >> > up
>> >> > the driver.  This isn't really intended to be the piglit of vulkan.
>> >> > With
>> >> > the CTS being publicly available, most cross-platform tests should go
>> >> > there.  We mostly made crucible so that we could write a few tests
>> >> > early
>> >> > on
>> >> > to get us going and for tests that were targetted specifically at our
>> >> > implementation.  None the less, they may prove useful to someone and
>> >> > we
>> >> > are
>> >

Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware

2016-02-16 Thread Olivier Galibert
I'm actually interested about how one goes about debugging that kind
of problem, if you have pointers.  I would have an idea or two on how
to go about it if it was in userspace only, but once it crosses into
the kernel I'm not sure what strategies are best.

Best,

  OG.


On Wed, Feb 17, 2016 at 2:51 AM, Jason Ekstrand <ja...@jlekstrand.net> wrote:
> On Tue, Feb 16, 2016 at 1:21 PM, Olivier Galibert <galib...@pobox.com>
> wrote:
>>
>>   Hi,
>>
>> I'm getting gpu hangs with the lunarg examples (cube and tri) on my
>> Haswell (64 bits).  I attach /sys/class/drm/card0/error fwiw.  How
>> should I go about debugging that?
>
>
> It's a depth-stencil issue and we know about it.   The gen7 code needs some
> love.   I think Kristian and Jordan have been working on it.
> --Jason
>
>>
>>
>>   OG.
>>
>>
>> On Tue, Feb 16, 2016 at 4:19 PM, Jason Ekstrand <ja...@jlekstrand.net>
>> wrote:
>> > The Intel mesa team is pleased to announce a brand-new open-source
>> > Vulkan
>> > driver for Intel hardware.  We've been working hard on this over the
>> > course
>> > of the past year or so and are excited to finally share it with the
>> > community.  We will work on up-streaming the driver in the next few
>> > weeks
>> > and hope to have it all in place in time for mesa 11.3 (mesa 12?).  In
>> > the
>> > mean time, the driver can be found in the "vulkan" branch of the mesa
>> > git
>> > repo on freedesktop.org:
>> >
>> > https://cgit.freedesktop.org/mesa/mesa/log/?h=vulkan
>> >
>> > More information on building the driver and running a few simple apps
>> > can
>> > be found on the 01.org web site:
>> >
>> >
>> > https://01.org/linuxgraphics/blogs/jekstrand/2016/open-source-vulkan-drivers-intel-hardware
>> >
>> > We have talked to people at Red Hat and Cannonical and binaries should
>> > be
>> > available for Fedora and Ubuntu soon.  We will update the page on 01.org
>> > with links as soon as they are available.
>> >
>> > We have also created a small test suite called crucible which contains a
>> > few hundred tests (mostly for miptrees) that we created when bringing up
>> > the driver.  This isn't really intended to be the piglit of vulkan.
>> > With
>> > the CTS being publicly available, most cross-platform tests should go
>> > there.  We mostly made crucible so that we could write a few tests early
>> > on
>> > to get us going and for tests that were targetted specifically at our
>> > implementation.  None the less, they may prove useful to someone and we
>> > are
>> > happy to share them.  The crucible source code can be found at
>> >
>> > https://cgit.freedesktop.org/mesa/crucible/
>> >
>> > Frequently Asked Questions:
>> >
>> > What all hardware does it support?
>> >
>> >The driver currently supports Sky Lake all the way back to Ivy
>> > Bridge.
>> >The driver is Vulkan 1.0 conformant for 64-bit builds on Sky Lake,
>> >Broadwell, and Braswell.  We are still having a couple of 32-bit
>> > issues
>> >and support for Haswell, Ivy Bridge, and Bay Trail should be
>> > considered
>> >experimental.
>> >
>> > How much code is shared between the Vulkan and GL drivers?
>> >
>> >For shaders, we're using a SPIR-V to NIR pass which is new, and a few
>> >new NIR lowering passes for things that we previously depended on
>> > GLSL
>> >IR to handle.  Beyond that, we're using the same core NIR and the
>> > same
>> >back-end compiler that we have for GL.  We're carrying a few patches
>> >against the back-end compiler, but the delta is very small and it's
>> > all
>> >stuff that we eventually want to do for GL anyway.
>> >
>> >The main API handling and state setup code is all new and written
>> > from
>> >the ground-up for Vulkan.  For actually packing hardware packets, we
>> > are
>> >using a codegen system that Kristian developed early on in the
>> > project
>> >that's based on an XML description of the hardware packets.  The
>> > result
>> >is state setup code that's both easier to work with and maybe even a
>> >little more efficient than what we have in mesa today.
>> >
>> >We also have a brand-new surface layout library called ISL that
>>

Re: [Mesa-dev] [PATCH v6] nir: Add an ALU op builder kind of like ir_builder.h

2015-02-17 Thread Olivier Galibert
  Hi,

I thought mesa was C++ by now?  That API is really C-ish.

  OG.


On Wed, Feb 18, 2015 at 2:12 AM, Kenneth Graunke kenn...@whitecape.org wrote:
 On Friday, February 06, 2015 04:00:10 PM Eric Anholt wrote:
 v2: Rebase on the nir_opcodes.h python code generation support.
 v3: Use SSA values, and set an appropriate writemask on dot products.
 v4: Make the arguments be SSA references as well.  This lets you stack up
 expressions in the arguments of other expressions, at the cost of
 having to insert a fmov/imov if you want to swizzle.  Also, add
 the generated file to NIR_GENERATED_FILES.
 v5: Use more pythonish style for iterating the list.
 v6: Infer the size of the dest from the size of the srcs, and auto-swizzle
 a single small src out to the appropriate size.
 ---
  src/glsl/Makefile.am  |   5 ++
  src/glsl/Makefile.sources |   1 +
  src/glsl/nir/.gitignore   |   1 +
  src/glsl/nir/nir_builder.h| 114 
 ++
  src/glsl/nir/nir_builder_opcodes_h.py |  38 
  5 files changed, 159 insertions(+)
  create mode 100644 src/glsl/nir/nir_builder.h
  create mode 100644 src/glsl/nir/nir_builder_opcodes_h.py

 This patch is:
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org

 I do like Connor's ideas - we should definitely extend this and use it
 in more places.  I think we can easily do that as a follow on series.

 It might make sense to (eventually) have an API like:

 nir_builder *nir_builder_create(...)

 nir_builder_insert_at_cf_list(nir_builder *b, nir_cf_list *cf_list)
 nir_builder_insert_at_block_start(nir_builder *b, nir_bblock *block)
 nir_builder_insert_at_block_end(nir_builder *b, nir_bblock *block)
 nir_builder_insert_after_instr(nir_builder *b, nir_instruction *instr)
 nir_builder_insert_before_instr(nir_builder *b, nir_instruction *instr)

 I could see us having to store a cf_list/bblock/instr and needing to
 swap around several fields, so having functions would be nicer than
 prodding at struct fields directly.

 But for now, I think it's sufficient - it'll be easy enough to create
 later, when we actually make the other APIs and start using them.

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Add mesa SHA-1 functions

2014-12-23 Thread Olivier Galibert
  Hi,

Not sure there's anything to maintain, but sure, I'll maintain it.

Best,

  OG.


On Sun, Dec 21, 2014 at 8:51 PM, Emil Velikov emil.l.veli...@gmail.com wrote:
 On 20 December 2014 at 14:21, Olivier Galibert galib...@pobox.com wrote:
 Here is an implementation I've written myself, so no license issues.

 Thanks OG,

 Afaics the main issue is not the lack of implementation, but that
 no-one wants to step up to maintain it.
 Even adding code that is x2 the size is considered a better solution :'-(

 If you're up-to the maintenance task, we can resolve all the issues
 (linking, multi platform support) in half the size and a lot cleaner
 build :-)

 Cheers,
 Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Add mesa SHA-1 functions

2014-12-20 Thread Olivier Galibert
Here is an implementation I've written myself, so no license issues.

  OG.


On Fri, Dec 12, 2014 at 10:48 AM, Jose Fonseca jfons...@vmware.com wrote:
 On 11/12/14 22:02, Brian Paul wrote:

 On 12/11/2014 02:51 PM, Carl Worth wrote:

 From: Kristian Høgsberg k...@bitplanet.net

 The upcoming shader cache uses the SHA-1 algorithm for cryptographic
 naming. These new mesa_sha1 functions are implemented with the nettle
 library.
 ---

 This patch is another in support of my upcoming shader-cache work.
 Thanks to
 Kritian for coding this piece.

 As currently written, this patch introduces a new dependency of Mesa
 on the
 Nettle library to implement SHA-1. I'm open to recommendations if
 people would prefer some other option.

 For example, the xserver can be configured to get a SHA-1
 implementation from
 libmd, libc, CommonCrypto, CryptoAPI, libnettle, libgcrypt, libsha1, or
 openssl.

 I don't know if it's important to offer as many options as that, which
 is why
 I'm asking for opinions here.



 We'll need a solution for Windows too.  I don't have time right now to
 do any research into that.


 Yes, ideally we'd have something small that we could bundle into mesa source
 tree, for sake of non Linux OSes.

 If Windows was the only concern, we could use its Crypto API,
 http://msdn.microsoft.com/en-us/library/windows/desktop/aa382379.aspx and
 avoid depending on anything else, but some of the above mention libraries
 are not trivial to install.

 The other alternative is to disable shader cache when no suitable dependency
 is found. That is, make this an optional dependency.

 Jose

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
/*
 * Copyright © 2014 Olivier Galibert  Intel Corporation
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
 * copy of this software and associated documentation files (the Software),
 * to deal in the Software without restriction, including without limitation
 * the rights to use, copy, modify, merge, publish, distribute, sublicense,
 * and/or sell copies of the Software, and to permit persons to whom the
 * Software is furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice (including the next
 * paragraph) shall be included in all copies or substantial portions of the
 * Software.
 *
 * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
 * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 * DEALINGS IN THE SOFTWARE.
 */

#include stdlib.h
#include string.h

#include sha1.h

static inline unsigned int mesa_sha1_shift(unsigned int val, int count)
{
return (val  count) | (val  (32-count));
}

static void mesa_sha1_init(struct mesa_sha1 *ctx)
{
ctx-digest[0] = 0x67452301;
ctx-digest[1] = 0xefcdab89;
ctx-digest[2] = 0x98badcfe;
ctx-digest[3] = 0x10325476;
ctx-digest[4] = 0xc3d2e1f0;
ctx-msize = 0;
}

static void mesa_sha1_handle_block(struct mesa_sha1 *ctx, const unsigned char *b)
{
unsigned int W[80];
for(int i=0; i != 16; i++)
W[i] = (b[4*i]  24) | (b[4*i+1]  16) | (b[4*i+2]  8) | b[4*i+3];
for(int i=16; i != 80; i++)
W[i] = mesa_sha1_shift(W[i-3]^W[i-8]^W[i-14]^W[i-16], 1);

unsigned int A = ctx-digest[0];
unsigned int B = ctx-digest[1];
unsigned int C = ctx-digest[2];
unsigned int D = ctx-digest[3];
unsigned int E = ctx-digest[4];

for(int i= 0; i != 20; i++) {
unsigned int T = mesa_sha1_shift(A, 5) + ((B  C) | ((~B)  D))+ E + W[i] + 0x5A827999;
E = D;
D = C;
C = mesa_sha1_shift(B, 30);
B = A;
A = T;
}

for(int i=20; i != 40; i++) {
unsigned int T = mesa_sha1_shift(A, 5) + (B^C^D)   + E + W[i] + 0x6ed9eba1;
E = D;
D = C;
C = mesa_sha1_shift(B, 30);
B = A;
A = T;
}

for(int i=40; i != 60; i++) {
unsigned int T = mesa_sha1_shift(A, 5) + ((B  C) | (B  D) | (C  D)) + E + W[i] + 0x8f1bbcdc;
E = D;
D = C;
C = mesa_sha1_shift(B, 30);
B = A;
A = T;
}

for(int i=60; i != 80; i++) {
unsigned int T = mesa_sha1_shift(A, 5) + (B^C^D)   + E + W[i] + 0xca62c1d6;
E = D;
D = C;
C = mesa_sha1_shift(B, 30);
B = A;
A = T;
}

ctx-digest[0] += A;
ctx-digest[1] += B;
ctx-digest[2] += C;
ctx-digest[3] += D;
ctx-digest[4] += E;
}

void
mesa_sha1_final(struct mesa_sha1 *ctx, unsigned char result[20

Re: [Mesa-dev] [PATCH 03/16] mesa: Clamps the stencil value masks to GLint when queried

2014-12-18 Thread Olivier Galibert
Hi,

Something is not clear to me: In which way -1 is incorrect?

Also, w.r.t comments, what you're doing is masking, not clamping,
which incidentally is a good thing since clamping would be severely
bad for stencil.

Best,

  OG.


On Thu, Dec 11, 2014 at 11:34 PM, Eduardo Lima Mitev el...@igalia.com wrote:
 Stencil value masks values (ctx-Stencil.ValueMask[]) stores GLuint values
 which are initialized with max unsigned integer (~0u). When these values
 are queried by glGet* (GL_STENCIL_VALUE_MASK or GL_STENCIL_BACK_VALUE_MASK),
 they are converted to a signed integer. Currently, these values overflow
 and return incorrect result (-1).

 This patch clamps these values to max int (0x7FFF) before storing.

 Fixes 6 dEQP failing tests:
 * dEQP-GLES3.functional.state_query.integers.stencil_value_mask_getfloat
 * dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_getfloat
 * 
 dEQP-GLES3.functional.state_query.integers.stencil_value_mask_separate_getfloat
 * 
 dEQP-GLES3.functional.state_query.integers.stencil_value_mask_separate_both_getfloat
 * 
 dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_separate_getfloat
 * 
 dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_separate_both_getfloat
 ---
  src/mesa/main/get.c  | 11 ++-
  src/mesa/main/get_hash_params.py |  2 +-
  2 files changed, 11 insertions(+), 2 deletions(-)

 diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
 index 6091efc..4578a36 100644
 --- a/src/mesa/main/get.c
 +++ b/src/mesa/main/get.c
 @@ -726,7 +726,16 @@ find_custom_value(struct gl_context *ctx, const struct 
 value_desc *d, union valu
v-value_int = _mesa_get_stencil_ref(ctx, 1);
break;
 case GL_STENCIL_VALUE_MASK:
 -  v-value_int = ctx-Stencil.ValueMask[ctx-Stencil.ActiveFace];
 +  /* Since stencil value mask is a GLuint, it requires clamping
 +   * before storing in a signed int to avoid overflow.
 +   * Notice that Stencil.ValueMask values are initialized to ~0u,
 +   * so without clamping it will return -1 when assigned to value_int.
 +   */
 +  v-value_int = ctx-Stencil.ValueMask[ctx-Stencil.ActiveFace]  
 0x7FFF;
 +  break;
 +   case GL_STENCIL_BACK_VALUE_MASK:
 +  /* Same as with GL_STENCIL_VALUE_MASK, value requires claming. */
 +  v-value_int = ctx-Stencil.ValueMask[1]  0x7FFF;
break;
 case GL_STENCIL_WRITEMASK:
v-value_int = ctx-Stencil.WriteMask[ctx-Stencil.ActiveFace];
 diff --git a/src/mesa/main/get_hash_params.py 
 b/src/mesa/main/get_hash_params.py
 index 09a61ac..a3bf1cb 100644
 --- a/src/mesa/main/get_hash_params.py
 +++ b/src/mesa/main/get_hash_params.py
 @@ -283,7 +283,7 @@ descriptor=[

  # OpenGL 2.0
[ STENCIL_BACK_FUNC, CONTEXT_ENUM(Stencil.Function[1]), NO_EXTRA ],
 -  [ STENCIL_BACK_VALUE_MASK, CONTEXT_INT(Stencil.ValueMask[1]), NO_EXTRA 
 ],
 +  [ STENCIL_BACK_VALUE_MASK, LOC_CUSTOM, TYPE_INT, NO_OFFSET, NO_EXTRA],
[ STENCIL_BACK_WRITEMASK, CONTEXT_INT(Stencil.WriteMask[1]), NO_EXTRA 
 ],
[ STENCIL_BACK_REF, LOC_CUSTOM, TYPE_INT, NO_OFFSET, NO_EXTRA ],
[ STENCIL_BACK_FAIL, CONTEXT_ENUM(Stencil.FailFunc[1]), NO_EXTRA ],
 --
 2.1.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] mesa: Initializes the stencil value masks to 0xFF instead of ~0u

2014-12-16 Thread Olivier Galibert
Note that ~0U is perfectly correct w.r.t the GLES3 spec.  It just
means that s=32, which happens to be greater or equal to 8.

Best,

  OG.


On Tue, Dec 16, 2014 at 8:58 AM, Eduardo Lima Mitev el...@igalia.com wrote:
 On 12/15/2014 08:30 PM, Ian Romanick wrote:
 On 12/15/2014 08:04 AM, Eduardo Lima Mitev wrote:

 Since the maximum supported precision for stencil buffers is 8 bits, mask
 values should be initialized to 2^8 - 1 = 0xFF.

 Currently, these masks are initialized to max unsigned integer (~0u), which
 causes their values to overflow to -1 when converted to signed int by 
 glGet* APIs.

 I did some research on this... before desktop OpenGL 3.1, the spec said
 something quite different.  Please add the following to the commit message:

 In OpenGL 3.0 and before, the an initial value of ~0u was specified:

 In the initial state, stenciling is disabled, the front and back
 stencil reference value are both zero, the front and back stencil
 comparison functions are both ALWAYS, and the front and back
 stencil mask are both all ones.


 Oh, interesting. I should have looked back into older specs to
 understand where the ~0u was coming from. Note taken.


 With that, this patch is

 Reviewed-by: Ian Romanick ian.d.roman...@intel.com


 Great. If you feel like nitpicking, you can check the final commit log
 here:
 https://github.com/Igalia/mesa/commit/3784f7b2d5aa739c4abf9aa28874b85bbd1550e5

 Thanks a lot!

 Eduardo

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Add mesa SHA-1 functions

2014-12-12 Thread Olivier Galibert
  Hi,

SHA1 is easy to implement.  If you want an always-working backup, I
have a couple of C versions I wrote myself.  Libraries are only
interesting if they offer significant speedups through cpu-dependance.
Especially since the shader cache is not in the happy fun land of
hardware-based attacks (or attacks in the first place).

Best,

  OG.


On Fri, Dec 12, 2014 at 10:48 AM, Jose Fonseca jfons...@vmware.com wrote:
 On 11/12/14 22:02, Brian Paul wrote:

 On 12/11/2014 02:51 PM, Carl Worth wrote:

 From: Kristian Høgsberg k...@bitplanet.net

 The upcoming shader cache uses the SHA-1 algorithm for cryptographic
 naming. These new mesa_sha1 functions are implemented with the nettle
 library.
 ---

 This patch is another in support of my upcoming shader-cache work.
 Thanks to
 Kritian for coding this piece.

 As currently written, this patch introduces a new dependency of Mesa
 on the
 Nettle library to implement SHA-1. I'm open to recommendations if
 people would prefer some other option.

 For example, the xserver can be configured to get a SHA-1
 implementation from
 libmd, libc, CommonCrypto, CryptoAPI, libnettle, libgcrypt, libsha1, or
 openssl.

 I don't know if it's important to offer as many options as that, which
 is why
 I'm asking for opinions here.



 We'll need a solution for Windows too.  I don't have time right now to
 do any research into that.


 Yes, ideally we'd have something small that we could bundle into mesa source
 tree, for sake of non Linux OSes.

 If Windows was the only concern, we could use its Crypto API,
 http://msdn.microsoft.com/en-us/library/windows/desktop/aa382379.aspx and
 avoid depending on anything else, but some of the above mention libraries
 are not trivial to install.

 The other alternative is to disable shader cache when no suitable dependency
 is found. That is, make this an optional dependency.

 Jose

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()

2014-10-10 Thread Olivier Galibert
Applied.

 OG.


On Fri, Sep 26, 2014 at 6:11 PM, Erik Faye-Lund kusmab...@gmail.com wrote:
 Our current atan()-approximation is pretty inaccurate at 1.0, so
 let's try to improve the situation by doing a direct approximation
 without going through atan.

 This new implementation uses an 11th degree polynomial to approximate
 atan in the [-1..1] range, and the following identitiy to reduce the
 entire range to [-1..1]:

 atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x)

 This range-reduction idea is taken from the paper Fast computation
 of Arctangent Functions for Embedded Applications: A Comparative
 Analysis (Ukil et al. 2011).

 The polynomial that approximates atan(x) is:

 x   * 0.793128310355 - x^3  * 0.3326756418091246 +
 x^5 * 0.1938924977115610 - x^7  * 0.1173503194786851 +
 x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444

 This polynomial was found with the following GNU Octave script:

 x = linspace(0, 1);
 y = atan(x);
 n = [1, 3, 5, 7, 9, 11];
 format long;
 polyfitc(x, y, n)

 The polyfitc function is not built-in, but too long to include here.
 It can be downloaded from the following URL:

 http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m

 This fixes the following piglit test:
 shaders/glsl-const-folding-01

 Signed-off-by: Erik Faye-Lund kusmab...@gmail.com
 Reviewed-by: Ian Romanick ian.d.roman...@intel.com
 ---
  src/glsl/builtin_functions.cpp | 65 
 +++---
  1 file changed, 55 insertions(+), 10 deletions(-)

 diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
 index 9be7f6d..c126b60 100644
 --- a/src/glsl/builtin_functions.cpp
 +++ b/src/glsl/builtin_functions.cpp
 @@ -442,6 +442,7 @@ private:
 ir_swizzle *matrix_elt(ir_variable *var, int col, int row);

 ir_expression *asin_expr(ir_variable *x);
 +   void do_atan(ir_factory body, const glsl_type *type, ir_variable *res, 
 operand y_over_x);

 /**
  * Call function \param f with parameters specified as the linked
 @@ -2684,11 +2685,7 @@ builtin_builder::_atan2(const glsl_type *type)
ir_factory outer_then(outer_if-then_instructions, mem_ctx);

/* Then...call atan(y/x) */
 -  ir_variable *y_over_x = outer_then.make_temp(glsl_type::float_type, 
 y_over_x);
 -  outer_then.emit(assign(y_over_x, div(y, x)));
 -  outer_then.emit(assign(r, mul(y_over_x, rsq(add(mul(y_over_x, 
 y_over_x),
 -  imm(1.0f));
 -  outer_then.emit(assign(r, asin_expr(r)));
 +  do_atan(body, glsl_type::float_type, r, div(y, x));

/* ...and fix it up: */
ir_if *inner_if = new(mem_ctx) ir_if(less(x, imm(0.0f)));
 @@ -2711,17 +2708,65 @@ builtin_builder::_atan2(const glsl_type *type)
 return sig;
  }

 +void
 +builtin_builder::do_atan(ir_factory body, const glsl_type *type, 
 ir_variable *res, operand y_over_x)
 +{
 +   /*
 +* range-reduction, first step:
 +*
 +*  / y_over_x if |y_over_x| = 1.0;
 +* x = 
 +*  \ 1.0 / y_over_x   otherwise
 +*/
 +   ir_variable *x = body.make_temp(type, atan_x);
 +   body.emit(assign(x, div(min2(abs(y_over_x),
 +imm(1.0f)),
 +   max2(abs(y_over_x),
 +imm(1.0f);
 +
 +   /*
 +* approximate atan by evaluating polynomial:
 +*
 +* x   * 0.793128310355 - x^3  * 0.3326756418091246 +
 +* x^5 * 0.1938924977115610 - x^7  * 0.1173503194786851 +
 +* x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444
 +*/
 +   ir_variable *tmp = body.make_temp(type, atan_tmp);
 +   body.emit(assign(tmp, mul(x, x)));
 +   body.emit(assign(tmp, 
 mul(add(mul(sub(mul(add(mul(sub(mul(add(mul(imm(-0.0121323213173444f),
 + tmp),
 + 
 imm(0.0536813784310406f)),
 + tmp),
 + 
 imm(0.1173503194786851f)),
 + tmp),
 + imm(0.1938924977115610f)),
 + tmp),
 + imm(0.3326756418091246f)),
 + tmp),
 + imm(0.793128310355f)),
 + x)));
 +
 +   /* range-reduction fixup */
 +   body.emit(assign(tmp, add(tmp,
 + mul(b2f(greater(abs(y_over_x),
 +  imm(1.0f, type-components(,
 +  add(mul(tmp,
 +  imm(-2.0f)),
 +  imm(M_PI_2f));
 +
 +   /* sign fixup */
 +   body.emit(assign(res, 

Re: [Mesa-dev] [PATCH] glsl/glsl_parser_extras: Handle GLSL 4.50

2014-10-03 Thread Olivier Galibert
Sorry for not replying earlier, I didn't see your answer.

On Thu, Sep 4, 2014 at 12:33 AM, Matt Turner matts...@gmail.com wrote:
 Did you change the leading whitespace on purpose?

Not really, I can un-change that.  I have an emacs config that's
supposedly what mesa wants, but it may be incorrect.


 -   } supported_versions[12];
 +   } supported_versions[14];

 Where does this number come from, and can we make it a little clearer
 what it is?

It's the maximum number of simultaneous glsl versions that a driver
can support.  With the current code it should be 13, that is, if a
driver supports glsl 440 it's going to segfault writing after the end
of array.  Supporting 450 needs one more.

Note that it's not (yet) a security issue since the largest we support is 330.

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/{version, getstring}: Future-proof version handling

2014-08-23 Thread Olivier Galibert
Are we that far?

  OG.


On Sat, Aug 23, 2014 at 7:22 PM, Ian Romanick i...@freedesktop.org wrote:
 I'm content with waiting to add this until we're even close to
 supporting any of those versions... especially given all the lines like
 false  // ARB_gpu_shader_fp64 .  That's just clutter.

 On 08/21/2014 05:02 AM, Olivier Galibert wrote:
 Signed-off-by: Olivier Galibert galib...@pobox.com
 ---
  src/mesa/main/getstring.c |   6 ++
  src/mesa/main/version.c   | 140 
 +-
  2 files changed, 143 insertions(+), 3 deletions(-)

 diff --git a/src/mesa/main/getstring.c b/src/mesa/main/getstring.c
 index 431d60b..f9d13a7 100644
 --- a/src/mesa/main/getstring.c
 +++ b/src/mesa/main/getstring.c
 @@ -58,6 +58,12 @@ shading_language_version(struct gl_context *ctx)
   return (const GLubyte *) 4.10;
case 420:
   return (const GLubyte *) 4.20;
 +  case 430:
 + return (const GLubyte *) 4.30;
 +  case 440:
 + return (const GLubyte *) 4.40;
 +  case 450:
 + return (const GLubyte *) 4.50;
default:
   _mesa_problem(ctx,
 Invalid GLSL version in 
 shading_language_version());
 diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
 index 4dea530..c7a2381 100644
 --- a/src/mesa/main/version.c
 +++ b/src/mesa/main/version.c
 @@ -290,7 +290,122 @@ compute_version(const struct gl_extensions *extensions,
extensions-EXT_texture_swizzle);
/* ARB_sampler_objects is always enabled in 
 mesa */

 -   if (ver_3_3) {
 +   const GLboolean ver_4_0 = (ver_3_3 
 +  consts-GLSLVersion = 400 
 +  extensions-ARB_draw_buffers_blend 
 +  extensions-ARB_draw_indirect 
 +  extensions-ARB_gpu_shader5 
 +  false  // ARB_gpu_shader_fp64 
 +  extensions-ARB_sample_shading 
 +  false  // ARB_shader_subroutine
 +  false  // ARB_tesselation_shader
 +  extensions-ARB_texture_buffer_object_rgb32 
 +  extensions-ARB_texture_cube_map_array 
 +  extensions-ARB_texture_gather 
 +  extensions-ARB_texture_query_lod 
 +  extensions-ARB_transform_feedback2 
 +  extensions-ARB_transform_feedback3);
 +
 +   const GLboolean ver_4_1 = (ver_4_0 
 +  consts-GLSLVersion = 410 
 +  extensions-ARB_ES2_compatibility 
 +  false  // ARB_shader_precision
 +  false  // ARB_vertex_attrib_64bit
 +  extensions-ARB_viewport_array);
 +  /* ARB_get_program_binary and
 + ARB_separate_shader_objects are always 
 enabled in mesa */
 +
 +   const GLboolean ver_4_2 = (ver_4_1 
 +  consts-GLSLVersion = 420 
 +  extensions-ARB_texture_compression_bptc 
 +  extensions-ARB_shader_atomic_counters 
 +  extensions-ARB_transform_feedback_instanced 
 
 +  extensions-ARB_base_instance 
 +  extensions-ARB_shader_image_load_store 
 +  extensions-ARB_conservative_depth 
 +  extensions-ARB_shading_language_420pack 
 +  extensions-ARB_internalformat_query);
 +  /* ARB_compressed_texture_pixel_storage,
 + ARB_texture_storage and
 + ARB_map_buffer_alignment are always 
 enabled in mesa */
 +
 +   const GLboolean ver_4_3 = (ver_4_2 
 +  consts-GLSLVersion = 430 
 +  false  // ARB_arrays_of_arrays
 +  extensions-ARB_ES3_compatibility 
 +  extensions-ARB_compute_shader 
 +  extensions-ARB_copy_image 
 +  extensions-ARB_explicit_uniform_location 
 +  extensions-ARB_fragment_layer_viewport 
 +  false  // ARB_framebuffer_no_attachments
 +  false  // ARB_internalformat_query2
 +  extensions-ARB_draw_indirect 
 +  false  // ARB_program_interface_query
 +  false  // ARB_robust_buffer_access_behavior
 +  false  // ARB_shader_image_size
 +  false  // ARB_shader_storage_buffer_object

[Mesa-dev] [PATCH] mapi/glapi/gen/gl_API.xml: Summer cleanup.

2014-08-22 Thread Olivier Galibert
This adds all the extension names and numbers, adds some missing
numbers and fixes the order in places.  Future extension additions
should be slightly easier by not requiring to find where it should go
anymore.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/mapi/glapi/gen/gl_API.xml | 804 ++
 1 file changed, 578 insertions(+), 226 deletions(-)

diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 73f2f75..e91f37e 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -6275,7 +6275,7 @@
 /function
 /category
 
-!-- ARB extension number 2 is a GLX extension. --
+!-- 2. GLX_ARB_get_proc_address is a GLX extension. --
 
 category name=GL_ARB_transpose_matrix number=3
 enum name=TRANSPOSE_MODELVIEW_MATRIX_ARB   value=0x84E3/
@@ -6300,7 +6300,7 @@
 /function
 /category
 
-!-- ARB extension number 4 is a WGL extension. --
+!-- 4. WGL_ARB_buffer_region is a WGL extension. --
 
 category name=GL_ARB_multisample number=5
 enum name=MULTISAMPLE_ARB   count=1  value=0x809D
@@ -6335,6 +6335,9 @@
 /function
 /category
 
+!--GLX_ARB_multisample is a GLX extension --
+!--WGL_ARB_multisample is a WGL extension --
+
 category name=GL_ARB_texture_env_add number=6
 !-- No new functions, types, enums. --
 /category
@@ -6360,10 +6363,10 @@
 /enum
 /category
 
-!-- ARB extension number 8 is a WGL extension. --
-!-- ARB extension number 9 is a WGL extension. --
-!-- ARB extension number 10 is a WGL extension. --
-!-- ARB extension number 11 is a WGL extension. --
+!-- 8. WGL_ARB_extensions_string is a WGL extension. --
+!-- 9. WGL_ARB_pixel_format is a WGL extension. --
+!-- 10. WGL_ARB_make_current_read is a WGL extension. --
+!-- 11. WGL_ARB_pbuffer is a WGL extension. --
 
 category name=GL_ARB_texture_compression number=12
 enum name=COMPRESSED_ALPHA_ARB value=0x84E9/
@@ -6776,7 +6779,7 @@
 enum name=DOT3_RGBA_ARBvalue=0x86AF/
 /category
 
-!-- ARB extension number 20 is a WGL extension. --
+!-- 20. WGL_ARB_render_texture is a WGL extension. --
 
 category name=GL_ARB_texture_mirrored_repeat number=21
 enum name=MIRRORED_REPEAT_ARB  value=0x8370/
@@ -7443,7 +7446,7 @@
  parameter was in the NV functions.  When this error was discovered
  and fixed, there was already at least one implementation of
  GLX protocol for ARB_vertex_program, but there were no
- implementations of NV_vertex_program.  The sollution was to renumber
+ implementations of NV_vertex_program.  The solution was to renumber
  the opcodes for NV_vertex_program and convert the unused field in
  the ARB_vertex_program protocol to unused padding.
   --
@@ -7683,6 +7686,8 @@
 /function
 /category
 
+!-- GLX_ARB_vertex_buffer_object is a GLX extension. --
+
 category name=GL_ARB_occlusion_query number=29
 enum name=QUERY_COUNTER_BITS_ARBcount=1  value=0x8864
 size name=GetQueryiv mode=get/
@@ -8079,7 +8084,7 @@
 !-- No new functions, types, enums. --
 /category
 
-xi:include href=ARB_draw_buffers.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/
+xi:include href=ARB_draw_buffers.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/ !-- 37. --
 
 category name=GL_ARB_texture_rectangle number=38
 enum name=TEXTURE_RECTANGLE_ARB count=1  value=0x84F5
@@ -8094,79 +8099,79 @@
 /enum
 /category
 
-xi:include href=ARB_color_buffer_float.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/
+xi:include href=ARB_color_buffer_float.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/ !-- 39.--
 
 !-- 40. GL_ARB_half_float_pixel --
 
-xi:include href=ARB_texture_float.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/
+xi:include href=ARB_texture_float.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/ !-- 41. --
 
 !-- 42. GL_ARB_pixel_buffer_object --
 
-xi:include href=ARB_depth_buffer_float.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/
+xi:include href=ARB_depth_buffer_float.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/ !-- 43. --
 
-xi:include href=ARB_draw_instanced.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/
+xi:include href=ARB_draw_instanced.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/ !-- 44. --
 
-xi:include href=ARB_framebuffer_object.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/
+xi:include href=ARB_framebuffer_object.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/ !-- 45. --
 
 !-- 46. GL_ARB_framebuffer_sRGB --
 
-xi:include href=ARB_geometry_shader4.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/
+xi:include href=ARB_geometry_shader4.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/ !-- 47. --
 
 !-- 48. GL_ARB_half_float_vertex --
 
-xi:include href=ARB_instanced_arrays.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/
+xi:include href=ARB_instanced_arrays.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/ !-- 49. --
 
-xi:include href

[Mesa-dev] [PATCH] glsl/glsl_parser_extras: Handle GLSL 4.50

2014-08-22 Thread Olivier Galibert
Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/glsl/glsl_parser_extras.cpp | 2 +-
 src/glsl/glsl_parser_extras.h   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 490c3c8..87d4846 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -50,7 +50,7 @@ glsl_compute_version_string(void *mem_ctx, bool is_es, 
unsigned version)
 
 
 static const unsigned known_desktop_glsl_versions[] =
-   { 110, 120, 130, 140, 150, 330, 400, 410, 420, 430, 440 };
+  { 110, 120, 130, 140, 150, 330, 400, 410, 420, 430, 440, 450 };
 
 
 _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct gl_context *_ctx,
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index c8b9478..cd252f1 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -215,7 +215,7 @@ struct _mesa_glsl_parse_state {
struct {
   unsigned ver;
   bool es;
-   } supported_versions[12];
+   } supported_versions[14];
 
bool es_shader;
unsigned language_version;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

2014-08-22 Thread Olivier Galibert
In that case staying as close as possible to spir may make sense?

  OG.


On Fri, Aug 22, 2014 at 5:08 AM, Dave Airlie airl...@gmail.com wrote:
 On 22 August 2014 12:46, Jason Ekstrand ja...@jlekstrand.net wrote:
 On Thu, Aug 21, 2014 at 7:36 PM, Dave Airlie airl...@gmail.com wrote:

 On 21 August 2014 19:10, Henri Verbeet hverb...@gmail.com wrote:
  On 21 August 2014 04:56, Michel Dänzer mic...@daenzer.net wrote:
  On 21.08.2014 04:29, Henri Verbeet wrote:
  For whatever it's worth, I have been avoiding radeonsi in part because
  of the LLVM dependency. Some of the other issues already mentioned
  aside, I also think it makes it just painful to do bisects over
  moderate/longer periods of time.
 
  More painful, sure, but not too bad IME. In particular, if you know the
  regression is in Mesa, you can always use a stable release of LLVM for
  the bisect. You only need to change the --with-llvm-prefix= parameter
  to
  Mesa's configure for that. Of course, it could still be mildly painful
  if you need to go so far back that the current stable LLVM release
  wasn't supported yet. But how often does that happen? Very rarely for
  me.
 
  Sure, it's not impossible, but is that really the kind of process you
  want users to go through when bisecting a regression? Perhaps throw in
  building 32-bit versions of both Mesa and LLVM on 64-bit as well if
  they want to run 32-bit applications.
 
  Without LLVM, I'm not sure there would be a driver you could avoid. :)
 
  R600g didn't really exist either, and that one seems to have worked
  out fine. I think in a large part because of work done by Jerome and
  Dave in the early days, but regardless. From what I've seen from SI, I
  don't think radeonsi needed to be a separate driver to start with, and
  while its ISA is certainly different from R600-Cayman, it doesn't
  particularly strike me as much harder to work with.
 
  Back to the more immediate topic though, I think think that on
  occasion the discussion is framed as Is there any reason using LLVM
  IR wouldn't work?, while it would perhaps be more appropriate to
  think of as Would using LLVM IR provide enough advantages to justify
  adding a LLVM dependency to core Mesa?.

 Could we use an llvm compatible IR? is also a question I'd like to see
 answered.


 What do you mean by llvm compatible?  Do you mean forking their IR inside
 mesa or just something that's easy to translate back and forth?


 Importing/forking the llvm IR code with a different symbol set, and
 trying to not intentionally
 be incompatible with their llvm.

 Dave.
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa/{version, getstring}: Future-proof version handling

2014-08-21 Thread Olivier Galibert
Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/mesa/main/getstring.c |   6 ++
 src/mesa/main/version.c   | 140 +-
 2 files changed, 143 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/getstring.c b/src/mesa/main/getstring.c
index 431d60b..f9d13a7 100644
--- a/src/mesa/main/getstring.c
+++ b/src/mesa/main/getstring.c
@@ -58,6 +58,12 @@ shading_language_version(struct gl_context *ctx)
  return (const GLubyte *) 4.10;
   case 420:
  return (const GLubyte *) 4.20;
+  case 430:
+ return (const GLubyte *) 4.30;
+  case 440:
+ return (const GLubyte *) 4.40;
+  case 450:
+ return (const GLubyte *) 4.50;
   default:
  _mesa_problem(ctx,
Invalid GLSL version in shading_language_version());
diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
index 4dea530..c7a2381 100644
--- a/src/mesa/main/version.c
+++ b/src/mesa/main/version.c
@@ -290,7 +290,122 @@ compute_version(const struct gl_extensions *extensions,
   extensions-EXT_texture_swizzle);
   /* ARB_sampler_objects is always enabled in mesa 
*/
 
-   if (ver_3_3) {
+   const GLboolean ver_4_0 = (ver_3_3 
+  consts-GLSLVersion = 400 
+  extensions-ARB_draw_buffers_blend 
+  extensions-ARB_draw_indirect 
+  extensions-ARB_gpu_shader5 
+  false  // ARB_gpu_shader_fp64 
+  extensions-ARB_sample_shading 
+  false  // ARB_shader_subroutine
+  false  // ARB_tesselation_shader
+  extensions-ARB_texture_buffer_object_rgb32 
+  extensions-ARB_texture_cube_map_array 
+  extensions-ARB_texture_gather 
+  extensions-ARB_texture_query_lod 
+  extensions-ARB_transform_feedback2 
+  extensions-ARB_transform_feedback3);
+
+   const GLboolean ver_4_1 = (ver_4_0 
+  consts-GLSLVersion = 410 
+  extensions-ARB_ES2_compatibility 
+  false  // ARB_shader_precision
+  false  // ARB_vertex_attrib_64bit
+  extensions-ARB_viewport_array);
+  /* ARB_get_program_binary and
+ ARB_separate_shader_objects are always 
enabled in mesa */
+
+   const GLboolean ver_4_2 = (ver_4_1 
+  consts-GLSLVersion = 420 
+  extensions-ARB_texture_compression_bptc 
+  extensions-ARB_shader_atomic_counters 
+  extensions-ARB_transform_feedback_instanced 
+  extensions-ARB_base_instance 
+  extensions-ARB_shader_image_load_store 
+  extensions-ARB_conservative_depth 
+  extensions-ARB_shading_language_420pack 
+  extensions-ARB_internalformat_query);
+  /* ARB_compressed_texture_pixel_storage,
+ ARB_texture_storage and
+ ARB_map_buffer_alignment are always enabled 
in mesa */
+
+   const GLboolean ver_4_3 = (ver_4_2 
+  consts-GLSLVersion = 430 
+  false  // ARB_arrays_of_arrays
+  extensions-ARB_ES3_compatibility 
+  extensions-ARB_compute_shader 
+  extensions-ARB_copy_image 
+  extensions-ARB_explicit_uniform_location 
+  extensions-ARB_fragment_layer_viewport 
+  false  // ARB_framebuffer_no_attachments
+  false  // ARB_internalformat_query2
+  extensions-ARB_draw_indirect 
+  false  // ARB_program_interface_query
+  false  // ARB_robust_buffer_access_behavior
+  false  // ARB_shader_image_size
+  false  // ARB_shader_storage_buffer_object
+  extensions-ARB_stencil_texturing 
+  extensions-ARB_texture_buffer_range 
+  extensions-ARB_texture_query_levels 
+  extensions-ARB_texture_multisample 
+  extensions-ARB_texture_view);
+  /* ARB_clear_buffer_object,
+ KHR_debug

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

2014-08-20 Thread Olivier Galibert
And don't forget that explicit vec4 becomes immensely amusing once you
add fp64/double to the problem.

  OG.


On Wed, Aug 20, 2014 at 4:01 PM, Francisco Jerez curroje...@riseup.net wrote:
 Connor Abbott cwabbo...@gmail.com writes:

 On Tue, Aug 19, 2014 at 11:33 PM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Connor Abbott cwabbo...@gmail.com writes:

 On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Tom Stellard t...@stellard.net writes:

 On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote:
 On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net 
 wrote:
  On 19.08.2014 01:28, Connor Abbott wrote:
  On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net 
  wrote:
  On 16.08.2014 09:12, Connor Abbott wrote:
  I know what you might be thinking right now. Wait, *another* IR? 
  Don't
  we already have like 5 of those, not counting all the 
  driver-specific
  ones? Isn't this stuff complicated enough already? Well, there 
  are some
  pretty good reasons to start afresh (again...). In the years we've 
  been
  using GLSL IR, we've come to realize that, in fact, it's not what 
  we
  want *at all* to do optimizations on.
 
  Did you evaluate using LLVM IR instead of inventing yet another one?
 
 
  --
  Earthling Michel Dänzer|  
  http://www.amd.com
  Libre software enthusiast  |Mesa and X 
  developer
 
  Yes. See
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html
 
  and
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html
 
  I know Ian can't deal with LLVM for some reason. I was wondering if
  *you* evaluated it, and if so, why you rejected it.
 
 
  --
  Earthling Michel Dänzer|  
  http://www.amd.com
  Libre software enthusiast  |Mesa and X 
  developer


 Well, first of all, the fact that Ian and Ken don't want to use it
 means that any plan to use LLVM for the Intel driver is dead in the
 water anyways - you can translate NIR into LLVM if you want, but for
 i965 we want to share optimizations between our 2 backends (FS and
 vec4) that we can't do today in GLSL IR so this is what we want to use
 for that, and since nobody else does anything with the core GLSL
 compiler except when they have to, when we start moving things out of
 GLSL IR this will probably replace GLSL IR as the infrastructure that
 all Mesa drivers use. But with that in mind, here are a few reasons
 why we wouldn't want to use LLVM:

 * LLVM wasn't built to understand structured CFG's, meaning that you
 need to re-structurize it using a pass that's fragile and prone to
 break if some other pass optimizes the shader in a way that makes it
 non-structured (i.e. not expressible in terms of loops and if
 statements). This loss of information also means that passes that need
 to know things like, for example, the loop nesting depth need to do an
 analysis pass whereas with NIR you can just walk up the control flow
 tree and count the number of loops we hit.


 LLVM has a pass to structurize the CFG.  We use it in the radeon
 drivers, and it is run after all of the other LLVM optimizations which 
 have
 no concept of structured CFG.  It's not bug free, but it works really
 well even with all of the complex OpenCL kernels we throw at it.

 Your point about losing information when the CFG is de-structurized is
 valid, but for things like loop depth, I'm not sure why we couldn't 
 write an
 LLVM analysis pass for this (if one doesn't already exist).


 I don't think this is such a big deal either.  At least the
 structurization pass used on newer AMD hardware isn't fragile in the
 way you seem to imply -- AFAIK (unlike the old AMDIL heuristic
 algorithm) it's guaranteed to give you a valid structurized output no
 matter what the previous optimization passes have done to the CFG,
 modulo bugs.  I admit that the situation is nevertheless suboptimal.
 Ideally this information wouldn't get lost along the way.  For the long
 term we may want to represent structured control flow directly in the IR
 as you say, I just don't see how reinventing the IR saves us any work if
 we could just fix the existing one.

 It seems to me that something like how we represent control flow is a
 pretty fundamental part of the IR - it affects any optimization pass
 that needs to do anything beyond adding and removing instructions. How
 would you fix that, especially given that LLVM is primarily designed
 for CPU's where you don't want to be restricted to structured control
 flow at all? It seems like our goals (preserve the structure) conflict
 with the way LLVM has been designed.

 I think we can fix this by introducing new structured variants of the
 branch instruction in a way that doesn't alter the fundamental structure
 of the IR.  E.g. an if branch could look like:

 ifbr i1 cond, label iftrue, label iffalse, label join

 Where both 

Re: [Mesa-dev] [PATCH 8/8] mesa: simplify _mesa_update_draw_buffers()

2014-08-19 Thread Olivier Galibert
  Hi,

That patch makes glDrawBuffer(0, NULL); segfault because
_mesa_drawbuffers expects buffers[0] to be valid.  Note that the bug
is there, but I'm not sure what the final setup should look like in
that case.

Best,

  OG.

PS: reported by haagch on irc


On Fri, Aug 8, 2014 at 11:20 PM, Brian Paul bri...@vmware.com wrote:
 There's no need to copy the array of DrawBuffer enums to a temp array.
 ---
  src/mesa/main/buffers.c |9 ++---
  1 file changed, 2 insertions(+), 7 deletions(-)

 diff --git a/src/mesa/main/buffers.c b/src/mesa/main/buffers.c
 index 6b4fac9..140cf6e 100644
 --- a/src/mesa/main/buffers.c
 +++ b/src/mesa/main/buffers.c
 @@ -567,16 +567,11 @@ _mesa_drawbuffers(struct gl_context *ctx, GLuint n, 
 const GLenum *buffers,
  void
  _mesa_update_draw_buffers(struct gl_context *ctx)
  {
 -   GLenum buffers[MAX_DRAW_BUFFERS];
 -   GLuint i;
 -
 /* should be a window system FBO */
 assert(_mesa_is_winsys_fbo(ctx-DrawBuffer));

 -   for (i = 0; i  ctx-Const.MaxDrawBuffers; i++)
 -  buffers[i] = ctx-Color.DrawBuffer[i];
 -
 -   _mesa_drawbuffers(ctx, ctx-Const.MaxDrawBuffers, buffers, NULL);
 +   _mesa_drawbuffers(ctx, ctx-Const.MaxDrawBuffers,
 + ctx-Color.DrawBuffer, NULL);
  }


 --
 1.7.10.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Intel-gfx] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.

2012-07-30 Thread Olivier Galibert
On Mon, Jul 30, 2012 at 10:30:57AM -0700, Eric Anholt wrote:
 I'm perfectly fine with the VUE containing slots for both when the app
 has gone out of its way to ask for deprecated two-sided color
 rendering.

Are you also ok with recompiler the shaders when that enable is
switched?

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.

2012-07-29 Thread Olivier Galibert
On Tue, Jul 17, 2012 at 07:37:43AM -0700, Paul Berry wrote:
 If possible, I would still like to think of a way to address this situation
 that (a) doesn't require modifying both fragment shader back-ends and the
 SF program, and (b) helps all Mesa drivers, not just Intel Gen4-5.
 Especially because I suspect we may have bugs in Gen6-7 related to this
 situation. 

You don't :-) It's correctly handled in
gen6_sf_state.c::get_attr_override with similar semantics too.

 Would you be happy with one of the following two alternatives?
 
 1. In the GLSL front-end, if we detect that a vertex shader writes to
 gl_BackColor but not gl_FrontColor, then automatically insert
 gl_FrontColor = 0; into the shader.  This will guarantee that whenever
 gl_BackColor is written, gl_FrontColor is too.
 
 2. In the function brw_compute_vue_map(), assign a VUE slot for
 VERT_RESULT_COL0 whenever *either* VERT_RESULT_COL0 or VERT_RESULT_BFC0 is
 used.  This will guarantee that we always have a VUE slot available for
 front color, so we don't have to be as tricky in the FS and SF code.

With both methods the SF code is not really simplified.  Doing the mov
without testing would require writing to/reserving a slot for
gl_BackColor if gl_FrontColor is written to, which wouldn't be
acceptable.  And to write to/reserve a slot for the two of them if
gl_Color is read in any case.  Probably unacceptable.  So the need_*
stuff is going to stay in any case :/

So the only simplification would be in the fs/wm and I'm somewhat
afraid of having a vue slot that's not in outputs_written of the
previous stage.  They seem to be expected equivalent.

 This morning I'll try to ask some other Intel folks for their opinion on
 the subject.

Did they have an opinion?

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/9] intel gen4-5: fix the vue view in the fs.

2012-07-27 Thread Olivier Galibert
On Thu, Jul 26, 2012 at 10:18:01AM -0700, Eric Anholt wrote:
 Olivier Galibert galib...@pobox.com writes:
 
  In some cases the fragment shader view of the vue registers was out of
  sync with the builder.  This fixes it.
 
 s/builder/SF outputs/ ?
 
 I'd love to see the pre-gen6 code get rearranged so the FS walked the
 bitfield of FS inputs from SF and chose the urb offset for each.  But
 this does look like the minimal fix.

In other words, an explicit linking pass?  That could be useful with
geometry shaders, too.

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] mesa: Add a Version field to the context with VersionMajor*10+VersionMinor.

2012-07-27 Thread Olivier Galibert
On Thu, Jul 26, 2012 at 05:27:43PM -0700, Eric Anholt wrote:
 As we get into supporting GL 3.x core, we come across more and more features
 of the API that depend on the version number as opposed to just the extension
 list.  This will let us more sanely do version checks than (VersionMajor == 3
  VersionMinor = 2) || VersionMajor = 4.

Pure bikeshedding, but why not use *100 in order to be identical to glsl?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Support for EXT/ARB_geometry_shader4

2012-07-27 Thread Olivier Galibert
On Fri, Jul 27, 2012 at 10:40:28AM -0500, Bryan Cain wrote:
 https://github.com/Plombo/mesa/tree/geometry-shaders .

Quick remarks from a fast read:
- you missed draw_pipe_clip.c:clip_init_state, where you need to plug
  in the gs info where appropriate.  Should be easy.  It will take
  care of the interpolation-on-clipping issues you currently have even
  if you don't know you have them :-)

- starting with 4.0 EmitVertex and EmitPrimitive are in fact
  EmitStreamVertex(0) and EmitStreamPrimitive(0).  It may be a good
  idea to implement the stream version at the ir_* level, even if the
  first implementation just ignores the parameter.

- all the is_*_shader boolean variables should probably be an integer
  shader type variable, since there will be two more types to add for
  4.0.

- I'm not sure we want to use _ARB versions of constants when the
  suffix-less versions exist and have the same value.

- cross_validate_outputs_to_inputs could use some kind of
  const char *_mesa_get_shader_type_string(gl_shader *sh) from
  somewhere like shaderapi.h.  We'll need two more shader types soon.

I'll see how hard the intel gen4 supports looks to be, shouldn't be
that bad.  Need to finish clipper first though.

Best,

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] glsl: Make bvec and ivec types accessible without using get_instance.

2012-07-27 Thread Olivier Galibert
On Fri, Jul 27, 2012 at 10:49:19AM -0700, Kenneth Graunke wrote:
 It's more convenient to use shortcuts like glsl_type::bvec2_type than
 the longwinded glsl_type::get_instance(GLSL_TYPE_BOOL, 2, 1).

Yay, code in zones I understand :-)

Reviewed-by: Olivier Galibert galib...@pobox.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/7] glsl: Add glsl_type::get_sampler_instance method.

2012-07-27 Thread Olivier Galibert
On Fri, Jul 27, 2012 at 10:49:20AM -0700, Kenneth Graunke wrote:
 +/**
 + * Convert sampler type attributes into an index in the sampler_types array
 + */
 +#define SAMPLER_TYPE_INDEX(dim, sample_type, array, shadow)  \
 +   ((unsigned(dim) * 12) + (sample_type * 4) + (unsigned(array) * 2) \
 ++ unsigned(shadow))
 +
 +/**
 + * \note
 + * Arrays like this are \b the argument for C99-style designated 
 initializers.
 + * Too bad C++ and VisualStudio are too cool for that sort of useful
 + * functionality.
 + */
 +const glsl_type *const glsl_type::sampler_types[] = {

Did you think about using a 4-dimensions array and let the compiler
take care of the multiplies?  It may not be that much more readable though.



 +   /* GLSL_SAMPLER_DIM_1D */
 +   builtin_130_types[10],  /* uint */
 +   NULL,/* uint, shadow */

What does NULL mean?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] glsl: Request an Nx1 type instance in ir_quadop_vector lowering pass.

2012-07-27 Thread Olivier Galibert
On Fri, Jul 27, 2012 at 10:49:22AM -0700, Kenneth Graunke wrote:
 From: Ian Romanick ian.d.roman...@intel.com
 
 No types have 0 columns.  The glsl_type::get_instance method contains
 
if ((rows  1) || (rows  4) || (columns  1) || (columns  4))
   return error_type;
 
 To get a vector, use columns = 1.

Reviewed-by: Olivier Galibert galib...@pobox.com

That's an obvious bugfix.  If there's a stable branch with the glsl
compiler in, it probably should go there.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] glsl: Add typeless constructor for quadop ir_expressions

2012-07-27 Thread Olivier Galibert
On Fri, Jul 27, 2012 at 10:49:23AM -0700, Kenneth Graunke wrote:
 From: Ian Romanick ian.d.roman...@intel.com
 
 This matches the typeless constructors for unop and binop
 ir_expressions.
 
 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/glsl/ir.cpp | 17 +
  src/glsl/ir.h   |  2 ++
  2 files changed, 19 insertions(+)
 
 diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
 index b0e38d8..5faf34a 100644
 --- a/src/glsl/ir.cpp
 +++ b/src/glsl/ir.cpp
 @@ -236,6 +236,23 @@ ir_expression::ir_expression(int op, const struct 
 glsl_type *type,
 this-operands[3] = op3;
  }
  
 +ir_expression::ir_expression(int op, ir_rvalue *op0, ir_rvalue *op1,
 +  ir_rvalue *op2, ir_rvalue *op3)
 +{
 +   assert(op0-type-is_scalar());
 +   assert((op0-type == op1-type)
 +(op0-type == op2-type)
 +(op0-type == op3-type));
 +
 +   this-ir_type = ir_type_expression;
 +   this-type = glsl_type::get_instance(op0-type-base_type, 4, 1);
 +   this-operation = ir_expression_operation(op);


You're hardcoding ir_quadop_vector's properties here.  A comment
saying so could be useful, if other quadops with different properties
happen someday.  In fact, you're hardcoding them so hard passing op
may not make sense.  A static method
   ir_expression *ir_expression::build_quadop_vector
perhaps?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/7] glsl: Fix ir_last_opcode value.

2012-07-27 Thread Olivier Galibert
On Fri, Jul 27, 2012 at 10:49:24AM -0700, Kenneth Graunke wrote:
 From: Ian Romanick ian.d.roman...@intel.com
 
 Now that ir_quadop_vector exists, ir_last_binop and ir_last_opcode are
 no longer the same.  Only one place currently uses this enumeration, and
 already handles ir_quadop_vector correctly.
 
 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/glsl/ir.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/src/glsl/ir.h b/src/glsl/ir.h
 index e2743f6..a69494f 100644
 --- a/src/glsl/ir.h
 +++ b/src/glsl/ir.h
 @@ -1027,7 +1027,7 @@ enum ir_expression_operation {
 /**
  * A sentinel marking the last of all operations.
  */
 -   ir_last_opcode = ir_last_binop
 +   ir_last_opcode = ir_quadop_vector
  };

Another obvious-in-hindsight bugfix.

Reviewed-by: Olivier Galibert galib...@pobox.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] ir_to_mesa: Don't set component for ir_dereference in ir_quadop_vector

2012-07-27 Thread Olivier Galibert
On Fri, Jul 27, 2012 at 10:49:25AM -0700, Kenneth Graunke wrote:
 From: Ian Romanick ian.d.roman...@intel.com
 
 There can only be one variable used in an ir_quadop_vector.  Accesses
 of this variable must be swizzled.

There's nothing anywhere ensuring the presence of the swizzle.  I
completely agree that trashing the components value is a bad idea, but
you should have SWIZZLE_X as a default value for the components array.

Amusingly enough, it's already the case (SWIZZLE_X==0), so making it
explicit would be perfect.

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] sp_tex_sample: Fix stupid copy/paste error.

2012-07-24 Thread Olivier Galibert
diff --git a/src/gallium/drivers/softpipe/sp_tex_sample.c 
b/src/gallium/drivers/softpipe/sp_tex_sample.c
index f215b90..0aeb8e2 100644
--- a/src/gallium/drivers/softpipe/sp_tex_sample.c
+++ b/src/gallium/drivers/softpipe/sp_tex_sample.c
@@ -1950,8 +1950,8 @@ mip_filter_linear_2d_linear_repeat_POT(
  float rgbax[TGSI_NUM_CHANNELS][TGSI_QUAD_SIZE];
  int c;
 
- img_filter_2d_linear_repeat_POT(tgsi_sampler, s[j], t[j], p[j], 
level0,   samp-faces[j], tgsi_sampler_lod_bias, rgbax[0][j]);
- img_filter_2d_linear_repeat_POT(tgsi_sampler, s[j], t[j], p[j], 
level0+1, samp-faces[j], tgsi_sampler_lod_bias, rgbax[0][j]);
+ img_filter_2d_linear_repeat_POT(tgsi_sampler, s[j], t[j], p[j], 
level0,   samp-faces[j], tgsi_sampler_lod_bias, rgbax[0][0]);
+ img_filter_2d_linear_repeat_POT(tgsi_sampler, s[j], t[j], p[j], 
level0+1, samp-faces[j], tgsi_sampler_lod_bias, rgbax[0][1]);
 
  for (c = 0; c  TGSI_NUM_CHANNELS; c++)
 rgba[c][j] = lerp(levelBlend, rgbax[c][0], rgbax[c][1]);
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] sp_tex_sample: Fix segfault with fbo-cubemap.

2012-07-19 Thread Olivier Galibert
On Thu, Jul 19, 2012 at 10:57:38AM -0600, Brian Paul wrote:
 
 static const float ...

Indeed.


 Reviewed-by: Brian Paul bri...@vmware.com

Thanks.  Could you commit it please?

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] (no subject)

2012-07-19 Thread Olivier Galibert
  Hi,

This is the second verion of the clipping/interpolation patches.

Main differences:
- I tried to take all of Paul's remarks into account
- I exploded the first patch in 4 independant ones
- I've added a patch to ensure that integers pass through unscathed

Patch 4/9 is (slightly) controversial.  There may be better ways to do
it, or at least more general ones.  But it's simple, it works, and it
allows to validate the other 8.  It's an easy one to revert if we
build an alternative.

Best,

  OG.
 
[PATCH 1/9] intel gen4-5: fix the vue view in the fs.
[PATCH 2/9] intel gen4-5: simplify the bfc copy in the sf.
[PATCH 3/9] intel gen4-5: fix GL_VERTEX_PROGRAM_TWO_SIDE selection.
[PATCH 4/9] intel gen4-5: Fix backface/frontface selection when one
[PATCH 5/9] intel gen4-5: Compute the interpolation status for every
[PATCH 6/9] intel gen4-5: Correctly setup the parameters in the sf.
[PATCH 7/9] intel gen4-5: Correctly handle flat vs. non-flat in the
[PATCH 8/9] intel gen4-5: Make noperspective clipping work.
[PATCH 9/9] intel gen4-5: Don't touch flatshaded values when
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/9] intel gen4-5: fix the vue view in the fs.

2012-07-19 Thread Olivier Galibert
In some cases the fragment shader view of the vue registers was out of
sync with the builder.  This fixes it.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |9 -
 src/mesa/drivers/dri/i965/brw_wm_pass2.c |   10 +-
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index b3b25cc..3f98137 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -972,8 +972,15 @@ fs_visitor::calculate_urb_setup()
 if (c-key.vp_outputs_written  BITFIELD64_BIT(i)) {
int fp_index = _mesa_vert_result_to_frag_attrib((gl_vert_result) i);
 
+   /* The back color slot is skipped when the front color is
+* also written to.  In addition, some slots can be
+* written in the vertex shader and not read in the
+* fragment shader.  So the register number must always be
+* incremented, mapped or not.
+*/
if (fp_index = 0)
-  urb_setup[fp_index] = urb_next++;
+  urb_setup[fp_index] = urb_next;
+   urb_next++;
 }
   }
 
diff --git a/src/mesa/drivers/dri/i965/brw_wm_pass2.c 
b/src/mesa/drivers/dri/i965/brw_wm_pass2.c
index 27c0a94..eacf7c0 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_pass2.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_pass2.c
@@ -97,8 +97,16 @@ static void init_registers( struct brw_wm_compile *c )
int fp_index = _mesa_vert_result_to_frag_attrib(j);
 
nr_interp_regs++;
+
+   /* The back color slot is skipped when the front color is
+* also written to.  In addition, some slots can be
+* written in the vertex shader and not read in the
+* fragment shader.  So the register number must always be
+* incremented, mapped or not.
+*/
if (fp_index = 0)
-  prealloc_reg(c, c-payload.input_interp[fp_index], i++);
+  prealloc_reg(c, c-payload.input_interp[fp_index], i);
+i++;
 }
   }
   assert(nr_interp_regs = 1);
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/9] intel gen4-5: simplify the bfc copy in the sf.

2012-07-19 Thread Olivier Galibert
This patch is mostly designed to make followup patches simpler, but
it's a simplification by itself.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/mesa/drivers/dri/i965/brw_sf_emit.c |   93 +--
 1 file changed, 52 insertions(+), 41 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c 
b/src/mesa/drivers/dri/i965/brw_sf_emit.c
index ff6383b..9d8aa38 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c
@@ -79,24 +79,9 @@ have_attr(struct brw_sf_compile *c, GLuint attr)
 /*** 
  * Twoside lighting
  */
-static void copy_bfc( struct brw_sf_compile *c,
- struct brw_reg vert )
-{
-   struct brw_compile *p = c-func;
-   GLuint i;
-
-   for (i = 0; i  2; i++) {
-  if (have_attr(c, VERT_RESULT_COL0+i) 
- have_attr(c, VERT_RESULT_BFC0+i))
-brw_MOV(p, 
-get_vert_result(c, vert, VERT_RESULT_COL0+i),
-get_vert_result(c, vert, VERT_RESULT_BFC0+i));
-   }
-}
-
-
 static void do_twoside_color( struct brw_sf_compile *c )
 {
+   GLuint i, need_0, need_1;
struct brw_compile *p = c-func;
GLuint backface_conditional = c-key.frontface_ccw ? BRW_CONDITIONAL_G : 
BRW_CONDITIONAL_L;
 
@@ -105,12 +90,14 @@ static void do_twoside_color( struct brw_sf_compile *c )
if (c-key.primitive == SF_UNFILLED_TRIS)
   return;
 
-   /* XXX: What happens if BFC isn't present?  This could only happen
-* for user-supplied vertex programs, as t_vp_build.c always does
-* the right thing.
+   /* If the vertex shader provides both front and backface color, do
+* the selection.  Otherwise the generated code will pick up
+* whichever there is.
 */
-   if (!(have_attr(c, VERT_RESULT_COL0)  have_attr(c, VERT_RESULT_BFC0)) 
-   !(have_attr(c, VERT_RESULT_COL1)  have_attr(c, VERT_RESULT_BFC1)))
+   need_0 = have_attr(c, VERT_RESULT_COL0)  have_attr(c, VERT_RESULT_BFC0);
+   need_1 = have_attr(c, VERT_RESULT_COL1)  have_attr(c, VERT_RESULT_BFC1);
+
+   if (!need_0  !need_1)
   return;

/* Need to use BRW_EXECUTE_4 and also do an 4-wide compare in order
@@ -121,12 +108,15 @@ static void do_twoside_color( struct brw_sf_compile *c )
brw_push_insn_state(p);
brw_CMP(p, vec4(brw_null_reg()), backface_conditional, c-det, 
brw_imm_f(0));
brw_IF(p, BRW_EXECUTE_4);
-   {
-  switch (c-nr_verts) {
-  case 3: copy_bfc(c, c-vert[2]);
-  case 2: copy_bfc(c, c-vert[1]);
-  case 1: copy_bfc(c, c-vert[0]);
-  }
+   for (i=0; ic-nr_verts; i++) {
+  if (need_0)
+brw_MOV(p, 
+get_vert_result(c, c-vert[i], VERT_RESULT_COL0),
+get_vert_result(c, c-vert[i], VERT_RESULT_BFC0));
+  if (need_1)
+brw_MOV(p, 
+get_vert_result(c, c-vert[i], VERT_RESULT_COL1),
+get_vert_result(c, c-vert[i], VERT_RESULT_BFC1));
}
brw_ENDIF(p);
brw_pop_insn_state(p);
@@ -139,20 +129,27 @@ static void do_twoside_color( struct brw_sf_compile *c )
  */
 
 #define VERT_RESULT_COLOR_BITS (BITFIELD64_BIT(VERT_RESULT_COL0) | \
-   BITFIELD64_BIT(VERT_RESULT_COL1))
+BITFIELD64_BIT(VERT_RESULT_COL1))
 
 static void copy_colors( struct brw_sf_compile *c,
 struct brw_reg dst,
-struct brw_reg src)
+ struct brw_reg src,
+ int allow_twoside)
 {
struct brw_compile *p = c-func;
GLuint i;
 
for (i = VERT_RESULT_COL0; i = VERT_RESULT_COL1; i++) {
-  if (have_attr(c,i))
+  if (have_attr(c,i)) {
 brw_MOV(p, 
 get_vert_result(c, dst, i),
 get_vert_result(c, src, i));
+
+  } else if(allow_twoside  have_attr(c, i - VERT_RESULT_COL0 + 
VERT_RESULT_BFC0)) {
+brw_MOV(p, 
+get_vert_result(c, dst, i - VERT_RESULT_COL0 + 
VERT_RESULT_BFC0),
+get_vert_result(c, src, i - VERT_RESULT_COL0 + 
VERT_RESULT_BFC0));
+  }
}
 }
 
@@ -167,9 +164,19 @@ static void do_flatshade_triangle( struct brw_sf_compile 
*c )
struct brw_compile *p = c-func;
struct intel_context *intel = p-brw-intel;
struct brw_reg ip = brw_ip_reg();
-   GLuint nr = _mesa_bitcount_64(c-key.attrs  VERT_RESULT_COLOR_BITS);
GLuint jmpi = 1;
 
+   GLuint nr;
+
+   if (c-key.do_twoside_color) {
+  nr = ((c-key.attrs  (BITFIELD64_BIT(VERT_RESULT_COL0) | 
BITFIELD64_BIT(VERT_RESULT_BFC0))) != 0) +
+ ((c-key.attrs  (BITFIELD64_BIT(VERT_RESULT_COL1) | 
BITFIELD64_BIT(VERT_RESULT_BFC1))) != 0);
+
+   } else {
+  nr = ((c-key.attrs  BITFIELD64_BIT(VERT_RESULT_COL0)) != 0) +
+ ((c-key.attrs  BITFIELD64_BIT(VERT_RESULT_COL1)) != 0);
+   }
+
if (!nr)
   return;
 
@@ -186,16 +193,16 @@ static void do_flatshade_triangle( struct brw_sf_compile 
*c )
brw_MUL(p, c-pv, c-pv

[Mesa-dev] [PATCH 3/9] intel gen4-5: fix GL_VERTEX_PROGRAM_TWO_SIDE selection.

2012-07-19 Thread Olivier Galibert
Previous code only selected two side in pure fixed-function setups.
This version also activates it when needed with shaders programs.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/mesa/drivers/dri/i965/brw_sf.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_sf.c 
b/src/mesa/drivers/dri/i965/brw_sf.c
index 23a874a..791210f 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.c
+++ b/src/mesa/drivers/dri/i965/brw_sf.c
@@ -192,7 +192,7 @@ brw_upload_sf_prog(struct brw_context *brw)
 
/* _NEW_LIGHT */
key.do_flat_shading = (ctx-Light.ShadeModel == GL_FLAT);
-   key.do_twoside_color = (ctx-Light.Enabled  ctx-Light.Model.TwoSide);
+   key.do_twoside_color = ctx-VertexProgram._TwoSideEnabled;
 
/* _NEW_POLYGON */
if (key.do_twoside_color) {
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/9] intel gen4-5: Fix backface/frontface selection when one one color is written to.

2012-07-19 Thread Olivier Galibert
Shaders, piglit test ones in particular, may write only to one of
gl_FrontColor/gl_BackColor.  The standard is unclear on whether the
behaviour is defined in that case, but it seems reasonable to support
it.

The choice done there to pick up whichever color was actually written
to.  That makes most of the generated piglit tests useless to test the
backface selection, but it's simple and it works.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |9 +
 src/mesa/drivers/dri/i965/brw_wm_pass2.c |9 +
 2 files changed, 18 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 3f98137..3b62952 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -972,6 +972,15 @@ fs_visitor::calculate_urb_setup()
 if (c-key.vp_outputs_written  BITFIELD64_BIT(i)) {
int fp_index = _mesa_vert_result_to_frag_attrib((gl_vert_result) i);
 
+/* Special case: two-sided vertex option, vertex program
+ * only writes to the back color.  Map it to the
+ * associated front color location.
+ */
+if (i = VERT_RESULT_BFC0  i = VERT_RESULT_BFC1 
+ctx-VertexProgram._TwoSideEnabled 
+urb_setup[i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0] == -1)
+   fp_index = i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
+
/* The back color slot is skipped when the front color is
 * also written to.  In addition, some slots can be
 * written in the vertex shader and not read in the
diff --git a/src/mesa/drivers/dri/i965/brw_wm_pass2.c 
b/src/mesa/drivers/dri/i965/brw_wm_pass2.c
index eacf7c0..48143f3 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_pass2.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_pass2.c
@@ -96,6 +96,15 @@ static void init_registers( struct brw_wm_compile *c )
 if (c-key.vp_outputs_written  BITFIELD64_BIT(j)) {
int fp_index = _mesa_vert_result_to_frag_attrib(j);
 
+/* Special case: two-sided vertex option, vertex program
+ * only writes to the back color.  Map it to the
+ * associated front color location.
+ */
+if (j = VERT_RESULT_BFC0  j = VERT_RESULT_BFC1 
+intel-ctx.VertexProgram._TwoSideEnabled 
+!(c-key.vp_outputs_written  BITFIELD64_BIT(j - 
VERT_RESULT_BFC0 + VERT_RESULT_COL0)))
+   fp_index = j - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
+
nr_interp_regs++;
 
/* The back color slot is skipped when the front color is
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/9] intel gen4-5: Compute the interpolation status for every variable in one place.

2012-07-19 Thread Olivier Galibert
The program keys are updated accordingly, but the values are not used
yet.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/mesa/drivers/dri/i965/brw_clip.c|   90 ++-
 src/mesa/drivers/dri/i965/brw_clip.h|1 +
 src/mesa/drivers/dri/i965/brw_context.h |   11 
 src/mesa/drivers/dri/i965/brw_sf.c  |5 +-
 src/mesa/drivers/dri/i965/brw_sf.h  |1 +
 src/mesa/drivers/dri/i965/brw_wm.c  |2 +
 src/mesa/drivers/dri/i965/brw_wm.h  |1 +
 7 files changed, 109 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index d411208..b4a2e0a 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -47,6 +47,86 @@
 #define FRONT_UNFILLED_BIT  0x1
 #define BACK_UNFILLED_BIT   0x2
 
+/**
+ * Lookup the interpolation mode information for every element in the
+ * vue.
+ */
+static void
+brw_lookup_interpolation(struct brw_context *brw)
+{
+   /* pprog means previous program, i.e. the last program before the
+* fragment shader.  It can only be the vertex shader for now, but
+* it may be a geometry shader in the future.
+*/
+   const struct gl_program *pprog = brw-vertex_program-Base;
+   const struct gl_fragment_program *fprog = brw-fragment_program;
+   struct brw_vue_map *vue_map = brw-vs.prog_data-vue_map;
+
+   /* Default everything to INTERP_QUALIFIER_NONE */
+   memset(brw-interpolation_mode, INTERP_QUALIFIER_NONE, BRW_VERT_RESULT_MAX);
+
+   /* If there is no fragment shader, interpolation won't be needed,
+* so defaulting to none is good.
+*/
+   if (!fprog)
+  return;
+
+   for (int i = 0; i  vue_map-num_slots; i++) {
+  /* First lookup the vert result, skip if there isn't one */
+  int vert_result = vue_map-slot_to_vert_result[i];
+  if (vert_result == BRW_VERT_RESULT_MAX)
+ continue;
+
+  /* HPOS is special.  In the clipper, it is handled specifically,
+   * so its value is irrelevant.  In the sf, it's forced to
+   * linear.  In the wm, it's special cased, irrelevant again.  So
+   * force linear to remove the sf special case.
+   */
+  if (vert_result == VERT_RESULT_HPOS) {
+ brw-interpolation_mode[i] = INTERP_QUALIFIER_NOPERSPECTIVE;
+ continue;
+  }
+
+  /* There is a 1-1 mapping of vert result to frag attrib except
+   * for BackColor and vars
+   */
+  int frag_attrib = vert_result;
+  if (vert_result = VERT_RESULT_BFC0  vert_result = VERT_RESULT_BFC1)
+ frag_attrib = vert_result - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
+  else if(vert_result = VERT_RESULT_VAR0)
+ frag_attrib = vert_result - VERT_RESULT_VAR0 + FRAG_ATTRIB_VAR0;
+
+  /* If the output is not used by the fragment shader, skip it. */
+  if (!(fprog-Base.InputsRead  BITFIELD64_BIT(frag_attrib)))
+ continue;
+
+  /* Lookup the interpolation mode */
+  enum glsl_interp_qualifier interpolation_mode = 
fprog-InterpQualifier[frag_attrib];
+
+  /* If the mode is not specified, then the default varies.  Color
+   * values follow the shader model, while all the rest uses
+   * smooth.
+   */
+  if (interpolation_mode == INTERP_QUALIFIER_NONE) {
+ if (frag_attrib = FRAG_ATTRIB_COL0  frag_attrib = 
FRAG_ATTRIB_COL1)
+interpolation_mode = brw-intel.ctx.Light.ShadeModel == GL_FLAT ? 
INTERP_QUALIFIER_FLAT : INTERP_QUALIFIER_SMOOTH;
+ else
+interpolation_mode = INTERP_QUALIFIER_SMOOTH;
+  }
+
+  /* Finally, if we have both a front color and a back color for
+   * the same channel, the selection will be done before
+   * interpolation and the back color copied over the front color
+   * if necessary.  So interpolating the back color is
+   * unnecessary.
+   */
+  if (vert_result = VERT_RESULT_BFC0  vert_result = VERT_RESULT_BFC1)
+ if (pprog-OutputsWritten  BITFIELD64_BIT(vert_result - 
VERT_RESULT_BFC0 + VERT_RESULT_COL0))
+interpolation_mode = INTERP_QUALIFIER_NONE;
+
+  brw-interpolation_mode[i] = interpolation_mode;
+   }
+}
 
 static void compile_clip_prog( struct brw_context *brw,
 struct brw_clip_prog_key *key )
@@ -143,6 +223,10 @@ brw_upload_clip_prog(struct brw_context *brw)
 
/* Populate the key:
 */
+
+   /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */
+   brw_lookup_interpolation(brw);
+
/* BRW_NEW_REDUCED_PRIMITIVE */
key.primitive = brw-intel.reduced_primitive;
/* CACHE_NEW_VS_PROG (also part of VUE map) */
@@ -150,6 +234,10 @@ brw_upload_clip_prog(struct brw_context *brw)
/* _NEW_LIGHT */
key.do_flat_shading = (ctx-Light.ShadeModel == GL_FLAT);
key.pv_first = (ctx-Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION);
+
+   /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */
+   memcpy(key.interpolation_mode, brw-interpolation_mode, 
BRW_VERT_RESULT_MAX

[Mesa-dev] [PATCH 6/9] intel gen4-5: Correctly setup the parameters in the sf.

2012-07-19 Thread Olivier Galibert
This patch also correct a couple of problems with noperspective
interpolation.

At that point all the glsl 1.1/1.3 interpolation tests that do not
clip pass (the -none ones).

The fs code does not use the pre-resolved interpolation modes in order
not to mess with gen6+.  Sharing the resolution would require putting
brw_wm_prog before brw_clip_prog and brw_sf_prog.  This may be a good
thing, but it could have unexpected consequences, so it's better be
done independently in any case.

Signed-off-by: Olivier Galibert galib...@pobox.com
Reviewed-by: Paul Berry stereotype...@gmail.com
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |2 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |   15 +++
 src/mesa/drivers/dri/i965/brw_sf.c   |   12 +-
 src/mesa/drivers/dri/i965/brw_sf.h   |2 +-
 src/mesa/drivers/dri/i965/brw_sf_emit.c  |  164 +-
 5 files changed, 106 insertions(+), 89 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 3b62952..4734a5d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -757,7 +757,7 @@ fs_visitor::emit_general_interpolation(ir_variable *ir)
  inst-predicated = true;
  inst-predicate_inverse = true;
   }
- if (intel-gen  6) {
+ if (intel-gen  6  interpolation_mode == 
INTERP_QUALIFIER_SMOOTH) {
 emit(BRW_OPCODE_MUL, attr, attr, this-pixel_w);
  }
   }
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 08c0130..c6dc265 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1872,6 +1872,21 @@ fs_visitor::emit_interpolation_setup_gen4()
emit(BRW_OPCODE_ADD, this-delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC],
this-pixel_y, fs_reg(negate(brw_vec1_grf(1, 1;
 
+   /*
+* On Gen4-5, we accomplish perspective-correct interpolation by
+* dividing the attribute values by w in the sf shader,
+* interpolating the result linearly in screen space, and then
+* multiplying by w in the fragment shader.  So the interpolation
+* step is always linear in screen space, regardless of whether the
+* attribute is perspective or non-perspective.  Accordingly, we
+* use the same delta_x and delta_y values for both kinds of
+* interpolation.
+*/
+   this-delta_x[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] =
+ this-delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC];
+   this-delta_y[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] =
+ this-delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC];
+
this-current_annotation = compute pos.w and 1/pos.w;
/* Compute wpos.w.  It's always in our setup, since it's needed to
 * interpolate the other attributes.
diff --git a/src/mesa/drivers/dri/i965/brw_sf.c 
b/src/mesa/drivers/dri/i965/brw_sf.c
index 26cbaf7..c00e85a 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.c
+++ b/src/mesa/drivers/dri/i965/brw_sf.c
@@ -139,6 +139,7 @@ brw_upload_sf_prog(struct brw_context *brw)
struct brw_sf_prog_key key;
/* _NEW_BUFFERS */
bool render_to_fbo = _mesa_is_user_fbo(ctx-DrawBuffer);
+   int i;
 
memset(key, 0, sizeof(key));
 
@@ -190,11 +191,16 @@ brw_upload_sf_prog(struct brw_context *brw)
if ((ctx-Point.SpriteOrigin == GL_LOWER_LEFT) != render_to_fbo)
   key.sprite_origin_lower_left = true;
 
-   /* _NEW_LIGHT */
-   key.do_flat_shading = (ctx-Light.ShadeModel == GL_FLAT);
+   /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */
+   key.has_flat_shading = 0;
+   for (i = 0; i  BRW_VERT_RESULT_MAX; i++) {
+  if (brw-interpolation_mode[i] == INTERP_QUALIFIER_FLAT) {
+ key.has_flat_shading = 1;
+ break;
+  }
+   }
key.do_twoside_color = ctx-VertexProgram._TwoSideEnabled;
 
-   /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */
memcpy(key.interpolation_mode, brw-interpolation_mode, 
BRW_VERT_RESULT_MAX);
 
/* _NEW_POLYGON */
diff --git a/src/mesa/drivers/dri/i965/brw_sf.h 
b/src/mesa/drivers/dri/i965/brw_sf.h
index 5e261fb..47fdb3e 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.h
+++ b/src/mesa/drivers/dri/i965/brw_sf.h
@@ -50,7 +50,7 @@ struct brw_sf_prog_key {
uint8_t point_sprite_coord_replace;
GLuint primitive:2;
GLuint do_twoside_color:1;
-   GLuint do_flat_shading:1;
+   GLuint has_flat_shading:1;
GLuint frontface_ccw:1;
GLuint do_point_sprite:1;
GLuint do_point_coord:1;
diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c 
b/src/mesa/drivers/dri/i965/brw_sf_emit.c
index 9d8aa38..c99578a 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c
@@ -44,6 +44,17 @@
 
 
 /**
+ * Determine the vue slot corresponding to the given half of the given
+ * register.  half=0 means the first half of a register, half=1 means the
+ * second half

[Mesa-dev] [PATCH 7/9] intel gen4-5: Correctly handle flat vs. non-flat in the clipper.

2012-07-19 Thread Olivier Galibert
At that point, all interpolation piglit tests involving fixed clipping
work as long as there's no noperspective.

Signed-off-by: Olivier Galibert galib...@pobox.com
Reviewed-by: Paul Berry stereotype...@gmail.com
---
 src/mesa/drivers/dri/i965/brw_clip.c  |   13 --
 src/mesa/drivers/dri/i965/brw_clip.h  |6 +--
 src/mesa/drivers/dri/i965/brw_clip_line.c |6 +--
 src/mesa/drivers/dri/i965/brw_clip_tri.c  |   20 -
 src/mesa/drivers/dri/i965/brw_clip_unfilled.c |2 +-
 src/mesa/drivers/dri/i965/brw_clip_util.c |   56 +++--
 src/mesa/drivers/dri/i965/brw_sf_emit.c   |8 
 7 files changed, 50 insertions(+), 61 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index b4a2e0a..8512172 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -218,7 +218,7 @@ brw_upload_clip_prog(struct brw_context *brw)
struct intel_context *intel = brw-intel;
struct gl_context *ctx = intel-ctx;
struct brw_clip_prog_key key;
-
+   int i;
memset(key, 0, sizeof(key));
 
/* Populate the key:
@@ -231,11 +231,16 @@ brw_upload_clip_prog(struct brw_context *brw)
key.primitive = brw-intel.reduced_primitive;
/* CACHE_NEW_VS_PROG (also part of VUE map) */
key.attrs = brw-vs.prog_data-outputs_written;
-   /* _NEW_LIGHT */
-   key.do_flat_shading = (ctx-Light.ShadeModel == GL_FLAT);
+   /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */
+   key.has_flat_shading = 0;
+   for (i = 0; i  BRW_VERT_RESULT_MAX; i++) {
+  if (brw-interpolation_mode[i] == INTERP_QUALIFIER_FLAT) {
+ key.has_flat_shading = 1;
+ break;
+  }
+   }
key.pv_first = (ctx-Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION);
 
-   /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */
memcpy(key.interpolation_mode, brw-interpolation_mode, 
BRW_VERT_RESULT_MAX);
 
/* _NEW_TRANSFORM (also part of VUE map)*/
diff --git a/src/mesa/drivers/dri/i965/brw_clip.h 
b/src/mesa/drivers/dri/i965/brw_clip.h
index e78d074..3ad2e13 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.h
+++ b/src/mesa/drivers/dri/i965/brw_clip.h
@@ -46,7 +46,7 @@ struct brw_clip_prog_key {
unsigned char interpolation_mode[BRW_VERT_RESULT_MAX]; /* copy of the main 
context */
GLuint primitive:4;
GLuint nr_userclip:4;
-   GLuint do_flat_shading:1;
+   GLuint has_flat_shading:1;
GLuint pv_first:1;
GLuint do_unfilled:1;
GLuint fill_cw:2;   /* includes cull information */
@@ -166,8 +166,8 @@ void brw_clip_kill_thread(struct brw_clip_compile *c);
 struct brw_reg brw_clip_plane_stride( struct brw_clip_compile *c );
 struct brw_reg brw_clip_plane0_address( struct brw_clip_compile *c );
 
-void brw_clip_copy_colors( struct brw_clip_compile *c,
-  GLuint to, GLuint from );
+void brw_clip_copy_flatshaded_attributes( struct brw_clip_compile *c,
+  GLuint to, GLuint from );
 
 void brw_clip_init_clipmask( struct brw_clip_compile *c );
 
diff --git a/src/mesa/drivers/dri/i965/brw_clip_line.c 
b/src/mesa/drivers/dri/i965/brw_clip_line.c
index 6cf2bd2..729d8c0 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_line.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_line.c
@@ -271,11 +271,11 @@ void brw_emit_line_clip( struct brw_clip_compile *c )
brw_clip_line_alloc_regs(c);
brw_clip_init_ff_sync(c);
 
-   if (c-key.do_flat_shading) {
+   if (c-key.has_flat_shading) {
   if (c-key.pv_first)
- brw_clip_copy_colors(c, 1, 0);
+ brw_clip_copy_flatshaded_attributes(c, 1, 0);
   else
- brw_clip_copy_colors(c, 0, 1);
+ brw_clip_copy_flatshaded_attributes(c, 0, 1);
}
 
clip_and_emit_line(c);
diff --git a/src/mesa/drivers/dri/i965/brw_clip_tri.c 
b/src/mesa/drivers/dri/i965/brw_clip_tri.c
index a29f8e0..71225f5 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_tri.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_tri.c
@@ -187,8 +187,8 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile *c )
 
brw_IF(p, BRW_EXECUTE_1);
{
-  brw_clip_copy_colors(c, 1, 0);
-  brw_clip_copy_colors(c, 2, 0);
+  brw_clip_copy_flatshaded_attributes(c, 1, 0);
+  brw_clip_copy_flatshaded_attributes(c, 2, 0);
}
brw_ELSE(p);
{
@@ -200,19 +200,19 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile *c )
 brw_imm_ud(_3DPRIM_TRIFAN));
 brw_IF(p, BRW_EXECUTE_1);
 {
-   brw_clip_copy_colors(c, 0, 1);
-   brw_clip_copy_colors(c, 2, 1);
+   brw_clip_copy_flatshaded_attributes(c, 0, 1);
+   brw_clip_copy_flatshaded_attributes(c, 2, 1);
 }
 brw_ELSE(p);
 {
-   brw_clip_copy_colors(c, 1, 0);
-   brw_clip_copy_colors(c, 2, 0);
+   brw_clip_copy_flatshaded_attributes(c, 1, 0);
+   brw_clip_copy_flatshaded_attributes(c, 2, 0

[Mesa-dev] [PATCH 8/9] intel gen4-5: Make noperspective clipping work.

2012-07-19 Thread Olivier Galibert
At this point all interpolation tests with fixed clipping work.

Signed-off-by: Olivier Galibert galib...@pobox.com
Reviewed-by: Paul Berry stereotype...@gmail.com
---
 src/mesa/drivers/dri/i965/brw_clip.c  |9 ++
 src/mesa/drivers/dri/i965/brw_clip.h  |1 +
 src/mesa/drivers/dri/i965/brw_clip_util.c |  147 ++---
 3 files changed, 146 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index 8512172..eca2844 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -239,6 +239,15 @@ brw_upload_clip_prog(struct brw_context *brw)
  break;
   }
}
+   key.has_noperspective_shading = 0;
+   for (i = 0; i  BRW_VERT_RESULT_MAX; i++) {
+  if (brw-interpolation_mode[i] == INTERP_QUALIFIER_NOPERSPECTIVE 
+  brw-vs.prog_data-vue_map.slot_to_vert_result[i] != 
VERT_RESULT_HPOS) {
+ key.has_noperspective_shading = 1;
+ break;
+  }
+   }
+
key.pv_first = (ctx-Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION);
 
memcpy(key.interpolation_mode, brw-interpolation_mode, 
BRW_VERT_RESULT_MAX);
diff --git a/src/mesa/drivers/dri/i965/brw_clip.h 
b/src/mesa/drivers/dri/i965/brw_clip.h
index 3ad2e13..66dd928 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.h
+++ b/src/mesa/drivers/dri/i965/brw_clip.h
@@ -47,6 +47,7 @@ struct brw_clip_prog_key {
GLuint primitive:4;
GLuint nr_userclip:4;
GLuint has_flat_shading:1;
+   GLuint has_noperspective_shading:1;
GLuint pv_first:1;
GLuint do_unfilled:1;
GLuint fill_cw:2;   /* includes cull information */
diff --git a/src/mesa/drivers/dri/i965/brw_clip_util.c 
b/src/mesa/drivers/dri/i965/brw_clip_util.c
index 692573e..b06ad1d 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_util.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_util.c
@@ -129,6 +129,8 @@ static void brw_clip_project_vertex( struct 
brw_clip_compile *c,
 
 /* Interpolate between two vertices and put the result into a0.0.  
  * Increment a0.0 accordingly.
+ *
+ * Beware that dest_ptr can be equal to v0_ptr.
  */
 void brw_clip_interp_vertex( struct brw_clip_compile *c,
 struct brw_indirect dest_ptr,
@@ -138,7 +140,8 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
 bool force_edgeflag)
 {
struct brw_compile *p = c-func;
-   struct brw_reg tmp = get_tmp(c);
+   struct brw_context *brw = p-brw;
+   struct brw_reg t_nopersp, v0_ndc_copy;
GLuint slot;
 
/* Just copy the vertex header:
@@ -148,13 +151,130 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
 * back on Ironlake, so needn't change it
 */
brw_copy_indirect_to_indirect(p, dest_ptr, v0_ptr, 1);
-  
-   /* Iterate over each attribute (could be done in pairs?)
+
+   /*
+* First handle the 3D and NDC positioning, in case we need
+* noperspective interpolation.  Doing it early has no performance
+* impact in any case.
+*/
+
+   /* Start by picking up the v0 NDC coordinates, because that vertex
+* may be shared with the destination.
+*/
+   if (c-key.has_noperspective_shading) {
+  GLuint offset = brw_vert_result_to_offset(c-vue_map,
+BRW_VERT_RESULT_NDC);
+  v0_ndc_copy = get_tmp(c);
+  brw_MOV(p, v0_ndc_copy, deref_4f(v0_ptr, offset));
+   }  
+
+   /*
+* Compute the new 3D position
+*
+* dest_hpos = v0_hpos * (1 - t0) + v1_hpos * t0
+*/
+   {
+  GLuint delta = brw_vert_result_to_offset(c-vue_map, VERT_RESULT_HPOS);
+  struct brw_reg tmp = get_tmp(c);
+  brw_MUL(p, 
+  vec4(brw_null_reg()),
+  deref_4f(v1_ptr, delta),
+  t0);
+
+  brw_MAC(p,
+  tmp,   
+  negate(deref_4f(v0_ptr, delta)),
+  t0);
+ 
+  brw_ADD(p,
+  deref_4f(dest_ptr, delta), 
+  deref_4f(v0_ptr, delta),
+  tmp);
+  release_tmp(c, tmp);
+   }
+
+   /* Then recreate the projected (NDC) coordinate in the new vertex
+* header
+*/
+   brw_clip_project_vertex(c, dest_ptr);
+
+   /*
+* If we have noperspective attributes, we now need to compute the
+* screen-space t.
+*/
+   if (c-key.has_noperspective_shading) {
+  GLuint delta = brw_vert_result_to_offset(c-vue_map, 
BRW_VERT_RESULT_NDC);
+  struct brw_reg tmp = get_tmp(c);
+  t_nopersp = get_tmp(c);
+
+  /* Build a register with coordinates from the second and new vertices
+   *
+   * t_nopersp = vec4(v1.xy, dest.xy)
+   */
+  brw_MOV(p, t_nopersp, deref_4f(v1_ptr, delta));
+  brw_MOV(p, tmp, deref_4f(dest_ptr, delta));
+  brw_set_access_mode(p, BRW_ALIGN_16);
+  brw_MOV(p,
+  brw_writemask(t_nopersp, WRITEMASK_ZW),
+  brw_swizzle(tmp, 0,1,0,1));
+
+  /* Subtract the coordinates of the first

[Mesa-dev] [PATCH 9/9] intel gen4-5: Don't touch flatshaded values when clipping, only copy them.

2012-07-19 Thread Olivier Galibert
This patch ensures that integers will pass through unscathed.  Doing
(useless) computations on them is risky, especially when their bit
patterns correspond to values like inf or nan.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/mesa/drivers/dri/i965/brw_clip_util.c |   48 ++---
 1 file changed, 30 insertions(+), 18 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip_util.c 
b/src/mesa/drivers/dri/i965/brw_clip_util.c
index b06ad1d..998c304 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_util.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_util.c
@@ -293,30 +293,42 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
  * header), so interpolate:
  *
  *New = attr0 + t*attr1 - t*attr0
+  *
+  * unless it's flat shaded, then just copy the value from a
+  * source vertex.
  */
 
- struct brw_reg tmp = get_tmp(c);
+ GLuint interp = brw-interpolation_mode[slot];
 
- struct brw_reg t =
-brw-interpolation_mode[slot] == INTERP_QUALIFIER_NOPERSPECTIVE ?
-t_nopersp : t0;
+ if(interp == INTERP_QUALIFIER_SMOOTH ||
+interp == INTERP_QUALIFIER_NOPERSPECTIVE) {
+struct brw_reg tmp = get_tmp(c);
+struct brw_reg t =
+   interp == INTERP_QUALIFIER_NOPERSPECTIVE ?
+   t_nopersp : t0;
 
-brw_MUL(p, 
-vec4(brw_null_reg()),
-deref_4f(v1_ptr, delta),
-t);
+brw_MUL(p,
+vec4(brw_null_reg()),
+deref_4f(v1_ptr, delta),
+t);
 
-brw_MAC(p, 
-tmp, 
-negate(deref_4f(v0_ptr, delta)),
-t); 
+brw_MAC(p,
+tmp,
+negate(deref_4f(v0_ptr, delta)),
+t);
  
-brw_ADD(p,
-deref_4f(dest_ptr, delta), 
-deref_4f(v0_ptr, delta),
-tmp);
-
- release_tmp(c, tmp);
+brw_ADD(p,
+deref_4f(dest_ptr, delta),
+deref_4f(v0_ptr, delta),
+tmp);
+
+release_tmp(c, tmp);
+
+ } else if(interp == INTERP_QUALIFIER_FLAT) {
+brw_MOV(p,
+deref_4f(dest_ptr, delta),
+deref_4f(v0_ptr, delta));
+ }
   }
}
 
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] softpipe: Take all lods into account when texture sampling.

2012-07-18 Thread Olivier Galibert
On Tue, Jul 17, 2012 at 03:41:44PM -0600, Brian Paul wrote:
 On 07/13/2012 10:30 AM, Olivier Galibert wrote:
  On Wed, Jun 20, 2012 at 08:33:38AM -0600, Brian Paul wrote:
  Yeah, I think it's pretty clear that we need to support per-pixel LOD
  selection.  For softpipe, Olivier's big patch looks good.
 
  ... and then nothing happened.  Ping?  The only code remark was a
  whitespace issue on one line :-)
 
 I'll commit/push your patch soon.  I don't always remember who has 
 git-write access so if you can't push patches yourself you should 
 probably indicate so.

I indeed don't have commit access, but more importantly there has been
discussion but not review, which is why I didn't know if I had to
change things :-)

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Intel-gfx] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.

2012-07-17 Thread Olivier Galibert
On Mon, Jul 16, 2012 at 08:43:17PM -0700, Paul Berry wrote:
 Can you split this into three separate patches?  That will make it easier
 to troubleshoot in case we find bugs with these patches in the future.

I'm going to try.


 Also, I'm not convinced that #3 is necessary.  Is there something in the
 spec that dictates this behaviour?  My reading of the spec is that if the
 vertex shader writes to gl_BackColor but not glFrontColor, then the
 contents of gl_Color in the fragment shader is undefined.

Given the number of security issues/information leaks that happen due
to reads out of place, I'm always extremely wary of reads from
nowhere.  So one pretty much has a choice between forcing a specific
value (like 0) or reading from someplace else that makes sense.  In
that particular case I considered reading from the other color slot
the easy way out.


 If we *do* decide that #3 is necessary, then I think a better way to
 accomplish it is to handle it in the GLSL vertex shader front-end, by
 replacing gl_BackColor with gl_FrontColor in cases where gl_FrontColor is
 not written to.  That way our special case code to handle this situation
 would be in just one place, rather than in three places (both fragment
 shader back-ends, and the SF program).  Also then the fix would apply to
 all hardware, not just Intel Gen4-5.

You'd have to switch off two-sided lighting too, but why not.


 Finally, I couldn't figure out what you meant by the stray mov into
 lalaland.  Can you elaborate on which piece of code used to generate that
 stray mov, and why it doesn't anymore?  Thanks.

Looking at it again, I was wrong, it was protected.

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes

2012-07-14 Thread Olivier Galibert
On Fri, Jul 13, 2012 at 02:45:10PM -0700, Kenneth Graunke wrote:
 Sorry...been really busy, and most of us haven't actually spent much if
 any time in the clipper shaders.  I'll try and review it within a week.

Ok cool, lack of time is something I completely understand :-)


 Despite the lack of response, I am really excited to see that you're
 working on this---this is a huge step toward bringing GL 3.x back to
 Gen4/5, and we're all really glad to see it happen!

Excellent.  I was starting to wonder if gen4/5 was abandoned (by lack
of resources if anything), nice to see it isn't.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes

2012-07-13 Thread Olivier Galibert
On Sat, Jun 30, 2012 at 08:50:10PM +0200, Olivier Galibert wrote:
 This is the first part of the fixes I've done to make my gm45 work
 correctly w.r.t clipping and interpolation.  There's a fair chance
 they work for everything gen 4/5, but I have no way to be sure.

So, not even one comment, nothing?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa/st: Generates TGSI that always recognizes INSTANCEID/VERTEXID as integers.

2012-07-13 Thread Olivier Galibert
On Thu, Jul 12, 2012 at 08:50:13PM +0100, jfons...@vmware.com wrote:
 From: José Fonseca jfons...@vmware.com
 
 Tested by running piglit draw-instanced, and by forcing llvmpipe advertise no 
 native
 integer support, which now produces:

Looks like a very good solution to me.  Did you check
draw-non-instanced too?  51366 is a variant of the same issue.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] softpipe: Take all lods into account when texture sampling.

2012-07-13 Thread Olivier Galibert
On Wed, Jun 20, 2012 at 08:33:38AM -0600, Brian Paul wrote:
 Yeah, I think it's pretty clear that we need to support per-pixel LOD 
 selection.  For softpipe, Olivier's big patch looks good.

... and then nothing happened.  Ping?  The only code remark was a
whitespace issue on one line :-)


 For 
 llvmpipe it's important to maintain performance for the common case 
 where we compute LOD per quad but we'll also need new paths for 
 per-pixel LOD.  Hopefully, the two paths can share some code.

I've been thinking, it looks reasonable to statically check whether
the lod/grad/bias is shared at the glsl level.  Then we could have
separate opcodes for the texturing variants for when we're sure things
are shared and when we aren't.  And pay the cost only when it is
needed.  Would that sound reasonable?

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-07-11 Thread Olivier Galibert
On Tue, Jul 10, 2012 at 03:51:22PM +0200, Marek Ol??ák wrote:
 I just wanted to tell you Stephane's change cannot work and it even
 has no effect at the moment. The native integer support is global in
 core Mesa. It's because integer uniforms are converted to floats based
 on the global NativeInteger flag for all shader stages and that can't
 be fixed easily, because uniforms can be shared between shaders.
 Basically, all drivers must advertise integer support either for all
 shader stages or none.

Really?  I mean the idea here is that drivers like i915g which don't
have native integers in the fragger are going to advertise native
integers in the vs but stay at glsl 1.20.  Can you have integer
uniforms without 1.30+?  I don't think so.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-07-11 Thread Olivier Galibert
On Tue, Jul 10, 2012 at 09:19:05AM -0700, Stéphane Marchesin wrote:
 There is also option 3): revert the two patches causing the regression.

And then you'll have this problem again as soon as you want llvmpipe
to reach GL 3.00+/GLSL 1.30+.  So why not find a definitive solution
now?

Previous code converted the instance id to float, and said it's an
integer guv', honest.  That does not fly in the face of native
integers, at all, unless you like your second instance to be numbered
1065353216.

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-07-11 Thread Olivier Galibert
On Wed, Jul 11, 2012 at 12:51:32PM +0200, Marek Ol??ák wrote:
 Dude, you should really learn GLSL. The idea to emulate integers is
 even older than the GLSL itself. It first appeared in HLSL and NVIDIA
 Cg on hardware that wasn't even GL2-capable.

I'm learning 3.30+, which is what I consider useful now :-) But that
makes it a little harder to remember what appeared when.


 From the GLSL 1.2 spec:
 The uniform qualifier can be used with any of the basic data types,
 ..., then the section 4.1 lists the basic data types (like ivec4).

Fuck, damn.  Yes, we do have a problem.

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-07-11 Thread Olivier Galibert
On Wed, Jul 11, 2012 at 02:19:02PM +0200, Marek Ol??ák wrote:
 On Wed, Jul 11, 2012 at 1:09 PM, Jose Fonseca jfons...@vmware.com wrote:
  My current plan is to:
  - make it clear that INSTANCEID/VERTEXID always means integer
  - require PIPE_SHADER_CAP_INTEGERS to be advertise in the vertex shader 
  stage in order to advertise INSTANCEID/VERTEXID in Mesa statetracker
  - given that Mesa assumes integer, insert a I2F when loading 
  INSTANCEID/VERTEXID (this meets the new semantics while avoiding a big 
  re-architecture)
 
 The first two points sound good, but why I2F? Note that softpipe fully
 supports integers while llvmpipe doesn't, and I2F after loading
 INSTANCEID would very likely break softpipe.

I think that would break llvmpipe too.  llvmpipe actually fully
supports integers, it only thinks it doesn't, and least according to
piglit (textureFetch is the only real remaining issue left for glsl
1.30).  And draw-instanced works perfectly well with native integer
llvmpipe (which is why I didn't see the problem before the bug
report).

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] draw: draw_get_shader_param should return correct values WRT llvm

2012-07-04 Thread Olivier Galibert
On Wed, Jul 04, 2012 at 01:59:44PM +0200, Marek Ol??ák wrote:
 Please disregard patch 1 and 2. It wouldn't work.

What's wrong with them?


 I still plan to commit patch 3.

Patch 3 makes sense.  I probably should have done it like that in the
first place (learned a lot since :-).

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-07-03 Thread Olivier Galibert
On Tue, Jul 03, 2012 at 12:39:47PM -0700, Jose Fonseca wrote:
 Note that all registers are stored as floats (for convenience, and
 because LLVM has no unions), so integers are bitcasted into floats
 while storing/loading.  And I'm not sure if your patch would break
 that.

I did test the patch with a llvmpipe in a glsl 120/no native integer
setup.  draw_instanced worked.  I didn't try a full piglit though.


 I still think that having draw/gallivm guessing whether native integer 
 support is intended or not is bad. Either:
 1) TGSI is extended (e.g., more type annotations) so that native-integer 
 support can inferred from it
 2) draw/gallivm need to now if the driver has native-integer or not
 
 I'm inclined towards 1), as TGSI should be self-documented. That is,
 it should not be necessary to know if the driver has or not native
 integer support to know whether system values should be assumed to
 be integers or floats...

It could be argued that dtype being TGSI_TYPE_FLOAT is the
documentation on what is expected.  But I'm quickly reaching the point
where I don't really care, just tell me what you want.  As long as
textureFetch stays the only issue between llvmpipe and 1.30 I'm ok.

Of course doing textureFetch right is going to require an interesting
overhaul of the texture allocations... need to finish fixing the gm45
interpolation/clipping first.

Best,

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-07-02 Thread Olivier Galibert
On Mon, Jul 02, 2012 at 06:44:37AM -0700, Jose Fonseca wrote:
 But I think that this fix is too ad-hoc, and I suspect it may
 introduce other regressions.
 
 If I understood the problem correctly, the issue here is that some
 drivers want system values in floats, others want in
 integers. Right?

It's slightly more perverted than that.  GLSL 1.20 says if a value is
an integer, it will be forced into a float but don't expect more than
16 bits precision, while 1.30 has native integers.  Next to that,
some extensions (and gl versions) introduce integer system values but
require native integer support to have them implemented.  The glsl
parser handles that correctly by adding the needed type conversions
when accessing these values from a 1.20 shader.

But then in mesa someone decided to extend the extensions and
implement things like draw_instanced without native integer support.
st_glsl_to_tgsi behaves very differently when native integers aren't
there, forcing evey type to float and ignoring the integer-float type
conversions.  What tells you that is that the requested type (dtype)
is float while the system value itself is integer.

In fact, I suspect the conversion code is ill-advised.  It was picked
up from the previous code, but actually it should only check that the
types are identical or that float is requested for an int, and bitch
otherwise.  Still, it would be interesting to know if that patch works
for i915g, even if we make things more cranky afterwards.

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes

2012-06-30 Thread Olivier Galibert
  Hi,

This is the first part of the fixes I've done to make my gm45 work
correctly w.r.t clipping and interpolation.  There's a fair chance
they work for everything gen 4/5, but I have no way to be sure.

[PATCH 1/5] intel gen4-5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
[PATCH 2/5] intel gen4-5: Compute the interpolation status for every
[PATCH 3/5] intel gen4-5: Correctly setup the parameters in the sf.
[PATCH 4/5] intel gen4-5: Correctly handle flat vs. non-flat in the
[PATCH 5/5] intel gen4-5: Make noperspective clipping work.

After this batch every piglit interpolation test involving no clipping
or fixed clipping passes.  Vertex clipping clearly never worked
(VERT_RESULT_CLIP_VERTEX is not used, so...) and clipdistance isn't
implemented.  These will be the topic of the second batch, whenever it
exists.

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.

2012-06-30 Thread Olivier Galibert
There was... confusion about which register goes where.  With that
patch urb_setup is in line with the vue setup, even when these
annoying backcolor slots are used.  And in addition the stray mov into
lalaland is avoided when only one of the front/back slots is used and
the backface is looking at you.  The code instead picks whatever slot
was written to by the vertex shader.  That makes most of the generated
piglit tests useless to test the backface selection though.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |   18 +-
 src/mesa/drivers/dri/i965/brw_sf.c   |3 +-
 src/mesa/drivers/dri/i965/brw_sf_emit.c  |   93 +-
 src/mesa/drivers/dri/i965/brw_wm_pass2.c |   19 +-
 4 files changed, 89 insertions(+), 44 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 6cef08a..710f2ff 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -721,8 +721,24 @@ fs_visitor::calculate_urb_setup()
 if (c-key.vp_outputs_written  BITFIELD64_BIT(i)) {
int fp_index = _mesa_vert_result_to_frag_attrib((gl_vert_result) i);
 
+/* Special case: two-sided vertex option, vertex program
+ * only writes to the back color.  Map it to the
+ * associated front color location.
+ */
+if (i = VERT_RESULT_BFC0  i = VERT_RESULT_BFC1 
+ctx-VertexProgram._TwoSideEnabled 
+urb_setup[i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0] == -1)
+   fp_index = i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
+
+   /* The back color slot is skipped when the front color is
+* also written to.  In addition, some slots can be
+* written in the vertex shader and not read in the
+* fragment shader.  So the register number must always be
+* incremented, mapped or not.
+*/
if (fp_index = 0)
-  urb_setup[fp_index] = urb_next++;
+  urb_setup[fp_index] = urb_next;
+   urb_next++;
 }
   }
 
diff --git a/src/mesa/drivers/dri/i965/brw_sf.c 
b/src/mesa/drivers/dri/i965/brw_sf.c
index 23a874a..7867ab5 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.c
+++ b/src/mesa/drivers/dri/i965/brw_sf.c
@@ -192,7 +192,8 @@ brw_upload_sf_prog(struct brw_context *brw)
 
/* _NEW_LIGHT */
key.do_flat_shading = (ctx-Light.ShadeModel == GL_FLAT);
-   key.do_twoside_color = (ctx-Light.Enabled  ctx-Light.Model.TwoSide);
+   key.do_twoside_color = (ctx-Light.Enabled  ctx-Light.Model.TwoSide) ||
+ ctx-VertexProgram._TwoSideEnabled;
 
/* _NEW_POLYGON */
if (key.do_twoside_color) {
diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c 
b/src/mesa/drivers/dri/i965/brw_sf_emit.c
index ff6383b..9d8aa38 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c
@@ -79,24 +79,9 @@ have_attr(struct brw_sf_compile *c, GLuint attr)
 /*** 
  * Twoside lighting
  */
-static void copy_bfc( struct brw_sf_compile *c,
- struct brw_reg vert )
-{
-   struct brw_compile *p = c-func;
-   GLuint i;
-
-   for (i = 0; i  2; i++) {
-  if (have_attr(c, VERT_RESULT_COL0+i) 
- have_attr(c, VERT_RESULT_BFC0+i))
-brw_MOV(p, 
-get_vert_result(c, vert, VERT_RESULT_COL0+i),
-get_vert_result(c, vert, VERT_RESULT_BFC0+i));
-   }
-}
-
-
 static void do_twoside_color( struct brw_sf_compile *c )
 {
+   GLuint i, need_0, need_1;
struct brw_compile *p = c-func;
GLuint backface_conditional = c-key.frontface_ccw ? BRW_CONDITIONAL_G : 
BRW_CONDITIONAL_L;
 
@@ -105,12 +90,14 @@ static void do_twoside_color( struct brw_sf_compile *c )
if (c-key.primitive == SF_UNFILLED_TRIS)
   return;
 
-   /* XXX: What happens if BFC isn't present?  This could only happen
-* for user-supplied vertex programs, as t_vp_build.c always does
-* the right thing.
+   /* If the vertex shader provides both front and backface color, do
+* the selection.  Otherwise the generated code will pick up
+* whichever there is.
 */
-   if (!(have_attr(c, VERT_RESULT_COL0)  have_attr(c, VERT_RESULT_BFC0)) 
-   !(have_attr(c, VERT_RESULT_COL1)  have_attr(c, VERT_RESULT_BFC1)))
+   need_0 = have_attr(c, VERT_RESULT_COL0)  have_attr(c, VERT_RESULT_BFC0);
+   need_1 = have_attr(c, VERT_RESULT_COL1)  have_attr(c, VERT_RESULT_BFC1);
+
+   if (!need_0  !need_1)
   return;

/* Need to use BRW_EXECUTE_4 and also do an 4-wide compare in order
@@ -121,12 +108,15 @@ static void do_twoside_color( struct brw_sf_compile *c )
brw_push_insn_state(p);
brw_CMP(p, vec4(brw_null_reg()), backface_conditional, c-det, 
brw_imm_f(0));
brw_IF(p, BRW_EXECUTE_4);
-   {
-  switch (c-nr_verts) {
-  case 3: copy_bfc(c, c

[Mesa-dev] [PATCH 2/5] intel gen4-5: Compute the interpolation status for every variable in one place.

2012-06-30 Thread Olivier Galibert
The program keys are updated accordingly, but the values are not used
yet.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/mesa/drivers/dri/i965/brw_clip.c|   82 ++-
 src/mesa/drivers/dri/i965/brw_clip.h|1 +
 src/mesa/drivers/dri/i965/brw_context.h |   59 ++
 src/mesa/drivers/dri/i965/brw_sf.c  |3 +-
 src/mesa/drivers/dri/i965/brw_sf.h  |1 +
 src/mesa/drivers/dri/i965/brw_wm.c  |4 ++
 src/mesa/drivers/dri/i965/brw_wm.h  |1 +
 7 files changed, 149 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index d411208..52e8c47 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -47,6 +47,83 @@
 #define FRONT_UNFILLED_BIT  0x1
 #define BACK_UNFILLED_BIT   0x2
 
+/**
+ * Lookup the interpolation mode information for every element in the
+ * vue.
+ */
+static void
+brw_lookup_interpolation(struct brw_context *brw)
+{
+   /* pprog means previous program, i.e. the last program before the
+* fragment shader.  It can only be the vertex shader for now, but
+* it may be a geometry shader in the future.
+*/
+   const struct gl_program *pprog = brw-vertex_program-Base;
+   const struct gl_fragment_program *fprog = brw-fragment_program;
+   struct brw_vue_map *vue_map = brw-vs.prog_data-vue_map;
+
+   /* Default everything to INTERP_QUALIFIER_NONE */
+   brw_clear_interpolation_modes(brw);
+
+   /* If there is no fragment shader, interpolation won't be needed,
+* so defaulting to none is good.
+*/
+   if (!fprog)
+  return;
+
+   for (int i = 0; i  vue_map-num_slots; i++) {
+  /* First lookup the vert result, skip if there isn't one */
+  int vert_result = vue_map-slot_to_vert_result[i];
+  if (vert_result == BRW_VERT_RESULT_MAX)
+ continue;
+
+  /* HPOS is special, it must be linear
+   */
+  if (vert_result == VERT_RESULT_HPOS) {
+ brw_set_interpolation_mode(brw, i, INTERP_QUALIFIER_NOPERSPECTIVE);
+ continue;
+  }
+
+  /* There is a 1-1 mapping of vert result to frag attrib except
+   * for BackColor and vars
+   */
+  int frag_attrib = vert_result;
+  if (vert_result = VERT_RESULT_BFC0  vert_result = VERT_RESULT_BFC1)
+ frag_attrib = vert_result - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
+  else if(vert_result = VERT_RESULT_VAR0)
+ frag_attrib = vert_result - VERT_RESULT_VAR0 + FRAG_ATTRIB_VAR0;
+
+  /* If the output is not used by the fragment shader, skip it. */
+  if (!(fprog-Base.InputsRead  BITFIELD64_BIT(frag_attrib)))
+ continue;
+
+  /* Lookup the interpolation mode */
+  enum glsl_interp_qualifier interpolation_mode = 
fprog-InterpQualifier[frag_attrib];
+
+  /* If the mode is not specified, then the default varies.  Color
+   * values follow the shader model, while all the rest uses
+   * smooth.
+   */
+  if (interpolation_mode == INTERP_QUALIFIER_NONE) {
+ if (frag_attrib = FRAG_ATTRIB_COL0  frag_attrib = 
FRAG_ATTRIB_COL1)
+interpolation_mode = brw-intel.ctx.Light.ShadeModel == GL_FLAT ? 
INTERP_QUALIFIER_FLAT : INTERP_QUALIFIER_SMOOTH;
+ else
+interpolation_mode = INTERP_QUALIFIER_SMOOTH;
+  }
+
+  /* Finally, if we have both a front color and a back color for
+   * the same channel, the selection will be done before
+   * interpolation and the back color copied over the front color
+   * if necessary.  So interpolating the back color is
+   * unnecessary.
+   */
+  if (vert_result = VERT_RESULT_BFC0  vert_result = VERT_RESULT_BFC1)
+ if (pprog-OutputsWritten  BITFIELD64_BIT(vert_result - 
VERT_RESULT_BFC0 + VERT_RESULT_COL0))
+interpolation_mode = INTERP_QUALIFIER_NONE;
+
+  brw_set_interpolation_mode(brw, i, interpolation_mode);
+   }
+}
 
 static void compile_clip_prog( struct brw_context *brw,
 struct brw_clip_prog_key *key )
@@ -141,6 +218,8 @@ brw_upload_clip_prog(struct brw_context *brw)
 
memset(key, 0, sizeof(key));
 
+   brw_lookup_interpolation(brw);
+
/* Populate the key:
 */
/* BRW_NEW_REDUCED_PRIMITIVE */
@@ -150,6 +229,7 @@ brw_upload_clip_prog(struct brw_context *brw)
/* _NEW_LIGHT */
key.do_flat_shading = (ctx-Light.ShadeModel == GL_FLAT);
key.pv_first = (ctx-Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION);
+   brw_copy_interpolation_modes(brw, key.interpolation_mode);
/* _NEW_TRANSFORM (also part of VUE map)*/
key.nr_userclip = _mesa_bitcount_64(ctx-Transform.ClipPlanesEnabled);
 
@@ -258,7 +338,7 @@ const struct brw_tracked_state brw_clip_prog = {
_NEW_TRANSFORM |
_NEW_POLYGON | 
_NEW_BUFFERS),
-  .brw   = (BRW_NEW_REDUCED_PRIMITIVE),
+  .brw   = (BRW_NEW_FRAGMENT_PROGRAM|BRW_NEW_REDUCED_PRIMITIVE

[Mesa-dev] [PATCH 3/5] intel gen4-5: Correctly setup the parameters in the sf.

2012-06-30 Thread Olivier Galibert
This patch also correct a couple of problems with noperspective
interpolation.

At that point all the glsl 1.1/1.3 interpolation tests that do not
clip pass (the -none ones).

The fs code does not use the pre-resolved interpolation modes in order
not to mess with gen6+.  Sharing the resolution would require putting
brw_wm_prog before brw_clip_prog and brw_sf_prog.  This may be a good
thing, but it could have unexpected consequences, so it's better be
done independently in any case.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |2 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |5 +
 src/mesa/drivers/dri/i965/brw_sf.c   |9 +-
 src/mesa/drivers/dri/i965/brw_sf.h   |2 +-
 src/mesa/drivers/dri/i965/brw_sf_emit.c  |  164 +-
 5 files changed, 95 insertions(+), 87 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 710f2ff..b142f2b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -506,7 +506,7 @@ fs_visitor::emit_general_interpolation(ir_variable *ir)
  struct brw_reg interp = interp_reg(location, k);
   emit_linterp(attr, fs_reg(interp), interpolation_mode,
ir-centroid);
- if (intel-gen  6) {
+ if (intel-gen  6  interpolation_mode == 
INTERP_QUALIFIER_SMOOTH) {
 emit(BRW_OPCODE_MUL, attr, attr, this-pixel_w);
  }
   }
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 9bd1e67..ab83a95 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1924,6 +1924,11 @@ fs_visitor::emit_interpolation_setup_gen4()
emit(BRW_OPCODE_ADD, this-delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC],
this-pixel_y, fs_reg(negate(brw_vec1_grf(1, 1;
 
+   this-delta_x[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] =
+ this-delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC];
+   this-delta_y[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] =
+ this-delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC];
+
this-current_annotation = compute pos.w and 1/pos.w;
/* Compute wpos.w.  It's always in our setup, since it's needed to
 * interpolate the other attributes.
diff --git a/src/mesa/drivers/dri/i965/brw_sf.c 
b/src/mesa/drivers/dri/i965/brw_sf.c
index 0cc4fc7..85f5f51 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.c
+++ b/src/mesa/drivers/dri/i965/brw_sf.c
@@ -139,6 +139,7 @@ brw_upload_sf_prog(struct brw_context *brw)
struct brw_sf_prog_key key;
/* _NEW_BUFFERS */
bool render_to_fbo = _mesa_is_user_fbo(ctx-DrawBuffer);
+   int i;
 
memset(key, 0, sizeof(key));
 
@@ -191,7 +192,13 @@ brw_upload_sf_prog(struct brw_context *brw)
   key.sprite_origin_lower_left = true;
 
/* _NEW_LIGHT */
-   key.do_flat_shading = (ctx-Light.ShadeModel == GL_FLAT);
+   key.has_flat_shading = 0;
+   for (i = 0; i  BRW_VERT_RESULT_MAX; i++) {
+  if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) {
+ key.has_flat_shading = 1;
+ break;
+  }
+   }
key.do_twoside_color = (ctx-Light.Enabled  ctx-Light.Model.TwoSide) ||
  ctx-VertexProgram._TwoSideEnabled;
brw_copy_interpolation_modes(brw, key.interpolation_mode);
diff --git a/src/mesa/drivers/dri/i965/brw_sf.h 
b/src/mesa/drivers/dri/i965/brw_sf.h
index 0a8135c..c718072 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.h
+++ b/src/mesa/drivers/dri/i965/brw_sf.h
@@ -50,7 +50,7 @@ struct brw_sf_prog_key {
uint8_t point_sprite_coord_replace;
GLuint primitive:2;
GLuint do_twoside_color:1;
-   GLuint do_flat_shading:1;
+   GLuint has_flat_shading:1;
GLuint frontface_ccw:1;
GLuint do_point_sprite:1;
GLuint do_point_coord:1;
diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c 
b/src/mesa/drivers/dri/i965/brw_sf_emit.c
index 9d8aa38..387685a 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c
@@ -44,6 +44,17 @@
 
 
 /**
+ * Determine the vue slot corresponding to the given half of the given
+ * register.  half=0 means the first half of a register, half=1 means the
+ * second half.
+ */
+static inline int vert_reg_to_vue_slot(struct brw_sf_compile *c, GLuint reg,
+   int half)
+{
+   return (reg + c-urb_entry_read_offset) * 2 + half;
+}
+
+/**
  * Determine the vert_result corresponding to the given half of the given
  * register.  half=0 means the first half of a register, half=1 means the
  * second half.
@@ -51,11 +62,24 @@
 static inline int vert_reg_to_vert_result(struct brw_sf_compile *c, GLuint reg,
   int half)
 {
-   int vue_slot = (reg + c-urb_entry_read_offset) * 2 + half;
+   int vue_slot = vert_reg_to_vue_slot(c, reg, half

[Mesa-dev] [PATCH 4/5] intel gen4-5: Correctly handle flat vs. non-flat in the clipper.

2012-06-30 Thread Olivier Galibert
At that point, all interpolation piglit tests involving fixed clipping
work as long as there's no noperspective.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/mesa/drivers/dri/i965/brw_clip.c  |   10 -
 src/mesa/drivers/dri/i965/brw_clip.h  |6 +--
 src/mesa/drivers/dri/i965/brw_clip_line.c |6 +--
 src/mesa/drivers/dri/i965/brw_clip_tri.c  |   20 -
 src/mesa/drivers/dri/i965/brw_clip_unfilled.c |2 +-
 src/mesa/drivers/dri/i965/brw_clip_util.c |   56 +++--
 6 files changed, 41 insertions(+), 59 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index 52e8c47..952eb4a 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -215,7 +215,7 @@ brw_upload_clip_prog(struct brw_context *brw)
struct intel_context *intel = brw-intel;
struct gl_context *ctx = intel-ctx;
struct brw_clip_prog_key key;
-
+   int i;
memset(key, 0, sizeof(key));
 
brw_lookup_interpolation(brw);
@@ -227,7 +227,13 @@ brw_upload_clip_prog(struct brw_context *brw)
/* CACHE_NEW_VS_PROG (also part of VUE map) */
key.attrs = brw-vs.prog_data-outputs_written;
/* _NEW_LIGHT */
-   key.do_flat_shading = (ctx-Light.ShadeModel == GL_FLAT);
+   key.has_flat_shading = 0;
+   for (i = 0; i  BRW_VERT_RESULT_MAX; i++) {
+  if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) {
+ key.has_flat_shading = 1;
+ break;
+  }
+   }
key.pv_first = (ctx-Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION);
brw_copy_interpolation_modes(brw, key.interpolation_mode);
/* _NEW_TRANSFORM (also part of VUE map)*/
diff --git a/src/mesa/drivers/dri/i965/brw_clip.h 
b/src/mesa/drivers/dri/i965/brw_clip.h
index 6f811ae..0ea0394 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.h
+++ b/src/mesa/drivers/dri/i965/brw_clip.h
@@ -46,7 +46,7 @@ struct brw_clip_prog_key {
GLbitfield64 interpolation_mode[2]; /* copy of the main context */
GLuint primitive:4;
GLuint nr_userclip:4;
-   GLuint do_flat_shading:1;
+   GLuint has_flat_shading:1;
GLuint pv_first:1;
GLuint do_unfilled:1;
GLuint fill_cw:2;   /* includes cull information */
@@ -166,8 +166,8 @@ void brw_clip_kill_thread(struct brw_clip_compile *c);
 struct brw_reg brw_clip_plane_stride( struct brw_clip_compile *c );
 struct brw_reg brw_clip_plane0_address( struct brw_clip_compile *c );
 
-void brw_clip_copy_colors( struct brw_clip_compile *c,
-  GLuint to, GLuint from );
+void brw_clip_copy_flatshaded_attributes( struct brw_clip_compile *c,
+  GLuint to, GLuint from );
 
 void brw_clip_init_clipmask( struct brw_clip_compile *c );
 
diff --git a/src/mesa/drivers/dri/i965/brw_clip_line.c 
b/src/mesa/drivers/dri/i965/brw_clip_line.c
index 6cf2bd2..729d8c0 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_line.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_line.c
@@ -271,11 +271,11 @@ void brw_emit_line_clip( struct brw_clip_compile *c )
brw_clip_line_alloc_regs(c);
brw_clip_init_ff_sync(c);
 
-   if (c-key.do_flat_shading) {
+   if (c-key.has_flat_shading) {
   if (c-key.pv_first)
- brw_clip_copy_colors(c, 1, 0);
+ brw_clip_copy_flatshaded_attributes(c, 1, 0);
   else
- brw_clip_copy_colors(c, 0, 1);
+ brw_clip_copy_flatshaded_attributes(c, 0, 1);
}
 
clip_and_emit_line(c);
diff --git a/src/mesa/drivers/dri/i965/brw_clip_tri.c 
b/src/mesa/drivers/dri/i965/brw_clip_tri.c
index a29f8e0..71225f5 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_tri.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_tri.c
@@ -187,8 +187,8 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile *c )
 
brw_IF(p, BRW_EXECUTE_1);
{
-  brw_clip_copy_colors(c, 1, 0);
-  brw_clip_copy_colors(c, 2, 0);
+  brw_clip_copy_flatshaded_attributes(c, 1, 0);
+  brw_clip_copy_flatshaded_attributes(c, 2, 0);
}
brw_ELSE(p);
{
@@ -200,19 +200,19 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile *c )
 brw_imm_ud(_3DPRIM_TRIFAN));
 brw_IF(p, BRW_EXECUTE_1);
 {
-   brw_clip_copy_colors(c, 0, 1);
-   brw_clip_copy_colors(c, 2, 1);
+   brw_clip_copy_flatshaded_attributes(c, 0, 1);
+   brw_clip_copy_flatshaded_attributes(c, 2, 1);
 }
 brw_ELSE(p);
 {
-   brw_clip_copy_colors(c, 1, 0);
-   brw_clip_copy_colors(c, 2, 0);
+   brw_clip_copy_flatshaded_attributes(c, 1, 0);
+   brw_clip_copy_flatshaded_attributes(c, 2, 0);
 }
 brw_ENDIF(p);
   }
   else {
- brw_clip_copy_colors(c, 0, 2);
- brw_clip_copy_colors(c, 1, 2);
+ brw_clip_copy_flatshaded_attributes(c, 0, 2);
+ brw_clip_copy_flatshaded_attributes(c, 1, 2);
   }
}
brw_ENDIF(p);
@@ -606,8 +606,8

[Mesa-dev] [PATCH 5/5] intel gen4-5: Make noperspective clipping work.

2012-06-30 Thread Olivier Galibert
At this point all interpolation tests with fixed clipping work.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/mesa/drivers/dri/i965/brw_clip.c  |9 ++
 src/mesa/drivers/dri/i965/brw_clip.h  |1 +
 src/mesa/drivers/dri/i965/brw_clip_util.c |  133 ++---
 3 files changed, 132 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index 952eb4a..6bfdf24 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -234,6 +234,15 @@ brw_upload_clip_prog(struct brw_context *brw)
  break;
   }
}
+   key.has_noperspective_shading = 0;
+   for (i = 0; i  BRW_VERT_RESULT_MAX; i++) {
+  if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_NOPERSPECTIVE 

+  brw-vs.prog_data-vue_map.slot_to_vert_result[i] != 
VERT_RESULT_HPOS) {
+ key.has_noperspective_shading = 1;
+ break;
+  }
+   }
+
key.pv_first = (ctx-Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION);
brw_copy_interpolation_modes(brw, key.interpolation_mode);
/* _NEW_TRANSFORM (also part of VUE map)*/
diff --git a/src/mesa/drivers/dri/i965/brw_clip.h 
b/src/mesa/drivers/dri/i965/brw_clip.h
index 0ea0394..2a7245a 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.h
+++ b/src/mesa/drivers/dri/i965/brw_clip.h
@@ -47,6 +47,7 @@ struct brw_clip_prog_key {
GLuint primitive:4;
GLuint nr_userclip:4;
GLuint has_flat_shading:1;
+   GLuint has_noperspective_shading:1;
GLuint pv_first:1;
GLuint do_unfilled:1;
GLuint fill_cw:2;   /* includes cull information */
diff --git a/src/mesa/drivers/dri/i965/brw_clip_util.c 
b/src/mesa/drivers/dri/i965/brw_clip_util.c
index 7b0205a..5bdcef8 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_util.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_util.c
@@ -129,6 +129,8 @@ static void brw_clip_project_vertex( struct 
brw_clip_compile *c,
 
 /* Interpolate between two vertices and put the result into a0.0.  
  * Increment a0.0 accordingly.
+ *
+ * Beware that dest_ptr can be equal to v0_ptr.
  */
 void brw_clip_interp_vertex( struct brw_clip_compile *c,
 struct brw_indirect dest_ptr,
@@ -138,8 +140,9 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
 bool force_edgeflag)
 {
struct brw_compile *p = c-func;
-   struct brw_reg tmp = get_tmp(c);
-   GLuint slot;
+   struct brw_context *brw = p-brw;
+   struct brw_reg tmp, t_nopersp, v0_ndc_copy;
+   GLuint slot, delta;
 
/* Just copy the vertex header:
 */
@@ -148,13 +151,119 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
 * back on Ironlake, so needn't change it
 */
brw_copy_indirect_to_indirect(p, dest_ptr, v0_ptr, 1);
-  
-   /* Iterate over each attribute (could be done in pairs?)
+
+   /*
+* First handle the 3D and NDC positioning, in case we need
+* noperspective interpolation.  Doing it early has no performance
+* impact in any case.
+*/
+
+   /* Start by picking up the v0 NDC coordinates, because that vertex
+* may be shared with the destination.
+*/
+   if (c-key.has_noperspective_shading) {
+  v0_ndc_copy = get_tmp(c);
+  brw_MOV(p, v0_ndc_copy, deref_4f(v0_ptr,
+   brw_vert_result_to_offset(c-vue_map,
+ 
BRW_VERT_RESULT_NDC)));
+   }  
+
+   /*
+* Compute the new 3D position
+*/
+
+   delta = brw_vert_result_to_offset(c-vue_map, VERT_RESULT_HPOS);
+   tmp = get_tmp(c);
+   brw_MUL(p, 
+   vec4(brw_null_reg()),
+   deref_4f(v1_ptr, delta),
+   t0);
+
+   brw_MAC(p, 
+   tmp,  
+   negate(deref_4f(v0_ptr, delta)),
+   t0); 
+ 
+   brw_ADD(p,
+   deref_4f(dest_ptr, delta), 
+   deref_4f(v0_ptr, delta),
+   tmp);
+   release_tmp(c, tmp);
+
+   /* Then recreate the projected (NDC) coordinate in the new vertex
+* header
 */
+   brw_clip_project_vertex(c, dest_ptr);
+
+   /*
+* If we have noperspective attributes, we now need to compute the
+* screen-space t.
+*/
+   if (c-key.has_noperspective_shading) {
+  delta = brw_vert_result_to_offset(c-vue_map, BRW_VERT_RESULT_NDC);
+  t_nopersp = get_tmp(c);
+  tmp = get_tmp(c);
+
+  /* Build a register with coordinates from the second and new vertices */
+  brw_MOV(p, t_nopersp, deref_4f(v1_ptr, delta));
+  brw_MOV(p, tmp, deref_4f(dest_ptr, delta));
+  brw_set_access_mode(p, BRW_ALIGN_16);
+  brw_MOV(p,
+  brw_writemask(t_nopersp, WRITEMASK_ZW),
+  brw_swizzle(tmp, 0,1,0,1));
+
+  /* Subtract the coordinates of the first vertex */
+  brw_ADD(p, t_nopersp, t_nopersp, negate(brw_swizzle(v0_ndc_copy, 
0,1,0,1)));
+
+  /* Add the absolute value of the X and Y deltas so

Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-06-29 Thread Olivier Galibert
On Fri, Jun 29, 2012 at 12:52:06PM -0700, Stéphane Marchesin wrote:
 Yeah, but my question was more high level, whether the vertex id
 support required the previous refactor. It looks like it does though,
 and I don't want to untangle, so I'll revert both 3/4 and 4/4.

You realize that will re-break instanceID on llvmpipe for glsl  120, right?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-06-29 Thread Olivier Galibert
On Fri, Jun 29, 2012 at 03:09:23PM -0700, Stéphane Marchesin wrote:
 I do, but it fixes a regression, so unless you have a fix, it's the way to
 go. If you have a fix I'll happily test it :)

Just between us, revert on small regressions may not be optimal long
term on a project like mesa where the review/commit pipeline is
clogged.  The risk of losing developers is non-negligible.  The linux
kernel can afford it because even if you miss a cycle you know that
you will have another one in two months, and there are a lot of
intermediate collation trees in which your patch can be tried out and
shaken for bugs (subsystem trees, -next, akp patch tree, etc).  I'm
not sure Mesa can afford it.

That said, try this.

commit 56555c58d7f16c8d619c21feb23096155e2fb505
Author: Olivier Galibert galib...@pobox.com
Date:   Sat Jun 30 00:41:20 2012 +0200

lp_bld_tgsi_soa: Fix conversion of system values to float.

Signed-off-by: Olivier Galibert galib...@pobox.com

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 55db561..f8df2bc 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -811,9 +811,10 @@ emit_fetch_system_value(
   break;
}
 
+   /* Extend that when atype can happen to be float */
if (atype != stype) {
   if (stype == TGSI_TYPE_FLOAT) {
- res = LLVMBuildBitCast(builder, res, bld_base-base.vec_type, );
+ res = lp_build_int_to_float(bld_base-base, res);
   } else if (stype == TGSI_TYPE_UNSIGNED) {
  res = LLVMBuildBitCast(builder, res, bld_base-uint_bld.vec_type, );
   } else if (stype == TGSI_TYPE_SIGNED) {
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.

2012-06-27 Thread Olivier Galibert
On Wed, Jun 27, 2012 at 03:17:05AM -0700, Jose Fonseca wrote:
 I took a look at the results, and it seems to me that bri linear
 code is fine -- the test is merely too strict, and doesnot forgive
 the gravitation towards integer lod that brilinear implements.

Yes, the current code maps [0,.25] to 0, [0.25,0.75] to [0-1] and
[0.75,1] to 1.  So you you need an error tolerance of 0.20 given how
the test is done on multiples of 0.2.

What's your criteria to decide that a precision is good enough?

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.

2012-06-26 Thread Olivier Galibert
On Mon, Jun 25, 2012 at 03:16:35PM -0700, Jose Fonseca wrote:
 Indeed lp_build_brilinear_lod is not faster than
 lp_build_ifloor_fract, but brilinear is faster, not because log is a
 faster approximation, but because it increases the odds that fract
 part is zero, which means that we can sample from a single mip
 level, instead of lerping between two mip levels.
 
 I think you have a good point here -- lp_build_brilinear_lod is a
 log2 approximation which is wrong here and that's a great catch, --
 but I have a point too: lp_build_ifloor_fract will slow down texture
 sampling.

 Just like log2 and brilinear log2, we need a variant of
 ifloor_fract, that increases the probability of fract part being
 zero, essentially by applying a stair case transformation like:

You can do that by multiplying by 'k', subtracting 0.5*k and clamping
to [0,1[.  The question is whether you really want to do something
like that for explicit lod, where the user supposedly exactly knows
what he wants.  textureLod is not used often at all[1], so one can
think that when it's used you'd better do it precisely.

  OG.

[1] You see more uses of lod bias and/or textureGrad, the latter due
to the use of conditionals.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] llvmpipe: handle more PIPE_CAP_x queries

2012-06-26 Thread Olivier Galibert
On Tue, Jun 26, 2012 at 02:46:01PM -0600, Brian Paul wrote:
 As with the previous commit for softpipe.
 
 v2: remove 'default' case to get compile-time warning
 ---
  src/gallium/drivers/llvmpipe/lp_screen.c |   52 +++--
  1 files changed, 48 insertions(+), 4 deletions(-)
 
 diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
 b/src/gallium/drivers/llvmpipe/lp_screen.c
 index 40037a5..e66737b 100644
 --- a/src/gallium/drivers/llvmpipe/lp_screen.c
 +++ b/src/gallium/drivers/llvmpipe/lp_screen.c
 +   case PIPE_CAP_GLSL_FEATURE_LEVEL:
 +  return 0;

Why not 120?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] llvmpipe: Remove the ARB_draw_instanced capability.

2012-06-25 Thread Olivier Galibert
That capability requires integer handling and that's not yet active,
ending with a failure in draw-non-instanced unless you force it on.
See bug 51366.

Frankly, I'd rather have that patch rejected and integer/glsl 130
capability activated instead.  There still are things missing, but
they mostly have their own extension anyway.  And the overall picture
ain't so bad.

Signed-off-by: Olivier Galibert galib...@pobox.com

diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 40037a5..5eb826e 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -152,8 +152,6 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
   return 1;
case PIPE_CAP_DEPTH_CLIP_DISABLE:
   return 0;
-   case PIPE_CAP_TGSI_INSTANCEID:
-   case PIPE_CAP_VERTEX_ELEMENT_INSTANCE_DIVISOR:
case PIPE_CAP_MIXED_COLORBUFFER_FORMATS:
case PIPE_CAP_CONDITIONAL_RENDER:
   return 1;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] softpipe: Do round-to-even, not round-up.

2012-06-25 Thread Olivier Galibert
On Fri, May 18, 2012 at 08:55:39AM -0600, Brian Paul wrote:
 In any case, I think this function could be moved into u_math.c so it 
 could be used elsewhere.
[...]
 I was looking at the GLSL round() and roundEven() functions.  The GLSL 
 spec says round() can use whatever method is fastest.  But in 
 builtin_functions.cpp the round() function is implemented in terms of 
 the round_even builtin.  It seems to me that we should have a generic 
 'round' builtin function and separate TGSI_ROUND and TGSI_ROUND_EVEN 
 opcodes so that drivers can really have the option of using a 
 faster/looser round function.

I've tried doing that.  I've moved the function to u_math.c, then made
src/glsl/ir_constant_expression.cpp use it.  That blew up.

If I compile with scons, I get:
  Linking build/linux-x86_64-debug/glsl/builtin_compiler ...
build/linux-x86_64-debug/glsl/ir_constant_expression.o: In function `dot':
/home/galibert/X/work/mesa-play/src/glsl/ir_constant_expression.cpp:47: 
undefined reference to `_debug_assert_fail'
[...]
/home/galibert/X/work/mesa-play/src/glsl/ir_constant_expression.cpp:265: 
undefined reference to `ieee754_fp32_round_half_to_even'
[etc]

If I compile with autoconf/make I get:
ir_constant_expression.cpp:42:25: fatal error: util/u_math.h: No such file or 
directory

So at that point src/glsl and src/gallium are not supposed to meet
each other.  And changing that is not a responsability I feel like
taking.  Any advice?

Best,

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] builtin_variables: Only advertise gl_InstanceIDARB when GLSL handle integers.

2012-06-25 Thread Olivier Galibert
It can be argued it makes to sense to advertise an integer system
variable in GLSL levels where integers aren't handled.

Signed-off-by: Olivier Galibert galib...@pobox.com

---

I don't really know if that's a patch we want, but otoh having
gl_InstanceIDARB being a different type depending on the GLSL version
would be... weird.


diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index 03b64c9..f9a341f 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -888,12 +888,13 @@ generate_ARB_draw_instanced_variables(exec_list 
*instructions,
   bool warn,
   _mesa_glsl_parser_targets target)
 {
-   /* gl_InstanceIDARB is only available in the vertex shader.
+   /* gl_InstanceIDARB is only available in the vertex shader, and
+* only if the glsl level can handle integers.
 */
if (target != vertex_shader)
   return;
 
-   if (state-ARB_draw_instanced_enable) {
+   if (state-ARB_draw_instanced_enable  state-language_version = 130) {
   ir_variable *inst =
  add_variable(instructions, state-symbols,
  gl_InstanceIDARB, glsl_type::int_type,
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Comparison of llvmpipe with 2.9 and 3.1

2012-06-25 Thread Olivier Galibert
  Hi,

I've just finished two piglit runs of llvmpipe with glsl 1.40 and gl
3.1 forced on, one with LLVM 2.9, the other with 3.1.

The least we can say is that there aren't many differences. 

- fp-indirections2, didn't have the patience to wait to see whether it
  would eventually stop.  Looks like something quadratic or worse in
  the LLVM optimizers.

- 17000-consecutive-chars-identifier, the memory corruption it creates
  behaved differently (probably due to the different glibc, it wasn't on
  the same box), causing a deadlock in malloc()

- texCombine fails on 3.1 only with:
Returncode: -5

Errors:
src/gallium/auxiliary/draw/draw_llvm.c:309:create_jit_vertex_header: Assertion 
`LLVMABISizeOfType(target, vertex_header) == __builtin_offsetof (struct 
vertex_header, data[data_elems])' failed.


Output:
--
GL_EXT_texture_env_combine verification test.
We only test a subset of all possible texture env combinations
because there's simply too many to exhaustively test them all.



So, in total, the story isn't bad.

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] u2f_emit: Fix type parameter in LLVM call.

2012-06-25 Thread Olivier Galibert
The type is the destination type (i.e. float vector) and not the
source type.  Fixes piglit fs-{in,de}crement-uint.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
index cbc5945..17f288f 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
@@ -693,7 +693,7 @@ u2f_emit(
 {
emit_data-output[emit_data-chan] = 
LLVMBuildUIToFP(bld_base-base.gallivm-builder,
emit_data-args[0],
-   
bld_base-uint_bld.vec_type, );
+   
bld_base-base.vec_type, );
 }
 
 static void
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.

2012-06-25 Thread Olivier Galibert
Brilinear folding must only be used if the log2 was computed with
brilinear too.  Fixes fs-textureLod-miplevels.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/gallium/auxiliary/gallivm/lp_bld_sample.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
index d966788..9deda61 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
@@ -513,7 +513,7 @@ lp_build_lod_selector(struct lp_build_sample_context *bld,
}
 
if (mip_filter == PIPE_TEX_MIPFILTER_LINEAR) {
-  if (!(gallivm_debug  GALLIVM_DEBUG_NO_BRILINEAR)) {
+  if (!explicit_lod  !(gallivm_debug  GALLIVM_DEBUG_NO_BRILINEAR)) {
  lp_build_brilinear_lod(float_bld, lod, BRILINEAR_FACTOR,
 out_lod_ipart, out_lod_fpart);
   }
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] llvmpipe: Remove the ARB_draw_instanced capability.

2012-06-25 Thread Olivier Galibert
On Mon, Jun 25, 2012 at 05:34:25AM -0700, Jose Fonseca wrote:
 - Original Message -
  That capability requires integer handling and that's not yet active,
  ending with a failure in draw-non-instanced unless you force it on.
  See bug 51366.
  
  Frankly, I'd rather have that patch rejected and integer/glsl 130
  capability activated instead.  There still are things missing, but
  they mostly have their own extension anyway.  And the overall picture
  ain't so bad.
 
 I'm personally also more interested in seeing llvmpipe to get the missing 
 features for GLSL 1.30 / OGL 3.
 
 What's the overall picture of llvmpipe w/ integer/glsl 130? That is, how many 
 piglit tests go from skipped to passed/failed?

To failed:

precision-05.vert
link-mismatch-layout-02
no-redeclaration-01.vert
feature-macro.vert
fs-exec-after-break
  - general failures, everybody has them

vs-clip-distance-bulk-assign
vs-clip-distance-inout-param
vs-clip-distance-out-param
vs-clip-distance-retval
  - haven't checked what the problem is, softpipe has it right

fs-isinf-vec2
fs-isinf-vec3
fs-isinf-vec4
vs-isinf-vec2
vs-isinf-vec3
vs-isinf-vec4
  - test is iffy

fs-texelFetch-2D
fs-texelFetchOffset-2D
  - no texelFetch support yet

fs-texture-sampler2dshadow-10
fs-texture-sampler2dshadow-11
  - dunno what's going on, softpipe fails it too

vs-attrib-ivec4-implied
vs-attrib-ivec4-precision
vs-attrib-uvec4-implied
vs-attrib-uvec4-precision
  - use glVertexAttribIPointer, which is GL 3.0+ only

vs-textureLod-miplevels
  - issue with vertex shader invalidation when sampler mode changes (as in, 
it's not done)

vs-textureLod-miplevels-2
  - you know that one, it's nowhere near fixed yet (the softpipe patch is 
awaiting review too :-)

texel-offset-limits
  - no limits defined in lp_screen.c, udnno whether texture() would take it 
into account either

To pass:
1503 total, it seems, you can be sure I'm not going to list them :-)

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.

2012-06-25 Thread Olivier Galibert
On Mon, Jun 25, 2012 at 07:31:20PM +0100, Roland Scheidegger wrote:
 Does this fix the test because lp_build_brilinear_lod produces bogus 
 values in this case or just because the test is strict about such 
 filtering optimizations? In the latter case I'm not sure I really see 
 much point.

Bogus.  It does the fractional-part log2 approximation there, which
only makes sense if you called fast_log2 before (and even then the log
bias is going to be strangely applied, but meh).


 I'm surprised it can actually pass in either case since we drop all but 
 the first lod per quad values on the floor anyway so I think you will 
 get neither the right filtering weights between mipmaps nor even the 
 right mip levels (if the integer part of the lod isn't the same) for 
 anything but the first texel per quad.

Luck due to the design of the test.  It's rectangles with a fixed lod
value, so the quads all have the same.  That's pretty much why I
cooked up miplevels-2 (only in vs though, it's much easier there and
the code is shared).

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.

2012-06-25 Thread Olivier Galibert
On Mon, Jun 25, 2012 at 11:40:08AM -0700, Jose Fonseca wrote:
 My thoughts too.
 
 Brilinear filtering provides a significant boost, and I don't see why skip 
 the optimization for explicit lod over implicit lods.

Warning: code misread :-)

Explicit lod does not need brilinear filtering because explicit lod is
post log2.  Brilinear is only about a faster log2, nothing else.
Explicit lod only needs the integer/fractional part separation.

The whole code is:
   if (mip_filter == PIPE_TEX_MIPFILTER_LINEAR) {
  if (!explicit_lod  !(gallivm_debug  GALLIVM_DEBUG_NO_BRILINEAR)) {
 lp_build_brilinear_lod(float_bld, lod, BRILINEAR_FACTOR,
out_lod_ipart, out_lod_fpart);
  }
  else {
 lp_build_ifloor_fract(float_bld, lod, out_lod_ipart, out_lod_fpart);
  }

  lp_build_name(*out_lod_fpart, lod_fpart);
   }
   else {
  *out_lod_ipart = lp_build_iround(float_bld, lod);
   }

and you're not going to tell me that lp_build_brilinear_lod is faster
than lp_build_ifloor_fract (especially since it includes it ;-)

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st_program.c: gl_ClipDistance must be interpolated in 3d space.

2012-06-24 Thread Olivier Galibert
That old bug was hidden but the clipper always interpolating in 3d
space no matter what it should have been doing.  Now that the
interpolation has been fixed, the bug shows up.

Fixes bugzilla 51364.

Signed-off-by: Olivier Galibert galib...@pobox.com

diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index e6664fb..9f98298 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -569,12 +569,12 @@ st_translate_fragment_program(struct st_context *st,
  case FRAG_ATTRIB_CLIP_DIST0:
 input_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST;
 input_semantic_index[slot] = 0;
-interpMode[slot] = TGSI_INTERPOLATE_LINEAR;
+interpMode[slot] = TGSI_INTERPOLATE_PERSPECTIVE;
 break;
  case FRAG_ATTRIB_CLIP_DIST1:
 input_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST;
 input_semantic_index[slot] = 1;
-interpMode[slot] = TGSI_INTERPOLATE_LINEAR;
+interpMode[slot] = TGSI_INTERPOLATE_PERSPECTIVE;
 break;
 /* In most cases, there is nothing special about these
  * inputs, so adopt a convention to use the generic
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] softpipe: Take all lods into account when texture sampling.

2012-06-21 Thread Olivier Galibert
On Wed, Jun 20, 2012 at 01:44:14PM +0100, Roland Scheidegger wrote:
 A lot of code I just glossed over it, but seems to look ok other than 
 the (performance) implications this might have.

Actually whether there's a performance implication is not obvious.  In
practice the code just kicks the 4-pixel loop one or two function
calls higher.  This unshares some tests, some function calls, and the
mip-size computation shifts.  For normal texturing and on x86 the
tests are correctly predicted after the first one, and so are the
function calls, giving all of them a near zero cost.  So I'm not sure
the costs is that measurable.

With the actual vectorization the llvmpipe situation may be different
(not so sure with the aos texturing though).

Best,

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/msaa: Only do multisample rasterization if GL_MULTISAMPLE enabled.

2012-06-21 Thread Olivier Galibert
On Thu, Jun 21, 2012 at 11:19:39AM +0200, Michel Dänzer wrote:
 On Die, 2012-06-19 at 17:18 -0700, Kenneth Graunke wrote: 
  Also, distribute the appropriate emacs and vim settings to indent things
  correctly.
 
 In any case, please do this *before* any kind of cleanup.

(global-set-key [(control c) (s)]  (lambda () (interactive) (setq 
c-basic-offset 3 tab-width 8 indent-tabs-mode nil)))

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/msaa: Only do multisample rasterization if GL_MULTISAMPLE enabled.

2012-06-21 Thread Olivier Galibert
On Thu, Jun 21, 2012 at 11:58:57AM +0200, Michel Dänzer wrote:
 On Don, 2012-06-21 at 11:38 +0200, Olivier Galibert wrote: 
  On Thu, Jun 21, 2012 at 11:19:39AM +0200, Michel Dänzer wrote:
   On Die, 2012-06-19 at 17:18 -0700, Kenneth Graunke wrote: 
Also, distribute the appropriate emacs and vim settings to indent things
correctly.
   
   In any case, please do this *before* any kind of cleanup.
  
  (global-set-key [(control c) (s)]  (lambda () (interactive) (setq 
  c-basic-offset 3 tab-width 8 indent-tabs-mode nil)))
 
 The point is to encode that in a file in the tree which is picked up
 automagically.

Errr, automagically running code coming from a repository without user
intervention is not usually considered smart, security-wise...

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-06-21 Thread Olivier Galibert
On Thu, Jun 21, 2012 at 10:28:22AM -0700, Jose Fonseca wrote:
 This patch series is causing regressions in select/feedback mode. Can you 
 take a look?

Sure.  I wouldn't have expected that case to ever happen, but it makes
sense now that I think of it.

commit edc7b26b03c0393582ff5ec8c963207c7553850a
Author: Olivier Galibert galib...@pobox.com
Date:   Thu Jun 21 19:37:11 2012 +0200

clip_init_state: Handle the case when there isn't a fragment shader.

Signed-off-by: Olivier Galibert galib...@pobox.com

diff --git a/src/gallium/auxiliary/draw/draw_pipe_clip.c 
b/src/gallium/auxiliary/draw/draw_pipe_clip.c
index 2d36eb3..c02d0ef 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_clip.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_clip.c
@@ -586,6 +586,9 @@ clip_init_state( struct draw_stage *stage )
 * two outputs for one input, so we tuck the information in a
 * specific array.  Second if they don't have qualifiers, the
 * default value has to be picked from the global shade mode.
+*
+* Of course, if we don't have a fragment shader in the first
+* place, defaults should be used.
 */
 
/* First pick up the interpolation mode for
@@ -595,10 +598,12 @@ clip_init_state( struct draw_stage *stage )
indexed_interp[0] = indexed_interp[1] = stage-draw-rasterizer-flatshade ?
   TGSI_INTERPOLATE_CONSTANT : TGSI_INTERPOLATE_PERSPECTIVE;
 
-   for (i = 0; i  fs-info.num_inputs; i++) {
-  if (fs-info.input_semantic_name[i] == TGSI_SEMANTIC_COLOR) {
- if (fs-info.input_interpolate[i] != TGSI_INTERPOLATE_COLOR)
-indexed_interp[fs-info.input_semantic_index[i]] = 
fs-info.input_interpolate[i];
+   if (fs) {
+  for (i = 0; i  fs-info.num_inputs; i++) {
+ if (fs-info.input_semantic_name[i] == TGSI_SEMANTIC_COLOR) {
+if (fs-info.input_interpolate[i] != TGSI_INTERPOLATE_COLOR)
+   indexed_interp[fs-info.input_semantic_index[i]] = 
fs-info.input_interpolate[i];
+ }
   }
}
 
@@ -627,12 +632,14 @@ clip_init_state( struct draw_stage *stage )
   */
  uint j;
  interp = TGSI_INTERPOLATE_PERSPECTIVE;
- for (j = 0; j  fs-info.num_inputs; j++) {
-if (vs-info.output_semantic_name[i] == 
fs-info.input_semantic_name[j] 
-vs-info.output_semantic_index[i] == 
fs-info.input_semantic_index[j]) {
-   interp = fs-info.input_interpolate[j];
-   break;
-}   
+ if (fs) {
+for (j = 0; j  fs-info.num_inputs; j++) {
+   if (vs-info.output_semantic_name[i] == 
fs-info.input_semantic_name[j] 
+   vs-info.output_semantic_index[i] == 
fs-info.input_semantic_index[j]) {
+  interp = fs-info.input_interpolate[j];
+  break;
+   }
+}
  }
   }
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Ping: patches to apply

2012-06-19 Thread Olivier Galibert
  Hi,

They've been revieved, they've been changed when requested :-)

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] softpipe: Offset is not to be applied to the layer parameter of array texture fetches.

2012-06-19 Thread Olivier Galibert
Signed-off-by: Olivier Galibert galib...@pobox.com
Reviewed-by: Brian Paul bri...@vmware.com
---
 src/gallium/drivers/softpipe/sp_tex_sample.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/softpipe/sp_tex_sample.c 
b/src/gallium/drivers/softpipe/sp_tex_sample.c
index d4c0175..f29a6c7 100644
--- a/src/gallium/drivers/softpipe/sp_tex_sample.c
+++ b/src/gallium/drivers/softpipe/sp_tex_sample.c
@@ -2693,7 +2693,7 @@ sample_get_texels(struct tgsi_sampler *tgsi_sampler,
case PIPE_TEXTURE_1D_ARRAY:
   for (j = 0; j  TGSI_QUAD_SIZE; j++) {
  int x = CLAMP(v_i[j] + offset[0], 0, width - 1);
- int y = CLAMP(v_j[j] + offset[1], 0, layers - 1);
+ int y = CLAMP(v_j[j], 0, layers - 1);
 tx = get_texel_1d_array(samp, addr, x, y);
 for (c = 0; c  4; c++) {
rgba[c][j] = tx[c];
@@ -2715,7 +2715,7 @@ sample_get_texels(struct tgsi_sampler *tgsi_sampler,
   for (j = 0; j  TGSI_QUAD_SIZE; j++) {
  int x = CLAMP(v_i[j] + offset[0], 0, width - 1);
  int y = CLAMP(v_j[j] + offset[1], 0, height - 1);
- int layer = CLAMP(v_k[j] + offset[2], 0, layers - 1);
+ int layer = CLAMP(v_k[j], 0, layers - 1);
 tx = get_texel_2d_array(samp, addr, x, y, layer);
 for (c = 0; c  4; c++) {
rgba[c][j] = tx[c];
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] draw: fix flat shading and screen-space linear interpolation in clipper

2012-06-19 Thread Olivier Galibert
This includes:
- picking up correctly which attributes are flatshaded and which are
  noperspective

- copying the flatshaded attributes when needed, including the
  non-built-in ones

- correctly interpolating the noperspective attributes in screen-space
  instead than in a 3d-correct fashion.

Signed-off-by: Olivier Galibert galib...@pobox.com
Reviewed-by: Brian Paul bri...@vmware.com
---
 src/gallium/auxiliary/draw/draw_pipe_clip.c |  144 +--
 1 file changed, 113 insertions(+), 31 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_pipe_clip.c 
b/src/gallium/auxiliary/draw/draw_pipe_clip.c
index 4da4d65..2d36eb3 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_clip.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_clip.c
@@ -39,6 +39,7 @@
 
 #include draw_vs.h
 #include draw_pipe.h
+#include draw_fs.h
 
 
 #ifndef IS_NEGATIVE
@@ -56,11 +57,12 @@
 struct clip_stage {
struct draw_stage stage;  /** base class */
 
-   /* Basically duplicate some of the flatshading logic here:
-*/
-   boolean flat;
-   uint num_color_attribs;
-   uint color_attribs[4];  /* front/back primary/secondary colors */
+   /* List of the attributes to be flatshaded. */
+   uint num_flat_attribs;
+   uint flat_attribs[PIPE_MAX_SHADER_OUTPUTS];
+
+   /* Mask of attributes in noperspective mode */
+   boolean noperspective_attribs[PIPE_MAX_SHADER_OUTPUTS];
 
float (*plane)[4];
 };
@@ -91,17 +93,16 @@ static void interp_attr( float dst[4],
 
 
 /**
- * Copy front/back, primary/secondary colors from src vertex to dst vertex.
- * Used when flat shading.
+ * Copy flat shaded attributes src vertex to dst vertex.
  */
-static void copy_colors( struct draw_stage *stage,
-struct vertex_header *dst,
-const struct vertex_header *src )
+static void copy_flat( struct draw_stage *stage,
+   struct vertex_header *dst,
+   const struct vertex_header *src )
 {
const struct clip_stage *clipper = clip_stage(stage);
uint i;
-   for (i = 0; i  clipper-num_color_attribs; i++) {
-  const uint attr = clipper-color_attribs[i];
+   for (i = 0; i  clipper-num_flat_attribs; i++) {
+  const uint attr = clipper-flat_attribs[i];
   COPY_4FV(dst-data[attr], src-data[attr]);
}
 }
@@ -120,6 +121,7 @@ static void interp( const struct clip_stage *clip,
const unsigned pos_attr = 
draw_current_shader_position_output(clip-stage.draw);
const unsigned clip_attr = 
draw_current_shader_clipvertex_output(clip-stage.draw);
unsigned j;
+   float t_nopersp;
 
/* Vertex header.
 */
@@ -148,12 +150,36 @@ static void interp( const struct clip_stage *clip,
   dst-data[pos_attr][2] = pos[2] * oow * scale[2] + trans[2];
   dst-data[pos_attr][3] = oow;
}
+   
+   /**
+* Compute the t in screen-space instead of 3d space to use
+* for noperspective interpolation.
+*
+* The points can be aligned with the X axis, so in that case try
+* the Y.  When both points are at the same screen position, we can
+* pick whatever value (the interpolated point won't be in front
+* anyway), so just use the 3d t.
+*/
+   {
+  int k;
+  t_nopersp = t;
+  for (k = 0; k  2; k++)
+ if (in-data[pos_attr][k] != out-data[pos_attr][k]) {
+t_nopersp = (dst-data[pos_attr][k] - out-data[pos_attr][k]) /
+   (in-data[pos_attr][k] - out-data[pos_attr][k]);
+break;
+ }
+   }
 
/* Other attributes
 */
for (j = 0; j  nr_attrs; j++) {
-  if (j != pos_attr  j != clip_attr)
-interp_attr(dst-data[j], t, in-data[j], out-data[j]);
+  if (j != pos_attr  j != clip_attr) {
+ if (clip-noperspective_attribs[j])
+interp_attr(dst-data[j], t_nopersp, in-data[j], out-data[j]);
+ else
+interp_attr(dst-data[j], t, in-data[j], out-data[j]);
+  }
}
 }
 
@@ -406,14 +432,14 @@ do_clip_tri( struct draw_stage *stage,
/* If flat-shading, copy provoking vertex color to polygon vertex[0]
 */
if (n = 3) {
-  if (clipper-flat) {
+  if (clipper-num_flat_attribs) {
  if (stage-draw-rasterizer-flatshade_first) {
 if (inlist[0] != header-v[0]) {
assert(tmpnr  MAX_CLIPPED_VERTICES + 1);
if (tmpnr = MAX_CLIPPED_VERTICES + 1)
   return;
inlist[0] = dup_vert(stage, inlist[0], tmpnr++);
-   copy_colors(stage, inlist[0], header-v[0]);
+   copy_flat(stage, inlist[0], header-v[0]);
 }
  }
  else {
@@ -422,7 +448,7 @@ do_clip_tri( struct draw_stage *stage,
if (tmpnr = MAX_CLIPPED_VERTICES + 1)
   return;
inlist[0] = dup_vert(stage, inlist[0], tmpnr++);
-   copy_colors(stage, inlist[0], header-v[2]);
+   copy_flat(stage, inlist[0], header-v[2

[Mesa-dev] [PATCH 4/4] llvmpipe: Add vertex id support.

2012-06-19 Thread Olivier Galibert
Signed-off-by: Olivier Galibert galib...@pobox.com
Reviewed-by: Brian Paul bri...@vmware.com
---
 src/gallium/auxiliary/draw/draw_llvm.c  |   32 ++-
 src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |   13 +++--
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |   11 +---
 src/gallium/drivers/llvmpipe/lp_state_fs.c  |5 +++-
 4 files changed, 42 insertions(+), 19 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index 8e787c5..e08221e 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -459,7 +459,7 @@ generate_vs(struct draw_llvm *llvm,
 LLVMBuilderRef builder,
 LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS],
 const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS],
-LLVMValueRef instance_id,
+const struct lp_bld_tgsi_system_values *system_values,
 LLVMValueRef context_ptr,
 struct lp_build_sampler_soa *draw_sampler,
 boolean clamp_vertex_color)
@@ -491,7 +491,7 @@ generate_vs(struct draw_llvm *llvm,
  vs_type,
  NULL /*struct lp_build_mask_context *mask*/,
  consts_ptr,
- instance_id,
+ system_values,
  NULL /*pos*/,
  inputs,
  outputs,
@@ -1248,7 +1248,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
LLVMValueRef count, fetch_elts, fetch_count;
LLVMValueRef stride, step, io_itr;
LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr;
-   LLVMValueRef instance_id;
LLVMValueRef zero = lp_build_const_int32(gallivm, 0);
LLVMValueRef one = lp_build_const_int32(gallivm, 1);
struct draw_context *draw = llvm-draw;
@@ -1270,6 +1269,9 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
const unsigned pos = draw_current_shader_position_output(llvm-draw);
const unsigned cv = draw_current_shader_clipvertex_output(llvm-draw);
boolean have_clipdist = FALSE;
+   struct lp_bld_tgsi_system_values system_values;
+
+   memset(system_values, 0, sizeof(system_values));
 
arg_types[0] = get_context_ptr_type(llvm);   /* context */
arg_types[1] = get_vertex_header_ptr_type(llvm); /* vertex_header */
@@ -1300,19 +1302,19 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
  LLVMAddAttribute(LLVMGetParam(variant_func, i),
   LLVMNoAliasAttribute);
 
-   context_ptr  = LLVMGetParam(variant_func, 0);
-   io_ptr   = LLVMGetParam(variant_func, 1);
-   vbuffers_ptr = LLVMGetParam(variant_func, 2);
-   stride   = LLVMGetParam(variant_func, 5);
-   vb_ptr   = LLVMGetParam(variant_func, 6);
-   instance_id  = LLVMGetParam(variant_func, 7);
+   context_ptr   = LLVMGetParam(variant_func, 0);
+   io_ptr= LLVMGetParam(variant_func, 1);
+   vbuffers_ptr  = LLVMGetParam(variant_func, 2);
+   stride= LLVMGetParam(variant_func, 5);
+   vb_ptr= LLVMGetParam(variant_func, 6);
+   system_values.instance_id = LLVMGetParam(variant_func, 7);
 
lp_build_name(context_ptr, context);
lp_build_name(io_ptr, io);
lp_build_name(vbuffers_ptr, vbuffers);
lp_build_name(stride, stride);
lp_build_name(vb_ptr, vb);
-   lp_build_name(instance_id, instance_id);
+   lp_build_name(system_values.instance_id, instance_id);
 
if (elts) {
   fetch_elts   = LLVMGetParam(variant_func, 3);
@@ -1378,6 +1380,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   lp_build_printf(builder,  --- io %d = %p, loop counter %d\n,
   io_itr, io, lp_loop.counter);
 #endif
+  system_values.vertex_id = lp_build_zero(gallivm, lp_type_uint_vec(32));
   for (i = 0; i  TGSI_NUM_CHANNELS; ++i) {
  LLVMValueRef true_index =
 LLVMBuildAdd(builder,
@@ -1395,7 +1398,10 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
  true_index, 1, );
 true_index = LLVMBuildLoad(builder, fetch_ptr, fetch_elt);
  }
-
+ 
+ system_values.vertex_id = LLVMBuildInsertElement(gallivm-builder,
+  
system_values.vertex_id, true_index,
+  
lp_build_const_int32(gallivm, i), );
  for (j = 0; j  draw-pt.nr_vertex_elements; ++j) {
 struct pipe_vertex_element *velem = draw-pt.vertex_element[j];
 LLVMValueRef vb_index =
@@ -1403,7 +1409,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
 LLVMValueRef vb = LLVMBuildGEP(builder, vb_ptr, vb_index, 1, );
 generate_fetch(gallivm

[Mesa-dev] [PATCH 3/4] llvmpipe: Simplify and fix system variables fetch.

2012-06-19 Thread Olivier Galibert
The system array values concept doesn't really because it expects the
system values to be fixed per call, which is wrong for gl_VertexID and
iffy for gl_SampleID.  So this patch does two things:

- kill the array, have emit_fetch_system_value directly pick the
  values it needs (only gl_InstanceID for now, as the previous code)

- correctly handle the expected type in emit_fetch_system_value

Signed-off-by: Olivier Galibert galib...@pobox.com
Reviewed-by: Brian Paul bri...@vmware.com
---
 src/gallium/auxiliary/draw/draw_llvm.c  |   10 +--
 src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |   11 +--
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |   88 +++
 src/gallium/drivers/llvmpipe/lp_state_fs.c  |2 +-
 4 files changed, 33 insertions(+), 78 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index e1df2f1..8e787c5 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -459,7 +459,7 @@ generate_vs(struct draw_llvm *llvm,
 LLVMBuilderRef builder,
 LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS],
 const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS],
-LLVMValueRef system_values_array,
+LLVMValueRef instance_id,
 LLVMValueRef context_ptr,
 struct lp_build_sampler_soa *draw_sampler,
 boolean clamp_vertex_color)
@@ -491,7 +491,7 @@ generate_vs(struct draw_llvm *llvm,
  vs_type,
  NULL /*struct lp_build_mask_context *mask*/,
  consts_ptr,
- system_values_array,
+ instance_id,
  NULL /*pos*/,
  inputs,
  outputs,
@@ -1249,7 +1249,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
LLVMValueRef stride, step, io_itr;
LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr;
LLVMValueRef instance_id;
-   LLVMValueRef system_values_array;
LLVMValueRef zero = lp_build_const_int32(gallivm, 0);
LLVMValueRef one = lp_build_const_int32(gallivm, 1);
struct draw_context *draw = llvm-draw;
@@ -1340,9 +1339,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
 
lp_build_context_init(bld, gallivm, lp_type_int(32));
 
-   system_values_array = lp_build_system_values_array(gallivm, vs_info,
-  instance_id, NULL);
-
/* function will return non-zero i32 value if any clipped vertices */
ret_ptr = lp_build_alloca(gallivm, int32_type, );
LLVMBuildStore(builder, zero, ret_ptr);
@@ -1418,7 +1414,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   builder,
   outputs,
   ptr_aos,
-  system_values_array,
+  instance_id,
   context_ptr,
   sampler,
   variant-key.clamp_vertex_color);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
index 141e799..c4e690c 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
@@ -205,7 +205,7 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
   struct lp_type type,
   struct lp_build_mask_context *mask,
   LLVMValueRef consts_ptr,
-  LLVMValueRef system_values_array,
+  LLVMValueRef instance_id,
   const LLVMValueRef *pos,
   const LLVMValueRef (*inputs)[4],
   LLVMValueRef (*outputs)[4],
@@ -225,13 +225,6 @@ lp_build_tgsi_aos(struct gallivm_state *gallivm,
   const struct tgsi_shader_info *info);
 
 
-LLVMValueRef
-lp_build_system_values_array(struct gallivm_state *gallivm,
- const struct tgsi_shader_info *info,
- LLVMValueRef instance_id,
- LLVMValueRef facing);
-
-
 struct lp_exec_mask {
struct lp_build_context *bld;
 
@@ -388,7 +381,7 @@ struct lp_build_tgsi_soa_context
 */
LLVMValueRef inputs_array;
 
-   LLVMValueRef system_values_array;
+   LLVMValueRef instance_id;
 
/** bitmask indicating which register files are accessed indirectly */
unsigned indirect_files;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 412dc0c..26be902 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -786,18 +786,37 @@ emit_fetch_system_value(
 {
struct lp_build_tgsi_soa_context * bld = lp_soa_context(bld_base);
struct gallivm_state *gallivm = bld-bld_base.base.gallivm;
+   const struct tgsi_shader_info *info = bld-bld_base.info;
LLVMBuilderRef

Re: [Mesa-dev] [PATCH] softpipe: Take all lods into account when texture sampling.

2012-06-19 Thread Olivier Galibert
On Tue, Jun 19, 2012 at 02:46:35PM -0700, Jose Fonseca wrote:
 Could you give more background on why is this necessary?
 
 This will make software renderering slower, so I'd really like to avoid it on 
 llvmpipe if at all possible.

Well, given the existence of textureLod and textureGrad every texture
sample can easily hit a different mipmap or even, by switching between
minification and magnification, a different filter entirely.  Even a
simple texture() is hit, if your polygon is horizontal enough.

And this goes double for vertex shaders, where texture fetches there
have less reason to be close in texture space.

textureSize and textureFetch, with their explicit lod, have of course
the same problem. only worse.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Clarifications w.r.t MSAA

2012-06-12 Thread Olivier Galibert
  Hi all,

I'm getting a little lost in all the interactions between the
different parts of the GL standards and what I understand of the
expectations when it comes to MSAA.  It would be nice if I could have
some clarifications.

I'll start with what I think I understand (and please correct me when
I'm wrong) and add a number of questions.  I'll also ignore the
resolve part, which isn't an issue (at least for me :-).


MSAA is a variant on the supersampling theme where the coverage is
supersampled but depth, stencil and color may or may not be.  The
destination buffer has enough space to store the full results of a
complete supersampling, but some of the values may be duplicated.

The variable MIN_SAMPLE_SHADING_VALUE allows the application to
control the minimum number of values that have to be computed.  It can
say for instance that in a 16xMSAA case at least 4 samples per pixel
are required.

So let's take a case of 16xMSAA (say with the DX11 pattern) and let's
look at the pipeline.  First the coverage is sampled for the 16 fixed
positions, leaving C active samples.  Then there should be early depth
testing then shading, or the other way around, depending on the
shaders.

First question: how many depths should be computed, and for which
coordinates? Which of these values is associated with which sample?

Second question: how many samples should be shaded, and for which
coordinates?  What is the impact of depth testing failure?

Third question: what happens when a variable has a sample qualifier
in the fragment shader?  Or centroid?

Fourth question: how does gl_SampleMask interact with all that when
more than one sample is evaluated.  And what does gl_SampleMaskIn look
like in the same case?

I hope you people can help me clarify all that stuff :-)

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Fix pi/2 constant in acos built-in function

2012-06-06 Thread Olivier Galibert
On Tue, Jun 05, 2012 at 04:51:54PM -0700, Paul Berry wrote:
 The best idea I've got so far would be a shader_runner test with a fragment
 shader that computes dFdx(asin(x)), compares it to the theoretical closed
 form derivative of asin(x) (which is 1/sqrt(1-x^2)), and draws red pixels
 if the result is outside a certain error tolerance.  We'd probably want to
 use a relative error (since the derivative of asin(x) can get quite large)
 and stop a bit shy of the endpoints where it goes to infinity.

Can't you take the perfectly reasonable hypothesis that the system's
asin is precise, and upload something like a 256x256 R32FG32FB32FA32F
texture with reference values?  262144 testing points should be good
enough :-)

And that's something that generalizes easily to all the functions you
may want to test on a segment.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Fix pi/2 constant in acos built-in function

2012-06-04 Thread Olivier Galibert
On Mon, Jun 04, 2012 at 01:11:13PM -0700, Ian Romanick wrote:
 From: Ian Romanick ian.d.roman...@intel.com
 
 In single precision, 1.5707963 becomes 1.5707962513 which is too
 small.  However, 1.5707964 becomes 1.5707963705 which is just right.
 The value 1.5707964 is already used in asin.ir.
 
 NOTE: This is a candidate for stable release branches.

If piglit stops bitching on that partical problem thanks to such a
small change, it's just beautiful.

Do we need a better precision atan, or should piglit just be told to
shutup?  The shutup patch has been sent it ages ago, but I can't do
the more precision one if that's what's wanted.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Fix pi/2 constant in acos built-in function

2012-06-04 Thread Olivier Galibert
On Mon, Jun 04, 2012 at 03:23:34PM -0700, Paul Berry wrote:
 I'm not even kidding--I love this
 stuff and I'm jealous that I don't have time to work on it right now

Do you have a favorite method for Vandermonde matrix inversion?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/2] Add vertex id to llvmpipe.

2012-06-01 Thread Olivier Galibert
  Hi,

The following pair of patches add gl_VertexID support to llvmpipe.
They also simplify the system value fetch methodology (hopefully
generating better code in the end) and fixes some issues with
gl_InstanceID.  The I don't understand how it could ever work kind
of issues, converting from int32 to float twice is not good, usually.

Best,

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] llvmpipe: Simplify and fix system variables fetch.

2012-06-01 Thread Olivier Galibert
The system array values concept doesn't really because it expects the
system values to be fixed per call, which is wrong for gl_VertexID and
iffy for gl_SampleID.  So this patch does two things:

- kill the array, have emit_fetch_system_value directly pick the
  values it needs (only gl_InstanceID for now, as the previous code)

- correctly handle the expected type in emit_fetch_system_value

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/gallium/auxiliary/draw/draw_llvm.c  |   10 +--
 src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |   11 +--
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |   88 +++
 src/gallium/drivers/llvmpipe/lp_state_fs.c  |2 +-
 4 files changed, 33 insertions(+), 78 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index 4058e11..d5eb727 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -456,7 +456,7 @@ generate_vs(struct draw_llvm *llvm,
 LLVMBuilderRef builder,
 LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS],
 const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS],
-LLVMValueRef system_values_array,
+LLVMValueRef instance_id,
 LLVMValueRef context_ptr,
 struct lp_build_sampler_soa *draw_sampler,
 boolean clamp_vertex_color)
@@ -488,7 +488,7 @@ generate_vs(struct draw_llvm *llvm,
  vs_type,
  NULL /*struct lp_build_mask_context *mask*/,
  consts_ptr,
- system_values_array,
+ instance_id,
  NULL /*pos*/,
  inputs,
  outputs,
@@ -1246,7 +1246,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
LLVMValueRef stride, step, io_itr;
LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr;
LLVMValueRef instance_id;
-   LLVMValueRef system_values_array;
LLVMValueRef zero = lp_build_const_int32(gallivm, 0);
LLVMValueRef one = lp_build_const_int32(gallivm, 1);
struct draw_context *draw = llvm-draw;
@@ -1337,9 +1336,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
 
lp_build_context_init(bld, gallivm, lp_type_int(32));
 
-   system_values_array = lp_build_system_values_array(gallivm, vs_info,
-  instance_id, NULL);
-
/* function will return non-zero i32 value if any clipped vertices */
ret_ptr = lp_build_alloca(gallivm, int32_type, );
LLVMBuildStore(builder, zero, ret_ptr);
@@ -1415,7 +1411,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   builder,
   outputs,
   ptr_aos,
-  system_values_array,
+  instance_id,
   context_ptr,
   sampler,
   variant-key.clamp_vertex_color);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
index 141e799..c4e690c 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
@@ -205,7 +205,7 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
   struct lp_type type,
   struct lp_build_mask_context *mask,
   LLVMValueRef consts_ptr,
-  LLVMValueRef system_values_array,
+  LLVMValueRef instance_id,
   const LLVMValueRef *pos,
   const LLVMValueRef (*inputs)[4],
   LLVMValueRef (*outputs)[4],
@@ -225,13 +225,6 @@ lp_build_tgsi_aos(struct gallivm_state *gallivm,
   const struct tgsi_shader_info *info);
 
 
-LLVMValueRef
-lp_build_system_values_array(struct gallivm_state *gallivm,
- const struct tgsi_shader_info *info,
- LLVMValueRef instance_id,
- LLVMValueRef facing);
-
-
 struct lp_exec_mask {
struct lp_build_context *bld;
 
@@ -388,7 +381,7 @@ struct lp_build_tgsi_soa_context
 */
LLVMValueRef inputs_array;
 
-   LLVMValueRef system_values_array;
+   LLVMValueRef instance_id;
 
/** bitmask indicating which register files are accessed indirectly */
unsigned indirect_files;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 412dc0c..26be902 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -786,18 +786,37 @@ emit_fetch_system_value(
 {
struct lp_build_tgsi_soa_context * bld = lp_soa_context(bld_base);
struct gallivm_state *gallivm = bld-bld_base.base.gallivm;
+   const struct tgsi_shader_info *info = bld-bld_base.info;
LLVMBuilderRef builder = gallivm-builder;
-   LLVMValueRef

[Mesa-dev] [PATCH 2/2] llvmpipe: Add vertex id support.

2012-06-01 Thread Olivier Galibert
Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/gallium/auxiliary/draw/draw_llvm.c  |   10 --
 src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |3 ++-
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |7 +++
 src/gallium/drivers/llvmpipe/lp_state_fs.c  |2 +-
 4 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index d5eb727..71125ba 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -457,6 +457,7 @@ generate_vs(struct draw_llvm *llvm,
 LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS],
 const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS],
 LLVMValueRef instance_id,
+LLVMValueRef vertex_id,
 LLVMValueRef context_ptr,
 struct lp_build_sampler_soa *draw_sampler,
 boolean clamp_vertex_color)
@@ -489,6 +490,7 @@ generate_vs(struct draw_llvm *llvm,
  NULL /*struct lp_build_mask_context *mask*/,
  consts_ptr,
  instance_id,
+ vertex_id,
  NULL /*pos*/,
  inputs,
  outputs,
@@ -1245,7 +1247,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
LLVMValueRef count, fetch_elts, fetch_count;
LLVMValueRef stride, step, io_itr;
LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr;
-   LLVMValueRef instance_id;
+   LLVMValueRef instance_id, vertex_id;
LLVMValueRef zero = lp_build_const_int32(gallivm, 0);
LLVMValueRef one = lp_build_const_int32(gallivm, 1);
struct draw_context *draw = llvm-draw;
@@ -1375,6 +1377,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   lp_build_printf(builder,  --- io %d = %p, loop counter %d\n,
   io_itr, io, lp_loop.counter);
 #endif
+  vertex_id = lp_build_zero(gallivm, lp_type_uint_vec(32));
   for (i = 0; i  TGSI_NUM_CHANNELS; ++i) {
  LLVMValueRef true_index =
 LLVMBuildAdd(builder,
@@ -1392,7 +1395,9 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
  true_index, 1, );
 true_index = LLVMBuildLoad(builder, fetch_ptr, fetch_elt);
  }
-
+ 
+ vertex_id = LLVMBuildInsertElement(gallivm-builder, vertex_id, 
true_index,
+lp_build_const_int32(gallivm, i), 
);
  for (j = 0; j  draw-pt.nr_vertex_elements; ++j) {
 struct pipe_vertex_element *velem = draw-pt.vertex_element[j];
 LLVMValueRef vb_index =
@@ -1412,6 +1417,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   outputs,
   ptr_aos,
   instance_id,
+  vertex_id,
   context_ptr,
   sampler,
   variant-key.clamp_vertex_color);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
index c4e690c..f87f899 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
@@ -206,6 +206,7 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
   struct lp_build_mask_context *mask,
   LLVMValueRef consts_ptr,
   LLVMValueRef instance_id,
+  LLVMValueRef vertex_id,
   const LLVMValueRef *pos,
   const LLVMValueRef (*inputs)[4],
   LLVMValueRef (*outputs)[4],
@@ -381,7 +382,7 @@ struct lp_build_tgsi_soa_context
 */
LLVMValueRef inputs_array;
 
-   LLVMValueRef instance_id;
+   LLVMValueRef instance_id, vertex_id;
 
/** bitmask indicating which register files are accessed indirectly */
unsigned indirect_files;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 26be902..e1abae8 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -799,6 +799,11 @@ emit_fetch_system_value(
   atype = TGSI_TYPE_UNSIGNED;
   break;
 
+   case TGSI_SEMANTIC_VERTEXID:
+  res = bld-vertex_id;
+  atype = TGSI_TYPE_FLOAT;
+  break;
+
default:
   assert(!unexpected semantic in emit_fetch_system_value);
   res = bld_base-base.zero;
@@ -1996,6 +2001,7 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
   struct lp_build_mask_context *mask,
   LLVMValueRef consts_ptr,
   LLVMValueRef instance_id,
+  LLVMValueRef vertex_id,
   const LLVMValueRef *pos,
   const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS],
   LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS

[Mesa-dev] [PATCH 0/2] Add vertex id to llvmpipe. (v2)

2012-06-01 Thread Olivier Galibert
  Hi,

The following pair of patches add gl_VertexID support to llvmpipe.
They also simplify the system value fetch methodology (hopefully
generating better code in the end) and fixes some issues with
gl_InstanceID.  The I don't understand how it could ever work kind
of issues, converting from int32 to float twice is not good, usually.

v2: Fix a stupid type error for vertex id.

Best,

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] llvmpipe: Simplify and fix system variables fetch.

2012-06-01 Thread Olivier Galibert
The system array values concept doesn't really because it expects the
system values to be fixed per call, which is wrong for gl_VertexID and
iffy for gl_SampleID.  So this patch does two things:

- kill the array, have emit_fetch_system_value directly pick the
  values it needs (only gl_InstanceID for now, as the previous code)

- correctly handle the expected type in emit_fetch_system_value

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/gallium/auxiliary/draw/draw_llvm.c  |   10 +--
 src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |   11 +--
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |   88 +++
 src/gallium/drivers/llvmpipe/lp_state_fs.c  |2 +-
 4 files changed, 33 insertions(+), 78 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index 4058e11..d5eb727 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -456,7 +456,7 @@ generate_vs(struct draw_llvm *llvm,
 LLVMBuilderRef builder,
 LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS],
 const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS],
-LLVMValueRef system_values_array,
+LLVMValueRef instance_id,
 LLVMValueRef context_ptr,
 struct lp_build_sampler_soa *draw_sampler,
 boolean clamp_vertex_color)
@@ -488,7 +488,7 @@ generate_vs(struct draw_llvm *llvm,
  vs_type,
  NULL /*struct lp_build_mask_context *mask*/,
  consts_ptr,
- system_values_array,
+ instance_id,
  NULL /*pos*/,
  inputs,
  outputs,
@@ -1246,7 +1246,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
LLVMValueRef stride, step, io_itr;
LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr;
LLVMValueRef instance_id;
-   LLVMValueRef system_values_array;
LLVMValueRef zero = lp_build_const_int32(gallivm, 0);
LLVMValueRef one = lp_build_const_int32(gallivm, 1);
struct draw_context *draw = llvm-draw;
@@ -1337,9 +1336,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
 
lp_build_context_init(bld, gallivm, lp_type_int(32));
 
-   system_values_array = lp_build_system_values_array(gallivm, vs_info,
-  instance_id, NULL);
-
/* function will return non-zero i32 value if any clipped vertices */
ret_ptr = lp_build_alloca(gallivm, int32_type, );
LLVMBuildStore(builder, zero, ret_ptr);
@@ -1415,7 +1411,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   builder,
   outputs,
   ptr_aos,
-  system_values_array,
+  instance_id,
   context_ptr,
   sampler,
   variant-key.clamp_vertex_color);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
index 141e799..c4e690c 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
@@ -205,7 +205,7 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
   struct lp_type type,
   struct lp_build_mask_context *mask,
   LLVMValueRef consts_ptr,
-  LLVMValueRef system_values_array,
+  LLVMValueRef instance_id,
   const LLVMValueRef *pos,
   const LLVMValueRef (*inputs)[4],
   LLVMValueRef (*outputs)[4],
@@ -225,13 +225,6 @@ lp_build_tgsi_aos(struct gallivm_state *gallivm,
   const struct tgsi_shader_info *info);
 
 
-LLVMValueRef
-lp_build_system_values_array(struct gallivm_state *gallivm,
- const struct tgsi_shader_info *info,
- LLVMValueRef instance_id,
- LLVMValueRef facing);
-
-
 struct lp_exec_mask {
struct lp_build_context *bld;
 
@@ -388,7 +381,7 @@ struct lp_build_tgsi_soa_context
 */
LLVMValueRef inputs_array;
 
-   LLVMValueRef system_values_array;
+   LLVMValueRef instance_id;
 
/** bitmask indicating which register files are accessed indirectly */
unsigned indirect_files;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 412dc0c..26be902 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -786,18 +786,37 @@ emit_fetch_system_value(
 {
struct lp_build_tgsi_soa_context * bld = lp_soa_context(bld_base);
struct gallivm_state *gallivm = bld-bld_base.base.gallivm;
+   const struct tgsi_shader_info *info = bld-bld_base.info;
LLVMBuilderRef builder = gallivm-builder;
-   LLVMValueRef

[Mesa-dev] [PATCH 2/2] llvmpipe: Add vertex id support.

2012-06-01 Thread Olivier Galibert
Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/gallium/auxiliary/draw/draw_llvm.c  |   10 --
 src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |3 ++-
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |7 +++
 src/gallium/drivers/llvmpipe/lp_state_fs.c  |2 +-
 4 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index d5eb727..71125ba 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -457,6 +457,7 @@ generate_vs(struct draw_llvm *llvm,
 LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS],
 const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS],
 LLVMValueRef instance_id,
+LLVMValueRef vertex_id,
 LLVMValueRef context_ptr,
 struct lp_build_sampler_soa *draw_sampler,
 boolean clamp_vertex_color)
@@ -489,6 +490,7 @@ generate_vs(struct draw_llvm *llvm,
  NULL /*struct lp_build_mask_context *mask*/,
  consts_ptr,
  instance_id,
+ vertex_id,
  NULL /*pos*/,
  inputs,
  outputs,
@@ -1245,7 +1247,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
LLVMValueRef count, fetch_elts, fetch_count;
LLVMValueRef stride, step, io_itr;
LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr;
-   LLVMValueRef instance_id;
+   LLVMValueRef instance_id, vertex_id;
LLVMValueRef zero = lp_build_const_int32(gallivm, 0);
LLVMValueRef one = lp_build_const_int32(gallivm, 1);
struct draw_context *draw = llvm-draw;
@@ -1375,6 +1377,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   lp_build_printf(builder,  --- io %d = %p, loop counter %d\n,
   io_itr, io, lp_loop.counter);
 #endif
+  vertex_id = lp_build_zero(gallivm, lp_type_uint_vec(32));
   for (i = 0; i  TGSI_NUM_CHANNELS; ++i) {
  LLVMValueRef true_index =
 LLVMBuildAdd(builder,
@@ -1392,7 +1395,9 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
  true_index, 1, );
 true_index = LLVMBuildLoad(builder, fetch_ptr, fetch_elt);
  }
-
+ 
+ vertex_id = LLVMBuildInsertElement(gallivm-builder, vertex_id, 
true_index,
+lp_build_const_int32(gallivm, i), 
);
  for (j = 0; j  draw-pt.nr_vertex_elements; ++j) {
 struct pipe_vertex_element *velem = draw-pt.vertex_element[j];
 LLVMValueRef vb_index =
@@ -1412,6 +1417,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   outputs,
   ptr_aos,
   instance_id,
+  vertex_id,
   context_ptr,
   sampler,
   variant-key.clamp_vertex_color);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
index c4e690c..f87f899 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
@@ -206,6 +206,7 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
   struct lp_build_mask_context *mask,
   LLVMValueRef consts_ptr,
   LLVMValueRef instance_id,
+  LLVMValueRef vertex_id,
   const LLVMValueRef *pos,
   const LLVMValueRef (*inputs)[4],
   LLVMValueRef (*outputs)[4],
@@ -381,7 +382,7 @@ struct lp_build_tgsi_soa_context
 */
LLVMValueRef inputs_array;
 
-   LLVMValueRef instance_id;
+   LLVMValueRef instance_id, vertex_id;
 
/** bitmask indicating which register files are accessed indirectly */
unsigned indirect_files;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 26be902..37599da 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -799,6 +799,11 @@ emit_fetch_system_value(
   atype = TGSI_TYPE_UNSIGNED;
   break;
 
+   case TGSI_SEMANTIC_VERTEXID:
+  res = bld-vertex_id;
+  atype = TGSI_TYPE_UNSIGNED;
+  break;
+
default:
   assert(!unexpected semantic in emit_fetch_system_value);
   res = bld_base-base.zero;
@@ -1996,6 +2001,7 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
   struct lp_build_mask_context *mask,
   LLVMValueRef consts_ptr,
   LLVMValueRef instance_id,
+  LLVMValueRef vertex_id,
   const LLVMValueRef *pos,
   const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS],
   LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS

Re: [Mesa-dev] [PATCH] softpipe: Fix everything that is wrong with clipping and interpolation.

2012-05-30 Thread Olivier Galibert
On Wed, May 30, 2012 at 11:38:06AM +0100, Dave Airlie wrote:
 Have you checked llvmpipe with this? since it might need changes to go
 along with this.

It didn't look like llvmpipe was going anywhere near that.  OTOH, a
piglit run of llvmpipe I just did on the place gave me zero errors in
that area (including int interpolation if you can believe it) which
left me somewhat surprised.  I didn't think it was working correctly
in the first place...  It was failing in other places due to f.i. the
lack of support for anything other than z24s8, so I didn't hit
softpipe by mistake, I think.


 Also a count of piglit tests if fixes in the commit might be good.

That's going to take a little more time, given my trees are
accumulating not-yet-commited fixes.

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] softpipe: Fix everything that is wrong with clipping and interpolation.

2012-05-30 Thread Olivier Galibert
On Wed, May 30, 2012 at 07:32:16AM -0600, Brian Paul wrote:
 All the code above could use more comments.  Otherwise it takes some 
 pretty intense studying to understand what's going on there.

Ok, I'll take that into account (and the previous comments too).

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] softpipe: Fix everything that is wrong with clipping and interpolation.

2012-05-30 Thread Olivier Galibert
This includes:
- picking up correctly which attributes are flatshaded and which are
  noperspective

- copying the flatshaded attributes when needed, including the
  non-built-in ones

- correctly interpolating the noperspective attributes in screen-space
  instead than in a 3d-correct fashion.

Signed-off-by: Olivier Galibert galib...@pobox.com
---
 src/gallium/auxiliary/draw/draw_pipe_clip.c |  144 +--
 1 file changed, 113 insertions(+), 31 deletions(-)

I've kicked the f_nopersp computation up so that it's always
evaluated, and I've added a bunch of comments.

Every generated interpolation test in piglit pass for both softpipe
and llvmpipe at that point (after forcing llvmpipe to GLSL 1.30 of
course).

diff --git a/src/gallium/auxiliary/draw/draw_pipe_clip.c 
b/src/gallium/auxiliary/draw/draw_pipe_clip.c
index 4da4d65..2d36eb3 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_clip.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_clip.c
@@ -39,6 +39,7 @@
 
 #include draw_vs.h
 #include draw_pipe.h
+#include draw_fs.h
 
 
 #ifndef IS_NEGATIVE
@@ -56,11 +57,12 @@
 struct clip_stage {
struct draw_stage stage;  /** base class */
 
-   /* Basically duplicate some of the flatshading logic here:
-*/
-   boolean flat;
-   uint num_color_attribs;
-   uint color_attribs[4];  /* front/back primary/secondary colors */
+   /* List of the attributes to be flatshaded. */
+   uint num_flat_attribs;
+   uint flat_attribs[PIPE_MAX_SHADER_OUTPUTS];
+
+   /* Mask of attributes in noperspective mode */
+   boolean noperspective_attribs[PIPE_MAX_SHADER_OUTPUTS];
 
float (*plane)[4];
 };
@@ -91,17 +93,16 @@ static void interp_attr( float dst[4],
 
 
 /**
- * Copy front/back, primary/secondary colors from src vertex to dst vertex.
- * Used when flat shading.
+ * Copy flat shaded attributes src vertex to dst vertex.
  */
-static void copy_colors( struct draw_stage *stage,
-struct vertex_header *dst,
-const struct vertex_header *src )
+static void copy_flat( struct draw_stage *stage,
+   struct vertex_header *dst,
+   const struct vertex_header *src )
 {
const struct clip_stage *clipper = clip_stage(stage);
uint i;
-   for (i = 0; i  clipper-num_color_attribs; i++) {
-  const uint attr = clipper-color_attribs[i];
+   for (i = 0; i  clipper-num_flat_attribs; i++) {
+  const uint attr = clipper-flat_attribs[i];
   COPY_4FV(dst-data[attr], src-data[attr]);
}
 }
@@ -120,6 +121,7 @@ static void interp( const struct clip_stage *clip,
const unsigned pos_attr = 
draw_current_shader_position_output(clip-stage.draw);
const unsigned clip_attr = 
draw_current_shader_clipvertex_output(clip-stage.draw);
unsigned j;
+   float t_nopersp;
 
/* Vertex header.
 */
@@ -148,12 +150,36 @@ static void interp( const struct clip_stage *clip,
   dst-data[pos_attr][2] = pos[2] * oow * scale[2] + trans[2];
   dst-data[pos_attr][3] = oow;
}
+   
+   /**
+* Compute the t in screen-space instead of 3d space to use
+* for noperspective interpolation.
+*
+* The points can be aligned with the X axis, so in that case try
+* the Y.  When both points are at the same screen position, we can
+* pick whatever value (the interpolated point won't be in front
+* anyway), so just use the 3d t.
+*/
+   {
+  int k;
+  t_nopersp = t;
+  for (k = 0; k  2; k++)
+ if (in-data[pos_attr][k] != out-data[pos_attr][k]) {
+t_nopersp = (dst-data[pos_attr][k] - out-data[pos_attr][k]) /
+   (in-data[pos_attr][k] - out-data[pos_attr][k]);
+break;
+ }
+   }
 
/* Other attributes
 */
for (j = 0; j  nr_attrs; j++) {
-  if (j != pos_attr  j != clip_attr)
-interp_attr(dst-data[j], t, in-data[j], out-data[j]);
+  if (j != pos_attr  j != clip_attr) {
+ if (clip-noperspective_attribs[j])
+interp_attr(dst-data[j], t_nopersp, in-data[j], out-data[j]);
+ else
+interp_attr(dst-data[j], t, in-data[j], out-data[j]);
+  }
}
 }
 
@@ -406,14 +432,14 @@ do_clip_tri( struct draw_stage *stage,
/* If flat-shading, copy provoking vertex color to polygon vertex[0]
 */
if (n = 3) {
-  if (clipper-flat) {
+  if (clipper-num_flat_attribs) {
  if (stage-draw-rasterizer-flatshade_first) {
 if (inlist[0] != header-v[0]) {
assert(tmpnr  MAX_CLIPPED_VERTICES + 1);
if (tmpnr = MAX_CLIPPED_VERTICES + 1)
   return;
inlist[0] = dup_vert(stage, inlist[0], tmpnr++);
-   copy_colors(stage, inlist[0], header-v[0]);
+   copy_flat(stage, inlist[0], header-v[0]);
 }
  }
  else {
@@ -422,7 +448,7 @@ do_clip_tri( struct draw_stage *stage,
if (tmpnr = MAX_CLIPPED_VERTICES + 1)
   return

  1   2   >