from:"Olivier Galibert"

Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware

2016-03-01 Thread Olivier Galibert

I can confirm tri/cube work with latest git.  Talos Principle refuses
to start because of missing vkCmdBeginQuery, time to jump into the
docs to see how much of gen8 is copy-able there.

  OG.


On Tue, Mar 1, 2016 at 10:28 AM, Jacek Konieczny  wrote:
> On 2016-03-01 10:10, Martin Peres wrote:
>>
>> On 29/02/16 20:48, Jason Ekstrand wrote:
>>>
>>> On Fri, Feb 26, 2016 at 2:18 AM, Olivier Galibert >> <mailto:galib...@pobox.com>> wrote:
>>>
>>> Ok, I can tell you that 3DSTATE_DEPTH_BUFFER and
>>> 3DSTATE_STENCIL_BUFFER seem perfectly correct (assuming the gem
>>> address-patching-in works for the depth buffer address). I'll see if
>>> I can find a past version that works.
>>>
>>>
>>> FYI, this hang has been fixed now and most of the demos work
>>> more-or-less.
>>> --Jason
>>
>>
>> Just tried the vkcube with hsw and there is definitely an improvements
>> (the machine does not hard hang anymore) but vkcube now segfaults:
>
>
> For me both 'vkcube' and the 'cube' and 'tri' demos from
> LoaderAndValidationLayers work correctly with GIT revision
> 46b7c242da7c7c9ea7877a2c4b1fecdf5c1c0452.
>
> 'cube', 'tri' and most other Vulkan examples would cause GPU hang on
> earlier revisions, so the improvement is (was?) clear.
>
> Jacek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware

2016-03-01 Thread Olivier Galibert

Beware of path issues, vk* has no error checking and gives funky
values to the driver if it fails at finding its extra files.

  OG.


On Tue, Mar 1, 2016 at 10:10 AM, Martin Peres  wrote:
> On 29/02/16 20:48, Jason Ekstrand wrote:
>
> On Fri, Feb 26, 2016 at 2:18 AM, Olivier Galibert 
> wrote:
>>
>> Ok, I can tell you that 3DSTATE_DEPTH_BUFFER and
>> 3DSTATE_STENCIL_BUFFER seem perfectly correct (assuming the gem
>> address-patching-in works for the depth buffer address).  I'll see if
>> I can find a past version that works.
>
>
> FYI, this hang has been fixed now and most of the demos work more-or-less.
> --Jason
>
>
> Just tried the vkcube with hsw and there is definitely an improvements (the
> machine does not hard hang anymore) but vkcube now segfaults:
>
> #0  0x75210f23 in anv_descriptor_set_create () from
> /usr/lib/libvulkan_intel.so
> #1  0x7521121d in anv_AllocateDescriptorSets () from
> /usr/lib/libvulkan_intel.so
> #2  0x004063b0 in ?? ()
> #3  0x004035df in ?? ()
> #4  0x00404986 in ?? ()
> #5  0x0040589f in ?? ()
> #6  0x76aa9710 in __libc_start_main () from /usr/lib/libc.so.6
> #7  0x00402e69 in ?? ()
>
> Is it supposed to?
>
> I will have a look at it tonight.
>
> Martin
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware

2016-02-26 Thread Olivier Galibert

Ok, I can tell you that 3DSTATE_DEPTH_BUFFER and
3DSTATE_STENCIL_BUFFER seem perfectly correct (assuming the gem
address-patching-in works for the depth buffer address).  I'll see if
I can find a past version that works.

  OG.


On Wed, Feb 17, 2016 at 4:31 PM, Jason Ekstrand  wrote:
> On Tue, Feb 16, 2016 at 11:22 PM, Olivier Galibert 
> wrote:
>>
>> I'm actually interested about how one goes about debugging that kind
>> of problem, if you have pointers.  I would have an idea or two on how
>> to go about it if it was in userspace only, but once it crosses into
>> the kernel I'm not sure what strategies are best.
>
>
> This is almost certainly a userspace problem.  I mentioned before that  it's
> probably a depth/stencil problem.  I remember having similar problems a few
> months ago when I was reviving gen7.  I know that depth/stencil did work at
> some point.
>
> I would start by looking at is where we emit the 3DSTATE_DEPTH_BUFFER and
> 3DSTATE_STENCIL_BUFFER and trying to see if we're setting something up
> wrong.  Sometimes it's just a matter of looking at the documentation and
> comparing the values we're setting to the docs and seeing if the make sense.
> That's where I'd start.
>
> You could also try to go back a little ways (don't to past the update to
> 1.0) to see if you can find a point where depth/stencil worked and try and
> bisect to find where it broke.  That may also provide hints as to what's
> going wrong.
>
> Hope that helps,
> --Jason
>
>>
>>
>> Best,
>>
>>   OG.
>>
>>
>> On Wed, Feb 17, 2016 at 2:51 AM, Jason Ekstrand 
>> wrote:
>> > On Tue, Feb 16, 2016 at 1:21 PM, Olivier Galibert 
>> > wrote:
>> >>
>> >>   Hi,
>> >>
>> >> I'm getting gpu hangs with the lunarg examples (cube and tri) on my
>> >> Haswell (64 bits).  I attach /sys/class/drm/card0/error fwiw.  How
>> >> should I go about debugging that?
>> >
>> >
>> > It's a depth-stencil issue and we know about it.   The gen7 code needs
>> > some
>> > love.   I think Kristian and Jordan have been working on it.
>> > --Jason
>> >
>> >>
>> >>
>> >>   OG.
>> >>
>> >>
>> >> On Tue, Feb 16, 2016 at 4:19 PM, Jason Ekstrand 
>> >> wrote:
>> >> > The Intel mesa team is pleased to announce a brand-new open-source
>> >> > Vulkan
>> >> > driver for Intel hardware.  We've been working hard on this over the
>> >> > course
>> >> > of the past year or so and are excited to finally share it with the
>> >> > community.  We will work on up-streaming the driver in the next few
>> >> > weeks
>> >> > and hope to have it all in place in time for mesa 11.3 (mesa 12?).
>> >> > In
>> >> > the
>> >> > mean time, the driver can be found in the "vulkan" branch of the mesa
>> >> > git
>> >> > repo on freedesktop.org:
>> >> >
>> >> > https://cgit.freedesktop.org/mesa/mesa/log/?h=vulkan
>> >> >
>> >> > More information on building the driver and running a few simple apps
>> >> > can
>> >> > be found on the 01.org web site:
>> >> >
>> >> >
>> >> >
>> >> > https://01.org/linuxgraphics/blogs/jekstrand/2016/open-source-vulkan-drivers-intel-hardware
>> >> >
>> >> > We have talked to people at Red Hat and Cannonical and binaries
>> >> > should
>> >> > be
>> >> > available for Fedora and Ubuntu soon.  We will update the page on
>> >> > 01.org
>> >> > with links as soon as they are available.
>> >> >
>> >> > We have also created a small test suite called crucible which
>> >> > contains a
>> >> > few hundred tests (mostly for miptrees) that we created when bringing
>> >> > up
>> >> > the driver.  This isn't really intended to be the piglit of vulkan.
>> >> > With
>> >> > the CTS being publicly available, most cross-platform tests should go
>> >> > there.  We mostly made crucible so that we could write a few tests
>> >> > early
>> >> > on
>> >> > to get us going and for tests that were targetted specifically at our
>> >> > implementation.  None the less, they may prove usefu

Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware

2016-02-17 Thread Olivier Galibert

Ok, I'll do that, thanks :-)  No matter what, I'll learn interesting things.

  OG.


On Wed, Feb 17, 2016 at 4:31 PM, Jason Ekstrand  wrote:
> On Tue, Feb 16, 2016 at 11:22 PM, Olivier Galibert 
> wrote:
>>
>> I'm actually interested about how one goes about debugging that kind
>> of problem, if you have pointers.  I would have an idea or two on how
>> to go about it if it was in userspace only, but once it crosses into
>> the kernel I'm not sure what strategies are best.
>
>
> This is almost certainly a userspace problem.  I mentioned before that  it's
> probably a depth/stencil problem.  I remember having similar problems a few
> months ago when I was reviving gen7.  I know that depth/stencil did work at
> some point.
>
> I would start by looking at is where we emit the 3DSTATE_DEPTH_BUFFER and
> 3DSTATE_STENCIL_BUFFER and trying to see if we're setting something up
> wrong.  Sometimes it's just a matter of looking at the documentation and
> comparing the values we're setting to the docs and seeing if the make sense.
> That's where I'd start.
>
> You could also try to go back a little ways (don't to past the update to
> 1.0) to see if you can find a point where depth/stencil worked and try and
> bisect to find where it broke.  That may also provide hints as to what's
> going wrong.
>
> Hope that helps,
> --Jason
>
>>
>>
>> Best,
>>
>>   OG.
>>
>>
>> On Wed, Feb 17, 2016 at 2:51 AM, Jason Ekstrand 
>> wrote:
>> > On Tue, Feb 16, 2016 at 1:21 PM, Olivier Galibert 
>> > wrote:
>> >>
>> >>   Hi,
>> >>
>> >> I'm getting gpu hangs with the lunarg examples (cube and tri) on my
>> >> Haswell (64 bits).  I attach /sys/class/drm/card0/error fwiw.  How
>> >> should I go about debugging that?
>> >
>> >
>> > It's a depth-stencil issue and we know about it.   The gen7 code needs
>> > some
>> > love.   I think Kristian and Jordan have been working on it.
>> > --Jason
>> >
>> >>
>> >>
>> >>   OG.
>> >>
>> >>
>> >> On Tue, Feb 16, 2016 at 4:19 PM, Jason Ekstrand 
>> >> wrote:
>> >> > The Intel mesa team is pleased to announce a brand-new open-source
>> >> > Vulkan
>> >> > driver for Intel hardware.  We've been working hard on this over the
>> >> > course
>> >> > of the past year or so and are excited to finally share it with the
>> >> > community.  We will work on up-streaming the driver in the next few
>> >> > weeks
>> >> > and hope to have it all in place in time for mesa 11.3 (mesa 12?).
>> >> > In
>> >> > the
>> >> > mean time, the driver can be found in the "vulkan" branch of the mesa
>> >> > git
>> >> > repo on freedesktop.org:
>> >> >
>> >> > https://cgit.freedesktop.org/mesa/mesa/log/?h=vulkan
>> >> >
>> >> > More information on building the driver and running a few simple apps
>> >> > can
>> >> > be found on the 01.org web site:
>> >> >
>> >> >
>> >> >
>> >> > https://01.org/linuxgraphics/blogs/jekstrand/2016/open-source-vulkan-drivers-intel-hardware
>> >> >
>> >> > We have talked to people at Red Hat and Cannonical and binaries
>> >> > should
>> >> > be
>> >> > available for Fedora and Ubuntu soon.  We will update the page on
>> >> > 01.org
>> >> > with links as soon as they are available.
>> >> >
>> >> > We have also created a small test suite called crucible which
>> >> > contains a
>> >> > few hundred tests (mostly for miptrees) that we created when bringing
>> >> > up
>> >> > the driver.  This isn't really intended to be the piglit of vulkan.
>> >> > With
>> >> > the CTS being publicly available, most cross-platform tests should go
>> >> > there.  We mostly made crucible so that we could write a few tests
>> >> > early
>> >> > on
>> >> > to get us going and for tests that were targetted specifically at our
>> >> > implementation.  None the less, they may prove useful to someone and
>> >> > we
>> >> > are
>> >> > happy to share them.  The cruc

Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware

2016-02-16 Thread Olivier Galibert

I'm actually interested about how one goes about debugging that kind
of problem, if you have pointers.  I would have an idea or two on how
to go about it if it was in userspace only, but once it crosses into
the kernel I'm not sure what strategies are best.

Best,

  OG.


On Wed, Feb 17, 2016 at 2:51 AM, Jason Ekstrand  wrote:
> On Tue, Feb 16, 2016 at 1:21 PM, Olivier Galibert 
> wrote:
>>
>>   Hi,
>>
>> I'm getting gpu hangs with the lunarg examples (cube and tri) on my
>> Haswell (64 bits).  I attach /sys/class/drm/card0/error fwiw.  How
>> should I go about debugging that?
>
>
> It's a depth-stencil issue and we know about it.   The gen7 code needs some
> love.   I think Kristian and Jordan have been working on it.
> --Jason
>
>>
>>
>>   OG.
>>
>>
>> On Tue, Feb 16, 2016 at 4:19 PM, Jason Ekstrand 
>> wrote:
>> > The Intel mesa team is pleased to announce a brand-new open-source
>> > Vulkan
>> > driver for Intel hardware.  We've been working hard on this over the
>> > course
>> > of the past year or so and are excited to finally share it with the
>> > community.  We will work on up-streaming the driver in the next few
>> > weeks
>> > and hope to have it all in place in time for mesa 11.3 (mesa 12?).  In
>> > the
>> > mean time, the driver can be found in the "vulkan" branch of the mesa
>> > git
>> > repo on freedesktop.org:
>> >
>> > https://cgit.freedesktop.org/mesa/mesa/log/?h=vulkan
>> >
>> > More information on building the driver and running a few simple apps
>> > can
>> > be found on the 01.org web site:
>> >
>> >
>> > https://01.org/linuxgraphics/blogs/jekstrand/2016/open-source-vulkan-drivers-intel-hardware
>> >
>> > We have talked to people at Red Hat and Cannonical and binaries should
>> > be
>> > available for Fedora and Ubuntu soon.  We will update the page on 01.org
>> > with links as soon as they are available.
>> >
>> > We have also created a small test suite called crucible which contains a
>> > few hundred tests (mostly for miptrees) that we created when bringing up
>> > the driver.  This isn't really intended to be the piglit of vulkan.
>> > With
>> > the CTS being publicly available, most cross-platform tests should go
>> > there.  We mostly made crucible so that we could write a few tests early
>> > on
>> > to get us going and for tests that were targetted specifically at our
>> > implementation.  None the less, they may prove useful to someone and we
>> > are
>> > happy to share them.  The crucible source code can be found at
>> >
>> > https://cgit.freedesktop.org/mesa/crucible/
>> >
>> > Frequently Asked Questions:
>> >
>> > What all hardware does it support?
>> >
>> >The driver currently supports Sky Lake all the way back to Ivy
>> > Bridge.
>> >The driver is Vulkan 1.0 conformant for 64-bit builds on Sky Lake,
>> >Broadwell, and Braswell.  We are still having a couple of 32-bit
>> > issues
>> >and support for Haswell, Ivy Bridge, and Bay Trail should be
>> > considered
>> >experimental.
>> >
>> > How much code is shared between the Vulkan and GL drivers?
>> >
>> >For shaders, we're using a SPIR-V to NIR pass which is new, and a few
>> >new NIR lowering passes for things that we previously depended on
>> > GLSL
>> >IR to handle.  Beyond that, we're using the same core NIR and the
>> > same
>> >back-end compiler that we have for GL.  We're carrying a few patches
>> >against the back-end compiler, but the delta is very small and it's
>> > all
>> >stuff that we eventually want to do for GL anyway.
>> >
>> >The main API handling and state setup code is all new and written
>> > from
>> >the ground-up for Vulkan.  For actually packing hardware packets, we
>> > are
>> >using a codegen system that Kristian developed early on in the
>> > project
>> >that's based on an XML description of the hardware packets.  The
>> > result
>> >is state setup code that's both easier to work with and maybe even a
>> >little more efficient than what we have in mesa today.
>> >
>> >We also have a brand-new surface layout library called ISL that
>> > handles
>

Re: [Mesa-dev] [PATCH v6] nir: Add an ALU op builder kind of like ir_builder.h

2015-02-17 Thread Olivier Galibert

  Hi,

I thought mesa was C++ by now?  That API is really C-ish.

  OG.


On Wed, Feb 18, 2015 at 2:12 AM, Kenneth Graunke  wrote:
> On Friday, February 06, 2015 04:00:10 PM Eric Anholt wrote:
>> v2: Rebase on the nir_opcodes.h python code generation support.
>> v3: Use SSA values, and set an appropriate writemask on dot products.
>> v4: Make the arguments be SSA references as well.  This lets you stack up
>> expressions in the arguments of other expressions, at the cost of
>> having to insert a fmov/imov if you want to swizzle.  Also, add
>> the generated file to NIR_GENERATED_FILES.
>> v5: Use more pythonish style for iterating the list.
>> v6: Infer the size of the dest from the size of the srcs, and auto-swizzle
>> a single small src out to the appropriate size.
>> ---
>>  src/glsl/Makefile.am  |   5 ++
>>  src/glsl/Makefile.sources |   1 +
>>  src/glsl/nir/.gitignore   |   1 +
>>  src/glsl/nir/nir_builder.h| 114 
>> ++
>>  src/glsl/nir/nir_builder_opcodes_h.py |  38 
>>  5 files changed, 159 insertions(+)
>>  create mode 100644 src/glsl/nir/nir_builder.h
>>  create mode 100644 src/glsl/nir/nir_builder_opcodes_h.py
>
> This patch is:
> Reviewed-by: Kenneth Graunke 
>
> I do like Connor's ideas - we should definitely extend this and use it
> in more places.  I think we can easily do that as a follow on series.
>
> It might make sense to (eventually) have an API like:
>
> nir_builder *nir_builder_create(...)
>
> nir_builder_insert_at_cf_list(nir_builder *b, nir_cf_list *cf_list)
> nir_builder_insert_at_block_start(nir_builder *b, nir_bblock *block)
> nir_builder_insert_at_block_end(nir_builder *b, nir_bblock *block)
> nir_builder_insert_after_instr(nir_builder *b, nir_instruction *instr)
> nir_builder_insert_before_instr(nir_builder *b, nir_instruction *instr)
>
> I could see us having to store a cf_list/bblock/instr and needing to
> swap around several fields, so having functions would be nicer than
> prodding at struct fields directly.
>
> But for now, I think it's sufficient - it'll be easy enough to create
> later, when we actually make the other APIs and start using them.
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: Add mesa SHA-1 functions

2014-12-23 Thread Olivier Galibert

  Hi,

Not sure there's anything to maintain, but sure, I'll maintain it.

Best,

  OG.


On Sun, Dec 21, 2014 at 8:51 PM, Emil Velikov  wrote:
> On 20 December 2014 at 14:21, Olivier Galibert  wrote:
>> Here is an implementation I've written myself, so no license issues.
>>
> Thanks OG,
>
> Afaics the main issue is not the lack of implementation, but that
> no-one wants to step up to "maintain" it.
> Even adding code that is x2 the size is considered a better solution :'-(
>
> If you're up-to the maintenance task, we can resolve all the issues
> (linking, multi platform support) in half the size and a lot cleaner
> build :-)
>
> Cheers,
> Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: Add mesa SHA-1 functions

2014-12-20 Thread Olivier Galibert

Here is an implementation I've written myself, so no license issues.

  OG.


On Fri, Dec 12, 2014 at 10:48 AM, Jose Fonseca  wrote:
> On 11/12/14 22:02, Brian Paul wrote:
>>
>> On 12/11/2014 02:51 PM, Carl Worth wrote:
>>>
>>> From: Kristian Høgsberg 
>>>
>>> The upcoming shader cache uses the SHA-1 algorithm for cryptographic
>>> naming. These new mesa_sha1 functions are implemented with the nettle
>>> library.
>>> ---
>>>
>>> This patch is another in support of my upcoming shader-cache work.
>>> Thanks to
>>> Kritian for coding this piece.
>>>
>>> As currently written, this patch introduces a new dependency of Mesa
>>> on the
>>> Nettle library to implement SHA-1. I'm open to recommendations if
>>> people would prefer some other option.
>>>
>>> For example, the xserver can be configured to get a SHA-1
>>> implementation from
>>> libmd, libc, CommonCrypto, CryptoAPI, libnettle, libgcrypt, libsha1, or
>>> openssl.
>>>
>>> I don't know if it's important to offer as many options as that, which
>>> is why
>>> I'm asking for opinions here.
>>
>>
>>
>> We'll need a solution for Windows too.  I don't have time right now to
>> do any research into that.
>
>
> Yes, ideally we'd have something small that we could bundle into mesa source
> tree, for sake of non Linux OSes.
>
> If Windows was the only concern, we could use its Crypto API,
> http://msdn.microsoft.com/en-us/library/windows/desktop/aa382379.aspx and
> avoid depending on anything else, but some of the above mention libraries
> are not trivial to install.
>
> The other alternative is to disable shader cache when no suitable dependency
> is found. That is, make this an optional dependency.
>
> Jose
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
/*
 * Copyright © 2014 Olivier Galibert & Intel Corporation
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
 * copy of this software and associated documentation files (the "Software"),
 * to deal in the Software without restriction, including without limitation
 * the rights to use, copy, modify, merge, publish, distribute, sublicense,
 * and/or sell copies of the Software, and to permit persons to whom the
 * Software is furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice (including the next
 * paragraph) shall be included in all copies or substantial portions of the
 * Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
 * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 * DEALINGS IN THE SOFTWARE.
 */

#include 
#include 

#include "sha1.h"

static inline unsigned int mesa_sha1_shift(unsigned int val, int count)
{
return (val << count) | (val >> (32-count));
}

static void mesa_sha1_init(struct mesa_sha1 *ctx)
{
ctx->digest[0] = 0x67452301;
ctx->digest[1] = 0xefcdab89;
ctx->digest[2] = 0x98badcfe;
ctx->digest[3] = 0x10325476;
ctx->digest[4] = 0xc3d2e1f0;
ctx->msize = 0;
}

static void mesa_sha1_handle_block(struct mesa_sha1 *ctx, const unsigned char *b)
{
unsigned int W[80];
for(int i=0; i != 16; i++)
W[i] = (b[4*i] << 24) | (b[4*i+1] << 16) | (b[4*i+2] << 8) | b[4*i+3];
for(int i=16; i != 80; i++)
W[i] = mesa_sha1_shift(W[i-3]^W[i-8]^W[i-14]^W[i-16], 1);

unsigned int A = ctx->digest[0];
unsigned int B = ctx->digest[1];
unsigned int C = ctx->digest[2];
unsigned int D = ctx->digest[3];
unsigned int E = ctx->digest[4];

for(int i= 0; i != 20; i++) {
unsigned int T = mesa_sha1_shift(A, 5) + ((B & C) | ((~B) & D))+ E + W[i] + 0x5A827999;
E = D;
D = C;
C = mesa_sha1_shift(B, 30);
B = A;
A = T;
}

for(int i=20; i != 40; i++) {
unsigned int T = mesa_sha1_shift(A, 5) + (B^C^D)   + E + W[i] + 0x6ed9eba1;
E = D;
D = C;
C = mesa_sha1_shift(B, 30);
B = A;
A = T;
}

for(int i=40; i != 60; i++) {
unsigned int T = mesa_sha1_shift(A, 5) + ((B & C) | (B & D

Re: [Mesa-dev] [PATCH 03/16] mesa: Clamps the stencil value masks to GLint when queried

2014-12-18 Thread Olivier Galibert

H, if you convert to float you have a real problem: floats only
have 23 bits of mantissa, so if bit 31 or 32 is set bits 0-7 will be
lost.  Converting directly won't change a thing there.  Initing to 255
is definitively better it seems.

W.r.t clamping, in computer graphics clamping a value to an interval
mean setting the value to the nearest boundary if it was outside of
the interval.  Clamping can never change a value to something *inside*
the interval, which masking does.

  OG.


On Thu, Dec 18, 2014 at 12:08 PM, Eduardo Lima Mitev  wrote:
> On 12/18/2014 10:28 AM, Eduardo Lima Mitev wrote:
>> On 12/18/2014 09:55 AM, Olivier Galibert wrote:
>>> Something is not clear to me: In which way -1 is incorrect?
>>>
>>
>> Hi Olivier,
>>
>> The values being queried are the front and back stencil masks. Masks are
>> (conceptually?) an unsigned integer, AFAIU.
>
> Well, more accurately, just a string of bits, so -1 (or any other value)
> is probably fine. Problem is when the signed integer value is further
> converted to float, which is the case for these failing tests. (Note
> that only the test cases that query the mask as a float value are the
> ones failing).
>
> Giving a bit more of thought to this, and assuming the test is fine
> querying for a mask value using the glGetFloat API, the problem is Mesa
> converting from unsigned int (the mask) to signed int, then to float;
> instead of converting to float directly. I don't have a say on why it
> does that intermediate conversion to int, though.
>
> So my original solution was wrong from different angles, and the final
> solution is probably not the best one either, since we are "avoiding"
> the type conversion problem rather than fixing it.
>
> Thanks for rising these points.
>
> Eduardo
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/16] mesa: Clamps the stencil value masks to GLint when queried

2014-12-18 Thread Olivier Galibert

Hi,

Something is not clear to me: In which way -1 is incorrect?

Also, w.r.t comments, what you're doing is masking, not clamping,
which incidentally is a good thing since clamping would be severely
bad for stencil.

Best,

  OG.


On Thu, Dec 11, 2014 at 11:34 PM, Eduardo Lima Mitev  wrote:
> Stencil value masks values (ctx->Stencil.ValueMask[]) stores GLuint values
> which are initialized with max unsigned integer (~0u). When these values
> are queried by glGet* (GL_STENCIL_VALUE_MASK or GL_STENCIL_BACK_VALUE_MASK),
> they are converted to a signed integer. Currently, these values overflow
> and return incorrect result (-1).
>
> This patch clamps these values to max int (0x7FFF) before storing.
>
> Fixes 6 dEQP failing tests:
> * dEQP-GLES3.functional.state_query.integers.stencil_value_mask_getfloat
> * dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_getfloat
> * 
> dEQP-GLES3.functional.state_query.integers.stencil_value_mask_separate_getfloat
> * 
> dEQP-GLES3.functional.state_query.integers.stencil_value_mask_separate_both_getfloat
> * 
> dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_separate_getfloat
> * 
> dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_separate_both_getfloat
> ---
>  src/mesa/main/get.c  | 11 ++-
>  src/mesa/main/get_hash_params.py |  2 +-
>  2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> index 6091efc..4578a36 100644
> --- a/src/mesa/main/get.c
> +++ b/src/mesa/main/get.c
> @@ -726,7 +726,16 @@ find_custom_value(struct gl_context *ctx, const struct 
> value_desc *d, union valu
>v->value_int = _mesa_get_stencil_ref(ctx, 1);
>break;
> case GL_STENCIL_VALUE_MASK:
> -  v->value_int = ctx->Stencil.ValueMask[ctx->Stencil.ActiveFace];
> +  /* Since stencil value mask is a GLuint, it requires clamping
> +   * before storing in a signed int to avoid overflow.
> +   * Notice that Stencil.ValueMask values are initialized to ~0u,
> +   * so without clamping it will return -1 when assigned to value_int.
> +   */
> +  v->value_int = ctx->Stencil.ValueMask[ctx->Stencil.ActiveFace] & 
> 0x7FFF;
> +  break;
> +   case GL_STENCIL_BACK_VALUE_MASK:
> +  /* Same as with GL_STENCIL_VALUE_MASK, value requires claming. */
> +  v->value_int = ctx->Stencil.ValueMask[1] & 0x7FFF;
>break;
> case GL_STENCIL_WRITEMASK:
>v->value_int = ctx->Stencil.WriteMask[ctx->Stencil.ActiveFace];
> diff --git a/src/mesa/main/get_hash_params.py 
> b/src/mesa/main/get_hash_params.py
> index 09a61ac..a3bf1cb 100644
> --- a/src/mesa/main/get_hash_params.py
> +++ b/src/mesa/main/get_hash_params.py
> @@ -283,7 +283,7 @@ descriptor=[
>
>  # OpenGL 2.0
>[ "STENCIL_BACK_FUNC", "CONTEXT_ENUM(Stencil.Function[1]), NO_EXTRA" ],
> -  [ "STENCIL_BACK_VALUE_MASK", "CONTEXT_INT(Stencil.ValueMask[1]), NO_EXTRA" 
> ],
> +  [ "STENCIL_BACK_VALUE_MASK", "LOC_CUSTOM, TYPE_INT, NO_OFFSET, NO_EXTRA"],
>[ "STENCIL_BACK_WRITEMASK", "CONTEXT_INT(Stencil.WriteMask[1]), NO_EXTRA" 
> ],
>[ "STENCIL_BACK_REF", "LOC_CUSTOM, TYPE_INT, NO_OFFSET, NO_EXTRA" ],
>[ "STENCIL_BACK_FAIL", "CONTEXT_ENUM(Stencil.FailFunc[1]), NO_EXTRA" ],
> --
> 2.1.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] mesa: Initializes the stencil value masks to 0xFF instead of ~0u

2014-12-16 Thread Olivier Galibert

Note that ~0U is perfectly correct w.r.t the GLES3 spec.  It just
means that s=32, which happens to be greater or equal to 8.

Best,

  OG.


On Tue, Dec 16, 2014 at 8:58 AM, Eduardo Lima Mitev  wrote:
> On 12/15/2014 08:30 PM, Ian Romanick wrote:
>> On 12/15/2014 08:04 AM, Eduardo Lima Mitev wrote:
>>>
>>> Since the maximum supported precision for stencil buffers is 8 bits, mask
>>> values should be initialized to 2^8 - 1 = 0xFF.
>>>
>>> Currently, these masks are initialized to max unsigned integer (~0u), which
>>> causes their values to overflow to -1 when converted to signed int by 
>>> glGet* APIs.
>>
>> I did some research on this... before desktop OpenGL 3.1, the spec said
>> something quite different.  Please add the following to the commit message:
>>
>> "In OpenGL 3.0 and before, the an initial value of ~0u was specified:
>>
>> In the initial state, stenciling is disabled, the front and back
>> stencil reference value are both zero, the front and back stencil
>> comparison functions are both ALWAYS, and the front and back
>> stencil mask are both all ones."
>>
>
> Oh, interesting. I should have looked back into older specs to
> understand where the ~0u was coming from. Note taken.
>
>>
>> With that, this patch is
>>
>> Reviewed-by: Ian Romanick 
>>
>
> Great. If you feel like nitpicking, you can check the final commit log
> here:
> https://github.com/Igalia/mesa/commit/3784f7b2d5aa739c4abf9aa28874b85bbd1550e5
>
> Thanks a lot!
>
> Eduardo
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: Add mesa SHA-1 functions

2014-12-12 Thread Olivier Galibert

  Hi,

SHA1 is easy to implement.  If you want an always-working backup, I
have a couple of C versions I wrote myself.  Libraries are only
interesting if they offer significant speedups through cpu-dependance.
Especially since the shader cache is not in the happy fun land of
hardware-based attacks (or attacks in the first place).

Best,

  OG.


On Fri, Dec 12, 2014 at 10:48 AM, Jose Fonseca  wrote:
> On 11/12/14 22:02, Brian Paul wrote:
>>
>> On 12/11/2014 02:51 PM, Carl Worth wrote:
>>>
>>> From: Kristian Høgsberg 
>>>
>>> The upcoming shader cache uses the SHA-1 algorithm for cryptographic
>>> naming. These new mesa_sha1 functions are implemented with the nettle
>>> library.
>>> ---
>>>
>>> This patch is another in support of my upcoming shader-cache work.
>>> Thanks to
>>> Kritian for coding this piece.
>>>
>>> As currently written, this patch introduces a new dependency of Mesa
>>> on the
>>> Nettle library to implement SHA-1. I'm open to recommendations if
>>> people would prefer some other option.
>>>
>>> For example, the xserver can be configured to get a SHA-1
>>> implementation from
>>> libmd, libc, CommonCrypto, CryptoAPI, libnettle, libgcrypt, libsha1, or
>>> openssl.
>>>
>>> I don't know if it's important to offer as many options as that, which
>>> is why
>>> I'm asking for opinions here.
>>
>>
>>
>> We'll need a solution for Windows too.  I don't have time right now to
>> do any research into that.
>
>
> Yes, ideally we'd have something small that we could bundle into mesa source
> tree, for sake of non Linux OSes.
>
> If Windows was the only concern, we could use its Crypto API,
> http://msdn.microsoft.com/en-us/library/windows/desktop/aa382379.aspx and
> avoid depending on anything else, but some of the above mention libraries
> are not trivial to install.
>
> The other alternative is to disable shader cache when no suitable dependency
> is found. That is, make this an optional dependency.
>
> Jose
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()

2014-10-10 Thread Olivier Galibert

Applied.

 OG.


On Fri, Sep 26, 2014 at 6:11 PM, Erik Faye-Lund  wrote:
> Our current atan()-approximation is pretty inaccurate at 1.0, so
> let's try to improve the situation by doing a direct approximation
> without going through atan.
>
> This new implementation uses an 11th degree polynomial to approximate
> atan in the [-1..1] range, and the following identitiy to reduce the
> entire range to [-1..1]:
>
> atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x)
>
> This range-reduction idea is taken from the paper "Fast computation
> of Arctangent Functions for Embedded Applications: A Comparative
> Analysis" (Ukil et al. 2011).
>
> The polynomial that approximates atan(x) is:
>
> x   * 0.793128310355 - x^3  * 0.3326756418091246 +
> x^5 * 0.1938924977115610 - x^7  * 0.1173503194786851 +
> x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444
>
> This polynomial was found with the following GNU Octave script:
>
> x = linspace(0, 1);
> y = atan(x);
> n = [1, 3, 5, 7, 9, 11];
> format long;
> polyfitc(x, y, n)
>
> The polyfitc function is not built-in, but too long to include here.
> It can be downloaded from the following URL:
>
> http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m
>
> This fixes the following piglit test:
> shaders/glsl-const-folding-01
>
> Signed-off-by: Erik Faye-Lund 
> Reviewed-by: Ian Romanick 
> ---
>  src/glsl/builtin_functions.cpp | 65 
> +++---
>  1 file changed, 55 insertions(+), 10 deletions(-)
>
> diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
> index 9be7f6d..c126b60 100644
> --- a/src/glsl/builtin_functions.cpp
> +++ b/src/glsl/builtin_functions.cpp
> @@ -442,6 +442,7 @@ private:
> ir_swizzle *matrix_elt(ir_variable *var, int col, int row);
>
> ir_expression *asin_expr(ir_variable *x);
> +   void do_atan(ir_factory &body, const glsl_type *type, ir_variable *res, 
> operand y_over_x);
>
> /**
>  * Call function \param f with parameters specified as the linked
> @@ -2684,11 +2685,7 @@ builtin_builder::_atan2(const glsl_type *type)
>ir_factory outer_then(&outer_if->then_instructions, mem_ctx);
>
>/* Then...call atan(y/x) */
> -  ir_variable *y_over_x = outer_then.make_temp(glsl_type::float_type, 
> "y_over_x");
> -  outer_then.emit(assign(y_over_x, div(y, x)));
> -  outer_then.emit(assign(r, mul(y_over_x, rsq(add(mul(y_over_x, 
> y_over_x),
> -  imm(1.0f));
> -  outer_then.emit(assign(r, asin_expr(r)));
> +  do_atan(body, glsl_type::float_type, r, div(y, x));
>
>/* ...and fix it up: */
>ir_if *inner_if = new(mem_ctx) ir_if(less(x, imm(0.0f)));
> @@ -2711,17 +2708,65 @@ builtin_builder::_atan2(const glsl_type *type)
> return sig;
>  }
>
> +void
> +builtin_builder::do_atan(ir_factory &body, const glsl_type *type, 
> ir_variable *res, operand y_over_x)
> +{
> +   /*
> +* range-reduction, first step:
> +*
> +*  / y_over_x if |y_over_x| <= 1.0;
> +* x = <
> +*  \ 1.0 / y_over_x   otherwise
> +*/
> +   ir_variable *x = body.make_temp(type, "atan_x");
> +   body.emit(assign(x, div(min2(abs(y_over_x),
> +imm(1.0f)),
> +   max2(abs(y_over_x),
> +imm(1.0f);
> +
> +   /*
> +* approximate atan by evaluating polynomial:
> +*
> +* x   * 0.793128310355 - x^3  * 0.3326756418091246 +
> +* x^5 * 0.1938924977115610 - x^7  * 0.1173503194786851 +
> +* x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444
> +*/
> +   ir_variable *tmp = body.make_temp(type, "atan_tmp");
> +   body.emit(assign(tmp, mul(x, x)));
> +   body.emit(assign(tmp, 
> mul(add(mul(sub(mul(add(mul(sub(mul(add(mul(imm(-0.0121323213173444f),
> + tmp),
> + 
> imm(0.0536813784310406f)),
> + tmp),
> + 
> imm(0.1173503194786851f)),
> + tmp),
> + imm(0.1938924977115610f)),
> + tmp),
> + imm(0.3326756418091246f)),
> + tmp),
> + imm(0.793128310355f)),
> + x)));
> +
> +   /* range-reduction fixup */
> +   body.emit(assign(tmp, add(tmp,
> + mul(b2f(greater(abs(y_over_x),
> +  imm(1.0f, type->components(,
> +  add(mul(tmp,
> +  imm(-2.0f)),
> +  im

Re: [Mesa-dev] [PATCH] glsl/glsl_parser_extras: Handle GLSL 4.50

2014-10-03 Thread Olivier Galibert

Sorry for not replying earlier, I didn't see your answer.

On Thu, Sep 4, 2014 at 12:33 AM, Matt Turner  wrote:
> Did you change the leading whitespace on purpose?

Not really, I can un-change that.  I have an emacs config that's
supposedly what mesa wants, but it may be incorrect.

>> -   } supported_versions[12];
>> +   } supported_versions[14];
>
> Where does this number come from, and can we make it a little clearer
> what it is?

It's the maximum number of simultaneous glsl versions that a driver
can support.  With the current code it should be 13, that is, if a
driver supports glsl 440 it's going to segfault writing after the end
of array.  Supporting 450 needs one more.

Note that it's not (yet) a security issue since the largest we support is 330.

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa/{version, getstring}: Future-proof version handling

2014-08-23 Thread Olivier Galibert

Are we that far?

  OG.


On Sat, Aug 23, 2014 at 7:22 PM, Ian Romanick  wrote:
> I'm content with waiting to add this until we're even close to
> supporting any of those versions... especially given all the lines like
> "false && // ARB_gpu_shader_fp64 &&".  That's just clutter.
>
> On 08/21/2014 05:02 AM, Olivier Galibert wrote:
>> Signed-off-by: Olivier Galibert 
>> ---
>>  src/mesa/main/getstring.c |   6 ++
>>  src/mesa/main/version.c   | 140 
>> +-
>>  2 files changed, 143 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/mesa/main/getstring.c b/src/mesa/main/getstring.c
>> index 431d60b..f9d13a7 100644
>> --- a/src/mesa/main/getstring.c
>> +++ b/src/mesa/main/getstring.c
>> @@ -58,6 +58,12 @@ shading_language_version(struct gl_context *ctx)
>>   return (const GLubyte *) "4.10";
>>case 420:
>>   return (const GLubyte *) "4.20";
>> +  case 430:
>> + return (const GLubyte *) "4.30";
>> +  case 440:
>> + return (const GLubyte *) "4.40";
>> +  case 450:
>> + return (const GLubyte *) "4.50";
>>default:
>>   _mesa_problem(ctx,
>> "Invalid GLSL version in 
>> shading_language_version()");
>> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
>> index 4dea530..c7a2381 100644
>> --- a/src/mesa/main/version.c
>> +++ b/src/mesa/main/version.c
>> @@ -290,7 +290,122 @@ compute_version(const struct gl_extensions *extensions,
>>extensions->EXT_texture_swizzle);
>>/* ARB_sampler_objects is always enabled in 
>> mesa */
>>
>> -   if (ver_3_3) {
>> +   const GLboolean ver_4_0 = (ver_3_3 &&
>> +  consts->GLSLVersion >= 400 &&
>> +  extensions->ARB_draw_buffers_blend &&
>> +  extensions->ARB_draw_indirect &&
>> +  extensions->ARB_gpu_shader5 &&
>> +  false && // ARB_gpu_shader_fp64 &&
>> +  extensions->ARB_sample_shading &&
>> +  false && // ARB_shader_subroutine
>> +  false && // ARB_tesselation_shader
>> +  extensions->ARB_texture_buffer_object_rgb32 &&
>> +  extensions->ARB_texture_cube_map_array &&
>> +  extensions->ARB_texture_gather &&
>> +  extensions->ARB_texture_query_lod &&
>> +  extensions->ARB_transform_feedback2 &&
>> +  extensions->ARB_transform_feedback3);
>> +
>> +   const GLboolean ver_4_1 = (ver_4_0 &&
>> +  consts->GLSLVersion >= 410 &&
>> +  extensions->ARB_ES2_compatibility &&
>> +  false && // ARB_shader_precision
>> +  false && // ARB_vertex_attrib_64bit
>> +  extensions->ARB_viewport_array);
>> +  /* ARB_get_program_binary and
>> + ARB_separate_shader_objects are always 
>> enabled in mesa */
>> +
>> +   const GLboolean ver_4_2 = (ver_4_1 &&
>> +  consts->GLSLVersion >= 420 &&
>> +  extensions->ARB_texture_compression_bptc &&
>> +  extensions->ARB_shader_atomic_counters &&
>> +  extensions->ARB_transform_feedback_instanced 
>> &&
>> +  extensions->ARB_base_instance &&
>> +  extensions->ARB_shader_image_load_store &&
>> +  extensions->ARB_conservative_depth &&
>> +  extensions->ARB_shading_language_420pack &&
>> +  extensions->ARB_internalformat_query);
>> +  /* ARB_compressed_texture_pixel_storage,
>> +

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

2014-08-22 Thread Olivier Galibert

In that case staying as close as possible to spir may make sense?

  OG.


On Fri, Aug 22, 2014 at 5:08 AM, Dave Airlie  wrote:
> On 22 August 2014 12:46, Jason Ekstrand  wrote:
>> On Thu, Aug 21, 2014 at 7:36 PM, Dave Airlie  wrote:
>>>
>>> On 21 August 2014 19:10, Henri Verbeet  wrote:
>>> > On 21 August 2014 04:56, Michel Dänzer  wrote:
>>> >> On 21.08.2014 04:29, Henri Verbeet wrote:
>>> >>> For whatever it's worth, I have been avoiding radeonsi in part because
>>> >>> of the LLVM dependency. Some of the other issues already mentioned
>>> >>> aside, I also think it makes it just painful to do bisects over
>>> >>> moderate/longer periods of time.
>>> >>
>>> >> More painful, sure, but not too bad IME. In particular, if you know the
>>> >> regression is in Mesa, you can always use a stable release of LLVM for
>>> >> the bisect. You only need to change the --with-llvm-prefix= parameter
>>> >> to
>>> >> Mesa's configure for that. Of course, it could still be mildly painful
>>> >> if you need to go so far back that the current stable LLVM release
>>> >> wasn't supported yet. But how often does that happen? Very rarely for
>>> >> me.
>>> >>
>>> > Sure, it's not impossible, but is that really the kind of process you
>>> > want users to go through when bisecting a regression? Perhaps throw in
>>> > building 32-bit versions of both Mesa and LLVM on 64-bit as well if
>>> > they want to run 32-bit applications.
>>> >
>>> >> Without LLVM, I'm not sure there would be a driver you could avoid. :)
>>> >>
>>> > R600g didn't really exist either, and that one seems to have worked
>>> > out fine. I think in a large part because of work done by Jerome and
>>> > Dave in the early days, but regardless. From what I've seen from SI, I
>>> > don't think radeonsi needed to be a separate driver to start with, and
>>> > while its ISA is certainly different from R600-Cayman, it doesn't
>>> > particularly strike me as much harder to work with.
>>> >
>>> > Back to the more immediate topic though, I think think that on
>>> > occasion the discussion is framed as "Is there any reason using LLVM
>>> > IR wouldn't work?", while it would perhaps be more appropriate to
>>> > think of as "Would using LLVM IR provide enough advantages to justify
>>> > adding a LLVM dependency to core Mesa?".
>>>
>>> Could we use an llvm compatible IR? is also a question I'd like to see
>>> answered.
>>
>>
>> What do you mean by llvm compatible?  Do you mean forking their IR inside
>> mesa or just something that's easy to translate back and forth?
>>
>
> Importing/forking the llvm IR code with a different symbol set, and
> trying to not intentionally
> be incompatible with their llvm.
>
> Dave.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glsl/glsl_parser_extras: Handle GLSL 4.50

2014-08-22 Thread Olivier Galibert

Signed-off-by: Olivier Galibert 
---
 src/glsl/glsl_parser_extras.cpp | 2 +-
 src/glsl/glsl_parser_extras.h   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 490c3c8..87d4846 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -50,7 +50,7 @@ glsl_compute_version_string(void *mem_ctx, bool is_es, 
unsigned version)
 
 
 static const unsigned known_desktop_glsl_versions[] =
-   { 110, 120, 130, 140, 150, 330, 400, 410, 420, 430, 440 };
+  { 110, 120, 130, 140, 150, 330, 400, 410, 420, 430, 440, 450 };
 
 
 _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct gl_context *_ctx,
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index c8b9478..cd252f1 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -215,7 +215,7 @@ struct _mesa_glsl_parse_state {
struct {
   unsigned ver;
   bool es;
-   } supported_versions[12];
+   } supported_versions[14];
 
bool es_shader;
unsigned language_version;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mapi/glapi/gen/gl_API.xml: Summer cleanup.

2014-08-22 Thread Olivier Galibert

This adds all the extension names and numbers, adds some missing
numbers and fixes the order in places.  Future extension additions
should be slightly easier by not requiring to find where it should go
anymore.

Signed-off-by: Olivier Galibert 
---
 src/mapi/glapi/gen/gl_API.xml | 804 ++
 1 file changed, 578 insertions(+), 226 deletions(-)

diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 73f2f75..e91f37e 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -6275,7 +6275,7 @@
 
 
 
-
+
 
 
 
@@ -6300,7 +6300,7 @@
 
 
 
-
+
 
 
 
@@ -6335,6 +6335,9 @@
 
 
 
+
+
+
 
 
 
@@ -6360,10 +6363,10 @@
 
 
 
-
-
-
-
+
+
+
+
 
 
 
@@ -6776,7 +6779,7 @@
 
 
 
-
+
 
 
 
@@ -7443,7 +7446,7 @@
  parameter was in the NV functions.  When this error was discovered
  and fixed, there was already at least one implementation of
  GLX protocol for ARB_vertex_program, but there were no
- implementations of NV_vertex_program.  The sollution was to renumber
+ implementations of NV_vertex_program.  The solution was to renumber
  the opcodes for NV_vertex_program and convert the unused field in
  the ARB_vertex_program protocol to unused padding.
   -->
@@ -7683,6 +7686,8 @@
 
 
 
+
+
 
 
 
@@ -8079,7 +8084,7 @@
 
 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
 
 
@@ -8094,79 +8099,79 @@
 
 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
-
-
+
+
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
 
 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 

 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 http://www.w3.org/2001/XInclude"/>
 
-http://www.w3.org/2001/XInclude"/>
-http://www.w3.org/2001/XInclude"/>
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
+http://www.w3.org/2001/XInclude"/> 
+http://www.w3.org/2001/XInclude"/> 
 
 
-
-
+
+
 
 
 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
 
 
@@ -8176,15 +8181,15 @@
 
 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
 
 
-
+ 
 
 
 
@@ -8205,13 +8210,17 @@
 
 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
-
+http://www.w3.org/2001/XInclude"/> 
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
 
-http://www.w3.org/2001/XInclude"/>
+
+
+
+
+
 
 
   
@@ -8243,21 +8252,28 @@
   
 
 
-
+http://www.w3.org/2001/XInclude"/> 
+http://www.w3.org/2001/XInclude"/> 
+http://www.w3.org/2001/XInclude"/> 
 
-http://www.w3.org/2001/XInclude"/>
+
+
 
-
+http://www.w3.org/2001/XInclude"/> 
 
-http://www.w3.org/2001/XInclude"/>
+
+
 
-http://www.w3.org/2001/XInclude"/>
+
 
-http://www.w3.org/2001/XInclude"/>
+http://www.w3.org/2001/XInclude"/> 
+http://www.w3.org/2001/XInclude"/> 
 
-http://www.w3.org/2001/XInclude"/>
+
 
-http://www.

[Mesa-dev] [PATCH] mesa/{version, getstring}: Future-proof version handling

2014-08-21 Thread Olivier Galibert

Signed-off-by: Olivier Galibert 
---
 src/mesa/main/getstring.c |   6 ++
 src/mesa/main/version.c   | 140 +-
 2 files changed, 143 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/getstring.c b/src/mesa/main/getstring.c
index 431d60b..f9d13a7 100644
--- a/src/mesa/main/getstring.c
+++ b/src/mesa/main/getstring.c
@@ -58,6 +58,12 @@ shading_language_version(struct gl_context *ctx)
  return (const GLubyte *) "4.10";
   case 420:
  return (const GLubyte *) "4.20";
+  case 430:
+ return (const GLubyte *) "4.30";
+  case 440:
+ return (const GLubyte *) "4.40";
+  case 450:
+ return (const GLubyte *) "4.50";
   default:
  _mesa_problem(ctx,
"Invalid GLSL version in shading_language_version()");
diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
index 4dea530..c7a2381 100644
--- a/src/mesa/main/version.c
+++ b/src/mesa/main/version.c
@@ -290,7 +290,122 @@ compute_version(const struct gl_extensions *extensions,
   extensions->EXT_texture_swizzle);
   /* ARB_sampler_objects is always enabled in mesa 
*/
 
-   if (ver_3_3) {
+   const GLboolean ver_4_0 = (ver_3_3 &&
+  consts->GLSLVersion >= 400 &&
+  extensions->ARB_draw_buffers_blend &&
+  extensions->ARB_draw_indirect &&
+  extensions->ARB_gpu_shader5 &&
+  false && // ARB_gpu_shader_fp64 &&
+  extensions->ARB_sample_shading &&
+  false && // ARB_shader_subroutine
+  false && // ARB_tesselation_shader
+  extensions->ARB_texture_buffer_object_rgb32 &&
+  extensions->ARB_texture_cube_map_array &&
+  extensions->ARB_texture_gather &&
+  extensions->ARB_texture_query_lod &&
+  extensions->ARB_transform_feedback2 &&
+  extensions->ARB_transform_feedback3);
+
+   const GLboolean ver_4_1 = (ver_4_0 &&
+  consts->GLSLVersion >= 410 &&
+  extensions->ARB_ES2_compatibility &&
+  false && // ARB_shader_precision
+  false && // ARB_vertex_attrib_64bit
+  extensions->ARB_viewport_array);
+  /* ARB_get_program_binary and
+ ARB_separate_shader_objects are always 
enabled in mesa */
+
+   const GLboolean ver_4_2 = (ver_4_1 &&
+  consts->GLSLVersion >= 420 &&
+  extensions->ARB_texture_compression_bptc &&
+  extensions->ARB_shader_atomic_counters &&
+  extensions->ARB_transform_feedback_instanced &&
+  extensions->ARB_base_instance &&
+  extensions->ARB_shader_image_load_store &&
+  extensions->ARB_conservative_depth &&
+  extensions->ARB_shading_language_420pack &&
+  extensions->ARB_internalformat_query);
+  /* ARB_compressed_texture_pixel_storage,
+ ARB_texture_storage and
+ ARB_map_buffer_alignment are always enabled 
in mesa */
+
+   const GLboolean ver_4_3 = (ver_4_2 &&
+  consts->GLSLVersion >= 430 &&
+  false && // ARB_arrays_of_arrays
+  extensions->ARB_ES3_compatibility &&
+  extensions->ARB_compute_shader &&
+  extensions->ARB_copy_image &&
+  extensions->ARB_explicit_uniform_location &&
+  extensions->ARB_fragment_layer_viewport &&
+  false && // ARB_framebuffer_no_attachments
+  false && // ARB_internalformat_query2
+  extensions->ARB_draw_indirect &&
+  false && // ARB_program_interface_query
+  false &&

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

2014-08-20 Thread Olivier Galibert

And don't forget that explicit vec4 becomes immensely amusing once you
add fp64/double to the problem.

  OG.


On Wed, Aug 20, 2014 at 4:01 PM, Francisco Jerez  wrote:
> Connor Abbott  writes:
>
>> On Tue, Aug 19, 2014 at 11:33 PM, Francisco Jerez  
>> wrote:
>>> Connor Abbott  writes:
>>>
 On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez  
 wrote:
> Tom Stellard  writes:
>
>> On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote:
>>> On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer  
>>> wrote:
>>> > On 19.08.2014 01:28, Connor Abbott wrote:
>>> >> On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer  
>>> >> wrote:
>>> >>> On 16.08.2014 09:12, Connor Abbott wrote:
>>>  I know what you might be thinking right now. "Wait, *another* IR? 
>>>  Don't
>>>  we already have like 5 of those, not counting all the 
>>>  driver-specific
>>>  ones? Isn't this stuff complicated enough already?" Well, there 
>>>  are some
>>>  pretty good reasons to start afresh (again...). In the years we've 
>>>  been
>>>  using GLSL IR, we've come to realize that, in fact, it's not what 
>>>  we
>>>  want *at all* to do optimizations on.
>>> >>>
>>> >>> Did you evaluate using LLVM IR instead of inventing yet another one?
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Earthling Michel Dänzer|  
>>> >>> http://www.amd.com
>>> >>> Libre software enthusiast  |Mesa and X 
>>> >>> developer
>>> >>
>>> >> Yes. See
>>> >>
>>> >> http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html
>>> >>
>>> >> and
>>> >>
>>> >> http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html
>>> >
>>> > I know Ian can't deal with LLVM for some reason. I was wondering if
>>> > *you* evaluated it, and if so, why you rejected it.
>>> >
>>> >
>>> > --
>>> > Earthling Michel Dänzer|  
>>> > http://www.amd.com
>>> > Libre software enthusiast  |Mesa and X 
>>> > developer
>>>
>>>
>>> Well, first of all, the fact that Ian and Ken don't want to use it
>>> means that any plan to use LLVM for the Intel driver is dead in the
>>> water anyways - you can translate NIR into LLVM if you want, but for
>>> i965 we want to share optimizations between our 2 backends (FS and
>>> vec4) that we can't do today in GLSL IR so this is what we want to use
>>> for that, and since nobody else does anything with the core GLSL
>>> compiler except when they have to, when we start moving things out of
>>> GLSL IR this will probably replace GLSL IR as the infrastructure that
>>> all Mesa drivers use. But with that in mind, here are a few reasons
>>> why we wouldn't want to use LLVM:
>>>
>>> * LLVM wasn't built to understand structured CFG's, meaning that you
>>> need to re-structurize it using a pass that's fragile and prone to
>>> break if some other pass "optimizes" the shader in a way that makes it
>>> non-structured (i.e. not expressible in terms of loops and if
>>> statements). This loss of information also means that passes that need
>>> to know things like, for example, the loop nesting depth need to do an
>>> analysis pass whereas with NIR you can just walk up the control flow
>>> tree and count the number of loops we hit.
>>>
>>
>> LLVM has a pass to structurize the CFG.  We use it in the radeon
>> drivers, and it is run after all of the other LLVM optimizations which 
>> have
>> no concept of structured CFG.  It's not bug free, but it works really
>> well even with all of the complex OpenCL kernels we throw at it.
>>
>> Your point about losing information when the CFG is de-structurized is
>> valid, but for things like loop depth, I'm not sure why we couldn't 
>> write an
>> LLVM analysis pass for this (if one doesn't already exist).
>>
>
> I don't think this is such a big deal either.  At least the
> structurization pass used on newer AMD hardware isn't "fragile" in the
> way you seem to imply -- AFAIK (unlike the old AMDIL heuristic
> algorithm) it's guaranteed to give you a valid structurized output no
> matter what the previous optimization passes have done to the CFG,
> modulo bugs.  I admit that the situation is nevertheless suboptimal.
> Ideally this information wouldn't get lost along the way.  For the long
> term we may want to represent structured control flow directly in the IR
> as you say, I just don't see how reinventing the IR saves us any work if
> we could just fix the existing one.

 It seems to me that something like how we represent control flow is a
 pretty fundamental part of the IR - it affects any optimization pas

Re: [Mesa-dev] [PATCH 8/8] mesa: simplify _mesa_update_draw_buffers()

2014-08-19 Thread Olivier Galibert

  Hi,

That patch makes glDrawBuffer(0, NULL); segfault because
_mesa_drawbuffers expects buffers[0] to be valid.  Note that the bug
is there, but I'm not sure what the final setup should look like in
that case.

Best,

  OG.

PS: reported by haagch on irc


On Fri, Aug 8, 2014 at 11:20 PM, Brian Paul  wrote:
> There's no need to copy the array of DrawBuffer enums to a temp array.
> ---
>  src/mesa/main/buffers.c |9 ++---
>  1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/src/mesa/main/buffers.c b/src/mesa/main/buffers.c
> index 6b4fac9..140cf6e 100644
> --- a/src/mesa/main/buffers.c
> +++ b/src/mesa/main/buffers.c
> @@ -567,16 +567,11 @@ _mesa_drawbuffers(struct gl_context *ctx, GLuint n, 
> const GLenum *buffers,
>  void
>  _mesa_update_draw_buffers(struct gl_context *ctx)
>  {
> -   GLenum buffers[MAX_DRAW_BUFFERS];
> -   GLuint i;
> -
> /* should be a window system FBO */
> assert(_mesa_is_winsys_fbo(ctx->DrawBuffer));
>
> -   for (i = 0; i < ctx->Const.MaxDrawBuffers; i++)
> -  buffers[i] = ctx->Color.DrawBuffer[i];
> -
> -   _mesa_drawbuffers(ctx, ctx->Const.MaxDrawBuffers, buffers, NULL);
> +   _mesa_drawbuffers(ctx, ctx->Const.MaxDrawBuffers,
> + ctx->Color.DrawBuffer, NULL);
>  }
>
>
> --
> 1.7.10.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Intel-gfx] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.

2012-07-30 Thread Olivier Galibert

On Mon, Jul 30, 2012 at 10:30:57AM -0700, Eric Anholt wrote:
> I'm perfectly fine with the VUE containing slots for both when the app
> has gone out of its way to ask for deprecated two-sided color
> rendering.

Are you also ok with recompiler the shaders when that enable is
switched?

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.

2012-07-29 Thread Olivier Galibert

On Tue, Jul 17, 2012 at 07:37:43AM -0700, Paul Berry wrote:
> If possible, I would still like to think of a way to address this situation
> that (a) doesn't require modifying both fragment shader back-ends and the
> SF program, and (b) helps all Mesa drivers, not just Intel Gen4-5.
> Especially because I suspect we may have bugs in Gen6-7 related to this
> situation. 

You don't :-) It's correctly handled in
gen6_sf_state.c::get_attr_override with similar semantics too.

> Would you be happy with one of the following two alternatives?
> 
> 1. In the GLSL front-end, if we detect that a vertex shader writes to
> gl_BackColor but not gl_FrontColor, then automatically insert
> "gl_FrontColor = 0;" into the shader.  This will guarantee that whenever
> gl_BackColor is written, gl_FrontColor is too.
> 
> 2. In the function brw_compute_vue_map(), assign a VUE slot for
> VERT_RESULT_COL0 whenever *either* VERT_RESULT_COL0 or VERT_RESULT_BFC0 is
> used.  This will guarantee that we always have a VUE slot available for
> front color, so we don't have to be as tricky in the FS and SF code.

With both methods the SF code is not really simplified.  Doing the mov
without testing would require writing to/reserving a slot for
gl_BackColor if gl_FrontColor is written to, which wouldn't be
acceptable.  And to write to/reserve a slot for the two of them if
gl_Color is read in any case.  Probably unacceptable.  So the need_*
stuff is going to stay in any case :/

So the only simplification would be in the fs/wm and I'm somewhat
afraid of having a vue slot that's not in outputs_written of the
previous stage.  They seem to be expected equivalent.

> This morning I'll try to ask some other Intel folks for their opinion on
> the subject.

Did they have an opinion?

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 7/7] ir_to_mesa: Don't set component for ir_dereference in ir_quadop_vector

2012-07-27 Thread Olivier Galibert

On Fri, Jul 27, 2012 at 10:49:25AM -0700, Kenneth Graunke wrote:
> From: Ian Romanick 
> 
> There can only be one variable used in an ir_quadop_vector.  Accesses
> of this variable must be swizzled.

There's nothing anywhere ensuring the presence of the swizzle.  I
completely agree that trashing the components value is a bad idea, but
you should have SWIZZLE_X as a default value for the components array.

Amusingly enough, it's already the case (SWIZZLE_X==0), so making it
explicit would be perfect.

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 6/7] glsl: Fix ir_last_opcode value.

2012-07-27 Thread Olivier Galibert

On Fri, Jul 27, 2012 at 10:49:24AM -0700, Kenneth Graunke wrote:
> From: Ian Romanick 
> 
> Now that ir_quadop_vector exists, ir_last_binop and ir_last_opcode are
> no longer the same.  Only one place currently uses this enumeration, and
> already handles ir_quadop_vector correctly.
> 
> Signed-off-by: Ian Romanick 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/glsl/ir.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/glsl/ir.h b/src/glsl/ir.h
> index e2743f6..a69494f 100644
> --- a/src/glsl/ir.h
> +++ b/src/glsl/ir.h
> @@ -1027,7 +1027,7 @@ enum ir_expression_operation {
> /**
>  * A sentinel marking the last of all operations.
>  */
> -   ir_last_opcode = ir_last_binop
> +   ir_last_opcode = ir_quadop_vector
>  };

Another obvious-in-hindsight bugfix.

Reviewed-by: Olivier Galibert 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/7] glsl: Add "typeless" constructor for quadop ir_expressions

2012-07-27 Thread Olivier Galibert

On Fri, Jul 27, 2012 at 10:49:23AM -0700, Kenneth Graunke wrote:
> From: Ian Romanick 
> 
> This matches the typeless constructors for unop and binop
> ir_expressions.
> 
> Signed-off-by: Ian Romanick 
> Reviewed-by: Kenneth Graunke 
> ---
>  src/glsl/ir.cpp | 17 +
>  src/glsl/ir.h   |  2 ++
>  2 files changed, 19 insertions(+)
> 
> diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
> index b0e38d8..5faf34a 100644
> --- a/src/glsl/ir.cpp
> +++ b/src/glsl/ir.cpp
> @@ -236,6 +236,23 @@ ir_expression::ir_expression(int op, const struct 
> glsl_type *type,
> this->operands[3] = op3;
>  }
>  
> +ir_expression::ir_expression(int op, ir_rvalue *op0, ir_rvalue *op1,
> +  ir_rvalue *op2, ir_rvalue *op3)
> +{
> +   assert(op0->type->is_scalar());
> +   assert((op0->type == op1->type)
> +   && (op0->type == op2->type)
> +   && (op0->type == op3->type));
> +
> +   this->ir_type = ir_type_expression;
> +   this->type = glsl_type::get_instance(op0->type->base_type, 4, 1);
> +   this->operation = ir_expression_operation(op);


You're hardcoding ir_quadop_vector's properties here.  A comment
saying so could be useful, if other quadops with different properties
happen someday.  In fact, you're hardcoding them so hard passing "op"
may not make sense.  A static method
   ir_expression *ir_expression::build_quadop_vector
perhaps?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/7] glsl: Request an Nx1 type instance in ir_quadop_vector lowering pass.

2012-07-27 Thread Olivier Galibert

On Fri, Jul 27, 2012 at 10:49:22AM -0700, Kenneth Graunke wrote:
> From: Ian Romanick 
> 
> No types have 0 columns.  The glsl_type::get_instance method contains
> 
>if ((rows < 1) || (rows > 4) || (columns < 1) || (columns > 4))
>   return error_type;
> 
> To get a vector, use columns = 1.

Reviewed-by: Olivier Galibert 

That's an obvious bugfix.  If there's a stable branch with the glsl
compiler in, it probably should go there.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/7] glsl: Add glsl_type::get_sampler_instance method.

2012-07-27 Thread Olivier Galibert

On Fri, Jul 27, 2012 at 10:49:20AM -0700, Kenneth Graunke wrote:
> +/**
> + * Convert sampler type attributes into an index in the sampler_types array
> + */
> +#define SAMPLER_TYPE_INDEX(dim, sample_type, array, shadow)  \
> +   ((unsigned(dim) * 12) + (sample_type * 4) + (unsigned(array) * 2) \
> ++ unsigned(shadow))
> +
> +/**
> + * \note
> + * Arrays like this are \b the argument for C99-style designated 
> initializers.
> + * Too bad C++ and VisualStudio are too cool for that sort of useful
> + * functionality.
> + */
> +const glsl_type *const glsl_type::sampler_types[] = {

Did you think about using a 4-dimensions array and let the compiler
take care of the multiplies?  It may not be that much more readable though.



> +   /* GLSL_SAMPLER_DIM_1D */
> +   &builtin_130_types[10],  /* uint */
> +   NULL,/* uint, shadow */

What does NULL mean?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/7] glsl: Make bvec and ivec types accessible without using get_instance.

2012-07-27 Thread Olivier Galibert

On Fri, Jul 27, 2012 at 10:49:19AM -0700, Kenneth Graunke wrote:
> It's more convenient to use shortcuts like glsl_type::bvec2_type than
> the longwinded glsl_type::get_instance(GLSL_TYPE_BOOL, 2, 1).

Yay, code in zones I understand :-)

Reviewed-by: Olivier Galibert 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Support for EXT/ARB_geometry_shader4

2012-07-27 Thread Olivier Galibert

On Fri, Jul 27, 2012 at 10:40:28AM -0500, Bryan Cain wrote:
> https://github.com/Plombo/mesa/tree/geometry-shaders .

Quick remarks from a fast read:
- you missed draw_pipe_clip.c:clip_init_state, where you need to plug
  in the gs info where appropriate.  Should be easy.  It will take
  care of the interpolation-on-clipping issues you currently have even
  if you don't know you have them :-)

- starting with 4.0 EmitVertex and EmitPrimitive are in fact
  EmitStreamVertex(0) and EmitStreamPrimitive(0).  It may be a good
  idea to implement the stream version at the ir_* level, even if the
  first implementation just ignores the parameter.

- all the is_*_shader boolean variables should probably be an integer
  "shader type" variable, since there will be two more types to add for
  4.0.

- I'm not sure we want to use _ARB versions of constants when the
  suffix-less versions exist and have the same value.

- cross_validate_outputs_to_inputs could use some kind of
  const char *_mesa_get_shader_type_string(gl_shader *sh) from
  somewhere like shaderapi.h.  We'll need two more shader types soon.

I'll see how hard the intel gen4 supports looks to be, shouldn't be
that bad.  Need to finish clipper first though.

Best,

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] mesa: Add a Version field to the context with VersionMajor*10+VersionMinor.

2012-07-27 Thread Olivier Galibert

On Thu, Jul 26, 2012 at 05:27:43PM -0700, Eric Anholt wrote:
> As we get into supporting GL 3.x core, we come across more and more features
> of the API that depend on the version number as opposed to just the extension
> list.  This will let us more sanely do version checks than "(VersionMajor == 3
> && VersionMinor >= 2) || VersionMajor >= 4".

Pure bikeshedding, but why not use *100 in order to be identical to glsl?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/9] intel gen4-5: fix the vue view in the fs.

2012-07-27 Thread Olivier Galibert

On Thu, Jul 26, 2012 at 10:18:01AM -0700, Eric Anholt wrote:
> Olivier Galibert  writes:
> 
> > In some cases the fragment shader view of the vue registers was out of
> > sync with the builder.  This fixes it.
> 
> s/builder/SF outputs/ ?
> 
> I'd love to see the pre-gen6 code get rearranged so the FS walked the
> bitfield of FS inputs from SF and chose the urb offset for each.  But
> this does look like the minimal fix.

In other words, an explicit linking pass?  That could be useful with
geometry shaders, too.

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/9] intel gen4-5: Compute the interpolation status for every variable in one place.

2012-07-27 Thread Olivier Galibert

On Thu, Jul 26, 2012 at 10:22:26AM -0700, Eric Anholt wrote:
> I don't like seeing this data that should be referenced out of the
> program cache key being communicated through brw->.

What would you like it being communicated through?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] sp_tex_sample: Fix stupid copy/paste error.

2012-07-25 Thread Olivier Galibert

[Sorry, mail was down yesterday]

On Tue, Jul 24, 2012 at 10:06:05AM -0600, Brian Paul wrote:
> Does this fix bug 52369?

Yes.

>  Do you need me to commit this for you?

Yes please.  Perhaps I should see about getting a fdo account.

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] sp_tex_sample: Fix stupid copy/paste error.

2012-07-24 Thread Olivier Galibert

diff --git a/src/gallium/drivers/softpipe/sp_tex_sample.c 
b/src/gallium/drivers/softpipe/sp_tex_sample.c
index f215b90..0aeb8e2 100644
--- a/src/gallium/drivers/softpipe/sp_tex_sample.c
+++ b/src/gallium/drivers/softpipe/sp_tex_sample.c
@@ -1950,8 +1950,8 @@ mip_filter_linear_2d_linear_repeat_POT(
  float rgbax[TGSI_NUM_CHANNELS][TGSI_QUAD_SIZE];
  int c;
 
- img_filter_2d_linear_repeat_POT(tgsi_sampler, s[j], t[j], p[j], 
level0,   samp->faces[j], tgsi_sampler_lod_bias, &rgbax[0][j]);
- img_filter_2d_linear_repeat_POT(tgsi_sampler, s[j], t[j], p[j], 
level0+1, samp->faces[j], tgsi_sampler_lod_bias, &rgbax[0][j]);
+ img_filter_2d_linear_repeat_POT(tgsi_sampler, s[j], t[j], p[j], 
level0,   samp->faces[j], tgsi_sampler_lod_bias, &rgbax[0][0]);
+ img_filter_2d_linear_repeat_POT(tgsi_sampler, s[j], t[j], p[j], 
level0+1, samp->faces[j], tgsi_sampler_lod_bias, &rgbax[0][1]);
 
  for (c = 0; c < TGSI_NUM_CHANNELS; c++)
 rgba[c][j] = lerp(levelBlend, rgbax[c][0], rgbax[c][1]);
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Intel-gfx] [PATCH 4/9] intel gen4-5: Fix backface/frontface selection when one one color is written to.

2012-07-20 Thread Olivier Galibert

On Fri, Jul 20, 2012 at 10:01:03AM -0700, Eric Anholt wrote:
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > index 3f98137..3b62952 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > @@ -972,6 +972,15 @@ fs_visitor::calculate_urb_setup()
> >  if (c->key.vp_outputs_written & BITFIELD64_BIT(i)) {
> > int fp_index = _mesa_vert_result_to_frag_attrib((gl_vert_result) i);
> >  
> > +/* Special case: two-sided vertex option, vertex program
> > + * only writes to the back color.  Map it to the
> > + * associated front color location.
> > + */
> > +if (i >= VERT_RESULT_BFC0 && i <= VERT_RESULT_BFC1 &&
> > +ctx->VertexProgram._TwoSideEnabled &&
> > +urb_setup[i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0] == -1)
> > +   fp_index = i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
> 
> In the fs_visitor (and brw_wm_pass*), you don't get to look at ctx->
> state like that -- you're getting called once with some set of ctx
> state, but the program will get reused even if the ctx state changes.
> You'd have to get that state into the wm prog key, and use that, which
> would guarantee that you have the appropriate program code.

Ok.  OTOH, we don't actually *need* to look at TwoSideEnabled.  If the
rest of the condition triggers it's either correct or undefined
behaviour.  So we can do it systematically.

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 9/9] intel gen4-5: Don't touch flatshaded values when clipping, only copy them.

2012-07-19 Thread Olivier Galibert

This patch ensures that integers will pass through unscathed.  Doing
(useless) computations on them is risky, especially when their bit
patterns correspond to values like inf or nan.

Signed-off-by: Olivier Galibert 
---
 src/mesa/drivers/dri/i965/brw_clip_util.c |   48 ++---
 1 file changed, 30 insertions(+), 18 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip_util.c 
b/src/mesa/drivers/dri/i965/brw_clip_util.c
index b06ad1d..998c304 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_util.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_util.c
@@ -293,30 +293,42 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
  * header), so interpolate:
  *
  *New = attr0 + t*attr1 - t*attr0
+  *
+  * unless it's flat shaded, then just copy the value from a
+  * source vertex.
  */
 
- struct brw_reg tmp = get_tmp(c);
+ GLuint interp = brw->interpolation_mode[slot];
 
- struct brw_reg t =
-brw->interpolation_mode[slot] == INTERP_QUALIFIER_NOPERSPECTIVE ?
-t_nopersp : t0;
+ if(interp == INTERP_QUALIFIER_SMOOTH ||
+interp == INTERP_QUALIFIER_NOPERSPECTIVE) {
+struct brw_reg tmp = get_tmp(c);
+struct brw_reg t =
+   interp == INTERP_QUALIFIER_NOPERSPECTIVE ?
+   t_nopersp : t0;
 
-brw_MUL(p, 
-vec4(brw_null_reg()),
-deref_4f(v1_ptr, delta),
-t);
+brw_MUL(p,
+vec4(brw_null_reg()),
+deref_4f(v1_ptr, delta),
+t);
 
-brw_MAC(p, 
-tmp, 
-negate(deref_4f(v0_ptr, delta)),
-t); 
+brw_MAC(p,
+tmp,
+negate(deref_4f(v0_ptr, delta)),
+t);
  
-brw_ADD(p,
-deref_4f(dest_ptr, delta), 
-deref_4f(v0_ptr, delta),
-tmp);
-
- release_tmp(c, tmp);
+brw_ADD(p,
+deref_4f(dest_ptr, delta),
+deref_4f(v0_ptr, delta),
+tmp);
+
+release_tmp(c, tmp);
+
+ } else if(interp == INTERP_QUALIFIER_FLAT) {
+brw_MOV(p,
+deref_4f(dest_ptr, delta),
+deref_4f(v0_ptr, delta));
+ }
   }
}
 
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 8/9] intel gen4-5: Make noperspective clipping work.

2012-07-19 Thread Olivier Galibert

At this point all interpolation tests with fixed clipping work.

Signed-off-by: Olivier Galibert 
Reviewed-by: Paul Berry 
---
 src/mesa/drivers/dri/i965/brw_clip.c  |9 ++
 src/mesa/drivers/dri/i965/brw_clip.h  |1 +
 src/mesa/drivers/dri/i965/brw_clip_util.c |  147 ++---
 3 files changed, 146 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index 8512172..eca2844 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -239,6 +239,15 @@ brw_upload_clip_prog(struct brw_context *brw)
  break;
   }
}
+   key.has_noperspective_shading = 0;
+   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
+  if (brw->interpolation_mode[i] == INTERP_QUALIFIER_NOPERSPECTIVE &&
+  brw->vs.prog_data->vue_map.slot_to_vert_result[i] != 
VERT_RESULT_HPOS) {
+ key.has_noperspective_shading = 1;
+ break;
+  }
+   }
+
key.pv_first = (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION);
 
memcpy(key.interpolation_mode, brw->interpolation_mode, 
BRW_VERT_RESULT_MAX);
diff --git a/src/mesa/drivers/dri/i965/brw_clip.h 
b/src/mesa/drivers/dri/i965/brw_clip.h
index 3ad2e13..66dd928 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.h
+++ b/src/mesa/drivers/dri/i965/brw_clip.h
@@ -47,6 +47,7 @@ struct brw_clip_prog_key {
GLuint primitive:4;
GLuint nr_userclip:4;
GLuint has_flat_shading:1;
+   GLuint has_noperspective_shading:1;
GLuint pv_first:1;
GLuint do_unfilled:1;
GLuint fill_cw:2;   /* includes cull information */
diff --git a/src/mesa/drivers/dri/i965/brw_clip_util.c 
b/src/mesa/drivers/dri/i965/brw_clip_util.c
index 692573e..b06ad1d 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_util.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_util.c
@@ -129,6 +129,8 @@ static void brw_clip_project_vertex( struct 
brw_clip_compile *c,
 
 /* Interpolate between two vertices and put the result into a0.0.  
  * Increment a0.0 accordingly.
+ *
+ * Beware that dest_ptr can be equal to v0_ptr.
  */
 void brw_clip_interp_vertex( struct brw_clip_compile *c,
 struct brw_indirect dest_ptr,
@@ -138,7 +140,8 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
 bool force_edgeflag)
 {
struct brw_compile *p = &c->func;
-   struct brw_reg tmp = get_tmp(c);
+   struct brw_context *brw = p->brw;
+   struct brw_reg t_nopersp, v0_ndc_copy;
GLuint slot;
 
/* Just copy the vertex header:
@@ -148,13 +151,130 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
 * back on Ironlake, so needn't change it
 */
brw_copy_indirect_to_indirect(p, dest_ptr, v0_ptr, 1);
-  
-   /* Iterate over each attribute (could be done in pairs?)
+
+   /*
+* First handle the 3D and NDC positioning, in case we need
+* noperspective interpolation.  Doing it early has no performance
+* impact in any case.
+*/
+
+   /* Start by picking up the v0 NDC coordinates, because that vertex
+* may be shared with the destination.
+*/
+   if (c->key.has_noperspective_shading) {
+  GLuint offset = brw_vert_result_to_offset(&c->vue_map,
+BRW_VERT_RESULT_NDC);
+  v0_ndc_copy = get_tmp(c);
+  brw_MOV(p, v0_ndc_copy, deref_4f(v0_ptr, offset));
+   }  
+
+   /*
+* Compute the new 3D position
+*
+* dest_hpos = v0_hpos * (1 - t0) + v1_hpos * t0
+*/
+   {
+  GLuint delta = brw_vert_result_to_offset(&c->vue_map, VERT_RESULT_HPOS);
+  struct brw_reg tmp = get_tmp(c);
+  brw_MUL(p, 
+  vec4(brw_null_reg()),
+  deref_4f(v1_ptr, delta),
+  t0);
+
+  brw_MAC(p,
+  tmp,   
+  negate(deref_4f(v0_ptr, delta)),
+  t0);
+ 
+  brw_ADD(p,
+  deref_4f(dest_ptr, delta), 
+  deref_4f(v0_ptr, delta),
+  tmp);
+  release_tmp(c, tmp);
+   }
+
+   /* Then recreate the projected (NDC) coordinate in the new vertex
+* header
+*/
+   brw_clip_project_vertex(c, dest_ptr);
+
+   /*
+* If we have noperspective attributes, we now need to compute the
+* screen-space t.
+*/
+   if (c->key.has_noperspective_shading) {
+  GLuint delta = brw_vert_result_to_offset(&c->vue_map, 
BRW_VERT_RESULT_NDC);
+  struct brw_reg tmp = get_tmp(c);
+  t_nopersp = get_tmp(c);
+
+  /* Build a register with coordinates from the second and new vertices
+   *
+   * t_nopersp = vec4(v1.xy, dest.xy)
+   */
+  brw_MOV(p, t_nopersp, deref_4f(v1_ptr, delta));
+  brw_MOV(p, tmp, deref_4f(dest_ptr, delta));
+  brw_set_access_mode(p, BRW_ALIGN_16);
+  brw_MOV(p,
+  brw_writemask(t_nopersp, WRITEMASK_ZW),
+  brw_swizzle(tmp,

[Mesa-dev] [PATCH 7/9] intel gen4-5: Correctly handle flat vs. non-flat in the clipper.

2012-07-19 Thread Olivier Galibert

At that point, all interpolation piglit tests involving fixed clipping
work as long as there's no noperspective.

Signed-off-by: Olivier Galibert 
Reviewed-by: Paul Berry 
---
 src/mesa/drivers/dri/i965/brw_clip.c  |   13 --
 src/mesa/drivers/dri/i965/brw_clip.h  |6 +--
 src/mesa/drivers/dri/i965/brw_clip_line.c |6 +--
 src/mesa/drivers/dri/i965/brw_clip_tri.c  |   20 -
 src/mesa/drivers/dri/i965/brw_clip_unfilled.c |2 +-
 src/mesa/drivers/dri/i965/brw_clip_util.c |   56 +++--
 src/mesa/drivers/dri/i965/brw_sf_emit.c   |8 
 7 files changed, 50 insertions(+), 61 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index b4a2e0a..8512172 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -218,7 +218,7 @@ brw_upload_clip_prog(struct brw_context *brw)
struct intel_context *intel = &brw->intel;
struct gl_context *ctx = &intel->ctx;
struct brw_clip_prog_key key;
-
+   int i;
memset(&key, 0, sizeof(key));
 
/* Populate the key:
@@ -231,11 +231,16 @@ brw_upload_clip_prog(struct brw_context *brw)
key.primitive = brw->intel.reduced_primitive;
/* CACHE_NEW_VS_PROG (also part of VUE map) */
key.attrs = brw->vs.prog_data->outputs_written;
-   /* _NEW_LIGHT */
-   key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
+   /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */
+   key.has_flat_shading = 0;
+   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
+  if (brw->interpolation_mode[i] == INTERP_QUALIFIER_FLAT) {
+ key.has_flat_shading = 1;
+ break;
+  }
+   }
key.pv_first = (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION);
 
-   /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */
memcpy(key.interpolation_mode, brw->interpolation_mode, 
BRW_VERT_RESULT_MAX);
 
/* _NEW_TRANSFORM (also part of VUE map)*/
diff --git a/src/mesa/drivers/dri/i965/brw_clip.h 
b/src/mesa/drivers/dri/i965/brw_clip.h
index e78d074..3ad2e13 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.h
+++ b/src/mesa/drivers/dri/i965/brw_clip.h
@@ -46,7 +46,7 @@ struct brw_clip_prog_key {
unsigned char interpolation_mode[BRW_VERT_RESULT_MAX]; /* copy of the main 
context */
GLuint primitive:4;
GLuint nr_userclip:4;
-   GLuint do_flat_shading:1;
+   GLuint has_flat_shading:1;
GLuint pv_first:1;
GLuint do_unfilled:1;
GLuint fill_cw:2;   /* includes cull information */
@@ -166,8 +166,8 @@ void brw_clip_kill_thread(struct brw_clip_compile *c);
 struct brw_reg brw_clip_plane_stride( struct brw_clip_compile *c );
 struct brw_reg brw_clip_plane0_address( struct brw_clip_compile *c );
 
-void brw_clip_copy_colors( struct brw_clip_compile *c,
-  GLuint to, GLuint from );
+void brw_clip_copy_flatshaded_attributes( struct brw_clip_compile *c,
+  GLuint to, GLuint from );
 
 void brw_clip_init_clipmask( struct brw_clip_compile *c );
 
diff --git a/src/mesa/drivers/dri/i965/brw_clip_line.c 
b/src/mesa/drivers/dri/i965/brw_clip_line.c
index 6cf2bd2..729d8c0 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_line.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_line.c
@@ -271,11 +271,11 @@ void brw_emit_line_clip( struct brw_clip_compile *c )
brw_clip_line_alloc_regs(c);
brw_clip_init_ff_sync(c);
 
-   if (c->key.do_flat_shading) {
+   if (c->key.has_flat_shading) {
   if (c->key.pv_first)
- brw_clip_copy_colors(c, 1, 0);
+ brw_clip_copy_flatshaded_attributes(c, 1, 0);
   else
- brw_clip_copy_colors(c, 0, 1);
+ brw_clip_copy_flatshaded_attributes(c, 0, 1);
}
 
clip_and_emit_line(c);
diff --git a/src/mesa/drivers/dri/i965/brw_clip_tri.c 
b/src/mesa/drivers/dri/i965/brw_clip_tri.c
index a29f8e0..71225f5 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_tri.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_tri.c
@@ -187,8 +187,8 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile *c )
 
brw_IF(p, BRW_EXECUTE_1);
{
-  brw_clip_copy_colors(c, 1, 0);
-  brw_clip_copy_colors(c, 2, 0);
+  brw_clip_copy_flatshaded_attributes(c, 1, 0);
+  brw_clip_copy_flatshaded_attributes(c, 2, 0);
}
brw_ELSE(p);
{
@@ -200,19 +200,19 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile *c )
 brw_imm_ud(_3DPRIM_TRIFAN));
 brw_IF(p, BRW_EXECUTE_1);
 {
-   brw_clip_copy_colors(c, 0, 1);
-   brw_clip_copy_colors(c, 2, 1);
+   brw_clip_copy_flatshaded_attributes(c, 0, 1);
+   brw_clip_copy_flatshaded_attributes(c, 2, 1);
 }
 brw_ELSE(p);
 {
-   brw_clip_copy_colors(c, 1, 0);
-   brw_clip_copy_colors(c, 2, 0);
+   brw_clip_copy_flatshaded_attributes(c, 1, 0);
+   brw_clip_copy_flatsha

[Mesa-dev] [PATCH 6/9] intel gen4-5: Correctly setup the parameters in the sf.

2012-07-19 Thread Olivier Galibert

This patch also correct a couple of problems with noperspective
interpolation.

At that point all the glsl 1.1/1.3 interpolation tests that do not
clip pass (the -none ones).

The fs code does not use the pre-resolved interpolation modes in order
not to mess with gen6+.  Sharing the resolution would require putting
brw_wm_prog before brw_clip_prog and brw_sf_prog.  This may be a good
thing, but it could have unexpected consequences, so it's better be
done independently in any case.

Signed-off-by: Olivier Galibert 
Reviewed-by: Paul Berry 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |2 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |   15 +++
 src/mesa/drivers/dri/i965/brw_sf.c   |   12 +-
 src/mesa/drivers/dri/i965/brw_sf.h   |2 +-
 src/mesa/drivers/dri/i965/brw_sf_emit.c  |  164 +-
 5 files changed, 106 insertions(+), 89 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 3b62952..4734a5d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -757,7 +757,7 @@ fs_visitor::emit_general_interpolation(ir_variable *ir)
  inst->predicated = true;
  inst->predicate_inverse = true;
   }
- if (intel->gen < 6) {
+ if (intel->gen < 6 && interpolation_mode == 
INTERP_QUALIFIER_SMOOTH) {
 emit(BRW_OPCODE_MUL, attr, attr, this->pixel_w);
  }
   }
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 08c0130..c6dc265 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1872,6 +1872,21 @@ fs_visitor::emit_interpolation_setup_gen4()
emit(BRW_OPCODE_ADD, this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC],
this->pixel_y, fs_reg(negate(brw_vec1_grf(1, 1;
 
+   /*
+* On Gen4-5, we accomplish perspective-correct interpolation by
+* dividing the attribute values by w in the sf shader,
+* interpolating the result linearly in screen space, and then
+* multiplying by w in the fragment shader.  So the interpolation
+* step is always linear in screen space, regardless of whether the
+* attribute is perspective or non-perspective.  Accordingly, we
+* use the same delta_x and delta_y values for both kinds of
+* interpolation.
+*/
+   this->delta_x[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] =
+ this->delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC];
+   this->delta_y[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] =
+ this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC];
+
this->current_annotation = "compute pos.w and 1/pos.w";
/* Compute wpos.w.  It's always in our setup, since it's needed to
 * interpolate the other attributes.
diff --git a/src/mesa/drivers/dri/i965/brw_sf.c 
b/src/mesa/drivers/dri/i965/brw_sf.c
index 26cbaf7..c00e85a 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.c
+++ b/src/mesa/drivers/dri/i965/brw_sf.c
@@ -139,6 +139,7 @@ brw_upload_sf_prog(struct brw_context *brw)
struct brw_sf_prog_key key;
/* _NEW_BUFFERS */
bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
+   int i;
 
memset(&key, 0, sizeof(key));
 
@@ -190,11 +191,16 @@ brw_upload_sf_prog(struct brw_context *brw)
if ((ctx->Point.SpriteOrigin == GL_LOWER_LEFT) != render_to_fbo)
   key.sprite_origin_lower_left = true;
 
-   /* _NEW_LIGHT */
-   key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
+   /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */
+   key.has_flat_shading = 0;
+   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
+  if (brw->interpolation_mode[i] == INTERP_QUALIFIER_FLAT) {
+ key.has_flat_shading = 1;
+ break;
+  }
+   }
key.do_twoside_color = ctx->VertexProgram._TwoSideEnabled;
 
-   /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */
memcpy(key.interpolation_mode, brw->interpolation_mode, 
BRW_VERT_RESULT_MAX);
 
/* _NEW_POLYGON */
diff --git a/src/mesa/drivers/dri/i965/brw_sf.h 
b/src/mesa/drivers/dri/i965/brw_sf.h
index 5e261fb..47fdb3e 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.h
+++ b/src/mesa/drivers/dri/i965/brw_sf.h
@@ -50,7 +50,7 @@ struct brw_sf_prog_key {
uint8_t point_sprite_coord_replace;
GLuint primitive:2;
GLuint do_twoside_color:1;
-   GLuint do_flat_shading:1;
+   GLuint has_flat_shading:1;
GLuint frontface_ccw:1;
GLuint do_point_sprite:1;
GLuint do_point_coord:1;
diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c 
b/src/mesa/drivers/dri/i965/brw_sf_emit.c
index 9d8aa38..c99578a 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c
@@ -44,6 +44,17 @@
 
 
 /**
+ * Determine the vue slot corresponding to the given half of the given
+ *

[Mesa-dev] [PATCH 5/9] intel gen4-5: Compute the interpolation status for every variable in one place.

2012-07-19 Thread Olivier Galibert

The program keys are updated accordingly, but the values are not used
yet.

Signed-off-by: Olivier Galibert 
---
 src/mesa/drivers/dri/i965/brw_clip.c|   90 ++-
 src/mesa/drivers/dri/i965/brw_clip.h|1 +
 src/mesa/drivers/dri/i965/brw_context.h |   11 
 src/mesa/drivers/dri/i965/brw_sf.c  |5 +-
 src/mesa/drivers/dri/i965/brw_sf.h  |1 +
 src/mesa/drivers/dri/i965/brw_wm.c  |2 +
 src/mesa/drivers/dri/i965/brw_wm.h  |1 +
 7 files changed, 109 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index d411208..b4a2e0a 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -47,6 +47,86 @@
 #define FRONT_UNFILLED_BIT  0x1
 #define BACK_UNFILLED_BIT   0x2
 
+/**
+ * Lookup the interpolation mode information for every element in the
+ * vue.
+ */
+static void
+brw_lookup_interpolation(struct brw_context *brw)
+{
+   /* pprog means "previous program", i.e. the last program before the
+* fragment shader.  It can only be the vertex shader for now, but
+* it may be a geometry shader in the future.
+*/
+   const struct gl_program *pprog = &brw->vertex_program->Base;
+   const struct gl_fragment_program *fprog = brw->fragment_program;
+   struct brw_vue_map *vue_map = &brw->vs.prog_data->vue_map;
+
+   /* Default everything to INTERP_QUALIFIER_NONE */
+   memset(brw->interpolation_mode, INTERP_QUALIFIER_NONE, BRW_VERT_RESULT_MAX);
+
+   /* If there is no fragment shader, interpolation won't be needed,
+* so defaulting to none is good.
+*/
+   if (!fprog)
+  return;
+
+   for (int i = 0; i < vue_map->num_slots; i++) {
+  /* First lookup the vert result, skip if there isn't one */
+  int vert_result = vue_map->slot_to_vert_result[i];
+  if (vert_result == BRW_VERT_RESULT_MAX)
+ continue;
+
+  /* HPOS is special.  In the clipper, it is handled specifically,
+   * so its value is irrelevant.  In the sf, it's forced to
+   * linear.  In the wm, it's special cased, irrelevant again.  So
+   * force linear to remove the sf special case.
+   */
+  if (vert_result == VERT_RESULT_HPOS) {
+ brw->interpolation_mode[i] = INTERP_QUALIFIER_NOPERSPECTIVE;
+ continue;
+  }
+
+  /* There is a 1-1 mapping of vert result to frag attrib except
+   * for BackColor and vars
+   */
+  int frag_attrib = vert_result;
+  if (vert_result >= VERT_RESULT_BFC0 && vert_result <= VERT_RESULT_BFC1)
+ frag_attrib = vert_result - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
+  else if(vert_result >= VERT_RESULT_VAR0)
+ frag_attrib = vert_result - VERT_RESULT_VAR0 + FRAG_ATTRIB_VAR0;
+
+  /* If the output is not used by the fragment shader, skip it. */
+  if (!(fprog->Base.InputsRead & BITFIELD64_BIT(frag_attrib)))
+ continue;
+
+  /* Lookup the interpolation mode */
+  enum glsl_interp_qualifier interpolation_mode = 
fprog->InterpQualifier[frag_attrib];
+
+  /* If the mode is not specified, then the default varies.  Color
+   * values follow the shader model, while all the rest uses
+   * smooth.
+   */
+  if (interpolation_mode == INTERP_QUALIFIER_NONE) {
+ if (frag_attrib >= FRAG_ATTRIB_COL0 && frag_attrib <= 
FRAG_ATTRIB_COL1)
+interpolation_mode = brw->intel.ctx.Light.ShadeModel == GL_FLAT ? 
INTERP_QUALIFIER_FLAT : INTERP_QUALIFIER_SMOOTH;
+ else
+interpolation_mode = INTERP_QUALIFIER_SMOOTH;
+  }
+
+  /* Finally, if we have both a front color and a back color for
+   * the same channel, the selection will be done before
+   * interpolation and the back color copied over the front color
+   * if necessary.  So interpolating the back color is
+   * unnecessary.
+   */
+  if (vert_result >= VERT_RESULT_BFC0 && vert_result <= VERT_RESULT_BFC1)
+ if (pprog->OutputsWritten & BITFIELD64_BIT(vert_result - 
VERT_RESULT_BFC0 + VERT_RESULT_COL0))
+interpolation_mode = INTERP_QUALIFIER_NONE;
+
+  brw->interpolation_mode[i] = interpolation_mode;
+   }
+}
 
 static void compile_clip_prog( struct brw_context *brw,
 struct brw_clip_prog_key *key )
@@ -143,6 +223,10 @@ brw_upload_clip_prog(struct brw_context *brw)
 
/* Populate the key:
 */
+
+   /* BRW_NEW_FRAGMENT_PROGRAM, _NEW_LIGHT */
+   brw_lookup_interpolation(brw);
+
/* BRW_NEW_REDUCED_PRIMITIVE */
key.primitive = brw->intel.reduced_primitive;
/* CACHE_NEW_VS_PROG (also part of VUE map) */
@@ -150,6 +234,10 @@ brw_upload_clip_prog(struct brw_context *brw)
/* _NEW_LIGHT */
key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
key.pv_first = (ctx->Light.ProvokingV

[Mesa-dev] [PATCH 4/9] intel gen4-5: Fix backface/frontface selection when one one color is written to.

2012-07-19 Thread Olivier Galibert

Shaders, piglit test ones in particular, may write only to one of
gl_FrontColor/gl_BackColor.  The standard is unclear on whether the
behaviour is defined in that case, but it seems reasonable to support
it.

The choice done there to pick up whichever color was actually written
to.  That makes most of the generated piglit tests useless to test the
backface selection, but it's simple and it works.

Signed-off-by: Olivier Galibert 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |9 +
 src/mesa/drivers/dri/i965/brw_wm_pass2.c |9 +
 2 files changed, 18 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 3f98137..3b62952 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -972,6 +972,15 @@ fs_visitor::calculate_urb_setup()
 if (c->key.vp_outputs_written & BITFIELD64_BIT(i)) {
int fp_index = _mesa_vert_result_to_frag_attrib((gl_vert_result) i);
 
+/* Special case: two-sided vertex option, vertex program
+ * only writes to the back color.  Map it to the
+ * associated front color location.
+ */
+if (i >= VERT_RESULT_BFC0 && i <= VERT_RESULT_BFC1 &&
+ctx->VertexProgram._TwoSideEnabled &&
+urb_setup[i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0] == -1)
+   fp_index = i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
+
/* The back color slot is skipped when the front color is
 * also written to.  In addition, some slots can be
 * written in the vertex shader and not read in the
diff --git a/src/mesa/drivers/dri/i965/brw_wm_pass2.c 
b/src/mesa/drivers/dri/i965/brw_wm_pass2.c
index eacf7c0..48143f3 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_pass2.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_pass2.c
@@ -96,6 +96,15 @@ static void init_registers( struct brw_wm_compile *c )
 if (c->key.vp_outputs_written & BITFIELD64_BIT(j)) {
int fp_index = _mesa_vert_result_to_frag_attrib(j);
 
+/* Special case: two-sided vertex option, vertex program
+ * only writes to the back color.  Map it to the
+ * associated front color location.
+ */
+if (j >= VERT_RESULT_BFC0 && j <= VERT_RESULT_BFC1 &&
+intel->ctx.VertexProgram._TwoSideEnabled &&
+!(c->key.vp_outputs_written & BITFIELD64_BIT(j - 
VERT_RESULT_BFC0 + VERT_RESULT_COL0)))
+   fp_index = j - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
+
nr_interp_regs++;
 
/* The back color slot is skipped when the front color is
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/9] intel gen4-5: fix GL_VERTEX_PROGRAM_TWO_SIDE selection.

2012-07-19 Thread Olivier Galibert

Previous code only selected two side in pure fixed-function setups.
This version also activates it when needed with shaders programs.

Signed-off-by: Olivier Galibert 
---
 src/mesa/drivers/dri/i965/brw_sf.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_sf.c 
b/src/mesa/drivers/dri/i965/brw_sf.c
index 23a874a..791210f 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.c
+++ b/src/mesa/drivers/dri/i965/brw_sf.c
@@ -192,7 +192,7 @@ brw_upload_sf_prog(struct brw_context *brw)
 
/* _NEW_LIGHT */
key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
-   key.do_twoside_color = (ctx->Light.Enabled && ctx->Light.Model.TwoSide);
+   key.do_twoside_color = ctx->VertexProgram._TwoSideEnabled;
 
/* _NEW_POLYGON */
if (key.do_twoside_color) {
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/9] intel gen4-5: simplify the bfc copy in the sf.

2012-07-19 Thread Olivier Galibert

This patch is mostly designed to make followup patches simpler, but
it's a simplification by itself.

Signed-off-by: Olivier Galibert 
---
 src/mesa/drivers/dri/i965/brw_sf_emit.c |   93 +--
 1 file changed, 52 insertions(+), 41 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c 
b/src/mesa/drivers/dri/i965/brw_sf_emit.c
index ff6383b..9d8aa38 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c
@@ -79,24 +79,9 @@ have_attr(struct brw_sf_compile *c, GLuint attr)
 /*** 
  * Twoside lighting
  */
-static void copy_bfc( struct brw_sf_compile *c,
- struct brw_reg vert )
-{
-   struct brw_compile *p = &c->func;
-   GLuint i;
-
-   for (i = 0; i < 2; i++) {
-  if (have_attr(c, VERT_RESULT_COL0+i) &&
- have_attr(c, VERT_RESULT_BFC0+i))
-brw_MOV(p, 
-get_vert_result(c, vert, VERT_RESULT_COL0+i),
-get_vert_result(c, vert, VERT_RESULT_BFC0+i));
-   }
-}
-
-
 static void do_twoside_color( struct brw_sf_compile *c )
 {
+   GLuint i, need_0, need_1;
struct brw_compile *p = &c->func;
GLuint backface_conditional = c->key.frontface_ccw ? BRW_CONDITIONAL_G : 
BRW_CONDITIONAL_L;
 
@@ -105,12 +90,14 @@ static void do_twoside_color( struct brw_sf_compile *c )
if (c->key.primitive == SF_UNFILLED_TRIS)
   return;
 
-   /* XXX: What happens if BFC isn't present?  This could only happen
-* for user-supplied vertex programs, as t_vp_build.c always does
-* the right thing.
+   /* If the vertex shader provides both front and backface color, do
+* the selection.  Otherwise the generated code will pick up
+* whichever there is.
 */
-   if (!(have_attr(c, VERT_RESULT_COL0) && have_attr(c, VERT_RESULT_BFC0)) &&
-   !(have_attr(c, VERT_RESULT_COL1) && have_attr(c, VERT_RESULT_BFC1)))
+   need_0 = have_attr(c, VERT_RESULT_COL0) && have_attr(c, VERT_RESULT_BFC0);
+   need_1 = have_attr(c, VERT_RESULT_COL1) && have_attr(c, VERT_RESULT_BFC1);
+
+   if (!need_0 && !need_1)
   return;

/* Need to use BRW_EXECUTE_4 and also do an 4-wide compare in order
@@ -121,12 +108,15 @@ static void do_twoside_color( struct brw_sf_compile *c )
brw_push_insn_state(p);
brw_CMP(p, vec4(brw_null_reg()), backface_conditional, c->det, 
brw_imm_f(0));
brw_IF(p, BRW_EXECUTE_4);
-   {
-  switch (c->nr_verts) {
-  case 3: copy_bfc(c, c->vert[2]);
-  case 2: copy_bfc(c, c->vert[1]);
-  case 1: copy_bfc(c, c->vert[0]);
-  }
+   for (i=0; inr_verts; i++) {
+  if (need_0)
+brw_MOV(p, 
+get_vert_result(c, c->vert[i], VERT_RESULT_COL0),
+get_vert_result(c, c->vert[i], VERT_RESULT_BFC0));
+  if (need_1)
+brw_MOV(p, 
+get_vert_result(c, c->vert[i], VERT_RESULT_COL1),
+get_vert_result(c, c->vert[i], VERT_RESULT_BFC1));
}
brw_ENDIF(p);
brw_pop_insn_state(p);
@@ -139,20 +129,27 @@ static void do_twoside_color( struct brw_sf_compile *c )
  */
 
 #define VERT_RESULT_COLOR_BITS (BITFIELD64_BIT(VERT_RESULT_COL0) | \
-   BITFIELD64_BIT(VERT_RESULT_COL1))
+BITFIELD64_BIT(VERT_RESULT_COL1))
 
 static void copy_colors( struct brw_sf_compile *c,
 struct brw_reg dst,
-struct brw_reg src)
+ struct brw_reg src,
+ int allow_twoside)
 {
struct brw_compile *p = &c->func;
GLuint i;
 
for (i = VERT_RESULT_COL0; i <= VERT_RESULT_COL1; i++) {
-  if (have_attr(c,i))
+  if (have_attr(c,i)) {
 brw_MOV(p, 
 get_vert_result(c, dst, i),
 get_vert_result(c, src, i));
+
+  } else if(allow_twoside && have_attr(c, i - VERT_RESULT_COL0 + 
VERT_RESULT_BFC0)) {
+brw_MOV(p, 
+get_vert_result(c, dst, i - VERT_RESULT_COL0 + 
VERT_RESULT_BFC0),
+get_vert_result(c, src, i - VERT_RESULT_COL0 + 
VERT_RESULT_BFC0));
+  }
}
 }
 
@@ -167,9 +164,19 @@ static void do_flatshade_triangle( struct brw_sf_compile 
*c )
struct brw_compile *p = &c->func;
struct intel_context *intel = &p->brw->intel;
struct brw_reg ip = brw_ip_reg();
-   GLuint nr = _mesa_bitcount_64(c->key.attrs & VERT_RESULT_COLOR_BITS);
GLuint jmpi = 1;
 
+   GLuint nr;
+
+   if (c->key.do_twoside_color) {
+  nr = ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL0) | 
BITFIELD64_BIT(VERT_RESULT_BFC0))) != 0) +
+ ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL1) | 
BITFIELD64_BIT(VERT_RESULT_BFC1))) != 0);
+
+   } else {
+  nr = ((c->key.attrs & BITFIELD64_BIT(VERT_RESULT_COL0)) != 0) +
+

[Mesa-dev] [PATCH 1/9] intel gen4-5: fix the vue view in the fs.

2012-07-19 Thread Olivier Galibert

In some cases the fragment shader view of the vue registers was out of
sync with the builder.  This fixes it.

Signed-off-by: Olivier Galibert 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |9 -
 src/mesa/drivers/dri/i965/brw_wm_pass2.c |   10 +-
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index b3b25cc..3f98137 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -972,8 +972,15 @@ fs_visitor::calculate_urb_setup()
 if (c->key.vp_outputs_written & BITFIELD64_BIT(i)) {
int fp_index = _mesa_vert_result_to_frag_attrib((gl_vert_result) i);
 
+   /* The back color slot is skipped when the front color is
+* also written to.  In addition, some slots can be
+* written in the vertex shader and not read in the
+* fragment shader.  So the register number must always be
+* incremented, mapped or not.
+*/
if (fp_index >= 0)
-  urb_setup[fp_index] = urb_next++;
+  urb_setup[fp_index] = urb_next;
+   urb_next++;
 }
   }
 
diff --git a/src/mesa/drivers/dri/i965/brw_wm_pass2.c 
b/src/mesa/drivers/dri/i965/brw_wm_pass2.c
index 27c0a94..eacf7c0 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_pass2.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_pass2.c
@@ -97,8 +97,16 @@ static void init_registers( struct brw_wm_compile *c )
int fp_index = _mesa_vert_result_to_frag_attrib(j);
 
nr_interp_regs++;
+
+   /* The back color slot is skipped when the front color is
+* also written to.  In addition, some slots can be
+* written in the vertex shader and not read in the
+* fragment shader.  So the register number must always be
+* incremented, mapped or not.
+*/
if (fp_index >= 0)
-  prealloc_reg(c, &c->payload.input_interp[fp_index], i++);
+  prealloc_reg(c, &c->payload.input_interp[fp_index], i);
+i++;
 }
   }
   assert(nr_interp_regs >= 1);
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] (no subject)

2012-07-19 Thread Olivier Galibert

  Hi,

This is the second verion of the clipping/interpolation patches.

Main differences:
- I tried to take all of Paul's remarks into account
- I exploded the first patch in 4 independant ones
- I've added a patch to ensure that integers pass through unscathed

Patch 4/9 is (slightly) controversial.  There may be better ways to do
it, or at least more general ones.  But it's simple, it works, and it
allows to validate the other 8.  It's an easy one to revert if we
build an alternative.

Best,

  OG.
 
[PATCH 1/9] intel gen4-5: fix the vue view in the fs.
[PATCH 2/9] intel gen4-5: simplify the bfc copy in the sf.
[PATCH 3/9] intel gen4-5: fix GL_VERTEX_PROGRAM_TWO_SIDE selection.
[PATCH 4/9] intel gen4-5: Fix backface/frontface selection when one
[PATCH 5/9] intel gen4-5: Compute the interpolation status for every
[PATCH 6/9] intel gen4-5: Correctly setup the parameters in the sf.
[PATCH 7/9] intel gen4-5: Correctly handle flat vs. non-flat in the
[PATCH 8/9] intel gen4-5: Make noperspective clipping work.
[PATCH 9/9] intel gen4-5: Don't touch flatshaded values when
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] sp_tex_sample: Fix segfault with fbo-cubemap.

2012-07-19 Thread Olivier Galibert

On Thu, Jul 19, 2012 at 10:57:38AM -0600, Brian Paul wrote:
> 
> static const float ...

Indeed.


> Reviewed-by: Brian Paul 

Thanks.  Could you commit it please?

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] sp_tex_sample: Fix segfault with fbo-cubemap.

2012-07-19 Thread Olivier Galibert

The cube sampler generates two-dimensional texture coordinates and
hence passes NULL for the array for the third one.  The actual 2D
sampler, lower in the pipe, knew not to used that array since it
didn't need it.  But the samplers have become single-texel and the
coordinate array dereference has been moved up one step, to a level
where the code does not know only two coordinates are used.  Hence the
segfault.

The simplest fix by far is to add a third dummy coordinate array in
the call to the next pipe step, which will be dereferenced to an
harmless 0 which then will be happily ignored by the sampler.

Signed-off-by: Olivier Galibert 
---
 src/gallium/drivers/softpipe/sp_tex_sample.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Brown paper bag time.  I had tested with (I think) everything with
"tex" in the name.  Guess what fbo-cubemap doesn't have in the name?

Fixes 52250.

diff --git a/src/gallium/drivers/softpipe/sp_tex_sample.c 
b/src/gallium/drivers/softpipe/sp_tex_sample.c
index 292dc6e..2f6e272 100644
--- a/src/gallium/drivers/softpipe/sp_tex_sample.c
+++ b/src/gallium/drivers/softpipe/sp_tex_sample.c
@@ -2090,6 +2090,11 @@ sample_cube(struct tgsi_sampler *tgsi_sampler,
unsigned j;
float [4], [4];
 
+   /* Not actually used, but the intermediate steps that do the
+* dereferencing don't know it.
+*/
+   float [4] = { 0, 0, 0, 0 };
+
/*
  major axis
  directiontarget sc tcma
@@ -2157,7 +2162,7 @@ sample_cube(struct tgsi_sampler *tgsi_sampler,
 * is not active, this will point somewhere deeper into the
 * pipeline, eg. to mip_filter or even img_filter.
 */
-   samp->compare(tgsi_sampler, , , NULL, c0, control, rgba);
+   samp->compare(tgsi_sampler, , , , c0, control, rgba);
 }
 
 
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] softpipe: Take all lods into account when texture sampling.

2012-07-18 Thread Olivier Galibert

On Tue, Jul 17, 2012 at 03:41:44PM -0600, Brian Paul wrote:
> On 07/13/2012 10:30 AM, Olivier Galibert wrote:
> > On Wed, Jun 20, 2012 at 08:33:38AM -0600, Brian Paul wrote:
> >> Yeah, I think it's pretty clear that we need to support per-pixel LOD
> >> selection.  For softpipe, Olivier's big patch looks good.
> >
> > ... and then nothing happened.  Ping?  The only code remark was a
> > whitespace issue on one line :-)
> 
> I'll commit/push your patch soon.  I don't always remember who has 
> git-write access so if you can't push patches yourself you should 
> probably indicate so.

I indeed don't have commit access, but more importantly there has been
discussion but not review, which is why I didn't know if I had to
change things :-)

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.

2012-07-17 Thread Olivier Galibert

On Mon, Jul 16, 2012 at 08:43:17PM -0700, Paul Berry wrote:
> Also, I'm not convinced that #3 is necessary.  Is there something in the
> spec that dictates this behaviour?  My reading of the spec is that if the
> vertex shader writes to gl_BackColor but not glFrontColor, then the
> contents of gl_Color in the fragment shader is undefined.

Oh, I remember why I did that in the first place.  All the front/back
piglit tests only write the appropriate color slot and not the other
one.

The language is annoying:
  The following built-in vertex output variables are available, but deprecated. 
A particular one should be
  written to if any functionality in a corresponding fragment shader or fixed 
pipeline uses it or state derived
  from it. Otherwise, behavior is undefined.
  out vec4 gl_FrontColor;
  out vec4 gl_BackColor;
  out vec4 gl_FrontSecondaryColor;
  out vec4 gl_BackSecondaryColor;
  [...]

One could argue that you don't "use" gl_FrontColor if all your
polygons are back-facing.  Dunno.  Do you consider all of the twoside
piglit tests buggy?  We can fix *that*.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Intel-gfx] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.

2012-07-17 Thread Olivier Galibert

On Mon, Jul 16, 2012 at 08:43:17PM -0700, Paul Berry wrote:
> Can you split this into three separate patches?  That will make it easier
> to troubleshoot in case we find bugs with these patches in the future.

I'm going to try.

> Also, I'm not convinced that #3 is necessary.  Is there something in the
> spec that dictates this behaviour?  My reading of the spec is that if the
> vertex shader writes to gl_BackColor but not glFrontColor, then the
> contents of gl_Color in the fragment shader is undefined.

Given the number of security issues/information leaks that happen due
to reads out of place, I'm always extremely wary of reads from
nowhere.  So one pretty much has a choice between forcing a specific
value (like 0) or reading from someplace else that makes sense.  In
that particular case I considered reading from the other color slot
the easy way out.

> If we *do* decide that #3 is necessary, then I think a better way to
> accomplish it is to handle it in the GLSL vertex shader front-end, by
> replacing gl_BackColor with gl_FrontColor in cases where gl_FrontColor is
> not written to.  That way our special case code to handle this situation
> would be in just one place, rather than in three places (both fragment
> shader back-ends, and the SF program).  Also then the fix would apply to
> all hardware, not just Intel Gen4-5.

You'd have to switch off two-sided lighting too, but why not.

> Finally, I couldn't figure out what you meant by "the stray mov into
> lalaland".  Can you elaborate on which piece of code used to generate that
> stray mov, and why it doesn't anymore?  Thanks.

Looking at it again, I was wrong, it was protected.

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes

2012-07-14 Thread Olivier Galibert

On Fri, Jul 13, 2012 at 02:45:10PM -0700, Kenneth Graunke wrote:
> Sorry...been really busy, and most of us haven't actually spent much if
> any time in the clipper shaders.  I'll try and review it within a week.

Ok cool, lack of time is something I completely understand :-)


> Despite the lack of response, I am really excited to see that you're
> working on this---this is a huge step toward bringing GL 3.x back to
> Gen4/5, and we're all really glad to see it happen!

Excellent.  I was starting to wonder if gen4/5 was abandoned (by lack
of resources if anything), nice to see it isn't.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] softpipe: Take all lods into account when texture sampling.

2012-07-13 Thread Olivier Galibert

On Wed, Jun 20, 2012 at 08:33:38AM -0600, Brian Paul wrote:
> Yeah, I think it's pretty clear that we need to support per-pixel LOD 
> selection.  For softpipe, Olivier's big patch looks good.

... and then nothing happened.  Ping?  The only code remark was a
whitespace issue on one line :-)

> For 
> llvmpipe it's important to maintain performance for the common case 
> where we compute LOD per quad but we'll also need new paths for 
> per-pixel LOD.  Hopefully, the two paths can share some code.

I've been thinking, it looks reasonable to statically check whether
the lod/grad/bias is shared at the glsl level.  Then we could have
separate opcodes for the texturing variants for when we're sure things
are shared and when we aren't.  And pay the cost only when it is
needed.  Would that sound reasonable?

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] mesa/st: Generates TGSI that always recognizes INSTANCEID/VERTEXID as integers.

2012-07-13 Thread Olivier Galibert

On Thu, Jul 12, 2012 at 08:50:13PM +0100, jfons...@vmware.com wrote:
> From: José Fonseca 
> 
> Tested by running piglit draw-instanced, and by forcing llvmpipe advertise no 
> native
> integer support, which now produces:

Looks like a very good solution to me.  Did you check
draw-non-instanced too?  51366 is a variant of the same issue.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes

2012-07-13 Thread Olivier Galibert

On Sat, Jun 30, 2012 at 08:50:10PM +0200, Olivier Galibert wrote:
> This is the first part of the fixes I've done to make my gm45 work
> correctly w.r.t clipping and interpolation.  There's a fair chance
> they work for everything gen 4/5, but I have no way to be sure.

So, not even one comment, nothing?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-07-11 Thread Olivier Galibert

On Wed, Jul 11, 2012 at 02:19:02PM +0200, Marek Ol??ák wrote:
> On Wed, Jul 11, 2012 at 1:09 PM, Jose Fonseca  wrote:
> > My current plan is to:
> > - make it clear that INSTANCEID/VERTEXID always means integer
> > - require PIPE_SHADER_CAP_INTEGERS to be advertise in the vertex shader 
> > stage in order to advertise INSTANCEID/VERTEXID in Mesa statetracker
> > - given that Mesa assumes integer, insert a I2F when loading 
> > INSTANCEID/VERTEXID (this meets the new semantics while avoiding a big 
> > re-architecture)
> 
> The first two points sound good, but why I2F? Note that softpipe fully
> supports integers while llvmpipe doesn't, and I2F after loading
> INSTANCEID would very likely break softpipe.

I think that would break llvmpipe too.  llvmpipe actually fully
supports integers, it only thinks it doesn't, and least according to
piglit (textureFetch is the only real remaining issue left for glsl
1.30).  And draw-instanced works perfectly well with native integer
llvmpipe (which is why I didn't see the problem before the bug
report).

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-07-11 Thread Olivier Galibert

On Wed, Jul 11, 2012 at 12:51:32PM +0200, Marek Ol??ák wrote:
> Dude, you should really learn GLSL. The idea to emulate integers is
> even older than the GLSL itself. It first appeared in HLSL and NVIDIA
> Cg on hardware that wasn't even GL2-capable.

I'm learning 3.30+, which is what I consider useful now :-) But that
makes it a little harder to remember what appeared when.

> >From the GLSL 1.2 spec:
> "The uniform qualifier can be used with any of the basic data types,
> ...", then the section 4.1 lists the basic data types (like ivec4).

Fuck, damn.  Yes, we do have a problem.

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-07-11 Thread Olivier Galibert

On Tue, Jul 10, 2012 at 09:19:05AM -0700, Stéphane Marchesin wrote:
> There is also option 3): revert the two patches causing the regression.

And then you'll have this problem again as soon as you want llvmpipe
to reach GL 3.00+/GLSL 1.30+.  So why not find a definitive solution
now?

Previous code converted the instance id to float, and said "it's an
integer guv', honest".  That does not fly in the face of native
integers, at all, unless you like your second instance to be numbered
1065353216.

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-07-11 Thread Olivier Galibert

On Tue, Jul 10, 2012 at 03:51:22PM +0200, Marek Ol??ák wrote:
> I just wanted to tell you Stephane's change cannot work and it even
> has no effect at the moment. The native integer support is global in
> core Mesa. It's because integer uniforms are converted to floats based
> on the global NativeInteger flag for all shader stages and that can't
> be fixed easily, because uniforms can be shared between shaders.
> Basically, all drivers must advertise integer support either for all
> shader stages or none.

Really?  I mean the idea here is that drivers like i915g which don't
have native integers in the fragger are going to advertise native
integers in the vs but stay at glsl 1.20.  Can you have integer
uniforms without 1.30+?  I don't think so.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] draw: draw_get_shader_param should return correct values WRT llvm

2012-07-04 Thread Olivier Galibert

On Wed, Jul 04, 2012 at 01:59:44PM +0200, Marek Ol??ák wrote:
> Please disregard patch 1 and 2. It wouldn't work.

What's wrong with them?

> I still plan to commit patch 3.

Patch 3 makes sense.  I probably should have done it like that in the
first place (learned a lot since :-).

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-07-03 Thread Olivier Galibert

On Tue, Jul 03, 2012 at 12:39:47PM -0700, Jose Fonseca wrote:
> Note that all registers are stored as floats (for convenience, and
> because LLVM has no unions), so integers are bitcasted into floats
> while storing/loading.  And I'm not sure if your patch would break
> that.

I did test the patch with a llvmpipe in a glsl 120/no native integer
setup.  draw_instanced worked.  I didn't try a full piglit though.

> I still think that having draw/gallivm guessing whether native integer 
> support is intended or not is bad. Either:
> 1) TGSI is extended (e.g., more type annotations) so that native-integer 
> support can inferred from it
> 2) draw/gallivm need to now if the driver has native-integer or not
> 
> I'm inclined towards 1), as TGSI should be self-documented. That is,
> it should not be necessary to know if the driver has or not native
> integer support to know whether system values should be assumed to
> be integers or floats...

It could be argued that dtype being TGSI_TYPE_FLOAT is the
documentation on what is expected.  But I'm quickly reaching the point
where I don't really care, just tell me what you want.  As long as
textureFetch stays the only issue between llvmpipe and 1.30 I'm ok.

Of course doing textureFetch right is going to require an interesting
overhaul of the texture allocations... need to finish fixing the gm45
interpolation/clipping first.

Best,

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-07-02 Thread Olivier Galibert

On Mon, Jul 02, 2012 at 06:44:37AM -0700, Jose Fonseca wrote:
> But I think that this fix is too ad-hoc, and I suspect it may
> introduce other regressions.
> 
> If I understood the problem correctly, the issue here is that some
> drivers want system values in floats, others want in
> integers. Right?

It's slightly more perverted than that.  GLSL 1.20 says "if a value is
an integer, it will be forced into a float but don't expect more than
16 bits precision", while 1.30 has native integers.  Next to that,
some extensions (and gl versions) introduce integer system values but
require native integer support to have them implemented.  The glsl
parser handles that correctly by adding the needed type conversions
when accessing these values from a 1.20 shader.

But then in mesa someone decided to extend the extensions and
implement things like draw_instanced without native integer support.
st_glsl_to_tgsi behaves very differently when native integers aren't
there, forcing evey type to float and ignoring the integer->float type
conversions.  What tells you that is that the requested type (dtype)
is float while the system value itself is integer.

In fact, I suspect the conversion code is ill-advised.  It was picked
up from the previous code, but actually it should only check that the
types are identical or that float is requested for an int, and bitch
otherwise.  Still, it would be interesting to know if that patch works
for i915g, even if we make things more cranky afterwards.

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/5] intel gen4-5: Make noperspective clipping work.

2012-06-30 Thread Olivier Galibert

At this point all interpolation tests with fixed clipping work.

Signed-off-by: Olivier Galibert 
---
 src/mesa/drivers/dri/i965/brw_clip.c  |9 ++
 src/mesa/drivers/dri/i965/brw_clip.h  |1 +
 src/mesa/drivers/dri/i965/brw_clip_util.c |  133 ++---
 3 files changed, 132 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index 952eb4a..6bfdf24 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -234,6 +234,15 @@ brw_upload_clip_prog(struct brw_context *brw)
  break;
   }
}
+   key.has_noperspective_shading = 0;
+   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
+  if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_NOPERSPECTIVE 
&&
+  brw->vs.prog_data->vue_map.slot_to_vert_result[i] != 
VERT_RESULT_HPOS) {
+ key.has_noperspective_shading = 1;
+ break;
+  }
+   }
+
key.pv_first = (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION);
brw_copy_interpolation_modes(brw, key.interpolation_mode);
/* _NEW_TRANSFORM (also part of VUE map)*/
diff --git a/src/mesa/drivers/dri/i965/brw_clip.h 
b/src/mesa/drivers/dri/i965/brw_clip.h
index 0ea0394..2a7245a 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.h
+++ b/src/mesa/drivers/dri/i965/brw_clip.h
@@ -47,6 +47,7 @@ struct brw_clip_prog_key {
GLuint primitive:4;
GLuint nr_userclip:4;
GLuint has_flat_shading:1;
+   GLuint has_noperspective_shading:1;
GLuint pv_first:1;
GLuint do_unfilled:1;
GLuint fill_cw:2;   /* includes cull information */
diff --git a/src/mesa/drivers/dri/i965/brw_clip_util.c 
b/src/mesa/drivers/dri/i965/brw_clip_util.c
index 7b0205a..5bdcef8 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_util.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_util.c
@@ -129,6 +129,8 @@ static void brw_clip_project_vertex( struct 
brw_clip_compile *c,
 
 /* Interpolate between two vertices and put the result into a0.0.  
  * Increment a0.0 accordingly.
+ *
+ * Beware that dest_ptr can be equal to v0_ptr.
  */
 void brw_clip_interp_vertex( struct brw_clip_compile *c,
 struct brw_indirect dest_ptr,
@@ -138,8 +140,9 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
 bool force_edgeflag)
 {
struct brw_compile *p = &c->func;
-   struct brw_reg tmp = get_tmp(c);
-   GLuint slot;
+   struct brw_context *brw = p->brw;
+   struct brw_reg tmp, t_nopersp, v0_ndc_copy;
+   GLuint slot, delta;
 
/* Just copy the vertex header:
 */
@@ -148,13 +151,119 @@ void brw_clip_interp_vertex( struct brw_clip_compile *c,
 * back on Ironlake, so needn't change it
 */
brw_copy_indirect_to_indirect(p, dest_ptr, v0_ptr, 1);
-  
-   /* Iterate over each attribute (could be done in pairs?)
+
+   /*
+* First handle the 3D and NDC positioning, in case we need
+* noperspective interpolation.  Doing it early has no performance
+* impact in any case.
+*/
+
+   /* Start by picking up the v0 NDC coordinates, because that vertex
+* may be shared with the destination.
+*/
+   if (c->key.has_noperspective_shading) {
+  v0_ndc_copy = get_tmp(c);
+  brw_MOV(p, v0_ndc_copy, deref_4f(v0_ptr,
+   brw_vert_result_to_offset(&c->vue_map,
+ 
BRW_VERT_RESULT_NDC)));
+   }  
+
+   /*
+* Compute the new 3D position
+*/
+
+   delta = brw_vert_result_to_offset(&c->vue_map, VERT_RESULT_HPOS);
+   tmp = get_tmp(c);
+   brw_MUL(p, 
+   vec4(brw_null_reg()),
+   deref_4f(v1_ptr, delta),
+   t0);
+
+   brw_MAC(p, 
+   tmp,  
+   negate(deref_4f(v0_ptr, delta)),
+   t0); 
+ 
+   brw_ADD(p,
+   deref_4f(dest_ptr, delta), 
+   deref_4f(v0_ptr, delta),
+   tmp);
+   release_tmp(c, tmp);
+
+   /* Then recreate the projected (NDC) coordinate in the new vertex
+* header
 */
+   brw_clip_project_vertex(c, dest_ptr);
+
+   /*
+* If we have noperspective attributes, we now need to compute the
+* screen-space t.
+*/
+   if (c->key.has_noperspective_shading) {
+  delta = brw_vert_result_to_offset(&c->vue_map, BRW_VERT_RESULT_NDC);
+  t_nopersp = get_tmp(c);
+  tmp = get_tmp(c);
+
+  /* Build a register with coordinates from the second and new vertices */
+  brw_MOV(p, t_nopersp, deref_4f(v1_ptr, delta));
+  brw_MOV(p, tmp, deref_4f(dest_ptr, delta));
+  brw_set_access_mode(p, BRW_ALIGN_16);
+  brw_MOV(p,
+  brw_writemask(t_nopersp, WRITEMASK_ZW),
+  brw_swizzle(tmp, 0,1,0,1));
+
+  /* Subtract the coordinates of the first vertex */
+  brw_ADD(p, t_nopersp, t_nopersp, negate(brw_swizzle(v0_ndc_copy, 
0,1,0,1)));
+
+

[Mesa-dev] [PATCH 4/5] intel gen4-5: Correctly handle flat vs. non-flat in the clipper.

2012-06-30 Thread Olivier Galibert

At that point, all interpolation piglit tests involving fixed clipping
work as long as there's no noperspective.

Signed-off-by: Olivier Galibert 
---
 src/mesa/drivers/dri/i965/brw_clip.c  |   10 -
 src/mesa/drivers/dri/i965/brw_clip.h  |6 +--
 src/mesa/drivers/dri/i965/brw_clip_line.c |6 +--
 src/mesa/drivers/dri/i965/brw_clip_tri.c  |   20 -
 src/mesa/drivers/dri/i965/brw_clip_unfilled.c |2 +-
 src/mesa/drivers/dri/i965/brw_clip_util.c |   56 +++--
 6 files changed, 41 insertions(+), 59 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index 52e8c47..952eb4a 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -215,7 +215,7 @@ brw_upload_clip_prog(struct brw_context *brw)
struct intel_context *intel = &brw->intel;
struct gl_context *ctx = &intel->ctx;
struct brw_clip_prog_key key;
-
+   int i;
memset(&key, 0, sizeof(key));
 
brw_lookup_interpolation(brw);
@@ -227,7 +227,13 @@ brw_upload_clip_prog(struct brw_context *brw)
/* CACHE_NEW_VS_PROG (also part of VUE map) */
key.attrs = brw->vs.prog_data->outputs_written;
/* _NEW_LIGHT */
-   key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
+   key.has_flat_shading = 0;
+   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
+  if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) {
+ key.has_flat_shading = 1;
+ break;
+  }
+   }
key.pv_first = (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION);
brw_copy_interpolation_modes(brw, key.interpolation_mode);
/* _NEW_TRANSFORM (also part of VUE map)*/
diff --git a/src/mesa/drivers/dri/i965/brw_clip.h 
b/src/mesa/drivers/dri/i965/brw_clip.h
index 6f811ae..0ea0394 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.h
+++ b/src/mesa/drivers/dri/i965/brw_clip.h
@@ -46,7 +46,7 @@ struct brw_clip_prog_key {
GLbitfield64 interpolation_mode[2]; /* copy of the main context */
GLuint primitive:4;
GLuint nr_userclip:4;
-   GLuint do_flat_shading:1;
+   GLuint has_flat_shading:1;
GLuint pv_first:1;
GLuint do_unfilled:1;
GLuint fill_cw:2;   /* includes cull information */
@@ -166,8 +166,8 @@ void brw_clip_kill_thread(struct brw_clip_compile *c);
 struct brw_reg brw_clip_plane_stride( struct brw_clip_compile *c );
 struct brw_reg brw_clip_plane0_address( struct brw_clip_compile *c );
 
-void brw_clip_copy_colors( struct brw_clip_compile *c,
-  GLuint to, GLuint from );
+void brw_clip_copy_flatshaded_attributes( struct brw_clip_compile *c,
+  GLuint to, GLuint from );
 
 void brw_clip_init_clipmask( struct brw_clip_compile *c );
 
diff --git a/src/mesa/drivers/dri/i965/brw_clip_line.c 
b/src/mesa/drivers/dri/i965/brw_clip_line.c
index 6cf2bd2..729d8c0 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_line.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_line.c
@@ -271,11 +271,11 @@ void brw_emit_line_clip( struct brw_clip_compile *c )
brw_clip_line_alloc_regs(c);
brw_clip_init_ff_sync(c);
 
-   if (c->key.do_flat_shading) {
+   if (c->key.has_flat_shading) {
   if (c->key.pv_first)
- brw_clip_copy_colors(c, 1, 0);
+ brw_clip_copy_flatshaded_attributes(c, 1, 0);
   else
- brw_clip_copy_colors(c, 0, 1);
+ brw_clip_copy_flatshaded_attributes(c, 0, 1);
}
 
clip_and_emit_line(c);
diff --git a/src/mesa/drivers/dri/i965/brw_clip_tri.c 
b/src/mesa/drivers/dri/i965/brw_clip_tri.c
index a29f8e0..71225f5 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_tri.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_tri.c
@@ -187,8 +187,8 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile *c )
 
brw_IF(p, BRW_EXECUTE_1);
{
-  brw_clip_copy_colors(c, 1, 0);
-  brw_clip_copy_colors(c, 2, 0);
+  brw_clip_copy_flatshaded_attributes(c, 1, 0);
+  brw_clip_copy_flatshaded_attributes(c, 2, 0);
}
brw_ELSE(p);
{
@@ -200,19 +200,19 @@ void brw_clip_tri_flat_shade( struct brw_clip_compile *c )
 brw_imm_ud(_3DPRIM_TRIFAN));
 brw_IF(p, BRW_EXECUTE_1);
 {
-   brw_clip_copy_colors(c, 0, 1);
-   brw_clip_copy_colors(c, 2, 1);
+   brw_clip_copy_flatshaded_attributes(c, 0, 1);
+   brw_clip_copy_flatshaded_attributes(c, 2, 1);
 }
 brw_ELSE(p);
 {
-   brw_clip_copy_colors(c, 1, 0);
-   brw_clip_copy_colors(c, 2, 0);
+   brw_clip_copy_flatshaded_attributes(c, 1, 0);
+   brw_clip_copy_flatshaded_attributes(c, 2, 0);
 }
 brw_ENDIF(p);
   }
   else {
- brw_clip_copy_colors(c, 0, 2);
- brw_clip_copy_colors(c, 1, 2);
+ brw_clip_copy_flatshaded_attributes(c, 0, 2);
+ brw_clip_copy_flatshaded_attributes(c, 1, 2);
   }

[Mesa-dev] [PATCH 3/5] intel gen4-5: Correctly setup the parameters in the sf.

2012-06-30 Thread Olivier Galibert

This patch also correct a couple of problems with noperspective
interpolation.

At that point all the glsl 1.1/1.3 interpolation tests that do not
clip pass (the -none ones).

The fs code does not use the pre-resolved interpolation modes in order
not to mess with gen6+.  Sharing the resolution would require putting
brw_wm_prog before brw_clip_prog and brw_sf_prog.  This may be a good
thing, but it could have unexpected consequences, so it's better be
done independently in any case.

Signed-off-by: Olivier Galibert 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |2 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |5 +
 src/mesa/drivers/dri/i965/brw_sf.c   |9 +-
 src/mesa/drivers/dri/i965/brw_sf.h   |2 +-
 src/mesa/drivers/dri/i965/brw_sf_emit.c  |  164 +-
 5 files changed, 95 insertions(+), 87 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 710f2ff..b142f2b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -506,7 +506,7 @@ fs_visitor::emit_general_interpolation(ir_variable *ir)
  struct brw_reg interp = interp_reg(location, k);
   emit_linterp(attr, fs_reg(interp), interpolation_mode,
ir->centroid);
- if (intel->gen < 6) {
+ if (intel->gen < 6 && interpolation_mode == 
INTERP_QUALIFIER_SMOOTH) {
 emit(BRW_OPCODE_MUL, attr, attr, this->pixel_w);
  }
   }
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 9bd1e67..ab83a95 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1924,6 +1924,11 @@ fs_visitor::emit_interpolation_setup_gen4()
emit(BRW_OPCODE_ADD, this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC],
this->pixel_y, fs_reg(negate(brw_vec1_grf(1, 1;
 
+   this->delta_x[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] =
+ this->delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC];
+   this->delta_y[BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC] =
+ this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC];
+
this->current_annotation = "compute pos.w and 1/pos.w";
/* Compute wpos.w.  It's always in our setup, since it's needed to
 * interpolate the other attributes.
diff --git a/src/mesa/drivers/dri/i965/brw_sf.c 
b/src/mesa/drivers/dri/i965/brw_sf.c
index 0cc4fc7..85f5f51 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.c
+++ b/src/mesa/drivers/dri/i965/brw_sf.c
@@ -139,6 +139,7 @@ brw_upload_sf_prog(struct brw_context *brw)
struct brw_sf_prog_key key;
/* _NEW_BUFFERS */
bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
+   int i;
 
memset(&key, 0, sizeof(key));
 
@@ -191,7 +192,13 @@ brw_upload_sf_prog(struct brw_context *brw)
   key.sprite_origin_lower_left = true;
 
/* _NEW_LIGHT */
-   key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
+   key.has_flat_shading = 0;
+   for (i = 0; i < BRW_VERT_RESULT_MAX; i++) {
+  if (brw_get_interpolation_mode(brw, i) == INTERP_QUALIFIER_FLAT) {
+ key.has_flat_shading = 1;
+ break;
+  }
+   }
key.do_twoside_color = (ctx->Light.Enabled && ctx->Light.Model.TwoSide) ||
  ctx->VertexProgram._TwoSideEnabled;
brw_copy_interpolation_modes(brw, key.interpolation_mode);
diff --git a/src/mesa/drivers/dri/i965/brw_sf.h 
b/src/mesa/drivers/dri/i965/brw_sf.h
index 0a8135c..c718072 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.h
+++ b/src/mesa/drivers/dri/i965/brw_sf.h
@@ -50,7 +50,7 @@ struct brw_sf_prog_key {
uint8_t point_sprite_coord_replace;
GLuint primitive:2;
GLuint do_twoside_color:1;
-   GLuint do_flat_shading:1;
+   GLuint has_flat_shading:1;
GLuint frontface_ccw:1;
GLuint do_point_sprite:1;
GLuint do_point_coord:1;
diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c 
b/src/mesa/drivers/dri/i965/brw_sf_emit.c
index 9d8aa38..387685a 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c
@@ -44,6 +44,17 @@
 
 
 /**
+ * Determine the vue slot corresponding to the given half of the given
+ * register.  half=0 means the first half of a register, half=1 means the
+ * second half.
+ */
+static inline int vert_reg_to_vue_slot(struct brw_sf_compile *c, GLuint reg,
+   int half)
+{
+   return (reg + c->urb_entry_read_offset) * 2 + half;
+}
+
+/**
  * Determine the vert_result corresponding to the given half of the given
  * register.  half=0 means the first half of a register, half=1 means the
  * second half.
@@ -51,11 +62,24 @@
 static inline int vert_reg_to_vert_result(struct brw_sf_compile *c, GLuint reg,
   int half)
 {

[Mesa-dev] [PATCH 2/5] intel gen4-5: Compute the interpolation status for every variable in one place.

2012-06-30 Thread Olivier Galibert

The program keys are updated accordingly, but the values are not used
yet.

Signed-off-by: Olivier Galibert 
---
 src/mesa/drivers/dri/i965/brw_clip.c|   82 ++-
 src/mesa/drivers/dri/i965/brw_clip.h|1 +
 src/mesa/drivers/dri/i965/brw_context.h |   59 ++
 src/mesa/drivers/dri/i965/brw_sf.c  |3 +-
 src/mesa/drivers/dri/i965/brw_sf.h  |1 +
 src/mesa/drivers/dri/i965/brw_wm.c  |4 ++
 src/mesa/drivers/dri/i965/brw_wm.h  |1 +
 7 files changed, 149 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index d411208..52e8c47 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -47,6 +47,83 @@
 #define FRONT_UNFILLED_BIT  0x1
 #define BACK_UNFILLED_BIT   0x2
 
+/**
+ * Lookup the interpolation mode information for every element in the
+ * vue.
+ */
+static void
+brw_lookup_interpolation(struct brw_context *brw)
+{
+   /* pprog means "previous program", i.e. the last program before the
+* fragment shader.  It can only be the vertex shader for now, but
+* it may be a geometry shader in the future.
+*/
+   const struct gl_program *pprog = &brw->vertex_program->Base;
+   const struct gl_fragment_program *fprog = brw->fragment_program;
+   struct brw_vue_map *vue_map = &brw->vs.prog_data->vue_map;
+
+   /* Default everything to INTERP_QUALIFIER_NONE */
+   brw_clear_interpolation_modes(brw);
+
+   /* If there is no fragment shader, interpolation won't be needed,
+* so defaulting to none is good.
+*/
+   if (!fprog)
+  return;
+
+   for (int i = 0; i < vue_map->num_slots; i++) {
+  /* First lookup the vert result, skip if there isn't one */
+  int vert_result = vue_map->slot_to_vert_result[i];
+  if (vert_result == BRW_VERT_RESULT_MAX)
+ continue;
+
+  /* HPOS is special, it must be linear
+   */
+  if (vert_result == VERT_RESULT_HPOS) {
+ brw_set_interpolation_mode(brw, i, INTERP_QUALIFIER_NOPERSPECTIVE);
+ continue;
+  }
+
+  /* There is a 1-1 mapping of vert result to frag attrib except
+   * for BackColor and vars
+   */
+  int frag_attrib = vert_result;
+  if (vert_result >= VERT_RESULT_BFC0 && vert_result <= VERT_RESULT_BFC1)
+ frag_attrib = vert_result - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
+  else if(vert_result >= VERT_RESULT_VAR0)
+ frag_attrib = vert_result - VERT_RESULT_VAR0 + FRAG_ATTRIB_VAR0;
+
+  /* If the output is not used by the fragment shader, skip it. */
+  if (!(fprog->Base.InputsRead & BITFIELD64_BIT(frag_attrib)))
+ continue;
+
+  /* Lookup the interpolation mode */
+  enum glsl_interp_qualifier interpolation_mode = 
fprog->InterpQualifier[frag_attrib];
+
+  /* If the mode is not specified, then the default varies.  Color
+   * values follow the shader model, while all the rest uses
+   * smooth.
+   */
+  if (interpolation_mode == INTERP_QUALIFIER_NONE) {
+ if (frag_attrib >= FRAG_ATTRIB_COL0 && frag_attrib <= 
FRAG_ATTRIB_COL1)
+interpolation_mode = brw->intel.ctx.Light.ShadeModel == GL_FLAT ? 
INTERP_QUALIFIER_FLAT : INTERP_QUALIFIER_SMOOTH;
+ else
+interpolation_mode = INTERP_QUALIFIER_SMOOTH;
+  }
+
+  /* Finally, if we have both a front color and a back color for
+   * the same channel, the selection will be done before
+   * interpolation and the back color copied over the front color
+   * if necessary.  So interpolating the back color is
+   * unnecessary.
+   */
+  if (vert_result >= VERT_RESULT_BFC0 && vert_result <= VERT_RESULT_BFC1)
+ if (pprog->OutputsWritten & BITFIELD64_BIT(vert_result - 
VERT_RESULT_BFC0 + VERT_RESULT_COL0))
+interpolation_mode = INTERP_QUALIFIER_NONE;
+
+  brw_set_interpolation_mode(brw, i, interpolation_mode);
+   }
+}
 
 static void compile_clip_prog( struct brw_context *brw,
 struct brw_clip_prog_key *key )
@@ -141,6 +218,8 @@ brw_upload_clip_prog(struct brw_context *brw)
 
memset(&key, 0, sizeof(key));
 
+   brw_lookup_interpolation(brw);
+
/* Populate the key:
 */
/* BRW_NEW_REDUCED_PRIMITIVE */
@@ -150,6 +229,7 @@ brw_upload_clip_prog(struct brw_context *brw)
/* _NEW_LIGHT */
key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
key.pv_first = (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION);
+   brw_copy_interpolation_modes(brw, key.interpolation_mode);
/* _NEW_TRANSFORM (also part of VUE map)*/
key.nr_userclip = _mesa_bitcount_64(ctx->Transform.ClipPlanesEnabled);
 
@@ -258,7 +338,7 @@ const struct brw_tracked_state brw_clip_prog = {
_NEW_TRANSFORM |
_NE

[Mesa-dev] [PATCH 1/5] intel gen4/5: fix GL_VERTEX_PROGRAM_TWO_SIDE.

2012-06-30 Thread Olivier Galibert

There was... confusion about which register goes where.  With that
patch urb_setup is in line with the vue setup, even when these
annoying backcolor slots are used.  And in addition the stray mov into
lalaland is avoided when only one of the front/back slots is used and
the backface is looking at you.  The code instead picks whatever slot
was written to by the vertex shader.  That makes most of the generated
piglit tests useless to test the backface selection though.

Signed-off-by: Olivier Galibert 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |   18 +-
 src/mesa/drivers/dri/i965/brw_sf.c   |3 +-
 src/mesa/drivers/dri/i965/brw_sf_emit.c  |   93 +-
 src/mesa/drivers/dri/i965/brw_wm_pass2.c |   19 +-
 4 files changed, 89 insertions(+), 44 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 6cef08a..710f2ff 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -721,8 +721,24 @@ fs_visitor::calculate_urb_setup()
 if (c->key.vp_outputs_written & BITFIELD64_BIT(i)) {
int fp_index = _mesa_vert_result_to_frag_attrib((gl_vert_result) i);
 
+/* Special case: two-sided vertex option, vertex program
+ * only writes to the back color.  Map it to the
+ * associated front color location.
+ */
+if (i >= VERT_RESULT_BFC0 && i <= VERT_RESULT_BFC1 &&
+ctx->VertexProgram._TwoSideEnabled &&
+urb_setup[i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0] == -1)
+   fp_index = i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;
+
+   /* The back color slot is skipped when the front color is
+* also written to.  In addition, some slots can be
+* written in the vertex shader and not read in the
+* fragment shader.  So the register number must always be
+* incremented, mapped or not.
+*/
if (fp_index >= 0)
-  urb_setup[fp_index] = urb_next++;
+  urb_setup[fp_index] = urb_next;
+   urb_next++;
 }
   }
 
diff --git a/src/mesa/drivers/dri/i965/brw_sf.c 
b/src/mesa/drivers/dri/i965/brw_sf.c
index 23a874a..7867ab5 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.c
+++ b/src/mesa/drivers/dri/i965/brw_sf.c
@@ -192,7 +192,8 @@ brw_upload_sf_prog(struct brw_context *brw)
 
/* _NEW_LIGHT */
key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);
-   key.do_twoside_color = (ctx->Light.Enabled && ctx->Light.Model.TwoSide);
+   key.do_twoside_color = (ctx->Light.Enabled && ctx->Light.Model.TwoSide) ||
+ ctx->VertexProgram._TwoSideEnabled;
 
/* _NEW_POLYGON */
if (key.do_twoside_color) {
diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c 
b/src/mesa/drivers/dri/i965/brw_sf_emit.c
index ff6383b..9d8aa38 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c
@@ -79,24 +79,9 @@ have_attr(struct brw_sf_compile *c, GLuint attr)
 /*** 
  * Twoside lighting
  */
-static void copy_bfc( struct brw_sf_compile *c,
- struct brw_reg vert )
-{
-   struct brw_compile *p = &c->func;
-   GLuint i;
-
-   for (i = 0; i < 2; i++) {
-  if (have_attr(c, VERT_RESULT_COL0+i) &&
- have_attr(c, VERT_RESULT_BFC0+i))
-brw_MOV(p, 
-get_vert_result(c, vert, VERT_RESULT_COL0+i),
-get_vert_result(c, vert, VERT_RESULT_BFC0+i));
-   }
-}
-
-
 static void do_twoside_color( struct brw_sf_compile *c )
 {
+   GLuint i, need_0, need_1;
struct brw_compile *p = &c->func;
GLuint backface_conditional = c->key.frontface_ccw ? BRW_CONDITIONAL_G : 
BRW_CONDITIONAL_L;
 
@@ -105,12 +90,14 @@ static void do_twoside_color( struct brw_sf_compile *c )
if (c->key.primitive == SF_UNFILLED_TRIS)
   return;
 
-   /* XXX: What happens if BFC isn't present?  This could only happen
-* for user-supplied vertex programs, as t_vp_build.c always does
-* the right thing.
+   /* If the vertex shader provides both front and backface color, do
+* the selection.  Otherwise the generated code will pick up
+* whichever there is.
 */
-   if (!(have_attr(c, VERT_RESULT_COL0) && have_attr(c, VERT_RESULT_BFC0)) &&
-   !(have_attr(c, VERT_RESULT_COL1) && have_attr(c, VERT_RESULT_BFC1)))
+   need_0 = have_attr(c, VERT_RESULT_COL0) && have_attr(c, VERT_RESULT_BFC0);
+   need_1 = have_attr(c, VERT_RESULT_COL1) && have_attr(c, VERT_RESULT_BFC1);
+
+   if (!need_0 && !need_1)
   return;

/* Need to use BRW_EXECUTE_4 and also do an 4-wide compare in order
@@ -121,12 +108,15 @@ static void do_twoside_color( struct brw_sf_compile *c )
brw_push

[Mesa-dev] [PATCH 0/5] First batch of gm45 clipping/interpolation fixes

2012-06-30 Thread Olivier Galibert

  Hi,

This is the first part of the fixes I've done to make my gm45 work
correctly w.r.t clipping and interpolation.  There's a fair chance
they work for everything gen 4/5, but I have no way to be sure.

[PATCH 1/5] intel gen4-5: fix GL_VERTEX_PROGRAM_TWO_SIDE.
[PATCH 2/5] intel gen4-5: Compute the interpolation status for every
[PATCH 3/5] intel gen4-5: Correctly setup the parameters in the sf.
[PATCH 4/5] intel gen4-5: Correctly handle flat vs. non-flat in the
[PATCH 5/5] intel gen4-5: Make noperspective clipping work.

After this batch every piglit interpolation test involving no clipping
or fixed clipping passes.  Vertex clipping clearly never worked
(VERT_RESULT_CLIP_VERTEX is not used, so...) and clipdistance isn't
implemented.  These will be the topic of the second batch, whenever it
exists.

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-06-29 Thread Olivier Galibert

On Fri, Jun 29, 2012 at 03:09:23PM -0700, Stéphane Marchesin wrote:
> I do, but it fixes a regression, so unless you have a fix, it's the way to
> go. If you have a fix I'll happily test it :)

Just between us, revert on small regressions may not be optimal long
term on a project like mesa where the review/commit pipeline is
clogged.  The risk of losing developers is non-negligible.  The linux
kernel can afford it because even if you miss a cycle you know that
you will have another one in two months, and there are a lot of
intermediate collation trees in which your patch can be tried out and
shaken for bugs (subsystem trees, -next, akp patch tree, etc).  I'm
not sure Mesa can afford it.

That said, try this.

commit 56555c58d7f16c8d619c21feb23096155e2fb505
Author: Olivier Galibert 
Date:   Sat Jun 30 00:41:20 2012 +0200

lp_bld_tgsi_soa: Fix conversion of system values to float.

Signed-off-by: Olivier Galibert 

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 55db561..f8df2bc 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -811,9 +811,10 @@ emit_fetch_system_value(
   break;
}

+   /* Extend that when atype can happen to be float */
if (atype != stype) {
   if (stype == TGSI_TYPE_FLOAT) {
- res = LLVMBuildBitCast(builder, res, bld_base->base.vec_type, "");
+ res = lp_build_int_to_float(&bld_base->base, res);
   } else if (stype == TGSI_TYPE_UNSIGNED) {
  res = LLVMBuildBitCast(builder, res, bld_base->uint_bld.vec_type, "");
   } else if (stype == TGSI_TYPE_SIGNED) {
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-06-29 Thread Olivier Galibert

On Fri, Jun 29, 2012 at 12:52:06PM -0700, Stéphane Marchesin wrote:
> Yeah, but my question was more high level, whether the vertex id
> support required the previous refactor. It looks like it does though,
> and I don't want to untangle, so I'll revert both 3/4 and 4/4.

You realize that will re-break instanceID on llvmpipe for glsl > 120, right?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.

2012-06-27 Thread Olivier Galibert

On Wed, Jun 27, 2012 at 03:17:05AM -0700, Jose Fonseca wrote:
> I took a look at the results, and it seems to me that bri linear
> code is fine -- the test is merely too strict, and doesnot forgive
> the gravitation towards integer lod that brilinear implements.

Yes, the current code maps [0,.25] to 0, [0.25,0.75] to [0-1] and
[0.75,1] to 1.  So you you need an error tolerance of 0.20 given how
the test is done on multiples of 0.2.

What's your criteria to decide that a precision is "good enough"?

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] llvmpipe: handle more PIPE_CAP_x queries

2012-06-26 Thread Olivier Galibert

On Tue, Jun 26, 2012 at 02:46:01PM -0600, Brian Paul wrote:
> As with the previous commit for softpipe.
> 
> v2: remove 'default' case to get compile-time warning
> ---
>  src/gallium/drivers/llvmpipe/lp_screen.c |   52 +++--
>  1 files changed, 48 insertions(+), 4 deletions(-)
> 
> diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
> b/src/gallium/drivers/llvmpipe/lp_screen.c
> index 40037a5..e66737b 100644
> --- a/src/gallium/drivers/llvmpipe/lp_screen.c
> +++ b/src/gallium/drivers/llvmpipe/lp_screen.c
> +   case PIPE_CAP_GLSL_FEATURE_LEVEL:
> +  return 0;

Why not 120?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.

2012-06-25 Thread Olivier Galibert

On Mon, Jun 25, 2012 at 03:16:35PM -0700, Jose Fonseca wrote:
> Indeed lp_build_brilinear_lod is not faster than
> lp_build_ifloor_fract, but brilinear is faster, not because log is a
> faster approximation, but because it increases the odds that fract
> part is zero, which means that we can sample from a single mip
> level, instead of lerping between two mip levels.
> 
> I think you have a good point here -- lp_build_brilinear_lod is a
> log2 approximation which is wrong here and that's a great catch, --
> but I have a point too: lp_build_ifloor_fract will slow down texture
> sampling.
>
> Just like log2 and brilinear log2, we need a variant of
> ifloor_fract, that increases the probability of fract part being
> zero, essentially by applying a stair case transformation like:

You can do that by multiplying by 'k', subtracting 0.5*k and clamping
to [0,1[.  The question is whether you really want to do something
like that for explicit lod, where the user supposedly exactly knows
what he wants.  "textureLod" is not used often at all[1], so one can
think that when it's used you'd better do it precisely.

  OG.

[1] You see more uses of lod bias and/or textureGrad, the latter due
to the use of conditionals.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.

2012-06-25 Thread Olivier Galibert

On Mon, Jun 25, 2012 at 11:40:08AM -0700, Jose Fonseca wrote:
> My thoughts too.
> 
> Brilinear filtering provides a significant boost, and I don't see why skip 
> the optimization for explicit lod over implicit lods.

Warning: code misread :-)

Explicit lod does not need brilinear filtering because explicit lod is
post log2.  Brilinear is only about a faster log2, nothing else.
Explicit lod only needs the integer/fractional part separation.

The whole code is:
   if (mip_filter == PIPE_TEX_MIPFILTER_LINEAR) {
  if (!explicit_lod && !(gallivm_debug & GALLIVM_DEBUG_NO_BRILINEAR)) {
 lp_build_brilinear_lod(float_bld, lod, BRILINEAR_FACTOR,
out_lod_ipart, out_lod_fpart);
  }
  else {
 lp_build_ifloor_fract(float_bld, lod, out_lod_ipart, out_lod_fpart);
  }

  lp_build_name(*out_lod_fpart, "lod_fpart");
   }
   else {
  *out_lod_ipart = lp_build_iround(float_bld, lod);
   }

and you're not going to tell me that lp_build_brilinear_lod is faster
than lp_build_ifloor_fract (especially since it includes it ;-)

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.

2012-06-25 Thread Olivier Galibert

On Mon, Jun 25, 2012 at 07:31:20PM +0100, Roland Scheidegger wrote:
> Does this fix the test because lp_build_brilinear_lod produces bogus 
> values in this case or just because the test is strict about such 
> filtering optimizations? In the latter case I'm not sure I really see 
> much point.

Bogus.  It does the fractional-part log2 approximation there, which
only makes sense if you called fast_log2 before (and even then the log
bias is going to be strangely applied, but meh).

> I'm surprised it can actually pass in either case since we drop all but 
> the first lod per quad values on the floor anyway so I think you will 
> get neither the right filtering weights between mipmaps nor even the 
> right mip levels (if the integer part of the lod isn't the same) for 
> anything but the first texel per quad.

Luck due to the design of the test.  It's rectangles with a fixed lod
value, so the quads all have the same.  That's pretty much why I
cooked up miplevels-2 (only in vs though, it's much easier there and
the code is shared).

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] llvmpipe: Remove the ARB_draw_instanced capability.

2012-06-25 Thread Olivier Galibert

On Mon, Jun 25, 2012 at 05:34:25AM -0700, Jose Fonseca wrote:
> - Original Message -
> > That capability requires integer handling and that's not yet active,
> > ending with a failure in draw-non-instanced unless you force it on.
> > See bug 51366.
> > 
> > Frankly, I'd rather have that patch rejected and integer/glsl 130
> > capability activated instead.  There still are things missing, but
> > they mostly have their own extension anyway.  And the overall picture
> > ain't so bad.
> 
> I'm personally also more interested in seeing llvmpipe to get the missing 
> features for GLSL 1.30 / OGL 3.
> 
> What's the overall picture of llvmpipe w/ integer/glsl 130? That is, how many 
> piglit tests go from skipped to passed/failed?

To failed:

precision-05.vert
link-mismatch-layout-02
no-redeclaration-01.vert
feature-macro.vert
fs-exec-after-break
  - general failures, everybody has them

vs-clip-distance-bulk-assign
vs-clip-distance-inout-param
vs-clip-distance-out-param
vs-clip-distance-retval
  - haven't checked what the problem is, softpipe has it right

fs-isinf-vec2
fs-isinf-vec3
fs-isinf-vec4
vs-isinf-vec2
vs-isinf-vec3
vs-isinf-vec4
  - test is iffy

fs-texelFetch-2D
fs-texelFetchOffset-2D
  - no texelFetch support yet

fs-texture-sampler2dshadow-10
fs-texture-sampler2dshadow-11
  - dunno what's going on, softpipe fails it too

vs-attrib-ivec4-implied
vs-attrib-ivec4-precision
vs-attrib-uvec4-implied
vs-attrib-uvec4-precision
  - use glVertexAttribIPointer, which is GL 3.0+ only

vs-textureLod-miplevels
  - issue with vertex shader invalidation when sampler mode changes (as in, 
it's not done)

vs-textureLod-miplevels-2
  - you know that one, it's nowhere near fixed yet (the softpipe patch is 
awaiting review too :-)

texel-offset-limits
  - no limits defined in lp_screen.c, udnno whether texture() would take it 
into account either

To pass:
1503 total, it seems, you can be sure I'm not going to list them :-)

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] lp_build_lod_selector: Disable brilinear folding on explicit lod.

2012-06-25 Thread Olivier Galibert

Brilinear folding must only be used if the log2 was computed with
brilinear too.  Fixes fs-textureLod-miplevels.

Signed-off-by: Olivier Galibert 
---
 src/gallium/auxiliary/gallivm/lp_bld_sample.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
index d966788..9deda61 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
@@ -513,7 +513,7 @@ lp_build_lod_selector(struct lp_build_sample_context *bld,
}
 
if (mip_filter == PIPE_TEX_MIPFILTER_LINEAR) {
-  if (!(gallivm_debug & GALLIVM_DEBUG_NO_BRILINEAR)) {
+  if (!explicit_lod && !(gallivm_debug & GALLIVM_DEBUG_NO_BRILINEAR)) {
  lp_build_brilinear_lod(float_bld, lod, BRILINEAR_FACTOR,
 out_lod_ipart, out_lod_fpart);
   }
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] u2f_emit: Fix type parameter in LLVM call.

2012-06-25 Thread Olivier Galibert

The type is the destination type (i.e. float vector) and not the
source type.  Fixes piglit fs-{in,de}crement-uint.

Signed-off-by: Olivier Galibert 
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
index cbc5945..17f288f 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
@@ -693,7 +693,7 @@ u2f_emit(
 {
emit_data->output[emit_data->chan] = 
LLVMBuildUIToFP(bld_base->base.gallivm->builder,
emit_data->args[0],
-   
bld_base->uint_bld.vec_type, "");
+   
bld_base->base.vec_type, "");
 }
 
 static void
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Comparison of llvmpipe with 2.9 and 3.1

2012-06-25 Thread Olivier Galibert

  Hi,

I've just finished two piglit runs of llvmpipe with glsl 1.40 and gl
3.1 forced on, one with LLVM 2.9, the other with 3.1.

The least we can say is that there aren't many differences. 

- fp-indirections2, didn't have the patience to wait to see whether it
  would eventually stop.  Looks like something quadratic or worse in
  the LLVM optimizers.

- 17000-consecutive-chars-identifier, the memory corruption it creates
  behaved differently (probably due to the different glibc, it wasn't on
  the same box), causing a deadlock in malloc()

- texCombine fails on 3.1 only with:
Returncode: -5

Errors:
src/gallium/auxiliary/draw/draw_llvm.c:309:create_jit_vertex_header: Assertion 
`LLVMABISizeOfType(target, vertex_header) == __builtin_offsetof (struct 
vertex_header, data[data_elems])' failed.


Output:
--
GL_EXT_texture_env_combine verification test.
We only test a subset of all possible texture env combinations
because there's simply too many to exhaustively test them all.



So, in total, the story isn't bad.

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] builtin_variables: Only advertise gl_InstanceIDARB when GLSL handle integers.

2012-06-25 Thread Olivier Galibert

It can be argued it makes to sense to advertise an integer system
variable in GLSL levels where integers aren't handled.

Signed-off-by: Olivier Galibert 

---

I don't really know if that's a patch we want, but otoh having
gl_InstanceIDARB being a different type depending on the GLSL version
would be... weird.


diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index 03b64c9..f9a341f 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -888,12 +888,13 @@ generate_ARB_draw_instanced_variables(exec_list 
*instructions,
   bool warn,
   _mesa_glsl_parser_targets target)
 {
-   /* gl_InstanceIDARB is only available in the vertex shader.
+   /* gl_InstanceIDARB is only available in the vertex shader, and
+* only if the glsl level can handle integers.
 */
if (target != vertex_shader)
   return;
 
-   if (state->ARB_draw_instanced_enable) {
+   if (state->ARB_draw_instanced_enable && state->language_version >= 130) {
   ir_variable *inst =
  add_variable(instructions, state->symbols,
  "gl_InstanceIDARB", glsl_type::int_type,
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] softpipe: Do round-to-even, not round-up.

2012-06-24 Thread Olivier Galibert

On Fri, May 18, 2012 at 08:55:39AM -0600, Brian Paul wrote:
> In any case, I think this function could be moved into u_math.c so it 
> could be used elsewhere.
[...]
> I was looking at the GLSL round() and roundEven() functions.  The GLSL 
> spec says round() can use whatever method is fastest.  But in 
> builtin_functions.cpp the round() function is implemented in terms of 
> the round_even builtin.  It seems to me that we should have a generic 
> 'round' builtin function and separate TGSI_ROUND and TGSI_ROUND_EVEN 
> opcodes so that drivers can really have the option of using a 
> faster/looser round function.

I've tried doing that.  I've moved the function to u_math.c, then made
src/glsl/ir_constant_expression.cpp use it.  That blew up.

If I compile with scons, I get:
  Linking build/linux-x86_64-debug/glsl/builtin_compiler ...
build/linux-x86_64-debug/glsl/ir_constant_expression.o: In function `dot':
/home/galibert/X/work/mesa-play/src/glsl/ir_constant_expression.cpp:47: 
undefined reference to `_debug_assert_fail'
[...]
/home/galibert/X/work/mesa-play/src/glsl/ir_constant_expression.cpp:265: 
undefined reference to `ieee754_fp32_round_half_to_even'
[etc]

If I compile with autoconf/make I get:
ir_constant_expression.cpp:42:25: fatal error: util/u_math.h: No such file or 
directory

So at that point src/glsl and src/gallium are not supposed to meet
each other.  And changing that is not a responsability I feel like
taking.  Any advice?

Best,

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] llvmpipe: Remove the ARB_draw_instanced capability.

2012-06-24 Thread Olivier Galibert

That capability requires integer handling and that's not yet active,
ending with a failure in draw-non-instanced unless you force it on.
See bug 51366.

Frankly, I'd rather have that patch rejected and integer/glsl 130
capability activated instead.  There still are things missing, but
they mostly have their own extension anyway.  And the overall picture
ain't so bad.

Signed-off-by: Olivier Galibert 

diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 40037a5..5eb826e 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -152,8 +152,6 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
   return 1;
case PIPE_CAP_DEPTH_CLIP_DISABLE:
   return 0;
-   case PIPE_CAP_TGSI_INSTANCEID:
-   case PIPE_CAP_VERTEX_ELEMENT_INSTANCE_DIVISOR:
case PIPE_CAP_MIXED_COLORBUFFER_FORMATS:
case PIPE_CAP_CONDITIONAL_RENDER:
   return 1;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st_program.c: gl_ClipDistance must be interpolated in 3d space.

2012-06-24 Thread Olivier Galibert

That old bug was hidden but the clipper always interpolating in 3d
space no matter what it should have been doing.  Now that the
interpolation has been fixed, the bug shows up.

Fixes bugzilla 51364.

Signed-off-by: Olivier Galibert 

diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index e6664fb..9f98298 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -569,12 +569,12 @@ st_translate_fragment_program(struct st_context *st,
  case FRAG_ATTRIB_CLIP_DIST0:
 input_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST;
 input_semantic_index[slot] = 0;
-interpMode[slot] = TGSI_INTERPOLATE_LINEAR;
+interpMode[slot] = TGSI_INTERPOLATE_PERSPECTIVE;
 break;
  case FRAG_ATTRIB_CLIP_DIST1:
 input_semantic_name[slot] = TGSI_SEMANTIC_CLIPDIST;
 input_semantic_index[slot] = 1;
-interpMode[slot] = TGSI_INTERPOLATE_LINEAR;
+interpMode[slot] = TGSI_INTERPOLATE_PERSPECTIVE;
 break;
 /* In most cases, there is nothing special about these
  * inputs, so adopt a convention to use the generic
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa (master): draw: fix flat shading and screen-space linear interpolation in clipper

2012-06-21 Thread Olivier Galibert

On Thu, Jun 21, 2012 at 10:28:22AM -0700, Jose Fonseca wrote:
> This patch series is causing regressions in select/feedback mode. Can you 
> take a look?

Sure.  I wouldn't have expected that case to ever happen, but it makes
sense now that I think of it.

commit edc7b26b03c0393582ff5ec8c963207c7553850a
Author: Olivier Galibert 
Date:   Thu Jun 21 19:37:11 2012 +0200

clip_init_state: Handle the case when there isn't a fragment shader.

    Signed-off-by: Olivier Galibert 

diff --git a/src/gallium/auxiliary/draw/draw_pipe_clip.c 
b/src/gallium/auxiliary/draw/draw_pipe_clip.c
index 2d36eb3..c02d0ef 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_clip.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_clip.c
@@ -586,6 +586,9 @@ clip_init_state( struct draw_stage *stage )
 * two outputs for one input, so we tuck the information in a
 * specific array.  Second if they don't have qualifiers, the
 * default value has to be picked from the global shade mode.
+*
+* Of course, if we don't have a fragment shader in the first
+* place, defaults should be used.
 */
 
/* First pick up the interpolation mode for
@@ -595,10 +598,12 @@ clip_init_state( struct draw_stage *stage )
indexed_interp[0] = indexed_interp[1] = stage->draw->rasterizer->flatshade ?
   TGSI_INTERPOLATE_CONSTANT : TGSI_INTERPOLATE_PERSPECTIVE;
 
-   for (i = 0; i < fs->info.num_inputs; i++) {
-  if (fs->info.input_semantic_name[i] == TGSI_SEMANTIC_COLOR) {
- if (fs->info.input_interpolate[i] != TGSI_INTERPOLATE_COLOR)
-indexed_interp[fs->info.input_semantic_index[i]] = 
fs->info.input_interpolate[i];
+   if (fs) {
+  for (i = 0; i < fs->info.num_inputs; i++) {
+ if (fs->info.input_semantic_name[i] == TGSI_SEMANTIC_COLOR) {
+if (fs->info.input_interpolate[i] != TGSI_INTERPOLATE_COLOR)
+   indexed_interp[fs->info.input_semantic_index[i]] = 
fs->info.input_interpolate[i];
+ }
   }
}
 
@@ -627,12 +632,14 @@ clip_init_state( struct draw_stage *stage )
   */
  uint j;
  interp = TGSI_INTERPOLATE_PERSPECTIVE;
- for (j = 0; j < fs->info.num_inputs; j++) {
-if (vs->info.output_semantic_name[i] == 
fs->info.input_semantic_name[j] &&
-vs->info.output_semantic_index[i] == 
fs->info.input_semantic_index[j]) {
-   interp = fs->info.input_interpolate[j];
-   break;
-}   
+ if (fs) {
+for (j = 0; j < fs->info.num_inputs; j++) {
+   if (vs->info.output_semantic_name[i] == 
fs->info.input_semantic_name[j] &&
+   vs->info.output_semantic_index[i] == 
fs->info.input_semantic_index[j]) {
+  interp = fs->info.input_interpolate[j];
+  break;
+   }
+}
  }
   }
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/msaa: Only do multisample rasterization if GL_MULTISAMPLE enabled.

2012-06-21 Thread Olivier Galibert

On Thu, Jun 21, 2012 at 11:58:57AM +0200, Michel Dänzer wrote:
> On Don, 2012-06-21 at 11:38 +0200, Olivier Galibert wrote: 
> > On Thu, Jun 21, 2012 at 11:19:39AM +0200, Michel Dänzer wrote:
> > > On Die, 2012-06-19 at 17:18 -0700, Kenneth Graunke wrote: 
> > > > Also, distribute the appropriate emacs and vim settings to indent things
> > > > correctly.
> > > 
> > > In any case, please do this *before* any kind of cleanup.
> > 
> > (global-set-key [(control c) (s)]  (lambda () (interactive) (setq 
> > c-basic-offset 3 tab-width 8 indent-tabs-mode nil)))
> 
> The point is to encode that in a file in the tree which is picked up
> automagically.

Errr, automagically running code coming from a repository without user
intervention is not usually considered smart, security-wise...

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/msaa: Only do multisample rasterization if GL_MULTISAMPLE enabled.

2012-06-21 Thread Olivier Galibert

On Thu, Jun 21, 2012 at 11:19:39AM +0200, Michel Dänzer wrote:
> On Die, 2012-06-19 at 17:18 -0700, Kenneth Graunke wrote: 
> > Also, distribute the appropriate emacs and vim settings to indent things
> > correctly.
> 
> In any case, please do this *before* any kind of cleanup.

(global-set-key [(control c) (s)]  (lambda () (interactive) (setq 
c-basic-offset 3 tab-width 8 indent-tabs-mode nil)))

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] softpipe: Take all lods into account when texture sampling.

2012-06-21 Thread Olivier Galibert

On Wed, Jun 20, 2012 at 01:44:14PM +0100, Roland Scheidegger wrote:
> A lot of code I just glossed over it, but seems to look ok other than 
> the (performance) implications this might have.

Actually whether there's a performance implication is not obvious.  In
practice the code just kicks the 4-pixel loop one or two function
calls higher.  This unshares some tests, some function calls, and the
mip-size computation shifts.  For normal texturing and on x86 the
tests are correctly predicted after the first one, and so are the
function calls, giving all of them a near zero cost.  So I'm not sure
the costs is that measurable.

With the actual vectorization the llvmpipe situation may be different
(not so sure with the aos texturing though).

Best,

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] softpipe: Take all lods into account when texture sampling.

2012-06-19 Thread Olivier Galibert

On Tue, Jun 19, 2012 at 02:46:35PM -0700, Jose Fonseca wrote:
> Could you give more background on why is this necessary?
> 
> This will make software renderering slower, so I'd really like to avoid it on 
> llvmpipe if at all possible.

Well, given the existence of textureLod and textureGrad every texture
sample can easily hit a different mipmap or even, by switching between
minification and magnification, a different filter entirely.  Even a
simple texture() is hit, if your polygon is horizontal enough.

And this goes double for vertex shaders, where texture fetches there
have less reason to be close in texture space.

textureSize and textureFetch, with their explicit lod, have of course
the same problem. only worse.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] llvmpipe: Simplify and fix system variables fetch.

2012-06-19 Thread Olivier Galibert

The system array values concept doesn't really because it expects the
system values to be fixed per call, which is wrong for gl_VertexID and
iffy for gl_SampleID.  So this patch does two things:

- kill the array, have emit_fetch_system_value directly pick the
  values it needs (only gl_InstanceID for now, as the previous code)

- correctly handle the expected type in emit_fetch_system_value

Signed-off-by: Olivier Galibert 
Reviewed-by: Brian Paul 
---
 src/gallium/auxiliary/draw/draw_llvm.c  |   10 +--
 src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |   11 +--
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |   88 +++
 src/gallium/drivers/llvmpipe/lp_state_fs.c  |2 +-
 4 files changed, 33 insertions(+), 78 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index e1df2f1..8e787c5 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -459,7 +459,7 @@ generate_vs(struct draw_llvm *llvm,
 LLVMBuilderRef builder,
 LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS],
 const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS],
-LLVMValueRef system_values_array,
+LLVMValueRef instance_id,
 LLVMValueRef context_ptr,
 struct lp_build_sampler_soa *draw_sampler,
 boolean clamp_vertex_color)
@@ -491,7 +491,7 @@ generate_vs(struct draw_llvm *llvm,
  vs_type,
  NULL /*struct lp_build_mask_context *mask*/,
  consts_ptr,
- system_values_array,
+ instance_id,
  NULL /*pos*/,
  inputs,
  outputs,
@@ -1249,7 +1249,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
LLVMValueRef stride, step, io_itr;
LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr;
LLVMValueRef instance_id;
-   LLVMValueRef system_values_array;
LLVMValueRef zero = lp_build_const_int32(gallivm, 0);
LLVMValueRef one = lp_build_const_int32(gallivm, 1);
struct draw_context *draw = llvm->draw;
@@ -1340,9 +1339,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
 
lp_build_context_init(&bld, gallivm, lp_type_int(32));
 
-   system_values_array = lp_build_system_values_array(gallivm, vs_info,
-  instance_id, NULL);
-
/* function will return non-zero i32 value if any clipped vertices */
ret_ptr = lp_build_alloca(gallivm, int32_type, "");
LLVMBuildStore(builder, zero, ret_ptr);
@@ -1418,7 +1414,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   builder,
   outputs,
   ptr_aos,
-  system_values_array,
+  instance_id,
   context_ptr,
   sampler,
   variant->key.clamp_vertex_color);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
index 141e799..c4e690c 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
@@ -205,7 +205,7 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
   struct lp_type type,
   struct lp_build_mask_context *mask,
   LLVMValueRef consts_ptr,
-  LLVMValueRef system_values_array,
+  LLVMValueRef instance_id,
   const LLVMValueRef *pos,
   const LLVMValueRef (*inputs)[4],
   LLVMValueRef (*outputs)[4],
@@ -225,13 +225,6 @@ lp_build_tgsi_aos(struct gallivm_state *gallivm,
   const struct tgsi_shader_info *info);
 
 
-LLVMValueRef
-lp_build_system_values_array(struct gallivm_state *gallivm,
- const struct tgsi_shader_info *info,
- LLVMValueRef instance_id,
- LLVMValueRef facing);
-
-
 struct lp_exec_mask {
struct lp_build_context *bld;
 
@@ -388,7 +381,7 @@ struct lp_build_tgsi_soa_context
 */
LLVMValueRef inputs_array;
 
-   LLVMValueRef system_values_array;
+   LLVMValueRef instance_id;
 
/** bitmask indicating which register files are accessed indirectly */
unsigned indirect_files;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 412dc0c..26be902 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -786,18 +786,37 @@ emit_fetch_system_value(
 {
struct lp_build_tgsi_soa_context * bld = lp_soa_context(bld_base);
struct gallivm_state *gallivm = bld->bld_base.base.gallivm;
+   const struct tgsi_shader_info *info = bld->bld_base.info;
LLVMBuilde

[Mesa-dev] [PATCH 4/4] llvmpipe: Add vertex id support.

2012-06-19 Thread Olivier Galibert

Signed-off-by: Olivier Galibert 
Reviewed-by: Brian Paul 
---
 src/gallium/auxiliary/draw/draw_llvm.c  |   32 ++-
 src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |   13 +++--
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |   11 +---
 src/gallium/drivers/llvmpipe/lp_state_fs.c  |5 +++-
 4 files changed, 42 insertions(+), 19 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index 8e787c5..e08221e 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -459,7 +459,7 @@ generate_vs(struct draw_llvm *llvm,
 LLVMBuilderRef builder,
 LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS],
 const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS],
-LLVMValueRef instance_id,
+const struct lp_bld_tgsi_system_values *system_values,
 LLVMValueRef context_ptr,
 struct lp_build_sampler_soa *draw_sampler,
 boolean clamp_vertex_color)
@@ -491,7 +491,7 @@ generate_vs(struct draw_llvm *llvm,
  vs_type,
  NULL /*struct lp_build_mask_context *mask*/,
  consts_ptr,
- instance_id,
+ system_values,
  NULL /*pos*/,
  inputs,
  outputs,
@@ -1248,7 +1248,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
LLVMValueRef count, fetch_elts, fetch_count;
LLVMValueRef stride, step, io_itr;
LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr;
-   LLVMValueRef instance_id;
LLVMValueRef zero = lp_build_const_int32(gallivm, 0);
LLVMValueRef one = lp_build_const_int32(gallivm, 1);
struct draw_context *draw = llvm->draw;
@@ -1270,6 +1269,9 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
const unsigned pos = draw_current_shader_position_output(llvm->draw);
const unsigned cv = draw_current_shader_clipvertex_output(llvm->draw);
boolean have_clipdist = FALSE;
+   struct lp_bld_tgsi_system_values system_values;
+
+   memset(&system_values, 0, sizeof(system_values));
 
arg_types[0] = get_context_ptr_type(llvm);   /* context */
arg_types[1] = get_vertex_header_ptr_type(llvm); /* vertex_header */
@@ -1300,19 +1302,19 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
  LLVMAddAttribute(LLVMGetParam(variant_func, i),
   LLVMNoAliasAttribute);
 
-   context_ptr  = LLVMGetParam(variant_func, 0);
-   io_ptr   = LLVMGetParam(variant_func, 1);
-   vbuffers_ptr = LLVMGetParam(variant_func, 2);
-   stride   = LLVMGetParam(variant_func, 5);
-   vb_ptr   = LLVMGetParam(variant_func, 6);
-   instance_id  = LLVMGetParam(variant_func, 7);
+   context_ptr   = LLVMGetParam(variant_func, 0);
+   io_ptr= LLVMGetParam(variant_func, 1);
+   vbuffers_ptr  = LLVMGetParam(variant_func, 2);
+   stride= LLVMGetParam(variant_func, 5);
+   vb_ptr= LLVMGetParam(variant_func, 6);
+   system_values.instance_id = LLVMGetParam(variant_func, 7);
 
lp_build_name(context_ptr, "context");
lp_build_name(io_ptr, "io");
lp_build_name(vbuffers_ptr, "vbuffers");
lp_build_name(stride, "stride");
lp_build_name(vb_ptr, "vb");
-   lp_build_name(instance_id, "instance_id");
+   lp_build_name(system_values.instance_id, "instance_id");
 
if (elts) {
   fetch_elts   = LLVMGetParam(variant_func, 3);
@@ -1378,6 +1380,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   lp_build_printf(builder, " --- io %d = %p, loop counter %d\n",
   io_itr, io, lp_loop.counter);
 #endif
+  system_values.vertex_id = lp_build_zero(gallivm, lp_type_uint_vec(32));
   for (i = 0; i < TGSI_NUM_CHANNELS; ++i) {
  LLVMValueRef true_index =
 LLVMBuildAdd(builder,
@@ -1395,7 +1398,10 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
  &true_index, 1, "");
 true_index = LLVMBuildLoad(builder, fetch_ptr, "fetch_elt");
  }
-
+ 
+ system_values.vertex_id = LLVMBuildInsertElement(gallivm->builder,
+  
system_values.vertex_id, true_index,
+  
lp_build_const_int32(gallivm, i), "");
  for (j = 0; j < draw->pt.nr_vertex_elements; ++j) {
 struct pipe_vertex_element *velem = &draw->pt.vertex_element[j];
 LLVMValueRef vb_index =
@@ -1403,7 +1409,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struc

[Mesa-dev] [PATCH 2/4] draw: fix flat shading and screen-space linear interpolation in clipper

2012-06-19 Thread Olivier Galibert

This includes:
- picking up correctly which attributes are flatshaded and which are
  noperspective

- copying the flatshaded attributes when needed, including the
  non-built-in ones

- correctly interpolating the noperspective attributes in screen-space
  instead than in a 3d-correct fashion.

Signed-off-by: Olivier Galibert 
Reviewed-by: Brian Paul 
---
 src/gallium/auxiliary/draw/draw_pipe_clip.c |  144 +--
 1 file changed, 113 insertions(+), 31 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_pipe_clip.c 
b/src/gallium/auxiliary/draw/draw_pipe_clip.c
index 4da4d65..2d36eb3 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_clip.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_clip.c
@@ -39,6 +39,7 @@
 
 #include "draw_vs.h"
 #include "draw_pipe.h"
+#include "draw_fs.h"
 
 
 #ifndef IS_NEGATIVE
@@ -56,11 +57,12 @@
 struct clip_stage {
struct draw_stage stage;  /**< base class */
 
-   /* Basically duplicate some of the flatshading logic here:
-*/
-   boolean flat;
-   uint num_color_attribs;
-   uint color_attribs[4];  /* front/back primary/secondary colors */
+   /* List of the attributes to be flatshaded. */
+   uint num_flat_attribs;
+   uint flat_attribs[PIPE_MAX_SHADER_OUTPUTS];
+
+   /* Mask of attributes in noperspective mode */
+   boolean noperspective_attribs[PIPE_MAX_SHADER_OUTPUTS];
 
float (*plane)[4];
 };
@@ -91,17 +93,16 @@ static void interp_attr( float dst[4],
 
 
 /**
- * Copy front/back, primary/secondary colors from src vertex to dst vertex.
- * Used when flat shading.
+ * Copy flat shaded attributes src vertex to dst vertex.
  */
-static void copy_colors( struct draw_stage *stage,
-struct vertex_header *dst,
-const struct vertex_header *src )
+static void copy_flat( struct draw_stage *stage,
+   struct vertex_header *dst,
+   const struct vertex_header *src )
 {
const struct clip_stage *clipper = clip_stage(stage);
uint i;
-   for (i = 0; i < clipper->num_color_attribs; i++) {
-  const uint attr = clipper->color_attribs[i];
+   for (i = 0; i < clipper->num_flat_attribs; i++) {
+  const uint attr = clipper->flat_attribs[i];
   COPY_4FV(dst->data[attr], src->data[attr]);
}
 }
@@ -120,6 +121,7 @@ static void interp( const struct clip_stage *clip,
const unsigned pos_attr = 
draw_current_shader_position_output(clip->stage.draw);
const unsigned clip_attr = 
draw_current_shader_clipvertex_output(clip->stage.draw);
unsigned j;
+   float t_nopersp;
 
/* Vertex header.
 */
@@ -148,12 +150,36 @@ static void interp( const struct clip_stage *clip,
   dst->data[pos_attr][2] = pos[2] * oow * scale[2] + trans[2];
   dst->data[pos_attr][3] = oow;
}
+   
+   /**
+* Compute the t in screen-space instead of 3d space to use
+* for noperspective interpolation.
+*
+* The points can be aligned with the X axis, so in that case try
+* the Y.  When both points are at the same screen position, we can
+* pick whatever value (the interpolated point won't be in front
+* anyway), so just use the 3d t.
+*/
+   {
+  int k;
+  t_nopersp = t;
+  for (k = 0; k < 2; k++)
+ if (in->data[pos_attr][k] != out->data[pos_attr][k]) {
+t_nopersp = (dst->data[pos_attr][k] - out->data[pos_attr][k]) /
+   (in->data[pos_attr][k] - out->data[pos_attr][k]);
+break;
+ }
+   }
 
/* Other attributes
 */
for (j = 0; j < nr_attrs; j++) {
-  if (j != pos_attr && j != clip_attr)
-interp_attr(dst->data[j], t, in->data[j], out->data[j]);
+  if (j != pos_attr && j != clip_attr) {
+ if (clip->noperspective_attribs[j])
+interp_attr(dst->data[j], t_nopersp, in->data[j], out->data[j]);
+ else
+interp_attr(dst->data[j], t, in->data[j], out->data[j]);
+  }
}
 }
 
@@ -406,14 +432,14 @@ do_clip_tri( struct draw_stage *stage,
/* If flat-shading, copy provoking vertex color to polygon vertex[0]
 */
if (n >= 3) {
-  if (clipper->flat) {
+  if (clipper->num_flat_attribs) {
  if (stage->draw->rasterizer->flatshade_first) {
 if (inlist[0] != header->v[0]) {
assert(tmpnr < MAX_CLIPPED_VERTICES + 1);
if (tmpnr >= MAX_CLIPPED_VERTICES + 1)
   return;
inlist[0] = dup_vert(stage, inlist[0], tmpnr++);
-   copy_colors(stage, inlist[0], header->v[0]);
+   copy_flat(stage, inlist[0], header->v[0]);
 }
  }
  else {
@@ -422,7 +448,7 @@ do_clip_tri( struct draw_stage *stage,
if (tmpnr >= MAX_CLIPPED_VERTICES + 1)
   return;

[Mesa-dev] [PATCH 1/4] softpipe: Offset is not to be applied to the layer parameter of array texture fetches.

2012-06-19 Thread Olivier Galibert

Signed-off-by: Olivier Galibert 
Reviewed-by: Brian Paul 
---
 src/gallium/drivers/softpipe/sp_tex_sample.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/softpipe/sp_tex_sample.c 
b/src/gallium/drivers/softpipe/sp_tex_sample.c
index d4c0175..f29a6c7 100644
--- a/src/gallium/drivers/softpipe/sp_tex_sample.c
+++ b/src/gallium/drivers/softpipe/sp_tex_sample.c
@@ -2693,7 +2693,7 @@ sample_get_texels(struct tgsi_sampler *tgsi_sampler,
case PIPE_TEXTURE_1D_ARRAY:
   for (j = 0; j < TGSI_QUAD_SIZE; j++) {
  int x = CLAMP(v_i[j] + offset[0], 0, width - 1);
- int y = CLAMP(v_j[j] + offset[1], 0, layers - 1);
+ int y = CLAMP(v_j[j], 0, layers - 1);
 tx = get_texel_1d_array(samp, addr, x, y);
 for (c = 0; c < 4; c++) {
rgba[c][j] = tx[c];
@@ -2715,7 +2715,7 @@ sample_get_texels(struct tgsi_sampler *tgsi_sampler,
   for (j = 0; j < TGSI_QUAD_SIZE; j++) {
  int x = CLAMP(v_i[j] + offset[0], 0, width - 1);
  int y = CLAMP(v_j[j] + offset[1], 0, height - 1);
- int layer = CLAMP(v_k[j] + offset[2], 0, layers - 1);
+ int layer = CLAMP(v_k[j], 0, layers - 1);
 tx = get_texel_2d_array(samp, addr, x, y, layer);
 for (c = 0; c < 4; c++) {
rgba[c][j] = tx[c];
-- 
1.7.10.280.gaa39

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Ping: patches to apply

2012-06-19 Thread Olivier Galibert

  Hi,

They've been revieved, they've been changed when requested :-)

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Clarifications w.r.t MSAA

2012-06-12 Thread Olivier Galibert

On Tue, Jun 12, 2012 at 01:50:08PM +0200, Christoph Bumiller wrote:
> > First question: how many depths should be computed, and for which
> > coordinates? Which of these values is associated with which sample?
> 
> One for each sample point. The depth buffer will be multisampled as well.
> Coverage sampling (CSAA) where you have extra coverage samples that do
> NOT (necessarily) correspond to color sample locations are not covered
> by the GL spec, it's vendor-specific.

Ok.  So that means that if the shader writes z, you have to do full
supersampling then.


> > Second question: how many samples should be shaded, and for which
> > coordinates?  What is the impact of depth testing failure?
> 
> As many as the user requested via glMinSampleShading, and the sample
> locations to choose seem to be up to the implementation.

Do you know what's usually expected?  Center of collapsed samples, one
of the samples, center of the pixel?


> > Third question: what happens when a variable has a "sample" qualifier
> > in the fragment shader?  Or "centroid"?
> 
> "When interpolating variables declared using sample in when MULTISAMPLE
> is enabled, the fragment shader will be invoked separately for each (!)
> covered sample and the variable will be sampled at the corresponding
> sample point."

So a "sample" anywhere means full supersampling, ok.

Thanks,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Clarifications w.r.t MSAA

2012-06-12 Thread Olivier Galibert

  Hi all,

I'm getting a little lost in all the interactions between the
different parts of the GL standards and what I understand of the
expectations when it comes to MSAA.  It would be nice if I could have
some clarifications.

I'll start with what I think I understand (and please correct me when
I'm wrong) and add a number of questions.  I'll also ignore the
"resolve" part, which isn't an issue (at least for me :-).


MSAA is a variant on the supersampling theme where the coverage is
supersampled but depth, stencil and color may or may not be.  The
destination buffer has enough space to store the full results of a
complete supersampling, but some of the values may be duplicated.

The variable MIN_SAMPLE_SHADING_VALUE allows the application to
control the minimum number of values that have to be computed.  It can
say for instance that in a 16xMSAA case at least 4 samples per pixel
are required.

So let's take a case of 16xMSAA (say with the DX11 pattern) and let's
look at the pipeline.  First the coverage is sampled for the 16 fixed
positions, leaving C active samples.  Then there should be early depth
testing then shading, or the other way around, depending on the
shaders.

First question: how many depths should be computed, and for which
coordinates? Which of these values is associated with which sample?

Second question: how many samples should be shaded, and for which
coordinates?  What is the impact of depth testing failure?

Third question: what happens when a variable has a "sample" qualifier
in the fragment shader?  Or "centroid"?

Fourth question: how does gl_SampleMask interact with all that when
more than one sample is evaluated.  And what does gl_SampleMaskIn look
like in the same case?

I hope you people can help me clarify all that stuff :-)

Best,

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/2] Add vertex id to llvmpipe.

2012-06-10 Thread Olivier Galibert

On Fri, Jun 08, 2012 at 09:01:42AM -0700, Jose Fonseca wrote:
> Oliver,
> 
> There will be other system values in the future, so instead of passing every 
> value as a different parameter, please define a structure in 
> src/gallium/auxiliary/gallivm/lp_bld_tgsi.h as
> 
> struct lp_bld_tgsi_system_values {
>   LLVMValueRef facing;
>   LLVMValueRef instance_id;
>   LLVMValueRef vertex_id;
>   ...
> }
> 
> which is then passed to lp_build_tgsi_soa and all other functions.
> 
> Otherwise the change looks good overall.

Something like that for the second part?

  OG.

Author: Olivier Galibert 
Date:   Fri Jun 1 22:58:58 2012 +0200

llvmpipe: Add vertex id support.

Signed-off-by: Olivier Galibert 

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index d5eb727..de495cf 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -456,7 +456,7 @@ generate_vs(struct draw_llvm *llvm,
 LLVMBuilderRef builder,
 LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS],
 const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS],
-LLVMValueRef instance_id,
+const struct lp_bld_tgsi_system_values *system_values,
 LLVMValueRef context_ptr,
 struct lp_build_sampler_soa *draw_sampler,
 boolean clamp_vertex_color)
@@ -488,7 +488,7 @@ generate_vs(struct draw_llvm *llvm,
  vs_type,
  NULL /*struct lp_build_mask_context *mask*/,
  consts_ptr,
- instance_id,
+ system_values,
  NULL /*pos*/,
  inputs,
  outputs,
@@ -1245,7 +1245,6 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
LLVMValueRef count, fetch_elts, fetch_count;
LLVMValueRef stride, step, io_itr;
LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr;
-   LLVMValueRef instance_id;
LLVMValueRef zero = lp_build_const_int32(gallivm, 0);
LLVMValueRef one = lp_build_const_int32(gallivm, 1);
struct draw_context *draw = llvm->draw;
@@ -1267,6 +1266,9 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
const unsigned pos = draw_current_shader_position_output(llvm->draw);
const unsigned cv = draw_current_shader_clipvertex_output(llvm->draw);
boolean have_clipdist = FALSE;
+   struct lp_bld_tgsi_system_values system_values;
+
+   memset(&system_values, 0, sizeof(system_values));
 
arg_types[0] = get_context_ptr_type(llvm);   /* context */
arg_types[1] = get_vertex_header_ptr_type(llvm); /* vertex_header */
@@ -1297,19 +1299,19 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
  LLVMAddAttribute(LLVMGetParam(variant_func, i),
   LLVMNoAliasAttribute);
 
-   context_ptr  = LLVMGetParam(variant_func, 0);
-   io_ptr   = LLVMGetParam(variant_func, 1);
-   vbuffers_ptr = LLVMGetParam(variant_func, 2);
-   stride   = LLVMGetParam(variant_func, 5);
-   vb_ptr   = LLVMGetParam(variant_func, 6);
-   instance_id  = LLVMGetParam(variant_func, 7);
+   context_ptr   = LLVMGetParam(variant_func, 0);
+   io_ptr= LLVMGetParam(variant_func, 1);
+   vbuffers_ptr  = LLVMGetParam(variant_func, 2);
+   stride= LLVMGetParam(variant_func, 5);
+   vb_ptr= LLVMGetParam(variant_func, 6);
+   system_values.instance_id = LLVMGetParam(variant_func, 7);
 
lp_build_name(context_ptr, "context");
lp_build_name(io_ptr, "io");
lp_build_name(vbuffers_ptr, "vbuffers");
lp_build_name(stride, "stride");
lp_build_name(vb_ptr, "vb");
-   lp_build_name(instance_id, "instance_id");
+   lp_build_name(system_values.instance_id, "instance_id");
 
if (elts) {
   fetch_elts   = LLVMGetParam(variant_func, 3);
@@ -1375,6 +1377,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   lp_build_printf(builder, " --- io %d = %p, loop counter %d\n",
   io_itr, io, lp_loop.counter);
 #endif
+  system_values.vertex_id = lp_build_zero(gallivm, lp_type_uint_vec(32));
   for (i = 0; i < TGSI_NUM_CHANNELS; ++i) {
  LLVMValueRef true_index =
 LLVMBuildAdd(builder,
@@ -1392,7 +1395,10 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
  &true_index, 1, "");
 true_index = LLVMBuildLoad(builder, fetch_ptr, "fetch_elt");
  }
-
+ 
+ system_values.vertex_id = LLVMBuildInsertElement(gallivm->builder,
+  
system_values.vertex_id

Re: [Mesa-dev] [PATCH] glsl: Fix pi/2 constant in acos built-in function

2012-06-06 Thread Olivier Galibert

On Tue, Jun 05, 2012 at 04:51:54PM -0700, Paul Berry wrote:
> The best idea I've got so far would be a shader_runner test with a fragment
> shader that computes dFdx(asin(x)), compares it to the theoretical closed
> form derivative of asin(x) (which is 1/sqrt(1-x^2)), and draws red pixels
> if the result is outside a certain error tolerance.  We'd probably want to
> use a relative error (since the derivative of asin(x) can get quite large)
> and stop a bit shy of the endpoints where it goes to infinity.

Can't you take the perfectly reasonable hypothesis that the system's
asin is precise, and upload something like a 256x256 R32FG32FB32FA32F
texture with reference values?  262144 testing points should be good
enough :-)

And that's something that generalizes easily to all the functions you
may want to test on a segment.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Fix pi/2 constant in acos built-in function

2012-06-04 Thread Olivier Galibert

On Mon, Jun 04, 2012 at 03:23:34PM -0700, Paul Berry wrote:
> I'm not even kidding--I love this
> stuff and I'm jealous that I don't have time to work on it right now

Do you have a favorite method for Vandermonde matrix inversion?

  OG.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Fix pi/2 constant in acos built-in function

2012-06-04 Thread Olivier Galibert

On Mon, Jun 04, 2012 at 01:11:13PM -0700, Ian Romanick wrote:
> From: Ian Romanick 
> 
> In single precision, 1.5707963 becomes 1.5707962513 which is too
> small.  However, 1.5707964 becomes 1.5707963705 which is just right.
> The value 1.5707964 is already used in asin.ir.
> 
> NOTE: This is a candidate for stable release branches.

If piglit stops bitching on that partical problem thanks to such a
small change, it's just beautiful.

Do we need a better precision atan, or should piglit just be told to
shutup?  The shutup patch has been sent it ages ago, but I can't do
the "more precision" one if that's what's wanted.

  OG.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] llvmpipe: Add vertex id support.

2012-06-01 Thread Olivier Galibert

Signed-off-by: Olivier Galibert 
---
 src/gallium/auxiliary/draw/draw_llvm.c  |   10 --
 src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |3 ++-
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |7 +++
 src/gallium/drivers/llvmpipe/lp_state_fs.c  |2 +-
 4 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index d5eb727..71125ba 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -457,6 +457,7 @@ generate_vs(struct draw_llvm *llvm,
 LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS],
 const LLVMValueRef (*inputs)[TGSI_NUM_CHANNELS],
 LLVMValueRef instance_id,
+LLVMValueRef vertex_id,
 LLVMValueRef context_ptr,
 struct lp_build_sampler_soa *draw_sampler,
 boolean clamp_vertex_color)
@@ -489,6 +490,7 @@ generate_vs(struct draw_llvm *llvm,
  NULL /*struct lp_build_mask_context *mask*/,
  consts_ptr,
  instance_id,
+ vertex_id,
  NULL /*pos*/,
  inputs,
  outputs,
@@ -1245,7 +1247,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
LLVMValueRef count, fetch_elts, fetch_count;
LLVMValueRef stride, step, io_itr;
LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr;
-   LLVMValueRef instance_id;
+   LLVMValueRef instance_id, vertex_id;
LLVMValueRef zero = lp_build_const_int32(gallivm, 0);
LLVMValueRef one = lp_build_const_int32(gallivm, 1);
struct draw_context *draw = llvm->draw;
@@ -1375,6 +1377,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   lp_build_printf(builder, " --- io %d = %p, loop counter %d\n",
   io_itr, io, lp_loop.counter);
 #endif
+  vertex_id = lp_build_zero(gallivm, lp_type_uint_vec(32));
   for (i = 0; i < TGSI_NUM_CHANNELS; ++i) {
  LLVMValueRef true_index =
 LLVMBuildAdd(builder,
@@ -1392,7 +1395,9 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
  &true_index, 1, "");
 true_index = LLVMBuildLoad(builder, fetch_ptr, "fetch_elt");
  }
-
+ 
+ vertex_id = LLVMBuildInsertElement(gallivm->builder, vertex_id, 
true_index,
+lp_build_const_int32(gallivm, i), 
"");
  for (j = 0; j < draw->pt.nr_vertex_elements; ++j) {
 struct pipe_vertex_element *velem = &draw->pt.vertex_element[j];
 LLVMValueRef vb_index =
@@ -1412,6 +1417,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   outputs,
   ptr_aos,
   instance_id,
+  vertex_id,
   context_ptr,
   sampler,
   variant->key.clamp_vertex_color);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
index c4e690c..f87f899 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
@@ -206,6 +206,7 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
   struct lp_build_mask_context *mask,
   LLVMValueRef consts_ptr,
   LLVMValueRef instance_id,
+  LLVMValueRef vertex_id,
   const LLVMValueRef *pos,
   const LLVMValueRef (*inputs)[4],
   LLVMValueRef (*outputs)[4],
@@ -381,7 +382,7 @@ struct lp_build_tgsi_soa_context
 */
LLVMValueRef inputs_array;
 
-   LLVMValueRef instance_id;
+   LLVMValueRef instance_id, vertex_id;
 
/** bitmask indicating which register files are accessed indirectly */
unsigned indirect_files;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 26be902..37599da 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -799,6 +799,11 @@ emit_fetch_system_value(
   atype = TGSI_TYPE_UNSIGNED;
   break;
 
+   case TGSI_SEMANTIC_VERTEXID:
+  res = bld->vertex_id;
+  atype = TGSI_TYPE_UNSIGNED;
+  break;
+
default:
   assert(!"unexpected semantic in emit_fetch_system_value");
   res = bld_base->base.zero;
@@ -1996,6 +2001,7 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
   struct lp_build_mask_context *mask,
   LLVMValueRef consts_ptr,
   LLVMValueRef instance_id,
+  LLVMValueRef vertex_id,
   const LLVMValueRef *pos,
   const LLVMValueRef (*inp

1 2 >

1 - 100 of 182 matches

Mail list logo