Re: [Mesa-dev] [PATCH 1/4] nv50: add target->hasDualIssueing()

2016-08-13 Thread Patrick Baggett
On Sat, Aug 13, 2016 at 10:43 AM, Tobias Klausmann
 wrote:
>
>
>
> On 13.08.2016 12:02, Karol Herbst wrote:
>>
>> Signed-off-by: Karol Herbst 
>> ---
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_target.h| 1 +
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp | 7 ++-
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.h   | 1 +
>>   3 files changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h 
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
>> index 4a701f7..485ca16 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
>> @@ -222,6 +222,7 @@ public:
>>const Value *) const = 0;
>>// whether @insn can be issued together with @next (order matters)
>> +   virtual bool hasDualIssueing() const { return false; }
>>  virtual bool canDualIssue(const Instruction *insn,
>>const Instruction *next) const { return 
>> false; }
>>  virtual int getLatency(const Instruction *) const { return 1; }
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp 
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> index 04ac288..faf2121 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> @@ -605,12 +605,17 @@ int TargetNVC0::getThroughput(const Instruction *i) 
>> const
>>  }
>>   }
>>   +bool TargetNVC0::hasDualIssueing() const

The correct spelling is "issuing". English can be so silly at times...

>> +{
>> +   return getChipset() >= 0xe4;
>> +}
>> +
>>   bool TargetNVC0::canDualIssue(const Instruction *a, const Instruction *b) 
>> const
>>   {
>>  const OpClass clA = operationClass[a->op];
>>  const OpClass clB = operationClass[b->op];
>>   -   if (getChipset() >= 0xe4) {
>> +   if (hasDualIssueing()) {
>> // not texturing
>> // not if the 2nd instruction isn't necessarily executed
>> if (clA == OPCLASS_TEXTURE || clA == OPCLASS_FLOW)
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.h 
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.h
>> index 7d11cd9..3d55da7 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.h
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.h
>> @@ -57,6 +57,7 @@ public:
>>  virtual bool isPostMultiplySupported(operation, float, int& e) const;
>>  virtual bool mayPredicate(const Instruction *, const Value *) const;
>>   +   virtual bool hasDualIssueing() const;
>>  virtual bool canDualIssue(const Instruction *, const Instruction *) 
>> const;
>>  virtual int getLatency(const Instruction *) const;
>>  virtual int getThroughput(const Instruction *) const;
>
>
> Reviewed-by: Tobias Klausmann 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/6] nir: Turn -(b2f(a) + b2f(b) >= 0 into !(a || b).

2016-08-10 Thread Patrick Baggett
> >
> > For now, this patch is
> >
> > Reviewed-by: Ian Romanick 
>

I had a hard time parsing the title: "Turn -(b2f(a) + b2f(b) >= 0 into
!(a || b)"  at first, until I saw the replacement instructions. You're
missing a ')' on the commit line. :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Make TexSubImage check negative dimensions sooner.

2016-06-08 Thread Patrick Baggett
Sorry, didn't CC mesa-dev, trying again...

On Wed, Jun 8, 2016 at 4:11 PM, Kenneth Graunke  wrote:
> Two dEQP tests expect INVALID_VALUE errors for negative width/height
> parameters, but get INVALID_OPERATION because they haven't actually
> created a destination image.  This is arguably not a bug in Mesa, as
> there's no specified ordering of error conditions.
>
> However, it's also really easy to make the tests pass, and there's
> no real harm in doing these checks earlier.
>
> Fixes:
> dEQP-GLES3.functional.negative_api.texture.texsubimage3d_neg_width_height
> dEQP-GLES31.functional.debug.negative_coverage.get_error.texture.texsubimage3d_neg_width_height
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/main/teximage.c | 68 
> ++--
>  1 file changed, 49 insertions(+), 19 deletions(-)
>
> diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
> index 58b7f27..d4f8278 100644
> --- a/src/mesa/main/teximage.c
> +++ b/src/mesa/main/teximage.c
> @@ -1102,6 +1102,32 @@ _mesa_legal_texture_dimensions(struct gl_context *ctx, 
> GLenum target,
> }
>  }
>
> +static bool
> +error_check_subtexture_negative_dimensions(struct gl_context *ctx,
> +   GLuint dims,
> +   GLsizei subWidth,
> +   GLsizei subHeight,
> +   GLsizei subDepth,
> +   const char *func)
> +{
> +   /* Check size */
> +   if (subWidth < 0) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s(width=%d)", func, subWidth);
> +  return true;
> +   }
> +
> +   if (dims > 1 && subHeight < 0) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s(height=%d)", func, subHeight);
> +  return true;
> +   }
> +
> +   if (dims > 2 && subDepth < 0) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s(depth=%d)", func, subDepth);
> +  return true;
> +   }
> +

What do you think of a structure like:

switch(dims) {
case 3:
if(subDepth < 0) {
...
}
/* fall through */
case 2:
if(subHeight < 0) {
...
}
   /* fall through *
default:
if(subWidth < 0) {
...
}
}
return true;

I think this would reduce the overall number of expressions to check.
If you just want to check whether any are < 0, you can OR the sign
bits:


int result = 0;
switch(dims) {
case 3: result |= subDepth & (1 << 31);
case 2: result |= subHeight & (1 << 31);
default: result |= subWidth & (1 << 31);
}
return (bool)(result>>31);

...then later call that function to generate a more detailed error
message about specifically which dimension was negative.

> +   return false;
> +}
>
>  /**
>   * Do error checking of xoffset, yoffset, zoffset, width, height and depth
> @@ -1119,25 +1145,6 @@ error_check_subtexture_dimensions(struct gl_context 
> *ctx, GLuint dims,
> const GLenum target = destImage->TexObject->Target;
> GLuint bw, bh, bd;
>
> -   /* Check size */
> -   if (subWidth < 0) {
> -  _mesa_error(ctx, GL_INVALID_VALUE,
> -  "%s(width=%d)", func, subWidth);
> -  return GL_TRUE;
> -   }
> -
> -   if (dims > 1 && subHeight < 0) {
> -  _mesa_error(ctx, GL_INVALID_VALUE,
> -  "%s(height=%d)", func, subHeight);
> -  return GL_TRUE;
> -   }
> -
> -   if (dims > 2 && subDepth < 0) {
> -  _mesa_error(ctx, GL_INVALID_VALUE,
> -  "%s(depth=%d)", func, subDepth);
> -  return GL_TRUE;
> -   }
> -
> /* check xoffset and width */
> if (xoffset < - (GLint) destImage->Border) {
>_mesa_error(ctx, GL_INVALID_VALUE, "%s(xoffset)", func);
> @@ -2104,6 +2111,12 @@ texsubimage_error_check(struct gl_context *ctx, GLuint 
> dimensions,
>return GL_TRUE;
> }
>
> +   if (error_check_subtexture_negative_dimensions(ctx, dimensions,
> +  width, height, depth,
> +  callerName)) {
> +  return GL_TRUE;
> +   }
> +
> texImage = _mesa_select_tex_image(texObj, target, level);
> if (!texImage) {
>/* non-existant texture level */
> @@ -2140,6 +2153,12 @@ texsubimage_error_check(struct gl_context *ctx, GLuint 
> dimensions,
>return GL_TRUE;
> }
>
> +   if (error_check_subtexture_negative_dimensions(ctx, dimensions,
> +  width, height, depth,
> +  callerName)) {
> +  return GL_TRUE;
> +   }
> +
> if (error_check_subtexture_dimensions(ctx, dimensions,
>   texImage, xoffset, yoffset, zoffset,
>   width, height, depth, callerName)) {
> @@ -2497,6 +2516,11 @@ copytexsubimage_error_check(struct gl_context *ctx, 

Re: [Mesa-dev] Patchwork review process (efficiency) questions

2016-06-03 Thread Patrick Baggett
> I will point out a couple notes/observations:
>
> Kernel (drm/dri-devel), xorg, and other related projects use the same
> process, and a lot of us do (or at least at some point have) been
> active in 2 or more of these.
>
> Also, I have seen/used some other processes (gerrit, github pulls,
> etc).. and IMO on those projects the review process ended up being a
> lot more rubber-stamping and less thorough review of the changes.
> There is some value in not making things too "push-button"..

What are people's opinions on patchwork? I'm a regular reader but not
contributor. I find the interface appealing and overall not too
difficult to see recently submitted patches. Is it slower
(workflow-wise)/less convenient to use than email? Or are there
certain use-cases that just don't work?

-- Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Discussion: C++11 std::future in Mesa

2016-06-01 Thread Patrick Baggett
>
>
> No. Shader compilation can only be asynchronous if it's far enough
> from a draw call and the app doesn't query its status. If it's next to
> a draw call, multithreading is useless. Completely useless.
>

I don't know a lot about the shader compilation/linking process, so
I'm just asking this for my own benefit.

I read that the optimizations take a long time. Is it possible to
create a sort of -O0 version of the shader while the real version is
generated by some thread pool? Or would there be some shaders that
would just fail to run unless optimization took place (and the
developers count on that)?

> We need to get below 33 ms for all shaders needed to be compiled to
> render a frame. If there are 10 VS and 10 PS, one shader must be
> compiled within 1.65 ms on average. I don't see where your random
> guess meets that goal.
>
> Marek
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/tiled_memcpy: don't unconditionally use __builtin_bswap32

2016-04-19 Thread Patrick Baggett
On Mon, Apr 18, 2016 at 9:31 PM, Jonathan Gray  wrote:

> Use the defines Mesa configure sets to indicate presence of the bswap32
> builtins.  This lets i965 work on OpenBSD again after the changes that
> were made in 0a5d8d9af42fd77fce1492d55f958da97816961a.
>
> Signed-off-by: Jonathan Gray 
> ---
>  src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> index a549854..c888e46 100644
> --- a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> +++ b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> @@ -64,6 +64,19 @@ ror(uint32_t n, uint32_t d)
> return (n >> d) | (n << (32 - d));
>  }
>
> +static inline uint32_t
> +bswap32(uint32_t n)
> +{
> +#if defined(HAVE___BUILTIN_BSWAP32)
> +   return __builtin_bswap32(n);
> +#else
> +   return (n >> 24) |
> +  ((n >> 8) & 0xff00) |
> +  ((n << 8) & 0x00ff) |
> +  (n << 24);
> +#endif
> +}
>

If I recall, GCC recognizes an open-coded byte swapping funciton and will
replace it with the BSWAP instruction. I'm about 99% sure it is not
necessary to use __built_bswap32() to have the benefits of using BSWAP.
While I understand that you're trying to fix the use of
__builtin_bswap32(), I don't think it is really necessary to continue to
use it in your wrapper function. I'm not sure about -O0 though... anyways,
maybe it isn't worth looking too hard into, but you might be able to drop
some of the ugly #if defined() stuff.



> +
>  /**
>   * Copy RGBA to BGRA - swap R and B.
>   */
> @@ -76,7 +89,7 @@ rgba8_copy(void *dst, const void *src, size_t bytes)
> assert(bytes % 4 == 0);
>
> while (bytes >= 4) {
> -  *d = ror(__builtin_bswap32(*s), 8);
> +  *d = ror(bswap32(*s), 8);
>d += 1;
>s += 1;
>bytes -= 4;
> --
> 2.8.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Use double-precision pow() when bit_size is 64, powf() otherwise

2016-03-28 Thread Patrick Baggett
On Mon, Mar 28, 2016 at 1:58 PM, Patrick Baggett
<baggett.patr...@gmail.com> wrote:
>> What are the rules in C when you compare a double
>> variable with a single constant?
>>
>> void foo(double d)
>> {
>> /* Does d get converted to single, or does 0.0f get converted to
>>  * double?
>>  */
>> if (d == 0.0f)
>> printf("zero\n");
>> }
>
> The 0.0f is converted to a double. One site [1] has a likely looking
> reference. :) Sadly, I don't know how to check the C spec directly (I
> think that it is not free).
>
> [1] https://www.eskimo.com/~scs/cclass/int/sx4cb.html

Nevermind, the spec is available..found the link via Wikipedia.

6.3.1.8 Usual arithmetic conversions
1

Otherwise, if the corresponding real type of either operand is double,
the other operand is converted, without change of type domain, to a
type whose corresponding real type is double.

So yes, 100% sure that it is promoted to a double.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Use double-precision pow() when bit_size is 64, powf() otherwise

2016-03-28 Thread Patrick Baggett
> What are the rules in C when you compare a double
> variable with a single constant?
>
> void foo(double d)
> {
> /* Does d get converted to single, or does 0.0f get converted to
>  * double?
>  */
> if (d == 0.0f)
> printf("zero\n");
> }

The 0.0f is converted to a double. One site [1] has a likely looking
reference. :) Sadly, I don't know how to check the C spec directly (I
think that it is not free).

[1] https://www.eskimo.com/~scs/cclass/int/sx4cb.html
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] nir: Simplify 0 < fabs(a)

2016-03-11 Thread Patrick Baggett
On Fri, Mar 11, 2016 at 10:21 AM, Ian Romanick <i...@freedesktop.org> wrote:
> On 03/10/2016 01:24 PM, Patrick Baggett wrote:
>> On Thu, Mar 10, 2016 at 3:08 PM, Patrick Baggett
>> <baggett.patr...@gmail.com> wrote:
>>> On Thu, Mar 10, 2016 at 12:25 PM, Ian Romanick <i...@freedesktop.org> wrote:
>>>> From: Ian Romanick <ian.d.roman...@intel.com>
>>>>
>>>> Sandy Bridge / Ivy Bridge / Haswell
>>>> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
>>>> instructions in affected programs: 564 -> 558 (-1.06%)
>>>> helped: 6
>>>> HURT: 0
>>>>
>>>> total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
>>>> cycles in affected programs: 9768 -> 9582 (-1.90%)
>>>> helped: 12
>>>> HURT: 0
>>>>
>>>> Broadwell / Skylake
>>>> total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
>>>> instructions in affected programs: 626 -> 619 (-1.12%)
>>>> helped: 7
>>>> HURT: 0
>>>>
>>>> total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
>>>> cycles in affected programs: 9378 -> 9192 (-1.98%)
>>>> helped: 12
>>>> HURT: 0
>>>>
>>>> G45 and Ironlake showed no change.
>>>>
>>>> Signed-off-by: Ian Romanick <ian.d.roman...@intel.com>
>>>> ---
>>>>  src/compiler/nir/nir_opt_algebraic.py | 5 +
>>>>  1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
>>>> b/src/compiler/nir/nir_opt_algebraic.py
>>>> index 4db8f84..1442ce8 100644
>>>> --- a/src/compiler/nir/nir_opt_algebraic.py
>>>> +++ b/src/compiler/nir/nir_opt_algebraic.py
>>>> @@ -108,6 +108,11 @@ optimizations = [
>>>> # inot(a)
>>>> (('fge', 0.0, ('b2f', a)), ('inot', a)),
>>>>
>>>> +   # 0.0 < fabs(a)
>>>> +   # 0.0 != fabs(a)  because fabs(a) must be >= 0
>>> I think this is wrong. Because >= 0.0 can mean that fabs(a) == 0.0 for
>>> some a, you can't say then fabs(a) != 0.0.
>>>
>>> Then, the counter-example is when a = 0.0
>>>
>>> 1) 0.0 != fabs(0.0)
>>> 2) 0.0 != 0.0
>>>
>> Rather, I mean the comment is wrong, but the conclusion that:
>> 0 < fabs(a) <-> a != 0.0
>> is correct. You can just build a truth table or just observe that when
>> a == 0, 0 < 0 is false, and
>> when a != 0.0, fabs(a) will be > 0, so 0 < fabs(a) will be always true.
>
> How about if I change it to
>
># 0.0 != fabs(a)  Since fabs(a) >= 0, 0 <= fabs(a) must be true
>
> I think it's trivial to see how to get from "0 < fabs(a)" to "0 !=
> fabs(a)" based on that.
Yeah, I think what gave me a pause when I read was "0.0 != fabs(a)",
because that's not a general mathematical truth unless qualified by "a
!= 0.0". I don't have any particularly strong feelings about the
wording. I personally didn't reason about it using (in)equalities at
all. My logic was mostly based on domain analysis of the expression:
let p(a) := 0 < fabs(a)
p(0) <-> false
p(a) <-> true, for any other value of a
therefore p(a) <-> true when a != 0.0
therefore p(a) <-> a != 0

It's up to you.

>
>>>> +   # 0.0 != a
>>>> +   (('flt', 0.0, ('fabs', a)), ('fne', a, 0.0)),
>>>> +
>>>> (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
>>>> (('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
>>>> (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
>>>> --
>>>> 2.5.0
>>>>
>>>> ___
>>>> mesa-dev mailing list
>>>> mesa-dev@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] nir: Simplify 0 < fabs(a)

2016-03-10 Thread Patrick Baggett
On Thu, Mar 10, 2016 at 3:08 PM, Patrick Baggett
<baggett.patr...@gmail.com> wrote:
> On Thu, Mar 10, 2016 at 12:25 PM, Ian Romanick <i...@freedesktop.org> wrote:
>> From: Ian Romanick <ian.d.roman...@intel.com>
>>
>> Sandy Bridge / Ivy Bridge / Haswell
>> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
>> instructions in affected programs: 564 -> 558 (-1.06%)
>> helped: 6
>> HURT: 0
>>
>> total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
>> cycles in affected programs: 9768 -> 9582 (-1.90%)
>> helped: 12
>> HURT: 0
>>
>> Broadwell / Skylake
>> total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
>> instructions in affected programs: 626 -> 619 (-1.12%)
>> helped: 7
>> HURT: 0
>>
>> total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
>> cycles in affected programs: 9378 -> 9192 (-1.98%)
>> helped: 12
>> HURT: 0
>>
>> G45 and Ironlake showed no change.
>>
>> Signed-off-by: Ian Romanick <ian.d.roman...@intel.com>
>> ---
>>  src/compiler/nir/nir_opt_algebraic.py | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
>> b/src/compiler/nir/nir_opt_algebraic.py
>> index 4db8f84..1442ce8 100644
>> --- a/src/compiler/nir/nir_opt_algebraic.py
>> +++ b/src/compiler/nir/nir_opt_algebraic.py
>> @@ -108,6 +108,11 @@ optimizations = [
>> # inot(a)
>> (('fge', 0.0, ('b2f', a)), ('inot', a)),
>>
>> +   # 0.0 < fabs(a)
>> +   # 0.0 != fabs(a)  because fabs(a) must be >= 0
> I think this is wrong. Because >= 0.0 can mean that fabs(a) == 0.0 for
> some a, you can't say then fabs(a) != 0.0.
>
> Then, the counter-example is when a = 0.0
>
> 1) 0.0 != fabs(0.0)
> 2) 0.0 != 0.0
>
Rather, I mean the comment is wrong, but the conclusion that:
0 < fabs(a) <-> a != 0.0
is correct. You can just build a truth table or just observe that when
a == 0, 0 < 0 is false, and
when a != 0.0, fabs(a) will be > 0, so 0 < fabs(a) will be always true.



>> +   # 0.0 != a
>
>
>
>
>> +   (('flt', 0.0, ('fabs', a)), ('fne', a, 0.0)),
>> +
>> (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
>> (('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
>> (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
>> --
>> 2.5.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] nir: Simplify 0 < fabs(a)

2016-03-10 Thread Patrick Baggett
On Thu, Mar 10, 2016 at 12:25 PM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> Sandy Bridge / Ivy Bridge / Haswell
> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
> instructions in affected programs: 564 -> 558 (-1.06%)
> helped: 6
> HURT: 0
>
> total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
> cycles in affected programs: 9768 -> 9582 (-1.90%)
> helped: 12
> HURT: 0
>
> Broadwell / Skylake
> total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
> instructions in affected programs: 626 -> 619 (-1.12%)
> helped: 7
> HURT: 0
>
> total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
> cycles in affected programs: 9378 -> 9192 (-1.98%)
> helped: 12
> HURT: 0
>
> G45 and Ironlake showed no change.
>
> Signed-off-by: Ian Romanick 
> ---
>  src/compiler/nir/nir_opt_algebraic.py | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
> b/src/compiler/nir/nir_opt_algebraic.py
> index 4db8f84..1442ce8 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -108,6 +108,11 @@ optimizations = [
> # inot(a)
> (('fge', 0.0, ('b2f', a)), ('inot', a)),
>
> +   # 0.0 < fabs(a)
> +   # 0.0 != fabs(a)  because fabs(a) must be >= 0
I think this is wrong. Because >= 0.0 can mean that fabs(a) == 0.0 for
some a, you can't say then fabs(a) != 0.0.

Then, the counter-example is when a = 0.0

1) 0.0 != fabs(0.0)
2) 0.0 != 0.0

> +   # 0.0 != a




> +   (('flt', 0.0, ('fabs', a)), ('fne', a, 0.0)),
> +
> (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
> (('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
> (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] glsl: fix new gcc6 warnings

2016-02-17 Thread Patrick Baggett
On Wed, Feb 17, 2016 at 3:35 PM, Rob Clark  wrote:
> src/compiler/glsl/lower_discard_flow.cpp:79:1: warning: ‘ir_visitor_status 
> {anonymous}::lower_discard_flow_visitor::visit_enter(ir_loop_jump*)’ defined 
> but not used [-Wunused-function]
>  lower_discard_flow_visitor::visit_enter(ir_loop_jump *ir)
>  ^~
>
> The base class method that was intended to be overridden was
> 'visit(ir_loop_jump *ir)', not visit_entire().
>
Has there been a discussion about using the "override" keyword
(C++11)? It sounds like it could catch bugs like this, and if hidden
behind a #define, act as a no-op when C++11 is not supported. Although
obviously the new gcc6 warning is effectively doing much the same
thing...


> Signed-off-by: Rob Clark 
> ---
>  src/compiler/glsl/lower_discard_flow.cpp | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/compiler/glsl/lower_discard_flow.cpp 
> b/src/compiler/glsl/lower_discard_flow.cpp
> index 9d0a56b..9e3a7c0 100644
> --- a/src/compiler/glsl/lower_discard_flow.cpp
> +++ b/src/compiler/glsl/lower_discard_flow.cpp
> @@ -62,8 +62,8 @@ public:
> {
> }
>
> +   ir_visitor_status visit(ir_loop_jump *ir);
> ir_visitor_status visit_enter(ir_discard *ir);
> -   ir_visitor_status visit_enter(ir_loop_jump *ir);
> ir_visitor_status visit_enter(ir_loop *ir);
> ir_visitor_status visit_enter(ir_function_signature *ir);
>
> @@ -76,7 +76,7 @@ public:
>  } /* anonymous namespace */
>
>  ir_visitor_status
> -lower_discard_flow_visitor::visit_enter(ir_loop_jump *ir)
> +lower_discard_flow_visitor::visit(ir_loop_jump *ir)
>  {
> if (ir->mode != ir_loop_jump::jump_continue)
>return visit_continue;
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Bug 27512] Illegal instruction _mesa_x86_64_transform_points4_general

2016-01-05 Thread Patrick Baggett
Given that there is a _mesa_3dnow_transform_points4_2d in the x86-64 asm
(using MMX/3DNow! is deprecated in x86-64), it appears that this code was
copy-pasted. I wrote a quick patch to change prefetch[w] to prefetcht1,
which is more or less the equivalent in SSE. However, I'm not actually sure
those prefetches really benefit the code since they appear to be monotonic
addresses and hinting only 16 bytes ahead (a cache line is almost always at
least 32 bytes) -- maybe that sort of testing is for another day.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965/nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const

2015-10-23 Thread Patrick Baggett
On Fri, Oct 23, 2015 at 10:55 AM, Eduardo Lima Mitev 
wrote:

> When both fadd and fmul instructions have at least one operand that is a
> constant and it is only used once, the total number of instructions can
> be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because
> the constants will be progagated as immediate operands of fmul and fadd.
>
> This patch detects these situations and prevents fusing fmul+fadd into
> ffma.
>
> Shader-db results on i965 Haswell:
>
> total instructions in shared programs: 6235835 -> 6225895 (-0.16%)
> instructions in affected programs: 1124094 -> 1114154 (-0.88%)
> total loops in shared programs:1979 -> 1979 (0.00%)
> helped:7612
> HURT:  843
> GAINED:4
> LOST:  0
> ---
>  .../drivers/dri/i965/brw_nir_opt_peephole_ffma.c   | 31
> ++
>  1 file changed, 31 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
> b/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
> index a8448e7..c7fc15a 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
> @@ -133,6 +133,28 @@ get_mul_for_src(nir_alu_src *src, int num_components,
> return alu;
>  }
>
> +/**
> + * Given a list of (at least two) nir_alu_src's, tells if any of them is a
> + * constant value and is used only once.
> + */
> +static bool
> +any_alu_src_is_a_constant(nir_alu_src srcs[])
> +{
> +   for (unsigned i = 0; i < 2; i++) {
> +  if (srcs[i].src.ssa->parent_instr->type ==
> nir_instr_type_load_const) {
> + nir_load_const_instr *load_const =
> +nir_instr_as_load_const (srcs[i].src.ssa->parent_instr);
> +
> + if (list_is_single(_const->def.uses) &&
> + list_empty(_const->def.if_uses)) {
> +return true;
> + }
> +  }
> +   }
> +
> +   return false;
> +}
> +
>

The comment above this functions reads "Given a list of (at least two)
nir_alu_src's...", but the function checks exactly two. Was it your
intention to support lists with size > 2?


>  static bool
>  brw_nir_opt_peephole_ffma_block(nir_block *block, void *void_state)
>  {
> @@ -183,6 +205,15 @@ brw_nir_opt_peephole_ffma_block(nir_block *block,
> void *void_state)
>mul_src[0] = mul->src[0].src.ssa;
>mul_src[1] = mul->src[1].src.ssa;
>
> +  /* If any of the operands of the fmul and any of the fadd is a
> constant,
> +   * we bypass because it will be more efficient as the constants
> will be
> +   * propagated as operands, potentially saving two load_const
> instructions.
> +   */
> +  if (any_alu_src_is_a_constant(mul->src) &&
> +  any_alu_src_is_a_constant(add->src)) {
> + continue;
> +  }
> +
>if (abs) {
>   for (unsigned i = 0; i < 2; i++) {
>  nir_alu_instr *abs = nir_alu_instr_create(state->mem_ctx,
> --
> 2.5.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/os: add os_wait_until_zero

2015-06-26 Thread Patrick Baggett
On Fri, Jun 26, 2015 at 11:40 AM, Marek Olšák mar...@gmail.com wrote:

 If p_atomic_read is fine, then this patch is fine too. So you're
 telling that this should work:

 while (p_atomic_read(var));

 I wouldn't be concerned about a memory barrier. This is only 1 int, so
 it should make its way into the shared cache eventually.


Yes, it does make it to the shared cache, but the assumption is that the
compiler will actually generate code to check the memory location more than
one. I've personally been bitten by this assumption - it's a bad one. Ilia
is right. If you have a variable that doesn't appear to modified at all,
but you, the programmer know it will be modified by another thread, you're
asking for an infinite loop. The only guarantee you get is that if this
code ran in isolation on a single thread, it will do what you told it to.
Consider even a trivial transformation:

while(1) {

if(var == 0) break;

}

The compiler can optimize this to a single statement:

if(var != 0) infinite_loop();

...because it produces the same results as the above code when run in
isolation. However, if 'var' is volilate, it cannot assume that the value
will remain the same and cannot apply this optimization. What's more fun
is that debug mode tends to not apply these sorts of optimizations, so your
code hangs in release builds, and when you check the memory location, you
can see that it has been updated. Commence tearing hair out. Then you look
at the assembly and hit your head on the desk. Or something like that. ;)

Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: Unbreak usage of assert()/debug_assert() inside expressions.

2014-12-12 Thread Patrick Baggett
On Fri, Dec 12, 2014 at 10:17 AM, Roland Scheidegger srol...@vmware.com
wrote:

 Am 12.12.2014 um 15:09 schrieb Jose Fonseca:
  From: José Fonseca jfons...@vmware.com
 
  f0ba7d897d1c22202531acb70f134f2edc30557d made debug_assert()/assert()
  unsafe for expressions, but only now with u_atomic.h started to rely on
  them for Windows this became an issue.
 
  This fixes non-debug builds with MSVC.
  ---
   src/gallium/auxiliary/util/u_debug.h | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
 
  diff --git a/src/gallium/auxiliary/util/u_debug.h
 b/src/gallium/auxiliary/util/u_debug.h
  index badd5e2..4c22fdf 100644
  --- a/src/gallium/auxiliary/util/u_debug.h
  +++ b/src/gallium/auxiliary/util/u_debug.h
  @@ -185,7 +185,7 @@ void _debug_assert_fail(const char *expr,
   #ifdef DEBUG
   #define debug_assert(expr) ((expr) ? (void)0 :
 _debug_assert_fail(#expr, __FILE__, __LINE__, __FUNCTION__))
   #else
  -#define debug_assert(expr) do { } while (0  (expr))
  +#define debug_assert(expr) (void)(0  (expr))
   #endif
 
 
 


Just for my own education, can someone explain what the need for
`debug_assert()` to have any expansion of `expr` at all? Rather, what
breaks with something like:

  #define debug_assert(expr) ((void)0)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/29] mesa: Add _mesa_swap2_copy and _mesa_swap4_copy

2014-11-20 Thread Patrick Baggett



 The restrict keyword is a C99 thing and I don't think it's supported in
 MSVC so that would be a problem.  If it won't build with MSVC then it's a
 non-starter.  If MSVC can handle restrict, then I don't know that I care
 much either way about 2 functions or 4


MSVC uses __restrict which functions identically -- but if there doesn't
already exist a #define around this MSVC-ism, then I guess it may be more
work then Iago was really signing up for. But it does exist.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/29] mesa: Add _mesa_swap2_copy and _mesa_swap4_copy

2014-11-19 Thread Patrick Baggett
On Tue, Nov 18, 2014 at 3:23 AM, Iago Toral Quiroga ito...@igalia.com
wrote:

 We have _mesa_swap{2,4} but these do in-place byte-swapping only. The new
 functions receive an extra parameter so we can swap bytes on a source
 input array and store the results in a (possibly different) destination
 array.


If this is being split into an in-place and different pointers version,
I think using the restrict keyword would be useful here to improve the
performance. Then, the in-place one cannot be implemented as copy(p,p,n),
but the code isn't overly complicated.



 This is useful to implement byte-swapping in pixel uploads, since in this
 case we need to swap bytes on the src data which is owned by the
 application so we can't do an in-place byte swap.
 ---
  src/mesa/main/image.c | 25 +
  src/mesa/main/image.h | 10 --
  2 files changed, 25 insertions(+), 10 deletions(-)

 diff --git a/src/mesa/main/image.c b/src/mesa/main/image.c
 index 4ea5f04..9ad97c5 100644
 --- a/src/mesa/main/image.c
 +++ b/src/mesa/main/image.c
 @@ -41,36 +41,45 @@


  /**
 - * Flip the order of the 2 bytes in each word in the given array.
 + * Flip the order of the 2 bytes in each word in the given array (src) and
 + * store the result in another array (dst). For in-place byte-swapping
 this
 + * function can be called with the same array for src and dst.
   *
 - * \param p array.
 + * \param dst the array where byte-swapped data will be stored.
 + * \param src the array with the source data we want to byte-swap.
   * \param n number of words.
   */
  void
 -_mesa_swap2( GLushort *p, GLuint n )
 +_mesa_swap2_copy( GLushort *dst, GLushort *src, GLuint n )
  {
 GLuint i;
 for (i = 0; i  n; i++) {
 -  p[i] = (p[i]  8) | ((p[i]  8)  0xff00);
 +  dst[i] = (src[i]  8) | ((src[i]  8)  0xff00);
 }
  }



  /*
 - * Flip the order of the 4 bytes in each word in the given array.
 + * Flip the order of the 4 bytes in each word in the given array (src) and
 + * store the result in another array (dst). For in-place byte-swapping
 this
 + * function can be called with the same array for src and dst.
 + *
 + * \param dst the array where byte-swapped data will be stored.
 + * \param src the array with the source data we want to byte-swap.
 + * \param n number of words.
   */
  void
 -_mesa_swap4( GLuint *p, GLuint n )
 +_mesa_swap4_copy( GLuint *dst, GLuint *src, GLuint n )
  {
 GLuint i, a, b;
 for (i = 0; i  n; i++) {
 -  b = p[i];
 +  b = src[i];
a =  (b  24)
 | ((b  8)  0xff00)
 | ((b  8)  0xff)
 | ((b  24)  0xff00);
 -  p[i] = a;
 +  dst[i] = a;
 }
  }

 diff --git a/src/mesa/main/image.h b/src/mesa/main/image.h
 index abd84bf..79c6e68 100644
 --- a/src/mesa/main/image.h
 +++ b/src/mesa/main/image.h
 @@ -33,10 +33,16 @@ struct gl_context;
  struct gl_pixelstore_attrib;

  extern void
 -_mesa_swap2( GLushort *p, GLuint n );
 +_mesa_swap2_copy( GLushort *dst, GLushort *src, GLuint n );

  extern void
 -_mesa_swap4( GLuint *p, GLuint n );
 +_mesa_swap4_copy( GLuint *dst, GLuint *src, GLuint n );
 +
 +static inline void
 +_mesa_swap2( GLushort *p, GLuint n ) { _mesa_swap2_copy(p, p, n); }
 +
 +static inline void
 +_mesa_swap4( GLuint *p, GLuint n ) { _mesa_swap4_copy(p, p, n); }

  extern GLintptr
  _mesa_image_offset( GLuint dimensions,
 --
 1.9.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5] gallium/auxiliary: add inc and dec alternative with return (v3)

2014-11-17 Thread Patrick Baggett
On Mon, Nov 17, 2014 at 12:20 PM, Axel Davy axel.d...@ens.fr wrote:

 From: Christoph Bumiller christoph.bumil...@speed.at

 At this moment we use only zero or positive values.

 v2: Implement it for also for Solaris, MSVC assembly
 and enable for other combinations.

 v3: Replace MSVC assembly by assert + warning during compilation

 Signed-off-by: David Heidelberg da...@ixit.cz
 ---
  src/gallium/auxiliary/util/u_atomic.h | 72
 +++
  1 file changed, 72 insertions(+)

 diff --git a/src/gallium/auxiliary/util/u_atomic.h
 b/src/gallium/auxiliary/util/u_atomic.h
 index 2f2b42b..9279073 100644
 --- a/src/gallium/auxiliary/util/u_atomic.h
 +++ b/src/gallium/auxiliary/util/u_atomic.h
 @@ -69,6 +69,18 @@ p_atomic_dec(int32_t *v)
  }

  static INLINE int32_t
 +p_atomic_inc_return(int32_t *v)
 +{
 +   return __sync_add_and_fetch(v, 1);
 +}
 +
 +static INLINE int32_t
 +p_atomic_dec_return(int32_t *v)
 +{
 +   return __sync_sub_and_fetch(v, 1);
 +}
 +
 +static INLINE int32_t
  p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
  {
 return __sync_val_compare_and_swap(v, old, _new);
 @@ -116,6 +128,18 @@ p_atomic_dec(int32_t *v)
  }

  static INLINE int32_t
 +p_atomic_inc_return(int32_t *v)
 +{
 +   return __sync_add_and_fetch(v, 1);
 +}
 +
 +static INLINE int32_t
 +p_atomic_dec_return(int32_t *v)
 +{
 +   return __sync_sub_and_fetch(v, 1);
 +}
 +
 +static INLINE int32_t
  p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
  {
 return __sync_val_compare_and_swap(v, old, _new);
 @@ -161,6 +185,18 @@ p_atomic_dec(int32_t *v)
  }

  static INLINE int32_t
 +p_atomic_inc_return(int32_t *v)
 +{
 +   return __sync_add_and_fetch(v, 1);
 +}
 +
 +static INLINE int32_t
 +p_atomic_dec_return(int32_t *v)
 +{
 +   return __sync_sub_and_fetch(v, 1);
 +}
 +
 +static INLINE int32_t
  p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
  {
 return __sync_val_compare_and_swap(v, old, _new);
 @@ -186,6 +222,8 @@ p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
  #define p_atomic_dec_zero(_v) ((boolean) --(*(_v)))
  #define p_atomic_inc(_v) ((void) (*(_v))++)
  #define p_atomic_dec(_v) ((void) (*(_v))--)
 +#define p_atomic_inc_return(_v) ((*(_v))++)
 +#define p_atomic_dec_return(_v) ((*(_v))--)
  #define p_atomic_cmpxchg(_v, old, _new) (*(_v) == old ? *(_v) = (_new) :
 *(_v))

  #endif
 @@ -197,6 +235,8 @@ p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)

  #define PIPE_ATOMIC MSVC x86 assembly

 +#include assert.h
 +
  #ifdef __cplusplus
  extern C {
  #endif
 @@ -236,6 +276,24 @@ p_atomic_dec(int32_t *v)
 }
  }

 +#pragma message ( Warning: p_atomic_dec_return and p_atomic_inc_return
 unimplemented for PIPE_ATOMIC_ASM_MSVC_X86 )
 +
 +static INLINE int32_t
 +p_atomic_inc_return(int32_t *v)
 +{
 +   (void) v;
 +   assert(0);
 +   return 0;
 +}


Why isn't _InterlockedIncrement() used here? It is used for the void
functions. If you read the definition of _InterlockedIncrement() it returns
the new value -- isn't that what is needed?


 +
 +static INLINE int32_t
 +p_atomic_dec_return(int32_t *v)
 +{
 +   (void) v;
 +   assert(0);
 +   return 0;
 +}


Similarly here.


 +
  static INLINE int32_t
  p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
  {
 @@ -288,6 +346,12 @@ p_atomic_inc(int32_t *v)
 _InterlockedIncrement((long *)v);
  }

 +static INLINE int32_t
 +p_atomic_inc_return(int32_t *v)
 +{
 +   return _InterlockedIncrement((long *)v);
 +}
 +
  static INLINE void
  p_atomic_dec(int32_t *v)
  {
 @@ -295,6 +359,12 @@ p_atomic_dec(int32_t *v)
  }

  static INLINE int32_t
 +p_atomic_dec_return(int32_t *v)
 +{
 +   return _InterlockedDecrement((long *)v);
 +}
 +
 +static INLINE int32_t
  p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
  {
 return _InterlockedCompareExchange((long *)v, _new, old);
 @@ -329,6 +399,8 @@ p_atomic_dec_zero(int32_t *v)

  #define p_atomic_inc(_v) atomic_inc_32((uint32_t *) _v)
  #define p_atomic_dec(_v) atomic_dec_32((uint32_t *) _v)
 +#define p_atomic_inc_return(_v) atomic_inc_32_nv((uint32_t *) _v)
 +#define p_atomic_dec_return(_v) atomic_dec_32_nv((uint32_t *) _v)

  #define p_atomic_cmpxchg(_v, _old, _new) \
 atomic_cas_32( (uint32_t *) _v, (uint32_t) _old, (uint32_t) _new)
 --
 2.1.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5] gallium/auxiliary: add inc and dec alternative with return (v3)

2014-11-17 Thread Patrick Baggett


 Looking at u_atomic.h there is a section that uses
 PIPE_ATOMIC_ASM_MSVC_X86 and has explicit assembly, and there's a
 section that uses PIPE_ATOMIC_MSVC_INTRINSIC and has intrinsics. No
 clue whatsoever what the difference between them is, but presumably it
 doesn't exist solely for the purpose of annoying developers...


I can't think of a good reason; I would be interested in knowing why. Last
time I checked, MSVC is terrible at optimizing around __asm{} blocks and if
I recall, only x86 (i.e. 32-bit) supports inline assembly. This is a bit
off-topic though...



   -ilia

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3][RFC v2] mesa/main/x86: Add sse2 streaming clamping

2014-11-04 Thread Patrick Baggett
On Tue, Nov 4, 2014 at 6:05 AM, Juha-Pekka Heikkila 
juhapekka.heikk...@gmail.com wrote:

 Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com
 ---
  src/mesa/Makefile.am  |   8 +++
  src/mesa/main/x86/sse2_clamping.c | 103
 ++
  src/mesa/main/x86/sse2_clamping.h |  49 ++
  3 files changed, 160 insertions(+)
  create mode 100644 src/mesa/main/x86/sse2_clamping.c
  create mode 100644 src/mesa/main/x86/sse2_clamping.h

 diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
 index e71bccb..5d3c6f5 100644
 --- a/src/mesa/Makefile.am
 +++ b/src/mesa/Makefile.am
 @@ -111,6 +111,10 @@ if SSE41_SUPPORTED
  ARCH_LIBS += libmesa_sse41.la
  endif

 +if SSE2_SUPPORTED
 +ARCH_LIBS += libmesa_sse2.la
 +endif
 +
  MESA_ASM_FILES_FOR_ARCH =

  if HAVE_X86_ASM
 @@ -154,6 +158,10 @@ libmesa_sse41_la_SOURCES = \
 main/streaming-load-memcpy.c
  libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1

 +libmesa_sse2_la_SOURCES = \
 +   main/x86/sse2_clamping.c
 +libmesa_sse2_la_CFLAGS = $(AM_CFLAGS) -msse2
 +
  pkgconfigdir = $(libdir)/pkgconfig
  pkgconfig_DATA = gl.pc

 diff --git a/src/mesa/main/x86/sse2_clamping.c
 b/src/mesa/main/x86/sse2_clamping.c
 new file mode 100644
 index 000..7df1c85
 --- /dev/null
 +++ b/src/mesa/main/x86/sse2_clamping.c
 @@ -0,0 +1,103 @@
 +/*
 + * Copyright © 2014 Intel Corporation
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a
 + * copy of this software and associated documentation files (the
 Software),
 + * to deal in the Software without restriction, including without
 limitation
 + * the rights to use, copy, modify, merge, publish, distribute,
 sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the
 next
 + * paragraph) shall be included in all copies or substantial portions of
 the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
 EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
 SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
 OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 DEALINGS
 + * IN THE SOFTWARE.
 + *
 + * Authors:
 + *Juha-Pekka Heikkila juhapekka.heikk...@gmail.com
 + *
 + */
 +
 +#ifdef __SSE2__
 +#include main/macros.h
 +#include main/x86/sse2_clamping.h
 +#include emmintrin.h
 +
 +/**
 + * Clamp four float values to [min,max]
 + */
 +static inline void
 +_mesa_clamp_float_rgba(GLfloat src[4], GLfloat result[4], const float min,
 +   const float max)
 +{
 +   __m128  operand, minval, maxval;
 +
 +   operand = _mm_loadu_ps(src);
 +   minval = _mm_set1_ps(min);
 +   maxval = _mm_set1_ps(max);
 +   operand = _mm_max_ps(operand, minval);
 +   operand = _mm_min_ps(operand, maxval);
 +   _mm_storeu_ps(result, operand);
 +}
 +
 +
 +/* Clamp n amount float rgba pixels to [min,max] using SSE2


Conceptually, _mesa_streaming_clamp_float_rgba() is clamping a contiguous
array of floats to some min/max value. The fact that they are pixels is
somewhat incidental when looking at it from a stream perspective. It looks
like the code is more or less just operating on n*4 floats. Given that, a
more efficient implementation would check alignment and then use aligned
loads and streaming stores. It doesn't really matter if you straddle pixel
boundaries as long as each float is operated on. I'm not sure how much
effort you want to put into this though. :)


 + */
 +void
 +_mesa_streaming_clamp_float_rgba(const GLuint n, GLfloat rgba_src[][4],
 + GLfloat rgba_dst[][4], const GLfloat min,
 + const GLfloat max)
 +{
 +   int i;
 +
 +   for (i = 0; i  n; i++) {
 +  _mesa_clamp_float_rgba(rgba_src[i], rgba_dst[i], min, max);
 +   }
 +}
 +
 +
 +/* Clamp n amount float rgba pixels to [min,max] using SSE2 and apply
 + * scaling and mapping to components.
 + *
 + * this replace handling of [RGBA] channels:
 + * rgba_temp[RCOMP] = CLAMP(rgba[i][RCOMP], 0.0F, 1.0F);
 + * rgba[i][RCOMP] = rMap[F_TO_I(rgba_temp[RCOMP] * scale[RCOMP])];
 + */
 +void
 +_mesa_clamp_float_rgba_scale_and_map(const GLuint n, GLfloat
 rgba_src[][4],
 + GLfloat rgba_dst[][4], const GLfloat
 min,
 + const GLfloat max,
 + const GLfloat scale[4],
 + const GLfloat* rMap, const GLfloat*
 gMap,
 + const GLfloat* bMap, const GLfloat*
 aMap)
 +{
 +   int i;
 +   GLfloat 

Re: [Mesa-dev] [PATCH 11/11] glsl: Optimize X / X == 1

2014-08-07 Thread Patrick Baggett
Would this be conformant to GLSL spec if X had a runtime value of 0? Seems
unsafe to replace X / X with 1 without a runtime test...maybe GLSL spec
allows such optimizations.


On Thu, Aug 7, 2014 at 3:51 PM, thomashellan...@gmail.com wrote:

 From: Thomas Helland thomashellan...@gmail.com

 Shows no changes for shader-db.

 Signed-off-by: Thomas Helland thomashelland90 at gmail.com
 ---
  src/glsl/opt_algebraic.cpp | 2 ++
  1 file changed, 2 insertions(+)

 diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
 index 21bf332..a49752d 100644
 --- a/src/glsl/opt_algebraic.cpp
 +++ b/src/glsl/opt_algebraic.cpp
 @@ -513,6 +513,8 @@ ir_algebraic_visitor::handle_expression(ir_expression
 *ir)
}
if (is_vec_one(op_const[1]))
  return ir-operands[0];
 +  if(ir-operands[0]-equals(ir-operands[1]))
 + return new(mem_ctx) ir_constant(1.0f, 1);
break;

 case ir_binop_dot:
 --
 2.0.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] util: Add util_memcpy_cpu_to_le32() v2

2014-07-18 Thread Patrick Baggett
On Fri, Jul 18, 2014 at 2:10 PM, Tom Stellard thomas.stell...@amd.com
wrote:

 v2:
   - Preserve word boundaries.
 ---
  src/gallium/auxiliary/util/u_math.h | 17 +
  1 file changed, 17 insertions(+)

 diff --git a/src/gallium/auxiliary/util/u_math.h
 b/src/gallium/auxiliary/util/u_math.h
 index b9ed197..5de181a 100644
 --- a/src/gallium/auxiliary/util/u_math.h
 +++ b/src/gallium/auxiliary/util/u_math.h
 @@ -812,6 +812,23 @@ util_bswap16(uint16_t n)
(n  8);
  }

 +static INLINE void*
 +util_memcpy_cpu_to_le32(void *dest, void *src, size_t n)


I don't know where Mesa is with C99 standards, but if you are utilizing C99
keywords, I think restrict would help here to show that the two pointers
do not overlap. I'm not sure if have to mark 'd' and 's' as restrict to get
the benefit if they are initialized by a typecast, but it probably wouldn't
be a bad idea.

This may be a no-go with C++ however.


 +{
 +#ifdef PIPE_ARCH_BIG_ENDIAN
 +   size_t i, e;
 +   asset(n % 4 == 0);
 +
 +   for (i = 0, e = n / 4; i  e; i++) {
 +   uint32_t *d = (uint32_t*)dest;
 +   uint32_t *s = (uint32_t*)src;
 +   d[i] = util_bswap32(s[i]);
 +   }
 +   return dest;
 +#else
 +   return memcpy(dest, src, n);
 +#endif
 +}

  /**
   * Clamp X to [MIN, MAX].
 --
 1.8.1.5

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util: Add util_memcpy_cpu_to_le()

2014-07-15 Thread Patrick Baggett
On Tue, Jul 15, 2014 at 11:19 AM, Tom Stellard thomas.stell...@amd.com
wrote:

 ---
  src/gallium/auxiliary/util/u_math.h  | 22 ++
  src/gallium/drivers/radeonsi/si_shader.c |  8 +---
  2 files changed, 23 insertions(+), 7 deletions(-)

 diff --git a/src/gallium/auxiliary/util/u_math.h
 b/src/gallium/auxiliary/util/u_math.h
 index b9ed197..cd3cf04 100644
 --- a/src/gallium/auxiliary/util/u_math.h
 +++ b/src/gallium/auxiliary/util/u_math.h
 @@ -812,6 +812,28 @@ util_bswap16(uint16_t n)
(n  8);
  }

 +static INLINE void*
 +util_memcpy_cpu_to_le(void *dest, void *src, size_t n)
 +{
 +#ifdef PIPE_ARCH_BIG_ENDIAN
 +   size_t i, e;
 +   for (i = 0, e = n % 8; i  e; i++) {
 +   char *d = (char*)dest;
 +   char *s = (char*)src;
 +   d[i] = s[e - i - 1];
 +   }
 +   dest += i;
 +   n -= i;
 +   for (i = 0, e = n / 8; i  e; i++) {
 +   uint64_t *d = (uint64_t*)dest;
 +   uint64_t *s = (uint64_t*)src;
 +   d[i] = util_bswap64(s[e - i - 1]);
 +   }


Doesn't this reverse all of the byte (as if it were a list) without
preserving word boundaries? e.g.

|a, b, c, d | e, f, g, h | i, j, k, l | m, n, o, p | -
|p, o, n, m | l, j, k, i | h, g, f, e | d, c, b, a |

The old code did something like this, didn't it?:
|a, b, c, d | e, f, g, h | i, j, k, l | m, n, o, p | -
|d, c, b, a | h, g, f, e | l, k, j, i | p, o, n, m |

I don't know which is correct, but it does seem like a behavior change. Or
am I misreading the code?

+   return dest;
 +#else
 +   return memcpy(dest, src, n);
 +#endif
 +}

  /**
   * Clamp X to [MIN, MAX].
 diff --git a/src/gallium/drivers/radeonsi/si_shader.c
 b/src/gallium/drivers/radeonsi/si_shader.c
 index f0650f4..6f0504b 100644
 --- a/src/gallium/drivers/radeonsi/si_shader.c
 +++ b/src/gallium/drivers/radeonsi/si_shader.c
 @@ -2559,13 +2559,7 @@ int si_compile_llvm(struct si_context *sctx, struct
 si_pipe_shader *shader,
 }

 ptr = (uint32_t*)sctx-b.ws-buffer_map(shader-bo-cs_buf,
 sctx-b.rings.gfx.cs, PIPE_TRANSFER_WRITE);
 -   if (SI_BIG_ENDIAN) {
 -   for (i = 0; i  binary.code_size / 4; ++i) {
 -   ptr[i] =
 util_cpu_to_le32((*(uint32_t*)(binary.code + i*4)));
 -   }
 -   } else {
 -   memcpy(ptr, binary.code, binary.code_size);
 -   }
 +   util_memcpy_cpu_to_le(ptr, binary.code, binary.code_size);
 sctx-b.ws-buffer_unmap(shader-bo-cs_buf);

 free(binary.code);
 --
 1.8.1.5

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/9] glsl: Make foreach macros usable from C by adding struct keyword.

2014-06-10 Thread Patrick Baggett


 Yep, no new warnings.

 I tried a little test program
 % cat t.cpp
 class asdf {
 int x;
 };

 void f() {
 asdf a;
 struct asdf b;
 class asdf c;
 }

C++ never ceases to amaze.


 and I can't make it generate warnings (other than unused variables)
 regardless of whether I define asdf as a class or a struct.
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/21] glsl: Store ir_variable::ir_type in 8 bits instead of 32

2014-05-28 Thread Patrick Baggett
On Wed, May 28, 2014 at 2:17 PM, Ian Romanick i...@freedesktop.org wrote:

 On 05/27/2014 08:28 PM, Matt Turner wrote:
  On Tue, May 27, 2014 at 7:49 PM, Ian Romanick i...@freedesktop.org
 wrote:
  From: Ian Romanick ian.d.roman...@intel.com
 
  No change in the peak ir_variable memory usage in a trimmed apitrace of
  dota2 on 64-bit.
 
  No change in the peak ir_variable memory usage in a trimmed apitrace of
  dota2 on 32-bit.
 
  Signed-off-by: Ian Romanick ian.d.roman...@intel.com
  ---
   src/glsl/ir.h | 5 +++--
   1 file changed, 3 insertions(+), 2 deletions(-)
 
  diff --git a/src/glsl/ir.h b/src/glsl/ir.h
  index 7faee74..bc02f6e 100644
  --- a/src/glsl/ir.h
  +++ b/src/glsl/ir.h
  @@ -92,12 +92,13 @@ enum ir_node_type {
*/
   class ir_instruction : public exec_node {
   private:
  -   enum ir_node_type ir_type;
  +   uint8_t ir_type;
 
   public:
  inline enum ir_node_type get_ir_type() const
  {
  -  return this-ir_type;
  +  STATIC_ASSERT(ir_type_max  256);
  +  return (enum ir_node_type) this-ir_type;
  }
 
  /**
  --
  1.8.1.4
 
  Instead of doing this, you can mark the enum type with the PACKED
  attribute. I did this in a similar change in i965 already. See
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/054643.html
 
  This way we still get enum type checking and warnings out of switch
  statements and such.

 Hmm... that would mean that patch 10 wouldn't strictly be necessary.
 The disadvantage is that the next patch would need (right?) some changes
 for MSVC, especially on 32-bit.  I think it would need to be

 #if sizeof(ir_node_type)  sizeof(void *)


I don't think the preprocessor can evaluate sizeof().


 # define PADDING_BYTES (sizeof(void *) - sizeof(ir_node_type))
 #else
 # define PADDING_BYTES sizeof(void *)
 #  if (__GNUC__ = 3)
 #error GCC did us wrong.
 #  endif
 #endif

   uint8_t padding[PADDING_BYTES];

 Seems a little sketchy, but might still be better... hmm...
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/9] egl: Don't attempt to redefine stdint.h types with VS 2013.

2014-05-02 Thread Patrick Baggett
On Fri, May 2, 2014 at 10:11 AM, jfons...@vmware.com wrote:

 From: José Fonseca jfons...@vmware.com

 Just include stdint.h.
 ---
  src/egl/main/eglcompiler.h | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/src/egl/main/eglcompiler.h b/src/egl/main/eglcompiler.h
 index 53dab54..5ea83d6 100644
 --- a/src/egl/main/eglcompiler.h
 +++ b/src/egl/main/eglcompiler.h
 @@ -37,7 +37,8 @@
  /**
   * Get standard integer types
   */
 -#if (defined(__STDC_VERSION__)  __STDC_VERSION__ = 199901L)
 +#if (defined(__STDC_VERSION__)  __STDC_VERSION__ = 199901L) || \
 +(defined(_MSC_VER)  _MSC_VER = 1600)


VS 2010 is where the support for stdint.h beings. This can be verified by
a quick Google search.



  #  include stdint.h
  #elif defined(_MSC_VER)
 typedef __int8 int8_t;
 --
 1.9.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/8] radeonsi: Use util_cpu_to_le32() instead of bswap32() on big-endian systems

2014-02-20 Thread Patrick Baggett
FWIW, memcpy() vs a for() loop has different semantics with respect to
address alignment. I don't know how much it will matter, but last time I
was reading assembly output, copying int[] via for() loop didn't produce a
codepath for 16-byte aligned addresses (allowing for SSE streaming) while
memcpy() has a lot of such logic. This won't matter much unless you have
lots to copy, and of course, compiler optimizations can change, so maybe
this situation has changed.

Patrick


On Thu, Feb 20, 2014 at 8:11 PM, Michel Dänzer mic...@daenzer.net wrote:

 On Don, 2014-02-20 at 10:21 -0800, Tom Stellard wrote:
 
  diff --git a/src/gallium/drivers/radeonsi/si_shader.c
 b/src/gallium/drivers/radeonsi/si_shader.c
  index 54270cd..9b04e6b 100644
  --- a/src/gallium/drivers/radeonsi/si_shader.c
  +++ b/src/gallium/drivers/radeonsi/si_shader.c
  @@ -2335,7 +2335,7 @@ int si_compile_llvm(struct si_context *sctx,
 struct si_pipe_shader *shader,
ptr = (uint32_t*)sctx-b.ws-buffer_map(shader-bo-cs_buf,
 sctx-b.rings.gfx.cs, PIPE_TRANSFER_WRITE);
if (0 /*SI_BIG_ENDIAN*/) {
for (i = 0; i  binary.code_size / 4; ++i) {
  - ptr[i] = util_bswap32(*(uint32_t*)(binary.code +
 i*4));
  + ptr[i] =
 util_cpu_to_le32((*(uint32_t*)(binary.code + i*4)));
}
} else {
memcpy(ptr, binary.code, binary.code_size);

 We could get rid of the separate *_ENDIAN paths using util_cpu_to_le*().

 Either way, the non-clover patches are

 Reviewed-by: Michel Dänzer michel.daen...@amd.com


 --
 Earthling Michel Dänzer|  http://www.amd.com
 Libre software enthusiast  |Mesa and X developer

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions

2014-02-05 Thread Patrick Baggett
My understanding is that this is like having MAP_UNSYNCHRONIZED on at all
times, even when it isn't mapped, because it is always mapped (into
memory). Is that correct Jose?

Patrick


On Wed, Feb 5, 2014 at 11:53 AM, Grigori Goronzy g...@chown.ath.cx wrote:

 On 05.02.2014 18:08, Jose Fonseca wrote:

 I honestly hope that GL_AMD_pinned_memory doesn't become popular. It
 would have been alright if it wasn't for this bit in
 http://www.opengl.org/registry/specs/AMD/pinned_memory.txt which says:

  2) Can the application still use the buffer using the CPU address?

  RESOLVED: YES. However, this access would be completely
  non synchronized to the OpenGL pipeline, unless explicit
  synchronization is being used (for example, through glFinish or
 by using
  sync objects).

 And I'm imagining apps which are streaming vertex data doing precisely
 just that...


 I don't understand your concern, this is exactly the same behavior
 GL_MAP_UNSYCHRONIZED_BIT has, and apps are supposedly using that properly.
 How does apitrace handle it?

 Grigori

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Testing optimizer

2013-12-17 Thread Patrick Baggett
Hi all,

Is there a way to see the machine code that is generated by the GLSL
compiler for all GPU instruction sets? For example, I would like to know if
the optimizer optimizes certain (equivalent) constructs (or not), and avoid
them if possible. I know there is a lot to optimization on GPUs that I
don't know, but I'd still like to get some ballpark estimates. For example,
I'm curious whether:

//let p1, p2, p3 be vec2 uniforms

vec4(p1, 0, 0) + vec4(p2, 0, 0) + vec4(p3, 0, 1)

produces identical machine code as:

vec4(p1+p2+p3, 0, 1);

for all architectures supported by Mesa.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radeonsi: pad IBs to a multiple of 8 DWs

2013-09-06 Thread Patrick Baggett
 Any reason for this complicated logic, instead of simply:

 while (cs-cdw  0x7)
 cs-buf[cs-cdw++] = 0x8000;


Ah, that is eloquently terse; I'm going to have to remember that.

Patrick


 Earthling Michel Dänzer   |   http://www.amd.com
 Libre software enthusiast |  Debian, X and DRI developer

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Another Take on the S3TC issue

2013-08-13 Thread Patrick Baggett
I've been hanging on this list for a while, and this isn't the first time
this has been suggested. The general thing that is repeated is basically
this: if you make an API (e.g. OpenGL) that supports S3TC without a
license, you're in trouble, even if it is a passthrough to the hardware,
which also required a license to produce in the first place. I think the
assumption most people make is that if the hardware vendor paid a license
to implement S3TC in an ASIC, then surely simply passing through data is
OK. After all, it is being done without any knowledge of the algorithm,
etc. From a common sense standpoint, I would agree.
However, the note in the S3TC extension itself[1] mentions explicitly
to be wary of such assumptions in the IP Status section, and notes that *a
license for one API is not a license for another*. This implies that for an
API to make use of S3TC, it requires a license, which Mesa in general, does
not have, while a hardware vendor might. All of this is theoretical as far
as I've read; I don't think anyone has legally challenged this for open
source drivers and posted the results on this mailing list -- mostly have
stayed away from it with a prejudice. I think the patent was granted in
1999, so at least in the USA, hopefully we don't have too many more years
of this garbage.

Patrick

[1] http://www.opengl.org/registry/specs/EXT/texture_compression_s3tc.txt


On Tue, Aug 13, 2013 at 1:53 PM, Uwe Schmidt 
simon.schm...@cs-systemberatung.de wrote:

 Hi,

 I have read about the issue of implementing the S3TC Extension in Mesa:
 http://dri.freedesktop.org/wiki/S3TC/

 As I understood, the problem is, that encoding and decoding S3TC in
 software is covered by patents, while passing S3TC compressed data to the
 GPU is still ok.

 AS NOW:

 If force_s3tc_enable is enabled in Mesa3D, uploading a S3TC encoded
 texture works if format==internalFormat is true. If format!=internalFormat
 is true, it would fail (as i know).

 SO MY PROPOSAL:

 If 'format' is one of the S3TC types, and format!=internalFormat is true,
 then set internalFormat:=format.

 Else, if 'internalFormat' is one of the S3TC types, but the 'format' isn't,
 set internalFormat:=format (or any other format, Mesa3D can encode).


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Another Take on the S3TC issue

2013-08-13 Thread Patrick Baggett
Erm... I'm wondering... why does the S3TC issue come up every few
 months out of it's grave and haunt the list (and your nerves) ?


I think it is because the issue looks deceptively simple. Hardware is
hardware, right? ASICs do the decompression, not software. Surely blindly
copying bits from one device to another *can't* be patent infringement.
Surely, right? :\

Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r300g: add program name check for BSD

2013-06-26 Thread Patrick Baggett
On Wed, Jun 26, 2013 at 2:11 AM, Jonathan Gray j...@jsg.id.au wrote:

 program_invocation_short_name is glibc specific.  Provide an
 alternative using getprogname(), which can be found on *BSD and OS X.

 Signed-off-by: Jonathan Gray j...@jsg.id.au
 ---
  src/gallium/drivers/r300/r300_chipset.c | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)

 diff --git src/gallium/drivers/r300/r300_chipset.c
 src/gallium/drivers/r300/r300_chipset.c
 index 11061ed..7f51ccb 100644
 --- src/gallium/drivers/r300/r300_chipset.c
 +++ src/gallium/drivers/r300/r300_chipset.c
 @@ -30,6 +30,14 @@
  #include stdio.h
  #include errno.h

 +#undef GET_PROGRAM_NAME
 +#ifdef __GLIBC__
 +#  define GET_PROGRAM_NAME() program_invocation_short_name


I think you are missing parentheses on the end of
program_invocation_short_name


 +#else /* *BSD and OS X */
 +#  include stdlib.h
 +#  define GET_PROGRAM_NAME() getprogname()
 +#endif
 +
  /* r300_chipset: A file all to itself for deducing the various properties
 of
   * Radeons. */

 @@ -49,7 +57,7 @@ static void r300_apply_hyperz_blacklist(struct
 r300_capabilities* caps)
  int i;

  for (i = 0; i  Elements(list); i++) {
 -if (strcmp(list[i], program_invocation_short_name) == 0) {
 +if (strcmp(list[i], GET_PROGRAM_NAME()) == 0) {
  caps-zmask_ram = 0;
  caps-hiz_ram = 0;
  break;
 --
 1.8.3.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] forking shared intel directory?

2013-06-21 Thread Patrick Baggett
On Fri, Jun 21, 2013 at 1:29 PM, Eric Anholt e...@anholt.net wrote:

 Long ago, when porting FBO and memory management support to i965, I
 merged a bunch of code between the i915 and i965 drivers and put it in
 the intel directory.  I think it served us well for a long time, as both
 drivers got improvements from shared work on that code.  But since then,
 we've talked several times about splitting things back apart (since we
 break i915 much more often than we improve it), so I spent yesterday and
 today looking at what the impact would be.


I'm not a developer, but I like to keep up with the drivers that I have
hardware for. Please take my opinions with a grain of salt.

When you say you break i915 more than you improve it, do you mean to say
that it is difficult to improve !i915 without breaking i915 and therefore
to improve development speed, it should be forked OR that i915 doesn't
receive enough testing / have maintainers who can resolve the issues and so
it burdens other developers to fix i915 and hence slows development?

The reason I ask if because if it is #2, then it sounds like you should be
looking for someone to volunteer as the official i915 maintainer [and if
none, then fork], but if it is #1, then maintainer or not, it will slow
down your efforts.




 LOC counts (wc -l):

 intel/ i915/   i965/ total
 master: 14751  13458   61109 89318
 fork-i915:  0  24322   74978 99300

 We duplicate ~1 lines of code, but i915 drops ~4000 lines of code
 From its build and i965 drops ~1000.

 context size:
i915i965
 master:99512   101456
 fork-i915: 99384   100824

 There's a bunch of cleanup I haven't done in the branch, like moving
 brw_vtbl.c contents to sensible places, or nuking the intel vs brw split
 that doesn't make any sense any more.

 I'm ambivalent about the change.  If the code growth from splitting was
 7000 lines or so, I'd be happy, but this feels pretty big.  On the
 other hand, the cleanups feel good to me.  I don't know how other
 developers feel.  There's a branch up at fork-i915 of my tree.  If
 people are excited about doing this and I get a bunch of acks for the
 two copy the code to my directory commits, I'll do those two then
 start sending out the non-copying changes for review.  If people don't
 like it, I won't be hurt.

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] forking shared intel directory?

2013-06-21 Thread Patrick Baggett
On Fri, Jun 21, 2013 at 3:53 PM, Kenneth Graunke kenn...@whitecape.orgwrote:

 On 06/21/2013 01:25 PM, Patrick Baggett wrote:

 I'm not a developer, but I like to keep up with the drivers that I have
 hardware for. Please take my opinions with a grain of salt.

 When you say you break i915 more than you improve it, do you mean to say
 that it is difficult to improve !i915 without breaking i915 and
 therefore to improve development speed, it should be forked OR that i915
 doesn't receive enough testing / have maintainers who can resolve the
 issues and so it burdens other developers to fix i915 and hence slows
 development?

 The reason I ask if because if it is #2, then it sounds like you should
 be looking for someone to volunteer as the official i915 maintainer [and
 if none, then fork], but if it is #1, then maintainer or not, it will
 slow down your efforts.


 Mostly the former...i915c already supports everything the hardware can do,
 while we're continually adding new features to i965+ (well, mostly gen6+).
  Things like HiZ, fast color clears, and ETC texture compression support
 affect the common miptree code, but they do nothing for i915 class
 hardware...there's only a potential downside of accidental breakage.

 The latter is true as well.  Unfortunately, community work is hampered by
 the fact that Intel hasn't released public documentation for i915 class
 hardware.  From time to time we've tried to find and motivate the right
 people to make that happen, but it hasn't yet.  Most people in the
 community are also more interested in working on the i915g driver.


Ah, thanks for the explanation, though I guess it doesn't do a whole, whole
lot to answer Eric's question.

On a side note: I was interested in the i915g driver, but I couldn't find
any documentation for it other than some architectural information about
the GPU's pipeline. I'm glad I wasn't just lacking the Google-foo. :\
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] forking shared intel directory?

2013-06-21 Thread Patrick Baggett
 The latter is true as well.  Unfortunately, community work is hampered by
 the fact that Intel hasn't released public documentation for i915 class
 hardware.  From time to time we've tried to find and motivate the right
 people to make that happen, but it hasn't yet.  Most people in the
 community are also more interested in working on the i915g driver.


 Ah, thanks for the explanation, though I guess it doesn't do a whole,
 whole lot to answer Eric's question.


That is to say, hearing that there isn't just a lack of maintainer or just
lack of ease for new development doesn't make either option seem better to
me, but you all know what's best here. Thanks for the info!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] radeonsi/compute: Pass kernel arguments in a buffer

2013-05-24 Thread Patrick Baggett
The only difference I could see is that in the old code you passed
cb-buffer (which maybe points to a value?) directly into u_upload_data()
where as in the new code, you do pass cb-buffer as the parameter rbuffer
to r600_upload_const_buffer(), but then inside that function, you do
*rbuffer = NULL before you start, which effectively erases any previous
pointer, so if *rbuffer was examined by u_upload_data(), it may be
different. I don't know if that matters, though.

Patrick


On Fri, May 24, 2013 at 1:07 PM, Tom Stellard t...@stellard.net wrote:

 From: Tom Stellard thomas.stell...@amd.com

 ---
  src/gallium/drivers/radeonsi/r600_buffer.c  | 31
 +
  src/gallium/drivers/radeonsi/radeonsi_compute.c | 26 ++---
  src/gallium/drivers/radeonsi/si_state.c | 29
 +++
  3 files changed, 51 insertions(+), 35 deletions(-)

 diff --git a/src/gallium/drivers/radeonsi/r600_buffer.c
 b/src/gallium/drivers/radeonsi/r600_buffer.c
 index cdf9988..87763c3 100644
 --- a/src/gallium/drivers/radeonsi/r600_buffer.c
 +++ b/src/gallium/drivers/radeonsi/r600_buffer.c
 @@ -25,6 +25,8 @@
   *  Corbin Simpson mostawesomed...@gmail.com
   */

 +#include byteswap.h
 +
  #include pipe/p_screen.h
  #include util/u_format.h
  #include util/u_math.h
 @@ -168,3 +170,32 @@ void r600_upload_index_buffer(struct r600_context
 *rctx,
 u_upload_data(rctx-uploader, 0, count * ib-index_size,
   ib-user_buffer, ib-offset, ib-buffer);
  }
 +
 +void r600_upload_const_buffer(struct r600_context *rctx, struct
 si_resource **rbuffer,
 +   const uint8_t *ptr, unsigned size,
 +   uint32_t *const_offset)
 +{
 +   *rbuffer = NULL;
 +
 +   if (R600_BIG_ENDIAN) {
 +   uint32_t *tmpPtr;
 +   unsigned i;
 +
 +   if (!(tmpPtr = malloc(size))) {
 +   R600_ERR(Failed to allocate BE swap buffer.\n);
 +   return;
 +   }
 +
 +   for (i = 0; i  size / 4; ++i) {
 +   tmpPtr[i] = bswap_32(((uint32_t *)ptr)[i]);
 +   }
 +
 +   u_upload_data(rctx-uploader, 0, size, tmpPtr,
 const_offset,
 +   (struct pipe_resource**)rbuffer);
 +
 +   free(tmpPtr);
 +   } else {
 +   u_upload_data(rctx-uploader, 0, size, ptr, const_offset,
 +   (struct pipe_resource**)rbuffer);
 +   }
 +}
 diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c
 b/src/gallium/drivers/radeonsi/radeonsi_compute.c
 index 3fb6eb1..035076d 100644
 --- a/src/gallium/drivers/radeonsi/radeonsi_compute.c
 +++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c
 @@ -91,8 +91,11 @@ static void radeonsi_launch_grid(
 struct r600_context *rctx = (struct r600_context*)ctx;
 struct si_pipe_compute *program = rctx-cs_shader_state.program;
 struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
 +   struct si_resource *input_buffer;
 +   uint32_t input_offset = 0;
 +   uint64_t input_va;
 uint64_t shader_va;
 -   unsigned arg_user_sgpr_count;
 +   unsigned arg_user_sgpr_count = 2;
 unsigned i;
 struct si_pipe_shader *shader = program-kernels[pc];

 @@ -109,21 +112,16 @@ static void radeonsi_launch_grid(
 si_pm4_inval_shader_cache(pm4);
 si_cmd_surface_sync(pm4, pm4-cp_coher_cntl);

 -   arg_user_sgpr_count = program-input_size / 4;
 -   if (program-input_size % 4 != 0) {
 -   arg_user_sgpr_count++;
 -   }
 +   /* Upload the input data */
 +   r600_upload_const_buffer(rctx, input_buffer, input,
 +   program-input_size,
 input_offset);
 +   input_va = r600_resource_va(ctx-screen, (struct
 pipe_resource*)input_buffer);
 +   input_va += input_offset;

 -   /* XXX: We should store arguments in memory if we run out of user
 sgprs.
 -*/
 -   assert(arg_user_sgpr_count  16);
 +   si_pm4_add_bo(pm4, input_buffer, RADEON_USAGE_READ);

 -   for (i = 0; i  arg_user_sgpr_count; i++) {
 -   uint32_t *args = (uint32_t*)input;
 -   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 +
 -   (i * 4),
 -   args[i]);
 -   }
 +   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0, input_va);
 +   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + 4,
 S_008F04_BASE_ADDRESS_HI (input_va  32) | S_008F04_STRIDE(0));

 si_pm4_set_reg(pm4, R_00B810_COMPUTE_START_X, 0);
 si_pm4_set_reg(pm4, R_00B814_COMPUTE_START_Y, 0);
 diff --git a/src/gallium/drivers/radeonsi/si_state.c
 b/src/gallium/drivers/radeonsi/si_state.c
 index dec535c..1e94f7e 100644
 --- a/src/gallium/drivers/radeonsi/si_state.c
 +++ b/src/gallium/drivers/radeonsi/si_state.c
 @@ -24,8 +24,6 @@
   *  

Re: [Mesa-dev] No configs available with xlib based egl

2013-05-07 Thread Patrick Baggett
Perhaps 16-bit color isn't supported? Maybe try other color bits or set
R/G/B individually and see what happens. Also, there is an eglinfo tool
source code in Mesa that can probably tell you a whole lot more.


Patrick


On Tue, May 7, 2013 at 7:56 AM, Divick Kishore divick.kish...@gmail.comwrote:

 Hi,
 I have compiled mesa with the following options:

 .././configure --prefix=~/lib/mesa/swrast/ --build=x86_64-linux-gnu
 --with-gallium-drivers= --with-driver=xlib --enable-egl --enable-gles1
 --enable-gles2 --with-egl-platforms=x11 CFLAGS=-Wall -g -O2
 CXXFLAGS=-Wall -g -O2

 but when I run a sample app with the following egl config, it returns 0
 configs.

 EGLint attr[] = {   // some attributes to set up our egl-interface
   EGL_BUFFER_SIZE, 16,
   EGL_RENDERABLE_TYPE,
   EGL_OPENGL_ES2_BIT,
   EGL_NONE
};

EGLConfig  ecfg;
EGLint num_config;
if ( !eglChooseConfig( egl_display, attr, ecfg, 1, num_config ) ) {
   cerr  Failed to choose config (eglError:   eglGetError()
  )  endl;
   return 1;
}


 The code above prints 'Failed to choose config'.

 While the same code works fine when I compile with:

 ../../configure --prefix=~/lib/mesa/dri --build=x86_64-linux-gnu
 --with-driver=dri --with-dri-drivers=swrast
 --with-dri-driverdir=~/lib/mesa/dri/
 --with-dri-searchpath='~/lib/mesa/dri' --enable-glx-tls --enable-xa
 --enable-driglx-direct --with-egl-platforms=x11
 --enable-gallium-llvm=yes --with-gallium-drivers=swrast
 --enable-gles1 --enable-gles2 --enable-gallium-egl --disable-glu
 CFLAGS=-Wall -g -O2 CXXFLAGS=-Wall -g -O2

 Could someone please suggest what could be causing this?

 Thanks  Regards,
 Divick
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] glxgears performance higher with software renderer compared to h/w drivers

2013-05-06 Thread Patrick Baggett
I don't think glxgears is the best benchmark for what is a typical OpenGL
load (if there is a typical). The 60 FPS with your hardware driver sounds
suspiciously like the refresh rate of your screen; perhaps it is
synchronized with the vertical retrace? Since I'm assuming you want to find
the fastest driver, why not try a free and open source game like openarena
to give you a better idea of how they actually perform.


Patrick


On Mon, May 6, 2013 at 9:33 AM, Divick Kishore divick.kish...@gmail.comwrote:

 Hi,
  I am trying to build s/w only mesa driver. It seems that the
 performance of software only renderer (compiled with
 --with-driver=xlib) is higher than that of h/w drivers. Could someone
 please help me understand what is causing this or if it is expected?

 I see that dri based s/w renderer is also slower than xlib/swrast
 driver. So how does dri based s/w rendering work and why is it slower
 than xlib/swrast driver?

 I presume that --with-driver=xlib builds s/w only renderer. Please
 correct me if I am wrong.

 ./configure -build=x86_64-linux-gnu --with-driver=dri
 --with-dri-drivers=i915 swrast

 --with-dri-driverdir=/home/divick/work/mesa/mesa-8.0.5/build/dri/x86_64-linux-gnu/

 --with-dri-searchpath='/home/divick/work/mesa/mesa-8.0.5/build/dri/x86_64-linux-gnu/'
 --enable-glx-tls --enable-shared-glapi --enable-texture-float
 --enable-xa --enable-driglx-direct --with-egl-platforms=x11 drm
 --enable-gallium-llvm --with-gallium-drivers=swrast i915
 --enable-gles1 --enable-gles2 --enable-openvg --enable-gallium-egl
 --disable-glu CFLAGS=-Wall -g -O2 CXXFLAGS=-Wall -g -O2

 with LIBGL_ALWAYS_SOFTWARE=1
 glxgears reports:

 GL_RENDERER   = Software Rasterizer
 GL_VERSION= 2.1 Mesa 8.0.5
 GL_VENDOR = Mesa Project

 fps: ~ 490 fps

 Without LIBGL_ALWAYS_SOFTWARE set:

 GL_RENDERER   = Mesa DRI Intel(R) Sandybridge Mobile
 GL_VERSION= 3.0 Mesa 8.0.5
 GL_VENDOR = Tungsten Graphics, Inc

 fps: ~ 60

 When compiled with configure options

  --build=x86_64-linux-gnu --disable-egl --with-gallium-drivers=
 --with-driver=xlib --disable-egl CFLAGS=-Wall -g -O2 CXXFLAGS=-Wall
 -g -O2

 glxgears reports:

 GL_RENDERER   = Mesa X11
 GL_VERSION= 2.1 Mesa 8.0.5
 GL_VENDOR = Brian Paul

 fps: ~1600

 With drivers installed on system and with LIBGL_ALWAYS_SOFTWARE=1:

 GL_RENDERER   = Gallium 0.4 on llvmpipe (LLVM 0x209)
 GL_VERSION= 2.1 Mesa 8.0.5
 GL_VENDOR = VMware, Inc.

 fps: ~ 1130

 Thanks  Regards,
 Divick
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/17] swrast: Factor out texture slice counting.

2013-04-22 Thread Patrick Baggett
On Mon, Apr 22, 2013 at 11:14 AM, Eric Anholt e...@anholt.net wrote:

 This function going to get used a lot more in upcoming patches.
 ---
  src/mesa/swrast/s_texture.c |   16 
  1 file changed, 12 insertions(+), 4 deletions(-)

 diff --git a/src/mesa/swrast/s_texture.c b/src/mesa/swrast/s_texture.c
 index 51048be..36a90dd 100644
 --- a/src/mesa/swrast/s_texture.c
 +++ b/src/mesa/swrast/s_texture.c
 @@ -58,6 +58,14 @@ _swrast_delete_texture_image(struct gl_context *ctx,
 _mesa_delete_texture_image(ctx, texImage);
  }

 +static unsigned int
 +texture_slices(struct gl_texture_image *texImage)
 +{
 +   if (texImage-TexObject-Target == GL_TEXTURE_1D_ARRAY)
 +  return texImage-Height;
 +   else
 +  return texImage-Depth;
 +}


I think you can const-qualify 'texImage'.


  /**
   * Called via ctx-Driver.AllocTextureImageBuffer()
 @@ -83,11 +91,11 @@ _swrast_alloc_texture_image_buffer(struct gl_context
 *ctx,
  * We allocate the array for 1D/2D textures too in order to avoid
 special-
  * case code in the texstore routines.
  */
 -   swImg-ImageOffsets = malloc(texImage-Depth * sizeof(GLuint));
 +   swImg-ImageOffsets = malloc(texture_slices(texImage) *
 sizeof(GLuint));
 if (!swImg-ImageOffsets)
return GL_FALSE;

 -   for (i = 0; i  texImage-Depth; i++) {
 +   for (i = 0; i  texture_slices(texImage); i++) {
swImg-ImageOffsets[i] = i * texImage-Width * texImage-Height;
 }

 @@ -209,20 +217,20 @@ _swrast_map_teximage(struct gl_context *ctx,

 map = swImage-Buffer;

 +   assert(slice  texture_slices(texImage));
 +
 if (texImage-TexObject-Target == GL_TEXTURE_3D ||
 texImage-TexObject-Target == GL_TEXTURE_2D_ARRAY) {
GLuint sliceSize = _mesa_format_image_size(texImage-TexFormat,
   texImage-Width,
   texImage-Height,
   1);
 -  assert(slice  texImage-Depth);
map += slice * sliceSize;
 } else if (texImage-TexObject-Target == GL_TEXTURE_1D_ARRAY) {
GLuint sliceSize = _mesa_format_image_size(texImage-TexFormat,
   texImage-Width,
   1,
   1);
 -  assert(slice  texImage-Height);
map += slice * sliceSize;
 }

 --
 1.7.10.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] One definition of C99 inline/__func__ to rule them all.

2013-03-12 Thread Patrick Baggett
On Tue, Mar 12, 2013 at 3:39 PM, jfons...@vmware.com wrote:

 From: José Fonseca jfons...@vmware.com

 We were in four already...
 ---
  include/c99_compat.h  |  105
 +
  src/egl/main/eglcompiler.h|   44 ++
  src/gallium/include/pipe/p_compiler.h |   74 ++-
  src/mapi/mapi/u_compiler.h|   26 ++--
  src/mesa/main/compiler.h  |   56 ++
  5 files changed, 125 insertions(+), 180 deletions(-)
  create mode 100644 include/c99_compat.h

 diff --git a/include/c99_compat.h b/include/c99_compat.h
 new file mode 100644
 index 000..39f958f
 --- /dev/null
 +++ b/include/c99_compat.h
 @@ -0,0 +1,105 @@

 +/**
 + *
 + * Copyright 2007-2013 VMware, Inc.
 + * All Rights Reserved.
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a
 + * copy of this software and associated documentation files (the
 + * Software), to deal in the Software without restriction, including
 + * without limitation the rights to use, copy, modify, merge, publish,
 + * distribute, sub license, and/or sell copies of the Software, and to
 + * permit persons to whom the Software is furnished to do so, subject to
 + * the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the
 + * next paragraph) shall be included in all copies or substantial portions
 + * of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS
 + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
 + * IN NO EVENT SHALL VMWARE AND/OR ITS SUPPLIERS BE LIABLE FOR
 + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
 CONTRACT,
 + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
 + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 + *
 +
 **/
 +
 +#ifndef _C99_COMPAT_H_
 +#define _C99_COMPAT_H_
 +
 +
 +/*
 + * C99 inline keyword
 + */
 +#ifndef inline
 +#  ifdef __cplusplus
 + /* C++ supports inline keyword */
 +#  elif defined(__GNUC__)
 +#define inline __inline__
 +#  elif defined(_MSC_VER)
 +#define inline __inline
 +#  elif defined(__ICL)
 +#define inline __inline
 +#  elif defined(__INTEL_COMPILER)
 + /* Intel compiler supports inline keyword */
 +#  elif defined(__WATCOMC__)  (__WATCOMC__ = 1100)
 +#define inline __inline
 +#  elif defined(__SUNPRO_C)  defined(__C99FEATURES__)


Solaris Studio supports __inline and __inline__


 + /* C99 supports inline keyword */
 +#  elif (__STDC_VERSION__ = 199901L)
 + /* C99 supports inline keyword */
 +#  else
 +#define inline
 +#  endif
 +#endif



The order of the checks will not work as expected. Intel's compiler will
define __GNUC__, and so will clang. The check for __GNUC__ has to be the
last one.



 +
 +
 +/*
 + * C99 restrict keyword
 + *
 + * See also:
 + * -
 http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.html
 + */
 +#ifndef restrict
 +#  if (__STDC_VERSION__ = 199901L)
 + /* C99 */
 +#  elif defined(__SUNPRO_C)  defined(__C99FEATURES__)
 + /* C99 */


Solaris Studio supports _Restrict when not in C99 mode as well.

#define restrict _Restrict


 +#  elif defined(__GNUC__)
 +#define restrict __restrict__
 +#  elif defined(_MSC_VER)
 +#define restrict __restrict
 +#  else
 +#define restrict /* */
 +#  endif
 +#endif
 +
 +
 +/*
 + * C99 __func__ macro
 + */
 +#ifndef __func__
 +#  if (__STDC_VERSION__ = 199901L)
 + /* C99 */
 +#  elif defined(__SUNPRO_C)  defined(__C99FEATURES__)
 + /* C99 */


Solaris Studio supports __FUNCTION__ when not in C99 mode.


 +#  elif defined(__GNUC__)
 +#if __GNUC__ = 2
 +#  define __func__ __FUNCTION__
 +#else
 +#  define __func__ unknown
 +#endif
 +#  elif defined(_MSC_VER)
 +#if _MSC_VER = 1300
 +#  define __func__ __FUNCTION__
 +#else
 +#  define __func__ unknown
 +#endif
 +#  else
 +#define __func__ unknown
 +#  endif
 +#endif
 +
 +
 +#endif /* _C99_COMPAT_H_ */
 diff --git a/src/egl/main/eglcompiler.h b/src/egl/main/eglcompiler.h
 index 9823693..2499172 100644
 --- a/src/egl/main/eglcompiler.h
 +++ b/src/egl/main/eglcompiler.h
 @@ -31,6 +31,9 @@
  #define EGLCOMPILER_INCLUDED


 +#include c99_compat.h /* inline, __func__, etc. */
 +
 +
  /**
   * Get standard integer types
   */
 @@ -62,30 +65,7 @@
  #endif


 -/**
 - * Function inlining
 - */
 -#ifndef inline
 -#  ifdef __cplusplus
 - /* C++ supports inline keyword */
 -#  elif defined(__GNUC__)
 -#define inline __inline__
 -#  elif defined(_MSC_VER)
 -#define inline __inline
 -#  elif defined(__ICL)
 -#define inline __inline
 -#  elif defined(__INTEL_COMPILER)
 - 

Re: [Mesa-dev] [PATCH 2/2] mesa: Speedup the xrgb - argb special case in fast_read_rgba_pixels_memcpy

2013-03-11 Thread Patrick Baggett
On Mon, Mar 11, 2013 at 9:56 AM, Jose Fonseca jfons...@vmware.com wrote:

 I'm surprised this is is faster.

 In particular, for big things we'll be touching memory twice.

 Did you measure the speed up?

 Jose


I'm sorry to be dull, but is there a SSE2 implementation of this somewhere
for x86 / x64 CPUs?

Patrick



 - Original Message -
  ---
   src/mesa/main/readpix.c | 5 +++--
   1 file changed, 3 insertions(+), 2 deletions(-)
 
  diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c
  index 349b0bc..0f5c84c 100644
  --- a/src/mesa/main/readpix.c
  +++ b/src/mesa/main/readpix.c
  @@ -285,11 +285,12 @@ fast_read_rgba_pixels_memcpy( struct gl_context
 *ctx,
 }
  } else if (copy_xrgb) {
 /* convert xrgb - argb */
  +  int alphaOffset = texelBytes - 1;
 for (j = 0; j  height; j++) {
  - GLuint *dst4 = (GLuint *) dst, *map4 = (GLuint *) map;
  + memcpy(dst, map, width * texelBytes);
int i;
for (i = 0; i  width; i++) {
  -dst4[i] = map4[i] | 0xff00;  /* set A=0xff */
  +dst[i * texelBytes + alphaOffset] = 0xff;  /* set A=0xff */
}
dst += dstStride;
map += stride;
  --
  1.8.1.5
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] meta: Allocate texture before initializing texture coordinates

2013-02-22 Thread Patrick Baggett
On Fri, Feb 22, 2013 at 2:23 PM, Ian Romanick i...@freedesktop.org wrote:

 On 02/15/2013 11:20 AM, Anuj Phogat wrote:

 tex-Sright and tex-Ttop are initialized during texture allocation.
 This fixes depth buffer blitting failures in khronos conformance tests
 when run on desktop GL 3.0.

 Note: This is a candidate for stable branches.

 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com


 Reviewed-by: Ian Romanick ian.d.roman...@intel.com

 I think there is a lot of room for other improvements in this code.
 Like... why are we doing glReadPixels into malloc memory, then handing that
 same pointer to glTexImage2D.  We should (at least for desktop and GLES3)
 use a PBO.

 ---
   src/mesa/drivers/common/meta.c |   17 -
   1 files changed, 8 insertions(+), 9 deletions(-)

 diff --git a/src/mesa/drivers/common/**meta.c b/src/mesa/drivers/common/*
 *meta.c
 index 4e32b50..29a209e 100644
 --- a/src/mesa/drivers/common/**meta.c
 +++ b/src/mesa/drivers/common/**meta.c
 @@ -1910,6 +1910,14 @@ _mesa_meta_BlitFramebuffer(**struct gl_context
 *ctx,
 GLuint *tmp = malloc(srcW * srcH * sizeof(GLuint));

 if (tmp) {
 +
 + newTex = alloc_texture(depthTex, srcW, srcH,
 GL_DEPTH_COMPONENT);


Are out of memory conditions handled in alloc_texture?


 + _mesa_ReadPixels(srcX, srcY, srcW, srcH, GL_DEPTH_COMPONENT,
 +  GL_UNSIGNED_INT, tmp);
 + setup_drawpix_texture(ctx, depthTex, newTex, GL_DEPTH_COMPONENT,
 +   srcW, srcH, GL_DEPTH_COMPONENT,
 +   GL_UNSIGNED_INT, tmp);
 +
/* texcoords (after texture allocation!) */
{
   verts[0].s = 0.0F;
 @@ -1928,15 +1936,6 @@ _mesa_meta_BlitFramebuffer(**struct gl_context
 *ctx,
if (!blit-DepthFP)
   init_blit_depth_pixels(ctx);

 - /* maybe change tex format here */
 - newTex = alloc_texture(depthTex, srcW, srcH,
 GL_DEPTH_COMPONENT);
 -
 - _mesa_ReadPixels(srcX, srcY, srcW, srcH,
 -  GL_DEPTH_COMPONENT, GL_UNSIGNED_INT, tmp);
 -
 - setup_drawpix_texture(ctx, depthTex, newTex,
 GL_DEPTH_COMPONENT, srcW, srcH,
 -   GL_DEPTH_COMPONENT, GL_UNSIGNED_INT, tmp);
 -
_mesa_BindProgramARB(GL_**FRAGMENT_PROGRAM_ARB,
 blit-DepthFP);
_mesa_set_enable(ctx, GL_FRAGMENT_PROGRAM_ARB, GL_TRUE);
_mesa_ColorMask(GL_FALSE, GL_FALSE, GL_FALSE, GL_FALSE);


 __**_
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/**mailman/listinfo/mesa-devhttp://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] use of likey() / unlikely() macros

2013-01-17 Thread Patrick Baggett
On Thu, Jan 17, 2013 at 10:37 AM, Brian Paul bri...@vmware.com wrote:


 In compiler.h we define the likely(), unlikely() macros which wrap GCC's
 __builtin_expect().  But we only use them in a handful of places.

 It seems to me that an obvious place to possibly use these would be for GL
 error testing.  For example, in glDrawArrays():

if (unlikely(count = 0)) {
   _mesa_error();
}

 Plus, in some of the glBegin/End per-vertex calls such as
 glVertexAttrib3fARB() where we error test the index parameter.

 I guess the key question is how much might we gain from this.  I don't
 really have a good feel for the value at this level.  In a tight inner
 loop, sure, but the GL error checking is pretty high-level code.


This is basically a micro-optimization, to be honest. Not that
micro-optimization is bad, but while it should improve performance, it
would take a lot for that to show up on profiles. In the case of error
checking at the start of a function, you might be lucky to save a few
cycles -- virtually unnoticeable.


 I haven't found much on the web about performance gains from
 __builtin_expect().  Anyone?


I read a few heresay posts, but this one comes with actual numbers:

http://blog.man7.org/2012/10/how-much-do-builtinexpect-likely-and.html

Long story short: if you're wrong, slower; if you're right, marginal
improvement.

It's use is for changing the ordering of jumps from gcc's default of assume
linear execution. For example, code like this:
---
if(A == NULL) //not likely
return ERR_NULL;

if(B = MAX) //not likely
   return ERR_MAX;

if(C  MIN) //not likely
   return ERR_MIN;

doStuff();
---

generates jumps around the return statement, so in the normal case, you're
making a jump, which can mean you have a delay and possibly refetch
instructions. If you didn't jump, then CPU will have the then part
already loaded in the icache. The optimal ordering then is:

if(A != NULL) {
if(B  MAX) {
if(C = MIN) {
doStuff();
}
else return ERR_MIN;
}
else return ERR_MAX;
}
else return ERR_NULL;

---
In the common case then, the code does not branch, but executes a linear
stream of instructions. On modern x86 CPUs, this matters very little,
except for maybe a few in-order CPUs (maybe Intel Atom?). You're probably a
lot more likely to get some improvements from non-x86 where branch
prediction is weaker or unavailable and/or the CPU is in-order. ARM and
older SPARC CPUs come to mind. Also, some architectures allow you to encode
a branch prediction hint inside of the branch itself, e.g. IA64's
br.call.sptk.many Branch / Call / Static Predict Taken / Many Times,
which gcc can take advantage of. Still overall, this is well within the
realm of micro-optimization.

Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GL 3.1 on Radeon HD 4670?

2012-10-31 Thread Patrick Baggett
DOH. I'm sorry, I read that Mesa supported GL 3.1 and somehow I generalized
that to all drivers. Thanks for that TODO list. I guess I need to start
reading about the R700 architecture...

Patrick

On Wed, Oct 31, 2012 at 1:28 PM, Alex Deucher alexdeuc...@gmail.com wrote:

 On Wed, Oct 31, 2012 at 1:11 PM, Patrick Baggett
 baggett.patr...@gmail.com wrote:
  Hi all,
 
  I've got a really weird duck of system: an Itanium2 system running Linux
  3.7.0-rc3 with the newest libdrm and mesa git from yesterday. I
 configured
  it with --enable-texture-float and the radeon DRI driver. When I use
  glxinfo, I see that it is Mesa 9.1-devel but only OpenGL 3.0. Is that
  because my version glxinfo doesn't create the appropriate context? Is
 there
  an updated version of glxinfo that does? Or a flag that I should pass to
  only consider core contexts?
 

 The open source r600g driver only supports GL 3.0 at the moment.  See
 this document to see what's still missing:
 http://cgit.freedesktop.org/mesa/mesa/tree/docs/GL3.txt

 Alex

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] R600 tiling halves the frame rate

2012-10-30 Thread Patrick Baggett
Is your screen refresh rate 70 Hz? Because if so, that means that it's
syncing to the vblank on Mesa, and not doing so on the proprietary one.

Patrick

On Mon, Oct 29, 2012 at 8:24 PM, Tzvetan Mikov tmi...@jupiter.com wrote:

 On 10/28/2012 12:56 PM, Tzvetan Mikov wrote:

 On 10/28/2012 04:26 AM, Marek Olšák wrote:
 No, there is no X11 at all. I am running my tests on a very bare system
 with EGL only, hoping to minimize the test surface and isolate any
 interferences.

 I will try it though (it will also enable me to compare against the
 proprietary drivers as a baseline, I guess).


 This is not directly related to tiling, but I installed the proprietary
 drivers on the same hardware, and I am getting about 3X the performance.
 (From 70 FPS to 225 FPS in 1920x1200 on a HD6460).

 Is it known what the main reason is for such a dramatic performance
 difference between the Mesa R600 driver and proprietary driver? This is a
 very simple test app rendering two textured rectangles on screen, so I am
 guessing the difference must be due to something fundamental.


 regards,
 Tzvetan
 __**_
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/**mailman/listinfo/mesa-devhttp://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): Use signbit() in IS_NEGATIVE and DIFFERENT_SIGNS

2012-09-24 Thread Patrick Baggett
Concurrency::precise_math::signbit(), and only as of VS 2012 runtimes. This
is an awfully high bar for such a simple function.



On Mon, Sep 24, 2012 at 1:43 PM, Matt Turner matts...@gmail.com wrote:

 On Mon, Sep 24, 2012 at 11:02 AM, Brian Paul bri...@vmware.com wrote:
  On 09/24/2012 10:49 AM, Matt Turner wrote:
 
  Module: Mesa
  Branch: master
  Commit: 0f3ba405eada72e1ab4371948315b28608903927
  URL:
 
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=0f3ba405eada72e1ab4371948315b28608903927
 
  Author: Matt Turnermatts...@gmail.com
  Date:   Fri Sep 14 16:04:40 2012 -0700
 
  Use signbit() in IS_NEGATIVE and DIFFERENT_SIGNS
 
  signbit() appears to be available everywhere (even MSVC according to
  MSDN), so let's use it instead of open-coding some messy and confusing
  bit twiddling macros.
 
  Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54805
  Reviewed-by: Paul Berrystereotype...@gmail.com
  Suggested-by: Ian Romanickian.d.roman...@intel.com
 
  ---
 
configure.ac   |7 +++
src/mesa/main/macros.h |   21 ++---
2 files changed, 9 insertions(+), 19 deletions(-)
 
  diff --git a/configure.ac b/configure.ac
  index 4193496..cb65467 100644
  --- a/configure.ac
  +++ b/configure.ac
  @@ -499,6 +499,13 @@ AC_SUBST([DLOPEN_LIBS])
dnl See if posix_memalign is available
AC_CHECK_FUNC([posix_memalign], [DEFINES=$DEFINES
  -DHAVE_POSIX_MEMALIGN])
 
  +dnl signbit() is a macro in glibc's math.h, so AC_CHECK_FUNC fails. To
  handle
  +dnl this, use AC_CHECK_DECLS and fallback to AC_CHECK_FUNC in case it
  fails.
  +AC_CHECK_DECLS([signbit],[],
  +   AC_CHECK_FUNC([signbit],[],
  + AC_MSG_ERROR([could not find signbit()])),
  +   [#includemath.h])
  +
dnl SELinux awareness.
AC_ARG_ENABLE([selinux],
[AS_HELP_STRING([--enable-selinux],
  diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h
  index 04d59d7..7b7fd1b 100644
  --- a/src/mesa/main/macros.h
  +++ b/src/mesa/main/macros.h
  @@ -693,31 +693,14 @@ NORMALIZE_3FV(GLfloat v[3])
static inline GLboolean
IS_NEGATIVE(float x)
{
  -#if defined(USE_IEEE)
  -   fi_type fi;
  -   fi.f = x;
  -   return fi.i  0;
  -#else
  -   return x  0.0F;
  -#endif
  +   return signbit(x) != 0;
}
 
  -
/** Test two floats have opposite signs */
static inline GLboolean
DIFFERENT_SIGNS(GLfloat x, GLfloat y)
{
  -#if defined(USE_IEEE)
  -   fi_type xfi, yfi;
  -   xfi.f = x;
  -   yfi.f = y;
  -   return !!((xfi.i ^ yfi.i)  (1u  31));
  -#else
  -   /* Could just use (x*y0) except for the flatshading requirements.
  -* Maybe there's a better way?
  -*/
  -   return ((x) * (y)= 0.0F  (x) - (y) != 0.0F);
  -#endif
  +   return signbit(x) != signbit(y);
}
 
 
 
  Looks like we don't have signbit() on Windows.  We build with scons
 there so
  the autoconf check isn't applicable.  I'll post a patch in a bit.
 
  -Brian

 MSDN claims that Windows does have signbit():
 http://msdn.microsoft.com/en-us/library/hh308342.aspx
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: loosen small matrix determinant check

2012-07-30 Thread Patrick Baggett
On Mon, Jul 30, 2012 at 4:31 AM, Pekka Paalanen ppaala...@gmail.com wrote:

 On Tue, 24 Jul 2012 11:31:59 -0600
 Brian Paul bri...@vmware.com wrote:

  When computing a matrix inverse, if the determinant is too small we
 could hit
  a divide by zero.  There's a check to prevent this (we basically give up
 on
  computing the inverse and return the identity matrix.)  This patches
 loosens
  this test to fix a lighting bug reported by Lars Henning Wendt.
 
  NOTE: This is a candidate for the 8.0 branch.
  ---
   src/mesa/math/m_matrix.c |2 +-
   1 files changed, 1 insertions(+), 1 deletions(-)
 
  diff --git a/src/mesa/math/m_matrix.c b/src/mesa/math/m_matrix.c
  index 02aedba..ef377ee 100644
  --- a/src/mesa/math/m_matrix.c
  +++ b/src/mesa/math/m_matrix.c
  @@ -513,7 +513,7 @@ static GLboolean invert_matrix_3d_general( GLmatrix
 *mat )
 
  det = pos + neg;
 
  -   if (det*det  1e-25)
  +   if (det  1e-25)
 return GL_FALSE;
 
  det = 1.0F / det;

 Hi,

 just a fly-by question; doesn't that break if determinant is negative?
 I.e. reflection transformations.

 Yeah, I think you need a fabsf() there.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] IROUND() issue

2012-05-18 Thread Patrick Baggett
On Fri, May 18, 2012 at 11:28 AM, Brian Paul bri...@vmware.com wrote:

 On 05/18/2012 10:11 AM, Jose Fonseca wrote:



 - Original Message -


 A while back I noticed that the piglit roundmode-pixelstore and
 roundmode-getinteger tests pass on my 64-bit Fedora system but fail
 on
 a 32-bit Ubuntu system.  Both glGetIntegerv() and glPixelStoref()
  use
 the IROUND() function to convert floats to ints.

 The implementation if IROUND() that uses the x86 fistp instruction is
 protected with:

 #if defined(USE_X86_ASM)  defined(__GNUC__)  defined(__i386__)


 but that evaluates to 0 on x86-64 (neither USE_X86_ASM nor __i386__
 are defined) so we use the C fallback:

 #define IROUND(f)  ((int) (((f)= 0.0F) ? ((f) + 0.5F) : ((f) -
 0.5F)))

 The C version of IROUND() does what we want for the piglit tests but
 not the x86 version.  I think the default x86 rounding mode is
 FE_UPWARD so that explains the failures.


 So I think I'd like to do the following:

 1. Enable the x86 fistp-based functions in imports.h for x86-64.


 It's illegal/inneficient to use x87 on x86-64. We should use the
 appropriate SSE intrisinsic instead.


The instruction is cvtss2si. Even if you use SSE here, you depend on the
rounding mode in the MXCSR register, which means you'll have to set that,
because some applications change this mode to use a faster or more precise
rounding mode. It's the parallel problem that you have with fistp.



  2. Rename IROUND() to IROUND_FAST() and define it as float-int
 conversion by whatever method is fastest.

 3. Define IROUND() as round to nearest int.  For the x86 fistp
 implementation this would involve setting/restoring the rounding
 mode.


If I recall, it is generally run with some other rounding mode other than
truncate by default, so usually float - int conversions that involve
truncation (C cast) require changing the rounding mode *to truncation*.
This was such a problem that in SSE3 there is fisttp which is FP integer
store with truncation. I guess though if the default rounding mode causes
problems, there isn't much that can be done but change it each time.

Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Four questions about DRI1 drivers

2012-03-01 Thread Patrick Baggett
Now I'm curious. Is it the case that every DRI1 driver *could be* a DRI2
driver with enough effort? Not talking about emulating hardware features.

Patrick

On Thu, Mar 1, 2012 at 1:46 PM, Dave Airlie airl...@gmail.com wrote:

 On Thu, Mar 1, 2012 at 7:25 PM, Connor Behan connor.be...@gmail.com
 wrote:
  On 01/03/12 01:36 AM, Dave Airlie wrote:
 
  You can still build r128_dri.so from Mesa 7.11 and it will work with
 later
  Mesa libGLs fine. You just can't build it from Mesa 8.0 source anymore.
 
  Really? Even if no one updates r128 to stay compatible with new libGLs
 and
  no one updating libGL gives a second thought as to whether that update
 will
  break r128? I thought the whole point of removing DRI1 drivers is that
 most
  of you are too pressured to keep that promise. If the plan really is to
  update libGL carefully so that DRI1 drivers will always work with it,
 then
  it seems like their removal does nothing but save a few MB of space on
 the
  git server.

 Thats the plan, some distros have to keep shipping older drivers, but
 also want to ship newer drivers.

 the libGL - driver interface is a lot more standard than the internal
 mesa-driver interfaces, and are not the same thing.

 Removing the drivers allowed major simplification of mesa internal
 interfaces not the GL-driver interface.

 It doesn't save any space on the git server since git holds all the
 history ever.

 Dave.
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/i965g: hide that utterly broken driver better

2011-11-28 Thread Patrick Baggett
On Mon, Nov 28, 2011 at 3:32 PM, Daniel Vetter daniel.vet...@ffwll.chwrote:

 And warn loudly in case people want to use it. Too many tester report
 gpu hangs on irc and we rootcause this ...

 Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
 ---
  configure.ac |9 -
  1 files changed, 8 insertions(+), 1 deletions(-)

 diff --git a/configure.ac b/configure.ac
 index 8885a6d..4dee3ad 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -658,7 +658,7 @@ GALLIUM_DRIVERS_DEFAULT=r300,r600,swrast
  AC_ARG_WITH([gallium-drivers],
 [AS_HELP_STRING([--with-gallium-drivers@:@=DIRS...@:@],
 [comma delimited Gallium drivers list, e.g.
 -i915,i965,nouveau,r300,r600,svga,swrast
 +i915,nouveau,r300,r600,svga,swrast
 @:@default=r300,r600,swrast@:@])],
 [with_gallium_drivers=$withval],
 [with_gallium_drivers=$GALLIUM_DRIVERS_DEFAULT])
 @@ -2007,10 +2007,17 @@ if echo $SRC_DIRS | grep 'gallium' /dev/null
 21; then
 echo Winsys dirs: $GALLIUM_WINSYS_DIRS
 echo Driver dirs: $GALLIUM_DRIVERS_DIRS
 echo Trackers dirs:   $GALLIUM_STATE_TRACKERS_DIRS
 +   if echo $GALLIUM_DRIVERS_DIRS | grep i965  /dev/null 21; then
 +  echo
 +  echo WARNING: enabling i965 gallium driver
 +  echo the i965g driver is currently utterly broken, only
 for adventurours developers


I think the word is adventurous.


 +  echo
 +   fi
  else
 echo Gallium: no
  fi

 +
  dnl Libraries
  echo 
  echo Shared libs: $enable_shared
 --
 1.7.7.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: re-implement unpacking of DEPTH_COMPONENT32F

2011-11-22 Thread Patrick Baggett
On Tue, Nov 22, 2011 at 2:07 PM, Marek Olšák mar...@gmail.com wrote:

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43122
 ---
  src/mesa/main/format_unpack.c |   10 ++
  1 files changed, 10 insertions(+), 0 deletions(-)

 diff --git a/src/mesa/main/format_unpack.c b/src/mesa/main/format_unpack.c
 index 6e2ce7a..52f224a 100644
 --- a/src/mesa/main/format_unpack.c
 +++ b/src/mesa/main/format_unpack.c
 @@ -1751,6 +1751,13 @@ unpack_float_z_Z32(GLuint n, const void *src,
 GLfloat *dst)
  }

  static void
 +unpack_float_z_Z32F(GLuint n, const void *src, GLfloat *dst)
 +{
 +   const GLfloat *s = ((const GLfloat *) src);
 +   memcpy(dst, s, n * sizeof(float));
 +}


Why bother typecasting here in a separate variable 's'?



 +
 +static void
  unpack_float_z_Z32X24S8(GLuint n, const void *src, GLfloat *dst)
  {
const GLfloat *s = ((const GLfloat *) src);
 @@ -1783,6 +1790,9 @@ _mesa_unpack_float_z_row(gl_format format, GLuint n,
case MESA_FORMAT_Z32:
   unpack = unpack_float_z_Z32;
   break;
 +   case MESA_FORMAT_Z32_FLOAT:
 +  unpack = unpack_float_z_Z32F;
 +  break;
case MESA_FORMAT_Z32_FLOAT_X24S8:
   unpack = unpack_float_z_Z32X24S8;
   break;
 --
 1.7.5.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): st/xorg: fix build without LLVM

2011-10-13 Thread Patrick Baggett
Well, trivial answer is that Win32 uses some C/C++ runtime provided by
Microsoft, usually something like MSVCR90.DLL (v9.0) etc. Solaris uses
libC.so, for example. As far as I know, only systems where the GNU C/C++
compiler is main system compiler (and generally therefore the GNU C++
runtime) uses anything named libstdc++. So I'd expect Free/Net/OpenBSD +
Linux use that naming and probably not much else. On other commercial
UNIXes, if it does exist, it is just for compatibility with C++ programs
compiled using g++.

Patrick

2011/10/13 Marcin Slusarz marcin.slus...@gmail.com

 On Thu, Oct 13, 2011 at 07:54:32PM +0200, Michel Dänzer wrote:
  On Don, 2011-10-13 at 10:03 -0700, Marcin XXlusarz wrote:
   Module: Mesa
   Branch: master
   Commit: 349e4db99e938f8ee8826b0d27e490c66a1e8356
   URL:
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=349e4db99e938f8ee8826b0d27e490c66a1e8356
  
   Author: Marcin Slusarz marcin.slus...@gmail.com
   Date:   Thu Oct 13 18:44:40 2011 +0200
  
   st/xorg: fix build without LLVM
  
   ---
  
src/gallium/targets/Makefile.xorg |2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
  
   diff --git a/src/gallium/targets/Makefile.xorg
 b/src/gallium/targets/Makefile.xorg
   index 9269375..c96eded 100644
   --- a/src/gallium/targets/Makefile.xorg
   +++ b/src/gallium/targets/Makefile.xorg
   @@ -33,6 +33,8 @@ LD = $(CXX)
LDFLAGS += $(LLVM_LDFLAGS)
USE_CXX=1
DRIVER_LINKS += $(LLVM_LIBS) -lm -ldl
   +else
   +LDFLAGS += -lstdc++
endif
 
  This is wrong. Use g++ for linking libstdc++, gcc [...] -lstdc++ doesn't
  work everywhere.

 It wasn't my invention - I mimicked other targets (with partial exception
 of dri).
 Why gcc -lstdc++ doesn't work everywhere?

 ---
 From: Marcin Slusarz marcin.slus...@gmail.com
 Subject: [PATCH] gallium/targets: use g++ for linking

 As pointed by Michel Dänzer, gcc -lstdc++ doesn't work everywhere,
 because ...
 Use g++ for linking and remove redundant LDFLAGS += -lstdc++.
 ---
  src/gallium/targets/Makefile.dri   |2 --
  src/gallium/targets/Makefile.va|4 +---
  src/gallium/targets/Makefile.vdpau |4 +---
  src/gallium/targets/Makefile.xorg  |5 +
  src/gallium/targets/Makefile.xvmc  |4 +---
  5 files changed, 4 insertions(+), 15 deletions(-)

 diff --git a/src/gallium/targets/Makefile.dri
 b/src/gallium/targets/Makefile.dri
 index 857ebfe..a26b3ee 100644
 --- a/src/gallium/targets/Makefile.dri
 +++ b/src/gallium/targets/Makefile.dri
 @@ -4,8 +4,6 @@
  ifeq ($(MESA_LLVM),1)
  LDFLAGS += $(LLVM_LDFLAGS)
  DRIVER_EXTRAS = $(LLVM_LIBS)
 -else
 -LDFLAGS += -lstdc++
  endif

  MESA_MODULES = \
 diff --git a/src/gallium/targets/Makefile.va
 b/src/gallium/targets/Makefile.va
 index 7ced430..b6ee595 100644
 --- a/src/gallium/targets/Makefile.va
 +++ b/src/gallium/targets/Makefile.va
 @@ -17,8 +17,6 @@ STATE_TRACKER_LIB =
 $(TOP)/src/gallium/state_trackers/va/libvatracker.a
  ifeq ($(MESA_LLVM),1)
  LDFLAGS += $(LLVM_LDFLAGS)
  DRIVER_EXTRAS = $(LLVM_LIBS)
 -else
 -LDFLAGS += -lstdc++
  endif

  # XXX: Hack, VA public funcs aren't exported
 @@ -39,7 +37,7 @@ OBJECTS = $(C_SOURCES:.c=.o) \
  default: depend symlinks $(TOP)/$(LIB_DIR)/gallium/$(LIBNAME)

  $(TOP)/$(LIB_DIR)/gallium/$(LIBNAME): $(OBJECTS) $(PIPE_DRIVERS)
 $(STATE_TRACKER_LIB) $(TOP)/$(LIB_DIR)/gallium Makefile
 -   $(MKLIB) -o $(LIBBASENAME) -linker '$(CC)' -ldflags '$(LDFLAGS)' \
 +   $(MKLIB) -o $(LIBBASENAME) -linker '$(CXX)' -ldflags '$(LDFLAGS)' \
-major $(VA_MAJOR) -minor $(VA_MINOR) $(MKLIB_OPTIONS) \
-install $(TOP)/$(LIB_DIR)/gallium \
$(OBJECTS) $(STATE_TRACKER_LIB) $(PIPE_DRIVERS) $(LIBS)
 $(DRIVER_EXTRAS)
 diff --git a/src/gallium/targets/Makefile.vdpau
 b/src/gallium/targets/Makefile.vdpau
 index c634915..f6b89ad 100644
 --- a/src/gallium/targets/Makefile.vdpau
 +++ b/src/gallium/targets/Makefile.vdpau
 @@ -17,8 +17,6 @@ STATE_TRACKER_LIB =
 $(TOP)/src/gallium/state_trackers/vdpau/libvdpautracker.a
  ifeq ($(MESA_LLVM),1)
  LDFLAGS += $(LLVM_LDFLAGS)
  DRIVER_EXTRAS = $(LLVM_LIBS)
 -else
 -LDFLAGS += -lstdc++
  endif

  # XXX: Hack, VDPAU public funcs aren't exported if we link to
 libvdpautracker.a :(
 @@ -39,7 +37,7 @@ OBJECTS = $(C_SOURCES:.c=.o) \
  default: depend symlinks $(TOP)/$(LIB_DIR)/gallium/$(LIBNAME)

  $(TOP)/$(LIB_DIR)/gallium/$(LIBNAME): $(OBJECTS) $(PIPE_DRIVERS)
 $(STATE_TRACKER_LIB) $(TOP)/$(LIB_DIR)/gallium Makefile
 -   $(MKLIB) -o $(LIBBASENAME) -linker '$(CC)' -ldflags '$(LDFLAGS)' \
 +   $(MKLIB) -o $(LIBBASENAME) -linker '$(CXX)' -ldflags '$(LDFLAGS)' \
-major $(VDPAU_MAJOR) -minor $(VDPAU_MINOR) $(MKLIB_OPTIONS)
 \
-install $(TOP)/$(LIB_DIR)/gallium \
$(OBJECTS) $(STATE_TRACKER_LIB) $(PIPE_DRIVERS) $(LIBS)
 $(DRIVER_EXTRAS)
 diff --git a/src/gallium/targets/Makefile.xorg
 b/src/gallium/targets/Makefile.xorg
 index c96eded..0538b2b 100644
 --- a/src/gallium/targets/Makefile.xorg
 +++ 

Re: [Mesa-dev] DEATH to old drivers!

2011-08-24 Thread Patrick Baggett
My Voodoo3 3500 AGP just wept.

On Wed, Aug 24, 2011 at 4:36 PM, Eric Anholt e...@anholt.net wrote:

 On Wed, 24 Aug 2011 12:11:32 -0700, Ian Romanick i...@freedesktop.org
 wrote:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
  I'd like to propose giving the ax to a bunch of old, unmaintained
  drivers.  I've been doing a bunch of refactoring and reworking of core
  Mesa code, and these drivers have been causing me problems for a number
  of reasons.

 Acked!

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GPL'd vl_mpeg12_bitstream.c

2011-08-12 Thread Patrick Baggett
Why not ask the original author to relicense?

2011/8/12 Marek Olšák mar...@gmail.com

 2011/8/12 Christian König deathsim...@vodafone.de:
  Am Freitag, den 12.08.2011, 10:49 -0400 schrieb Younes Manton:
  Sorry, by incompatible I didn't mean that you couldn't use them
  together, but that one is more restrictive than the other. Like the
  discussion you quoted states, if you combine MIT and GPL you have to
  satisfy both of them, which means you have to satisfy the GPL. I
  personally don't care that much, but unfortunately with the way
  gallium is built it affects more than just VDPAU.
 
  Every driver in lib/gallium includes that code, including swrast_dri
  (softpipe), r600_dri, etc, and libGL loads those drivers. If you build
  with the swrast config instead of DRI I believe galllium libGL
  statically links with softpipe, so basically my understanding is that
  anyone linking with gallium libGL (both swrast and DRI configs) has to
  satisfy the GPL now.
  A crap, your right. I've forgotten that GPL has even a problem when code
  is just linked in, compared to being used.
 
  Maybe someone else who is more familiar with these sorts of things can
  comment and confirm that this is accurate and whether or not it's a
  problem.
  I already asked around in my AMD team, and the general answer was: Oh
  fuck I've no idea, please don't give me a headache. I could asked around
  a bit more, but I don't think we get a definitive answer before xmas.
 
  As a short term solution we could compile that code conditionally, and
  only enable it when the VDPAU state tracker is enabled. But as the long
  term solution the code just needs a rewrite, beside having a license
  problem, it is just not very optimal. The original code is something
  like a decade old, and is using a whole bunch of quirks which are not
  useful by today’s standards (not including the sign in mv tables for
  example). ffmpegs/libavs implementation for example is something like
  halve the size and even faster, but uses more memory for table lookups.
  But that code is also dual licensed under the GPL/LGPL.
 
  Using LGPL code instead could also be a solution, because very important
  parts of Mesa (the GLSL parser for example) is already licensed under
  that, but I'm also not an expert with that also.

 Even though the GLSL parser is licensed under LGPL (because Bison is),
 there is a special exception that we may license it under whatever
 licence we want if we don't make software that does exactly what Bison
 does. So the whole GLSL compiler is actually licensed under the MIT
 license. There was one LGPL dependency (talloc), but Intel has paid
 special attention to get rid of that. My recollection is nobody wanted
 LGPL or GPL code in Mesa.

 Marek
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] rationale for GLubyte pointers for strings?

2011-07-19 Thread Patrick Baggett
SGI invented OpenGL and offered it first on their IRIX platform. SGI's
MIPSpro compiler has the char datatype as unsigned by default, so the
compiler would likely complain if assigning a GLbyte pointer to an
[unsigned] character pointer. Thus, to do something like

char* ext = glGetString(GL_VENDOR);

doesn't require a cast on IRIX, while the same code would require a cast
using other compilers due to the aforementioned problem.

Patrick


On Tue, Jul 19, 2011 at 1:44 PM, Allen Akin a...@arden.org wrote:

 On Tue, Jul 19, 2011 at 12:20:54PM -0600, tom fogal wrote:
 | glGetString and gluErrorString, plus maybe some other functions, return
 | GLubyte pointers instead of simply character pointers...
 | What's the rationale here?

 I agree, it's odd.  I don't remember the rationale, but my best guess is
 that it papered over some compatibility issue with another language
 binding (probably Fortran).  I suppose there's a very slight possibility
 that it sprang from a compatibility issue with Cray.

 Allen
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] is it possible to dynamic load OSMesa?

2011-07-15 Thread Patrick Baggett
If libOSMesa.so is separate library, then isn't libGL.so too? You're calling
glGetIntegerv() from libGL.so but not from libOSMesa.so -- try doing
dlsym(glGetIntegerv) and removing libGL.so from the link line.

Patrick

On Fri, Jul 15, 2011 at 2:41 PM, Paul Gotzel paul.got...@gmail.com wrote:

 Hello,

 I've downloaded the latest 7.10.3 and I need to be able to dynamically load
 OSMesa.  Is this possible?  I've tried to use dlopen and dlsym to load the
 functions and all the OSMesa calls return success but when I make the gl
 calls I get:

 GL User Error: glGetIntegerv called without a rendering context
 GL User Error: glGetIntegerv called without a rendering context
 GL User Error: glGetIntegerv called without a rendering context

 Any help would be appreciated.

 Thanks,
 Paul

 My sample program is as follows.  I compile it with the same flags as the
 rest of the demo programs without linking to OSMesa.

 static void *
 loadOSMesa()
 {
   return dlopen(libOSMesa.so, RTLD_DEEPBIND | RTLD_NOW | RTLD_GLOBAL);
 }

 static OSMesaContext
 dynOSMesaCreateContext()
 {
   typedef OSMesaContext (*CreateContextProto)( GLenum , GLint , GLint ,
 GLint , OSMesaContext );
   static void *createPfunc = NULL;
   CreateContextProto createContext;
   if (createPfunc == NULL)
   {
 void *handle = loadOSMesa();
 if (handle)
 {
   createPfunc = dlsym(handle, OSMesaCreateContextExt);
 }
   }

   if (createPfunc)
   {
 createContext = (CreateContextProto)(createPfunc);
 return (*createContext)(GL_RGBA, 16, 0, 0, NULL);
   }
   return 0;
 }

 static GLboolean
 dynOSMesaMakeCurrent(OSMesaContext cid, void * win, GLenum type, GLsizei w,
 GLsizei h)
 {
   typedef GLboolean (*MakeCurrentProto)(OSMesaContext, void *, GLenum,
 GLsizei, GLsizei);
   static void *currentPfunc = NULL;
   MakeCurrentProto makeCurrent;
   if (currentPfunc == NULL)
   {
 void *handle = loadOSMesa();
 if (handle)
 {
   currentPfunc = dlsym(handle, OSMesaMakeCurrent);
 }
   }
   if (currentPfunc)
   {
 makeCurrent = (MakeCurrentProto)(currentPfunc);
 return (*makeCurrent)(cid, win, type, w, h);
   }
   return GL_FALSE;
 }

 int
 main(int argc, char *argv[])
 {
OSMesaContext ctx;
void *buffer;

ctx = dynOSMesaCreateContext();
if (!ctx) {
   printf(OSMesaCreateContext failed!\n);
   return 0;
}

int Width = 100;
int Height = 100;

/* Allocate the image buffer */
buffer = malloc( Width * Height * 4 * sizeof(GLubyte) );
if (!buffer) {
   printf(Alloc image buffer failed!\n);
   return 0;
}

/* Bind the buffer to the context and make it current */
if (!dynOSMesaMakeCurrent( ctx, buffer, GL_UNSIGNED_BYTE, Width, Height
 )) {
   printf(OSMesaMakeCurrent failed!\n);
   return 0;
}


{
   int z, s, a;
   glGetIntegerv(GL_DEPTH_BITS, z);
   glGetIntegerv(GL_STENCIL_BITS, s);
   glGetIntegerv(GL_ACCUM_RED_BITS, a);
   printf(Depth=%d Stencil=%d Accum=%d\n, z, s, a);
}

return 0;
 }


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] xorg/nouveau: blacklist all pre NV30 cards

2011-06-05 Thread Patrick Baggett
Wasn't nouveau targeted to provide HW acceleration for old cards like the
TNT2, or has that idea been killed?

Patrick

On Sun, Jun 5, 2011 at 2:06 PM, Marcin Slusarz marcin.slus...@gmail.comwrote:

 On Tue, May 17, 2011 at 12:20:14AM +0200, Marcin Slusarz wrote:
  Bail out early in probe, so other driver can take control of the card.
  Doing it in screen_create would be too late.
  ---
   src/gallium/targets/xorg-nouveau/nouveau_xorg.c |   44
 ++-
   1 files changed, 35 insertions(+), 9 deletions(-)

 ping

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: silence some compilation warnings.

2011-05-12 Thread Patrick Baggett
I would be wary of assuming you can typecast long - pointer, or pointer -
long. On 64-bit Windows,  sizeof(int) == sizeof(long) == 4 but sizeof(void*)
== 8. On 64-bit Linux (gcc), sizeof(int) == 4, sizeof(long) == sizeof(void*)
== 8. It would be better to use stdint.h with uintptr_t -- it was designed
to solve this problem exactly. If you insist on using long, why not use long
long (C99) which is 64-bits on both platforms.



On Thu, May 12, 2011 at 3:49 AM, zhigang gong zhigang.g...@gmail.comwrote:

 glu.h: typedef void (GLAPIENTRYP _GLUfuncptr)(); causes the following
   warning: function declaration isn't a prototype.
 egl:   When convert a (void *) to a int type, it's better to
   convert to long firstly, otherwise in 64 bit envirnonment, it
   causes compilation warning.
 ---
  include/GL/glu.h|2 +-
  src/egl/drivers/dri2/egl_dri2.c |4 ++--
  src/egl/drivers/dri2/platform_drm.c |4 ++--
  src/egl/drivers/dri2/platform_x11.c |2 +-
  src/egl/main/eglapi.c   |2 +-
  5 files changed, 7 insertions(+), 7 deletions(-)

 diff --git a/include/GL/glu.h b/include/GL/glu.h
 index cd967ac..ba2228d 100644
 --- a/include/GL/glu.h
 +++ b/include/GL/glu.h
 @@ -284,7 +284,7 @@ typedef GLUtesselator GLUtriangulatorObj;
  #define GLU_TESS_MAX_COORD 1.0e150

  /* Internal convenience typedefs */
 -typedef void (GLAPIENTRYP _GLUfuncptr)();
 +typedef void (GLAPIENTRYP _GLUfuncptr)(void);

  GLAPI void GLAPIENTRY gluBeginCurve (GLUnurbs* nurb);
  GLAPI void GLAPIENTRY gluBeginPolygon (GLUtesselator* tess);
 diff --git a/src/egl/drivers/dri2/egl_dri2.c
 b/src/egl/drivers/dri2/egl_dri2.c
 index afab679..f5f5ac3 100644
 --- a/src/egl/drivers/dri2/egl_dri2.c
 +++ b/src/egl/drivers/dri2/egl_dri2.c
 @@ -835,7 +835,7 @@ dri2_create_image_khr_renderbuffer(_EGLDisplay
 *disp, _EGLContext *ctx,
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
struct dri2_egl_context *dri2_ctx = dri2_egl_context(ctx);
struct dri2_egl_image *dri2_img;
 -   GLuint renderbuffer = (GLuint) buffer;
 +   GLuint renderbuffer =  (unsigned long) buffer;

if (renderbuffer == 0) {
   _eglError(EGL_BAD_PARAMETER, dri2_create_image_khr);
 @@ -870,7 +870,7 @@ dri2_create_image_mesa_drm_buffer(_EGLDisplay
 *disp, _EGLContext *ctx,

(void) ctx;

 -   name = (EGLint) buffer;
 +   name = (unsigned long) buffer;

err = _eglParseImageAttribList(attrs, disp, attr_list);
if (err != EGL_SUCCESS)
 diff --git a/src/egl/drivers/dri2/platform_drm.c
 b/src/egl/drivers/dri2/platform_drm.c
 index 68912e3..cea8418 100644
 --- a/src/egl/drivers/dri2/platform_drm.c
 +++ b/src/egl/drivers/dri2/platform_drm.c
 @@ -596,7 +596,7 @@ dri2_get_device_name(int fd)
   goto out;
}

 -   device_name = udev_device_get_devnode(device);
 +   device_name = (char*)udev_device_get_devnode(device);
if (!device_name)
   goto out;
device_name = strdup(device_name);
 @@ -690,7 +690,7 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
memset(dri2_dpy, 0, sizeof *dri2_dpy);

disp-DriverData = (void *) dri2_dpy;
 -   dri2_dpy-fd = (int) disp-PlatformDisplay;
 +   dri2_dpy-fd = (long) disp-PlatformDisplay;

dri2_dpy-driver_name = dri2_get_driver_for_fd(dri2_dpy-fd);
if (dri2_dpy-driver_name == NULL)
 diff --git a/src/egl/drivers/dri2/platform_x11.c
 b/src/egl/drivers/dri2/platform_x11.c
 index 5d4ac6a..90136f4 100644
 --- a/src/egl/drivers/dri2/platform_x11.c
 +++ b/src/egl/drivers/dri2/platform_x11.c
 @@ -784,7 +784,7 @@ dri2_create_image_khr_pixmap(_EGLDisplay *disp,
 _EGLContext *ctx,

(void) ctx;

 -   drawable = (xcb_drawable_t) buffer;
 +   drawable = (xcb_drawable_t) (long)buffer;
xcb_dri2_create_drawable (dri2_dpy-conn, drawable);
attachments[0] = XCB_DRI2_ATTACHMENT_BUFFER_FRONT_LEFT;
buffers_cookie =
 diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
 index 336ec23..9063752 100644
 --- a/src/egl/main/eglapi.c
 +++ b/src/egl/main/eglapi.c
 @@ -1168,7 +1168,7 @@ eglQueryModeStringMESA(EGLDisplay dpy, EGLModeMESA
 mode)
  EGLDisplay EGLAPIENTRY
  eglGetDRMDisplayMESA(int fd)
  {
 -   _EGLDisplay *dpy = _eglFindDisplay(_EGL_PLATFORM_DRM, (void *) fd);
 +   _EGLDisplay *dpy = _eglFindDisplay(_EGL_PLATFORM_DRM, (void *)
 (long)fd);
return _eglGetDisplayHandle(dpy);
  }

 --
 1.7.3.1
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Thanks To All!

2011-05-03 Thread Patrick Baggett
I just wanted to say thanks! to everyone who has been taking part of
Mesa3D. I have an R500-based card and it is good to know that it still
functions on Linux even after ATI/AMD decided it was too old too support.
Not only that, it still receives improvements from Mesa. I even hear
whispers that those cards might function on Power architecture systems, and
I can't help but finding myself impressed. Good job to you all and keep up
the good work.

Patrick Baggett
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Naked DXTn support via ARB_texture_compression?

2011-03-20 Thread Patrick Baggett
Offhand, anyone know when these patents expire?

Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Truncated extensions string

2011-03-11 Thread Patrick Baggett
I feel like there is some kind of underlying lesson that we, OpenGL app
programmers, should be getting out of this...

What about a psuedo-database of app - extension list rather than by year?
Surely Quake3 doesn't make use of but = 10 extensions. I'd imagine the same
holds true for other old games as well. A simple strings on their binary
could figure that out...

On Fri, Mar 11, 2011 at 2:14 PM, Kenneth Graunke kenn...@whitecape.orgwrote:

 On Friday, March 11, 2011 10:46:31 AM José Fonseca wrote:
  On Fri, 2011-03-11 at 09:04 -0800, Eric Anholt wrote:
   On Fri, 11 Mar 2011 10:33:13 +, José Fonseca jfons...@vmware.com
 wrote:
The problem from
   
   
 http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg12493.h
tml
   
is back, and now a bit worse -- it causes Quake3 arena demo to crash
(at least the windows version). The full version works fine. I'm not
sure what other applications are hit by this. See the above thread
 for
more background.
   
   
There are two major approaches:
   
1) sort extensions chronologically instead of alphabetically. See
attached patch for that
   
  - for those who prefer to see extensions sorted alphabetically in
   
glxinfo, we could modify glxinfo to sort then before displaying
   
2) detect broken applications (i.e., by process name), and only sort
extensions strings chronologically then
   
Personally I think that varying behavior based on process name is a
ugly and brittle hack, so I'd prefer 1), but I just want to put this
on my back above all, so whatever works is also fine by me.
  
   If this is just a hack for one broken application, and we think that
   building in a workaround for this particular broken application is
   important (I don't), I still prefer an obvious hack for that broken
   application like feeding it a tiny extension string that it cares
 about,
   instead of reordering the extension list.
 
  There are many versions of Quake3 out there, some fixed, others not, and
  others enhanced. This means a tiny string would prevent any Quake3
  application from finding newer extensions. So I think that if we go for
  the application name detection then we should present the whole
  extension string sorted chronologically, instead of giving a tiny
  string.
 
  Jose

 I agree with José - it's not one broken application, it's a number of old,
 sometimes closed-source games that we can't change.

 I'm not sure how changing the sorting solves the problem, anyway - the
 amount
 of data returned would still overflow the buffer, possibly wreaking havoc.
  I'd
 rather avoid that.

 Ian and I talked about this a year ago, and the solution I believe we came
 up
 with was to use a driconf option or environment variable:

 If MESA_MAX_EXTENSION_YEAR=2006, then glGetString would only return
 extensions
 created in 2006 or earlier.  The rationale is that if a game came out in
 2006,
 it won't know about any extensions from 2007 anyway, so advertising them is
 useless.  The fixed-size buffer is also almost certainly large enough to
 handle
 this cut-down list of extensions.

 This should be trivial to do now that you already have the years for each
 extension...just store them in the table, rather than in comments, and
 check
 before listing an extension.

 A driconf option is nice because it allows this to be overridden in .drirc
 on
 a per-app basis, rather than having to set an environment variable.  It
 might
 be a bit more work though.

 --Kenneth
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] os: add spinlocks

2010-12-15 Thread Patrick Baggett
UP = Uniprocessor system, (S)MP = (Symmetric) multiprocessor system.

On Wed, Dec 15, 2010 at 2:23 AM, Marek Olšák mar...@gmail.com wrote:

 On Tue, Dec 14, 2010 at 8:10 PM, Thomas Hellstrom 
 thellst...@vmware.comwrote:

 Hmm,

 for the uninformed, where do we need to use spinlocks in gallium and how
 do
 we avoid using them on an UP system?


 I plan to use spinlocks to guard very simple code like the macro
 remove_from_list, which might be, under some circumstances, called too
 often. Entering and leaving a mutex is quite visible in callgrind.

 What does UP stand for?

 Marek

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev