Re: [Mesa-dev] [PATCH 05/15] glsl: Add new {fr, ld}exp built-ins IR and prototypes.

2013-08-27 Thread Paul Berry
On 26 August 2013 17:49, Ian Romanick i...@freedesktop.org wrote:

 On 08/23/2013 02:02 PM, Paul Berry wrote:

 On 23 August 2013 13:19, Matt Turner matts...@gmail.com
 mailto:matts...@gmail.com wrote:

 On Fri, Aug 23, 2013 at 8:55 AM, Paul Berry stereotype...@gmail.com
 mailto:stereotype441@gmail.**com stereotype...@gmail.com wrote:
   On 22 August 2013 16:08, Matt Turner matts...@gmail.com
 mailto:matts...@gmail.com wrote:
  
   ---
src/glsl/builtins/ir/frexp.ir http://frexp.ir
| 25
   +
src/glsl/builtins/ir/ldexp.ir http://ldexp.ir

| 25
   +
src/glsl/builtins/profiles/**ARB_gpu_shader5.glsl | 10
 ++
3 files changed, 60 insertions(+)
create mode 100644 src/glsl/builtins/ir/frexp.ir 
 http://frexp.ir
create mode 100644 src/glsl/builtins/ir/ldexp.ir 
 http://ldexp.ir
  
   diff --git a/src/glsl/builtins/ir/frexp.**ir http://frexp.ir 
 http://frexp.ir
 b/src/glsl/builtins/ir/frexp.**ir http://frexp.ir http://frexp.ir

   new file mode 100644
   index 000..a514994
   --- /dev/null
   +++ b/src/glsl/builtins/ir/frexp.**ir http://frexp.ir 
 http://frexp.ir

   @@ -0,0 +1,25 @@
   +((function frexp
   +   (signature float
   + (parameters
   +   (declare (in) float x)
   +   (declare (out) int exp))
   + ((return (expression float frexp (var_ref x) (var_ref
 exp)
  
  
   Having an ir_expression that writes to one of its parameters is
 going to
   break assumptions in a lot of our optimization passes.

 I'm concerned that that may be a problem we have to solve anyway.

 While our hardware doesn't support an frexp instruction (like e.g.,
 AMD does) and we could probably do what you suggest, we do have
 instructions that correspond directly to the uaddCarry() and
 usubBorrow() built-ins in this same extension. They return a value and
 also have an out parameter.

 genUType uaddCarry(genUType x, genUType y, out genUType carry);
 genUType usubBorrow(genUType x, genUType y, out genUType borrow);

 We could probably avoid the problem you describe by lowering them, but
 it's feeling increasingly distasteful.

 Your code would make a good piglit test. I'll do some experiments.


 Hmm, interesting.

 The way LLVM solves this problem, as I understand it, is through
 so-called intrinsic functions
 (http://llvm.org/docs/LangRef.**html#intrinsic-functionshttp://llvm.org/docs/LangRef.html#intrinsic-functions).
  I wonder if we
 should start doing that in Mesa.

 Briefly, here is what it would look like, using uaddCarry as an example:

 1. First we do an inefficient implementation of uaddCarry in terms of
 existing GLSL functions, much like you did for frexp in your
 frexp_to_arith lowering pass, except that we do it in
 src/glsl/builtins/glsl/**uaddCarry.glsl, so it's a little easier to
 review
 :).  Optimization passes already deal with function out parameters
 properly, and function inlining automatically splices in the proper code
 during linking.

 2. For back-ends that don't have an efficient native way to do
 uaddCarry, we're done.  The uaddCarry function works as is.

 3. For back-ends that do have an efficient way to do uaddCarry, we add a
 mechanism to allow the back-end to tell the linker: don't inline the
 definition of this built-in.  Just leave it as an ir_call because I have
 my own special implementation of it*.


 I had thought about solving this in a slightly different way, but there
 are a couple potential tricky bits.

 Provide an implementation of the built-in function in the GLSL library.

 float frexp(float x, out int exponent)
 {
 return __intrinsic_frexp(x, exponent);
 }

 Provide a default implementation of the intrinsic elsewhere.

 Allow drivers to supply an alternate library with custom versions of the
 intrinsics.

 Since the GLSL library's frexp is the same in either case, the problem
 Paul identifies below should be avoided.

 The tricky bit, and the problem we always come to when talking about
 intrinsics is dealing with constant expressions.  That doesn't (shouldn't?)
 apply to this case because of the out parameter, but it may apply to other
 cases.


Yeah, good point about constant expressions.  With my proposal, that could
be addressed by having the constant expression evaluator always recurse
into the GLSL implementation, regardless of whether the function is an
intrinsic (this should be fine, since the only reason for the intrinsic
version of the function to be used is to take advantage of efficient
instructions in the GPU).

I confess that I don't understand the rest of your proposal as well as I
would like.  Maybe the three of us should discuss it in person next time
we're in the office.



 Right now an application could do:

 float foo[packUnorm2x16(vec2(1,0))];

 If 

Re: [Mesa-dev] [PATCH 05/15] glsl: Add new {fr, ld}exp built-ins IR and prototypes.

2013-08-27 Thread Matt Turner
On Mon, Aug 26, 2013 at 5:49 PM, Ian Romanick i...@freedesktop.org wrote:
 On 08/23/2013 02:02 PM, Paul Berry wrote:

 On 23 August 2013 13:19, Matt Turner matts...@gmail.com
 mailto:matts...@gmail.com wrote:

 On Fri, Aug 23, 2013 at 8:55 AM, Paul Berry stereotype...@gmail.com
 mailto:stereotype...@gmail.com wrote:
   On 22 August 2013 16:08, Matt Turner matts...@gmail.com
 mailto:matts...@gmail.com wrote:
  
   ---
src/glsl/builtins/ir/frexp.ir http://frexp.ir
| 25
   +
src/glsl/builtins/ir/ldexp.ir http://ldexp.ir

| 25
   +
src/glsl/builtins/profiles/ARB_gpu_shader5.glsl | 10 ++
3 files changed, 60 insertions(+)
create mode 100644 src/glsl/builtins/ir/frexp.ir
 http://frexp.ir
create mode 100644 src/glsl/builtins/ir/ldexp.ir
 http://ldexp.ir
  
   diff --git a/src/glsl/builtins/ir/frexp.ir http://frexp.ir
 b/src/glsl/builtins/ir/frexp.ir http://frexp.ir

   new file mode 100644
   index 000..a514994
   --- /dev/null
   +++ b/src/glsl/builtins/ir/frexp.ir http://frexp.ir

   @@ -0,0 +1,25 @@
   +((function frexp
   +   (signature float
   + (parameters
   +   (declare (in) float x)
   +   (declare (out) int exp))
   + ((return (expression float frexp (var_ref x) (var_ref
 exp)
  
  
   Having an ir_expression that writes to one of its parameters is
 going to
   break assumptions in a lot of our optimization passes.

 I'm concerned that that may be a problem we have to solve anyway.

 While our hardware doesn't support an frexp instruction (like e.g.,
 AMD does) and we could probably do what you suggest, we do have
 instructions that correspond directly to the uaddCarry() and
 usubBorrow() built-ins in this same extension. They return a value and
 also have an out parameter.

 genUType uaddCarry(genUType x, genUType y, out genUType carry);
 genUType usubBorrow(genUType x, genUType y, out genUType borrow);

 We could probably avoid the problem you describe by lowering them, but
 it's feeling increasingly distasteful.

 Your code would make a good piglit test. I'll do some experiments.


 Hmm, interesting.

 The way LLVM solves this problem, as I understand it, is through
 so-called intrinsic functions
 (http://llvm.org/docs/LangRef.html#intrinsic-functions).  I wonder if we
 should start doing that in Mesa.

 Briefly, here is what it would look like, using uaddCarry as an example:

 1. First we do an inefficient implementation of uaddCarry in terms of
 existing GLSL functions, much like you did for frexp in your
 frexp_to_arith lowering pass, except that we do it in
 src/glsl/builtins/glsl/uaddCarry.glsl, so it's a little easier to review
 :).  Optimization passes already deal with function out parameters
 properly, and function inlining automatically splices in the proper code
 during linking.

 2. For back-ends that don't have an efficient native way to do
 uaddCarry, we're done.  The uaddCarry function works as is.

 3. For back-ends that do have an efficient way to do uaddCarry, we add a
 mechanism to allow the back-end to tell the linker: don't inline the
 definition of this built-in.  Just leave it as an ir_call because I have
 my own special implementation of it*.


 I had thought about solving this in a slightly different way, but there are
 a couple potential tricky bits.

 Provide an implementation of the built-in function in the GLSL library.

 float frexp(float x, out int exponent)
 {
 return __intrinsic_frexp(x, exponent);
 }

 Provide a default implementation of the intrinsic elsewhere.

 Allow drivers to supply an alternate library with custom versions of the
 intrinsics.

 Since the GLSL library's frexp is the same in either case, the problem Paul
 identifies below should be avoided.

 The tricky bit, and the problem we always come to when talking about
 intrinsics is dealing with constant expressions.  That doesn't (shouldn't?)
 apply to this case because of the out parameter, but it may apply to other
 cases.

Maybe this is a problem in the general case, but I think the only
thing we'd want to use intrinsics for at the moment are exactly the
things you can't consider to be constant expressions -- because of the
multiple outputs.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/15] glsl: Add new {fr, ld}exp built-ins IR and prototypes.

2013-08-26 Thread Ian Romanick

On 08/23/2013 02:02 PM, Paul Berry wrote:

On 23 August 2013 13:19, Matt Turner matts...@gmail.com
mailto:matts...@gmail.com wrote:

On Fri, Aug 23, 2013 at 8:55 AM, Paul Berry stereotype...@gmail.com
mailto:stereotype...@gmail.com wrote:
  On 22 August 2013 16:08, Matt Turner matts...@gmail.com
mailto:matts...@gmail.com wrote:
 
  ---
   src/glsl/builtins/ir/frexp.ir http://frexp.ir
   | 25
  +
   src/glsl/builtins/ir/ldexp.ir http://ldexp.ir
   | 25
  +
   src/glsl/builtins/profiles/ARB_gpu_shader5.glsl | 10 ++
   3 files changed, 60 insertions(+)
   create mode 100644 src/glsl/builtins/ir/frexp.ir http://frexp.ir
   create mode 100644 src/glsl/builtins/ir/ldexp.ir http://ldexp.ir
 
  diff --git a/src/glsl/builtins/ir/frexp.ir http://frexp.ir
b/src/glsl/builtins/ir/frexp.ir http://frexp.ir
  new file mode 100644
  index 000..a514994
  --- /dev/null
  +++ b/src/glsl/builtins/ir/frexp.ir http://frexp.ir
  @@ -0,0 +1,25 @@
  +((function frexp
  +   (signature float
  + (parameters
  +   (declare (in) float x)
  +   (declare (out) int exp))
  + ((return (expression float frexp (var_ref x) (var_ref exp)
 
 
  Having an ir_expression that writes to one of its parameters is
going to
  break assumptions in a lot of our optimization passes.

I'm concerned that that may be a problem we have to solve anyway.

While our hardware doesn't support an frexp instruction (like e.g.,
AMD does) and we could probably do what you suggest, we do have
instructions that correspond directly to the uaddCarry() and
usubBorrow() built-ins in this same extension. They return a value and
also have an out parameter.

genUType uaddCarry(genUType x, genUType y, out genUType carry);
genUType usubBorrow(genUType x, genUType y, out genUType borrow);

We could probably avoid the problem you describe by lowering them, but
it's feeling increasingly distasteful.

Your code would make a good piglit test. I'll do some experiments.


Hmm, interesting.

The way LLVM solves this problem, as I understand it, is through
so-called intrinsic functions
(http://llvm.org/docs/LangRef.html#intrinsic-functions).  I wonder if we
should start doing that in Mesa.

Briefly, here is what it would look like, using uaddCarry as an example:

1. First we do an inefficient implementation of uaddCarry in terms of
existing GLSL functions, much like you did for frexp in your
frexp_to_arith lowering pass, except that we do it in
src/glsl/builtins/glsl/uaddCarry.glsl, so it's a little easier to review
:).  Optimization passes already deal with function out parameters
properly, and function inlining automatically splices in the proper code
during linking.

2. For back-ends that don't have an efficient native way to do
uaddCarry, we're done.  The uaddCarry function works as is.

3. For back-ends that do have an efficient way to do uaddCarry, we add a
mechanism to allow the back-end to tell the linker: don't inline the
definition of this built-in.  Just leave it as an ir_call because I have
my own special implementation of it*.


I had thought about solving this in a slightly different way, but there 
are a couple potential tricky bits.


Provide an implementation of the built-in function in the GLSL library.

float frexp(float x, out int exponent)
{
return __intrinsic_frexp(x, exponent);
}

Provide a default implementation of the intrinsic elsewhere.

Allow drivers to supply an alternate library with custom versions of the 
intrinsics.


Since the GLSL library's frexp is the same in either case, the problem 
Paul identifies below should be avoided.


The tricky bit, and the problem we always come to when talking about 
intrinsics is dealing with constant expressions.  That doesn't 
(shouldn't?) apply to this case because of the out parameter, but it may 
apply to other cases.


Right now an application could do:

float foo[packUnorm2x16(vec2(1,0))];

If packUnorm2x16 becomes __intrinsic_packUnorm2x16, the constant 
expression evaluator has to be able to handle whatever 
__intrinsic_packUnorm2x16 becomes.



4. In the back-end visitor code, the ir_call visitor looks at the name
of the function being called.  If it's uaddCarry, then the back-end
visitor just emits the efficient back-end code.  Any other ir_calls
should have been eliminated by the function inlining.

*We'll need to be careful to make sure that the right thing happens if
the user overrides uaddCarry with their own user-defined function, of
course :)


Now that I've actually thought through it, I'm really excited about this
idea.  It seems way more straightforward than what we are currently
doing (e.g. in lower_packing_builtins.cpp), and it works nicely with the
other back-ends because if a back-end doesn't advertise an intrinsic

Re: [Mesa-dev] [PATCH 05/15] glsl: Add new {fr, ld}exp built-ins IR and prototypes.

2013-08-23 Thread Paul Berry
On 22 August 2013 16:08, Matt Turner matts...@gmail.com wrote:

 ---
  src/glsl/builtins/ir/frexp.ir   | 25
 +
  src/glsl/builtins/ir/ldexp.ir   | 25
 +
  src/glsl/builtins/profiles/ARB_gpu_shader5.glsl | 10 ++
  3 files changed, 60 insertions(+)
  create mode 100644 src/glsl/builtins/ir/frexp.ir
  create mode 100644 src/glsl/builtins/ir/ldexp.ir

 diff --git a/src/glsl/builtins/ir/frexp.ir b/src/glsl/builtins/ir/frexp.ir
 new file mode 100644
 index 000..a514994
 --- /dev/null
 +++ b/src/glsl/builtins/ir/frexp.ir
 @@ -0,0 +1,25 @@
 +((function frexp
 +   (signature float
 + (parameters
 +   (declare (in) float x)
 +   (declare (out) int exp))
 + ((return (expression float frexp (var_ref x) (var_ref exp)


Having an ir_expression that writes to one of its parameters is going to
break assumptions in a lot of our optimization passes.  For example, if
opt_tree_grafting encounters this code:

uniform float u;
void main()
{
  int exp;
  float f = frexp(u, out exp);
  float g = float(exp)/256.0;
  float h = float(exp) + 1.0;
  gl_FragColor = vec4(f, g, h, g + h);
}

it may try to optimize it to this:

uniform float u;
void main()
{
  int exp;
  float g = float(exp)/256.0;
  float h = float(exp) + 1.0;
  gl_FragColor = vec4(frexp(u, out exp), g, h, g + h);
}

I think what we need to do is either:

1. Punt on the frexp_to_arith lowering pass for now, and instead just put
the lowered code right here, or

2. In patch 7, replace ir_binop_frexp with two unary ops, one that computes
the mantissa (return value of frexp()), and one that computes the integer
exponent.  Then this code will be effectively:

float frexp(float x, out int exp)
{
   exp = ir_unop_frexp_mantissa(x);
   return ir_unop_frexp_exponent(x);
}

 I'm leaning toward option 1, because I suspect it will generate more
efficient code (option 2 is likely to cause the if test in frexp_to_arith
to be duplicated).


 +
 +   (signature vec2
 + (parameters
 +   (declare (in) vec2 x)
 +   (declare (out) ivec2 exp))
 + ((return (expression vec2 frexp (var_ref x) (var_ref exp)
 +
 +   (signature vec3
 + (parameters
 +   (declare (in) vec3 x)
 +   (declare (out) ivec3 exp))
 + ((return (expression vec3 frexp (var_ref x) (var_ref exp)
 +
 +   (signature vec4
 + (parameters
 +   (declare (in) vec4 x)
 +   (declare (out) ivec4 exp))
 + ((return (expression vec4 frexp (var_ref x) (var_ref exp)
 +))
 diff --git a/src/glsl/builtins/ir/ldexp.ir b/src/glsl/builtins/ir/ldexp.ir
 new file mode 100644
 index 000..dd25f5a
 --- /dev/null
 +++ b/src/glsl/builtins/ir/ldexp.ir
 @@ -0,0 +1,25 @@
 +((function ldexp
 +   (signature float
 + (parameters
 +   (declare (in) float x)
 +   (declare (in) int exp))
 + ((return (expression float ldexp (var_ref x) (var_ref exp)


Note: ldexp is fine as a binop, since both its parameters are inputs.


 +
 +   (signature vec2
 + (parameters
 +   (declare (in) vec2 x)
 +   (declare (in) ivec2 exp))
 + ((return (expression vec2 ldexp (var_ref x) (var_ref exp)
 +
 +   (signature vec3
 + (parameters
 +   (declare (in) vec3 x)
 +   (declare (in) ivec3 exp))
 + ((return (expression vec3 ldexp (var_ref x) (var_ref exp)
 +
 +   (signature vec4
 + (parameters
 +   (declare (in) vec4 x)
 +   (declare (in) ivec4 exp))
 + ((return (expression vec4 ldexp (var_ref x) (var_ref exp)
 +))
 diff --git a/src/glsl/builtins/profiles/ARB_gpu_shader5.glsl
 b/src/glsl/builtins/profiles/ARB_gpu_shader5.glsl
 index 3f76283..36fc0de 100644
 --- a/src/glsl/builtins/profiles/ARB_gpu_shader5.glsl
 +++ b/src/glsl/builtins/profiles/ARB_gpu_shader5.glsl
 @@ -59,3 +59,13 @@ float fma(float a, float b, float c);
  vec2  fma(vec2  a, vec2  b, vec2  c);
  vec3  fma(vec3  a, vec3  b, vec3  c);
  vec4  fma(vec4  a, vec4  b, vec4  c);
 +
 +float frexp(float x, out int   exp);
 +vec2  frexp(vec2  x, out ivec2 exp);
 +vec3  frexp(vec3  x, out ivec3 exp);
 +vec4  frexp(vec4  x, out ivec4 exp);
 +
 +float ldexp(float x, int   exp);
 +vec2  ldexp(vec2  x, ivec2 exp);
 +vec3  ldexp(vec3  x, ivec3 exp);
 +vec4  ldexp(vec4  x, ivec4 exp);
 --
 1.8.3.2

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/15] glsl: Add new {fr, ld}exp built-ins IR and prototypes.

2013-08-23 Thread Matt Turner
On Fri, Aug 23, 2013 at 8:55 AM, Paul Berry stereotype...@gmail.com wrote:
 On 22 August 2013 16:08, Matt Turner matts...@gmail.com wrote:

 ---
  src/glsl/builtins/ir/frexp.ir   | 25
 +
  src/glsl/builtins/ir/ldexp.ir   | 25
 +
  src/glsl/builtins/profiles/ARB_gpu_shader5.glsl | 10 ++
  3 files changed, 60 insertions(+)
  create mode 100644 src/glsl/builtins/ir/frexp.ir
  create mode 100644 src/glsl/builtins/ir/ldexp.ir

 diff --git a/src/glsl/builtins/ir/frexp.ir b/src/glsl/builtins/ir/frexp.ir
 new file mode 100644
 index 000..a514994
 --- /dev/null
 +++ b/src/glsl/builtins/ir/frexp.ir
 @@ -0,0 +1,25 @@
 +((function frexp
 +   (signature float
 + (parameters
 +   (declare (in) float x)
 +   (declare (out) int exp))
 + ((return (expression float frexp (var_ref x) (var_ref exp)


 Having an ir_expression that writes to one of its parameters is going to
 break assumptions in a lot of our optimization passes.

I'm concerned that that may be a problem we have to solve anyway.

While our hardware doesn't support an frexp instruction (like e.g.,
AMD does) and we could probably do what you suggest, we do have
instructions that correspond directly to the uaddCarry() and
usubBorrow() built-ins in this same extension. They return a value and
also have an out parameter.

genUType uaddCarry(genUType x, genUType y, out genUType carry);
genUType usubBorrow(genUType x, genUType y, out genUType borrow);

We could probably avoid the problem you describe by lowering them, but
it's feeling increasingly distasteful.

Your code would make a good piglit test. I'll do some experiments.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/15] glsl: Add new {fr, ld}exp built-ins IR and prototypes.

2013-08-23 Thread Paul Berry
On 23 August 2013 13:19, Matt Turner matts...@gmail.com wrote:

 On Fri, Aug 23, 2013 at 8:55 AM, Paul Berry stereotype...@gmail.com
 wrote:
  On 22 August 2013 16:08, Matt Turner matts...@gmail.com wrote:
 
  ---
   src/glsl/builtins/ir/frexp.ir   | 25
  +
   src/glsl/builtins/ir/ldexp.ir   | 25
  +
   src/glsl/builtins/profiles/ARB_gpu_shader5.glsl | 10 ++
   3 files changed, 60 insertions(+)
   create mode 100644 src/glsl/builtins/ir/frexp.ir
   create mode 100644 src/glsl/builtins/ir/ldexp.ir
 
  diff --git a/src/glsl/builtins/ir/frexp.ir b/src/glsl/builtins/ir/
 frexp.ir
  new file mode 100644
  index 000..a514994
  --- /dev/null
  +++ b/src/glsl/builtins/ir/frexp.ir
  @@ -0,0 +1,25 @@
  +((function frexp
  +   (signature float
  + (parameters
  +   (declare (in) float x)
  +   (declare (out) int exp))
  + ((return (expression float frexp (var_ref x) (var_ref exp)
 
 
  Having an ir_expression that writes to one of its parameters is going to
  break assumptions in a lot of our optimization passes.

 I'm concerned that that may be a problem we have to solve anyway.

 While our hardware doesn't support an frexp instruction (like e.g.,
 AMD does) and we could probably do what you suggest, we do have
 instructions that correspond directly to the uaddCarry() and
 usubBorrow() built-ins in this same extension. They return a value and
 also have an out parameter.

 genUType uaddCarry(genUType x, genUType y, out genUType carry);
 genUType usubBorrow(genUType x, genUType y, out genUType borrow);

 We could probably avoid the problem you describe by lowering them, but
 it's feeling increasingly distasteful.

 Your code would make a good piglit test. I'll do some experiments.


Hmm, interesting.

The way LLVM solves this problem, as I understand it, is through so-called
intrinsic functions (http://llvm.org/docs/LangRef.html#intrinsic-functions).
I wonder if we should start doing that in Mesa.

Briefly, here is what it would look like, using uaddCarry as an example:

1. First we do an inefficient implementation of uaddCarry in terms of
existing GLSL functions, much like you did for frexp in your frexp_to_arith
lowering pass, except that we do it in
src/glsl/builtins/glsl/uaddCarry.glsl, so it's a little easier to review
:).  Optimization passes already deal with function out parameters
properly, and function inlining automatically splices in the proper code
during linking.

2. For back-ends that don't have an efficient native way to do uaddCarry,
we're done.  The uaddCarry function works as is.

3. For back-ends that do have an efficient way to do uaddCarry, we add a
mechanism to allow the back-end to tell the linker: don't inline the
definition of this built-in.  Just leave it as an ir_call because I have my
own special implementation of it*.

4. In the back-end visitor code, the ir_call visitor looks at the name of
the function being called.  If it's uaddCarry, then the back-end visitor
just emits the efficient back-end code.  Any other ir_calls should have
been eliminated by the function inlining.

*We'll need to be careful to make sure that the right thing happens if the
user overrides uaddCarry with their own user-defined function, of course :)


Now that I've actually thought through it, I'm really excited about this
idea.  It seems way more straightforward than what we are currently doing
(e.g. in lower_packing_builtins.cpp), and it works nicely with the other
back-ends because if a back-end doesn't advertise an intrinsic definition
of a given function, it automtically gets the version declared in
src/glsl/builtins without having to do any extra work.

What do you think?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev