Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 I'm still a little pensive.  But

 Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

Thanks.

 Now for a little aside.  I have come to the conclusion that I made a grave
 mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should have
 just subclassed fs_inst for load_payload.  The problem is that we need to
 snag a bunch of information for the sources when we create the
 load_payload.  In particular, we need to know the width of the source so
 that we know how much space it consumes in the payload and we need to know
 the information required to properly re-create the mov such as
 force_sechalf and force_writemask_all.  Really, in order to do things
 properly, we need to gather this information *before* we do any
 optimizations.  The nasty pile of code that you're editing together with
 the effective_width parameter is a lame attempt to capture/reconstruct
 this information.  Really, we should just subclass, capture the information
 up-front, and do it properly.

Yes, absolutely, this lowering pass is a real mess.  There are four more
unreviewed patches in the mailing list fixing bugs of the metadata
guessing of lower_load_payload() [1][2][3][4], you may want to take a
look at them.  There are more bugs I'm aware of but it didn't seem
practical to fix them.

That said, I don't think it would be worth subclassing fs_inst.

My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
into a series of MOVs with force_writemask_all enabled in all cases,
simply rely on the optimizer to eliminate those where possible.  Then
get rid of the metadata and effective_width guessing.  Instead agree on
immediates and uniforms being exec_size-wide by convention
(i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
then prevent constant propagation from propagating an immediate load
into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
program, and maybe just run constant propagation after lowering so we
can be sure those cases are cleaned up properly where register coalesce
isn't enough.

[1] http://lists.freedesktop.org/archives/mesa-dev/2015-January/074614.html
[2] http://lists.freedesktop.org/archives/mesa-dev/2015-February/076094.html
[3] http://lists.freedesktop.org/archives/mesa-dev/2015-February/076097.html
[4] http://lists.freedesktop.org/archives/mesa-dev/2015-February/076098.html


 --Jason

 On Thu, Feb 19, 2015 at 1:53 PM, Jason Ekstrand ja...@jlekstrand.net
 wrote:



 On Thu, Feb 19, 2015 at 1:25 PM, Francisco Jerez curroje...@riseup.net
 wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  On Thu, Feb 19, 2015 at 12:13 PM, Francisco Jerez 
 curroje...@riseup.net
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez 
 curroje...@riseup.net
   wrote:
  
   Hey Matt,
  
   Matt Turner matts...@gmail.com writes:
  
On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez 
  curroje...@riseup.net
   wrote:
MRFs cannot be read from anyway so they cannot possibly be a
 valid
source of LOAD_PAYLOAD.
---
   
The function only seems to test inst-dst.file == MRF. I don't
 see any
code for handling MRF sources. What am I missing?
  
   That test is for handling MRF sources -- More precisely, it's
   collecting the writemask and half flags for MRF writes, which can
 only
   possibly be useful if we're going to use them later on to read
 something
   out of an MRF into a payload, which we shouldn't be doing in the
 first
   place.
  
   Aside from simplifying the function somewhat, that allows us to
 drop the
   16 register gap reserved for MRFs at register offset zero, what will
   allow us to drop the vgrf_to_reg[] offset calculation completely
 (also
   in split_virtual_grfs()) in a future patch (not sent for review
 yet).
  
  
   No, we do read from MRF's sort-of...  Send messages have an implicit
  read
   from an MRF.
 
  Heh, and that's pretty much the only way you read from it.
 
   This was written precicely so that we could use LOAD_PAYLOAD
   to build MRF payloads.  We do on pre-GEN6.
  
  I'm aware, but you don't need any of this meta-data to LOAD_PAYLOAD
  *into* an MRF, and LOAD_PAYLOAD with an MRF as source should be illegal
  anyway.
 
 
  And no one is using it that way.  All of the metadata checks you are
  deleting are checks on the *destination*.
 

 Didn't you write this code yourself?  The only use for the collected
 metadata is initializing the instruction flags of the MOVs subsequent
 LOAD_PAYLOAD instructions are lowered to, based on the metadata already
 collected for its source registers, which can never be MRFs, so the
 metadata you collect from MRF writes is never actually used.


 Right... I misred something initially.  Yes, we should never be tracking
 MRF's as a source of a LOAD_PAYLOAD.  I'll give it a better look a bit
 later, but it looks better.



pgpjBO09kHb1i.pgp
Description: PGP signature

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Jason Ekstrand
On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez curroje...@riseup.net
wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  I'm still a little pensive.  But
 
  Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com
 
 Thanks.

  Now for a little aside.  I have come to the conclusion that I made a
 grave
  mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should have
  just subclassed fs_inst for load_payload.  The problem is that we need to
  snag a bunch of information for the sources when we create the
  load_payload.  In particular, we need to know the width of the source so
  that we know how much space it consumes in the payload and we need to
 know
  the information required to properly re-create the mov such as
  force_sechalf and force_writemask_all.  Really, in order to do things
  properly, we need to gather this information *before* we do any
  optimizations.  The nasty pile of code that you're editing together with
  the effective_width parameter is a lame attempt to capture/reconstruct
  this information.  Really, we should just subclass, capture the
 information
  up-front, and do it properly.
 
 Yes, absolutely, this lowering pass is a real mess.  There are four more
 unreviewed patches in the mailing list fixing bugs of the metadata
 guessing of lower_load_payload() [1][2][3][4], you may want to take a
 look at them.  There are more bugs I'm aware of but it didn't seem
 practical to fix them.


Yeah, Matt pointed me at those.  I'll give them a look later today.


 That said, I don't think it would be worth subclassing fs_inst.

 My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
 into a series of MOVs with force_writemask_all enabled in all cases,
 simply rely on the optimizer to eliminate those where possible.  Then
 get rid of the metadata and effective_width guessing.  Instead agree on
 immediates and uniforms being exec_size-wide by convention
 (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
 then prevent constant propagation from propagating an immediate load
 into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
 program, and maybe just run constant propagation after lowering so we
 can be sure those cases are cleaned up properly where register coalesce
 isn't enough.


First off, simply setting force_writemask_all isn't an option.  Consider
the following scenario:

a = foo;
if (bar) {
   b = 7;
} else {
   use(a);
}
use(b);

If use(a) is the last use of the variable a, then the live ranges of a
and b don't actually over-lap and we can assign a and b to the same
register.  However, if force_writemask_all is set on b, it will destroy the
value in a before its last use.  Right now, we don't actually do this
because our liveness analysis pass flattens everything so it will think
that b and a over-lap even though they don't.  However, if we ever decide
to make the liveness information more accurate, having a pile of
force_writemask_all instructions will be a major problem.  So, while it is
*technically* safe for now, it's a really bad idea in the long-term.

Regarding the other suggestion of just requiring width == exec_size for
immediates and uniforms, that seems pretty reasonable to me.  I'd like to
know what it will do to optimizations, but it seems ok initially.  I'm
still a bigger fan of just subclassing and stashing everything we need to
know up-front.  If we do it right, the only things that will actually need
to know about the subclass are the function for creating a LOAD_PAYLOAD and
lower_load_payloads.

--Jason



 [1]
 http://lists.freedesktop.org/archives/mesa-dev/2015-January/074614.html
 [2]
 http://lists.freedesktop.org/archives/mesa-dev/2015-February/076094.html
 [3]
 http://lists.freedesktop.org/archives/mesa-dev/2015-February/076097.html
 [4]
 http://lists.freedesktop.org/archives/mesa-dev/2015-February/076098.html


  --Jason
 
  On Thu, Feb 19, 2015 at 1:53 PM, Jason Ekstrand ja...@jlekstrand.net
  wrote:
 
 
 
  On Thu, Feb 19, 2015 at 1:25 PM, Francisco Jerez curroje...@riseup.net
 
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   On Thu, Feb 19, 2015 at 12:13 PM, Francisco Jerez 
  curroje...@riseup.net
   wrote:
  
   Jason Ekstrand ja...@jlekstrand.net writes:
  
On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez 
  curroje...@riseup.net
wrote:
   
Hey Matt,
   
Matt Turner matts...@gmail.com writes:
   
 On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez 
   curroje...@riseup.net
wrote:
 MRFs cannot be read from anyway so they cannot possibly be a
  valid
 source of LOAD_PAYLOAD.
 ---

 The function only seems to test inst-dst.file == MRF. I don't
  see any
 code for handling MRF sources. What am I missing?
   
That test is for handling MRF sources -- More precisely, it's
collecting the writemask and half flags for MRF writes, which can
  only
possibly be useful if we're going to use them later on to read
  something

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez curroje...@riseup.net
 wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  I'm still a little pensive.  But
 
  Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com
 
 Thanks.

  Now for a little aside.  I have come to the conclusion that I made a
 grave
  mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should have
  just subclassed fs_inst for load_payload.  The problem is that we need to
  snag a bunch of information for the sources when we create the
  load_payload.  In particular, we need to know the width of the source so
  that we know how much space it consumes in the payload and we need to
 know
  the information required to properly re-create the mov such as
  force_sechalf and force_writemask_all.  Really, in order to do things
  properly, we need to gather this information *before* we do any
  optimizations.  The nasty pile of code that you're editing together with
  the effective_width parameter is a lame attempt to capture/reconstruct
  this information.  Really, we should just subclass, capture the
 information
  up-front, and do it properly.
 
 Yes, absolutely, this lowering pass is a real mess.  There are four more
 unreviewed patches in the mailing list fixing bugs of the metadata
 guessing of lower_load_payload() [1][2][3][4], you may want to take a
 look at them.  There are more bugs I'm aware of but it didn't seem
 practical to fix them.


 Yeah, Matt pointed me at those.  I'll give them a look later today.


 That said, I don't think it would be worth subclassing fs_inst.

 My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
 into a series of MOVs with force_writemask_all enabled in all cases,
 simply rely on the optimizer to eliminate those where possible.  Then
 get rid of the metadata and effective_width guessing.  Instead agree on
 immediates and uniforms being exec_size-wide by convention
 (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
 then prevent constant propagation from propagating an immediate load
 into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
 program, and maybe just run constant propagation after lowering so we
 can be sure those cases are cleaned up properly where register coalesce
 isn't enough.


 First off, simply setting force_writemask_all isn't an option.  Consider
 the following scenario:

 a = foo;
 if (bar) {
b = 7;
 } else {
use(a);
 }
 use(b);

 If use(a) is the last use of the variable a, then the live ranges of a
 and b don't actually over-lap and we can assign a and b to the same
 register.  However, if force_writemask_all is set on b, it will destroy the
 value in a before its last use.  Right now, we don't actually do this
 because our liveness analysis pass flattens everything so it will think
 that b and a over-lap even though they don't.  However, if we ever decide
 to make the liveness information more accurate, having a pile of
 force_writemask_all instructions will be a major problem.  So, while it is
 *technically* safe for now, it's a really bad idea in the long-term.

The thing is we *will* have to deal with that scenario.  Building a
message payload inherently involves crossing channel boundaries (because
of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
say it's a bug in the liveness analysis pass if it wouldn't consider the
liveness interval of a and b to overlap in your example if the
assignment of b had force_writemask_all set.

A reasonable compromise might be to leave it up to the caller whether to
set the force_writemask_all and force_sechalf flags or not.  I.e. just
copy the same flags set on the LOAD_PAYLOAD instruction to the MOV
instructions.  That would still allow reducing the liveness intervals in
cases where we know that the payload respects channel boundaries.

Tracking the flag information per-register in cases where the same
payload has well- and ill-behaved values seems rather premature and
rather useless to me because the optimizer is likely to be able to get
rid of redundant copies with force_writemask_all -- register coalesce
is doing this already AFAIK, maybe by accident.

 Regarding the other suggestion of just requiring width == exec_size for
 immediates and uniforms, that seems pretty reasonable to me.  I'd like to
 know what it will do to optimizations, but it seems ok initially.  I'm
 still a bigger fan of just subclassing and stashing everything we need to
 know up-front.  If we do it right, the only things that will actually need
 to know about the subclass are the function for creating a LOAD_PAYLOAD and
 lower_load_payloads.

 --Jason



 [1]
 http://lists.freedesktop.org/archives/mesa-dev/2015-January/074614.html
 [2]
 http://lists.freedesktop.org/archives/mesa-dev/2015-February/076094.html
 [3]
 http://lists.freedesktop.org/archives/mesa-dev/2015-February/076097.html
 [4]
 

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Jason Ekstrand
On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez curroje...@riseup.net
wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez curroje...@riseup.net
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   I'm still a little pensive.  But
  
   Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com
  
  Thanks.
 
   Now for a little aside.  I have come to the conclusion that I made a
  grave
   mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should
 have
   just subclassed fs_inst for load_payload.  The problem is that we
 need to
   snag a bunch of information for the sources when we create the
   load_payload.  In particular, we need to know the width of the source
 so
   that we know how much space it consumes in the payload and we need to
  know
   the information required to properly re-create the mov such as
   force_sechalf and force_writemask_all.  Really, in order to do things
   properly, we need to gather this information *before* we do any
   optimizations.  The nasty pile of code that you're editing together
 with
   the effective_width parameter is a lame attempt to
 capture/reconstruct
   this information.  Really, we should just subclass, capture the
  information
   up-front, and do it properly.
  
  Yes, absolutely, this lowering pass is a real mess.  There are four more
  unreviewed patches in the mailing list fixing bugs of the metadata
  guessing of lower_load_payload() [1][2][3][4], you may want to take a
  look at them.  There are more bugs I'm aware of but it didn't seem
  practical to fix them.
 
 
  Yeah, Matt pointed me at those.  I'll give them a look later today.
 
 
  That said, I don't think it would be worth subclassing fs_inst.
 
  My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
  into a series of MOVs with force_writemask_all enabled in all cases,
  simply rely on the optimizer to eliminate those where possible.  Then
  get rid of the metadata and effective_width guessing.  Instead agree on
  immediates and uniforms being exec_size-wide by convention
  (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
  then prevent constant propagation from propagating an immediate load
  into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
  program, and maybe just run constant propagation after lowering so we
  can be sure those cases are cleaned up properly where register coalesce
  isn't enough.
 
 
  First off, simply setting force_writemask_all isn't an option.  Consider
  the following scenario:
 
  a = foo;
  if (bar) {
 b = 7;
  } else {
 use(a);
  }
  use(b);
 
  If use(a) is the last use of the variable a, then the live ranges of a
  and b don't actually over-lap and we can assign a and b to the same
  register.  However, if force_writemask_all is set on b, it will destroy
 the
  value in a before its last use.  Right now, we don't actually do this
  because our liveness analysis pass flattens everything so it will think
  that b and a over-lap even though they don't.  However, if we ever decide
  to make the liveness information more accurate, having a pile of
  force_writemask_all instructions will be a major problem.  So, while it
 is
  *technically* safe for now, it's a really bad idea in the long-term.
 
 The thing is we *will* have to deal with that scenario.  Building a
 message payload inherently involves crossing channel boundaries (because
 of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
 say it's a bug in the liveness analysis pass if it wouldn't consider the
 liveness interval of a and b to overlap in your example if the
 assignment of b had force_writemask_all set.


Yes, I'm aware of that.  However, the part of that register that has to
squash everything is usually only a couple of registers while the entire
payload may be up to 13 (if I remmeber correctly).  We'd rather not have to
squash everything if we can.  All that being said, our liveness/register
allocation can't handle this and making register allocation handle parts of
registers interfering but not other bits is going to be a real pain.  Maybe
not even worth it.  If you'd rather force_writemask_all everything, I'll
sign off on that for now.  I just wanted to point out that it may not be a
good permanent solution.


 A reasonable compromise might be to leave it up to the caller whether to
 set the force_writemask_all and force_sechalf flags or not.  I.e. just
 copy the same flags set on the LOAD_PAYLOAD instruction to the MOV
 instructions.  That would still allow reducing the liveness intervals in
 cases where we know that the payload respects channel boundaries.

 Tracking the flag information per-register in cases where the same
 payload has well- and ill-behaved values seems rather premature and
 rather useless to me because the optimizer is likely to be able to get
 rid of redundant copies with force_writemask_all -- register coalesce
 is 

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Jason Ekstrand
One more comment here...  This particularly regards your plan of separating
it into things that match the destination and other things and not copy
prop uniforms or immediates into the other things.  There is another case
we need to handle.  On older gens (SNB maybe?) the SIMD16 FB write messages
are in the order (r0, g0, b0, a0, r1, g1, b1, a1) instead of the usual (r0,
r1, g0, g1, b0, b1, a0, a1).  This isn't going to work if we expect SIMD16
load_payload instructions to be lowerable to a bunch of SIMD16 MOVs.  On
those platforms, there is a special type of MOV-to-MRF that can move from
(gN, gN+1) to (mK, mK+4) and we handle that in lower_load_payload right
now.  However, it doesn't follow the standard rules.
--Jason

On Fri, Feb 20, 2015 at 2:10 PM, Jason Ekstrand ja...@jlekstrand.net
wrote:



 On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez curroje...@riseup.net
 wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez curroje...@riseup.net
 
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   I'm still a little pensive.  But
  
   Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com
  
  Thanks.
 
   Now for a little aside.  I have come to the conclusion that I made a
  grave
   mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should
 have
   just subclassed fs_inst for load_payload.  The problem is that we
 need to
   snag a bunch of information for the sources when we create the
   load_payload.  In particular, we need to know the width of the
 source so
   that we know how much space it consumes in the payload and we need to
  know
   the information required to properly re-create the mov such as
   force_sechalf and force_writemask_all.  Really, in order to do things
   properly, we need to gather this information *before* we do any
   optimizations.  The nasty pile of code that you're editing together
 with
   the effective_width parameter is a lame attempt to
 capture/reconstruct
   this information.  Really, we should just subclass, capture the
  information
   up-front, and do it properly.
  
  Yes, absolutely, this lowering pass is a real mess.  There are four
 more
  unreviewed patches in the mailing list fixing bugs of the metadata
  guessing of lower_load_payload() [1][2][3][4], you may want to take a
  look at them.  There are more bugs I'm aware of but it didn't seem
  practical to fix them.
 
 
  Yeah, Matt pointed me at those.  I'll give them a look later today.
 
 
  That said, I don't think it would be worth subclassing fs_inst.
 
  My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
  into a series of MOVs with force_writemask_all enabled in all cases,
  simply rely on the optimizer to eliminate those where possible.  Then
  get rid of the metadata and effective_width guessing.  Instead agree on
  immediates and uniforms being exec_size-wide by convention
  (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
  then prevent constant propagation from propagating an immediate load
  into a LOAD_PAYLOAD if it would lead to a change in the semantics of
 the
  program, and maybe just run constant propagation after lowering so we
  can be sure those cases are cleaned up properly where register coalesce
  isn't enough.
 
 
  First off, simply setting force_writemask_all isn't an option.  Consider
  the following scenario:
 
  a = foo;
  if (bar) {
 b = 7;
  } else {
 use(a);
  }
  use(b);
 
  If use(a) is the last use of the variable a, then the live ranges of a
  and b don't actually over-lap and we can assign a and b to the same
  register.  However, if force_writemask_all is set on b, it will destroy
 the
  value in a before its last use.  Right now, we don't actually do this
  because our liveness analysis pass flattens everything so it will think
  that b and a over-lap even though they don't.  However, if we ever
 decide
  to make the liveness information more accurate, having a pile of
  force_writemask_all instructions will be a major problem.  So, while it
 is
  *technically* safe for now, it's a really bad idea in the long-term.
 
 The thing is we *will* have to deal with that scenario.  Building a
 message payload inherently involves crossing channel boundaries (because
 of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
 say it's a bug in the liveness analysis pass if it wouldn't consider the
 liveness interval of a and b to overlap in your example if the
 assignment of b had force_writemask_all set.


 Yes, I'm aware of that.  However, the part of that register that has to
 squash everything is usually only a couple of registers while the entire
 payload may be up to 13 (if I remmeber correctly).  We'd rather not have to
 squash everything if we can.  All that being said, our liveness/register
 allocation can't handle this and making register allocation handle parts of
 registers interfering but not other bits is going to be a real pain.  

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez curroje...@riseup.net
 wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez curroje...@riseup.net
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   I'm still a little pensive.  But
  
   Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com
  
  Thanks.
 
   Now for a little aside.  I have come to the conclusion that I made a
  grave
   mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should
 have
   just subclassed fs_inst for load_payload.  The problem is that we
 need to
   snag a bunch of information for the sources when we create the
   load_payload.  In particular, we need to know the width of the source
 so
   that we know how much space it consumes in the payload and we need to
  know
   the information required to properly re-create the mov such as
   force_sechalf and force_writemask_all.  Really, in order to do things
   properly, we need to gather this information *before* we do any
   optimizations.  The nasty pile of code that you're editing together
 with
   the effective_width parameter is a lame attempt to
 capture/reconstruct
   this information.  Really, we should just subclass, capture the
  information
   up-front, and do it properly.
  
  Yes, absolutely, this lowering pass is a real mess.  There are four more
  unreviewed patches in the mailing list fixing bugs of the metadata
  guessing of lower_load_payload() [1][2][3][4], you may want to take a
  look at them.  There are more bugs I'm aware of but it didn't seem
  practical to fix them.
 
 
  Yeah, Matt pointed me at those.  I'll give them a look later today.
 
 
  That said, I don't think it would be worth subclassing fs_inst.
 
  My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
  into a series of MOVs with force_writemask_all enabled in all cases,
  simply rely on the optimizer to eliminate those where possible.  Then
  get rid of the metadata and effective_width guessing.  Instead agree on
  immediates and uniforms being exec_size-wide by convention
  (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
  then prevent constant propagation from propagating an immediate load
  into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
  program, and maybe just run constant propagation after lowering so we
  can be sure those cases are cleaned up properly where register coalesce
  isn't enough.
 
 
  First off, simply setting force_writemask_all isn't an option.  Consider
  the following scenario:
 
  a = foo;
  if (bar) {
 b = 7;
  } else {
 use(a);
  }
  use(b);
 
  If use(a) is the last use of the variable a, then the live ranges of a
  and b don't actually over-lap and we can assign a and b to the same
  register.  However, if force_writemask_all is set on b, it will destroy
 the
  value in a before its last use.  Right now, we don't actually do this
  because our liveness analysis pass flattens everything so it will think
  that b and a over-lap even though they don't.  However, if we ever decide
  to make the liveness information more accurate, having a pile of
  force_writemask_all instructions will be a major problem.  So, while it
 is
  *technically* safe for now, it's a really bad idea in the long-term.
 
 The thing is we *will* have to deal with that scenario.  Building a
 message payload inherently involves crossing channel boundaries (because
 of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
 say it's a bug in the liveness analysis pass if it wouldn't consider the
 liveness interval of a and b to overlap in your example if the
 assignment of b had force_writemask_all set.


 Yes, I'm aware of that.  However, the part of that register that has to
 squash everything is usually only a couple of registers while the entire
 payload may be up to 13 (if I remmeber correctly).  We'd rather not have to
 squash everything if we can.  All that being said, our liveness/register
 allocation can't handle this and making register allocation handle parts of
 registers interfering but not other bits is going to be a real pain.  Maybe
 not even worth it.  If you'd rather force_writemask_all everything, I'll
 sign off on that for now.  I just wanted to point out that it may not be a
 good permanent solution.


 A reasonable compromise might be to leave it up to the caller whether to
 set the force_writemask_all and force_sechalf flags or not.  I.e. just
 copy the same flags set on the LOAD_PAYLOAD instruction to the MOV
 instructions.  That would still allow reducing the liveness intervals in
 cases where we know that the payload respects channel boundaries.

 Tracking the flag information per-register in cases where the same
 payload has well- and ill-behaved values seems rather premature and
 rather useless to me because the optimizer is likely to be able to get
 rid of redundant 

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 One more comment here...  This particularly regards your plan of separating
 it into things that match the destination and other things and not copy
 prop uniforms or immediates into the other things.  There is another case
 we need to handle.  On older gens (SNB maybe?) the SIMD16 FB write messages
 are in the order (r0, g0, b0, a0, r1, g1, b1, a1) instead of the usual (r0,
 r1, g0, g1, b0, b1, a0, a1).  This isn't going to work if we expect SIMD16
 load_payload instructions to be lowerable to a bunch of SIMD16 MOVs.  On
 those platforms, there is a special type of MOV-to-MRF that can move from
 (gN, gN+1) to (mK, mK+4) and we handle that in lower_load_payload right
 now.  However, it doesn't follow the standard rules.
 --Jason

I don't see why it couldn't be handled in roughly the same way you do it
now?  Recognize when src[i + 4] is the same 8-wide register as src[i]
shifted by 8 and emit a COMPR4 copy in that case?

 On Fri, Feb 20, 2015 at 2:10 PM, Jason Ekstrand ja...@jlekstrand.net
 wrote:



 On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez curroje...@riseup.net
 wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez curroje...@riseup.net
 
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   I'm still a little pensive.  But
  
   Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com
  
  Thanks.
 
   Now for a little aside.  I have come to the conclusion that I made a
  grave
   mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should
 have
   just subclassed fs_inst for load_payload.  The problem is that we
 need to
   snag a bunch of information for the sources when we create the
   load_payload.  In particular, we need to know the width of the
 source so
   that we know how much space it consumes in the payload and we need to
  know
   the information required to properly re-create the mov such as
   force_sechalf and force_writemask_all.  Really, in order to do things
   properly, we need to gather this information *before* we do any
   optimizations.  The nasty pile of code that you're editing together
 with
   the effective_width parameter is a lame attempt to
 capture/reconstruct
   this information.  Really, we should just subclass, capture the
  information
   up-front, and do it properly.
  
  Yes, absolutely, this lowering pass is a real mess.  There are four
 more
  unreviewed patches in the mailing list fixing bugs of the metadata
  guessing of lower_load_payload() [1][2][3][4], you may want to take a
  look at them.  There are more bugs I'm aware of but it didn't seem
  practical to fix them.
 
 
  Yeah, Matt pointed me at those.  I'll give them a look later today.
 
 
  That said, I don't think it would be worth subclassing fs_inst.
 
  My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
  into a series of MOVs with force_writemask_all enabled in all cases,
  simply rely on the optimizer to eliminate those where possible.  Then
  get rid of the metadata and effective_width guessing.  Instead agree on
  immediates and uniforms being exec_size-wide by convention
  (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
  then prevent constant propagation from propagating an immediate load
  into a LOAD_PAYLOAD if it would lead to a change in the semantics of
 the
  program, and maybe just run constant propagation after lowering so we
  can be sure those cases are cleaned up properly where register coalesce
  isn't enough.
 
 
  First off, simply setting force_writemask_all isn't an option.  Consider
  the following scenario:
 
  a = foo;
  if (bar) {
 b = 7;
  } else {
 use(a);
  }
  use(b);
 
  If use(a) is the last use of the variable a, then the live ranges of a
  and b don't actually over-lap and we can assign a and b to the same
  register.  However, if force_writemask_all is set on b, it will destroy
 the
  value in a before its last use.  Right now, we don't actually do this
  because our liveness analysis pass flattens everything so it will think
  that b and a over-lap even though they don't.  However, if we ever
 decide
  to make the liveness information more accurate, having a pile of
  force_writemask_all instructions will be a major problem.  So, while it
 is
  *technically* safe for now, it's a really bad idea in the long-term.
 
 The thing is we *will* have to deal with that scenario.  Building a
 message payload inherently involves crossing channel boundaries (because
 of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
 say it's a bug in the liveness analysis pass if it wouldn't consider the
 liveness interval of a and b to overlap in your example if the
 assignment of b had force_writemask_all set.


 Yes, I'm aware of that.  However, the part of that register that has to
 squash everything is usually only a couple of registers while the entire
 payload may be up to 13 (if I remmeber 

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Jason Ekstrand
On Fri, Feb 20, 2015 at 2:43 PM, Francisco Jerez curroje...@riseup.net
wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  One more comment here...  This particularly regards your plan of
 separating
  it into things that match the destination and other things and not
 copy
  prop uniforms or immediates into the other things.  There is another
 case
  we need to handle.  On older gens (SNB maybe?) the SIMD16 FB write
 messages
  are in the order (r0, g0, b0, a0, r1, g1, b1, a1) instead of the usual
 (r0,
  r1, g0, g1, b0, b1, a0, a1).  This isn't going to work if we expect
 SIMD16
  load_payload instructions to be lowerable to a bunch of SIMD16 MOVs.  On
  those platforms, there is a special type of MOV-to-MRF that can move from
  (gN, gN+1) to (mK, mK+4) and we handle that in lower_load_payload right
  now.  However, it doesn't follow the standard rules.
  --Jason
 
 I don't see why it couldn't be handled in roughly the same way you do it
 now?  Recognize when src[i + 4] is the same 8-wide register as src[i]
 shifted by 8 and emit a COMPR4 copy in that case?


Sure.  I haven't thought about it hard enough to prove it can't be done.
Just wanted to make sure you were aware of that particular monkey-wrench.
--Jason


  On Fri, Feb 20, 2015 at 2:10 PM, Jason Ekstrand ja...@jlekstrand.net
  wrote:
 
 
 
  On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez curroje...@riseup.net
 
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez 
 curroje...@riseup.net
  
   wrote:
  
   Jason Ekstrand ja...@jlekstrand.net writes:
  
I'm still a little pensive.  But
   
Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com
   
   Thanks.
  
Now for a little aside.  I have come to the conclusion that I
 made a
   grave
mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I
 should
  have
just subclassed fs_inst for load_payload.  The problem is that we
  need to
snag a bunch of information for the sources when we create the
load_payload.  In particular, we need to know the width of the
  source so
that we know how much space it consumes in the payload and we
 need to
   know
the information required to properly re-create the mov such as
force_sechalf and force_writemask_all.  Really, in order to do
 things
properly, we need to gather this information *before* we do any
optimizations.  The nasty pile of code that you're editing
 together
  with
the effective_width parameter is a lame attempt to
  capture/reconstruct
this information.  Really, we should just subclass, capture the
   information
up-front, and do it properly.
   
   Yes, absolutely, this lowering pass is a real mess.  There are four
  more
   unreviewed patches in the mailing list fixing bugs of the metadata
   guessing of lower_load_payload() [1][2][3][4], you may want to take
 a
   look at them.  There are more bugs I'm aware of but it didn't seem
   practical to fix them.
  
  
   Yeah, Matt pointed me at those.  I'll give them a look later today.
  
  
   That said, I don't think it would be worth subclassing fs_inst.
  
   My suggestion would have been to keep it simple and lower
 LOAD_PAYLOAD
   into a series of MOVs with force_writemask_all enabled in all cases,
   simply rely on the optimizer to eliminate those where possible.
 Then
   get rid of the metadata and effective_width guessing.  Instead
 agree on
   immediates and uniforms being exec_size-wide by convention
   (i.e. LOAD_PAYLOAD's exec_size rather than the original
 instruction's),
   then prevent constant propagation from propagating an immediate load
   into a LOAD_PAYLOAD if it would lead to a change in the semantics of
  the
   program, and maybe just run constant propagation after lowering so
 we
   can be sure those cases are cleaned up properly where register
 coalesce
   isn't enough.
  
  
   First off, simply setting force_writemask_all isn't an option.
 Consider
   the following scenario:
  
   a = foo;
   if (bar) {
  b = 7;
   } else {
  use(a);
   }
   use(b);
  
   If use(a) is the last use of the variable a, then the live ranges
 of a
   and b don't actually over-lap and we can assign a and b to the same
   register.  However, if force_writemask_all is set on b, it will
 destroy
  the
   value in a before its last use.  Right now, we don't actually do this
   because our liveness analysis pass flattens everything so it will
 think
   that b and a over-lap even though they don't.  However, if we ever
  decide
   to make the liveness information more accurate, having a pile of
   force_writemask_all instructions will be a major problem.  So, while
 it
  is
   *technically* safe for now, it's a really bad idea in the long-term.
  
  The thing is we *will* have to deal with that scenario.  Building a
  message payload inherently involves crossing channel boundaries
 (because
  of headers, unsupported SIMD modes by some shared functions, etc.).
 I'd
  

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-19 Thread Jason Ekstrand
On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez curroje...@riseup.net
wrote:

 Hey Matt,

 Matt Turner matts...@gmail.com writes:

  On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez curroje...@riseup.net
 wrote:
  MRFs cannot be read from anyway so they cannot possibly be a valid
  source of LOAD_PAYLOAD.
  ---
 
  The function only seems to test inst-dst.file == MRF. I don't see any
  code for handling MRF sources. What am I missing?

 That test is for handling MRF sources -- More precisely, it's
 collecting the writemask and half flags for MRF writes, which can only
 possibly be useful if we're going to use them later on to read something
 out of an MRF into a payload, which we shouldn't be doing in the first
 place.

 Aside from simplifying the function somewhat, that allows us to drop the
 16 register gap reserved for MRFs at register offset zero, what will
 allow us to drop the vgrf_to_reg[] offset calculation completely (also
 in split_virtual_grfs()) in a future patch (not sent for review yet).


No, we do read from MRF's sort-of...  Send messages have an implicit read
from an MRF.  This was written precicely so that we could use LOAD_PAYLOAD
to build MRF payloads.  We do on pre-GEN6.


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-19 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez curroje...@riseup.net
 wrote:

 Hey Matt,

 Matt Turner matts...@gmail.com writes:

  On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez curroje...@riseup.net
 wrote:
  MRFs cannot be read from anyway so they cannot possibly be a valid
  source of LOAD_PAYLOAD.
  ---
 
  The function only seems to test inst-dst.file == MRF. I don't see any
  code for handling MRF sources. What am I missing?

 That test is for handling MRF sources -- More precisely, it's
 collecting the writemask and half flags for MRF writes, which can only
 possibly be useful if we're going to use them later on to read something
 out of an MRF into a payload, which we shouldn't be doing in the first
 place.

 Aside from simplifying the function somewhat, that allows us to drop the
 16 register gap reserved for MRFs at register offset zero, what will
 allow us to drop the vgrf_to_reg[] offset calculation completely (also
 in split_virtual_grfs()) in a future patch (not sent for review yet).


 No, we do read from MRF's sort-of...  Send messages have an implicit read
 from an MRF.

Heh, and that's pretty much the only way you read from it.

 This was written precicely so that we could use LOAD_PAYLOAD
 to build MRF payloads.  We do on pre-GEN6.

I'm aware, but you don't need any of this meta-data to LOAD_PAYLOAD
*into* an MRF, and LOAD_PAYLOAD with an MRF as source should be illegal
anyway.


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev




pgpfF5PKyva7v.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-19 Thread Jason Ekstrand
On Thu, Feb 19, 2015 at 12:13 PM, Francisco Jerez curroje...@riseup.net
wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez curroje...@riseup.net
  wrote:
 
  Hey Matt,
 
  Matt Turner matts...@gmail.com writes:
 
   On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez 
 curroje...@riseup.net
  wrote:
   MRFs cannot be read from anyway so they cannot possibly be a valid
   source of LOAD_PAYLOAD.
   ---
  
   The function only seems to test inst-dst.file == MRF. I don't see any
   code for handling MRF sources. What am I missing?
 
  That test is for handling MRF sources -- More precisely, it's
  collecting the writemask and half flags for MRF writes, which can only
  possibly be useful if we're going to use them later on to read something
  out of an MRF into a payload, which we shouldn't be doing in the first
  place.
 
  Aside from simplifying the function somewhat, that allows us to drop the
  16 register gap reserved for MRFs at register offset zero, what will
  allow us to drop the vgrf_to_reg[] offset calculation completely (also
  in split_virtual_grfs()) in a future patch (not sent for review yet).
 
 
  No, we do read from MRF's sort-of...  Send messages have an implicit
 read
  from an MRF.

 Heh, and that's pretty much the only way you read from it.

  This was written precicely so that we could use LOAD_PAYLOAD
  to build MRF payloads.  We do on pre-GEN6.
 
 I'm aware, but you don't need any of this meta-data to LOAD_PAYLOAD
 *into* an MRF, and LOAD_PAYLOAD with an MRF as source should be illegal
 anyway.


And no one is using it that way.  All of the metadata checks you are
deleting are checks on the *destination*.


 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-19 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Thu, Feb 19, 2015 at 12:13 PM, Francisco Jerez curroje...@riseup.net
 wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez curroje...@riseup.net
  wrote:
 
  Hey Matt,
 
  Matt Turner matts...@gmail.com writes:
 
   On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez 
 curroje...@riseup.net
  wrote:
   MRFs cannot be read from anyway so they cannot possibly be a valid
   source of LOAD_PAYLOAD.
   ---
  
   The function only seems to test inst-dst.file == MRF. I don't see any
   code for handling MRF sources. What am I missing?
 
  That test is for handling MRF sources -- More precisely, it's
  collecting the writemask and half flags for MRF writes, which can only
  possibly be useful if we're going to use them later on to read something
  out of an MRF into a payload, which we shouldn't be doing in the first
  place.
 
  Aside from simplifying the function somewhat, that allows us to drop the
  16 register gap reserved for MRFs at register offset zero, what will
  allow us to drop the vgrf_to_reg[] offset calculation completely (also
  in split_virtual_grfs()) in a future patch (not sent for review yet).
 
 
  No, we do read from MRF's sort-of...  Send messages have an implicit
 read
  from an MRF.

 Heh, and that's pretty much the only way you read from it.

  This was written precicely so that we could use LOAD_PAYLOAD
  to build MRF payloads.  We do on pre-GEN6.
 
 I'm aware, but you don't need any of this meta-data to LOAD_PAYLOAD
 *into* an MRF, and LOAD_PAYLOAD with an MRF as source should be illegal
 anyway.


 And no one is using it that way.  All of the metadata checks you are
 deleting are checks on the *destination*.


Didn't you write this code yourself?  The only use for the collected
metadata is initializing the instruction flags of the MOVs subsequent
LOAD_PAYLOAD instructions are lowered to, based on the metadata already
collected for its source registers, which can never be MRFs, so the
metadata you collect from MRF writes is never actually used.


 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
 



pgpznbosO3BD6.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-19 Thread Jason Ekstrand
On Thu, Feb 19, 2015 at 1:25 PM, Francisco Jerez curroje...@riseup.net
wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  On Thu, Feb 19, 2015 at 12:13 PM, Francisco Jerez curroje...@riseup.net
 
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez 
 curroje...@riseup.net
   wrote:
  
   Hey Matt,
  
   Matt Turner matts...@gmail.com writes:
  
On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez 
  curroje...@riseup.net
   wrote:
MRFs cannot be read from anyway so they cannot possibly be a valid
source of LOAD_PAYLOAD.
---
   
The function only seems to test inst-dst.file == MRF. I don't see
 any
code for handling MRF sources. What am I missing?
  
   That test is for handling MRF sources -- More precisely, it's
   collecting the writemask and half flags for MRF writes, which can
 only
   possibly be useful if we're going to use them later on to read
 something
   out of an MRF into a payload, which we shouldn't be doing in the
 first
   place.
  
   Aside from simplifying the function somewhat, that allows us to drop
 the
   16 register gap reserved for MRFs at register offset zero, what will
   allow us to drop the vgrf_to_reg[] offset calculation completely
 (also
   in split_virtual_grfs()) in a future patch (not sent for review yet).
  
  
   No, we do read from MRF's sort-of...  Send messages have an implicit
  read
   from an MRF.
 
  Heh, and that's pretty much the only way you read from it.
 
   This was written precicely so that we could use LOAD_PAYLOAD
   to build MRF payloads.  We do on pre-GEN6.
  
  I'm aware, but you don't need any of this meta-data to LOAD_PAYLOAD
  *into* an MRF, and LOAD_PAYLOAD with an MRF as source should be illegal
  anyway.
 
 
  And no one is using it that way.  All of the metadata checks you are
  deleting are checks on the *destination*.
 

 Didn't you write this code yourself?  The only use for the collected
 metadata is initializing the instruction flags of the MOVs subsequent
 LOAD_PAYLOAD instructions are lowered to, based on the metadata already
 collected for its source registers, which can never be MRFs, so the
 metadata you collect from MRF writes is never actually used.


Right... I misred something initially.  Yes, we should never be tracking
MRF's as a source of a LOAD_PAYLOAD.  I'll give it a better look a bit
later, but it looks better.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-19 Thread Jason Ekstrand
I'm still a little pensive.  But

Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

Now for a little aside.  I have come to the conclusion that I made a grave
mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should have
just subclassed fs_inst for load_payload.  The problem is that we need to
snag a bunch of information for the sources when we create the
load_payload.  In particular, we need to know the width of the source so
that we know how much space it consumes in the payload and we need to know
the information required to properly re-create the mov such as
force_sechalf and force_writemask_all.  Really, in order to do things
properly, we need to gather this information *before* we do any
optimizations.  The nasty pile of code that you're editing together with
the effective_width parameter is a lame attempt to capture/reconstruct
this information.  Really, we should just subclass, capture the information
up-front, and do it properly.

--Jason

On Thu, Feb 19, 2015 at 1:53 PM, Jason Ekstrand ja...@jlekstrand.net
wrote:



 On Thu, Feb 19, 2015 at 1:25 PM, Francisco Jerez curroje...@riseup.net
 wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  On Thu, Feb 19, 2015 at 12:13 PM, Francisco Jerez 
 curroje...@riseup.net
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez 
 curroje...@riseup.net
   wrote:
  
   Hey Matt,
  
   Matt Turner matts...@gmail.com writes:
  
On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez 
  curroje...@riseup.net
   wrote:
MRFs cannot be read from anyway so they cannot possibly be a
 valid
source of LOAD_PAYLOAD.
---
   
The function only seems to test inst-dst.file == MRF. I don't
 see any
code for handling MRF sources. What am I missing?
  
   That test is for handling MRF sources -- More precisely, it's
   collecting the writemask and half flags for MRF writes, which can
 only
   possibly be useful if we're going to use them later on to read
 something
   out of an MRF into a payload, which we shouldn't be doing in the
 first
   place.
  
   Aside from simplifying the function somewhat, that allows us to
 drop the
   16 register gap reserved for MRFs at register offset zero, what will
   allow us to drop the vgrf_to_reg[] offset calculation completely
 (also
   in split_virtual_grfs()) in a future patch (not sent for review
 yet).
  
  
   No, we do read from MRF's sort-of...  Send messages have an implicit
  read
   from an MRF.
 
  Heh, and that's pretty much the only way you read from it.
 
   This was written precicely so that we could use LOAD_PAYLOAD
   to build MRF payloads.  We do on pre-GEN6.
  
  I'm aware, but you don't need any of this meta-data to LOAD_PAYLOAD
  *into* an MRF, and LOAD_PAYLOAD with an MRF as source should be illegal
  anyway.
 
 
  And no one is using it that way.  All of the metadata checks you are
  deleting are checks on the *destination*.
 

 Didn't you write this code yourself?  The only use for the collected
 metadata is initializing the instruction flags of the MOVs subsequent
 LOAD_PAYLOAD instructions are lowered to, based on the metadata already
 collected for its source registers, which can never be MRFs, so the
 metadata you collect from MRF writes is never actually used.


 Right... I misred something initially.  Yes, we should never be tracking
 MRF's as a source of a LOAD_PAYLOAD.  I'll give it a better look a bit
 later, but it looks better.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-06 Thread Francisco Jerez
Hey Matt,

Matt Turner matts...@gmail.com writes:

 On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez curroje...@riseup.net wrote:
 MRFs cannot be read from anyway so they cannot possibly be a valid
 source of LOAD_PAYLOAD.
 ---

 The function only seems to test inst-dst.file == MRF. I don't see any
 code for handling MRF sources. What am I missing?

That test is for handling MRF sources -- More precisely, it's
collecting the writemask and half flags for MRF writes, which can only
possibly be useful if we're going to use them later on to read something
out of an MRF into a payload, which we shouldn't be doing in the first
place.

Aside from simplifying the function somewhat, that allows us to drop the
16 register gap reserved for MRFs at register offset zero, what will
allow us to drop the vgrf_to_reg[] offset calculation completely (also
in split_virtual_grfs()) in a future patch (not sent for review yet).


pgpTS_XOX7QSF.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-06 Thread Matt Turner
On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez curroje...@riseup.net wrote:
 MRFs cannot be read from anyway so they cannot possibly be a valid
 source of LOAD_PAYLOAD.
 ---

The function only seems to test inst-dst.file == MRF. I don't see any
code for handling MRF sources. What am I missing?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev