On 10/29/2013 03:02 AM, Kirill Yukhin wrote:
> Hello Richard,
> 
> On 28 Oct 14:45, Richard Henderson wrote:
>> On 10/28/2013 01:58 PM, Kirill Yukhin wrote:
>>> Hello Richard,
>>> On 28 Oct 08:20, Richard Henderson wrote:
>>>> Why is a masked *scalar* operation useful?
>>>
>>> The reason the instructions exist is so that
>>> you can do fully fault correct predicated scalar algorithms.
>>
>> Using VEC_MERGE isn't the proper representation for that.
>>
>> If that's your real goal, then COND_EXEC is the only way to let
>> rtl know that faults are suppressed in the false condition.
> 
> I believe cond_exec approach supposed to look like this:
>   (define_subst "mask_scalar"
>     [(set (match_operand:SUBST_V 0)
>           (vec_merge:SUBST_V
>             (match_operand:SUBST_V 1)
>             (match_operand:SUBST_V 2)
>             (const_int 1)))]
>     "TARGET_AVX512F"
>     [(cond_exec (eq:CC
>                   (match_operand:<avx512fmaskmode> 3 "register_operand" "k")
>                   (const_int 1))
>                 (set (match_dup 0)
>                      (vec_merge:SUBST_V
>                        (match_dup 1)
>                        (match_dup 2)
>                        (const_int 1))))])
> 
> But this only will describe merge-masking in incorrect way.
> We will need to add a clobber to signal that even for false
> condition we will zero higher part of register.
> Preferable zerro-masking will be indistinguishable from merge-
> masking and will need to choose which mask mode to enable. Bad turn.

No, a cond_exec approach to scalars would use scalar modes
not vector modes with vec_merge.  In that case the higher
part of the register is ignored and undefined, and the fact
that zeroing happens is irrelevant.

> 
> IMHO, we have 3 options to implement scalar masked insns:
>   1. `vec_merge' over vec_merge (current approach).
>      Pro.
>        1. Precise semantic description

False, as I've described above.  The compiler will believe
that all exceptions, especially memory exceptions, will be
taken.

>        2. Unified approach with vector patterns
>        3. Freedom for simplifier to reduce EVEX to VEX for
>        certain const masks
>      Cons.
>        1. Too precise semantic description and as a
>        consequence complicated code in md-file
> 
>   2. `cond_exec' approach
>     Pro.
>       1. Look useful for compiler when trying to generate
>       predicated code
>     Cons.
>       1. Not precise. Extra clobbers (?) needed: to signal
>       that we're changing the register even for false
>       condition in cond_exec 
>       2. Unable to describe zero masking nicely
>       3. Code still complicated as for option #1
>       4. Simplifier won't work (clobber is always clobber)
> 
>   3. Make all masked scalar insns to be unspecs
>     Pro.
>       1. Straight-forward, not overweighted. Enough for
>       intrinsics to work
>     Cons.
>       1. Since every unspec needs a code: substs won't be
>       applied directly: huge volume of similar code
>       2. Simplifier won't work
>       3. Generation of predicated code become hard
> 
> Am I missing  some options, or that’s all we have?
> If so, what option would you prefer?

As far as builtins are concerned, all three approaches are
functional.  But in order for the compiler to be able to
automatically create conditional code from normal scalar
code we'll have to use cond_exec.

Indeed, if we've done our job right, then the user-facing
inlines that we expose could be written

  inline double add_mask(double r, double x, double y, int m)
  {
    if (m & 1)
      r = x + y;
    return r;
  }

although honestly such inlines would be silly.

Reply via email to