> I believe this patch to be nothing but an improvement over the current > state, and that a fix to the constraint problem should be a separate patch. > > In that basis, am I OK to commit?
One minor nit: > (define_special_predicate "shift_operator" >... >+ (ior (match_test "GET_CODE (XEXP (op, 1)) == CONST_INT >+ && ((unsigned HOST_WIDE_INT) INTVAL (XEXP (op, 1))) < >32") >+ (match_test "REG_P (XEXP (op, 1))")))) We're already enforcing the REG_P elsewhere, and it's only valid in some contexts, so I'd change this to: (match_test "GET_CODE (XEXP (op, 1)) != CONST_INT || ((unsigned HOST_WIDE_INT) INTVAL (XEXP (op, 1))) < 32") > Now, let me explain the other problem: > > As it stands, all shift-related patterns, for ARM or Thumb2 mode, permit > a shift to be expressed as either a shift type and amount (register or > constant), or as a multiply and power-of-two constant. Added complication is that only ARM mode accepts a register. > Problem scenario 1: > > Consider pattern (plus (mult r1 r2) r3). > > It so happens that reload knows that r2 contains a constant, say 20, > so reload checks to see if that could be converted to an immediate. > Now, 20 is not a power of two, so recog would reject it, but it is in > the range 0..31 so it does match the 'M' constraint. Oops! Though as you mention below we the predicate don't allow the second operand to be a register, so this can never happen. Reload may do unexpected things, but if it starts randomly changing valid const_int values then we have much bigger problems. > Problem scenario 2: > > Consider pattern (ashiftrt r1 r2). > > Again, it so happens that reload knows that r2 contains a constant, in > this case let's say 64, so again reload checks to see if that could > be converted to an immediate. This time, 64 is not in the range > 0..31, so recog would reject it, but it is a power of two, so it does > match the 'M' constraint. Again, oops! > > I see two ways to fix this properly: > > 1. Duplicate all the patterns in the machine description, once for the > mult case, and once for the other cases. This could probably be > done with a code iterator, if preferred. > > 2. Redefine the canonical form of (plus (mult .. 2^n) ..) such that it > always uses the (presumably cheaper) shift-and-add option. However, > this would require all other targets where madd really is the best > option to fix it up. (I'd imagine that two instructions for shift > and add would be cheaper speed wise, if properly scheduled, on most > targets? That doesn't help the size optimization though.) 3. Consistently accept both power-of-two and 0..31 for shifts. Large shift counts give undefined results[1], so replace them with an arbitrary value (e.g. 0) during assembly output. Argualy not an entirely "proper" fix, but I think it'll keep everything happy. > However, it's not obvious to me that this needs fixing: > * The failure mode would be an ICE, and we've not seen any. Then again noone noticed the negative-shift ICE until recently :-/ > * There's a comment in arm.c:shift_op that suggests that this can't > happen, somehow, at least in the mult case. > - I'm not sure exactly how reload works, but it seems reasonable > that it will never try to convert a register to an immediate > because the pattern does not allow registers in the first place. > - This logic doesn't hold in the opposite case though. > Have I explained all that clearly? I think you've convered most of it. For bonus points we should probably disallow MULT in the arm_shiftsi3 pattern, stop it interacting with the regulat mulsi3 pattern in undesirable ways. Paul [1] Or at least not any result gcc will be expecting.