I also wonder whether it would be useful to have 32-bit do the vector logical ops in gprs as well. At the moment, the patches don't allow it (vector types must be done in the altivec/vsx registers, an TImode is done by splitting the operation into 4 separate categories). On the 64-bit side, having __int128_t passed in GPRs, means you want to avoid ping-ponging between the GPRs and VSX registers. In addition, the atomic quad word support (patch #7) has to run in GPRs, so we need add/subtract/logical to have versions that run in GPRs.
It might work better if you added a mode V1TI for TI in vector regs, and then used plain TI only for GPRs. It certainly will make things a lot more regular; whether it actually works better, I have no idea.
The way you have things now, only after reload the vector patterns are split to GPR patterns; much too late to do most optimisations on it. On the other hand, deciding early what register set some op should go to isn't too pleasant either; is it always the best choice to use the vector regs when possible? Segher