Hi Uros,
> > The major theme of this patch is to generalize many of i386.md's
> > *di3_doubleword patterns to become *<dwi>_doubleword patterns, i.e.
> > whenever there exists a "double word" optimization for DImode with
> > -m32, there should be an equivalent TImode optimization on TARGET_64BIT.
> 
> No, please do not mix two different themes in one patch.
> 
> OTOH, the only TImode optimization that can be used with SSE registers is with
> logic instructions and some constant shifts, but there is no TImode 
> arithmetic. I
> assume your end goal is to introduce STV for TImode on 64-bit targets, because
> DImode patterns for x86_32 were introduced to avoid early decomposition by
> middle end and to split instructions that STV didn't convert to vector 
> instructions
> after STV pass. So, let's start with basic V1TImode support before 
> optimizations
> are introduced.

I'm not sure I understand.  What basic V1TImode support do you/we want next?

This testcase and worked example with this patch shows its benefits without STV
nor using V1TI mode vectors.  As explained in the subject, and;cmp can be turned
into the cheaper not;cmp $0, for TImode (and DImode with -m32) in the same way
as we currently do for SImode everywhere.  Having double word modes visible to
combine, allows it to work its magic.  A recent patch ensured that double word
compares were visible to combine, this optimization just required that double
word logic (AND, IOR and XOR) are visible after combine, and in fact for -m32 
DImode
they already are, it's just that TImode is inconsistent, leading to missed 
optimizations.
Likewise, STV can't choose between implementations before there are alternative
Implementations to choose from.

As always I'm happy to do things in the order you want (modulo my 36 hour spin
cycle), in fact the reason this is being done now is that you recommended it 
best
to fix pr65105-5.c after the "double word comparison", which I fully agree with,
as it leads to a better solution that doesn’t require peephole2 (in your own 
words,
"why isn't this being done in combine?").

I'm also certainly misunderstanding.  Which piece needs to be done next?

Perhaps I should have used the term "the common theme" rather than
"the major theme" that may have made it sound like there were unrelated
or Independent bits in this patch.  But there are no V1TI changes in it.

Thanks in advance, for any clarification.

Cheers,
Roger
--


Reply via email to