Re: New warning for expanded vector operations
On Fri, Oct 14, 2011 at 2:57 PM, Richard Guenther wrote: > On Fri, Oct 14, 2011 at 3:42 PM, Artem Shinkarov > wrote: >> On Thu, Oct 13, 2011 at 10:40 AM, Artem Shinkarov >> wrote: >>> On Thu, Oct 13, 2011 at 10:23 AM, Richard Guenther >>> wrote: >>>> On Thu, Oct 13, 2011 at 10:59 AM, Mike Stump wrote: >>>>> On Oct 12, 2011, at 2:37 PM, Artem Shinkarov wrote: >>>>>> This patch fixed PR50704. >>>>>> >>>>>> gcc/testsuite: >>>>>> * gcc.target/i386/warn-vect-op-3.c: Exclude ia32 target. >>>>>> * gcc.target/i386/warn-vect-op-1.c: Ditto. >>>>>> * gcc.target/i386/warn-vect-op-2.c: Ditto. >>>>>> >>>>>> Ok for trunk? >>>>> >>>>> Ok. Is this x32 clean? :-) If not, HJ will offer an even better >>>>> spelling. >>>> >>>> I suppose you instead want sth like >>>> >>>> { dg-require-effective-target lp64 } >>>> >>>> ? >>>> >>> >>> See our discussion with HJ here: >>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50704 >>> /* { dg-do compile { target { ! { ia32 } } } } */ was his idea. As >>> far as x32 sets UNITS_PER_WORD to 8, these tests should work fine. >>> >>> Artem. >>> >> >> Ping. >> >> So can I commit the changes? > > Yes. > > Thanks, > Richard. Committed with 179991. >> >> Thanks, >> Artem. >> >
Re: New warning for expanded vector operations
On Thu, Oct 13, 2011 at 10:40 AM, Artem Shinkarov wrote: > On Thu, Oct 13, 2011 at 10:23 AM, Richard Guenther > wrote: >> On Thu, Oct 13, 2011 at 10:59 AM, Mike Stump wrote: >>> On Oct 12, 2011, at 2:37 PM, Artem Shinkarov wrote: >>>> This patch fixed PR50704. >>>> >>>> gcc/testsuite: >>>> * gcc.target/i386/warn-vect-op-3.c: Exclude ia32 target. >>>> * gcc.target/i386/warn-vect-op-1.c: Ditto. >>>> * gcc.target/i386/warn-vect-op-2.c: Ditto. >>>> >>>> Ok for trunk? >>> >>> Ok. Is this x32 clean? :-) If not, HJ will offer an even better spelling. >> >> I suppose you instead want sth like >> >> { dg-require-effective-target lp64 } >> >> ? >> > > See our discussion with HJ here: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50704 > /* { dg-do compile { target { ! { ia32 } } } } */ was his idea. As > far as x32 sets UNITS_PER_WORD to 8, these tests should work fine. > > Artem. > Ping. So can I commit the changes? Thanks, Artem.
Re: Vector alignment tracking
On Thu, Oct 13, 2011 at 4:54 PM, Andi Kleen wrote: > Artem Shinkarov writes: >> >> 1) Currently in C we cannot provide information that an array is >> aligned to a certain number. The problem is hidden in the fact, that > > Have you considered doing it the other way round: when an optimization > needs something to be aligned, make the declaration aligned? > > -Andi Andi, I can't realistically imagine how could it work. The problem is that for an arbitrary arr[x], I have no idea whether it should be aligned or not. what if arr = ptr + 5; v = *(vec *) arr; I can make arr aligned, because it would be better for performance, but obviously, the pointer expression breaks this alignment. But the code is valid, because unaligned move is still possible. So I think that checking is a more conservative approach. Or I am missing someting? Thanks, Artem. > -- > a...@linux.intel.com -- Speaking for myself only >
Vector alignment tracking
Hi I would like to share some plans about improving the situation with vector alignment tracking. First of all, I would like to start with a well-known bug: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50716. There are several aspects of the problem: 1) We would like to avoid the quiet segmentation fault. 2) We would like to warn a user about the potential problems considering assignment of vectors with different alignment. 3) We would like to replace obvious aligned vector assignments with aligned move, and unaligned with unaligned. All these aspects are interconnected and in order to find the problem, we have to improve the alignment tracking facilities. 1) Currently in C we cannot provide information that an array is aligned to a certain number. The problem is hidden in the fact, that pointer can represents an array or an address of an object. And it turns out that current aligned attribute doesn't help here. My proposal is to introduce an attribute called array_alligned (I am very flexible on the name) which can be applied only to the pointers and which would show that the pointer of this type represents an array, where the first element is aligned to the given number. 2) After we have the new attribute, we can have a pass which would check all the pointer arithmetic expressions, and in case of vectors, mark the assignments with __builtin_assume_aligned. 3) In the separate pass we need to mark an alignments of the function return types, in order to propagate this information through the flow-graph. 4) In case of LTO, it becomes possible to track all the pointer dereferences, and depending on the parameters warn, or change aligned assignment to unaligned and vice-versa. As a very first draft of (1) I include the patch, that introduces array_aligned attribute. The attribute sets is_array_flag in the type, ans uses alignment number to store the alignment of the array. In this implementation, we loose information about the alignment of the pointer itself, but I don't know if we need it in this particular situation. Alternatively we can keep array_alignment in a separate field, which one is better I am not sure. Thanks, Artem. Index: gcc/c-family/c-common.c === --- gcc/c-family/c-common.c (revision 179906) +++ gcc/c-family/c-common.c (working copy) @@ -341,6 +341,7 @@ static tree handle_destructor_attribute static tree handle_mode_attribute (tree *, tree, tree, int, bool *); static tree handle_section_attribute (tree *, tree, tree, int, bool *); static tree handle_aligned_attribute (tree *, tree, tree, int, bool *); +static tree handle_aligned_array_attribute (tree *, tree, tree, int, bool *); static tree handle_weak_attribute (tree *, tree, tree, int, bool *) ; static tree handle_alias_ifunc_attribute (bool, tree *, tree, tree, bool *); static tree handle_ifunc_attribute (tree *, tree, tree, int, bool *); @@ -643,6 +644,8 @@ const struct attribute_spec c_common_att handle_section_attribute, false }, { "aligned",0, 1, false, false, false, handle_aligned_attribute, false }, + { "aligned_array", 0, 1, false, false, false, + handle_aligned_array_attribute, false }, { "weak", 0, 0, true, false, false, handle_weak_attribute, false }, { "ifunc", 1, 1, true, false, false, @@ -6682,6 +6685,26 @@ handle_section_attribute (tree *node, tr } return NULL_TREE; +} + +/* Handle "aligned_array" attribute. */ +static tree +handle_aligned_array_attribute (tree *node, tree ARG_UNUSED (name), tree args, + int flags, bool *no_add_attrs) +{ + if (!TYPE_P (*node) || !POINTER_TYPE_P (*node)) +{ + error ("array_alignment attribute must be applied to a pointer-type"); + *no_add_attrs = true; +} + else +{ + tree ret = handle_aligned_attribute (node, name, args, flags, no_add_attrs); + TYPE_IS_ARRAY (*node) = true; + return ret; +} + + return NULL_TREE; } /* Handle a "aligned" attribute; arguments as in Index: gcc/tree.h === --- gcc/tree.h (revision 179906) +++ gcc/tree.h (working copy) @@ -2149,6 +2149,7 @@ struct GTY(()) tree_block { #define TYPE_NEXT_VARIANT(NODE) (TYPE_CHECK (NODE)->type_common.next_variant) #define TYPE_MAIN_VARIANT(NODE) (TYPE_CHECK (NODE)->type_common.main_variant) #define TYPE_CONTEXT(NODE) (TYPE_CHECK (NODE)->type_common.context) +#define TYPE_IS_ARRAY(NODE) (TYPE_CHECK (NODE)->type_common.is_array_flag) /* Vector types need to check target flags to determine type. */ extern enum machine_mode vector_type_mode (const_tree); @@ -2411,6 +2412,7 @@ struct GTY(()) tree_type_common { unsigned lang_flag_5 : 1; unsigned lang_flag_6 : 1; + unsigned is_array_flag: 1;
Re: New warning for expanded vector operations
On Thu, Oct 13, 2011 at 10:23 AM, Richard Guenther wrote: > On Thu, Oct 13, 2011 at 10:59 AM, Mike Stump wrote: >> On Oct 12, 2011, at 2:37 PM, Artem Shinkarov wrote: >>> This patch fixed PR50704. >>> >>> gcc/testsuite: >>> * gcc.target/i386/warn-vect-op-3.c: Exclude ia32 target. >>> * gcc.target/i386/warn-vect-op-1.c: Ditto. >>> * gcc.target/i386/warn-vect-op-2.c: Ditto. >>> >>> Ok for trunk? >> >> Ok. Is this x32 clean? :-) If not, HJ will offer an even better spelling. > > I suppose you instead want sth like > > { dg-require-effective-target lp64 } > > ? > See our discussion with HJ here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50704 /* { dg-do compile { target { ! { ia32 } } } } */ was his idea. As far as x32 sets UNITS_PER_WORD to 8, these tests should work fine. Artem.
Re: New warning for expanded vector operations
This patch fixed PR50704. gcc/testsuite: * gcc.target/i386/warn-vect-op-3.c: Exclude ia32 target. * gcc.target/i386/warn-vect-op-1.c: Ditto. * gcc.target/i386/warn-vect-op-2.c: Ditto. Ok for trunk? Artem. On Wed, Oct 12, 2011 at 4:40 PM, H.J. Lu wrote: > On Tue, Oct 11, 2011 at 9:11 AM, Artem Shinkarov > wrote: >> >> Committed with the revision 179807. >> >> > > This caused: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50704 > > -- > H.J. > fix-performance-tests.diff Description: Binary data
Re: New warning for expanded vector operations
On Tue, Oct 11, 2011 at 11:52 AM, Richard Guenther wrote: > On Mon, Oct 10, 2011 at 3:21 PM, Artem Shinkarov > wrote: >> On Mon, Oct 10, 2011 at 12:02 PM, Richard Guenther >> wrote: >>> On Fri, Oct 7, 2011 at 9:44 AM, Artem Shinkarov >>> wrote: >>>> On Fri, Oct 7, 2011 at 6:22 AM, Artem Shinkarov >>>> wrote: >>>>> On Wed, Oct 5, 2011 at 12:35 PM, Richard Guenther >>>>> wrote: >>>>>> On Wed, Oct 5, 2011 at 1:28 PM, Artem Shinkarov >>>>>> wrote: >>>>>>> On Wed, Oct 5, 2011 at 9:40 AM, Richard Guenther >>>>>>> wrote: >>>>>>>> On Wed, Oct 5, 2011 at 12:18 AM, Artem Shinkarov >>>>>>>> wrote: >>>>>>>>> Hi >>>>>>>>> >>>>>>>>> Here is a patch to inform a programmer about the expanded vector >>>>>>>>> operation. >>>>>>>>> Bootstrapped on x86-unknown-linux-gnu. >>>>>>>>> >>>>>>>>> ChangeLog: >>>>>>>>> >>>>>>>>> * gcc/tree-vect-generic.c (expand_vector_piecewise): Adjust to >>>>>>>>> produce the warning. >>>>>>>>> (expand_vector_parallel): Adjust to produce the warning. >>>>>>>> >>>>>>>> Entries start without gcc/, they are relative to the gcc/ChangeLog >>>>>>>> file. >>>>>>> >>>>>>> Sure, sorry. >>>>>>> >>>>>>>>> (lower_vec_shuffle): Adjust to produce the warning. >>>>>>>>> * gcc/common.opt: New warning Wvector-operation-expanded. >>>>>>>>> * gcc/doc/invoke.texi: Document the wawning. >>>>>>>>> >>>>>>>>> >>>>>>>>> Ok? >>>>>>>> >>>>>>>> I don't like the name -Wvector-operation-expanded. We emit a >>>>>>>> similar warning for missed inline expansions with -Winline, so >>>>>>>> maybe -Wvector-extensions (that's the name that appears >>>>>>>> in the C extension documentation). >>>>>>> >>>>>>> Hm, I don't care much about the name, unless it gets clear what the >>>>>>> warning is used for. I am not really sure that Wvector-extensions >>>>>>> makes it clear. Also, I don't see anything bad if the warning will >>>>>>> pop up during the vectorisation. Any vector operation performed >>>>>>> outside the SIMD accelerator looks suspicious, because it actually >>>>>>> doesn't improve performance. Such a warning during the vectorisation >>>>>>> could mean that a programmer forgot some flag, or the constant >>>>>>> propagation failed to deliver a constant, or something else. >>>>>>> >>>>>>> Conceptually the text I am producing is not really a warning, it is >>>>>>> more like an information, but I am not aware of the mechanisms that >>>>>>> would allow me to introduce a flag triggering inform () or something >>>>>>> similar. >>>>>>> >>>>>>> What I think we really need to avoid is including this warning in the >>>>>>> standard Ox. >>>>>>> >>>>>>>> + location_t loc = gimple_location (gsi_stmt (*gsi)); >>>>>>>> + >>>>>>>> + warning_at (loc, OPT_Wvector_operation_expanded, >>>>>>>> + "vector operation will be expanded piecewise"); >>>>>>>> >>>>>>>> v = VEC_alloc(constructor_elt, gc, (nunits + delta - 1) / delta); >>>>>>>> for (i = 0; i < nunits; >>>>>>>> @@ -260,6 +264,10 @@ expand_vector_parallel (gimple_stmt_iter >>>>>>>> tree result, compute_type; >>>>>>>> enum machine_mode mode; >>>>>>>> int n_words = tree_low_cst (TYPE_SIZE_UNIT (type), 1) / >>>>>>>> UNITS_PER_WORD; >>>>>>>> + location_t loc = gimple_location (gsi_stmt (*gsi)); >>>>>>>> + >>>>>>>> + warning_at (loc, OPT
Re: New warning for expanded vector operations
On Mon, Oct 10, 2011 at 12:02 PM, Richard Guenther wrote: > On Fri, Oct 7, 2011 at 9:44 AM, Artem Shinkarov > wrote: >> On Fri, Oct 7, 2011 at 6:22 AM, Artem Shinkarov >> wrote: >>> On Wed, Oct 5, 2011 at 12:35 PM, Richard Guenther >>> wrote: >>>> On Wed, Oct 5, 2011 at 1:28 PM, Artem Shinkarov >>>> wrote: >>>>> On Wed, Oct 5, 2011 at 9:40 AM, Richard Guenther >>>>> wrote: >>>>>> On Wed, Oct 5, 2011 at 12:18 AM, Artem Shinkarov >>>>>> wrote: >>>>>>> Hi >>>>>>> >>>>>>> Here is a patch to inform a programmer about the expanded vector >>>>>>> operation. >>>>>>> Bootstrapped on x86-unknown-linux-gnu. >>>>>>> >>>>>>> ChangeLog: >>>>>>> >>>>>>> * gcc/tree-vect-generic.c (expand_vector_piecewise): Adjust to >>>>>>> produce the warning. >>>>>>> (expand_vector_parallel): Adjust to produce the warning. >>>>>> >>>>>> Entries start without gcc/, they are relative to the gcc/ChangeLog file. >>>>> >>>>> Sure, sorry. >>>>> >>>>>>> (lower_vec_shuffle): Adjust to produce the warning. >>>>>>> * gcc/common.opt: New warning Wvector-operation-expanded. >>>>>>> * gcc/doc/invoke.texi: Document the wawning. >>>>>>> >>>>>>> >>>>>>> Ok? >>>>>> >>>>>> I don't like the name -Wvector-operation-expanded. We emit a >>>>>> similar warning for missed inline expansions with -Winline, so >>>>>> maybe -Wvector-extensions (that's the name that appears >>>>>> in the C extension documentation). >>>>> >>>>> Hm, I don't care much about the name, unless it gets clear what the >>>>> warning is used for. I am not really sure that Wvector-extensions >>>>> makes it clear. Also, I don't see anything bad if the warning will >>>>> pop up during the vectorisation. Any vector operation performed >>>>> outside the SIMD accelerator looks suspicious, because it actually >>>>> doesn't improve performance. Such a warning during the vectorisation >>>>> could mean that a programmer forgot some flag, or the constant >>>>> propagation failed to deliver a constant, or something else. >>>>> >>>>> Conceptually the text I am producing is not really a warning, it is >>>>> more like an information, but I am not aware of the mechanisms that >>>>> would allow me to introduce a flag triggering inform () or something >>>>> similar. >>>>> >>>>> What I think we really need to avoid is including this warning in the >>>>> standard Ox. >>>>> >>>>>> + location_t loc = gimple_location (gsi_stmt (*gsi)); >>>>>> + >>>>>> + warning_at (loc, OPT_Wvector_operation_expanded, >>>>>> + "vector operation will be expanded piecewise"); >>>>>> >>>>>> v = VEC_alloc(constructor_elt, gc, (nunits + delta - 1) / delta); >>>>>> for (i = 0; i < nunits; >>>>>> @@ -260,6 +264,10 @@ expand_vector_parallel (gimple_stmt_iter >>>>>> tree result, compute_type; >>>>>> enum machine_mode mode; >>>>>> int n_words = tree_low_cst (TYPE_SIZE_UNIT (type), 1) / UNITS_PER_WORD; >>>>>> + location_t loc = gimple_location (gsi_stmt (*gsi)); >>>>>> + >>>>>> + warning_at (loc, OPT_Wvector_operation_expanded, >>>>>> + "vector operation will be expanded in parallel"); >>>>>> >>>>>> what's the difference between 'piecewise' and 'in parallel'? >>>>> >>>>> Parallel is a little bit better for performance than piecewise. >>>> >>>> I see. That difference should probably be documented, maybe with >>>> an example. >>>> >>>> Richard. >>>> >>>>>> @@ -301,16 +309,15 @@ expand_vector_addition (gimple_stmt_iter >>>>>> { >>>>>> int parts_per_word = UNITS_PER_WORD >>>>>>
Re: New warning for expanded vector operations
On Fri, Oct 7, 2011 at 6:22 AM, Artem Shinkarov wrote: > On Wed, Oct 5, 2011 at 12:35 PM, Richard Guenther > wrote: >> On Wed, Oct 5, 2011 at 1:28 PM, Artem Shinkarov >> wrote: >>> On Wed, Oct 5, 2011 at 9:40 AM, Richard Guenther >>> wrote: >>>> On Wed, Oct 5, 2011 at 12:18 AM, Artem Shinkarov >>>> wrote: >>>>> Hi >>>>> >>>>> Here is a patch to inform a programmer about the expanded vector >>>>> operation. >>>>> Bootstrapped on x86-unknown-linux-gnu. >>>>> >>>>> ChangeLog: >>>>> >>>>> * gcc/tree-vect-generic.c (expand_vector_piecewise): Adjust to >>>>> produce the warning. >>>>> (expand_vector_parallel): Adjust to produce the warning. >>>> >>>> Entries start without gcc/, they are relative to the gcc/ChangeLog file. >>> >>> Sure, sorry. >>> >>>>> (lower_vec_shuffle): Adjust to produce the warning. >>>>> * gcc/common.opt: New warning Wvector-operation-expanded. >>>>> * gcc/doc/invoke.texi: Document the wawning. >>>>> >>>>> >>>>> Ok? >>>> >>>> I don't like the name -Wvector-operation-expanded. We emit a >>>> similar warning for missed inline expansions with -Winline, so >>>> maybe -Wvector-extensions (that's the name that appears >>>> in the C extension documentation). >>> >>> Hm, I don't care much about the name, unless it gets clear what the >>> warning is used for. I am not really sure that Wvector-extensions >>> makes it clear. Also, I don't see anything bad if the warning will >>> pop up during the vectorisation. Any vector operation performed >>> outside the SIMD accelerator looks suspicious, because it actually >>> doesn't improve performance. Such a warning during the vectorisation >>> could mean that a programmer forgot some flag, or the constant >>> propagation failed to deliver a constant, or something else. >>> >>> Conceptually the text I am producing is not really a warning, it is >>> more like an information, but I am not aware of the mechanisms that >>> would allow me to introduce a flag triggering inform () or something >>> similar. >>> >>> What I think we really need to avoid is including this warning in the >>> standard Ox. >>> >>>> + location_t loc = gimple_location (gsi_stmt (*gsi)); >>>> + >>>> + warning_at (loc, OPT_Wvector_operation_expanded, >>>> + "vector operation will be expanded piecewise"); >>>> >>>> v = VEC_alloc(constructor_elt, gc, (nunits + delta - 1) / delta); >>>> for (i = 0; i < nunits; >>>> @@ -260,6 +264,10 @@ expand_vector_parallel (gimple_stmt_iter >>>> tree result, compute_type; >>>> enum machine_mode mode; >>>> int n_words = tree_low_cst (TYPE_SIZE_UNIT (type), 1) / UNITS_PER_WORD; >>>> + location_t loc = gimple_location (gsi_stmt (*gsi)); >>>> + >>>> + warning_at (loc, OPT_Wvector_operation_expanded, >>>> + "vector operation will be expanded in parallel"); >>>> >>>> what's the difference between 'piecewise' and 'in parallel'? >>> >>> Parallel is a little bit better for performance than piecewise. >> >> I see. That difference should probably be documented, maybe with >> an example. >> >> Richard. >> >>>> @@ -301,16 +309,15 @@ expand_vector_addition (gimple_stmt_iter >>>> { >>>> int parts_per_word = UNITS_PER_WORD >>>> / tree_low_cst (TYPE_SIZE_UNIT (TREE_TYPE (type)), >>>> 1); >>>> + location_t loc = gimple_location (gsi_stmt (*gsi)); >>>> >>>> if (INTEGRAL_TYPE_P (TREE_TYPE (type)) >>>> && parts_per_word >= 4 >>>> && TYPE_VECTOR_SUBPARTS (type) >= 4) >>>> - return expand_vector_parallel (gsi, f_parallel, >>>> - type, a, b, code); >>>> + return expand_vector_parallel (gsi, f_parallel, type, a, b, code); >>>> else >>>> - return expand_vector_piecewise (gsi, f, >>>> - type, TREE_TYPE (type), >>>> -
Re: New warning for expanded vector operations
On Wed, Oct 5, 2011 at 12:35 PM, Richard Guenther wrote: > On Wed, Oct 5, 2011 at 1:28 PM, Artem Shinkarov > wrote: >> On Wed, Oct 5, 2011 at 9:40 AM, Richard Guenther >> wrote: >>> On Wed, Oct 5, 2011 at 12:18 AM, Artem Shinkarov >>> wrote: >>>> Hi >>>> >>>> Here is a patch to inform a programmer about the expanded vector operation. >>>> Bootstrapped on x86-unknown-linux-gnu. >>>> >>>> ChangeLog: >>>> >>>> * gcc/tree-vect-generic.c (expand_vector_piecewise): Adjust to >>>> produce the warning. >>>> (expand_vector_parallel): Adjust to produce the warning. >>> >>> Entries start without gcc/, they are relative to the gcc/ChangeLog file. >> >> Sure, sorry. >> >>>> (lower_vec_shuffle): Adjust to produce the warning. >>>> * gcc/common.opt: New warning Wvector-operation-expanded. >>>> * gcc/doc/invoke.texi: Document the wawning. >>>> >>>> >>>> Ok? >>> >>> I don't like the name -Wvector-operation-expanded. We emit a >>> similar warning for missed inline expansions with -Winline, so >>> maybe -Wvector-extensions (that's the name that appears >>> in the C extension documentation). >> >> Hm, I don't care much about the name, unless it gets clear what the >> warning is used for. I am not really sure that Wvector-extensions >> makes it clear. Also, I don't see anything bad if the warning will >> pop up during the vectorisation. Any vector operation performed >> outside the SIMD accelerator looks suspicious, because it actually >> doesn't improve performance. Such a warning during the vectorisation >> could mean that a programmer forgot some flag, or the constant >> propagation failed to deliver a constant, or something else. >> >> Conceptually the text I am producing is not really a warning, it is >> more like an information, but I am not aware of the mechanisms that >> would allow me to introduce a flag triggering inform () or something >> similar. >> >> What I think we really need to avoid is including this warning in the >> standard Ox. >> >>> + location_t loc = gimple_location (gsi_stmt (*gsi)); >>> + >>> + warning_at (loc, OPT_Wvector_operation_expanded, >>> + "vector operation will be expanded piecewise"); >>> >>> v = VEC_alloc(constructor_elt, gc, (nunits + delta - 1) / delta); >>> for (i = 0; i < nunits; >>> @@ -260,6 +264,10 @@ expand_vector_parallel (gimple_stmt_iter >>> tree result, compute_type; >>> enum machine_mode mode; >>> int n_words = tree_low_cst (TYPE_SIZE_UNIT (type), 1) / UNITS_PER_WORD; >>> + location_t loc = gimple_location (gsi_stmt (*gsi)); >>> + >>> + warning_at (loc, OPT_Wvector_operation_expanded, >>> + "vector operation will be expanded in parallel"); >>> >>> what's the difference between 'piecewise' and 'in parallel'? >> >> Parallel is a little bit better for performance than piecewise. > > I see. That difference should probably be documented, maybe with > an example. > > Richard. > >>> @@ -301,16 +309,15 @@ expand_vector_addition (gimple_stmt_iter >>> { >>> int parts_per_word = UNITS_PER_WORD >>> / tree_low_cst (TYPE_SIZE_UNIT (TREE_TYPE (type)), 1); >>> + location_t loc = gimple_location (gsi_stmt (*gsi)); >>> >>> if (INTEGRAL_TYPE_P (TREE_TYPE (type)) >>> && parts_per_word >= 4 >>> && TYPE_VECTOR_SUBPARTS (type) >= 4) >>> - return expand_vector_parallel (gsi, f_parallel, >>> - type, a, b, code); >>> + return expand_vector_parallel (gsi, f_parallel, type, a, b, code); >>> else >>> - return expand_vector_piecewise (gsi, f, >>> - type, TREE_TYPE (type), >>> - a, b, code); >>> + return expand_vector_piecewise (gsi, f, type, >>> + TREE_TYPE (type), a, b, code); >>> } >>> >>> /* Check if vector VEC consists of all the equal elements and >>> >>> unless i miss something loc is unused here. Please avoid random >>> whitespace changes (just review your patch yourself before posting >>> and revert pieces tha
Re: Fix pr50607 bconstp-3.c failure
On Thu, Oct 6, 2011 at 3:27 AM, Hans-Peter Nilsson wrote: > On Thu, 6 Oct 2011, Artem Shinkarov wrote: >> Successfully regtested on x86-unknown-linux-gnu. Committed to the >> mainline with the revision 179588. >> >> ChangeLog: >> 2011-10-06 Artjoms Sinkarovs >> * c-tree.h (c_expr_t): New typedef for struct c_expr. >> (C_EXPR_APPEND): New macro. >> * c-parser.c (c_parser_get_builtin_args): Preserve >> original_tree_code of c_expr structure. Fixes bconstp-3.c >> failure of PR50607. >> (c_parser_postfix_expression): Adjust to the new function. > > Write that changelog entry as: > > PR middle-end/50607 > * c-tree.h (c_expr_t): New typedef for struct c_expr. > (C_EXPR_APPEND): New macro. > * c-parser.c (c_parser_get_builtin_args): Preserve > original_tree_code of c_expr structure. > (c_parser_postfix_expression): Adjust to the new function. > > (note top marker and without the "Fixes...") > and there'll be an automatic entry in bugzilla, like for example > PR50609. You still have to close the bugzilla entry manually > (not all noteworthy fixes are reason to close a bug report). > > Mostly as a note for the future but you could fix that now, when > adding the missing empty line after the date+email line. ;) > > brgds, H-P > Thanks for the advice, I didn't know about the automatic mail. Fixed with commit 179589. As for closing bugzilla entry, I would prefer to make sure that the patch really fixes the problem. It would be very helpful if you could confirm this. Thanks, Artem.
Re: Fix pr50607 bconstp-3.c failure
On Wed, Oct 5, 2011 at 5:32 PM, Artem Shinkarov wrote: > On Wed, Oct 5, 2011 at 5:28 PM, Joseph S. Myers > wrote: >> On Wed, 5 Oct 2011, Artem Shinkarov wrote: >> >>> Joseph, is it possible to commit the patch together with the space fixes? >> >> You should not commit whitespace fixes to lines not otherwise modified by >> a patch, except in a separate commit that *only* has whitespace or other >> formatting fixes, so that "svn blame" results are more meaningful. That >> is: commit whitespace fixes (assuming they are genuinely fixes rather than >> making things worse) *separately* in a commit whose commit message makes >> clear it is just whitespace fixes. > > Ok, I see. In the given patch all the fixes are mainly auto-generated > and remove the trailing spaces. But I see your point. > > Thanks, > Artem. >> -- >> Joseph S. Myers >> jos...@codesourcery.com >> > Successfully regtested on x86-unknown-linux-gnu. Committed to the mainline with the revision 179588. ChangeLog: 2011-10-06 Artjoms Sinkarovs * c-tree.h (c_expr_t): New typedef for struct c_expr. (C_EXPR_APPEND): New macro. * c-parser.c (c_parser_get_builtin_args): Preserve original_tree_code of c_expr structure. Fixes bconstp-3.c failure of PR50607. (c_parser_postfix_expression): Adjust to the new function. Artem.
Re: Fix pr50607 bconstp-3.c failure
On Wed, Oct 5, 2011 at 5:28 PM, Joseph S. Myers wrote: > On Wed, 5 Oct 2011, Artem Shinkarov wrote: > >> Joseph, is it possible to commit the patch together with the space fixes? > > You should not commit whitespace fixes to lines not otherwise modified by > a patch, except in a separate commit that *only* has whitespace or other > formatting fixes, so that "svn blame" results are more meaningful. That > is: commit whitespace fixes (assuming they are genuinely fixes rather than > making things worse) *separately* in a commit whose commit message makes > clear it is just whitespace fixes. Ok, I see. In the given patch all the fixes are mainly auto-generated and remove the trailing spaces. But I see your point. Thanks, Artem. > -- > Joseph S. Myers > jos...@codesourcery.com >
Re: Fix pr50607 bconstp-3.c failure
On Wed, Oct 5, 2011 at 4:22 PM, Joseph S. Myers wrote: > On Wed, 5 Oct 2011, Artem Shinkarov wrote: > >> On Tue, Oct 4, 2011 at 11:51 PM, Joseph S. Myers >> wrote: >> > On Tue, 4 Oct 2011, Artem Shinkarov wrote: >> > >> >> Hi >> >> >> >> Here is the patch tho fix bconstp-3.c failure in the bug 50607. The >> >> failure was cause because the new parser routine did not consider >> >> original_tree_code of the expression. >> >> >> >> The patch is bootstrapped on x86-unknown-linux-gnu and is being tested. >> > >> > Please repost the patch for review without the unrelated whitespace >> > changes. >> > >> > -- >> > Joseph S. Myers >> > jos...@codesourcery.com >> > >> >> Sure. The new version is in the attachment. > > This version is OK, provided that it has passed bootstrap with no > regressions and that it fixes the failures in question. > > -- > Joseph S. Myers > jos...@codesourcery.com > Thanks, I am going to apply it then, when the regtesting is over in case it would be successfull. Joseph, is it possible to commit the patch together with the space fixes? Thanks, Artem.
Re: New warning for expanded vector operations
On Wed, Oct 5, 2011 at 9:40 AM, Richard Guenther wrote: > On Wed, Oct 5, 2011 at 12:18 AM, Artem Shinkarov > wrote: >> Hi >> >> Here is a patch to inform a programmer about the expanded vector operation. >> Bootstrapped on x86-unknown-linux-gnu. >> >> ChangeLog: >> >> * gcc/tree-vect-generic.c (expand_vector_piecewise): Adjust to >> produce the warning. >> (expand_vector_parallel): Adjust to produce the warning. > > Entries start without gcc/, they are relative to the gcc/ChangeLog file. Sure, sorry. >> (lower_vec_shuffle): Adjust to produce the warning. >> * gcc/common.opt: New warning Wvector-operation-expanded. >> * gcc/doc/invoke.texi: Document the wawning. >> >> >> Ok? > > I don't like the name -Wvector-operation-expanded. We emit a > similar warning for missed inline expansions with -Winline, so > maybe -Wvector-extensions (that's the name that appears > in the C extension documentation). Hm, I don't care much about the name, unless it gets clear what the warning is used for. I am not really sure that Wvector-extensions makes it clear. Also, I don't see anything bad if the warning will pop up during the vectorisation. Any vector operation performed outside the SIMD accelerator looks suspicious, because it actually doesn't improve performance. Such a warning during the vectorisation could mean that a programmer forgot some flag, or the constant propagation failed to deliver a constant, or something else. Conceptually the text I am producing is not really a warning, it is more like an information, but I am not aware of the mechanisms that would allow me to introduce a flag triggering inform () or something similar. What I think we really need to avoid is including this warning in the standard Ox. > + location_t loc = gimple_location (gsi_stmt (*gsi)); > + > + warning_at (loc, OPT_Wvector_operation_expanded, > + "vector operation will be expanded piecewise"); > > v = VEC_alloc(constructor_elt, gc, (nunits + delta - 1) / delta); > for (i = 0; i < nunits; > @@ -260,6 +264,10 @@ expand_vector_parallel (gimple_stmt_iter > tree result, compute_type; > enum machine_mode mode; > int n_words = tree_low_cst (TYPE_SIZE_UNIT (type), 1) / UNITS_PER_WORD; > + location_t loc = gimple_location (gsi_stmt (*gsi)); > + > + warning_at (loc, OPT_Wvector_operation_expanded, > + "vector operation will be expanded in parallel"); > > what's the difference between 'piecewise' and 'in parallel'? Parallel is a little bit better for performance than piecewise. > @@ -301,16 +309,15 @@ expand_vector_addition (gimple_stmt_iter > { > int parts_per_word = UNITS_PER_WORD > / tree_low_cst (TYPE_SIZE_UNIT (TREE_TYPE (type)), 1); > + location_t loc = gimple_location (gsi_stmt (*gsi)); > > if (INTEGRAL_TYPE_P (TREE_TYPE (type)) > && parts_per_word >= 4 > && TYPE_VECTOR_SUBPARTS (type) >= 4) > - return expand_vector_parallel (gsi, f_parallel, > - type, a, b, code); > + return expand_vector_parallel (gsi, f_parallel, type, a, b, code); > else > - return expand_vector_piecewise (gsi, f, > - type, TREE_TYPE (type), > - a, b, code); > + return expand_vector_piecewise (gsi, f, type, > + TREE_TYPE (type), a, b, code); > } > > /* Check if vector VEC consists of all the equal elements and > > unless i miss something loc is unused here. Please avoid random > whitespace changes (just review your patch yourself before posting > and revert pieces that do nothing). Yes you are right, sorry. > +@item -Wvector-operation-expanded > +@opindex Wvector-operation-expanded > +@opindex Wno-vector-operation-expanded > +Warn if vector operation is not implemented via SIMD capabilities of the > +architecture. Mainly useful for the performance tuning. > > I'd mention that this is for vector operations as of the C extension > documented in "Vector Extensions". > > The vectorizer can produce some operations that will need further > lowering - we probably should make sure to _not_ warn about those. > Try running the vect.exp testsuite with the new warning turned on > (eventually disabling SSE), like with > > obj/gcc> make check-gcc > RUNTESTFLAGS="--target_board=unix/-Wvector-extensions/-mno-sse > vect.exp" Again, see the comment above. I think, if the warning can be triggered only manually, then we are fine. But I'll check anyway how many warnings I
Re: Fix pr50607 bconstp-3.c failure
On Tue, Oct 4, 2011 at 11:51 PM, Joseph S. Myers wrote: > On Tue, 4 Oct 2011, Artem Shinkarov wrote: > >> Hi >> >> Here is the patch tho fix bconstp-3.c failure in the bug 50607. The >> failure was cause because the new parser routine did not consider >> original_tree_code of the expression. >> >> The patch is bootstrapped on x86-unknown-linux-gnu and is being tested. > > Please repost the patch for review without the unrelated whitespace > changes. > > -- > Joseph S. Myers > jos...@codesourcery.com > Sure. The new version is in the attachment. Thanks, Artem. diff -up '-F^(define' gcc-bootstrap/gcc//c-parser.c gcc-new/gcc//c-parser.c --- gcc-bootstrap/gcc//c-parser.c 2011-10-05 00:09:59.067560839 +0100 +++ gcc-new/gcc//c-parser.c 2011-10-05 00:08:23.454756162 +0100 @@ -5993,16 +5993,16 @@ c_parser_alignof_expression (c_parser *p for the middle-end nodes like COMPLEX_EXPR, VEC_SHUFFLE_EXPR and others. The name of the builtin is passed using BNAME parameter. Function returns true if there were no errors while parsing and - stores the arguments in EXPR_LIST. List of original types can be - obtained by passing non NULL value to ORIG_TYPES. */ + stores the arguments in CEXPR_LIST. */ static bool c_parser_get_builtin_args (c_parser *parser, const char *bname, - VEC(tree,gc) **expr_list, - VEC(tree,gc) **orig_types) + VEC(c_expr_t,gc) **ret_cexpr_list) { location_t loc = c_parser_peek_token (parser)->location; - *expr_list = NULL; + VEC (c_expr_t,gc) *cexpr_list; + c_expr_t expr; + *ret_cexpr_list = NULL; if (c_parser_next_token_is_not (parser, CPP_OPEN_PAREN)) { error_at (loc, "cannot take address of %qs", bname); @@ -6017,14 +6017,20 @@ c_parser_get_builtin_args (c_parser *par return true; } - if (orig_types) -*expr_list = c_parser_expr_list (parser, false, false, orig_types); - else -*expr_list = c_parser_expr_list (parser, false, false, NULL); + expr = c_parser_expr_no_commas (parser, NULL); + cexpr_list = VEC_alloc (c_expr_t, gc, 1); + C_EXPR_APPEND (cexpr_list, expr); + while (c_parser_next_token_is (parser, CPP_COMMA)) +{ + c_parser_consume_token (parser); + expr = c_parser_expr_no_commas (parser, NULL); + C_EXPR_APPEND (cexpr_list, expr); +} if (!c_parser_require (parser, CPP_CLOSE_PAREN, "expected %<)%>")) return false; + *ret_cexpr_list = cexpr_list; return true; } @@ -6378,20 +6384,20 @@ c_parser_postfix_expression (c_parser *p break; case RID_CHOOSE_EXPR: { - VEC(tree,gc) *expr_list; - VEC(tree,gc) *orig_types; - tree e1value, e2value, e3value, c; + VEC (c_expr_t, gc) *cexpr_list; + c_expr_t *e1_p, *e2_p, *e3_p; + tree c; c_parser_consume_token (parser); if (!c_parser_get_builtin_args (parser, "__builtin_choose_expr", - &expr_list, &orig_types)) + &cexpr_list)) { expr.value = error_mark_node; break; } - if (VEC_length (tree, expr_list) != 3) + if (VEC_length (c_expr_t, cexpr_list) != 3) { error_at (loc, "wrong number of arguments to " "%<__builtin_choose_expr%>"); @@ -6399,31 +6405,20 @@ c_parser_postfix_expression (c_parser *p break; } - e1value = VEC_index (tree, expr_list, 0); - e2value = VEC_index (tree, expr_list, 1); - e3value = VEC_index (tree, expr_list, 2); - - c = e1value; - mark_exp_read (e2value); - mark_exp_read (e3value); + e1_p = VEC_index (c_expr_t, cexpr_list, 0); + e2_p = VEC_index (c_expr_t, cexpr_list, 1); + e3_p = VEC_index (c_expr_t, cexpr_list, 2); + + c = e1_p->value; + mark_exp_read (e2_p->value); + mark_exp_read (e3_p->value); if (TREE_CODE (c) != INTEGER_CST || !INTEGRAL_TYPE_P (TREE_TYPE (c))) error_at (loc, "first argument to %<__builtin_choose_expr%> not" " a constant"); constant_expression_warning (c); - - if (integer_zerop (c)) - { - expr.value = e3value; - expr.original_type = VEC_index (tree, orig_types, 2); - } - else - { - expr.value = e2value; - expr.original_type = VEC_index (tree, orig_types, 1); - } - +
New warning for expanded vector operations
Hi Here is a patch to inform a programmer about the expanded vector operation. Bootstrapped on x86-unknown-linux-gnu. ChangeLog: * gcc/tree-vect-generic.c (expand_vector_piecewise): Adjust to produce the warning. (expand_vector_parallel): Adjust to produce the warning. (lower_vec_shuffle): Adjust to produce the warning. * gcc/common.opt: New warning Wvector-operation-expanded. * gcc/doc/invoke.texi: Document the wawning. Ok? Thanks, Artem Shinkarov. P.S. It is hard to write a reasonable testcase for the patch, because one needs to guess which architecture would expand a given vector operation. But the patch is trivial. Index: gcc/tree-vect-generic.c === --- gcc/tree-vect-generic.c (revision 179464) +++ gcc/tree-vect-generic.c (working copy) @@ -235,6 +235,10 @@ expand_vector_piecewise (gimple_stmt_ite int delta = tree_low_cst (part_width, 1) / tree_low_cst (TYPE_SIZE (TREE_TYPE (type)), 1); int i; + location_t loc = gimple_location (gsi_stmt (*gsi)); + + warning_at (loc, OPT_Wvector_operation_expanded, + "vector operation will be expanded piecewise"); v = VEC_alloc(constructor_elt, gc, (nunits + delta - 1) / delta); for (i = 0; i < nunits; @@ -260,6 +264,10 @@ expand_vector_parallel (gimple_stmt_iter tree result, compute_type; enum machine_mode mode; int n_words = tree_low_cst (TYPE_SIZE_UNIT (type), 1) / UNITS_PER_WORD; + location_t loc = gimple_location (gsi_stmt (*gsi)); + + warning_at (loc, OPT_Wvector_operation_expanded, + "vector operation will be expanded in parallel"); /* We have three strategies. If the type is already correct, just do the operation an element at a time. Else, if the vector is wider than @@ -301,16 +309,15 @@ expand_vector_addition (gimple_stmt_iter { int parts_per_word = UNITS_PER_WORD / tree_low_cst (TYPE_SIZE_UNIT (TREE_TYPE (type)), 1); + location_t loc = gimple_location (gsi_stmt (*gsi)); if (INTEGRAL_TYPE_P (TREE_TYPE (type)) && parts_per_word >= 4 && TYPE_VECTOR_SUBPARTS (type) >= 4) -return expand_vector_parallel (gsi, f_parallel, - type, a, b, code); +return expand_vector_parallel (gsi, f_parallel, type, a, b, code); else -return expand_vector_piecewise (gsi, f, - type, TREE_TYPE (type), - a, b, code); +return expand_vector_piecewise (gsi, f, type, + TREE_TYPE (type), a, b, code); } /* Check if vector VEC consists of all the equal elements and @@ -400,8 +407,8 @@ expand_vector_operation (gimple_stmt_ite case PLUS_EXPR: case MINUS_EXPR: if (!TYPE_OVERFLOW_TRAPS (type)) - return expand_vector_addition (gsi, do_binop, do_plus_minus, type, -gimple_assign_rhs1 (assign), + return expand_vector_addition (gsi, do_binop, do_plus_minus, type, +gimple_assign_rhs1 (assign), gimple_assign_rhs2 (assign), code); break; @@ -622,6 +629,8 @@ lower_vec_shuffle (gimple_stmt_iterator return true; } + warning_at (loc, OPT_Wvector_operation_expanded, + "vector shuffling operation will be expanded piecewise"); if (operand_equal_p (vec0, vec1, 0)) { unsigned i; Index: gcc/common.opt === --- gcc/common.opt (revision 179464) +++ gcc/common.opt (working copy) @@ -694,6 +694,10 @@ Wcoverage-mismatch Common Var(warn_coverage_mismatch) Init(1) Warning Warn in case profiles in -fprofile-use do not match +Wvector-operation-expanded +Common Var(warn_vector_operation_expanded) Warning +Warn when a vector operation is expanded piecewise + Xassembler Driver Separate Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 179464) +++ gcc/doc/invoke.texi (working copy) @@ -271,7 +271,8 @@ Objective-C and Objective-C++ Dialects}. -Wunused-label -Wunused-local-typedefs -Wunused-parameter @gol -Wno-unused-result -Wunused-value @gol -Wunused-variable @gol -Wunused-but-set-parameter -Wunused-but-set-variable @gol --Wvariadic-macros -Wvla -Wvolatile-register-var -Wwrite-strings} +-Wvariadic-macros -Wvector-operation-expanded -Wvla +-Wvolatile-register-var -Wwrite-strings} @item C and Objective-C-only Warning Options @gccoptlist{-Wbad-function-cast -Wmissing-declarations @gol @@ -4532,6 +4533,12 @@ Warn if variadic macros are used in peda alternate syntax when in pedantic ISO C99 mode. This is default. To inhibit the warning messages, use @option{-Wno-variadic-macros}.
Fix pr50607 bconstp-3.c failure
Hi Here is the patch tho fix bconstp-3.c failure in the bug 50607. The failure was cause because the new parser routine did not consider original_tree_code of the expression. The patch is bootstrapped on x86-unknown-linux-gnu and is being tested. Thanks, Artem. Index: c-parser.c === --- c-parser.c (revision 179464) +++ c-parser.c (working copy) @@ -297,7 +297,7 @@ c_lex_one_token (c_parser *parser, c_tok /* Else they are not special keywords. */ } - else if (c_dialect_objc () + else if (c_dialect_objc () && (OBJC_IS_AT_KEYWORD (rid_code) || OBJC_IS_CXX_KEYWORD (rid_code))) { @@ -689,9 +689,9 @@ c_parser_next_token_starts_declspecs (c_ setter/getter on the class. c_token_starts_declspecs() can't differentiate between the two cases because it only checks the current token, so we have a special check here. */ - if (c_dialect_objc () + if (c_dialect_objc () && token->type == CPP_NAME - && token->id_kind == C_ID_CLASSNAME + && token->id_kind == C_ID_CLASSNAME && c_parser_peek_2nd_token (parser)->type == CPP_DOT) return false; @@ -706,9 +706,9 @@ c_parser_next_tokens_start_declaration ( c_token *token = c_parser_peek_token (parser); /* Same as above. */ - if (c_dialect_objc () + if (c_dialect_objc () && token->type == CPP_NAME - && token->id_kind == C_ID_CLASSNAME + && token->id_kind == C_ID_CLASSNAME && c_parser_peek_2nd_token (parser)->type == CPP_DOT) return false; @@ -1516,7 +1516,7 @@ c_parser_declaration_or_fndef (c_parser return; if (specs->attrs) { - warning_at (c_parser_peek_token (parser)->location, + warning_at (c_parser_peek_token (parser)->location, OPT_Wattributes, "prefix attributes are ignored for methods"); specs->attrs = NULL_TREE; @@ -1551,12 +1551,12 @@ c_parser_declaration_or_fndef (c_parser return; if (specs->attrs) { - warning_at (c_parser_peek_token (parser)->location, + warning_at (c_parser_peek_token (parser)->location, OPT_Wattributes, "prefix attributes are ignored for implementations"); specs->attrs = NULL_TREE; } - c_parser_objc_class_definition (parser, NULL_TREE); + c_parser_objc_class_definition (parser, NULL_TREE); return; } break; @@ -1582,7 +1582,7 @@ c_parser_declaration_or_fndef (c_parser break; } } - + pending_xref_error (); prefix_attrs = specs->attrs; all_prefix_attrs = prefix_attrs; @@ -1597,7 +1597,7 @@ c_parser_declaration_or_fndef (c_parser should diagnose if there were no declaration specifiers) or a function definition (in which case the diagnostic for implicit int suffices). */ - declarator = c_parser_declarator (parser, + declarator = c_parser_declarator (parser, specs->typespec_kind != ctsk_none, C_DTR_NORMAL, &dummy); if (declarator == NULL) @@ -1657,13 +1657,13 @@ c_parser_declaration_or_fndef (c_parser if (d) finish_decl (d, UNKNOWN_LOCATION, NULL_TREE, NULL_TREE, asm_name); - + if (c_parser_next_token_is_keyword (parser, RID_IN)) { if (d) *objc_foreach_object_declaration = d; else - *objc_foreach_object_declaration = error_mark_node; + *objc_foreach_object_declaration = error_mark_node; } } if (c_parser_next_token_is (parser, CPP_COMMA)) @@ -3084,7 +3084,7 @@ c_parser_parms_declarator (c_parser *par && !attrs && c_parser_next_token_is (parser, CPP_NAME) && c_parser_peek_token (parser)->id_kind == C_ID_ID - + /* Look ahead to detect typos in type names. */ && c_parser_peek_2nd_token (parser)->type != CPP_NAME && c_parser_peek_2nd_token (parser)->type != CPP_MULT @@ -3478,11 +3478,11 @@ c_parser_attributes (c_parser *parser) /* Parse the attribute contents. If they start with an identifier which is followed by a comma or close parenthesis, then the arguments start with that -identifier; otherwise they are an expression list. +identifier; otherwise they are an expression list. In objective-c the identifier may be a classname. */ if (c_parser_next_token_is (parser, CPP_NAME) && (c_parser_p
Re: Vector shuffling
Ping. Richard, the patch in the attachment should be submitted asap. The other problem could wait for a while. Thanks, Artem. On Tue, Oct 4, 2011 at 12:04 AM, Artem Shinkarov wrote: > On Mon, Oct 3, 2011 at 6:12 PM, Richard Henderson wrote: >> On 10/03/2011 09:43 AM, Artem Shinkarov wrote: >>> Hi, Richard >>> >>> There is a problem with the testcases of the patch you have committed >>> for me. The code in every test-case is doubled. Could you please, >>> apply the following patch, otherwise it would fail all the tests from >>> the vector-shuffle-patch would fail. >> >> Huh. Dunno what happened there. Fixed. >> >>> Also, if it is possible, could you change my name from in the >>> ChangeLog from "Artem Shinkarov" to "Artjoms Sinkarovs". The last >>> version is the way I am spelled in the passport, and the name I use in >>> the ChangeLog. >> >> Fixed. >> >> >> r~ >> > > Richard, there was a problem causing segfault in ix86_expand_vshuffle > which I have fixed with the patch attached. > > Another thing I cannot figure out is the following case: > #define vector(elcount, type) \ > __attribute__((vector_size((elcount)*sizeof(type type > > vector (8, short) __attribute__ ((noinline)) > f (vector (8, short) x, vector (8, short) y, vector (8, short) mask) { > return __builtin_shuffle (x, y, mask); > } > > int main (int argc, char *argv[]) { > vector (8, short) v0 = {argc, 1,2,3,4,5,6,7}; > vector (8, short) v1 = {argc, 1,argc,3,4,5,argc,7}; > vector (8, short) mask0 = {0,2,3,1,4,5,6,7}; > vector (8, short) v2; > int i; > > v2 = f (v0, v1, mask0); > /* v2 = __builtin_shuffle (v0, v1, mask0); */ > for (i = 0; i < 8; i ++) > __builtin_printf ("%i, ", v2[i]); > > return 0; > } > > I am compiling with support of ssse3, in my case it is ./xgcc -B. b.c > -O3 -mtune=core2 -march=core2 > > And I get 1, 1, 1, 3, 4, 5, 1, 7, on the output, which is wrong. > > But if I will call __builtin_shuffle directly, then the answer is correct. > > Any ideas? > > > Thanks, > Artem. >
Re: Vector shuffling
On Mon, Oct 3, 2011 at 6:12 PM, Richard Henderson wrote: > On 10/03/2011 09:43 AM, Artem Shinkarov wrote: >> Hi, Richard >> >> There is a problem with the testcases of the patch you have committed >> for me. The code in every test-case is doubled. Could you please, >> apply the following patch, otherwise it would fail all the tests from >> the vector-shuffle-patch would fail. > > Huh. Dunno what happened there. Fixed. > >> Also, if it is possible, could you change my name from in the >> ChangeLog from "Artem Shinkarov" to "Artjoms Sinkarovs". The last >> version is the way I am spelled in the passport, and the name I use in >> the ChangeLog. > > Fixed. > > > r~ > Richard, there was a problem causing segfault in ix86_expand_vshuffle which I have fixed with the patch attached. Another thing I cannot figure out is the following case: #define vector(elcount, type) \ __attribute__((vector_size((elcount)*sizeof(type type vector (8, short) __attribute__ ((noinline)) f (vector (8, short) x, vector (8, short) y, vector (8, short) mask) { return __builtin_shuffle (x, y, mask); } int main (int argc, char *argv[]) { vector (8, short) v0 = {argc, 1,2,3,4,5,6,7}; vector (8, short) v1 = {argc, 1,argc,3,4,5,argc,7}; vector (8, short) mask0 = {0,2,3,1,4,5,6,7}; vector (8, short) v2; int i; v2 = f (v0, v1, mask0); /* v2 = __builtin_shuffle (v0, v1, mask0); */ for (i = 0; i < 8; i ++) __builtin_printf ("%i, ", v2[i]); return 0; } I am compiling with support of ssse3, in my case it is ./xgcc -B. b.c -O3 -mtune=core2 -march=core2 And I get 1, 1, 1, 3, 4, 5, 1, 7, on the output, which is wrong. But if I will call __builtin_shuffle directly, then the answer is correct. Any ideas? Thanks, Artem. Index: gcc/config/i386/i386.c === --- gcc/config/i386/i386.c (revision 179464) +++ gcc/config/i386/i386.c (working copy) @@ -19312,14 +19312,17 @@ ix86_expand_vshuffle (rtx operands[]) xops[1] = operands[1]; xops[2] = operands[2]; xops[3] = gen_rtx_EQ (mode, mask, w_vector); - xops[4] = t1; - xops[5] = t2; + xops[4] = t2; + xops[5] = t1; return ix86_expand_int_vcond (xops); } - /* mask = mask * {w, w, ...} */ - new_mask = expand_simple_binop (maskmode, MULT, new_mask, w_vector, + /* mask = mask * {16/w, 16/w, ...} */ + for (i = 0; i < w; i++) +vec[i] = GEN_INT (16/w); + vt = gen_rtx_CONST_VECTOR (maskmode, gen_rtvec_v (w, vec)); + new_mask = expand_simple_binop (maskmode, MULT, new_mask, vt, NULL_RTX, 0, OPTAB_DIRECT); /* Convert mask to vector of chars. */ @@ -19332,7 +19335,7 @@ ix86_expand_vshuffle (rtx operands[]) ... */ for (i = 0; i < w; i++) for (j = 0; j < 16/w; j++) - vec[i*w+j] = GEN_INT (i*16/w); + vec[i*(16/w)+j] = GEN_INT (i*16/w); vt = gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, vec)); vt = force_reg (V16QImode, vt); @@ -19344,7 +19347,7 @@ ix86_expand_vshuffle (rtx operands[]) new_mask = new_mask + {0,1,..,16/w, 0,1,..,16/w, ...} */ for (i = 0; i < w; i++) for (j = 0; j < 16/w; j++) - vec[i*w+j] = GEN_INT (j); + vec[i*(16/w)+j] = GEN_INT (j); vt = gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, vec)); new_mask = expand_simple_binop (V16QImode, PLUS, new_mask, vt, @@ -19386,8 +19389,8 @@ ix86_expand_vshuffle (rtx operands[]) xops[1] = operands[1]; xops[2] = operands[2]; xops[3] = gen_rtx_EQ (mode, mask, w_vector); - xops[4] = t1; - xops[5] = t2; + xops[4] = t2; + xops[5] = t1; return ix86_expand_int_vcond (xops); }
Re: Vector Shuffle plans
On Mon, Oct 3, 2011 at 8:02 PM, Richard Henderson wrote: > On 10/03/2011 11:40 AM, Artem Shinkarov wrote: >> Currently if vec_perm_ok returns false, we do not try to use a new >> vshuffle routine. Would it make sense to implement that? The only >> potential problem I can see is a possible performance degradation. >> This leads us to the second issue. > > Implement that where? In the vectorizer? No, I don't think so. > The _ok routine, while also indicating what the backend expander > supports, could also be thought of as a cost cutoff predicate. > Unless the vectorization folk request some more exact cost metric > I don't see any reason to change this. I was thinking more about the expander of the backend itself. When we throw sorry () in the ix86_expand_vec_perm_builtin, we can fall back to the vshuffle routine, unless it would lead to the performance degradation. >> When we perform vshuffle, we need to know whether it make sense to use >> pshufb (in case of x86) or to perform data movement via standard >> non-simd registers. Do we have this information in the current >> cost-model? > > Not really. Again, if you're talking about the vectorizer, it > gets even more complicated than this because... > >> Also, in certain cases, when the mask is constant, I would >> assume the memory movement is also faster. For example if the mask is >> {4,5,6,7,0,1,2,3...}, then two integer moves should do a better job. > > ... even before SSSE3 PSHUFB, we have all sorts of insns that can > perform a constant shuffle without having to resort to either > general-purpose registers or memory. E.g. PSHUFD. For specific > data types, we can handle arbitrary constant shuffle with 1 or 2 > insns, even when arbitrary variable shuffles aren't. But these cases are more or less covered. I am thinking about the cases when vec_perm_ok, returns false, but the actual permutation could be done faster with memory/register transfers, rather than with the PSHUFB & Co. > It's certainly something that we could add to tree-vect-generic.c. > I have no plans to do anything of the sort, however. I didn't quite understand what do you think can be added to the tree-vect-generic? I thought that we are talking about more or less backend issues. In any case I am investigating these problems, and I will appreciate any help or advices. Thanks, Artem.
Re: Vector Shuffle plans
On Mon, Oct 3, 2011 at 7:07 PM, Richard Henderson wrote: > On 10/03/2011 10:42 AM, David Miller wrote: >>> You might have a look at the "Vector Shuffle" thread, where we've been >>> trying to provide builtin-level access to this feature. We've not added >>> an rtx-level code for this because so far there isn't *that* much in >>> common between the various cpus. They all seem to differ in niggling >>> details... >>> >>> You'll have a somewhat harder time than i386 for this feature, given >>> that you've got to pack bytes into nibbles. But it can certainly be done. >> >> Ok, I'll take a look. > > Oh, you should know that, at present, our generic shuffle support assumes > that shuffles with a constant control (which are also generated by the > vectorizer) get expanded to builtins. And as builtins we wind up with > lots of them -- one per type. > > I'm going to start fixing that in the coming week. > > The vectorizer will be changed to emit VEC_SHUFFLE_EXPR. It will still use > the target hook to see if the constant shuffle is supported. > > The lower-vector pass currently tests the target hook and swaps the > VEC_SHUFFLE_EXPRs that are validate into builtins. That will be changed > to simply leave them unchanged if the other target hook returns NULL. > As the targets are updated to use vshuffle, the builtins get deleted > to return NULL. After all targets are updated, we can remove this check > and the target hook itself. This should preserve bisection on each of > the affected targets. > > The rtl expander won't have to change. > > The target backends will need to accept an immediate for vshuffle op3, > if anything special ought to be done for constant shuffles. In addition, > the builtins should be removed, as previously noted. > > > r~ > Several orthogonal vector-shuffling issues. Currently if vec_perm_ok returns false, we do not try to use a new vshuffle routine. Would it make sense to implement that? The only potential problem I can see is a possible performance degradation. This leads us to the second issue. When we perform vshuffle, we need to know whether it make sense to use pshufb (in case of x86) or to perform data movement via standard non-simd registers. Do we have this information in the current cost-model? Also, in certain cases, when the mask is constant, I would assume the memory movement is also faster. For example if the mask is {4,5,6,7,0,1,2,3...}, then two integer moves should do a better job. Were there any attempts to perform such an analysis, and if not, should we formalise the cases when the substitution of sorts would make some sense. Thanks, Artem.
Re: Vector shuffling
On Mon, Oct 3, 2011 at 6:12 PM, Richard Henderson wrote: > On 10/03/2011 09:43 AM, Artem Shinkarov wrote: >> Hi, Richard >> >> There is a problem with the testcases of the patch you have committed >> for me. The code in every test-case is doubled. Could you please, >> apply the following patch, otherwise it would fail all the tests from >> the vector-shuffle-patch would fail. > > Huh. Dunno what happened there. Fixed. > This is a common pattern, when the patch adds new files and you apply the same patch to the code-base second time. In that case the content of the files is doubled. This is an annoying feature of svn. May be there is a solution to the problem, but I never managed to find one. >> Also, if it is possible, could you change my name from in the >> ChangeLog from "Artem Shinkarov" to "Artjoms Sinkarovs". The last >> version is the way I am spelled in the passport, and the name I use in >> the ChangeLog. > > Fixed. Thank you very much. Artem. > > > r~ >
Re: Vector shuffling
Hi, Richard There is a problem with the testcases of the patch you have committed for me. The code in every test-case is doubled. Could you please, apply the following patch, otherwise it would fail all the tests from the vector-shuffle-patch would fail. Also, if it is possible, could you change my name from in the ChangeLog from "Artem Shinkarov" to "Artjoms Sinkarovs". The last version is the way I am spelled in the passport, and the name I use in the ChangeLog. Thanks, Artem. On Mon, Oct 3, 2011 at 4:13 PM, Richard Henderson wrote: > On 10/03/2011 05:14 AM, Artem Shinkarov wrote: >> Hi, can anyone commit it please? >> >> Richard? >> Or may be Richard? > > Committed. > > > r~ > Index: gcc/testsuite/gcc.c-torture/execute/vect-shuffle-1.c === --- gcc/testsuite/gcc.c-torture/execute/vect-shuffle-1.c(revision 179464) +++ gcc/testsuite/gcc.c-torture/execute/vect-shuffle-1.c(working copy) @@ -17,55 +17,9 @@ int main (int argc, char *argv[]) { /*vector (8, short) v0 = {argc, 1,2,3,4,5,6,7}; vector (8, short) v1 = {argc, 1,argc,3,4,5,argc,7}; vector (8, short) v2; - -vector (8, short) smask = {0,0,1,2,3,4,5,6}; - -v2 = __builtin_shuffle (v0, smask); -shufcompare (short, 8, v2, v0, smask); -v2 = __builtin_shuffle (v0, v1); -shufcompare (short, 8, v2, v0, v1); -v2 = __builtin_shuffle (smask, v0); -shufcompare (short, 8, v2, smask, v0);*/ - -vector (4, int) i0 = {argc, 1,2,3}; -vector (4, int) i1 = {argc, 1, argc, 3}; -vector (4, int) i2; - -vector (4, int) imask = {0,3,2,1}; - -/*i2 = __builtin_shuffle (i0, imask); -shufcompare (int, 4, i2, i0, imask);*/ -i2 = __builtin_shuffle (i0, i1); -shufcompare (int, 4, i2, i0, i1); - -i2 = __builtin_shuffle (imask, i0); -shufcompare (int, 4, i2, imask, i0); - -return 0; -} - -#define vector(elcount, type) \ -__attribute__((vector_size((elcount)*sizeof(type type - -#define vidx(type, vec, idx) (*(((type *) &(vec)) + idx)) -#define shufcompare(type, count, vres, v0, mask) \ -do { \ -int __i; \ -for (__i = 0; __i < count; __i++) { \ -if (vidx(type, vres, __i) != vidx(type, v0, vidx(type, mask, __i))) \ -__builtin_abort (); \ -} \ -} while (0) - - -int main (int argc, char *argv[]) { -/*vector (8, short) v0 = {argc, 1,2,3,4,5,6,7}; -vector (8, short) v1 = {argc, 1,argc,3,4,5,argc,7}; -vector (8, short) v2; - vector (8, short) smask = {0,0,1,2,3,4,5,6}; - + v2 = __builtin_shuffle (v0, smask); shufcompare (short, 8, v2, v0, smask); v2 = __builtin_shuffle (v0, v1); @@ -83,10 +37,10 @@ int main (int argc, char *argv[]) { shufcompare (int, 4, i2, i0, imask);*/ i2 = __builtin_shuffle (i0, i1); shufcompare (int, 4, i2, i0, i1); - + i2 = __builtin_shuffle (imask, i0); shufcompare (int, 4, i2, imask, i0); - + return 0; } Index: gcc/testsuite/gcc.c-torture/execute/vect-shuffle-2.c === --- gcc/testsuite/gcc.c-torture/execute/vect-shuffle-2.c(revision 179464) +++ gcc/testsuite/gcc.c-torture/execute/vect-shuffle-2.c(working copy) @@ -42,47 +42,3 @@ int main (int argc, char *argv[]) { return 0; } -#define vector(elcount, type) \ -__attribute__((vector_size((elcount)*sizeof(type type - -#define vidx(type, vec, idx) (*(((type *) &(vec)) + idx)) - -#define shuf2compare(type, count, vres, v0, v1, mask) \ -do { \ -int __i; \ -for (__i = 0; __i < count; __i++) { \ -if (vidx(type, vres, __i) != ((vidx(type, mask, __i) < count) ? \ - vidx(type, v0, vidx(type, mask, __i)) : \ - vidx(type, v1, (vidx(type, mask, __i) - count \ -__builtin_abort (); \ -} \ -} while (0) - - -int main (int argc, char *argv[]) { -vector (8, short) v0 = {5, 5,5,5,5,5,argc,7}; -vector (8, short) v1 = {argc, 1,8,8,4,9,argc,4}; -vector (8, short) v2; - -//vector (8, short) mask = {1,2,5,4,3,6,7}; - -vector (8, short) mask0 = {0,2,3,1,4,5,6,7}; -vector (8, short) mask1 = {0,12,3,4,3,0,10,9}; - -vector (8, short) mask2 = {0,8,1,9,2,10,3,11}; - -v2 = __builtin_shuffle (v0, v1, mask0); -shuf2compare (short, 8, v2, v0, v1, mask0); - -v2 = __builtin_shuffle (v0, v1, mask1); -shuf2compare (short, 8, v2, v0, v1, mask1); - -v2 = __builtin_shuffle (v0, v1, mask2); -shuf2compare (short, 8, v2, v0, v1, mask2); - -v2 = __builtin_shuffle (mask0, mask0, v0); -shuf2compare (short, 8, v2, mask0, mask0, v0); - -return 0; -} - Index: gcc/testsuite/gcc.c-torture/execute/vect-shuffle-3.c === --- gcc/testsuite/gcc.c-torture/execute/vect-shuffle-3.c
Re: Vector shuffling
Hi, can anyone commit it please? Richard? Or may be Richard? Thanks, Artem. On Sat, Oct 1, 2011 at 12:21 AM, Artem Shinkarov wrote: > Sorry for that, the vector comparison was submitted earlier. In the > attachment there is a new version of the patch against the latest > checkout. > > Richard, can you have a look at the genopinit.c, I am using > set_direct_optab_handler, is it correct? > > All the rest seems to be the same. > > > Thanks, > Artem. > > > On Fri, Sep 30, 2011 at 10:24 PM, Richard Henderson wrote: >> On 09/30/2011 12:14 PM, Artem Shinkarov wrote: >>> Ok, in the attachment there is a patch which fixes mentioned errors. >> >> The changes are ok. I would have committed it for you, only the patch >> isn't against mainline. There are 4 rejects. >> >> >> r~ >> >
Re: Vector Comparison patch
On Fri, Sep 30, 2011 at 4:54 PM, Jakub Jelinek wrote: > On Fri, Sep 30, 2011 at 04:48:41PM +0100, Artem Shinkarov wrote: >> Most likely we can. The question is what do we really want to check >> with this test. My intention was to check that a programmer can >> statically get correspondence of the types, in a sense that sizeof >> (float) == sizeof (int) and sizeof (double) == sizeof (long long). As >> it seems my original assumption does not hold. Before using __typeof, >> I would try to make sure that there is no other way to determine these >> correspondences. > > You can use preprocessor too, either just surround the whole test > with #if __SIZEOF_INT__ == __SIZEOF_FLOAT__ and similar, > or select the right type through preprocessor > #if __SIZEOF_INT__ == __SIZEOF_FLOAT__ > #define FLOATCMPTYPE int > #elif __SIZEOF_LONG__ == __SIZEOF_FLOAT__ > #define FLOATCMPTYPE long > #else > ... > or __typeof, etc. > > Jakub > Ok, here is a patch which uses __typeof. Passes on x86_64. Artem. Index: gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c === --- gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c (revision 179378) +++ gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c (working copy) @@ -39,17 +39,17 @@ int main (int argc, char *argv[]) { int i; i0 = (vector (4, INT)){argc, 1, 2, 10}; -i1 = (vector (4, INT)){0, 3, 2, (INT)-23}; +i1 = (vector (4, INT)){0, 3, 2, (INT)-23}; test (4, i0, i1, ires, "%i"); #undef INT -#define INT unsigned int +#define INT unsigned int vector (4, int) ures; vector (4, INT) u0; vector (4, INT) u1; u0 = (vector (4, INT)){argc, 1, 2, 10}; -u1 = (vector (4, INT)){0, 3, 2, (INT)-23}; +u1 = (vector (4, INT)){0, 3, 2, (INT)-23}; test (4, u0, u1, ures, "%u"); #undef INT @@ -60,7 +60,7 @@ int main (int argc, char *argv[]) { vector (8, short) sres; s0 = (vector (8, SHORT)){argc, 1, 2, 10, 6, 87, (SHORT)-5, 2}; -s1 = (vector (8, SHORT)){0, 3, 2, (SHORT)-23, 12, 10, (SHORT)-2, 0}; +s1 = (vector (8, SHORT)){0, 3, 2, (SHORT)-23, 12, 10, (SHORT)-2, 0}; test (8, s0, s1, sres, "%i"); #undef SHORT @@ -70,7 +70,7 @@ int main (int argc, char *argv[]) { vector (8, short) usres; us0 = (vector (8, SHORT)){argc, 1, 2, 10, 6, 87, (SHORT)-5, 2}; -us1 = (vector (8, SHORT)){0, 3, 2, (SHORT)-23, 12, 10, (SHORT)-2, 0}; +us1 = (vector (8, SHORT)){0, 3, 2, (SHORT)-23, 12, 10, (SHORT)-2, 0}; test (8, us0, us1, usres, "%u"); #undef SHORT @@ -102,19 +102,19 @@ int main (int argc, char *argv[]) { /* Float comparison. */ vector (4, float) f0; vector (4, float) f1; -vector (4, int) ifres; +__typeof (f0 == f1) ifres; f0 = (vector (4, float)){(float)argc, 1., 2., 10.}; -f1 = (vector (4, float)){0., 3., 2., (float)-23}; +f1 = (vector (4, float)){0., 3., 2., (float)-23}; test (4, f0, f1, ifres, "%f"); - + /* Double comparison. */ vector (2, double) d0; vector (2, double) d1; -vector (2, long long) idres; +__typeof (d0 == d1) idres; d0 = (vector (2, double)){(double)argc, 10.}; -d1 = (vector (2, double)){0., (double)-23}; +d1 = (vector (2, double)){0., (double)-23}; test (2, d0, d1, idres, "%f");
Re: Vector Comparison patch
On Fri, Sep 30, 2011 at 4:43 PM, Jakub Jelinek wrote: > On Fri, Sep 30, 2011 at 05:36:47PM +0200, Georg-Johann Lay wrote: >> >> The target has >> >> >> >> 2 = sizeof (short) >> >> 2 = sizeof (int) >> >> 4 = sizeof (long int) >> >> 8 = sizeof (long long int) >> >> >> >> Could you fix that? I.e. parametrize sizeof(int) out or skip the test by >> >> means of >> >> >> >> /* { dg-require-effective-target int32plus } */ >> >> >> >> or similar. >> >> >> >> Thanks, Johann >> >> >> >> [...] >> >> >> > The problem actually happens when we compare float vector with float >> > vector, it is assumed that we should get int vector as a result, but >> > it turns out that we are getting long int. >> > >> > The same with double, we assume that sizeof (double) == sizeof (long >> > long). But as it seems double has the same size as float. >> >> Yes. >> >> sizeof(double) = sizeof(float) = 4 >> >> > Hm, I can put conditional of sort: >> > if (sizeof (doulbe) == sizeof (long long)) and others. Or may be there >> > is more elegant way of solving this? >> >> That's too late because this won't prevent the compiler from error. >> The error already happens at compile time, not at run time. > > Isn't it possible to do something like: > vector (4, float) f0; > vector (4, float) f1; > - vector (4, int) ifres; > + vector (4, __typeof (f0 > f1)) ifres; > > f0 = (vector (4, float)){(float)argc, 1., 2., 10.}; > f1 = (vector (4, float)){0., 3., 2., (float)-23}; > test (4, f0, f1, ifres, "%f"); > > /* Double comparison. */ > vector (2, double) d0; > vector (2, double) d1; > - vector (2, long long) idres; > + vector (2, __typeof (d0 > d1)) idres; > > d0 = (vector (2, double)){(double)argc, 10.}; > d1 = (vector (2, double)){0., (double)-23}; > test (2, d0, d1, idres, "%f"); > > Jakub > Most likely we can. The question is what do we really want to check with this test. My intention was to check that a programmer can statically get correspondence of the types, in a sense that sizeof (float) == sizeof (int) and sizeof (double) == sizeof (long long). As it seems my original assumption does not hold. Before using __typeof, I would try to make sure that there is no other way to determine these correspondences. Artem.
Re: Vector Comparison patch
On Fri, Sep 30, 2011 at 4:01 PM, Georg-Johann Lay wrote: > Artem Shinkarov schrieb: >> Here is a new version of the patch which considers the changes from >> 2011-09-02 Richard Guenther >> >> >> ChangeLog >> >> 20011-09-06 Artjoms Sinkarovs >> >> gcc/ >> * fold-const.c (constant_boolean_node): Adjust the meaning >> of boolean for vector types: true = {-1,..}, false = {0,..}. >> (fold_unary_loc): Avoid conversion of vector comparison to >> boolean type. >> * expr.c (expand_expr_real_2): Expand vector comparison by >> building an appropriate VEC_COND_EXPR. >> * c-typeck.c (build_binary_op): Typecheck vector comparisons. >> (c_objc_common_truthvalue_conversion): Adjust. >> * tree-vect-generic.c (do_compare): Helper function. >> (expand_vector_comparison): Check if hardware supports >> vector comparison of the given type or expand vector >> piecewise. >> (expand_vector_operation): Treat comparison as binary >> operation of vector type. >> (expand_vector_operations_1): Adjust. >> * tree-cfg.c (verify_gimple_comparison): Adjust. >> >> gcc/config/i386 >> * i386.c (ix86_expand_sse_movcc): Consider a case when >> vcond operators are {-1,..} and {0,..}. >> >> gcc/doc >> * extend.texi: Adjust. >> >> gcc/testsuite >> * gcc.c-torture/execute/vector-compare-1.c: New test. >> * gcc.c-torture/execute/vector-compare-2.c: New test. >> * gcc.dg/vector-compare-1.c: New test. >> * gcc.dg/vector-compare-2.c: New test. >> >> bootstrapped and tested on x86_64-unknown-linux-gnu. >> >> >> Thanks, >> Artem. > > Hi Artem, > > the new test case gcc.c-torture/execute/vector-compare-1.c causes bunch of > FAILS in regression tests for avr-unknown-none (see attachment). > > The target has > > 2 = sizeof (short) > 2 = sizeof (int) > 4 = sizeof (long int) > 8 = sizeof (long long int) > > Could you fix that? I.e. parametrize sizeof(int) out or skip the test by > means of > > /* { dg-require-effective-target int32plus } */ > > or similar. > > Thanks, Johann > > > > > > > > > ./gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c: In function 'main': > ./gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c:109:5: error: > incompatible types when assigning to type '__vector(4) int' from type > '__vector(4) long int' > ./gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c:109:5: error: > incompatible types when assigning to type '__vector(4) int' from type > '__vector(4) long int' > ./gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c:109:5: error: > incompatible types when assigning to type '__vector(4) int' from type > '__vector(4) long int' > ./gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c:109:5: error: > incompatible types when assigning to type '__vector(4) int' from type > '__vector(4) long int' > ./gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c:109:5: error: > incompatible types when assigning to type '__vector(4) int' from type > '__vector(4) long int' > ./gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c:109:5: error: > incompatible types when assigning to type '__vector(4) int' from type > '__vector(4) long int' > ./gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c:118:5: error: > incompatible types when assigning to type '__vector(2) long long int' from > type '__vector(2) long int' > ./gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c:118:5: error: > incompatible types when assigning to type '__vector(2) long long int' from > type '__vector(2) long int' > ./gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c:118:5: error: > incompatible types when assigning to type '__vector(2) long long int' from > type '__vector(2) long int' > ./gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c:118:5: error: > incompatible types when assigning to type '__vector(2) long long int' from > type '__vector(2) long int' > ./gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c:118:5: error: > incompatible types when assigning to type '__vector(2) long long int' from > type '__vector(2) long int' > ./gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c:118:5: error: > incompatible types when assigning to type &
Re: Scalar vector binary operation
Hi In the attachment there is a patch with the changes. I don't have rights to commit to the cvs. If we write about vector-related stuff, may be it would make sense to mention that from now on we can make vector shifts and we can index vector. These are changes from 4.6 (I guess), but they were never mentioned in any changes. Thanks, Artem. On Wed, Sep 28, 2011 at 3:46 PM, Ian Lance Taylor wrote: > Artem Shinkarov writes: > >> I can try to put a description in the document. I am not sure that I >> have rights to commit to the svn, but at least I can try to write the >> text. >> >> There are also pending patches for vector-comparison (almost >> submitted) and vector shuffling (still under discussion), but I hope >> to finish both of them until the new release is coming. >> >> Could you point out where the cnhanges.html is located in svn? > > It's actually still in CVS, at gcc.gnu.org:/cvs/gcc. > > Ian > Index: htdocs/gcc-4.7/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v retrieving revision 1.38 diff -u -r1.38 changes.html --- htdocs/gcc-4.7/changes.html 27 Sep 2011 18:42:56 - 1.38 +++ htdocs/gcc-4.7/changes.html 28 Sep 2011 18:52:33 - @@ -150,6 +150,18 @@ through which the compiler can be hinted about pointer alignment and can use it to improve generated code. + When a binary operation performed on vector types and one of the operands + is a uniform vector it is possible to replace the vector with the + generating element. For example: + +typedef int v4si __attribute__ ((vector_size (16))); +v4si res, a = {1,2,3,4}; +int x; + +res = 2 + a; /* means {2,2,2,2} + a */ +res = a - x; /* means a - {x,x,x,x} */ + + C++
Re: Scalar vector binary operation
Ian I can try to put a description in the document. I am not sure that I have rights to commit to the svn, but at least I can try to write the text. There are also pending patches for vector-comparison (almost submitted) and vector shuffling (still under discussion), but I hope to finish both of them until the new release is coming. Could you point out where the cnhanges.html is located in svn? Thanks, Artem.
Re: Vector shuffling
On Thu, Sep 15, 2011 at 8:05 PM, Richard Henderson wrote: >> +The elements of the input vectors are numbered from left to right across >> +one or both of the vectors. Each element in the mask specifies a number >> +of element from the input vector(s). Consider the following example. > > It would be more preferable to talk about the memory ordering of the > elements rather than "left" and "right" which are ambiguous at best. > >> + if (TREE_CODE (mask) == VECTOR_CST) >> + { >> + tree m_type, call; >> + tree fn = targetm.vectorize.builtin_vec_perm (TREE_TYPE (v0), >> &m_type); >> + /*rtx t;*/ > > Leftover crap. Fixed. >> + >> + if (!fn) >> + goto vshuffle; >> + >> + if (m_type != TREE_TYPE (TREE_TYPE (mask))) >> + { >> + int units = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask)); >> + tree cvt = build_vector_type (m_type, units); >> + mask = fold_convert (cvt, mask); >> + } >> + >> + call = fold_build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (fn)), >> fn); >> + call = build_call_nary (type /* ? */, call, 3, v0, v1, mask); >> + >> + return expand_expr_real_1 (call, target, VOIDmode, EXPAND_NORMAL, >> NULL); >> + } >> + >> +vshuffle: >> + gcc_assert (operand_equal_p (v0, v1, 0)); > > Why can't a non-constant shuffle have different V0 and V1? That seems > like a direct violation of the documentation, and any sort of usefulness. Ok, I agree. The reason why this assert is here is that noone in the middle-end generates the code that does not meet this assert. In principle we definitely want to support it in the upcoming patches, but it would be nice to start with a simple thing. > Also, while I'm ok with the use of builtin_vec_perm here in the short > term, I think that in the long term we should simply force the named > pattern to handle constants. Then the vectorizer can simply use the > predicate and the tree code and we can drop the large redundancy of > builtins with different argument types. > > Indeed, once this patch is applied, I think that ought to be the very > next task in this domain. > >> +/* Vector shuffle expression. A = VEC_SHUFFLE_EXPR > > Typo in "mask". > >> + foreach i in length (mask): >> + A = mask[i] < length (v0) ? v0[mask[i]] : v1[mask[i]] > > Surely it's v1[mask[i] - length]. > >> + if (TREE_CODE (vect) == VECTOR_CST) >> + { >> + unsigned i; > > Indentation is off all through this function. Fixed. >> + mask = gen_rtx_AND (maskmode, mask, mm); >> + >> + /* Convert mask to vector of chars. */ >> + mask = simplify_gen_subreg (V16QImode, mask, maskmode, 0); >> + mask = force_reg (V16QImode, mask); > > Why are you using force_reg to do all the dirty work? Seems to > me this should be using expand_normal. All throughout this > function. That would also avoid the need for all of the extra > force_reg stuff that ought not be there for -O0. I don't really understand this. As far as I know, expand_normal "converts" tree to rtx. All my computations are happening at the level of rtx and force_reg is needed just to bring an rtx expression to the register of the correct mode. If I am missing something, could you give an example how can I use expand_normal instead of force_reg in this particular code. > I also see that you're not even attempting to use xop_pperm. As I said, I am happy to experiment with the cases v0 != v1 in the upcoming patches. Let's just start with a simple thing and see what kind of issues/problems it would bring. > Is ssse3_pshufb why you do the wrong thing in the expander for v0 != v1? My personal feeling is that it may be the case with v0 != v1, that it would be more efficient to perform piecewise shuffling rather than bitwise dances around the masks. > And give the vshuffle named pattern the wrong number of arguments? Ok, If I'll make vshuffle to accept only two arguments -- vector and mask, would it be ok? > It's certainly possible to handle it, though it takes a few more steps, > and might well be more efficient as a libgcc function rather than inline. I don't really understand why it could be more efficient. I thought that inline gives more chances to the final RTL optimisation. > > > r~ > > > Thanks, Artem.
Re: Vector Comparison patch
On Tue, Sep 6, 2011 at 3:56 PM, Richard Guenther wrote: > On Tue, Sep 6, 2011 at 4:50 PM, Artem Shinkarov > wrote: >> Here is a new version of the patch which considers the changes from >> 2011-09-02 Richard Guenther >> >> >> ChangeLog >> >> 20011-09-06 Artjoms Sinkarovs >> >> gcc/ >> * fold-const.c (constant_boolean_node): Adjust the meaning >> of boolean for vector types: true = {-1,..}, false = {0,..}. >> (fold_unary_loc): Avoid conversion of vector comparison to >> boolean type. > > Both changes have already been done. I missed the way you applied constant_boolean node, sorry for that. But fold_unary_loc seems confusing to me. We have the following code: else if (!INTEGRAL_TYPE_P (type)) return build3_loc (loc, COND_EXPR, type, op0, constant_boolean_node (true, type), constant_boolean_node (false, type)); But this is wrong for the vector types, because it should construct VEC_COND_EXPR, not COND_EXPR. That is why I had a special case for vectors. >> * expr.c (expand_expr_real_2): Expand vector comparison by >> building an appropriate VEC_COND_EXPR. > > I prefer > > Index: gcc/expr.c > === > *** gcc/expr.c.orig 2011-08-29 11:48:23.0 +0200 > --- gcc/expr.c 2011-08-29 12:58:59.0 +0200 > *** do_store_flag (sepops ops, rtx target, e > *** 10309,10314 > --- 10309,10325 > STRIP_NOPS (arg0); > STRIP_NOPS (arg1); > > + /* For vector typed comparisons emit code to generate the desired > + all-ones or all-zeros mask. Conveniently use the VEC_COND_EXPR > + expander for this. */ > + if (TREE_CODE (ops->type) == VECTOR_TYPE) > + { > + tree ifexp = build2 (ops->code, ops->type, arg0, arg1); > + tree if_true = constant_boolean_node (true, ops->type); > + tree if_false = constant_boolean_node (false, ops->type); > + return expand_vec_cond_expr (ops->type, ifexp, if_true, > if_false, target); > + } > + > /* Get the rtx comparison code to use. We know that EXP is a comparison > > as I said multiple times. > >> * c-typeck.c (build_binary_op): Typecheck vector comparisons. >> (c_objc_common_truthvalue_conversion): Adjust. >> * tree-vect-generic.c (do_compare): Helper function. >> (expand_vector_comparison): Check if hardware supports >> vector comparison of the given type or expand vector >> piecewise. >> (expand_vector_operation): Treat comparison as binary >> operation of vector type. >> (expand_vector_operations_1): Adjust. >> * tree-cfg.c (verify_gimple_comparison): Adjust. > > The tree-cfg.c change has already been done. > > Richard. > >> >> gcc/config/i386 >> * i386.c (ix86_expand_sse_movcc): Consider a case when >> vcond operators are {-1,..} and {0,..}. >> >> gcc/doc >> * extend.texi: Adjust. >> >> gcc/testsuite >> * gcc.c-torture/execute/vector-compare-1.c: New test. >> * gcc.c-torture/execute/vector-compare-2.c: New test. >> * gcc.dg/vector-compare-1.c: New test. >> * gcc.dg/vector-compare-2.c: New test. >> >> bootstrapped and tested on x86_64-unknown-linux-gnu. >> >> >> Thanks, >> Artem. >> > All the rest is adjusted in the new version of the patch you can find in the attachment. ChangLog 20011-09-06 Artjoms Sinkarovs gcc/ * expr.c (do_store_flag): Expand vector comparison by building an appropriate VEC_COND_EXPR. * c-typeck.c (build_binary_op): Typecheck vector comparisons. (c_objc_common_truthvalue_conversion): Adjust. * tree-vect-generic.c (do_compare): Helper function. (expand_vector_comparison): Check if hardware supports vector comparison of the given type or expand vector piecewise. (expand_vector_operation): Treat comparison as binary operation of vector type. (expand_vector_operations_1): Adjust. gcc/config/i386 * i386.c (ix86_expand_sse_movcc): Consider a case when vcond operators are {-1,..} and {0,..}. gcc/doc * extend.texi: Adjust. gcc/testsuite * gcc.c-torture/execute/vector-compare-1.c: New test. * gcc.c-torture/execute/vector-compare-2.c: New test. * gcc.dg/vector-compare-1.c: New test. * gcc.dg/vector-compare-2.c: New test. bootstrapped and tested on x86_64-unknown-linux-gnu. Index: gcc/doc/extend.texi =
Re: Vector Comparison patch
Here is a new version of the patch which considers the changes from 2011-09-02 Richard Guenther ChangeLog 20011-09-06 Artjoms Sinkarovs gcc/ * fold-const.c (constant_boolean_node): Adjust the meaning of boolean for vector types: true = {-1,..}, false = {0,..}. (fold_unary_loc): Avoid conversion of vector comparison to boolean type. * expr.c (expand_expr_real_2): Expand vector comparison by building an appropriate VEC_COND_EXPR. * c-typeck.c (build_binary_op): Typecheck vector comparisons. (c_objc_common_truthvalue_conversion): Adjust. * tree-vect-generic.c (do_compare): Helper function. (expand_vector_comparison): Check if hardware supports vector comparison of the given type or expand vector piecewise. (expand_vector_operation): Treat comparison as binary operation of vector type. (expand_vector_operations_1): Adjust. * tree-cfg.c (verify_gimple_comparison): Adjust. gcc/config/i386 * i386.c (ix86_expand_sse_movcc): Consider a case when vcond operators are {-1,..} and {0,..}. gcc/doc * extend.texi: Adjust. gcc/testsuite * gcc.c-torture/execute/vector-compare-1.c: New test. * gcc.c-torture/execute/vector-compare-2.c: New test. * gcc.dg/vector-compare-1.c: New test. * gcc.dg/vector-compare-2.c: New test. bootstrapped and tested on x86_64-unknown-linux-gnu. Thanks, Artem. Index: gcc/doc/extend.texi === --- gcc/doc/extend.texi (revision 178579) +++ gcc/doc/extend.texi (working copy) @@ -6561,6 +6561,29 @@ invoke undefined behavior at runtime. W accesses for vector subscription can be enabled with @option{-Warray-bounds}. +In GNU C vector comparison is supported within standard comparison +operators: @code{==, !=, <, <=, >, >=}. Comparison operands can be +vector expressions of integer-type or real-type. Comparison between +integer-type vectors and real-type vectors are not supported. The +result of the comparison is a vector of the same width and number of +elements as the comparison operands with a signed integral element +type. + +Vectors are compared element-wise producing 0 when comparison is false +and -1 (constant of the appropriate type where all bits are set) +otherwise. Consider the following example. + +@smallexample +typedef int v4si __attribute__ ((vector_size (16))); + +v4si a = @{1,2,3,4@}; +v4si b = @{3,2,1,4@}; +v4si c; + +c = a > b; /* The result would be @{0, 0,-1, 0@} */ +c = a == b; /* The result would be @{0,-1, 0,-1@} */ +@end smallexample + You can declare variables and use them in function calls and returns, as well as in assignments and some casts. You can specify a vector type as a return type for a function. Vector types can also be used as function Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 178579) +++ gcc/fold-const.c(working copy) @@ -5934,7 +5934,15 @@ extract_muldiv_1 (tree t, tree c, enum t tree constant_boolean_node (bool value, tree type) { - if (type == integer_type_node) + if (TREE_CODE (type) == VECTOR_TYPE) +{ + tree tval; + + gcc_assert (TREE_CODE (TREE_TYPE (type)) == INTEGER_TYPE); + tval = build_int_cst (TREE_TYPE (type), value ? -1 : 0); + return build_vector_from_val (type, tval); +} + else if (type == integer_type_node) return value ? integer_one_node : integer_zero_node; else if (type == boolean_type_node) return value ? boolean_true_node : boolean_false_node; @@ -7670,6 +7678,16 @@ fold_unary_loc (location_t loc, enum tre return build2_loc (loc, TREE_CODE (op0), type, TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 1)); + else if (TREE_CODE (type) == VECTOR_TYPE) + { + tree el_type = TREE_TYPE (type); + tree op_el_type = TREE_TYPE (TREE_TYPE (op0)); + + if (el_type == op_el_type) + return op0; + else + build1_loc (loc, VIEW_CONVERT_EXPR, type, op0); + } else if (!INTEGRAL_TYPE_P (type)) return build3_loc (loc, COND_EXPR, type, op0, constant_boolean_node (true, type), Index: gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c === --- gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c (revision 0) +++ gcc/testsuite/gcc.c-torture/execute/vector-compare-1.c (revision 0) @@ -0,0 +1,123 @@ +#define vector(elcount, type) \ +__attribute__((vector_size((elcount)*sizeof(type type + +#define check_compare(count, res, i0, i1, op, fmt) \ +do { \ +int __i; \ +for (__i = 0; __i < count; __i ++) { \ + if ((res)[__i] != ((i0)[__i] op (i1)[__i]
Re: Vector shuffling
On Fri, Sep 2, 2011 at 4:41 PM, Joseph S. Myers wrote: > On Fri, 2 Sep 2011, Artem Shinkarov wrote: > >> + /* Avoid C_MAYBE_CONST_EXPRs inside VEC_SHUFFLE_EXPR. */ >> + tmp = c_fully_fold (v0, false, &maybe_const); >> + v0 = save_expr (tmp); >> + wrap &= maybe_const; > > I suppose you need this save_expr because of the two-argument case, but > shouldn't need it otherwise. > >> + if (!two_arguments) >> + { >> + tmp = c_fully_fold (v1, false, &maybe_const); >> + v1 = save_expr (tmp); > > And you shouldn't need this save_expr at all. > >> + tmp = c_fully_fold (mask, false, &maybe_const); >> + mask = save_expr (tmp); > > Or this one. Joseph, I don't understand this comment. I have 2 or 3 arguments in the VEC_SHUFFLE_EXPR and any of them can be C_MAYBE_CONST_EXPR, so I need to wrap mask (the last argument) to avoid the following failure: #define vector(elcount, type) \ __attribute__((vector_size((elcount)*sizeof(type type extern int p, q, v, r; int main () { vector (4, int) i0 = {argc, 1,2,3}; vector (4, int) i1 = {argc, 1, argc, 3}; vector (4, int) imask = {0,3,2,1}; vector (4, int) extmask = {p,q,r,v}; i2 = __builtin_shuffle (i0, (p,q)? imask:extmask); return 0; } and the same failure would happen if __builtin_shuffle expression will be in the following form: i2 = __builtin_shuffle (i0, (p,q)? imask:extmask, i2); All the rest -- agreed, and is fixed already. Thanks, Artem. >> +/* Helper function to read arguments of builtins which are interfaces >> + for the middle-end nodes like COMPLEX_EXPR, VEC_SHUFLE_EXPR and > > Spelling of SHUFFLE. > >> + others. The name of the builtin is passed using BNAME parameter. > > Two spaces after ".". > >> + Function returns true if there were no errors while parsing and >> + stores the arguments in EXPR_LIST*/ > > ". " at end of comment. > >> +static bool >> +c_parser_get_builtin_args (c_parser * parser, const char * bname, >> + VEC(tree,gc) ** expr_list) > > No spaces after "*". > >> + if (c_parser_next_token_is_not (parser, CPP_OPEN_PAREN)) >> + { >> + error_at (loc, "cannot take address of %<%s%>", bname); > > %qs is a simpler form of %<%s%>. > >> @@ -6461,6 +6500,35 @@ c_parser_postfix_expression (c_parser *p > > Should also convert __builtin_choose_expr and __builtin_complex to use the > new helper. > >> + if (! c_parser_get_builtin_args (parser, > > No space after "!". > >> + { >> + error_at (loc, "%<__builtin_shuffle%> wrong number of >> arguments"); > > "wrong number of arguments to %<__builtin_shuffle%>". > > -- > Joseph S. Myers > jos...@codesourcery.com >
Re: Vector shuffling
On Wed, Aug 31, 2011 at 4:38 PM, Joseph S. Myers wrote: > On Wed, 31 Aug 2011, Artem Shinkarov wrote: > >> 1) Helper function for the pseudo-builtins. >> In my case the builtin can have 2 or 3 arguments, and I think that I >> expressed that in a pretty much short way without any helper function. >> Am I missing something? > > The point is to refactor what's common between this and other > pseudo-builtins, not to have two pseudo-builtins doing things one way and > one doing them another way Joseph, I don't mind adjusting, just look into the patch and tell me if the way it is done at the moment is the right way to do it. I don't see a good reason to write a helper function the way you describe, because the number of operations we do there is very small. However, if you think that this is a right way to go, I can put the statements I am using right now to handle arguments of RID_BUILTIN_SHUFFLE in a helper function. So is there anything missing? Thanks, Artem. > -- > Joseph S. Myers > jos...@codesourcery.com >
Re: Vector shuffling
On Wed, Aug 31, 2011 at 12:51 AM, Chris Lattner wrote: > On Aug 30, 2011, at 10:01 AM, Artem Shinkarov wrote: >>>> The patch at the moment lacks of some examples, but mainly it works >>>> fine for me. It would be nice if i386 gurus could look into the way I >>>> am doing the expansion. >>>> >>>> Middle-end parts seems to be more or less fine, they have not changed >>>> much from the previous time. >>> >>> +@code{__builtin_shuffle (vec, mask)} and >>> +@code{__builtin_shuffle (vec0, vec1, mask)}. Both functions construct >>> >>> the latter would be __builtin_shuffle2. >> >> Why?? >> That was the syntax we agreed on that elegantly handles both cases in one >> place. > > If you're going to add vector shuffling builtins, you might consider adding > the same builtin that clang has for compatibility: > http://clang.llvm.org/docs/LanguageExtensions.html#__builtin_shufflevector > > It should be straight-forward to map it into the same IR. > > -Chris > Chris I am trying to use OpenCL syntax here which says that the mask for shuffling is a vector. Also I didn't really get from the clang description if the indexes could be non-constnants? If not, then I have a problem here, because I want to support this. Artem.
Re: Vector shuffling
On Tue, Aug 30, 2011 at 2:03 PM, Richard Guenther wrote: > On Tue, Aug 30, 2011 at 4:31 AM, Artem Shinkarov > wrote: >> Hi >> >> This is a patch for the explicit vector shuffling we have discussed a >> long time ago here: >> http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01092.html >> >> The new patch introduces the new tree code, as we agreed, and expands >> this code by checking the vshuffle pattern in the backend. >> >> The patch at the moment lacks of some examples, but mainly it works >> fine for me. It would be nice if i386 gurus could look into the way I >> am doing the expansion. >> >> Middle-end parts seems to be more or less fine, they have not changed >> much from the previous time. > > +@code{__builtin_shuffle (vec, mask)} and > +@code{__builtin_shuffle (vec0, vec1, mask)}. Both functions construct > > the latter would be __builtin_shuffle2. Why?? That was the syntax we agreed on that elegantly handles both cases in one place. > +bool > +expand_vec_shuffle_expr_p (enum machine_mode mode, tree v0, > + tree v1, tree mask) > +{ > +#define inner_type_size(vec) \ > + GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (TREE_TYPE (vec > > missing comment. No #defines like this please, just initialize > two temporary variables. > > + > +rtx > +expand_vec_shuffle_expr (tree type, tree v0, tree v1, tree mask, rtx target) > +{ > > comment. > > +vshuffle: > + gcc_assert (v1 == v0); > + > + icode = direct_optab_handler (vshuffle_optab, mode); > > hmm, so we don't have a vshuffle2 optab but always go via the > builtin function, but only for constant masks there? I wonder > if we should arrange for targets to only support a vshuffle > optab (thus, transition away from the builtin) and so > unconditionally have a vshuffle2 optab only (with possibly > equivalent v1 and v0?) I have only implemented the case with non-constant mask that supports only one argument. I think that it would be enough for the first version. Later we can introduce vshuffle2 pattern and reuse the code that expands vshuffle at the moment. > I suppose Richard might remember what he had in mind back > when we discussed this. > > Index: gcc/c-typeck.c > === > --- gcc/c-typeck.c (revision 177758) > +++ gcc/c-typeck.c (working copy) > @@ -2815,6 +2815,68 @@ build_function_call_vec (location_t loc, > && !check_builtin_function_arguments (fundecl, nargs, argarray)) > return error_mark_node; > > + /* Typecheck a builtin function which is declared with variable > + argument list. */ > + if (fundecl && DECL_BUILT_IN (fundecl) > + && DECL_BUILT_IN_CLASS (fundecl) == BUILT_IN_NORMAL) > > just add to check_builtin_function_arguments which is called right > in front of your added code. > > + /* Here we change the return type of the builtin function > + from int f(...) --> t f(...) where t is a type of the > + first argument. */ > + fundecl = copy_node (fundecl); > + TREE_TYPE (fundecl) = build_function_type (TREE_TYPE (firstarg), > + TYPE_ARG_TYPES (TREE_TYPE > (fundecl))); > + function = build_fold_addr_expr (fundecl); > > oh, hum - now I remember ;) Eventually the C frontend should handle > this not via the function call mechanism but similar to how Joseph > added __builtin_complex support with > > 2011-08-19 Joseph Myers > > * c-parser.c (c_parser_postfix_expression): Handle RID_BUILTIN_COMPLEX. > * doc/extend.texi (__builtin_complex): Document. > > and then emit VEC_SHUFFLE_EXPRs directly from the frontend. Joseph? > > FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (inside_init), ix, > value) > - if (!CONSTANT_CLASS_P (value)) > + if (!CONSTANT_CLASS_P (value)) > > watch out for spurious whitespace changes. > > Index: gcc/gimplify.c > === > --- gcc/gimplify.c (revision 177758) > +++ gcc/gimplify.c (working copy) > @@ -7050,6 +7050,7 @@ gimplify_expr (tree *expr_p, gimple_seq > break; > > case BIT_FIELD_REF: > + case VEC_SHUFFLE_EXPR: > > I don't think that's quite the right place given the is_gimple_lvalue > predicate on the first operand. More like > > case VEC_SHUFFLE_EXPR: > goto expr_3; > > +/* Vector shuffle expression. A = VEC_SHUFFLE_EXPR > > typo, mask > > + means > + > + freach i in length (mask): > + A = mask[i
Vector shuffling
Hi This is a patch for the explicit vector shuffling we have discussed a long time ago here: http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01092.html The new patch introduces the new tree code, as we agreed, and expands this code by checking the vshuffle pattern in the backend. The patch at the moment lacks of some examples, but mainly it works fine for me. It would be nice if i386 gurus could look into the way I am doing the expansion. Middle-end parts seems to be more or less fine, they have not changed much from the previous time. ChangeLog: 2011-08-30 Artjoms Sinkarovs gcc/ * optabs.c (expand_vec_shuffle_expr_p): New function. Checks if given expression can be expanded by the target. (expand_vec_shuffle_expr): New function. Expand VEC_SHUFFLE_EXPR using target vector instructions. * optabs.h: New optab vshuffle. (expand_vec_shuffle_expr_p): New prototype. (expand_vec_shuffle_expr): New prototype. * genopinit.c: Adjust to support vecshuffle. * builtins.def: New builtin __builtin_shuffle. * c-typeck.c (build_function_call_vec): Typecheck __builtin_shuffle, allowing only two or three arguments. Change the type of builtin depending on the arguments. (digest_init): Warn when constructor has less elements than vector type. * gimplify.c (gimplify_exp): Adjusted to support VEC_SHUFFLE_EXPR. * tree.def: New tree code VEC_SHUFFLE_EXPR. * tree-vect-generic.c (vector_element): New function. Returns an element of the vector at the given position. (lower_builtin_shuffle): Change builtin_shuffle with VEC_SHUFLLE_EXPR or expand an expression piecewise. (expand_vector_operations_1): Adjusted. (gate_expand_vector_operations_noop): New gate function. * gimple.c (get_gimple_rhs_num_ops): Adjust. * passes.c: Move veclower down. * tree-pretty-print.c (dump_generic_node): Recognize VEC_SHUFFLE_EXPR as valid expression. * tree-ssa-operands: Adjust. gcc/config/i386 * sse.md: (sseshuffint) New mode_attr. Correspondence between the vector and the type of the mask when shuffling. (vecshuffle): New expansion. * i386-protos.h (ix86_expand_vshuffle): New prototype. * i386.c (ix86_expand_vshuffle): Expand vshuffle using pshufb. (ix86_vectorize_builtin_vec_perm_ok): Adjust. gcc/doc * extend.texi: Adjust. gcc/testsuite * gcc.c-torture/execute/vect-shuffle-2.c: New test. * gcc.c-torture/execute/vect-shuffle-4.c: New test. * gcc.c-torture/execute/vect-shuffle-1.c: New test. * gcc.c-torture/execute/vect-shuffle-3.c: New test. bootstrapped on x86_64-unknown-linux-gnu. The AVX parts are not tested, because I don't have actual hardware. It works with -mavx, the assembler code looks fine to me. I'll test it on a real hardware in couple of days. Thanks, Artem Shinkarov. Index: gcc/doc/extend.texi === --- gcc/doc/extend.texi (revision 177758) +++ gcc/doc/extend.texi (working copy) @@ -6553,6 +6553,32 @@ invoke undefined behavior at runtime. W accesses for vector subscription can be enabled with @option{-Warray-bounds}. +Vector shuffling is available using functions +@code{__builtin_shuffle (vec, mask)} and +@code{__builtin_shuffle (vec0, vec1, mask)}. Both functions construct +a permutation of elements from one or two vectors and return a vector +of the same type as input vector(s). The mask is a vector of +integer-typed elements. The size of each element of the mask must be +the same as the size of each input vector element. The number of +elements in input vector(s) and mask must be the same. + +The elements of the input vectors are numbered from left to right across +one or both of the vectors. Each element in the mask specifies a number +of element from the input vector(s). Consider the following example. + +@smallexample +typedef int v4si __attribute__ ((vector_size (16))); + +v4si a = @{1,2,3,4@}; +v4si b = @{5,6,7,8@}; +v4si mask1 = @{0,1,1,3@}; +v4si mask2 = @{0,4,2,5@}; +v4si res; + +res = __builtin_shuffle (a, mask1); /* res is @{1,2,2,4@} */ +res = __builtin_shuffle2 (a, b, mask2); /* res is @{1,5,3,6@} */ +@end smallexample + You can declare variables and use them in function calls and returns, as well as in assignments and some casts. You can specify a vector type as a return type for a function. Vector types can also be used as function Index: gcc/tree-pretty-print.c === --- gcc/tree-pretty-print.c (revision 177758) +++ gcc/tree-pretty-print.c (working copy) @@ -2063,6 +2063,16 @@ dump_generic_node (pretty_printer *buffe dump_generic_node (buffer, TREE_OPERAND (node, 2), spc, flags, false); pp_stri
Re: Vector Comparison patch
Hi Here is a patch with vector comparison only. Comparison is expanded using VEC_COND_EXPR, conversions between the different types inside the VEC_COND_EXPR are happening in optabs.c. The comparison generally works, however, the x86 backend does not recognize vectors of all 1s of type float and double, which is very bad, but I hope it could be fixed easily. Here is my humble attempt: Index: gcc/config/i386/predicates.md === --- gcc/config/i386/predicates.md (revision 177665) +++ gcc/config/i386/predicates.md (working copy) @@ -763,7 +763,19 @@ (define_predicate "vector_all_ones_opera for (i = 0; i < nunits; ++i) { rtx x = CONST_VECTOR_ELT (op, i); - if (x != constm1_rtx) + rtx y; + + if (GET_MODE_CLASS (GET_MODE (x)) == MODE_FLOAT) + { + REAL_VALUE_TYPE r; + REAL_VALUE_FROM_INT (r, -1, -1, GET_MODE (x)); + y = CONST_DOUBLE_FROM_REAL_VALUE (r, GET_MODE (x)); + } + else + y = constm1_rtx; + + /* if (x != constm1_rtx) */ + if (!rtx_equal_p (x, y)) return false; } return true; But the problem I have here is that -1 actually converts to -1.0, where I need to treat -0x1 as float. Something like: int p = -1; void *x = &p; float r = *((float *)x); Is there any way to do that in this context? Or may be there is another way to support real-typed vectors of -1 as constants? ChangeLog 20011-08-27 Artjoms Sinkarovs gcc/ * optabs.c (vector_compare_rtx): Allow comparison operands and vcond operands have different type. (expand_vec_cond_expr): Convert operands in case they do not match. * fold-const.c (constant_boolean_node): Adjust the meaning of boolean for vector types: true = {-1,..}, false = {0,..}. (fold_unary_loc): Avoid conversion of vector comparison to boolean type. * expr.c (expand_expr_real_2): Expand vector comparison by building an appropriate VEC_COND_EXPR. * c-typeck.c (build_binary_op): Typecheck vector comparisons. (c_objc_common_truthvalue_conversion): Adjust. * gimplify.c (gimplify_expr): Support vector comparison in gimple. * tree.def: Adjust comment. * tree-vect-generic.c (do_compare): Helper function. (expand_vector_comparison): Check if hardware supports vector comparison of the given type or expand vector piecewise. (expand_vector_operation): Treat comparison as binary operation of vector type. (expand_vector_operations_1): Adjust. * tree-cfg.c (verify_gimple_comparison): Adjust. gcc/config/i386 * i386.c (ix86_expand_sse_movcc): Consider a case when vcond operators are {-1,..} and {0,..}. gcc/doc * extend.texi: Adjust. gcc/testsuite * gcc.c-torture/execute/vector-compare-1.c: New test. * gcc.c-torture/execute/vector-compare-2.c: New test. * gcc.dg/vector-compare-1.c: New test. * gcc.dg/vector-compare-2.c: New test. Bootstrapped and tested on x86_64-unknown-linux-gnu. Artem. Index: gcc/doc/extend.texi === --- gcc/doc/extend.texi (revision 177665) +++ gcc/doc/extend.texi (working copy) @@ -6553,6 +6553,29 @@ invoke undefined behavior at runtime. W accesses for vector subscription can be enabled with @option{-Warray-bounds}. +In GNU C vector comparison is supported within standard comparison +operators: @code{==, !=, <, <=, >, >=}. Comparison operands can be +vector expressions of integer-type or real-type. Comparison between +integer-type vectors and real-type vectors are not supported. The +result of the comparison is a vector of the same width and number of +elements as the comparison operands with a signed integral element +type. + +Vectors are compared element-wise producing 0 when comparison is false +and -1 (constant of the appropriate type where all bits are set) +otherwise. Consider the following example. + +@smallexample +typedef int v4si __attribute__ ((vector_size (16))); + +v4si a = @{1,2,3,4@}; +v4si b = @{3,2,1,4@}; +v4si c; + +c = a > b; /* The result would be @{0, 0,-1, 0@} */ +c = a == b; /* The result would be @{0,-1, 0,-1@} */ +@end smallexample + You can declare variables and use them in function calls and returns, as well as in assignments and some casts. You can specify a vector type as a return type for a function. Vector types can also be used as function Index: gcc/optabs.c === --- gcc/optabs.c(revision 177665) +++ gcc/optabs.c(working copy) @@ -6502,7 +6502,8 @@ get_rtx_code (enum tree_code tcode, bool unsigned operators. Do not generate compare instruction. */ static rtx -vect
Re: Vector Comparison patch
On Thu, Aug 25, 2011 at 2:00 PM, Richard Guenther wrote: > On Thu, Aug 25, 2011 at 2:45 PM, Artem Shinkarov > wrote: >> On Thu, Aug 25, 2011 at 12:39 PM, Richard Guenther >> wrote: >>> On Thu, Aug 25, 2011 at 1:07 PM, Artem Shinkarov >>> wrote: >>>> On Thu, Aug 25, 2011 at 11:09 AM, Richard Guenther >>>> wrote: >>>>> On Thu, Aug 25, 2011 at 8:20 AM, Artem Shinkarov >>>>> wrote: >>>>>> Here is a cleaned-up patch without the hook. Mostly it works in a way >>>>>> we discussed. >>>>>> >>>>>> So I think it is a right time to do something about vcond patterns, >>>>>> which would allow me to get rid of conversions that I need to put all >>>>>> over the code. >>>>>> >>>>>> Also at the moment the patch breaks lto frontend with a simple example: >>>>>> #define vector(elcount, type) \ >>>>>> __attribute__((vector_size((elcount)*sizeof(type type >>>>>> >>>>>> int main (int argc, char *argv[]) { >>>>>> vector (4, float) f0; >>>>>> vector (4, float) f1; >>>>>> >>>>>> f0 = f1 != f0 >>>>>> ? (vector (4, float)){-1,-1,-1,-1} : (vector (4, >>>>>> float)){0,0,0,0}; >>>>>> >>>>>> return (int)f0[argc]; >>>>>> } >>>>>> >>>>>> test-lto.c:8:14: internal compiler error: in convert, at >>>>>> lto/lto-lang.c:1244 >>>>>> >>>>>> I looked into the file, the conversion function is defined as >>>>>> gcc_unreachable (). I am not very familiar with lto, so I don't really >>>>>> know what is the right way to treat the conversions. >>>>>> >>>>>> And I seriously need help with backend patterns. >>>>> >>>>> On the patch. >>>>> >>>>> The documentation needs review by a native english speaker, but here >>>>> are some factual comments: >>>>> >>>>> +In C vector comparison is supported within standard comparison operators: >>>>> >>>>> it should read 'In GNU C' here and everywhere else as this is a GNU >>>>> extension. >>>>> >>>>> The result of the >>>>> +comparison is a signed integer-type vector where the size of each >>>>> +element must be the same as the size of compared vectors element. >>>>> >>>>> The result type of the comparison is determined by the C frontend, >>>>> it isn't under control of the user. What you are implying here is >>>>> restrictions on vector assignments, which are documented elsewhere. >>>>> I'd just say >>>>> >>>>> 'The result of the comparison is a vector of the same width and number >>>>> of elements as the comparison operands with a signed integral element >>>>> type.' >>>>> >>>>> +In addition to the vector comparison C supports conditional expressions >>>>> >>>>> See above. >>>>> >>>>> +For the convenience condition in the vector conditional can be just a >>>>> +vector of signed integer type. >>>>> >>>>> 'of integer type.' I don't see a reason to disallow unsigned integers, >>>>> they can be equally well compared against zero. >>>> >>>> I'll have a final go on the documentation, it is untouched from the old >>>> patches. >>>> >>>>> Index: gcc/targhooks.h >>>>> === >>>>> --- gcc/targhooks.h (revision 177665) >>>>> +++ gcc/targhooks.h (working copy) >>>>> @@ -86,6 +86,7 @@ extern int default_builtin_vectorization >>>>> extern tree default_builtin_reciprocal (unsigned int, bool, bool); >>>>> >>>>> extern bool default_builtin_vector_alignment_reachable (const_tree, >>>>> bool); >>>>> + >>>>> extern bool >>>>> default_builtin_support_vector_misalignment (enum machine_mode mode, >>>>> const_tree, >>>>> >>>>> spurious whitespace change. >
Re: Vector Comparison patch
On Thu, Aug 25, 2011 at 12:39 PM, Richard Guenther wrote: > On Thu, Aug 25, 2011 at 1:07 PM, Artem Shinkarov > wrote: >> On Thu, Aug 25, 2011 at 11:09 AM, Richard Guenther >> wrote: >>> On Thu, Aug 25, 2011 at 8:20 AM, Artem Shinkarov >>> wrote: >>>> Here is a cleaned-up patch without the hook. Mostly it works in a way >>>> we discussed. >>>> >>>> So I think it is a right time to do something about vcond patterns, >>>> which would allow me to get rid of conversions that I need to put all >>>> over the code. >>>> >>>> Also at the moment the patch breaks lto frontend with a simple example: >>>> #define vector(elcount, type) \ >>>> __attribute__((vector_size((elcount)*sizeof(type type >>>> >>>> int main (int argc, char *argv[]) { >>>> vector (4, float) f0; >>>> vector (4, float) f1; >>>> >>>> f0 = f1 != f0 >>>> ? (vector (4, float)){-1,-1,-1,-1} : (vector (4, float)){0,0,0,0}; >>>> >>>> return (int)f0[argc]; >>>> } >>>> >>>> test-lto.c:8:14: internal compiler error: in convert, at >>>> lto/lto-lang.c:1244 >>>> >>>> I looked into the file, the conversion function is defined as >>>> gcc_unreachable (). I am not very familiar with lto, so I don't really >>>> know what is the right way to treat the conversions. >>>> >>>> And I seriously need help with backend patterns. >>> >>> On the patch. >>> >>> The documentation needs review by a native english speaker, but here >>> are some factual comments: >>> >>> +In C vector comparison is supported within standard comparison operators: >>> >>> it should read 'In GNU C' here and everywhere else as this is a GNU >>> extension. >>> >>> The result of the >>> +comparison is a signed integer-type vector where the size of each >>> +element must be the same as the size of compared vectors element. >>> >>> The result type of the comparison is determined by the C frontend, >>> it isn't under control of the user. What you are implying here is >>> restrictions on vector assignments, which are documented elsewhere. >>> I'd just say >>> >>> 'The result of the comparison is a vector of the same width and number >>> of elements as the comparison operands with a signed integral element >>> type.' >>> >>> +In addition to the vector comparison C supports conditional expressions >>> >>> See above. >>> >>> +For the convenience condition in the vector conditional can be just a >>> +vector of signed integer type. >>> >>> 'of integer type.' I don't see a reason to disallow unsigned integers, >>> they can be equally well compared against zero. >> >> I'll have a final go on the documentation, it is untouched from the old >> patches. >> >>> Index: gcc/targhooks.h >>> === >>> --- gcc/targhooks.h (revision 177665) >>> +++ gcc/targhooks.h (working copy) >>> @@ -86,6 +86,7 @@ extern int default_builtin_vectorization >>> extern tree default_builtin_reciprocal (unsigned int, bool, bool); >>> >>> extern bool default_builtin_vector_alignment_reachable (const_tree, bool); >>> + >>> extern bool >>> default_builtin_support_vector_misalignment (enum machine_mode mode, >>> const_tree, >>> >>> spurious whitespace change. >> >> Yes, thanks. >> >>> Index: gcc/optabs.c >>> === >>> --- gcc/optabs.c (revision 177665) >>> +++ gcc/optabs.c (working copy) >>> @@ -6572,16 +6572,36 @@ expand_vec_cond_expr (tree vec_cond_type >>> ... >>> + else >>> + { >>> + rtx rtx_op0; >>> + rtx vec; >>> + >>> + rtx_op0 = expand_normal (op0); >>> + comparison = gen_rtx_NE (mode, NULL_RTX, NULL_RTX); >>> + vec = CONST0_RTX (mode); >>> + >>> + create_output_operand (&ops[0], target, mode); >>> + create_input_operand (&ops[1], rtx_op1, mode); >>> + create_input_operand (&ops[2], rtx_op2, mode); >
Re: Vector Comparison patch
On Thu, Aug 25, 2011 at 11:09 AM, Richard Guenther wrote: > On Thu, Aug 25, 2011 at 8:20 AM, Artem Shinkarov > wrote: >> Here is a cleaned-up patch without the hook. Mostly it works in a way >> we discussed. >> >> So I think it is a right time to do something about vcond patterns, >> which would allow me to get rid of conversions that I need to put all >> over the code. >> >> Also at the moment the patch breaks lto frontend with a simple example: >> #define vector(elcount, type) \ >> __attribute__((vector_size((elcount)*sizeof(type type >> >> int main (int argc, char *argv[]) { >> vector (4, float) f0; >> vector (4, float) f1; >> >> f0 = f1 != f0 >> ? (vector (4, float)){-1,-1,-1,-1} : (vector (4, float)){0,0,0,0}; >> >> return (int)f0[argc]; >> } >> >> test-lto.c:8:14: internal compiler error: in convert, at lto/lto-lang.c:1244 >> >> I looked into the file, the conversion function is defined as >> gcc_unreachable (). I am not very familiar with lto, so I don't really >> know what is the right way to treat the conversions. >> >> And I seriously need help with backend patterns. > > On the patch. > > The documentation needs review by a native english speaker, but here > are some factual comments: > > +In C vector comparison is supported within standard comparison operators: > > it should read 'In GNU C' here and everywhere else as this is a GNU > extension. > > The result of the > +comparison is a signed integer-type vector where the size of each > +element must be the same as the size of compared vectors element. > > The result type of the comparison is determined by the C frontend, > it isn't under control of the user. What you are implying here is > restrictions on vector assignments, which are documented elsewhere. > I'd just say > > 'The result of the comparison is a vector of the same width and number > of elements as the comparison operands with a signed integral element > type.' > > +In addition to the vector comparison C supports conditional expressions > > See above. > > +For the convenience condition in the vector conditional can be just a > +vector of signed integer type. > > 'of integer type.' I don't see a reason to disallow unsigned integers, > they can be equally well compared against zero. I'll have a final go on the documentation, it is untouched from the old patches. > Index: gcc/targhooks.h > === > --- gcc/targhooks.h (revision 177665) > +++ gcc/targhooks.h (working copy) > @@ -86,6 +86,7 @@ extern int default_builtin_vectorization > extern tree default_builtin_reciprocal (unsigned int, bool, bool); > > extern bool default_builtin_vector_alignment_reachable (const_tree, bool); > + > extern bool > default_builtin_support_vector_misalignment (enum machine_mode mode, > const_tree, > > spurious whitespace change. Yes, thanks. > Index: gcc/optabs.c > === > --- gcc/optabs.c (revision 177665) > +++ gcc/optabs.c (working copy) > @@ -6572,16 +6572,36 @@ expand_vec_cond_expr (tree vec_cond_type > ... > + else > + { > + rtx rtx_op0; > + rtx vec; > + > + rtx_op0 = expand_normal (op0); > + comparison = gen_rtx_NE (mode, NULL_RTX, NULL_RTX); > + vec = CONST0_RTX (mode); > + > + create_output_operand (&ops[0], target, mode); > + create_input_operand (&ops[1], rtx_op1, mode); > + create_input_operand (&ops[2], rtx_op2, mode); > + create_input_operand (&ops[3], comparison, mode); > + create_input_operand (&ops[4], rtx_op0, mode); > + create_input_operand (&ops[5], vec, mode); > > this still builds the fake(?) != comparison, but as you said you need help > with the .md part if we want to use a machine specific pattern for this > case (which we eventually want, for the sake of using XOP vcond). Yes, I am waiting for it. This is the only way at the moment to make sure that in m = a > b; r = m ? c : d; m in the vcond is not transformed into the m != 0. > Index: gcc/target.h > === > --- gcc/target.h (revision 177665) > +++ gcc/target.h (working copy) > @@ -51,6 +51,7 @@ > #define GCC_TARGET_H > > #include "insn-modes.h" > +#include "gimple.h" > > #ifdef ENABLE_CHECKING > > spurious change. Old stuff, fixed. &
Re: Vector Comparison patch
On Thu, Aug 25, 2011 at 8:34 AM, Richard Guenther wrote: > On Thu, Aug 25, 2011 at 8:20 AM, Artem Shinkarov > wrote: >> Here is a cleaned-up patch without the hook. Mostly it works in a way >> we discussed. >> >> So I think it is a right time to do something about vcond patterns, >> which would allow me to get rid of conversions that I need to put all >> over the code. >> >> Also at the moment the patch breaks lto frontend with a simple example: >> #define vector(elcount, type) \ >> __attribute__((vector_size((elcount)*sizeof(type type >> >> int main (int argc, char *argv[]) { >> vector (4, float) f0; >> vector (4, float) f1; >> >> f0 = f1 != f0 >> ? (vector (4, float)){-1,-1,-1,-1} : (vector (4, float)){0,0,0,0}; >> >> return (int)f0[argc]; >> } >> >> test-lto.c:8:14: internal compiler error: in convert, at lto/lto-lang.c:1244 >> >> I looked into the file, the conversion function is defined as >> gcc_unreachable (). I am not very familiar with lto, so I don't really >> know what is the right way to treat the conversions. > > convert cannot be called from the middle-end, instead use fold_convert. Thanks, great. I didn't know that. Using fold_convert solves my problem and make all my tests pass. > >> And I seriously need help with backend patterns. > > I'll look at the patch in detail later today. Thanks, Artem. > Richard. > >> >> Thanks, >> Artem. >> >
Re: Vector Comparison patch
On Tue, Aug 23, 2011 at 12:23 PM, Richard Guenther wrote: > On Tue, Aug 23, 2011 at 1:11 PM, Artem Shinkarov > wrote: >> On Tue, Aug 23, 2011 at 11:56 AM, Richard Guenther >> wrote: >>> On Tue, Aug 23, 2011 at 12:45 PM, Artem Shinkarov >>> wrote: >>>> I'm confused. >>>> There is a set of problems which are tightly connected and you address >>>> only one one of them. >>>> >>>> I need to do something with C_MAYBE_CONST_EXPR node to allow the >>>> gimplification of the expression. In order to achieve that I am >>>> wrapping expression which can contain C_MAYBE_EXPR_NODE into >>>> SAVE_EXPR. This works fine, but, the vector condition is lifted out. >>>> So the question is how to get rid of C_MAYBE_CONST_EXPR nodes, making >>>> sure that the expression is still inside VEC_COND_EXPR? >>> >>> I can't answer this, but no C_MAYBE_CONST_EXPR nodes may survive >>> until gimplification. I thought c_fully_fold is exactly used (instead >>> of c_save_expr) because it _doesn't_ wrap things in C_MAYBE_CONST_EXPR >>> nodes. Instead you delay that (well, commented out in your patch). >> >> Ok. So for the time being save_expr is the only way that we know to >> avoid C_MAYBE_CONST_EXPR nodes. >> >>>> All the rest is fine -- a > b is transformed to VEC_COND_EXPR of the >>>> integer type, and when we are using it we can add != 0 to the mask, no >>>> problem. The problem is to make sure that the vector expression is not >>>> lifted out from the VEC_COND_EXPR and that C_MAYBE_CONST_EXPRs are >>>> also no there at the same time. >>> >>> Well, for example for floating-point comparisons and -fnon-call-exceptions >>> you _will_ get comparisons lifted out of the VEC_COND_EXPR. But >>> that shouldn't be an issue because C semantics are ensured for >>> the mask ? v0 : v1 source form by changing it to mask != 0 ? v0 : v1 and >>> the VEC_COND_EXPR semantic for a non-comparison mask operand >>> is (v0 & mask) | (v1 & ~mask). Which means that we have to be able to >>> expand mask = v0 < v1 anyway, but we'll simply expand it if it were >>> VEC_COND_EXPR . >> >> Richard, I think you almost get it, but there is a tiny thing you have >> missed. >> Look, let's assume, that by some reason when we gimplified a > b, the >> comparison was lifted out. So we have the following situation: >> >> D.1 = a > b; >> comp = vcond >> ... >> >> Ok? >> Now, I fully agree that we want to treat lifted a > b as VCOND. Now, >> what I am doing in the veclower is when I meet vector comparison a > >> b, I wrap it in the VCOND, otherwise it would not be recognized by >> optabs. literally I am doing: >> >> rhs = gimplify_build3 (gsi, VEC_COND_EXPR, a, b, {-1}, {0}> >> >> And here is a devil hidden. By some reason, when this expression is >> gimplified, a > b is lifted again and is left outside the >> VEC_COND_EXPR, and that is the problem I am trying to fight with. Have >> any ideas what could be done here? > > Well, don't do it. Check if the target can expand > > D.1 = a > b; > > via feeding it vcond and if not, expand it > piecewise > in veclower. If it can handle it - leave it alone! > > In expand_expr_real_2 add to the EQ_EXPR (etc.) case the case > of a vector-typed comparison and use the vcond optab for it, again > via vcond . If you look at the EQ_EXPR case > it dispatches to do_store_flag - that's the best place to handle > vector-typed compares. > > Richard. > That sounds like a plan. I'll investigate if it can be done. Also, if we can handle a > b, then we don't need to construct vcond b, {-1}, {0}>, we will know that it would be constructed correctly when expanding. Thanks for your help, Artem.
Re: Vector Comparison patch
Sorry, not rhs = gimplify_build3 (gsi, VEC_COND_EXPR, a, b, {-1}, {0}> but rather rhs = gimplify_build3 (gsi, VEC_COND_EXPR, build2 (GT_EXPR, type, a, b), {-1}, {0}> Artem.
Re: Vector Comparison patch
On Tue, Aug 23, 2011 at 11:56 AM, Richard Guenther wrote: > On Tue, Aug 23, 2011 at 12:45 PM, Artem Shinkarov > wrote: >> On Tue, Aug 23, 2011 at 11:33 AM, Richard Guenther >> wrote: >>> On Tue, Aug 23, 2011 at 12:24 PM, Artem Shinkarov >>> wrote: >>>> On Tue, Aug 23, 2011 at 11:08 AM, Richard Guenther >>>> wrote: >>>>> On Tue, Aug 23, 2011 at 11:44 AM, Artem Shinkarov >>>>> wrote: >>>>>> On Tue, Aug 23, 2011 at 9:17 AM, Richard Guenther >>>>>> wrote: >>>>>>> On Mon, Aug 22, 2011 at 11:11 PM, Artem Shinkarov >>>>>>> wrote: >>>>>>>> I'll just send you my current version. I'll be a little bit more >>>>>>>> specific. >>>>>>>> >>>>>>>> The problem starts when you try to lower the following expression: >>>>>>>> >>>>>>>> x = a > b; >>>>>>>> x1 = vcond >>>>>>>> vcond >>>>>>>> >>>>>>>> Now, you go from the beginning to the end of the block, and you cannot >>>>>>>> leave a > b, because only vconds are valid expressions to expand. >>>>>>>> >>>>>>>> Now, you meet a > b first. You try to transform it into vcond b, >>>>>>>> -1, 0>, you build this expression, then you try to gimplify it, and >>>>>>>> you see that you have something like: >>>>>>>> >>>>>>>> x' = a >b; >>>>>>>> x = vcond >>>>>>>> x1 = vcond >>>>>>>> vcond >>>>>>>> >>>>>>>> and your gsi stands at the x1 now, so the gimplification created a >>>>>>>> comparison that optab would not understand. And I am not really sure >>>>>>>> that you would be able to solve this problem easily. >>>>>>>> >>>>>>>> It would helpr, if you could create vcond, but you >>>>>>>> cant and x op y is a single tree that must be gimplified, and I am not >>>>>>>> sure that you can persuade gimplifier to leave this expression >>>>>>>> untouched. >>>>>>>> >>>>>>>> In the attachment the current version of the patch. >>>>>>> >>>>>>> I can't reproduce it with your patch. For >>>>>>> >>>>>>> #define vector(elcount, type) \ >>>>>>> __attribute__((vector_size((elcount)*sizeof(type type >>>>>>> >>>>>>> vector (4, float) x, y; >>>>>>> vector (4, int) a,b; >>>>>>> int >>>>>>> main (int argc, char *argv[]) >>>>>>> { >>>>>>> vector (4, int) i0 = x < y; >>>>>>> vector (4, int) i1 = i0 ? a : b; >>>>>>> return 0; >>>>>>> } >>>>>>> >>>>>>> I get from the C frontend: >>>>>>> >>>>>>> vector(4) int i0 = VEC_COND_EXPR < SAVE_EXPR , { -1, -1, >>>>>>> -1, -1 } , { 0, 0, 0, 0 } > ; >>>>>>> vector(4) int i1 = VEC_COND_EXPR < SAVE_EXPR , SAVE_EXPR , >>>>>>> SAVE_EXPR > ; >>>>>>> >>>>>>> but I have expected i0 != 0 in the second VEC_COND_EXPR. >>>>>> >>>>>> I don't put it there. This patch adds != 0, rather removing. But this >>>>>> could be changed. >>>>> >>>>> ? >>>>> >>>>>>> I do see that the gimplifier pulls away the condition for the first >>>>>>> VEC_COND_EXPR though: >>>>>>> >>>>>>> x.0 = x; >>>>>>> y.1 = y; >>>>>>> D.2735 = x.0 < y.1; >>>>>>> D.2734 = D.2735; >>>>>>> D.2736 = D.2734; >>>>>>> i0 = [vec_cond_expr] VEC_COND_EXPR < D.2736 , { -1, -1, -1, -1 } , >>>>>>> { 0, 0, 0, 0 } > ; >>>>>>> >>>>>>> which is, I believe because of the SAVE_EXPR wrapped around the >>>>>>> comparison. Why do you bother wrapping all ope
Re: Vector Comparison patch
On Tue, Aug 23, 2011 at 11:33 AM, Richard Guenther wrote: > On Tue, Aug 23, 2011 at 12:24 PM, Artem Shinkarov > wrote: >> On Tue, Aug 23, 2011 at 11:08 AM, Richard Guenther >> wrote: >>> On Tue, Aug 23, 2011 at 11:44 AM, Artem Shinkarov >>> wrote: >>>> On Tue, Aug 23, 2011 at 9:17 AM, Richard Guenther >>>> wrote: >>>>> On Mon, Aug 22, 2011 at 11:11 PM, Artem Shinkarov >>>>> wrote: >>>>>> I'll just send you my current version. I'll be a little bit more >>>>>> specific. >>>>>> >>>>>> The problem starts when you try to lower the following expression: >>>>>> >>>>>> x = a > b; >>>>>> x1 = vcond >>>>>> vcond >>>>>> >>>>>> Now, you go from the beginning to the end of the block, and you cannot >>>>>> leave a > b, because only vconds are valid expressions to expand. >>>>>> >>>>>> Now, you meet a > b first. You try to transform it into vcond b, >>>>>> -1, 0>, you build this expression, then you try to gimplify it, and >>>>>> you see that you have something like: >>>>>> >>>>>> x' = a >b; >>>>>> x = vcond >>>>>> x1 = vcond >>>>>> vcond >>>>>> >>>>>> and your gsi stands at the x1 now, so the gimplification created a >>>>>> comparison that optab would not understand. And I am not really sure >>>>>> that you would be able to solve this problem easily. >>>>>> >>>>>> It would helpr, if you could create vcond, but you >>>>>> cant and x op y is a single tree that must be gimplified, and I am not >>>>>> sure that you can persuade gimplifier to leave this expression >>>>>> untouched. >>>>>> >>>>>> In the attachment the current version of the patch. >>>>> >>>>> I can't reproduce it with your patch. For >>>>> >>>>> #define vector(elcount, type) \ >>>>> __attribute__((vector_size((elcount)*sizeof(type type >>>>> >>>>> vector (4, float) x, y; >>>>> vector (4, int) a,b; >>>>> int >>>>> main (int argc, char *argv[]) >>>>> { >>>>> vector (4, int) i0 = x < y; >>>>> vector (4, int) i1 = i0 ? a : b; >>>>> return 0; >>>>> } >>>>> >>>>> I get from the C frontend: >>>>> >>>>> vector(4) int i0 = VEC_COND_EXPR < SAVE_EXPR , { -1, -1, >>>>> -1, -1 } , { 0, 0, 0, 0 } > ; >>>>> vector(4) int i1 = VEC_COND_EXPR < SAVE_EXPR , SAVE_EXPR , >>>>> SAVE_EXPR > ; >>>>> >>>>> but I have expected i0 != 0 in the second VEC_COND_EXPR. >>>> >>>> I don't put it there. This patch adds != 0, rather removing. But this >>>> could be changed. >>> >>> ? >>> >>>>> I do see that the gimplifier pulls away the condition for the first >>>>> VEC_COND_EXPR though: >>>>> >>>>> x.0 = x; >>>>> y.1 = y; >>>>> D.2735 = x.0 < y.1; >>>>> D.2734 = D.2735; >>>>> D.2736 = D.2734; >>>>> i0 = [vec_cond_expr] VEC_COND_EXPR < D.2736 , { -1, -1, -1, -1 } , >>>>> { 0, 0, 0, 0 } > ; >>>>> >>>>> which is, I believe because of the SAVE_EXPR wrapped around the >>>>> comparison. Why do you bother wrapping all operands in save-exprs? >>>> >>>> I bother because they could be MAYBE_CONST which breaks the >>>> gimplifier. But I don't really know if you can do it better. I can >>>> always do this checking on operands of constructed vcond... >>> >>> Err, the patch does >>> >>> + /* Avoid C_MAYBE_CONST in VEC_COND_EXPR. */ >>> + tmp = c_fully_fold (ifexp, false, &maybe_const); >>> + ifexp = save_expr (tmp); >>> + wrap &= maybe_const; >>> >>> why is >>> >>> ifexp = save_expr (tmp); >>> >>> necessary here? SAVE_EXPR is if you need to protect side-effects >>> from being evaluated twice if you use an operand twice. But all >>> operands are just used a sing
Re: Vector Comparison patch
On Tue, Aug 23, 2011 at 11:08 AM, Richard Guenther wrote: > On Tue, Aug 23, 2011 at 11:44 AM, Artem Shinkarov > wrote: >> On Tue, Aug 23, 2011 at 9:17 AM, Richard Guenther >> wrote: >>> On Mon, Aug 22, 2011 at 11:11 PM, Artem Shinkarov >>> wrote: >>>> I'll just send you my current version. I'll be a little bit more specific. >>>> >>>> The problem starts when you try to lower the following expression: >>>> >>>> x = a > b; >>>> x1 = vcond >>>> vcond >>>> >>>> Now, you go from the beginning to the end of the block, and you cannot >>>> leave a > b, because only vconds are valid expressions to expand. >>>> >>>> Now, you meet a > b first. You try to transform it into vcond b, >>>> -1, 0>, you build this expression, then you try to gimplify it, and >>>> you see that you have something like: >>>> >>>> x' = a >b; >>>> x = vcond >>>> x1 = vcond >>>> vcond >>>> >>>> and your gsi stands at the x1 now, so the gimplification created a >>>> comparison that optab would not understand. And I am not really sure >>>> that you would be able to solve this problem easily. >>>> >>>> It would helpr, if you could create vcond, but you >>>> cant and x op y is a single tree that must be gimplified, and I am not >>>> sure that you can persuade gimplifier to leave this expression >>>> untouched. >>>> >>>> In the attachment the current version of the patch. >>> >>> I can't reproduce it with your patch. For >>> >>> #define vector(elcount, type) \ >>> __attribute__((vector_size((elcount)*sizeof(type type >>> >>> vector (4, float) x, y; >>> vector (4, int) a,b; >>> int >>> main (int argc, char *argv[]) >>> { >>> vector (4, int) i0 = x < y; >>> vector (4, int) i1 = i0 ? a : b; >>> return 0; >>> } >>> >>> I get from the C frontend: >>> >>> vector(4) int i0 = VEC_COND_EXPR < SAVE_EXPR , { -1, -1, >>> -1, -1 } , { 0, 0, 0, 0 } > ; >>> vector(4) int i1 = VEC_COND_EXPR < SAVE_EXPR , SAVE_EXPR , >>> SAVE_EXPR > ; >>> >>> but I have expected i0 != 0 in the second VEC_COND_EXPR. >> >> I don't put it there. This patch adds != 0, rather removing. But this >> could be changed. > > ? > >>> I do see that the gimplifier pulls away the condition for the first >>> VEC_COND_EXPR though: >>> >>> x.0 = x; >>> y.1 = y; >>> D.2735 = x.0 < y.1; >>> D.2734 = D.2735; >>> D.2736 = D.2734; >>> i0 = [vec_cond_expr] VEC_COND_EXPR < D.2736 , { -1, -1, -1, -1 } , >>> { 0, 0, 0, 0 } > ; >>> >>> which is, I believe because of the SAVE_EXPR wrapped around the >>> comparison. Why do you bother wrapping all operands in save-exprs? >> >> I bother because they could be MAYBE_CONST which breaks the >> gimplifier. But I don't really know if you can do it better. I can >> always do this checking on operands of constructed vcond... > > Err, the patch does > > + /* Avoid C_MAYBE_CONST in VEC_COND_EXPR. */ > + tmp = c_fully_fold (ifexp, false, &maybe_const); > + ifexp = save_expr (tmp); > + wrap &= maybe_const; > > why is > > ifexp = save_expr (tmp); > > necessary here? SAVE_EXPR is if you need to protect side-effects > from being evaluated twice if you use an operand twice. But all > operands are just used a single time. Again, the only reason why save_expr is there is to avoid MAYBE_CONST nodes to break the gimplification. But may be it is a wrong way of doing it, but it does the job. > And I expected, instead of > > + if ((COMPARISON_CLASS_P (ifexp) > + && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != TREE_TYPE (op1)) > + || TREE_TYPE (ifexp) != TREE_TYPE (op1)) > + { > + tree comp_type = COMPARISON_CLASS_P (ifexp) > + ? TREE_TYPE (TREE_OPERAND (ifexp, 0)) > + : TREE_TYPE (ifexp); > + > + op1 = convert (comp_type, op1); > + op2 = convert (comp_type, op2); > + vcond = build3 (VEC_COND_EXPR, comp_type, ifexp, op1, op2); > + vcond = convert (TREE_TYPE (op1), vcond); > + } > + else > + vcond = build3 (VEC_COND_EXPR, TREE_TYPE (op1), ifexp, op1, op2); > > if (!COMPARISON_CLASS_P (ifexp)) > ifexp = bu
Re: Vector Comparison patch
On Tue, Aug 23, 2011 at 9:17 AM, Richard Guenther wrote: > On Mon, Aug 22, 2011 at 11:11 PM, Artem Shinkarov > wrote: >> I'll just send you my current version. I'll be a little bit more specific. >> >> The problem starts when you try to lower the following expression: >> >> x = a > b; >> x1 = vcond >> vcond >> >> Now, you go from the beginning to the end of the block, and you cannot >> leave a > b, because only vconds are valid expressions to expand. >> >> Now, you meet a > b first. You try to transform it into vcond b, >> -1, 0>, you build this expression, then you try to gimplify it, and >> you see that you have something like: >> >> x' = a >b; >> x = vcond >> x1 = vcond >> vcond >> >> and your gsi stands at the x1 now, so the gimplification created a >> comparison that optab would not understand. And I am not really sure >> that you would be able to solve this problem easily. >> >> It would helpr, if you could create vcond, but you >> cant and x op y is a single tree that must be gimplified, and I am not >> sure that you can persuade gimplifier to leave this expression >> untouched. >> >> In the attachment the current version of the patch. > > I can't reproduce it with your patch. For > > #define vector(elcount, type) \ > __attribute__((vector_size((elcount)*sizeof(type type > > vector (4, float) x, y; > vector (4, int) a,b; > int > main (int argc, char *argv[]) > { > vector (4, int) i0 = x < y; > vector (4, int) i1 = i0 ? a : b; > return 0; > } > > I get from the C frontend: > > vector(4) int i0 = VEC_COND_EXPR < SAVE_EXPR , { -1, -1, > -1, -1 } , { 0, 0, 0, 0 } > ; > vector(4) int i1 = VEC_COND_EXPR < SAVE_EXPR , SAVE_EXPR , > SAVE_EXPR > ; > > but I have expected i0 != 0 in the second VEC_COND_EXPR. I don't put it there. This patch adds != 0, rather removing. But this could be changed. > I do see that the gimplifier pulls away the condition for the first > VEC_COND_EXPR though: > > x.0 = x; > y.1 = y; > D.2735 = x.0 < y.1; > D.2734 = D.2735; > D.2736 = D.2734; > i0 = [vec_cond_expr] VEC_COND_EXPR < D.2736 , { -1, -1, -1, -1 } , > { 0, 0, 0, 0 } > ; > > which is, I believe because of the SAVE_EXPR wrapped around the > comparison. Why do you bother wrapping all operands in save-exprs? I bother because they could be MAYBE_CONST which breaks the gimplifier. But I don't really know if you can do it better. I can always do this checking on operands of constructed vcond... You are right, that if you just put a comparison of variables there then we are fine. My point is that whenever gimplifier is pulling out the comparison from the first operand, replacing it with the variable, then we are screwed, because there is no chance to put it back, and that is exactly what happens in expand_vector_comparison, if you uncomment the replacement -- comparison is always represented as x = a > b. > With that the > > /* Currently the expansion of VEC_COND_EXPR does not allow > expessions where the type of vectors you compare differs > form the type of vectors you select from. For the time > being we insert implicit conversions. */ > if ((COMPARISON_CLASS_P (ifexp) > && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != TREE_TYPE (op1)) > || TREE_TYPE (ifexp) != TREE_TYPE (op1)) > > checks will fail (because ifexp is a SAVE_EXPR). > > I'll run into errors when not adding the SAVE_EXPR around the ifexp, > the transform into x < y ? {-1,...} : {0,...} is not happening. >> >> Thanks, >> Artem. >> >> >> On Mon, Aug 22, 2011 at 9:58 PM, Richard Guenther >> wrote: >>> On Mon, Aug 22, 2011 at 10:49 PM, Artem Shinkarov >>> wrote: >>>> On Mon, Aug 22, 2011 at 9:42 PM, Richard Guenther >>>> wrote: >>>>> On Mon, Aug 22, 2011 at 5:58 PM, Artem Shinkarov >>>>> wrote: >>>>>> On Mon, Aug 22, 2011 at 4:50 PM, Richard Guenther >>>>>> wrote: >>>>>>> On Mon, Aug 22, 2011 at 5:43 PM, Artem Shinkarov >>>>>>> wrote: >>>>>>>> On Mon, Aug 22, 2011 at 4:34 PM, Richard Guenther >>>>>>>> wrote: >>>>>>>>> On Mon, Aug 22, 2011 at 5:21 PM, Artem Shinkarov >>>>>>>>> wrote: >>>>>>>>>> On Mon, Aug 22, 2011 at 4:01 PM, Richard Guenther >>>>>>>>>> wrote: >>>>>>>>>>> On Mon, Aug 22, 20
Re: Vector Comparison patch
On Mon, Aug 22, 2011 at 9:42 PM, Richard Guenther wrote: > On Mon, Aug 22, 2011 at 5:58 PM, Artem Shinkarov > wrote: >> On Mon, Aug 22, 2011 at 4:50 PM, Richard Guenther >> wrote: >>> On Mon, Aug 22, 2011 at 5:43 PM, Artem Shinkarov >>> wrote: >>>> On Mon, Aug 22, 2011 at 4:34 PM, Richard Guenther >>>> wrote: >>>>> On Mon, Aug 22, 2011 at 5:21 PM, Artem Shinkarov >>>>> wrote: >>>>>> On Mon, Aug 22, 2011 at 4:01 PM, Richard Guenther >>>>>> wrote: >>>>>>> On Mon, Aug 22, 2011 at 2:05 PM, Artem Shinkarov >>>>>>> wrote: >>>>>>>> On Mon, Aug 22, 2011 at 12:25 PM, Richard Guenther >>>>>>>> wrote: >>>>>>>>> On Mon, Aug 22, 2011 at 12:53 AM, Artem Shinkarov >>>>>>>>> wrote: >>>>>>>>>> Richard >>>>>>>>>> >>>>>>>>>> I formalized an approach a little-bit, now it works without target >>>>>>>>>> hooks, but some polishing is still required. I want you to comment on >>>>>>>>>> the several important approaches that I use in the patch. >>>>>>>>>> >>>>>>>>>> So how does it work. >>>>>>>>>> 1) All the vector comparisons at the level of type-checker are >>>>>>>>>> introduced using VEC_COND_EXPR with constant selection operands being >>>>>>>>>> {-1} and {0}. For example v0 > v1 is transformed into >>>>>>>>>> VEC_COND_EXPR>>>>>>>>>> v1, {-1}, {0}>. >>>>>>>>>> >>>>>>>>>> 2) When optabs expand VEC_COND_EXPR, two cases are considered: >>>>>>>>>> 2.a) first operand of VEC_COND_EXPR is comparison, in that case >>>>>>>>>> nothing changes. >>>>>>>>>> 2.b) first operand is something else, in that case, we specially mark >>>>>>>>>> this case, recognize it in the backend, and do not create a >>>>>>>>>> comparison, but use the mask as it was a result of some comparison. >>>>>>>>>> >>>>>>>>>> 3) In order to make sure that mask in VEC_COND_EXPR is >>>>>>>>>> a >>>>>>>>>> vector comparison we use is_vector_comparison function, if it returns >>>>>>>>>> false, then we replace mask with mask != {0}. >>>>>>>>>> >>>>>>>>>> So we end-up with the following functionality: >>>>>>>>>> VEC_COND_EXPR -- if we know that mask is a result of >>>>>>>>>> comparison of two vectors, we leave it as it is, otherwise change >>>>>>>>>> with >>>>>>>>>> mask != {0}. >>>>>>>>>> >>>>>>>>>> Plain vector comparison a b is represented with VEC_COND_EXPR, >>>>>>>>>> which correctly expands, without creating useless masking. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Basically for me there are two questions: >>>>>>>>>> 1) Can we perform information passing in optabs in a nicer way? >>>>>>>>>> 2) How is_vector_comparison could be improved? I have several ideas, >>>>>>>>>> like checking if constant vector all consists of 0 and -1, and so on. >>>>>>>>>> But first is it conceptually fine. >>>>>>>>>> >>>>>>>>>> P.S. I tired to put the functionality of is_vector_comparison in >>>>>>>>>> tree-ssa-forwprop, but the thing is that it is called only with -On, >>>>>>>>>> which I find inappropriate, and the functionality gets more >>>>>>>>>> complicated. >>>>>>>>> >>>>>>>>> Why is it inappropriate to not optimize it at -O0? If the user >>>>>>>>> separates comparison and ?: expression it's his own fault. >>>>>>>> >>>>>>>> Well, because all the information is there, and I perfectly envision >>>>>>>> the case when user expressed co
Re: Vector Comparison patch
On Mon, Aug 22, 2011 at 8:50 PM, Uros Bizjak wrote: > On Mon, Aug 22, 2011 at 5:34 PM, Richard Guenther > wrote: > >>> In this case it is simple to analyse that a is a comparison, but you >>> cannot embed the operations of a into VEC_COND_EXPR. >> >> Sure, but if the above is C source the frontend would generate >> res = a != 0 ? v0 : v1; initially. An optimization pass could still >> track this information and replace VEC_COND_EXPR >> with VEC_COND_EXPR (no existing one would track >> vector contents though). >> >>> Ok, I am testing the patch that removes hooks. Could you push a little >>> bit the backend-patterns business? >> >> Well, I suppose we're waiting for Uros here. I hadn't much luck with >> fiddling with the mode-iterator stuff myself. > > It is not _that_ trivial change, since we have ix86_expand_fp_vcond > and ix86_expand_int_vcond to merge. ATM, FP version deals with FP > operands and vice versa. We have to merge them somehow and split out > comparison part that handles FP as well as integer operands. > > I also don't know why vcond is not allowed to FAIL... probably > middle-end should be enhanced for a fallback if some comparison isn't > supported by optab. > > Uros. > Uros My biggest problem in the backend is to create a correct description in *.md, which would accept the generic case. I can imagine adding all the cases, but as I mentioned already it explodes. I mean, we will have to do it for SSE, then the same number of cases for AVX, and so on. I would assume that there is a chance to persuade md, that the only thing that matters is the size of the element type of mask and operands. If you can help me with the pattern, I am happy to merge x86 code. Thanks, Artem.
Re: Vector Comparison patch
On Mon, Aug 22, 2011 at 4:50 PM, Richard Guenther wrote: > On Mon, Aug 22, 2011 at 5:43 PM, Artem Shinkarov > wrote: >> On Mon, Aug 22, 2011 at 4:34 PM, Richard Guenther >> wrote: >>> On Mon, Aug 22, 2011 at 5:21 PM, Artem Shinkarov >>> wrote: >>>> On Mon, Aug 22, 2011 at 4:01 PM, Richard Guenther >>>> wrote: >>>>> On Mon, Aug 22, 2011 at 2:05 PM, Artem Shinkarov >>>>> wrote: >>>>>> On Mon, Aug 22, 2011 at 12:25 PM, Richard Guenther >>>>>> wrote: >>>>>>> On Mon, Aug 22, 2011 at 12:53 AM, Artem Shinkarov >>>>>>> wrote: >>>>>>>> Richard >>>>>>>> >>>>>>>> I formalized an approach a little-bit, now it works without target >>>>>>>> hooks, but some polishing is still required. I want you to comment on >>>>>>>> the several important approaches that I use in the patch. >>>>>>>> >>>>>>>> So how does it work. >>>>>>>> 1) All the vector comparisons at the level of type-checker are >>>>>>>> introduced using VEC_COND_EXPR with constant selection operands being >>>>>>>> {-1} and {0}. For example v0 > v1 is transformed into VEC_COND_EXPR>>>>>>>> v1, {-1}, {0}>. >>>>>>>> >>>>>>>> 2) When optabs expand VEC_COND_EXPR, two cases are considered: >>>>>>>> 2.a) first operand of VEC_COND_EXPR is comparison, in that case >>>>>>>> nothing changes. >>>>>>>> 2.b) first operand is something else, in that case, we specially mark >>>>>>>> this case, recognize it in the backend, and do not create a >>>>>>>> comparison, but use the mask as it was a result of some comparison. >>>>>>>> >>>>>>>> 3) In order to make sure that mask in VEC_COND_EXPR is a >>>>>>>> vector comparison we use is_vector_comparison function, if it returns >>>>>>>> false, then we replace mask with mask != {0}. >>>>>>>> >>>>>>>> So we end-up with the following functionality: >>>>>>>> VEC_COND_EXPR -- if we know that mask is a result of >>>>>>>> comparison of two vectors, we leave it as it is, otherwise change with >>>>>>>> mask != {0}. >>>>>>>> >>>>>>>> Plain vector comparison a b is represented with VEC_COND_EXPR, >>>>>>>> which correctly expands, without creating useless masking. >>>>>>>> >>>>>>>> >>>>>>>> Basically for me there are two questions: >>>>>>>> 1) Can we perform information passing in optabs in a nicer way? >>>>>>>> 2) How is_vector_comparison could be improved? I have several ideas, >>>>>>>> like checking if constant vector all consists of 0 and -1, and so on. >>>>>>>> But first is it conceptually fine. >>>>>>>> >>>>>>>> P.S. I tired to put the functionality of is_vector_comparison in >>>>>>>> tree-ssa-forwprop, but the thing is that it is called only with -On, >>>>>>>> which I find inappropriate, and the functionality gets more >>>>>>>> complicated. >>>>>>> >>>>>>> Why is it inappropriate to not optimize it at -O0? If the user >>>>>>> separates comparison and ?: expression it's his own fault. >>>>>> >>>>>> Well, because all the information is there, and I perfectly envision >>>>>> the case when user expressed comparison separately, just to avoid code >>>>>> duplication. >>>>>> >>>>>> Like: >>>>>> mask = a > b; >>>>>> res1 = mask ? v0 : v1; >>>>>> res2 = mask ? v2 : v3; >>>>>> >>>>>> Which in this case would be different from >>>>>> res1 = a > b ? v0 : v1; >>>>>> res2 = a > b ? v2 : v3; >>>>>> >>>>>>> Btw, the new hook is still in the patch. >>>>>>> >>>>>>> I would simply always create != 0 if it isn't and let optimizers >>>>>>> (tree-ssa-forwprop.c) optimize this. Y
Re: Vector Comparison patch
On Mon, Aug 22, 2011 at 4:34 PM, Richard Guenther wrote: > On Mon, Aug 22, 2011 at 5:21 PM, Artem Shinkarov > wrote: >> On Mon, Aug 22, 2011 at 4:01 PM, Richard Guenther >> wrote: >>> On Mon, Aug 22, 2011 at 2:05 PM, Artem Shinkarov >>> wrote: >>>> On Mon, Aug 22, 2011 at 12:25 PM, Richard Guenther >>>> wrote: >>>>> On Mon, Aug 22, 2011 at 12:53 AM, Artem Shinkarov >>>>> wrote: >>>>>> Richard >>>>>> >>>>>> I formalized an approach a little-bit, now it works without target >>>>>> hooks, but some polishing is still required. I want you to comment on >>>>>> the several important approaches that I use in the patch. >>>>>> >>>>>> So how does it work. >>>>>> 1) All the vector comparisons at the level of type-checker are >>>>>> introduced using VEC_COND_EXPR with constant selection operands being >>>>>> {-1} and {0}. For example v0 > v1 is transformed into VEC_COND_EXPR>>>>>> v1, {-1}, {0}>. >>>>>> >>>>>> 2) When optabs expand VEC_COND_EXPR, two cases are considered: >>>>>> 2.a) first operand of VEC_COND_EXPR is comparison, in that case nothing >>>>>> changes. >>>>>> 2.b) first operand is something else, in that case, we specially mark >>>>>> this case, recognize it in the backend, and do not create a >>>>>> comparison, but use the mask as it was a result of some comparison. >>>>>> >>>>>> 3) In order to make sure that mask in VEC_COND_EXPR is a >>>>>> vector comparison we use is_vector_comparison function, if it returns >>>>>> false, then we replace mask with mask != {0}. >>>>>> >>>>>> So we end-up with the following functionality: >>>>>> VEC_COND_EXPR -- if we know that mask is a result of >>>>>> comparison of two vectors, we leave it as it is, otherwise change with >>>>>> mask != {0}. >>>>>> >>>>>> Plain vector comparison a b is represented with VEC_COND_EXPR, >>>>>> which correctly expands, without creating useless masking. >>>>>> >>>>>> >>>>>> Basically for me there are two questions: >>>>>> 1) Can we perform information passing in optabs in a nicer way? >>>>>> 2) How is_vector_comparison could be improved? I have several ideas, >>>>>> like checking if constant vector all consists of 0 and -1, and so on. >>>>>> But first is it conceptually fine. >>>>>> >>>>>> P.S. I tired to put the functionality of is_vector_comparison in >>>>>> tree-ssa-forwprop, but the thing is that it is called only with -On, >>>>>> which I find inappropriate, and the functionality gets more >>>>>> complicated. >>>>> >>>>> Why is it inappropriate to not optimize it at -O0? If the user >>>>> separates comparison and ?: expression it's his own fault. >>>> >>>> Well, because all the information is there, and I perfectly envision >>>> the case when user expressed comparison separately, just to avoid code >>>> duplication. >>>> >>>> Like: >>>> mask = a > b; >>>> res1 = mask ? v0 : v1; >>>> res2 = mask ? v2 : v3; >>>> >>>> Which in this case would be different from >>>> res1 = a > b ? v0 : v1; >>>> res2 = a > b ? v2 : v3; >>>> >>>>> Btw, the new hook is still in the patch. >>>>> >>>>> I would simply always create != 0 if it isn't and let optimizers >>>>> (tree-ssa-forwprop.c) optimize this. You still have to deal with >>>>> non-comparison operands during expansion though, but if >>>>> you always forced a != 0 from the start you can then simply >>>>> interpret it as a proper comparison result (in which case I'd >>>>> modify the backends to have an alternate pattern or directly >>>>> expand to masking operations - using the fake comparison >>>>> RTX is too much of a hack). >>>> >>>> Richard, I think you didn't get the problem. >>>> I really need the way, to pass the information, that the expression >>>> that is in the first operand of vcond is an appropriate mask. I
Re: Vector Comparison patch
On Mon, Aug 22, 2011 at 4:01 PM, Richard Guenther wrote: > On Mon, Aug 22, 2011 at 2:05 PM, Artem Shinkarov > wrote: >> On Mon, Aug 22, 2011 at 12:25 PM, Richard Guenther >> wrote: >>> On Mon, Aug 22, 2011 at 12:53 AM, Artem Shinkarov >>> wrote: >>>> Richard >>>> >>>> I formalized an approach a little-bit, now it works without target >>>> hooks, but some polishing is still required. I want you to comment on >>>> the several important approaches that I use in the patch. >>>> >>>> So how does it work. >>>> 1) All the vector comparisons at the level of type-checker are >>>> introduced using VEC_COND_EXPR with constant selection operands being >>>> {-1} and {0}. For example v0 > v1 is transformed into VEC_COND_EXPR>>>> v1, {-1}, {0}>. >>>> >>>> 2) When optabs expand VEC_COND_EXPR, two cases are considered: >>>> 2.a) first operand of VEC_COND_EXPR is comparison, in that case nothing >>>> changes. >>>> 2.b) first operand is something else, in that case, we specially mark >>>> this case, recognize it in the backend, and do not create a >>>> comparison, but use the mask as it was a result of some comparison. >>>> >>>> 3) In order to make sure that mask in VEC_COND_EXPR is a >>>> vector comparison we use is_vector_comparison function, if it returns >>>> false, then we replace mask with mask != {0}. >>>> >>>> So we end-up with the following functionality: >>>> VEC_COND_EXPR -- if we know that mask is a result of >>>> comparison of two vectors, we leave it as it is, otherwise change with >>>> mask != {0}. >>>> >>>> Plain vector comparison a b is represented with VEC_COND_EXPR, >>>> which correctly expands, without creating useless masking. >>>> >>>> >>>> Basically for me there are two questions: >>>> 1) Can we perform information passing in optabs in a nicer way? >>>> 2) How is_vector_comparison could be improved? I have several ideas, >>>> like checking if constant vector all consists of 0 and -1, and so on. >>>> But first is it conceptually fine. >>>> >>>> P.S. I tired to put the functionality of is_vector_comparison in >>>> tree-ssa-forwprop, but the thing is that it is called only with -On, >>>> which I find inappropriate, and the functionality gets more >>>> complicated. >>> >>> Why is it inappropriate to not optimize it at -O0? If the user >>> separates comparison and ?: expression it's his own fault. >> >> Well, because all the information is there, and I perfectly envision >> the case when user expressed comparison separately, just to avoid code >> duplication. >> >> Like: >> mask = a > b; >> res1 = mask ? v0 : v1; >> res2 = mask ? v2 : v3; >> >> Which in this case would be different from >> res1 = a > b ? v0 : v1; >> res2 = a > b ? v2 : v3; >> >>> Btw, the new hook is still in the patch. >>> >>> I would simply always create != 0 if it isn't and let optimizers >>> (tree-ssa-forwprop.c) optimize this. You still have to deal with >>> non-comparison operands during expansion though, but if >>> you always forced a != 0 from the start you can then simply >>> interpret it as a proper comparison result (in which case I'd >>> modify the backends to have an alternate pattern or directly >>> expand to masking operations - using the fake comparison >>> RTX is too much of a hack). >> >> Richard, I think you didn't get the problem. >> I really need the way, to pass the information, that the expression >> that is in the first operand of vcond is an appropriate mask. I though >> for quite a while and this hack is the only answer I found, is there a >> better way to do that. I could for example introduce another >> tree-node, but it would be overkill as well. >> >> Now why do I need it so much: >> I want to implement the comparison in a way that {1, 5, 0, -1} is >> actually {-1,-1,-1,-1}. So whenever I am not sure that mask of >> VEC_COND_EXPR is a real comparison I transform it to mask != {0} (not >> always). To check the stuff, I use is_vector_comparison in >> tree-vect-generic. >> >> So I really have the difference between mask ? x : y and mask != {0} ? >> x : y, otherwise I could treat mask != {0} in the backend as just >> mask. >> >&g
Re: Vector Comparison patch
On Mon, Aug 22, 2011 at 12:25 PM, Richard Guenther wrote: > On Mon, Aug 22, 2011 at 12:53 AM, Artem Shinkarov > wrote: >> Richard >> >> I formalized an approach a little-bit, now it works without target >> hooks, but some polishing is still required. I want you to comment on >> the several important approaches that I use in the patch. >> >> So how does it work. >> 1) All the vector comparisons at the level of type-checker are >> introduced using VEC_COND_EXPR with constant selection operands being >> {-1} and {0}. For example v0 > v1 is transformed into VEC_COND_EXPR>> v1, {-1}, {0}>. >> >> 2) When optabs expand VEC_COND_EXPR, two cases are considered: >> 2.a) first operand of VEC_COND_EXPR is comparison, in that case nothing >> changes. >> 2.b) first operand is something else, in that case, we specially mark >> this case, recognize it in the backend, and do not create a >> comparison, but use the mask as it was a result of some comparison. >> >> 3) In order to make sure that mask in VEC_COND_EXPR is a >> vector comparison we use is_vector_comparison function, if it returns >> false, then we replace mask with mask != {0}. >> >> So we end-up with the following functionality: >> VEC_COND_EXPR -- if we know that mask is a result of >> comparison of two vectors, we leave it as it is, otherwise change with >> mask != {0}. >> >> Plain vector comparison a b is represented with VEC_COND_EXPR, >> which correctly expands, without creating useless masking. >> >> >> Basically for me there are two questions: >> 1) Can we perform information passing in optabs in a nicer way? >> 2) How is_vector_comparison could be improved? I have several ideas, >> like checking if constant vector all consists of 0 and -1, and so on. >> But first is it conceptually fine. >> >> P.S. I tired to put the functionality of is_vector_comparison in >> tree-ssa-forwprop, but the thing is that it is called only with -On, >> which I find inappropriate, and the functionality gets more >> complicated. > > Why is it inappropriate to not optimize it at -O0? If the user > separates comparison and ?: expression it's his own fault. Well, because all the information is there, and I perfectly envision the case when user expressed comparison separately, just to avoid code duplication. Like: mask = a > b; res1 = mask ? v0 : v1; res2 = mask ? v2 : v3; Which in this case would be different from res1 = a > b ? v0 : v1; res2 = a > b ? v2 : v3; > Btw, the new hook is still in the patch. > > I would simply always create != 0 if it isn't and let optimizers > (tree-ssa-forwprop.c) optimize this. You still have to deal with > non-comparison operands during expansion though, but if > you always forced a != 0 from the start you can then simply > interpret it as a proper comparison result (in which case I'd > modify the backends to have an alternate pattern or directly > expand to masking operations - using the fake comparison > RTX is too much of a hack). Richard, I think you didn't get the problem. I really need the way, to pass the information, that the expression that is in the first operand of vcond is an appropriate mask. I though for quite a while and this hack is the only answer I found, is there a better way to do that. I could for example introduce another tree-node, but it would be overkill as well. Now why do I need it so much: I want to implement the comparison in a way that {1, 5, 0, -1} is actually {-1,-1,-1,-1}. So whenever I am not sure that mask of VEC_COND_EXPR is a real comparison I transform it to mask != {0} (not always). To check the stuff, I use is_vector_comparison in tree-vect-generic. So I really have the difference between mask ? x : y and mask != {0} ? x : y, otherwise I could treat mask != {0} in the backend as just mask. If this link between optabs and backend breaks, then the patch falls apart. Because every time the comparison is taken out VEC_COND_EXPR, I will have to put != {0}. Keep in mind, that I cannot always put the comparison inside the VEC_COND_EXPR, what if it is defined in a function-comparison, or somehow else? So what would be an appropriate way to connect optabs and the backend? Thanks, Artem. All the rest would be adjusted later. > tree > constant_boolean_node (int value, tree type) > { > - if (type == integer_type_node) > + if (TREE_CODE (type) == VECTOR_TYPE) > + { > + tree tval; > + > + gcc_assert (TREE_CODE (TREE_TYPE (type)) == INTEGER_TYPE); > + tval = build_int_cst (TREE_TYPE (type), value); > + return build_vector_from_val (type, tval); > > as value is either 0 or 1 that won't work. Oh, I see you
Re: Vector Comparison patch
On Fri, Aug 19, 2011 at 3:54 PM, Richard Guenther wrote: > On Fri, Aug 19, 2011 at 2:29 AM, Artem Shinkarov > wrote: >> Hi, I had the problem with passing information about single variable >> from expand_vec_cond_expr optab into ix86_expand_*_vcond. >> >> I looked into it this problem for quite a while and found a solution. >> Now the question if it could be done better. >> >> First of all the problem: >> >> If we represent any vector comparison with VEC_COND_EXPR < v0 v1 >> ? {-1,...} : {0,...} >, then in the assembler we do not want to see >> this useless comparison with {-1...}. >> >> Now it is easy to fix the problem about excessive masking. The real >> challenge starts when the comparison inside vcond is expressed as a >> variable. In that case in order to construct correct vector expression >> we need to adjust cond in cond ? v0 : v1 to cond == {-1...} or as we >> agreed recently cond != {0,..}. But hat we need to do only to make >> vec_cond_expr happy. On the level of assembler we don't want this >> condition. >> >> Now, if I just construct the tree, then in x86, rtx_equal_p, does not >> know that this is a constant vector full of -1, because the comparison >> operands are not immediate. So I need somehow to mark the fact in >> optabs, and then check the information in the x86. > > Well, this is why I was suggesting the bitwise semantic for a mask > operand. What we should do on the tree level (and that should happen > already), is forward the comparison into the COND_EXPR. Thus, > > mask = v1 < v2; > v3 = mask ? v4 : v5; > > should get changed to > > v3 = v1 < v2 ? v4 : v5; > > by tree-ssa-forwprop.c. If that is not happening we have to fix that there. Yeah, that is something I am working on. > Because we _don't_ know the mask is all -1 or 0 ;) The user might > put in {3, 5 ,1 3} and expect it to be treated like {-1,...} but it isn't > so already. > >> At the moment I do something like this: >> >> optabs: >> >> if (!COMPARISON_CLASS_P (op0)) >> ops[3] = gen_rtx_EQ (mode, NULL_RTX, NULL_RTX); >> >> This expression is preserved while checking and verifying. >> >> ix86: >> if (GET_CODE (comp) == EQ && XEXP (comp, 0) == NULL_RTX >> && XEXP (comp, 1) == NULL_RTX) >> >> See the patch attached for more details. The patch is just to give you >> an idea of the way I am doing it and it seems to work. Please don't >> criticise the patch itself, better help me to understand if there is a >> better way to pass the information from optabs to ix86. > > Hm, I'm not sure the expand_vec_cond_expr will work that way, > I'd have to play with it myself (but will now be running for weekend). > > Is the special-casing of a < b ? {-1,-1,-1} : {0,0,0,0} in the backend > working for you? I think there are probably some rtl all-ones and all-zeros > predicates you can re-use. > > Richard. It works fine. Masks all ones and all zeroes are predefined, all -1 are not, but I am switching to all zeroes. The real question is that this special case of comparison with two empty operands is a little bit hackish. On the other hand there should be no problem with that, because operand 3 is used only to get the code of comparison, noone is looking inside the arguments, so we could use this fact. The question is whether there is a better way. Thanks, Artem.
Re: Vector Comparison patch
Hi, I had the problem with passing information about single variable from expand_vec_cond_expr optab into ix86_expand_*_vcond. I looked into it this problem for quite a while and found a solution. Now the question if it could be done better. First of all the problem: If we represent any vector comparison with VEC_COND_EXPR < v0 v1 ? {-1,...} : {0,...} >, then in the assembler we do not want to see this useless comparison with {-1...}. Now it is easy to fix the problem about excessive masking. The real challenge starts when the comparison inside vcond is expressed as a variable. In that case in order to construct correct vector expression we need to adjust cond in cond ? v0 : v1 to cond == {-1...} or as we agreed recently cond != {0,..}. But hat we need to do only to make vec_cond_expr happy. On the level of assembler we don't want this condition. Now, if I just construct the tree, then in x86, rtx_equal_p, does not know that this is a constant vector full of -1, because the comparison operands are not immediate. So I need somehow to mark the fact in optabs, and then check the information in the x86. At the moment I do something like this: optabs: if (!COMPARISON_CLASS_P (op0)) ops[3] = gen_rtx_EQ (mode, NULL_RTX, NULL_RTX); This expression is preserved while checking and verifying. ix86: if (GET_CODE (comp) == EQ && XEXP (comp, 0) == NULL_RTX && XEXP (comp, 1) == NULL_RTX) See the patch attached for more details. The patch is just to give you an idea of the way I am doing it and it seems to work. Please don't criticise the patch itself, better help me to understand if there is a better way to pass the information from optabs to ix86. Thanks, Artem. On Thu, Aug 18, 2011 at 3:31 PM, Richard Henderson wrote: > On 08/18/2011 02:23 AM, Richard Guenther wrote: >> The first one (inefficient) is vec0 > vec1 ? {-1,...} : {0,...} >> The second is vec0 > vec1. expand_vec_cond_expr is stupid, which is >> fine, but it means that we need to construct it carefully. >>> > >>> > This is still important. >> Yes. I think the backends need to handle optimizing this case, >> esp. considering targets that do not have instructions to produce >> a {-1,...}/{0,...} bitmask from a comparison but produce a vector >> of condition codes. With using vec0 > vec1 ? {-1...} : {0,...} for >> mask = vec0 > vec1; we avoid exposing the result kind of >> vector comparisons. >> >> It should be easily possible for x86 for example to recognize >> the -1 : 0 case. >> > > I think you've been glossing over the hard part with "..." up there. > I challenge you to actually fill that in with something meaningful > in rtl. > > I suspect that you simply have to add another named pattern that > will Do What You Want on mips and suchlike that produce a CCmode. > > > > r~ > Index: gcc/optabs.c === --- gcc/optabs.c(revision 177665) +++ gcc/optabs.c(working copy) @@ -6557,6 +6557,8 @@ expand_vec_cond_expr_p (tree type, enum /* Generate insns for a VEC_COND_EXPR, given its TYPE and its three operands. */ +rtx rtx_build_vector_from_val (enum machine_mode, HOST_WIDE_INT); +rtx gen_const_vector1 (enum machine_mode, int); rtx expand_vec_cond_expr (tree vec_cond_type, tree op0, tree op1, tree op2, @@ -6572,16 +6574,39 @@ expand_vec_cond_expr (tree vec_cond_type if (icode == CODE_FOR_nothing) return 0; - comparison = vector_compare_rtx (op0, unsignedp, icode); rtx_op1 = expand_normal (op1); rtx_op2 = expand_normal (op2); + + if (COMPARISON_CLASS_P (op0)) +{ + comparison = vector_compare_rtx (op0, unsignedp, icode); + create_output_operand (&ops[0], target, mode); + create_input_operand (&ops[1], rtx_op1, mode); + create_input_operand (&ops[2], rtx_op2, mode); + create_fixed_operand (&ops[3], comparison); + create_fixed_operand (&ops[4], XEXP (comparison, 0)); + create_fixed_operand (&ops[5], XEXP (comparison, 1)); + +} + else +{ + enum rtx_code rcode; + rtx rtx_op0; + rtx vec; + + rtx_op0 = expand_normal (op0); + rcode = get_rtx_code (EQ_EXPR, unsignedp); + comparison = gen_rtx_EQ (mode, NULL_RTX, NULL_RTX); + vec = rtx_build_vector_from_val (mode, -1); + + create_output_operand (&ops[0], target, mode); + create_input_operand (&ops[1], rtx_op1, mode); + create_input_operand (&ops[2], rtx_op2, mode); + create_input_operand (&ops[3], comparison, mode); + create_input_operand (&ops[4], rtx_op0, mode); + create_input_operand (&ops[5], vec, mode); +} - create_output_operand (&ops[0], target, mode); - create_input_operand (&ops[1], rtx_op1, mode); - create_input_operand (&ops[2], rtx_op2, mode); - create_fixed_operand (&ops[3], comparison); - create_fixed_operand (&ops[4], XEXP (comparison, 0)); - create_fixed_operand (&ops[5], XEXP (comparison, 1)); expand_insn (icode, 6, ops); return o
Re: Vector Comparison patch
Richard, I am trying to make sure that when vcond has {-1} and {0} it does not trigger masking. Currently I am doing this: Index: config/i386/i386.c === --- config/i386/i386.c (revision 177665) +++ config/i386/i386.c (working copy) @@ -25,6 +25,7 @@ along with GCC; see the file COPYING3. #include "tm.h" #include "rtl.h" #include "tree.h" +#include "tree-flow.h" #include "tm_p.h" #include "regs.h" #include "hard-reg-set.h" @@ -18434,7 +18435,30 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp { enum machine_mode mode = GET_MODE (dest); rtx t2, t3, x; - + rtx mask_true; + + rtvec v; + int units, i; + enum machine_mode inner; + + units = GET_MODE_NUNITS (mode); + inner = GET_MODE_INNER (mode); + v = rtvec_alloc (units); + for (i = 0; i < units; ++i) +RTVEC_ELT (v, i) = gen_rtx_CONST_INT (inner, -1); + + mask_true = gen_rtx_raw_CONST_VECTOR (mode, v); + + fprintf (stderr, "I am here\n"); + debug_rtx (mask_true); + debug_rtx (op_true); + if (rtx_equal_p (op_true, mask_true)) +{ + fprintf (stderr, "Yes it is\n"); + emit_insn (gen_rtx_SET (VOIDmode, dest, cmp)); + return; +} + else if (op_false == CONST0_RTX (mode)) { op_true = force_reg (mode, op_true); It works out the case when mask is -1 very well, however in the code generated by the expansion I still see excessive operations: ires = i0 < i1 ? (vector (4, int)){-1,-1,-1,-1} : (vector (4, int)){0,0,0,0}; expands to: pcmpgtd %xmm1, %xmm0 pcmpeqd %xmm1, %xmm1 pcmpeqd %xmm1, %xmm0 movdqa %xmm0, -24(%rsp) Where the code ires = i0 < i1; using my hook expands to: pcmpgtd %xmm1, %xmm0 movdqa %xmm0, -24(%rsp) So someone is putting two extra instructions there, and I cannot really figure out who is doing that. Anyone knows how could I fix this... Artem.
Re: Vector Comparison patch
On Wed, Aug 17, 2011 at 10:52 PM, Joseph S. Myers wrote: > On Wed, 17 Aug 2011, Artem Shinkarov wrote: > >> +For the convenience condition in the vector conditional can be just a >> +vector of signed integer type. In that case this vector is implicitly >> +compared with vectors of zeroes. Consider an example: > > Where is this bit tested in the testcases added? In the gcc.c-torture/execute/vector-vcond-2.c at the end of test-case: /* Condition expressed with a single variable. */ dres = l0 ? d0 : d1; check_compare (2, dres, l0, ((vector (2, long)){-1,-1}), d0, d1, ==, "%f", "%i"); lres = l1 ? l0 : l1; check_compare (2, lres, l1, ((vector (2, long)){-1,-1}), l0, l1, ==, "%i", "%i"); fres = i0 ? f0 : f1; check_compare (4, fres, i0, ((vector (4, int)){-1,-1,-1,-1}), f0, f1, ==, "%f", "%i"); ires = i1 ? i0 : i1; check_compare (4, ires, i1, ((vector (4, int)){-1,-1,-1,-1}), i0, i1, ==, "%i", "%i"); > >> + if (TREE_CODE (type1) != VECTOR_TYPE >> + || TREE_CODE (type2) != VECTOR_TYPE) >> + { >> + error_at (colon_loc, "vector comparisom arguments must be of " >> + "type vector"); > > "comparison" Thanks, adjusted. >> + /* Avoid C_MAYBE_CONST in VEC_COND_EXPR. */ >> + sc = c_fully_fold (ifexp, false, &maybe_const); >> + sc = save_expr (sc); >> + if (!maybe_const) >> + ifexp = c_wrap_maybe_const (sc, true); >> + else >> + ifexp = sc; > > This looks like it's duplicating c_save_expr; that is, like "ifexp = > c_save_expr (ifexp);" would suffice. > > But, it's not clear that it actually achieves the effect described in the > comment; have you actually tried with function calls, assignments etc. in > the operands? I tested it with gcc.dg/vector-compare-2.c: typedef int vec __attribute__((vector_size(16))); vec i,j; extern vec a, b, c; vec foo (int x) { return (x ? i : j) ? a : b; } vec bar (int x) { return a ? (x ? i : j) : b; } vec baz (int x) { return a ? b : (x ? i : j); } Is it good enough? > The code in build_binary_op uses save_expr rather than > c_save_expr because it does some intermediate operations before calling > c_wrap_maybe_const, and if you really want to avoid C_MAYBE_CONST in > VEC_COND_EXPR then you'll need to continue calling save_expr, as here, but > delay the call to c_wrap_maybe_const so that the whole VEC_COND_EXPR is > wrapped if required. Ok, but I need to wrap it at some point, where do you think it would be appropriate to do? Thanks, Artem.
Re: Vector Comparison patch
> Yes. I think the backends need to handle optimizing this case, > esp. considering targets that do not have instructions to produce > a {-1,...}/{0,...} bitmask from a comparison but produce a vector > of condition codes. With using vec0 > vec1 ? {-1...} : {0,...} for > mask = vec0 > vec1; we avoid exposing the result kind of > vector comparisons. > > It should be easily possible for x86 for example to recognize > the -1 : 0 case. Ok, I am fine with this approach. Ho could we check if vector comparison returns {-1..}/{0..} or something else. If I can check that, I could adjust expand_vec_cond_exrp, and get rid of the hook. > Yes, and I think the fix is in the backends. I still think we have to > sort out the best building blocks we want the targets to expose. > Currently we only have the vectorizer vcond patterns which should > be enough to get the C language support implemented. After that > we should concentrate on generating efficient code for all variants. Ok, see my above comment. > Yeah, well. That's really a question for language lawyers ;) I agree > that it would be nice to have mask ? val0 : val1 behave "the same" > for scalars and vectors. The question is whether for vectors you > define it on the bit-level (which makes it equal to (mask & val0) | > (~mask & val1)) > or on the vector component level. The vector component level > is probably what people would expect. > > Which means we have to treat mask ? val0 : val1 as > mask != {0,...} ? val0 : val1. > I'd use != {0,0,...} as eventually a zero vector is cheaper to construct > and it supports the scalar ?: semantics - whenever the mask element > is non-zero it's true. Ok, I am fine with x != {0,...}, I can adjust it in both cases. Artem.
Re: Vector Comparison patch
On Wed, Aug 17, 2011 at 3:58 PM, Richard Guenther wrote: > On Wed, Aug 17, 2011 at 3:30 PM, Artem Shinkarov > wrote: >> Hi >> >> Several comments before the new version of the patch. >> 1) x != x >> I am happy to adjust constant_boolean_node, but look at the code >> around line 9074 in fold-const.c, you will see that x x >> elimination, even with adjusted constant_boolean_node, will look about >> the same as my code. Because I need to check the parameters (!FLOAT_P, >> HONOR_NANS) on TREE_TYPE (arg0) not arg0, and I need to construct >> constant_boolean_node (-1), not 1 in case of true. >> But I will change constant_boolean_node to accept vector types. > > Hm, that should be handled transparently if you look at the defines > of FLOAT_TYPE_P and the HONOR_* macros. > Ok, Currently I have this, what do you think: int true_val = TREE_CODE (type) == VECTOR_TYPE ? -1 : 0; tree arg0_type = TREE_CODE (type) == VECTOR_TYPE ? TREE_TYPE (TREE_TYPE (arg0)) : TREE_TYPE (arg0); switch (code) { case EQ_EXPR: if (! FLOAT_TYPE_P (arg0_type) || ! HONOR_NANS (TYPE_MODE (arg0_type))) return constant_boolean_node (true_val, type); break; case GE_EXPR: case LE_EXPR: if (! FLOAT_TYPE_P (arg0_type) || ! HONOR_NANS (TYPE_MODE (arg0_type))) return constant_boolean_node (true_val, type); return fold_build2_loc (loc, EQ_EXPR, type, arg0, arg1); case NE_EXPR: /* For NE, we can only do this simplification if integer or we don't honor IEEE floating point NaNs. */ if (FLOAT_TYPE_P (arg0_type) && HONOR_NANS (TYPE_MODE (arg0_type))) break; /* ... fall through ... */ case GT_EXPR: case LT_EXPR: return constant_boolean_node (0, type); default: gcc_unreachable (); } Works fine for both vector and scalar cases. >> >> 2) comparison vs vcond >> v = v1 < v2; >> v = v1 < v2 ? {-1,...} : {0,...}; >> >> are not the same. >> 16,25c16,22 >> < movdqa .LC1(%rip), %xmm1 >> < pshufd $225, %xmm1, %xmm1 >> < pshufd $39, %xmm0, %xmm0 >> < movss %xmm2, %xmm1 >> < pshufd $225, %xmm1, %xmm1 >> < pcmpgtd %xmm1, %xmm0 >> < pcmpeqd %xmm1, %xmm1 >> < pcmpeqd %xmm1, %xmm0 >> < pand %xmm1, %xmm0 >> < movdqa %xmm0, -24(%rsp) >> --- >>> pshufd $39, %xmm0, %xmm1 >>> movdqa .LC1(%rip), %xmm0 >>> pshufd $225, %xmm0, %xmm0 >>> movss %xmm2, %xmm0 >>> pshufd $225, %xmm0, %xmm0 >>> pcmpgtd %xmm0, %xmm1 >>> movdqa %xmm1, -24(%rsp) >> >> So I would keep the hook, it could be removed at any time when the >> standard expansion will start to work fine. > > Which one is which? You must be joking. :) The first one (inefficient) is vec0 > vec1 ? {-1,...} : {0,...} The second is vec0 > vec1. expand_vec_cond_expr is stupid, which is fine, but it means that we need to construct it carefully. > I'd really like to make this patch simpler at first, > and removing that hook is an obvious thing that _should_ be possible, > even optimally (by fixing the targets). Ok, let's remove the hook, then could you provide some more information rather than we just need to do it? Simple in this case means inefficient -- I would hope to make it efficient as well. >> 3) mask ? vec0 : vec1 >> So no, I don't think we need to convert {3, 4, -1, 5} to {0,0,-1,0} >> (that would surprise my anyway, I'd have expected {-1,-1,-1,-1} ;)). >> >> Does OpenCL somehow support you here? >> >> OpenCL says that vector operation mask ? vec0 : vec1 is the same as >> select (vec0, vec1, mask). The semantics of select operation is the >> following: >> >> gentype select (gentype a, gentype b, igentype c) >> For each component of a vector type, >> result[i] = if MSB of c[i] is set ? b[i] : a[i]. >> >> I am not sure what they really understand using the term MSB. As far >> as I know MSB is Most Significant Bit, so does it mean that in case of >> 3-bit integer 100 would trigger true but 011 would be still false... > > Yes, MSB is Most Significant Bit - that's a somewhat odd definition ;) > >> My reading would be that if all bits set, then take the first element, >> otherwise the second. >> >> It is also confusing when a ? vec0 : vec1, and a != 0 ? vec0 vec1 >&
Re: Vector Comparison patch
Hi Several comments before the new version of the patch. 1) x != x I am happy to adjust constant_boolean_node, but look at the code around line 9074 in fold-const.c, you will see that x x elimination, even with adjusted constant_boolean_node, will look about the same as my code. Because I need to check the parameters (!FLOAT_P, HONOR_NANS) on TREE_TYPE (arg0) not arg0, and I need to construct constant_boolean_node (-1), not 1 in case of true. But I will change constant_boolean_node to accept vector types. 2) comparison vs vcond v = v1 < v2; v = v1 < v2 ? {-1,...} : {0,...}; are not the same. 16,25c16,22 < movdqa .LC1(%rip), %xmm1 < pshufd $225, %xmm1, %xmm1 < pshufd $39, %xmm0, %xmm0 < movss %xmm2, %xmm1 < pshufd $225, %xmm1, %xmm1 < pcmpgtd %xmm1, %xmm0 < pcmpeqd %xmm1, %xmm1 < pcmpeqd %xmm1, %xmm0 < pand%xmm1, %xmm0 < movdqa %xmm0, -24(%rsp) --- > pshufd $39, %xmm0, %xmm1 > movdqa .LC1(%rip), %xmm0 > pshufd $225, %xmm0, %xmm0 > movss %xmm2, %xmm0 > pshufd $225, %xmm0, %xmm0 > pcmpgtd %xmm0, %xmm1 > movdqa %xmm1, -24(%rsp) So I would keep the hook, it could be removed at any time when the standard expansion will start to work fine. 3) mask ? vec0 : vec1 So no, I don't think we need to convert {3, 4, -1, 5} to {0,0,-1,0} (that would surprise my anyway, I'd have expected {-1,-1,-1,-1} ;)). Does OpenCL somehow support you here? OpenCL says that vector operation mask ? vec0 : vec1 is the same as select (vec0, vec1, mask). The semantics of select operation is the following: gentype select (gentype a, gentype b, igentype c) For each component of a vector type, result[i] = if MSB of c[i] is set ? b[i] : a[i]. I am not sure what they really understand using the term MSB. As far as I know MSB is Most Significant Bit, so does it mean that in case of 3-bit integer 100 would trigger true but 011 would be still false... My reading would be that if all bits set, then take the first element, otherwise the second. It is also confusing when a ? vec0 : vec1, and a != 0 ? vec0 vec1 produce different results. So I would stick to all bits set being true scenario. 4) Backend stuff. Ok, we could always fall back to reject the cases when cond and operands have different type, and then fix the backend. Adjustments are coming. Artem.
Re: Vector Comparison patch
On Tue, Aug 16, 2011 at 4:28 PM, Richard Guenther wrote: > On Mon, Aug 15, 2011 at 6:58 PM, Artem Shinkarov > wrote: >> On Mon, Aug 15, 2011 at 3:24 PM, Richard Guenther >> wrote: >>> On Fri, Aug 12, 2011 at 4:03 AM, Artem Shinkarov >>> wrote: >>>> Hi >>>> >>>> Here is a completed version of the vector comparison patch we >>>> discussed a long time ago here: >>>> http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01184.html >>>> >>>> The patch implements vector comparison according to the OpenCL >>>> standard, when the result of the comparison of two vectors is vector >>>> of signed integers, where -1 represents true and 0 false. >>>> >>>> The patch implements vector conditional res = VCOND >>>> which is expanded into: >>>> foreach (i in length (V1)) res[i] = V1 == 0 ? V3[i] : V2[i]. >>> >>> Some comments on the patch below. First, in general I don't see >>> why you need a new target hook to specify whether to "vectorize" >>> a comparison. Why are the existing hooks used by the vectorizer >>> not enough? >>> >>> Index: gcc/fold-const.c >>> === >>> --- gcc/fold-const.c (revision 177665) >>> +++ gcc/fold-const.c (working copy) >>> @@ -9073,34 +9073,61 @@ fold_comparison (location_t loc, enum tr >>> floating-point, we can only do some of these simplifications.) */ >>> if (operand_equal_p (arg0, arg1, 0)) >>> { >>> - switch (code) >>> + if (TREE_CODE (TREE_TYPE (arg0)) == VECTOR_TYPE) >>> { >>> - case EQ_EXPR: >>> - if (! FLOAT_TYPE_P (TREE_TYPE (arg0)) >>> - || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0 >>> - return constant_boolean_node (1, type); >>> >>> I think this change should go in a separate patch for improved >>> constant folding. It shouldn't be necessary for enabling vector compares, >>> no? >> >> Unfortunately no, this case must be covered here, otherwise x != x >> condition fails. > > How does it fail? When I have x > x, x == x, and so on, fold-const.c trigger operand_equal_p (arg0, arg1, 0), which returns true, and then it calls constant_boolean_node (, type). But the problem is that the result of the comparison is a vector, not a boolean. So we have an assertion failure: test.c: In function ‘foo’: test.c:9:3: internal compiler error: in build_int_cst_wide, at tree.c:1222 Please submit a full bug report, with preprocessed source if appropriate. > >>> >>> + if (TYPE_UNSIGNED (TREE_TYPE (TREE_TYPE (ifexp >>> + { >>> + error_at (colon_loc, "vector comparison must be of signed " >>> + "integer vector type"); >>> + return error_mark_node; >>> + } >>> >>> why that? >> >> Well, later on I rely on this fact. I mean OpenCL says that it should >> return -1 in the sense that all bits set. I don't really know, I can >> support unsigned masks as well, but wouldn't it just introduce a >> source for possible errors. I mean that natural choice for true and >> flase is 0 and 1, not 0 and -1. Anyway I don't have a strong opinion >> there, and I could easily adjust it if we decide that we want it. > > I think we want to allow both signed and unsigned masks. Ok, I'll adjust. > >>> >>> + if (TYPE_VECTOR_SUBPARTS (type1) != TYPE_VECTOR_SUBPARTS (type2) >>> + || TYPE_VECTOR_SUBPARTS (TREE_TYPE (ifexp)) >>> + != TYPE_VECTOR_SUBPARTS (type1)) >>> + { >>> + error_at (colon_loc, "vectors of different length found in " >>> + "vector comparison"); >>> + return error_mark_node; >>> + } >>> >>> I miss verification that type1 and type2 are vector types, or is that done >>> elsewhere? I think type1 and type2 are already verified to be >>> compatible (but you might double-check). At least the above would be >>> redundant with >> >> Thanks, type1 and type2 both vectors comparison is missing, going to >> be added in the new version of the patch. >>> >>> + if (type1 != type2) >>> + { >>> + error_at (colon_loc, "vectors of different types involved in " >>> +
Re: Vector Comparison patch
On Mon, Aug 15, 2011 at 3:24 PM, Richard Guenther wrote: > On Fri, Aug 12, 2011 at 4:03 AM, Artem Shinkarov > wrote: >> Hi >> >> Here is a completed version of the vector comparison patch we >> discussed a long time ago here: >> http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01184.html >> >> The patch implements vector comparison according to the OpenCL >> standard, when the result of the comparison of two vectors is vector >> of signed integers, where -1 represents true and 0 false. >> >> The patch implements vector conditional res = VCOND >> which is expanded into: >> foreach (i in length (V1)) res[i] = V1 == 0 ? V3[i] : V2[i]. > > Some comments on the patch below. First, in general I don't see > why you need a new target hook to specify whether to "vectorize" > a comparison. Why are the existing hooks used by the vectorizer > not enough? > > Index: gcc/fold-const.c > === > --- gcc/fold-const.c (revision 177665) > +++ gcc/fold-const.c (working copy) > @@ -9073,34 +9073,61 @@ fold_comparison (location_t loc, enum tr > floating-point, we can only do some of these simplifications.) */ > if (operand_equal_p (arg0, arg1, 0)) > { > - switch (code) > + if (TREE_CODE (TREE_TYPE (arg0)) == VECTOR_TYPE) > { > - case EQ_EXPR: > - if (! FLOAT_TYPE_P (TREE_TYPE (arg0)) > - || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0 > - return constant_boolean_node (1, type); > > I think this change should go in a separate patch for improved > constant folding. It shouldn't be necessary for enabling vector compares, no? Unfortunately no, this case must be covered here, otherwise x != x condition fails. > > + if (TYPE_UNSIGNED (TREE_TYPE (TREE_TYPE (ifexp > + { > + error_at (colon_loc, "vector comparison must be of signed " > + "integer vector type"); > + return error_mark_node; > + } > > why that? Well, later on I rely on this fact. I mean OpenCL says that it should return -1 in the sense that all bits set. I don't really know, I can support unsigned masks as well, but wouldn't it just introduce a source for possible errors. I mean that natural choice for true and flase is 0 and 1, not 0 and -1. Anyway I don't have a strong opinion there, and I could easily adjust it if we decide that we want it. > > + if (TYPE_VECTOR_SUBPARTS (type1) != TYPE_VECTOR_SUBPARTS (type2) > + || TYPE_VECTOR_SUBPARTS (TREE_TYPE (ifexp)) > + != TYPE_VECTOR_SUBPARTS (type1)) > + { > + error_at (colon_loc, "vectors of different length found in " > + "vector comparison"); > + return error_mark_node; > + } > > I miss verification that type1 and type2 are vector types, or is that done > elsewhere? I think type1 and type2 are already verified to be > compatible (but you might double-check). At least the above would be > redundant with Thanks, type1 and type2 both vectors comparison is missing, going to be added in the new version of the patch. > > + if (type1 != type2) > + { > + error_at (colon_loc, "vectors of different types involved in " > + "vector comparison"); > + return error_mark_node; > + } You are right, what I meant here is TREE_TYPE (type1) != TREE_TYPE (type2), because vector (4, int) have the same number of elements as vector (4, float). This would be fixed in the new version. > > Joseph may have comments about the fully-fold stuff that follows. > > + /* Currently the expansion of VEC_COND_EXPR does not allow > + expessions where the type of vectors you compare differs > + form the type of vectors you select from. For the time > + being we insert implicit conversions. */ > + if ((COMPARISON_CLASS_P (ifexp) > > Why only for comparison-class? Not only, there is || involved: (COMPARISON_CLASS_P (ifexp) && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != type1) || TREE_TYPE (ifexp) != type1 So if this is a comparison class, we check the first operand, because the result of the comparison fits, however the operands could not. In case we have an expression of signed vector, we know that we would transform it into exp != {0,0,...} in tree-vect-generic.c, but if the types of operands do not match we convert them. > > + && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != type1) > + || TREE_TYPE (ifexp) != type1) > + { > + tree comp_type
Re: Scalar vector binary operation
Sorry, I didn't attach the patch itself. Here we go, in the attachment. Artem. Index: gcc/doc/extend.texi === --- gcc/doc/extend.texi (revision 177589) +++ gcc/doc/extend.texi (working copy) @@ -6526,18 +6526,25 @@ In C it is possible to use shifting oper integer-type vectors. The operation is defined as following: @code{@{a0, a1, @dots{}, an@} >> @{b0, b1, @dots{}, bn@} == @{a0 >> b0, a1 >> b1, @dots{}, an >> bn@}}@. Vector operands must have the same number of -elements. Additionally second operands can be a scalar integer in which -case the scalar is converted to the type used by the vector operand (with -possible truncation) and each element of this new vector is the scalar's -value. +elements. + +For the convenience in C it is allowed to use a binary vector operation +where one operand is a scalar. In that case the compiler will transform +the scalar operand into a vector where each element is the scalar from +the operation. The transformation will happen only if the scalar could be +safely converted to the vector-element type. Consider the following code. @smallexample typedef int v4si __attribute__ ((vector_size (16))); -v4si a, b; +v4si a, b, c; +long l; + +a = b + 1;/* a = b + @{1,1,1,1@}; */ +a = 2 * b;/* a = @{2,2,2,2@} * b; */ -b = a >> 1; /* b = a >> @{1,1,1,1@}; */ +a = l + a;/* Error, cannot convert long to int. */ @end smallexample In C vectors can be subscripted as if the vector were an array with Index: gcc/c-family/c-common.c === --- gcc/c-family/c-common.c (revision 177589) +++ gcc/c-family/c-common.c (working copy) @@ -1922,143 +1922,92 @@ shorten_binary_op (tree result_type, tre return result_type; } -/* Warns if the conversion of EXPR to TYPE may alter a value. - This is a helper function for warnings_for_convert_and_check. */ - -static void -conversion_warning (tree type, tree expr) +/* Checks if expression EXPR of real/integer type cannot be converted + to the real/integer type TYPE. Function returns true when: + * EXPR is a constant which cannot be exactly converted to TYPE + * EXPR is not a constant and size of EXPR's type > than size of TYPE, + for EXPR type and TYPE being both integers or both real. + * EXPR is not a constant of real type and TYPE is an integer. + * EXPR is not a constant of integer type which cannot be + exactly converted to real type. + Function allows conversions between types of different signedness and + does not return true in that case. Function can produce signedness + warnings if PRODUCE_WARNS is true. */ +bool +unsafe_conversion_p (tree type, tree expr, bool produce_warns) { bool give_warning = false; - - int i; - const int expr_num_operands = TREE_OPERAND_LENGTH (expr); tree expr_type = TREE_TYPE (expr); location_t loc = EXPR_LOC_OR_HERE (expr); - if (!warn_conversion && !warn_sign_conversion) -return; - - /* If any operand is artificial, then this expression was generated - by the compiler and we do not warn. */ - for (i = 0; i < expr_num_operands; i++) -{ - tree op = TREE_OPERAND (expr, i); - if (op && DECL_P (op) && DECL_ARTIFICIAL (op)) - return; -} - - switch (TREE_CODE (expr)) + if (TREE_CODE (expr) == REAL_CST || TREE_CODE (expr) == INTEGER_CST) { -case EQ_EXPR: -case NE_EXPR: -case LE_EXPR: -case GE_EXPR: -case LT_EXPR: -case GT_EXPR: -case TRUTH_ANDIF_EXPR: -case TRUTH_ORIF_EXPR: -case TRUTH_AND_EXPR: -case TRUTH_OR_EXPR: -case TRUTH_XOR_EXPR: -case TRUTH_NOT_EXPR: - /* Conversion from boolean to a signed:1 bit-field (which only -can hold the values 0 and -1) doesn't lose information - but -it does change the value. */ - if (TYPE_PRECISION (type) == 1 && !TYPE_UNSIGNED (type)) - warning_at (loc, OPT_Wconversion, - "conversion to %qT from boolean expression", type); - return; - -case REAL_CST: -case INTEGER_CST: - /* Warn for real constant that is not an exact integer converted - to integer type. */ +to integer type. */ if (TREE_CODE (expr_type) == REAL_TYPE - && TREE_CODE (type) == INTEGER_TYPE) -{ - if (!real_isinteger (TREE_REAL_CST_PTR (expr), TYPE_MODE (expr_type))) -give_warning = true; -} + && TREE_CODE (type) == INTEGER_TYPE) + { + if (!real_isinteger (TREE_REAL_CST_PTR (expr), TYPE_MODE (expr_type))) + give_warning = true; + } /* Warn for an integer constant that does not fit into integer type. */ else if (TREE_CODE (expr_type) == INTEGER_TYPE - && TREE_CODE (type) == INTEGER_TYPE - && !int_fits_type_p (expr, type)) -{ - if (TYPE_UNSIGNED (type) && !TYPE_UNSIG
Scalar vector binary operation
This is a patch that was approved a long time ago here: http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01833.html but was never submitted. 2011-08-09 Artjoms Sinkarovs /gcc * c-typeck.c (scalar_to_vector): New function. Try scalar to vector conversion. (stv_conv): New enum for scalar_to_vector return type. (build_binary_op): Adjust. * doc/extend.texi: Description of scalar to vector expansion. /gcc/c-family * c-common.c (unsafe_conversion_p): New function. Check if it is unsafe to convert an expression to the type. (conversion_warning): Adjust, use unsafe_conversion_p. * c-common.h (unsafe_conversion_p): New function declaration. /gcc/testsuite * gcc.c-torture/execute/scal-to-vec1.c: New test. * gcc.c-torture/execute/scal-to-vec2.c: New test. * gcc.c-torture/execute/scal-to-vec3.c: New test. * gcc.dg/scal-to-vec1.c: New test. * gcc.dg/scal-to-vec2.c: New test. bootstrapped and tested on x86_64_unknown-linux Thanks, Artem.