AARCH64 vs SLOW_BYTE_ACCESS

2017-07-10 Thread Andrew Pinski
I was looking into some bitfield code for aarch64 and was wondering
why SLOW_BYTE_ACCESS is set to 0.  I can't seem to figure out why
though.
The header says:
   Although there's no difference in instruction count or cycles,
  in AArch64 we don't want to expand to a sub-word to a 64-bit access
  if we don't have to, for power-saving reasons.  */

But that does not make sense because with SLOW_BYTE_ACCESS to 0, GCC
expands a sub-word access to a 64bit access.

When I set to SLOW_BYTE_ACCESS to 1, I get between 38% to 208% speed
up for accesses of a bitfields inside a loop on ThunderX CN88xx.

Should we change SLOW_BYTE_ACCESS (or maybe better yet get rid of it)?

Thanks,
Andrew Pinski


Re: whereis PLUGIN_REGISTER_GGC_CACHES? how to migrate it for GCC v6.x?

2017-07-10 Thread Leslie Zhai



在 2017年07月10日 22:16, David Malcolm 写道:

On Sat, 2017-07-08 at 15:50 +0800, Leslie Zhai wrote:

Hi GCC developers,

There was

PLUGIN_REGISTER_GGC_CACHES

pseudo-events for register_callback in GCC v4.x, but how to migrate
it
for GCC v6.x? there is no  PLUGIN_REGISTER_GGC_CACHES deprecated log
in
ChangeLog-201X nor git log plugin.h... please give me some hint,
thanks
a lot!

Trevor [CCed] removed it 2014-12-10 in r218558
(eb06b2519a361b7784b1807115fcb3dea0226035) in the commit:
"remove gengtype support for param_is use_param, if_marked and splay
tree allocators"

The patch was here:
   https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02965.html

where he talks about plugin migration.
 Some plugins may need to add extra GGC root tables, e.g. to handle 
their own
 @code{GTY}-ed data. This can be done with the 
@code{PLUGIN_REGISTER_GGC_ROOTS}
 pseudo-event with a null callback and the extra root table (of type 
@code{struct

-ggc_root_tab*}) as @code{user_data}.  Plugins that want to use the
-@code{if_marked} hash table option can add the extra GGC cache tables 
generated
-by @code{gengtype} using the @code{PLUGIN_REGISTER_GGC_CACHES} 
pseudo-event with
-a null callback and the extra cache table (of type @code{struct 
ggc_cache_tab*})

-as @code{user_data}.  Running the @code{gengtype -p @var{source-dir}
-@var{file-list} @var{plugin*.c} ...} utility generates these extra root 
tables.

+ggc_root_tab*}) as @code{user_data}.  Running the
+ @code{gengtype -p @var{source-dir} @var{file-list} @var{plugin*.c} ...}
+utility generates these extra root tables.


After diff gcc-6.3.0/gcc/testsuite/gcc.dg/plugin and 
gcc-4.8.0/gcc/testsuite/gcc.dg/plugin

then migrate to GCC v6.x like this:


// Register our garbage collector roots.
#if GCC_MAJOR < 6
  register_callback(plugin_name, PLUGIN_REGISTER_GGC_CACHES, NULL,
#else
  register_callback(plugin_name, PLUGIN_REGISTER_GGC_ROOTS, NULL,
#endif
const_cast(gt_ggc_rc__gt_cache_h));


and Trevor talks more about GTY((if_marked(XXX), param_is(XXX))) htab_t 
migrate to GTY((cache)) hash_table such as:



#if (GCC_MAJOR < 6)
// FIXME: gengtype not support macro?
static GTY((if_marked("tree2int_marked_p"), param_is(struct tree2int)))
htab_t intCache;
#else
struct intCacheHasher : ggc_cache_ptr_hash {
  static inline hashval_t hash(tree2int *t2i) {
return tree_map_base_hash(&t2i->base);
}

  static inline bool equal(tree2int *a, tree2int *b) {
return a->base.from == b->base.from;
}
};
static GTY((cache))
hash_table *intCache;
#endif


But I have no idea why gengtype does not support macro? 
https://gcc.gnu.org/ml/gcc/2017-07/msg00045.html it just ignored #if 
(GCC_MAJOR < 6) still parse GTY((if_marked(XXX), param_is(XXX))) htab_t 
but not GTY((cache)) hash_table... please give me some hint, 
thanks a lot!




Hope this is helpful
Dave


--
Regards,
Leslie Zhai - a LLVM developer https://reviews.llvm.org/p/xiangzhai/





Re: whereis PLUGIN_REGISTER_GGC_CACHES? how to migrate it for GCC v6.x?

2017-07-10 Thread Leslie Zhai

Hi David,

Thanks for your kind response!


在 2017年07月10日 22:16, David Malcolm 写道:

On Sat, 2017-07-08 at 15:50 +0800, Leslie Zhai wrote:

Hi GCC developers,

There was

PLUGIN_REGISTER_GGC_CACHES

pseudo-events for register_callback in GCC v4.x, but how to migrate
it
for GCC v6.x? there is no  PLUGIN_REGISTER_GGC_CACHES deprecated log
in
ChangeLog-201X nor git log plugin.h... please give me some hint,
thanks
a lot!

Trevor [CCed] removed it 2014-12-10 in r218558
(eb06b2519a361b7784b1807115fcb3dea0226035) in the commit:
"remove gengtype support for param_is use_param, if_marked and splay
tree allocators"

The patch was here:
   https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02965.html

where he talks about plugin migration.

Hope this is helpful

yes, it is very helpful :) thank you very much!


Dave


--
Regards,
Leslie Zhai - a LLVM developer https://reviews.llvm.org/p/xiangzhai/






Re: combiner: how to compute cost for bit insertion?

2017-07-10 Thread Segher Boessenkool
On Mon, Jul 10, 2017 at 05:10:03PM +0200, Georg-Johann Lay wrote:
> Any ideas for a sane approach?

You could change insn_rtx_cost to actually calculate the cost of the
insn, not just set_src_cost of a single set.  This will need checking
on a great many targets, not in the least because most target's cost
functions are, uh, not so good.  Big project, but hopefully well worth
the effort.

Or you change your cost function to on a QImode shift assume it is an
insert instruction.  Not correct of course (but neither is the currently
calculated cost), but perhaps it gives good results in practice?


Segher


Re: Missed optimization with const member

2017-07-10 Thread Martin Sebor

On 07/07/2017 06:26 AM, Ion Gaztañaga wrote:

On 05/07/2017 17:24, Martin Sebor wrote:


[*] While the example (copied below) is valid, accessing the object
after someFunction() has returned via a reference or pointer to it
is not.

   void somefunction(const Object& object);
   {
 void* p = &object;
 object.~Object();
 new(p) Object();
   }


I think it's problematic as explained in p0532r0.pdf. We construct and
destroy objects in the internal buffer of std::vector and we don't
update the pointer every time. I don't see myself understanding when
std::launder must be used, looks too expert-grade feature.


The problem with vector is new in C++ 17 and caused by lifting
the assignability requirement without fully considering the
ramifications.

But there are a number of intertwined problems here and this
is just one of them.   The text that makes references and
pointers invalid was added in C++ 03 to fix one problem (CWG
issue #89).  Launder tries to patch over a problem caused by
introducing optional in C++ 14, again without considering CWG
89.  As a result of its narrow focus (as Niko's paper points
out) it doesn't fix other manifestations of it.  CWG issue
#2182 describes a related defect in arithmetic involving
pointers to individual objects constructed at consecutive
locations in the same block of memory that launder doesn't
do anything for. As Niko's paper also highlights, launder
isn't a complete or, IMO, even a very good solution to
the problems.  It adds even more complexity without solving
all the underlying problems.  Adding more, complex features
that even experts have trouble understanding is what caused
these problems to begin with.

Martin


combiner: how to compute cost for bit insertion?

2017-07-10 Thread Georg-Johann Lay

Hi, I'd need some help with the following optimization issue:

avr backend supports insns for bit insertion, and insn combiner tries to 
use them:


unsigned char bset (unsigned char a, unsigned char n)
{
  return (a & ~0x40) | (n & 0x40);
}


Trying 7 -> 14:
Successfully matched this instruction:
(set (zero_extract:QI (reg/i:QI 24 r24)
(const_int 1 [0x1])
(const_int 6 [0x6]))
(lshiftrt:QI (reg:QI 52)
(const_int 6 [0x6])))
rejecting combination of insns 7 and 14
original costs 4 + 4 = 8
replacement cost 24


Hence the existing insn is rejected because of too high costs.

The problem is that the backend only sees

avr_rtx_costs[bset:combine(266)]=true (size) total=24, outer=set:
(lshiftrt:QI (reg:QI 52)
(const_int 6 [0x6]))

Hence this looks like a QI shift as the ZERO_EXTRACT is killed, only the 
outer SET is available which is not very helpful.


A shift is actually more expensive than a bit insertion.

How can I fix that?


What I'd like to avoid is to write hell of many complicated patterns 
like for:


Trying 8, 7 -> 9:
Failed to match this instruction:
(set (reg:QI 50)
(ior:QI (and:QI (reg/v:QI 49 [ n ])
(const_int 64 [0x40]))
(and:QI (reg:QI 24 r24 [ a ])
(const_int -65 [0xffbf]

This would be a different representation of bit insertion, but it would 
also need many patterns:


* Ones for same bit number (like in the example)
* Ones where the src bit is smaller than the dest bit (needs ASHIFT).
* Ones where the src bit is greater than the dest bit (needs LSHIFTRT).
* Ones where the MSB has to be inserted (will use other canonical form)
* Ones where the LSB has to be inserted (will use other canonical form)
* ... you name it.

Any ideas for a sane approach?


Thanks,

Johann


Re: whereis PLUGIN_REGISTER_GGC_CACHES? how to migrate it for GCC v6.x?

2017-07-10 Thread David Malcolm
On Sat, 2017-07-08 at 15:50 +0800, Leslie Zhai wrote:
> Hi GCC developers,
> 
> There was
> 
> PLUGIN_REGISTER_GGC_CACHES
> 
> pseudo-events for register_callback in GCC v4.x, but how to migrate
> it 
> for GCC v6.x? there is no  PLUGIN_REGISTER_GGC_CACHES deprecated log
> in 
> ChangeLog-201X nor git log plugin.h... please give me some hint,
> thanks 
> a lot!

Trevor [CCed] removed it 2014-12-10 in r218558
(eb06b2519a361b7784b1807115fcb3dea0226035) in the commit:
"remove gengtype support for param_is use_param, if_marked and splay
tree allocators"

The patch was here:
  https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02965.html

where he talks about plugin migration.

Hope this is helpful
Dave


Add support to trace comparison instructions and switch statements

2017-07-10 Thread 吴潍浠(此彼)
Hi

I write some codes to make gcc support comparison-guided fuzzing.
It is very like 
http://clang.llvm.org/docs/SanitizerCoverage.html#tracing-data-flow .
With -fsanitize-coverage=trace-cmp the compiler will insert extra 
instrumentation around comparison instructions and switch statements.
I think it is useful for fuzzing.  :D

Patch is below, I may supply test cases later.

With Regards
Wish Wu

Index: gcc/asan.c
===
--- gcc/asan.c  (revision 250082)
+++ gcc/asan.c  (working copy)
@@ -2705,6 +2705,29 @@ initialize_sanitizer_builtins (void)
   tree BT_FN_SIZE_CONST_PTR_INT
 = build_function_type_list (size_type_node, const_ptr_type_node,
integer_type_node, NULL_TREE);
+
+  tree BT_FN_VOID_UINT8_UINT8
+= build_function_type_list (void_type_node, unsigned_char_type_node,
+   unsigned_char_type_node, NULL_TREE);
+  tree BT_FN_VOID_UINT16_UINT16
+= build_function_type_list (void_type_node, uint16_type_node,
+   uint16_type_node, NULL_TREE);
+  tree BT_FN_VOID_UINT32_UINT32
+= build_function_type_list (void_type_node, uint32_type_node,
+   uint32_type_node, NULL_TREE);
+  tree BT_FN_VOID_UINT64_UINT64
+= build_function_type_list (void_type_node, uint64_type_node,
+   uint64_type_node, NULL_TREE);
+  tree BT_FN_VOID_FLOAT_FLOAT
+= build_function_type_list (void_type_node, float_type_node,
+   float_type_node, NULL_TREE);
+  tree BT_FN_VOID_DOUBLE_DOUBLE
+= build_function_type_list (void_type_node, double_type_node,
+   double_type_node, NULL_TREE);
+  tree BT_FN_VOID_UINT64_PTR
+= build_function_type_list (void_type_node, uint64_type_node,
+   ptr_type_node, NULL_TREE);
+
   tree BT_FN_BOOL_VPTR_PTR_IX_INT_INT[5];
   tree BT_FN_IX_CONST_VPTR_INT[5];
   tree BT_FN_IX_VPTR_IX_INT[5];
Index: gcc/builtin-types.def
===
--- gcc/builtin-types.def   (revision 250082)
+++ gcc/builtin-types.def   (working copy)
@@ -338,8 +338,20 @@ DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTRMODE_PTR,
 BT_VOID, BT_PTRMODE, BT_PTR)
 DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTR_PTRMODE,
 BT_VOID, BT_PTR, BT_PTRMODE)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_UINT8_UINT8,
+BT_VOID, BT_UINT8, BT_UINT8)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_UINT16_UINT16,
+BT_VOID, BT_UINT16, BT_UINT16)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_UINT32_UINT32,
+BT_VOID, BT_UINT32, BT_UINT32)
 DEF_FUNCTION_TYPE_2 (BT_FN_VOID_UINT64_UINT64,
 BT_VOID, BT_UINT64, BT_UINT64)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_FLOAT_FLOAT,
+BT_VOID, BT_FLOAT, BT_FLOAT)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_DOUBLE_DOUBLE,
+BT_VOID, BT_DOUBLE, BT_DOUBLE)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_UINT64_PTR,
+BT_VOID, BT_UINT64, BT_PTR)
 DEF_FUNCTION_TYPE_2 (BT_FN_VOID_VALIST_REF_VALIST_ARG,
 BT_VOID, BT_VALIST_REF, BT_VALIST_ARG)
 DEF_FUNCTION_TYPE_2 (BT_FN_LONG_LONG_LONG,
Index: gcc/common.opt
===
--- gcc/common.opt  (revision 250082)
+++ gcc/common.opt  (working copy)
@@ -226,10 +226,9 @@ unsigned int flag_sanitize
 Variable
 unsigned int flag_sanitize_recover = (SANITIZE_UNDEFINED | 
SANITIZE_UNDEFINED_NONDEFAULT | SANITIZE_KERNEL_ADDRESS) & 
~(SANITIZE_UNREACHABLE | SANITIZE_RETURN)
 
-fsanitize-coverage=trace-pc
-Common Report Var(flag_sanitize_coverage)
-Enable coverage-guided fuzzing code instrumentation.
-Inserts call to __sanitizer_cov_trace_pc into every basic block.
+; What the coverage sanitizers should instrument
+Variable
+unsigned int flag_sanitize_coverage
 
 ; Flag whether a prefix has been added to dump_base_name
 Variable
@@ -975,6 +974,10 @@ fsanitize=
 Common Driver Report Joined
 Select what to sanitize.
 
+fsanitize-coverage=
+Common Driver Report Joined
+Select what to coverage sanitize.
+
 fasan-shadow-offset=
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -fasan-shadow-offset=  Use custom shadow memory offset.
Index: gcc/flag-types.h
===
--- gcc/flag-types.h(revision 250082)
+++ gcc/flag-types.h(working copy)
@@ -250,6 +250,14 @@ enum sanitize_code {
  | SANITIZE_BOUNDS_STRICT
 };
 
+/* Different trace modes */
+enum sanitize_coverage_code {
+  /* Trace PC */
+  SANITIZE_COV_TRACE_PC = 1UL << 0,
+  /* Trace Compare */
+  SANITIZE_COV_TRACE_CMP = 1UL << 1
+};
+
 /* flag_vtable_verify initialization levels. */
 enum vtv_priority {
   VTV_NO_PRIORITY   = 0,  /* i.E. Do NOT do vtable verification. */
Index: gcc/opts.c

gengtype not support #if (GCC_MAJOR < 6)? how to support both for GCC v4.x and v6.x?

2017-07-10 Thread Leslie Zhai

Hi GCC developers,

As ChangeLog-2014 mentioned: Remove support for if_marked and param_is 
about ggc, so I migrate to GCC v6.x, for example:



#if (GCC_MAJOR < 6)
// FIXME: gengtype not support macro?
//static GTY((if_marked("tree2int_marked_p"), param_is(struct tree2int)))
//htab_t intCache;
#else
struct intCacheHasher : ggc_cache_ptr_hash {
  static inline hashval_t hash(tree2int *t2i) {
return tree_map_base_hash(t2i->base);
}

  static inline bool equal(tree2int *a, tree2int *b) {
return a->base.from == b->base.from;
}

  static int keep_cache_entry(tree2int *&t2i) {
return ggc_marked_p(t2i->base.from);
}
};
static GTY((cache))
hash_table *intCache;
#endif


$ gcc-6.3.0/build/gcc/build/gengtype -r gcc-6.3.0/build/gcc/gtype.state 
-P /tmp/gt-cache-6.3.inc Input.cpp


but it still parse the deprecated if_marked and param_is for GCC v4.x, 
please give me some hint, thanks a lot!


--
Regards,
Leslie Zhai - a LLVM developer https://reviews.llvm.org/p/xiangzhai/