gcc-6-20180307 is now available

2018-03-07 Thread gccadmin
Snapshot gcc-6-20180307 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/6-20180307/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 6 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-6-branch 
revision 258342

You'll find:

 gcc-6-20180307.tar.xzComplete GCC

  SHA256=39718ba6a009390bcfb7ecb5d2ed0debce0c2f0c6000d625c50afb5e9ffa912d
  SHA1=5ac4fdd01b947d23cb5a705dcbf518d3605f8c71

Diffs from 6-20180228 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-6
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: BLKmode parameters are stored in unaligned stack slot when passed via registers.

2018-03-07 Thread Jeff Law
On 03/06/2018 08:21 AM, Renlin Li wrote:
> Hi all,
> 
> The problem described here probably only affects targets whose ABI allow
> to pass structured
> arguments of certain size via registers.
> 
> If the mode of the parameter type is BLKmode, in the callee, during RTL
> expanding,
> a stack slot will be reserved for this parameter, and the incoming value
> will be copied into
> the stack slot.
> 
> However, the stack slot for the parameter will not be aligned if the
> alignment of parameter type
> exceeds MAX_SUPPORTED_STACK_ALIGNMENT.
> Chances are, unaligned memory access might cause run-time errors.
My recollection here (the PA has a ABI which mandates this kind of
stuff) is that you have to copy the object out of the potentially
unaligned location into a suitably aligned local.

The copy should be occurring in an alignment safe way.  It also has to
handle structures that are partially in registers, partially in memory
and structures which are justified in the wrong direction.

We never tried to optimize this stuff.  It was rare enough to not worry
about.

jeff


Re: GCC GSOC Participation

2018-03-07 Thread Andi Kleen
> I would suggest that you start with reading through Andi's email to
> another student who expressed interest in that project which you can
> find at: https://gcc.gnu.org/ml/gcc/2018-02/msg00216.html
> 
> Andi, do you have any further suggestions what Prateek should check-out,
> perhaps build, examine and experiment with in order to come up with a
> nice proposal? 

See other mail.

> Do you personally prefer starting with any particular
> existing fuzzer, for example?

Yes should start with an existing fuzzer and just add the extensions.
Otherwise too much time will be spent on the base language functionality.
I would suggest either csmith or yarpgen.

-Andi


Re: GCC GSOC Participation

2018-03-07 Thread Andi Kleen
On Wed, Mar 07, 2018 at 03:52:15AM +0530, Prathamesh Kulkarni wrote:
> On 3 March 2018 at 16:22, Prateek Kalra  wrote:
> > Hello GCC Community,
> > My name is Prateek Kalra.I am pursuing integrated dual
> > degree(B.tech+M.tech) in Computer Science Software Engineering,from Gautam
> > Buddha University,Greater Noida.I am currently in 8th semester of the
> > programme.
> > I have experience in competitive programming with C++.Here's my linkedin
> > profile:
> > https://www.linkedin.com/in/prateek-kalra-6a40bab3/.
> > I am interested in GSOC project "Implement a fuzzer leveraging GCC
> > extensions".
> > I had opted compiler design as one of the course subjects in the previous
> > semester and was able to secure an 'A' grade at the end of the semester.
> > I have theoretical knowledge of fuzz testing and csmith,that how the random
> > C programs are generated to check the compiler bugs and I am very keen to
> > work under this project.
> > I request you to guide me to progress through the process.I would really
> > appreciate if you could mentor me with the further research of this project
> > idea.

Hi Prateek,

Further research on the project:

- Look at the gcc language extensions in the gcc documentation. Select some
(can be one or a combination of multiple)
of suitable complexity and get familiar with the concepts. Examples are
OpenMP, transactions, vector extensions. For some of them the gcc documentation
is enough, for others (like OpenMP or transactions) you'll need to
download the external specification. Look at the specification
and get familiar with it. Likely you'll also need to do some research
in the underlying concepts (e.g. parallelism for OpenMP or vectorization
for the vector APIs)

- Look at csmith or yarpgen and get familar with the code

Then you can make a choice which fuzzer you want to use as a base
and make a proposal which gcc extensions you would want to target
with the project.

-Andi


Re: Further for GSoC.

2018-03-07 Thread Tejas Joshi
 On 6 March 2018 at 22:25, Martin Jambor wrote:

> > You might have figured this out already but just in case something is
> > not clear:
> >
> >   1. How to check out our sources using svn and git is described at
> > https://gcc.gnu.org/svn.html and https://gcc.gnu.org/wiki/GitMirror
> > respectively, and
> >
> >   2. perhaps more importantly, how to configure, build and test GCC is
> > described in steps linked from https://gcc.gnu.org/install/ (look
> > for --disable-bootstrap, among other things).
>
> Or start with at https://gcc.gnu.org/wiki/InstallingGCC and
>
>> https://gcc.gnu.org/wiki/GettingStarted
>
> You need to check out the GCC source code from version control and find
the files and functions referenced in there (locating pieces of GCC code
using find, grep, etc. on the GCC source tree is something you'll need to
do a lot), and make sure you can build GCC, run the testsuite, save
results from a testsuite run, build and run the testsuite and compare the
results of the two runs (this is something that would need doing very many
times in the course of any project working on GCC).


Thank you all of you for your suggestions.
Was not mailing for some days because caught up in exams.
Interested in this project and working on it.
Thank you.

-Tejas



On 7 March 2018 at 18:05, Jonathan Wakely  wrote:

> On 6 March 2018 at 22:25, Martin Jambor wrote:
> > You might have figured this out already but just in case something is
> > not clear:
> >
> >   1. How to check out our sources using svn and git is described at
> > https://gcc.gnu.org/svn.html and https://gcc.gnu.org/wiki/GitMirror
> > respectively, and
> >
> >   2. perhaps more importantly, how to configure, build and test GCC is
> > described in steps linked from https://gcc.gnu.org/install/ (look
> > for --disable-bootstrap, among other things).
>
> Or start with at https://gcc.gnu.org/wiki/InstallingGCC and
> https://gcc.gnu.org/wiki/GettingStarted
>


Re: Further for GSoC.

2018-03-07 Thread Jonathan Wakely
On 6 March 2018 at 22:25, Martin Jambor wrote:
> You might have figured this out already but just in case something is
> not clear:
>
>   1. How to check out our sources using svn and git is described at
> https://gcc.gnu.org/svn.html and https://gcc.gnu.org/wiki/GitMirror
> respectively, and
>
>   2. perhaps more importantly, how to configure, build and test GCC is
> described in steps linked from https://gcc.gnu.org/install/ (look
> for --disable-bootstrap, among other things).

Or start with at https://gcc.gnu.org/wiki/InstallingGCC and
https://gcc.gnu.org/wiki/GettingStarted


Re: about the gsoc

2018-03-07 Thread Martin Jambor
Hello Jagmeet, 

On Wed, Mar 07 2018, Jagmeet Singh wrote:
> Any one for help me
>
> I want to ask question about the ideas
>
> reply please

please ask your question directly to this mailing list
(gcc@gcc.gnu.org).

Martin


Re: Size and speed comparison of GCC 7 & 8

2018-03-07 Thread Martin Liška
On 03/07/2018 11:13 AM, Martin Liška wrote:
> V2: fixed headers in the last table of the PDF.
> 
> Martin
> 

About the i386.ii -O2 -g, there's perf diff in between GCC 7 (base) and GCC 8:

# Baseline  Delta Abs  Shared Object Symbol 


   
#   .    
..
#
   +0.65%  cc1plus   [.] 
hash_table, ipa_call_summary*, 
simple_hashmap_traits >, 
ipa_call_summary*> >::hash_entry, xcallocator>::find_slot_with_hash
 0.18% +0.43%  cc1plus   [.] sreal::operator*
   +0.41%  cc1plus   [.] 
hash_table, ipa_fn_summary*, 
simple_hashmap_traits >, 
ipa_fn_summary*> >::hash_entry, xcallocator>::find_slot_with_hash
 0.07% +0.35%  cc1plus   [.] cgraph_node::find_replacement
   +0.33%  cc1plus   [.] profile_count::to_sreal_scale
   +0.33%  cc1plus   [.] predicate::probability
   +0.27%  cc1plus   [.] 
call_summary::get
 0.04% +0.25%  cc1plus   [.] sreal::operator/
   +0.24%  cc1plus   [.] sreal::normalize
 0.70% -0.23%  [kernel]  [.] 0x9c80019f
   +0.23%  cc1plus   [.] wide_int_to_tree_1
 0.09% +0.22%  cc1plus   [.] sreal::operator+
   +0.21%  cc1plus   [.] analyze_function_body
 0.04% +0.19%  cc1plus   [.] dwarf2out_var_location
 0.19% -0.19%  cc1plus   [.] compute_inlined_call_time
   +0.19%  cc1plus   [.] 
function_summary::get
 0.30% -0.18%  cc1plus   [.] can_inline_edge_p
 1.91% -0.16%  cc1plus   [.] bitmap_set_bit
 0.74% -0.15%  cc1plus   [.] 
pre_and_rev_post_order_compute_fn
 0.80% +0.15%  [unknown] [.] 0x9c80019f
   +0.14%  cc1plus   [.] cleanup_control_flow_pre
 0.81% -0.14%  cc1plus   [.] ggc_set_mark
 0.13% +0.13%  cc1plus   [.] variably_modified_type_p
 0.81% -0.13%  cc1plus   [.] et_splay
 0.17% -0.13%  cc1plus   [.] curr_insn_transform
   +0.12%  cc1plus   [.] profile_count::from_gcov_type
   +0.12%  cc1plus   [.] process_alt_operands
   +0.12%  cc1plus   [.] can_inline_edge_by_limits_p
 0.60% -0.11%  cc1plus   [.] estimate_calls_size_and_time
 1.36% -0.11%  libc-2.26.so  [.] _int_malloc
 0.27% +0.11%  cc1plus   [.] constrain_operands
   +0.11%  cc1plus   [.] bitmap_alloc
 0.60% +0.11%  cc1plus   [.] hash_table::find_slot_with_hash
   +0.11%  cc1plus   [.] predicate::evaluate
   +0.10%  cc1plus   [.] vr_values::get_value_range
 0.22% -0.10%  cc1plus   [.] nonzero_bits1
 0.24% +0.10%  cc1plus   [.] big_speedup_p
   +0.09%  cc1plus   [.] get_class_binding_direct
   +0.09%  cc1plus   [.] maybe_hot_count_p
 0.58% -0.09%  cc1plus   [.] walk_tree_1
 0.06% +0.09%  cc1plus   [.] estimate_size_after_inlining
   +0.09%  cc1plus   [.] mark_use
   +0.09%  cc1plus   [.] 
ix86_hard_regno_call_part_clobbered
 0.59% -0.09%  cc1plus   [.] bitmap_bit_p
 0.23% -0.09%  cc1plus   [.] delete_trivially_dead_insns
 0.41% -0.09%  cc1plus   [.] cse_insn
 0.47% -0.09%  cc1plus   [.] (anonymous 
namespace)::dom_info::calc_idoms
   +0.08%  cc1plus   [.] hash_table::find_slot_with_hash
   +0.08%  cc1plus   [.] profile_count::to_frequency
 0.28% -0.08%  libc-2.26.so  [.] msort_with_tmp.part.0
 0.90% -0.08%  libc-2.26.so  [.] _int_free
   +0.08%  cc1plus   [.] 
substitute_and_fold_engine::replace_uses_in
 0.20% -0.08%  cc1plus   [.] rtx_equal_for_memref_p
   +0.08%  cc1plus   [.] predicate::add_clause
 0.18% -0.07%  cc1plus   [.] update_callee_keys
 0.67% -0.07%  cc1plus   [.] gt_ggc_mx_lang_tree_node

Martin



Re: BLKmode parameters are stored in unaligned stack slot when passed via registers.

2018-03-07 Thread Richard Biener
On Tue, Mar 6, 2018 at 9:02 PM, Renlin Li  wrote:
> Hi Richard,
>
>
> On 06/03/18 16:04, Richard Biener wrote:
>>
>> On Tue, Mar 6, 2018 at 4:21 PM, Renlin Li  wrote:
>>>
>>> Hi all,
>>>
>>> The problem described here probably only affects targets whose ABI allow
>>> to
>>> pass structured
>>> arguments of certain size via registers.
>>>
>>> If the mode of the parameter type is BLKmode, in the callee, during RTL
>>> expanding,
>>> a stack slot will be reserved for this parameter, and the incoming value
>>> will be copied into
>>> the stack slot.
>>>
>>> However, the stack slot for the parameter will not be aligned if the
>>> alignment of parameter type
>>> exceeds MAX_SUPPORTED_STACK_ALIGNMENT.
>>> Chances are, unaligned memory access might cause run-time errors.
>>>
>>> For local variable on the stack, the alignment of the data type is
>>> honored,
>>> although the document states that it is not guaranteed.
>>>
>>> For example:
>>>
>>> #include 
>>> union U {
>>>  uint32_t M0;
>>>  uint32_t M1;
>>>  uint32_t M2;
>>>  uint32_t M3;
>>> } __attribute((aligned(16)));
>>>
>>> void tmp (union U *);
>>> void foo (union U P0)
>>> {
>>>union U P1 = P0;
>>>tmp (&P1);
>>> }
>>>
>>> The code-gen from armv7-a is like this:
>>>
>>> foo:
>>>  @ args = 0, pretend = 0, frame = 48
>>>  @ frame_needed = 0, uses_anonymous_args = 0
>>>  strlr, [sp, #-4]!
>>>  subsp, sp, #52
>>>  movip, sp
>>>  stmip, {r0, r1, r2, r3}  --> ip is not 128-bit aligned
>>>  addlr, sp, #39
>>>  biclr, lr, #15
>>>  ldmip, {r0, r1, r2, r3}
>>>  stmlr, {r0, r1, r2, r3} --> lr is 128-bit aligned
>>>  movr0, lr
>>>  bltmp
>>>  addsp, sp, #52
>>>  @ sp needed
>>>  ldrpc, [sp], #4
>>>
>>> There are other obvious missed optimizations in the code-generation
>>> above.
>>> The stack slot for parameter P0 and local variable P1 could be merged.
>>> So that some of the load/store instructions could be removed.
>>> I think this is a known missed optimization case.
>>>
>>> To summaries, there are two issues here:
>>> 1, (wrong code) unaligned stack slot allocated for parameters during
>>> function expansion.
>>> 2, (missed optimization) stack slot for parameter sometimes is not
>>> necessary.
>>> In certain scenario, the argument register could directly be used.
>>> Currently, this is only possible when the parameter mode is not
>>> BLKmode.
>>>
>>> For issue 1, we can do similar things as expand_used_vars.
>>> Dynamically align the stack slot address for parameters whose alignment
>>> exceeds
>>> PREDERRED_STACK_BOUNDARY. Other parameters could be store in gap between
>>> the
>>> aligned address and fp when possible.
>>>
>>> For issue 2, I checked the behavior of LLVM, it seems the stack slot
>>> allocation
>>> for parameters are explicitly exposed by the alloca IR instruction at the
>>> very beginning.
>>> Later, there are optimization/transformation passes like mem2reg,
>>> reg2mem,
>>> sroa etc. to remove
>>> unnecessary alloca instructions.
>>>
>>> In gcc, the stack allocation for parameters and local variables are done
>>> during expand pass, implicitly.
>>> And RTL passes are not able to remove the unnecessary stack allocation
>>> and
>>> load/store operations.
>>>
>>> For example:
>>>
>>> uint32_t bar(union U P0)
>>> {
>>>return P0.M0;
>>> }
>>>
>>> Currently, the code-gen is different on different targets.
>>> There are various backend hooks which make the code-gen sub-optimal.
>>> For example, aarch64 target could directly return with w0 while armv7-a
>>> target generates unnecessary
>>> store and load.
>>>
>>> However, this optimization should be target independent, unrelated target
>>> alignment configuration.
>>> Both issue 1&2 could be resolved if gcc has a similar approach. But I
>>> assume
>>> the change is big.
>>>
>>> Is there any suggestions for solving issue 1 and improving issue 2 in a
>>> generic way?
>>> I can create a bugzilla ticket to record the issue.
>>
>>
>> What does the ABI say for passing such over-aligned data types?
>>
>> For solving 1) you could copy the argument as passed by the ABI
>> to a properly aligned stack location in the callee.
>>
>> Generally it sounds like either the ABI doesn't specify anything
>> or the ABI specifies something that violates user expectation.
>>
>> For 2) again, it is the ABI which specifies whether an argument
>> is passed via the stack or via registers.  So - what does the ABI say?
>
>
>
> The compiler is doing the right thing here to pass argument via registers.
> To be specific, there are such clause in the arm PCS:
>
>> B.5 If the argument is an alignment adjusted type its value is passed as a
>> copy of the actual value. The
>> copy will have an alignment defined as follows.
>> ...
>> For a Composite Type, the alignment of the copy will have 4-byte alignment
>> if its natural alignment is
>> <= 4 and 8-byte alignment if its natural alignment is >= 8
>