RISC-V sibcall optimization with save-restore

2019-03-20 Thread Paulo Matos
.


Is there a way around this issue and allow the libcall to be emitted,
even if the sibcall was enabled? Or emit a sibcall even if it had been
disabled? Since the problem stems that at sibcall_ok_for_function_p I
don't have enough information to know what to do, is there a way to
decide this later on?

Thanks,

-- 
Paulo Matos


Re: riscv64 dep. computation

2019-02-15 Thread Paulo Matos



On 15/02/2019 19:15, Jim Wilson wrote:
> On Thu, Feb 14, 2019 at 11:33 PM Paulo Matos  wrote:
>> Are global variables not supposed to alias each other?
>> If I indeed do that, gcc still won't group loads and stores:
>> https://cx.rv8.io/g/rFjGLa
> 
> I meant something like
> struct foo_t x, y;
> and now they clearly don't alias.  As global pointers they may still alias.
> 

Ah ok, of course. Like that it makes sense they don't alias.

Thanks,

-- 
Paulo Matos


Re: riscv64 dep. computation

2019-02-14 Thread Paulo Matos



On 14/02/2019 19:56, Jim Wilson wrote:
> On 2/14/19 3:13 AM, Paulo Matos wrote:
>> If I compile this with -O2, sched1 groups all loads and all stores
>> together. That's perfect. However, if I change TYPE to unsigned char and
>> recompile, the stores and loads are interleaved.
>>
>> Further investigation shows that for unsigned char there are extra
>> dependencies that block the scheduler from grouping stores and loads.
> 
> The ISO C standard says that anything can be casted to char *, and char
> * can be casted to anything.  Hence, a char * pointer aliases everything.
> 
> If you look at the alias set info in the MEMs, you can see that the char
> * references are in alias set 0, which means that they alias everything.
>  The short * references are in alias set 2 which means they only alias
> other stuff in alias set 2.  The difference here is that short * does
> not alias the structure pointers, but char * does.  I haven't tried
> debugging your example, but this is presumably where the difference
> comes from.
>

OK, that seems to make sense. Indeed if I use restrict on the argument
pointers, the compiler will sort itself out and group the loads and stores.

> Because x and y are pointer parameters, the compiler must assume that
> they might alias.  And because char * aliases everything, the char
> references alias them too.  If you change x and y to global variables,
> then they no longer alias each other, and the compiler will schedule all
> of the loads first, even for char.
> 

Are global variables not supposed to alias each other?
If I indeed do that, gcc still won't group loads and stores:
https://cx.rv8.io/g/rFjGLa

-- 
Paulo Matos


riscv64 dep. computation

2019-02-14 Thread Paulo Matos
Hello,

While working on a private port of riscv, I noticed that upstream shows
the same behaviour.

For the code:
#define TYPE unsigned short

struct foo_t
{
  TYPE a;
  TYPE b;
  TYPE c;
};

void
func (struct foo_t *x, struct foo_t *y)
{
  y->a = x->a;
  y->b = x->b;
  y->c = x->c;
}

If I compile this with -O2, sched1 groups all loads and all stores
together. That's perfect. However, if I change TYPE to unsigned char and
recompile, the stores and loads are interleaved.

Further investigation shows that for unsigned char there are extra
dependencies that block the scheduler from grouping stores and loads.

For example, there's a dependency between:
(insn 8 3 9 2 (set (mem:QI (reg/v/f:DI 76 [ yD.1533 ]) [0
y_6(D)->aD.1529+0 S1 A8])
(subreg/s/v:QI (reg:DI 72 [ _1 ]) 0)) "test.c":13:8 142
{*movqi_internal}
 (expr_list:REG_DEAD (reg:DI 72 [ _1 ])
(nil)))

and
(insn 11 10 12 2 (set (reg:DI 74 [ _3 ])
(zero_extend:DI (mem:QI (plus:DI (reg/v/f:DI 75 [ xD.1532 ])
(const_int 2 [0x2])) [0 x_5(D)->cD.1531+0 S1 A8])))
"test.c":15:11 89 {zero_extendqidi2}
 (expr_list:REG_DEAD (reg/v/f:DI 75 [ xD.1532 ])
(nil)))

which didn't exist in the `unsigned short' case.

I can't find where this dependency is coming from but also can't justify
it so it seems like a bug to me. Is there a reason for this to happen
that I might not be aware of?

While I am at it, debugging compute_block_dependencies in sched-rgn.c is
a massive pain. This calls sched_analyze which receives a struct
deps_desc that tracks the dependencies in the insn list. Is there a way
to pretty print this structure in gdb?

Kind regards,

-- 
Paulo Matos


Re: Extracting live registers

2018-11-07 Thread Paulo Matos



On 07/11/2018 20:27, Segher Boessenkool wrote:
> 
> Sure, it shows the register information at the edges of basic blocks
> only.  This is what you asked for btw ;-)
> 
> 

True, but I need a way to map that information to the assembly
instructions in the basic block. :) I think it's not impossible with all
that gcc provides, but there's certainly a fair amount of parsing of
these files, which is not ideal given their format might change under my
feet.

-- 
Paulo Matos


Re: Extracting live registers

2018-11-06 Thread Paulo Matos



On 07/11/2018 00:40, Segher Boessenkool wrote:
> Hi Paulo,
> 
> -fdump-rtl-alignments[-all] is the last dump with all that information I
> think.  This one also has all this info without -all it seems.  With -all
> it shows it interleaving the RTL dump as well, which may or may not be
> handy for you.
> 

Thanks, however it provides no correspondence to the set of asm
instructions in the basic block. After you mentioned
-fdump-rtl-alignments, I tried a few related flags and hit upon what I
thought would work: -dA and -dP, but unfortunately these don't output
live out information per basic block so it's not helpful for my
application. It would be great if -dA or -dP would show live out info as
well, but that doesn't seem to be the case at the moment.

-- 
Paulo Matos


Re: Extracting live registers

2018-11-06 Thread Paulo Matos
Apologies, wrong mailing list. Should have sent this to gcc-help.

On 06/11/2018 21:35, Paulo Matos wrote:
> Hi,
> 
> I remember from awhile ago that there's some option (or there was...)
> that gets GCC to print some register allocation information together
> with the assembler output.
> 
> I am interested in obtaining the live registers per basic block. I think
> the option I had in mind did that but I can't remember the option name
> anymore. Can someone point me out to the option or a way to extract such
> information?
> 
> Kind regards,
> 

-- 
Paulo Matos


Extracting live registers

2018-11-06 Thread Paulo Matos
Hi,

I remember from awhile ago that there's some option (or there was...)
that gets GCC to print some register allocation information together
with the assembler output.

I am interested in obtaining the live registers per basic block. I think
the option I had in mind did that but I can't remember the option name
anymore. Can someone point me out to the option or a way to extract such
information?

Kind regards,
-- 
Paulo Matos


Re: Both GCC and GDB buildbot use gcc114

2018-02-27 Thread Paulo Matos
On 27/02/18 13:53, Yao Qi wrote:
> Hi Paulo,
> I noticed that GDB buildbot pending build queue on aarch64-linux
> becomes longer and longer,
> https://gdb-build.sergiodj.net/builders/Ubuntu-AArch64-m64
> as it takes longer to finish each build and test.
> 
> Looks that you deployed aarch64-linux buildslave for GCC buildbot
> on gcc114 as well, that is reason that GDB build and test is slowed
> down (GCC build and test is slowed down too).
> 
> We'd better avoid using the same machine for two buildbots.  Are there
> any easy way to move one buildslave to a different machine like gcc115
> or gcc116.  As far as I know, they are identical.
>

Apologies for the clash on resources. Using gcc115 and gcc116 only now.

Kind regards,

-- 
Paulo Matos


Re: jamais-vu can now ignore renumbering of source lines in dg output (Re: GCC Buildbot Update)

2018-01-29 Thread Paulo Matos


On 29/01/18 15:19, David Malcolm wrote:
>>
>> Hi,
>>
>> I am looking at this today and I noticed that having the source file
>> for
>> all recent GCC revisions is costly in terms of time (if we wish to
>> compress them) and space (for storage). I was instead thinking that
>> jv
>> could calculate the differences offline using pysvn and the old and
>> new
>> revision numbers.
> 
> Note that access to the source files is optional - jv doesn't need
> them, it just helps for the particular situation described above.
> 

I understand but it would be great to have line number filtering.

>> I have started implementing this in my port. Would you consider
>> merging it?
> 
> Sounds reasonable - though bear in mind that gcc might be switching to
> git at some point.
> 

Yes, I know... but... if we wait for that to happen to implement
something... :)

> Send a pull request (I've turned on travis CI on the github repository,
> so pull requests now automatically get tested on a bunch of different
> Python 3 versions).
> 

Sure.

-- 
Paulo Matos


Re: jamais-vu can now ignore renumbering of source lines in dg output (Re: GCC Buildbot Update)

2018-01-29 Thread Paulo Matos


On 24/01/18 20:20, David Malcolm wrote:
> 
> I've added a new feature to jamais-vu (as of
> 77849e2809ca9a049d5683571e27ebe190977fa8): it can now ignore test
> results that merely changed line number.  
> 
> For example, if the old .sum file has a:
> 
>   PASS: g++.dg/diagnostic/param-type-mismatch.C  -std=gnu++11  (test for 
> errors, line 106)
> 
> and the new .sum file has a:
> 
>   PASS: g++.dg/diagnostic/param-type-mismatch.C  -std=gnu++11  (test for 
> errors, line 103)
> 
> and diffing the source trees reveals that line 106 became line 103, the
> change won't be reported by "jv compare".
> 
> It also does it for dg-{begin|end}-multiline-output.
> 
> It will report them if the outcome changed (e.g. from PASS to FAIL).
> 
> To do this filtering, jv needs access to the old and new source trees,
> so it can diff the pertinent source files, so "jv compare" has gained
> the optional arguments
>   --old-source-path=
> and
>   --new-source-path=
> See the example in the jv Makefile for more info.  If they're not
> present, it should work as before (without being able to do the above
> filtering).


Hi,

I am looking at this today and I noticed that having the source file for
all recent GCC revisions is costly in terms of time (if we wish to
compress them) and space (for storage). I was instead thinking that jv
could calculate the differences offline using pysvn and the old and new
revision numbers.

I have started implementing this in my port. Would you consider merging it?

-- 
Paulo Matos


Re: jamais-vu can now ignore renumbering of source lines in dg output (Re: GCC Buildbot Update)

2018-01-24 Thread Paulo Matos


On 24/01/18 20:20, David Malcolm wrote:
> 
> I've added a new feature to jamais-vu (as of
> 77849e2809ca9a049d5683571e27ebe190977fa8): it can now ignore test
> results that merely changed line number.  
> 
> For example, if the old .sum file has a:
> 
>   PASS: g++.dg/diagnostic/param-type-mismatch.C  -std=gnu++11  (test for 
> errors, line 106)
> 
> and the new .sum file has a:
> 
>   PASS: g++.dg/diagnostic/param-type-mismatch.C  -std=gnu++11  (test for 
> errors, line 103)
> 
> and diffing the source trees reveals that line 106 became line 103, the
> change won't be reported by "jv compare".
> 
> It also does it for dg-{begin|end}-multiline-output.
> 
> It will report them if the outcome changed (e.g. from PASS to FAIL).
> 
> To do this filtering, jv needs access to the old and new source trees,
> so it can diff the pertinent source files, so "jv compare" has gained
> the optional arguments
>   --old-source-path=
> and
>   --new-source-path=
> See the example in the jv Makefile for more info.  If they're not
> present, it should work as before (without being able to do the above
> filtering).
> 
> Is this something that the buildbot can use?
> 

Hi David,

Thanks for the amazing improvements.
I will take a look at them on Monday. I have a lot of work at the moment
so I decided to take 1/5 of my week (usually Monday) to work on buildbot
so I will definitely get it integrated on Monday and hopefully have
something to say afterwards.

Thanks for keeping me up-to-date with these changes.

-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-20 Thread Paulo Matos


On 20/12/17 12:48, James Greenhalgh wrote:
> On Wed, Dec 20, 2017 at 10:02:45AM +0000, Paulo Matos wrote:
>>
>>
>> On 20/12/17 10:51, Christophe Lyon wrote:
>>>
>>> The recent fix changed the Makefile and configure script in libatomic.
>>> I guess that if your incremental builds does not run configure, it's
>>> still using old Makefiles, and old options.
>>>
>>>
>> You're right. I guess incremental builds should always call configure,
>> just in case.
> 
> For my personal bisect scripts I try an incremental build, with a
> full rebuild as a fallback on failure.
> 
> That gives me the benefits of an incremental build most of the time (I
> don't have stats on how often) with an automated approach to keeping things
> going where there are issues.
> 
> Note that there are rare cases where depencies are missed in the toolchain
> and an incremental build will give you a toolchain with undefined
> behaviour, as one compilation unit takes a new definition of a
> struct/interface and the other sits on an outdated compile from the
> previous build.
> 
> I don't have a good way to detect these.
> 

That's definitely a shortcoming of incremental builds. Unfortunately we
cannot cope with full builds for each commit (even for incremental
builds we'll need an alternative soon). So I will implement the same
strategy of full build if incremental fails, I think.

With respect with regards to incremental builds with undefined behaviour
that probably means that dependencies are incorrectly calculated. It
would be great to sort these out. If we could detect that there are
issues with the incremental build we could then try to understand which
dependencies were not properly calculated. Just a guess, however
implementing this might take awhile and would obviously need a lot more
resources than we have available now.

-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-20 Thread Paulo Matos


On 20/12/17 10:51, Christophe Lyon wrote:
> 
> The recent fix changed the Makefile and configure script in libatomic.
> I guess that if your incremental builds does not run configure, it's
> still using old Makefiles, and old options.
> 
> 
You're right. I guess incremental builds should always call configure,
just in case.

Thanks,
-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-20 Thread Paulo Matos


On 15/12/17 10:21, Christophe Lyon wrote:
> On 15 December 2017 at 10:19, Paulo Matos <pmatos@linki.tools> wrote:
>>
>>
>> On 14/12/17 21:32, Christophe Lyon wrote:
>>> Great, I thought the CF machines were reserved for developpers.
>>> Good news you could add builders on them.
>>>
>>
>> Oh. I have seen similar things happening on CF machines so I thought it
>> was not a problem. I have never specifically asked for permission.
>>
>>>> pmatos@gcc115:~/gcc-8-20171203_BUILD$ as -march=armv8.1-a
>>>> Assembler messages:
>>>> Error: unknown architecture `armv8.1-a'
>>>>
>>>> Error: unrecognized option -march=armv8.1-a
>>>>
>>>> However, if I run the a compiler build manually with just:
>>>>
>>>> $ configure --disable-multilib
>>>> $ nice -n 19 make -j4 all
>>>>
>>>> This compiles just fine. So I am at the moment attempting to investigate
>>>> what might cause the difference between what buildbot does and what I do
>>>> through ssh.
>>>>
>>> I suspect you are hitting a bug introduced recently, and fixed by:
>>> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00434.html
>>>
>>
>> Wow, that's really useful. Thanks for letting me know.
>>
> And the patch was committed last night (r255659), so maybe your builds now 
> work?
> 

On some machines, in incremental builds I still seeing this:
Assembler messages:
Error: unknown architectural extension `lse'
Error: unrecognized option -march=armv8-a+lse
make[4]: *** [load_1_1_.lo] Error 1
make[4]: *** Waiting for unfinished jobs

Looks related... the only strange thing happening is that this doesn't
happen in full builds.

-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-16 Thread Paulo Matos


On 15/12/17 18:05, Segher Boessenkool wrote:
> All the cfarm machines are shared resources.  Benchmarking on them will
> not work no matter what.  And being a shared resource means all users
> have to share and be mindful of others.
> 

Yes, we'll definitely need better machines for benchmarking. Something I
haven't thought of yet.

>> So it would be good if there was a strict separation of machines used
>> for bots and machines used by humans. In other words bots should only
>> run on dedicated machines.
> 
> The aarch64 builds should probably not use all of gcc113..gcc116.
>
> We do not have enough resources to dedicate machines to bots.
>

I have disabled gcc116.

Thanks,
-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-16 Thread Paulo Matos


On 15/12/17 15:29, David Malcolm wrote:
> On Fri, 2017-12-15 at 10:16 +0100, Paulo Matos wrote:
>>
>> On 14/12/17 12:39, David Malcolm wrote:
> 
> [...]
> 
>>> It looks like you're capturing the textual output from "jv compare"
>>> and
>>> using the exit code.  Would you prefer to import "jv" as a python
>>> module and use some kind of API?  Or a different output format?
>>>
>>
>> Well, I am using a fork of it which I converted to Python3. Would you
>> be
>> open to convert yours to Python3? The reason I am doing this is
>> because
>> all other Python software I have and the buildbot use Python3.
> 
> Done.
> 
> I found and fixed some more bugs, also (introduced during my
> refactoring, sigh...)
> 

That's great. Thank you very much for this work.

>> I would also prefer to have some json format or something but when I
>> looked at it, the software was just printing to stdout and I didn't
>> want
>> to spend too much time implementing it, so I thought parsing the
>> output
>> was just easier.
> 
> I can add JSON output (or whatever), but I need to get back to gcc 8
> work, so if the stdout output is good enough for now, let's defer
> output changes.
> 

Agree, for now I can use what I already have to read the output of jv.
I think I can now delete my fork and just use upstream jv as a submodule.

-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-15 Thread Paulo Matos


On 15/12/17 10:21, Christophe Lyon wrote:
> And the patch was committed last night (r255659), so maybe your builds now 
> work?
> 

Forgot to mention that. Yes, it built!
https://gcc-buildbot.linki.tools/#/builders/5

-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-15 Thread Paulo Matos


On 15/12/17 08:42, Markus Trippelsdorf wrote:
> 
> I don't think this is good news at all. 
> 

As I pointed out in a reply to Chris, I haven't seeked permission but I
am pretty sure something similar runs in the CF machines from other
projects.

The downside is that if we can't use the CF, I have no extra machines to
run the buildbot on.

> Once a buildbot runs on a CF machine it immediately becomes impossible
> to do any meaningful measurement on that machine. That is mainly because
> of the random I/O (untar, rm -fr, etc.) of the bot. As a result variance
> goes to the roof and all measurements drown in noise.
> 
> So it would be good if there was a strict separation of machines used
> for bots and machines used by humans. In other words bots should only
> run on dedicated machines.
> 

I understand your concern though. Do you know who this issue could be
raised with? FSF?

-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-15 Thread Paulo Matos


On 14/12/17 21:32, Christophe Lyon wrote:
> Great, I thought the CF machines were reserved for developpers.
> Good news you could add builders on them.
> 

Oh. I have seen similar things happening on CF machines so I thought it
was not a problem. I have never specifically asked for permission.

>> pmatos@gcc115:~/gcc-8-20171203_BUILD$ as -march=armv8.1-a
>> Assembler messages:
>> Error: unknown architecture `armv8.1-a'
>>
>> Error: unrecognized option -march=armv8.1-a
>>
>> However, if I run the a compiler build manually with just:
>>
>> $ configure --disable-multilib
>> $ nice -n 19 make -j4 all
>>
>> This compiles just fine. So I am at the moment attempting to investigate
>> what might cause the difference between what buildbot does and what I do
>> through ssh.
>>
> I suspect you are hitting a bug introduced recently, and fixed by:
> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00434.html
> 

Wow, that's really useful. Thanks for letting me know.

-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-15 Thread Paulo Matos


On 14/12/17 12:39, David Malcolm wrote:
> 
> Looking at some of the red blobs in e.g. the grid view there seem to be
> a few failures in the initial "update gcc trunk repo" step of the form:
> 
> svn: Working copy '.' locked
> svn: run 'svn cleanup' to remove locks (type 'svn help cleanup' for
> details)
> 

Yes, that's a big annoyance and a reason I have thought about moving to
using the git mirror, however that would probably bring other issues so
I am holding off. I need to add a reporter so that if it fails I am
notified by email and mobile phone.

This happens when there's a timeout from a server _during_ a
checkout/update (the svn repo unfortunately times out way too often). I
thought about doing an svn cleanup before each checkout but read it's
not good practice. If you have any suggestions on this please let me know.

> https://gcc-lnt.linki.tools/#/builders/3/builds/388/steps/0/logs/stdio
> 

Apologies, https://gcc-lnt.linki.tools is currently incorrectly
forwarding you to https://gcc-buildbot.linki.tools. I meant to have it
return an error until I open that up.

> Is there a bug-tracking location for the buildbot?
> Presumably:
>   https://github.com/LinkiTools/gcc-buildbot/issues
> ?
> 

That's correct.

> I actually found a serious bug in jamais-vu yesterday - it got confused
> by  multiple .sum lines for the same source line e.g. from multiple
> "dg-" directives that all specify a particular line).  For example,
> when testing one of my patches, of the 3 tests reporting as
>   "c-c++-common/pr83059.c  -std=c++11  (test for warnings, line 7)"
> one of the 3 PASS results became a FAIL.  jv correctly reported that
> new FAILs had occurred, but wouldn't identify them, and mistakenly
> reported that new PASSes has occurred also.
> 
> I've fixed that now; to do so I've done some refactoring and added a
> testsuite.
>

Perfect, thank you very much for this work.

> It looks like you're capturing the textual output from "jv compare" and
> using the exit code.  Would you prefer to import "jv" as a python
> module and use some kind of API?  Or a different output format?
> 

Well, I am using a fork of it which I converted to Python3. Would you be
open to convert yours to Python3? The reason I am doing this is because
all other Python software I have and the buildbot use Python3.

I would also prefer to have some json format or something but when I
looked at it, the software was just printing to stdout and I didn't want
to spend too much time implementing it, so I thought parsing the output
was just easier.

> If you file pull request(s) for the changes you've made in your copy of
> jamais-vu, I can take at look at merging them.
>

Happy to do so...
Will merge your changes into my fork first then.

Kind regards,
-- 
Paulo Matos


GCC Buildbot Update

2017-12-14 Thread Paulo Matos
 test results there as well.

*TODO:*

So on my todo list for the next iteration I have:

- analysis of out-of-memory issues in CF for Fast builders;
- analysis of aarch64 build failure;
- merge regression testing verification branch into master and deploy
into production;
  - this should then trigger the irc bot reporter for regressions;
- open up LNT for benchmarking and add benchmarking job for x64_64 using
csibe (as an initial proof of concept);

If you have any wishes, questions, want to offer some fast machines to
have workers on or if you know what's wrong with the Fast builder or the
aarch64 machines, please let me know.

I hope to send another update in about a months time.

Kind regards,
-- 
Paulo Matos


Re: GCC CI on Power

2017-11-07 Thread Paulo Matos


On 07/11/17 16:54, David Malcolm wrote:
> On Mon, 2017-11-06 at 18:46 -0200, Nathália Harumi wrote:
>> Hi,
>> My name is Nathália Harumi, I'm a student of The University of
>> Campinas.
>>
>> I'm working on validation projects on OpenPower lab (which is a
>> partnership
>> between IBM and The University of Campinas, in Brazil) and we'd like
>> to
>> find a way to contribute the GCC community.
>>
>> Now a days, we use a Buildbot platform for internal validation
>> projects on
>> Power with several ppc architectures and flavours as workers. We're
>> also
>> working with glibc community to improve it buildbot and to provide
>> workers
>> for builds on ppc.
>>
>> So, we'd like to know which platform you use for CI and how we can
>> contribute with it.
>>
>> Thank you,
>> Nathália.
> 
> Hi Nathália
> 
> I don't think there's anything "official" yet, but Paulo Matos [CCed]
> has been experimenting with running buildbot for gcc:
> 
>   https://gcc.gnu.org/ml/gcc/2017-10/msg00068.html
> 
> (albeit it with some significant unsolved problems, as I understand it)
> 
> Maybe you and Paulo could set things up so that the ppc workers you
> have could contribute to Paulo's buildbot instance?
> 

Dave, thanks for forwarding this to me.

Hello Nathalia,

Thanks for reaching out.
I am running an experimental buildbot for GCC (currently possibly down
as it's moving servers, but downtime should be short).

There are a few issues to fix, which I will be dealing with in the next
couple of weeks. A few examples of important issues to fix are:

* dealing with regressions on the gcc testsuite;
* trimming down the testsuite to have a fast testsuite run which takes
not longer than a coffee break;
* ensuring that notifications are processes properly.

Would you be able to share your configuration or solution to these issues?

Kind regards,

-- 
Paulo Matos


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Paulo Matos


On 11/10/17 11:15, Christophe Lyon wrote:
> 
> You can have a look at
> https://git.linaro.org/toolchain/gcc-compare-results.git/
> where compare_tests is a patched version of the contrib/ script,
> it calls the main perl script (which is not the prettiest thing :-)
> 

Thanks, that's useful. I will take a look.

-- 
Paulo Matos


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Paulo Matos


On 11/10/17 10:35, Christophe Lyon wrote:
> 
> FWIW, we consider regressions:
> * any->FAIL because we don't want such a regression at the whole testsuite 
> level
> * any->UNRESOLVED for the same reason
> * {PASS,UNSUPPORTED,UNTESTED,UNRESOLVED}-> XPASS
> * new XPASS
> * XFAIL disappears (may mean that a testcase was removed, worth a manual 
> check)
> * ERRORS
> 

That's certainly stricter than what it was proposed by Joseph. I will
run a few tests on historical data to see what I get using both approaches.

> 
> 
>>> ERRORs in the .sum or .log files should be watched out for as well,
>>> however, as sometimes they may indicate broken Tcl syntax in the
>>> testsuite, which may cause many tests not to be run.
>>>
>>> Note that the test names that come after PASS:, FAIL: etc. aren't unique
>>> between different .sum files, so you need to associate tests with a tuple
>>> (.sum file, test name) (and even then, sometimes multiple tests in a .sum
>>> file have the same name, but that's a testsuite bug).  If you're using
>>> --target_board options that run tests for more than one multilib in the
>>> same testsuite run, add the multilib to that tuple as well.
>>>
>>
>> Thanks for all the comments. Sounds sensible.
>> By not being unique, you mean between languages?
> Yes, but not only as Joseph mentioned above.
> 
> You have the obvious example of c-c++-common/*san tests, which are
> common to gcc and g++.
> 
>> I assume that two gcc.sum from different builds will always refer to the
>> same test/configuration when referring to (for example):
>> PASS: gcc.c-torture/compile/2105-1.c   -O1  (test for excess errors)
>>
>> In this case, I assume that "gcc.c-torture/compile/2105-1.c   -O1
>> (test for excess errors)" will always be referring to the same thing.
>>
> In gcc.sum, I can see 4 occurrences of
> PASS: gcc.dg/Werror-13.c  (test for errors, line )
> 
> Actually, there are quite a few others like that
> 

That actually surprised me.

I also see:
PASS: gcc.dg/Werror-13.c  (test for errors, line )
PASS: gcc.dg/Werror-13.c  (test for errors, line )
PASS: gcc.dg/Werror-13.c  (test for errors, line )
PASS: gcc.dg/Werror-13.c  (test for errors, line )

among others like it. Looks like a line number is missing?

In any case, it feels like the code I have to track this down needs to
be improved.

-- 
Paulo Matos


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Paulo Matos


On 10/10/17 23:25, Joseph Myers wrote:
> On Tue, 10 Oct 2017, Paulo Matos wrote:
> 
>> new test -> FAIL; New test starts as fail
> 
> No, that's not a regression, but you might want to treat it as one (in the 
> sense that it's a regression at the higher level of "testsuite run should 
> have no unexpected failures", even if the test in question would have 
> failed all along if added earlier and so the underlying compiler bug, if 
> any, is not a regression).  It should have human attention to classify it 
> and either fix the test or XFAIL it (with issue filed in Bugzilla if a 
> bug), but it's not a regression.  (Exception: where a test failing results 
> in its name changing, e.g. through adding "(internal compiler error)".)
> 

When someone adds a new test to the testsuite, isn't it supposed to not
FAIL? If is does FAIL, shouldn't this be considered a regression?

Now, the danger is that since regressions are comparisons with previous
run something like this would happen:

run1:
...
FAIL: foo.c ; new test
...

run1 fails because new test entered as a FAIL

run2:
...
FAIL: foo.c
...

run2 succeeds because there are no changes.

For this reason all of this issues need to be taken care straight away
or they become part of the 'normal' status and no more failures are
issued... unless of course a more complex regression analysis is
implemented.

Also, when I mean, run1 fails or succeeds this is just the term I use to
display red/green in the buildbot interface for a given build, not
necessarily what I expect the process will do.

> 
> My suggestion is:
> 
> PASS -> FAIL is an unambiguous regression.
> 
> Anything else -> FAIL and new FAILing tests aren't regressions at the 
> individual test level, but may be treated as such at the whole testsuite 
> level.
> 
> Any transition where the destination result is not FAIL is not a 
> regression.
> 
> ERRORs in the .sum or .log files should be watched out for as well, 
> however, as sometimes they may indicate broken Tcl syntax in the 
> testsuite, which may cause many tests not to be run.
> 
> Note that the test names that come after PASS:, FAIL: etc. aren't unique 
> between different .sum files, so you need to associate tests with a tuple 
> (.sum file, test name) (and even then, sometimes multiple tests in a .sum 
> file have the same name, but that's a testsuite bug).  If you're using 
> --target_board options that run tests for more than one multilib in the 
> same testsuite run, add the multilib to that tuple as well.
> 

Thanks for all the comments. Sounds sensible.
By not being unique, you mean between languages?
I assume that two gcc.sum from different builds will always refer to the
same test/configuration when referring to (for example):
PASS: gcc.c-torture/compile/2105-1.c   -O1  (test for excess errors)

In this case, I assume that "gcc.c-torture/compile/2105-1.c   -O1
(test for excess errors)" will always be referring to the same thing.

-- 
Paulo Matos


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Paulo Matos


On 11/10/17 06:17, Markus Trippelsdorf wrote:
> On 2017.10.10 at 21:45 +0200, Paulo Matos wrote:
>> Hi all,
>>
>> It's almost 3 weeks since I last posted on GCC Buildbot. Here's an update:
>>
>> * 3 x86_64 workers from CF are now installed;
>> * There's one scheduler for trunk doing fresh builds for every Daily bump;
>> * One scheduler doing incremental builds for each active branch;
>> * An IRC bot which is currently silent;
> 
> Using -j8 for the bot on a 8/16 (core/thread) machine like gcc67 is not
> acceptable, because it will render it unusable for everybody else.

I was going to correct you on that given what I read in
https://gcc.gnu.org/wiki/CompileFarm#Usage

but it was my mistake. I assumed that for an N-thread machine, I could
use N/2 processes but the guide explicitly says N-core, not N-thread.
Therefore I should be using 4 processes for gcc67 (or 0 given what follows).

I will fix also the number of processes used by the other workers.

> Also gcc67 has a buggy Ryzen CPU that causes random gcc crashes. Not the
> best setup for a regression tester...
> 

Is that documented anywhere? I will remove this worker.

Thanks,

-- 
Paulo Matos


GCC Buildbot Update - Definition of regression

2017-10-10 Thread Paulo Matos
Hi all,

It's almost 3 weeks since I last posted on GCC Buildbot. Here's an update:

* 3 x86_64 workers from CF are now installed;
* There's one scheduler for trunk doing fresh builds for every Daily bump;
* One scheduler doing incremental builds for each active branch;
* An IRC bot which is currently silent;

The next steps are:
* Enable LNT (I have installed this but have yet to connect to buildbot)
for tracking performance benchmarks over time -- it should come up as
http://gcc-lnt.linki.tools in the near future.
* Enable regression analysis --- This is fundamental. I understand that
without this the buildbot is pretty useless so it has highest priority.
However, I would like some agreement as to what in GCC should be
considered a regression. Each test in deja gnu can have several status:
FAIL, PASS, UNSUPPORTED, UNTESTED, XPASS, KPASS, XFAIL, KFAIL, UNRESOLVED

Since GCC doesn't have a 'clean bill' of test results we need to analyse
the sum files for the current run and compare with the last run of the
same branch. I have written down that if for each test there's a
transition that looks like the following, then a regression exists and
the test run should be marked as failure.

ANY -> no test  ; Test disappears
ANY / XPASS -> XPASS; Test goes from any status other than XPASS
to XPASS
ANY / KPASS -> KPASS; Test goes from any status other than KPASS
to KPASS
new test -> FAIL; New test starts as fail
PASS -> ANY ; Test moves away from PASS

This is a suggestion. I am keen to have corrections from people who use
this on a daily basis and/or have a better understanding of each status.

As soon as we reach a consensus, I will deploy this analysis and enable
IRC bot to notify on the #gcc channel the results of the tests.

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-26 Thread Paulo Matos


On 26/09/17 10:43, Martin Liška wrote:
> On 09/25/2017 02:49 PM, Paulo Matos wrote:
>> For benchmarks like Qt, blitz (as mentioned in the gcc testing page), we
>> can plot the build time of the benchmark and resulting size when
>> compiling for size.
>>
> 
> Please consider using LNT:
> http://llvm.org/docs/lnt/
> 
> Usage:
> http://lnt.llvm.org/
> 
> I've been investigating the tools and I know that ARM people use the tool:
> https://gcc.gnu.org/wiki/cauldron2017#ARM-perf
> 

Good suggestion. I was actually at the presentation. The reason I was
going with influx+grafana was because I know the process and already use
that internally --- the LNT configuration was unknown to me but you're
right. It might be better in the long term. I will look at the
documentation.

Thanks.

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-25 Thread Paulo Matos


On 25/09/17 13:36, Martin Liška wrote:
> 
> Would be great, what exactly do you want to visualize? For me, even having 
> green/red spots
> works fine in order to quickly identify what builds are wrong.
> 

There are several options and I think mostly it depends on what everyone
would like to see but I am thinking that a dashboard with green/red
spots as you mention (which depends not on the existence of failures)
but on the existence of a regression at a certain revision. Also, an
historical graph of results and gcc build times might be interesting as
well.

For benchmarks like Qt, blitz (as mentioned in the gcc testing page), we
can plot the build time of the benchmark and resulting size when
compiling for size.

Again, I expect that once there's something visible and people are keen
to use it, they'll ask for something specific. However, once the
infrastructure is in place, it shouldn't be too hard to add specific
visualizations.

> 
> Hopefully both. I'm attaching my config file (probably more for inspiration 
> that a real use).
> I'll ask my manager whether we can find a machine that can run more complex 
> tests. I'll inform you.
> 

Thanks for the configuration file. I will take a look. Will eagerly wait
for news on the hardware request.

> 
> Yes, duplication in way that it is (will be) same things. I'm adding author 
> of the tool,
> hopefully we can unify the effort (and resources of course).
> 

Great.

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-25 Thread Paulo Matos


On 25/09/17 13:14, Jonathan Wakely wrote:
> On 25 September 2017 at 11:13, Paulo Matos wrote:
>>> Apart from that, I fully agree with octoploid that 
>>> http://toolchain.lug-owl.de/buildbot/ is duplicated effort which is running
>>> on GCC compile farm machines and uses a shell scripts to utilize. I would 
>>> prefer to integrate it to Buildbot and utilize same
>>> GCC Farm machines for native builds.
>>>
>>
>> Octoploid? Is that a typo?
> 
> No, it's Markus Trippelsdorf's username.
> 

Ah, thanks for the clarification.

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-25 Thread Paulo Matos


On 25/09/17 11:52, Martin Liška wrote:
> Hi Paulo.
> 
> Thank you for working on that! To be honest, I've been running local buildbot 
> on
> my desktop machine which does builds your buildbot instance can do (please 
> see:
> https://pasteboard.co/GLZ0vLMu.png):
> 

Hi Martin,

Thanks for sharing your builders. Looks like you've got a good setup going.

I have done the very basic only since it was my interest to understand
if people would find it useful. I didn't want to waste my time building
something people have no interest to use.

It seems there is some interest so I am gathering some requirements in
the GitHub issues of the project. One very important feature is
visualization of results, so I am integrating support for data gathering
in influxdb to display using grafana. I do not work full time on this,
so it's going slowly but I should have a dashboard to show in the next
couple of weeks.

> - doing time to time (once a week) sanitizer builds: ASAN, UBSAN and run 
> test-suite
> - doing profiled bootstrap, LTO bootstrap (yes, it has been broken for quite 
> some time) and LTO profiled bootstrap
> - building project with --enable-gather-detailed-mem-stats
> - doing coverage --enable-coverage, running test-suite and uploading to a 
> location: https://gcc.opensuse.org/gcc-lcov/
> - similar for Doxygen: https://gcc.opensuse.org/gcc-doxygen/
> - periodic building of some projects: Inkscape, GIMP, linux-kernel, Firefox - 
> I do it with -O2, -O2+LTO, -O3, ...
>   Would be definitely fine, but it takes some care to maintain compatible 
> versions of a project and GCC compiler.
>   Plus handling of dependencies of external libraries can be irritating.
> - cross build for primary architectures
>
> That's list of what I have and can be inspiration for you. I can help if you 
> want and we can find a reasonable resources
> where this can be run.
>

Thanks. That's great. As you can see from #9 in
https://github.com/LinkiTools/gcc-buildbot/issues/9, most of the things
I hope to be able to run in the CompileFarm unless, of course, unless
people host a worker on their own hardware. Regarding your offer for
resources. Are you offering to merge your config or hardware? Either
would be great, however I expect your config to have to be ported to
buildbot nine before merging.

> Apart from that, I fully agree with octoploid that 
> http://toolchain.lug-owl.de/buildbot/ is duplicated effort which is running
> on GCC compile farm machines and uses a shell scripts to utilize. I would 
> prefer to integrate it to Buildbot and utilize same
> GCC Farm machines for native builds.
> 

Octoploid? Is that a typo?
I discussed that in the Cauldron with David was surprised to know that
the buildbot you reference is actually not a buildbot implementation
using the Python framework but a handwritten software. So, in that
respect is not duplicated effort. It is duplicated effort if on the
other hand, we try to test the same things. I will try to understand how
to merge efforts to that buildbot.

> Another inspiration (for builds) can come from what LLVM folks do:
> http://lab.llvm.org:8011/builders
> 

Thanks for the pointer. I at one point tried to read their
configuration. However, found the one by gdb simpler and used it as a
basis for what I have. I will look at their builders nonetheless to
understand what they build and how long they take.

> Anyway, it's good starting point what you did and I'm looking forward to more 
> common use of the tool.
> Martin
> 

Thanks,
-- 
Paulo Matos


Re: GCC Buildbot

2017-09-22 Thread Paulo Matos


On 22/09/17 01:23, Joseph Myers wrote:
> On Thu, 21 Sep 2017, Paulo Matos wrote:
> 
>> Interesting suggestion. I haven't had the opportunity to look at the
>> compile farm. However, it could be interesting to have a mix of workers:
>> native compile farm ones and some x86_64 doing cross compilation and
>> testing.
> 
> Note that even without a simulator (but with target libc), you can test 
> just the compilation parts of the testsuite using a board file with a 
> dummy _load implementation.
> 

I was not aware of that. I will keep that in mind once I try to setup a
cross-compilation worker.

I assume you have done this before. Do you have any scripts for
cross-compiling you can share?

Thanks,

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-21 Thread Paulo Matos
t framework can also lead to embarrassing
> situations, like when I blamed a C++ front-end patch for a regression
> in fortran ;-)
> 
> Most of the time, I consider it's more efficient for the project if I warn
> the author of the patch that introduced the regression than if I try to
> fix it myself. Except for the most trivial ones, it resulted several times
> in duplicated effort and waste of time. But of course, there are many
> more efficient gcc developers than me here :)
> 

I think that's the point. I mean, as soon as a regression/build fail is
noticed, the buildbot should notify the right people of what happened
and those need to take notice and fix it or revert their patch. If
someone submits a patch, is notified it breaks GCC and does nothing,
then we have a bigger problem.

> Regarding the cpu power, maybe we could have free slots in
> some cloud? (travis? amazon?, )
> 

Any suggestions on how to get these free slots? :)

Thanks for all the great suggestions and tips on your email.
-- 
Paulo Matos


Re: GCC Buildbot

2017-09-21 Thread Paulo Matos


On 21/09/17 16:41, Martin Sebor wrote:
> 
> The regression and the testresults lists are useful but not nearly
> as much as they could be.  For one, the presentation isn't user
> friendly (a matrix view would be much more informative).  But even
> beyond it, rather than using the pull model (people have to make
> an effort to search it for results of their changes or the changes
> of others to find regressions), the regression testing setup could
> be improved by adopting the push model and automatically emailing
> authors of changes that caused regressions (or opening bugs, or
> whatever else might get our attention).
> 

This is certainly one of the notifications that I think that need to be
implemented. If a patch breaks build or testing, the responsible parties
need to be informed, i.e. commiters, authors and possibly the list as well.

Thanks,

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-21 Thread Paulo Matos


On 21/09/17 14:11, Mark Wielaard wrote:
> Hi,
> 
> First let me say I am also a fan of buildbot. I use it for a couple of
> projects and it is really flexible, low on resources, easy to add new
> builders/workers and easily extensible if you like python.
> 
> On Thu, 2017-09-21 at 07:18 +0200, Markus Trippelsdorf wrote:
>> And it has the basic problem of all automatic testing: that in the
>> long run everyone simply ignores it.
>> The same thing would happen with the proposed new buildbot. It would
>> use still more resources on the already overused machines without
>> producing useful results.
> 
> But this is a real concern and will happen if you are too eager testing
> all the things all the time. So I would recommend to start as small as
> possible. Pick a target that builds as fast as possible. Once you go
> over 30 minutes of build/test time it really is already too long. Both
> on machine resources and human attention span. And pick a subset of the
> testsuite that should be zero-FAIL. Only then will people really take
> notice when the build turns from green-to-red. Otherwise people will
> always have an excuse "well, those tests aren't really reliable, it
> could be something else". And then only once you have a stable
> small/fast builder that reliably warns committers that their commit
> broke something extend it to new targets/setups/tests as long as you
> can keep the false warnings as close to zero as possible.
> 

Thanks. This is an interesting idea, however it might not be an easy
exercise to choose a subset of the tests for each compiled language that
PASS, are quick to run and representative. It would be interesting to
hear from some of the main developers which of the tests would be better
to run.

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-21 Thread Paulo Matos


On 21/09/17 02:27, Joseph Myers wrote:
> On Wed, 20 Sep 2017, Segher Boessenkool wrote:
> 
>>> - buildbot can notify people if the build fails or if there's a test
>>> regression. Notification can be sent to IRC and email for example. What
>>> would people prefer to have as the settings for notifications?
>>
>> Just try it!  IRC is most useful I think, at least for now.  But try
>> whatever seems useful, if there are too many complaints you can always
>> turn it off again ;-)
> 
> We have the gcc-regression list for automatic regression notifications 
> (but as per my previous message, regression notifications are much more 
> useful if someone actually does the analysis, promptly, to identify the 
> cause and possibly a fix).
> 

Yes, the gcc-regression list. Will add a notifier to email the list.

> My glibc bots only detect testsuite regressions that change the exit 
> status of "make check" from successful to unsuccessful (and regressions 
> that break any part of the build, etc.).  That works for glibc, where the 
> vast bulk of configurations have clean compilation-only testsuite results.  
> It won't work for GCC - you need to detect changes in the results of 
> individual tests (or new tests which start out as FAIL and may not 
> strictly be regressions but should still be fixed).  Ideally the expected 
> baseline *would* be zero FAILs, but we're a long way off that.
> 

Yes, with GCC is slightly more complex but it should be possible to
calculate regressions even in the presence of non-zero FAILs.

Thanks for your comments,

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-21 Thread Paulo Matos


On 21/09/17 01:01, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Sep 20, 2017 at 05:01:55PM +0200, Paulo Matos wrote:
>> This mail's intention is to gauge the interest of having a buildbot for
>> GCC.
> 
> +1.  Or no, +100.
> 
>> - which machines we can use as workers: we certainly need more worker
>> (previously known as slave) machines to test GCC in different
>> archs/configurations;
> 
> I think this would use too much resources (essentially the full machines)
> for the GCC Compile Farm.  If you can dial it down so it only uses a
> small portion of the machines, we can set up slaves there, at least on
> the more unusual architectures.  But then it may become too slow to be
> useful.
> 

We can certainly decide what builds on workers in the compile farm and
what doesn't. We can also decide what type of build we want. A full
bootstrap, all languages etc. I still have to look at that. Not sure how
to access the compile farm or who has access to them.

>> - what kind of build configurations do we need and what they should do:
>> for example, do we want to build gcc standalone against system (the one
>> installed in the worker) binutils, glibc, etc or do we want a builder to
>> bootstrap everything?
> 
> Bootstrap is painfully slow, but it catches many more problems.
> 

Could possibly do that on a schedule instead.

>> -  We are building gcc for C, C++, ObjC (Which is the default). Shall we
>> add more languages to the mix?
> 
> I'd add Fortran, it tends to find problems (possibly because it has much
> bigger tests than most C/C++ tests are).  But all extra testing uses
> disproportionally more resources...  Find a sweet spot :-)  You probably
> should start with as little as possible, or perhaps do bigger configs
> every tenth build, or something like that.
> 

Sounds like a good idea.

>> - the gdb buildbot has a feature I have disabled (the TRY scheduler)
>> which allows people to submit patches to the buildbot, buildbot patches
>> the current svn version, builds and tests that. Would we want something
>> like this?
> 
> This is very useful, but should be mostly separate...  There are of course
> the security considerations, but also this really needs clean builds every
> time, and perhaps even bootstraps.
> 

There could be two types of try schedulers, one for full bootstraps and
one just for GCC. Security wise we could always containerize.

>> - buildbot can notify people if the build fails or if there's a test
>> regression. Notification can be sent to IRC and email for example. What
>> would people prefer to have as the settings for notifications?
> 
> Just try it!  IRC is most useful I think, at least for now.  But try
> whatever seems useful, if there are too many complaints you can always
> turn it off again ;-)
>
> Thank you for working on this.
>

Thanks for all the comments. I will add the initial notifications into
IRC and see how people react.

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-21 Thread Paulo Matos


On 20/09/17 19:14, Joseph Myers wrote:
> On Wed, 20 Sep 2017, Paulo Matos wrote:
> 
>> - buildbot can notify people if the build fails or if there's a test
>> regression. Notification can be sent to IRC and email for example. What
>> would people prefer to have as the settings for notifications?
> 
> It's very useful if someone reviews regressions manually and files bugs / 
> notifies authors of patches that seem likely to have caused them / fixes 
> them if straightforward.  However, that can take a lot of time (especially 
> if you're building a large number of configurations, and ideally there 
> would be at least compilation testing for every target architecture 
> supported by GCC if enough machine resources are available).  (I do that 
> for my glibc bot, which does compilation-only testing for many 
> configurations, covering all supported glibc ABIs except Hurd - the 
> summaries of results go to libc-testresults, but the detailed logs that 
> show e.g. which test failed or failed to build aren't public; each build 
> cycle for the mainline bot produces about 3 GB of logs, which get deleted 
> after the following build cycle.)
> 

I totally agree that only if people get involved in checking if there
were regressions and keeping an eye on what's going on are things going
to improve. The framework can help a lot here by notifying the right
people and the mailing list if something gets broken or if there are
regressions but once the notification is sent someone certainly needs to
pick it up.

I believe that once the framework is there and if it's reliable and user
friendly and does not force people to check the UI every day, instead
notifying people only when things go wrong, then it will force people to
take notice and do something about breakages.

At the moment, there are no issues with regards to logs sizes but we are
starting small with a single worker. Once we have more we'll have to
revisit this issue.

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-21 Thread Paulo Matos


On 20/09/17 17:07, Jeff Law wrote:
> I'd strongly recommend using one of the existing infrastructures.  I
> know several folks (myself included) are using Jenkins/Hudson.  There's
> little to be gained building a completely new infrastructure to manage a
> buildbot.
> 

As David pointed out in another email, I should have referenced the
buildbot homepage:
http://buildbot.net/

This is a framework with batteries included to build the kind of things
we want to have for testing. I certainly don't want to start a Jenkins
vs Buildbot discussion.

Kind regards,

-- 
Paulo Matos


GCC Buildbot

2017-09-20 Thread Paulo Matos
Hi all,

I am internally running buildbot for a few projects, including one for a
simple gcc setup for a private port. After some discussions with David
Edelsohn at the last couple of Cauldrons, who told me this might be
interesting for the community in general, I have contacted Sergio DJ
with a few questions on his buildbot configuration for GDB. I then
stripped out his configuration and transformed it into one from GCC,
with a few private additions and ported it to the most recent buildbot
version nine (which is numerically 0.9.x).

To make a long story short: https://gcc-buildbot.linki.tools
With brief documentation in: https://linkitools.github.io/gcc-buildbot
and configuration in: https://github.com/LinkiTools/gcc-buildbot

Now, this is still pretty raw but it:
* Configures a fedora x86_64 for C, C++ and ObjectiveC (./configure
--disable-multilib)
* Does an incremental build
* Runs all tests
* Grabs the test results and stores them as properties
* Creates a tarball of the sum and log files from the testsuite
directory and uploads them

This mail's intention is to gauge the interest of having a buildbot for
GCC. Buildbot is a generic Python framework to build a test framework so
the possibilities are pretty much endless as all workflows are
programmed in Python and with buildbot nine the interface is also
modifiable, if required.

If this is something of interest, then we will need to understand what
is required, among those:

- which machines we can use as workers: we certainly need more worker
(previously known as slave) machines to test GCC in different
archs/configurations;
- what kind of build configurations do we need and what they should do:
for example, do we want to build gcc standalone against system (the one
installed in the worker) binutils, glibc, etc or do we want a builder to
bootstrap everything?
- initially I was doing fresh builds and uploading a tarball (450Mgs)
for download. This took way too long. I have moved to incremental builds
with no tarball generation but if required we could do this for forced
builds and/or nightly. Ideas?
- We are currently running the whole testsuite for each incremental
build (~40mins). If we want a faster turnaround time, we could run just
an important subset of tests. Suggestions?
- would we like to run anything on the compiler besides the gcc
testsuite? I know Honza does, or used to do, lots of firefox builds to
test LTO. Shall we build those, for example? I noticed there's a testing
subpage which contains a few other libraries, should we build these?
(https://gcc.gnu.org/testing/)
- Currently we have a force build which allows people to force a build
on the worker. This requires no authentication and can certainly be
abused. We can add some sort of authentication, like for example, only
allow users with a gcc.gnu.org email? For now, it's not a problem.
-  We are building gcc for C, C++, ObjC (Which is the default). Shall we
add more languages to the mix?
- the gdb buildbot has a feature I have disabled (the TRY scheduler)
which allows people to submit patches to the buildbot, buildbot patches
the current svn version, builds and tests that. Would we want something
like this?
- buildbot can notify people if the build fails or if there's a test
regression. Notification can be sent to IRC and email for example. What
would people prefer to have as the settings for notifications?
- an example of a successful build is:
https://gcc-buildbot.linki.tools/#/builders/1/builds/38
This build shows several Changes because between the start and finish of
a build there were several new commits. Properties show among other
things test results. Responsible users show the people who were involved
in the changes for the build.

I am sure there are lots of other questions and issues. Please let me
know if you find this interesting and what you would like to see
implemented.

Kind regards,

-- 
Paulo Matos


Re: Is test case with 700k lines of code a valid test case?

2016-03-19 Thread Paulo Matos


On 14/03/16 16:31, Andrey Tarasevich wrote:
> Hi,
> 
> I have a source file with 700k lines of code 99% of which are printf() 
> statements. Compiling this test case crashes GCC 5.3.0 with segmentation 
> fault. 
> Can such test case be considered valid or source files of size 35 MB are too 
> much for a C compiler and it should crash? It crashes on Ubuntu 14.04 64bit 
> with 16GB of RAM. 
> 
> Cheers,
> Andrey
> 

I would think it's useful but a reduced version would be great.
Can you reduce the test? If you need a hand, I can help. Contact me
directly and I will give it a try.

Cheers,
-- 
Paulo Matos


Re: Is test case with 700k lines of code a valid test case?

2016-03-19 Thread Paulo Matos


On 18/03/16 15:02, Jonathan Wakely wrote:
> 
> It's probably crashing because it's too large, so if you reduce it
> then it won't crash.
> 

Would be curious to see what's the limit though, or if it depends on the
machine he's running GCC on.

-- 
Paulo Matos



signature.asc
Description: OpenPGP digital signature


Re: Repository for the conversion machinery

2015-08-27 Thread Paulo Matos



On 27/08/15 16:56, Paulo Matos wrote:


I noticed I am not on the list (check commit r225509, user pmatos) either.

And thanks for your help on this transition.



r188804 | mkuvyrkov ma...@codesourcery.com

for example.

--
Paulo Matos


Re: Repository for the conversion machinery

2015-08-27 Thread Paulo Matos



On 27/08/15 16:48, Eric S. Raymond wrote:

FX fxcoud...@gmail.com:

[context for the Fortran list: the svn repo is about to be converted into a git 
repo, which will be the official gcc repo onwards]

Hi Eric,

I realize that some of our Fortran maintainers (and committers) are not listed 
in the map file:

Fortran Janne Blomqvist j...@gcc.gnu.org
Fortran Tobias Burnus   bur...@net-b.de
Fortran Daniel Franke   franke.dan...@gmail.com
Fortran Daniel Kraftd...@domob.eu
Fortran Mikael Morinmikael.mo...@sfr.fr
Fortran Janus Weil  ja...@gcc.gnu.org

Is that normal?


Do they have actual commits in the repository or were their commits shipped
as patches and merged by a core maintainer?

If the former, then I don't know why they're not in the map.  It contains
an entry for every distinct Unix username it could extract.  What usernames
should I expect these people to have?



I noticed I am not on the list (check commit r225509, user pmatos) either.

And thanks for your help on this transition.

--
Paulo Matos


svn timeouts

2015-08-27 Thread Paulo Matos

Hi,

Am I the only one regularly getting svn timeouts lately?
svn: E210002: Unable to connect to a repository at URL 
'svn://gcc.gnu.org/svn/gcc/trunk'

svn: E210002: Network connection closed unexpectedly

Is this because the repository is being overloaded with requests 
regarding the latest transition start?


--
Paulo Matos


RE: Expectations for 0/0

2015-07-29 Thread Paulo Matos


 -Original Message-
 From: Andrew Haley [mailto:a...@redhat.com]
 Sent: 28 July 2015 18:38
 To: Paulo Matos; gcc@gcc.gnu.org
 Subject: Re: Expectations for 0/0
 
 On 07/28/2015 04:40 PM, Paulo Matos wrote:
  The block skips the test for ((unsigned int) xx  1 == 0  yy == -
 1), should we skip it if they're both zero as well?
 
 Yes.  It's undefined behaviour.  If we don't want to invoke nasal
 daemons we shouldn't do this.


Thanks. I will propose a patch to avoid this.

-- 
Paulo Matos


RE: Expectations for 0/0

2015-07-29 Thread Paulo Matos


 -Original Message-
 From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf
 Of Paulo Matos
 Sent: 29 July 2015 10:12
 To: Andrew Haley; gcc@gcc.gnu.org
 Subject: RE: Expectations for 0/0
 
 
 
  -Original Message-
  From: Andrew Haley [mailto:a...@redhat.com]
  Sent: 28 July 2015 18:38
  To: Paulo Matos; gcc@gcc.gnu.org
  Subject: Re: Expectations for 0/0
 
  On 07/28/2015 04:40 PM, Paulo Matos wrote:
   The block skips the test for ((unsigned int) xx  1 == 0  yy ==
 -
  1), should we skip it if they're both zero as well?
 
  Yes.  It's undefined behaviour.  If we don't want to invoke nasal
  daemons we shouldn't do this.
 
 
 Thanks. I will propose a patch to avoid this.
 

My mistake. The check is already in the test but as I simplified the test, I 
ended up removing the check for 0 in the denominator.

Apologies.

-- 
Paulo Matos


Expectations for 0/0

2015-07-28 Thread Paulo Matos
Hi,

What are the expectations for the 0/0 division?
Test execute.exp=arith-rand.c generates two integers, both being 0 and one of 
the testing blocks is:
  { signed int xx = x, yy = y, r1, r2;
if ((unsigned int) xx  1 == 0  yy == -1)
  continue;
r1 = xx / yy;
r2 = xx % yy;
if (ABS (r2) = (unsigned int) ABS (yy) || (signed int) (r1 * yy + r2) 
!= xx)
  abort ();
  }

Our routine returns : 
R1: 0x
R2: 0xf

Then it aborts because ABS (r2) = (unsigned int) ABS (yy).
While I understand the results from our division routine might be peculiar, 
this division is also undefined.

The block skips the test for ((unsigned int) xx  1 == 0  yy == -1), should 
we skip it if they're both zero as well? If not, what do you expect to get from 
0/0 and 0%0?

Regards,

Paulo Matos




RE: Possible range based 'for' bug

2015-06-22 Thread Paulo Matos


 -Original Message-
 From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf
 Of Julian Klappenbach
 Sent: 21 June 2015 16:56
 To: gcc@gcc.gnu.org
 Subject: Re: Possible range based 'for' bug
 
 Version info:
 
 Configured with:
 --prefix=/Applications/Xcode.app/Contents/Developer/usr
 --with-gxx-include-dir=/usr/include/c++/4.2.1
 Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)
 Target: x86_64-apple-darwin14.3.0
 Thread model: posix
 

Is this what gcc --version returns on your mac?

Paulo Matos


Re: [PATCH] Fix typo

2015-05-11 Thread Paulo Matos
On Sun, 10 May 2015 22:07:53 -0600, Jeff Law wrote:

 On 05/10/2015 03:00 PM, Paulo Matos wrote:
 Yes.  This would fall under the obvious rule and can be committed
 without waiting for approvals.
 
 jeff

Thanks. Committed.
-- 
Paulo Matos



Re: [PATCH] Add myself to MAINTAINERS

2015-05-11 Thread Paulo Matos
On Sun, 10 May 2015 21:08:04 +, Paulo Matos wrote:

 Somehow I never added myself to the MAINTAINERS file.
 Apologies for that. OK to commit?
 
 2015-05-10  Paulo Matos  pa...@matos-sorge.com
 
 * MAINTAINERS: Add myself as commit after approval.
 
 diff --git a/MAINTAINERS b/MAINTAINERS index 7dc4c8f..c5d6c99 100644 ---
 a/MAINTAINERS +++ b/MAINTAINERS @@ -483,6 +483,7 @@ David Malcolm
 dmalc...@redhat.com
  Mikhail Maltsev
 malts...@gmail.com
  Simon Martin
 simar...@users.sourceforge.net
  Ranjit Mathew  rmat...@hotmail.com
 +Paulo Matospa...@matos-sorge.com
  Michael Matz   m...@suse.de
  Greg McGaryg...@gnu.org
  Roland McGrath rol...@hack.frob.com

Committed as obvious.

-- 
Paulo Matos



[PATCH] Add myself to MAINTAINERS

2015-05-10 Thread Paulo Matos
Somehow I never added myself to the MAINTAINERS file. 
Apologies for that. OK to commit?

2015-05-10  Paulo Matos  pa...@matos-sorge.com

* MAINTAINERS: Add myself as commit after approval.

diff --git a/MAINTAINERS b/MAINTAINERS
index 7dc4c8f..c5d6c99 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -483,6 +483,7 @@ David Malcolm   
dmalc...@redhat.com
 Mikhail Maltsev
malts...@gmail.com
 Simon Martin   
simar...@users.sourceforge.net
 Ranjit Mathew  rmat...@hotmail.com
+Paulo Matospa...@matos-sorge.com
 Michael Matz   m...@suse.de
 Greg McGaryg...@gnu.org
 Roland McGrath rol...@hack.frob.com




[PATCH] Fix typo

2015-05-10 Thread Paulo Matos
OK to commit?

2015-05-10  Paulo Matos  pa...@matos-sorge.com  
  
* configure.ac: Fix typo.  
* configure: Regenerate.  
  
diff --git a/configure b/configure  
index a3f66ba..8ee279f 100755  
--- a/configure  
+++ b/configure  
@@ -7423,7 +7423,7 @@ fi  
 # multilib is not explicitly enabled.  
 case $target:$have_compiler:$host:$target:$enable_multilib in  
   x86_64-*linux*:yes:$build:$build:)  
-# Make sure we have a developement environment that handles 32-bit  
+# Make sure we have a development environment that handles 32-bit  
 dev64=no  
 echo int main () { return 0; }  conftest.c  
 ${CC} -m32 -o conftest ${CFLAGS} ${CPPFLAGS} ${LDFLAGS} conftest.c  
@@ -7434,7 +7434,7 @@ case $target:$have_compiler:$host:$target:
$enable_multilib in  
 fi  
 rm -f conftest*  
 if test x${dev64} != xyes ; then  
-  as_fn_error I suspect your system does not have 32-bit 
developement libraries (libc and headers). If you have them, rerun 
configure with --enable-multilib. If you do not have them, and want to 
build a 64-bit-only compiler, rerun configure with --disable-multilib. 
$LINENO 5
+  as_fn_error I suspect your system does not have 32-bit 
development libraries (libc and headers). If you have them, rerun 
configure with --enable-multilib. If you do not have them, and want to 
build a 64-bit-only compiler, rerun configure with --disable-multilib. 
$LINENO 5
 fi
 ;;
 esac
diff --git a/configure.ac b/configure.ac
index 987dfab..4081fb9 100644
--- a/configure.ac
+++ b/configure.ac
@@ -3063,7 +3063,7 @@ fi
 # multilib is not explicitly enabled.
 case $target:$have_compiler:$host:$target:$enable_multilib in
   x86_64-*linux*:yes:$build:$build:)
-# Make sure we have a developement environment that handles 32-bit
+# Make sure we have a development environment that handles 32-bit
 dev64=no
 echo int main () { return 0; }  conftest.c
 ${CC} -m32 -o conftest ${CFLAGS} ${CPPFLAGS} ${LDFLAGS} conftest.c
@@ -3074,7 +3074,7 @@ case $target:$have_compiler:$host:$target:
$enable_multilib in
 fi 
 rm -f conftest*
 if test x${dev64} != xyes ; then
-  AC_MSG_ERROR([I suspect your system does not have 32-bit 
developement libraries (libc and headers). If you have them, rerun 
configure with --enable-multilib. If you do not have them, and want to 
build a 64-bit-only compiler, rerun configure with --disable-multilib.])
+  AC_MSG_ERROR([I suspect your system does not have 32-bit 
development libraries (libc and headers). If you have them, rerun 
configure with --enable-multilib. If you do not have them, and want to 
build a 64-bit-only compiler, rerun configure with --disable-multilib.])
 fi
 ;;
 esac




RE: GCC version bikeshedding

2014-07-21 Thread Paulo Matos


 -Original Message-
 From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf
 Of Andi Kleen
 Sent: 20 July 2014 22:29
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: Re: GCC version bikeshedding
 
 Paulo Matos pa...@matos-sorge.com writes:
 
  That's what I understood as well. Someone mentioned to leave the
 patch
  level number to the distros to use which sounded like a good idea.
 
 Sounds like a bad idea, as then there would be non unique gcc
 versions.
 redhat gcc 5.0.2 potentially being completely different from suse gcc
 5.0.2
 

I understand your concern but I am not convinced it's bad. The main reason for 
this is that we wouldn't distribute GCCs x.y.z with z != 0 so if you would see 
5.0.3 in the wild then you could only conclude it's 5.0 with a few patches from 
some vendor. As I type this I am starting to think how frustrating this might 
become. However, isn't it the case that nowadays you can have different gcc 
4.9.1-2 distributed from different distros? The default gcc in my linode shows: 
gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)

So, I can't see why in the future you couldn't have:
Gcc version 5.1 (Ubuntu/Linaro 5.1.3)

This is only if the release managers want to assign the patch level number to 
distros. I don't think there was a decision on this.

Paulo Matos


Re: GCC version bikeshedding

2014-07-20 Thread Paulo Matos

On 20/07/14 17:59, Richard Biener wrote:

On July 20, 2014 5:55:06 PM GMT+01:00, Jakub Jelinek ja...@redhat.com wrote:

Hi!

So, what versioning scheme have we actually agreed on, before I change
it in
wwwdocs?  Is that
5.0.0 in ~ April 2015, 5.0.1 in ~ June-July 2015 and 5.1.0 in ~ April
2016,
or
5.0 in ~ April 2015, 5.1 in ~ June-July 2015 and 6.0 in ~ April 2016?
The only thing I understood was that we don't want 4.10, but for the
rest
various people expressed different preferences and then it was
presented as
agreement on 5.0, which applies to both of the above.

It is not a big deal either way of course.


I understood we agreed on 5.0 and further 5.1, 5.2 releases from the branch and 
6.0 a year later. With unspecified uses for the patch level number (so leave it 
at zero).



That's what I understood as well. Someone mentioned to leave the patch 
level number to the distros to use which sounded like a good idea.


Paulo Matos


Richard.


Jakub









RE: Roadmap for 4.9.1, 4.10.0 and onwards?

2014-05-20 Thread Paulo Matos


 -Original Message-
 From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf
 Of Basile Starynkevitch
 Sent: 20 May 2014 16:29
 To: Bruce Adams
 Cc: gcc@gcc.gnu.org
 Subject: Re: Roadmap for 4.9.1, 4.10.0 and onwards?
 
 On Tue, 2014-05-20 at 11:09 +0100, Bruce Adams wrote:
  Hi,
  I've been tracking the latest releases of gcc since 4.7 or so
 (variously interested in C++1y support, cilk and openmp).
  One thing I've found hard to locate is information about planned
 inclusions for future releases.
  As much relies on unpredictable community contributions I don't
 expect there to be a concrete or reliable plan.
 
  However, equally I'm sure the steering committee have some ideas
 over
  what ought to be upcoming releases.
 
 As a whole, the steering committee does not have any idea, because GCC
 development is based upon volunteer contributions.


I understand the argument but I am not sure it's the way to go. Even if the 
project is based on volunteer contributions it would be interesting to have a 
tentative roadmap. This, I would think, would also help possible beginner 
volunteers know where to start if they wanted to contribute to the project. So 
the roadmap could be a list of features (big or small) of bug fixes that we 
would like fixed for a particular version. Even if we don't want to name it 
roadmap it would still be interesting to have a list of things that are being 
worked on or on the process of being merged into mainline and therefore will 
make it to the next major version.

That being said I know it's hard to set sometime apart to write this kind of 
thing given most of us prefer to be hacking on GCC. From a newcomer point of 
view, however, not having things like a roadmap makes it look like the project 
is heading nowhere.


RE: jump_table_data and active_insn_p

2014-05-12 Thread Paulo Matos


 -Original Message-
 From: Steven Bosscher [mailto:stevenb@gmail.com]
 Sent: 05 May 2014 10:11
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: Re: jump_table_data and active_insn_p
 
 On Mon, Mar 17, 2014 at 12:51 PM, Paulo Matos wrote:
  Why is jump_table_data an active_insn?
  int
  active_insn_p (const_rtx insn)
  {
return (CALL_P (insn) || JUMP_P (insn)
|| JUMP_TABLE_DATA_P (insn) /* FIXME */
|| (NONJUMP_INSN_P (insn)
 (! reload_completed
|| (GET_CODE (PATTERN (insn)) != USE
 GET_CODE (PATTERN (insn)) != CLOBBER; }
 
  It is clear that someone [Steven Bosscher] thought it needs fixing
 but what's the problem with just removing it from the OR-expression?
 
 Places using active_insn_p, next_active_insn, prev_active_insn, etc.,
 need to be audited to make sure it's safe to remove JUMP_TABLE_DATA
 from the OR-expression.
 
 I've done most of that work, but it needs finishing and for that I
 need to find some time.
 See http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03122.html
 

Fair enough.

Thanks for the explanation.

Paulo Matos

 Ciao!
 Steven


returning short-enum and truncate doesn't trigger conversion warning

2014-03-19 Thread Paulo Matos
Hi all,

This is either a C standard subtlety or a bug in GCC.
For example:
unsigned short foo (unsigned int a)
{
  return a;
}

enum xpto
{
  A = 0,
  B = 1,
  X = 512
};

extern void print (unsigned int);

unsigned char bar (enum xpto a)
{
  print (sizeof (unsigned char));
  print (sizeof (enum xpto));
  return a;
}

$ ~/work/tmp/GCC/builds/gcc-trunk/gcc/cc1 -O2 test.c -Wall -Wconversion 
--short-enums -quiet
test.c: In function 'foo':
test.c:3:3: warning: conversion to 'short unsigned int' from 'unsigned int' may 
alter its value [-Wconversion]
   return a;
   ^

I was expecting a warning for bar as well since sizeof unsigned char is 1 and 
sizeof enum xpto is 2, therefore the value is truncated but no warning is 
issued.

Shall I open a PR?

Paulo Matos




RE: returning short-enum and truncate doesn't trigger conversion warning

2014-03-19 Thread Paulo Matos


 -Original Message-
 From: David Brown [mailto:da...@westcontrol.com]
 Sent: 19 March 2014 14:44
 To: Paulo Matos; gcc@gcc.gnu.org
 Subject: Re: returning short-enum and truncate doesn't trigger
 conversion warning
 
 On 19/03/14 15:33, Paulo Matos wrote:
  Hi all,
 
  This is either a C standard subtlety or a bug in GCC.
 
 This is perfectly normal behaviour for C, and not a bug.  It is also a
 topic for the gcc help list, rather than the development list.
 

Apologies, but I submitted this to gcc devel because I do think it is a bug.

 
  enum xpto
  {
A = 0,
B = 1,
X = 512
  };
 
  extern void print (unsigned int);
 
  unsigned char bar (enum xpto a)
  {
print (sizeof (unsigned char));
print (sizeof (enum xpto));
return a;
  }
 
 The sizeof operator returns an integer of type size_t (typically the
 same as unsigned int or unsigned long, depending on the platform
 details).  But the compiler can see that the particular values in
 question - 1 and 2 - can be converted to unsigned int without loss of
 precision of changing the value.  Therefore no warning is giving.


512 cannot be converted to an unsigned char without loss of precision. 
The prints are actually not relevant for the question. They were just used for 
me to confirm the sizes of the types. 
enum xpto
{
  A = 0,
  B = 1,
  X = 512
};
extern void print (unsigned int);

unsigned char bar (enum xpto a)
{
   return a;
}

Variable a could be 0, 1 or 512. In case of the latter there is loss of 
precision but still no warning.
 
 
  $ ~/work/tmp/GCC/builds/gcc-trunk/gcc/cc1 -O2 test.c -Wall
  -Wconversion --short-enums -quiet
  test.c: In function 'foo':
  test.c:3:3: warning: conversion to 'short unsigned int' from
 'unsigned int' may alter its value [-Wconversion]
 return a;
 ^
 
  I was expecting a warning for bar as well since sizeof unsigned char
 is 1 and sizeof enum xpto is 2, therefore the value is truncated but
 no warning is issued.
 
  Shall I open a PR?
 
  Paulo Matos
 
 
 



RE: returning short-enum and truncate doesn't trigger conversion warning

2014-03-19 Thread Paulo Matos


 -Original Message-
 From: David Brown [mailto:da...@westcontrol.com]
 Sent: 19 March 2014 15:47
 To: Paulo Matos; gcc@gcc.gnu.org
 Subject: Re: returning short-enum and truncate doesn't trigger
 conversion warning
 
 
 Usually the discovery of bugs gets discussed on the help list - fixes
 for the bugs are an issue for the developers list.  And it is
 certainly a help issue until we know you have found a bug.


Sure, I will make sure I submit these questions to gcc-help first.
 
 
 
  enum xpto
  {
A = 0,
B = 1,
X = 512
  };
 
  extern void print (unsigned int);
 
  unsigned char bar (enum xpto a)
  {
print (sizeof (unsigned char));
print (sizeof (enum xpto));
return a;
  }
 
  The sizeof operator returns an integer of type size_t (typically
 the
  same as unsigned int or unsigned long, depending on the platform
  details).  But the compiler can see that the particular values in
  question - 1 and 2 - can be converted to unsigned int without loss
 of
  precision of changing the value.  Therefore no warning is giving.
 
 
  512 cannot be converted to an unsigned char without loss of
 precision.
  The prints are actually not relevant for the question. They were
 just used for me to confirm the sizes of the types.
 
 Not only are the prints not relevant, but they hid the real issue - I
 missed the point of your question!
 
 Conversions involving enumerations are not listed in the documentation
 for -Wconversion, but it would be nice to get a warning here.  Filing
 it as an enhancement request seems reasonable.
 

Will report as an enhancement.

Thanks,

Paulo Matos


jump_table_data and active_insn_p

2014-03-17 Thread Paulo Matos
Why is jump_table_data an active_insn?
int
active_insn_p (const_rtx insn)
{
  return (CALL_P (insn) || JUMP_P (insn)
  || JUMP_TABLE_DATA_P (insn) /* FIXME */
  || (NONJUMP_INSN_P (insn)
   (! reload_completed
  || (GET_CODE (PATTERN (insn)) != USE
   GET_CODE (PATTERN (insn)) != CLOBBER;
}

It is clear that someone [Steven Bosscher] thought it needs fixing but what's 
the problem with just removing it from the OR-expression?

Cheers,

Paulo Matos




RE: dom requires PROP_loops

2014-03-14 Thread Paulo Matos
 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: 13 March 2014 18:46
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: RE: dom requires PROP_loops
 
 On March 13, 2014 5:00:53 PM CET, Paulo Matos pma...@broadcom.com wrote:
  -Original Message-
  From: Richard Biener [mailto:richard.guent...@gmail.com]
  Sent: 13 March 2014 13:24
  To: Paulo Matos
  Cc: gcc@gcc.gnu.org
  Subject: Re: dom requires PROP_loops
 
 
  Probably RTL cfgcleaup needs the same treatment as GIMPLE cfgcleanup
  then - allow removal if loop properties allows it.
 
 
 In both cfgcleanup.c and tree-cfgcleanup.c I can see code that protects
 loop latches, but I see no code that allows removal of latch if
 property allows it.
 From what you say I would expect this would already be implemented in
 tree-cfgcleanup.c, however what actually happens is that since
 current_loops is non-null (PROP_loops is not destroyed in tree
 loopdone), tree-cfgcleanup call chain ends up calling
 cleanup_tree_cfg_bb on the bb loop latch and tree_forwarder_block_p
 returns false for bb because of the following code thereby not removing
 the latch:
   if (current_loops)
 {
   basic_block dest;
   /* Protect loop latches, headers and preheaders.  */
   if (bb-loop_father-header == bb)
  return false;
   dest = EDGE_SUCC (bb, 0)-dest;
 
   if (dest-loop_father-header == dest)
  return false;
 }
 
 Why do we need to protect the latch?
 
 You are looking at old sources.


That's correct. I was looking at 4.8. Let me take a look at what trunk is 
doing... :)
 
 Richard.
 
 Paulo Matos
 
  Richard.
 
 



RE: dom requires PROP_loops

2014-03-13 Thread Paulo Matos


 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: 11 March 2014 10:52
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: Re: dom requires PROP_loops
 
 On Mon, Mar 10, 2014 at 12:57 PM, Paulo Matos pma...@broadcom.com wrote:
  Hello,
 
  In an attempt to test some optimization I destroyed the loop property in
 pass_tree_loop_done and reinstated it in pass_rtl_loop_init, however then I
 noticed that pass_dominator started generating wrong code.
  My guess is that we should mark pass_dominator with PROP_loops as a required
 property? Do you agree?
 
 No, PROP_loops is something artificial.  Passes needing loops
 will compute them (call loop_optimizer_init).
 
 You probably did sth wrong with how you destroy PROP_loops.


Haven't done anything out of the ordinary. I actually copied what other passes 
do:
  cfun-curr_properties = ~PROP_loops;

before calling loop_optimizer_finalize.

I find it strange that you actually need to do this since something like this 
should be done automatically if you mark the property as being destroyed.

The background of my investigation in to this is that we don't like 
instructions in loop latches. This blocks generation of zero-overhead loops for 
us.
PROP_loops is enabled from tree loopinit to rtl loop_done2 and with this 
property enabled cfg_cleanup doesn't remove empty latches allowing GCC to move 
instructions into the latch in the meantime.

If I destroy this property (as above) in tree_ssa_loop_done and recreate it in 
rtl loop_init we have major performance boost because CFG cleanup removes empty 
loop latches but I found a case when jump threading generates wrong code in 
this case.
 
 Richard.



RE: dom requires PROP_loops

2014-03-13 Thread Paulo Matos
 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: 13 March 2014 13:24
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: Re: dom requires PROP_loops
 
 
 Probably RTL cfgcleaup needs the same treatment as GIMPLE cfgcleanup
 then - allow removal if loop properties allows it.
 

In both cfgcleanup.c and tree-cfgcleanup.c I can see code that protects loop 
latches, but I see no code that allows removal of latch if property allows it. 
From what you say I would expect this would already be implemented in 
tree-cfgcleanup.c, however what actually happens is that since current_loops is 
non-null (PROP_loops is not destroyed in tree loopdone), tree-cfgcleanup call 
chain ends up calling cleanup_tree_cfg_bb on the bb loop latch and 
tree_forwarder_block_p returns false for bb because of the following code 
thereby not removing the latch:
  if (current_loops)
{
  basic_block dest;
  /* Protect loop latches, headers and preheaders.  */
  if (bb-loop_father-header == bb)
return false;
  dest = EDGE_SUCC (bb, 0)-dest;

  if (dest-loop_father-header == dest)
return false;
}

Why do we need to protect the latch?

Paulo Matos

 Richard.





dom requires PROP_loops

2014-03-10 Thread Paulo Matos
Hello,

In an attempt to test some optimization I destroyed the loop property in 
pass_tree_loop_done and reinstated it in pass_rtl_loop_init, however then I 
noticed that pass_dominator started generating wrong code.
My guess is that we should mark pass_dominator with PROP_loops as a required 
property? Do you agree?

Cheers,

Paulo Matos




Re: [PATCH] [lto/55113] Fix use of -fshort-double with -flto for powerpc

2014-03-08 Thread Paulo Matos

On 07/03/14 09:03, Richard Biener wrote:


Btw, can you check the attached as well?  It makes sure all TUs
have -fshort-double set consistently and it automatically enables
it at link-time, not allowing to override the setting.

If it works for you please check it in, too.  (I can't really test it
because -fshort-double brokeness on x86_64).



I have tested it and everything looks fine. I have now committed all the 
code and testcase.  Hopefully it's now sorted.


Thanks for your help,

Paulo Matos


Thanks,
Richard.




--
PMatos


ira marks conflict with hard register, can't explain it.

2014-03-07 Thread Paulo Matos
Hello,

I have been trying to understand why IRA is marking a register to conflict with 
a hard register but can't find a good explanation.

Before IRA pass I have the following insn chain in bb2:
(insn 15 16 21 2 (set (reg/v/f:SI 122 [ pdata ])
(reg:SI 0 r0 [ pdata ])) ../src/bdv258_limt_process.c:66 481 {fp_movsi}
 (expr_list:REG_DEAD (reg:SI 0 r0 [ pdata ])
(nil)))
(insn 21 15 20 2 (set (reg:HI 123 [ MEM[(struct limt_state *)pdata_5(D)].lime ])
(mem:HI (plus:SI (reg/v/f:SI 122 [ pdata ])
(const_int 44 [0x2c])) [4 MEM[(struct limt_state 
*)pdata_5(D)].lime+0 S2 A32])) ../src/bdv258_limt_process.c:87 484 {fp_movhi}
 (nil))
(insn 20 21 22 2 (set (reg/v:SI 105 [ blk_sz ])
(mem:SI (plus:SI (reg/v/f:SI 122 [ pdata ])
(const_int 16 [0x10])) [2 MEM[(struct limt_state 
*)pdata_5(D)].blk_sz+0 S4 A32])) ../src/bdv258_limt_process.c:84 481 {fp_movsi}
 (nil))
(insn 22 20 23 2 (set (reg:BI 124)
(ne:BI (reg:HI 123 [ MEM[(struct limt_state *)pdata_5(D)].lime ])
(const_int 0 [0]))) ../src/bdv258_limt_process.c:87 1368 
{cmp_himode}
 (expr_list:REG_DEAD (reg:HI 123 [ MEM[(struct limt_state 
*)pdata_5(D)].lime ])
(nil)))
(jump_insn 23 22 24 2 (set (pc)
(if_then_else (ne (reg:BI 124)
(const_int 0 [0]))
(label_ref 56)
(pc))) ../src/bdv258_limt_process.c:87 1459 {cbranchbi4}
 (expr_list:REG_DEAD (reg:BI 124)
(expr_list:REG_BR_PROB (const_int 3900 [0xf3c])
(expr_list:REG_PRED_WIDTH (const_int 2 [0x2])
(nil


It starts by moving r0 to r122 and then using r122 as base register of address.
The problem is that IRA allocated r122 to r10 and r123 to r0. Therefore in the 
postreload phase, insn 21 will have r10 changed to r0 ending up like: r0 - mem 
(r0 + 16) but r10 will still use r10.
It would be better to have r122 allocated to r0, and then the postreload could 
replace r122 by r0 in the address expressions and eliminate r15.

This is what happens if I specify -fira-loop-pressure (and that's how I came to 
the other thread where I mention it). 
Without -fira-loop-pressure I see the following conflicts for r122:
;; a1(r122,l0) conflicts: a0(r107,l0) a2(r136,l0) a3(r135,l0) a4(r137,l0) 
a6(r134,l0) a7(r133,l0) a8(r111,l0) a5(r105,l0) a9(r125,l0) a11(r124,l0) 
a12(r123,l0) a21(r120,l0) a22(r119,l0) a23(r114,l0) a24(r118,l0) a25(r117,l0) 
a26(r113,l0) a28(r116,l0)
;; total conflict hard regs: 0 20-25
;; conflict hard regs: 0 20-25

Why is r122 conflicting with r0? r0 is dead in insn 21, I can't see a reason 
for conflict.
Is there something I am missing?

Cheers,

Paulo Matos




ira-loop-pressure not marked as optimization?

2014-03-06 Thread Paulo Matos
Hi,

Upon noticing ira-hoist-pressure in `gcc --help=optimizers` and not 
ira-loop-pressure,
I am wondering why the latter is not marked as an Optimization in common.opt:

fira-loop-pressure
Common Report Var(flag_ira_loop_pressure)
Use IRA based register pressure calculation
in RTL loop optimizations.

Should we mark it as such?

Cheers,

Paulo Matos




Re: ira-loop-pressure not marked as optimization?

2014-03-06 Thread Paulo Matos

On 06/03/14 21:03, Vladimir Makarov wrote:

On 03/06/2014 03:14 PM, Paulo J. Matos wrote:


I have just run a few long benchmarks and even though I had high
expectations for improving performance the flag in our port causes a
~3% degradation in performance. Haven't had time to investigate why.




Strange.  What port is this?  Can I see the benchmarks.  I know Pat
Haugen from IBM found a bug which can generated worse code in some cases
and has a fix.  I hope Pat submits it on stage 1.



Vladimir, unfortunately I cannot send the benchmarks as they are private 
customer code. However, I will investigate the degradation and try to 
find a case which I can reduce and post on the list.


The port is for an internal Broadcom chip, SIMD/VLIW sharing some 
similarities and technical issues from a compiler point of view as IA-64.



Still, would you accept a patch to mark this flag as an optimization?


I think we should.



I will submit a patch to gcc-patches.

Thanks,

--
Paulo Matos


Re: [PATCH] [lto/55113] Fix use of -fshort-double with -flto for powerpc

2014-03-06 Thread Paulo Matos

On 06/03/2014 11:19, Richard Biener wrote:


I have reverted the patch for now.

Richard.


That's fine Richard, thanks. I got stuck with another issue in the 
meantime but I will look at it again very soon.


--
Paulo Matos


Re: [PATCH] [lto/55113] Fix use of -fshort-double with -flto for powerpc

2014-03-05 Thread Paulo Matos

On 05/03/2014 11:51, Richard Biener wrote:
On Wed, Mar 5, 2014 at 12:43 PM, Dominique Dhumieres 
domi...@lps.ens.fr wrote:


Revision 208312 causes

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60427


Uhm.  pointer comparison against double_type_node ...

I'd say we want to revert the patch.  Paulo, please do that.  Let's
do the alternate approach of marking -fshort-double eligible for LTO
as well and handle it there properly.



Sure, I will prepare a new patch and post it for approval by the end of 
the day.


Apologies for the regression.
--
Paulo Matos


Simplify using context in loop-iv

2014-02-07 Thread Paulo Matos
Hello,

This is a followup of 
http://gcc.gnu.org/ml/gcc/2014-01/msg00120.html
and
http://gcc.gnu.org/ml/gcc/2014-01/msg00055.html

This is a slightly long patch that attempts to simplify the niter and infinite 
expressions by recursively looking through the dominant basic blocks for 
register definitions, replacing them in the original expressions and 
simplifying the expression with simplify-rtx.

It allows GCC to generate a lot more zero overhead loops by finding much better 
bounds to loops or discovering that infinite loops are actually not so.

I would like to have some comments on the patch and its possible integration 
into upstream (I guess it can only go upstream once 4.9 is released, right?).

I bootstrapped and tested x86_64-unknown-linux-gnu with no regressions.

Thanks,

Paulo Matos




simplify-using-context.patch
Description: simplify-using-context.patch


FW: Git repo lagging behing

2014-02-04 Thread Paulo Matos
Since the owner of the repo http://gcc.gnu.org/git/?p=gcc.git;a=summary is 
marked as being: gcc@gcc.gnu.org
moving this thread of the gcc mailing list.

Paulo Matos


-Original Message-
From: gcc-help-ow...@gcc.gnu.org [mailto:gcc-help-ow...@gcc.gnu.org] On Behalf 
Of Paulo Matos
Sent: 04 February 2014 14:08
To: gcc-h...@gcc.gnu.org
Subject: Git repo lagging behing

Hi,

Don't know if anybody noticed but something's wrong with the git repo. It's 
lagging behind. Last commit there was:
18 hours agotejohnson   2014-02-03 Teresa Johnson tejohn...@google.com

which was certainly not the last commit on trunk.

Paulo Matos




RE: Regression [v850,mep...]: sign_extend in loop breaks zero-overhead loop generation

2014-01-31 Thread Paulo Matos

 -Original Message-
 From: Jeff Law [mailto:l...@redhat.com]
 Sent: 30 January 2014 15:50
 To: Paulo Matos; Andreas Schwab
 Cc: gcc@gcc.gnu.org
 Subject: Re: Regression [v850,mep...]: sign_extend in loop breaks 
 zero-overhead
 loop generation
 
 
  OK, of course. Don't know what I am doing today.
  It's undefined because 'i' might overflow... I will get back to this. Thanks
 for pointing this out.
 When you've got it sorted out, go ahead and file a BZ, include the
 regression markers so that it shows up in the searches most of us are
 paying the most attention to right now.
 
 jeff

Sure. I opened it as 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5

Paulo Matos


Regression [v850,mep...]: sign_extend in loop breaks zero-overhead loop generation

2014-01-30 Thread Paulo Matos
Hello,

I am tracking a performance and size regression from 4.5.4 present in trunk.
Consider the following function:
==
extern short delayLength;
typedef int Sample;
extern Sample *temp_ptr;
extern Sample x;

void
foo (short blockSize)
{
  short i;
  unsigned short loopCount;

  loopCount = (unsigned short) (blockSize + delayLength) % 8;
  for (i = 0; i  loopCount; i++)
  *temp_ptr++ = x ^ *temp_ptr++;
}
==

For v850, before the commit
commit e0ae2fe2a0bebe9de31e3d8eb4feace4909ef009
Author: vries vries@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Fri May 20 19:32:30 2011 +

2011-05-20  Tom de Vries  t...@codesourcery.com

PR target/45098
* tree-ssa-loop-ivopts.c: Include expmed.h.
(get_shiftadd_cost): New function.
(force_expr_to_var_cost): Declare forward.  Use get_shiftadd_cost.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@173976 
138bc75d-0d04-0410-961f-82ee72b054a4

gcc generated for -O2:
_foo:
movhi hi(_delayLength),r0,r10
ld.h lo(_delayLength)[r10],r10
add r10,r6
andi 7,r6,r10
be .L1
movhi hi(_temp_ptr),r0,r16
ld.w lo(_temp_ptr)[r16],r15
mov r10,r17
shl 2,r17
mov r15,r14
movhi hi(_x),r0,r13
mov r15,r10
add r17,r14
movea lo(_x),r13,r13
.L3:
ld.w 0[r10],r11
ld.w 0[r13],r12
xor r12,r11
st.w r11,0[r10]
add 4,r10
cmp r14,r10
bne .L3
mov r15,r10
add r17,r10
st.w r10,lo(_temp_ptr)[r16]
.L1:
jmp [r31]

After the commit it generates:
_foo:
movhi hi(_delayLength),r0,r10
ld.h lo(_delayLength)[r10],r16
add r16,r6
andi 7,r6,r16
be .L1
movhi hi(_temp_ptr),r0,r17
ld.w lo(_temp_ptr)[r17],r18
movhi hi(_x),r0,r14
mov r18,r11
mov r16,r15
mov 0,r10
movea lo(_x),r14,r14
.L3:
ld.w 0[r11],r12
ld.w 0[r14],r13
add 1,r10
xor r13,r12
shl 16,r10
st.w r12,0[r11]
sar 16,r10
add 4,r11
cmp r15,r10
bne .L3
shl 2,r16
add r18,r16
st.w r16,lo(_temp_ptr)[r17]
.L1:
jmp [r31]

The problem is inside the loop: 
shl 16,r10
st.w r12,0[r11]
sar 16,r10
add 4,r11
cmp r15,r10


shl followed by sar is used to sign extend r10 which was in previous gcc 
versions not being done and it is unnecessary.
At the point of commit v850 didn't have e3v5 support or zero overhead loops but 
now it does and this blocks generation of zero overhead loops. (with trunk and 
-mv850e3v5, gcc generates a sxh instruction instead of the shift pattern but 
the point is the same).

For mep the situation repeats. mep generates:
foo:
# frame: 8   8 regs
lh  $10, %sdaoff(delayLength)($gp)
add $sp, -8
add3$1, $1, $10
and3$10, $1, 0x7
beqz$10, .L1
lw  $2, %sdaoff(temp_ptr)($gp)
mov $1, 0
add3$11, $gp, %sdaoff(x)
bra .L5
.L3:
mov $2, $9
.L5:
lw  $0, ($11)
lw  $3, 4($2)
add $1, 1
exth$1
xor $3, $0
slt3$0, $1, $10
add3$9, $2, 8
sw  $3, ($2)
bnez$0, .L3
sw  $9, %sdaoff(temp_ptr)($gp)
.L1:
add $sp, 8
ret


Again exth signextends $1 and blocks generation of zero overhead loop because 
suddenly loop is not simple anymore. Unfortunately I cannot test mep before the 
patch as at the time mep was not in mainline.

Does anyone understand why the mentioned patch is forcing the generation of the 
sign extend inside the loop? Is this just a problem with cost calculation in 
the backends or some issue lurking in tree-ssa-loop-ivopts?

Thanks,

Paulo Matos




RE: Regression [v850,mep...]: sign_extend in loop breaks zero-overhead loop generation

2014-01-30 Thread Paulo Matos
 -Original Message-
 From: Andreas Schwab [mailto:sch...@linux-m68k.org]
 Sent: 30 January 2014 14:29
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: Re: Regression [v850,mep...]: sign_extend in loop breaks 
 zero-overhead
 loop generation
 
 Paulo Matos pma...@broadcom.com writes:
 
  void
  foo (short blockSize)
  {
short i;
unsigned short loopCount;
 
loopCount = (unsigned short) (blockSize + delayLength) % 8;
for (i = 0; i  loopCount; i++)
*temp_ptr++ = x ^ *temp_ptr++;
  }
 
 You know that this is undefined code?


Correct, my apologies. I didn't notice the undefined behaviour that the reducer 
introduced in the original code.
However, the issue persists.
If instead I write:
void
foo (short blockSize)
{
  short i;
  unsigned short loopCount;
  loopCount = (unsigned short) (blockSize + delayLength) % 8;
  for (i = 0; i  loopCount; i++)
  *temp_ptr++ = x ^ *temp_ptr;
}

the sign extend is still generated from the referenced commit and persists 
until trunk. This causes the zero overhead loop not to be generated.

Thanks,

Paulo Matos
 
 Andreas.
 
 --
 Andreas Schwab, sch...@linux-m68k.org
 GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
 And now for something completely different.


RE: Regression [v850,mep...]: sign_extend in loop breaks zero-overhead loop generation

2014-01-30 Thread Paulo Matos
 -Original Message-
 From: Andreas Schwab [mailto:sch...@linux-m68k.org]
 Sent: 30 January 2014 15:15
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: Re: Regression [v850,mep...]: sign_extend in loop breaks 
 zero-overhead
 loop generation
 
 Paulo Matos pma...@broadcom.com writes:
 
  If instead I write:
  void
  foo (short blockSize)
  {
short i;
unsigned short loopCount;
loopCount = (unsigned short) (blockSize + delayLength) % 8;
for (i = 0; i  loopCount; i++)
*temp_ptr++ = x ^ *temp_ptr;
  }
 
 This is still undefined.
 

OK, of course. Don't know what I am doing today.
It's undefined because 'i' might overflow... I will get back to this. Thanks 
for pointing this out.

 Andreas.
 
 --
 Andreas Schwab, sch...@linux-m68k.org
 GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
 And now for something completely different.


[PATCH] Vector mode addresses

2014-01-30 Thread Paulo Matos
As a followup of the thread:
http://gcc.gnu.org/ml/gcc/2014-01/msg00206.html

consider the attached patch for submission. It fixes an ICE in case you try to 
use vector mode addresses and do not have it as mode dependent target hook.
As discussed in the thread adding VECTOR_MODE_P to the target hook is a way to 
avoid ICE but as Richard S. mentioned it's more general to patch GCC up.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

2014-01-30  Paulo Matos  pma...@broadcom.com

* simplify-rtx.c (simplify_subreg): Do not adjust address if memory 
   address has vector mode.


OK to submit?

Paulo Matos




vector-mode.patch
Description: vector-mode.patch


RE: [PATCH] Vector mode addresses

2014-01-30 Thread Paulo Matos
 -Original Message-
 From: Richard Sandiford [mailto:rdsandif...@googlemail.com]
 Sent: 30 January 2014 12:43
 To: Paulo Matos
 Cc: gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH] Vector mode addresses
 
 Paulo Matos pma...@broadcom.com writes:
  As a followup of the thread:
  http://gcc.gnu.org/ml/gcc/2014-01/msg00206.html
 
  consider the attached patch for submission. It fixes an ICE in case
  you try to use vector mode addresses and do not have it as mode
  dependent target hook.  As discussed in the thread adding
  VECTOR_MODE_P to the target hook is a way to avoid ICE but as Richard
  S. mentioned it's more general to patch GCC up.
 
 But like I said, I think the VECTOR_MODE_P check should be in
 mode_dependent_address_p (recog.c) rather than here.  If vector
 addresses are supported then they are mode-dependent, since the
 number of elements in the address mode must match the number of
 elements in the MEM mode.


Have I misunderstood what you said:
If we do support vector addresses than I think the right fix is to
check VECTOR_MODE_P there. 

From this I understood that you agreed that if vector_mode is supported for an 
address the check should be in simplify_rtx as it would prevent all target 
ports from adding the check to their hook, making it therefore more generic. 
You re-enforced this when you said:
I'd just prefer it
to be in generic code because I think it's a target-independent rule.

Apologies if I completely misunderstood you.

 
 Thanks,
 Richard


Re: [PATCH] Vector mode addresses

2014-01-30 Thread Paulo Matos

On 30/01/14 14:01, Richard Biener wrote:


I'm curious on where you are seeing MEMs with a vector mode address.
What does that MEM even mean?



Yes, it looks strange but it was the way we came up with to implement an 
instruction that loads from a pair of addresses.


From what I wrote previously to Richard.
We have an instruction that loads two 32 bit values into a lower and 
upper parts of a 64bit register using a base register and a 64 bit 
register used as a double index.

So,
r1 - [r0, r2]
means:
low(r1) = [r0 + low(r2)]
high(r1) = [r0 + high(r2)]


 From the referenced mail:

new_rtx: (mem:V4SI (plus:V4SI (vec_concat:V4SI (vec_concat:V2SI
(const_int 0 [0])
 (const_int 0 [0]))
 (vec_concat:V2SI (const_int 0 [0])
 (const_int 0 [0])))

that should be invalid and somehow lacks the subreg:DI.  The bug is where
that got lost.



I don't think it got lost. GCC was trying to simplify it. That's why my 
patch was in simplify_subreg. GCC was trying to simplify a subreg in 
DImode with this mem rtx as SUBREG_REG and offset 8.


--
Paulo Matos




Re: [PATCH] Vector mode addresses

2014-01-30 Thread Paulo Matos

On 30/01/14 17:10, Richard Sandiford wrote:


Right, in the context:

   Just in case: the point I was trying to make in the second paragraph
   was that the code in the patch only triggers (as you say) because the
   address isn't seen as mode-dependent.  But this kind of address does
   look mode-dependent to me, since it only works for MEM modes that have
   the same number of elements as the index vector.  So this does sound
   like a case where mode_dependent_address_p ought to return true.

   If we do support vector addresses than I think the right fix is to
   check VECTOR_MODE_P there.

   http://gcc.gnu.org/ml/gcc/2014-01/msg00242.html

I.e. there == mode_dependent_address_p (defined in recog.c).


 From this I understood that you agreed that if vector_mode is supported
for an address the check should be in simplify_rtx as it would prevent
all target ports from adding the check to their hook, making it
therefore more generic. You re-enforced this when you said:
I'd just prefer it
to be in generic code because I think it's a target-independent rule.


No, I meant that I'd prefer it to be done in the target-independent
mode_dependent_address_p function.  This was in response to:

   Well, isn't it the case that a mem with vector mode is always mode
   dependent, in which case adding VECTOR_MODE_P to mode-dependent target
   hook would be enough making the patch not so useful?

   http://gcc.gnu.org/ml/gcc/2014-01/msg00248.html

where it sounded like you were instead going to add the check to your
target's TARGET_MODE_DEPENDENT_ADDRESS_P hook.  I don't think it makes
sense to defer the VECTOR_MODE_P check to the target hook since I don't
know how target-independent code could attach any meaning to something
like (mem:V4HI (reg:V4SI ...)) - (mem:DI (reg:V4SI ...)).

Again, this is all on the basis that vector-mode addresses really
are supported.



Now I understand what you mean. I was pretty confused by what you meant 
by target-independent mode_dependent_address_p because initially I had 
the feeling that targetm.mode_dependent_address_p was being called in 
simplify_rtx but there's actually a mode_dependent_address_p in recog.c 
and there is where you suggested to add the check _if_ vector modes are 
supported. Got it.


I apologize for my misunderstanding and thank you for your patience.

--
Paulo Matos


Thanks,
Richard





Re: [PATCH] Vector mode addresses

2014-01-30 Thread Paulo Matos

On 30/01/14 20:47, Jakub Jelinek wrote:

On Thu, Jan 30, 2014 at 08:27:47PM +, Paulo Matos wrote:

Yes, it looks strange but it was the way we came up with to
implement an instruction that loads from a pair of addresses.

 From what I wrote previously to Richard.
We have an instruction that loads two 32 bit values into a lower
and upper parts of a 64bit register using a base register and a 64
bit register used as a double index.
So,
r1 - [r0, r2]
means:
low(r1) = [r0 + low(r2)]
high(r1) = [r0 + high(r2)]


That sounds like gather instruction e.g. i?86 AVX2/AVX512F has.
I don't like using vector mode for the address though, i?86 uses UNSPECs and
IMNSHO you should too.  Or express it as vec_concat of two MEM loads
where address of each one will be the base + vec_select of the vector
register.

Jakub



I will take a closer look at our pattern and how i?86 implements it.

Thanks for the advice.

Paulo Matos



RE: ICE in trunk due to MEM with address in vector mode

2014-01-27 Thread Paulo Matos
 -Original Message-
 From: Richard Sandiford [mailto:rdsandif...@googlemail.com]
 Sent: 24 January 2014 16:34
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: Re: ICE in trunk due to MEM with address in vector mode
 
 Paulo Matos pma...@broadcom.com writes:
  If we do support vector addresses than I think the right fix is to
  check VECTOR_MODE_P there.
 
  Well, isn't it the case that a mem with vector mode is always mode
  dependent, in which case adding VECTOR_MODE_P to mode-dependent target
  hook would be enough making the patch not so useful?
 
 Yeah, adding it to the target hook would work too.  I'd just prefer it
 to be in generic code because I think it's a target-independent rule.
 
 I can see that for an out-of-tree port it would be easier to put it in
 the target hook though.


I agree. If indeed VECTOR_MODE_P is officially supported then the patch would 
be best.
I will submit the patch to gcc-patches and wait for the reaction then.

Thanks for your help,

Paulo Matos

 
 Thanks,
 Richard


Mode change for bswap pattern expansion

2014-01-27 Thread Paulo Matos
Hello,

On a vector processor we can do a bswapsi with two instructions, by first 
rotating half-words (16 bits) by 8 and then rotating full words by 16. 
However, this means expanding:
(set (match_operand:SI 0 register_operand )
 (bswap:SI (match_operand:SI 1 register_operand )))

to:
(set (match_dup:V2HI 0)
 (rotate:V2HI (match_dup:V2HI 1)
  (const_int 8)))
(set (match_dup:SI 0)
 (rotate:SI (match_dup:SI 0)
(const_int 16)))

This is obviously not correct, because match_dup cannot set the mode. The point 
I am trying to make is that I can't find a good way to deal with the mode 
changes. I don't think GCC is too happy if I change the modes of the same 
operand from one instruction to the other right? The only other way is to emit 
paradoxical subregs. So something along these lines:
(set (subreg:V2HI (match_dup 0) 0)
 (rotate:V2HI (subreg:V2HI (match_dup 1) 0)
  (const_int 8)))
(set (match_dup 0)
 (rotate:SI (match_dup 0)
(const_int 16)))

Is there a better way to handle a situation like this?

Cheers,

Paulo Matos




RE: Mode change for bswap pattern expansion

2014-01-27 Thread Paulo Matos


 -Original Message-
 From: Richard Sandiford [mailto:rdsandif...@googlemail.com]
 Sent: 27 January 2014 16:06
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: Re: Mode change for bswap pattern expansion
 
 Paulo Matos pma...@broadcom.com writes:
  On a vector processor we can do a bswapsi with two instructions, by first
 rotating half-words (16 bits) by 8 and then rotating full words by 16.
  However, this means expanding:
  (set (match_operand:SI 0 register_operand )
   (bswap:SI (match_operand:SI 1 register_operand )))
 
  to:
  (set (match_dup:V2HI 0)
   (rotate:V2HI (match_dup:V2HI 1)
(const_int 8)))
  (set (match_dup:SI 0)
   (rotate:SI (match_dup:SI 0)
  (const_int 16)))
 
  This is obviously not correct, because match_dup cannot set the mode. The 
  point
 I am trying to make is that I can't find a good way to deal with the mode
 changes. I don't think GCC is too happy if I change the modes of the same 
 operand
 from one instruction to the other right? The only other way is to emit
 paradoxical subregs. So something along these lines:
  (set (subreg:V2HI (match_dup 0) 0)
   (rotate:V2HI (subreg:V2HI (match_dup 1) 0)
(const_int 8)))
  (set (match_dup 0)
   (rotate:SI (match_dup 0)
  (const_int 16)))
 
 It's usually better not to hard-code the subregs in the pattern.
 Instead you could use C code to create the subregs, e.g.:
 
   [(set (match_dup 3)
 (rotate:V2HI (match_dup 2)
  (const_int 8)))
(set (match_dup 0)
 (rotate:SI (match_dup 4)
(const_int 16)))]
   
 {
   operands[2] = gen_lowpart (V2HImode, operands[1]);
   operands[3] = gen_reg_rtx (V2HImode);
   operands[4] = gen_lowpart (SImode, operands[3]);
 }
 
 so that any hard regs are correctly handled.  Or it might be easier to code
 it using emit_insn (gen_* (...))s instead.
 
 BTW, paradoxical subregs are where the outer mode is strictly larger
 than the inner mode.


That's right. My mis-understanding.
 
 MIPS uses essentially the same sequence, except that it has a special
 instruction to do the first rotate (WSBH), rather than it being an instance
 of a general vector rotate.  For MIPS we just model it as an unspec SImode
 operation.  Maybe that would be easier here too.
 

I will look at how MIPS is doing it.

However, the unspec SI has severe performance penalties on my port since it is 
able to issue more that one instruction per cycle, therefore having each 
instruction separately allows the scheduler to issue each of the bswapsi parts 
into different slots with other instructions.


Thanks,

Paulo Matos

 Thanks,
 Richard


RE: Mode change for bswap pattern expansion

2014-01-27 Thread Paulo Matos
 -Original Message-
 From: Richard Sandiford [mailto:rsand...@linux.vnet.ibm.com]
 Sent: 27 January 2014 16:50
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: Re: Mode change for bswap pattern expansion
  
 Sorry, I meant we use an unspec for the first (V2HI) rotate.
 I.e. rather than:
 
   (set (subreg:V2HI (match_dup 2) 0)
(rotate:V2HI (subreg:V2HI (match_dup 1) 0)
 (const_int 8)))
   (set (match_dup 0)
(rotate:SI (match_dup 2)
   (const_int 16)))
 
 we have:
 
   (set (match_dup 2) (unspec:SI [(match_dup 1)] UNSPEC_FOO))
   (set (match_dup 0)
(rotate:SI (match_dup 2)
   (const_int 16)))
 
 In your case the define_insn for the UNSPEC_FOO pattern would have the
 same attributes as a V2HI rotate, so it should get scheduled in the same way.
 

In that case it would work. My only concern would then be if it prevents 
further optimizations. On the other hand I am not sure if GCC would try to 
optimize a rotate with vector V2HI mode...
Might give both solutions a try and see what results I get.

Thanks,

Paulo Matos

 Thanks,
 Richard



RE: ICE in trunk due to MEM with address in vector mode

2014-01-24 Thread Paulo Matos

 -Original Message-
 From: Richard Sandiford [mailto:rdsandif...@googlemail.com]
 Sent: 24 January 2014 10:46
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: Re: ICE in trunk due to MEM with address in vector mode
 
 Paulo Matos pma...@broadcom.com writes:
  Hello,
 
  I have found a strange case of an ICE due to a MEM with an address
  with vector mode.
 
 AFAIK that isn't supported.  Addresses are assumed to be MODE_INT or
 MODE_PARTIAL_INT.  Hopefully someone will correct me if this has changed
 without me noticing it.
 
 Is this for some kind of gather load?  I think an address like that
 would have to be mode-dependent anyway, since the number of units
 in the MEM mode and the address mode would need to match.


Thanks for your comment.
We have an instruction that loads two 32 bit values into a lower and upper 
parts of a 64bit register using a base register and a 64 bit register used as a 
double index.
So,
r1 - [r0, r2] 
means:
low(r1) = [r0 + low(r2)]
high(r1) = [r0 + high(r2)]

In this case we use vector addresses. If it is indeed not allowed then I need 
another solution. Used to work well in 4.8 and before. In trunk it breaks, but 
works with the patch I included before.
Can't find anything in the documentation regarding this case so it would be 
good to make it clear one way or the other.

Paulo Matos

 
 Thanks,
 Richard


RE: ICE in trunk due to MEM with address in vector mode

2014-01-24 Thread Paulo Matos
 -Original Message-
 From: Richard Sandiford [mailto:rdsandif...@googlemail.com]
 Sent: 24 January 2014 12:21
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: Re: ICE in trunk due to MEM with address in vector mode
 
 Just in case: the point I was trying to make in the second paragraph
 was that the code in the patch only triggers (as you say) because the
 address isn't seen as mode-dependent.  But this kind of address does
 look mode-dependent to me, since it only works for MEM modes that have
 the same number of elements as the index vector.  So this does sound
 like a case where mode_dependent_address_p ought to return true.
 

Yes, you're probably right. Sorry I haven't gotten back to you on that part of 
your initial email.

 If we do support vector addresses than I think the right fix is to
 check VECTOR_MODE_P there.
 

Well, isn't it the case that a mem with vector mode is always mode dependent, 
in which case adding VECTOR_MODE_P to mode-dependent target hook would be 
enough making the patch not so useful?

Cheers,

Paulo Matos

 Thanks,
 Richard


ICE in trunk due to MEM with address in vector mode

2014-01-23 Thread Paulo Matos
Hello,

I have found a strange case of an ICE due to a MEM with an address with vector 
mode.

The ICE looks like:
baaclc_block.c: In function 'fn2':
baaclc_block.c:22:1: internal compiler error: in trunc_int_for_mode, at 
explow.c:56
 }
 ^
0x64d050 trunc_int_for_mode(long, machine_mode)
/home/pmatos/work/fp_gcc-master/gcc/explow.c:56
0x637148 gen_int_mode(long, machine_mode)
/home/pmatos/work/fp_gcc-master/gcc/emit-rtl.c:422
0x64d61a plus_constant(machine_mode, rtx_def*, long)
/home/pmatos/work/fp_gcc-master/gcc/explow.c:190
0x63b3e1 adjust_address_1(rtx_def*, machine_mode, long, int, int, int, long)
/home/pmatos/work/fp_gcc-master/gcc/emit-rtl.c:2088
0x98f05b simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
/home/pmatos/work/fp_gcc-master/gcc/simplify-rtx.c:6061
0x98f381 simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
/home/pmatos/work/fp_gcc-master/gcc/simplify-rtx.c:6123
0xf546ab propagate_rtx_1
/home/pmatos/work/fp_gcc-master/gcc/fwprop.c:565
0xf54c6c propagate_rtx
/home/pmatos/work/fp_gcc-master/gcc/fwprop.c:730
0xf56628 forward_propagate_and_simplify
/home/pmatos/work/fp_gcc-master/gcc/fwprop.c:1392
0xf568f4 forward_propagate_into
/home/pmatos/work/fp_gcc-master/gcc/fwprop.c:1465
0xf56b8d fwprop
/home/pmatos/work/fp_gcc-master/gcc/fwprop.c:1550
0xf56c30 execute
/home/pmatos/work/fp_gcc-master/gcc/fwprop.c:1586

I cannot reproduce it with my host backend... however, I can try to explain 
what's happening. 
fwprop is calling propagate_rtx with rtxs of the form:
X: (subreg:DI (reg/v:V4SI 93 [ v4wspecExp ]) 8
mode: DImode
old_rtx: (reg/v:V4SI 93 [ v4wspecExp ])
new_rtx: (mem:V4SI (plus:V4SI (vec_concat:V4SI (vec_concat:V2SI (const_int 0 
[0])
(const_int 0 [0]))
(vec_concat:V2SI (const_int 0 [0])
(const_int 0 [0])))
(reg:V4SI 92 [ D.2926 ])) [0  S16 A6

This will go through simplify_subreg that upon seeing a memory where the 
address is not mode dependent will call adjust_address_1 with adjust_address 
argument enabled and it will call plus_constant to adjust the address, however 
plus_constant won't find any constants in the address and will simply fallback 
to: x = gen_rtx_PLUS (mode, x, gen_int_mode (c, mode));

gen_int_mode (8, V4SI) will call trunc_int_for_mode with 8 and V4SI which will 
ICE because trunc_int_for_mode doesn't handle ICE.

I suggest the following patch to simplify-rtx.c:
diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 04af01e..b01211f5 100755
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -6057,7 +6057,11 @@ simplify_subreg (enum machine_mode outermode, rtx op,
  have instruction to move the whole thing.  */
(! MEM_VOLATILE_P (op)
  || ! have_insn_for (SET, innermode))
-   GET_MODE_SIZE (outermode) = GET_MODE_SIZE (GET_MODE (op)))
+   GET_MODE_SIZE (outermode) = GET_MODE_SIZE (GET_MODE (op))
+   !VECTOR_MODE_P (GET_MODE (XEXP (op, 0
 return adjust_address_nv (op, outermode, byte);
 
   /* Handle complex values represented as CONCAT


Comments?

Paulo Matos




Avoiding paradoxical subregs

2014-01-17 Thread Paulo Matos
Hello,

I am seeing GCC combining:
(insn 17 16 18 3 (set (reg:SI 104 [ D.4588 ])
(zero_extend:SI (reg:HI 103 [ D.4587 ]))) test-18.c:24 1426 
{zero_extendhisi2}
 (expr_list:REG_DEAD (reg:HI 103 [ D.4587 ])
(nil)))
(insn 18 17 19 3 (set (reg:BI 105)
(gt:BI (reg:SI 104 [ D.4588 ])
(const_int 4 [0x4]))) test-18.c:24 1372 {cmp_simode}
 (expr_list:REG_DEAD (reg:SI 104 [ D.4588 ])
(nil)))

into a paradoxical use:
(set (reg:BI 105)
(gt:BI (subreg:SI (reg:HI 103 [ D.4587 ]) 0)
(const_int 4 [0x4])))


Reload then transforms (subreg:SI (reg:HI 103 [ D.4587 ]) 0) into (reg:SI 18), 
allocating 103 to 18, but changing the mode to SI.
This is wrong and it will generate wrong code.

I can use canonicalize_comparison like s390 to remove the subreg, however the 
question then becomes about how to avoid this in general. We cannot allow a 
zero_extend to become a paradoxical subreg and then have the subreg discarded. 
Most of our instructions are vector instructions which will change the 
remainder of the register.

I have defined WORD_REGISTER_OPERATIONS and CANNOT_CHANGE_MODE_CLASS(from, to, 
class) and set it to true if GET_MODE_SIZE (from)  GET_MODE_SIZE (to) but it 
didn't help.

Any suggestions for a generic solution?

Paulo Matos




RE: Avoiding paradoxical subregs

2014-01-17 Thread Paulo Matos
 -Original Message-
 From: Eric Botcazou [mailto:ebotca...@adacore.com]
 Sent: 17 January 2014 16:23
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: Re: Avoiding paradoxical subregs
 
  I can use canonicalize_comparison like s390 to remove the subreg, however
  the question then becomes about how to avoid this in general. We cannot
  allow a zero_extend to become a paradoxical subreg and then have the subreg
  discarded. Most of our instructions are vector instructions which will
  change the remainder of the register.
 
  I have defined WORD_REGISTER_OPERATIONS and CANNOT_CHANGE_MODE_CLASS(from,
  to, class) and set it to true if GET_MODE_SIZE (from)  GET_MODE_SIZE (to)
  but it didn't help.
 
  Any suggestions for a generic solution?
 
 What kind of generic solution?  Eliminating paradoxical subregs altogether?
 That's very likely not doable if you define WORD_REGISTER_OPERATIONS anyway.
 You need to pinpoint where things start to go wrong, for example in combine.
 

I am not implying that this is a GCC bug, unless you think 
WORD_REGISTER_OPERATIONS should have avoided the creation of such paradoxical 
subreg. What I was looking after was for a generic solution on my backend that 
either eliminates the use of paradoxical subregs or forces reload the transform 
(subreg:m (reg:n K)), where subreg is paradoxical, into a zero_extend.

I mentioned a generic one because the only solution I have is for comparison 
instructions only:
canonicalize_comparison (int *, rtx *op0, rtx *,
  bool op0_preserve_value)
{
  if (op0_preserve_value)
return;

  /* Remove paradoxical subregs.  */
  if (GET_CODE (*op0) == SUBREG)
{
  rtx inner = SUBREG_REG (*op0);

  if (GET_MODE_SIZE (GET_MODE (inner))  GET_MODE_SIZE (GET_MODE (*op0)))
*op0 = simplify_gen_unary (ZERO_EXTEND, GET_MODE (*op0), inner, 
GET_MODE (inner));
}
}

Another more generic solution would be to write my own register_operand 
predicate that excludes paradoxical subregs but I am not sure what kind of 
pandora's box I am opening by doing that.
-- 
Paulo Matos


RE: Infinite number of iterations in loop [v850, mep]

2014-01-15 Thread Paulo Matos
 -Original Message-
 From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Paulo
 Matos
 Sent: 09 January 2014 16:48
 To: Richard Biener
 Cc: Andrew Haley; gcc@gcc.gnu.org; Jan Hubicka
 Subject: RE: Infinite number of iterations in loop [v850, mep]
 
 
 I would like some comments on the following patch that seems to work but I 
 think
 it could be generalized.
 The idea is for the specific infinite condition of type (and reg int), we can
 search for the definition of reg,
 check nonzero_bits and check that they don't match any of the bits in int.
 

Forget the patch, I am implementing a much more robust patch with the same 
objective which I will post with a request for comments in a separate thread.

Paulo Matos



Context dependent expression simplification

2014-01-14 Thread Paulo Matos
Hello,

Before I start to write code to reinvent the wheel, I would like to know if 
there's something already out there to do context dependent expression 
simplification.
What I need is to simplify an expression at a given point in a BB.

For example:

  bb2:
   r1 - 2
   if r2 != 0 goto bb3 else bb4   
  bb3:
   r3 - r2  1
   goto bb4
  bb4:
   ...
   ...
   if ... goto bb4 else bb5

Is there any way already implemented to find the value of (and (plus r1 r3) 
(const_int 1)) at the end of bb4 and simplify it to (const_int 0)?

Cheers,

Paulo Matos




RE: Context dependent expression simplification

2014-01-14 Thread Paulo Matos
 -Original Message-
 From: Jakub Jelinek [mailto:ja...@redhat.com]
 Sent: 14 January 2014 13:45
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: Re: Context dependent expression simplification
 
 On Tue, Jan 14, 2014 at 01:40:36PM +, Paulo Matos wrote:
  Before I start to write code to reinvent the wheel, I would like to know if
 there's something already out there to do context dependent expression
 simplification.
  What I need is to simplify an expression at a given point in a BB.
 
  For example:
 
bb2:
 r1 - 2
 if r2 != 0 goto bb3 else bb4
bb3:
 r3 - r2  1
 goto bb4
bb4:
 ...
 ...
 if ... goto bb4 else bb5
 
  Is there any way already implemented to find the value of (and (plus r1 r3)
 (const_int 1)) at the end of bb4 and simplify it to (const_int 0)?
 
 VRP should handle this (of course at the GIMPLE level) performs context
 dependent optimizations through it's edge assertions.

I need to do this during loop_doloop to simplify infinite conditons.
Might have to implement something to do this at RTL level then.

Paulo Matos

 
   Jakub


Useless statement in loop latch looks like performance regression

2014-01-10 Thread Paulo Matos
}
 (expr_list:REG_DEAD (reg:SI 109 [ D.3280 ])
(expr_list:REG_DEAD (reg:SI 101 [ D.3283 ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)
(insn 56 55 57 4 (parallel [
(set (reg:SI 85 [ ivtmp.7 ])
(plus:SI (reg:SI 85 [ ivtmp.7 ])
(const_int 1 [0x1])))
(clobber (reg:CC 17 flags))
]) 273 {*addsi_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))
(insn 57 56 58 4 (parallel [
(set (reg:DI 97 [ ivtmp.8 ])
(plus:DI (reg:DI 97 [ ivtmp.8 ])
(const_int 2 [0x2])))
(clobber (reg:CC 17 flags))
]) 274 {*adddi_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))
(insn 58 57 59 4 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 85 [ ivtmp.7 ])
(reg:SI 98 [ D.3281 ]))) coremark.c:141 7 {*cmpsi_1}
 (nil))
(jump_insn 59 58 60 4 (set (pc)
(if_then_else (eq (reg:CCZ 17 flags)
(const_int 0 [0]))
(label_ref 64)
(pc))) coremark.c:141 621 {*jcc_1}
 (expr_list:REG_DEAD (reg:CCZ 17 flags)
(expr_list:REG_BR_PROB (const_int 900 [0x384])
(nil)))
 - 64)
;;  succ:   5 [91.0%]  (FALLTHRU)
;;  6 [9.0%]  (LOOP_EXIT)
;; lr  out   6 [bp] 7 [sp] 16 [argp] 20 [frame] 85 91 96 97 98 102 103 104 
105

;; basic block 5, loop depth 2, count 0, freq 8281, maybe hot
;;  prev block 4, next block 6, flags: (RTL, MODIFIED)
;;  pred:   4 [91.0%]  (FALLTHRU)
;; bb 5 artificial_defs: { }
;; bb 5 artificial_uses: { u35(6){ }u36(7){ }u37(16){ }u38(20){ }}
;; lr  in6 [bp] 7 [sp] 16 [argp] 20 [frame] 85 91 96 97 98 102 103 104 
105
;; lr  use   6 [bp] 7 [sp] 16 [argp] 20 [frame] 96
;; lr  def   101
(note 60 59 34 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(insn 34 60 84 5 (set (reg:SI 101 [ D.3283 ])
(reg:SI 96 [ D.3280 ])) coremark.c:141 89 {*movsi_internal}
 (expr_list:REG_DEAD (reg:SI 96 [ D.3280 ])
(nil)))
(jump_insn 84 34 85 5 (set (pc)
(label_ref 61)) 659 {jump}
 (nil)
 - 61)
;;  succ:   4 [100.0%]  (DFS_BACK)
;; lr  out   6 [bp] 7 [sp] 16 [argp] 20 [frame] 85 91 97 98 101 102 103 104 
105

However, registers 96 and 101 only show up in block 4 here:
(insn 55 54 56 4 (parallel [
(set (reg:SI 96 [ D.3280 ])
(plus:SI (reg:SI 109 [ D.3280 ])
(reg:SI 101 [ D.3283 ])))
(clobber (reg:CC 17 flags))
]) coremark.c:142 273 {*addsi_1}
 (expr_list:REG_DEAD (reg:SI 109 [ D.3280 ])
(expr_list:REG_DEAD (reg:SI 101 [ D.3283 ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)

I don't really understand why GCC is generating this extra basic block however 
it looks like a regression. Block 5 could disappear and insn 55 rewritten as 
(replaced reg101 by reg96):

(set (reg:SI 96 [ D.3280 ])
(plus:SI (reg:SI 109 [ D.3280 ])
(reg:SI 96 [ D.3283 ])))

It is worse on my port due to several technical constraints. For example, zero 
overhead loops can only be generated if there are no internal branches, which 
means that any loop without an empty latch can't be transformed into a zero 
overhead loop, which is the case with this example. So, loops with empty 
latches are desired as much as possible. Once GCC starts creating code like 
this performance on my port drops drastically.

What are your thoughts?

Note: I just noticed this doesn't happen in trunk anymore. Any idea of what 
went into trunk to fix this?

Paulo Matos




RE: Useless statement in loop latch looks like performance regression

2014-01-10 Thread Paulo Matos


Paulo Matos


 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: 10 January 2014 13:25
 To: Paulo Matos
 Cc: gcc@gcc.gnu.org
 Subject: Re: Useless statement in loop latch looks like performance regression
 
 Most likely changes to SSA coalescing at out-of-SSA time like
 
 2013-09-26  Richard Biener  rguent...@suse.de
 
 * tree-ssa-live.c (var_map_base_init): Handle SSA names with
 DECL_IGNORED_P base VAR_DECLs like anonymous SSA names.
 (loe_visit_block): Use gcc_checking_assert.
 * tree-ssa-coalesce.c (create_outofssa_var_map): Use
 gimple_assign_ssa_name_copy_p.
 (gimple_can_coalesce_p): Adjust according to the var_map_base_init
 change.
 
 and an earlier patch by Jeff Law.

You're right, I tracked it down to this one:

Author: law law@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Fri Jun 14 18:52:32 2013 +

* gimple.h (gimple_can_coalesce_p): Prototype.
* tree-ssa-coalesce.c (gimple_can_coalesce_p): New function.
(create_outofssa_var_map, coalesce_partitions): Use it.
* tree-ssa-uncprop.c (uncprop_into_successor_phis): Similarly.
* tree-ssa-live.c (var_map_base_init): Use TYPE_CANONICAL
if it's available.

* gcc.dg/tree-ssa/coalesce-1.c: New test.

Thanks,

Paulo Matos

 
 Richard.
 




RE: Infinite number of iterations in loop [v850, mep]

2014-01-09 Thread Paulo Matos
 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: 08 January 2014 14:42
 To: Paulo Matos
 Cc: Andrew Haley; gcc@gcc.gnu.org; Jan Hubicka
 Subject: Re: Infinite number of iterations in loop [v850, mep]
 
 Well.  We have
 
 Loop 2 is simple:
   simple exit 5 - 7
   infinite if: (expr_list:REG_DEP_TRUE (and:SI (reg:SI 76)
 (const_int 1 [0x1]))
 (nil))
   number of iterations: (lshiftrt:SI (plus:SI (minus:SI (reg:SI 68 [ D.1398 ])
 (reg:SI 64 [ ivtmp___6 ]))
 (const_int -2 [0xfffe]))
 (const_int 1 [0x1]))
   upper bound: 2147483646
   realistic bound: -1
 Doloop: Possible infinite iteration case.
 Doloop: The loop is not suitable.
 
 as we replaced the induction variable by a pointer induction with
 step 2.  So this might be a very common issue for RTL loop opts,
 the upper bound of the IV is 2 * N in this case, so 2 * N  1
 should be always false and thus infinite be optimized.
 
 (insn 34 33 36 3 (parallel [
 (set (reg:SI 76)
 (plus:SI (reg/v:SI 71 [ N ])
 (reg/v:SI 71 [ N ])))
 (clobber (reg:CC 32 psw))
 ]) 21 {addsi3}
  (expr_list:REG_UNUSED (reg:CC 32 psw)
 (nil)))
 
 that doesn't look too difficult to do with the above definition.
 nonzero_bits might be of use here, not sure (not my area of
 expertise).
 

I am trying to do something that shouldn't be too hard with the current df 
infrastructure but I don't think I am doing it the right way.
Once I have the assumption (and:SI (reg:SI 76) (const_int 1 [0x1]))
I need to reach the definition of reg 76 which is insn 34. Generally we can 
only do this if there if no other definition of reg 76 except one in a 
dominator basic block.
The CFG looks like:
  BB2
 /   \
   BB3   BB7
| \
  BB6exit 
\
BB4 -
/\
 BB5 BB9
 |  \ |
BB8 BB7 (exit)BB4 (loop)
 |
BB6 (loop)

BB3 contains insn 34 and there's no other definition of reg 76 in loop 
BB6-BB4-BB5-BB8 or the inner BB4-BB9.
Is there a way to do this search for definition of reg 76 automatically? I can 
see a df_find_def, however this requires me to have insn 34 already. I need to 
search for it. Is there anything in GCC to do this already?

I am sure GCC must be doing this already somewhere.

Cheers,

Paulo Matos



 Richard.
 



RE: Infinite number of iterations in loop [v850, mep]

2014-01-09 Thread Paulo Matos
 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: 08 January 2014 14:42
 To: Paulo Matos
 Cc: Andrew Haley; gcc@gcc.gnu.org; Jan Hubicka
 Subject: Re: Infinite number of iterations in loop [v850, mep]
 
 On Wed, Jan 8, 2014 at 3:09 PM, Paulo Matos pma...@broadcom.com wrote:
  -Original Message-
  From: Richard Biener [mailto:richard.guent...@gmail.com]
  Sent: 08 January 2014 11:03
  To: Paulo Matos
  Cc: Andrew Haley; gcc@gcc.gnu.org
  Subject: Re: Infinite number of iterations in loop [v850, mep]
 
  That was refering to the case with extern b.  For the above case the
  issue must be sth else.  Trying a cross to v850-elf to see if it
  reproduces for me (if 'b' is a stack or argument slot then we might
  bogously think that *c++ = 0 may clobber it, otherwise RTL
  number of iteration analysis might just be confused).
 
  So for example (should be arch independent)
 
  struct X { int i; int j; int k; int l[24]; };
 
  int foo (struct X x, int *p)
  {
int z = x.j;
*p = 1;
return z;
  }
 
  see if there is a anti-dependence between x.j and *p on the RTL side
  (at least the code dispatching to the tree oracle using the MEM_EXPRs
  should save you from that).
 
  So - v850 at least doesn't pass b in memory and the doloop recognition
  works for me (on trunk).
 
 
  You are right, everything is fine with the above example regarding the anti-
 dependence and with the loop as well. I got confused with mine not generating 
 a
 loop for
  void fn1 (unsigned int b)
  {
unsigned int a;
for (a = 0; a  b; a++)
  *c++ = 0;
  }
 
  but that simply because in our case it is not profitable.
 
  However, for the case:
  void matrix_add_const(unsigned int N, short *A, short val) {
   unsigned int i,j;
   for (i=0; iN; i++) {
for (j=0; jN; j++) {
 A[i*N+j] += val;
}
   }
  }
 
  GCC thinks for v850 and my port that the inner loop might be infinite.
  It looks like GCC is mangling the loop so much that the obviousness that the
 inner loop is finite is lost.
 
  This however turns out to be very performance degrading. Using -fno-ivopts
 makes generation of loops work again both in my port and v850.
  Is there a way to fine-tune ivopts besides trying to tune the costs or do 
  you
 reckon this is something iv-analysis should be smarter about?
 
 Well.  We have
 
 Loop 2 is simple:
   simple exit 5 - 7
   infinite if: (expr_list:REG_DEP_TRUE (and:SI (reg:SI 76)
 (const_int 1 [0x1]))
 (nil))
   number of iterations: (lshiftrt:SI (plus:SI (minus:SI (reg:SI 68 [ D.1398 ])
 (reg:SI 64 [ ivtmp___6 ]))
 (const_int -2 [0xfffe]))
 (const_int 1 [0x1]))
   upper bound: 2147483646
   realistic bound: -1
 Doloop: Possible infinite iteration case.
 Doloop: The loop is not suitable.
 
 as we replaced the induction variable by a pointer induction with
 step 2.  So this might be a very common issue for RTL loop opts,
 the upper bound of the IV is 2 * N in this case, so 2 * N  1
 should be always false and thus infinite be optimized.
 
 (insn 34 33 36 3 (parallel [
 (set (reg:SI 76)
 (plus:SI (reg/v:SI 71 [ N ])
 (reg/v:SI 71 [ N ])))
 (clobber (reg:CC 32 psw))
 ]) 21 {addsi3}
  (expr_list:REG_UNUSED (reg:CC 32 psw)
 (nil)))
 
 that doesn't look too difficult to do with the above definition.
 nonzero_bits might be of use here, not sure (not my area of
 expertise).
 

I would like some comments on the following patch that seems to work but I 
think it could be generalized.
The idea is for the specific infinite condition of type (and reg int), we can 
search for the definition of reg,
check nonzero_bits and check that they don't match any of the bits in int.

diff --git a/gcc/loop-iv.c b/gcc/loop-iv.c
index 4c34007..215fd22 100644
--- a/gcc/loop-iv.c
+++ b/gcc/loop-iv.c
@@ -2064,6 +2064,50 @@ simplify_using_initial_values (struct loop *loop, enum 
rtx_code op, rtx *expr)
   e = single_pred_edge (e-src);
 }
 
+  /* For certain patterns we can do even better, like (and (reg) 1).  */
+  if (GET_CODE (*expr) == AND
+   REG_P (XEXP (*expr, 0))
+   CONST_INT_P (XEXP (*expr, 1)))
+{
+  rtx reg = XEXP (*expr, 0);
+  unsigned HOST_WIDE_INT mask = INTVAL (XEXP (*expr, 1));
+  rtx insn_def = NULL_RTX;
+  basic_block bb = loop_preheader_edge (loop)-src;
+
+  while (1)
+   {
+ rtx insn;
+
+ if (bb == ENTRY_BLOCK_PTR)
+   break;
+
+ FOR_BB_INSNS_REVERSE (bb, insn)
+   {
+ if (!INSN_P (insn))
+   break;
+ if (df_reg_defined (insn, reg))
+   {
+ insn_def = insn;
+ break;
+   }
+   }
+
+ if (insn_def)
+   break;
+ bb = get_immediate_dominator (CDI_DOMINATORS, bb);
+   }
+
+  if (insn_def)
+   {
+ rtx set = single_set (insn_def

RE: Infinite number of iterations in loop [v850, mep]

2014-01-08 Thread Paulo Matos
 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: 08 January 2014 11:03
 To: Paulo Matos
 Cc: Andrew Haley; gcc@gcc.gnu.org
 Subject: Re: Infinite number of iterations in loop [v850, mep]
 
 That was refering to the case with extern b.  For the above case the
 issue must be sth else.  Trying a cross to v850-elf to see if it
 reproduces for me (if 'b' is a stack or argument slot then we might
 bogously think that *c++ = 0 may clobber it, otherwise RTL
 number of iteration analysis might just be confused).
 
 So for example (should be arch independent)
 
 struct X { int i; int j; int k; int l[24]; };
 
 int foo (struct X x, int *p)
 {
   int z = x.j;
   *p = 1;
   return z;
 }
 
 see if there is a anti-dependence between x.j and *p on the RTL side
 (at least the code dispatching to the tree oracle using the MEM_EXPRs
 should save you from that).
 
 So - v850 at least doesn't pass b in memory and the doloop recognition
 works for me (on trunk).
 

You are right, everything is fine with the above example regarding the 
anti-dependence and with the loop as well. I got confused with mine not 
generating a loop for 
void fn1 (unsigned int b)
{
  unsigned int a;
  for (a = 0; a  b; a++)
*c++ = 0;
}

but that simply because in our case it is not profitable.

However, for the case:
void matrix_add_const(unsigned int N, short *A, short val) {
 unsigned int i,j;
 for (i=0; iN; i++) {
  for (j=0; jN; j++) {
   A[i*N+j] += val;
  }
 }
}

GCC thinks for v850 and my port that the inner loop might be infinite.
It looks like GCC is mangling the loop so much that the obviousness that the 
inner loop is finite is lost.

This however turns out to be very performance degrading. Using -fno-ivopts 
makes generation of loops work again both in my port and v850.
Is there a way to fine-tune ivopts besides trying to tune the costs or do you 
reckon this is something iv-analysis should be smarter about?

Paulo Matos

 Richard.
 
  The same situation occurs with a coremark function:
  void matrix_add_const(ee_u32 N, MATDAT *A, MATDAT val) {
   ee_u32 i,j;
   for (i=0; iN; i++) {
for (j=0; jN; j++) {
 A[i*N+j] += val;
}
   }
  }
 
  It also says the inner loop might be infinite but it can't N is given as
 argument. j starts as 0, N is unsigned like N. This will loop N times. GCC 
 cannot
 possibly assume array A will overwrite the value of N in the loop. This seems
 like something is wrong in alias analysis.
 
  --
  PMatos


  1   2   3   >