Re: jamais-vu can now ignore renumbering of source lines in dg output (Re: GCC Buildbot Update)

2018-01-29 Thread Paulo Matos


On 29/01/18 15:19, David Malcolm wrote:
>>
>> Hi,
>>
>> I am looking at this today and I noticed that having the source file
>> for
>> all recent GCC revisions is costly in terms of time (if we wish to
>> compress them) and space (for storage). I was instead thinking that
>> jv
>> could calculate the differences offline using pysvn and the old and
>> new
>> revision numbers.
> 
> Note that access to the source files is optional - jv doesn't need
> them, it just helps for the particular situation described above.
> 

I understand but it would be great to have line number filtering.

>> I have started implementing this in my port. Would you consider
>> merging it?
> 
> Sounds reasonable - though bear in mind that gcc might be switching to
> git at some point.
> 

Yes, I know... but... if we wait for that to happen to implement
something... :)

> Send a pull request (I've turned on travis CI on the github repository,
> so pull requests now automatically get tested on a bunch of different
> Python 3 versions).
> 

Sure.

-- 
Paulo Matos


Re: jamais-vu can now ignore renumbering of source lines in dg output (Re: GCC Buildbot Update)

2018-01-29 Thread David Malcolm
On Mon, 2018-01-29 at 14:55 +0100, Paulo Matos wrote:
> 
> On 24/01/18 20:20, David Malcolm wrote:
> > 
> > I've added a new feature to jamais-vu (as of
> > 77849e2809ca9a049d5683571e27ebe190977fa8): it can now ignore test
> > results that merely changed line number.  
> > 
> > For example, if the old .sum file has a:
> > 
> >   PASS: g++.dg/diagnostic/param-type-mismatch.C  -
> > std=gnu++11  (test for errors, line 106)
> > 
> > and the new .sum file has a:
> > 
> >   PASS: g++.dg/diagnostic/param-type-mismatch.C  -
> > std=gnu++11  (test for errors, line 103)
> > 
> > and diffing the source trees reveals that line 106 became line 103,
> > the
> > change won't be reported by "jv compare".
> > 
> > It also does it for dg-{begin|end}-multiline-output.
> > 
> > It will report them if the outcome changed (e.g. from PASS to
> > FAIL).
> > 
> > To do this filtering, jv needs access to the old and new source
> > trees,
> > so it can diff the pertinent source files, so "jv compare" has
> > gained
> > the optional arguments
> >   --old-source-path=
> > and
> >   --new-source-path=
> > See the example in the jv Makefile for more info.  If they're not
> > present, it should work as before (without being able to do the
> > above
> > filtering).
> 
> 
> Hi,
> 
> I am looking at this today and I noticed that having the source file
> for
> all recent GCC revisions is costly in terms of time (if we wish to
> compress them) and space (for storage). I was instead thinking that
> jv
> could calculate the differences offline using pysvn and the old and
> new
> revision numbers.

Note that access to the source files is optional - jv doesn't need
them, it just helps for the particular situation described above.

> I have started implementing this in my port. Would you consider
> merging it?

Sounds reasonable - though bear in mind that gcc might be switching to
git at some point.

Send a pull request (I've turned on travis CI on the github repository,
so pull requests now automatically get tested on a bunch of different
Python 3 versions).

Thanks
Dave


Re: jamais-vu can now ignore renumbering of source lines in dg output (Re: GCC Buildbot Update)

2018-01-29 Thread Paulo Matos


On 24/01/18 20:20, David Malcolm wrote:
> 
> I've added a new feature to jamais-vu (as of
> 77849e2809ca9a049d5683571e27ebe190977fa8): it can now ignore test
> results that merely changed line number.  
> 
> For example, if the old .sum file has a:
> 
>   PASS: g++.dg/diagnostic/param-type-mismatch.C  -std=gnu++11  (test for 
> errors, line 106)
> 
> and the new .sum file has a:
> 
>   PASS: g++.dg/diagnostic/param-type-mismatch.C  -std=gnu++11  (test for 
> errors, line 103)
> 
> and diffing the source trees reveals that line 106 became line 103, the
> change won't be reported by "jv compare".
> 
> It also does it for dg-{begin|end}-multiline-output.
> 
> It will report them if the outcome changed (e.g. from PASS to FAIL).
> 
> To do this filtering, jv needs access to the old and new source trees,
> so it can diff the pertinent source files, so "jv compare" has gained
> the optional arguments
>   --old-source-path=
> and
>   --new-source-path=
> See the example in the jv Makefile for more info.  If they're not
> present, it should work as before (without being able to do the above
> filtering).


Hi,

I am looking at this today and I noticed that having the source file for
all recent GCC revisions is costly in terms of time (if we wish to
compress them) and space (for storage). I was instead thinking that jv
could calculate the differences offline using pysvn and the old and new
revision numbers.

I have started implementing this in my port. Would you consider merging it?

-- 
Paulo Matos


Re: jamais-vu can now ignore renumbering of source lines in dg output (Re: GCC Buildbot Update)

2018-01-24 Thread Paulo Matos


On 24/01/18 20:20, David Malcolm wrote:
> 
> I've added a new feature to jamais-vu (as of
> 77849e2809ca9a049d5683571e27ebe190977fa8): it can now ignore test
> results that merely changed line number.  
> 
> For example, if the old .sum file has a:
> 
>   PASS: g++.dg/diagnostic/param-type-mismatch.C  -std=gnu++11  (test for 
> errors, line 106)
> 
> and the new .sum file has a:
> 
>   PASS: g++.dg/diagnostic/param-type-mismatch.C  -std=gnu++11  (test for 
> errors, line 103)
> 
> and diffing the source trees reveals that line 106 became line 103, the
> change won't be reported by "jv compare".
> 
> It also does it for dg-{begin|end}-multiline-output.
> 
> It will report them if the outcome changed (e.g. from PASS to FAIL).
> 
> To do this filtering, jv needs access to the old and new source trees,
> so it can diff the pertinent source files, so "jv compare" has gained
> the optional arguments
>   --old-source-path=
> and
>   --new-source-path=
> See the example in the jv Makefile for more info.  If they're not
> present, it should work as before (without being able to do the above
> filtering).
> 
> Is this something that the buildbot can use?
> 

Hi David,

Thanks for the amazing improvements.
I will take a look at them on Monday. I have a lot of work at the moment
so I decided to take 1/5 of my week (usually Monday) to work on buildbot
so I will definitely get it integrated on Monday and hopefully have
something to say afterwards.

Thanks for keeping me up-to-date with these changes.

-- 
Paulo Matos


jamais-vu can now ignore renumbering of source lines in dg output (Re: GCC Buildbot Update)

2018-01-24 Thread David Malcolm
On Sat, 2017-12-16 at 12:06 +0100, Paulo Matos wrote:
> 
> On 15/12/17 15:29, David Malcolm wrote:
> > On Fri, 2017-12-15 at 10:16 +0100, Paulo Matos wrote:
> > > 
> > > On 14/12/17 12:39, David Malcolm wrote:
> > 
> > [...]
> > 
> > > > It looks like you're capturing the textual output from "jv
> > > > compare"
> > > > and
> > > > using the exit code.  Would you prefer to import "jv" as a
> > > > python
> > > > module and use some kind of API?  Or a different output format?
> > > > 
> > > 
> > > Well, I am using a fork of it which I converted to Python3. Would
> > > you
> > > be
> > > open to convert yours to Python3? The reason I am doing this is
> > > because
> > > all other Python software I have and the buildbot use Python3.
> > 
> > Done.
> > 
> > I found and fixed some more bugs, also (introduced during my
> > refactoring, sigh...)
> > 
> 
> That's great. Thank you very much for this work.
> 
> > > I would also prefer to have some json format or something but
> > > when I
> > > looked at it, the software was just printing to stdout and I
> > > didn't
> > > want
> > > to spend too much time implementing it, so I thought parsing the
> > > output
> > > was just easier.
> > 
> > I can add JSON output (or whatever), but I need to get back to gcc
> > 8
> > work, so if the stdout output is good enough for now, let's defer
> > output changes.
> > 
> 
> Agree, for now I can use what I already have to read the output of
> jv.
> I think I can now delete my fork and just use upstream jv as a
> submodule.

I've added a new feature to jamais-vu (as of
77849e2809ca9a049d5683571e27ebe190977fa8): it can now ignore test
results that merely changed line number.  

For example, if the old .sum file has a:

  PASS: g++.dg/diagnostic/param-type-mismatch.C  -std=gnu++11  (test for 
errors, line 106)

and the new .sum file has a:

  PASS: g++.dg/diagnostic/param-type-mismatch.C  -std=gnu++11  (test for 
errors, line 103)

and diffing the source trees reveals that line 106 became line 103, the
change won't be reported by "jv compare".

It also does it for dg-{begin|end}-multiline-output.

It will report them if the outcome changed (e.g. from PASS to FAIL).

To do this filtering, jv needs access to the old and new source trees,
so it can diff the pertinent source files, so "jv compare" has gained
the optional arguments
  --old-source-path=
and
  --new-source-path=
See the example in the jv Makefile for more info.  If they're not
present, it should work as before (without being able to do the above
filtering).

Is this something that the buildbot can use?

Dave


Re: GCC Buildbot Update

2017-12-20 Thread Paulo Matos


On 20/12/17 12:48, James Greenhalgh wrote:
> On Wed, Dec 20, 2017 at 10:02:45AM +, Paulo Matos wrote:
>>
>>
>> On 20/12/17 10:51, Christophe Lyon wrote:
>>>
>>> The recent fix changed the Makefile and configure script in libatomic.
>>> I guess that if your incremental builds does not run configure, it's
>>> still using old Makefiles, and old options.
>>>
>>>
>> You're right. I guess incremental builds should always call configure,
>> just in case.
> 
> For my personal bisect scripts I try an incremental build, with a
> full rebuild as a fallback on failure.
> 
> That gives me the benefits of an incremental build most of the time (I
> don't have stats on how often) with an automated approach to keeping things
> going where there are issues.
> 
> Note that there are rare cases where depencies are missed in the toolchain
> and an incremental build will give you a toolchain with undefined
> behaviour, as one compilation unit takes a new definition of a
> struct/interface and the other sits on an outdated compile from the
> previous build.
> 
> I don't have a good way to detect these.
> 

That's definitely a shortcoming of incremental builds. Unfortunately we
cannot cope with full builds for each commit (even for incremental
builds we'll need an alternative soon). So I will implement the same
strategy of full build if incremental fails, I think.

With respect with regards to incremental builds with undefined behaviour
that probably means that dependencies are incorrectly calculated. It
would be great to sort these out. If we could detect that there are
issues with the incremental build we could then try to understand which
dependencies were not properly calculated. Just a guess, however
implementing this might take awhile and would obviously need a lot more
resources than we have available now.

-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-20 Thread James Greenhalgh
On Wed, Dec 20, 2017 at 10:02:45AM +, Paulo Matos wrote:
> 
> 
> On 20/12/17 10:51, Christophe Lyon wrote:
> > 
> > The recent fix changed the Makefile and configure script in libatomic.
> > I guess that if your incremental builds does not run configure, it's
> > still using old Makefiles, and old options.
> > 
> > 
> You're right. I guess incremental builds should always call configure,
> just in case.

For my personal bisect scripts I try an incremental build, with a
full rebuild as a fallback on failure.

That gives me the benefits of an incremental build most of the time (I
don't have stats on how often) with an automated approach to keeping things
going where there are issues.

Note that there are rare cases where depencies are missed in the toolchain
and an incremental build will give you a toolchain with undefined
behaviour, as one compilation unit takes a new definition of a
struct/interface and the other sits on an outdated compile from the
previous build.

I don't have a good way to detect these.

Thanks,
James



Re: GCC Buildbot Update

2017-12-20 Thread Christophe Lyon
On 20 December 2017 at 11:02, Paulo Matos  wrote:
>
>
> On 20/12/17 10:51, Christophe Lyon wrote:
>>
>> The recent fix changed the Makefile and configure script in libatomic.
>> I guess that if your incremental builds does not run configure, it's
>> still using old Makefiles, and old options.
>>
>>
> You're right. I guess incremental builds should always call configure,
> just in case.
>

Maybe, but this does not always work. Sometimes, I have to rm -rf $builddir


> Thanks,
> --
> Paulo Matos


Re: GCC Buildbot Update

2017-12-20 Thread Paulo Matos


On 20/12/17 10:51, Christophe Lyon wrote:
> 
> The recent fix changed the Makefile and configure script in libatomic.
> I guess that if your incremental builds does not run configure, it's
> still using old Makefiles, and old options.
> 
> 
You're right. I guess incremental builds should always call configure,
just in case.

Thanks,
-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-20 Thread Christophe Lyon
On 20 December 2017 at 09:31, Paulo Matos  wrote:
>
>
> On 15/12/17 10:21, Christophe Lyon wrote:
>> On 15 December 2017 at 10:19, Paulo Matos  wrote:
>>>
>>>
>>> On 14/12/17 21:32, Christophe Lyon wrote:
 Great, I thought the CF machines were reserved for developpers.
 Good news you could add builders on them.

>>>
>>> Oh. I have seen similar things happening on CF machines so I thought it
>>> was not a problem. I have never specifically asked for permission.
>>>
> pmatos@gcc115:~/gcc-8-20171203_BUILD$ as -march=armv8.1-a
> Assembler messages:
> Error: unknown architecture `armv8.1-a'
>
> Error: unrecognized option -march=armv8.1-a
>
> However, if I run the a compiler build manually with just:
>
> $ configure --disable-multilib
> $ nice -n 19 make -j4 all
>
> This compiles just fine. So I am at the moment attempting to investigate
> what might cause the difference between what buildbot does and what I do
> through ssh.
>
 I suspect you are hitting a bug introduced recently, and fixed by:
 https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00434.html

>>>
>>> Wow, that's really useful. Thanks for letting me know.
>>>
>> And the patch was committed last night (r255659), so maybe your builds now 
>> work?
>>
>
> On some machines, in incremental builds I still seeing this:
> Assembler messages:
> Error: unknown architectural extension `lse'
> Error: unrecognized option -march=armv8-a+lse
> make[4]: *** [load_1_1_.lo] Error 1
> make[4]: *** Waiting for unfinished jobs
>
> Looks related... the only strange thing happening is that this doesn't
> happen in full builds.
>

The recent fix changed the Makefile and configure script in libatomic.
I guess that if your incremental builds does not run configure, it's
still using old Makefiles, and old options.


> --
> Paulo Matos


Re: GCC Buildbot Update

2017-12-20 Thread Paulo Matos


On 15/12/17 10:21, Christophe Lyon wrote:
> On 15 December 2017 at 10:19, Paulo Matos  wrote:
>>
>>
>> On 14/12/17 21:32, Christophe Lyon wrote:
>>> Great, I thought the CF machines were reserved for developpers.
>>> Good news you could add builders on them.
>>>
>>
>> Oh. I have seen similar things happening on CF machines so I thought it
>> was not a problem. I have never specifically asked for permission.
>>
 pmatos@gcc115:~/gcc-8-20171203_BUILD$ as -march=armv8.1-a
 Assembler messages:
 Error: unknown architecture `armv8.1-a'

 Error: unrecognized option -march=armv8.1-a

 However, if I run the a compiler build manually with just:

 $ configure --disable-multilib
 $ nice -n 19 make -j4 all

 This compiles just fine. So I am at the moment attempting to investigate
 what might cause the difference between what buildbot does and what I do
 through ssh.

>>> I suspect you are hitting a bug introduced recently, and fixed by:
>>> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00434.html
>>>
>>
>> Wow, that's really useful. Thanks for letting me know.
>>
> And the patch was committed last night (r255659), so maybe your builds now 
> work?
> 

On some machines, in incremental builds I still seeing this:
Assembler messages:
Error: unknown architectural extension `lse'
Error: unrecognized option -march=armv8-a+lse
make[4]: *** [load_1_1_.lo] Error 1
make[4]: *** Waiting for unfinished jobs

Looks related... the only strange thing happening is that this doesn't
happen in full builds.

-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-16 Thread Paulo Matos


On 15/12/17 18:05, Segher Boessenkool wrote:
> All the cfarm machines are shared resources.  Benchmarking on them will
> not work no matter what.  And being a shared resource means all users
> have to share and be mindful of others.
> 

Yes, we'll definitely need better machines for benchmarking. Something I
haven't thought of yet.

>> So it would be good if there was a strict separation of machines used
>> for bots and machines used by humans. In other words bots should only
>> run on dedicated machines.
> 
> The aarch64 builds should probably not use all of gcc113..gcc116.
>
> We do not have enough resources to dedicate machines to bots.
>

I have disabled gcc116.

Thanks,
-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-16 Thread Paulo Matos


On 15/12/17 15:29, David Malcolm wrote:
> On Fri, 2017-12-15 at 10:16 +0100, Paulo Matos wrote:
>>
>> On 14/12/17 12:39, David Malcolm wrote:
> 
> [...]
> 
>>> It looks like you're capturing the textual output from "jv compare"
>>> and
>>> using the exit code.  Would you prefer to import "jv" as a python
>>> module and use some kind of API?  Or a different output format?
>>>
>>
>> Well, I am using a fork of it which I converted to Python3. Would you
>> be
>> open to convert yours to Python3? The reason I am doing this is
>> because
>> all other Python software I have and the buildbot use Python3.
> 
> Done.
> 
> I found and fixed some more bugs, also (introduced during my
> refactoring, sigh...)
> 

That's great. Thank you very much for this work.

>> I would also prefer to have some json format or something but when I
>> looked at it, the software was just printing to stdout and I didn't
>> want
>> to spend too much time implementing it, so I thought parsing the
>> output
>> was just easier.
> 
> I can add JSON output (or whatever), but I need to get back to gcc 8
> work, so if the stdout output is good enough for now, let's defer
> output changes.
> 

Agree, for now I can use what I already have to read the output of jv.
I think I can now delete my fork and just use upstream jv as a submodule.

-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-15 Thread Segher Boessenkool
On Fri, Dec 15, 2017 at 08:42:18AM +0100, Markus Trippelsdorf wrote:
> On 2017.12.14 at 21:32 +0100, Christophe Lyon wrote:
> > On 14 December 2017 at 09:56, Paulo Matos  wrote:
> > > I got an email suggesting I add some aarch64 workers so I did:
> > > 4 workers from CF (gcc113, gcc114, gcc115 and gcc116);
> > >
> > Great, I thought the CF machines were reserved for developpers.
> > Good news you could add builders on them.
> 
> I don't think this is good news at all. 
> 
> Once a buildbot runs on a CF machine it immediately becomes impossible
> to do any meaningful measurement on that machine. That is mainly because
> of the random I/O (untar, rm -fr, etc.) of the bot. As a result variance
> goes to the roof and all measurements drown in noise.

Automated runs should not use an unreasonable amount of resources (and
neither should manual runs, but the bar for automated things lies much
lower, since they are more annoying).

All the cfarm machines are shared resources.  Benchmarking on them will
not work no matter what.  And being a shared resource means all users
have to share and be mindful of others.

> So it would be good if there was a strict separation of machines used
> for bots and machines used by humans. In other words bots should only
> run on dedicated machines.

The aarch64 builds should probably not use all of gcc113..gcc116.

We do not have enough resources to dedicate machines to bots.


Segher


Re: GCC Buildbot Update

2017-12-15 Thread David Malcolm
On Fri, 2017-12-15 at 10:16 +0100, Paulo Matos wrote:
> 
> On 14/12/17 12:39, David Malcolm wrote:

[...]

> > It looks like you're capturing the textual output from "jv compare"
> > and
> > using the exit code.  Would you prefer to import "jv" as a python
> > module and use some kind of API?  Or a different output format?
> > 
> 
> Well, I am using a fork of it which I converted to Python3. Would you
> be
> open to convert yours to Python3? The reason I am doing this is
> because
> all other Python software I have and the buildbot use Python3.

Done.

I found and fixed some more bugs, also (introduced during my
refactoring, sigh...)

> I would also prefer to have some json format or something but when I
> looked at it, the software was just printing to stdout and I didn't
> want
> to spend too much time implementing it, so I thought parsing the
> output
> was just easier.

I can add JSON output (or whatever), but I need to get back to gcc 8
work, so if the stdout output is good enough for now, let's defer
output changes.

> > If you file pull request(s) for the changes you've made in your
> > copy of
> > jamais-vu, I can take at look at merging them.
> > 
> 
> Happy to do so...
> Will merge your changes into my fork first then.
> 
> Kind regards,


Re: GCC Buildbot Update

2017-12-15 Thread Markus Trippelsdorf
On 2017.12.15 at 10:21 +0100, Paulo Matos wrote:
> 
> 
> On 15/12/17 08:42, Markus Trippelsdorf wrote:
> > 
> > I don't think this is good news at all. 
> > 
> 
> As I pointed out in a reply to Chris, I haven't seeked permission but I
> am pretty sure something similar runs in the CF machines from other
> projects.
> 
> The downside is that if we can't use the CF, I have no extra machines to
> run the buildbot on.
> 
> > Once a buildbot runs on a CF machine it immediately becomes impossible
> > to do any meaningful measurement on that machine. That is mainly because
> > of the random I/O (untar, rm -fr, etc.) of the bot. As a result variance
> > goes to the roof and all measurements drown in noise.
> > 
> > So it would be good if there was a strict separation of machines used
> > for bots and machines used by humans. In other words bots should only
> > run on dedicated machines.
> > 
> 
> I understand your concern though. Do you know who this issue could be
> raised with? FSF?

I think the best place would be the CF user mailing list
.
(All admins and users should be subscribed.)

-- 
Markus


Re: GCC Buildbot Update

2017-12-15 Thread Paulo Matos


On 15/12/17 10:21, Christophe Lyon wrote:
> And the patch was committed last night (r255659), so maybe your builds now 
> work?
> 

Forgot to mention that. Yes, it built!
https://gcc-buildbot.linki.tools/#/builders/5

-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-15 Thread Paulo Matos


On 15/12/17 08:42, Markus Trippelsdorf wrote:
> 
> I don't think this is good news at all. 
> 

As I pointed out in a reply to Chris, I haven't seeked permission but I
am pretty sure something similar runs in the CF machines from other
projects.

The downside is that if we can't use the CF, I have no extra machines to
run the buildbot on.

> Once a buildbot runs on a CF machine it immediately becomes impossible
> to do any meaningful measurement on that machine. That is mainly because
> of the random I/O (untar, rm -fr, etc.) of the bot. As a result variance
> goes to the roof and all measurements drown in noise.
> 
> So it would be good if there was a strict separation of machines used
> for bots and machines used by humans. In other words bots should only
> run on dedicated machines.
> 

I understand your concern though. Do you know who this issue could be
raised with? FSF?

-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-15 Thread Christophe Lyon
On 15 December 2017 at 10:19, Paulo Matos  wrote:
>
>
> On 14/12/17 21:32, Christophe Lyon wrote:
>> Great, I thought the CF machines were reserved for developpers.
>> Good news you could add builders on them.
>>
>
> Oh. I have seen similar things happening on CF machines so I thought it
> was not a problem. I have never specifically asked for permission.
>
>>> pmatos@gcc115:~/gcc-8-20171203_BUILD$ as -march=armv8.1-a
>>> Assembler messages:
>>> Error: unknown architecture `armv8.1-a'
>>>
>>> Error: unrecognized option -march=armv8.1-a
>>>
>>> However, if I run the a compiler build manually with just:
>>>
>>> $ configure --disable-multilib
>>> $ nice -n 19 make -j4 all
>>>
>>> This compiles just fine. So I am at the moment attempting to investigate
>>> what might cause the difference between what buildbot does and what I do
>>> through ssh.
>>>
>> I suspect you are hitting a bug introduced recently, and fixed by:
>> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00434.html
>>
>
> Wow, that's really useful. Thanks for letting me know.
>
And the patch was committed last night (r255659), so maybe your builds now work?

> --
> Paulo Matos


Re: GCC Buildbot Update

2017-12-15 Thread Paulo Matos


On 14/12/17 21:32, Christophe Lyon wrote:
> Great, I thought the CF machines were reserved for developpers.
> Good news you could add builders on them.
> 

Oh. I have seen similar things happening on CF machines so I thought it
was not a problem. I have never specifically asked for permission.

>> pmatos@gcc115:~/gcc-8-20171203_BUILD$ as -march=armv8.1-a
>> Assembler messages:
>> Error: unknown architecture `armv8.1-a'
>>
>> Error: unrecognized option -march=armv8.1-a
>>
>> However, if I run the a compiler build manually with just:
>>
>> $ configure --disable-multilib
>> $ nice -n 19 make -j4 all
>>
>> This compiles just fine. So I am at the moment attempting to investigate
>> what might cause the difference between what buildbot does and what I do
>> through ssh.
>>
> I suspect you are hitting a bug introduced recently, and fixed by:
> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00434.html
> 

Wow, that's really useful. Thanks for letting me know.

-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-15 Thread Paulo Matos


On 14/12/17 12:39, David Malcolm wrote:
> 
> Looking at some of the red blobs in e.g. the grid view there seem to be
> a few failures in the initial "update gcc trunk repo" step of the form:
> 
> svn: Working copy '.' locked
> svn: run 'svn cleanup' to remove locks (type 'svn help cleanup' for
> details)
> 

Yes, that's a big annoyance and a reason I have thought about moving to
using the git mirror, however that would probably bring other issues so
I am holding off. I need to add a reporter so that if it fails I am
notified by email and mobile phone.

This happens when there's a timeout from a server _during_ a
checkout/update (the svn repo unfortunately times out way too often). I
thought about doing an svn cleanup before each checkout but read it's
not good practice. If you have any suggestions on this please let me know.

> https://gcc-lnt.linki.tools/#/builders/3/builds/388/steps/0/logs/stdio
> 

Apologies, https://gcc-lnt.linki.tools is currently incorrectly
forwarding you to https://gcc-buildbot.linki.tools. I meant to have it
return an error until I open that up.

> Is there a bug-tracking location for the buildbot?
> Presumably:
>   https://github.com/LinkiTools/gcc-buildbot/issues
> ?
> 

That's correct.

> I actually found a serious bug in jamais-vu yesterday - it got confused
> by  multiple .sum lines for the same source line e.g. from multiple
> "dg-" directives that all specify a particular line).  For example,
> when testing one of my patches, of the 3 tests reporting as
>   "c-c++-common/pr83059.c  -std=c++11  (test for warnings, line 7)"
> one of the 3 PASS results became a FAIL.  jv correctly reported that
> new FAILs had occurred, but wouldn't identify them, and mistakenly
> reported that new PASSes has occurred also.
> 
> I've fixed that now; to do so I've done some refactoring and added a
> testsuite.
>

Perfect, thank you very much for this work.

> It looks like you're capturing the textual output from "jv compare" and
> using the exit code.  Would you prefer to import "jv" as a python
> module and use some kind of API?  Or a different output format?
> 

Well, I am using a fork of it which I converted to Python3. Would you be
open to convert yours to Python3? The reason I am doing this is because
all other Python software I have and the buildbot use Python3.

I would also prefer to have some json format or something but when I
looked at it, the software was just printing to stdout and I didn't want
to spend too much time implementing it, so I thought parsing the output
was just easier.

> If you file pull request(s) for the changes you've made in your copy of
> jamais-vu, I can take at look at merging them.
>

Happy to do so...
Will merge your changes into my fork first then.

Kind regards,
-- 
Paulo Matos


Re: GCC Buildbot Update

2017-12-14 Thread Markus Trippelsdorf
On 2017.12.14 at 21:32 +0100, Christophe Lyon wrote:
> On 14 December 2017 at 09:56, Paulo Matos  wrote:
> > I got an email suggesting I add some aarch64 workers so I did:
> > 4 workers from CF (gcc113, gcc114, gcc115 and gcc116);
> >
> Great, I thought the CF machines were reserved for developpers.
> Good news you could add builders on them.

I don't think this is good news at all. 

Once a buildbot runs on a CF machine it immediately becomes impossible
to do any meaningful measurement on that machine. That is mainly because
of the random I/O (untar, rm -fr, etc.) of the bot. As a result variance
goes to the roof and all measurements drown in noise.

So it would be good if there was a strict separation of machines used
for bots and machines used by humans. In other words bots should only
run on dedicated machines.

-- 
Markus


Re: GCC Buildbot Update

2017-12-14 Thread Christophe Lyon
On 14 December 2017 at 09:56, Paulo Matos  wrote:
> Hello,
>
> Apologies for the delay on the update. It was my plan to do an update on
> a monthly basis but it slipped by a couple of weeks.
>
Hi,

Thanks for the update!


> The current status is:
>
> *Workers:*
>
> - x86_64
>
> 2 workers from CF (gcc16 and gcc20) up and running;
> 1 worker from my farm (jupiter-F26) up and running;
>
> 2 broken CF (gcc75 and gcc76) - the reason for the brokenness is that
> the machines work well but all outgoing ports except the git port is
> open (9418 if not mistaken). This means that not only we cannot svn co
> gcc but we can't connect a worker to the master through port 9918. I
> have contacted the cf admin but the reply was that nothing can be done
> as they don't really own the machine. They seemed to have relayed the
> request to the machine owners.
>
> - aarch64
>
> I got an email suggesting I add some aarch64 workers so I did:
> 4 workers from CF (gcc113, gcc114, gcc115 and gcc116);
>
Great, I thought the CF machines were reserved for developpers.
Good news you could add builders on them.

> *Builds:*
>
> As before we have the full build and the incremental build. Both enabled
> for x86_64 and aarch64, except they are currently failing for aarch64
> (more on that later).
>
> The full build is triggered on Daily bump commit and the incremental
> build is triggered for each commit.
>
> The problem with this setup is that the incremental builder takes too
> long to run the tests. Around 1h30m on CF machines for x86_64.
>
> Segher Boessenkool sent me a patch to disable guality and prettyprinters
> which coupled with --disable-gomp at configure time was supposed to make
> things much faster. I have added this as the Fast builder, except this
> is failing during the test runs:
> unable to alloc 389376 bytes
> /bin/bash: line 21: 32472 Aborted `if [ -f
> ${srcdir}/../dejagnu/runtest ] ; then echo ${srcdir}/../dejagnu/runtest
> ; else echo runtest; fi` --tool gcc
> /bin/bash: fork: Cannot allocate memory
> make[3]: [check-parallel-gcc] Error 254 (ignored)
> make[3]: execvp: /bin/bash: Cannot allocate memory
> make[3]: [check-parallel-gcc_1] Error 127 (ignored)
> make[3]: execvp: /bin/bash: Cannot allocate memory
> make[3]: [check-parallel-gcc_1] Error 127 (ignored)
> make[3]: execvp: /bin/bash: Cannot allocate memory
> make[3]: *** [check-parallel-gcc_1] Error 127
>
>
> However, something interesting is happening here since the munin
> interface for gcc16 doesn't show the machine running out of memory:
> https://cfarm.tetaneutral.net/munin/gccfarm/gcc16/memory.html
> (something confirmed by the cf admins)
>
> The aarch64 build is failing as mentioned earlier. If you check the logs:
> https://gcc-buildbot.linki.tools/#/builders/5/builds/10
> the problem seems to be the assembler issuing:
> Assembler messages:
> Error: unknown architecture `armv8.1-a'
> Error: unrecognized option -march=armv8.1-a
>
>
> If I go to the machines and check the versions I get:
> pmatos@gcc115:~/gcc-8-20171203_BUILD$ as --version
> GNU assembler (GNU Binutils for Ubuntu) 2.24
> Copyright 2013 Free Software Foundation, Inc.
> This program is free software; you may redistribute it under the terms of
> the GNU General Public License version 3 or later.
> This program has absolutely no warranty.
> This assembler was configured for a target of `aarch64-linux-gnu'.
>
> pmatos@gcc115:~/gcc-8-20171203_BUILD$ gcc --version
> gcc (Ubuntu/Linaro 4.8.4-2ubuntu1~14.04.3) 4.8.4
> Copyright (C) 2013 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>
> pmatos@gcc115:~/gcc-8-20171203_BUILD$ as -march=armv8.1-a
> Assembler messages:
> Error: unknown architecture `armv8.1-a'
>
> Error: unrecognized option -march=armv8.1-a
>
> However, if I run the a compiler build manually with just:
>
> $ configure --disable-multilib
> $ nice -n 19 make -j4 all
>
> This compiles just fine. So I am at the moment attempting to investigate
> what might cause the difference between what buildbot does and what I do
> through ssh.
>
I suspect you are hitting a bug introduced recently, and fixed by:
https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00434.html

> *Reporters:*
>
> There is a single reporter which is a irc bot currently silent.
>
> *Regression analysis:*
>
> This is one of the most important issues to tackle and I have a solution
> in a branch regression-testing :
> https://github.com/LinkiTools/gcc-buildbot/tree/regression-testing
>
> using jamais-vu from David Malco

Re: GCC Buildbot Update

2017-12-14 Thread David Malcolm
On Thu, 2017-12-14 at 09:56 +0100, Paulo Matos wrote:
> Hello,
> 
> Apologies for the delay on the update. It was my plan to do an update
> on
> a monthly basis but it slipped by a couple of weeks.

Thanks for working on this.

> The current status is:
> 
> *Workers:*

[...snip...]

> *Builds:*

[...snip...]

Looking at some of the red blobs in e.g. the grid view there seem to be
a few failures in the initial "update gcc trunk repo" step of the form:

svn: Working copy '.' locked
svn: run 'svn cleanup' to remove locks (type 'svn help cleanup' for
details)

https://gcc-lnt.linki.tools/#/builders/3/builds/388/steps/0/logs/stdio

Is there a bug-tracking location for the buildbot?
Presumably:
  https://github.com/LinkiTools/gcc-buildbot/issues
?

*Reporters:*
> 
> There is a single reporter which is a irc bot currently silent.
> 
> *Regression analysis:*
> 
> This is one of the most important issues to tackle and I have a
> solution
> in a branch regression-testing :
> https://github.com/LinkiTools/gcc-buildbot/tree/regression-testing
> 
> using jamais-vu from David Malcolm to analyze the regressions.
> It needs some more testing and I should be able to get it working
> still
> this year.

I actually found a serious bug in jamais-vu yesterday - it got confused
by  multiple .sum lines for the same source line e.g. from multiple
"dg-" directives that all specify a particular line).  For example,
when testing one of my patches, of the 3 tests reporting as
  "c-c++-common/pr83059.c  -std=c++11  (test for warnings, line 7)"
one of the 3 PASS results became a FAIL.  jv correctly reported that
new FAILs had occurred, but wouldn't identify them, and mistakenly
reported that new PASSes has occurred also.

I've fixed that now; to do so I've done some refactoring and added a
testsuite.

It looks like you're capturing the textual output from "jv compare" and
using the exit code.  Would you prefer to import "jv" as a python
module and use some kind of API?  Or a different output format?

If you file pull request(s) for the changes you've made in your copy of
jamais-vu, I can take at look at merging them.

[...]

> I hope to send another update in about a months time.
> 
> Kind regards,

Thanks again for your work on this
Dave


GCC Buildbot Update

2017-12-14 Thread Paulo Matos
Hello,

Apologies for the delay on the update. It was my plan to do an update on
a monthly basis but it slipped by a couple of weeks.

The current status is:

*Workers:*

- x86_64

2 workers from CF (gcc16 and gcc20) up and running;
1 worker from my farm (jupiter-F26) up and running;

2 broken CF (gcc75 and gcc76) - the reason for the brokenness is that
the machines work well but all outgoing ports except the git port is
open (9418 if not mistaken). This means that not only we cannot svn co
gcc but we can't connect a worker to the master through port 9918. I
have contacted the cf admin but the reply was that nothing can be done
as they don't really own the machine. They seemed to have relayed the
request to the machine owners.

- aarch64

I got an email suggesting I add some aarch64 workers so I did:
4 workers from CF (gcc113, gcc114, gcc115 and gcc116);

*Builds:*

As before we have the full build and the incremental build. Both enabled
for x86_64 and aarch64, except they are currently failing for aarch64
(more on that later).

The full build is triggered on Daily bump commit and the incremental
build is triggered for each commit.

The problem with this setup is that the incremental builder takes too
long to run the tests. Around 1h30m on CF machines for x86_64.

Segher Boessenkool sent me a patch to disable guality and prettyprinters
which coupled with --disable-gomp at configure time was supposed to make
things much faster. I have added this as the Fast builder, except this
is failing during the test runs:
unable to alloc 389376 bytes
/bin/bash: line 21: 32472 Aborted `if [ -f
${srcdir}/../dejagnu/runtest ] ; then echo ${srcdir}/../dejagnu/runtest
; else echo runtest; fi` --tool gcc
/bin/bash: fork: Cannot allocate memory
make[3]: [check-parallel-gcc] Error 254 (ignored)
make[3]: execvp: /bin/bash: Cannot allocate memory
make[3]: [check-parallel-gcc_1] Error 127 (ignored)
make[3]: execvp: /bin/bash: Cannot allocate memory
make[3]: [check-parallel-gcc_1] Error 127 (ignored)
make[3]: execvp: /bin/bash: Cannot allocate memory
make[3]: *** [check-parallel-gcc_1] Error 127


However, something interesting is happening here since the munin
interface for gcc16 doesn't show the machine running out of memory:
https://cfarm.tetaneutral.net/munin/gccfarm/gcc16/memory.html
(something confirmed by the cf admins)

The aarch64 build is failing as mentioned earlier. If you check the logs:
https://gcc-buildbot.linki.tools/#/builders/5/builds/10
the problem seems to be the assembler issuing:
Assembler messages:
Error: unknown architecture `armv8.1-a'
Error: unrecognized option -march=armv8.1-a


If I go to the machines and check the versions I get:
pmatos@gcc115:~/gcc-8-20171203_BUILD$ as --version
GNU assembler (GNU Binutils for Ubuntu) 2.24
Copyright 2013 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `aarch64-linux-gnu'.

pmatos@gcc115:~/gcc-8-20171203_BUILD$ gcc --version
gcc (Ubuntu/Linaro 4.8.4-2ubuntu1~14.04.3) 4.8.4
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

pmatos@gcc115:~/gcc-8-20171203_BUILD$ as -march=armv8.1-a
Assembler messages:
Error: unknown architecture `armv8.1-a'

Error: unrecognized option -march=armv8.1-a

However, if I run the a compiler build manually with just:

$ configure --disable-multilib
$ nice -n 19 make -j4 all

This compiles just fine. So I am at the moment attempting to investigate
what might cause the difference between what buildbot does and what I do
through ssh.

*Reporters:*

There is a single reporter which is a irc bot currently silent.

*Regression analysis:*

This is one of the most important issues to tackle and I have a solution
in a branch regression-testing :
https://github.com/LinkiTools/gcc-buildbot/tree/regression-testing

using jamais-vu from David Malcolm to analyze the regressions.
It needs some more testing and I should be able to get it working still
this year.

*LNT:*

I had mentioned I wanted to setup an interface which would allow for
easy visibility of test failures, time taken to build/test, etc.
Initially I thought a stack of influx+grafana would be a good idea, but
was pointed out to using LNT as presented by James Greenhalgh in the GNU
Cauldron. I have setup LNT (soon to be available under
https://gcc-lnt.linki.tools) and contacted James to learn more about the
setup. As it turns out James is just using it for benchmarking results
and out-of-the-box only seems to support the LLVM testing infrastructure
so getting GCC results in there might take a bit more of scripting and
plumbing.

I will probably take the same route and set it up first for the
benchmarking resul

Re: GCC Buildbot Update - Definition of regression

2017-10-13 Thread David Malcolm
On Wed, 2017-10-11 at 16:17 +0200, Marc Glisse wrote:
> On Wed, 11 Oct 2017, David Malcolm wrote:
> 
> > On Wed, 2017-10-11 at 11:18 +0200, Paulo Matos wrote:
> > > 
> > > On 11/10/17 11:15, Christophe Lyon wrote:
> > > > 
> > > > You can have a look at
> > > > https://git.linaro.org/toolchain/gcc-compare-results.git/
> > > > where compare_tests is a patched version of the contrib/
> > > > script,
> > > > it calls the main perl script (which is not the prettiest thing
> > > > :-)
> > > > 
> > > 
> > > Thanks, that's useful. I will take a look.
> > 
> > You may also want to look at this script I wrote:
> > 
> >  https://github.com/davidmalcolm/jamais-vu
> > 
> > (it has Python classes for working with DejaGnu output)
> 
> By the way, David, how do you handle comparisons for the jit
> testsuite? jv 
> gives
> 
> Tests that went away in build/gcc/testsuite/jit/jit.sum: 81
> ---
> 
>   PASS:  t
>   PASS:  test-
>   PASS:  test-arith-overflow.c
>   PASS:  test-arith-overflow.c.exe iteration 1 of 5: verify_uint_over
>   PASS:  test-arith-overflow.c.exe iteration 2 of 5: verify_uint_o
>   PASS:  test-arith-overflow.c.exe iteration 3 of 5: verify
> [...]
> 
> Tests appeared in build/gcc/testsuite/jit/jit.sum: 78
> -
> 
>   PASS:  test-arith-overflow.c.exe iteration 1
>   PASS:  test-arith-overflow.c.exe iteration 2 of
>   PASS:  test-arith-overflow.c.exe iteration 4 of 5: verify_u
>   PASS:  test-combination.
>   PASS:  test-combination.c.exe it
> [...]
> 
> The issue is more likely in the testsuite, but I assume you have a 
> workflow that allows working around the issue?

I believe the issue here is PR jit/69435 ("Truncated lines in
jit.log").
I suspect that the attachment in comment #2 there ought to fix it
(sorry that this issue stalled; in the meantime I've been simplying
verifying the absence of FAILs and checking the number of PASSes in
jit.sum).

Dave


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Hans-Peter Nilsson
On Tue, 10 Oct 2017, Paulo Matos wrote:

> This is a suggestion. I am keen to have corrections from people who use
> this on a daily basis and/or have a better understanding of each status.

Not mentioning them (oddly I don't see anyone mentioning them)
makes me think you've not looked there so allow me to point out:
consider re-using Geoff Keating's regression tester scripts.
They're all in your nearest gcc checkout, in contrib/regression.
I suggest using whatever definition those scripts define.
They've worked for my regression testing (though my local
automated tester is not active at the moment).  Just remember to
always use the option --add-passes-despite-regression or else
btest-gcc.sh requires a clean bill before adding new PASSes to
the list of PASSing tests considered for regression.  (A clean
bill happens too rarely for non-primary targets, for long times,
for reasons beyond port maintainer powers.)

Also, you may have to fight release maintainers for the
"regression" definition.  Previous arguments have been along the
line of "it's not a regression if there hasn't been a release
with the test for that functionality passing".

brgds, H-P


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Joseph Myers
On Wed, 11 Oct 2017, Martin Sebor wrote:

> I don't have a strong opinion on the definition of a Regression
> in this context but I would very much like to see status changes
> highlighted in the test results to indicate that something that

There are lots of things that are useful *if* you have someone actively 
reviewing them for every test run (maybe several times a day) and alerting 
people / filing bugs in Bugzilla if there are problems.  Some of those 
things, however, are likely to have too many false positives for a display 
people can quickly look at to see if the build is red or green, or for 
automatically telling people their patch broke something.

If we can clean up results for each system the bot runs tests on - 
XFAILing and filing bugs in Bugzilla for failures where there isn't a 
reasonably simple and obvious fix - we can make green mean "no FAILs or 
ERRORs" (remembering the possibility that with very broken testing, 
sometimes an ERROR might only be in the .log not the .sum).  Other 
differences (such as PASS -> UNSUPPORTED) can then be reviewed manually by 
someone who takes responsibility for doing so, resulting in bugs being 
filed if appropriate, without affecting the basic red/green status.

(Variants such as green meaning "no FAILs or ERRORs, except for failing 
guality tests where there should be no regressions" are possible as well, 
for cases like that where PASS/FAIL status depends on non-GCC components 
and meaningfully selective XFAILing is hard.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Andreas Schwab
On Okt 10 2017, Joseph Myers  wrote:

> Anything else -> FAIL and new FAILing tests aren't regressions at the 
> individual test level, but may be treated as such at the whole testsuite 
> level.

An ICE FAIL is a regression, but this is always a new test.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Martin Sebor

PASS -> ANY ; Test moves away from PASS


No, only a regression if the destination result is FAIL (if it's
UNRESOLVED then there might be a separate regression - execution test
becoming UNRESOLVED should be accompanied by compilation becoming FAIL).
If it's XFAIL, it might formally be a regression, but one already being
tracked in another way (presumably Bugzilla) which should not turn the bot
red.  If it's XPASS, that simply means XFAILing conditions slightly wider
than necessary in order to mark failure in another configuration as
expected.

My suggestion is:

PASS -> FAIL is an unambiguous regression.

Anything else -> FAIL and new FAILing tests aren't regressions at the
individual test level, but may be treated as such at the whole testsuite
level.


I don't have a strong opinion on the definition of a Regression
in this context but I would very much like to see status changes
highlighted in the test results to indicate that something that
worked before no longer works as well, to help us spot the kinds
of problems I've run into and a had trouble with.  (Showing the
SVN revision number along with each transition would be great.)
Here are a couple of examples.

A recent change of mine caused a test in the target_supports.exp
file to fail to detect attribute ifunc support.  That in turn
prevented regression tests for the attribute from being compiled
(changed them from PASS to UNSUPPORTED) which ultimately masked
a bug my change had introduced.

My script that looks for regressions in my own test results would
normally catch this before I commit such a change.  Unfortunately,
the script ignores results with the UNSUPPORTED status, so this
bug slipped in unnoticed.

Regardless of whether or not these types of errors are considered
Regressions, highlighting them perhaps in different colors would
be helpful.


Any transition where the destination result is not FAIL is not a
regression.

ERRORs in the .sum or .log files should be watched out for as well,
however, as sometimes they may indicate broken Tcl syntax in the
testsuite, which may cause many tests not to be run.


Yes, please.  I had a problem happen with a test with a bad DejaGnu
directive.  The test failed in an non-obvious way (I think it caused
an ERROR in the log) which caused a small number of tests that ran
after it to fail.  Because of parallel make (I run tests with make
-j96) the failing tests changed from one run of the test suite to
the next and the whole problem ended up being quite hard to debug.
(The ultimate root cause was a stray backslash in a dj-warning
directive introduced by copying and pasting between an Emacs session
in one terminal and a via session in another.  The backslash was in
column 80 and so virtually impossible to see.)

Martin


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Marc Glisse

On Wed, 11 Oct 2017, David Malcolm wrote:


On Wed, 2017-10-11 at 11:18 +0200, Paulo Matos wrote:


On 11/10/17 11:15, Christophe Lyon wrote:


You can have a look at
https://git.linaro.org/toolchain/gcc-compare-results.git/
where compare_tests is a patched version of the contrib/ script,
it calls the main perl script (which is not the prettiest thing :-)



Thanks, that's useful. I will take a look.


You may also want to look at this script I wrote:

 https://github.com/davidmalcolm/jamais-vu

(it has Python classes for working with DejaGnu output)


By the way, David, how do you handle comparisons for the jit testsuite? jv 
gives


Tests that went away in build/gcc/testsuite/jit/jit.sum: 81
---

 PASS:  t
 PASS:  test-
 PASS:  test-arith-overflow.c
 PASS:  test-arith-overflow.c.exe iteration 1 of 5: verify_uint_over
 PASS:  test-arith-overflow.c.exe iteration 2 of 5: verify_uint_o
 PASS:  test-arith-overflow.c.exe iteration 3 of 5: verify
[...]

Tests appeared in build/gcc/testsuite/jit/jit.sum: 78
-

 PASS:  test-arith-overflow.c.exe iteration 1
 PASS:  test-arith-overflow.c.exe iteration 2 of
 PASS:  test-arith-overflow.c.exe iteration 4 of 5: verify_u
 PASS:  test-combination.
 PASS:  test-combination.c.exe it
[...]

The issue is more likely in the testsuite, but I assume you have a 
workflow that allows working around the issue?


--
Marc Glisse


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Joseph Myers
On Wed, 11 Oct 2017, Christophe Lyon wrote:

> * {PASS,UNSUPPORTED,UNTESTED,UNRESOLVED}-> XPASS

I don't think any of these should be considered regressions.  It's good if 
someone manually checks anything that's *consistently* XPASSing, to see if 
the XFAIL should be removed or restricted to narrower conditions, but if 
the result of a test has become any kind of pass, it cannot possibly be 
considered a regression.  (You might have a flaky test XFAILed because it 
passes or fails at random, though I think that random variation is more 
common for GDB than for GCC.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Joseph Myers
On Wed, 11 Oct 2017, Paulo Matos wrote:

> On 10/10/17 23:25, Joseph Myers wrote:
> > On Tue, 10 Oct 2017, Paulo Matos wrote:
> > 
> >> new test -> FAIL; New test starts as fail
> > 
> > No, that's not a regression, but you might want to treat it as one (in the 
> > sense that it's a regression at the higher level of "testsuite run should 
> > have no unexpected failures", even if the test in question would have 
> > failed all along if added earlier and so the underlying compiler bug, if 
> > any, is not a regression).  It should have human attention to classify it 
> > and either fix the test or XFAIL it (with issue filed in Bugzilla if a 
> > bug), but it's not a regression.  (Exception: where a test failing results 
> > in its name changing, e.g. through adding "(internal compiler error)".)
> > 
> 
> When someone adds a new test to the testsuite, isn't it supposed to not
> FAIL? If is does FAIL, shouldn't this be considered a regression?

Only a regression at the whole-testsuite level (in that "no FAILs" is the 
desired state).  Not a regression in the sense of a regression bug in GCC 
that might be relevant for release management (something user-visible that 
worked in a previous GCC version but no longer works).  And if e.g. 
someone added a dg-require-effective-target (for example) line to a 
testcase, so incrementing all the line numbers in that test, every PASS / 
FAIL assertion in that test will have its line number increase by 1, so 
being renamed, so resulting in spurious detection of a regression if you 
consider new FAILs as regressions (even at the whole-testsuite level, an 
increased line number on an existing FAIL is not meaningfully a 
regression).

> For this reason all of this issues need to be taken care straight away

Well, I think it *does* make sense to do sufficient analysis on existing 
FAILs to decide if they are testsuite issues or compiler bugs, fix if they 
are testsuite issues and XFAIL with reference to a bug in Bugzilla if 
compiler bugs.  That is, try to get to the point where no-FAILs is the 
normal expected testsuite state and it's Bugzilla, not 
expected-FAILs-not-marked-as-XFAIL, that is used to track regressions and 
other bugs.

> By not being unique, you mean between languages?

Yes (e.g. c-c++-common tests in both gcc and g++ tests might have the same 
name in both .sum files, but should still be counted as different tests).

> I assume that two gcc.sum from different builds will always refer to the
> same test/configuration when referring to (for example):
> PASS: gcc.c-torture/compile/2105-1.c   -O1  (test for excess errors)

The problem is when e.g. multiple diagnostics are being tested for on the 
same line but the "test name" field in the dg-* directive is an empty 
string for all of them.  One possible approach is to automatically (in 
your regression checking scripts) append a serial number to the first, 
second, third etc. cases of any given repeated test name in a .sum file.  
Or you could count such duplicates as being errors that automatically 
result in red test results, and get fixes for them into GCC as soon as 
possible.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread David Malcolm
On Wed, 2017-10-11 at 11:18 +0200, Paulo Matos wrote:
> 
> On 11/10/17 11:15, Christophe Lyon wrote:
> > 
> > You can have a look at
> > https://git.linaro.org/toolchain/gcc-compare-results.git/
> > where compare_tests is a patched version of the contrib/ script,
> > it calls the main perl script (which is not the prettiest thing :-)
> > 
> 
> Thanks, that's useful. I will take a look.

You may also want to look at this script I wrote:

  https://github.com/davidmalcolm/jamais-vu

(it has Python classes for working with DejaGnu output)

Dave


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Jonathan Wakely
On 11 October 2017 at 07:34, Paulo Matos wrote:
> When someone adds a new test to the testsuite, isn't it supposed to not
> FAIL?

Yes, but sometimes it FAILs because the test is using a new feature
that only works on some targets, and the new test was missing the
right directives to make it UNSUPPORTED on other targets.

> If is does FAIL, shouldn't this be considered a regression?

No, it's not a regression, because it's not something that used to
work and now fails.

Maybe it should still be flagged as red, but it's not strictly a
regression. I would call it a "new failure" rather than regression.


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Paulo Matos


On 11/10/17 11:15, Christophe Lyon wrote:
> 
> You can have a look at
> https://git.linaro.org/toolchain/gcc-compare-results.git/
> where compare_tests is a patched version of the contrib/ script,
> it calls the main perl script (which is not the prettiest thing :-)
> 

Thanks, that's useful. I will take a look.

-- 
Paulo Matos


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Christophe Lyon
On 11 October 2017 at 11:03, Paulo Matos  wrote:
>
>
> On 11/10/17 10:35, Christophe Lyon wrote:
>>
>> FWIW, we consider regressions:
>> * any->FAIL because we don't want such a regression at the whole testsuite 
>> level
>> * any->UNRESOLVED for the same reason
>> * {PASS,UNSUPPORTED,UNTESTED,UNRESOLVED}-> XPASS
>> * new XPASS
>> * XFAIL disappears (may mean that a testcase was removed, worth a manual 
>> check)
>> * ERRORS
>>
>
> That's certainly stricter than what it was proposed by Joseph. I will
> run a few tests on historical data to see what I get using both approaches.
>
>>
>>
 ERRORs in the .sum or .log files should be watched out for as well,
 however, as sometimes they may indicate broken Tcl syntax in the
 testsuite, which may cause many tests not to be run.

 Note that the test names that come after PASS:, FAIL: etc. aren't unique
 between different .sum files, so you need to associate tests with a tuple
 (.sum file, test name) (and even then, sometimes multiple tests in a .sum
 file have the same name, but that's a testsuite bug).  If you're using
 --target_board options that run tests for more than one multilib in the
 same testsuite run, add the multilib to that tuple as well.

>>>
>>> Thanks for all the comments. Sounds sensible.
>>> By not being unique, you mean between languages?
>> Yes, but not only as Joseph mentioned above.
>>
>> You have the obvious example of c-c++-common/*san tests, which are
>> common to gcc and g++.
>>
>>> I assume that two gcc.sum from different builds will always refer to the
>>> same test/configuration when referring to (for example):
>>> PASS: gcc.c-torture/compile/2105-1.c   -O1  (test for excess errors)
>>>
>>> In this case, I assume that "gcc.c-torture/compile/2105-1.c   -O1
>>> (test for excess errors)" will always be referring to the same thing.
>>>
>> In gcc.sum, I can see 4 occurrences of
>> PASS: gcc.dg/Werror-13.c  (test for errors, line )
>>
>> Actually, there are quite a few others like that
>>
>
> That actually surprised me.
>
> I also see:
> PASS: gcc.dg/Werror-13.c  (test for errors, line )
> PASS: gcc.dg/Werror-13.c  (test for errors, line )
> PASS: gcc.dg/Werror-13.c  (test for errors, line )
> PASS: gcc.dg/Werror-13.c  (test for errors, line )
>
> among others like it. Looks like a line number is missing?
>
> In any case, it feels like the code I have to track this down needs to
> be improved.
>
We had to derive our scripts from the ones in contrib/ because these
failed to handle some cases (eg when a same test reports
both PASS and FAIL, yes it does happen).

You can have a look at
https://git.linaro.org/toolchain/gcc-compare-results.git/
where compare_tests is a patched version of the contrib/ script,
it calls the main perl script (which is not the prettiest thing :-)

Christophe

> --
> Paulo Matos


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Paulo Matos


On 11/10/17 10:35, Christophe Lyon wrote:
> 
> FWIW, we consider regressions:
> * any->FAIL because we don't want such a regression at the whole testsuite 
> level
> * any->UNRESOLVED for the same reason
> * {PASS,UNSUPPORTED,UNTESTED,UNRESOLVED}-> XPASS
> * new XPASS
> * XFAIL disappears (may mean that a testcase was removed, worth a manual 
> check)
> * ERRORS
> 

That's certainly stricter than what it was proposed by Joseph. I will
run a few tests on historical data to see what I get using both approaches.

> 
> 
>>> ERRORs in the .sum or .log files should be watched out for as well,
>>> however, as sometimes they may indicate broken Tcl syntax in the
>>> testsuite, which may cause many tests not to be run.
>>>
>>> Note that the test names that come after PASS:, FAIL: etc. aren't unique
>>> between different .sum files, so you need to associate tests with a tuple
>>> (.sum file, test name) (and even then, sometimes multiple tests in a .sum
>>> file have the same name, but that's a testsuite bug).  If you're using
>>> --target_board options that run tests for more than one multilib in the
>>> same testsuite run, add the multilib to that tuple as well.
>>>
>>
>> Thanks for all the comments. Sounds sensible.
>> By not being unique, you mean between languages?
> Yes, but not only as Joseph mentioned above.
> 
> You have the obvious example of c-c++-common/*san tests, which are
> common to gcc and g++.
> 
>> I assume that two gcc.sum from different builds will always refer to the
>> same test/configuration when referring to (for example):
>> PASS: gcc.c-torture/compile/2105-1.c   -O1  (test for excess errors)
>>
>> In this case, I assume that "gcc.c-torture/compile/2105-1.c   -O1
>> (test for excess errors)" will always be referring to the same thing.
>>
> In gcc.sum, I can see 4 occurrences of
> PASS: gcc.dg/Werror-13.c  (test for errors, line )
> 
> Actually, there are quite a few others like that
> 

That actually surprised me.

I also see:
PASS: gcc.dg/Werror-13.c  (test for errors, line )
PASS: gcc.dg/Werror-13.c  (test for errors, line )
PASS: gcc.dg/Werror-13.c  (test for errors, line )
PASS: gcc.dg/Werror-13.c  (test for errors, line )

among others like it. Looks like a line number is missing?

In any case, it feels like the code I have to track this down needs to
be improved.

-- 
Paulo Matos


Re: GCC Buildbot Update - Definition of regression

2017-10-11 Thread Christophe Lyon
On 11 October 2017 at 08:34, Paulo Matos  wrote:
>
>
> On 10/10/17 23:25, Joseph Myers wrote:
>> On Tue, 10 Oct 2017, Paulo Matos wrote:
>>
>>> new test -> FAIL; New test starts as fail
>>
>> No, that's not a regression, but you might want to treat it as one (in the
>> sense that it's a regression at the higher level of "testsuite run should
>> have no unexpected failures", even if the test in question would have
>> failed all along if added earlier and so the underlying compiler bug, if
>> any, is not a regression).  It should have human attention to classify it
>> and either fix the test or XFAIL it (with issue filed in Bugzilla if a
>> bug), but it's not a regression.  (Exception: where a test failing results
>> in its name changing, e.g. through adding "(internal compiler error)".)
>>
>
> When someone adds a new test to the testsuite, isn't it supposed to not
> FAIL? If is does FAIL, shouldn't this be considered a regression?
>
> Now, the danger is that since regressions are comparisons with previous
> run something like this would happen:
>
> run1:
> ...
> FAIL: foo.c ; new test
> ...
>
> run1 fails because new test entered as a FAIL
>
> run2:
> ...
> FAIL: foo.c
> ...
>
> run2 succeeds because there are no changes.
>
> For this reason all of this issues need to be taken care straight away
> or they become part of the 'normal' status and no more failures are
> issued... unless of course a more complex regression analysis is
> implemented.
>
Agreed.

> Also, when I mean, run1 fails or succeeds this is just the term I use to
> display red/green in the buildbot interface for a given build, not
> necessarily what I expect the process will do.
>
>>
>> My suggestion is:
>>
>> PASS -> FAIL is an unambiguous regression.
>>
>> Anything else -> FAIL and new FAILing tests aren't regressions at the
>> individual test level, but may be treated as such at the whole testsuite
>> level.
>>
>> Any transition where the destination result is not FAIL is not a
>> regression.
>>

FWIW, we consider regressions:
* any->FAIL because we don't want such a regression at the whole testsuite level
* any->UNRESOLVED for the same reason
* {PASS,UNSUPPORTED,UNTESTED,UNRESOLVED}-> XPASS
* new XPASS
* XFAIL disappears (may mean that a testcase was removed, worth a manual check)
* ERRORS



>> ERRORs in the .sum or .log files should be watched out for as well,
>> however, as sometimes they may indicate broken Tcl syntax in the
>> testsuite, which may cause many tests not to be run.
>>
>> Note that the test names that come after PASS:, FAIL: etc. aren't unique
>> between different .sum files, so you need to associate tests with a tuple
>> (.sum file, test name) (and even then, sometimes multiple tests in a .sum
>> file have the same name, but that's a testsuite bug).  If you're using
>> --target_board options that run tests for more than one multilib in the
>> same testsuite run, add the multilib to that tuple as well.
>>
>
> Thanks for all the comments. Sounds sensible.
> By not being unique, you mean between languages?
Yes, but not only as Joseph mentioned above.

You have the obvious example of c-c++-common/*san tests, which are
common to gcc and g++.

> I assume that two gcc.sum from different builds will always refer to the
> same test/configuration when referring to (for example):
> PASS: gcc.c-torture/compile/2105-1.c   -O1  (test for excess errors)
>
> In this case, I assume that "gcc.c-torture/compile/2105-1.c   -O1
> (test for excess errors)" will always be referring to the same thing.
>
In gcc.sum, I can see 4 occurrences of
PASS: gcc.dg/Werror-13.c  (test for errors, line )

Actually, there are quite a few others like that

Christophe

> --
> Paulo Matos


Re: GCC Buildbot Update - Definition of regression

2017-10-10 Thread Markus Trippelsdorf
On 2017.10.11 at 08:22 +0200, Paulo Matos wrote:
> 
> 
> On 11/10/17 06:17, Markus Trippelsdorf wrote:
> > On 2017.10.10 at 21:45 +0200, Paulo Matos wrote:
> >> Hi all,
> >>
> >> It's almost 3 weeks since I last posted on GCC Buildbot. Here's an update:
> >>
> >> * 3 x86_64 workers from CF are now installed;
> >> * There's one scheduler for trunk doing fresh builds for every Daily bump;
> >> * One scheduler doing incremental builds for each active branch;
> >> * An IRC bot which is currently silent;
> > 
> > Using -j8 for the bot on a 8/16 (core/thread) machine like gcc67 is not
> > acceptable, because it will render it unusable for everybody else.
> 
> I was going to correct you on that given what I read in
> https://gcc.gnu.org/wiki/CompileFarm#Usage
> 
> but it was my mistake. I assumed that for an N-thread machine, I could
> use N/2 processes but the guide explicitly says N-core, not N-thread.
> Therefore I should be using 4 processes for gcc67 (or 0 given what follows).
> 
> I will fix also the number of processes used by the other workers.

Thanks. And while you are at it please set the niceness to 19.

> > Also gcc67 has a buggy Ryzen CPU that causes random gcc crashes. Not the
> > best setup for a regression tester...
> > 
> 
> Is that documented anywhere? I will remove this worker.

https://community.amd.com/thread/215773

-- 
Markus


Re: GCC Buildbot Update - Definition of regression

2017-10-10 Thread Paulo Matos


On 10/10/17 23:25, Joseph Myers wrote:
> On Tue, 10 Oct 2017, Paulo Matos wrote:
> 
>> new test -> FAIL; New test starts as fail
> 
> No, that's not a regression, but you might want to treat it as one (in the 
> sense that it's a regression at the higher level of "testsuite run should 
> have no unexpected failures", even if the test in question would have 
> failed all along if added earlier and so the underlying compiler bug, if 
> any, is not a regression).  It should have human attention to classify it 
> and either fix the test or XFAIL it (with issue filed in Bugzilla if a 
> bug), but it's not a regression.  (Exception: where a test failing results 
> in its name changing, e.g. through adding "(internal compiler error)".)
> 

When someone adds a new test to the testsuite, isn't it supposed to not
FAIL? If is does FAIL, shouldn't this be considered a regression?

Now, the danger is that since regressions are comparisons with previous
run something like this would happen:

run1:
...
FAIL: foo.c ; new test
...

run1 fails because new test entered as a FAIL

run2:
...
FAIL: foo.c
...

run2 succeeds because there are no changes.

For this reason all of this issues need to be taken care straight away
or they become part of the 'normal' status and no more failures are
issued... unless of course a more complex regression analysis is
implemented.

Also, when I mean, run1 fails or succeeds this is just the term I use to
display red/green in the buildbot interface for a given build, not
necessarily what I expect the process will do.

> 
> My suggestion is:
> 
> PASS -> FAIL is an unambiguous regression.
> 
> Anything else -> FAIL and new FAILing tests aren't regressions at the 
> individual test level, but may be treated as such at the whole testsuite 
> level.
> 
> Any transition where the destination result is not FAIL is not a 
> regression.
> 
> ERRORs in the .sum or .log files should be watched out for as well, 
> however, as sometimes they may indicate broken Tcl syntax in the 
> testsuite, which may cause many tests not to be run.
> 
> Note that the test names that come after PASS:, FAIL: etc. aren't unique 
> between different .sum files, so you need to associate tests with a tuple 
> (.sum file, test name) (and even then, sometimes multiple tests in a .sum 
> file have the same name, but that's a testsuite bug).  If you're using 
> --target_board options that run tests for more than one multilib in the 
> same testsuite run, add the multilib to that tuple as well.
> 

Thanks for all the comments. Sounds sensible.
By not being unique, you mean between languages?
I assume that two gcc.sum from different builds will always refer to the
same test/configuration when referring to (for example):
PASS: gcc.c-torture/compile/2105-1.c   -O1  (test for excess errors)

In this case, I assume that "gcc.c-torture/compile/2105-1.c   -O1
(test for excess errors)" will always be referring to the same thing.

-- 
Paulo Matos


Re: GCC Buildbot Update - Definition of regression

2017-10-10 Thread Paulo Matos


On 11/10/17 06:17, Markus Trippelsdorf wrote:
> On 2017.10.10 at 21:45 +0200, Paulo Matos wrote:
>> Hi all,
>>
>> It's almost 3 weeks since I last posted on GCC Buildbot. Here's an update:
>>
>> * 3 x86_64 workers from CF are now installed;
>> * There's one scheduler for trunk doing fresh builds for every Daily bump;
>> * One scheduler doing incremental builds for each active branch;
>> * An IRC bot which is currently silent;
> 
> Using -j8 for the bot on a 8/16 (core/thread) machine like gcc67 is not
> acceptable, because it will render it unusable for everybody else.

I was going to correct you on that given what I read in
https://gcc.gnu.org/wiki/CompileFarm#Usage

but it was my mistake. I assumed that for an N-thread machine, I could
use N/2 processes but the guide explicitly says N-core, not N-thread.
Therefore I should be using 4 processes for gcc67 (or 0 given what follows).

I will fix also the number of processes used by the other workers.

> Also gcc67 has a buggy Ryzen CPU that causes random gcc crashes. Not the
> best setup for a regression tester...
> 

Is that documented anywhere? I will remove this worker.

Thanks,

-- 
Paulo Matos


Re: GCC Buildbot Update - Definition of regression

2017-10-10 Thread Markus Trippelsdorf
On 2017.10.10 at 21:45 +0200, Paulo Matos wrote:
> Hi all,
> 
> It's almost 3 weeks since I last posted on GCC Buildbot. Here's an update:
> 
> * 3 x86_64 workers from CF are now installed;
> * There's one scheduler for trunk doing fresh builds for every Daily bump;
> * One scheduler doing incremental builds for each active branch;
> * An IRC bot which is currently silent;

Using -j8 for the bot on a 8/16 (core/thread) machine like gcc67 is not
acceptable, because it will render it unusable for everybody else.
Also gcc67 has a buggy Ryzen CPU that causes random gcc crashes. Not the
best setup for a regression tester...

-- 
Markus


Re: GCC Buildbot Update - Definition of regression

2017-10-10 Thread Joseph Myers
On Tue, 10 Oct 2017, Paulo Matos wrote:

> ANY -> no test  ; Test disappears

No, that's not a regression.  Simply adding a line to a testcase will 
change the line number that appears in the PASS / FAIL line for an 
individual assertion therein.  Or the names will change when e.g. 
-std=c++2a becomes -std=c++20 and all the tests with a C++ standard 
version in them change their names.  Or if a bogus test is removed.

> ANY / XPASS -> XPASS; Test goes from any status other than XPASS
> to XPASS
> ANY / KPASS -> KPASS; Test goes from any status other than KPASS
> to KPASS

No, that's not a regression.  It's inevitable that XFAILing conditions may 
sometimes be broader than ideal, if it's not possible to describe the 
exact failure conditions to the testsuite, and so sometimes a test may 
reasonably XPASS.  Such tests *may* sometimes be candidates for a more 
precise XFAIL condition, but they aren't regressions.

> new test -> FAIL; New test starts as fail

No, that's not a regression, but you might want to treat it as one (in the 
sense that it's a regression at the higher level of "testsuite run should 
have no unexpected failures", even if the test in question would have 
failed all along if added earlier and so the underlying compiler bug, if 
any, is not a regression).  It should have human attention to classify it 
and either fix the test or XFAIL it (with issue filed in Bugzilla if a 
bug), but it's not a regression.  (Exception: where a test failing results 
in its name changing, e.g. through adding "(internal compiler error)".)

> PASS -> ANY ; Test moves away from PASS

No, only a regression if the destination result is FAIL (if it's 
UNRESOLVED then there might be a separate regression - execution test 
becoming UNRESOLVED should be accompanied by compilation becoming FAIL).  
If it's XFAIL, it might formally be a regression, but one already being 
tracked in another way (presumably Bugzilla) which should not turn the bot 
red.  If it's XPASS, that simply means XFAILing conditions slightly wider 
than necessary in order to mark failure in another configuration as 
expected.

My suggestion is:

PASS -> FAIL is an unambiguous regression.

Anything else -> FAIL and new FAILing tests aren't regressions at the 
individual test level, but may be treated as such at the whole testsuite 
level.

Any transition where the destination result is not FAIL is not a 
regression.

ERRORs in the .sum or .log files should be watched out for as well, 
however, as sometimes they may indicate broken Tcl syntax in the 
testsuite, which may cause many tests not to be run.

Note that the test names that come after PASS:, FAIL: etc. aren't unique 
between different .sum files, so you need to associate tests with a tuple 
(.sum file, test name) (and even then, sometimes multiple tests in a .sum 
file have the same name, but that's a testsuite bug).  If you're using 
--target_board options that run tests for more than one multilib in the 
same testsuite run, add the multilib to that tuple as well.

-- 
Joseph S. Myers
jos...@codesourcery.com


GCC Buildbot Update - Definition of regression

2017-10-10 Thread Paulo Matos
Hi all,

It's almost 3 weeks since I last posted on GCC Buildbot. Here's an update:

* 3 x86_64 workers from CF are now installed;
* There's one scheduler for trunk doing fresh builds for every Daily bump;
* One scheduler doing incremental builds for each active branch;
* An IRC bot which is currently silent;

The next steps are:
* Enable LNT (I have installed this but have yet to connect to buildbot)
for tracking performance benchmarks over time -- it should come up as
http://gcc-lnt.linki.tools in the near future.
* Enable regression analysis --- This is fundamental. I understand that
without this the buildbot is pretty useless so it has highest priority.
However, I would like some agreement as to what in GCC should be
considered a regression. Each test in deja gnu can have several status:
FAIL, PASS, UNSUPPORTED, UNTESTED, XPASS, KPASS, XFAIL, KFAIL, UNRESOLVED

Since GCC doesn't have a 'clean bill' of test results we need to analyse
the sum files for the current run and compare with the last run of the
same branch. I have written down that if for each test there's a
transition that looks like the following, then a regression exists and
the test run should be marked as failure.

ANY -> no test  ; Test disappears
ANY / XPASS -> XPASS; Test goes from any status other than XPASS
to XPASS
ANY / KPASS -> KPASS; Test goes from any status other than KPASS
to KPASS
new test -> FAIL; New test starts as fail
PASS -> ANY ; Test moves away from PASS

This is a suggestion. I am keen to have corrections from people who use
this on a daily basis and/or have a better understanding of each status.

As soon as we reach a consensus, I will deploy this analysis and enable
IRC bot to notify on the #gcc channel the results of the tests.

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-26 Thread Paulo Matos


On 26/09/17 10:43, Martin Liška wrote:
> On 09/25/2017 02:49 PM, Paulo Matos wrote:
>> For benchmarks like Qt, blitz (as mentioned in the gcc testing page), we
>> can plot the build time of the benchmark and resulting size when
>> compiling for size.
>>
> 
> Please consider using LNT:
> http://llvm.org/docs/lnt/
> 
> Usage:
> http://lnt.llvm.org/
> 
> I've been investigating the tools and I know that ARM people use the tool:
> https://gcc.gnu.org/wiki/cauldron2017#ARM-perf
> 

Good suggestion. I was actually at the presentation. The reason I was
going with influx+grafana was because I know the process and already use
that internally --- the LNT configuration was unknown to me but you're
right. It might be better in the long term. I will look at the
documentation.

Thanks.

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-26 Thread Martin Liška
On 09/25/2017 02:49 PM, Paulo Matos wrote:
> For benchmarks like Qt, blitz (as mentioned in the gcc testing page), we
> can plot the build time of the benchmark and resulting size when
> compiling for size.
> 

Please consider using LNT:
http://llvm.org/docs/lnt/

Usage:
http://lnt.llvm.org/

I've been investigating the tools and I know that ARM people use the tool:
https://gcc.gnu.org/wiki/cauldron2017#ARM-perf

Martin


Re: GCC Buildbot

2017-09-25 Thread Paulo Matos


On 25/09/17 13:36, Martin Liška wrote:
> 
> Would be great, what exactly do you want to visualize? For me, even having 
> green/red spots
> works fine in order to quickly identify what builds are wrong.
> 

There are several options and I think mostly it depends on what everyone
would like to see but I am thinking that a dashboard with green/red
spots as you mention (which depends not on the existence of failures)
but on the existence of a regression at a certain revision. Also, an
historical graph of results and gcc build times might be interesting as
well.

For benchmarks like Qt, blitz (as mentioned in the gcc testing page), we
can plot the build time of the benchmark and resulting size when
compiling for size.

Again, I expect that once there's something visible and people are keen
to use it, they'll ask for something specific. However, once the
infrastructure is in place, it shouldn't be too hard to add specific
visualizations.

> 
> Hopefully both. I'm attaching my config file (probably more for inspiration 
> that a real use).
> I'll ask my manager whether we can find a machine that can run more complex 
> tests. I'll inform you.
> 

Thanks for the configuration file. I will take a look. Will eagerly wait
for news on the hardware request.

> 
> Yes, duplication in way that it is (will be) same things. I'm adding author 
> of the tool,
> hopefully we can unify the effort (and resources of course).
> 

Great.

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-25 Thread Paulo Matos


On 25/09/17 13:14, Jonathan Wakely wrote:
> On 25 September 2017 at 11:13, Paulo Matos wrote:
>>> Apart from that, I fully agree with octoploid that 
>>> http://toolchain.lug-owl.de/buildbot/ is duplicated effort which is running
>>> on GCC compile farm machines and uses a shell scripts to utilize. I would 
>>> prefer to integrate it to Buildbot and utilize same
>>> GCC Farm machines for native builds.
>>>
>>
>> Octoploid? Is that a typo?
> 
> No, it's Markus Trippelsdorf's username.
> 

Ah, thanks for the clarification.

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-25 Thread Martin Liška
On 09/25/2017 12:13 PM, Paulo Matos wrote:
> 
> 
> On 25/09/17 11:52, Martin Liška wrote:
>> Hi Paulo.
>>
>> Thank you for working on that! To be honest, I've been running local 
>> buildbot on
>> my desktop machine which does builds your buildbot instance can do (please 
>> see:
>> https://pasteboard.co/GLZ0vLMu.png):
>>
> 
> Hi Martin,
> 
> Thanks for sharing your builders. Looks like you've got a good setup going.
> 
> I have done the very basic only since it was my interest to understand
> if people would find it useful. I didn't want to waste my time building
> something people have no interest to use.

Sure, nice kick off.

> 
> It seems there is some interest so I am gathering some requirements in
> the GitHub issues of the project. One very important feature is
> visualization of results, so I am integrating support for data gathering
> in influxdb to display using grafana. I do not work full time on this,
> so it's going slowly but I should have a dashboard to show in the next
> couple of weeks.

Would be great, what exactly do you want to visualize? For me, even having 
green/red spots
works fine in order to quickly identify what builds are wrong.

> 
>> - doing time to time (once a week) sanitizer builds: ASAN, UBSAN and run 
>> test-suite
>> - doing profiled bootstrap, LTO bootstrap (yes, it has been broken for quite 
>> some time) and LTO profiled bootstrap
>> - building project with --enable-gather-detailed-mem-stats
>> - doing coverage --enable-coverage, running test-suite and uploading to a 
>> location: https://gcc.opensuse.org/gcc-lcov/
>> - similar for Doxygen: https://gcc.opensuse.org/gcc-doxygen/
>> - periodic building of some projects: Inkscape, GIMP, linux-kernel, Firefox 
>> - I do it with -O2, -O2+LTO, -O3, ...
>>   Would be definitely fine, but it takes some care to maintain compatible 
>> versions of a project and GCC compiler.
>>   Plus handling of dependencies of external libraries can be irritating.
>> - cross build for primary architectures
>>
>> That's list of what I have and can be inspiration for you. I can help if you 
>> want and we can find a reasonable resources
>> where this can be run.
>>
> 
> Thanks. That's great. As you can see from #9 in
> https://github.com/LinkiTools/gcc-buildbot/issues/9, most of the things
> I hope to be able to run in the CompileFarm unless, of course, unless
> people host a worker on their own hardware. Regarding your offer for
> resources. Are you offering to merge your config or hardware? Either
> would be great, however I expect your config to have to be ported to
> buildbot nine before merging.

Hopefully both. I'm attaching my config file (probably more for inspiration 
that a real use).
I'll ask my manager whether we can find a machine that can run more complex 
tests. I'll inform you.

> 
>> Apart from that, I fully agree with octoploid that 
>> http://toolchain.lug-owl.de/buildbot/ is duplicated effort which is running
>> on GCC compile farm machines and uses a shell scripts to utilize. I would 
>> prefer to integrate it to Buildbot and utilize same
>> GCC Farm machines for native builds.
>>
> 
> Octoploid? Is that a typo?
> I discussed that in the Cauldron with David was surprised to know that
> the buildbot you reference is actually not a buildbot implementation
> using the Python framework but a handwritten software. So, in that
> respect is not duplicated effort. It is duplicated effort if on the
> other hand, we try to test the same things. I will try to understand how
> to merge efforts to that buildbot.

Yes, duplication in way that it is (will be) same things. I'm adding author of 
the tool,
hopefully we can unify the effort (and resources of course).

Martin

> 
>> Another inspiration (for builds) can come from what LLVM folks do:
>> http://lab.llvm.org:8011/builders
>>
> 
> Thanks for the pointer. I at one point tried to read their
> configuration. However, found the one by gdb simpler and used it as a
> basis for what I have. I will look at their builders nonetheless to
> understand what they build and how long they take.> 
>> Anyway, it's good starting point what you did and I'm looking forward to 
>> more common use of the tool.
>> Martin
>>
> 
> Thanks,
> 

# -*- python -*-
# ex: set syntax=python:

# This is a sample buildmaster config file. It must be installed as
# 'master.cfg' in your buildmaster's base directory.

# This is the dictionary that the buildmaster pays attention to. We also use
# a shorter alias to save typing.
c = BuildmasterConfig = {}

from base64 import 

Re: GCC Buildbot

2017-09-25 Thread Jonathan Wakely
On 25 September 2017 at 11:13, Paulo Matos wrote:
>> Apart from that, I fully agree with octoploid that 
>> http://toolchain.lug-owl.de/buildbot/ is duplicated effort which is running
>> on GCC compile farm machines and uses a shell scripts to utilize. I would 
>> prefer to integrate it to Buildbot and utilize same
>> GCC Farm machines for native builds.
>>
>
> Octoploid? Is that a typo?

No, it's Markus Trippelsdorf's username.


Re: GCC Buildbot

2017-09-25 Thread Paulo Matos


On 25/09/17 11:52, Martin Liška wrote:
> Hi Paulo.
> 
> Thank you for working on that! To be honest, I've been running local buildbot 
> on
> my desktop machine which does builds your buildbot instance can do (please 
> see:
> https://pasteboard.co/GLZ0vLMu.png):
> 

Hi Martin,

Thanks for sharing your builders. Looks like you've got a good setup going.

I have done the very basic only since it was my interest to understand
if people would find it useful. I didn't want to waste my time building
something people have no interest to use.

It seems there is some interest so I am gathering some requirements in
the GitHub issues of the project. One very important feature is
visualization of results, so I am integrating support for data gathering
in influxdb to display using grafana. I do not work full time on this,
so it's going slowly but I should have a dashboard to show in the next
couple of weeks.

> - doing time to time (once a week) sanitizer builds: ASAN, UBSAN and run 
> test-suite
> - doing profiled bootstrap, LTO bootstrap (yes, it has been broken for quite 
> some time) and LTO profiled bootstrap
> - building project with --enable-gather-detailed-mem-stats
> - doing coverage --enable-coverage, running test-suite and uploading to a 
> location: https://gcc.opensuse.org/gcc-lcov/
> - similar for Doxygen: https://gcc.opensuse.org/gcc-doxygen/
> - periodic building of some projects: Inkscape, GIMP, linux-kernel, Firefox - 
> I do it with -O2, -O2+LTO, -O3, ...
>   Would be definitely fine, but it takes some care to maintain compatible 
> versions of a project and GCC compiler.
>   Plus handling of dependencies of external libraries can be irritating.
> - cross build for primary architectures
>
> That's list of what I have and can be inspiration for you. I can help if you 
> want and we can find a reasonable resources
> where this can be run.
>

Thanks. That's great. As you can see from #9 in
https://github.com/LinkiTools/gcc-buildbot/issues/9, most of the things
I hope to be able to run in the CompileFarm unless, of course, unless
people host a worker on their own hardware. Regarding your offer for
resources. Are you offering to merge your config or hardware? Either
would be great, however I expect your config to have to be ported to
buildbot nine before merging.

> Apart from that, I fully agree with octoploid that 
> http://toolchain.lug-owl.de/buildbot/ is duplicated effort which is running
> on GCC compile farm machines and uses a shell scripts to utilize. I would 
> prefer to integrate it to Buildbot and utilize same
> GCC Farm machines for native builds.
> 

Octoploid? Is that a typo?
I discussed that in the Cauldron with David was surprised to know that
the buildbot you reference is actually not a buildbot implementation
using the Python framework but a handwritten software. So, in that
respect is not duplicated effort. It is duplicated effort if on the
other hand, we try to test the same things. I will try to understand how
to merge efforts to that buildbot.

> Another inspiration (for builds) can come from what LLVM folks do:
> http://lab.llvm.org:8011/builders
> 

Thanks for the pointer. I at one point tried to read their
configuration. However, found the one by gdb simpler and used it as a
basis for what I have. I will look at their builders nonetheless to
understand what they build and how long they take.

> Anyway, it's good starting point what you did and I'm looking forward to more 
> common use of the tool.
> Martin
> 

Thanks,
-- 
Paulo Matos


Re: GCC Buildbot

2017-09-25 Thread Martin Liška
Hi Paulo.

Thank you for working on that! To be honest, I've been running local buildbot on
my desktop machine which does builds your buildbot instance can do (please see:
https://pasteboard.co/GLZ0vLMu.png):

- doing time to time (once a week) sanitizer builds: ASAN, UBSAN and run 
test-suite
- doing profiled bootstrap, LTO bootstrap (yes, it has been broken for quite 
some time) and LTO profiled bootstrap
- building project with --enable-gather-detailed-mem-stats
- doing coverage --enable-coverage, running test-suite and uploading to a 
location: https://gcc.opensuse.org/gcc-lcov/
- similar for Doxygen: https://gcc.opensuse.org/gcc-doxygen/
- periodic building of some projects: Inkscape, GIMP, linux-kernel, Firefox - I 
do it with -O2, -O2+LTO, -O3, ...
  Would be definitely fine, but it takes some care to maintain compatible 
versions of a project and GCC compiler.
  Plus handling of dependencies of external libraries can be irritating.
- cross build for primary architectures

That's list of what I have and can be inspiration for you. I can help if you 
want and we can find a reasonable resources
where this can be run.

Apart from that, I fully agree with octoploid that 
http://toolchain.lug-owl.de/buildbot/ is duplicated effort which is running
on GCC compile farm machines and uses a shell scripts to utilize. I would 
prefer to integrate it to Buildbot and utilize same
GCC Farm machines for native builds.

Another inspiration (for builds) can come from what LLVM folks do:
http://lab.llvm.org:8011/builders

Anyway, it's good starting point what you did and I'm looking forward to more 
common use of the tool.
Martin


Re: GCC Buildbot

2017-09-22 Thread Joseph Myers
On Fri, 22 Sep 2017, Paulo Matos wrote:

> > Note that even without a simulator (but with target libc), you can test 
> > just the compilation parts of the testsuite using a board file with a 
> > dummy _load implementation.
> > 
> 
> I was not aware of that. I will keep that in mind once I try to setup a
> cross-compilation worker.
> 
> I assume you have done this before. Do you have any scripts for
> cross-compiling you can share?

Here is a board file (by Janis Johnson) that stubs out test execution (you 
still need to be able to link, and for some bare-metal configurations 
linking may not work by default without specifying a linker script that 
chooses a BSP).

load_generic_config "sim"
set_board_info hostname "dummy-run"
set board_info(localhost,isremote) 0
process_multilib_options ""

# Override ${target}_load to do nothing.
foreach t $target_list {
proc ${t}_load { args } {
send_user "dummy load does nothing\n"
return [list "unresolved" ""]
}
}

glibc's build-many-glibcs.py shows the preferred modern approach for 
bootstrapping a cross toolchain for a target using glibc.  (However, the 
set of configurations it builds aims at covering glibc configuration 
space, especially as regards ABIs, and is probably more than is relevant 
for testing GCC for those architectures.)  Tools such as crosstool-ng have 
recipes for building cross toolchains that may also be useful to look at, 
but many such tools often do things for glibc builds that haven't been 
necessary for several years.

For bare metal you'll want newlib instead of glibc (configure that with 
the same --build --host --target as GCC, since it's the same top-level 
build system, and that build system deals internally with making "host" 
for configuring the newlib subdirectory the same as "target" was at top 
level).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: GCC Buildbot

2017-09-22 Thread Paulo Matos


On 22/09/17 01:23, Joseph Myers wrote:
> On Thu, 21 Sep 2017, Paulo Matos wrote:
> 
>> Interesting suggestion. I haven't had the opportunity to look at the
>> compile farm. However, it could be interesting to have a mix of workers:
>> native compile farm ones and some x86_64 doing cross compilation and
>> testing.
> 
> Note that even without a simulator (but with target libc), you can test 
> just the compilation parts of the testsuite using a board file with a 
> dummy _load implementation.
> 

I was not aware of that. I will keep that in mind once I try to setup a
cross-compilation worker.

I assume you have done this before. Do you have any scripts for
cross-compiling you can share?

Thanks,

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-21 Thread Joseph Myers
On Thu, 21 Sep 2017, Paulo Matos wrote:

> Interesting suggestion. I haven't had the opportunity to look at the
> compile farm. However, it could be interesting to have a mix of workers:
> native compile farm ones and some x86_64 doing cross compilation and
> testing.

Note that even without a simulator (but with target libc), you can test 
just the compilation parts of the testsuite using a board file with a 
dummy _load implementation.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: GCC Buildbot

2017-09-21 Thread Joseph Myers
On Thu, 21 Sep 2017, Paulo Matos wrote:

> I totally agree that only if people get involved in checking if there
> were regressions and keeping an eye on what's going on are things going
> to improve. The framework can help a lot here by notifying the right
> people and the mailing list if something gets broken or if there are
> regressions but once the notification is sent someone certainly needs to
> pick it up.

A regression that isn't fixed quickly needs to end up as an appropriately 
regression-marked (subject, target milestone) bug in Bugzilla, as that's 
what's used to track regressions for release management purposes.  And as 
far as possible it should be one bug per logical issue, whether it causes 
one test FAIL or thousands.

Note: a test result

FAIL: foo

where there was no previous

PASS: foo

on that platform is not generally a regression, although it should still 
be fixed to keep test results clean (we want that 0-FAIL expected 
baseline...).  But it *can* be a regression if e.g.

PASS: foo

changed to

FAIL: foo (internal compiler error)

as it's not always the case that the test names - the text after "PASS: " 
or "FAIL: " - are properly invariant under changes in the results of the 
tests.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: GCC Buildbot

2017-09-21 Thread Paulo Matos


On 21/09/17 14:18, Christophe Lyon wrote:
>>
>> If this is something of interest, then we will need to understand what
>> is required, among those:
>>
>> - which machines we can use as workers: we certainly need more worker
>> (previously known as slave) machines to test GCC in different
>> archs/configurations;
> 
> To cover various archs, it may be more practical to build cross-compilers,
> using "cheap" x86_64 builders, and relying on qemu or other simulators
> to run the tests. I don't think the GCC compute farm can offer powerful
> enough machines for all the archs we want to test.
> 

Interesting suggestion. I haven't had the opportunity to look at the
compile farm. However, it could be interesting to have a mix of workers:
native compile farm ones and some x86_64 doing cross compilation and
testing.

> It's not as good as using native hardware, but this is often faster.
> And it does not prevent from using native hardware for weekly
> bootstraps for instance.
> 
>> - what kind of build configurations do we need and what they should do:
>> for example, do we want to build gcc standalone against system (the one
>> installed in the worker) binutils, glibc, etc or do we want a builder to
>> bootstrap everything?
> 
> Using the system tools is OK for native builders, maybe not when building
> cross-compilers.
> 
> Then I think it's way safer to stick to given binutils/glibc/newlib 
> versions
> and monitor only gcc changes. There are already frequent regressions,
> and it's easier to be sure it's related to gcc-changes only.
> 
> And have a mechanism to upgrade such components after checking
> the impact on the gcc testsuite.
> 
> In Linaro we have a job tracking all master branches, it is almost
> always red :(
> 

Oh, that's surprising actually. I wouldn't have expected that. I would
have hoped that actually all masters would work most of the time. Do you
know if there's a specific reason for this?

>> - Currently we have a force build which allows people to force a build
>> on the worker. This requires no authentication and can certainly be
>> abused. We can add some sort of authentication, like for example, only
>> allow users with a gcc.gnu.org email? For now, it's not a problem.
>> -  We are building gcc for C, C++, ObjC (Which is the default). Shall we
>> add more languages to the mix?
>> - the gdb buildbot has a feature I have disabled (the TRY scheduler)
>> which allows people to submit patches to the buildbot, buildbot patches
>> the current svn version, builds and tests that. Would we want something
>> like this?
> 
> I think this is very useful.
> We have something like that both at Linaro and ST.
> On a few occasions, I did manually submit other people's patches
> for testing after they submitted them to gcc-patches@. It always
> caught a few problems in some less configurations.
> 

I wonder how feasible it is to automatically extract the patches and
automatically do the running, and post back the results to the patches'
thread... just something that occurred to me. Haven't yet investigated.

>> - buildbot can notify people if the build fails or if there's a test
>> regression. Notification can be sent to IRC and email for example. What
>> would people prefer to have as the settings for notifications?
> 
> I've recently seen complaints on the gdb list because the buildbot
> was sending notifications to too many people. I'm afraid that this
> is going to be a touchy area if the notifications contain too many
> false positives.
> 

I discussed this with Pedro and Sergio and it was due to a bug in the
configuration that Sergio fixed, so notifications don't need to contain
false positives, unless of course, there's a bug. I will try to avoid
making the same mistake as Sergio and not spam GCC developers.

>> - an example of a successful build is:
>> https://gcc-buildbot.linki.tools/#/builders/1/builds/38
>> This build shows several Changes because between the start and finish of
>> a build there were several new commits. Properties show among other
>> things test results. Responsible users show the people who were involved
>> in the changes for the build.
>>
>> I am sure there are lots of other questions and issues. Please let me
>> know if you find this interesting and what you would like to see
>> implemented.
>>
> 
> To summarize, I think such bots are very valuable, even if they only
> act as post-commit validations.
> 
> But as other people expressed, the main difficulty is what to do with
> the results. Analyzing regression reports to make sure they are
> not false positive is very time consuming.
> 
> Having a buggy bisect framework can also lead to embarrassing
> situations, like when I blamed a C++ front-end patch for a regression
> in fortran ;-)
> 
> Most of the time, I consider it's more efficient for the project if I warn
> the author of the patch that introduced the regression than if I try to
> fix it myself. Except for the most trivial ones, it resulted several times
> in d

Re: GCC Buildbot

2017-09-21 Thread Paulo Matos


On 21/09/17 16:41, Martin Sebor wrote:
> 
> The regression and the testresults lists are useful but not nearly
> as much as they could be.  For one, the presentation isn't user
> friendly (a matrix view would be much more informative).  But even
> beyond it, rather than using the pull model (people have to make
> an effort to search it for results of their changes or the changes
> of others to find regressions), the regression testing setup could
> be improved by adopting the push model and automatically emailing
> authors of changes that caused regressions (or opening bugs, or
> whatever else might get our attention).
> 

This is certainly one of the notifications that I think that need to be
implemented. If a patch breaks build or testing, the responsible parties
need to be informed, i.e. commiters, authors and possibly the list as well.

Thanks,

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-21 Thread Paulo Matos


On 21/09/17 14:11, Mark Wielaard wrote:
> Hi,
> 
> First let me say I am also a fan of buildbot. I use it for a couple of
> projects and it is really flexible, low on resources, easy to add new
> builders/workers and easily extensible if you like python.
> 
> On Thu, 2017-09-21 at 07:18 +0200, Markus Trippelsdorf wrote:
>> And it has the basic problem of all automatic testing: that in the
>> long run everyone simply ignores it.
>> The same thing would happen with the proposed new buildbot. It would
>> use still more resources on the already overused machines without
>> producing useful results.
> 
> But this is a real concern and will happen if you are too eager testing
> all the things all the time. So I would recommend to start as small as
> possible. Pick a target that builds as fast as possible. Once you go
> over 30 minutes of build/test time it really is already too long. Both
> on machine resources and human attention span. And pick a subset of the
> testsuite that should be zero-FAIL. Only then will people really take
> notice when the build turns from green-to-red. Otherwise people will
> always have an excuse "well, those tests aren't really reliable, it
> could be something else". And then only once you have a stable
> small/fast builder that reliably warns committers that their commit
> broke something extend it to new targets/setups/tests as long as you
> can keep the false warnings as close to zero as possible.
> 

Thanks. This is an interesting idea, however it might not be an easy
exercise to choose a subset of the tests for each compiled language that
PASS, are quick to run and representative. It would be interesting to
hear from some of the main developers which of the tests would be better
to run.

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-21 Thread Paulo Matos


On 21/09/17 02:27, Joseph Myers wrote:
> On Wed, 20 Sep 2017, Segher Boessenkool wrote:
> 
>>> - buildbot can notify people if the build fails or if there's a test
>>> regression. Notification can be sent to IRC and email for example. What
>>> would people prefer to have as the settings for notifications?
>>
>> Just try it!  IRC is most useful I think, at least for now.  But try
>> whatever seems useful, if there are too many complaints you can always
>> turn it off again ;-)
> 
> We have the gcc-regression list for automatic regression notifications 
> (but as per my previous message, regression notifications are much more 
> useful if someone actually does the analysis, promptly, to identify the 
> cause and possibly a fix).
> 

Yes, the gcc-regression list. Will add a notifier to email the list.

> My glibc bots only detect testsuite regressions that change the exit 
> status of "make check" from successful to unsuccessful (and regressions 
> that break any part of the build, etc.).  That works for glibc, where the 
> vast bulk of configurations have clean compilation-only testsuite results.  
> It won't work for GCC - you need to detect changes in the results of 
> individual tests (or new tests which start out as FAIL and may not 
> strictly be regressions but should still be fixed).  Ideally the expected 
> baseline *would* be zero FAILs, but we're a long way off that.
> 

Yes, with GCC is slightly more complex but it should be possible to
calculate regressions even in the presence of non-zero FAILs.

Thanks for your comments,

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-21 Thread Paulo Matos


On 21/09/17 01:01, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Sep 20, 2017 at 05:01:55PM +0200, Paulo Matos wrote:
>> This mail's intention is to gauge the interest of having a buildbot for
>> GCC.
> 
> +1.  Or no, +100.
> 
>> - which machines we can use as workers: we certainly need more worker
>> (previously known as slave) machines to test GCC in different
>> archs/configurations;
> 
> I think this would use too much resources (essentially the full machines)
> for the GCC Compile Farm.  If you can dial it down so it only uses a
> small portion of the machines, we can set up slaves there, at least on
> the more unusual architectures.  But then it may become too slow to be
> useful.
> 

We can certainly decide what builds on workers in the compile farm and
what doesn't. We can also decide what type of build we want. A full
bootstrap, all languages etc. I still have to look at that. Not sure how
to access the compile farm or who has access to them.

>> - what kind of build configurations do we need and what they should do:
>> for example, do we want to build gcc standalone against system (the one
>> installed in the worker) binutils, glibc, etc or do we want a builder to
>> bootstrap everything?
> 
> Bootstrap is painfully slow, but it catches many more problems.
> 

Could possibly do that on a schedule instead.

>> -  We are building gcc for C, C++, ObjC (Which is the default). Shall we
>> add more languages to the mix?
> 
> I'd add Fortran, it tends to find problems (possibly because it has much
> bigger tests than most C/C++ tests are).  But all extra testing uses
> disproportionally more resources...  Find a sweet spot :-)  You probably
> should start with as little as possible, or perhaps do bigger configs
> every tenth build, or something like that.
> 

Sounds like a good idea.

>> - the gdb buildbot has a feature I have disabled (the TRY scheduler)
>> which allows people to submit patches to the buildbot, buildbot patches
>> the current svn version, builds and tests that. Would we want something
>> like this?
> 
> This is very useful, but should be mostly separate...  There are of course
> the security considerations, but also this really needs clean builds every
> time, and perhaps even bootstraps.
> 

There could be two types of try schedulers, one for full bootstraps and
one just for GCC. Security wise we could always containerize.

>> - buildbot can notify people if the build fails or if there's a test
>> regression. Notification can be sent to IRC and email for example. What
>> would people prefer to have as the settings for notifications?
> 
> Just try it!  IRC is most useful I think, at least for now.  But try
> whatever seems useful, if there are too many complaints you can always
> turn it off again ;-)
>
> Thank you for working on this.
>

Thanks for all the comments. I will add the initial notifications into
IRC and see how people react.

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-21 Thread Paulo Matos


On 20/09/17 19:14, Joseph Myers wrote:
> On Wed, 20 Sep 2017, Paulo Matos wrote:
> 
>> - buildbot can notify people if the build fails or if there's a test
>> regression. Notification can be sent to IRC and email for example. What
>> would people prefer to have as the settings for notifications?
> 
> It's very useful if someone reviews regressions manually and files bugs / 
> notifies authors of patches that seem likely to have caused them / fixes 
> them if straightforward.  However, that can take a lot of time (especially 
> if you're building a large number of configurations, and ideally there 
> would be at least compilation testing for every target architecture 
> supported by GCC if enough machine resources are available).  (I do that 
> for my glibc bot, which does compilation-only testing for many 
> configurations, covering all supported glibc ABIs except Hurd - the 
> summaries of results go to libc-testresults, but the detailed logs that 
> show e.g. which test failed or failed to build aren't public; each build 
> cycle for the mainline bot produces about 3 GB of logs, which get deleted 
> after the following build cycle.)
> 

I totally agree that only if people get involved in checking if there
were regressions and keeping an eye on what's going on are things going
to improve. The framework can help a lot here by notifying the right
people and the mailing list if something gets broken or if there are
regressions but once the notification is sent someone certainly needs to
pick it up.

I believe that once the framework is there and if it's reliable and user
friendly and does not force people to check the UI every day, instead
notifying people only when things go wrong, then it will force people to
take notice and do something about breakages.

At the moment, there are no issues with regards to logs sizes but we are
starting small with a single worker. Once we have more we'll have to
revisit this issue.

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-21 Thread Joseph Myers
On Thu, 21 Sep 2017, Markus Trippelsdorf wrote:

> And it has the basic problem of all automatic testing: that in the long
> run everyone simply ignores it.

Hence, see my comments about the value of having someone who monitors the 
results and files bugs / notifies patch authors / fixes issues.

It doesn't need to be the same person who runs the bot.  It doesn't need 
to be the same person for every architecture.  But the results need to be 
monitored and issues raised.

That's what I do with my glibc bots.  A lot of the time they run cleanly, 
but sometimes when they show failures it does take a significant amount of 
work to understand them and fix or inform appropriate people.  (A lot of 
fixes were also involved in getting those bots to a near-clean baseline 
state.)

I once did it, a long time ago, for some GCC bots (on HP-UX, and I think 
i686-pc-linux-gnu as well, as I recall), but that project ended and I 
stopped running the HP-UX bots and largely stopped monitoring the 
i686-pc-linux-gnu one.  My experience indicates that GCC bots would 
require a lot more monitoring than glibc ones, especially if testing lots 
of unusual configurations.

I think it would be straightforward to adapt build-many-glibcs.py to 
operate as a GCC bot running the compilation parts of the GCC testsuite 
for all or almost all supported architectures (and a bit more complicated 
to make it track regressions at the level of individual test failures), 
but I don't know how long an all-architectures compilation test run would 
take, and whether there would be people to monitor all-architectures test 
results is another matter.  (build-many-glibcs.py is useful in glibc 
development even apart from the bots, to allow people to do some 
all-architectures testing of global changes.)

> The same thing is true for the regression mailing list
> https://gcc.gnu.org/ml/gcc-regression/current/.
> It is obvious that nobody pays any attention to it, e.g. PGO bootstrap
> is broken for several months on x86_64 and i686 bootstrap is broken for
> a long time, too.

I don't know if he currently monitors it, but HJ has certainly filed bugs 
for regressions found by his bots in the past.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: GCC Buildbot

2017-09-21 Thread Martin Sebor

On 09/20/2017 11:18 PM, Markus Trippelsdorf wrote:

On 2017.09.20 at 18:01 -0500, Segher Boessenkool wrote:

Hi!

On Wed, Sep 20, 2017 at 05:01:55PM +0200, Paulo Matos wrote:

This mail's intention is to gauge the interest of having a buildbot for
GCC.


+1.  Or no, +100.


- which machines we can use as workers: we certainly need more worker
(previously known as slave) machines to test GCC in different
archs/configurations;


I think this would use too much resources (essentially the full machines)
for the GCC Compile Farm.  If you can dial it down so it only uses a
small portion of the machines, we can set up slaves there, at least on
the more unusual architectures.  But then it may become too slow to be
useful.


There is already a buildbot that uses GCC compile farm resources:
http://toolchain.lug-owl.de/buildbot/

And it has the basic problem of all automatic testing: that in the long
run everyone simply ignores it.


I don't think that's a fair characterization.  The problem is not
that people ignore all automated build results (this, in fact,
couldn't be farther from the truth).  Rather, there is not a single
build and test system for GCC but a multitude of disjoint efforts,
each offering different views with varying levels of functionality
and detail, and each maintained to a different degree.  If we could
agree to adopt one that meets most of our needs and if it were
maintained with same diligence and attention as GCC itself I'm sure
it would benefit the project greatly.


The same thing would happen with the proposed new buildbot. It would use
still more resources on the already overused machines without producing
useful results.

The same thing is true for the regression mailing list
https://gcc.gnu.org/ml/gcc-regression/current/.
It is obvious that nobody pays any attention to it, e.g. PGO bootstrap
is broken for several months on x86_64 and i686 bootstrap is broken for
a long time, too.


The regression and the testresults lists are useful but not nearly
as much as they could be.  For one, the presentation isn't user
friendly (a matrix view would be much more informative).  But even
beyond it, rather than using the pull model (people have to make
an effort to search it for results of their changes or the changes
of others to find regressions), the regression testing setup could
be improved by adopting the push model and automatically emailing
authors of changes that caused regressions (or opening bugs, or
whatever else might get our attention).


Only a mandatory pre-commit hook that would reject commits that break
anything would work. But running the testsuite takes much to long to
make this approach feasible.


A commit hook would be very handy, but it can't very well do
the same amount of testing as an automated build bot.

Martin



Re: GCC Buildbot

2017-09-21 Thread Christophe Lyon
On 20 September 2017 at 17:01, Paulo Matos  wrote:
> Hi all,
>
> I am internally running buildbot for a few projects, including one for a
> simple gcc setup for a private port. After some discussions with David
> Edelsohn at the last couple of Cauldrons, who told me this might be
> interesting for the community in general, I have contacted Sergio DJ
> with a few questions on his buildbot configuration for GDB. I then
> stripped out his configuration and transformed it into one from GCC,
> with a few private additions and ported it to the most recent buildbot
> version nine (which is numerically 0.9.x).
>

That's something I'd have liked to discuss at the Cauldron, but I
couldn't attend.

> To make a long story short: https://gcc-buildbot.linki.tools
> With brief documentation in: https://linkitools.github.io/gcc-buildbot
> and configuration in: https://github.com/LinkiTools/gcc-buildbot
>
> Now, this is still pretty raw but it:
> * Configures a fedora x86_64 for C, C++ and ObjectiveC (./configure
> --disable-multilib)
> * Does an incremental build
> * Runs all tests
> * Grabs the test results and stores them as properties
> * Creates a tarball of the sum and log files from the testsuite
> directory and uploads them
>
> This mail's intention is to gauge the interest of having a buildbot for
> GCC. Buildbot is a generic Python framework to build a test framework so
> the possibilities are pretty much endless as all workflows are
> programmed in Python and with buildbot nine the interface is also
> modifiable, if required.

I think there is no question about the interest of such a feature. It's almost
mandatory nowadays.

FYI, I've been involved in some "bots" for GCC for the past 4-5 years.
Our interest is in the ARM and AArch64 targets.

I don't want to start a Buildbot vs Jenkins vs something else war,
but I can share my experience. I did look at Buildbot, including when
the GDB guys started their own, but I must admit that I have trouble
with Python ;-)

A general warning would be: avoid sharing resources, it's always
a cause of trouble.

In ST, I stopped using my team's Jenkins instance because it
was overloaded, needed to be restarted at inconvenient times, ...
I'm now using a nice crontab :-)
Still in ST, I am using our Compute Farm, which is a large number
of x86_64 servers, where you submit batch jobs, wait, then parse
the results, and the workspace is deleted upon job completion.
I have to cope with various rules, to have a decent throughput,
and minimize pending time as much as possible.

Yet, and probably because the machines are shared with other users
running large (much larger?) programs at the same time, I have to face
random failures (processes are killed randomly, interrupted system calls,
etc). Trying to handle these problems gracefully is very time consuming.

I upload the results on a Linaro server, so that I can share them
when I report a regression. For disk space reasons, I currently
keep about 2 months of results. For the trunk:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/


In Linaro, we use Jenkins, and a few dedicated x86_64 builders
as well as arm and aarch64 builders and test machines. We have
much less cpu power than what I can currently use in ST, so
we run less builds, and less configurations. But even there we have
to face a few random test results (mostly when threads and libgomp
are involved).
https://ci.linaro.org/view/tcwg-ci/job/tcwg-upstream-monitoring/

These random false failures have been preventing us from sending
results automatically.

>
> If this is something of interest, then we will need to understand what
> is required, among those:
>
> - which machines we can use as workers: we certainly need more worker
> (previously known as slave) machines to test GCC in different
> archs/configurations;

To cover various archs, it may be more practical to build cross-compilers,
using "cheap" x86_64 builders, and relying on qemu or other simulators
to run the tests. I don't think the GCC compute farm can offer powerful
enough machines for all the archs we want to test.

It's not as good as using native hardware, but this is often faster.
And it does not prevent from using native hardware for weekly
bootstraps for instance.

> - what kind of build configurations do we need and what they should do:
> for example, do we want to build gcc standalone against system (the one
> installed in the worker) binutils, glibc, etc or do we want a builder to
> bootstrap everything?

Using the system tools is OK for native builders, maybe not when building
cross-compilers.

Then I think it's way safer to stick to given binutils/glibc/newlib versions
and monitor only gcc changes. There are already frequent regressions,
and it's easier to be sure

Re: GCC Buildbot

2017-09-21 Thread Mark Wielaard
Hi,

First let me say I am also a fan of buildbot. I use it for a couple of
projects and it is really flexible, low on resources, easy to add new
builders/workers and easily extensible if you like python.

On Thu, 2017-09-21 at 07:18 +0200, Markus Trippelsdorf wrote:
> And it has the basic problem of all automatic testing: that in the
> long run everyone simply ignores it.
> The same thing would happen with the proposed new buildbot. It would
> use still more resources on the already overused machines without
> producing useful results.

But this is a real concern and will happen if you are too eager testing
all the things all the time. So I would recommend to start as small as
possible. Pick a target that builds as fast as possible. Once you go
over 30 minutes of build/test time it really is already too long. Both
on machine resources and human attention span. And pick a subset of the
testsuite that should be zero-FAIL. Only then will people really take
notice when the build turns from green-to-red. Otherwise people will
always have an excuse "well, those tests aren't really reliable, it
could be something else". And then only once you have a stable
small/fast builder that reliably warns committers that their commit
broke something extend it to new targets/setups/tests as long as you
can keep the false warnings as close to zero as possible.

Cheers,

Mark


Re: GCC Buildbot

2017-09-20 Thread Paulo Matos


On 20/09/17 17:07, Jeff Law wrote:
> I'd strongly recommend using one of the existing infrastructures.  I
> know several folks (myself included) are using Jenkins/Hudson.  There's
> little to be gained building a completely new infrastructure to manage a
> buildbot.
> 

As David pointed out in another email, I should have referenced the
buildbot homepage:
http://buildbot.net/

This is a framework with batteries included to build the kind of things
we want to have for testing. I certainly don't want to start a Jenkins
vs Buildbot discussion.

Kind regards,

-- 
Paulo Matos


Re: GCC Buildbot

2017-09-20 Thread Markus Trippelsdorf
On 2017.09.20 at 18:01 -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Sep 20, 2017 at 05:01:55PM +0200, Paulo Matos wrote:
> > This mail's intention is to gauge the interest of having a buildbot for
> > GCC.
> 
> +1.  Or no, +100.
> 
> > - which machines we can use as workers: we certainly need more worker
> > (previously known as slave) machines to test GCC in different
> > archs/configurations;
> 
> I think this would use too much resources (essentially the full machines)
> for the GCC Compile Farm.  If you can dial it down so it only uses a
> small portion of the machines, we can set up slaves there, at least on
> the more unusual architectures.  But then it may become too slow to be
> useful.

There is already a buildbot that uses GCC compile farm resources:
http://toolchain.lug-owl.de/buildbot/

And it has the basic problem of all automatic testing: that in the long
run everyone simply ignores it.
The same thing would happen with the proposed new buildbot. It would use
still more resources on the already overused machines without producing
useful results.

The same thing is true for the regression mailing list
https://gcc.gnu.org/ml/gcc-regression/current/.
It is obvious that nobody pays any attention to it, e.g. PGO bootstrap
is broken for several months on x86_64 and i686 bootstrap is broken for
a long time, too.

Only a mandatory pre-commit hook that would reject commits that break
anything would work. But running the testsuite takes much to long to
make this approach feasible.

-- 
Markus


Re: GCC Buildbot

2017-09-20 Thread Joseph Myers
On Wed, 20 Sep 2017, Segher Boessenkool wrote:

> > - buildbot can notify people if the build fails or if there's a test
> > regression. Notification can be sent to IRC and email for example. What
> > would people prefer to have as the settings for notifications?
> 
> Just try it!  IRC is most useful I think, at least for now.  But try
> whatever seems useful, if there are too many complaints you can always
> turn it off again ;-)

We have the gcc-regression list for automatic regression notifications 
(but as per my previous message, regression notifications are much more 
useful if someone actually does the analysis, promptly, to identify the 
cause and possibly a fix).

My glibc bots only detect testsuite regressions that change the exit 
status of "make check" from successful to unsuccessful (and regressions 
that break any part of the build, etc.).  That works for glibc, where the 
vast bulk of configurations have clean compilation-only testsuite results.  
It won't work for GCC - you need to detect changes in the results of 
individual tests (or new tests which start out as FAIL and may not 
strictly be regressions but should still be fixed).  Ideally the expected 
baseline *would* be zero FAILs, but we're a long way off that.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: GCC Buildbot

2017-09-20 Thread Segher Boessenkool
Hi!

On Wed, Sep 20, 2017 at 05:01:55PM +0200, Paulo Matos wrote:
> This mail's intention is to gauge the interest of having a buildbot for
> GCC.

+1.  Or no, +100.

> - which machines we can use as workers: we certainly need more worker
> (previously known as slave) machines to test GCC in different
> archs/configurations;

I think this would use too much resources (essentially the full machines)
for the GCC Compile Farm.  If you can dial it down so it only uses a
small portion of the machines, we can set up slaves there, at least on
the more unusual architectures.  But then it may become too slow to be
useful.

> - what kind of build configurations do we need and what they should do:
> for example, do we want to build gcc standalone against system (the one
> installed in the worker) binutils, glibc, etc or do we want a builder to
> bootstrap everything?

Bootstrap is painfully slow, but it catches many more problems.

> -  We are building gcc for C, C++, ObjC (Which is the default). Shall we
> add more languages to the mix?

I'd add Fortran, it tends to find problems (possibly because it has much
bigger tests than most C/C++ tests are).  But all extra testing uses
disproportionally more resources...  Find a sweet spot :-)  You probably
should start with as little as possible, or perhaps do bigger configs
every tenth build, or something like that.

> - the gdb buildbot has a feature I have disabled (the TRY scheduler)
> which allows people to submit patches to the buildbot, buildbot patches
> the current svn version, builds and tests that. Would we want something
> like this?

This is very useful, but should be mostly separate...  There are of course
the security considerations, but also this really needs clean builds every
time, and perhaps even bootstraps.

> - buildbot can notify people if the build fails or if there's a test
> regression. Notification can be sent to IRC and email for example. What
> would people prefer to have as the settings for notifications?

Just try it!  IRC is most useful I think, at least for now.  But try
whatever seems useful, if there are too many complaints you can always
turn it off again ;-)

Thank you for working on this.


Segher


Re: GCC Buildbot

2017-09-20 Thread R0b0t1
Hello friends!

On Wed, Sep 20, 2017 at 10:12 AM, David Edelsohn  wrote:
> On Wed, Sep 20, 2017 at 11:07 AM, Jeff Law  wrote:
>> On 09/20/2017 09:01 AM, Paulo Matos wrote:
>>> Hi all,
>>>
>>> I am internally running buildbot for a few projects, including one for a
>>> simple gcc setup for a private port. After some discussions with David
>>> Edelsohn at the last couple of Cauldrons, who told me this might be
>>> interesting for the community in general, I have contacted Sergio DJ
>>> with a few questions on his buildbot configuration for GDB. I then
>>> stripped out his configuration and transformed it into one from GCC,
>>> with a few private additions and ported it to the most recent buildbot
>>> version nine (which is numerically 0.9.x).
>>>
>>> To make a long story short: https://gcc-buildbot.linki.tools
>>> With brief documentation in: https://linkitools.github.io/gcc-buildbot
>>> and configuration in: https://github.com/LinkiTools/gcc-buildbot
>>>
>>> Now, this is still pretty raw but it:
>>> * Configures a fedora x86_64 for C, C++ and ObjectiveC (./configure
>>> --disable-multilib)
>>> * Does an incremental build
>>> * Runs all tests
>>> * Grabs the test results and stores them as properties
>>> * Creates a tarball of the sum and log files from the testsuite
>>> directory and uploads them
>>>
>>> This mail's intention is to gauge the interest of having a buildbot for
>>> GCC. Buildbot is a generic Python framework to build a test framework so
>>> the possibilities are pretty much endless as all workflows are
>>> programmed in Python and with buildbot nine the interface is also
>>> modifiable, if required.
>>>
>>> If this is something of interest, then we will need to understand what
>>> is required, among those:
>>>
>>> - which machines we can use as workers: we certainly need more worker
>>> (previously known as slave) machines to test GCC in different
>>> archs/configurations;
>>> - what kind of build configurations do we need and what they should do:
>>> for example, do we want to build gcc standalone against system (the one
>>> installed in the worker) binutils, glibc, etc or do we want a builder to
>>> bootstrap everything?
>>> - initially I was doing fresh builds and uploading a tarball (450Mgs)
>>> for download. This took way too long. I have moved to incremental builds
>>> with no tarball generation but if required we could do this for forced
>>> builds and/or nightly. Ideas?
>>> - We are currently running the whole testsuite for each incremental
>>> build (~40mins). If we want a faster turnaround time, we could run just
>>> an important subset of tests. Suggestions?
>>> - would we like to run anything on the compiler besides the gcc
>>> testsuite? I know Honza does, or used to do, lots of firefox builds to
>>> test LTO. Shall we build those, for example? I noticed there's a testing
>>> subpage which contains a few other libraries, should we build these?
>>> (https://gcc.gnu.org/testing/)
>>> - Currently we have a force build which allows people to force a build
>>> on the worker. This requires no authentication and can certainly be
>>> abused. We can add some sort of authentication, like for example, only
>>> allow users with a gcc.gnu.org email? For now, it's not a problem.
>>> -  We are building gcc for C, C++, ObjC (Which is the default). Shall we
>>> add more languages to the mix?
>>> - the gdb buildbot has a feature I have disabled (the TRY scheduler)
>>> which allows people to submit patches to the buildbot, buildbot patches
>>> the current svn version, builds and tests that. Would we want something
>>> like this?
>>> - buildbot can notify people if the build fails or if there's a test
>>> regression. Notification can be sent to IRC and email for example. What
>>> would people prefer to have as the settings for notifications?
>>> - an example of a successful build is:
>>> https://gcc-buildbot.linki.tools/#/builders/1/builds/38
>>> This build shows several Changes because between the start and finish of
>>> a build there were several new commits. Properties show among other
>>> things test results. Responsible users show the people who were involved
>>> in the changes for the build.
>>>
>>> I am sure there are lots of other questions and issues. Please let me
>>> know if you find this interesting an

Re: GCC Buildbot

2017-09-20 Thread Joseph Myers
On Wed, 20 Sep 2017, Paulo Matos wrote:

> - buildbot can notify people if the build fails or if there's a test
> regression. Notification can be sent to IRC and email for example. What
> would people prefer to have as the settings for notifications?

It's very useful if someone reviews regressions manually and files bugs / 
notifies authors of patches that seem likely to have caused them / fixes 
them if straightforward.  However, that can take a lot of time (especially 
if you're building a large number of configurations, and ideally there 
would be at least compilation testing for every target architecture 
supported by GCC if enough machine resources are available).  (I do that 
for my glibc bot, which does compilation-only testing for many 
configurations, covering all supported glibc ABIs except Hurd - the 
summaries of results go to libc-testresults, but the detailed logs that 
show e.g. which test failed or failed to build aren't public; each build 
cycle for the mainline bot produces about 3 GB of logs, which get deleted 
after the following build cycle.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: GCC Buildbot

2017-09-20 Thread David Edelsohn
On Wed, Sep 20, 2017 at 11:07 AM, Jeff Law  wrote:
> On 09/20/2017 09:01 AM, Paulo Matos wrote:
>> Hi all,
>>
>> I am internally running buildbot for a few projects, including one for a
>> simple gcc setup for a private port. After some discussions with David
>> Edelsohn at the last couple of Cauldrons, who told me this might be
>> interesting for the community in general, I have contacted Sergio DJ
>> with a few questions on his buildbot configuration for GDB. I then
>> stripped out his configuration and transformed it into one from GCC,
>> with a few private additions and ported it to the most recent buildbot
>> version nine (which is numerically 0.9.x).
>>
>> To make a long story short: https://gcc-buildbot.linki.tools
>> With brief documentation in: https://linkitools.github.io/gcc-buildbot
>> and configuration in: https://github.com/LinkiTools/gcc-buildbot
>>
>> Now, this is still pretty raw but it:
>> * Configures a fedora x86_64 for C, C++ and ObjectiveC (./configure
>> --disable-multilib)
>> * Does an incremental build
>> * Runs all tests
>> * Grabs the test results and stores them as properties
>> * Creates a tarball of the sum and log files from the testsuite
>> directory and uploads them
>>
>> This mail's intention is to gauge the interest of having a buildbot for
>> GCC. Buildbot is a generic Python framework to build a test framework so
>> the possibilities are pretty much endless as all workflows are
>> programmed in Python and with buildbot nine the interface is also
>> modifiable, if required.
>>
>> If this is something of interest, then we will need to understand what
>> is required, among those:
>>
>> - which machines we can use as workers: we certainly need more worker
>> (previously known as slave) machines to test GCC in different
>> archs/configurations;
>> - what kind of build configurations do we need and what they should do:
>> for example, do we want to build gcc standalone against system (the one
>> installed in the worker) binutils, glibc, etc or do we want a builder to
>> bootstrap everything?
>> - initially I was doing fresh builds and uploading a tarball (450Mgs)
>> for download. This took way too long. I have moved to incremental builds
>> with no tarball generation but if required we could do this for forced
>> builds and/or nightly. Ideas?
>> - We are currently running the whole testsuite for each incremental
>> build (~40mins). If we want a faster turnaround time, we could run just
>> an important subset of tests. Suggestions?
>> - would we like to run anything on the compiler besides the gcc
>> testsuite? I know Honza does, or used to do, lots of firefox builds to
>> test LTO. Shall we build those, for example? I noticed there's a testing
>> subpage which contains a few other libraries, should we build these?
>> (https://gcc.gnu.org/testing/)
>> - Currently we have a force build which allows people to force a build
>> on the worker. This requires no authentication and can certainly be
>> abused. We can add some sort of authentication, like for example, only
>> allow users with a gcc.gnu.org email? For now, it's not a problem.
>> -  We are building gcc for C, C++, ObjC (Which is the default). Shall we
>> add more languages to the mix?
>> - the gdb buildbot has a feature I have disabled (the TRY scheduler)
>> which allows people to submit patches to the buildbot, buildbot patches
>> the current svn version, builds and tests that. Would we want something
>> like this?
>> - buildbot can notify people if the build fails or if there's a test
>> regression. Notification can be sent to IRC and email for example. What
>> would people prefer to have as the settings for notifications?
>> - an example of a successful build is:
>> https://gcc-buildbot.linki.tools/#/builders/1/builds/38
>> This build shows several Changes because between the start and finish of
>> a build there were several new commits. Properties show among other
>> things test results. Responsible users show the people who were involved
>> in the changes for the build.
>>
>> I am sure there are lots of other questions and issues. Please let me
>> know if you find this interesting and what you would like to see
>> implemented.
> I'd strongly recommend using one of the existing infrastructures.  I
> know several folks (myself included) are using Jenkins/Hudson.  There's
> little to be gained building a completely new infrastructure to manage a
> buildbot.

Python Buildbot is not a completely new infrastructure.  It is widely
deployed and used for many projects.  LLVM utilizes a Buildbot
cluster.

I strongly support that we explore how to utilize this offer.  Please
don't bikeshed this.

Thanks, David


Re: GCC Buildbot

2017-09-20 Thread Jeff Law
On 09/20/2017 09:01 AM, Paulo Matos wrote:
> Hi all,
> 
> I am internally running buildbot for a few projects, including one for a
> simple gcc setup for a private port. After some discussions with David
> Edelsohn at the last couple of Cauldrons, who told me this might be
> interesting for the community in general, I have contacted Sergio DJ
> with a few questions on his buildbot configuration for GDB. I then
> stripped out his configuration and transformed it into one from GCC,
> with a few private additions and ported it to the most recent buildbot
> version nine (which is numerically 0.9.x).
> 
> To make a long story short: https://gcc-buildbot.linki.tools
> With brief documentation in: https://linkitools.github.io/gcc-buildbot
> and configuration in: https://github.com/LinkiTools/gcc-buildbot
> 
> Now, this is still pretty raw but it:
> * Configures a fedora x86_64 for C, C++ and ObjectiveC (./configure
> --disable-multilib)
> * Does an incremental build
> * Runs all tests
> * Grabs the test results and stores them as properties
> * Creates a tarball of the sum and log files from the testsuite
> directory and uploads them
> 
> This mail's intention is to gauge the interest of having a buildbot for
> GCC. Buildbot is a generic Python framework to build a test framework so
> the possibilities are pretty much endless as all workflows are
> programmed in Python and with buildbot nine the interface is also
> modifiable, if required.
> 
> If this is something of interest, then we will need to understand what
> is required, among those:
> 
> - which machines we can use as workers: we certainly need more worker
> (previously known as slave) machines to test GCC in different
> archs/configurations;
> - what kind of build configurations do we need and what they should do:
> for example, do we want to build gcc standalone against system (the one
> installed in the worker) binutils, glibc, etc or do we want a builder to
> bootstrap everything?
> - initially I was doing fresh builds and uploading a tarball (450Mgs)
> for download. This took way too long. I have moved to incremental builds
> with no tarball generation but if required we could do this for forced
> builds and/or nightly. Ideas?
> - We are currently running the whole testsuite for each incremental
> build (~40mins). If we want a faster turnaround time, we could run just
> an important subset of tests. Suggestions?
> - would we like to run anything on the compiler besides the gcc
> testsuite? I know Honza does, or used to do, lots of firefox builds to
> test LTO. Shall we build those, for example? I noticed there's a testing
> subpage which contains a few other libraries, should we build these?
> (https://gcc.gnu.org/testing/)
> - Currently we have a force build which allows people to force a build
> on the worker. This requires no authentication and can certainly be
> abused. We can add some sort of authentication, like for example, only
> allow users with a gcc.gnu.org email? For now, it's not a problem.
> -  We are building gcc for C, C++, ObjC (Which is the default). Shall we
> add more languages to the mix?
> - the gdb buildbot has a feature I have disabled (the TRY scheduler)
> which allows people to submit patches to the buildbot, buildbot patches
> the current svn version, builds and tests that. Would we want something
> like this?
> - buildbot can notify people if the build fails or if there's a test
> regression. Notification can be sent to IRC and email for example. What
> would people prefer to have as the settings for notifications?
> - an example of a successful build is:
> https://gcc-buildbot.linki.tools/#/builders/1/builds/38
> This build shows several Changes because between the start and finish of
> a build there were several new commits. Properties show among other
> things test results. Responsible users show the people who were involved
> in the changes for the build.
> 
> I am sure there are lots of other questions and issues. Please let me
> know if you find this interesting and what you would like to see
> implemented.
I'd strongly recommend using one of the existing infrastructures.  I
know several folks (myself included) are using Jenkins/Hudson.  There's
little to be gained building a completely new infrastructure to manage a
buildbot.


Jeff
> 
> Kind regards,
> 



GCC Buildbot

2017-09-20 Thread Paulo Matos
Hi all,

I am internally running buildbot for a few projects, including one for a
simple gcc setup for a private port. After some discussions with David
Edelsohn at the last couple of Cauldrons, who told me this might be
interesting for the community in general, I have contacted Sergio DJ
with a few questions on his buildbot configuration for GDB. I then
stripped out his configuration and transformed it into one from GCC,
with a few private additions and ported it to the most recent buildbot
version nine (which is numerically 0.9.x).

To make a long story short: https://gcc-buildbot.linki.tools
With brief documentation in: https://linkitools.github.io/gcc-buildbot
and configuration in: https://github.com/LinkiTools/gcc-buildbot

Now, this is still pretty raw but it:
* Configures a fedora x86_64 for C, C++ and ObjectiveC (./configure
--disable-multilib)
* Does an incremental build
* Runs all tests
* Grabs the test results and stores them as properties
* Creates a tarball of the sum and log files from the testsuite
directory and uploads them

This mail's intention is to gauge the interest of having a buildbot for
GCC. Buildbot is a generic Python framework to build a test framework so
the possibilities are pretty much endless as all workflows are
programmed in Python and with buildbot nine the interface is also
modifiable, if required.

If this is something of interest, then we will need to understand what
is required, among those:

- which machines we can use as workers: we certainly need more worker
(previously known as slave) machines to test GCC in different
archs/configurations;
- what kind of build configurations do we need and what they should do:
for example, do we want to build gcc standalone against system (the one
installed in the worker) binutils, glibc, etc or do we want a builder to
bootstrap everything?
- initially I was doing fresh builds and uploading a tarball (450Mgs)
for download. This took way too long. I have moved to incremental builds
with no tarball generation but if required we could do this for forced
builds and/or nightly. Ideas?
- We are currently running the whole testsuite for each incremental
build (~40mins). If we want a faster turnaround time, we could run just
an important subset of tests. Suggestions?
- would we like to run anything on the compiler besides the gcc
testsuite? I know Honza does, or used to do, lots of firefox builds to
test LTO. Shall we build those, for example? I noticed there's a testing
subpage which contains a few other libraries, should we build these?
(https://gcc.gnu.org/testing/)
- Currently we have a force build which allows people to force a build
on the worker. This requires no authentication and can certainly be
abused. We can add some sort of authentication, like for example, only
allow users with a gcc.gnu.org email? For now, it's not a problem.
-  We are building gcc for C, C++, ObjC (Which is the default). Shall we
add more languages to the mix?
- the gdb buildbot has a feature I have disabled (the TRY scheduler)
which allows people to submit patches to the buildbot, buildbot patches
the current svn version, builds and tests that. Would we want something
like this?
- buildbot can notify people if the build fails or if there's a test
regression. Notification can be sent to IRC and email for example. What
would people prefer to have as the settings for notifications?
- an example of a successful build is:
https://gcc-buildbot.linki.tools/#/builders/1/builds/38
This build shows several Changes because between the start and finish of
a build there were several new commits. Properties show among other
things test results. Responsible users show the people who were involved
in the changes for the build.

I am sure there are lots of other questions and issues. Please let me
know if you find this interesting and what you would like to see
implemented.

Kind regards,

-- 
Paulo Matos


Re: gcc buildbot?

2014-01-10 Thread Christophe Lyon

On 01/10/14 10:11, Richard Sandiford wrote:

Hi,

Philippe Baril Lecavalier  writes:

I have been experimenting with buildbot lately, and I would be glad to
help in providing it. If there is interest, I could have a prototype and
a detailed proposal ready in a few days. It could serve GCC, binutils
and some important libraries as well.

Thanks for the offer.  I think the current state is that Jan-Benedict Glaw
has put together a buildbot for testing that binutils, gcc and gdb
build for a wide range of targets:

http://toolchain.lug-owl.de/buildbot/

which has been very useful for catching target-specific build failures.
AFAIK the bot doesn't (and wasn't supposed to) check for testsuite
regressions, but I could be wrong about that.  There was some discussion
about adding testsuite coverage here:

http://gcc.gnu.org/ml/gcc/2013-08/msg00317.html


One aspect hasn't been discussed in that thread: I don't think it's possible to 
run the testsuite for every single commit, since 'make check' takes really a 
lot of time on some targets.

I have developed a cross-validation environment in which cross-build and cross-validate 
several ARM and AArch64 combinations. In my case, each build+check job takes about 3h 
(make -j4, in tmpfs, c, c++ and fortran), and I have to restrict the validation to an 
"interesting" subset of commits. Running on actual HW would be slower.

But it's definitely worth it :-)

Christophe.



Re: gcc buildbot?

2014-01-10 Thread Richard Sandiford
Hi,

Philippe Baril Lecavalier  writes:
> I have been experimenting with buildbot lately, and I would be glad to
> help in providing it. If there is interest, I could have a prototype and
> a detailed proposal ready in a few days. It could serve GCC, binutils
> and some important libraries as well.

Thanks for the offer.  I think the current state is that Jan-Benedict Glaw
has put together a buildbot for testing that binutils, gcc and gdb
build for a wide range of targets:

   http://toolchain.lug-owl.de/buildbot/

which has been very useful for catching target-specific build failures.
AFAIK the bot doesn't (and wasn't supposed to) check for testsuite
regressions, but I could be wrong about that.  There was some discussion
about adding testsuite coverage here:

   http://gcc.gnu.org/ml/gcc/2013-08/msg00317.html

I'm not sure how up-to-date that is though.

TBH, as you can probably tell, I'm not really the right person to be
answering your question :-)

Thanks,
Richard


gcc buildbot?

2014-01-06 Thread Philippe Baril Lecavalier


Hi,

Is anyone working on an implementation of buildbot 
for GCC?

I have been experimenting with buildbot lately, and I would be glad to
help in providing it. If there is interest, I could have a prototype and
a detailed proposal ready in a few days. It could serve GCC, binutils
and some important libraries as well.

Cheers,

Philippe Baril Lecavalier