[webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Adam Barth
Not to point fingers, but we've been having trouble keeping
build.webkit.org green these past few weeks.  As I write this message,
every platform is broken, again.  As the project scales, polluting the
build with brokenness impacts more developers and drains more
productivity.

Here are some approaches we could use to turn this tragedy of the
commons around:

1) Adopt a "rollout first, ask questions later" ethic.  The vast
majority of changes are not important enough to break the build for
everyone else.  If we adopt a "rollout first, ask questions later"
ethic, committers would feel free to rollout brokenness to unbreak the
build and contributors shouldn't be offended if their patch is rolled
out without their knowledge.  We can always re-land the broken patch
later once it actually works.

2) Require pre-commit vetting of patches.  We have the resources to
build and test every patch on at least one platform before landing the
patch in the main tree.  Vetting patches before landing will help us
avoid breaking every platform at once.  Once the patch has been
vetted, it can either be landed mechanically (i.e., by commit-queue)
or manually.

Here's how I would design the life and times of a patch:

1) Contributor uploads patch and nominates it for review.
2) Patch vetted by the EWS on numerous platforms.
3) If the EWS finds a problem, return to step 1.
4) Reviewer marks patch review+.
5) Committer decides the patch is ready to land.
6) Patch built and tested against top-of-tree on at least one platform.
7) If the patch fail to build or pass tests, return to step 1.
8) Patch landed.
9) If the patch turns any of the "core builders" red, patch is rolled
out, return to step 1.

I suspect most of our brokenness coming from committers skipping steps 6 and 7.

Adam
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Jeremy Orlow
On Fri, Feb 26, 2010 at 10:36 AM, Adam Barth  wrote:

> Not to point fingers, but we've been having trouble keeping
> build.webkit.org green these past few weeks.  As I write this message,
> every platform is broken, again.  As the project scales, polluting the
> build with brokenness impacts more developers and drains more
> productivity.
>
> Here are some approaches we could use to turn this tragedy of the
> commons around:
>
> 1) Adopt a "rollout first, ask questions later" ethic.  The vast
> majority of changes are not important enough to break the build for
> everyone else.  If we adopt a "rollout first, ask questions later"
> ethic, committers would feel free to rollout brokenness to unbreak the
> build and contributors shouldn't be offended if their patch is rolled
> out without their knowledge.  We can always re-land the broken patch
> later once it actually works.
>
> 2) Require pre-commit vetting of patches.  We have the resources to
> build and test every patch on at least one platform before landing the
> patch in the main tree.  Vetting patches before landing will help us
> avoid breaking every platform at once.  Once the patch has been
> vetted, it can either be landed mechanically (i.e., by commit-queue)
> or manually.
>
> Here's how I would design the life and times of a patch:
>
> 1) Contributor uploads patch and nominates it for review.
> 2) Patch vetted by the EWS on numerous platforms.
> 3) If the EWS finds a problem, return to step 1.
> 4) Reviewer marks patch review+.
> 5) Committer decides the patch is ready to land.
> 6) Patch built and tested against top-of-tree on at least one platform.
> 7) If the patch fail to build or pass tests, return to step 1.
> 8) Patch landed.
> 9) If the patch turns any of the "core builders" red, patch is rolled
> out, return to step 1.
>
> I suspect most of our brokenness coming from committers skipping steps 6
> and 7.
>

LGTM.  The only thing I'd add is that we REALLY need emails to start going
out to webkit-dev (and ideally the suspected patch owners as well) when
things do break.  What is doing this blocked on?
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Xan Lopez
On Fri, Feb 26, 2010 at 11:36 AM, Adam Barth  wrote:
> Not to point fingers, but we've been having trouble keeping
> build.webkit.org green these past few weeks.  As I write this message,
> every platform is broken, again.  As the project scales, polluting the
> build with brokenness impacts more developers and drains more
> productivity.
>
> Here are some approaches we could use to turn this tragedy of the
> commons around:
>
> 1) Adopt a "rollout first, ask questions later" ethic.  The vast
> majority of changes are not important enough to break the build for
> everyone else.  If we adopt a "rollout first, ask questions later"
> ethic, committers would feel free to rollout brokenness to unbreak the
> build and contributors shouldn't be offended if their patch is rolled
> out without their knowledge.  We can always re-land the broken patch
> later once it actually works.

In my experience this is more or less the current policy, especially
for build breakage (as opposed to test breakage). Maybe a bit less
hardliner in that we usually try contact the culprit and give some
time to fix issues, but I think there's no remorse in rolling out
patches if there's brokenness and nobody working on fixing it.

>
> 2) Require pre-commit vetting of patches.  We have the resources to
> build and test every patch on at least one platform before landing the
> patch in the main tree.  Vetting patches before landing will help us
> avoid breaking every platform at once.  Once the patch has been
> vetted, it can either be landed mechanically (i.e., by commit-queue)
> or manually.
>
> Here's how I would design the life and times of a patch:
>
> 1) Contributor uploads patch and nominates it for review.
> 2) Patch vetted by the EWS on numerous platforms.
> 3) If the EWS finds a problem, return to step 1.
> 4) Reviewer marks patch review+.
> 5) Committer decides the patch is ready to land.
> 6) Patch built and tested against top-of-tree on at least one platform.
> 7) If the patch fail to build or pass tests, return to step 1.
> 8) Patch landed.
> 9) If the patch turns any of the "core builders" red, patch is rolled
> out, return to step 1.

EWS has been a huge boon in productivity at least for us GTK+ folks,
so I fully support any step to increase its awesomeness! Of course
what we need to do is to work on increasing the number of "core
builders", but that's an orthogonal issue and our own responsibility.

Cheers,

Xan

>
> I suspect most of our brokenness coming from committers skipping steps 6 and 
> 7.
>
> Adam
> ___
> webkit-dev mailing list
> webkit-dev@lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Kenneth Christiansen
> 1) Contributor uploads patch and nominates it for review.
> 2) Patch vetted by the EWS on numerous platforms.

When a non-committer uploads a patch, it is not being vet by EWS. I
know that is due to security issues. It would be really nice with an
option for a reviewer to accept it to run on the EWS.

Kenneht
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Eric Seidel
On Fri, Feb 26, 2010 at 4:14 AM, Kenneth Christiansen
 wrote:
>> 1) Contributor uploads patch and nominates it for review.
>> 2) Patch vetted by the EWS on numerous platforms.
>
> When a non-committer uploads a patch, it is not being vet by EWS. I
> know that is due to security issues. It would be really nice with an
> option for a reviewer to accept it to run on the EWS.

The only EWS which requires committer access is Mac-EWS.  All other
EWS bots will run any patch.

-eric
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Alex Milowski
On Fri, Feb 26, 2010 at 7:09 AM, Eric Seidel  wrote:
> On Fri, Feb 26, 2010 at 4:14 AM, Kenneth Christiansen
>  wrote:
>>> 1) Contributor uploads patch and nominates it for review.
>>> 2) Patch vetted by the EWS on numerous platforms.
>>
>> When a non-committer uploads a patch, it is not being vet by EWS. I
>> know that is due to security issues. It would be really nice with an
>> option for a reviewer to accept it to run on the EWS.
>
> The only EWS which requires committer access is Mac-EWS.  All other
> EWS bots will run any patch.

Why is that?   That's the platform I'm most interested in see run.

-- 
--Alex Milowski
"The excellence of grammar as a guide is proportional to the paucity of the
inflexions, i.e. to the degree of analysis effected by the language
considered."

Bertrand Russell in a footnote of Principles of Mathematics
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Eric Seidel
On Fri, Feb 26, 2010 at 7:12 AM, Alex Milowski  wrote:
>> The only EWS which requires committer access is Mac-EWS.  All other
>> EWS bots will run any patch.
>
> Why is that?   That's the platform I'm most interested in see run.

Various reasons.  Mostly due to our current hardware setup.  If
someone has some mac hardware they'd like to donate to the cause it
would be most welcome.

-eric
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Alex Milowski
On Fri, Feb 26, 2010 at 7:17 AM, Eric Seidel  wrote:
> On Fri, Feb 26, 2010 at 7:12 AM, Alex Milowski  wrote:
>>> The only EWS which requires committer access is Mac-EWS.  All other
>>> EWS bots will run any patch.
>>
>> Why is that?   That's the platform I'm most interested in see run.
>
> Various reasons.  Mostly due to our current hardware setup.  If
> someone has some mac hardware they'd like to donate to the cause it
> would be most welcome.

That seems really, really solvable.

-- 
--Alex Milowski
"The excellence of grammar as a guide is proportional to the paucity of the
inflexions, i.e. to the degree of analysis effected by the language
considered."

Bertrand Russell in a footnote of Principles of Mathematics
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Adam Barth
On Fri, Feb 26, 2010 at 7:24 AM, Alex Milowski  wrote:
> On Fri, Feb 26, 2010 at 7:17 AM, Eric Seidel  wrote:
>> On Fri, Feb 26, 2010 at 7:12 AM, Alex Milowski  wrote:
 The only EWS which requires committer access is Mac-EWS.  All other
 EWS bots will run any patch.
>>>
>>> Why is that?   That's the platform I'm most interested in see run.
>>
>> Various reasons.  Mostly due to our current hardware setup.  If
>> someone has some mac hardware they'd like to donate to the cause it
>> would be most welcome.
>
> That seems really, really solvable.

The core issue here is that the license for Mac OS X prevents us from
running the OS in a virtual machine.  The way we protect ourselves
from random folks haxoring the EWS on Linux is by running them on EC2
and re-imagining the machines periodically.

If you'd like to donate hardware that you're willing to have random
folks run code on, please let me or Eric know and we'll show you how
to get the mac-ews up and running.

Adam
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Alex Milowski
On Fri, Feb 26, 2010 at 8:19 AM, Adam Barth  wrote:
> On Fri, Feb 26, 2010 at 7:24 AM, Alex Milowski  wrote:
>> On Fri, Feb 26, 2010 at 7:17 AM, Eric Seidel  wrote:
>>> On Fri, Feb 26, 2010 at 7:12 AM, Alex Milowski  wrote:
> The only EWS which requires committer access is Mac-EWS.  All other
> EWS bots will run any patch.

 Why is that?   That's the platform I'm most interested in see run.
>>>
>>> Various reasons.  Mostly due to our current hardware setup.  If
>>> someone has some mac hardware they'd like to donate to the cause it
>>> would be most welcome.
>>
>> That seems really, really solvable.
>
> The core issue here is that the license for Mac OS X prevents us from
> running the OS in a virtual machine.  The way we protect ourselves
> from random folks haxoring the EWS on Linux is by running them on EC2
> and re-imagining the machines periodically.

So, it is possible to run Mac OS X on a virtual machine:

http://www.appleinsider.com/articles/07/11/01/apple_frees_mac_os_x_leopard_server_to_run_in_virtual_machines.html

or the:

http://images.apple.com/legal/sla/docs/macosxserver105.pdf

which says:

"You may also Install and use other copies of Mac OS X Server Software
on the same Apple-labeled computer,"

You just need to use Apple hardware.  Hence the request for hardware.  :)

The real issue is you can't run this in the cloud like on an EC2 server
because of the hardware restriction in Apple's license, right?

> If you'd like to donate hardware that you're willing to have random
> folks run code on, please let me or Eric know and we'll show you how
> to get the mac-ews up and running.
>

I have limited bandwidth where I'm at and so hosting something, while
possible, needs careful consideration.  I've contemplated running something
like EWS for my own work so I'd be interested in learning how this work.

...but will just one server out there somewhere solve this problem?  Don't
we need several?

-- 
--Alex Milowski
"The excellence of grammar as a guide is proportional to the paucity of the
inflexions, i.e. to the degree of analysis effected by the language
considered."

Bertrand Russell in a footnote of Principles of Mathematics
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Adam Barth
On Fri, Feb 26, 2010 at 8:47 AM, Alex Milowski  wrote:
> On Fri, Feb 26, 2010 at 8:19 AM, Adam Barth  wrote:
>> On Fri, Feb 26, 2010 at 7:24 AM, Alex Milowski  wrote:
>>> On Fri, Feb 26, 2010 at 7:17 AM, Eric Seidel  wrote:
 On Fri, Feb 26, 2010 at 7:12 AM, Alex Milowski  wrote:
>> The only EWS which requires committer access is Mac-EWS.  All other
>> EWS bots will run any patch.
>
> Why is that?   That's the platform I'm most interested in see run.

 Various reasons.  Mostly due to our current hardware setup.  If
 someone has some mac hardware they'd like to donate to the cause it
 would be most welcome.
>>>
>>> That seems really, really solvable.
>>
>> The core issue here is that the license for Mac OS X prevents us from
>> running the OS in a virtual machine.  The way we protect ourselves
>> from random folks haxoring the EWS on Linux is by running them on EC2
>> and re-imagining the machines periodically.
>
> So, it is possible to run Mac OS X on a virtual machine:

Oh, awesome!

> The real issue is you can't run this in the cloud like on an EC2 server
> because of the hardware restriction in Apple's license, right?

EC2 has support for Linux and Windows, but not Mac.  I have been
meaning to set up a Windows box, but I haven't gotten around to it
yet.  If you know of a cloud provider that has Mac, we can set up the
mac-ews there.

>> If you'd like to donate hardware that you're willing to have random
>> folks run code on, please let me or Eric know and we'll show you how
>> to get the mac-ews up and running.
>>
>
> I have limited bandwidth where I'm at and so hosting something, while
> possible, needs careful consideration.  I've contemplated running something
> like EWS for my own work so I'd be interested in learning how this work.

Amazon tells me that our current bots use about 4 GB/month of download
bandwidth and 600 MB/month of upload bandwidth.  I presume almost all
of the bandwidth is to update the working copies of the four bots
hosted there.

> ...but will just one server out there somewhere solve this problem?  Don't
> we need several?

It depends on how beefy your server it, but one server is probably
fine.  The current mac-ews is running on one machine and has no
trouble keeping up with the load.

Adam
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Adam Barth
On Fri, Feb 26, 2010 at 8:55 AM, Adam Barth  wrote:
> Amazon tells me that our current bots use about 4 GB/month of download
> bandwidth and 600 MB/month of upload bandwidth.  I presume almost all
> of the bandwidth is to update the working copies of the four bots
> hosted there.

In case you're curious, Amazon charges us 9 cents/month for that much bandwidth.

Adam
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Kenneth Christiansen
That is some of the best 9 cents spend ever!

Kenneth

On Fri, Feb 26, 2010 at 1:58 PM, Adam Barth  wrote:
> On Fri, Feb 26, 2010 at 8:55 AM, Adam Barth  wrote:
>> Amazon tells me that our current bots use about 4 GB/month of download
>> bandwidth and 600 MB/month of upload bandwidth.  I presume almost all
>> of the bandwidth is to update the working copies of the four bots
>> hosted there.
>
> In case you're curious, Amazon charges us 9 cents/month for that much 
> bandwidth.
>
> Adam
> ___
> webkit-dev mailing list
> webkit-dev@lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>



-- 
Kenneth Rohde Christiansen
Technical Lead / Senior Software Engineer
Qt Labs Americas, Nokia Technology Institute, INdT
Phone  +55 81 8895 6002 / E-mail kenneth.christiansen at openbossa.org
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Adam Barth
Well, the total bill is a bit bigger, but yeah.  :)

Adam


On Fri, Feb 26, 2010 at 9:05 AM, Kenneth Christiansen
 wrote:
> That is some of the best 9 cents spend ever!
>
> Kenneth
>
> On Fri, Feb 26, 2010 at 1:58 PM, Adam Barth  wrote:
>> On Fri, Feb 26, 2010 at 8:55 AM, Adam Barth  wrote:
>>> Amazon tells me that our current bots use about 4 GB/month of download
>>> bandwidth and 600 MB/month of upload bandwidth.  I presume almost all
>>> of the bandwidth is to update the working copies of the four bots
>>> hosted there.
>>
>> In case you're curious, Amazon charges us 9 cents/month for that much 
>> bandwidth.
>>
>> Adam
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Dimitri Glazkov
To summarize the thread:

1) We're adopting "when in doubt, roll it out" approach to patches
that turn tree red.
2) Need to find a way to run Mac-EWS for non-committers.
3) Enable "build-break" emails to webkit-dev or another opt-in mailing list

What else?

:DG<

On Fri, Feb 26, 2010 at 9:08 AM, Adam Barth  wrote:
> Well, the total bill is a bit bigger, but yeah.  :)
>
> Adam
>
>
> On Fri, Feb 26, 2010 at 9:05 AM, Kenneth Christiansen
>  wrote:
>> That is some of the best 9 cents spend ever!
>>
>> Kenneth
>>
>> On Fri, Feb 26, 2010 at 1:58 PM, Adam Barth  wrote:
>>> On Fri, Feb 26, 2010 at 8:55 AM, Adam Barth  wrote:
 Amazon tells me that our current bots use about 4 GB/month of download
 bandwidth and 600 MB/month of upload bandwidth.  I presume almost all
 of the bandwidth is to update the working copies of the four bots
 hosted there.
>>>
>>> In case you're curious, Amazon charges us 9 cents/month for that much 
>>> bandwidth.
>>>
>>> Adam
> ___
> webkit-dev mailing list
> webkit-dev@lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Maciej Stachowiak


On Feb 26, 2010, at 9:17 AM, Dimitri Glazkov wrote:


To summarize the thread:

1) We're adopting "when in doubt, roll it out" approach to patches
that turn tree red.


I think it's polite, though not mandatory, to make a reasonable effort  
to find the person responsible for the breakage and give them a chance  
to fix it. (This doesn't have to mean hunting around for hours or  
days, but you could send email or ask on IRC.) Also acceptable to fix  
it yourself, if it is obvious how.



2) Need to find a way to run Mac-EWS for non-committers.
3) Enable "build-break" emails to webkit-dev or another opt-in  
mailing list


What else?


I'd like it if we had an IRC bot that announced build breakage on  
#webkit.


Regards,
Maciej

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Nikolas Zimmermann


Am 26.02.2010 um 18:17 schrieb Dimitri Glazkov:


To summarize the thread:

1) We're adopting "when in doubt, roll it out" approach to patches
that turn tree red.
2) Need to find a way to run Mac-EWS for non-committers.
3) Enable "build-break" emails to webkit-dev or another opt-in  
mailing list


What else?


I'm a bit scared of rule 1. How about we define a minimum delay when  
to roll-out patches, after they break something?
Let's say, if a commit breaks the tree, give the commiter a time frame  
of 30 minutes to fix it - otherwhise roll-out (we could even automate  
that.)


Example: When landing a SVG patch, that worked fine on Leopard, but  
broke Snow Leopard, I'd like to have some time to identify wheter it's  
the
fault of my patch, or a platform specific problem. If it's the fault  
of my patch, I have no problem with reverting. But if I can't  
immediately fix the
problem, because it's a platform specific issue, which can not be  
fixed (in terms of WebKit), then adding to the Skipped list, and  
filing a new bug
just takes 5 minutes. Reverting the whole patch, just to reland it  
with a Skipped list addition is a bit too much work for me.


What do others think?

Cheers,
Niko

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Maciej Stachowiak


On Feb 26, 2010, at 1:36 AM, Adam Barth wrote:


Not to point fingers, but we've been having trouble keeping
build.webkit.org green these past few weeks.  As I write this message,
every platform is broken, again.  As the project scales, polluting the
build with brokenness impacts more developers and drains more
productivity.

Here are some approaches we could use to turn this tragedy of the
commons around:

1) Adopt a "rollout first, ask questions later" ethic.  The vast
majority of changes are not important enough to break the build for
everyone else.  If we adopt a "rollout first, ask questions later"
ethic, committers would feel free to rollout brokenness to unbreak the
build and contributors shouldn't be offended if their patch is rolled
out without their knowledge.  We can always re-land the broken patch
later once it actually works.

2) Require pre-commit vetting of patches.  We have the resources to
build and test every patch on at least one platform before landing the
patch in the main tree.  Vetting patches before landing will help us
avoid breaking every platform at once.  Once the patch has been
vetted, it can either be landed mechanically (i.e., by commit-queue)
or manually.

Here's how I would design the life and times of a patch:

1) Contributor uploads patch and nominates it for review.
2) Patch vetted by the EWS on numerous platforms.
3) If the EWS finds a problem, return to step 1.
4) Reviewer marks patch review+.
5) Committer decides the patch is ready to land.
6) Patch built and tested against top-of-tree on at least one  
platform.

7) If the patch fail to build or pass tests, return to step 1.
8) Patch landed.
9) If the patch turns any of the "core builders" red, patch is rolled
out, return to step 1.

I suspect most of our brokenness coming from committers skipping  
steps 6 and 7.


One data point: I broke the build this weekend, because I introduced a  
problem that affected debug builds but not release. I did a full  
release build on my own system before committing. When someone pointed  
out the breakage, I rolled the patch out myself until I could fix it.  
If the problems were such that I could fix them as quickly as rolling  
out, I would


I feel like the biggest failure in my case was that I forgot to look  
at the bot once my patch went through a cycle. This is why I wish it  
would do some form of more active notification. Sometimes I get  
distracted after committing and forget to keep hitting reload on the  
buildbot page.


Regards,
Maciej




___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Chris Jerdonek
On Fri, Feb 26, 2010 at 1:36 AM, Adam Barth  wrote:

> 2) Require pre-commit vetting of patches.  We have the resources to
>
> Here's how I would design the life and times of a patch:
>
> 1) Contributor uploads patch and nominates it for review.
> 2) Patch vetted by the EWS on numerous platforms.
> 3) If the EWS finds a problem, return to step 1.
> 4) Reviewer marks patch review+.

It seems like this would preclude serial patches from getting reviewed together.

If I break a larger patch into smaller pieces for the benefit of the
reviewer (so that the second piece depends on the first getting
committed, etc), it seems like this process would mean that the second
piece can't get reviewed until the first piece is committed.

It seems like the committer should be allowed to decide when (2) and
(3) happen relative to the other steps -- provided it happens some
time before landing.

--Chris

> 5) Committer decides the patch is ready to land.
> 6) Patch built and tested against top-of-tree on at least one platform.
> 7) If the patch fail to build or pass tests, return to step 1.
> 8) Patch landed.
> 9) If the patch turns any of the "core builders" red, patch is rolled
> out, return to step 1.
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Eric Seidel
The bots take 15 minutes to cycle.  The moment the build is broken,
thats FIX_TIME + BOT_CYCLE_TIME until their green again.

I think we should cap the fix grace period at something like 15
minutes, that means no more than 30 minutes of tree redness per break.
 That might be too aggressive to start with for WebKit, but I think we
should move towards that.

I would re-write rule one as something like this:
1.  Comment in the bugzilla bug when the build breaks.  If there is no
bugzilla bug, comment in #webkit.
2.  15 minutes after the break or 10 minutes after the comment, with
no reply from the breaker, roll out the patch.

-eric

On Fri, Feb 26, 2010 at 9:32 AM, Nikolas Zimmermann
 wrote:
>
> Am 26.02.2010 um 18:17 schrieb Dimitri Glazkov:
>
>> To summarize the thread:
>>
>> 1) We're adopting "when in doubt, roll it out" approach to patches
>> that turn tree red.
>> 2) Need to find a way to run Mac-EWS for non-committers.
>> 3) Enable "build-break" emails to webkit-dev or another opt-in mailing
>> list
>>
>> What else?
>
> I'm a bit scared of rule 1. How about we define a minimum delay when to
> roll-out patches, after they break something?
> Let's say, if a commit breaks the tree, give the commiter a time frame of 30
> minutes to fix it - otherwhise roll-out (we could even automate that.)
>
> Example: When landing a SVG patch, that worked fine on Leopard, but broke
> Snow Leopard, I'd like to have some time to identify wheter it's the
> fault of my patch, or a platform specific problem. If it's the fault of my
> patch, I have no problem with reverting. But if I can't immediately fix the
> problem, because it's a platform specific issue, which can not be fixed (in
> terms of WebKit), then adding to the Skipped list, and filing a new bug
> just takes 5 minutes. Reverting the whole patch, just to reland it with a
> Skipped list addition is a bit too much work for me.
>
> What do others think?
>
> Cheers,
> Niko
>
> ___
> webkit-dev mailing list
> webkit-dev@lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Alex Milowski
On Fri, Feb 26, 2010 at 8:55 AM, Adam Barth  wrote:
> On Fri, Feb 26, 2010 at 8:47 AM, Alex Milowski  wrote:
>> On Fri, Feb 26, 2010 at 8:19 AM, Adam Barth  wrote:
>>> On Fri, Feb 26, 2010 at 7:24 AM, Alex Milowski  wrote:
 On Fri, Feb 26, 2010 at 7:17 AM, Eric Seidel  wrote:
> On Fri, Feb 26, 2010 at 7:12 AM, Alex Milowski  wrote:
>>> The only EWS which requires committer access is Mac-EWS.  All other
>>> EWS bots will run any patch.
>>
>> Why is that?   That's the platform I'm most interested in see run.
>
> Various reasons.  Mostly due to our current hardware setup.  If
> someone has some mac hardware they'd like to donate to the cause it
> would be most welcome.

 That seems really, really solvable.
>>>
>>> The core issue here is that the license for Mac OS X prevents us from
>>> running the OS in a virtual machine.  The way we protect ourselves
>>> from random folks haxoring the EWS on Linux is by running them on EC2
>>> and re-imagining the machines periodically.
>>
>> So, it is possible to run Mac OS X on a virtual machine:
>
> Oh, awesome!
>
>> The real issue is you can't run this in the cloud like on an EC2 server
>> because of the hardware restriction in Apple's license, right?
>
> EC2 has support for Linux and Windows, but not Mac.  I have been
> meaning to set up a Windows box, but I haven't gotten around to it
> yet.  If you know of a cloud provider that has Mac, we can set up the
> mac-ews there.

The only non-dedicated server hosting provider I've found is GoDaddy:

   http://www.godaddy.com/hosting/mac-hosting.aspx

I don't know if starting/stopping instances is as easy as Amazon's EC2
service (which I use).  I've never used their virtual hosting service.

-- 
--Alex Milowski
"The excellence of grammar as a guide is proportional to the paucity of the
inflexions, i.e. to the degree of analysis effected by the language
considered."

Bertrand Russell in a footnote of Principles of Mathematics
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Dimitri Glazkov
On Fri, Feb 26, 2010 at 9:44 AM, Eric Seidel  wrote:
> The bots take 15 minutes to cycle.  The moment the build is broken,
> thats FIX_TIME + BOT_CYCLE_TIME until their green again.
>
> I think we should cap the fix grace period at something like 15
> minutes, that means no more than 30 minutes of tree redness per break.
>  That might be too aggressive to start with for WebKit, but I think we
> should move towards that.
>
> I would re-write rule one as something like this:
> 1.  Comment in the bugzilla bug when the build breaks.  If there is no
> bugzilla bug, comment in #webkit.
> 2.  15 minutes after the break or 10 minutes after the comment, with
> no reply from the breaker, roll out the patch.

Sounds great. Is this going to be a new page on webkit.org?

:DG<

> -eric
>
> On Fri, Feb 26, 2010 at 9:32 AM, Nikolas Zimmermann
>  wrote:
>>
>> Am 26.02.2010 um 18:17 schrieb Dimitri Glazkov:
>>
>>> To summarize the thread:
>>>
>>> 1) We're adopting "when in doubt, roll it out" approach to patches
>>> that turn tree red.
>>> 2) Need to find a way to run Mac-EWS for non-committers.
>>> 3) Enable "build-break" emails to webkit-dev or another opt-in mailing
>>> list
>>>
>>> What else?
>>
>> I'm a bit scared of rule 1. How about we define a minimum delay when to
>> roll-out patches, after they break something?
>> Let's say, if a commit breaks the tree, give the commiter a time frame of 30
>> minutes to fix it - otherwhise roll-out (we could even automate that.)
>>
>> Example: When landing a SVG patch, that worked fine on Leopard, but broke
>> Snow Leopard, I'd like to have some time to identify wheter it's the
>> fault of my patch, or a platform specific problem. If it's the fault of my
>> patch, I have no problem with reverting. But if I can't immediately fix the
>> problem, because it's a platform specific issue, which can not be fixed (in
>> terms of WebKit), then adding to the Skipped list, and filing a new bug
>> just takes 5 minutes. Reverting the whole patch, just to reland it with a
>> Skipped list addition is a bit too much work for me.
>>
>> What do others think?
>>
>> Cheers,
>> Niko
>>
>> ___
>> webkit-dev mailing list
>> webkit-dev@lists.webkit.org
>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>
>
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Jeremy Orlow
On Fri, Feb 26, 2010 at 6:47 PM, Dimitri Glazkov wrote:

> On Fri, Feb 26, 2010 at 9:44 AM, Eric Seidel  wrote:
> > The bots take 15 minutes to cycle.  The moment the build is broken,
> > thats FIX_TIME + BOT_CYCLE_TIME until their green again.
> >
> > I think we should cap the fix grace period at something like 15
> > minutes, that means no more than 30 minutes of tree redness per break.
> >  That might be too aggressive to start with for WebKit, but I think we
> > should move towards that.
> >
> > I would re-write rule one as something like this:
> > 1.  Comment in the bugzilla bug when the build breaks.  If there is no
> > bugzilla bug, comment in #webkit.
> > 2.  15 minutes after the break or 10 minutes after the comment, with
> > no reply from the breaker, roll out the patch.
>
> Sounds great. Is this going to be a new page on webkit.org?
>

Agree it sounds like a good plan.

Re the emails: who knows how to do that?  Can someone own this process to
completion and do it as soon as possible?  It'd be much appreciated!


>
> :DG<
>
> > -eric
> >
> > On Fri, Feb 26, 2010 at 9:32 AM, Nikolas Zimmermann
> >  wrote:
> >>
> >> Am 26.02.2010 um 18:17 schrieb Dimitri Glazkov:
> >>
> >>> To summarize the thread:
> >>>
> >>> 1) We're adopting "when in doubt, roll it out" approach to patches
> >>> that turn tree red.
> >>> 2) Need to find a way to run Mac-EWS for non-committers.
> >>> 3) Enable "build-break" emails to webkit-dev or another opt-in mailing
> >>> list
> >>>
> >>> What else?
> >>
> >> I'm a bit scared of rule 1. How about we define a minimum delay when to
> >> roll-out patches, after they break something?
> >> Let's say, if a commit breaks the tree, give the commiter a time frame
> of 30
> >> minutes to fix it - otherwhise roll-out (we could even automate that.)
> >>
> >> Example: When landing a SVG patch, that worked fine on Leopard, but
> broke
> >> Snow Leopard, I'd like to have some time to identify wheter it's the
> >> fault of my patch, or a platform specific problem. If it's the fault of
> my
> >> patch, I have no problem with reverting. But if I can't immediately fix
> the
> >> problem, because it's a platform specific issue, which can not be fixed
> (in
> >> terms of WebKit), then adding to the Skipped list, and filing a new bug
> >> just takes 5 minutes. Reverting the whole patch, just to reland it with
> a
> >> Skipped list addition is a bit too much work for me.
> >>
> >> What do others think?
> >>
> >> Cheers,
> >> Niko
> >>
> >> ___
> >> webkit-dev mailing list
> >> webkit-dev@lists.webkit.org
> >> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
> >>
> >
> ___
> webkit-dev mailing list
> webkit-dev@lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Alexey Proskuryakov


On 26.02.2010, at 9:29, Maciej Stachowiak wrote:

I'd like it if we had an IRC bot that announced build breakage on  
#webkit.



Perhaps better yet, on #webkit-build, as buildbot used to do.

- WBR, Alexey Proskuryakov

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Alexey Proskuryakov


On 26.02.2010, at 9:50, Jeremy Orlow wrote:


> I would re-write rule one as something like this:
> 1.  Comment in the bugzilla bug when the build breaks.  If there  
is no

> bugzilla bug, comment in #webkit.
> 2.  15 minutes after the break or 10 minutes after the comment, with
> no reply from the breaker, roll out the patch.

Sounds great. Is this going to be a new page on webkit.org?

Agree it sounds like a good plan.



So, is the assumption that everyone reads bugmail immediately? When  
pinged on #webkit, I get an audible notification, but it's likely that  
I won't see bugmail until much later.


- WBR, Alexey Proskuryakov

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Maciej Stachowiak


On Feb 26, 2010, at 9:58 AM, Alexey Proskuryakov wrote:



On 26.02.2010, at 9:50, Jeremy Orlow wrote:


> I would re-write rule one as something like this:
> 1.  Comment in the bugzilla bug when the build breaks.  If there  
is no

> bugzilla bug, comment in #webkit.
> 2.  15 minutes after the break or 10 minutes after the comment,  
with

> no reply from the breaker, roll out the patch.

Sounds great. Is this going to be a new page on webkit.org?

Agree it sounds like a good plan.



So, is the assumption that everyone reads bugmail immediately? When  
pinged on #webkit, I get an audible notification, but it's likely  
that I won't see bugmail until much later.


I suspect the odds of most people reading bugmail within 10 minutes  
are pretty low.


Regards,
Maciej

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Maciej Stachowiak


On Feb 26, 2010, at 9:56 AM, Alexey Proskuryakov wrote:



On 26.02.2010, at 9:29, Maciej Stachowiak wrote:

I'd like it if we had an IRC bot that announced build breakage on  
#webkit.



Perhaps better yet, on #webkit-build, as buildbot used to do.


In the past, no one ever joined #webkit-build so this was not an  
effective means of notification.


Regards,
Maciej

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Jeremy Orlow
On Fri, Feb 26, 2010 at 7:06 PM, Maciej Stachowiak  wrote:

>
> On Feb 26, 2010, at 9:56 AM, Alexey Proskuryakov wrote:
>
>
>> On 26.02.2010, at 9:29, Maciej Stachowiak wrote:
>>
>>  I'd like it if we had an IRC bot that announced build breakage on
>>> #webkit.
>>>
>>
>>
>> Perhaps better yet, on #webkit-build, as buildbot used to do.
>>
>
> In the past, no one ever joined #webkit-build so this was not an effective
> means of notification.
>

I didn't even know it existed until now.  Was there ever an email sent out
on this?  If so, I missed it.
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Geoffrey Garen
I think it would be more productive to start with better systems for informing 
people that they've broken something, and move on to rolling out patches 
aggressively if informing people doesn't work.

It's not surprising that people neglect a red tree when they don't know about 
it.

A lot of the proposals on this thread would interfere with this work flow:

1. Finish patch and get it working on local machine.
2. Check in, automatically test for compatibility on other machines and OS's in 
parallel, resolving unexpected problems as they arise.

and change it to this work flow:

0. Purchase and set up about 15 different build environments.
1. Finish patch and get it working on local machine.
2. Manually test on build environments purchased and set up in (0).
3. Check in.

That would be a serious blow to productivity -- probably a cure that is worse 
than the disease.

Bear in mind that the build environments problem is multiplied by Google's 
choice to use a separate JavaScript engine, which effectively almost doubles 
the testing surface area.

Geoff

On Feb 26, 2010, at 9:44 AM, Eric Seidel wrote:

> The bots take 15 minutes to cycle.  The moment the build is broken,
> thats FIX_TIME + BOT_CYCLE_TIME until their green again.
> 
> I think we should cap the fix grace period at something like 15
> minutes, that means no more than 30 minutes of tree redness per break.
> That might be too aggressive to start with for WebKit, but I think we
> should move towards that.
> 
> I would re-write rule one as something like this:
> 1.  Comment in the bugzilla bug when the build breaks.  If there is no
> bugzilla bug, comment in #webkit.
> 2.  15 minutes after the break or 10 minutes after the comment, with
> no reply from the breaker, roll out the patch.
> 
> -eric
> 
> On Fri, Feb 26, 2010 at 9:32 AM, Nikolas Zimmermann
>  wrote:
>> 
>> Am 26.02.2010 um 18:17 schrieb Dimitri Glazkov:
>> 
>>> To summarize the thread:
>>> 
>>> 1) We're adopting "when in doubt, roll it out" approach to patches
>>> that turn tree red.
>>> 2) Need to find a way to run Mac-EWS for non-committers.
>>> 3) Enable "build-break" emails to webkit-dev or another opt-in mailing
>>> list
>>> 
>>> What else?
>> 
>> I'm a bit scared of rule 1. How about we define a minimum delay when to
>> roll-out patches, after they break something?
>> Let's say, if a commit breaks the tree, give the commiter a time frame of 30
>> minutes to fix it - otherwhise roll-out (we could even automate that.)
>> 
>> Example: When landing a SVG patch, that worked fine on Leopard, but broke
>> Snow Leopard, I'd like to have some time to identify wheter it's the
>> fault of my patch, or a platform specific problem. If it's the fault of my
>> patch, I have no problem with reverting. But if I can't immediately fix the
>> problem, because it's a platform specific issue, which can not be fixed (in
>> terms of WebKit), then adding to the Skipped list, and filing a new bug
>> just takes 5 minutes. Reverting the whole patch, just to reland it with a
>> Skipped list addition is a bit too much work for me.
>> 
>> What do others think?
>> 
>> Cheers,
>> Niko
>> 
>> ___
>> webkit-dev mailing list
>> webkit-dev@lists.webkit.org
>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>> 
> ___
> webkit-dev mailing list
> webkit-dev@lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Enrica Casucci
I didn't know that failing tests would block the commit queue. I saw they were 
failing yesterday afternoon and I thought it was ok to wait until this morning 
to fix them.
My apologies for the inconvenience.
I believe a reasonable approach to handle these situations is to try to contact 
the person responsible for braking the tests in IRC and if there is no response 
within an hour, roll back.
I believe that requiring everyone to run the layout tests (the entire suite) 
before committing is the right thing to do.
The only time I haven't done it was yesterday :-(.
Lesson learned.

Enrica

On Feb 26, 2010, at 10:15 AM, Jeremy Orlow wrote:

> On Fri, Feb 26, 2010 at 7:06 PM, Maciej Stachowiak  wrote:
> 
> On Feb 26, 2010, at 9:56 AM, Alexey Proskuryakov wrote:
> 
> 
> On 26.02.2010, at 9:29, Maciej Stachowiak wrote:
> 
> I'd like it if we had an IRC bot that announced build breakage on #webkit.
> 
> 
> Perhaps better yet, on #webkit-build, as buildbot used to do.
> 
> In the past, no one ever joined #webkit-build so this was not an effective 
> means of notification.
> 
> I didn't even know it existed until now.  Was there ever an email sent out on 
> this?  If so, I missed it.
> ___
> webkit-dev mailing list
> webkit-dev@lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Alexey Proskuryakov


On 26.02.2010, at 10:15, Jeremy Orlow wrote:

I didn't even know it existed until now.  Was there ever an email  
sent out on this?  If so, I missed it.



Buildbot used to announce results there, but it was a few years ago.  
My recollection is that when it worked, about half of active  
committers actually joined the channel. I still do, because I'm too  
lazy to remove it from my auto-connect list :)


Buildbot was also listening to commands on this channel, which I think  
worked as of several months ago. But it also no longer works, too.


- WBR, Alexey Proskuryakov

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Ojan Vafai
On Fri, Feb 26, 2010 at 10:40 AM, Geoffrey Garen  wrote:

> A lot of the proposals on this thread would interfere with this work flow:
>
> 1. Finish patch and get it working on local machine.
> 2. Check in, automatically test for compatibility on other machines and
> OS's in parallel, resolving unexpected problems as they arise.
>

There is a non-trivial cost of this workflow on the rest of the team.
-keeps the commit-queue from running
-often results in new test failures going unnoticed because the tree is
already red
-we can't generally trust that all the tests should pass locally

Clearly, every developer having access to every environment and knowing how
to setup/build/test on each environment is not an option.

Would it be enough for you if you could send a patch to the EWS and get back
the results for any test failures? This would let you maintain the above
workflow without actually committing. Adam/Eric, how close is the EWS to
enabling that? The missing pieces as I see it are:

1. Running the layout tests as part of the EWS.
2. Giving access to the results of any failing tests.

and change it to this work flow:
>
> 0. Purchase and set up about 15 different build environments.
> 1. Finish patch and get it working on local machine.
> 2. Manually test on build environments purchased and set up in (0).
> 3. Check in.
>
> That would be a serious blow to productivity -- probably a cure that is
> worse than the disease.
>
> Bear in mind that the build environments problem is multiplied by Google's
> choice to use a separate JavaScript engine, which effectively almost doubles
> the testing surface area.
>
> Geoff
>
> On Feb 26, 2010, at 9:44 AM, Eric Seidel wrote:
>
> > The bots take 15 minutes to cycle.  The moment the build is broken,
> > thats FIX_TIME + BOT_CYCLE_TIME until their green again.
> >
> > I think we should cap the fix grace period at something like 15
> > minutes, that means no more than 30 minutes of tree redness per break.
> > That might be too aggressive to start with for WebKit, but I think we
> > should move towards that.
> >
> > I would re-write rule one as something like this:
> > 1.  Comment in the bugzilla bug when the build breaks.  If there is no
> > bugzilla bug, comment in #webkit.
> > 2.  15 minutes after the break or 10 minutes after the comment, with
> > no reply from the breaker, roll out the patch.
> >
> > -eric
> >
> > On Fri, Feb 26, 2010 at 9:32 AM, Nikolas Zimmermann
> >  wrote:
> >>
> >> Am 26.02.2010 um 18:17 schrieb Dimitri Glazkov:
> >>
> >>> To summarize the thread:
> >>>
> >>> 1) We're adopting "when in doubt, roll it out" approach to patches
> >>> that turn tree red.
> >>> 2) Need to find a way to run Mac-EWS for non-committers.
> >>> 3) Enable "build-break" emails to webkit-dev or another opt-in mailing
> >>> list
> >>>
> >>> What else?
> >>
> >> I'm a bit scared of rule 1. How about we define a minimum delay when to
> >> roll-out patches, after they break something?
> >> Let's say, if a commit breaks the tree, give the commiter a time frame
> of 30
> >> minutes to fix it - otherwhise roll-out (we could even automate that.)
> >>
> >> Example: When landing a SVG patch, that worked fine on Leopard, but
> broke
> >> Snow Leopard, I'd like to have some time to identify wheter it's the
> >> fault of my patch, or a platform specific problem. If it's the fault of
> my
> >> patch, I have no problem with reverting. But if I can't immediately fix
> the
> >> problem, because it's a platform specific issue, which can not be fixed
> (in
> >> terms of WebKit), then adding to the Skipped list, and filing a new bug
> >> just takes 5 minutes. Reverting the whole patch, just to reland it with
> a
> >> Skipped list addition is a bit too much work for me.
> >>
> >> What do others think?
> >>
> >> Cheers,
> >> Niko
> >>
> >> ___
> >> webkit-dev mailing list
> >> webkit-dev@lists.webkit.org
> >> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
> >>
> > ___
> > webkit-dev mailing list
> > webkit-dev@lists.webkit.org
> > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>
> ___
> webkit-dev mailing list
> webkit-dev@lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Geoffrey Garen
> There is a non-trivial cost of this workflow on the rest of the team.
> -keeps the commit-queue from running
> -often results in new test failures going unnoticed because the tree is 
> already red
> -we can't generally trust that all the tests should pass locally

I think all of the costs you list fundamentally derive from "failures going 
unnoticed." That's the rationale for my suggestion that we start by making sure 
that failures are noticed.

> Would it be enough for you if you could send a patch to the EWS and get back 
> the results for any test failures?

It would certainly be very helpful.

I don't know if it would be enough to make me think a harsh policy of rolling 
out patches was a good idea.

But if we had a good system for making failures noticed, and a working EWS, and 
we still had problems with a red tree, I'm sure I would support some further 
action to solve the problem.

Geoff
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Maciej Stachowiak


On Feb 26, 2010, at 11:08 AM, Alexey Proskuryakov wrote:



On 26.02.2010, at 10:15, Jeremy Orlow wrote:

I didn't even know it existed until now.  Was there ever an email  
sent out on this?  If so, I missed it.



Buildbot used to announce results there, but it was a few years ago.  
My recollection is that when it worked, about half of active  
committers actually joined the channel. I still do, because I'm too  
lazy to remove it from my auto-connect list :)


Buildbot was also listening to commands on this channel, which I  
think worked as of several months ago. But it also no longer works,  
too.


I believe it announced successes as well as failures, which somewhat  
limited the utility. I think notice only of failures (or returning to  
green after previous failures), plus mention of the blameworthy  
committer's IRC nick, would make a much better notification system.


Regards,
Maciej

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Maciej Stachowiak


On Feb 26, 2010, at 11:34 AM, Geoffrey Garen wrote:


There is a non-trivial cost of this workflow on the rest of the team.
-keeps the commit-queue from running
-often results in new test failures going unnoticed because the  
tree is already red

-we can't generally trust that all the tests should pass locally


I think all of the costs you list fundamentally derive from  
"failures going unnoticed." That's the rationale for my suggestion  
that we start by making sure that failures are noticed.


I strongly agree with Geoff that our first step should be to make  
failures more visible.


But if we had a good system for making failures noticed, and a  
working EWS, and we still had problems with a red tree, I'm sure I  
would support some further action to solve the problem.


I agree with this as well.

One goal I have always had for the WebKit project is to have the  
minimum amount of policy necessary for the project to run smoothly. It  
seems good to me that we have less in the way of rules and bureaucracy  
than other open source projects of a similar scale. As the project  
grows, we will certainly need some additional policy. But I would  
prefer to take it in steps. It seems to me like making failures more  
visible would go a long way.



Regards,
Maciej

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Simon Fraser
On Feb 26, 2010, at 11:34 AM, Geoffrey Garen wrote:

>> There is a non-trivial cost of this workflow on the rest of the team.
>> -keeps the commit-queue from running
>> -often results in new test failures going unnoticed because the tree is 
>> already red
>> -we can't generally trust that all the tests should pass locally
> 
> I think all of the costs you list fundamentally derive from "failures going 
> unnoticed." That's the rationale for my suggestion that we start by making 
> sure that failures are noticed.
> 
>> Would it be enough for you if you could send a patch to the EWS and get back 
>> the results for any test failures?
> 
> It would certainly be very helpful.
> 
> I don't know if it would be enough to make me think a harsh policy of rolling 
> out patches was a good idea.
> 
> But if we had a good system for making failures noticed, and a working EWS, 
> and we still had problems with a red tree, I'm sure I would support some 
> further action to solve the problem.

Mozilla has (or at least had when I worked there) two additional "tree rules" 
that helped keep the tree green:

1. A sheriff was appointed at all times, and had the authority to close the 
tree if there was significant build or test breakage. Closing the tree meant 
that it was blocked to new commits other than those intended to fix problems. 
Closing the tree also sends a strong message that "something is broken, please 
pitch in and fix it if you can".

Sheriff duties were shared around between responsible committers, so as not to 
overly burden one person.

2. The Mozilla tinderbox page (their buildbot waterfall) had a way for people 
to leave comments, by adding a "star" to a particular build with a comment. 
This is used as a way to communicate that someone has noticed the breakage, and 
is working on it.

In general, I think the waterfall page could be improved in order to make 
"breakage archeology" easier. Entries in the Changes column should be direct 
links to trac changesets, for example.

Simon

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Maciej Stachowiak


On Feb 26, 2010, at 11:43 AM, Simon Fraser wrote:



Mozilla has (or at least had when I worked there) two additional  
"tree rules" that helped keep the tree green:


1. A sheriff was appointed at all times, and had the authority to  
close the tree if there was significant build or test breakage.  
Closing the tree meant that it was blocked to new commits other than  
those intended to fix problems. Closing the tree also sends a strong  
message that "something is broken, please pitch in and fix it if you  
can".


Sheriff duties were shared around between responsible committers, so  
as not to overly burden one person.


I think the build sheriff idea is a good one. Maybe what we want is to  
have a sheriff responsible for each build train that has an active  
buildbot. (It could be the same person responsible for several build  
trains, the main qualification would be having reasonable familiarity  
with a port and access to its build environment.)


However, I am not so sure "close the tree" is necessarily the best  
focus for sheriff actions. What I'd prefer to see is that the sheriff  
the person primarily responsible for reverting broken patches if not  
fixed in a timely manner. Then we could have some human judgment in  
the process and specific people with clear responsibility.


2. The Mozilla tinderbox page (their buildbot waterfall) had a way  
for people to leave comments, by adding a "star" to a particular  
build with a comment. This is used as a way to communicate that  
someone has noticed the breakage, and is working on it.


Sounds like a good idea. Wondering if that fits better in the console  
view or the extensions view.




In general, I think the waterfall page could be improved in order to  
make "breakage archeology" easier. Entries in the Changes column  
should be direct links to trac changesets, for example.


That sounds good too. Another thing that would help is adding "next  
page" links to the console view, like we have on the waterfall. The  
console link often makes it easier to quickly identify the patch that  
went bad, but only if the badness is recent enough to show up.


Regards,
Maciej

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Jeremy Orlow
On Fri, Feb 26, 2010 at 8:53 PM, Maciej Stachowiak  wrote:

>
> On Feb 26, 2010, at 11:43 AM, Simon Fraser wrote:
>
>
>> Mozilla has (or at least had when I worked there) two additional "tree
>> rules" that helped keep the tree green:
>>
>> 1. A sheriff was appointed at all times, and had the authority to close
>> the tree if there was significant build or test breakage. Closing the tree
>> meant that it was blocked to new commits other than those intended to fix
>> problems. Closing the tree also sends a strong message that "something is
>> broken, please pitch in and fix it if you can".
>>
>> Sheriff duties were shared around between responsible committers, so as
>> not to overly burden one person.
>>
>
> I think the build sheriff idea is a good one. Maybe what we want is to have
> a sheriff responsible for each build train that has an active buildbot. (It
> could be the same person responsible for several build trains, the main
> qualification would be having reasonable familiarity with a port and access
> to its build environment.)
>
> However, I am not so sure "close the tree" is necessarily the best focus
> for sheriff actions. What I'd prefer to see is that the sheriff the person
> primarily responsible for reverting broken patches if not fixed in a timely
> manner. Then we could have some human judgment in the process and specific
> people with clear responsibility.


I agree "close to the tree" is not necessary for the reasons you listed.
 And I think most people from the Chromium would welcome this change
(sheriff + ability to close).  We've been advocating it for some time now.
 :-)


>  2. The Mozilla tinderbox page (their buildbot waterfall) had a way for
>> people to leave comments, by adding a "star" to a particular build with a
>> comment. This is used as a way to communicate that someone has noticed the
>> breakage, and is working on it.
>>
>
> Sounds like a good idea. Wondering if that fits better in the console view
> or the extensions view.
>
>
>
>> In general, I think the waterfall page could be improved in order to make
>> "breakage archeology" easier. Entries in the Changes column should be direct
>> links to trac changesets, for example.
>>
>
> That sounds good too. Another thing that would help is adding "next page"
> links to the console view, like we have on the waterfall. The console link
> often makes it easier to quickly identify the patch that went bad, but only
> if the badness is recent enough to show up.
>
> Regards,
> Maciej
>
>
> ___
> webkit-dev mailing list
> webkit-dev@lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread William Siegrist
On Feb 26, 2010, at 9:29 AM, Maciej Stachowiak wrote:

> 
> On Feb 26, 2010, at 9:17 AM, Dimitri Glazkov wrote:
> 
>> To summarize the thread:
>> 
>> 1) We're adopting "when in doubt, roll it out" approach to patches
>> that turn tree red.
> 
> I think it's polite, though not mandatory, to make a reasonable effort to 
> find the person responsible for the breakage and give them a chance to fix 
> it. (This doesn't have to mean hunting around for hours or days, but you 
> could send email or ask on IRC.) Also acceptable to fix it yourself, if it is 
> obvious how.
> 
>> 2) Need to find a way to run Mac-EWS for non-committers.
>> 3) Enable "build-break" emails to webkit-dev or another opt-in mailing list
>> 
>> What else?
> 
> I'd like it if we had an IRC bot that announced build breakage on #webkit.
> 


The buildbot master lives on hardware that cannot host IRC bots, at least by 
default. I'd rather the bot be external to the master, but if you really need a 
bot on that hardware, I can start the request process. 

-Bill
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Jeremy Orlow
On Fri, Feb 26, 2010 at 9:00 PM, Jeremy Orlow  wrote:

> On Fri, Feb 26, 2010 at 8:53 PM, Maciej Stachowiak  wrote:
>
>>
>> On Feb 26, 2010, at 11:43 AM, Simon Fraser wrote:
>>
>>
>>> Mozilla has (or at least had when I worked there) two additional "tree
>>> rules" that helped keep the tree green:
>>>
>>> 1. A sheriff was appointed at all times, and had the authority to close
>>> the tree if there was significant build or test breakage. Closing the tree
>>> meant that it was blocked to new commits other than those intended to fix
>>> problems. Closing the tree also sends a strong message that "something is
>>> broken, please pitch in and fix it if you can".
>>>
>>> Sheriff duties were shared around between responsible committers, so as
>>> not to overly burden one person.
>>>
>>
>> I think the build sheriff idea is a good one. Maybe what we want is to
>> have a sheriff responsible for each build train that has an active buildbot.
>> (It could be the same person responsible for several build trains, the main
>> qualification would be having reasonable familiarity with a port and access
>> to its build environment.)
>>
>> However, I am not so sure "close the tree" is necessarily the best focus
>> for sheriff actions. What I'd prefer to see is that the sheriff the person
>> primarily responsible for reverting broken patches if not fixed in a timely
>> manner. Then we could have some human judgment in the process and specific
>> people with clear responsibility.
>
>
> I agree "close to the tree" is not necessary for the reasons you listed.
>  And I think most people from the Chromium would welcome this change
> (sheriff + ability to close).  We've been advocating it for some time now.
>  :-)
>

OopsI completely misread what you said.

The reason why being able to close the tree is important is because
sometimes it can take a while to sort out what caused what failures.  And
it's important not to allow more breakage in the mean time.  In Chromium, we
often have a good deal of redness, but as long as the sheriffs feel as
though they're on top of it, the tree stays open.  Now, I'll admit that we
have many more long running bots (like memory leak bots) and so these kinds
of train wrecks that require sorting happen way less in WebKit, but it still
might be nice to have the ability when necessary.

The suggestion below (2) about notes on the waterfall sounds great, but we
do OK by abusing the "tree is closed/open" string to keep track of other
state (like who's working on what fix).  We've found this works "good
enough".  And maybe some informal banner like this would be good enough for
the first rev, unless we thought per CL annotations would be easy to
implement.

I'll note that in the Chromium project, we've had a very strong "keep the
tree green" ethic for some time now.  And we have a good deal of experience
related to it.  Certainly there are multiple ways to solve various problems,
but it might be worth taking a look at how we do things to see if there are
other parts of how we do things that might be of interest.


>  2. The Mozilla tinderbox page (their buildbot waterfall) had a way for
>>> people to leave comments, by adding a "star" to a particular build with a
>>> comment. This is used as a way to communicate that someone has noticed the
>>> breakage, and is working on it.
>>>
>>
>> Sounds like a good idea. Wondering if that fits better in the console view
>> or the extensions view.
>>
>>
>>
>>> In general, I think the waterfall page could be improved in order to make
>>> "breakage archeology" easier. Entries in the Changes column should be direct
>>> links to trac changesets, for example.
>>>
>>
>> That sounds good too. Another thing that would help is adding "next page"
>> links to the console view, like we have on the waterfall. The console link
>> often makes it easier to quickly identify the patch that went bad, but only
>> if the badness is recent enough to show up.
>>
>> Regards,
>> Maciej
>>
>>
>> ___
>> webkit-dev mailing list
>> webkit-dev@lists.webkit.org
>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>
>
>
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Ojan Vafai
On Fri, Feb 26, 2010 at 11:53 AM, Maciej Stachowiak  wrote:

> On Feb 26, 2010, at 11:43 AM, Simon Fraser wrote:
>
2. The Mozilla tinderbox page (their buildbot waterfall) had a way for
> people to leave comments, by adding a "star" to a particular build with a
> comment. This is used as a way to communicate that someone has noticed the
> breakage, and is working on it.
>
> Sounds like a good idea. Wondering if that fits better in the console view
> or the extensions view.


Another, perhaps easier to implement, approach would be to have a single
status message that is iframed at the top of the waterfall and console
pages. This has proven good enough for chromium. See the message at the top
of build.chromium.org.

http://chromium-status.appspot.com/current The status can then be updated
at http://chromium-status.appspot.com/ (requires login...not sure why),
which also shows the last 25 statuses.

People use it in ways like "2 win release failures -> ojan, mac compile ->
dglazkov, qt failure -> ???" to indicate that ojan/dglazkov are currently
actively fixing those and qt has a failure that needs an owner.

For the record, I fully support making warnings more visible and improving
the EWS/buildbot infrastructure before resorting to adding new policies.

On the topic of buildbot infrastructure, one problem I've had is the bots
sometimes get quite behind. I made a commit last week that took *hours*
before running the tests on the Windows bot. Sitting around for 30 minutes
to see the tree green after a commit is one thing, sitting around for 4
hours is another. Hopefully, running tests in parallel will resolve many of
these issues.
 

Ojan
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] The tree is on fire: a tragedy of the commons

2010-02-26 Thread Maciej Stachowiak


On Feb 26, 2010, at 12:11 PM, William Siegrist wrote:


On Feb 26, 2010, at 9:29 AM, Maciej Stachowiak wrote:



On Feb 26, 2010, at 9:17 AM, Dimitri Glazkov wrote:


To summarize the thread:

1) We're adopting "when in doubt, roll it out" approach to patches
that turn tree red.


I think it's polite, though not mandatory, to make a reasonable  
effort to find the person responsible for the breakage and give  
them a chance to fix it. (This doesn't have to mean hunting around  
for hours or days, but you could send email or ask on IRC.) Also  
acceptable to fix it yourself, if it is obvious how.



2) Need to find a way to run Mac-EWS for non-committers.
3) Enable "build-break" emails to webkit-dev or another opt-in  
mailing list


What else?


I'd like it if we had an IRC bot that announced build breakage on  
#webkit.





The buildbot master lives on hardware that cannot host IRC bots, at  
least by default. I'd rather the bot be external to the master, but  
if you really need a bot on that hardware, I can start the request  
process.


As long as the master can notify whatever host is running the bot, it  
seems to me like it doesn't matter much if it needs to be the same  
hardware.


I'm not really up on the internal details of buildbot, so I am not  
sure what would be easier to implement.


Regards,
Maciej

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev