Re: [Wikitech-l] Merging near deployment branch cut time

2014-03-07 Thread Gabriel Wicke
On 03/07/2014 10:08 AM, Greg Grossmeier wrote:
> What we should do, however, is have a true "deployment pipeline".
> Briefly defined: A deployment pipeline is a sequence of events that
> increase your confidence in the quality of any particular build/commit
> point.
> 
> A typical example is:
> commit -> unit tests -> integration tests -> manual tests -> release

This is pretty much the way this currently works in Parsoid. We deploy
twice per week, with integration tests currently being our mass
round-trip testing setup on 160k pages. Those tests take a few hours to
run, so we only deploy revisions for which round-trip testing has
finished. Anything uncovered there is fed back to improvements in parser
tests, so over time it has become less common to catch regressions only
in round-trip tests. With improved integration tests manual testing
should also mostly be eliminated over time.

> Really, a pipeline isn't a thing like your indoor plumping but more of a
> mindset/way of designing your test infrastructure. But, it means that
> you keep things self-contained (contrary to the mobile example above)
> and things progress through the pipeline in a predicable way/pace.

This is one of the big arguments for narrow interfaces and services.

In Parsoid we have small mock implementations of the MediaWiki API end
points we use which allows us to run parser tests without a wiki in the
background. Network services tend to be at a medium granularity (coarser
than modules, finer than the entire system) with necessarily narrow
interfaces. Doing much of the testing at this level often seems to
strike a good balance between effort, run time (still suitable for CI)
and capturing the interface behavior essential to users of the service.

Gabriel

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Merging near deployment branch cut time

2014-03-07 Thread addshorewiki
One thing I didn't mention is that we recently marked a few of our selenium
tests as smoke tests which I just spotted was suggested in the other email
thread for core!

Great idea!

Addshore


On 7 March 2014 19:34, addshorewiki  wrote:

> I feel like I should probably post here about the current Wikibase /
> Wikidata deployment pipeline too which probably differs slightly to other
> products.
>
> On a per commit basis:
> A commit is made, the unit tests run on jenkins, the commit is reviewed,
> amended, merged, Jenkins again runs the unit tests as gate submit, Travis
> also runs the unit tests post merge testing against php 5.3, 5.4, 5.5 and
> both sqlite and mysql.
>
> Daily at 10AM UTC:
> Our build process is triggered and creates a build for Wikidata, Both the
> WMF jenkins and our WMDE jenkins run the unit tests, if both pass we +2CR
> and it is merged, this is then deployed straight to beta where our Jenkins
> runs all of our Selenium tests, the tests report to the build gerrit commit
> depending on their outcome.
>
> Branch Day (even 2 weeks):
> Tuesday is generally branch day, we branch the Wikidata repo all the unit
> tests and selenium tests are re run, we deploy to test.wikidata, manually
> test and on thursday and wikidata.org the following tuesday.
>
> We have very good test coverage which generally makes everything much
> easier! I probably missed something of interest above but generally
> everything is covered.
>
> Addshore
>
>
> On 7 March 2014 19:08, Greg Grossmeier  wrote:
>
>> 
>> > Let's also take this into a new thread. There are a lot of different
>> > conversations now going on
>>
>>
>> My opinion is that fixing this with policy is going to be hard.
>>
>> Either everyone who commits needs to be mindful of what day/time it is
>> and whether or not another human has cut the new branch yet (which isn't
>> set in stone on when, it varies by a couple hours, depending on a lot of
>> factors), OR we modify the branch cut based on some arbitrary offset
>> (24 hours ago) or some human looks at the merges and picks a point.
>>
>> None of those are ideal/scalable.
>>
>> What we should do, however, is have a true "deployment pipeline".
>> Briefly defined: A deployment pipeline is a sequence of events that
>> increase your confidence in the quality of any particular build/commit
>> point.
>>
>> A typical example is:
>> commit -> unit tests -> integration tests -> manual tests -> release
>>
>>
>> Each step has the ability to fail a build, which means "You shall not
>> pass!" to that commit point. The earlier you get a "You shall not pass!"
>> the better because it means less time waiting by the developers to know
>> if what they committed is ok or not.
>>
>>
>> What this means for us:
>> The Mobile team is actually a good example. They are doing The Right
>> Thing and have a lot of tests written, including browser tests. They run
>> into problems when, eg: they write a new feature and associated test and
>> commit it.
>>
>> Beta Cluster gets that code (feature and test) within 5 minutes.
>>
>> But, test.wikipedia and en.wikipedia get that feature much later, days
>> later.
>>
>> However, the test code is run by Jenkins across all environments (beta
>> cluster, test.wikipedia, en.wikipedia etc) all the time. So, the mobile
>> team gets a ton of false positives when their new test runs against eg
>> production where the feature isn't enabled yet (on purpose).
>>
>> The QA team is working on this problem now (loosely termed the
>> "versioned test problem").
>>
>>
>> How a pipeline would help:
>>
>> Really, a pipeline isn't a thing like your indoor plumping but more of a
>> mindset/way of designing your test infrastructure. But, it means that
>> you keep things self-contained (contrary to the mobile example above)
>> and things progress through the pipeline in a predicable way/pace.
>>
>> It also means that each code commit spends the exact same amount of time
>> in the various stages as other code commits. Right now some code sits on
>> Beta Cluster for 7 days before hitting production, whereas other code
>> spends 0-5 minutes. That's not good.
>>
>> Wanna help us on this problem? We're hiring:
>> https://hire.jobvite.com/Jobvite/jobvite.aspx?b=nHZ0zmw6
>> (2 job openings)
>>
>>
>> Greg
>>
>>
>> --
>> | Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E |
>> | identi.ca: @gregA18D 1138 8E47 FAC8 1C7D |
>>
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Merging near deployment branch cut time

2014-03-07 Thread addshorewiki
I feel like I should probably post here about the current Wikibase /
Wikidata deployment pipeline too which probably differs slightly to other
products.

On a per commit basis:
A commit is made, the unit tests run on jenkins, the commit is reviewed,
amended, merged, Jenkins again runs the unit tests as gate submit, Travis
also runs the unit tests post merge testing against php 5.3, 5.4, 5.5 and
both sqlite and mysql.

Daily at 10AM UTC:
Our build process is triggered and creates a build for Wikidata, Both the
WMF jenkins and our WMDE jenkins run the unit tests, if both pass we +2CR
and it is merged, this is then deployed straight to beta where our Jenkins
runs all of our Selenium tests, the tests report to the build gerrit commit
depending on their outcome.

Branch Day (even 2 weeks):
Tuesday is generally branch day, we branch the Wikidata repo all the unit
tests and selenium tests are re run, we deploy to test.wikidata, manually
test and on thursday and wikidata.org the following tuesday.

We have very good test coverage which generally makes everything much
easier! I probably missed something of interest above but generally
everything is covered.

Addshore


On 7 March 2014 19:08, Greg Grossmeier  wrote:

> 
> > Let's also take this into a new thread. There are a lot of different
> > conversations now going on
>
>
> My opinion is that fixing this with policy is going to be hard.
>
> Either everyone who commits needs to be mindful of what day/time it is
> and whether or not another human has cut the new branch yet (which isn't
> set in stone on when, it varies by a couple hours, depending on a lot of
> factors), OR we modify the branch cut based on some arbitrary offset
> (24 hours ago) or some human looks at the merges and picks a point.
>
> None of those are ideal/scalable.
>
> What we should do, however, is have a true "deployment pipeline".
> Briefly defined: A deployment pipeline is a sequence of events that
> increase your confidence in the quality of any particular build/commit
> point.
>
> A typical example is:
> commit -> unit tests -> integration tests -> manual tests -> release
>
>
> Each step has the ability to fail a build, which means "You shall not
> pass!" to that commit point. The earlier you get a "You shall not pass!"
> the better because it means less time waiting by the developers to know
> if what they committed is ok or not.
>
>
> What this means for us:
> The Mobile team is actually a good example. They are doing The Right
> Thing and have a lot of tests written, including browser tests. They run
> into problems when, eg: they write a new feature and associated test and
> commit it.
>
> Beta Cluster gets that code (feature and test) within 5 minutes.
>
> But, test.wikipedia and en.wikipedia get that feature much later, days
> later.
>
> However, the test code is run by Jenkins across all environments (beta
> cluster, test.wikipedia, en.wikipedia etc) all the time. So, the mobile
> team gets a ton of false positives when their new test runs against eg
> production where the feature isn't enabled yet (on purpose).
>
> The QA team is working on this problem now (loosely termed the
> "versioned test problem").
>
>
> How a pipeline would help:
>
> Really, a pipeline isn't a thing like your indoor plumping but more of a
> mindset/way of designing your test infrastructure. But, it means that
> you keep things self-contained (contrary to the mobile example above)
> and things progress through the pipeline in a predicable way/pace.
>
> It also means that each code commit spends the exact same amount of time
> in the various stages as other code commits. Right now some code sits on
> Beta Cluster for 7 days before hitting production, whereas other code
> spends 0-5 minutes. That's not good.
>
> Wanna help us on this problem? We're hiring:
> https://hire.jobvite.com/Jobvite/jobvite.aspx?b=nHZ0zmw6
> (2 job openings)
>
>
> Greg
>
>
> --
> | Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E |
> | identi.ca: @gregA18D 1138 8E47 FAC8 1C7D |
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Merging near deployment branch cut time

2014-03-07 Thread Greg Grossmeier

> Let's also take this into a new thread. There are a lot of different
> conversations now going on


My opinion is that fixing this with policy is going to be hard.

Either everyone who commits needs to be mindful of what day/time it is
and whether or not another human has cut the new branch yet (which isn't
set in stone on when, it varies by a couple hours, depending on a lot of
factors), OR we modify the branch cut based on some arbitrary offset
(24 hours ago) or some human looks at the merges and picks a point.

None of those are ideal/scalable.

What we should do, however, is have a true "deployment pipeline".
Briefly defined: A deployment pipeline is a sequence of events that
increase your confidence in the quality of any particular build/commit
point.

A typical example is:
commit -> unit tests -> integration tests -> manual tests -> release


Each step has the ability to fail a build, which means "You shall not
pass!" to that commit point. The earlier you get a "You shall not pass!"
the better because it means less time waiting by the developers to know
if what they committed is ok or not.


What this means for us:
The Mobile team is actually a good example. They are doing The Right
Thing and have a lot of tests written, including browser tests. They run
into problems when, eg: they write a new feature and associated test and
commit it.

Beta Cluster gets that code (feature and test) within 5 minutes.

But, test.wikipedia and en.wikipedia get that feature much later, days
later.

However, the test code is run by Jenkins across all environments (beta
cluster, test.wikipedia, en.wikipedia etc) all the time. So, the mobile
team gets a ton of false positives when their new test runs against eg
production where the feature isn't enabled yet (on purpose).

The QA team is working on this problem now (loosely termed the
"versioned test problem").


How a pipeline would help:

Really, a pipeline isn't a thing like your indoor plumping but more of a
mindset/way of designing your test infrastructure. But, it means that
you keep things self-contained (contrary to the mobile example above)
and things progress through the pipeline in a predicable way/pace.

It also means that each code commit spends the exact same amount of time
in the various stages as other code commits. Right now some code sits on
Beta Cluster for 7 days before hitting production, whereas other code
spends 0-5 minutes. That's not good.

Wanna help us on this problem? We're hiring:
https://hire.jobvite.com/Jobvite/jobvite.aspx?b=nHZ0zmw6 
(2 job openings)


Greg


-- 
| Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E |
| identi.ca: @gregA18D 1138 8E47 FAC8 1C7D |

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Merging near deployment branch cut time

2014-03-07 Thread Bryan Davis
On Fri, Mar 7, 2014 at 10:30 AM, Jon Robson  wrote:
> Let's also take this into a new thread. There are a lot of different
> conversations now going on
>
> On Fri, Mar 7, 2014 at 9:21 AM, Brad Jorsch (Anomie)
>  wrote:
>> On Fri, Mar 7, 2014 at 12:08 PM, C. Scott Ananian 
>> wrote:
>>
>>> I agree.  I think a better technical solution would be to halt jenkins'
>>> auto-merge for the 24 hour period, so that +2'ed changes are not
>>> automatically merged until after the branch is cut.
>>
>>
>> I don't see how that's any better. Things still aren't getting merged.
>>
>> If anything, the "cut using master@{24 hours ago}" is a much better
>> idea.[1] Although it might be useful to see if Wednesday tends to be a
>> relatively active bug-fixing day as the community on non-Wikipedia sites
>> finds issues in the version that was deployed to them on Tuesday, in which
>> case keeping those from making it into the new cut on Thursday (and so
>> requiring more backports or waiting an extra week for fixes) might not be
>> so great.
>>
>>
>>  [1]: And yes, 'master@{24 hours ago}' is valid git syntax.

In another project in another place with another team... I solved a
very similar problem by creating release branches from tags that the
testing system placed on revisions that had passed the integration
test suite rather than whatever HEAD happened to be at the time that
the branch was cut. We had a very heavy (~2hr wall clock time; ~24hr
cpu time) integration test suite that ran once a day. When the master
job that kicked off these tests found that no child tests had failed
it would tag the revisions that had been tested across all of the
involved repositories with something like 'integration-MMDD'. Our
release branch was forked from these tags. In the MW situation it
might be nicer to run such test processes more often that once a day
but the fundamental idea could/should work.

This type of gated promotion process won't stop all problems from
getting to the next stage, but it might make the inclusion of code
into the weekly branch slightly safer. It also could be seen as
stepping stone to further automation that could decrease the time
between production minor version deploys. Someday we might be able to
deploy "continuously" to some set of wikis where continuously doesn't
mean every individual commit but every time the integration test suite
says that things are stable.

Bryan
-- 
Bryan Davis  Wikimedia Foundation
[[m:User:BDavis_(WMF)]]  Sr Software EngineerBoise, ID USA
irc: bd808v:415.839.6885 x6855

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l