Re: [Wikitech-l] [Ops] Changes to SWAT deployment policies, effective Monday April 30th

2018-04-28 Thread Gergo Tisza
On Sat, Apr 28, 2018 at 2:19 AM, Tyler Cipriani 
wrote:

> Looking at this info maybe 6 is the magic number?
>

Other options would be to have more windows (an EU morning SWAT maybe?),
and/or make it less of an expectation to have your patches deployed in the
window you scheduled them for (e.g. not starting deploying new patches
after :45; anything that overflows just gets pushed to next day).

Another thing that could be reconsidered for more even use of SWAT windows
is the requirement that the patch owner has to be present; we could just
request a test plan in the commit summary instead for low-risk patches.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Ops] Changes to SWAT deployment policies, effective Monday April 30th

2018-04-28 Thread Martin Urbanec
so 28. 4. 2018 v 0:05 odesílatel Greg Grossmeier 
napsal:

> 
> > ie. if the new policy was introduced at the start of the year, it would
> > have caused at least a two-week delay in all three SWAT windows at one
> > point at least, and it would have caused an ever-growing backlog of
> patches
> > for EU mid-day.
>
> 2 week delays aren't possible as all SWATs are backports of master, so
> if waiting a week it'd go out with the normal train ;)
>

No, majority of patches deployed in a SWAT window is in
operations/mediawiki-config and this repository do not have release
branches and/or train deploys. If I'd be the only one using certain SWAT
window and I'd need to deploy 5 patches per window, the first window I'll
deploy 4 patches and the remaining would be rescheduled, the second window
I'll have 1 patch rescheduled and 5 new patches (6 in total), 4 will be
deployed, 2 will be rescheduled, the third window I'll have 2 patches
rescheduled and 5 new patches (7 in total), 4 will be deployed, 3 will be
rescheduled, etc. etc. etc.

I'd have a constantly raising backlog of patches and will be enforced to a)
reserve a window to deploy all waiting patches b) use another SWAT.

This will cause even bigger pain after US holidays because then bigger SWAT
windows are expected.

I know that deploying 8 patches in a window is a problem (at least in EU
SWAT window, don't use the other windows that much), but 4 is the other
extreme.

>
> But point taken and thanks for the data.
>
> I'm comfortable with a 6 patch "limit". I put limit in scare quotes
> because I assume you all know that high priority changes can go out when
> they need to go out. The limit is to reduce SWAT deploy stress for what
> is possible in a 1 hour window and not try to squeeze things in and
> reduce the clarity when watching logs.
>
> Greg
>
> --
> | Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E |
> | Release Team ManagerA18D 1138 8E47 FAC8 1C7D |
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Ops] Changes to SWAT deployment policies, effective Monday April 30th

2018-04-27 Thread Tyler Cipriani

On 18-04-27 21:00:35, Gergo Tisza wrote:

On Fri, Apr 27, 2018 at 7:05 PM, Niharika Kohli 
wrote:


Also, I think dropping the limit to 4 patches per window is extreme,
especially if we are asking people to start splitting their patches now.
Very often we can +2 multiple patches in one go if they don't affect each
other, or sync out changes together if they happen to the same file. I've
deployed 8 patches in a window often, with people asking if they can add
more yet. Due to timezone limitations, most people can only attend one of
the SWAT windows and if they can't get it out in that window, they have to
wait a whole day or more to get it out.



FWIW, here are some stats on the patch counts for the first three months of
2018 (might contain errors, tried not to spend too much time on it):

EU mid: 8, 6, 2, 10, 1, 6, 1, 2, 1, 2, 2, 6, 2, 9, 6, 10, 2, 1, 1, 8, 7, 5,
5, 5, 2, 7, 6, 8, 4, 7, 4, 4, 4, 1, 1, 4, 3, 3, 2, 4, 4, 5, 5, 1
(average: 4.25, max of 2-week rolling average: 5.7)
Morning: 7, 1, 1, 5, 1, 3, 4, 1, 6, 3, 4, 3, 3, 5, 6, 4, 3, 5, 8, 4, 5, 4,
3, 0, 3, 1, 5, 9, 2, 1, 3
(average: 3.6, max of 2-week rolling average: 5)
Evening: 8, 1, 1, 1, 1, 3, 3, 2, 1, 3, 4, 2, 1, 1, 2, 1, 1, 5, 1, 6, 1, 3,
3, 1, 2, 3, 1, 0, 9, 10, 5, 4, 1, 6, 2, 1, 2, 4, 3
(average: 2.8, max of 2-week rolling average: 4.75)



These stats are really cool and they made me want to dig a little more. There 
have been a few times where having data about the actual syncs that make up a 
given SWAT window would be nice[0] (this being another one of those times).


As of now, to get information about the number of syncs that make up a 
given SWAT window -- or to see how long a SWAT window actually takes -- there 
is some digging in the SAL required (and even then it can be hard to 
figure out what happened if there is a window with patches, but no 
syncs, or just one sync, etc). Anyway, I spent some time digging in the 
SAL[1] to correlate SWAT windows on Wikitech to actual syncs and 
deployments.


One thing I found is that number of patches on Wikitech isn't necessarily the 
number of patches that make it out in a given window -- which makes sense -- 
sometimes we run out of time in the window or people don't show up or something 
breaks and we have to stop.


2018-01-02 Europe:  8 patches  1:05 6/8
2018-01-03 Evening: 8 patches  1:01 8/8
2018-01-08 Europe:  8 patches  1:03 8/8
2018-01-29 Europe:  9 patches  0:58 4/9
2018-02-06 Europe:  10 patches 1:02 7/10
2018-02-13 Europe:  8 patches  1:01 5/8
2018-02-28 Europe:  8 patches  1:16 7/8

The other thing I found is that there was no SWAT window between 2018-01-02 and 
2018-03-09 with > 6 patches that we kept within the allotted 1 hour time limit 
and deployed all the patches (although we were very close a couple of times).


2018-01-02 Europe:  8 patches  1:05 6/8
2018-01-03 Evening: 8 patches  1:01
2018-01-03 Morning: 7 patches  1:19
2018-01-08 Europe:  8 patches  1:03
2018-01-29 Europe:  9 patches  0:58 4/9
2018-02-06 Europe:  10 patches 1:02 7/10
2018-02-13 Europe:  8 patches  1:01 5/8
2018-02-14 Europe:  7 patches  1:31 5/7
2018-02-26 Europe:  7 patches  1:32
2018-02-28 Europe:  8 patches  1:16 7/8
2018-03-05 Europe:  7 patches  0:57 5/7

Looking at this info maybe 6 is the magic number?

FWIW, I feel like I struggle to get out 8 patches in an hour (depending on the 
patches).


Although maybe requiring more patches per change and allowing fewer patches in
a given window at the same time may not be the best course of action. As Chad
said elsewhere in the thread maybe we should focus on "changes" per window,
where "change" is the equivalent of a patch currently.

-- Tyler

[0]. 
[1].  

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Ops] Changes to SWAT deployment policies, effective Monday April 30th

2018-04-27 Thread Greg Grossmeier

> ie. if the new policy was introduced at the start of the year, it would
> have caused at least a two-week delay in all three SWAT windows at one
> point at least, and it would have caused an ever-growing backlog of patches
> for EU mid-day.

2 week delays aren't possible as all SWATs are backports of master, so
if waiting a week it'd go out with the normal train ;)

But point taken and thanks for the data.

I'm comfortable with a 6 patch "limit". I put limit in scare quotes
because I assume you all know that high priority changes can go out when
they need to go out. The limit is to reduce SWAT deploy stress for what
is possible in a 1 hour window and not try to squeeze things in and
reduce the clarity when watching logs.

Greg

-- 
| Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E |
| Release Team ManagerA18D 1138 8E47 FAC8 1C7D |

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Ops] Changes to SWAT deployment policies, effective Monday April 30th

2018-04-27 Thread Chad
On Fri, Apr 27, 2018 at 12:42 PM Stas Malyshev 
wrote:

> Hi!
>
> > The new policy asks the folks submitting patches to split up patches to
> > avoid bad intermediate states ahead of time.
>
> Thank you Tyler for the explanation! Which means this patch needs to be
> split into several patches? Giving the lower limit of the patches, this
> becomes kinda challenging - if this patch becomes, say, three commits,
> using figures from Gergo it is possible to apply it (without blocking
> whole SWAT window) only in ~27% of windows available. Given that many
> people can't use every window for timezone reasons, it may become tricky.
>
> I think we should reconsider how we do both of those things - if we're
> requiring splitting the patches, the limit should be considered
> differently - there's no point of performing the whole cycle of checks
> after merging each component of a multi-component patch, so maybe those
> should be counted differently. Though, of course, merging patches
> separately probably make it slower than before. If we additionally have
> limit of four - which is low even by current historic usage - I am
> concerned this will lead to long wait times and a backlog of patches
> which might then block other work.
>
>

I think we could do 4 "changes" in a SWAT, which could consist of a couple
of
individual commits?

-Chad
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Ops] Changes to SWAT deployment policies, effective Monday April 30th

2018-04-27 Thread Stas Malyshev
Hi!

> The new policy asks the folks submitting patches to split up patches to
> avoid bad intermediate states ahead of time.

Thank you Tyler for the explanation! Which means this patch needs to be
split into several patches? Giving the lower limit of the patches, this
becomes kinda challenging - if this patch becomes, say, three commits,
using figures from Gergo it is possible to apply it (without blocking
whole SWAT window) only in ~27% of windows available. Given that many
people can't use every window for timezone reasons, it may become tricky.

I think we should reconsider how we do both of those things - if we're
requiring splitting the patches, the limit should be considered
differently - there's no point of performing the whole cycle of checks
after merging each component of a multi-component patch, so maybe those
should be counted differently. Though, of course, merging patches
separately probably make it slower than before. If we additionally have
limit of four - which is low even by current historic usage - I am
concerned this will lead to long wait times and a backlog of patches
which might then block other work.
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Ops] Changes to SWAT deployment policies, effective Monday April 30th

2018-04-27 Thread Tyler Cipriani

On 18-04-27 10:49:28, Stas Malyshev wrote:

Hi!


First, we now disallow multi-sync patch deployments. See T187761[0].
This means that the sync order of files is determined by git commit
parent relationships (or Gerrit's "depends-on"). This is to prevent SWAT
deployers from accidentally syncing two patches in the wrong order.


Question about this: if there's a patch that requires files to land in
specific order, e.g. one that part of the config is moved into another
file (example: https://gerrit.wikimedia.org/r/c/419367) is this handled
automatically by scap (i.e. all changes in the same patch land at the
same time atomically and scap takes care of nothing ever seeing the
intermediate states) or has to be managed manually, and if so, how?


Scap doesn't currently handle this since for MediaWiki deploys it's 
still using rsync at a basic level.


Currently, syncing changes in a way that avoids bad intermediate states 
is handled by the SWAT deployer and is determined on the fly at the time 
of deployment.


That is, the deployer figures out how to sync stuff on the fly. And 
deployers are pretty good at it, mostly. If I were deploying that change 
today I'd split it up and sync one at a time:


   - wmf-config/WikibaseSearchSettings.php
   - wmf-config/InitialiseSettings.php
   - wmf-config/Wikibase.php
   - wmf-config/Wikibase-production.php

(or maybe I'd combine the last two into a sync-dir wmf-config).

The new policy asks the folks submitting patches to split up patches to 
avoid bad intermediate states ahead of time.


So Instead of me syncing that change one file at a time, maybe that 
change becomes two changes and I can pull a change that adds the 
WikibaseSearchSettings.php file and the variables in 
InitialiseSettings.php -- sync-dir wmf-config, and then a second patch 
that could be synced all at once as well. Two syncs rather than 3 or 4 
in this case.


The hope is that this will be more efficient, less error-prone, and 
lower the difficulty factor for deployers. Ideally, this makes it more 
and more difficult to earn a t-shirt[0] :)


-- Tyler

[0]. 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Ops] Changes to SWAT deployment policies, effective Monday April 30th

2018-04-27 Thread Gergo Tisza
On Fri, Apr 27, 2018 at 7:05 PM, Niharika Kohli 
wrote:

> Also, I think dropping the limit to 4 patches per window is extreme,
> especially if we are asking people to start splitting their patches now.
> Very often we can +2 multiple patches in one go if they don't affect each
> other, or sync out changes together if they happen to the same file. I've
> deployed 8 patches in a window often, with people asking if they can add
> more yet. Due to timezone limitations, most people can only attend one of
> the SWAT windows and if they can't get it out in that window, they have to
> wait a whole day or more to get it out.
>

FWIW, here are some stats on the patch counts for the first three months of
2018 (might contain errors, tried not to spend too much time on it):

EU mid: 8, 6, 2, 10, 1, 6, 1, 2, 1, 2, 2, 6, 2, 9, 6, 10, 2, 1, 1, 8, 7, 5,
5, 5, 2, 7, 6, 8, 4, 7, 4, 4, 4, 1, 1, 4, 3, 3, 2, 4, 4, 5, 5, 1
(average: 4.25, max of 2-week rolling average: 5.7)
Morning: 7, 1, 1, 5, 1, 3, 4, 1, 6, 3, 4, 3, 3, 5, 6, 4, 3, 5, 8, 4, 5, 4,
3, 0, 3, 1, 5, 9, 2, 1, 3
(average: 3.6, max of 2-week rolling average: 5)
Evening: 8, 1, 1, 1, 1, 3, 3, 2, 1, 3, 4, 2, 1, 1, 2, 1, 1, 5, 1, 6, 1, 3,
3, 1, 2, 3, 1, 0, 9, 10, 5, 4, 1, 6, 2, 1, 2, 4, 3
(average: 2.8, max of 2-week rolling average: 4.75)

ie. if the new policy was introduced at the start of the year, it would
have caused at least a two-week delay in all three SWAT windows at one
point at least, and it would have caused an ever-growing backlog of patches
for EU mid-day.
(That's without taking the split patch requirement into account, it's hard
to tell how many patches that would have affected.)
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Ops] Changes to SWAT deployment policies, effective Monday April 30th

2018-04-27 Thread Stas Malyshev
Hi!

> First, we now disallow multi-sync patch deployments. See T187761[0].
> This means that the sync order of files is determined by git commit
> parent relationships (or Gerrit's "depends-on"). This is to prevent SWAT
> deployers from accidentally syncing two patches in the wrong order.

Question about this: if there's a patch that requires files to land in
specific order, e.g. one that part of the config is moved into another
file (example: https://gerrit.wikimedia.org/r/c/419367) is this handled
automatically by scap (i.e. all changes in the same patch land at the
same time atomically and scap takes care of nothing ever seeing the
intermediate states) or has to be managed manually, and if so, how?

Thanks,
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Ops] Changes to SWAT deployment policies, effective Monday April 30th

2018-04-27 Thread Niharika Kohli
On Fri, Apr 27, 2018 at 8:51 AM, Roan Kattouw 
wrote:

> Will multi-file, single-directory syncs still be allowed? In other words,
> can I deploy a change to the Foo extension that touches many files with
> scap sync-dir extension/Foo ?
>

According to the task description, yes.

While the "one command sync" patch system makes sense to us SWAT deployers,
we do get a fair number of volunteer developers contributing patches for
SWAT. They will have a hard time telling when they need to split a patch up
and how. How do we plan to address that?

Also, I think dropping the limit to 4 patches per window is extreme,
especially if we are asking people to start splitting their patches now.
Very often we can +2 multiple patches in one go if they don't affect each
other, or sync out changes together if they happen to the same file. I've
deployed 8 patches in a window often, with people asking if they can add
more yet. Due to timezone limitations, most people can only attend one of
the SWAT windows and if they can't get it out in that window, they have to
wait a whole day or more to get it out.


>
> On Thu, Apr 26, 2018, 15:15 Greg Grossmeier  wrote:
>
>> Hello,
>>
>> I have made two changes to SWAT policies today.
>>
>> First, we now disallow multi-sync patch deployments. See T187761[0].
>> This means that the sync order of files is determined by git commit
>> parent relationships (or Gerrit's "depends-on"). This is to prevent SWAT
>> deployers from accidentally syncing two patches in the wrong order.
>>
>> Second, we are reducing the number of allowed patches from 8 to 4. This
>> is to reduce stress on the SWAT deployer as well as set expectations for
>> requesters on the pace of the windows. See the approximate best case
>> time spent breakdown[1] for how we came to this number.
>>
>> I've updated the on-wiki documentation on wikitech[2][3].
>>
>>
>> Thank you for flying scap,
>>
>> Greg
>>
>>
>> [0] https://phabricator.wikimedia.org/T187761
>> [1]
>> * +2/Wait for Jenkins to merge - 2 min
>> * prepare git on tin - 1 min
>> * Deploy to mwdebug - 1 min
>> * Verify on mwdebug - 3 min
>> * Deploy to production - 1 min
>> * Verify & wait/watch logs - 2 min
>> [2] https://wikitech.wikimedia.org/w/index.php?title=SWAT_
>> deploys&diff=prev&oldid=1789212
>> [3] https://wikitech.wikimedia.org/w/index.php?title=SWAT_
>> deploys&diff=next&oldid=1789212
>>
>> --
>> | Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E |
>> | Release Team ManagerA18D 1138 8E47 FAC8 1C7D |
>>
>> ___
>> Ops mailing list
>> o...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/ops
>>
>
> ___
> Ops mailing list
> o...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/ops
>
>


-- 
Niharika
Product Manager
Community Tech
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Ops] Changes to SWAT deployment policies, effective Monday April 30th

2018-04-27 Thread Roan Kattouw
Will multi-file, single-directory syncs still be allowed? In other words,
can I deploy a change to the Foo extension that touches many files with
scap sync-dir extension/Foo ?

On Thu, Apr 26, 2018, 15:15 Greg Grossmeier  wrote:

> Hello,
>
> I have made two changes to SWAT policies today.
>
> First, we now disallow multi-sync patch deployments. See T187761[0].
> This means that the sync order of files is determined by git commit
> parent relationships (or Gerrit's "depends-on"). This is to prevent SWAT
> deployers from accidentally syncing two patches in the wrong order.
>
> Second, we are reducing the number of allowed patches from 8 to 4. This
> is to reduce stress on the SWAT deployer as well as set expectations for
> requesters on the pace of the windows. See the approximate best case
> time spent breakdown[1] for how we came to this number.
>
> I've updated the on-wiki documentation on wikitech[2][3].
>
>
> Thank you for flying scap,
>
> Greg
>
>
> [0] https://phabricator.wikimedia.org/T187761
> [1]
> * +2/Wait for Jenkins to merge - 2 min
> * prepare git on tin - 1 min
> * Deploy to mwdebug - 1 min
> * Verify on mwdebug - 3 min
> * Deploy to production - 1 min
> * Verify & wait/watch logs - 2 min
> [2]
> https://wikitech.wikimedia.org/w/index.php?title=SWAT_deploys&diff=prev&oldid=1789212
> [3]
> https://wikitech.wikimedia.org/w/index.php?title=SWAT_deploys&diff=next&oldid=1789212
>
> --
> | Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E |
> | Release Team ManagerA18D 1138 8E47 FAC8 1C7D |
>
> ___
> Ops mailing list
> o...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/ops
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Ops] Changes to SWAT deployment policies, effective Monday April 30th

2018-04-27 Thread Brad Jorsch (Anomie)
On Thu, Apr 26, 2018 at 6:14 PM, Greg Grossmeier  wrote:

> First, we now disallow multi-sync patch deployments. See T187761[0].
> This means that the sync order of files is determined by git commit
> parent relationships (or Gerrit's "depends-on"). This is to prevent SWAT
> deployers from accidentally syncing two patches in the wrong order.
>

Is full scap now fast enough that sync-file is no longer necessary?
Discussion on that task seems to say "no".

I'd hate to see people mangling[1] the git log by submitting and merging
patch chains for updating individual files of a single logical change just
to satisfy this SWAT requirement. I'd hate even more if people do that in
mediawiki/core master (versus splitting an existing patch while
backporting), to the point where I'd recommend -2ing such patch-chains
there.

[1]: "mangling" both in that it would introduce unnecessary changes and in
that looking at a single change doesn't let you see the changes to other
files that should logically be grouped with it.


P.S. So if someone has a change that needs to touch 2 files in both
branches, they'd use up the whole smaller SWAT window for that one change
because they'd need 2 patches (1 per file) that both count double (because
two branches)?


-- 
Brad Jorsch (Anomie)
Senior Software Engineer
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l