Re: js-inbound as a separate tree
On 12/19/13, 4:20 PM, David Burns wrote: I know that RelEng are looking into how to do scheduling better, I am not sure where they are with this or if it is started but its a good first step. The whole a push can take hours to build/test is the thing that we need to be pushing against. I think if we solve that problem their will be a significant drop in bad pushes. To reduce turnaround time, trychooser syntax now supports T-shaped try runs. For example, the following syntax will build on all platforms, but only test one platform (in this case, Linux64): try: -b do -p all -u all[x64] -t none http://trychooser.pub.build.mozilla.org/ chris ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
js-inbound as a separate tree
On dev-tech-js-engine-internals, there's been some discussion about reviving a separate tree for JS engine development. The tradeoffs are like any other team-specific tree. Pro: - protect the rest of the project from closures and breakage due to JS patches - protect the JS team from closures and breakage on mozilla-inbound - avoid perverse incentives (rushing to land while the tree is open) Con: - more work for sheriffs (mostly merges) - breakage caused by merges is a huge pain to track down - makes it harder to land stuff that touches both JS and other modules We did this before once (the badly named tracemonkey tree), and it was, I dunno, OK. The sheriffs have leveled up a *lot* since then. There is one JS-specific downside: because everything else in Gecko depends on the JS engine, JS patches might be extra likely to conflict with stuff landing on mozilla-inbound, causing problems that only surface after merging (the worst kind). I don't remember this being a big deal when the JS engine had its own repo before, though. We could use one of these to start: https://wiki.mozilla.org/ReleaseEngineering/DisposableProjectBranches Thoughts? -j ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: js-inbound as a separate tree
As someone who works mostly on the intersection of the JS engine and everything else, I'm not really wild about this. SpiderMonkey is pretty intimately tied to the rest of Gecko, certainly just as much as something like gfx. I think fx-team makes more sense, since most of the patches there consist primarily of changes to XUL/CSS/JS. The main problem with inbound seems to be that it requires all developers, who are generally working on disjoint things, to devote attention to serializing their patches into inbound with other patches that are mostly unrelated (but might not be!). As the number of pushers and inbound closures increases, this becomes more and more of an attention-suck. The long-term solution that we're working towards is some kind of bugzilla-based auto-lander IIUC. But in the mean time, it seems like it would be trivial to write a locally-hosted (mach-integrated?) auto-lander script, the automates the process of: (1) Wait until inbound is open. (2) pull -u, apply the patches, and make sure they apply cleanly. (3) Push, and mark the bug. In the case where the patches don't apply, the developer can be alerted, since her attention is basically required in that case anyway. In all other cases, we effectively emulate the experience of pushing to an always-open inbound. This would be a relatively trivial tool to write, especially compared with the infra and staff burden of maintaining a bunch of separate repos. Thoughts? bholley On Thu, Dec 19, 2013 at 10:48 AM, Jason Orendorff jorendo...@mozilla.comwrote: On dev-tech-js-engine-internals, there's been some discussion about reviving a separate tree for JS engine development. The tradeoffs are like any other team-specific tree. Pro: - protect the rest of the project from closures and breakage due to JS patches - protect the JS team from closures and breakage on mozilla-inbound - avoid perverse incentives (rushing to land while the tree is open) Con: - more work for sheriffs (mostly merges) - breakage caused by merges is a huge pain to track down - makes it harder to land stuff that touches both JS and other modules We did this before once (the badly named tracemonkey tree), and it was, I dunno, OK. The sheriffs have leveled up a *lot* since then. There is one JS-specific downside: because everything else in Gecko depends on the JS engine, JS patches might be extra likely to conflict with stuff landing on mozilla-inbound, causing problems that only surface after merging (the worst kind). I don't remember this being a big deal when the JS engine had its own repo before, though. We could use one of these to start: https://wiki.mozilla.org/ReleaseEngineering/DisposableProjectBranches Thoughts? -j ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: js-inbound as a separate tree
We already have the approximate equivalent of this. It's the 'checkin-needed' keyword. Add this to your bug, and the sheriffs will land the patch for you, using the approximate process you describe. The only difference is this is done out-of-band, so turnaround may take up to 24 hrs. The advantage of using this is that it allows the sheriffs to regulate patch landings more evenly, and avoid landings during peak hours, which makes bustages easier to spot, and potentially reduces the duration of tree closures. The disadvantage is that you may end up waiting to have your patch landed, and some bugs may be hard for the sheriffs to land; e.g., if there are a lot of patches in a single bug, and some of them have already landed, or there are patches for multiple repos. Jonathan On 12/19/2013 2:42 PM, Bobby Holley wrote: As someone who works mostly on the intersection of the JS engine and everything else, I'm not really wild about this. SpiderMonkey is pretty intimately tied to the rest of Gecko, certainly just as much as something like gfx. I think fx-team makes more sense, since most of the patches there consist primarily of changes to XUL/CSS/JS. The main problem with inbound seems to be that it requires all developers, who are generally working on disjoint things, to devote attention to serializing their patches into inbound with other patches that are mostly unrelated (but might not be!). As the number of pushers and inbound closures increases, this becomes more and more of an attention-suck. The long-term solution that we're working towards is some kind of bugzilla-based auto-lander IIUC. But in the mean time, it seems like it would be trivial to write a locally-hosted (mach-integrated?) auto-lander script, the automates the process of: (1) Wait until inbound is open. (2) pull -u, apply the patches, and make sure they apply cleanly. (3) Push, and mark the bug. In the case where the patches don't apply, the developer can be alerted, since her attention is basically required in that case anyway. In all other cases, we effectively emulate the experience of pushing to an always-open inbound. This would be a relatively trivial tool to write, especially compared with the infra and staff burden of maintaining a bunch of separate repos. Thoughts? bholley On Thu, Dec 19, 2013 at 10:48 AM, Jason Orendorff jorendo...@mozilla.comwrote: On dev-tech-js-engine-internals, there's been some discussion about reviving a separate tree for JS engine development. The tradeoffs are like any other team-specific tree. Pro: - protect the rest of the project from closures and breakage due to JS patches - protect the JS team from closures and breakage on mozilla-inbound - avoid perverse incentives (rushing to land while the tree is open) Con: - more work for sheriffs (mostly merges) - breakage caused by merges is a huge pain to track down - makes it harder to land stuff that touches both JS and other modules We did this before once (the badly named tracemonkey tree), and it was, I dunno, OK. The sheriffs have leveled up a *lot* since then. There is one JS-specific downside: because everything else in Gecko depends on the JS engine, JS patches might be extra likely to conflict with stuff landing on mozilla-inbound, causing problems that only surface after merging (the worst kind). I don't remember this being a big deal when the JS engine had its own repo before, though. We could use one of these to start: https://wiki.mozilla.org/ReleaseEngineering/DisposableProjectBranches Thoughts? -j ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: js-inbound as a separate tree
Personally I find the branches we have annoying and are papering over the real problem that our feedback cycles once landed are far too long. Just for that reason alone I am against the idea. I think if we can solve the build/test scheduling and being smart about how we do our testing we can reduce the time the tree is closed greatly. more comments in line. David On 19/12/2013 18:48, Jason Orendorff wrote: On dev-tech-js-engine-internals, there's been some discussion about reviving a separate tree for JS engine development. The tradeoffs are like any other team-specific tree. Pro: - protect the rest of the project from closures and breakage due to JS patches mozilla-inbound has been closed for on average ~4 days a Month (Data at the end of the email). This is including the 8 days in November because we werent monitoring leaks properly. These ~4 days havent been split into Infrastructure vs test/build failure causing the closure and do include known downtime from Releng when they do work. - protect the JS team from closures and breakage on mozilla-inbound see my comment above. - avoid perverse incentives (rushing to land while the tree is open) When auto-land is ready we will be able to throttle landings for people adding checkin-needed to bugs since the tree is fragile on re opening. Currently the sheriffs watch for that an land things accordingly. They do the throttling themselves. Con: - more work for sheriffs (mostly merges) If mostly merges, are you suggesting there will be little traffic on the branch or the JS team will watch the tree for failures? If the former, is their value in having another branch when there is low traffic? - breakage caused by merges is a huge pain to track down Yup! Not to mention merge conflicts that can happen between branches. Today there was a complaint in #jsapi when someone was trying to fix an issue but the test framework was out of sync currently and no merge imminent. This was between b2g-inbound and mozilla-inbound. Adding another inbound feels like its going to make it even harder. - makes it harder to land stuff that touches both JS and other modules I already have this pain with working on something that B2G use too. The B2G team has been working with releng to try mitigate it but it's still painful. We did this before once (the badly named tracemonkey tree), and it was, I dunno, OK. The sheriffs have leveled up a *lot* since then. There is one JS-specific downside: because everything else in Gecko depends on the JS engine, JS patches might be extra likely to conflict with stuff landing on mozilla-inbound, causing problems that only surface after merging (the worst kind). I don't remember this being a big deal when the JS engine had its own repo before, though. We could use one of these to start: https://wiki.mozilla.org/ReleaseEngineering/DisposableProjectBranches Thoughts? -j ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform (treestatus)☁ mozilla-inbound python treestatus-stats.py --tree mozilla-inbound Added on :2012-05-14T09:59:46 Tree has been closed for a total of 64 days, 23:18:12 since it was created on 2012-05-14T09:59:46 2012-08 : 1 day, 1:26:57 2012-09 : 1 day, 3:31:16 2012-10 : 2 days, 21:33:14 2012-11 : 20:45:45 2012-12 : 2 days, 1:19:51 2013-01 : 2 days, 8:17:55 2013-02 : 4 days, 0:24:59 2013-03 : 6 days, 3:13:09 2013-04 : 4 days, 17:51:39 2013-05 : 5 days, 13:33:49 2013-06 : 2 days, 15:42:37 2013-07 : 6 days, 13:46:11 2013-08 : 4 days, 5:42:17 2013-09 : 4 days, 20:59:41 2013-10 : 4 days, 21:22:40 2013-11 : 8 days, 4:58:30 2013-12 : 2 days, 16:47:42 ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: js-inbound as a separate tree
On 12/19/13 4:55 PM, David Burns wrote: On 19/12/2013 18:48, Jason Orendorff wrote: Con: - more work for sheriffs (mostly merges) If mostly merges, are you suggesting there will be little traffic on the branch or the JS team will watch the tree for failures? Neither, I'm just saying the overall rate of broken patches wouldn't increase much, which I think shouldn't be controversial. That is, sheriffing is not watching trees, it's fighting bustage. Each busted patch and each intermittent orange creates a ton of work. It stands to reason that diverting some patches to a separate tree won't increase the volume of patches, except to the degree it actually improves developer efficiency (and let's have that problem, please). 2013-07 : 6 days, 13:46:11 2013-08 : 4 days, 5:42:17 2013-09 : 4 days, 20:59:41 2013-10 : 4 days, 21:22:40 2013-11 : 8 days, 4:58:30 2013-12 : 2 days, 16:47:42 I know the point of including these numbers was, hey look it's not that bad, but this is really shocking. We're looking at an average of something like 125 hours per month that developers can't check stuff in. Even if the breakage is evenly distributed across time zones (optimistic) we're looking at zero 9s of availability. We've all gotten used to it, but it's kind of nuts. -j ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: js-inbound as a separate tree
On 19/12/2013 23:56, Jason Orendorff wrote: On 12/19/13 4:55 PM, David Burns wrote: On 19/12/2013 18:48, Jason Orendorff wrote: Con: - more work for sheriffs (mostly merges) If mostly merges, are you suggesting there will be little traffic on the branch or the JS team will watch the tree for failures? Neither, I'm just saying the overall rate of broken patches wouldn't increase much, which I think shouldn't be controversial. That is, sheriffing is not watching trees, it's fighting bustage. Each busted patch and each intermittent orange creates a ton of work. It stands to reason that diverting some patches to a separate tree won't increase the volume of patches, except to the degree it actually improves developer efficiency (and let's have that problem, please). For context, I manage the sheriffs so want to be sure what I am signing them up for. If the overall rate of broken patches wouldn't increase much, why can't we keep things on inbound and when the tree is closed just using the checkin-needed keyword and let the sheriffs manage continue to manage the bustage and start landing patches again? 2013-07 : 6 days, 13:46:11 2013-08 : 4 days, 5:42:17 2013-09 : 4 days, 20:59:41 2013-10 : 4 days, 21:22:40 2013-11 : 8 days, 4:58:30 2013-12 : 2 days, 16:47:42 I know the point of including these numbers was, hey look it's not that bad, but this is really shocking. I know its bad and this is why I am tracking this information! I am watching how many backouts are affecting closures[1] and what the backout to push ratio[2] is. Currently these figures scare me and the default stance that I get from platform engineers is It's probably cheaper to push and get backed out than push to try. This comes back to my papering over the cracks be spreading things around. We're looking at an average of something like 125 hours per month that developers can't check stuff in. Even if the breakage is evenly distributed across time zones (optimistic) we're looking at zero 9s of availability. I know that RelEng are looking into how to do scheduling better, I am not sure where they are with this or if it is started but its a good first step. The whole a push can take hours to build/test is the thing that we need to be pushing against. I think if we solve that problem their will be a significant drop in bad pushes. A bad push is 3 times more expensive than a good push just in compute hours (we have 1 backout in every 15 pushes on average), never mind the cost of someone doing a pull after a bad push and them trying to solve why things don't build. We've all gotten used to it, but it's kind of nuts. Couldnt agree more! -j David [1] https://secure.theautomatedtester.co.uk/owncloud/public.php?service=filest=f54a3e2edabb70771d64e473b30780ac [2] https://secure.theautomatedtester.co.uk/owncloud/public.php?service=filest=ca3312fa7e0914e8352e96d44a48569f ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform