All work was completed on Monday and services fully restored. If you are experiencing any issues, please contact #vcs.

NOTE: if you use the remote git.mozilla.org/integration/gecko-dev.git please keep reading.

If you use this particular remote, and pulled during the problem window (roughly Jan 28-Jan 30), you may have picked up some bad shas. This remote now has the correct shas, and you can rebase any local work on that.

The other public remote for this repository: github.com/mozilla/gecko-dev.git was not affected. (The bad shas were not pushed to that instance.)

--Hal

On 2015-02-01 19:35 , Laura Thomson wrote:
7.30 PST update:
Work is largely complete (with two exceptions, and no force pushes
required) and we are currently re-enabling automation.

We have left gecko-projects and integration/gecko-dev for now and will pick
these up in the morning.

Thank you for your patience while the team worked through this outage. We
will have a postmortem next week and post a link to the writeup here.

Best,

Laura



On Sun, Feb 1, 2015 at 8:00 PM, Laura Thomson <lthom...@mozilla.com> wrote:

5pm PST update:
Work is progressing smoothly.  We are currently at step 8 in the detailed
plan posted earlier. That is, all commits have been manually processed and
the head of both systems matches.

We'll send an update when work is complete, or at 9am PST, whichever is
sooner.

Best,

Laura

On Sun, Feb 1, 2015 at 2:42 PM, Laura Thomson <lthom...@mozilla.com>
wrote:

Here is an update on our plans and status.

= Overview =
gps and hwine will implement the plan, which is, in summary, manually
playing back the problematic merges one by one to ensure both systems are
in agreement.

gps has point and hwine is online for peer review of the work.

No further tree closures are needed for this plan.

= Procedure in detail =
1) Make backup copy of SHA-1 mapfiles on both systems (in progress)
2) Manually iterate through Mercurial commits starting at 8991b10184de
and run gexport on that commit
3) Compare resulting SHA-1s in Git across conversion systems
4) Manually Git cherry-pick and update the mapfiles as needed
** go/no-go point (work-to-completion is guaranteed diminishing from here)
5) Prune entries from mapfiles newer than and including 8991b10184de (the
first merge in central)
6) After bfa194d93aed has been converted to Git with the same SHA-1,
proceed to convert remaining commits via `hg gexport`.
7) Verify new head matches in both systems
8) Manually push this new head to the "master" branch from both systems
(non-force)
** if force push on legacy, notify downstream partners
9) Turn on automated conversion again

= Success conditions =
* Legacy and modern vcs-sync are producing same shas
* Legacy and modern vcs-sync can push fast forward to gecko.git &
gecko-dev.git (respectively)
* Modern vcs-sync also has sha agreement with gecko-projects.git

= Next update =
The next update to lists, etc, will be when it's fixed, if things change
significantly, or at 5pm PST, whichever comes first.

Let me know if you have questions.

Best,

Laura



On Fri, Jan 30, 2015 at 8:01 PM, Gregory Szorc <g...@mozilla.com> wrote:

I figured people would like an update.

There were multiple, independent failures in the replication systems
(there
are 2 systems that replicate Mercurial to Git).

At least one system wasn't DAG aware. It was effectively using the "tip"
commit of the Mercurial repositories (the most recently committed
changeset) to constitute the Git branch head when it should have been
using
the latest commit on the "default" branch. It is a minor miracle this
hasn't broken before, as all anybody needed to do was push to an older
head
to create a non-fast-forward push.

The other system got in a really wonky state when processing some merge
commits in mozilla-central. Instead of converting a handful of commits in
the 2nd merge parent, it converted all commits down to Mercurial
revision 0
and merged in an unrelated DAG head with tens of thousands of commits!
It's
a good thing GitHub rejected a malformed author line, or the gecko-dev
repository would be epically whacky right now and would almost certainly
require a hard reset / force push to correct.

Both systems are replicating Firefox Mercurial commits to Git. And the
SHA-1s need to be consistent between them. We're capable of fixing at
least
one of these systems now. But we're hesitant to fix one unless we are
pretty sure both systems agree about SHA-1s. We have obligations with
partners to not force push. And, you don't like force pushing either. So
caution is needed before bringing any system back online.

There is currently no ETA for service restoration. But people are working
on it. I wish I had better news to report.

On Thu, Jan 29, 2015 at 1:06 AM, Gregory Szorc <g...@mozilla.com> wrote:

Git replication is currently broken due to a mistake of mine when mass
closing branches earlier today.

Don't expect restoration before 1200 PDT.

Bug 927219.

_______________________________________________
dev-version-control mailing list
dev-version-cont...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-version-control




_______________________________________________
dev-b2g mailing list
dev-b2g@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-b2g

Reply via email to