I'm -0 on merging as-is. I have the same concerns as Robert and he's voiced
them very well so I won't waste time re-airing them.

(2) I spot checked the content, pulled out some common patterns, and
> it mostly looks good, but there were also some issues (e.g. several
> pages were replaced with the contents from entirely different pages).
> I would be more comfortable if, say, a smoke test of comparing the old
> and new sites, with html tags stripped and ignoring whitespace,
> yielded what should be empty diffs.
>

Can you share any details about this analysis?

+1 for verifying the old and new are the same by diffing the output.


> (3) It'd be good to have someone give a stamp of approval on the
> infrastructure changes, at least to validate that we're not going to
> be taking on extra tech debt with regard to jenkins stability and
> developer workflow. I see that Brian has at least looked at this some.


My involvement so far was just recognizing a problem (creating root-owned
files on jenkins workers) and helping to fix it. If there's anyone
available who's familiar with the website infrastructure it would be great
if they could take a look instead (if not I could probably acquaint myself
enough to review).

On Thu, May 7, 2020 at 11:57 PM Robert Bradshaw <rober...@google.com> wrote:

> This is a tough situation.
>
> It would have been much better if this transition was structured in
> such a way that the review was more manageable (e.g. the suggestion of
> scripts, not mixing in voluminous unnecessary changes like whitespace,
> and not updating content), and possibly even incrementally (e.g. the
> new site would have been developed over multiple PRs in a subdomain or
> subdirectory while being worked on). But hindsight is 20/20 and no
> one, myself included, thought to bring this up when the original
> migration was proposed, so this is more something to keep in mind for
> the future. I also appreciate the efforts that have been made to clean
> things up (e.g. preserving history) and address feedback.
>
> So, where do we go from here? My first thought is that I really don't
> want to set a precedent that just because a PR "will require a large
> effort" and in a state that if we don't "move forward and merge what
> we have now" then "work done so far will be lost" means that we think
> it's OK to forgo doing a proper review.
>
> On the other hand, there are some mitigating factors with this being
> the website and not the code in that "bugs," though possibly
> embarrassing, won't break production pipelines or data loss, and
> though the source is technically part of the release, when we find
> something to fix we can fix the live website much more quickly than go
> through the whole release process and convince people to upgrade. (I
> recognize accepting this argument is, to some degree at least, saying
> that we don't care about the correctness of docs as much as so-called
> "real" code, if we go there.)
>
> If we decide to go ahead and merge (and I would not object), there are
> some things I would like to see.
>
> (1) I would like to understand what we would do afterwards to "review
> the outcome, and ensure that all the content is there," and why it
> can't be done before merging instead. (Is it because it'd take time
> and we don't want to incorporate changes that are made to the website
> in the meantime? I think that boat has sailed, but maybe we can avoid
> making it worse...)
>
> (2) I spot checked the content, pulled out some common patterns, and
> it mostly looks good, but there were also some issues (e.g. several
> pages were replaced with the contents from entirely different pages).
> I would be more comfortable if, say, a smoke test of comparing the old
> and new sites, with html tags stripped and ignoring whitespace,
> yielded what should be empty diffs.
>
> (3) It'd be good to have someone give a stamp of approval on the
> infrastructure changes, at least to validate that we're not going to
> be taking on extra tech debt with regard to jenkins stability and
> developer workflow. I see that Brian has at least looked at this some.
>
> - Robert
>
>
> On Thu, May 7, 2020 at 12:40 PM Aizhamal Nurmamat kyzy
> <aizha...@apache.org> wrote:
> >
> > Thank you Ahmet.
> >
> > Robert/Brian, what do you think?
> >
> > The website staging and pre commit tests have passed [1]. If nobody has
> objections, we could merge it soon.
> >
> >
> > [1] https://github.com/apache/beam/pull/11554
> >
> > On Thu, May 7, 2020 at 11:38 AM Ahmet Altay <al...@google.com> wrote:
> >>
> >>
> >>
> >> On Thu, May 7, 2020 at 10:50 AM Aizhamal Nurmamat kyzy <
> aizha...@apache.org> wrote:
> >>>
> >>> Thanks for the writeup Ahmet.
> >>>
> >>> My bias is to move forward and merge the PR. After this, we'll review
> the outcome, and ensure that all the content is there. Nam will help us
> with that.
> >>> The reason that I'd like to move forward and merge what we have now -
> is that if we don't do that, the work done so far will be lost.
> >>> We'll make sure to stage the website in its current state, and use
> that as reference/archive to ensure all the content have been moved.
> >>>
> >>> Is this reasonable to everyone?
> >>
> >>
> >> This is reasonable to me. I agree with your reasons.
> >>
> >> What do others think?
> >>
> >>>
> >>>
> >>>
> >>> On Wed, May 6, 2020 at 7:07 PM Ahmet Altay <al...@google.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On Wed, May 6, 2020 at 2:33 PM Aizhamal Nurmamat kyzy <
> aizha...@apache.org> wrote:
> >>>>>>
> >>>>>>
> >>>>>> > 1) Currently, the main blocker for merging is Staging Test
> Failures.
> >>>>>>
> >>>>>> That and finishing the review. (Is someone tracking/coordinating
> this?)
> >>>>>
> >>>>>
> >>>>> I am coordinating the work on the failed tests, but I would need
> other committer's help to perform the review. @Ahmet, could you help us
> prioritize the review for this PR?
> >>>>
> >>>>
> >>>> The problem is there are too many manual changes. Reviewing this
> change in this form will require a large effort. I do not think I can
> interrupt other projects to prioritize reviews on this PR. IMO, we have a
> few options:
> >>>>
> >>>> - PR to be restructured in the format suggested in this thread. A
> commit for infrastructure changes from Jekyll to hugo. A second commit for
> a script that will convert the majority of the content. A third commit for
> the execution of the script. And a fourth commit for the additional manual
> content changes. If Nam can get to this form, people on this thread
> myself/Robert/Pablo/Brian can review the changes.
> >>>> - Another option is, we can accept that we already invested in this
> transition and overall this is a good change, and merge the PR more or less
> in its current form (with tests fixed and open comments addressed) even
> though it has issues. And then overtime fix the issues we encounter. There
> was already some amount of review and visual comparisons, we risk losing
> some recent content changes but I am assuming this will not be much. If Nam
> can commit to compare two sites after a merge, fixing the majority of the
> delta, this might be a viable option.
> >>>>
> >>>> Another thing we can do, we can archive/store a read-only copy of the
> current website in an "archive" url temporarily instead of completely
> deleting it. It will give us a baseline for a while to go back to the old
> content and move any missing data. (And maybe, someone can come up with an
> innovative way to compare the textual content of both sites.) A note on the
> stop world approach, I believe we are already failing on that with merge
> conflicts showing up on the PR. It will be better for us to complete the
> transition as soon as possible. Fixing after the initial merge might be a
> simpler task, especially if we can archive the old site.
> >>>>
> >>>>>
> >>>>>
> >>>>>>
> >>>>>> > Michal showed Nam how to handle the 1st test which was about
> Apache License missing.
> >>>>>> >
> >>>>>> > However, the 2nd and 3rd tests looked like some kind of
> permissions error on the Jenkins worker, not to be configured by code. For
> more details based on Jenkin logs, the 2nd test failed because of
> website/www/site/themes and the 3rd test failed because of
> website/www/node_modules, they are both auto-generated files on build. Can
> someone help Nam to look into this?
> >>>>>> >
> >>>>>> > RAT ("Run RAT PreCommit") — FAILURE
> >>>>>> > Website_Stage_GCS ("Run Website_Stage_GCS PreCommit") — FAILURE
> >>>>>> > Website_Stage_GCS ("Run Website_Stage_GCS PreCommit") — FAILURE
> >>>>>> >
> >>>>>> > 2) Are there any other blockers for merging? @Ahmet/Robert/others
> please share if there are any other blockers.
> >>>>>> >
> >>>>>> >
> >>>>>> > [1] https://github.com/gohugoio/hugo/pull/4494
> >>>>>> >
> >>>>>> >
> >>>>>> > On Wed, May 6, 2020 at 10:19 AM Robert Bradshaw <
> rober...@google.com> wrote:
> >>>>>> >>
> >>>>>> >> On Mon, May 4, 2020 at 7:07 PM Ahmet Altay <al...@google.com>
> wrote:
> >>>>>> >> >
> >>>>>> >> >> On Mon, May 4, 2020 at 6:30 PM Robert Bradshaw <
> rober...@google.com> wrote:
> >>>>>> >> >>>
> >>>>>> >> >>> I took the massive commit and split it up into:
> >>>>>> >> >>>
> >>>>>> >> >>> (1) Infrastructure changes (basically everything outside of
> >>>>>> >> >>> (website/www/site/content)
> >>>>>> >> >>> (2) Sed script changes, and
> >>>>>> >> >>> (3) Manual changes (everything not in (1) and (2)).
> >>>>>> >> >
> >>>>>> >> >
> >>>>>> >> > Thank you Robert. This makes it much easier. What is the
> source of the sed script? I am not sure why some of those lines are there.
> It would be much easier for us to comment on the script source if it is
> reviewable somewhere.
> >>>>>> >>
> >>>>>> >> I just gathered up common patterns as I was trying to go through
> and
> >>>>>> >> review the files... Mostly it was an exercise in finding a
> compact
> >>>>>> >> representation for the delta, not trying to be a perfect
> conversion.
> >>>>>> >> (I do think in retrospect, if we do something like this again, it
> >>>>>> >> would be preferable to commit a script that does the
> auto-conversion
> >>>>>> >> (maybe even with some patch files for manual changes) both for
> ease of
> >>>>>> >> reviewing and to avoid the stop-the-world situation we're in
> now. (I'm
> >>>>>> >> still worried that some changes will get lost in the shuffle.)
>

Reply via email to