Ideally, I would like to see that the site builds are automated by CI, we still have CALCITE-3129 [1] open.

My thinking is that if we automate the site building and deployment process, we can use the following heuristics: - Build the site completely and deploy when a final release tag is pushed to the repo.
- Build the site on a partial basis in all other cases:
- Option 1: Check out the last final release tag and apply changes to the site that only touches certain whitelisted categories such as news and community. This should allow us to not have documentation changes for code deployed before the final release.This should then allow us to get rid of the site branch - Option 2: We keep the site branch, but we automate the current process. On every commit to master, if it is a change to the files in the site directory, we check if the change only touches certain whitelisted categories such as news and community. If so, we cherry pick that into the site branch automatically using Github Actions and build and deploy the site. When a final release tag is pushed to the repo, we use Github Actions to make the master and site branches equal and automatically build and deploy the site.

This would negate the need to build and publish the site manually and simplify the process as we always only commit to master. As an added bonus, we if we keep the site branch, but automate the process, maybe we can lock the site branch so that only CI can push to it. The downside of course, is that we're relying on heuristics for the partial build, so there's some "magic" to it.

Francis


[1] https://issues.apache.org/jira/browse/CALCITE-3129

On 26/03/2022 8:58 am, Stamatis Zampetakis wrote:
Hello,

Thanks for starting this discussion Liya. It is important to find which
parts of the process are unclear and improve them if possible.

The current procedure for updating the website remains unchanged and it is
documented here:
https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md

If the procedure is not followed, which has happened a few times in the
past, meaning that someone commits directly in site without committing in
master then we will have commits in site that may get lost forever.
When we discover such commits we should port them to master. The
cherry-pick now goes in the opposite direction (from site to master).
This is usually discovered/done by the release manager and that's why we
have the respective instructions in the howto [1].

After a release we don't care much what happens because master and site
should be equal. As Francis pointed out this is usually done with a force
push.

Regarding Julian's question the commit hashes before the force pushes done
by Liya are the following (according to commits@calcite):
* master -> dcbc493bf699d961427952c5efc047b76d859096
* site -> aa9dfc7dbc64c784040cf20ed168016ae3b9c2c5

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/_docs/howto.md?plain=1#L696

On Fri, Mar 25, 2022 at 7:36 PM Julian Hyde <jh...@apache.org> wrote:

Does anyone know (or could find out) the SHA of the master and site
branches at the time that Fan attempted to move the site changes over?
If so, we could recreate the same environment, and figure out a set of
git commands that would have worked then and will work for the next
release manager. This process is safe because we can do these
experiments in a local git sandbox, without pushing to any remote.

On Fri, Mar 25, 2022 at 6:09 AM Fan Liya <liya.fa...@gmail.com> wrote:

Hi Francis,

Thanks for your feedback.

It seems we should choose option 2.
In addition, it seems less risky to run "git push --force" commands in
the site branch.

Best,
Liya Fan

Francis Chuang <francischu...@apache.org> 于2022年3月25日周五 12:14写道:

Hi Liya,

Thanks for bringing this up. We have always done the following when
committing:
1. Always commit to master.
2. If we need to publish the change to the site now (for example, new
committer or announcement), cherry-pick the change into the site branch
and publish it.
3. After a release, make the site branch the same as master (git reset
--hard master) and force push (git push --force origin site).

Francis

On 25/03/2022 3:03 pm, Fan Liya wrote:
Hi all,

As part of the release process, we need to synchronize the master and
site branches (Please see

https://calcite.apache.org/docs/howto.html#making-a-release-candidate).
Usually, the site is behind the master branch by some commits.
If the existing commits in the site branch are in the same order as
in
the master branch, the task is easy: just switch to the site branch,
and run

git rebase master

However, if some commits are in different orders, it can be tricky.
For example, the master branch may have the following commits (in
order):

A, B, X1, X2, ... , Xn.

and the site branch may have the following commits (in order):

B, A, X1, X2.

Basically we have two choices:

1. We can live with the out of order commits, because after
cherry-picking commits X3, X4, ... , Xn to the site branch, the file
contents will be consistent.

The problem is that, since the two branches have diverged, we cannot
use the rebase command. Instead, we have to manually cherry-pick
commits individually, which requires large effort. In addition, for
any subsequent release processes, we have to manually cherry-pick
each
commit.

2. We need to make the commits order consistent, which will make it
easy for subsequent releases.
However, the problem is that, to make the commits order consistent,
some git force push command is unavoidable, which is risky to some
extent.

So what is the recommended way to do this? Thanks in advance for
your feedback!

Best,
Liya Fan


Reply via email to