Hello, catching up a little bit on this.
I have one practical question though: right now, which is the recommended practice to check if a new release of a sub-project brake the website build? On Sat, Nov 13, 2021, at 22:55, David Jencks wrote: > Inline... > > > On Nov 13, 2021, at 12:40 PM, Zoran Regvart <zo...@regvart.com> wrote: > > > > Hi David, > > lots of great stuff here, I'll try to keep my replies short though... > > > > On Sat, Nov 13, 2021 at 7:47 PM David Jencks <david.a.jen...@gmail.com > > <mailto:david.a.jen...@gmail.com>> wrote: > >> > >> The Antora build part of the website is getting better at detecting > >> problems and failing the build, and the website build seems to me to be > >> failing more often. Perhaps we can find ways to improve our process so > >> there are fewer problematic commits and it’s easier to detect and fix > >> problems earlier. > > > > My intent with the website was that we should fail to publish as often > > as necessary to do our best in not publishing a broken website. I > > think we can tolerate website not being up to date with the latest for > > a day or two. > > Absolutely! But there have been lots of times recently when I’ve broken the > website and had no idea for days. > > > > >> There are a few problems caused by interactions between near-in-time > >> commits and commits that bring in stuff that is obsolete due to recent > >> website build changes. Let’s ignore those :-)… especially the second kind > >> will iron themselves out over time. > > > > +1 > > > >> So, people keep merging PRs that change the documentation without checking > >> that it doesn’t break the website build, either locally or as a CI check > >> on the PR. > > > > I think the xref syntax makes it difficult for folk to wrap their > > head, I think everyone should be familiar with the documentation we > > have here: > > > > https://github.com/apache/camel-website/#links-between-pages-in-antora-content > > > > <https://github.com/apache/camel-website/#links-between-pages-in-antora-content> > > I think that’s a really inaccessible location for the information…. I’d like > to move it to a page in the manual, near the release guidelines. Something > like “how to contribute to the docs”. > > > >> They theoretically could do a local website build that incorporates their > >> changes, but right now it’s way too hard and time consuming. (I’ll discuss > >> the problems with the projects that attempt to do a partial local build > >> later) > >> So one good step would be to make local website builds to check doc > >> changes easy and quick. I’ve made some progress on this. > > > > +1 I think so too, local build and then the CI build against the git > > repository should be our first lines of defense. We should make those > > take seconds not minutes and then mandatory. > > > >> Another step would be for CI to check the website build on each PR, either > >> the whole site or a partial build. I think GH actions can trigger each > >> other, but I’ve never set it up. Do we have enough GH action time to do a > >> full website build on every PR to any camel subproject? Is it practical > >> to trigger the website build only when something documentation-related > >> changes? (this detection would need to be carefully set up in each > >> subproject) If these are possible I think we should just do this. It’s > >> probably possible to set up quicker partial builds, but it’s decidedly > >> more complicated. > > > > Most issues I've seen have been with xref linking, I think we should > > focus on that first. Other checks I think fail less often. Perhaps not > > building at all could be a good solution. For example in the Camel > > main, Guillaume built a Maven plugin that checks for broken xrefs in > > seconds. But that (currently) works only within the main Camel > > repository. > > > > What if we could expand that or build comparable tooling that checks > > xrefs in the adocs of the git repository the developer is working on > > but also takes into account xrefs to other subprojects? Two ideas I > > have here would be to clone other subprojects and build an index of > > xrefs against them as well; or to use the sitemap XMLs (could be > > fairly quick!) from the live website and reverse them back to xrefs > > for checking. > > Well, the site manifest is like the sitemap with antora-compatible > information. I think what I’m proposing with the sitemap and partial builds > will be quick enough without another tool we struggle to keep up to date. > > > >> Another step would be to make it extremely visible when the jenkins > >> website build fails. I try to follow the dev list pretty closely, and see > >> a lot of GH PR CI build failures reported, but apparently the jenkins > >> build has been failing for several days and I had no idea. > > > > +1 I think this is an area I can focus on next. I think we're in > > agreement that we don't want to send emails to the dev@ mailing list, > > one idea is to create GitHub issues; I just thought of a "status" > > channel on Zulip. Or perhaps both. > > > >> In principle, what other steps could we take? > >> > >> —— > >> > >> Comments on the existing attempts to have subproject-specific partial > >> builds: > >> > >> Dan Allen (of Antora) has repeatedly said that subsidiary builds such as > >> local or partial builds should be done from (clones) of the repo > >> containing the playbook for the actual site. For a long time I disagreed > >> and thought approaches like that of camel-quarkus to have a local build in > >> the subproject were workable but I’m now convinced that they are totally > >> unmaintainable. They rely on updating each such subproject every time the > >> main playbook changes, and in a way that requires deep understanding of > >> the entire site build. It just isn’t going to work, ever. > > > > I wonder if we could have the approach of Camel Quarkus and solve the > > issue of outdated playbooks by having a git submodule of the website > > in every project. Be warned though, the website git repository is very > > large (3.6GB). > > That’s another issue…. Isn’t most of that the built site branch? I think the > site should be published to a separate repo from the sources, something like > camel-site-pub. Then we won’t need to delete earlier versions, for one > thing. I set this up for Aries and Felix, it’s not hard to do. > > > >> —— > >> Maybe there’s hope… > >> > >> If we’re going to encourage or require local builds of the website, there > >> needs to be a defined file system relationship between the camel-website > >> clone and the subproject(s) clone(s). I have a “global” directory (named > >> camel) into which I’ve cloned all the subprojects next to one another > >> (together with some extra git work trees). I think this is the simplest > >> arrangement and I think we could require it. > >> > >> Next, there needs to be an easy way (preferably automated) to modify the > >> playbook to take account of building against one (or possibly more) local > >> clones. E.g, if I’m working on camel-quarkus, I should only need to have > >> camel-quarkus cloned, and still be able to do a build. Doing this is much > >> more plausible if we can assume that every branch participating in the > >> website is present and up to date locally. Does anyone know if it’s > >> possible to write a git script that can update branches without switching > >> to them? If we can assume this, then the local build just involves > >> changing the playbook source url from GitHub….<project>.git to > >> `./../<project>` and adjusting the checked out branch name. > > > > I think this could help, but I'm a bit skeptical that if this is not > > automated folk will skip over this. My workflow is somewhat similar, I > > have all subprojects checked out in the same (parent) directory, so I > > just change the playbook to use HEAD branch and ../camel-$subproject > > to build. > > > >> Then there’s the problem that the full Antora build takes something like 6 > >> minutes now, which is too long for anyone to wait for. So, we need an > >> effective way of doing quick partial builds. I’ve been working on this > >> with some progress. Dan has an idea he calls a site manifest, which means > >> that the site build writes out the content catalog with information about > >> the Antora coordinates and the site location of every page. Then a partial > >> build can read this in to populate the partial build content catalog, so > >> that xrefs can be properly resolved. This was originally developed to > >> enable a “subsidiary site” to have xrefs to a “main site”. I’ve adapted > >> this to be an Antora pipeline extension, and it can be used in a couple of > >> ways. > > > > Here's where it dawned on me that we already have the manifest of > > sorts in the sitemap XML files. But the idea of each subproject > > building it's bit of the website is also interesting to me. > > > >> - A site manifest could be published as part of the actual site. In this > >> case the partial build would fetch it, and only pages actually present > >> locally would get local links. You’d find out whether there are any > >> problems, but it might be hard to locate the local pages through > >> navigation. > >> > >> - If you do a full build locally to generate a local site manifest, a > >> partial build using that site manifest will only overwrite the rebuilt > >> local files, leaving you with a functional local site. > >> > >> - Possibly the full Jenkins build could also package the Antora site as a > >> zip archive, and local builds could fetch and unpack it rather than doing > >> a full local build. > > > > I think INFRA might not look too keenly on us taking up too much disk > > space on ci-builds.a.o. We _could/perhaps_ push to repository.a.o. as > > a -SNAPSHOT. > > I thought we’d have Antora also package as a zip or tar.gz and just include > it in the website, like the site manifest. > > > >> With the site manifest, there’s still the problem of modifying the > >> playbook to only build a little bit. I’ve written another extension that > >> you configure with the part you want to build, and it applies appropriate > >> filters. You can configure it down to one page. It also watches for > >> changes and rebuilds when it detects a change: I think I’ll need to make > >> that configurable since it’s great to see your changes quickly but not > >> what you want for a build step. > >> I have not yet tried to make it easy to select which subproject you want > >> to build: so far it requires knowing how to configure the extensions. I’ve > >> started having some ideas on how this might be done. > > > > This would bring super fast previews, could be part of the preview > > functionality we already have for the website... > > > >> What I’m envisioning and hoping for is a pre-PR process that involves > >> running, in a local camel-website clone, something like `yarn > >> partial-build-camel-quarkus` that will in less than a minute detect any > >> errors and produce a local site you can look at with the local changes. > > > > This would be really cool. > > > >> Thoughts? > > > > If we agree that most issues are broken xrefs (that's how it seems to > > me) perhaps focusing on not building the Antora bits at all, but doing > > something along the lines what Guillaume built with information (say > > from XML sitemaps) about other Antora components in the mix, feels > > like it would bring some quick wins. > > I really think that as soon as we try to cross component boundaries we’ll be > reinventing Antora for no good reason. People should preview their doc > changes locally IMO, so lets make that quick and easy and also so it will > detect xref problems. > > > > Sorry I don't think that was short... > > It could have been much longer!! > > David Jencks