Hi all, And here's a working version for "CI on the guides" : https://github.com/apache/polaris/pull/3553. It builds upon PR #3550
A CI run, takes about 10 minutes for all guides, looks like this one: https://github.com/apache/polaris/actions/runs/21437618287?pr=3553 The TL;DR of how it works: - It exercises for all getting-started guides Markdown files in CI - Testing can be run locally as well, either for all guides or a single guide - For each Markdown file, the `shell` and `sql` code blocks are extracted - Supports "docker compose" and Spark SQL Shell invocations - Custom assertions can be added (as a `shell` code block in an HTML comment, so the assertions aren't rendered on the web site) While working on this, I had to fix a couple of things in the docker-compose files. Some are related to docker-compose service dependencies and timing, others due to the guides just not working anymore. I'll come up with separate PRs to address the findings individually. Robert On Mon, Jan 26, 2026 at 3:29 PM Robert Stupp <[email protected]> wrote: > Hi all, > > Here's a prototype as a PR https://github.com/apache/polaris/pull/3550 - > please try it out and let me know what you think. > > On Tue, Jan 20, 2026 at 9:12 PM Dmitri Bourlatchkov <[email protected]> > wrote: > >> Hi All, >> >> Building CI for getting-started guides sounds useful to me. I suppose we'd >> have to formalize the format of the related `.md` files somehow to make >> automated execution possible. >> >> I wonder about the reliability of these tests too. If CI is flaky (e.g. >> containers not starting properly), it might be an irritation more than an >> aid. It's worth a try in any case. >> >> Cheers, >> Dmitri. >> >> On Tue, Jan 20, 2026 at 2:48 PM Yong Zheng <[email protected]> wrote: >> >> > 100%. There are so many open source projects with outdated >> getting-started >> > examples and it will be nice to have these in our CI pipelines. The only >> > concern on my end is how do we defined coverage for getting-started >> > example? Currently most of them have simple examples to do following: >> > 1. use catalog >> > 2. create namespace >> > 3. create table under namespace >> > 4. create some dummy data >> > >> > Will these be sufficient for CI? With these, we will only know the basic >> > stuff work but if users tried to more complex things, we can't really >> > guarantee it will still work. But will this be sufficient? >> > >> > Thanks, >> > Yong Zheng >> > >> > On 2026/01/20 10:55:30 Robert Stupp wrote: >> > > Hi all, >> > > >> > > We have a nice collection of getting started guides in the source >> > > repository [1]. >> > > The user-targeting description of each guide is in a README.md file. >> > > >> > > I would like to start a discussion and gather feedback about two >> > > topics regarding the getting-started guides: >> > > >> > > 1. Website: >> > > The user facing getting-started guides are well written but not very >> > > visible to users, because those are not on the web site. >> > > What are your thoughts of moving the getting-started guides to the >> > website? >> > > >> > > 2. CI coverage: >> > > Most, actually all, getting-started guides include code snippets >> > > referencing Docker compose files. >> > > Manually verifying these code snippets and Docker compose files, >> > > during initial contribution or when those are being updated, is quite >> > > some work. >> > > I _think_ we can automate the verification of the code snippets, and >> > > with those the Docker compose files, in CI. >> > > The overall idea is to parse the getting-started guide markdown and >> > > let a workflow execute the code blocks for shell/bash. >> > > I am not sure whether all guides can actually be verified, because >> > > some of those Docker compose files start a couple of containers, which >> > > can be a resource (RAM/CPU) issue in GitHub's hosted runners. >> > > The alternatives would be: >> > > - Never update the getting-started guides with the risk that those >> > > become stale and outdated. >> > > - Keep the manual verification process. >> > > Any thoughts on this? >> > > >> > > Robert >> > > >> > > >> > > [1] https://github.com/apache/polaris/tree/main/getting-started >> > > >> > >> >
