Thanks for the proposal!

+1 on moving them to the site.

Having a basic validation(create catalog/namespace/table/, ingest and query
dummy data) should be helpful rather than nothing. A lot of times, these
examples just are broken silently. It'd be nicer if we can have a stable
and light validation. I agree with Dmitri that it may not be worth it if it
tends to be flaky, or execute longer than any other pipelines.

Yufei


On Tue, Jan 20, 2026 at 12:13 PM Dmitri Bourlatchkov <[email protected]>
wrote:

> Hi All,
>
> Building CI for getting-started guides sounds useful to me. I suppose we'd
> have to formalize the format of the related `.md` files somehow to make
> automated execution possible.
>
> I wonder about the reliability of these tests too. If CI is flaky (e.g.
> containers not starting properly), it might be an irritation more than an
> aid. It's worth a try in any case.
>
> Cheers,
> Dmitri.
>
> On Tue, Jan 20, 2026 at 2:48 PM Yong Zheng <[email protected]> wrote:
>
> > 100%. There are so many open source projects with outdated
> getting-started
> > examples and it will be nice to have these in our CI pipelines. The only
> > concern on my end is how do we defined coverage for getting-started
> > example? Currently most of them have simple examples to do following:
> > 1. use catalog
> > 2. create namespace
> > 3. create table under namespace
> > 4. create some dummy data
> >
> > Will these be sufficient for CI? With these, we will only know the basic
> > stuff work but if users tried to more complex things, we can't really
> > guarantee it will still work. But will this be sufficient?
> >
> > Thanks,
> > Yong Zheng
> >
> > On 2026/01/20 10:55:30 Robert Stupp wrote:
> > > Hi all,
> > >
> > > We have a nice collection of getting started guides in the source
> > > repository [1].
> > > The user-targeting description of each guide is in a README.md file.
> > >
> > > I would like to start a discussion and gather feedback about two
> > > topics regarding the getting-started guides:
> > >
> > > 1. Website:
> > > The user facing getting-started guides are well written but not very
> > > visible to users, because those are not on the web site.
> > > What are your thoughts of moving the getting-started guides to the
> > website?
> > >
> > > 2. CI coverage:
> > > Most, actually all, getting-started guides include code snippets
> > > referencing Docker compose files.
> > > Manually verifying these code snippets and Docker compose files,
> > > during initial contribution or when those are being updated, is quite
> > > some work.
> > > I _think_ we can automate the verification of the code snippets, and
> > > with those the Docker compose files, in CI.
> > > The overall idea is to parse the getting-started guide markdown and
> > > let a workflow execute the code blocks for shell/bash.
> > > I am not sure whether all guides can actually be verified, because
> > > some of those Docker compose files start a couple of containers, which
> > > can be a resource (RAM/CPU) issue in GitHub's hosted runners.
> > > The alternatives would be:
> > > - Never update the getting-started guides with the risk that those
> > > become stale and outdated.
> > > - Keep the manual verification process.
> > > Any thoughts on this?
> > >
> > > Robert
> > >
> > >
> > > [1] https://github.com/apache/polaris/tree/main/getting-started
> > >
> >
>

Reply via email to