Re: Cassandra 6 docs

Patrick McFadin Fri, 03 Apr 2026 08:36:54 -0700

Ah yes. I updated Antora and it was having trouble with some of the
equations. I think it will be useful for me to do a search for that
pattern. Thanks for flagging


As a matter of keeping some order. I will need to move these into trunk.
That can be done in stages. We need to start with the restructure though. I
will have to rely heavily on Claude to do a lot of the work. DS had 10
people working on docs and still could barely keep up. As a solo dude I
need to leverage some tools or it will never get done.

Just as a matter of scope. I burned through all my credits on a Claude Max
plan multiple times over a week and now have maxed out my weekly credits
until tomorrow. It’s a lot of text and work even for a machine. If you look
in my repo, I’m transparent on my methodology. It just takes a long time to
run.

On Fri, Apr 3, 2026 at 8:17 AM C. Scott Andreas <[email protected]>
wrote:

> I'm also fine with a big-bang update to the docs. In some sense, it seems
> like the only way it might happen.
>
> I like the examples-first approach to introducing features in this
> version, beyond prototypes that show all variations supported by the
> grammar.
>
> For the purpose of producing my own documentation to use as a user, I took
> the antlr grammar for Accord CQL and used a model to generate a few dozen
> examples of what's possible and not possible in the syntax. I'd love for
> something like this to be included in the Accord section.
>
> I do think there are a few areas to work on here, as there are some
> examples of unrendered latex (e.g.,) here [1]. I also wouldn't put Vector
> Search at the top level of the docs (same hierarchy position as CQL). But I
> think this is a really good start and a useful direction to pursue.
>
> – Scott
>
> [1]
> https://pmcfadin.github.io/cassandra6-docs-workzone/Cassandra/trunk/cassandra/developing/data-modeling/data-modeling_refining.html
>
> On Apr 3, 2026, at 12:13 AM, Mick <[email protected]> wrote:
>
>
> There are many moving parts here, and I don't want to be telling content
> reviewers their effort was wasted because the build broke, or because
> something got restructured out from under them.
>
> I hear the concern about process killing momentum, I don't want that
> either, but I think the risk runs the other way here. A big-bang approach
> won't make the docs any easier to maintain. Maybe part of this work does
> improve or simplify those pain points, but that hasn't come to light yet.
>
> If individual pages (or groups of pages where applicable) are separate
> PRs, then the call out to everyone to pick just one thing they cared about
> and review it will actually move faster. Otherwise it gets blocked until
> it's all done: that includes the layout, double-checking old info hasn't
> been lost, that it's an improvement in maintenance, and that it actually
> fits into the build.
>
> I'm not keen to take on debt for the sake of a quick win.
>
> WRT AI generated content, I think we're at the point we should be looking
> at how we tackle it at the patch/PR/ticket level. We're got a lot of input
> on dev@ already, i think an example in practice is our best next step.
>
>
>
> On 2 Apr 2026, at 23:58, Jon Haddad <[email protected]> wrote:
>
> I understand Mick's desire to break things up, but I am personally OK with
> it if we do big updates to the docs. To me, it's process for the sake of
> process, and has discouraged me from making contributions in the past. I'd
> rather not have a good body of work get abandoned because we ask someone to
> do a bunch of extra work for very little benefit.
>
> Here's a good example, CASSANDRA-20960 [1]. I tried to make the build
> process easier, Mick wanted to do something else, then neither thing got
> done. Building the docs still sucks.
>
> I say let it rip. The docs need love, Patrick's willing to do it, and very
> few people are willing to contribute.
>
> Jon
>
> [1] https://issues.apache.org/jira/browse/CASSANDRA-20960
>
>
>
> On Thu, Apr 2, 2026 at 2:43 PM Patrick McFadin <[email protected]> wrote:
> **Scope**
>
> I'll start a discussion thread about the format change. Most of the pages
> were simply rearranged into the buckets, but large tracts of information
> were just missing, which, after trying to shoehorn things in, I realized
> the docs had a poor foundation. My solution to the overwhelming amount of
> things was to ask everyone to pick just one thing they cared about and
> review it.
>
> **AI-generated content and community process**
> The information from there was drawn from those discussions and from other
> OSS projects where the same thing is happening. I expected this to be the
> hot button, and it's already happening, so we should settle this and get it
> in writing.
>
> Patrick
>
> On Thu, Apr 2, 2026 at 1:58 PM Mick <[email protected]> wrote:
> Thanks for the work here, the docs clearly need attention and it's awesome
> to see someone driving it!
>
> I want to raise two separate concerns.
>
> **Scope**
>
> Mixing new content additions with a wholesale restructure of the layout in
> a single contribution makes it very hard to review, and hard to have a
> traceable, community-owned conversation about *why* things are being done
> in particular ways. I'd suggest splitting this into two parallel tracks:
>
> - New or updated content added as focused, reviewable PRs
> - A separate discussion thread (here on the list) to propose and evaluate
> layout/structural changes, inviting broader input before any restructure
> lands
>
> That way the community can engage meaningfully with both, rather than
> being presented with a fait accompli that's difficult to unpack.
>
>
> **AI-generated content and community process**
>
> The content reads as AI-generated, and I want to flag that we have some
> unfinished business here as a project. Ariel opened a thread in May on
> exactly this topic: what vetting we require for AI-generated contributions;
> and IIRC I'm not sure we reached a resolution/consensus. This seems like a
> good tangible example to test it on.
>
> The ASF's guidance doesn't prohibit AI-generated contributions, but it
> does ask that contributors address copyright provenance (which tool,
> whether output was scanned or otherwise vetted for third-party material),
> and recommends disclosing tooling in commit messages. You've made brief
> mention of that here, but I think if we break the proposal up into smaller
> incremental PRs, and work with providing the (specific) information around
> the AI-generated content, it'll be easier to build the collaborative
> momentum needed.
>
>
>
>
> > On 2 Apr 2026, at 17:51, Patrick McFadin <[email protected]> wrote:
> >
> > Hello all my fellow Cassandra people.
> >
> > I have been working on a big rewrite of our Cassandra docs for a while.
> The current version released with 5 is "Just the facts" on several things.
> DataStax docs have filled in many gaps over the years, but a few
> developments have made me rethink our docs. 1) Since Lorena retired, there
> hasn't been the same focus on the in-tree docs. 2) Let's face it, LLMs have
> made docs WAY easier to generate and maintain. If you can point to the
> sweet spot for LLMs, generating docs is at the top of the list.
> >
> > Now I need everyone's help getting this in-tree. What I have done is set
> up a temporary repo in my github as a workspace as I'm doing a lot of the
> heavy lifting. Since this is generating HTML, it will be easy for everyone
> to look and comment. Once I get to a good settling place, then I'll bring
> it over to trunk and get it published on our website.
> >
> > The big changes:
> >
> > The primary change has been in how the docs are presented. Before the
> docs were top-down and mostly reference. If you had a specific use case or
> need, it was up to you to figure out where in the docs the data was
> located. I've flipped that to be persona-based now. The 4 major buckets
> I've put things in are Developer, Operator, Contributor, and Reference. For
> example, Accord and ACID transactions have completely different focuses
> when talking to a developer or an operator. Each person will find what's
> relevant to them.
> >
> > In that, I added a lot more pages to each. The sources I took from were
> previous docs, CEPs, the actual code, presentations, and blogs over the
> years. I even used Jon Haddad's Cassandra agentic coding skills! (A huge
> help for validation)
> >
> > What I need now is people looking at and judging the output. I'm
> constantly reading and revising, but I need your input. The best thing I
> could ask for is to pick one thing you are the most opinionated about.
> Maybe it's UCS or repairs. A feature you worked on that isn't covered well.
> You pick. The way I would like you to look at it is "Is this enough
> information and I would share the link." If it needs more or is missing,
> let me know. Let's fix it today.
> >
> > Here is the review link:
> https://pmcfadin.github.io/cassandra6-docs-workzone/home/draft/index.html
> >
> > Some notes on navigation. You will see the different personas on the
> left hand nav. When you click on one, it changes the view. That's just how
> Antora works in this environment. In prod, it will be cleaner. To go back
> to the top level, just click the home icon.
> >
> > To communicate changes, you can either reply to this thread or start a
> convo in the #cassandra-comdev channel on ASF Slack. Or, you can private
> message me on whatever platform you choose.
> >
> > Thanks and looking forward to some awesome docs in Cassandra 6!
> >
> > Patrick
>
>
>

Re: Cassandra 6 docs

Reply via email to