Re: Cassandra 6 docs

Mick Fri, 03 Apr 2026 00:12:32 -0700

There are many moving parts here, and I don't want to be telling content 
reviewers their effort was wasted because the build broke, or because something 
got restructured out from under them.


I hear the concern about process killing momentum, I don't want that either, 
but I think the risk runs the other way here. A big-bang approach won't make 
the docs any easier to maintain. Maybe part of this work does improve or 
simplify those pain points, but that hasn't come to light yet.

If individual pages (or groups of pages where applicable) are separate PRs, 
then the call out to everyone to pick just one thing they cared about and 
review it will actually move faster. Otherwise it gets blocked until it's all 
done: that includes the layout, double-checking old info hasn't been lost, that 
it's an improvement in maintenance, and that it actually fits into the build.

I'm not keen to take on debt for the sake of a quick win.

WRT AI generated content, I think we're at the point we should be looking at 
how we tackle it at the patch/PR/ticket level.  We're got a lot of input on 
dev@ already, i think an example in practice is our best next step.



> On 2 Apr 2026, at 23:58, Jon Haddad <[email protected]> wrote:
> 
> I understand Mick's desire to break things up, but I am personally OK with it 
> if we do big updates to the docs.  To me, it's process for the sake of 
> process, and has discouraged me from making contributions in the past.  I'd 
> rather not have a good body of work get abandoned because we ask someone to 
> do a bunch of extra work for very little benefit.
> 
> Here's a good example, CASSANDRA-20960 [1].  I tried to make the build 
> process easier, Mick wanted to do something else, then neither thing got 
> done.  Building the docs still sucks.  
> 
> I say let it rip.  The docs need love, Patrick's willing to do it, and very 
> few people are willing to contribute.
> 
> Jon
> 
> [1] https://issues.apache.org/jira/browse/CASSANDRA-20960
> 
> 
> 
> On Thu, Apr 2, 2026 at 2:43 PM Patrick McFadin <[email protected]> wrote:
> **Scope**
> 
> I'll start a discussion thread about the format change. Most of the pages 
> were simply rearranged into the buckets, but large tracts of information were 
> just missing, which, after trying to shoehorn things in, I realized the docs 
> had a poor foundation. My solution to the overwhelming amount of things was 
> to ask everyone to pick just one thing they cared about and review it. 
> 
> **AI-generated content and community process**
> The information from there was drawn from those discussions and from other 
> OSS projects where the same thing is happening. I expected this to be the hot 
> button, and it's already happening, so we should settle this and get it in 
> writing. 
> 
> Patrick
> 
> On Thu, Apr 2, 2026 at 1:58 PM Mick <[email protected]> wrote:
> Thanks for the work here, the docs clearly need attention and it's awesome to 
> see someone driving it!
> 
> I want to raise two separate concerns.
> 
> **Scope**
> 
> Mixing new content additions with a wholesale restructure of the layout in a 
> single contribution makes it very hard to review, and hard to have a 
> traceable, community-owned conversation about *why* things are being done in 
> particular ways. I'd suggest splitting this into two parallel tracks:
> 
> - New or updated content added as focused, reviewable PRs
> - A separate discussion thread (here on the list) to propose and evaluate 
> layout/structural changes, inviting broader input before any restructure lands
> 
> That way the community can engage meaningfully with both, rather than being 
> presented with a fait accompli that's difficult to unpack.
> 
> 
> **AI-generated content and community process**
> 
> The content reads as AI-generated, and I want to flag that we have some 
> unfinished business here as a project. Ariel opened a thread in May on 
> exactly this topic: what vetting we require for AI-generated contributions; 
> and IIRC I'm not sure we reached a resolution/consensus.  This seems like a 
> good tangible example to test it on.
> 
> The ASF's guidance doesn't prohibit AI-generated contributions, but it does 
> ask that contributors address copyright provenance (which tool, whether 
> output was scanned or otherwise vetted for third-party material), and 
> recommends disclosing tooling in commit messages.   You've made brief mention 
> of that here, but I think if we break the proposal up into smaller 
> incremental PRs, and work with providing the (specific) information around 
> the AI-generated content, it'll be easier to build the collaborative momentum 
> needed.
> 
> 
> 
> 
> > On 2 Apr 2026, at 17:51, Patrick McFadin <[email protected]> wrote:
> > 
> > Hello all my fellow Cassandra people. 
> > 
> > I have been working on a big rewrite of our Cassandra docs for a while. The 
> > current version released with 5 is "Just the facts" on several things. 
> > DataStax docs have filled in many gaps over the years, but a few 
> > developments have made me rethink our docs. 1) Since Lorena retired, there 
> > hasn't been the same focus on the in-tree docs. 2) Let's face it, LLMs have 
> > made docs WAY easier to generate and maintain. If you can point to the 
> > sweet spot for LLMs, generating docs is at the top of the list.
> > 
> > Now I need everyone's help getting this in-tree. What I have done is set up 
> > a temporary repo in my github as a workspace as I'm doing a lot of the 
> > heavy lifting. Since this is generating HTML, it will be easy for everyone 
> > to look and comment. Once I get to a good settling place, then I'll bring 
> > it over to trunk and get it published on our website. 
> > 
> > The big changes:
> > 
> > The primary change has been in how the docs are presented. Before the docs 
> > were top-down and mostly reference. If you had a specific use case or need, 
> > it was up to you to figure out where in the docs the data was located. I've 
> > flipped that to be persona-based now. The 4 major buckets I've put things 
> > in are Developer, Operator, Contributor, and Reference. For example, Accord 
> > and ACID transactions have completely different focuses when talking to a 
> > developer or an operator. Each person will find what's relevant to them. 
> > 
> > In that, I added a lot more pages to each. The sources I took from were 
> > previous docs, CEPs, the actual code, presentations, and blogs over the 
> > years. I even used Jon Haddad's Cassandra agentic coding skills! (A huge 
> > help for validation) 
> > 
> > What I need now is people looking at and judging the output. I'm constantly 
> > reading and revising, but I need your input. The best thing I could ask for 
> > is to pick one thing you are the most opinionated about. Maybe it's UCS or 
> > repairs. A feature you worked on that isn't covered well. You pick. The way 
> > I would like you to look at it is "Is this enough information and I would 
> > share the link."  If it needs more or is missing, let me know. Let's fix it 
> > today. 
> > 
> > Here is the review link: 
> > https://pmcfadin.github.io/cassandra6-docs-workzone/home/draft/index.html
> > 
> > Some notes on navigation. You will see the different personas on the left 
> > hand nav. When you click on one, it changes the view. That's just how 
> > Antora works in this environment. In prod, it will be cleaner. To go back 
> > to the top level, just click the home icon.
> > 
> > To communicate changes, you can either reply to this thread or start a 
> > convo in the #cassandra-comdev channel on ASF Slack. Or, you can private 
> > message me on whatever platform you choose. 
> > 
> > Thanks and looking forward to some awesome docs in Cassandra 6!
> > 
> > Patrick
>

Re: Cassandra 6 docs

Reply via email to