Ah yes. I updated Antora and it was having trouble with some of the equations. I think it will be useful for me to do a search for that pattern. Thanks for flagging
As a matter of keeping some order. I will need to move these into trunk. That can be done in stages. We need to start with the restructure though. I will have to rely heavily on Claude to do a lot of the work. DS had 10 people working on docs and still could barely keep up. As a solo dude I need to leverage some tools or it will never get done. Just as a matter of scope. I burned through all my credits on a Claude Max plan multiple times over a week and now have maxed out my weekly credits until tomorrow. It’s a lot of text and work even for a machine. If you look in my repo, I’m transparent on my methodology. It just takes a long time to run. On Fri, Apr 3, 2026 at 8:17 AM C. Scott Andreas <[email protected]> wrote: > I'm also fine with a big-bang update to the docs. In some sense, it seems > like the only way it might happen. > > I like the examples-first approach to introducing features in this > version, beyond prototypes that show all variations supported by the > grammar. > > For the purpose of producing my own documentation to use as a user, I took > the antlr grammar for Accord CQL and used a model to generate a few dozen > examples of what's possible and not possible in the syntax. I'd love for > something like this to be included in the Accord section. > > I do think there are a few areas to work on here, as there are some > examples of unrendered latex (e.g.,) here [1]. I also wouldn't put Vector > Search at the top level of the docs (same hierarchy position as CQL). But I > think this is a really good start and a useful direction to pursue. > > – Scott > > [1] > https://pmcfadin.github.io/cassandra6-docs-workzone/Cassandra/trunk/cassandra/developing/data-modeling/data-modeling_refining.html > > On Apr 3, 2026, at 12:13 AM, Mick <[email protected]> wrote: > > > There are many moving parts here, and I don't want to be telling content > reviewers their effort was wasted because the build broke, or because > something got restructured out from under them. > > I hear the concern about process killing momentum, I don't want that > either, but I think the risk runs the other way here. A big-bang approach > won't make the docs any easier to maintain. Maybe part of this work does > improve or simplify those pain points, but that hasn't come to light yet. > > If individual pages (or groups of pages where applicable) are separate > PRs, then the call out to everyone to pick just one thing they cared about > and review it will actually move faster. Otherwise it gets blocked until > it's all done: that includes the layout, double-checking old info hasn't > been lost, that it's an improvement in maintenance, and that it actually > fits into the build. > > I'm not keen to take on debt for the sake of a quick win. > > WRT AI generated content, I think we're at the point we should be looking > at how we tackle it at the patch/PR/ticket level. We're got a lot of input > on dev@ already, i think an example in practice is our best next step. > > > > On 2 Apr 2026, at 23:58, Jon Haddad <[email protected]> wrote: > > I understand Mick's desire to break things up, but I am personally OK with > it if we do big updates to the docs. To me, it's process for the sake of > process, and has discouraged me from making contributions in the past. I'd > rather not have a good body of work get abandoned because we ask someone to > do a bunch of extra work for very little benefit. > > Here's a good example, CASSANDRA-20960 [1]. I tried to make the build > process easier, Mick wanted to do something else, then neither thing got > done. Building the docs still sucks. > > I say let it rip. The docs need love, Patrick's willing to do it, and very > few people are willing to contribute. > > Jon > > [1] https://issues.apache.org/jira/browse/CASSANDRA-20960 > > > > On Thu, Apr 2, 2026 at 2:43 PM Patrick McFadin <[email protected]> wrote: > **Scope** > > I'll start a discussion thread about the format change. Most of the pages > were simply rearranged into the buckets, but large tracts of information > were just missing, which, after trying to shoehorn things in, I realized > the docs had a poor foundation. My solution to the overwhelming amount of > things was to ask everyone to pick just one thing they cared about and > review it. > > **AI-generated content and community process** > The information from there was drawn from those discussions and from other > OSS projects where the same thing is happening. I expected this to be the > hot button, and it's already happening, so we should settle this and get it > in writing. > > Patrick > > On Thu, Apr 2, 2026 at 1:58 PM Mick <[email protected]> wrote: > Thanks for the work here, the docs clearly need attention and it's awesome > to see someone driving it! > > I want to raise two separate concerns. > > **Scope** > > Mixing new content additions with a wholesale restructure of the layout in > a single contribution makes it very hard to review, and hard to have a > traceable, community-owned conversation about *why* things are being done > in particular ways. I'd suggest splitting this into two parallel tracks: > > - New or updated content added as focused, reviewable PRs > - A separate discussion thread (here on the list) to propose and evaluate > layout/structural changes, inviting broader input before any restructure > lands > > That way the community can engage meaningfully with both, rather than > being presented with a fait accompli that's difficult to unpack. > > > **AI-generated content and community process** > > The content reads as AI-generated, and I want to flag that we have some > unfinished business here as a project. Ariel opened a thread in May on > exactly this topic: what vetting we require for AI-generated contributions; > and IIRC I'm not sure we reached a resolution/consensus. This seems like a > good tangible example to test it on. > > The ASF's guidance doesn't prohibit AI-generated contributions, but it > does ask that contributors address copyright provenance (which tool, > whether output was scanned or otherwise vetted for third-party material), > and recommends disclosing tooling in commit messages. You've made brief > mention of that here, but I think if we break the proposal up into smaller > incremental PRs, and work with providing the (specific) information around > the AI-generated content, it'll be easier to build the collaborative > momentum needed. > > > > > > On 2 Apr 2026, at 17:51, Patrick McFadin <[email protected]> wrote: > > > > Hello all my fellow Cassandra people. > > > > I have been working on a big rewrite of our Cassandra docs for a while. > The current version released with 5 is "Just the facts" on several things. > DataStax docs have filled in many gaps over the years, but a few > developments have made me rethink our docs. 1) Since Lorena retired, there > hasn't been the same focus on the in-tree docs. 2) Let's face it, LLMs have > made docs WAY easier to generate and maintain. If you can point to the > sweet spot for LLMs, generating docs is at the top of the list. > > > > Now I need everyone's help getting this in-tree. What I have done is set > up a temporary repo in my github as a workspace as I'm doing a lot of the > heavy lifting. Since this is generating HTML, it will be easy for everyone > to look and comment. Once I get to a good settling place, then I'll bring > it over to trunk and get it published on our website. > > > > The big changes: > > > > The primary change has been in how the docs are presented. Before the > docs were top-down and mostly reference. If you had a specific use case or > need, it was up to you to figure out where in the docs the data was > located. I've flipped that to be persona-based now. The 4 major buckets > I've put things in are Developer, Operator, Contributor, and Reference. For > example, Accord and ACID transactions have completely different focuses > when talking to a developer or an operator. Each person will find what's > relevant to them. > > > > In that, I added a lot more pages to each. The sources I took from were > previous docs, CEPs, the actual code, presentations, and blogs over the > years. I even used Jon Haddad's Cassandra agentic coding skills! (A huge > help for validation) > > > > What I need now is people looking at and judging the output. I'm > constantly reading and revising, but I need your input. The best thing I > could ask for is to pick one thing you are the most opinionated about. > Maybe it's UCS or repairs. A feature you worked on that isn't covered well. > You pick. The way I would like you to look at it is "Is this enough > information and I would share the link." If it needs more or is missing, > let me know. Let's fix it today. > > > > Here is the review link: > https://pmcfadin.github.io/cassandra6-docs-workzone/home/draft/index.html > > > > Some notes on navigation. You will see the different personas on the > left hand nav. When you click on one, it changes the view. That's just how > Antora works in this environment. In prod, it will be cleaner. To go back > to the top level, just click the home icon. > > > > To communicate changes, you can either reply to this thread or start a > convo in the #cassandra-comdev channel on ASF Slack. Or, you can private > message me on whatever platform you choose. > > > > Thanks and looking forward to some awesome docs in Cassandra 6! > > > > Patrick > > >
