Thank you Benedict.

Considering there were no objections I am closing the discussion and
getting back to work on the ticket itself. Thank you all. Have a great week
ahead.

On Wed, 20 Oct 2021 at 18:06, bened...@apache.org <bened...@apache.org>
wrote:

> Thanks for moving this forwards Ekaterina.
>
> I think what we perhaps discovered is that there’s not really any
> consensus about how to best do config files. I think in this situation it’s
> best to defer to the one who’s actually putting in the time to _do_, so I
> am more than happy to defer to your decisions.
>
> I’m sure everyone is looking forward to the improved consistency of this
> work.
>
>
> From: Ekaterina Dimitrova <e.dimitr...@gmail.com>
> Date: Wednesday, 20 October 2021 at 22:27
> To: dev@cassandra.apache.org <dev@cassandra.apache.org>
> Subject: Re: [DISCUSS] CASSANDRA-15234
> Hi everyone,
>
> I think it is time to summarize the discussion.
>
> First of all, thank you for all the valuable input, suggestions, concerns,
> and comments!
>
> The things that I believe we all agree on:
>
>    -
>
>    Simplicity for maintenance on our end - automation as much as possible
>    so we don’t have to maintain more than one configuration file and our
>    config is less prone to human errors while adding new features
>    -
>
>    Simplicity for our users - as less confusing and as simple as possible
>    and having in mind the users’ toolset
>    -
>
>    Simplicity for testing and verification of the different config file
>    formats
>
>
> It seems to me that most people want to see committed both proposed
> versions(feel free to correct me if I am wrong) with revision of the
> default values and potentially commented out all parameters that are not
> really mandatory to be changed. Also, versions with striped comments plus a
> way to maintain everything automatically, as much as possible.
>
> With that said it seems to me the current patch in CASSANDRA-15234 can be
> committed after rebase and addressing any outstanding review comments. The
> new version of cassandra.yaml, grouping the parameters can be added in a
> new ticket by me or anyone with free cycles for that. It will require
> additional work on the backward compatibility and the opportunity for
> Cassandra to operate on all of the current versions but it will be new
> additional opportunity which doesn’t disqualify the old ones so it seems as
> a fair game to be added at any point in time in the future as it won’t be a
> breaking change. We won’t replace anything. We will only add more options.
>
> If someone disagrees and wants to implement all possible options and
> functionalities at once, I will be happy to handover the work and try to
> find the time to provide feedback/reviews later.
>
> Please do not hesitate to correct me if I misunderstood something.
>
> I will leave this discussion open until Monday and if there are no
> objections I will continue with CASSANDRA-15234 as per my proposal.
>
> Best regards,
>
> Ekaterina
>
> On Fri, 10 Sep 2021 at 20:18, Patrick McFadin <pmcfa...@gmail.com> wrote:
>
> > Ah, I feel like cassandra.yaml discussions are such an evergreen topic.
> >
> > This was something brought up a while back, but I remember years ago we
> > talked about emulating the config options that some other databases have
> > done. Providing different versions of the config for different
> approaches.
> > For instance, MySQL has had 'my-small.cnf' with just the bare minimum
> > config and restricted parameters for something like a laptop. A friendly
> > option for newcomers would be a clearly labeled  'cassandra-small.yaml'
> > with just the bare minimum and good comments. Then people new to
> Cassandra
> > wouldn't have a panic moment wondering if they have to know what
> concurrent
> > compactors are and how many you actually need? (Is there a right answer
> > even???) It's tackling the way operators approach config by the use case
> > they are trying to satisfy. Run one node on my laptop. Run a small
> cluster
> > on a budget cloud server. Run any size cluster on a ginormous server.
> >
> > Unfortunately, the cleaner solution would be how Apache HTTD solved it
> back
> > in the day with include files. It made config management much easier and
> > the overwhelm factor much lower. Yaml doesn't support it and it would all
> > have to be custom code in the Cassandra config loader. Not the best
> option
> > really.
> >
> > Back to the original question, I think Ekaterina's sectioned version
> could
> > be used for new operators because there is a lot to learn looking at the
> > comments.  Publish the following options:
> >
> > cassandra-small.yaml: Just the 'Quickstart' section
> > cassandra-medium.yaml: 'Quickstart' and 'Commonly used' with sane
> defaults
> > cassandra-advanced.yaml: Every section
> >
> > The addition is a similarly named JVM properties file .
> >
> > As somebody who has been using Cassandra for a while and would like to
> have
> > a more verbose version (especially for config management) Benedict's
> > grouped version is fantastic. Just one option there:
> >
> > cassandra-full.yaml
> >
> > That's my idea to satisfy the various operators that approach a new
> > install.
> >
> > Patrick
> >
> > On Fri, Sep 10, 2021 at 3:31 PM Jeremiah D Jordan <
> > jeremiah.jor...@gmail.com>
> > wrote:
> >
> > > > Also, if you run the above command you will see we actually have a
> lot
> > > of things show (129 lines)… it would be nice to clean it up as only a
> > small
> > > subset is required and most shown normal users won’t care
> > >
> > > +1 for this.  It would be good to clean up the config code and yaml
> such
> > > that only “things that are required to be changed” are not commented
> out
> > in
> > > the file, and everything else is commented out by default.  Last I
> > checked
> > > there were many fields that when commented out would not use a sensible
> > > value, or would result in NPE’s because they didn’t have a code level
> > > default.
> > >
> > > -Jeremiah
> > >
> > > > On Sep 10, 2021, at 1:24 PM, David Capwell
> <dcapw...@apple.com.INVALID
> > >
> > > wrote:
> > > >
> > > > We can have both, but I would hope we do not have humans maintaining
> > > both.  If we maintain the commented one, and did something like the
> below
> > > while we compile then the burden to maintain doesn’t exist
> > > >
> > > > # remove comments and empty lines
> > > > $ egrep -v '^[[:space:]]*#|^[[:space:]]*$' conf/cassandra.yaml.doc >
> > > conf/cassandra.yaml
> > > >
> > > > We do this right now with conf/hotspot_compiler so as long as our
> build
> > > maintains the other file +1
> > > >
> > > > Also, if you run the above command you will see we actually have a
> lot
> > > of things show (129 lines)… it would be nice to clean it up as only a
> > small
> > > subset is required and most shown normal users won’t care
> > > >
> > > >> On Sep 3, 2021, at 6:45 AM, bened...@apache.org wrote:
> > > >>
> > > >>> I think as the comments were stripped only for the POC. I guess
> many
> > > of them will get back
> > > >> in the actual doc version unfortunately.
> > > >>
> > > >> Well, I think the grouped format lends itself to much briefer
> > comments,
> > > with groups of related parameters getting an overall description. Even
> > as a
> > > developer who understands most of the toggles I found the old file very
> > > hard to navigate.
> > > >>
> > > >> I also don’t see why we cannot have both heavily commented versions
> > and
> > > uncommented (or lightly commented) versions.
> > > >>
> > > >> I don’t personally see why multiple different config templates would
> > be
> > > confusing if they’re in a suitably labelled directory, even if we
> settle
> > on
> > > one for the default. It might even be nice to have a pared-down config
> > that
> > > has only those properties we expect the normal user to need, so it’s
> > > particularly easy to navigate.
> > > >>
> > > >>
> > > >> From: Ekaterina Dimitrova <e.dimitr...@gmail.com>
> > > >> Date: Friday, 3 September 2021 at 14:40
> > > >> To: dev@cassandra.apache.org <dev@cassandra.apache.org>
> > > >> Subject: Re: [DISCUSS] CASSANDRA-15234
> > > >>>>
> > > >>>> It’s worth noting that the two don’t have to be in >conflict: we
> > could
> > > >>> offer two template yaml with the parameters grouped differently,
> for
> > > users
> > > >>> to decide for themselves.
> > > >>
> > > >> Sure, my only concern is that three versions of the yaml could bring
> > > >> confusion (we will have backward compatibility to the current one
> for
> > > some
> > > >> time). But it might be only me. I am open for feedback
> > > >>
> > > >>
> > > >>> If we can document this, it would be great as stuff >like “enabled”
> > are
> > > >>> inconsistent so not sure if I did it properly =D
> > > >>>
> > > >> Well, this is for now only in the ticket in the first version but no
> > one
> > > >> raised any concern. We will definitely have to update our docs on
> this
> > > and
> > > >> whatever else we came to agreement on - both for users and
> > contributors.
> > > >>
> > > >>> though I will agree that it can be hard for some >tools (such
> > > >>> as bash templating), but feel we can always find a >common ground
> > > >> Valid point and I believe it is one of the reasons we delayed the
> > > ticket,
> > > >> in order to get feedback on that. I am really interested to hear
> what
> > > >> concerns people might have.
> > > >>
> > > >>
> > > >>> Opening up a 1500+ line .yaml file is very daunting, >even if most
> of
> > > it is
> > > >>> comments. Can't blame folks for being >overwhelmed at the prospect
> of
> > > >> tuning
> > > >>> Cassandra w/that as our operator config API. :)
> > > >> I am all in for simplification and to make our users’ lives easier.
> > But
> > > at
> > > >> this point we shouldn’t be comparing the length of the files I think
> > as
> > > the
> > > >> comments were stripped only for the POC. I guess many of them will
> get
> > > back
> > > >> in the actual doc version unfortunately.
> > > >>
> > > >> Thank you all,
> > > >> Ekaterina
> > > >>
> > > >> On Thu, 2 Sep 2021 at 20:07, Joshua McKenzie <jmcken...@apache.org>
> > > wrote:
> > > >>
> > > >>> Reading through the two, the grouping approach seems like it's a
> lot
> > > more
> > > >>> friendly to newcomers as well as providing context specific cues
> for
> > > >>> relationships between params you're editing. Showing and not
> telling,
> > > if
> > > >>> you will.
> > > >>>
> > > >>> Opening up a 1500+ line .yaml file is very daunting, even if most
> of
> > > it is
> > > >>> comments. Can't blame folks for being overwhelmed at the prospect
> of
> > > tuning
> > > >>> Cassandra w/that as our operator config API. :)
> > > >>>
> > > >>> ~Josh
> > > >>>
> > > >>> On Thu, Sep 2, 2021 at 1:48 PM David Capwell
> > > <dcapw...@apple.com.invalid>
> > > >>> wrote:
> > > >>>
> > > >>>> Thanks for bringing this back up; Caleb and I were talking about
> the
> > > lack
> > > >>>> of clarity with regard to CASSANDRA-16896, fleshing this out would
> > > make
> > > >>>> those configs nicer!
> > > >>>>
> > > >>>>> To standardize naming - that we did by agreeing to the form
> > noun_verb
> > > >>>>
> > > >>>> If we can document this, it would be great as stuff like “enabled”
> > are
> > > >>>> inconsistent so not sure if I did it properly =D
> > > >>>>
> > > >>>>>
> > > >>>>> Provision of values with units while maintaining backward
> > > >>>> compatibility.
> > > >>>>
> > > >>>> +1000000000000
> > > >>>>
> > > >>>> I really hate local_read_size_threshold_kb; I would love
> > > >>>> local_read_size_threshold: 10kb.  Once we have the infrastructure
> in
> > > >>> place
> > > >>>> (believe your patch before had these tools) I would love to
> switch!
> > > >>>>
> > > >>>>
> > > >>>>> Another proposal is done by Benedict; grouping the config
> > parameters.
> > > >>>>
> > > >>>> Yep, this is what triggered Caleb and I to talk about this thread!
> > To
> > > >>>> group or not to group; that is the question
> > > >>>>
> > > >>>> Personally I like grouping from an organization point of view so
> am
> > in
> > > >>>> favor of that; though I will agree that it can be hard for some
> > tools
> > > >>> (such
> > > >>>> as bash templating), but feel we can always find a common ground
> > > >>>>
> > > >>>>
> > > >>>>> On Sep 2, 2021, at 8:44 AM, bened...@apache.org wrote:
> > > >>>>>
> > > >>>>> Thanks for bringing this to the list Ekaterina!
> > > >>>>>
> > > >>>>> It’s worth noting that the two don’t have to be in conflict: we
> > could
> > > >>>> offer two template yaml with the parameters grouped differently,
> for
> > > >>> users
> > > >>>> to decide for themselves.
> > > >>>>>
> > > >>>>> The proposals primarily define parameter names differently, with
> my
> > > >>>> proposal going by kind->place, and the other proposal maintaining
> > > >>> (mostly)
> > > >>>> the existing name form (which is a bit more like place->kind).
> While
> > > the
> > > >>>> example yaml groups by kind, you can convert nested definitions
> > into a
> > > >>>> ‘dot’ form (e.g. limits.concurrency.reads) for use in a different
> > > >>> grouping.
> > > >>>>>
> > > >>>>> One advantage of grouping parameters together is that it aids
> > > >>>> maintaining coherency of naming between systems, and also
> > potentially
> > > >>>> permits a more succinct config file and better discovery. But it’s
> > far
> > > >>> from
> > > >>>> a silver bullet, as value judgements have to be made about where
> the
> > > >>>> grouping lines are. I’m sure anything we settle on will be a huge
> > > >>>> improvement over the status quo, however.
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> From: Ekaterina Dimitrova <e.dimitr...@gmail.com>
> > > >>>>> Date: Thursday, 2 September 2021 at 16:32
> > > >>>>> To: dev@cassandra.apache.org <dev@cassandra.apache.org>
> > > >>>>> Subject: [DISCUSS] CASSANDRA-15234
> > > >>>>> Hi team,
> > > >>>>>
> > > >>>>> I would like to bring to the attention of the community
> > > >>> CASSANDRA-15234,
> > > >>>>> standardise config and JVM parameters.
> > > >>>>>
> > > >>>>> This is work we discussed back in Summer 2020 just before our
> first
> > > 4.0
> > > >>>>> Beta release. During the discussion we figured out that there is
> > more
> > > >>>> than
> > > >>>>> one option to do the job and not enough time to get user feedback
> > and
> > > >>>>> finish it so this was delayed post-4.0 And here I am, bringing it
> > > back
> > > >>> to
> > > >>>>> the table.
> > > >>>>>
> > > >>>>> This work’s goal is:
> > > >>>>>
> > > >>>>> -
> > > >>>>>
> > > >>>>> To standardize naming - that we did by agreeing to the form
> > noun_verb
> > > >>>>> -
> > > >>>>>
> > > >>>>> Provision of values with units while maintaining backward
> > > >>>> compatibility.
> > > >>>>>
> > > >>>>>
> > > >>>>> Those two parts are more or less already done.
> > > >>>>>
> > > >>>>> More interesting is the third part - reorganizing the
> > cassandra.yaml
> > > >>>> file.
> > > >>>>>
> > > >>>>> My personal approach was to split it into sections, done here
> > > >>>>> <
> > > >>>>
> > > >>>
> > >
> >
> https://github.com/ekaterinadimitrova2/cassandra/blob/b4eebe080835da79d032f9314262c268b71172a8/conf/cassandra.yaml
> > > >>>>>
> > > >>>>> .
> > > >>>>>
> > > >>>>> Another proposal is done by Benedict; grouping the config
> > parameters.
> > > >>>>>
> > > >>>>> To make it clearer, he created a yaml
> > > >>>>> <
> > > >>>>
> > > >>>
> > >
> >
> https://github.com/belliottsmith/cassandra/blob/5f80d1c0d38873b7a27dc137656d8b81f8e6bbd7/conf/cassandra_nocomment.yaml
> > > >>>>>
> > > >>>>> with comments mostly stripped.
> > > >>>>>
> > > >>>>> In his version, there are basic settings for network, disk etc
> all
> > > >>>> grouped
> > > >>>>> together, followed by operator tuneables mostly under limits
> within
> > > >>> which
> > > >>>>> we now have throughput, concurrency, capacity. This leads to
> > settings
> > > >>> for
> > > >>>>> some features being kept separate (most notably for caching), but
> > > helps
> > > >>>> the
> > > >>>>> operator understand what they have to play with for controlling
> > > >>> resource
> > > >>>>> consumption.
> > > >>>>>
> > > >>>>> I am interested to hear what people think about the two options
> or
> > if
> > > >>>>> anyone has another idea to share, open discussion.
> > > >>>>>
> > > >>>>> Thank you,
> > > >>>>>
> > > >>>>> Ekaterina
> > > >>>>
> > > >>>>
> > > >>>>
> > ---------------------------------------------------------------------
> > > >>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > >>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >>>>
> > > >>>>
> > > >>>
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> >
>

Reply via email to