"Dinesh's message cautions against making "breaking" changes that are likely to
break parsing of output by current users (e.g., changes to naming/meaning/"
That is 100% correct. So by that logic, changing the output which you grep on
to something else will break your scripts if you expect it there.
For example, take sstablemetadata command - I know it is not nodetool but it
does not matter. This is just an example. Same "problem" can be found in
nodetool probably, sstablemetadata just came to my mind first as that is what I
hit recently.
sstablemetadata write this:
Repaired at: 0
Originating host id: d2d12c56-7d9c-49a7-aaef-05bd2633b09e
Pending repair: --
Replay positions covered: {CommitLogPosition(segmentId=1689261027905,
position=59450)=CommitLogPosition(segmentId=1689261027905, position=60508)}
totalColumnsSet: 0
totalRows: 1
Estimated tombstone drop times:
Do you see "totalColumsSet" and "totalRows" when all other keys in that ouput
(in whole command) are following different format? In this case, it should be
"Total columns set" and "Total rows".
So when we change it to that, anybody who is grepping "totalRows" will have no
output. That is a breaking change to me. His script stopped to work.
You are correct and I agree with you completely that STRICT ADDITIONS (what I
was suggesting) are fine because we are not breaking anything to anybody.
So here, if I want to change this, by what Dinesh says, (we change the naming
and we break it), I need to offer JSON / YAML alternative to what
sstablemetadata prints currently. (might be as well nodetool, just an example).
________________________________________
From: C. Scott Andreas <[email protected]>
Sent: Thursday, July 13, 2023 17:01
To: [email protected]
Cc: [email protected]
Subject: Re: Changing the output of tooling between majors
NetApp Security WARNING: This is an external email. Do not click links or open
attachments unless you recognize the sender and know the content is safe.
Dinesh's message cautions against making "breaking" changes that are likely to
break parsing of output by current users (e.g., changes to
naming/meaning/position of existing fields vs. adding new ones). I don't read
his message as saying that any change to nodetool output is conditional on
offering a JSON/YAML representation, though.
What are some changes that you'd like to make?
– Scott
On Jul 13, 2023, at 7:44 AM, "Miklosovic, Stefan"
<[email protected]> wrote:
For example Dinesh said this:
"Until nodetool can support JSON as output format for all interaction and there
is a significant adoption in the user community, I would strongly advise
against making breaking changes to the CLI output."
That is where I get the need to have a JSON output in order to fix a typo from.
That is if we look at fixing a typo as a breaking change. Which I would say it
is as if somebody is "greping" it and it is not there, it will break.
Do you understand that the same way or am I interpreting that wrong?
________________________________________
From: C. Scott Andreas <[email protected]>
Sent: Thursday, July 13, 2023 16:35
To: [email protected]<mailto:[email protected]>
Cc: dev
Subject: Re: Changing the output of tooling between majors
NetApp Security WARNING: This is an external email. Do not click links or open
attachments unless you recognize the sender and know the content is safe.
"From what I see you guys want to condition any change by offering json/yaml as
well."
I don't think I've seen a proposal to block changes to nodetool output on
machine-parseable formats in this thread.
Additions of new delimited fields to nodetool output are mostly
straightforward. Changes to fields that exist today are likely to cause
problems - as Josh mentions. These seem best to take on a case-by-case basis
rather than trying to hammer out an abstract policy. What changes would you
like to make?
I do think we will have difficulty evolving output formats of text-based
Cassandra tooling until we offer machine-parseable output formats.
– Scott
On Jul 13, 2023, at 6:39 AM, Josh McKenzie <[email protected]> wrote:
I just find it ridiculous we can not change "someProperty: 10" to "Some
Property: 10" and there is so much red tape about that.
Well, we're talking about programmatic parsing here. This feels like
complaining about a compiler that won't let you build if you're missing a ;
We can change it, but that doesn't mean the aggregate cost/benefit across our
entire ecosystem is worth it. The value of correcting a typo is pretty small,
and the cost for everyone downstream is not. This is why we should spellcheck
things in API's before we release them. :)
On Wed, Jul 12, 2023, at 2:45 PM, Miklosovic, Stefan wrote:
Eric,
I appreciate your feedback on this, especially more background about where you
are comming from in the second paragraph.
I think we are on the same page afterall. I definitely understand that people
are depending on this output and we need to be careful. That is why I propose
to change it only each major. What I feel is that everybody's usage /
expectations is little bit different and outputs of the commands are very
diverse and it is hard to balance this so everybody is happy.
I am trying to come up with a solution which would not change the most
important commands unnecessarily while also having some free room to tweak the
existing commands where we see it appropriate. I just find it ridiculous we can
not change "someProperty: 10" to "Some Property: 10" and there is so much red
tape about that.
If I had to summarize this whole discussion, the best conclustion I can think
of is to not change what is used the most (this would probably need to be
defined more explicitly) and if we have to change something else we better
document that extensively and provide json/yaml for people to be able to
divorce from the parsing of human-readable format (which probably all agree
should not happen in the first place).
What I am afraid of is that in order to satisfy these conditions, if, for
example, we just want to fix a typo or the format of a key of some value, the
we would need to deliver JSON/YAML format as well if there is not any yet and
that would mean that the change of such triviality would require way more work
in terms of the implementation of JSON/YAML format output. Some commands are
quite sophisticated and I do not want to be blocked to change a field in
human-readable out because providing corresponding JSON/YAML format would be
gigantic portion of the work itself.
From what I see you guys want to condition any change by offering json/yaml as
well and I dont know if that is just not too much.
________________________________________
From: Eric Evans <[email protected]<mailto:[email protected]>>
Sent: Wednesday, July 12, 2023 19:48
To:
[email protected]<mailto:[email protected]><mailto:[email protected]>
Subject: Re: Changing the output of tooling between majors
You don't often get email from
[email protected]<mailto:[email protected]><mailto:[email protected]>.
Learn why this is important<https://aka.ms/LearnAboutSenderIdentification>
NetApp Security WARNING: This is an external email. Do not click links or open
attachments unless you recognize the sender and know the content is safe.
On Wed, Jul 12, 2023 at 1:54 AM Miklosovic, Stefan
<[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>
wrote:
I agree with Jackson that having a different output format (JSON/YAML) in order
to be able to change the default output resolves nothing in practice.
As Jackson said, "operators who maintain these scripts aren’t going to re-write
them just because a better way of doing them is newly available, usually
they’re too busy with other work and will keep using those old scripts until
they stop working".
This is true. If this approach is adopted, what will happen in practice is that
we change the output and we provide a different format and then a user detects
this change because his scripts changed. As he has existing solution in place
which parses the text from human-readable output, he will try to fix that, he
will not suddenly convert all scripting he has to parsing JSON just because we
added it. Starting with JSON parsing might be done if he has no scripting in
place yet but then we would not cover already existing deployments.
I think this is quite an extreme conclusion to draw. If tooling had stable,
structured output formats, and if we documented an expectation that
human-readable console output was unstable, then presumably it would be safe to
assume that any new scripters would avail themselves of the stable formats, or
expect breakage later. I think it's also fair to assume that at least some
people would spend the time to convert their scripts, particularly if forced to
revisit them (for example, after a breaking change to console output). As
someone who manages several large-scale mission-critical Cassandra clusters
under constrained resources, this is how I would approach it.
TL;DR Don't let perfect by the enemy of
good<https://en.wikipedia.org/wiki/Perfect_is_the_enemy_of_good<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FPerfect_is_the_enemy_of_good&data=05%7C01%7CStefan.Miklosovic%40netapp.com%7C37013cf61fa345516c7708db83b255ef%7C4b0911a0929b4715944bc03745165b3a%7C0%7C0%7C638248574229300772%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=FkNof8T1l%2BEw5jyFTmKGH7nW7s1cAg5ffUnvlcI7%2BEE%3D&reserved=0><https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FPerfect_is_the_enemy_of_good&data=05%7C01%7CStefan.Miklosovic%40netapp.com%7Ca05ce5e1602946efc1fb08db83ae73bc%7C4b0911a0929b4715944bc03745165b3a%7C0%7C0%7C638248557516576810%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=8j5RTNRt0scXO7Fgi0C5rgzKT16cqNqLIfwU96qlfso%3D&reserved=0<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FPerfect_is_the_enemy_of_good&data=05%7C01%7CStefan.Miklosovic%40netapp.com%7C37013cf61fa345516c7708db83b255ef%7C4b0911a0929b4715944bc03745165b3a%7C0%7C0%7C638248574229300772%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=FkNof8T1l%2BEw5jyFTmKGH7nW7s1cAg5ffUnvlcI7%2BEE%3D&reserved=0>>>
[ ... ]
For that reason, what we could agree on is that we would never change the
output for "tier 1" commands and if we ever changed something, it would be
STRICT ADDITIONS only. In other words, everything it printed, it would continue
to print that for ever. Only new lines could be introduced. We need to do this
because Cassandra is evolving over time and we need to keep the output aligned
as new functionality appears. But the output would be backward compatible.
Plus, we are talking about majors only.
The only reason we would ever changed the output on "tier 1" commands, if is
not an addition, is the fix of the typo in the existing output. This would
again happened only in majors.
All other output for all other commands might be changed but their output will
not need to be strictly additive. This would again happen only between majors.
What is you opinion about this?
To be clear about where I'm coming from: I'm not arguing against you or anyone
else making changes like these (in major versions, or otherwise). If —for
example— we had console output that was incorrect, incomplete, or obviously
misleading, I'd absolutely want to see that fixed, script breakage be damned.
All I want is for folks to recognize the problems this sort of thing can
create, and show a bit of empathy before submitting a change. For operators on
the receiving end, it can be really frustrating, especially when there is no
normative change (i.e. it's in service of aesthetics).
--
Eric Evans<mailto:[email protected]<mailto:[email protected]>>
Staff SRE, Data Persistence
Wikimedia Foundation