If somebody understood my message as I am promoting the removal of all these commands for which we have other means of getting the output of, that is not the case at all. I do not want to remove any of them.. I am just elaborating on "parsing the output of nodetool and problems related to that if it is changed" in this particular case.
________________________________________ From: Miklosovic, Stefan <stefan.mikloso...@netapp.com> Sent: Saturday, July 8, 2023 17:43 To: dev Subject: Re: Changing the output of tooling between majors Thank you, Josh, for your insight. I think they should not parse that output in the first place. Gradually introducing JSON / YAML output formats for nodetool is cool but I think it started to happen too late and people were already parsing the raw nodetool output and here we are. I played with nodetool a little bit to see where we are with this, there is 135 commands in total. We can leave out all "set*" commands, we can not ignore "get*" because that is potential output to parse. People just don't parse the output of "set*" commands. That is 116 commands. We can also ignore all "disable*" and "enable" commands and we are on 98. Then there is the group of "invalidate*" commands, we can skip them too, we are on 90, minus help command, 89. Now the commands which left can be categorized into two main groups: the commands which execute some action and commands which display some statistics or state about internals of a Cassandra node. The first group, "action commands", are again not going to be parsed on the output. These are here (1) (I could make some mistakes here and there). So, the commands we can potentially parse the output of are here (2), there is roughly 51 of them. Some of these commands have their equivalent in system_views vtables, these are, if I havent forgotten something clientstats (system_views.clients) compactionhistory (system.compaction_history) compactionstats (system_views.sstable_tasks) gossipinfo (system_views.gossip_info) listsnapshots (system_view.snapshots) tpstats (system_view.thread_pools) Some of them have already different format of the output supported (JSON or YAML), they are: datapaths tablestats tpstats (has also cql table) compactionhistory (has also cql table) I would argue that some commands with prefix "status" and "get" can go away too because their value is visible in system_views.settings. Some of these settings will be even updateable after Maxim's work. statusbackup incremental_backups statushandoff hinted_handoff_enabled getmaxhintwindow max_hint_window getconcurrentcompactors concurrent_compactors getconcurrentviewbuilders concurrent_materialized_view_builders getdefaultrf default_keyspace_rf gettimeout (this just reflects cassandra.yaml more or less) Then there is the family of all "get throttle / threshold " etc like this, I am lazy to go through them but they are somehow retrievable from CQL system_views.settings too. getbatchlogreplaythrottle getcolumnindexsize getcompactionthreshold getcompactionthroughput getinterdcstreamthroughput getsnapshotthrottle getstreamthroughput There are commands which just return an integer or there is nothing to change about their output / it is just not necessary like: gettraceprobability getsstables So commands which do not have their output equivalent in some cql table or for which there is not JSON / YAML format available are describecluster describering failuredetector gcstats getauditlog getauthcacheconfig getconcurrency getendpoints getfullquerylog getlogginglevels getseeds info listpendinghints netstats profileload (replacement of toppartition (which should be removed in 5.0, actually)) proxyhistograms rangekeysample repair repair_admin ring status statusautocompaction statusbinary statusgossip tablehistograms toppartitions viewbuildstatus From these, if one asks which ones actually make sense to try to tweak the output of, they might be describecluster describering info listpendinghints netstats proxyhistograms repair_admin (if somebody wants to list stuff in json) ring status tablehistograms viewbuildstatus The point I want to make is that I do not think the problem of changing the output is too hot. There is basically like 15 at most commands for which the output matter because there is not their CQL equivalent or JSON / YAML output. If we are providing CQL / JSON / YAML for couple years, I do not believe that the argument "lets not break it for folks in nodetool" is still relevant. CQL output is there from times of 4.0 at least (at least!) and YAML / JSON is also not something completely new. It is not like we are suddenly forcing people to change their habits, there was enough time to update the stuff to CQL / json / yaml etc ... But really, the question I still don't have an answer for is who is actually parsing the output, I think I ping user ML list to probe the situation a little bit. (1) https://gist.github.com/smiklosovic/3f4ea8ccae53ad503af13c53789815be (2) https://gist.github.com/smiklosovic/f9a681016c22e2dfe88c883b6881cb7c ________________________________________ From: Josh McKenzie <jmcken...@apache.org> Sent: Saturday, July 8, 2023 14:47 To: dev Subject: Re: Changing the output of tooling between majors NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. Once there is, we are free to change the default output however we want. One thing I always try to keep in mind on discussions like this. A thought experiment (with very hand-wavy numbers; try not to get hung up on them): * Let's say there are 5,000 discrete "users" of C* out there (different groups of people using the DB) * And assume 5% have written some kind of scripting / automation to parse our tooling output (250) * And let's assume it'd take 18 developer hours (a few days at 6 hours/day) to retool to the new output, validate and test correctness, and then roll it out to qa, test, validate, and then to prod, test, validate You're looking at 250 * 18 hours, 4,500 hours, 112.5 40 hour work weeks (2+ years for some poor sod without vacations) worth of work from what seems to be a simple change. Now, that estimate could be off by an order of magnitude either way, but the motion of the exercise is valuable, I think. There's a real magnified downstream cost to our community when we make changes to APIs and we need to weigh that against the cost to the project in terms of maintaining those interfaces. The above mental exercise really strongly applies to the periodic discussions where we talk about deprecating JMX support. Not saying we should or shouldn't change things here for the record, just want to call this out for anyone that might not have been thinking about things this way. On Fri, Jul 7, 2023, at 3:23 PM, Brandon Williams wrote: On Fri, Jul 7, 2023 at 2:20 PM Miklosovic, Stefan <stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>> wrote: > > Great thanks. That might work. > > So we do not change the default output unless there is json / yaml equivalent. > > Once there is, we are free to change the default output however we want. Yes, exactly. Then we have the best of both worlds: programmatic access that isn't flimsy, and a pretty display however we want it.