I'm surprised by this compatibility requirement. It's quite onerous, since
it means we can't evolve the output at all. There's no standardized way to
parse CLI output, so who knows what might break user scripts. e.g. if we
wanted to display a "+" for ACLs in ls output, that'd be incompatible. Same
deal for an xattr or encryption bit in ls output. Adding new cluster/node
state to dfsadmin -report. We have some left and right justified columns in
cacheadmin output, and changing a column header might add an extra space
and again break a script. Our CLI output is just not intended to be a
stable API.

This is also not something typically upheld by unix-y commands. BSD vs. GNU
already leads to incompatible flags and output. Most of these commands
haven't been changed in 20 years, but that doesn't constitute a compat
guarantee.

One example I like is git. It splits its commands into "porcelain" and
"plumbing", where plumbing is meant for script use. An excerpt from the man
page:

       The interface (input, output, set of options and the semantics) to
these low-level commands are meant to be a lot more stable than Porcelain
level commands, because these commands are primarily for scripted use. The
interface to Porcelain commands on the other hand are subject to change in
order to improve the end user experience.

This is something I'd like to follow for our own commands. We provide
different APIs for machine consumption vs. human consumption, and make this
clear in the compat guide. Of course, we should still be judicious when
changing the human output, but I just don't see a good way forward without
relaxing our current compat guidelines.

The other thing to consider is providing supported Java APIs for the
commonly-parsed shell commands. This is something we have much more
experience with.

Best,
Andrew

On Fri, Apr 24, 2015 at 1:17 PM, Yongjun Zhang <yzh...@cloudera.com> wrote:

> Thanks Chris, good clarification!
>
> --Yongjun
>
> On Fri, Apr 24, 2015 at 12:36 PM, Chris Nauroth <cnaur...@hortonworks.com>
> wrote:
>
> > Metrics/JMX is covered by our compatibility guidelines:
> >
> >
> http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/Comp
> > atibility.html#MetricsJMX
> >
> >
> > Metrics/JMX is similar to our usage of Protocol Buffers/JSON that I
> > mentioned.  It supports backwards-compatible evolution if the change is
> > done correctly.  Adding new fields/beans is compatible.  Changing the
> > names or data types of existing fields/beans is incompatible.  Deleting
> > existing fields/beans is incompatible.
> >
> > --Chris Nauroth
> >
> >
> >
> >
> > On 4/24/15, 11:19 AM, "Yongjun Zhang" <yzh...@cloudera.com> wrote:
> >
> > >Thanks Allen and Chris!
> > >
> > >What about adding new entries to jmx report? Somehow I had the
> impression
> > >that if we add new entries to it, it's not considered incompatible.
> Often
> > >within the same minor release, we want to add new info to jmx report
> > >instead of waiting for a major release.
> > >
> > >For CLI like fsck, maybe we can add a new command line option to enable
> > >the
> > >change, and if the command line option is not enabled, don't change the
> > >output, so we can still commit the change within the same release line?
> > >
> > >Thanks.
> > >
> > >--Yongjun
> > >
> > >
> > >On Fri, Apr 24, 2015 at 11:05 AM, Chris Nauroth <
> cnaur...@hortonworks.com
> > >
> > >wrote:
> > >
> > >> Allen, thank you for calling this out.  I was not aware of this part
> of
> > >> the compatibility guidelines.  I committed one of those fsck changes
> in
> > >> HDFS-7933.  I see you flagged the issue as incompatible, which agrees
> > >>with
> > >> the compatibility guidelines.
> > >>
> > >> "Changing the path of a command, removing or renaming command line
> > >> options, the order of arguments, or the command return code and output
> > >> break compatibility and may adversely affect users."
> > >>
> > >> Most of this intuitively makes sense.  Even ignorant of the
> > >>compatibility
> > >> guidelines, I would have known to push back on patches that change the
> > >> path, remove or rename existing options, or change the order of
> > >>arguments.
> > >>
> > >> HDFS-7933 was an example of an output change, and I find this part of
> > >>the
> > >> compatibility guidelines much more challenging.  We need to be able to
> > >> evolve CLI output within a release line.  On the protocol side, our
> use
> > >>of
> > >> Protocol Buffers and JSON supports evolution if we use it correctly.
> > >>How
> > >> can we achieve the equivalent for the CLI?  For example, can we turn
> > >> HDFS-7933 into a backwards-compatible change if it preserves the old
> > >> output, and only adds the new information if the user passes a new
> > >> argument, such as -count-decom?
> > >>
> > >> Are there other specific issues that you have in mind for CLI
> > >> incompatibility problems?  Let's see if we can find a way to amend
> them
> > >>to
> > >> satisfy the compatibility guidelines.
> > >>
> > >> --Chris Nauroth
> > >>
> > >>
> > >>
> > >>
> > >> On 4/24/15, 1:02 AM, "Allen Wittenauer" <a...@altiscale.com> wrote:
> > >>
> > >> >
> > >> >On Apr 24, 2015, at 5:53 AM, Yongjun Zhang <yzh...@cloudera.com>
> > wrote:
> > >> >
> > >> >>
> > >> >> Basically we are adding two additional lines to the report (as
> > >> >>highlighted
> > >> >> above).
> > >> >>
> > >> >> Theoretically if a tool parses existing fsck report and expects the
> > >> >> 'Corrupt blocks" entry to be right after the "Average block
> > >>replication"
> > >> >> entry, then the change would fail the tool. But is this really a
> > >> >>concern?
> > >> >>
> > >> >> I guess this is not really a concern, so I don't think this change
> is
> > >> >> incompatible. but would anyone please comment?
> > >> >>
> > >> >
> > >> >       If it changes the output of a CLI command, it's an
> incompatible
> > >> change:
> > >> >
> > >> >
> > >>
> > >>
> >
> http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/Co
> > >> >mpatibility.html#Command_Line_Interface_CLI
> > >> >
> > >> >
> > >> >       Other changes to fsck have been punted to 3.x for the *exact
> > >>same
> > >> >reason*. In other cases, committers have violated these rules in
> > >>branch-2
> > >> >(not just to fsck, but to all sorts of other command line bits, even
> > >> >removing command options!) to the point that our compatibility
> > >>guarantees
> > >> >are pretty much useless.  It's open season on nuking the ecosystem.
> :(
> > >> >
> > >> >       People not following the compat rules is one of the reasons I
> > >> started
> > >> >building my own changes and release notes, because we have too many
> > >> >committers either accidentally committing incompatible changes or
> just
> > >> >outright lying about them.  (Š and, as much as I hate to say it, the
> > >>HDFS
> > >> >project is easily the biggest offender.)
> > >>
> > >>
> >
> >
>

Reply via email to