From the perspective of an end user who is reading multiple versions' 
listings at once, listing the same JIRA being fixed in multiple releases is 
totally confusing, especially now that release notes are actually readable.  
"So which version was it ACTUALLY fixed in?" is going to be the question. It'd 
be worthwhile for folks to actually build, say, trunk and look at the release 
notes section of the site build to see how these things are presented in 
aggregate before coming to any conclusions.  Just viewing a single version's 
output will likely give a skewed perspective.  (Or, I suppose you can read 
https://gitlab.com/_a__w_/eco-release-metadata/tree/master/HADOOP too, but the 
sort order is "wrong" for web viewing.)

        My read of the HowToCommit fix rules is that they were written from the 
perspective of how we typically use branches to cut releases. In other words, 
the changes and release notes for 2.6.x, where x>0, 2.7.y, where y>0, will 
likely not be fully present/complete in 2.8.0 so wouldn't actually reflect the 
entirety of, say, the 2.7.4 release if 2.7.4 and 2.8.0 are being worked in 
parallel.   This in turn means the changes and release notes become orthogonal 
once the minor release branch is cut. This is also important because there is 
no guarantee that a change made in, say, 2.7.4 is actually in 2.8.0 because the 
code may have changed to the point that the fix isn't needed or wanted.

        From an automation perspective, I took the perspective that this means 
that the a.b.0 release notes are expected to be committed to all non-released 
major branches.  So trunk will have release notes for 2.7.0, 2.8.0, 2.9.0, etc 
but not from 2.7.1, 2.8.1, or 2.9.1.  This makes the fix rules actually pretty 
easy:  the lowest a.b.0 release and all non-.0 releases.  trunk, as always, is 
only listed if that is the only place where it was committed. (i.e., the lowest 
a.b.0 release happens to be the highest one available.)

        I suspect people are feeling confused or think the rules need to be 
changed mainly because a) we have a lot more branches getting RE work than ever 
before in Hadoop's history and b) 2.8.0 has been hanging out in an unreleased 
branch for ~7 months.  [The PMC should probably vote to kill that branch and 
just cut a new 2.8.0 based off of the current top of branch-2. I think that'd 
go a long way to clearing the confusion as well as actually making 2.8.0 
relevant again for those that still want to work on branch-2.]

        Also:

> Assuming the semantic versioning (http://semver.org) as
> our baseline thinking, 

        We don't use semantic versioning and you'll find zero references to it 
in any Apache Hadoop documentation.  If we were following semver, even in the 
loosest sense, 2.7.0 should have been 3.0.0 with the JRE upgrade requirement. 
(which, ironically, is still causing issues with folks moving things between 
2.6 and 2.7+, see the other thread about the Dockerfile.) In a stricter sense, 
we should be on v11 or something, given the amount of incompatible changes 
throughout branch-2's history.


> On Jul 22, 2016, at 11:44 AM, Andrew Wang <andrew.w...@cloudera.com> wrote:
> 
>> 
>> 
>>> I am also not quite sure I understand the rationale of what's in the
>> HowToCommit wiki. Assuming the semantic versioning (http://semver.org) as
>> our baseline thinking, having concurrent release streams alone breaks the
>> principle. And that is *regardless of* how we line up individual releases
>> in time (2.6.4 v. 2.7.3). Semantic versioning means 2.6.z < 2.7.* where *
>> is any number. Therefore, the moment we have any new 2.6.z release after
>> 2.7.0, the rule is broken and remains that way. Timing of subsequent
>> releases is somewhat irrelevant.
>> 
>> From a practical standpoint, I would love to know whether a certain patch
>> has been backported to a specific version. Thus, I would love to see fix
>> version enumerating all the releases that the JIRA went into. Basically the
>> more disclosure, the better. That would also make it easier for us
>> committers to see the state of the porting and identify issues like being
>> ported to 2.6.x but not to 2.7.x. What do you think? Should we revise our
>> policy?
>> 
>> 
> I also err towards more fix versions. Based on our branching strategy of
> branch-x -> branch-x.y -> branch->x.y.z, I think this means that the
> changelog will identify everything since the previous
> last-version-component of the branch name. So 2.6.5 diffs against 2.6.4,
> 2.8.0 diffs against 2.7.0, 3.0.0 against 2.0.0. This makes it more
> straightforward for users to determine what changelogs are important, based
> purely on the version number.
> 
> I agree with Sangjin that the #1 question that the changelogs should
> address is whether a certain patch is present in a version. For this
> usecase, it's better to have duplicate info than to omit something.
> 
> To answer "what's new", I think that's answered by the manually curated
> release notes, like the ones we put together at HADOOP-13383.


---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to