Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2016-07-22 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/

No changes




-1 overall


The following subsystems voted -1:
asflicense unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery 
   hadoop.yarn.server.nodemanager.TestDirectoryCollection 
   hadoop.yarn.server.TestMiniYarnClusterNodeUtilization 
   hadoop.yarn.server.TestContainerManagerSecurity 
   hadoop.yarn.client.api.impl.TestYarnClient 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/diff-compile-javac-root.txt
  [172K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/diff-checkstyle-root.txt
  [16M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/diff-patch-pylint.txt
  [16K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/diff-patch-shelldocs.txt
  [16K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/whitespace-tabs.txt
  [1.3M]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/diff-javadoc-javadoc-root.txt
  [2.3M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [144K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
  [36K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt
  [268K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-nativetask.txt
  [124K]

   asflicense:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109/artifact/out/patch-asflicense-problems.txt
  [4.0K]

Powered by Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[jira] [Created] (HADOOP-13407) s3a directory housekeeping operations to be done in async thread

2016-07-22 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-13407:
---

 Summary: s3a directory housekeeping operations to be done in async 
thread
 Key: HADOOP-13407
 URL: https://issues.apache.org/jira/browse/HADOOP-13407
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 2.8.0
Reporter: Steve Loughran
Priority: Minor


Some of the delays on s3a calls are due to cleaning up parent pseudo 
directories; repeated getParent/GET calls to look for the entries, then to 
delete them.

We could possibly make this asynchronous; the core semantics would be retained, 
just the cleanup delayed.

Risks?
# while the cleanup is in progress, getFileStatus of parent dirs could imply 
that the parent dir is still empty
# failure
of course, these risks exist today. We really need an s3a fsck



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2016-07-22 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/

[Jul 21, 2016 11:41:02 PM] (xiao) HDFS-10225. DataNode hot swap drives should 
disallow storage type
[Jul 22, 2016 6:21:47 AM] (cdouglas) HADOOP-13393. Omit unsupported 
fs.defaultFS setting in ADLS




-1 overall


The following subsystems voted -1:
asflicense unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.hdfs.server.namenode.TestFSImageWithXAttr 
   hadoop.yarn.server.nodemanager.TestDirectoryCollection 
   hadoop.yarn.server.TestContainerManagerSecurity 
   hadoop.yarn.server.TestMiniYarnClusterNodeUtilization 
   hadoop.yarn.client.api.impl.TestYarnClient 
   hadoop.mapred.TestMRCJCFileOutputCommitter 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/diff-compile-javac-root.txt
  [172K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/diff-checkstyle-root.txt
  [16M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/diff-patch-pylint.txt
  [16K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/diff-patch-shelldocs.txt
  [16K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/whitespace-tabs.txt
  [1.3M]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/diff-javadoc-javadoc-root.txt
  [2.3M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [144K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
  [36K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt
  [268K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
  [92K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-nativetask.txt
  [120K]

   asflicense:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/110/artifact/out/patch-asflicense-problems.txt
  [4.0K]

Powered by Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-22 Thread Allen Wittenauer

Does any of this work actually help processes that sit outside of YARN?

> On Jul 21, 2016, at 12:29 PM, Sean Busbey  wrote:
> 
> thanks for bringing this up! big +1 on upgrading dependencies for 3.0.
> 
> I have an updated patch for HADOOP-11804 ready to post this week. I've
> been updating HBase's master branch to try to make use of it, but
> could use some other reviews.
> 
> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa  wrote:
>> Hi developers,
>> 
>> I'd like to discuss how to make an advance towards dependency
>> management in Apache Hadoop trunk code since there has been lots work
>> about updating dependencies in parallel. Summarizing recent works and
>> activities as follows:
>> 
>> 0) Currently, we have merged minimum update dependencies for making
>> Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
>> 1) After that, some people suggest that we should update the other
>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
>> 2) In parallel, Sangjin and Sean are working on classpath isolation:
>> HADOOP-13070, HADOOP-11804 and HADOOP-11656.
>> 
>> Main problems we try to solve in the activities above is as follows:
>> 
>> * 1) tries to solve dependency hell between user-level jar and
>> system(Hadoop)-level jar.
>> * 2) tries to solve updating old libraries.
>> 
>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries
>> to separate class loader between client-side dependencies and
>> server-side dependencies in Hadoop, so we can the change policy of
>> updating libraries after doing 2). We can also decide which libraries
>> can be shaded after 2).
>> 
>> Hence, IMHO, a straight way we should go to is doing 2 at first.
>> After that, we can update both client-side and server-side
>> dependencies based on new policy(maybe we should discuss what kind of
>> incompatibility is acceptable, and the others are not).
>> 
>> Thoughts?
>> 
>> Thanks,
>> - Tsuyoshi
>> 
>> -
>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>> 
> 
> 
> 
> -- 
> busbey
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] 2.6.x line releases

2016-07-22 Thread Allen Wittenauer

> On Jul 21, 2016, at 5:06 PM, Zhe Zhang  wrote:
> 
> We are using 2.6 releases and would like to see 2.6.5.
> 
> One related issue, pre-commit is not working for 2.6 (see comment from
> Allen
> https://issues.apache.org/jira/browse/HDFS-10653?focusedCommentId=15388621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15388621).
> Should we fix for the purpose of the release?


Someone probably just needs to invest some time in modifying the 
Dockerfile in 2.6 to actually meet the build requirements.  Right now, it's way 
way wrong.
-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] 2.6.x line releases

2016-07-22 Thread Zhe Zhang
Thanks Allen for the note. I thought the 2.6 Dockerfile issue was addressed
in https://issues.apache.org/jira/browse/HADOOP-12800?

On Fri, Jul 22, 2016 at 9:04 AM Allen Wittenauer 
wrote:

>
> > On Jul 21, 2016, at 5:06 PM, Zhe Zhang 
> wrote:
> >
> > We are using 2.6 releases and would like to see 2.6.5.
> >
> > One related issue, pre-commit is not working for 2.6 (see comment from
> > Allen
> >
> https://issues.apache.org/jira/browse/HDFS-10653?focusedCommentId=15388621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15388621
> ).
> > Should we fix for the purpose of the release?
>
>
> Someone probably just needs to invest some time in modifying the
> Dockerfile in 2.6 to actually meet the build requirements.  Right now, it's
> way way wrong.

-- 
Zhe Zhang
Apache Hadoop Committer
http://zhe-thoughts.github.io/about/ | @oldcap


Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-22 Thread Sean Busbey
My work on HADOOP-11804 *only* helps processes that sit outside of YARN. :)

On Fri, Jul 22, 2016 at 10:48 AM, Allen Wittenauer
 wrote:
>
> Does any of this work actually help processes that sit outside of YARN?
>
>> On Jul 21, 2016, at 12:29 PM, Sean Busbey  wrote:
>>
>> thanks for bringing this up! big +1 on upgrading dependencies for 3.0.
>>
>> I have an updated patch for HADOOP-11804 ready to post this week. I've
>> been updating HBase's master branch to try to make use of it, but
>> could use some other reviews.
>>
>> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa  wrote:
>>> Hi developers,
>>>
>>> I'd like to discuss how to make an advance towards dependency
>>> management in Apache Hadoop trunk code since there has been lots work
>>> about updating dependencies in parallel. Summarizing recent works and
>>> activities as follows:
>>>
>>> 0) Currently, we have merged minimum update dependencies for making
>>> Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
>>> 1) After that, some people suggest that we should update the other
>>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
>>> 2) In parallel, Sangjin and Sean are working on classpath isolation:
>>> HADOOP-13070, HADOOP-11804 and HADOOP-11656.
>>>
>>> Main problems we try to solve in the activities above is as follows:
>>>
>>> * 1) tries to solve dependency hell between user-level jar and
>>> system(Hadoop)-level jar.
>>> * 2) tries to solve updating old libraries.
>>>
>>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries
>>> to separate class loader between client-side dependencies and
>>> server-side dependencies in Hadoop, so we can the change policy of
>>> updating libraries after doing 2). We can also decide which libraries
>>> can be shaded after 2).
>>>
>>> Hence, IMHO, a straight way we should go to is doing 2 at first.
>>> After that, we can update both client-side and server-side
>>> dependencies based on new policy(maybe we should discuss what kind of
>>> incompatibility is acceptable, and the others are not).
>>>
>>> Thoughts?
>>>
>>> Thanks,
>>> - Tsuyoshi
>>>
>>> -
>>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>>>
>>
>>
>>
>> --
>> busbey
>>
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-22 Thread Sangjin Lee
The work on HADOOP-13070 and the ApplicationClassLoader are generic and go
beyond YARN. It can be used in any JVM that uses hadoop. The current use
cases are MR containers, hadoop's RunJar (as in "hadoop jar"), and the YARN
node manager auxiliary services. I'm not sure if that's what you were
asking, but I hope it helps.

Regards,
Sangjin

On Fri, Jul 22, 2016 at 9:16 AM, Sean Busbey  wrote:

> My work on HADOOP-11804 *only* helps processes that sit outside of YARN. :)
>
> On Fri, Jul 22, 2016 at 10:48 AM, Allen Wittenauer
>  wrote:
> >
> > Does any of this work actually help processes that sit outside of YARN?
> >
> >> On Jul 21, 2016, at 12:29 PM, Sean Busbey  wrote:
> >>
> >> thanks for bringing this up! big +1 on upgrading dependencies for 3.0.
> >>
> >> I have an updated patch for HADOOP-11804 ready to post this week. I've
> >> been updating HBase's master branch to try to make use of it, but
> >> could use some other reviews.
> >>
> >> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa 
> wrote:
> >>> Hi developers,
> >>>
> >>> I'd like to discuss how to make an advance towards dependency
> >>> management in Apache Hadoop trunk code since there has been lots work
> >>> about updating dependencies in parallel. Summarizing recent works and
> >>> activities as follows:
> >>>
> >>> 0) Currently, we have merged minimum update dependencies for making
> >>> Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
> >>> 1) After that, some people suggest that we should update the other
> >>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
> >>> 2) In parallel, Sangjin and Sean are working on classpath isolation:
> >>> HADOOP-13070, HADOOP-11804 and HADOOP-11656.
> >>>
> >>> Main problems we try to solve in the activities above is as follows:
> >>>
> >>> * 1) tries to solve dependency hell between user-level jar and
> >>> system(Hadoop)-level jar.
> >>> * 2) tries to solve updating old libraries.
> >>>
> >>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries
> >>> to separate class loader between client-side dependencies and
> >>> server-side dependencies in Hadoop, so we can the change policy of
> >>> updating libraries after doing 2). We can also decide which libraries
> >>> can be shaded after 2).
> >>>
> >>> Hence, IMHO, a straight way we should go to is doing 2 at first.
> >>> After that, we can update both client-side and server-side
> >>> dependencies based on new policy(maybe we should discuss what kind of
> >>> incompatibility is acceptable, and the others are not).
> >>>
> >>> Thoughts?
> >>>
> >>> Thanks,
> >>> - Tsuyoshi
> >>>
> >>> -
> >>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> >>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >>>
> >>
> >>
> >>
> >> --
> >> busbey
> >>
> >> -
> >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>
> >
> >
> > -
> > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >
>
>
>
> --
> busbey
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


Re: Setting JIRA fix versions for 3.0.0 releases

2016-07-22 Thread Sangjin Lee
On Thu, Jul 21, 2016 at 3:58 PM, Andrew Wang 
wrote:

> Thanks for the input Vinod, inline:
>
>
> > Similarly the list of features we are enabling in this alpha would be
> good
> > - may be update the Roadmap wiki. Things like classpath-isolation which
> > were part of the original 3.x roadmap are still not done.
> >
> > I already updated the website release notes at HADOOP-13383. I can update
> the Roadmap wiki and break out what's new in alpha1 too, thanks for the
> reminder.
>
>
> > > * Community bandwidth isn't zero-sum. This particularly applies to
> > people working on features that are only present in trunk, like EC, shell
> > script rewrite, etc.
> >
> >
> > A bunch of us are going to be busy with finishing 2.8.0. It isn’t
> > zero-sum, but it predicates those of us involved with 2.8.0 from looking
> at
> > it, even though we are very interested in doing so.
> >
> > There's a plan for more 3.0.0 alphas, so there's still time to help out
> before things settle down for beta. If 2.8.0 is ready to go, it should
> happen before even alpha2.
>
> >
> > Obviously, I am not making the case that this issue won’t happen ever. In
> > fact, this already happened with the parallel 2.6.x and 2.7.x releases.
> And
> > we precisely avoided major confusion there by lining up 2.7.2 behind
> 2.6.3
> > etc.
> >
>
> Could you clarify how lining up releases differently avoids the fix version
> issue?
>
> What we've been doing is something like "one fix version per active release
> line". Since we've revived 3.x as a release line, it seems like a lot of
> JIRAs need 3.x fix versions now.
>
> As an aside, I honestly don't know how to interpret the fix version
> instructions on the HowToCommit wiki. They don't seem to match up with what
> we actually do in practice.
>

I am also not quite sure I understand the rationale of what's in the
HowToCommit wiki. Assuming the semantic versioning (http://semver.org) as
our baseline thinking, having concurrent release streams alone breaks the
principle. And that is *regardless of* how we line up individual releases
in time (2.6.4 v. 2.7.3). Semantic versioning means 2.6.z < 2.7.* where *
is any number. Therefore, the moment we have any new 2.6.z release after
2.7.0, the rule is broken and remains that way. Timing of subsequent
releases is somewhat irrelevant.

>From a practical standpoint, I would love to know whether a certain patch
has been backported to a specific version. Thus, I would love to see fix
version enumerating all the releases that the JIRA went into. Basically the
more disclosure, the better. That would also make it easier for us
committers to see the state of the porting and identify issues like being
ported to 2.6.x but not to 2.7.x. What do you think? Should we revise our
policy?


>
> Best,
> Andrew
>


Re: Setting JIRA fix versions for 3.0.0 releases

2016-07-22 Thread Andrew Wang
>
>
>> I am also not quite sure I understand the rationale of what's in the
> HowToCommit wiki. Assuming the semantic versioning (http://semver.org) as
> our baseline thinking, having concurrent release streams alone breaks the
> principle. And that is *regardless of* how we line up individual releases
> in time (2.6.4 v. 2.7.3). Semantic versioning means 2.6.z < 2.7.* where *
> is any number. Therefore, the moment we have any new 2.6.z release after
> 2.7.0, the rule is broken and remains that way. Timing of subsequent
> releases is somewhat irrelevant.
>
> From a practical standpoint, I would love to know whether a certain patch
> has been backported to a specific version. Thus, I would love to see fix
> version enumerating all the releases that the JIRA went into. Basically the
> more disclosure, the better. That would also make it easier for us
> committers to see the state of the porting and identify issues like being
> ported to 2.6.x but not to 2.7.x. What do you think? Should we revise our
> policy?
>
>
I also err towards more fix versions. Based on our branching strategy of
branch-x -> branch-x.y -> branch->x.y.z, I think this means that the
changelog will identify everything since the previous
last-version-component of the branch name. So 2.6.5 diffs against 2.6.4,
2.8.0 diffs against 2.7.0, 3.0.0 against 2.0.0. This makes it more
straightforward for users to determine what changelogs are important, based
purely on the version number.

I agree with Sangjin that the #1 question that the changelogs should
address is whether a certain patch is present in a version. For this
usecase, it's better to have duplicate info than to omit something.

To answer "what's new", I think that's answered by the manually curated
release notes, like the ones we put together at HADOOP-13383.


[jira] [Created] (HADOOP-13409) Andrew's test JIRA

2016-07-22 Thread Andrew Wang (JIRA)
Andrew Wang created HADOOP-13409:


 Summary: Andrew's test JIRA
 Key: HADOOP-13409
 URL: https://issues.apache.org/jira/browse/HADOOP-13409
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Andrew Wang
Assignee: Andrew Wang


Test JIRA for JIRA interaction script



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: 2.7.3 release plan

2016-07-22 Thread Vinod Kumar Vavilapalli
The nexus issue persists to date. I have been constantly getting “Remote end 
closed due to a SSH handshake” issue. Tried to upload the artifacts in piece, 
but that was too slow, spanned across days.

Finally found a “ DretryFailedDeploymentCount=10” option to mvn-deploy.

Sending out the vote now.

Thanks
+Vinod

> On Jul 21, 2016, at 11:53 AM, Vinod Kumar Vavilapalli  
> wrote:
> 
> Started on the RC two days ago, but running into a nexus issue - not able to 
> push the jars up stream for two days - the remote end is dropping the 
> connection consistently.
> 
> Trying to figure out workarounds.
> 
> Thanks
> +Vinod
> 
>> On Jul 13, 2016, at 4:09 PM, Vinod Kumar Vavilapalli  
>> wrote:
>> 
>> HADOOP-12893 is done after 4 months, thanks to great work from a bunch of 
>> folks besides Xiao Chen.
>> 
>> Creating the RC now.
>> 
>> Thanks
>> +Vinod
>> 
>>> On Jun 14, 2016, at 7:28 PM, Vinod Kumar Vavilapalli  
>>> wrote:
>>> 
>>> Release branch 2.7.3 is created. I also updated branch-2.7 to point to 
>>> 2.7.4-SNAPSHOT now.
>>> 
>>> Thanks
>>> +Vinod
>>> 
 On Jun 14, 2016, at 3:54 PM, Vinod Kumar Vavilapalli  
 wrote:
 
 HADOOP-12893 is finally close to completion after >  3months thanks to 
 efforts from Akira AJISAKA, Xiao Chen and Andrew Wang.
 
 I’m creating the release branch and kickstarting the release activities.
 
 Thanks
 +Vinod
 
> On May 16, 2016, at 5:39 PM, Vinod Kumar Vavilapalli  
> wrote:
> 
> I am just waiting on HADOOP-12893.
> 
> HADOOP-13154 just got created in the last one day, will have to see if it 
> really should block the release.
> 
> Major tickets are usually taken on a time basis: if they get in by the 
> proposed timelines, we get them in. Otherwise, we move them over.
> 
> Thanks
> +Vinod
> 
>> On May 16, 2016, at 5:20 PM, larry mccay  wrote:
>> 
>> Curious on the status of 2.7.3
>> 
>> It seems that we still have two outstanding critical/blocker JIRAs:
>> 
>> 1. [image: Bug] HADOOP-12893
>> Verify LICENSE.txt and NOTICE.txt
>> 
>> 2. [image: Sub-task] HADOOP-13154
>> S3AFileSystem
>> printAmazonServiceException/printAmazonClientException appear copy & 
>> paste
>> of AWS examples 
>> 
>> 
>> But 45-ish when we include Majors as well.
>> 
>> I know there are a number of critical issues with fixes that need to go 
>> out.
>> 
>> What is the plan?
>> 
>> On Tue, Apr 12, 2016 at 2:09 PM, Vinod Kumar Vavilapalli 
>> >> wrote:
>> 
>>> Others and I committed a few, I pushed out a few.
>>> 
>>> Down to just three now!
>>> 
>>> +Vinod
>>> 
 On Apr 6, 2016, at 3:00 PM, Vinod Kumar Vavilapalli 
 
>>> wrote:
 
 Down to only 10 blocker / critical tickets (
>>> https://issues.apache.org/jira/issues/?filter=12335343 <
>>> https://issues.apache.org/jira/issues/?filter=12335343>) now!
 
 Thanks
 +Vinod
 
> On Mar 30, 2016, at 4:18 PM, Vinod Kumar Vavilapalli <
>>> vino...@apache.org > wrote:
> 
> Hi all,
> 
> Got nudged about 2.7.3. Was previously waiting for 2.6.4 to go out
>>> (which did go out mid February). Got a little busy since.
> 
> Following up the 2.7.2 maintenance release, we should work towards a
>>> 2.7.3. The focus obviously is to have blocker issues [1], bug-fixes and
>>> *no* features / improvements.
> 
> I hope to cut an RC in a week - giving enough time for outstanding
>>> blocker / critical issues. Will start moving out any tickets that are 
>>> not
>>> blockers and/or won’t fit the timeline - there are 3 blockers and 15
>>> critical tickets outstanding as of now.
> 
> Thanks,
> +Vinod
> 
> [1] 2.7.3 release blockers:
>>> https://issues.apache.org/jira/issues/?filter=12335343 <
>>> https://issues.apache.org/jira/issues/?filter=12335343>
 
>>> 
>>> 
> 
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 
> 
 
>>> 
>>> 
>> 
>> 
> 


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Setting JIRA fix versions for 3.0.0 releases

2016-07-22 Thread Allen Wittenauer

From the perspective of an end user who is reading multiple versions' 
listings at once, listing the same JIRA being fixed in multiple releases is 
totally confusing, especially now that release notes are actually readable.  
"So which version was it ACTUALLY fixed in?" is going to be the question. It'd 
be worthwhile for folks to actually build, say, trunk and look at the release 
notes section of the site build to see how these things are presented in 
aggregate before coming to any conclusions.  Just viewing a single version's 
output will likely give a skewed perspective.  (Or, I suppose you can read 
https://gitlab.com/_a__w_/eco-release-metadata/tree/master/HADOOP too, but the 
sort order is "wrong" for web viewing.)

My read of the HowToCommit fix rules is that they were written from the 
perspective of how we typically use branches to cut releases. In other words, 
the changes and release notes for 2.6.x, where x>0, 2.7.y, where y>0, will 
likely not be fully present/complete in 2.8.0 so wouldn't actually reflect the 
entirety of, say, the 2.7.4 release if 2.7.4 and 2.8.0 are being worked in 
parallel.   This in turn means the changes and release notes become orthogonal 
once the minor release branch is cut. This is also important because there is 
no guarantee that a change made in, say, 2.7.4 is actually in 2.8.0 because the 
code may have changed to the point that the fix isn't needed or wanted.

From an automation perspective, I took the perspective that this means 
that the a.b.0 release notes are expected to be committed to all non-released 
major branches.  So trunk will have release notes for 2.7.0, 2.8.0, 2.9.0, etc 
but not from 2.7.1, 2.8.1, or 2.9.1.  This makes the fix rules actually pretty 
easy:  the lowest a.b.0 release and all non-.0 releases.  trunk, as always, is 
only listed if that is the only place where it was committed. (i.e., the lowest 
a.b.0 release happens to be the highest one available.)

I suspect people are feeling confused or think the rules need to be 
changed mainly because a) we have a lot more branches getting RE work than ever 
before in Hadoop's history and b) 2.8.0 has been hanging out in an unreleased 
branch for ~7 months.  [The PMC should probably vote to kill that branch and 
just cut a new 2.8.0 based off of the current top of branch-2. I think that'd 
go a long way to clearing the confusion as well as actually making 2.8.0 
relevant again for those that still want to work on branch-2.]

Also:

> Assuming the semantic versioning (http://semver.org) as
> our baseline thinking, 

We don't use semantic versioning and you'll find zero references to it 
in any Apache Hadoop documentation.  If we were following semver, even in the 
loosest sense, 2.7.0 should have been 3.0.0 with the JRE upgrade requirement. 
(which, ironically, is still causing issues with folks moving things between 
2.6 and 2.7+, see the other thread about the Dockerfile.) In a stricter sense, 
we should be on v11 or something, given the amount of incompatible changes 
throughout branch-2's history.


> On Jul 22, 2016, at 11:44 AM, Andrew Wang  wrote:
> 
>> 
>> 
>>> I am also not quite sure I understand the rationale of what's in the
>> HowToCommit wiki. Assuming the semantic versioning (http://semver.org) as
>> our baseline thinking, having concurrent release streams alone breaks the
>> principle. And that is *regardless of* how we line up individual releases
>> in time (2.6.4 v. 2.7.3). Semantic versioning means 2.6.z < 2.7.* where *
>> is any number. Therefore, the moment we have any new 2.6.z release after
>> 2.7.0, the rule is broken and remains that way. Timing of subsequent
>> releases is somewhat irrelevant.
>> 
>> From a practical standpoint, I would love to know whether a certain patch
>> has been backported to a specific version. Thus, I would love to see fix
>> version enumerating all the releases that the JIRA went into. Basically the
>> more disclosure, the better. That would also make it easier for us
>> committers to see the state of the porting and identify issues like being
>> ported to 2.6.x but not to 2.7.x. What do you think? Should we revise our
>> policy?
>> 
>> 
> I also err towards more fix versions. Based on our branching strategy of
> branch-x -> branch-x.y -> branch->x.y.z, I think this means that the
> changelog will identify everything since the previous
> last-version-component of the branch name. So 2.6.5 diffs against 2.6.4,
> 2.8.0 diffs against 2.7.0, 3.0.0 against 2.0.0. This makes it more
> straightforward for users to determine what changelogs are important, based
> purely on the version number.
> 
> I agree with Sangjin that the #1 question that the changelogs should
> address is whether a certain patch is present in a version. For this
> usecase, it's better to have duplicate info than to omit something.
> 
> To answer "what's new", I think that's answered by the manually curated
> release notes, like 

Re: [DISCUSS] 2.6.x line releases

2016-07-22 Thread Allen Wittenauer

> On Jul 22, 2016, at 9:07 AM, Zhe Zhang  wrote:
> 
> Thanks Allen for the note. I thought the 2.6 Dockerfile issue was addressed 
> in https://issues.apache.org/jira/browse/HADOOP-12800?

2.6 builds on JDK6 and JDK7.  2.x, where x>6, builds on JDK7 and JDK8.  [1] 
This should have been a big red flag in that patch:

---
+RUN apt-get install -y oracle-java8-installer
---

(It's interesting that ~2 years later, we're still dealing with the fallout of 
the JRE compatibility break in 2.7.  I wonder how many of those PMCs who voted 
for it are still actively involved.)

[1] For completeness, 3.x only builds on JDK8 but probably not JDK9, given 
log4j 1.x, etc, etc.  
-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] 2.6.x line releases

2016-07-22 Thread Zhe Zhang
Thanks for pointing it out Allen. I'll work on an addendum 2.6 patch for
HADOOP-12800.

On Fri, Jul 22, 2016 at 3:59 PM Allen Wittenauer 
wrote:

>
> > On Jul 22, 2016, at 9:07 AM, Zhe Zhang 
> wrote:
> >
> > Thanks Allen for the note. I thought the 2.6 Dockerfile issue was
> addressed in https://issues.apache.org/jira/browse/HADOOP-12800?
>
> 2.6 builds on JDK6 and JDK7.  2.x, where x>6, builds on JDK7 and JDK8.
> [1] This should have been a big red flag in that patch:
>
> ---
> +RUN apt-get install -y oracle-java8-installer
> ---
>
> (It's interesting that ~2 years later, we're still dealing with the
> fallout of the JRE compatibility break in 2.7.  I wonder how many of those
> PMCs who voted for it are still actively involved.)
>
> [1] For completeness, 3.x only builds on JDK8 but probably not JDK9, given
> log4j 1.x, etc, etc.

-- 
Zhe Zhang
Apache Hadoop Committer
http://zhe-thoughts.github.io/about/ | @oldcap


[jira] [Created] (HADOOP-13410) RunJar adds the content of the jar twice to the classpath

2016-07-22 Thread Sangjin Lee (JIRA)
Sangjin Lee created HADOOP-13410:


 Summary: RunJar adds the content of the jar twice to the classpath
 Key: HADOOP-13410
 URL: https://issues.apache.org/jira/browse/HADOOP-13410
 Project: Hadoop Common
  Issue Type: Bug
  Components: util
Reporter: Sangjin Lee


Today when you run a "hadoop jar" command, the jar is unzipped to a temporary 
location and gets added to the classloader.

However, the original jar itself is still added to the classpath.
{code}
  List classPath = new ArrayList<>();
  classPath.add(new File(workDir + "/").toURI().toURL());
  classPath.add(file.toURI().toURL());
  classPath.add(new File(workDir, "classes/").toURI().toURL());
  File[] libs = new File(workDir, "lib").listFiles();
  if (libs != null) {
for (File lib : libs) {
  classPath.add(lib.toURI().toURL());
}
  }
{code}

As a result, the contents of the jar are present in the classpath *twice* and 
are completely redundant. Although this does not necessarily cause correctness 
issues, some stricter code written to require a single presence of files may 
fail.

I cannot think of a good reason why the jar should be added to the classpath if 
the unjarred content was added to it. I think we should remove the jar from the 
classpath.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Setting JIRA fix versions for 3.0.0 releases

2016-07-22 Thread Vinod Kumar Vavilapalli
I’ve been using jdiff simply because of a lack of alternative.

If you’ve had experience with tool [1], if you think it serves our purpose, and 
if you can spare some time, that’ll be greatly appreciated. I can also pitch in 
with whatever help is needed.

I think we should pick one of 2.6.3 or 2.7.2 as baseline.

Thanks
+Vinod

> On Jul 21, 2016, at 2:41 PM, Sean Busbey  wrote:
> 
> I can come up with this, atleast for Source / Binary API compatibility,
> provided folks don't mind if I use the Java API Compliance Checker[1]
> instead of jdiff.
> 
> I'm already familiar with quickly using it, esp with Audience
> Annotations from my work in HBase.
> 
> Do you want this check from some particular branch-2 release? It
> matters since the releases along branch-2 have themselves had some
> noise[2].
> 
> [1]: https://github.com/lvc/japi-compliance-checker 
> 
> [2]: http://abi-laboratory.pro/java/tracker/timeline/hadoop/ 
> 
> 
> -- 
> busbey



Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-22 Thread Allen Wittenauer

But if I don't use ApplicationClassLoader, my java app is basically screwed 
then, right?

Also:  right now, the non-Linux and/or non-x86 platforms have to supply their 
own leveldbjni jar (or at least the C level library?) in order to make YARN 
even functional.  How is that going to work with the class path manipulation?


> On Jul 22, 2016, at 9:57 AM, Sangjin Lee  wrote:
> 
> The work on HADOOP-13070 and the ApplicationClassLoader are generic and go 
> beyond YARN. It can be used in any JVM that uses hadoop. The current use 
> cases are MR containers, hadoop's RunJar (as in "hadoop jar"), and the YARN 
> node manager auxiliary services. I'm not sure if that's what you were asking, 
> but I hope it helps.
> 
> Regards,
> Sangjin
> 
> On Fri, Jul 22, 2016 at 9:16 AM, Sean Busbey  wrote:
> My work on HADOOP-11804 *only* helps processes that sit outside of YARN. :)
> 
> On Fri, Jul 22, 2016 at 10:48 AM, Allen Wittenauer
>  wrote:
> >
> > Does any of this work actually help processes that sit outside of YARN?
> >
> >> On Jul 21, 2016, at 12:29 PM, Sean Busbey  wrote:
> >>
> >> thanks for bringing this up! big +1 on upgrading dependencies for 3.0.
> >>
> >> I have an updated patch for HADOOP-11804 ready to post this week. I've
> >> been updating HBase's master branch to try to make use of it, but
> >> could use some other reviews.
> >>
> >> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa  wrote:
> >>> Hi developers,
> >>>
> >>> I'd like to discuss how to make an advance towards dependency
> >>> management in Apache Hadoop trunk code since there has been lots work
> >>> about updating dependencies in parallel. Summarizing recent works and
> >>> activities as follows:
> >>>
> >>> 0) Currently, we have merged minimum update dependencies for making
> >>> Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
> >>> 1) After that, some people suggest that we should update the other
> >>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
> >>> 2) In parallel, Sangjin and Sean are working on classpath isolation:
> >>> HADOOP-13070, HADOOP-11804 and HADOOP-11656.
> >>>
> >>> Main problems we try to solve in the activities above is as follows:
> >>>
> >>> * 1) tries to solve dependency hell between user-level jar and
> >>> system(Hadoop)-level jar.
> >>> * 2) tries to solve updating old libraries.
> >>>
> >>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries
> >>> to separate class loader between client-side dependencies and
> >>> server-side dependencies in Hadoop, so we can the change policy of
> >>> updating libraries after doing 2). We can also decide which libraries
> >>> can be shaded after 2).
> >>>
> >>> Hence, IMHO, a straight way we should go to is doing 2 at first.
> >>> After that, we can update both client-side and server-side
> >>> dependencies based on new policy(maybe we should discuss what kind of
> >>> incompatibility is acceptable, and the others are not).
> >>>
> >>> Thoughts?
> >>>
> >>> Thanks,
> >>> - Tsuyoshi
> >>>
> >>> -
> >>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> >>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >>>
> >>
> >>
> >>
> >> --
> >> busbey
> >>
> >> -
> >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>
> >
> >
> > -
> > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >
> 
> 
> 
> --
> busbey
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 
> 


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



RE: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-22 Thread Zheng, Kai
For the leveldb thing, wouldn't we have an alternative option in Java for the 
platforms where leveldb isn't supported yet due to whatever reasons. IMO, 
native library would be best to be used for optimization and production for 
performance. For development and pure Java platform, by default pure Java 
approach should still be provided and used. That is to say, if no Hadoop native 
is used, all the functionalities should still work and not break. 

HDFS erasure coding goes in the way. For that, we spent much effort in 
developing an ISA-L compatible erasure coder in pure Java that's used by 
default, though for performance the ISA-L native one is recommended in 
production deployment.

Regards,
Kai

-Original Message-
From: Allen Wittenauer [mailto:a...@effectivemachines.com] 
Sent: Saturday, July 23, 2016 8:16 AM
To: Sangjin Lee 
Cc: Sean Busbey ; common-dev@hadoop.apache.org; 
yarn-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
mapreduce-...@hadoop.apache.org
Subject: Re: [DISCUSS] The order of classpath isolation work and 
updating/shading dependencies on trunk


But if I don't use ApplicationClassLoader, my java app is basically screwed 
then, right?

Also:  right now, the non-Linux and/or non-x86 platforms have to supply their 
own leveldbjni jar (or at least the C level library?) in order to make YARN 
even functional.  How is that going to work with the class path manipulation?


> On Jul 22, 2016, at 9:57 AM, Sangjin Lee  wrote:
> 
> The work on HADOOP-13070 and the ApplicationClassLoader are generic and go 
> beyond YARN. It can be used in any JVM that uses hadoop. The current use 
> cases are MR containers, hadoop's RunJar (as in "hadoop jar"), and the YARN 
> node manager auxiliary services. I'm not sure if that's what you were asking, 
> but I hope it helps.
> 
> Regards,
> Sangjin
> 
> On Fri, Jul 22, 2016 at 9:16 AM, Sean Busbey  wrote:
> My work on HADOOP-11804 *only* helps processes that sit outside of 
> YARN. :)
> 
> On Fri, Jul 22, 2016 at 10:48 AM, Allen Wittenauer 
>  wrote:
> >
> > Does any of this work actually help processes that sit outside of YARN?
> >
> >> On Jul 21, 2016, at 12:29 PM, Sean Busbey  wrote:
> >>
> >> thanks for bringing this up! big +1 on upgrading dependencies for 3.0.
> >>
> >> I have an updated patch for HADOOP-11804 ready to post this week. 
> >> I've been updating HBase's master branch to try to make use of it, 
> >> but could use some other reviews.
> >>
> >> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa  wrote:
> >>> Hi developers,
> >>>
> >>> I'd like to discuss how to make an advance towards dependency 
> >>> management in Apache Hadoop trunk code since there has been lots 
> >>> work about updating dependencies in parallel. Summarizing recent 
> >>> works and activities as follows:
> >>>
> >>> 0) Currently, we have merged minimum update dependencies for 
> >>> making Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
> >>> 1) After that, some people suggest that we should update the other 
> >>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
> >>> 2) In parallel, Sangjin and Sean are working on classpath isolation:
> >>> HADOOP-13070, HADOOP-11804 and HADOOP-11656.
> >>>
> >>> Main problems we try to solve in the activities above is as follows:
> >>>
> >>> * 1) tries to solve dependency hell between user-level jar and 
> >>> system(Hadoop)-level jar.
> >>> * 2) tries to solve updating old libraries.
> >>>
> >>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) 
> >>> tries to separate class loader between client-side dependencies 
> >>> and server-side dependencies in Hadoop, so we can the change 
> >>> policy of updating libraries after doing 2). We can also decide 
> >>> which libraries can be shaded after 2).
> >>>
> >>> Hence, IMHO, a straight way we should go to is doing 2 at first.
> >>> After that, we can update both client-side and server-side 
> >>> dependencies based on new policy(maybe we should discuss what kind 
> >>> of incompatibility is acceptable, and the others are not).
> >>>
> >>> Thoughts?
> >>>
> >>> Thanks,
> >>> - Tsuyoshi
> >>>
> >>> --
> >>> --- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> >>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >>>
> >>
> >>
> >>
> >> --
> >> busbey
> >>
> >> ---
> >> -- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>
> >
> >
> > 
> > - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >
> 
> 
> 
> --
> busbey
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apach

[VOTE] Release Apache Hadoop 2.7.3 RC0

2016-07-22 Thread Vinod Kumar Vavilapalli
Hi all,

I've created a release candidate RC0 for Apache Hadoop 2.7.3.

As discussed before, this is the next maintenance release to follow up 2.7.2.

The RC is available for validation at: 
http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/ 


The RC tag in git is: release-2.7.3-RC0

The maven artifacts are available via repository.apache.org 
 at 
https://repository.apache.org/content/repositories/orgapachehadoop-1040/ 


The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
this at http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/releasenotes.html 
 for your 
quick perusal.

As you may have noted, a very long fix-cycle for the License & Notice issues 
(HADOOP-12893) caused 2.7.3 (along with every other Hadoop release) to slip by 
quite a bit. This release's related discussion thread is linked below: [1].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1]: 2.7.3 release plan: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html 


Re: Setting JIRA fix versions for 3.0.0 releases

2016-07-22 Thread Andrew Wang
Thanks for the input Allen, good perspective as always, inline:


> From the perspective of an end user who is reading multiple
> versions' listings at once, listing the same JIRA being fixed in multiple
> releases is totally confusing, especially now that release notes are
> actually readable.  "So which version was it ACTUALLY fixed in?" is going
> to be the question. It'd be worthwhile for folks to actually build, say,
> trunk and look at the release notes section of the site build to see how
> these things are presented in aggregate before coming to any conclusions.
> Just viewing a single version's output will likely give a skewed
> perspective.  (Or, I suppose you can read
> https://gitlab.com/_a__w_/eco-release-metadata/tree/master/HADOOP too,
> but the sort order is "wrong" for web viewing.)
>
> Does this mean you find our current system of listing a JIRA as being
fixed in both a 2.6.x and 2.7.x to be confusing?

FWIW, my usecase is normally not "what is the earliest release that has
this fix?" but rather "is this fix in this release?". If it's easy to query
the latter, you can also determine the former. Some kind of query tool
could help here.


> My read of the HowToCommit fix rules is that they were written
> from the perspective of how we typically use branches to cut releases. In
> other words, the changes and release notes for 2.6.x, where x>0, 2.7.y,
> where y>0, will likely not be fully present/complete in 2.8.0 so wouldn't
> actually reflect the entirety of, say, the 2.7.4 release if 2.7.4 and 2.8.0
> are being worked in parallel.   This in turn means the changes and release
> notes become orthogonal once the minor release branch is cut. This is also
> important because there is no guarantee that a change made in, say, 2.7.4
> is actually in 2.8.0 because the code may have changed to the point that
> the fix isn't needed or wanted.
>
> From an automation perspective, I took the perspective that this
> means that the a.b.0 release notes are expected to be committed to all
> non-released major branches.  So trunk will have release notes for 2.7.0,
> 2.8.0, 2.9.0, etc but not from 2.7.1, 2.8.1, or 2.9.1.


For the release notes, am I correct in interpreting this as:

* diff a.0.0 from the previous x.y.0 release
* diff a.b.0  from the previous a.0.0 or a.b.0 release
* diff a.b.c from the previous a.b.0 or a.b.c release

Ray pointed me at the changelogs of a few other enterprise software
products, and this strategy seems pretty common. I like it.

I realize now that this means a lot more JIRAs will need the 2.8.0 fix
version, since they only have 2.6.x and 2.7.x.


>   This makes the fix rules actually pretty easy:  the lowest a.b.0 release
> and all non-.0 releases.


I think this needs to be amended to handle the case of multiple major
release branches, since we could have something committed for both 2.9.0
and 3.1.0. So "lowest a.b.0 release within each major version"?

  trunk, as always, is only listed if that is the only place where it was
> committed. (i.e., the lowest a.b.0 release happens to be the highest one
> available.)
>

This was true previously (no releases from trunk, trunk is versioned
a.0.0), but now that trunk is essentially a minor release branch, its fix
version needs to be treated as such.


> I suspect people are feeling confused or think the rules need to
> be changed...
>

The explanation here is far clearer than what's on HowToCommit,
particularly since the HowToCommit example branches are irrelevant in
today's era. If we like this versioning strategy, I'm happy to crib some of
this text and update the HowToCommit wiki.

In good news, my little python script to do bulk fixVersion updates seems
to work, so we can proceed posthaste once we have a plan.

Thanks,
Andrew


Re: Setting JIRA fix versions for 3.0.0 releases

2016-07-22 Thread Allen Wittenauer

> On Jul 22, 2016, at 7:16 PM, Andrew Wang  wrote:
> 
> Does this mean you find our current system of listing a JIRA as being fixed 
> in both a 2.6.x and 2.7.x to be confusing?

Nope.  I'm only confused when there isn't a .0 release in the fix line. 
 When I see 2.6.x and 2.7.x I know that it was back ported to those branches.  
If I don't see a .0, I figure it's either a mistake or something that was 
already fixed by another change in that major/minor branch.  It's almost always 
the former, however.

> FWIW, my usecase is normally not "what is the earliest release that has this 
> fix?" but rather "is this fix in this release?". If it's easy to query the 
> latter, you can also determine the former. Some kind of query tool could help 
> here.

It literally becomes a grep if people commit the release data into the 
source tree, the release data is correct, etc:

$ mvn install site  -Preleasedocs -Pdocs -DskipTests
$ grep issueid 
hadoop-common-project/hadoop-common/src/site/markdown/release/*/CHANGES*

We should probably update the release process to make sure that *in 
progress* release data is also committed when a .0 is cut.  That's likely 
missing. Another choice would be to modify the pom to that runs releasedocmaker 
to use a range rather than single version, but that gets a bit tricky with 
release dates, how big of a range, etc.  Not impossible, just tricky.  Probably 
needs to be script that gets run as part of create-release, maybe?

(In reality, I do this grep against my git repo that generates the 
change log data automatically.  This way it is always up-to-date and not 
dependent upon release data being committed.  But that same grep could be done 
with a JQL query just as easily.)

> For the release notes, am I correct in interpreting this as:
> 
> * diff a.0.0 from the previous x.y.0 release
> * diff a.b.0  from the previous a.0.0 or a.b.0 release
> * diff a.b.c from the previous a.b.0 or a.b.c release

Pretty much yes.

> Ray pointed me at the changelogs of a few other enterprise software products, 
> and this strategy seems pretty common. I like it.

It's extremely common, to the point that putting every fix for every 
release touched is, at least to me, weird and extremely unconventional.

> I realize now that this means a lot more JIRAs will need the 2.8.0 fix 
> version, since they only have 2.6.x and 2.7.x.

Yup.

>   This makes the fix rules actually pretty easy:  the lowest a.b.0 release 
> and all non-.0 releases.
> 
> I think this needs to be amended to handle the case of multiple major release 
> branches, since we could have something committed for both 2.9.0 and 3.1.0. 
> So "lowest a.b.0 release within each major version"?

Yeah, switching to effectively trunk-based development makes the rules 
harder.  It's one of the reasons why the two big enterprisey companies I worked 
at prior to working on Hadoop didn't really do trunk-based for the vast 
majority of projects.  They always cut a branch (or equivalent for that SCM) to 
delineate a break.   Given the amount of ex-Sun folks involved in the early 
days of Hadoop, our pre-existing development processes very much reflect that 
culture.

> This was true previously (no releases from trunk, trunk is versioned a.0.0), 
> but now that trunk is essentially a minor release branch, its fix version 
> needs to be treated as such.

Yeah, I misspoke a bit when dealing with a head-of-tree model.  
3.0.0-alpha1 will generate different notes than 3.0.0-alpha2, obviously. Every 
3.0.0-(label) release is effectively a major version in that case.



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-22 Thread Allen Wittenauer

> On Jul 22, 2016, at 5:47 PM, Zheng, Kai  wrote:
> 
> For the leveldb thing, wouldn't we have an alternative option in Java for the 
> platforms where leveldb isn't supported yet due to whatever reasons. IMO, 
> native library would be best to be used for optimization and production for 
> performance. For development and pure Java platform, by default pure Java 
> approach should still be provided and used. That is to say, if no Hadoop 
> native is used, all the functionalities should still work and not break. 

Yes and no.  I can certainly understand some high-end features being 
tied to native libraries, simply because system programming with Java is like 
being a touch typist with your nose.  

That said, absolutely key functionality should definitely work. Take a 
look at the last Linux/ppc64le report that was emailed to these very lists a 
few days ago [1]:


https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/30/artifact/out/console-report.html

Almost all of those YARN failures are due to MiniYARN trying to 
initiate leveldb as part of the service startup but can't because the embedded 
shared library is the wrong hardware architecture. Rather than catch the 
exception and do something else, the code just blows up in a very dramatic 
fashion. That should translate into YARN is completely busted and unusable 
without doing some very weird workarounds.

To get us back on topic:  the class path isolation work absolutely 
cannot make this situation worse.  We either need to make sure end users can 
replace/modify Hadoop's dependencies if they require native lirbaries or work 
harder on making multiplatform stuff better supported.  The nightly PowerPC 
builds should help tremendously towards this goal. [2]

1 - While I greatly appreciate the OpenPOWER Foundation getting the ASF access 
to these boxes -- Mesos and Hadoop are both actively using them -- It'd be 
great if they were more reliable so we could get a report every day of the 
week. :(

2 - At some point, I'll set up a manually triggered precommit job to test 
patches.  But until both boxes are online and available on a consistent basis, 
it just isn't worth the effort.
-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org