Re: [VOTE] Merge feature branch YARN-5355 (Timeline Service v2) to trunk

2017-08-29 Thread Chris Trezzo
+1

I helped deploy and run the ATSv2 aux service on a cluster with ~400 node
managers on it. I also verified that ATSv2 has no significant impact when
disabled.

Nice work everyone!

On Tue, Aug 29, 2017 at 6:17 AM, Aaron Gresch  wrote:

> +1 non-binding
>
> I did some testing with security off running both ATS v1 and v2.
>
> On Tue, Aug 22, 2017 at 1:32 AM, Vrushali Channapattan <
> vrushalic2...@gmail.com> wrote:
>
> > Hi folks,
> >
> > Per earlier discussion [1], I'd like to start a formal vote to merge
> > feature branch YARN-5355 [2] (Timeline Service v.2) to trunk. The vote
> will
> > run for 7 days, and will end August 29 11:00 PM PDT.
> >
> > We have previously completed one merge onto trunk [3] and Timeline
> Service
> > v2 has been part of Hadoop release 3.0.0-alpha1.
> >
> > Since then, we have been working on extending the capabilities of
> Timeline
> > Service v2 in a feature branch [2] for a while, and we are reasonably
> > confident that the state of the feature meets the criteria to be merged
> > onto trunk and we'd love folks to get their hands on it in a test
> capacity
> > and provide valuable feedback so that we can make it production-ready.
> >
> > In a nutshell, Timeline Service v.2 delivers significant scalability and
> > usability improvements based on a new architecture. What we would like to
> > merge to trunk is termed "alpha 2" (milestone 2). The feature has a
> > complete end-to-end read/write flow with security and read level
> > authorization via whitelists. You should be able to start setting it up
> and
> > testing it.
> >
> > At a high level, the following are the key features that have been
> > implemented since alpha1:
> > - Security via Kerberos Authentication and delegation tokens
> > - Read side simple authorization via whitelist
> > - Client configurable entity sort ordering
> > - Richer REST APIs for apps, app attempts, containers, fetching metrics
> by
> > timerange, pagination, sub-app entities
> > - Support for storing sub-application entities (entities that exist
> outside
> > the scope of an application)
> > - Configurable TTLs (time-to-live) for tables, configurable table
> prefixes,
> > configurable hbase cluster
> > - Flow level aggregations done as dynamic (table level) coprocessors
> > - Uses latest stable HBase release 1.2.6
> >
> > There are a total of 82 subtasks that were completed as part of this
> > effort.
> >
> > We paid close attention to ensure that once disabled Timeline Service v.2
> > does not impact existing functionality when disabled (by default).
> >
> > Special thanks to a team of folks who worked hard and contributed towards
> > this effort with patches, reviews and guidance: Rohith Sharma K S, Varun
> > Saxena, Haibo Chen, Sangjin Lee, Li Lu, Vinod Kumar Vavilapalli, Joep
> > Rottinghuis, Jason Lowe, Jian He, Robert Kanter, Micheal Stack.
> >
> > Regards,
> > Vrushali
> >
> > [1] http://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg27383.html
> > [2] https://issues.apache.org/jira/browse/YARN-5355
> > [3] https://issues.apache.org/jira/browse/YARN-2928
> > [4] https://github.com/apache/hadoop/commits/YARN-5355
> >
>


[jira] [Created] (MAPREDUCE-6862) Fragments are not handled correctly by resource limit checking

2017-03-09 Thread Chris Trezzo (JIRA)
Chris Trezzo created MAPREDUCE-6862:
---

 Summary: Fragments are not handled correctly by resource limit 
checking
 Key: MAPREDUCE-6862
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6862
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0-alpha1, 2.9.0
Reporter: Chris Trezzo
Assignee: Chris Trezzo
Priority: Minor


If a user specifies a fragment for a libjar, files, archives path via generic 
options parser and resource limit checking is enabled, the client crashes with 
a FileNotFoundException:
{noformat}
java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.getFileStatus(JobResourceUploader.java:413)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.explorePath(JobResourceUploader.java:395)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.checkLocalizationLimits(JobResourceUploader.java:304)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:103)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-02-08 Thread Chris Trezzo (JIRA)
Chris Trezzo created MAPREDUCE-6846:
---

 Summary: Fragments specified for libjar paths are not handled 
correctly
 Key: MAPREDUCE-6846
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0-alpha2, 2.7.3
Reporter: Chris Trezzo
Assignee: Chris Trezzo
Priority: Minor


If a user specifies a fragment for a libjars path via generic options parser, 
the client crashes with a FileNotFoundException:
{noformat}
java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
{noformat}

This is actually inconsistent with the behavior for files and archives. Here is 
a table showing the current behavior for each type of path and resource:
| || Qualified path (i.e. file:/home/mapred/test.txt#frag.txt) || Absolute path 
(i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. test.txt#frag.txt) 
||
|| -libjars | FileNotFound | FileNotFound|FileNotFound|
|| -files | (/) | (/) | (/) |
|| -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release cadence and EOL

2017-01-18 Thread Chris Trezzo
Thanks Sangjin for pushing this forward! I have a few questions:

1. What is the definition of end-of-life for a release in the hadoop
project? My current understanding is as follows: When a release line
reaches end-of-life, there are no more planned releases for that line.
Committers are no longer responsible for back-porting bug fixes to the line
(including fixed security vulnerabilities) and it is essentially
unmaintained.

2. How do major releases affect the end-of-life proposal? For example, how
does a new minor release in the next major release affect the end-of-life
of minor releases in a previous major release? Is it possible to have a
maintained 2.x release if there is a 3.3 release?

Thanks!

On Tue, Jan 17, 2017 at 10:32 AM, Karthik Kambatla 
wrote:

> +1
>
> I would also like to see some process guidelines. I should have brought
> this up on the discussion thread, but I thought of them only now :(
>
>- Is an RM responsible for all maintenance releases against a minor
>release, or finding another RM to drive a maintenance release? In the
> past,
>this hasn't been a major issue.
>- When do we pick/volunteer to RM a minor release? IMO, this should be
>right after the previous release goes out. For example, Junping is
> driving
>2.8.0 now. As soon as that is done, we need to find a volunteer to RM
> 2.9.0
>6 months after.
>- The release process has multiple steps, based on
>major/minor/maintenance. It would be nice to capture/track how long each
>step takes so the RM can be prepared. e.g. herding the cats for an RC
> takes
>x weeks, compatibility checks take y days of work.
>
>
> On Tue, Jan 17, 2017 at 10:05 AM, Sangjin Lee  wrote:
>
> > Thanks for correcting me! I left out a sentence by mistake. :)
> >
> > To correct the proposal we're voting for:
> >
> > A minor release on the latest major line should be every 6 months, and a
> > maintenance release on a minor release (as there may be concurrently
> > maintained minor releases) every 2 months.
> >
> > A minor release line is end-of-lifed 2 years after it is released or
> there
> > are 2 newer minor releases, whichever is sooner. The community reserves
> the
> > right to extend or shorten the life of a release line if there is a good
> > reason to do so.
> >
> > Sorry for the snafu.
> >
> > Regards,
> > Sangjin
> >
> > On Tue, Jan 17, 2017 at 9:58 AM, Daniel Templeton 
> > wrote:
> >
> > > Thanks for driving this, Sangjin. Quick question, though: the subject
> > line
> > > is "Release cadence and EOL," but I don't see anything about cadence in
> > the
> > > proposal.  Did I miss something?
> > >
> > > Daniel
> > >
> > >
> > > On 1/17/17 8:35 AM, Sangjin Lee wrote:
> > >
> > >> Following up on the discussion thread on this topic (
> > >> https://s.apache.org/eFOf), I'd like to put the proposal for a vote
> for
> > >> the
> > >> release cadence and EOL. The proposal is as follows:
> > >>
> > >> "A minor release line is end-of-lifed 2 years after it is released or
> > >> there
> > >> are 2 newer minor releases, whichever is sooner. The community
> reserves
> > >> the
> > >> right to extend or shorten the life of a release line if there is a
> good
> > >> reason to do so."
> > >>
> > >> This also entails that we the Hadoop community commit to following
> this
> > >> practice and solving challenges to make it possible. Andrew Wang laid
> > out
> > >> some of those challenges and what can be done in the discussion thread
> > >> mentioned above.
> > >>
> > >> I'll set the voting period to 7 days. I understand a majority rule
> would
> > >> apply in this case. Your vote is greatly appreciated, and so are
> > >> suggestions!
> > >>
> > >> Thanks,
> > >> Sangjin
> > >>
> > >>
> > >
> > > -
> > > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> > >
> > >
> >
>


[jira] [Created] (MAPREDUCE-6825) YARNRunner#createApplicationSubmissionContext method is longer than 150 lines

2016-12-16 Thread Chris Trezzo (JIRA)
Chris Trezzo created MAPREDUCE-6825:
---

 Summary: YARNRunner#createApplicationSubmissionContext method is 
longer than 150 lines
 Key: MAPREDUCE-6825
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6825
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Chris Trezzo
Priority: Trivial


bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java:341:
 public ApplicationSubmissionContext createApplicationSubmissionContext(:3: 
Method length is 249 lines (max allowed is 150).

{{YARNRunner#createApplicationSubmissionContext}} is longer than 150 lines and 
needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6824) TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines

2016-12-16 Thread Chris Trezzo (JIRA)
Chris Trezzo created MAPREDUCE-6824:
---

 Summary: TaskAttemptImpl#createCommonContainerLaunchContext is 
longer than 150 lines
 Key: MAPREDUCE-6824
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6824
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Chris Trezzo
Priority: Trivial


bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
 private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
Method length is 172 lines (max allowed is 150).

{{TaskAttemptImpl#createCommonContainerLaunchContext}} is longer than 150 lines 
and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.6.5 (RC1)

2016-10-05 Thread Chris Trezzo
+1 (non-binding)

Thanks for cutting another release candidate Sangjin!

1. Diff'ed the CHANGES-*.txt files against the branch-2.6.5 git log and
verified the set of listed issues matches.
2. Verified md5 checksums and signature on src and binary tar.gz.
3. Built from source.
4. Started up a pseudo distributed cluster.
5. Successfully ran a PI job.
6. Ran the balancer.
7. Inspected UI for RM, NN, JobHistory.

On Sun, Oct 2, 2016 at 5:12 PM, Sangjin Lee  wrote:

> Hi folks,
>
> I have pushed a new release candidate (R1) for the Apache Hadoop 2.6.5
> release (the next maintenance release in the 2.6.x release line). RC1
> contains fixes to CHANGES.txt, and is otherwise identical to RC0.
>
> Below are the details of this release candidate:
>
> The RC is available for validation at:
> http://home.apache.org/~sjlee/hadoop-2.6.5-RC1/.
>
> The RC tag in git is release-2.6.5-RC1 and its git commit is
> e8c9fe0b4c252caf2ebf1464220599650f119997.
>
> The maven artifacts are staged via repository.apache.org at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1050/.
>
> You can find my public key at
> http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS.
>
> Please try the release and vote. The vote will run for the usual 5 days. I
> would greatly appreciate your timely vote. Thanks!
>
> Regards,
> Sangjin
>


Re: [VOTE] Release Apache Hadoop 2.6.5 (RC0)

2016-09-29 Thread Chris Trezzo
+1

Thanks Sangjin!

1. Verified md5 checksums and signature on src, and release tar.gz.
2. Built from source.
3. Started up a pseudo distributed cluster.
4. Successfully ran a PI job.
5. Ran the balancer.
6. Inspected UI for RM, NN, JobHistory.

On Tue, Sep 27, 2016 at 4:11 PM, Lei Xu  wrote:

> +1
>
> The steps I've done:
>
> * Downloaded release tar and source tar, verified MD5.
> * Run a HDFS cluster, and copy files between local filesystem and HDFS.
>
>
> On Tue, Sep 27, 2016 at 1:28 PM, Sangjin Lee  wrote:
> > Hi folks,
> >
> > I have created a release candidate RC0 for the Apache Hadoop 2.6.5
> release
> > (the next maintenance release in the 2.6.x release line). Below are the
> > details of this release candidate:
> >
> > The RC is available for validation at:
> > http://home.apache.org/~sjlee/hadoop-2.6.5-RC0/.
> >
> > The RC tag in git is release-2.6.5-RC0 and its git commit is
> > 6939fc935fba5651fdb33386d88aeb8e875cf27a.
> >
> > The maven artifacts are staged via repository.apache.org at:
> > https://repository.apache.org/content/repositories/orgapachehadoop-1048/
> .
> >
> > You can find my public key at
> > http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS.
> >
> > Please try the release and vote. The vote will run for the usual 5 days.
> > Huge thanks to Chris Trezzo for spearheading the release management and
> > doing all the work!
> >
> > Thanks,
> > Sangjin
>
>
>
> --
> Lei (Eddy) Xu
> Software Engineer, Cloudera
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [Release thread] 2.6.5 release activities

2016-09-16 Thread Chris Trezzo
We have now cut branch-2.6.5.

On Wed, Sep 14, 2016 at 4:38 PM, Sangjin Lee  wrote:

> We ported 16 issues to branch-2.6. We will go ahead and start the release
> process, including cutting the release branch. If you have any critical
> change that should be made part of 2.6.5, please reach out to us and commit
> the changes. Thanks!
>
> Sangjin
>
> On Mon, Sep 12, 2016 at 3:24 PM, Sangjin Lee  wrote:
>
>> Thanks Chris!
>>
>> I'll help Chris to get those JIRAs marked in his spreadsheet committed.
>> We'll cut the release branch shortly after that. If you have any critical
>> change that should be made part of 2.6.5 (CVE patches included), please
>> reach out to us and commit the changes. If all things go well, we'd like to
>> cut the branch in a few days.
>>
>> Thanks,
>> Sangjin
>>
>> On Fri, Sep 9, 2016 at 1:24 PM, Chris Trezzo  wrote:
>>
>>> Hi all,
>>>
>>> I wanted to give an update on the Hadoop 2.6.5 release efforts.
>>>
>>> Here is what has been done so far:
>>>
>>> 1. I have gone through all of the potential backports and recorded the
>>> commit hashes for each of them from the branch that seems the most
>>> appropriate (i.e. if there was a backport to 2.7.x then I used the hash
>>> from the backport).
>>>
>>> 2. I verified if the cherry pick for each commit is clean. This was best
>>> effort as some of the patches are in parts of the code that I am less
>>> familiar with. This is recorded in the public spread sheet here:
>>> https://docs.google.com/spreadsheets/d/1lfG2CYQ7W4q3ol
>>> WpOCo6EBAey1WYC8hTRUemHvYPPzY/edit?usp=sharing
>>>
>>> I am going to need help from committers to get these backports committed.
>>> If there are any committers that have some spare cycles, especially if
>>> you
>>> were involved with the initial commit for one of these issues, please
>>> look
>>> at the spreadsheet and volunteer to backport one of the issues.
>>>
>>> As always, please let me know if you have any questions or feel that I
>>> have
>>> missed something.
>>>
>>> Thank you!
>>> Chris Trezzo
>>>
>>> On Mon, Aug 15, 2016 at 10:55 AM, Allen Wittenauer <
>>> a...@effectivemachines.com
>>> > wrote:
>>>
>>> >
>>> > > On Aug 12, 2016, at 8:19 AM, Junping Du  wrote:
>>> > >
>>> > >  In this community, we are so aggressive to drop Java 7 support in
>>> 3.0.x
>>> > release. Here, why we are so conservative to keep releasing new bits to
>>> > support Java 6?
>>> >
>>> > I don't view a group of people putting bug fixes into a micro
>>> > release as particularly conservative.  If a group within the community
>>> > wasn't interested in doing it, 2.6.5 wouldn't be happening.
>>> >
>>> > But let's put the releases into context, because I think it
>>> tells
>>> > a more interesting story.
>>> >
>>> > * hadoop 2.6.x = EOLed JREs (6,7)
>>> > * hadoop 2.7 -> hadoop 2.x = transitional (7,8)
>>> > * hadoop 3.x = JRE 8
>>> > * hadoop 4.x = JRE 9
>>> >
>>> > There are groups of people still using JDK6 and they want bug
>>> > fixes in a maintenance release.  Boom, there's 2.6.x.
>>> >
>>> > Hadoop 3.x has been pushed off for years for "reasons".  So we
>>> > still have releases coming off of branch-2.  If 2.7 had been released
>>> as
>>> > 3.x, this chart would look less weird. But it wasn't thus 2.x has this
>>> > weird wart in the middle of that supports JDK7 and JDK8.  Given the
>>> public
>>> > policy and roadmaps of at least one major vendor at the time of this
>>> > writing, we should expect to see JDK7 support for at least the next two
>>> > years after 3.x appears. Bang, there's 2.x, where x is some large
>>> number.
>>> >
>>> > Then there is the future.  People using JRE 8 want to use newer
>>> > dependencies.  A reasonable request. Some of these dependency updates
>>> won't
>>> > work with JRE 7.   We can't do that in hadoop 2.x in any sort of
>>> compatibl

Re: [Release thread] 2.6.5 release activities

2016-09-09 Thread Chris Trezzo
Hi all,

I wanted to give an update on the Hadoop 2.6.5 release efforts.

Here is what has been done so far:

1. I have gone through all of the potential backports and recorded the
commit hashes for each of them from the branch that seems the most
appropriate (i.e. if there was a backport to 2.7.x then I used the hash
from the backport).

2. I verified if the cherry pick for each commit is clean. This was best
effort as some of the patches are in parts of the code that I am less
familiar with. This is recorded in the public spread sheet here:
https://docs.google.com/spreadsheets/d/1lfG2CYQ7W4q3ol
WpOCo6EBAey1WYC8hTRUemHvYPPzY/edit?usp=sharing

I am going to need help from committers to get these backports committed.
If there are any committers that have some spare cycles, especially if you
were involved with the initial commit for one of these issues, please look
at the spreadsheet and volunteer to backport one of the issues.

As always, please let me know if you have any questions or feel that I have
missed something.

Thank you!
Chris Trezzo

On Mon, Aug 15, 2016 at 10:55 AM, Allen Wittenauer  wrote:

>
> > On Aug 12, 2016, at 8:19 AM, Junping Du  wrote:
> >
> >  In this community, we are so aggressive to drop Java 7 support in 3.0.x
> release. Here, why we are so conservative to keep releasing new bits to
> support Java 6?
>
> I don't view a group of people putting bug fixes into a micro
> release as particularly conservative.  If a group within the community
> wasn't interested in doing it, 2.6.5 wouldn't be happening.
>
> But let's put the releases into context, because I think it tells
> a more interesting story.
>
> * hadoop 2.6.x = EOLed JREs (6,7)
> * hadoop 2.7 -> hadoop 2.x = transitional (7,8)
> * hadoop 3.x = JRE 8
> * hadoop 4.x = JRE 9
>
> There are groups of people still using JDK6 and they want bug
> fixes in a maintenance release.  Boom, there's 2.6.x.
>
> Hadoop 3.x has been pushed off for years for "reasons".  So we
> still have releases coming off of branch-2.  If 2.7 had been released as
> 3.x, this chart would look less weird. But it wasn't thus 2.x has this
> weird wart in the middle of that supports JDK7 and JDK8.  Given the public
> policy and roadmaps of at least one major vendor at the time of this
> writing, we should expect to see JDK7 support for at least the next two
> years after 3.x appears. Bang, there's 2.x, where x is some large number.
>
> Then there is the future.  People using JRE 8 want to use newer
> dependencies.  A reasonable request. Some of these dependency updates won't
> work with JRE 7.   We can't do that in hadoop 2.x in any sort of compatible
> way without breaking the universe. (Tons of JIRAs on this point.) This
> means we can only do it in 3.x (re: Hadoop Compatibility Guidelines).
> Kapow, there's 3.x
>
> The log4j community has stated that v1 won't work with JDK9. In
> turn, this means we'll need to upgrade to v2 at some point.  Upgrading to
> v2 will break the log4j properties file (and maybe other things?). Another
> incompatible change and it likely won't appear until Apache Hadoop v4
> unless someone takes the initiative to fix it before v3 hits store
> shelves.  This makes JDK9 the likely target for Apache Hadoop v4.
>
> Having major release cadences tied to JRE updates isn't
> necessarily a bad thing and definitely forces the community to a) actually
> stop beating around the bush on majors and b) actually makes it relatively
> easy to determine what the schedule looks like to some degree.
>
>
>
>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [Release thread] 2.6.5 release activities

2016-08-12 Thread Chris Trezzo
Thanks everyone for the discussion! Based on what has been said in this
thread, I will move forward on the preparation efforts (working with
Sangjin and the community) for a 2.6.5 release. If there is a strong
objection to this, please let us know.

I see three initial tasks going forward:

1. Fix the pre-commit build for branch-2.6. I am not entirely sure what is
involved here, but will start looking into it using HADOOP-12800 as a
starting point.

2. Look through the unresolved issues still targeted to 2.6.5 and
resolve/re-target as necessary. There are currently 17 of them:
https://issues.apache.org/jira/issues/?jql=(project%20%
3D%20%22Hadoop%20Common%22%20OR%20project%20%3D%20%22Hadoop%20YARN%22%20OR%
20project%20%3D%20%22Hadoop%20Map%2FReduce%22%20OR%
20project%20%3D%20%22Hadoop%20HDFS%22)%20AND%20(
fixVersion%20%3D%202.6.5%20OR%20%22Target%20Version%2Fs%22%
20%3D%202.6.5)%20AND%20(status%20!%3D%20Resolved%20AND%20status%20!%3D%
20Closed)%20ORDER%20BY%20status%20ASC

3. Look through the backport candidates in the spreadsheet in more detail
and backport/drop as necessary. There are currently 15 of them:
https://docs.google.com/spreadsheets/d/1lfG2CYQ7W4q3olWpOCo6EBAey1WYC
8hTRUemHvYPPzY/edit?usp=sharing

Finally, I think it would be awesome to continue the release end-of-life
discussion. As we get better and better at release cadence it is going to
be really helpful to have an established EOL policy. It will be less
confusing for contributors and help set clear expectations for end users as
to when they need to start making reasonable migration plans, especially if
they want to stay on a well supported release line. I would be happy to
help with this effort as well. It would be great if we could leverage that
discussion and have EOL information for the 2.6 line when we release 2.6.5.

As always, please let me know if I have missed something.

Thanks!
Chris Trezzo

On Thu, Aug 11, 2016 at 1:03 PM, Karthik Kambatla 
wrote:

> Since there is sufficient interest in 2.6.5, we should probably do it. All
> the reasons Allen outlines make sense.
>
> That said, Junping brings up a very important point that we should think of
> for future releases. For a new user or a user that does not directly
> contribute to the project, more stable releases make it hard to pick from.
>
> As Chris T mentioned, the notion of EOL for our releases seems like a good
> idea. However, to come up with any EOLs, we need to have some sort of
> cadence for the releases. While this is hard for major releases (big bang,
> potentially incompatible features), it should be doable for minor releases.
>
> How do people feel about doing a minor release every 6 months, with
> follow-up maintenance releases every 2 months until the next minor and as
> needed after that? That way, we could EOL a minor release a year after its
> initial release? In the future, we could consider shrinking this window. In
> addition to the EOL, this also makes our releases a little more predictable
> for both users and vendors. Note that 2.6.0 went out almost 2 years ago and
> we haven't had a new minor release in 14 months. I am happy to start
> another DISCUSS thread around this if people think it is useful.
>
> Thanks
> Karthik
>
> On Thu, Aug 11, 2016 at 12:50 PM, Allen Wittenauer <
> a...@effectivemachines.com
> > wrote:
>
> >
> > > On Aug 11, 2016, at 8:10 AM, Junping Du  wrote:
> > >
> > > Allen, to be clear, I am not against any branch release effort here.
> > However,
> >
> > "I'm not an X but "
> >
> > > as RM for previous releases 2.6.3 and 2.6.4, I feel to have
> > responsibility to take care branch-2.6 together with other RMs (Vinod and
> > Sangjin) on this branch and understand current gap - especially, to get
> > consensus from community on the future plan for 2.6.x.
> > > Our bylaw give us freedom for anyone to do release effort, but our
> bylaw
> > doesn't stop our rights for reasonable question/concern on any release
> > plan. As you mentioned below, people can potentially fire up branch-1
> > release effort. But if you call a release plan tomorrow for branch-1, I
> > cannot imagine nobody will question on that effort. Isn't it?
> >
> > From previous discussions I've seen around releases, I
> > think it would depend upon which employee from which vendor raised the
> > question.
> >
> > > Let's keep discussions on releasing 2.6.5 more technical. IMO, to make
> > 2.6.5 release more reasonable, shouldn't we check following questions
> first?
> > > 1. Do we have any significant issues that should land on 2.6.5
> comparing
> > with 2.6.4?
> > > 2. If so, any technical reasons (like: upgrade i

Re: [Release thread] 2.6.5 release activities

2016-08-10 Thread Chris Trezzo
Thanks Jason and Junping for the comments! I will update the spreadsheet
for HADOOP-13362 and YARN-4794.

As for continuing 2.6.x releases, please see the discussion in the "[DISCUSS]
2.6.x line releases" thread. Sean, Akira and Zhe all expressed interest in
additional 2.6.x releases. I started this thread based off of that
interest. I understand there is a burden to maintaining a large number of
branches. I am not sure what the community's end-of-life policy is, but
maybe we can issue a warning with the 2.6.5 release stating when we will
stop maintaining the release line. This at least gives users some time to
make migration plans to a newer version.

On Wed, Aug 10, 2016 at 9:36 AM, Junping Du  wrote:

> Thanks Chris for bring up this discussion.
> Before we going to detail discussion of releasing 2.6.5, I have a quick
> question here: do we think it is necessary to continue to release
> branch-2.6, like 2.6.5, etc after 2.7 is out for more than 1 year. Any
> reason to not suggest users to upgrade to 2.7.3 releases for latest fixes
> which is in releasing now?
> My major concern on more release efforts on legacy branches is the same
> with my comments on other release plan before - it seems too many releases
> trains get planned at the same time window (2.6.x, 2.7.x, 2.8, 3.0-alpha,
> 3.1-beta, etc.). Not only user could get confusing on this, but also I
> suspect we don't have so many bandwidth in community to push forward so
> these releases in high quality during the same time window - just like
> Chris Douglas mentioned in another email thread on committer activity and
> bandwidth. IMO, may be it is better to focus on limited number of releases
> and move them faster?
>
> BTW, I agree with Jason that HADOOP-13362 is not needed for branch-2.6
> unless we backport container metrics related patches there.
>
>
> Thanks,
>
> Junping
> ________
> From: Jason Lowe 
> Sent: Wednesday, August 10, 2016 4:14 PM
> To: Chris Trezzo; common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org;
> mapreduce-dev@hadoop.apache.org; yarn-...@hadoop.apache.org
> Subject: Re: [Release thread] 2.6.5 release activities
>
> Thanks for organizing this, Chris!
> I don't believe HADOOP-13362 is needed since it's related to
> ContainerMetrics.  ContainerMetrics weren't added until 2.7 by YARN-2984.
> YARN-4794 looks applicable to 2.6.  The change drops right in except it
> has JDK7-isms (multi-catch clause), so it needs a slight change.
>
> Jason
>
>   From: Chris Trezzo 
>  To: "common-...@hadoop.apache.org" ;
> hdfs-...@hadoop.apache.org; "mapreduce-dev@hadoop.apache.org" <
> mapreduce-dev@hadoop.apache.org>; "yarn-...@hadoop.apache.org" <
> yarn-...@hadoop.apache.org>
>  Sent: Tuesday, August 9, 2016 7:32 PM
>  Subject: [Release thread] 2.6.5 release activities
>
> Based on the sentiment in the "[DISCUSS] 2.6.x line releases" thread, I
> have moved forward with some of the initial effort in creating a 2.6.5
> release. I am forking this thread so we have a dedicated 2.6.5 release
> thread.
>
> I have gone through the git logs and gathered a list of JIRAs that are in
> branch-2.7 but are missing from branch-2.6. I limited the diff to issues
> with a commit date after 1/26/2016. I did this because 2.6.4 was cut from
> branch-2.6 around that date (http://markmail.org/message/xmy7ebs6l3643o5e)
> and presumably issues that were committed to branch-2.7 before then were
> already looked at as part of 2.6.4.
>
> I have collected these issues in a spreadsheet and have given them an
> initial triage on whether they are candidates for a backport to 2.6.5. The
> spreadsheet is sorted by the status of the issues with the potential
> backport candidates at the top. Here is a link to the spreadsheet:
> https://docs.google.com/spreadsheets/d/1lfG2CYQ7W4q3olWpOCo6EBAey1WYC
> 8hTRUemHvYPPzY/edit?usp=sharing
>
> As of now, I have identified 16 potential backport candidates. Please take
> a look at the list and let me know if there are any that you think should
> not be on the list, or ones that you think I have missed. This was just an
> initial high-level triage, so there could definitely be issues that are
> miss-labeled.
>
> As a side note: we still need to look at the pre-commit build for 2.6 and
> follow up with an addendum for HADOOP-12800.
>
> Thanks everyone!
> Chris Trezzo
>
>


[Release thread] 2.6.5 release activities

2016-08-09 Thread Chris Trezzo
Based on the sentiment in the "[DISCUSS] 2.6.x line releases" thread, I
have moved forward with some of the initial effort in creating a 2.6.5
release. I am forking this thread so we have a dedicated 2.6.5 release
thread.

I have gone through the git logs and gathered a list of JIRAs that are in
branch-2.7 but are missing from branch-2.6. I limited the diff to issues
with a commit date after 1/26/2016. I did this because 2.6.4 was cut from
branch-2.6 around that date (http://markmail.org/message/xmy7ebs6l3643o5e)
and presumably issues that were committed to branch-2.7 before then were
already looked at as part of 2.6.4.

I have collected these issues in a spreadsheet and have given them an
initial triage on whether they are candidates for a backport to 2.6.5. The
spreadsheet is sorted by the status of the issues with the potential
backport candidates at the top. Here is a link to the spreadsheet:
https://docs.google.com/spreadsheets/d/1lfG2CYQ7W4q3olWpOCo6EBAey1WYC8hTRUemHvYPPzY/edit?usp=sharing

As of now, I have identified 16 potential backport candidates. Please take
a look at the list and let me know if there are any that you think should
not be on the list, or ones that you think I have missed. This was just an
initial high-level triage, so there could definitely be issues that are
miss-labeled.

As a side note: we still need to look at the pre-commit build for 2.6 and
follow up with an addendum for HADOOP-12800.

Thanks everyone!
Chris Trezzo


[jira] [Created] (MAPREDUCE-6747) TestMapReduceJobControl#testJobControlWithKillJob times out in trunk

2016-08-05 Thread Chris Trezzo (JIRA)
Chris Trezzo created MAPREDUCE-6747:
---

 Summary: TestMapReduceJobControl#testJobControlWithKillJob times 
out in trunk
 Key: MAPREDUCE-6747
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6747
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Chris Trezzo
Priority: Minor


TestMapReduceJobControl#testJobControlWithKillJob seems to time out while 
waiting for all jobs to complete. This seems to only happen if the test is run 
with the other tests in the class (specifically testJobControlWithFailJob). If 
testJobControlWithKillJob is run by itself, the test passes.

Looking into the test logs, when run with another test from the class, the test 
runs into an issue while setting permissions on the local file system:
{noformat}
2016-08-05 11:40:32,101 WARN  [Thread-100] util.Shell 
(Shell.java:joinThread(1023)) - Interrupted while joining on: 
Thread[Thread-105,5,main]
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1245)
at java.lang.Thread.join(Thread.java:1319)
at org.apache.hadoop.util.Shell.joinThread(Shell.java:1020)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:969)
at org.apache.hadoop.util.Shell.run(Shell.java:878)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1172)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:1266)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:1248)
at 
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:781)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkOneDirWithMode(RawLocalFileSystem.java:526)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:566)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:538)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:565)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:538)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:565)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:538)
at 
org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:696)
at 
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:343)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:541)
{noformat}

Conversely, when the test is run by itself, this issue is not hit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-05-06 Thread Chris Trezzo (JIRA)
Chris Trezzo created MAPREDUCE-6690:
---

 Summary: Limit the number of resources a single map reduce job can 
submit for localization
 Key: MAPREDUCE-6690
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Chris Trezzo
Assignee: Chris Trezzo


Users will sometimes submit a large amount of resources to be localized as part 
of a single map reduce job. This can cause issues with YARN localization that 
destabilize the cluster and potentially impact other user jobs. These resources 
are specified via the files, libjars, archives and jobjar command line 
arguments or directly through the configuration (i.e. distributed cache api). 
The resources specified could be too large in multiple dimensions:
# Total size
# Number of files
# Size of an individual resource (i.e. a large fat jar)

We would like to encourage good behavior on the client side by having the 
option of enforcing resource limits along the above dimensions.

There should be a separate effort to enforce limits at the YARN layer on the 
server side, but this jira is only covering the map reduce layer on the client 
side. In practice, having these client side limits will get us a long way 
towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: 2.7.3 release plan

2016-04-05 Thread Chris Trezzo
In light of the additional conversation on HDFS-8791, I would like to
re-propose the following:

1. Revert the new datanode layout (HDFS-8791) from the 2.7 branch. The
layout change currently does not support downgrades which breaks our
upgrade/downgrade policies for dot releases.

2. Cut a 2.8 release off of the 2.7.3 release with the addition of
HDFS-8791. This would give customers a stable release that they could
deploy with the new layout. As discussed on the jira, this is still in line
with user expectation for minor releases as we have done layout changes in
a number of 2.x minor releases already. The current 2.8 would become 2.9
and continue its current release schedule.

What does everyone think? If unsupported downgrades between minor releases
is still not agreeable, then as stated by Vinod, we would need to either
add support for downgrades with dn layout changes or revert the layout
change from branch-2. If we are OK with the layout change in a minor
release, but think that the issue does not affect enough customers to
warrant a separate release, we could simply leave it in branch-2 and let it
be released with the current 2.8.


On Mon, Apr 4, 2016 at 1:48 PM, Vinod Kumar Vavilapalli 
wrote:

> I commented on the JIRA way back (see
> https://issues.apache.org/jira/browse/HDFS-8791?focusedCommentId=1503&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-1503),
> saying what I said below. Unfortunately, I haven’t followed the patch along
> after my initial comment.
>
> This isn’t about any specific release - starting 2.6 we declared support
> for rolling upgrades and downgrades. Any patch that breaks this should not
> be in branch-2.
>
> Two options from where I stand
>  (1) For folks who worked on the patch: Is there a way to make (a) the
> upgrade-downgrade seamless for people who don’t care about this (b) and
> have explicit documentation for people who care to switch this behavior on
> and are willing to risk not having downgrades. If this means a new
> configuration property, so be it. It’s a necessary evil.
>  (2) Just let specific users backport this into specific 2.x branches they
> need and leave it only on trunk.
>
> Unless this behavior stops breaking rolling upgrades/downgrades, I think
> we should just revert it from branch-2 and definitely 2.7.3 as it stands
> today.
>
> +Vinod
>
>
> > On Apr 1, 2016, at 2:54 PM, Chris Trezzo  wrote:
> >
> > A few thoughts:
> >
> > 1. To echo Andrew Wang, HDFS-8578 (parallel upgrades) should be a
> > prerequisite for HDFS-8791. Without that patch, upgrades can be very slow
> > for data nodes depending on your setup.
> >
> > 2. We have already deployed this patch internally so, with my Twitter hat
> > on, I would be perfectly happy as long as it makes it into trunk and 2.8.
> > That being said, I would be hesitant to deploy the current 2.7.x or 2.6.x
> > releases on a large production cluster that has a diverse set of block
> ids
> > without this patch, especially if your data nodes have a large number of
> > disks or you are using federation. To be clear though: this highly
> depends
> > on your setup and at a minimum you should verify that this regression
> will
> > not affect you. The current block-id based layout in 2.6.x and 2.7.2 has
> a
> > performance regression that gets worse over time. When you see it
> happening
> > on a live cluster, it is one of the harder issues to identify a root
> cause
> > and debug. I do understand that this is currently only affecting a
> smaller
> > number of users, but I also think this number has potential to increase
> as
> > time goes on. Maybe we can issue a warning in the release notes for
> future
> > 2.7.x and 2.6.x releases?
> >
> > 3. One option (this was suggested on HDFS-8791 and I think Sean alluded
> to
> > this proposal on this thread) would be to cut a 2.8 release off of the
> > 2.7.3 release with the new layout. What people currently think of as 2.8
> > would then become 2.9. This would give customers a stable release that
> they
> > could deploy with the new layout and would not break upgrade and
> downgrade
> > expectations.
> >
> > On Fri, Apr 1, 2016 at 11:32 AM, Andrew Purtell 
> wrote:
> >
> >> As a downstream consumer of Apache Hadoop 2.7.x releases, I expect we
> would
> >> patch the release to revert HDFS-8791 before pushing it out to
> production.
> >> For what it's worth.
> >>
> >>
> >> On Fri, Apr 1, 2016 at 11:23 AM, Andrew Wang 
> >> wrote:
> >>
> >>> One other thing I wanted to bring up regarding HDFS-8791, we haven't
> >>> backporte

Re: 2.7.3 release plan

2016-04-01 Thread Chris Trezzo
A few thoughts:

1. To echo Andrew Wang, HDFS-8578 (parallel upgrades) should be a
prerequisite for HDFS-8791. Without that patch, upgrades can be very slow
for data nodes depending on your setup.

2. We have already deployed this patch internally so, with my Twitter hat
on, I would be perfectly happy as long as it makes it into trunk and 2.8.
That being said, I would be hesitant to deploy the current 2.7.x or 2.6.x
releases on a large production cluster that has a diverse set of block ids
without this patch, especially if your data nodes have a large number of
disks or you are using federation. To be clear though: this highly depends
on your setup and at a minimum you should verify that this regression will
not affect you. The current block-id based layout in 2.6.x and 2.7.2 has a
performance regression that gets worse over time. When you see it happening
on a live cluster, it is one of the harder issues to identify a root cause
and debug. I do understand that this is currently only affecting a smaller
number of users, but I also think this number has potential to increase as
time goes on. Maybe we can issue a warning in the release notes for future
2.7.x and 2.6.x releases?

3. One option (this was suggested on HDFS-8791 and I think Sean alluded to
this proposal on this thread) would be to cut a 2.8 release off of the
2.7.3 release with the new layout. What people currently think of as 2.8
would then become 2.9. This would give customers a stable release that they
could deploy with the new layout and would not break upgrade and downgrade
expectations.

On Fri, Apr 1, 2016 at 11:32 AM, Andrew Purtell  wrote:

> As a downstream consumer of Apache Hadoop 2.7.x releases, I expect we would
> patch the release to revert HDFS-8791 before pushing it out to production.
> For what it's worth.
>
>
> On Fri, Apr 1, 2016 at 11:23 AM, Andrew Wang 
> wrote:
>
> > One other thing I wanted to bring up regarding HDFS-8791, we haven't
> > backported the parallel DN upgrade improvement (HDFS-8578) to branch-2.6.
> > HDFS-8578 is a very important related fix since otherwise upgrade will be
> > very slow.
> >
> > On Thu, Mar 31, 2016 at 10:35 AM, Andrew Wang 
> > wrote:
> >
> > > As I expressed on HDFS-8791, I do not want to include this JIRA in a
> > > maintenance release. I've only seen it crop up on a handful of our
> > > customer's clusters, and large users like Twitter and Yahoo that seem
> to
> > be
> > > more affected are also the most able to patch this change in
> themselves.
> > >
> > > Layout upgrades are quite disruptive, and I don't think it's worth
> > > breaking upgrade and downgrade expectations when it doesn't affect the
> > (in
> > > my experience) vast majority of users.
> > >
> > > Vinod seemed to have a similar opinion in his comment on HDFS-8791, but
> > > will let him elaborate.
> > >
> > > Best,
> > > Andrew
> > >
> > > On Thu, Mar 31, 2016 at 9:11 AM, Sean Busbey 
> > wrote:
> > >
> > >> As of 2 days ago, there were already 135 jiras associated with 2.7.3,
> > >> if *any* of them end up introducing a regression the inclusion of
> > >> HDFS-8791 means that folks will have cluster downtime in order to back
> > >> things out. If that happens to any substantial number of downstream
> > >> folks, or any particularly vocal downstream folks, then it is very
> > >> likely we'll lose the remaining trust of operators for rolling out
> > >> maintenance releases. That's a pretty steep cost.
> > >>
> > >> Please do not include HDFS-8791 in any 2.6.z release. Folks having to
> > >> be aware that an upgrade from e.g. 2.6.5 to 2.7.2 will fail is an
> > >> unreasonable burden.
> > >>
> > >> I agree that this fix is important, I just think we should either cut
> > >> a version of 2.8 that includes it or find a way to do it that gives an
> > >> operational path for rolling downgrade.
> > >>
> > >> On Thu, Mar 31, 2016 at 10:10 AM, Junping Du 
> > wrote:
> > >> > Thanks for bringing up this topic, Sean.
> > >> > When I released our latest Hadoop release 2.6.4, the patch of
> > HDFS-8791
> > >> haven't been committed in so that's why we didn't discuss this
> earlier.
> > >> > I remember in JIRA discussion, we treated this layout change as a
> > >> Blocker bug that fixing a significant performance regression before
> but
> > not
> > >> a normal performance improvement. And I believe HDFS community already
> > did
> > >> their best with careful and patient to deliver the fix and other
> related
> > >> patches (like upgrade fix in HDFS-8578). Take an example of HDFS-8578,
> > you
> > >> can see 30+ rounds patch review back and forth by senior committers,
> > not to
> > >> mention the outstanding performance test data in HDFS-8791.
> > >> > I would trust our HDFS committers' judgement to land HDFS-8791 on
> > >> 2.7.3. However, that needs Vinod's final confirmation who serves as RM
> > for
> > >> branch-2.7. In addition, I didn't see any blocker issue to bring it
> into
> > >> 2.6.5 now.
> > >> > Just my 2 cents.
> > >> >
> >

Re: continuing releases on Apache Hadoop 2.6.x

2015-11-18 Thread Chris Trezzo
Thanks Junping for the clarification! It was not my intention to violate
the rules. I would be happy to work with you and help you manage the
release in whatever way is most effective.

Chris

On Wednesday, November 18, 2015, Junping Du  wrote:

> Thanks Chris Trezzo for volunteer on helping 2.6.3 release. I think
> Sangjin was asking for a committer to serve as release manager for 2.6.3
> according to Apache rules:
> http://www.apache.org/dev/release-publishing.html.
> I would like to serve as that role to work closely with you and Sangjin on
> 2.6.3 release if no objects from others.
>
> Thanks,
>
> Junping
> ____
> From: Chris Trezzo >
> Sent: Wednesday, November 18, 2015 1:13 AM
> To: yarn-...@hadoop.apache.org 
> Cc: common-...@hadoop.apache.org ;
> hdfs-...@hadoop.apache.org ; mapreduce-dev@hadoop.apache.org
> 
> Subject: Re: continuing releases on Apache Hadoop 2.6.x
>
> Hi Sangjin,
>
> I would be happy to volunteer to work with you as a release manager for
> 2.6.3. Shooting for a time in early December seems reasonable to me. I also
> agree that if we miss that window, January would be the next best option.
>
> Thanks,
> Chris
>
> On Tue, Nov 17, 2015 at 5:10 PM, Sangjin Lee  > wrote:
>
> > I'd like to pick up this email discussion again. It is time that we
> started
> > thinking about the next release in the 2.6.x line. IMO we want to walk
> the
> > balance between maintaining a reasonable release cadence and getting a
> good
> > amount of high-quality fixes. The timeframe is a little tricky as the
> > holidays are approaching. If we have enough fixes accumulated in
> > branch-2.6, some time early December might be a good target for cutting
> the
> > first release candidate. Once we miss that window, I think we are looking
> > at next January. I'd like to hear your thoughts on this.
> >
> > It'd be good if someone can volunteer for the release manager for 2.6.3.
> > I'd be happy to help out in any way I can. Thanks!
> >
> > Regards,
> > Sangjin
> >
> > On Mon, Nov 2, 2015 at 11:45 AM, Vinod Vavilapalli <
> > vino...@hortonworks.com >
> > wrote:
> >
> > > Just to stress on the following, it is very important that any critical
> > > bug-fixes that we push into 2.8.0 or even trunk, we should consider
> them
> > > for 2.6.3 and 2.7.3 if it makes sense. This is the only way we can
> avoid
> > > extremely long release cycles like that of 2.6.1.
> > >
> > > Also, to clarify a little, use Target-version if you want a discussion
> of
> > > the backport, but if you do end up backporting patches after that, you
> > > should set the fix-version to be 2.6.1.
> > >
> > > Thanks
> > > +Vinod
> > >
> > >
> > > > On Nov 2, 2015, at 11:29 AM, Sangjin Lee  > wrote:
> > > >
> > > > As you may have seen, 2.6.2 is out
> > > > <http://markmail.org/thread/yw53xgz6wzpqnclt>. I have also
> retargeted
> > > all
> > > > open issues that were targeted for 2.6.2 to 2.6.3.
> > > >
> > > > Continuing the discussion in the email thread here
> > > > <http://markmail.org/thread/ofjlzurok223bzyi>, I'd like us to
> maintain
> > > the
> > > > cadence of monthly point releases in the 2.6.x line. It would be
> great
> > if
> > > > we can have 2.6.3 released before the year-end holidays.
> > > >
> > > > If you have any bugfixes and improvements that are targeted for 2.7.x
> > (or
> > > > 2.8) that you think are applicable to 2.6.x, please *set the target
> > > version
> > > > to 2.6.3* and merge them to branch-2.6. Please use your judgment in
> > terms
> > > > of the applicability and quality of the changes so that we can ensure
> > > each
> > > > point release is consistently better quality than the previous one.
> > > Thanks
> > > > everyone!
> > > >
> > > > Regards,
> > > > Sangjin
> > >
> > >
> >
>


Re: continuing releases on Apache Hadoop 2.6.x

2015-11-17 Thread Chris Trezzo
Hi Sangjin,

I would be happy to volunteer to work with you as a release manager for
2.6.3. Shooting for a time in early December seems reasonable to me. I also
agree that if we miss that window, January would be the next best option.

Thanks,
Chris

On Tue, Nov 17, 2015 at 5:10 PM, Sangjin Lee  wrote:

> I'd like to pick up this email discussion again. It is time that we started
> thinking about the next release in the 2.6.x line. IMO we want to walk the
> balance between maintaining a reasonable release cadence and getting a good
> amount of high-quality fixes. The timeframe is a little tricky as the
> holidays are approaching. If we have enough fixes accumulated in
> branch-2.6, some time early December might be a good target for cutting the
> first release candidate. Once we miss that window, I think we are looking
> at next January. I'd like to hear your thoughts on this.
>
> It'd be good if someone can volunteer for the release manager for 2.6.3.
> I'd be happy to help out in any way I can. Thanks!
>
> Regards,
> Sangjin
>
> On Mon, Nov 2, 2015 at 11:45 AM, Vinod Vavilapalli <
> vino...@hortonworks.com>
> wrote:
>
> > Just to stress on the following, it is very important that any critical
> > bug-fixes that we push into 2.8.0 or even trunk, we should consider them
> > for 2.6.3 and 2.7.3 if it makes sense. This is the only way we can avoid
> > extremely long release cycles like that of 2.6.1.
> >
> > Also, to clarify a little, use Target-version if you want a discussion of
> > the backport, but if you do end up backporting patches after that, you
> > should set the fix-version to be 2.6.1.
> >
> > Thanks
> > +Vinod
> >
> >
> > > On Nov 2, 2015, at 11:29 AM, Sangjin Lee  wrote:
> > >
> > > As you may have seen, 2.6.2 is out
> > > . I have also retargeted
> > all
> > > open issues that were targeted for 2.6.2 to 2.6.3.
> > >
> > > Continuing the discussion in the email thread here
> > > , I'd like us to maintain
> > the
> > > cadence of monthly point releases in the 2.6.x line. It would be great
> if
> > > we can have 2.6.3 released before the year-end holidays.
> > >
> > > If you have any bugfixes and improvements that are targeted for 2.7.x
> (or
> > > 2.8) that you think are applicable to 2.6.x, please *set the target
> > version
> > > to 2.6.3* and merge them to branch-2.6. Please use your judgment in
> terms
> > > of the applicability and quality of the changes so that we can ensure
> > each
> > > point release is consistently better quality than the previous one.
> > Thanks
> > > everyone!
> > >
> > > Regards,
> > > Sangjin
> >
> >
>


[jira] [Created] (MAPREDUCE-6365) Refactor JobResourceUploader#uploadFilesInternal

2015-05-12 Thread Chris Trezzo (JIRA)
Chris Trezzo created MAPREDUCE-6365:
---

 Summary: Refactor JobResourceUploader#uploadFilesInternal
 Key: MAPREDUCE-6365
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6365
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Chris Trezzo
Priority: Minor


JobResourceUploader#uploadFilesInternal is a large method and there are similar 
pieces of code that could probably be pulled out into separate methods.  This 
refactor would improve readability of the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6267) Refactor JobSubmitter#copyAndConfigureFiles into it's own class

2015-02-26 Thread Chris Trezzo (JIRA)
Chris Trezzo created MAPREDUCE-6267:
---

 Summary: Refactor JobSubmitter#copyAndConfigureFiles into it's own 
class
 Key: MAPREDUCE-6267
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6267
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Chris Trezzo
Assignee: Chris Trezzo
Priority: Minor


Refactor the uploading logic in JobSubmitter#copyAndConfigureFiles into it's 
own class. This makes the JobSubmitter class more readable and isolates the 
logic that is actually uploading the job resources to HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Thinking ahead to hadoop-2.6

2014-09-23 Thread Chris Trezzo
I would like to see the shared cache (YARN-1492) make it in as well. We are
going through the final review process (with Karthik and Vinod) and should
be fairly close to complete. This is another feature that has been under
development for a while.

Thanks,
Chris

On Tue, Sep 23, 2014 at 4:00 PM, Suresh Srinivas 
wrote:

> I actually would like to see both archival storage and single replica
> memory writes to be in 2.6 release. Archival storage is in the final stages
> of getting ready for branch-2 merge as Nicholas has already indicated on
> the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of these
> features are being in development for sometime.
>
> On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang 
> wrote:
>
> > Hey Arun,
> >
> > Maybe we could do a quick run through of the Roadmap wiki and
> add/retarget
> > things accordingly?
> >
> > I think the KMS and transparent encryption are ready to go. We've got a
> > very few further bug fixes pending, but that's it.
> >
> > Two HDFS things that I think probably won't make the end of the week are
> > archival storage (HDFS-6584) and single replica memory writes
> (HDFS-6581),
> > which I believe are under the HSM banner. HDFS-6484 was just merged to
> > trunk and I think needs a little more work before it goes into branch-2.
> > HDFS-6581 hasn't even been merged to trunk yet, so seems a bit further
> off
> > yet.
> >
> > Just my 2c as I did not work directly on these features. I just generally
> > shy away from shipping bits quite this fresh.
> >
> > Thanks,
> > Andrew
> >
> > On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy 
> wrote:
> >
> > > Looks like most of the content is in and hadoop-2.6 is shaping up
> nicely.
> > >
> > > I'll create branch-2.6 by end of the week and we can go from there to
> > > stabilize it - hopefully in the next few weeks.
> > >
> > > Thoughts?
> > >
> > > thanks,
> > > Arun
> > >
> > > On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy 
> > > wrote:
> > >
> > > > Folks,
> > > >
> > > >  With hadoop-2.5 nearly done, it's time to start thinking ahead to
> > > > hadoop-2.6.
> > > >
> > > >  Currently, here is the Roadmap per the wiki:
> > > >
> > > > • HADOOP
> > > > • Credential provider HADOOP-10607
> > > > • HDFS
> > > > • Heterogeneous storage (Phase 2) - Support APIs for
> > > using
> > > > storage tiers by the applications HDFS-5682
> > > > • Memory as storage tier HDFS-5851
> > > > • YARN
> > > > • Dynamic Resource Configuration YARN-291
> > > > • NodeManager Restart YARN-1336
> > > > • ResourceManager HA Phase 2 YARN-556
> > > > • Support for admin-specified labels in YARN YARN-796
> > > > • Support for automatic, shared cache for YARN
> > > application
> > > > artifacts YARN-1492
> > > > • Support NodeGroup layer topology on YARN YARN-18
> > > > • Support for Docker containers in YARN YARN-1964
> > > > • YARN service registry YARN-913
> > > >
> > > >  My suspicion is, as is normal, some will make the cut and some
> won't.
> > > > Please do add/subtract from the list as appropriate. Ideally, it
> would
> > be
> > > > good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep up a
> > > cadence.
> > > >
> > > >  More importantly, as we discussed previously, we'd like hadoop-2.6
> to
> > be
> > > > the *last* Apache Hadoop 2.x release which support JDK6. I'll start a
> > > > discussion with other communities (HBase, Pig, Hive, Oozie etc.) and
> > see
> > > > how they feel about this.
> > > >
> > > > thanks,
> > > > Arun
> > > >
> > > >
> > >
> > >
> > > --
> > >
> > > --
> > > Arun C. Murthy
> > > Hortonworks Inc.
> > > http://hortonworks.com/
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>
>
>
> --
> http://hortonworks.com/download/
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> rece

Review request: YARN-1492 (i.e. the YARN shared cache)

2014-09-05 Thread Chris Trezzo
Hi All,

This email is to draw more attention to YARN-1492 and hopefully gain
traction on code reviews. At Twitter we have been running the shared cache
on our clusters and it has already served over 36 million requests.

The new shared cache feature is relatively isolated from existing code and
is completely enabled/disabled by configuration. When disabled there are no
behavioral changes compared to the existing code base. It would be great to
see this committed to trunk and even more awesome if it makes 2.6.

This is a larger patch, but I have broken it up into a number of sub-tasks
in an attempt to make it more digestible for the review process. A couple
things to note:

1. The two patches that interact with existing code in a substantial way
are:
YARN-2236 <https://issues.apache.org/jira/browse/YARN-2236>
<https://issues.apache.org/jira/browse/MAPREDUCE-5951> - This patch adds
the cache uploader service to the node manager. MAPREDUCE-5951
<https://issues.apache.org/jira/browse/MAPREDUCE-5951> - This patch adds
support for the shared cache at the MapReduce layer allowing jobs to cache
job jars, lib jars, files and archives.

2. If you would like to try out the entire feature there is a "big bang"
patch in YARN-1492 and instructions on how to set it up in a comment on the
issue here
<https://issues.apache.org/jira/browse/YARN-1492?focusedCommentId=14123617&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14123617>
.

Please ping me if you have any questions or think there is something else I
could do to make reviewing easier.

Thanks!
Chris Trezzo


[jira] [Created] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2014-06-30 Thread Chris Trezzo (JIRA)
Chris Trezzo created MAPREDUCE-5951:
---

 Summary: Add support for the YARN Shared Cache
 Key: MAPREDUCE-5951
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Chris Trezzo
Assignee: Chris Trezzo


Implement the necessary changes so that the MapReduce application can leverage 
the new YARN shared cache (i.e. YARN-1492).



--
This message was sent by Atlassian JIRA
(v6.2#6252)