Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-01 Thread Wangda Tan
+1 (Binding), I tried to use YARN service assembly before to run different
kinds of jobs (for example, distributed Tensorflow), it is really easy for
end user to run jobs on YARN.

Thanks to the whole team for the great job!

Best,
Wangda


On Fri, Sep 1, 2017 at 3:33 PM, Gour Saha  wrote:

> +1 (non-binding)
>
> On 9/1/17, 11:58 AM, "Billie Rinaldi"  wrote:
>
> >+1 (non-binding)
> >
> >On Thu, Aug 31, 2017 at 8:33 PM, Jian He  wrote:
> >
> >> Hi All,
> >>
> >> I would like to call a vote for merging yarn-native-services to trunk.
> >>The
> >> vote will run for 7 days as usual.
> >>
> >> At a high level, the following are the key feautres implemented.
> >> - YARN-5079[1]. A native YARN framework (ApplicationMaster) to migrate
> >>and
> >> orchestrate existing services to YARN either docker or non-docker based.
> >> - YARN-4793[2]. A Rest API server for user to deploy a service via a
> >> simple JSON spec
> >> - YARN-4757[3]. Extending today's service registry with a simple DNS
> >> service to enable users to discover services deployed on YARN
> >> - YARN-6419[4]. UI support for native-services on the new YARN UI
> >> All these new services are optional and are sitting outside of the
> >> existing system, and have no impact on existing system if disabled.
> >>
> >> Special thanks to a team of folks who worked hard towards this: Billie
> >> Rinaldi, Gour Saha, Vinod Kumar Vavilapalli, Jonathan Maron, Rohith
> >>Sharma
> >> K S, Sunil G, Akhil PB. This effort could not be possible without their
> >> ideas and hard work.
> >>
> >> Thanks,
> >> Jian
> >>
> >> [1] https://issues.apache.org/jira/browse/YARN-5079
> >> [2] https://issues.apache.org/jira/browse/YARN-4793
> >> [3] https://issues.apache.org/jira/browse/YARN-4757
> >> [4] https://issues.apache.org/jira/browse/YARN-6419
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
> >>
> >>
>
>
> -
> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>
>


[jira] [Resolved] (YARN-7126) Create introductory site documentation for YARN native services

2017-09-01 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He resolved YARN-7126.
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: yarn-native-services

Committed to yarn-native-services, thanks Gour and Billie !

> Create introductory site documentation for YARN native services
> ---
>
> Key: YARN-7126
> URL: https://issues.apache.org/jira/browse/YARN-7126
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Gour Saha
> Fix For: yarn-native-services
>
> Attachments: YARN-7126-yarn-native-services.001.patch, 
> YARN-7126-yarn-native-services.002.patch, 
> YARN-7126-yarn-native-services.003.patch, 
> YARN-7126-yarn-native-services.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-01 Thread Gour Saha
+1 (non-binding)

On 9/1/17, 11:58 AM, "Billie Rinaldi"  wrote:

>+1 (non-binding)
>
>On Thu, Aug 31, 2017 at 8:33 PM, Jian He  wrote:
>
>> Hi All,
>>
>> I would like to call a vote for merging yarn-native-services to trunk.
>>The
>> vote will run for 7 days as usual.
>>
>> At a high level, the following are the key feautres implemented.
>> - YARN-5079[1]. A native YARN framework (ApplicationMaster) to migrate
>>and
>> orchestrate existing services to YARN either docker or non-docker based.
>> - YARN-4793[2]. A Rest API server for user to deploy a service via a
>> simple JSON spec
>> - YARN-4757[3]. Extending today's service registry with a simple DNS
>> service to enable users to discover services deployed on YARN
>> - YARN-6419[4]. UI support for native-services on the new YARN UI
>> All these new services are optional and are sitting outside of the
>> existing system, and have no impact on existing system if disabled.
>>
>> Special thanks to a team of folks who worked hard towards this: Billie
>> Rinaldi, Gour Saha, Vinod Kumar Vavilapalli, Jonathan Maron, Rohith
>>Sharma
>> K S, Sunil G, Akhil PB. This effort could not be possible without their
>> ideas and hard work.
>>
>> Thanks,
>> Jian
>>
>> [1] https://issues.apache.org/jira/browse/YARN-5079
>> [2] https://issues.apache.org/jira/browse/YARN-4793
>> [3] https://issues.apache.org/jira/browse/YARN-4757
>> [4] https://issues.apache.org/jira/browse/YARN-6419
>>
>>
>> -
>> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>>
>>


-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



2017-09-01 Hadoop 3 release status update

2017-09-01 Thread Andrew Wang
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+updates

2017-09-01

We're two weeks out from beta1, focus is on blocker burndown.

Highlights:

   - S3Guard merged!
   - TSv2 alpha2 merged!
   - branch-3.0 has been cut after discussion on dev lists.

Red flags:

   - 10 blockers on the dashboard, closed and bumped some but new ones
   appeared.
   - Still need to land YARN native services and fix some S3Guard doc
   issues for beta1.
   - Rolling upgrade JIRAs for YARN and HDFS are not making any visible
   progress

Previously tracked beta1 blockers that have been resolved:

   - HADOOP-13363  (Upgrade
   to protobuf 3): I dropped this from beta1 since it's simply not going to
   happen in time.
   - YARN-7076 : This was
   quickly resolved! Thanks Jian, Junping, Jason for the action.
   - YARN-7094  (Document
   that server-side graceful decom is currently not recommended): Patch
   committed!

beta1 blockers:

   - HADOOP-14826  (review
   S3 docs prior to 3.0.0-beta1): New blocker with S3Guard merged. Should just
   be a quick doc update.
   - HADOOP-14284  (Shade
   Guava everywhere): Agreement to shade yarn-client at at HADOOP-14771.
   Shading hadoop-hdfs is still being discussed?
   - HADOOP-14771 
(hadoop-client
   does not include hadoop-yarn-client): Patch up, needs review, waiting on
   Busbey
   - YARN-5536  (Multiple
   format support (JSON, etc.) for exclude node file in NM graceful
   decommission with timeout): We're waiting on input from Junping.
   - MAPREDUCE-6941 (The default setting doesn't work for MapReduce job):
   Ray thinks this is a Won't Fix, waiting on Junping to confirm.
   - HADOOP-14238 (Rechecking Guava's object is not exposed to user-facing
   API): This relates to HADOOP-14771, I left a JIRA comment.

beta1 features:

   - Erasure coding
  - There are three must-dos. Two have patches, one might not be a
  must-do.
  - HDFS-11882 has been revved and reviewed, seems close
  - HDFS-11467 and HDFS-7859 are related, Sammi/Eddy/Kai are
  discussing, Sammi thinks we can still make beta1.
   - Addressing incompatible changes (YARN-6142 and HDFS-11096)
  - Sean has HDFS rolling upgrade scripts up, waiting on Ray to add
  some YARN/MR coverage too.
  - Need to do a final runthrough of the JACC reports for YARN and HDFS.
   - Classpath isolation (HADOOP-11656)
  - Sean has retriaged the subtasks and has been posting patches.
   - Compat guide (HADOOP-13714
   )
  - New patch is up, but needs review. Daniel asked Chris Douglas and
  Steve Loughran.
   - YARN native services
  - Jian sent out the merge vote
   - TSv2 alpha 2
   - This was merged, no problems thus far [image: (smile)]

GA features:

   - Resource profiles (Wangda Tan)
  - Merge vote was sent out. Since branch-3.0 has been cut, this can be
  merged to trunk (3.1.0) and then backported once we've completed testing.
   - HDFS router-based federation (Chris Douglas)
   - This is like YARN federation, very separate and doesn't add new APIs,
  run in production at MSFT.
  - If it passes Cloudera internal integration testing, I'm fine
  putting this in for GA.
   - API-based scheduler configuration (Jonathan Hung)
  - Jonathan mentioned that his main goal is to get this in for 2.9.0,
  which seems likely to go out after 3.0.0 GA since there hasn't been any
  serious release planning yet. Jonathan said that delaying this
until 3.1.0
  is fine.


[jira] [Created] (YARN-7149) Cross-queue preemption sometimes starves an underserved queue

2017-09-01 Thread Eric Payne (JIRA)
Eric Payne created YARN-7149:


 Summary: Cross-queue preemption sometimes starves an underserved 
queue
 Key: YARN-7149
 URL: https://issues.apache.org/jira/browse/YARN-7149
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacity scheduler
Affects Versions: 3.0.0-alpha3, 2.9.0
Reporter: Eric Payne
Assignee: Eric Payne


In branch 2 and trunk, I am consistently seeing some use cases where 
cross-queue preemption does not happen when it should. I do not see this in 
branch-2.8.

Use Case:
| | *Size* | *Minimum Container Size* |
|MyCluster | 20 GB | 0.5 GB |

| *Queue Name* | *Capacity* | *Absolute Capacity* | *Minimum User Limit Percent 
(MULP)* | *User Limit Factor (ULF)* |
|Q1 | 50% = 10 GB | 100% = 20 GB | 10% = 1 GB | 2.0 |
|Q2 | 50% = 10 GB | 100% = 20 GB | 10% = 1 GB | 2.0 |

- {{User1}} launches {{App1}} in {{Q1}} and consumes all resources (20 GB)
- {{User2}} launches {{App2}} in {{Q2}} and requests 10 GB
- _Note: containers are 0.5 GB._
- Preemption monitor kills 2 containers (equals 1 GB) from {{App1}} in {{Q1}}.
- Capacity Scheduler assigns 2 containers (equals 1 GB) to {{App2}} in {{Q2}}.
- _No more containers are ever preempted, even though {{Q2}} is far underserved_




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: Apache Hadoop 2.8.2 Release Plan

2017-09-01 Thread Junping Du
This issue (HADOOP-14439) is out of my radar given it is marked as Minor 
priority. If my understanding is correct, here is a trade-off between security 
and backward compatibility. IMO, priority of security is generally higher than 
backward compatibility especially 2.8.0 is still non-production release. 
I think we should skip this for 2.8.2 in case it doesn't break compatibility 
from 2.7.x. Thoughts?

Thanks,

Junping

From: larry mccay 
Sent: Friday, September 1, 2017 10:55 AM
To: Steve Loughran
Cc: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
mapreduce-...@hadoop.apache.org; yarn-dev@hadoop.apache.org
Subject: Re: Apache Hadoop 2.8.2 Release Plan

If we do "fix" this in 2.8.2 we should seriously consider not doing so in
3.0.
This is a very poor practice.

I can see an argument for backward compatibility in 2.8.x line though.

On Fri, Sep 1, 2017 at 1:41 PM, Steve Loughran 
wrote:

> One thing we need to consider is
>
> HADOOP-14439: regression: secret stripping from S3x URIs breaks some
> downstream code
>
> Hadoop 2.8 has a best-effort attempt to strip out secrets from the
> toString() value of an s3a or s3n path where someone has embedded them in
> the URI; this has caused problems in some uses, specifically: when people
> use secrets this way (bad) and assume that you can round trip paths to
> string and back
>
> Should we fix this? If so, Hadoop 2.8.2 is the time to do it
>
>
> > On 1 Sep 2017, at 11:14, Junping Du  wrote:
> >
> > HADOOP-14814 get committed and HADOOP-9747 get push out to 2.8.3, so we
> are clean on blocker/critical issues now.
> > I finish practice of going through JACC report and no more incompatible
> public API changes get found between 2.8.2 and 2.7.4. Also I check commit
> history and fixed 10+ commits which are missing from branch-2.8.2 for some
> reason. So, the current branch-2.8.2 should be good to go for RC stage, and
> I will kick off our first RC tomorrow.
> > In the meanwhile, please don't land any commits to branch-2.8.2 since
> now. If some issues really belong to blocker, please ping me on the JIRA
> before doing any commits. branch-2.8 is still open for landing. Thanks for
> your cooperation!
> >
> >
> > Thanks,
> >
> > Junping
> >
> > 
> > From: Junping Du 
> > Sent: Wednesday, August 30, 2017 12:35 AM
> > To: Brahma Reddy Battula; common-...@hadoop.apache.org;
> hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org;
> yarn-dev@hadoop.apache.org
> > Subject: Re: Apache Hadoop 2.8.2 Release Plan
> >
> > Thanks Brahma for comment on this thread. To be clear, I always update
> branch version just before RC kicking off.
> >
> > For 2.8.2 release, I don't have plan to involve big top or other
> third-party test tools. As always, we will rely on test/verify efforts from
> community especially from large deployed production cluster - as far as I
> know,  there are already several companies. like: Yahoo!, Alibaba, etc.
> already deploy 2.8 release in large production clusters for months which
> give me more confidence on 2.8.2.
> >
> >
> > Here is more update on 2.8.2 release:
> >
> > Blocker issues:
> >
> >-  A new blocker YARN-7076 get reported and fixed by Jian He through
> last weekend.
> >
> >-  Another new blocker - HADOOP-14814 get identified from my latest
> jdiff run against 2.7.4. The simple fix on an incompatible API change
> should get commit soon.
> >
> >
> > Critical issues:
> >
> >-  YARN-7083 already get committed. Thanks Jason for reporting the
> issue and delivering the fix.
> >
> >-  YARN-6091 get push out from 2.8.2 as issue is not a regression and
> pending for a while.
> >
> >-  Daryn is actively working on HADOOP-9747 for a while, and the
> patch are getting close to be committed. However, according to Daryn, the
> patch seems to cause some regression in some corner cases in secured
> environment (Storm auto tgt, etc.). May need some additional watch/review
> on this JIRA's fixes.
> >
> >
> >
> > My monitoring JACC report between 2.8.2 and 2.7.4 will get finish
> tomorrow. If everything is going smoothly, I am planning to kick off RC0
> around holiday (this weekend).
> >
> >
> >
> > Thanks,
> >
> >
> >
> > ​Junping
> >
> >
> >
> > 
> > From: Brahma Reddy Battula 
> > Sent: Tuesday, August 29, 2017 8:42 AM
> > To: Junping Du; common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org;
> mapreduce-...@hadoop.apache.org; yarn-dev@hadoop.apache.org
> > Subject: Re: Apache Hadoop 2.8.2 Release Plan
> >
> >
> > Hi All
> >
> > Update on 2.8.2 release status
> > we are down to 3 critical issues ( YARN-6091,YARN-7083,HADOOP-9747),all
> are patch available and closer to commit.
> > Junping is closing tracking this.
> >
> > Todo:
> >
> > 1) Update pom.xml ..?  currently it's with 2.8.3
> > 

Heads up: branch-3.0 has been cut, commit here for 3.0.0-beta1

2017-09-01 Thread Andrew Wang
Hi folks,

I've proceeded with the plan from our earlier thread and cut branch-3.0.
The branches and maven versions are now set as follows:

trunk: 3.1.0-SNAPSHOT
branch-3.0: 3.0.0-beta1-SNAPSHOT

branch-2's are still the same.

This means if you want to commit something for beta1, commit it to
branch-3.0 too. Excepting features already committed for beta1 (e.g. EC,
native services, S3Guard, TSv2, YARN federation), please treat branch-3.0
the same as a maintenance release branch.

I'm planning to cut the release branch branch-3.0.0-beta1 just before RC.
If you have anything we pushed out of 3.0.0-beta1 and is waiting for 3.0.0
GA, please hold it in trunk until after we release 3.0.0-beta1 (which
should be relatively soon).

Best,
Andrew


Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-01 Thread Billie Rinaldi
+1 (non-binding)

On Thu, Aug 31, 2017 at 8:33 PM, Jian He  wrote:

> Hi All,
>
> I would like to call a vote for merging yarn-native-services to trunk. The
> vote will run for 7 days as usual.
>
> At a high level, the following are the key feautres implemented.
> - YARN-5079[1]. A native YARN framework (ApplicationMaster) to migrate and
> orchestrate existing services to YARN either docker or non-docker based.
> - YARN-4793[2]. A Rest API server for user to deploy a service via a
> simple JSON spec
> - YARN-4757[3]. Extending today's service registry with a simple DNS
> service to enable users to discover services deployed on YARN
> - YARN-6419[4]. UI support for native-services on the new YARN UI
> All these new services are optional and are sitting outside of the
> existing system, and have no impact on existing system if disabled.
>
> Special thanks to a team of folks who worked hard towards this: Billie
> Rinaldi, Gour Saha, Vinod Kumar Vavilapalli, Jonathan Maron, Rohith Sharma
> K S, Sunil G, Akhil PB. This effort could not be possible without their
> ideas and hard work.
>
> Thanks,
> Jian
>
> [1] https://issues.apache.org/jira/browse/YARN-5079
> [2] https://issues.apache.org/jira/browse/YARN-4793
> [3] https://issues.apache.org/jira/browse/YARN-4757
> [4] https://issues.apache.org/jira/browse/YARN-6419
>
>
> -
> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>
>


Re: [DISCUSS] Branches and versions for Hadoop 3

2017-09-01 Thread Andrew Wang
Hi folks,

We've landed two of our beta1 features, S3Guard and TSv2, into trunk. Jian
just sent out the vote for our remaining beta1 feature, YARN native
services, but I think it's time to branch to unblock the resource profiles
merge to 3.1.

I'll cut just branch-3.0 for now, since we don't have anything urgent that
needs to go into 3.0.0-beta1 vs. 3.0.0 GA.

Cheers,
Andrew

On Tue, Aug 29, 2017 at 11:21 PM, varunsax...@apache.org <
varun.saxena.apa...@gmail.com> wrote:

> Hi Andrew,
>
> We have completed the merge of TSv2 to trunk.
> You can now go ahead with the branching.
>
> Regards,
> Varun Saxena.
>
> On Tue, Aug 29, 2017 at 11:35 PM, Andrew Wang 
> wrote:
>
>> Sure. Ping me when the TSv2 goes in, and I can take care of branching.
>>
>> We're still waiting on the native services and S3Guard merges, but I
>> don't want to hold branching to the last minute.
>>
>> On Tue, Aug 29, 2017 at 10:51 AM, Vrushali C 
>> wrote:
>>
>>> Hi Andrew,
>>> As Rohith mentioned, if you are good with it, from the TSv2 side, we are
>>> ready to go for merge tonight itself (Pacific time)  right after the voting
>>> period ends. Varun Saxena has been diligently rebasing up until now so most
>>> likely our merge should be reasonably straightforward.
>>>
>>> @Wangda: your resource profile vote ends tomorrow, could we please
>>> coordinate our merges?
>>>
>>> thanks
>>> Vrushali
>>>
>>>
>>> On Mon, Aug 28, 2017 at 10:45 PM, Rohith Sharma K S <
>>> rohithsharm...@apache.org> wrote:
>>>
 On 29 August 2017 at 06:24, Andrew Wang 
 wrote:

 > So far I've seen no -1's to the branching proposal, so I plan to
 execute
 > this tomorrow unless there's further feedback.
 >
 For on going branch merge threads i.e TSv2, voting will be closing
 tomorrow. Does it end up in merging into trunk(3.1.0-SNAPSHOT) and
 branch-3.0(3.0.0-beta1-SNAPSHOT) ? If so, would you be able to wait for
 couple of more days before creating branch-3.0 so that TSv2 branch merge
 would be done directly to trunk?



 >
 > Regarding the above discussion, I think Jason and I have essentially
 the
 > same opinion.
 >
 > I hope that keeping trunk a release branch means a higher bar for
 merges
 > and code review in general. In the past, I've seen some patches
 committed
 > to trunk-only as a way of passing responsibility to a future user or
 > reviewer. That doesn't help anyone; patches should be committed with
 the
 > intent of running them in production.
 >
 > I'd also like to repeat the above thanks to the many, many
 contributors
 > who've helped with release improvements. Allen's work on
 create-release and
 > automated changes and release notes were essential, as was Xiao's
 work on
 > LICENSE and NOTICE files. I'm also looking forward to Marton's site
 > improvements, which addresses one of the remaining sore spots in the
 > release process.
 >
 > Things have gotten smoother with each alpha we've done over the last
 year,
 > and it's a testament to everyone's work that we have a good
 probability of
 > shipping beta and GA later this year.
 >
 > Cheers,
 > Andrew
 >
 >

>>>
>>>
>>
>


[jira] [Created] (YARN-7148) TestLogsCLI fails in trunk and branch-2

2017-09-01 Thread Xuan Gong (JIRA)
Xuan Gong created YARN-7148:
---

 Summary: TestLogsCLI fails in trunk and branch-2
 Key: YARN-7148
 URL: https://issues.apache.org/jira/browse/YARN-7148
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong


The testcase failures(TestLogsCLI) should be related to YARN-6877



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: Apache Hadoop 2.8.2 Release Plan

2017-09-01 Thread larry mccay
If we do "fix" this in 2.8.2 we should seriously consider not doing so in
3.0.
This is a very poor practice.

I can see an argument for backward compatibility in 2.8.x line though.

On Fri, Sep 1, 2017 at 1:41 PM, Steve Loughran 
wrote:

> One thing we need to consider is
>
> HADOOP-14439: regression: secret stripping from S3x URIs breaks some
> downstream code
>
> Hadoop 2.8 has a best-effort attempt to strip out secrets from the
> toString() value of an s3a or s3n path where someone has embedded them in
> the URI; this has caused problems in some uses, specifically: when people
> use secrets this way (bad) and assume that you can round trip paths to
> string and back
>
> Should we fix this? If so, Hadoop 2.8.2 is the time to do it
>
>
> > On 1 Sep 2017, at 11:14, Junping Du  wrote:
> >
> > HADOOP-14814 get committed and HADOOP-9747 get push out to 2.8.3, so we
> are clean on blocker/critical issues now.
> > I finish practice of going through JACC report and no more incompatible
> public API changes get found between 2.8.2 and 2.7.4. Also I check commit
> history and fixed 10+ commits which are missing from branch-2.8.2 for some
> reason. So, the current branch-2.8.2 should be good to go for RC stage, and
> I will kick off our first RC tomorrow.
> > In the meanwhile, please don't land any commits to branch-2.8.2 since
> now. If some issues really belong to blocker, please ping me on the JIRA
> before doing any commits. branch-2.8 is still open for landing. Thanks for
> your cooperation!
> >
> >
> > Thanks,
> >
> > Junping
> >
> > 
> > From: Junping Du 
> > Sent: Wednesday, August 30, 2017 12:35 AM
> > To: Brahma Reddy Battula; common-...@hadoop.apache.org;
> hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org;
> yarn-dev@hadoop.apache.org
> > Subject: Re: Apache Hadoop 2.8.2 Release Plan
> >
> > Thanks Brahma for comment on this thread. To be clear, I always update
> branch version just before RC kicking off.
> >
> > For 2.8.2 release, I don't have plan to involve big top or other
> third-party test tools. As always, we will rely on test/verify efforts from
> community especially from large deployed production cluster - as far as I
> know,  there are already several companies. like: Yahoo!, Alibaba, etc.
> already deploy 2.8 release in large production clusters for months which
> give me more confidence on 2.8.2.
> >
> >
> > Here is more update on 2.8.2 release:
> >
> > Blocker issues:
> >
> >-  A new blocker YARN-7076 get reported and fixed by Jian He through
> last weekend.
> >
> >-  Another new blocker - HADOOP-14814 get identified from my latest
> jdiff run against 2.7.4. The simple fix on an incompatible API change
> should get commit soon.
> >
> >
> > Critical issues:
> >
> >-  YARN-7083 already get committed. Thanks Jason for reporting the
> issue and delivering the fix.
> >
> >-  YARN-6091 get push out from 2.8.2 as issue is not a regression and
> pending for a while.
> >
> >-  Daryn is actively working on HADOOP-9747 for a while, and the
> patch are getting close to be committed. However, according to Daryn, the
> patch seems to cause some regression in some corner cases in secured
> environment (Storm auto tgt, etc.). May need some additional watch/review
> on this JIRA's fixes.
> >
> >
> >
> > My monitoring JACC report between 2.8.2 and 2.7.4 will get finish
> tomorrow. If everything is going smoothly, I am planning to kick off RC0
> around holiday (this weekend).
> >
> >
> >
> > Thanks,
> >
> >
> >
> > ​Junping
> >
> >
> >
> > 
> > From: Brahma Reddy Battula 
> > Sent: Tuesday, August 29, 2017 8:42 AM
> > To: Junping Du; common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org;
> mapreduce-...@hadoop.apache.org; yarn-dev@hadoop.apache.org
> > Subject: Re: Apache Hadoop 2.8.2 Release Plan
> >
> >
> > Hi All
> >
> > Update on 2.8.2 release status
> > we are down to 3 critical issues ( YARN-6091,YARN-7083,HADOOP-9747),all
> are patch available and closer to commit.
> > Junping is closing tracking this.
> >
> > Todo:
> >
> > 1) Update pom.xml ..?  currently it's with 2.8.3
> > https://github.com/apache/hadoop/blob/branch-2.8.2/pom.xml#L21
> > 2) Wiki
> is outdated, need to update the wiki..?
> > 3) As this is going to stable release,are we planing enable Big top for
> 2.8.2 testing Or Dynamometer testing (anybody from linked-in can help)..?
> >
> > @Junping Du,Please correct me,if I am wrong.
> >
> >
> > --Brahma Reddy Battula
> > 
> > From: Junping Du 
> > Sent: Monday, August 7, 2017 2:44 PM
> > To: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org;
> mapreduce-...@hadoop.apache.org; yarn-dev@hadoop.apache.org
> > Subject: Re: Apache Hadoop 2.8.2 

Re: Apache Hadoop 2.8.2 Release Plan

2017-09-01 Thread Steve Loughran
One thing we need to consider is

HADOOP-14439: regression: secret stripping from S3x URIs breaks some downstream 
code

Hadoop 2.8 has a best-effort attempt to strip out secrets from the toString() 
value of an s3a or s3n path where someone has embedded them in the URI; this 
has caused problems in some uses, specifically: when people use secrets this 
way (bad) and assume that you can round trip paths to string and back

Should we fix this? If so, Hadoop 2.8.2 is the time to do it


> On 1 Sep 2017, at 11:14, Junping Du  wrote:
> 
> HADOOP-14814 get committed and HADOOP-9747 get push out to 2.8.3, so we are 
> clean on blocker/critical issues now.
> I finish practice of going through JACC report and no more incompatible 
> public API changes get found between 2.8.2 and 2.7.4. Also I check commit 
> history and fixed 10+ commits which are missing from branch-2.8.2 for some 
> reason. So, the current branch-2.8.2 should be good to go for RC stage, and I 
> will kick off our first RC tomorrow.
> In the meanwhile, please don't land any commits to branch-2.8.2 since now. If 
> some issues really belong to blocker, please ping me on the JIRA before doing 
> any commits. branch-2.8 is still open for landing. Thanks for your 
> cooperation!
> 
> 
> Thanks,
> 
> Junping
> 
> 
> From: Junping Du 
> Sent: Wednesday, August 30, 2017 12:35 AM
> To: Brahma Reddy Battula; common-...@hadoop.apache.org; 
> hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org; 
> yarn-dev@hadoop.apache.org
> Subject: Re: Apache Hadoop 2.8.2 Release Plan
> 
> Thanks Brahma for comment on this thread. To be clear, I always update branch 
> version just before RC kicking off.
> 
> For 2.8.2 release, I don't have plan to involve big top or other third-party 
> test tools. As always, we will rely on test/verify efforts from community 
> especially from large deployed production cluster - as far as I know,  there 
> are already several companies. like: Yahoo!, Alibaba, etc. already deploy 2.8 
> release in large production clusters for months which give me more confidence 
> on 2.8.2.
> 
> 
> Here is more update on 2.8.2 release:
> 
> Blocker issues:
> 
>-  A new blocker YARN-7076 get reported and fixed by Jian He through last 
> weekend.
> 
>-  Another new blocker - HADOOP-14814 get identified from my latest jdiff 
> run against 2.7.4. The simple fix on an incompatible API change should get 
> commit soon.
> 
> 
> Critical issues:
> 
>-  YARN-7083 already get committed. Thanks Jason for reporting the issue 
> and delivering the fix.
> 
>-  YARN-6091 get push out from 2.8.2 as issue is not a regression and 
> pending for a while.
> 
>-  Daryn is actively working on HADOOP-9747 for a while, and the patch are 
> getting close to be committed. However, according to Daryn, the patch seems 
> to cause some regression in some corner cases in secured environment (Storm 
> auto tgt, etc.). May need some additional watch/review on this JIRA's fixes.
> 
> 
> 
> My monitoring JACC report between 2.8.2 and 2.7.4 will get finish tomorrow. 
> If everything is going smoothly, I am planning to kick off RC0 around holiday 
> (this weekend).
> 
> 
> 
> Thanks,
> 
> 
> 
> ​Junping
> 
> 
> 
> 
> From: Brahma Reddy Battula 
> Sent: Tuesday, August 29, 2017 8:42 AM
> To: Junping Du; common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
> mapreduce-...@hadoop.apache.org; yarn-dev@hadoop.apache.org
> Subject: Re: Apache Hadoop 2.8.2 Release Plan
> 
> 
> Hi All
> 
> Update on 2.8.2 release status
> we are down to 3 critical issues ( YARN-6091,YARN-7083,HADOOP-9747),all are 
> patch available and closer to commit.
> Junping is closing tracking this.
> 
> Todo:
> 
> 1) Update pom.xml ..?  currently it's with 2.8.3
> https://github.com/apache/hadoop/blob/branch-2.8.2/pom.xml#L21
> 2) Wiki is 
> outdated, need to update the wiki..?
> 3) As this is going to stable release,are we planing enable Big top for 2.8.2 
> testing Or Dynamometer testing (anybody from linked-in can help)..?
> 
> @Junping Du,Please correct me,if I am wrong.
> 
> 
> --Brahma Reddy Battula
> 
> From: Junping Du 
> Sent: Monday, August 7, 2017 2:44 PM
> To: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
> mapreduce-...@hadoop.apache.org; yarn-dev@hadoop.apache.org
> Subject: Re: Apache Hadoop 2.8.2 Release Plan
> 
> Hello community,
>Here is a quick update on status for 2.8.2:
>- We are 0 blockers now!
>- Still 9 critical issues, 8 of them are Patch Available and with actively 
> working.
>For details of pending blocker/critical issues, please refer: 
> https://s.apache.org/JM5x
> Issue Navigator - ASF JIRA
> s.apache.org
> 

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-09-01 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/510/

[Aug 31, 2017 9:23:17 AM] (yqlin) HDFS-12317. HDFS metrics render error in the 
page of Github. Contributed
[Aug 31, 2017 1:12:01 PM] (sunilg) YARN-7116. CapacityScheduler Web UI: Queue's 
AM usage is always show on
[Aug 31, 2017 4:35:01 PM] (templedf) YARN-6780. ResourceWeights.toString() 
cleanup (Contributed by weiyuan
[Aug 31, 2017 10:05:41 PM] (subu) YARN-7095. Federation: routing 
getNode/getNodes/getMetrics REST
[Aug 31, 2017 11:41:43 PM] (junping_du) YARN-6877. Create an abstract log 
reader for extendability. Contributed
[Sep 1, 2017 2:06:49 AM] (aw) HADOOP-14364. refresh changelog/release notes 
with newer Apache Yetus
[Sep 1, 2017 2:39:31 AM] (aw) YARN-6721. container-executor should have stack 
checking
[Sep 1, 2017 4:10:52 AM] (aw) HADOOP-14781. Clarify that HADOOP_CONF_DIR 
shouldn't actually be set in
[Sep 1, 2017 4:13:22 AM] (jzhuge) HADOOP-14824. Update ADLS SDK to 2.2.2 for 
MSI fix. Contributed by Atul
[Sep 1, 2017 5:36:56 AM] (liuml07) HDFS-12363. Possible NPE in
[Sep 1, 2017 5:59:16 AM] (bibinchundatt) YARN-7141. Move logging APIs to slf4j 
in timelineservice after ATSv2
[Sep 1, 2017 6:18:48 AM] (liuml07) HDFS-12380. Simplify dataQueue.wait 
condition logical operation in
[Sep 1, 2017 6:20:01 AM] (xiao) HDFS-12300. Audit-log delegation token related 
operations.




-1 overall


The following subsystems voted -1:
findbugs unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
   Hard coded reference to an absolute pathname in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
 At DockerLinuxContainerRuntime.java:absolute pathname in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
 At DockerLinuxContainerRuntime.java:[line 490] 

Failed junit tests :

   hadoop.security.TestShellBasedUnixGroupsMapping 
   hadoop.hdfs.TestReconstructStripedFile 
   hadoop.hdfs.TestLeaseRecoveryStriped 
   hadoop.hdfs.TestClientProtocolForPipelineRecovery 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure050 
   hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks 
   hadoop.hdfs.server.datanode.TestDirectoryScanner 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure170 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 
   hadoop.hdfs.server.namenode.TestNamenodeCapacityReport 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 
   
hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness
 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 
   hadoop.hdfs.TestModTime 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure040 
   hadoop.hdfs.TestReadStripedFileWithDecoding 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure100 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 
   hadoop.fs.http.client.TestHttpFSWithHttpFSFileSystem 
   
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
 
   
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation 
   hadoop.yarn.server.TestContainerManagerSecurity 
   hadoop.yarn.client.cli.TestLogsCLI 
   hadoop.yarn.server.router.webapp.TestRouterWebServiceUtil 
   hadoop.mapreduce.v2.hs.webapp.TestHSWebApp 
   hadoop.mapreduce.security.ssl.TestEncryptedShuffle 
   hadoop.yarn.sls.TestReservationSystemInvariants 
   hadoop.yarn.sls.TestSLSRunner 

Timed out junit tests :

   org.apache.hadoop.hdfs.TestWriteReadStripedFile 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/510/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/510/artifact/out/diff-compile-javac-root.txt
  [292K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/510/artifact/out/diff-checkstyle-root.txt
  [17M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/510/artifact/out/diff-patch-pylint.txt
  [20K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/510/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/510/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   

[jira] [Resolved] (YARN-7147) ATS1.5 crash due to OOM

2017-09-01 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S resolved YARN-7147.
-
Resolution: Duplicate

Closing as duplicate!

> ATS1.5 crash due to OOM
> ---
>
> Key: YARN-7147
> URL: https://issues.apache.org/jira/browse/YARN-7147
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: Screen Shot - suspect-1.png, Screen Shot - suspect-2.png
>
>
> It is observed that in production cluster, though _app-cache-size_ is set to 
> minimal i.e less than 5, ATS server is going down with OOM. The 
> _entity-group-fs-store.cache-store-class_ is configured with 
> MemoryTimelineStore which is by default. The heap size configured for ATS 
> daemon is 8GB. 
> This is because ATS parse the entity log file per domain and caches it. If 
> the domain has lot of entity information, then in memory cache store loads 
> all the entity information which is causing OOM. After restart, again it 
> caches same domain and goes OOM. 
> There are  possible way handle it are
> # threshold the number of entities loaded into in memory cache. This still 
> can lead to OOM if data size is huge. 
> # Based on the data size in the store. 
> We faced 1st issue where number of entities are very huge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



YARN javadoc failures Re: [DISCUSS] Branches and versions for Hadoop 3

2017-09-01 Thread Allen Wittenauer

> On Aug 28, 2017, at 9:58 AM, Allen Wittenauer  
> wrote:
>   The automation only goes so far.  At least while investigating Yetus 
> bugs, I've seen more than enough blatant and purposeful ignored errors and 
> warnings that I'm not convinced it will be effective. ("That javadoc compile 
> failure didn't come from my patch!"  Um, yes, yes it did.) PR for features 
> has greatly trumped code correctness for a few years now.


I'm psychic.

Looks like YARN-6877 is crashing JDK8 javadoc.  Maven stops processing 
and errors out before even giving a build error/success. Reverting the patch 
makes things work again. Anyway, Yetus caught it, warned about it continuously, 
but it was still committed.  


-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7147) ATS1.5 crash due to OOM

2017-09-01 Thread Rohith Sharma K S (JIRA)
Rohith Sharma K S created YARN-7147:
---

 Summary: ATS1.5 crash due to OOM
 Key: YARN-7147
 URL: https://issues.apache.org/jira/browse/YARN-7147
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Rohith Sharma K S
Assignee: Rohith Sharma K S


It is observed that in production cluster, though _app-cache-size_ is set to 
minimal i.e less than 5, ATS server is going down with OOM. The 
_entity-group-fs-store.cache-store-class_ is configured with 
MemoryTimelineStore which is by default. The heap size configured for ATS 
daemon is 8GB. 

This is because ATS parse the entity log file per domain and caches it. If the 
domain has lot of entity information, then in memory cache store loads all the 
entity information which is causing OOM. After restart, again it caches same 
domain and goes OOM. 

There are  possible way handle it are
# threshold the number of entities loaded into in memory cache. This still can 
lead to OOM if data size is huge. 
# Based on the data size in the store. 

We faced 1st issue where number of entities are very huge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: Apache Hadoop 2.8.2 Release Plan

2017-09-01 Thread Junping Du
HADOOP-14814 get committed and HADOOP-9747 get push out to 2.8.3, so we are 
clean on blocker/critical issues now.
I finish practice of going through JACC report and no more incompatible public 
API changes get found between 2.8.2 and 2.7.4. Also I check commit history and 
fixed 10+ commits which are missing from branch-2.8.2 for some reason. So, the 
current branch-2.8.2 should be good to go for RC stage, and I will kick off our 
first RC tomorrow.
In the meanwhile, please don't land any commits to branch-2.8.2 since now. If 
some issues really belong to blocker, please ping me on the JIRA before doing 
any commits. branch-2.8 is still open for landing. Thanks for your cooperation!


Thanks,

Junping


From: Junping Du 
Sent: Wednesday, August 30, 2017 12:35 AM
To: Brahma Reddy Battula; common-...@hadoop.apache.org; 
hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org; 
yarn-dev@hadoop.apache.org
Subject: Re: Apache Hadoop 2.8.2 Release Plan

Thanks Brahma for comment on this thread. To be clear, I always update branch 
version just before RC kicking off.

For 2.8.2 release, I don't have plan to involve big top or other third-party 
test tools. As always, we will rely on test/verify efforts from community 
especially from large deployed production cluster - as far as I know,  there 
are already several companies. like: Yahoo!, Alibaba, etc. already deploy 2.8 
release in large production clusters for months which give me more confidence 
on 2.8.2.


Here is more update on 2.8.2 release:

Blocker issues:

-  A new blocker YARN-7076 get reported and fixed by Jian He through last 
weekend.

-  Another new blocker - HADOOP-14814 get identified from my latest jdiff 
run against 2.7.4. The simple fix on an incompatible API change should get 
commit soon.


Critical issues:

-  YARN-7083 already get committed. Thanks Jason for reporting the issue 
and delivering the fix.

-  YARN-6091 get push out from 2.8.2 as issue is not a regression and 
pending for a while.

-  Daryn is actively working on HADOOP-9747 for a while, and the patch are 
getting close to be committed. However, according to Daryn, the patch seems to 
cause some regression in some corner cases in secured environment (Storm auto 
tgt, etc.). May need some additional watch/review on this JIRA's fixes.



My monitoring JACC report between 2.8.2 and 2.7.4 will get finish tomorrow. If 
everything is going smoothly, I am planning to kick off RC0 around holiday 
(this weekend).



Thanks,



​Junping




From: Brahma Reddy Battula 
Sent: Tuesday, August 29, 2017 8:42 AM
To: Junping Du; common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
mapreduce-...@hadoop.apache.org; yarn-dev@hadoop.apache.org
Subject: Re: Apache Hadoop 2.8.2 Release Plan


Hi All

Update on 2.8.2 release status
we are down to 3 critical issues ( YARN-6091,YARN-7083,HADOOP-9747),all are 
patch available and closer to commit.
Junping is closing tracking this.

Todo:

1) Update pom.xml ..?  currently it's with 2.8.3
https://github.com/apache/hadoop/blob/branch-2.8.2/pom.xml#L21
2) Wiki is 
outdated, need to update the wiki..?
3) As this is going to stable release,are we planing enable Big top for 2.8.2 
testing Or Dynamometer testing (anybody from linked-in can help)..?

@Junping Du,Please correct me,if I am wrong.


--Brahma Reddy Battula

From: Junping Du 
Sent: Monday, August 7, 2017 2:44 PM
To: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
mapreduce-...@hadoop.apache.org; yarn-dev@hadoop.apache.org
Subject: Re: Apache Hadoop 2.8.2 Release Plan

Hello community,
Here is a quick update on status for 2.8.2:
- We are 0 blockers now!
- Still 9 critical issues, 8 of them are Patch Available and with actively 
working.
For details of pending blocker/critical issues, please refer: 
https://s.apache.org/JM5x
Issue Navigator - ASF JIRA
s.apache.org
Linked Applications. Loading… Dashboards




I am planning to cut off first RC in week of Aug. 21st to give these 
critical issues a bit more time (~2 weeks) to get fixed. Let's working towards 
first production GA release of Apache Hadoop 2.8 - let me know if you have any 
thoughts or comments.

Cheers,

Junping

From: Junping Du 
Sent: Monday, July 24, 2017 1:41 PM
To: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
mapreduce-...@hadoop.apache.org; yarn-dev@hadoop.apache.org
Subject: Re:

I have done the change.

All committers,

  2.8.2 release is supposed to be a stable/production release for 
branch-2.8. For commits to go for 2.8.2 release (only important and low risk 
bug fixes), please commit to trunk,