[jira] [Created] (YARN-6898) RM node labels page show display total used resources of each label.

2017-07-27 Thread YunFan Zhou (JIRA)
YunFan Zhou created YARN-6898:
-

 Summary: RM node labels page show display total used resources of 
each label.
 Key: YARN-6898
 URL: https://issues.apache.org/jira/browse/YARN-6898
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: YunFan Zhou






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: Moved remaining 16 Jiras targeted for version 2.7.4 to 2.7.5

2017-07-27 Thread Konstantin Shvachko
And some more.

On Thu, Jul 27, 2017 at 5:11 PM, Konstantin Shvachko 
wrote:

> Thanks,
> --Konst
>


[jira] [Created] (YARN-6897) Refactoring RMWebServices by moving some util methods in RMWebAppUtil

2017-07-27 Thread Giovanni Matteo Fumarola (JIRA)
Giovanni Matteo Fumarola created YARN-6897:
--

 Summary: Refactoring RMWebServices by moving some util methods in 
RMWebAppUtil
 Key: YARN-6897
 URL: https://issues.apache.org/jira/browse/YARN-6897
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Giovanni Matteo Fumarola
Assignee: Giovanni Matteo Fumarola


In YARN-6896 the router needs to use some methods already implemented in 
{{RMWebServices}}. This jira continues the work done in YARN-6634.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Moved remaining 16 Jiras targeted for version 2.7.4 to 2.7.5

2017-07-27 Thread Konstantin Shvachko
Thanks,
--Konst


[jira] [Created] (YARN-6896) Federation: routing REST invocations transparently to multiple RMs

2017-07-27 Thread Giovanni Matteo Fumarola (JIRA)
Giovanni Matteo Fumarola created YARN-6896:
--

 Summary: Federation: routing REST invocations transparently to 
multiple RMs
 Key: YARN-6896
 URL: https://issues.apache.org/jira/browse/YARN-6896
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Giovanni Matteo Fumarola
Assignee: Giovanni Matteo Fumarola


This JIRA tracks the design/implementation of the layer for routing 
RMWebServicesProtocol requests to the appropriate RM(s) in a federated YARN 
cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [Vote] merge feature branch YARN-2915 (Federation) to trunk

2017-07-27 Thread Botong Huang
+1 (non-bindings)

We have just deployed the latest bits (62f1ce2a3d9) from YARN-2915 in our
test cluster and ran multiple jobs. We confirm that Federation is working
e2e!

Our cluster setup: eight sub-clusters, each with one RM and four NM nodes.
One Router machine. SQL Server in Ubuntu is used as FederationStateStore.

Cheers,

Botong

On Thu, Jul 27, 2017 at 2:30 PM, Carlo Aldo Curino 
wrote:

> +1
>
> Cheers,
> Carlo
>
> On Thu, Jul 27, 2017 at 12:45 PM, Arun Suresh  wrote:
>
> > +1
> >
> > Cheers
> > -Arun
> >
> > On Jul 25, 2017 8:24 PM, "Subru Krishnan"  wrote:
> >
> >> Hi all,
> >>
> >> Per earlier discussion [9], I'd like to start a formal vote to merge
> >> feature YARN Federation (YARN-2915) [1] to trunk. The vote will run for
> 7
> >> days, and will end Aug 1 7PM PDT.
> >>
> >> We have been developing the feature in a branch (YARN-2915 [2]) for a
> >> while, and we are reasonably confident that the state of the feature
> meets
> >> the criteria to be merged onto trunk.
> >>
> >> *Key Ideas*:
> >>
> >> YARN’s centralized design allows strict enforcement of scheduling
> >> invariants and effective resource sharing, but becomes a scalability
> >> bottleneck (in number of jobs and nodes) well before reaching the scale
> of
> >> our clusters (e.g., 20k-50k nodes).
> >>
> >>
> >> To address these limitations, we developed a scale-out, federation-based
> >> solution (YARN-2915). Our architecture scales near-linearly to
> datacenter
> >> sized clusters, by partitioning nodes across multiple sub-clusters (each
> >> running a YARN cluster of few thousands nodes). Applications can span
> >> multiple sub-clusters *transparently (i.e. no code change or
> recompilation
> >> of existing apps)*, thanks to a layer of indirection that negotiates
> with
> >> multiple sub-clusters' Resource Managers on behalf of the application.
> >>
> >>
> >> This design is structurally scalable, as it bounds the number of nodes
> >> each
> >> RM is responsible for. Appropriate policies ensure that the majority of
> >> applications reside within a single sub-cluster, thus further
> controlling
> >> the load on each RM. This provides near linear scale-out by simply
> adding
> >> more sub-clusters. The same mechanism enables pooling of resources from
> >> clusters owned and operated by different teams.
> >>
> >> Status:
> >>
> >>- The version we would like to merge to trunk is termed "MVP"
> (minimal
> >>viable product). The feature will have a complete end-to-end
> >> application
> >>execution flow with the ability to span a single application across
> >>multiple YARN (sub) clusters.
> >>- There were 50+ sub-tasks that were that were completed as part of
> >> this
> >>effort. Every patch has been reviewed and +1ed by a committer. Thanks
> >> to
> >>Jian, Wangda, Karthik, Vinod, Varun & Arun for the thorough reviews!
> >>- Federation is designed to be built around YARN and consequently has
> >>minimal code changes to core YARN. The relevant JIRAs that modify
> >> existing
> >>YARN code base are YARN-3671 [7] & YARN-3673 [8]. We also paid close
> >>attention to ensure that if federation is disabled there is zero
> >> impact to
> >>existing functionality (disabled by default).
> >>- We found a few bugs as we went along which we fixed directly
> upstream
> >>in trunk and/or branch-2.
> >>- We have continuously rebasing the feature branch [2] so the merge
> >>should be a straightforward cherry-pick.
> >>- The current version has been rather thoroughly tested and is
> >> currently
> >>deployed in a *10,000+ node federated YARN cluster that's running
> >>upwards of 50k jobs daily with a reliability of 99.9%*.
> >>- We have few ideas for follow-up extensions/improvements which are
> >>tracked in the umbrella JIRA YARN-5597[3].
> >>
> >>
> >> Documentation:
> >>
> >>- Quick start guide (maven site) - YARN-6484[4].
> >>- Overall design doc[5] and the slide-deck [6] we used for our talk
> at
> >>Hadoop Summit 2016 is available in the umbrella jira - YARN-2915.
> >>
> >>
> >> Credits:
> >>
> >> This is a group effort that could have not been possible without the
> ideas
> >> and hard work of many other folks and we would like to specifically call
> >> out Giovanni, Botong & Ellen for their invaluable contributions. Also
> big
> >> thanks to the many folks in community  (Sriram, Kishore, Sarvesh, Jian,
> >> Wangda, Karthik, Vinod, Varun, Inigo, Vrushali, Sangjin, Joep, Rohith
> and
> >> many more) that helped us shape our ideas and code with very insightful
> >> feedback and comments.
> >>
> >> Cheers,
> >> Subru & Carlo
> >>
> >> [1] YARN-2915: https://issues.apache.org/jira/browse/YARN-2915
> >> [2] https://github.com/apache/hadoop/tree/YARN-2915
> >> [3] YARN-5597: https://issues.apache.org/jira/browse/YARN-5597
> >> [4] YARN-6484: https://issues.apache.org/jira/browse/YARN-6484
> >> 

Re: About 2.7.4 Release

2017-07-27 Thread Konstantin Shvachko
Hey guys,

Looks like we are done with blockers for Apache Hadoop 2.7.4 release.
https://issues.apache.org/jira/issues/?filter=12340814

I just committed HDFS-11896
, and decided not to wait
for HDFS-11576 , see Jira
comment.
Thanks Vinod for pointing out HDFS-11742
. It was thoroughly
tested on a small cluster and Kihwal committed it last week, thanks.

Will start building initial RC. Please refrain from committing to
branch-2.7 for some time.

Thank you everybody for contributing.
--Konst

On Thu, Jul 20, 2017 at 12:07 PM, Vinod Kumar Vavilapalli <
vino...@apache.org> wrote:

> Thanks for taking 2.7.4 over Konstantin!
>
> Regarding rolling RC next week, I still see that there are 4 blocker /
> critical tickets targeted for 2.7.4: https://issues.apache.
> org/jira/issues/?jql=project%20in%20(HDFS%2C%20MAPREDUCE%
> 2C%20HADOOP%2C%20YARN)%20AND%20priority%20in%20(Blocker%2C%
> 20Critical)%20AND%20resolution%20%3D%20Unresolved%20AND%20%
> 22Target%20Version%2Fs%22%20%3D%202.7.4
> 
> .
>
> We should get closure on them. https://issues.apache.
> org/jira/browse/HDFS-11742 definitely was something that was deemed a
> blocker for 2.8.2, not sure about 2.7.4.
>
> I’m ‘back’ - let me know if you need any help.
>
> Thanks
> +Vinod
>
> On Jul 13, 2017, at 5:45 PM, Konstantin Shvachko 
> wrote:
>
> Hi everybody.
>
> We have been doing some internal testing of Hadoop 2.7.4. The testing is
> going well.
> Did not find any major issues on our workloads.
> Used an internal tool called Dynamometer to check NameNode performance on
> real cluster traces. Good.
> Overall test cluster performance looks good.
> Some more testing is still going on.
>
> I plan to build an RC next week. If there are no objection.
>
> Thanks,
> --Konst
>
> On Thu, Jun 15, 2017 at 4:42 PM, Konstantin Shvachko  >
> wrote:
>
> Hey guys.
>
> An update on 2.7.4 progress.
> We are down to 4 blockers. There is some work remaining on those.
> https://issues.apache.org/jira/browse/HDFS-11896?filter=12340814
> Would be good if people could follow up on review comments.
>
> I looked through nightly Jenkins build results for 2.7.4 both on Apache
> Jenkins and internal.
> Some test fail intermittently, but there no consistent failures. I filed
> HDFS-11985 to track some of them.
> https://issues.apache.org/jira/browse/HDFS-11985
> I do not currently consider these failures as blockers. LMK if some of
> them are.
>
> We started internal testing of branch-2.7 on one of our smallish (100+
> nodes) test clusters.
> Will update on the results.
>
> There is a plan to enable BigTop for 2.7.4 testing.
>
> Akira, Brahma thank you for setting up a wiki page for 2.7.4 release.
> Thank you everybody for contributing to this effort.
>
> Regards,
> --Konstantin
>
>
> On Tue, May 30, 2017 at 12:08 AM, Akira Ajisaka 
> wrote:
>
> Sure.
> If you want to edit the wiki, please tell me your ASF confluence account.
>
> -Akira
>
> On 2017/05/30 15:31, Rohith Sharma K S wrote:
>
> Couple of more JIRAs need to be back ported for 2.7.4 release. These will
> solve RM HA unstability issues.
> https://issues.apache.org/jira/browse/YARN-5333
> https://issues.apache.org/jira/browse/YARN-5988
> https://issues.apache.org/jira/browse/YARN-6304
>
> I will raise a JIRAs to back port it.
>
> @Akira , could  you help to add these JIRAs into wiki?
>
> Thanks & Regards
> Rohith Sharma K S
>
> On 29 May 2017 at 12:19, Akira Ajisaka  wrote:
>
> Created a page for 2.7.4 release.
>
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.7.4
>
> If you want to edit this wiki, please ping me.
>
> Regards,
> Akira
>
>
> On 2017/05/23 4:42, Brahma Reddy Battula wrote:
>
> Hi Konstantin Shvachko
>
>
>
> how about creating a wiki page for 2.7.4 release status like 2.8 and
> trunk in following link.??
>
>
> https://cwiki.apache.org/confluence/display/HADOOP
>
>
> 
> From: Konstantin Shvachko 
> Sent: Saturday, May 13, 2017 3:58 AM
> To: Akira Ajisaka
> Cc: Hadoop Common; Hdfs-dev; mapreduce-...@hadoop.apache.org;
> yarn-dev@hadoop.apache.org
> Subject: Re: About 2.7.4 Release
>
> Latest update on the links and filters. Here is the correct link for
> the
> filter:
> https://issues.apache.org/jira/secure/IssueNavigator.jspa?
> requestId=12340814
>
> Also updated: https://s.apache.org/Dzg4
>
> Had to do some Jira debugging. Sorry for confusion.
>
> Thanks,
> --Konstantin
>
> On Wed, May 10, 2017 at 2:30 PM, Konstantin Shvachko <
> shv.had...@gmail.com>
> wrote:
>
> Hey Akira,
>
>
> I didn't have 

Re: [Vote] merge feature branch YARN-2915 (Federation) to trunk

2017-07-27 Thread Carlo Aldo Curino
+1

Cheers,
Carlo

On Thu, Jul 27, 2017 at 12:45 PM, Arun Suresh  wrote:

> +1
>
> Cheers
> -Arun
>
> On Jul 25, 2017 8:24 PM, "Subru Krishnan"  wrote:
>
>> Hi all,
>>
>> Per earlier discussion [9], I'd like to start a formal vote to merge
>> feature YARN Federation (YARN-2915) [1] to trunk. The vote will run for 7
>> days, and will end Aug 1 7PM PDT.
>>
>> We have been developing the feature in a branch (YARN-2915 [2]) for a
>> while, and we are reasonably confident that the state of the feature meets
>> the criteria to be merged onto trunk.
>>
>> *Key Ideas*:
>>
>> YARN’s centralized design allows strict enforcement of scheduling
>> invariants and effective resource sharing, but becomes a scalability
>> bottleneck (in number of jobs and nodes) well before reaching the scale of
>> our clusters (e.g., 20k-50k nodes).
>>
>>
>> To address these limitations, we developed a scale-out, federation-based
>> solution (YARN-2915). Our architecture scales near-linearly to datacenter
>> sized clusters, by partitioning nodes across multiple sub-clusters (each
>> running a YARN cluster of few thousands nodes). Applications can span
>> multiple sub-clusters *transparently (i.e. no code change or recompilation
>> of existing apps)*, thanks to a layer of indirection that negotiates with
>> multiple sub-clusters' Resource Managers on behalf of the application.
>>
>>
>> This design is structurally scalable, as it bounds the number of nodes
>> each
>> RM is responsible for. Appropriate policies ensure that the majority of
>> applications reside within a single sub-cluster, thus further controlling
>> the load on each RM. This provides near linear scale-out by simply adding
>> more sub-clusters. The same mechanism enables pooling of resources from
>> clusters owned and operated by different teams.
>>
>> Status:
>>
>>- The version we would like to merge to trunk is termed "MVP" (minimal
>>viable product). The feature will have a complete end-to-end
>> application
>>execution flow with the ability to span a single application across
>>multiple YARN (sub) clusters.
>>- There were 50+ sub-tasks that were that were completed as part of
>> this
>>effort. Every patch has been reviewed and +1ed by a committer. Thanks
>> to
>>Jian, Wangda, Karthik, Vinod, Varun & Arun for the thorough reviews!
>>- Federation is designed to be built around YARN and consequently has
>>minimal code changes to core YARN. The relevant JIRAs that modify
>> existing
>>YARN code base are YARN-3671 [7] & YARN-3673 [8]. We also paid close
>>attention to ensure that if federation is disabled there is zero
>> impact to
>>existing functionality (disabled by default).
>>- We found a few bugs as we went along which we fixed directly upstream
>>in trunk and/or branch-2.
>>- We have continuously rebasing the feature branch [2] so the merge
>>should be a straightforward cherry-pick.
>>- The current version has been rather thoroughly tested and is
>> currently
>>deployed in a *10,000+ node federated YARN cluster that's running
>>upwards of 50k jobs daily with a reliability of 99.9%*.
>>- We have few ideas for follow-up extensions/improvements which are
>>tracked in the umbrella JIRA YARN-5597[3].
>>
>>
>> Documentation:
>>
>>- Quick start guide (maven site) - YARN-6484[4].
>>- Overall design doc[5] and the slide-deck [6] we used for our talk at
>>Hadoop Summit 2016 is available in the umbrella jira - YARN-2915.
>>
>>
>> Credits:
>>
>> This is a group effort that could have not been possible without the ideas
>> and hard work of many other folks and we would like to specifically call
>> out Giovanni, Botong & Ellen for their invaluable contributions. Also big
>> thanks to the many folks in community  (Sriram, Kishore, Sarvesh, Jian,
>> Wangda, Karthik, Vinod, Varun, Inigo, Vrushali, Sangjin, Joep, Rohith and
>> many more) that helped us shape our ideas and code with very insightful
>> feedback and comments.
>>
>> Cheers,
>> Subru & Carlo
>>
>> [1] YARN-2915: https://issues.apache.org/jira/browse/YARN-2915
>> [2] https://github.com/apache/hadoop/tree/YARN-2915
>> [3] YARN-5597: https://issues.apache.org/jira/browse/YARN-5597
>> [4] YARN-6484: https://issues.apache.org/jira/browse/YARN-6484
>> [5] https://issues.apache.org/jira/secure/attachment/12733292/Ya
>> rn_federation_design_v1.pdf
>> [6] https://issues.apache.org/jira/secure/attachment/1281922
>> 9/YARN-Federation-Hadoop-Summit_final.pptx
>> [7] YARN-3671: https://issues.apache.org/jira/browse/YARN-3671
>> [8] YARN-3673: https://issues.apache.org/jira/browse/YARN-3673
>> [9]
>> http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201
>> 706.mbox/%3CCAOScs9bSsZ7mzH15Y%2BSPDU8YuNUAq7QicjXpDoX_tKh3M
>> S4HsA%40mail.gmail.com%3E
>>
>


[jira] [Resolved] (YARN-6793) Duplicated reservation in Fair Scheduler preemption

2017-07-27 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu resolved YARN-6793.

Resolution: Duplicate

> Duplicated reservation in Fair Scheduler preemption 
> 
>
> Key: YARN-6793
> URL: https://issues.apache.org/jira/browse/YARN-6793
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.8.1, 3.0.0-alpha3
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>Priority: Critical
>
> There is a delay between preemption happen and containers are killed. If 
> resources released from nodes before container killing are not enough for the 
> resource request preemption asking for, reservation happens again at that 
> node.
> E.g. scheduler reserves  in node 1 for app 1 while 
> preemption. It will take 15s by default to kill containers in node 1 for 
> fulfill that resource requests. If  was released from 
> node 1 before the killing, scheduler reserves  again in 
> node 1 for app1. The second reservation may never be unreserved. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [Vote] merge feature branch YARN-2915 (Federation) to trunk

2017-07-27 Thread Arun Suresh
+1

Cheers
-Arun

On Jul 25, 2017 8:24 PM, "Subru Krishnan"  wrote:

> Hi all,
>
> Per earlier discussion [9], I'd like to start a formal vote to merge
> feature YARN Federation (YARN-2915) [1] to trunk. The vote will run for 7
> days, and will end Aug 1 7PM PDT.
>
> We have been developing the feature in a branch (YARN-2915 [2]) for a
> while, and we are reasonably confident that the state of the feature meets
> the criteria to be merged onto trunk.
>
> *Key Ideas*:
>
> YARN’s centralized design allows strict enforcement of scheduling
> invariants and effective resource sharing, but becomes a scalability
> bottleneck (in number of jobs and nodes) well before reaching the scale of
> our clusters (e.g., 20k-50k nodes).
>
>
> To address these limitations, we developed a scale-out, federation-based
> solution (YARN-2915). Our architecture scales near-linearly to datacenter
> sized clusters, by partitioning nodes across multiple sub-clusters (each
> running a YARN cluster of few thousands nodes). Applications can span
> multiple sub-clusters *transparently (i.e. no code change or recompilation
> of existing apps)*, thanks to a layer of indirection that negotiates with
> multiple sub-clusters' Resource Managers on behalf of the application.
>
>
> This design is structurally scalable, as it bounds the number of nodes each
> RM is responsible for. Appropriate policies ensure that the majority of
> applications reside within a single sub-cluster, thus further controlling
> the load on each RM. This provides near linear scale-out by simply adding
> more sub-clusters. The same mechanism enables pooling of resources from
> clusters owned and operated by different teams.
>
> Status:
>
>- The version we would like to merge to trunk is termed "MVP" (minimal
>viable product). The feature will have a complete end-to-end application
>execution flow with the ability to span a single application across
>multiple YARN (sub) clusters.
>- There were 50+ sub-tasks that were that were completed as part of this
>effort. Every patch has been reviewed and +1ed by a committer. Thanks to
>Jian, Wangda, Karthik, Vinod, Varun & Arun for the thorough reviews!
>- Federation is designed to be built around YARN and consequently has
>minimal code changes to core YARN. The relevant JIRAs that modify
> existing
>YARN code base are YARN-3671 [7] & YARN-3673 [8]. We also paid close
>attention to ensure that if federation is disabled there is zero impact
> to
>existing functionality (disabled by default).
>- We found a few bugs as we went along which we fixed directly upstream
>in trunk and/or branch-2.
>- We have continuously rebasing the feature branch [2] so the merge
>should be a straightforward cherry-pick.
>- The current version has been rather thoroughly tested and is currently
>deployed in a *10,000+ node federated YARN cluster that's running
>upwards of 50k jobs daily with a reliability of 99.9%*.
>- We have few ideas for follow-up extensions/improvements which are
>tracked in the umbrella JIRA YARN-5597[3].
>
>
> Documentation:
>
>- Quick start guide (maven site) - YARN-6484[4].
>- Overall design doc[5] and the slide-deck [6] we used for our talk at
>Hadoop Summit 2016 is available in the umbrella jira - YARN-2915.
>
>
> Credits:
>
> This is a group effort that could have not been possible without the ideas
> and hard work of many other folks and we would like to specifically call
> out Giovanni, Botong & Ellen for their invaluable contributions. Also big
> thanks to the many folks in community  (Sriram, Kishore, Sarvesh, Jian,
> Wangda, Karthik, Vinod, Varun, Inigo, Vrushali, Sangjin, Joep, Rohith and
> many more) that helped us shape our ideas and code with very insightful
> feedback and comments.
>
> Cheers,
> Subru & Carlo
>
> [1] YARN-2915: https://issues.apache.org/jira/browse/YARN-2915
> [2] https://github.com/apache/hadoop/tree/YARN-2915
> [3] YARN-5597: https://issues.apache.org/jira/browse/YARN-5597
> [4] YARN-6484: https://issues.apache.org/jira/browse/YARN-6484
> [5] https://issues.apache.org/jira/secure/attachment/12733292/Ya
> rn_federation_design_v1.pdf
> [6] https://issues.apache.org/jira/secure/attachment/1281922
> 9/YARN-Federation-Hadoop-Summit_final.pptx
> [7] YARN-3671: https://issues.apache.org/jira/browse/YARN-3671
> [8] YARN-3673: https://issues.apache.org/jira/browse/YARN-3673
> [9]
> http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201706.mbox/%
> 3CCAOScs9bSsZ7mzH15Y%2BSPDU8YuNUAq7QicjXpDoX_
> tKh3MS4HsA%40mail.gmail.com%3E
>


[jira] [Created] (YARN-6895) Preemption reservation may cause regular reservation leaks

2017-07-27 Thread Miklos Szegedi (JIRA)
Miklos Szegedi created YARN-6895:


 Summary: Preemption reservation may cause regular reservation leaks
 Key: YARN-6895
 URL: https://issues.apache.org/jira/browse/YARN-6895
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 3.0.0-alpha4
Reporter: Miklos Szegedi
Assignee: Miklos Szegedi
Priority: Blocker


We found a limitation in the implementation of YARN-6432. If the container 
released is smaller than the preemption request, a node reservation is created 
that is never deleted.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6894) RM Apps API returns only active apps when query parameter queue used

2017-07-27 Thread Grant Sohn (JIRA)
Grant Sohn created YARN-6894:


 Summary: RM Apps API returns only active apps when query parameter 
queue used
 Key: YARN-6894
 URL: https://issues.apache.org/jira/browse/YARN-6894
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, restapi
Reporter: Grant Sohn
Priority: Minor


If you run RM's Cluster Applications API with no query parameters, you get a 
list of apps.
If you run RM's Cluster Applications API with any query parameters other than 
"queue" you get the list of apps with the parameter filters being applied.
However, when you use the "queue" query parameter, you only see the 
applications that are active in the cluster (NEW, NEW_SAVING, SUBMITTED, 
ACCEPTED, RUNNING).  This behavior is inconsistent with the API.  If there is a 
sound reason behind this, it should be documented and it seems like there might 
be as the mapred queue CLI behaves similarly.

http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_API



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6893) Fix Missing Memory and CPU Metric in WebUI v2 Flow Activity Tab

2017-07-27 Thread Abdullah Yousufi (JIRA)
Abdullah Yousufi created YARN-6893:
--

 Summary: Fix Missing Memory and CPU Metric in WebUI v2 Flow 
Activity Tab
 Key: YARN-6893
 URL: https://issues.apache.org/jira/browse/YARN-6893
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-ui-v2
Affects Versions: 3.0.0-alpha4
Reporter: Abdullah Yousufi
Assignee: Abdullah Yousufi


The table in the Flow Activity tab currently shows the aggregate memory and 
number of cpu vcores as N/A.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6892) Improve API implementation in Resources and DominantResourceCalculator in align to ResourceInformation

2017-07-27 Thread Sunil G (JIRA)
Sunil G created YARN-6892:
-

 Summary: Improve API implementation in Resources and 
DominantResourceCalculator in align to ResourceInformation
 Key: YARN-6892
 URL: https://issues.apache.org/jira/browse/YARN-6892
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Sunil G
Assignee: Sunil G


In YARN-3926, apis in Resources and DRC spents significant cpu cycles in most 
of its api. For better performance, its better to improve the apis as resource 
types order is defined in system level (ResourceUtils class ensures this post 
YARN-6788)

This work is preceding to YARN-6788



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6891) Can kill other user's applications via RM UI

2017-07-27 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-6891.
--
Resolution: Duplicate

> Can kill other user's applications via RM UI
> 
>
> Key: YARN-6891
> URL: https://issues.apache.org/jira/browse/YARN-6891
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sumana Sathish
>Assignee: Junping Du
>Priority: Critical
>
> In a  secured cluster with UI unsecured which has following config
> {code}
> "hadoop.http.authentication.simple.anonymous.allowed" => "true"
> "hadoop.http.authentication.type" => kerberos
> {code}
> UI can be accessed without any security setting.
> Also any user can kill other user's applications via UI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6891) Can kill other user's applications via RM UI

2017-07-27 Thread Sumana Sathish (JIRA)
Sumana Sathish created YARN-6891:


 Summary: Can kill other user's applications via RM UI
 Key: YARN-6891
 URL: https://issues.apache.org/jira/browse/YARN-6891
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sumana Sathish
Assignee: Junping Du
Priority: Critical


In a  secured cluster with UI unsecured which has following config
{code}
"hadoop.http.authentication.simple.anonymous.allowed" => "true"
"hadoop.http.authentication.type" => kerberos
{code}

UI can be accessed without any security setting.

Also any user can kill other user's applications via UI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6890) If UI is not secured, we allow user to kill other users' job even yarn cluster is secured.

2017-07-27 Thread Junping Du (JIRA)
Junping Du created YARN-6890:


 Summary: If UI is not secured, we allow user to kill other users' 
job even yarn cluster is secured.
 Key: YARN-6890
 URL: https://issues.apache.org/jira/browse/YARN-6890
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sumana Sathish
Assignee: Junping Du
Priority: Critical


Configuring SPNEGO for web browser could be a head ache, so many production 
cluster choose to configure a unsecured UI access even for a secured cluster. 
In this setup, users (login as some random guy) could watch other users job 
which is expected. However, the kill button (added in YARN-3249 which enabled 
by default) shouldn't work in this situation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6889) The throughput of timeline server is too small

2017-07-27 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash resolved YARN-6889.

Resolution: Duplicate

> The throughput of timeline server is too small
> --
>
> Key: YARN-6889
> URL: https://issues.apache.org/jira/browse/YARN-6889
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineserver
>Reporter: YunFan Zhou
>Priority: Critical
>
> Recent large-scale pressure test single timeline server, and I found the 
> throughput of timeline server is too small.
> I setup multiple servers, and each of them setup multiple processes. Each 
> process was setup up multiple threads, I send different data size with each 
> thread.
> Although I use different pressures and different scenarios. But the timeline 
> server processing power is basically similar, and 
> the ability to process messages is about 70 per second.
> It can't meet our requirements, we should improve it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-07-27 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/476/

[Jul 26, 2017 6:12:39 PM] (kihwal) HADOOP-14578. Bind IPC connections to 
kerberos UPN host for proxy users.




-1 overall


The following subsystems voted -1:
compile findbugs unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs-client 
   Possible exposure of partially initialized object in 
org.apache.hadoop.hdfs.DFSClient.initThreadsNumForStripedReads(int) At 
DFSClient.java:object in 
org.apache.hadoop.hdfs.DFSClient.initThreadsNumForStripedReads(int) At 
DFSClient.java:[line 2888] 
   org.apache.hadoop.hdfs.server.protocol.SlowDiskReports.equals(Object) 
makes inefficient use of keySet iterator instead of entrySet iterator At 
SlowDiskReports.java:keySet iterator instead of entrySet iterator At 
SlowDiskReports.java:[line 105] 

FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.qjournal.server.JournalNode.getJournalsStatus() due to 
return value of called method Dereferenced at 
JournalNode.java:org.apache.hadoop.hdfs.qjournal.server.JournalNode.getJournalsStatus()
 due to return value of called method Dereferenced at JournalNode.java:[line 
302] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setClusterId(String)
 unconditionally sets the field clusterId At HdfsServerConstants.java:clusterId 
At HdfsServerConstants.java:[line 193] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setForce(int)
 unconditionally sets the field force At HdfsServerConstants.java:force At 
HdfsServerConstants.java:[line 217] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setForceFormat(boolean)
 unconditionally sets the field isForceFormat At 
HdfsServerConstants.java:isForceFormat At HdfsServerConstants.java:[line 229] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setInteractiveFormat(boolean)
 unconditionally sets the field isInteractiveFormat At 
HdfsServerConstants.java:isInteractiveFormat At HdfsServerConstants.java:[line 
237] 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.linkBlocksHelper(File, File, 
int, HardLink, boolean, File, List) due to return value of called method 
Dereferenced at 
DataStorage.java:org.apache.hadoop.hdfs.server.datanode.DataStorage.linkBlocksHelper(File,
 File, int, HardLink, boolean, File, List) due to return value of called method 
Dereferenced at DataStorage.java:[line 1339] 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager.purgeOldLegacyOIVImages(String,
 long) due to return value of called method Dereferenced at 
NNStorageRetentionManager.java:org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager.purgeOldLegacyOIVImages(String,
 long) due to return value of called method Dereferenced at 
NNStorageRetentionManager.java:[line 258] 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.namenode.NNUpgradeUtil$1.visitFile(Path, 
BasicFileAttributes) due to return value of called method Dereferenced at 
NNUpgradeUtil.java:org.apache.hadoop.hdfs.server.namenode.NNUpgradeUtil$1.visitFile(Path,
 BasicFileAttributes) due to return value of called method Dereferenced at 
NNUpgradeUtil.java:[line 133] 
   Useless condition:argv.length >= 1 at this point At DFSAdmin.java:[line 
2100] 
   Useless condition:numBlocks == -1 at this point At 
ImageLoaderCurrent.java:[line 727] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
   Useless object stored in variable removedNullContainers of method 
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.removeOrTrackCompletedContainersFromContext(List)
 At NodeStatusUpdaterImpl.java:removedNullContainers of method 
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.removeOrTrackCompletedContainersFromContext(List)
 At NodeStatusUpdaterImpl.java:[line 642] 
   
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.removeVeryOldStoppedContainersFromCache()
 makes inefficient use of keySet iterator instead of entrySet iterator At 
NodeStatusUpdaterImpl.java:keySet iterator instead of entrySet iterator At 
NodeStatusUpdaterImpl.java:[line 719] 
   Hard coded reference to an absolute pathname in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
 At DockerLinuxContainerRuntime.java:absolute pathname in 

[jira] [Created] (YARN-6889) The throughput of timeline server is too small

2017-07-27 Thread YunFan Zhou (JIRA)
YunFan Zhou created YARN-6889:
-

 Summary: The throughput of timeline server is too small
 Key: YARN-6889
 URL: https://issues.apache.org/jira/browse/YARN-6889
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Reporter: YunFan Zhou






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6888) Refactor AppLevelTimelineCollector such that RM does not have aggregator threads created

2017-07-27 Thread Vrushali C (JIRA)
Vrushali C created YARN-6888:


 Summary: Refactor AppLevelTimelineCollector such that RM does not 
have aggregator threads created
 Key: YARN-6888
 URL: https://issues.apache.org/jira/browse/YARN-6888
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vrushali C
Assignee: Vrushali C



Currently both RM and NM use the same AppLevelTimelineCollector class. The NM 
requires aggregator threads per application so that it can perform in memory 
aggregation for application metrics but the RM does not need this. Since they 
share the code, RM has a bunch of "TimelineCollector Aggregation" threads 
created (one per running app).  

Filing jira to refactor AppLevelTimelineCollector such that RM does not have 
aggregator threads created. 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org