[jira] [Created] (YARN-4661) Per-queue preemption policy in FairScheduler
He Tianyi created YARN-4661: --- Summary: Per-queue preemption policy in FairScheduler Key: YARN-4661 URL: https://issues.apache.org/jira/browse/YARN-4661 Project: Hadoop YARN Issue Type: Wish Components: fairscheduler Affects Versions: 2.6.0 Reporter: He Tianyi Priority: Minor When {{FairScheduler}} needs to preempt container, it tries to find a container by hierachically sorting and selecting {{AppSchedulable}} with most 'over fairshare' (in {{FairSharePolicy}}), and pick its latest launched container. In some case, strategy above become non-optimal, one may want to kill latest container (not {{AppSchedulable}}) launched in the queue for better trade-off between fairness and efficiency. Since most app with over fairshare tend to be started longer ago than other apps, perhaps even its latent launch container is running quite some time. Maybe besides {{policy}}, we make it possible to also specify a {{preemptionPolicy}} only for selecting container to preempt, without changing scheduling policy. For example: {quote} fifo fair {quote} Any suggestions or comments? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3902) Fair scheduler preempts ApplicationMaster
[ https://issues.apache.org/jira/browse/YARN-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124947#comment-15124947 ] Arun Suresh commented on YARN-3902: --- Hello [~He Tianyi], Sure.. Ill assign this to you.. I can help with the reviews etc. BTW. now that YARN-3116 is resolved, you can figure out which is the AM container.. and should make things a bit simpler > Fair scheduler preempts ApplicationMaster > - > > Key: YARN-3902 > URL: https://issues.apache.org/jira/browse/YARN-3902 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.3.0 > Environment: 3.16.0-0.bpo.4-amd64 #1 SMP Debian 3.16.7-ckt2-1~bpo70+1 > (2014-12-08) x86_64 >Reporter: He Tianyi >Assignee: Arun Suresh > Original Estimate: 72h > Remaining Estimate: 72h > > YARN-2022 have fixed the similar issue related to CapacityScheduler. > However, FairScheduler still suffer, preempting AM while other normal > containers running out there. > I think we should take the same approach, avoid AM being preempted unless > there is no container running other than AM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3902) Fair scheduler preempts ApplicationMaster
[ https://issues.apache.org/jira/browse/YARN-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124961#comment-15124961 ] Arun Suresh commented on YARN-3902: --- [~He Tianyi], for some reason, JIRA is not able to find you in the Assignee list. Can you please assign it to yourself ? > Fair scheduler preempts ApplicationMaster > - > > Key: YARN-3902 > URL: https://issues.apache.org/jira/browse/YARN-3902 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.3.0 > Environment: 3.16.0-0.bpo.4-amd64 #1 SMP Debian 3.16.7-ckt2-1~bpo70+1 > (2014-12-08) x86_64 >Reporter: He Tianyi >Assignee: Arun Suresh > Original Estimate: 72h > Remaining Estimate: 72h > > YARN-2022 have fixed the similar issue related to CapacityScheduler. > However, FairScheduler still suffer, preempting AM while other normal > containers running out there. > I think we should take the same approach, avoid AM being preempted unless > there is no container running other than AM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4409) Fix javadoc and checkstyle issues in timelineservice code
[ https://issues.apache.org/jira/browse/YARN-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4409: --- Attachment: YARN-4409-YARN-2928.wip.01.patch > Fix javadoc and checkstyle issues in timelineservice code > - > > Key: YARN-4409 > URL: https://issues.apache.org/jira/browse/YARN-4409 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4409-YARN-2928.wip.01.patch > > > There are a large number of javadoc and checkstyle issues currently open in > timelineservice code. We need to fix them before we merge it into trunk. > Refer to > https://issues.apache.org/jira/browse/YARN-3862?focusedCommentId=15035267=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15035267 > We still have 94 open checkstyle issues and javadocs failing for Java 8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4409) Fix javadoc and checkstyle issues in timelineservice code
[ https://issues.apache.org/jira/browse/YARN-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124962#comment-15124962 ] Varun Saxena commented on YARN-4409: This patch is on top of YARN-4446 > Fix javadoc and checkstyle issues in timelineservice code > - > > Key: YARN-4409 > URL: https://issues.apache.org/jira/browse/YARN-4409 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4409-YARN-2928.wip.01.patch > > > There are a large number of javadoc and checkstyle issues currently open in > timelineservice code. We need to fix them before we merge it into trunk. > Refer to > https://issues.apache.org/jira/browse/YARN-3862?focusedCommentId=15035267=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15035267 > We still have 94 open checkstyle issues and javadocs failing for Java 8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4340) Add "list" API to reservation system
[ https://issues.apache.org/jira/browse/YARN-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124849#comment-15124849 ] Hadoop QA commented on YARN-4340: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 10 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 33s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 36s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 33s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 55s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 48s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 24s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 32s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 15s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 20s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 14m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 27s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 11m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 39s {color} | {color:green} root: patch generated 0 new + 347 unchanged - 2 fixed = 347 total (was 349) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 24s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 51s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 49s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 14s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 11m 28s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 47s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK
[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN
[ https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124824#comment-15124824 ] He Tianyi commented on YARN-2139: - Recently introduced SSD in my cluster for MapReduce shuffle. Then there is one issue, if map output gets too large, it cannot be placed on SSD. We have to implement a custom strategy (called SSDFirst) to make best effort to use SSD, but fallbacks to HDD when available space of SSD gets tight. This worked in most cases, but it is only a local optimum. To achieve global optimum, scheduler must be aware and management these resources. > [Umbrella] Support for Disk as a Resource in YARN > -- > > Key: YARN-2139 > URL: https://issues.apache.org/jira/browse/YARN-2139 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Wei Yan > Attachments: Disk_IO_Isolation_Scheduling_3.pdf, > Disk_IO_Scheduling_Design_1.pdf, Disk_IO_Scheduling_Design_2.pdf, > YARN-2139-prototype-2.patch, YARN-2139-prototype.patch > > > YARN should consider disk as another resource for (1) scheduling tasks on > nodes, (2) isolation at runtime, (3) spindle locality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3902) Fair scheduler preempts ApplicationMaster
[ https://issues.apache.org/jira/browse/YARN-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124827#comment-15124827 ] He Tianyi commented on YARN-3902: - Hi, [~asuresh]. Any feedback on this? Maybe you could assign this to me? > Fair scheduler preempts ApplicationMaster > - > > Key: YARN-3902 > URL: https://issues.apache.org/jira/browse/YARN-3902 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.3.0 > Environment: 3.16.0-0.bpo.4-amd64 #1 SMP Debian 3.16.7-ckt2-1~bpo70+1 > (2014-12-08) x86_64 >Reporter: He Tianyi >Assignee: Arun Suresh > Original Estimate: 72h > Remaining Estimate: 72h > > YARN-2022 have fixed the similar issue related to CapacityScheduler. > However, FairScheduler still suffer, preempting AM while other normal > containers running out there. > I think we should take the same approach, avoid AM being preempted unless > there is no container running other than AM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3902) Fair scheduler preempts ApplicationMaster
[ https://issues.apache.org/jira/browse/YARN-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125165#comment-15125165 ] He Tianyi commented on YARN-3902: - Hi [~asuresh]. I think I am unable to assign it to myself too, there is no more 'Assign to me' link. > Fair scheduler preempts ApplicationMaster > - > > Key: YARN-3902 > URL: https://issues.apache.org/jira/browse/YARN-3902 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.3.0 > Environment: 3.16.0-0.bpo.4-amd64 #1 SMP Debian 3.16.7-ckt2-1~bpo70+1 > (2014-12-08) x86_64 >Reporter: He Tianyi >Assignee: Arun Suresh > Original Estimate: 72h > Remaining Estimate: 72h > > YARN-2022 have fixed the similar issue related to CapacityScheduler. > However, FairScheduler still suffer, preempting AM while other normal > containers running out there. > I think we should take the same approach, avoid AM being preempted unless > there is no container running other than AM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4466) ResourceManager should tolerate unexpected exceptions to happen in non-critical subsystem/services like SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-4466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125248#comment-15125248 ] Naganarasimha G R commented on YARN-4466: - Hi [~djp], Any thoughts on my previous comment ? > ResourceManager should tolerate unexpected exceptions to happen in > non-critical subsystem/services like SystemMetricsPublisher > -- > > Key: YARN-4466 > URL: https://issues.apache.org/jira/browse/YARN-4466 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Junping Du >Assignee: Naganarasimha G R > > From my comment in > YARN-4452(https://issues.apache.org/jira/browse/YARN-4452?focusedCommentId=15059805=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15059805), > we should make RM more robust with ignore (but log) unexpected exception in > its non-critical subsystems/services. -- This message was sent by Atlassian JIRA (v6.3.4#6332)