[jira] [Commented] (YARN-41) The RM should handle the graceful shutdown of the NM.
[ https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557771#comment-14557771 ] Devaraj K commented on YARN-41: --- bq. Both ways doesn't break API compatibility as NodeState is marked as UNSTABLE and we were just adding a DECOMMISSIONING state. Isn't it? I agree that it doesn't break Java API level compatibility. But if the users who use web services for retrieving the nodes, they may need to change the client code if we add new node state here as the node listing would contain with this new state. bq. The compatibility issue I mentioned above is still there as behavior changes: user (and management tools) could feel confused that after this patch, the node will show up in the decommission list after normally shutdown. Here we are trying to notify the RM immediately that the NM is not available anymore for use instead of waiting for NM expiry interval to know about NM unavailability. Here users would be getting benefit with this rather than worrying about the behavior change. I think adding a new state for shut down here would rectifies the behavioral confusion as you describe. bq. I agree with Vinod that LOST is pretty far away here, but DECOMMISSIONED is also not so ideal I think. I agree that both the states are not exactly matches for this behavior, DECOMMISSIONED was chosen because it is close one for this behavior and also not to make it complex with the new state. I will add a new state i.e. SHUTDOWN for nodes in the next patch. The RM should handle the graceful shutdown of the NM. - Key: YARN-41 URL: https://issues.apache.org/jira/browse/YARN-41 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager, resourcemanager Reporter: Ravi Teja Ch N V Assignee: Devaraj K Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, YARN-41-4.patch, YARN-41-5.patch, YARN-41-6.patch, YARN-41.patch Instead of waiting for the NM expiry, RM should remove and handle the NM, which is shutdown gracefully. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-1969) Fair Scheduler: Add policy for Earliest Endtime First
[ https://issues.apache.org/jira/browse/YARN-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maysam Yabandeh resolved YARN-1969. --- Resolution: Not A Problem After the fix in MAPREDUCE-5844 and a couple of other issues, I no longer see much urgency for this feature. Still is a cool one but currently there is no high demand for it. Canceling the jira for the moment. We can resume it later when there is another specific application for it emerged. Fair Scheduler: Add policy for Earliest Endtime First - Key: YARN-1969 URL: https://issues.apache.org/jira/browse/YARN-1969 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Maysam Yabandeh Assignee: Maysam Yabandeh What we are observing is that some big jobs with many allocated containers are waiting for a few containers to finish. Under *fair-share scheduling* however they have a low priority since there are other jobs (usually much smaller, new comers) that are using resources way below their fair share, hence new released containers are not offered to the big, yet close-to-be-finished job. Nevertheless, everybody would benefit from an unfair scheduling that offers the resource to the big job since the sooner the big job finishes, the sooner it releases its many allocated resources to be used by other jobs.In other words, we need a relaxed version of *Earliest Endtime First scheduling*, that takes into account the number of already-allocated resources and estimated time to finish. For example, if a job is using MEM GB of memory and is expected to finish in TIME minutes, the priority in scheduling would be a function p of (MEM, TIME). The expected time to finish can be estimated by the AppMaster using TaskRuntimeEstimator#estimatedRuntime and be supplied to RM in the resource request messages. To be less susceptible to the issue of apps gaming the system, we can have this scheduling limited to leaf queues which have applications. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-314) Schedulers should allow resource requests of different sizes at the same priority and location
[ https://issues.apache.org/jira/browse/YARN-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557912#comment-14557912 ] Sandy Ryza commented on YARN-314: - Do we have applications that need this capability? Schedulers should allow resource requests of different sizes at the same priority and location -- Key: YARN-314 URL: https://issues.apache.org/jira/browse/YARN-314 Project: Hadoop YARN Issue Type: Sub-task Components: scheduler Affects Versions: 2.0.2-alpha Reporter: Sandy Ryza Assignee: Karthik Kambatla Attachments: yarn-314-prelim.patch Currently, resource requests for the same container and locality are expected to all be the same size. While it it doesn't look like it's needed for apps currently, and can be circumvented by specifying different priorities if absolutely necessary, it seems to me that the ability to request containers with different resource requirements at the same priority level should be there for the future and for completeness sake. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3708) container num become -1 after job finished
tongshiquan created YARN-3708: - Summary: container num become -1 after job finished Key: YARN-3708 URL: https://issues.apache.org/jira/browse/YARN-3708 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.7.0 Reporter: tongshiquan Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3708) container num become -1 after job finished
[ https://issues.apache.org/jira/browse/YARN-3708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tongshiquan updated YARN-3708: -- Attachment: screenshot-1.png container num become -1 after job finished -- Key: YARN-3708 URL: https://issues.apache.org/jira/browse/YARN-3708 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.7.0 Reporter: tongshiquan Priority: Minor Attachments: screenshot-1.png -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2238) filtering on UI sticks even if I move away from the page
[ https://issues.apache.org/jira/browse/YARN-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557960#comment-14557960 ] Rohith commented on YARN-2238: -- +1 lgtm (non-binding) filtering on UI sticks even if I move away from the page Key: YARN-2238 URL: https://issues.apache.org/jira/browse/YARN-2238 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Jian He Labels: usability Attachments: YARN-2238.patch, YARN-2238.png, filtered.png The main data table in many web pages (RM, AM, etc.) seems to show an unexpected filtering behavior. If I filter the table by typing something in the key or value field (or I suspect any search field), the data table gets filtered. The example I used is the job configuration page for a MR job. That is expected. However, when I move away from that page and visit any other web page of the same type (e.g. a job configuration page), the page is rendered with the filtering! That is unexpected. What's even stranger is that it does not render the filtering term. As a result, I have a page that's mysteriously filtered but doesn't tell me what it's filtering on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2238) filtering on UI sticks even if I move away from the page
[ https://issues.apache.org/jira/browse/YARN-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557959#comment-14557959 ] Rohith commented on YARN-2238: -- Tested locally with YARN-3707 fix, working fine:-) filtering on UI sticks even if I move away from the page Key: YARN-2238 URL: https://issues.apache.org/jira/browse/YARN-2238 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Jian He Labels: usability Attachments: YARN-2238.patch, YARN-2238.png, filtered.png The main data table in many web pages (RM, AM, etc.) seems to show an unexpected filtering behavior. If I filter the table by typing something in the key or value field (or I suspect any search field), the data table gets filtered. The example I used is the job configuration page for a MR job. That is expected. However, when I move away from that page and visit any other web page of the same type (e.g. a job configuration page), the page is rendered with the filtering! That is unexpected. What's even stranger is that it does not render the filtering term. As a result, I have a page that's mysteriously filtered but doesn't tell me what it's filtering on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3543) ApplicationReport should be able to tell whether the Application is AM managed or not.
[ https://issues.apache.org/jira/browse/YARN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-3543: - Attachment: 0004-YARN-3543.patch Attaching same patch as previous to kick off Jenkins ApplicationReport should be able to tell whether the Application is AM managed or not. --- Key: YARN-3543 URL: https://issues.apache.org/jira/browse/YARN-3543 Project: Hadoop YARN Issue Type: Improvement Components: api Affects Versions: 2.6.0 Reporter: Spandan Dutta Assignee: Rohith Attachments: 0001-YARN-3543.patch, 0001-YARN-3543.patch, 0002-YARN-3543.patch, 0002-YARN-3543.patch, 0003-YARN-3543.patch, 0004-YARN-3543.patch, 0004-YARN-3543.patch, 0004-YARN-3543.patch, YARN-3543-AH.PNG, YARN-3543-RM.PNG Currently we can know whether the application submitted by the user is AM managed from the applicationSubmissionContext. This can be only done at the time when the user submits the job. We should have access to this info from the ApplicationReport as well so that we can check whether an app is AM managed or not anytime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3705) forcemanual transition of RM active/standby state in automatic-failover mode should change elector state
[ https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557934#comment-14557934 ] Masatake Iwasaki commented on YARN-3705: When I invoked {{hdfs haadmin -transitionToStandby --forcemanual nn1}}, zkfc detected the change of status and made standby (nn2) active. I think the behaviour in YARN is not consistent with HDFS. I'm going to try to add checking logic using o.a.h.ha.HealthMonitor to EmbeddedElectorService of RM. Otherwise just making transitionToActive/transitionToStandby to change elector's status might work for manual transition case. forcemanual transition of RM active/standby state in automatic-failover mode should change elector state Key: YARN-3705 URL: https://issues.apache.org/jira/browse/YARN-3705 Project: Hadoop YARN Issue Type: Sub-task Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Executing {{rmadmin -transitionToActive --forcemanual}} and {{rmadmin -transitionToActive --forcemanual}} in automatic-failover.enabled mode changes the active/standby state of ResouceManager while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3708) container num become -1 after job finished
[ https://issues.apache.org/jira/browse/YARN-3708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith resolved YARN-3708. -- Resolution: Duplicate This is duplicate of YARN-3552. Closing the issue as duplicate.. container num become -1 after job finished -- Key: YARN-3708 URL: https://issues.apache.org/jira/browse/YARN-3708 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.7.0 Reporter: tongshiquan Priority: Minor Attachments: screenshot-1.png -- This message was sent by Atlassian JIRA (v6.3.4#6332)