date:20181030

[jira] [Created] (YARN-8964) UI2 should use clusters/{cluster name} for all ATSv2 REST APIs

2018-10-30 Thread Rohith Sharma K S (JIRA)

Rohith Sharma K S created YARN-8964:
---

 Summary: UI2 should use clusters/{cluster name} for all ATSv2 REST 
APIs
 Key: YARN-8964
 URL: https://issues.apache.org/jira/browse/YARN-8964
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Rohith Sharma K S


UI2 makes a REST call to TimelineReader without cluster name. It is advised to 
make a REST call with clusters/{cluster name} so that remote TimelineReader 
daemon could serve for different clusters.
*Example*:
*Current*: /ws/v2/timeline/flows/
*Change*: /ws/v2/timeline/*clusters/\{cluster name\}*/flows/


*yarn.resourcemanager.cluster-id *is configured with cluster. So, this config 
could be used to get cluster-id

cc:/ [~sunilg] [~akhilpb]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8961) [UI2] Flow Run End Time shows 'Invalid date'

2018-10-30 Thread Akhil PB (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669677#comment-16669677
 ] 

Akhil PB commented on YARN-8961:


Hi [~charanh],
Could you pls attach the flow runs response. The invalid date might be because 
of the empty/null date from the backend. Wanted to double check this.

> [UI2] Flow Run End Time shows 'Invalid date'
> 
>
> Key: YARN-8961
> URL: https://issues.apache.org/jira/browse/YARN-8961
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Charan Hebri
>Assignee: Akhil PB
>Priority: Major
> Attachments: Invalid_Date.png
>
>
> End Time for Flow Runs is shown as *Invalid date* for runs that are in 
> progress. This should be shown as *N/A* just like how it is shown for 'CPU 
> VCores' and 'Memory Used'. Attached relevant screenshot.
> cc [~akhilpb]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8902) Add volume manager that manages CSI volume lifecycle

2018-10-30 Thread Weiwei Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669572#comment-16669572
 ] 

Weiwei Yang commented on YARN-8902:
---

v7 patch fixed the findbugs and checkstyle issues

> Add volume manager that manages CSI volume lifecycle
> 
>
> Key: YARN-8902
> URL: https://issues.apache.org/jira/browse/YARN-8902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8902.001.patch, YARN-8902.002.patch, 
> YARN-8902.003.patch, YARN-8902.004.patch, YARN-8902.005.patch, 
> YARN-8902.006.patch, YARN-8902.007.patch
>
>
> The CSI volume manager is a service running in RM process, that manages all 
> CSI volumes' lifecycle. The details about volume's lifecycle states can be 
> found in [CSI 
> spec|https://github.com/container-storage-interface/spec/blob/master/spec.md].
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8902) Add volume manager that manages CSI volume lifecycle

2018-10-30 Thread Weiwei Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8902:
--
Attachment: YARN-8902.007.patch

> Add volume manager that manages CSI volume lifecycle
> 
>
> Key: YARN-8902
> URL: https://issues.apache.org/jira/browse/YARN-8902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8902.001.patch, YARN-8902.002.patch, 
> YARN-8902.003.patch, YARN-8902.004.patch, YARN-8902.005.patch, 
> YARN-8902.006.patch, YARN-8902.007.patch
>
>
> The CSI volume manager is a service running in RM process, that manages all 
> CSI volumes' lifecycle. The details about volume's lifecycle states can be 
> found in [CSI 
> spec|https://github.com/container-storage-interface/spec/blob/master/spec.md].
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8761) Service AM support for decommissioning component instances

2018-10-30 Thread Billie Rinaldi (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-8761:
-
Attachment: YARN-8761.03.patch

> Service AM support for decommissioning component instances
> --
>
> Key: YARN-8761
> URL: https://issues.apache.org/jira/browse/YARN-8761
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-8761.01.patch, YARN-8761.02.patch, 
> YARN-8761.03.patch
>
>
> The idea behind this feature is to have a flex down where specific component 
> instances are removed. Currently on a flex down, the service AM chooses for 
> removal the component instances with the highest IDs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8838) Add security check for container user is same as websocket user

2018-10-30 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669437#comment-16669437
 ] 

Eric Yang commented on YARN-8838:
-

Patch 002 allows YARN admin user to login to container, if yarn.acl.enable 
feature is enabled.

> Add security check for container user is same as websocket user
> ---
>
> Key: YARN-8838
> URL: https://issues.apache.org/jira/browse/YARN-8838
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: docker
> Attachments: YARN-8838.001.patch, YARN-8838.002.patch
>
>
> When user is authenticate via SPNEGO entry point, node manager must verify 
> the remote user is the same as the container user to start the web socket 
> session.  One possible solution is to verify the web request user matches 
> yarn container local directory owne during onWebSocketConnect..



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8838) Add security check for container user is same as websocket user

2018-10-30 Thread Eric Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8838:

Attachment: YARN-8838.002.patch

> Add security check for container user is same as websocket user
> ---
>
> Key: YARN-8838
> URL: https://issues.apache.org/jira/browse/YARN-8838
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: docker
> Attachments: YARN-8838.001.patch, YARN-8838.002.patch
>
>
> When user is authenticate via SPNEGO entry point, node manager must verify 
> the remote user is the same as the container user to start the web socket 
> session.  One possible solution is to verify the web request user matches 
> yarn container local directory owne during onWebSocketConnect..



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8957) Add Serializable interface to ComponentContainers

2018-10-30 Thread Zhankun Tang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669411#comment-16669411
 ] 

Zhankun Tang commented on YARN-8957:


[~csingh] . Thanks for the review and share its original JIRA!

> Add Serializable interface to ComponentContainers
> -
>
> Key: YARN-8957
> URL: https://issues.apache.org/jira/browse/YARN-8957
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Minor
> Attachments: YARN-8957-trunk.001.patch
>
>
> In YARN service API:
> public class ComponentContainers
> { private static final long serialVersionUID = -1456748479118874991L; ... }
>  
>  seems should be 
>  
>  public class ComponentContainers {color:#d04437}implements 
> Serializable{color} {
> private static final long serialVersionUID = -1456748479118874991L; ... }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-8839) Define a protocol exchange between websocket client and server for interactive shell

2018-10-30 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669405#comment-16669405
 ] 

Eric Yang edited comment on YARN-8839 at 10/30/18 10:43 PM:


In the current implementation, there is one control implemented.  The structure 
of the protocol is leading by number one, and follow by a JSON string, for 
example, sending heartbeat looks like:

{code}
1{}
{code}

This is used to ping server to check if there is any output.  We can use the 
same format to set terminal size:

{code}
1{cols:80, rows:25}
{code}

The rest of the data stream are treated as byte array.


was (Author: eyang):
In the current implementation, there is one control implemented.  The structure 
of the protocol is leading by number one, and follow by a JSON string, for 
example, sending heartbeat looks like:

1{}

This is used to ping server to check if there is any output.  We can use the 
same format to set terminal size:

1{cols:80, rows:25}

The rest of the data stream are treated as byte array.

> Define a protocol exchange between websocket client and server for 
> interactive shell
> 
>
> Key: YARN-8839
> URL: https://issues.apache.org/jira/browse/YARN-8839
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: docker
>
> Running interactive shell is more than piping stdio from docker exec through 
> a web socket.  For enabling terminal based program to run, there are certain 
> functions that work outside of stdio streams to the destination program.  A 
> couple known functions to improve terminal usability:
> # Resize terminal columns and rows
> # Set title of the window
> # Upload files via zmodem protocol
> # Set terminal type
> # Heartbeat (poll server side for more data)
> # Send keystroke payload to server side
> If we want to be on parity with commonly supported ssh terminal functions, we 
> need to develop a set of protocols between websocket client and server.  
> Client and server intercept the messages to enable functions that are 
> normally outside of the stdio streams.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8839) Define a protocol exchange between websocket client and server for interactive shell

2018-10-30 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669405#comment-16669405
 ] 

Eric Yang commented on YARN-8839:
-

In the current implementation, there is one control implemented.  The structure 
of the protocol is leading by number one, and follow by a JSON string, for 
example, sending heartbeat looks like:

1{}

This is used to ping server to check if there is any output.  We can use the 
same format to set terminal size:

1{cols:80, rows:25}

The rest of the data stream are treated as byte array.

> Define a protocol exchange between websocket client and server for 
> interactive shell
> 
>
> Key: YARN-8839
> URL: https://issues.apache.org/jira/browse/YARN-8839
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: docker
>
> Running interactive shell is more than piping stdio from docker exec through 
> a web socket.  For enabling terminal based program to run, there are certain 
> functions that work outside of stdio streams to the destination program.  A 
> couple known functions to improve terminal usability:
> # Resize terminal columns and rows
> # Set title of the window
> # Upload files via zmodem protocol
> # Set terminal type
> # Heartbeat (poll server side for more data)
> # Send keystroke payload to server side
> If we want to be on parity with commonly supported ssh terminal functions, we 
> need to develop a set of protocols between websocket client and server.  
> Client and server intercept the messages to enable functions that are 
> normally outside of the stdio streams.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8963) Add flag to disable interactive shell

2018-10-30 Thread Eric Yang (JIRA)

Eric Yang created YARN-8963:
---

 Summary: Add flag to disable interactive shell
 Key: YARN-8963
 URL: https://issues.apache.org/jira/browse/YARN-8963
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Eric Yang


For some production job, application admin might choose to disable debugging to 
production jobs to prevent developer or system admin from accessing the 
containers.  It would be nice to add an environment variable flag to disable 
interactive shell during application submission.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8962) Add ability to use interactive shell with normal yarn container

2018-10-30 Thread Eric Yang (JIRA)

Eric Yang created YARN-8962:
---

 Summary: Add ability to use interactive shell with normal yarn 
container
 Key: YARN-8962
 URL: https://issues.apache.org/jira/browse/YARN-8962
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Eric Yang


This task is focusing on extending interactive shell capability to yarn 
container without docker.  This will improve some aspect of debugging mapreduce 
or spark applications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8867) Retrieve the status of resource localization

2018-10-30 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669395#comment-16669395
 ] 

Eric Yang commented on YARN-8867:
-

[~csingh] Thank you for the update.  Is this status going to show up in Yarn 
Service JSON or some other mechanism to surface to the end user?  The status 
definition may also include a state for not yet started, like PENDING.  The 
rest looks on like the right track.  Thanks

> Retrieve the status of resource localization
> 
>
> Key: YARN-8867
> URL: https://issues.apache.org/jira/browse/YARN-8867
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8867.wip.patch
>
>
> Refer YARN-3854.
> Currently NM does not have an API to retrieve the status of localization. 
> Unless the client can know when the localization of a resource is complete 
> irrespective of the type of the resource, it cannot take any appropriate 
> action. 
> We need an API in {{ContainerManagementProtocol}} to retrieve the status on 
> the localization. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.

2018-10-30 Thread Wangda Tan (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669392#comment-16669392
 ] 

Wangda Tan commented on YARN-8714:
--

[~tangzhankun] , sure, please go ahead.

> [Submarine] Support files/tarballs to be localized for a training job.
> --
>
> Key: YARN-8714
> URL: https://issues.apache.org/jira/browse/YARN-8714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Priority: Major
>
> See 
> https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7,
>  {{job run --localizations ...}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8867) Retrieve the status of resource localization

2018-10-30 Thread Chandni Singh (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8867:

Summary: Retrieve the status of resource localization  (was: Retrieve the 
progress of resource localization)

> Retrieve the status of resource localization
> 
>
> Key: YARN-8867
> URL: https://issues.apache.org/jira/browse/YARN-8867
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8867.wip.patch
>
>
> Refer YARN-3854.
> Currently NM does not have an API to retrieve the progress of localization. 
> Unless the client can know when the localization of a resource is complete 
> irrespective of the type of the resource, it cannot take any appropriate 
> action. 
> We need an API in {{ContainerManagementProtocol}} to retrieve the progress on 
> the localization. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8867) Retrieve the status of resource localization

2018-10-30 Thread Chandni Singh (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669368#comment-16669368
 ] 

Chandni Singh commented on YARN-8867:
-

[~eyang] I changed the title and description of the jira to avoid confusion.

> Retrieve the status of resource localization
> 
>
> Key: YARN-8867
> URL: https://issues.apache.org/jira/browse/YARN-8867
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8867.wip.patch
>
>
> Refer YARN-3854.
> Currently NM does not have an API to retrieve the status of localization. 
> Unless the client can know when the localization of a resource is complete 
> irrespective of the type of the resource, it cannot take any appropriate 
> action. 
> We need an API in {{ContainerManagementProtocol}} to retrieve the status on 
> the localization. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8867) Retrieve the status of resource localization

2018-10-30 Thread Chandni Singh (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8867:

Description: 
Refer YARN-3854.

Currently NM does not have an API to retrieve the status of localization. 
Unless the client can know when the localization of a resource is complete 
irrespective of the type of the resource, it cannot take any appropriate 
action. 

We need an API in {{ContainerManagementProtocol}} to retrieve the status on the 
localization. 


  was:
Refer YARN-3854.

Currently NM does not have an API to retrieve the progress of localization. 
Unless the client can know when the localization of a resource is complete 
irrespective of the type of the resource, it cannot take any appropriate 
action. 

We need an API in {{ContainerManagementProtocol}} to retrieve the progress on 
the localization. 



> Retrieve the status of resource localization
> 
>
> Key: YARN-8867
> URL: https://issues.apache.org/jira/browse/YARN-8867
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8867.wip.patch
>
>
> Refer YARN-3854.
> Currently NM does not have an API to retrieve the status of localization. 
> Unless the client can know when the localization of a resource is complete 
> irrespective of the type of the resource, it cannot take any appropriate 
> action. 
> We need an API in {{ContainerManagementProtocol}} to retrieve the status on 
> the localization. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8867) Retrieve the progress of resource localization

2018-10-30 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669361#comment-16669361
 ] 

Eric Yang commented on YARN-8867:
-

The JIRA title says progress, instead, what would be reported are status of the 
localization.  Either change the title of the JIRA to match implementation or 
change localization status to progress with numeric percentage for accuracy.

> Retrieve the progress of resource localization
> --
>
> Key: YARN-8867
> URL: https://issues.apache.org/jira/browse/YARN-8867
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8867.wip.patch
>
>
> Refer YARN-3854.
> Currently NM does not have an API to retrieve the progress of localization. 
> Unless the client can know when the localization of a resource is complete 
> irrespective of the type of the resource, it cannot take any appropriate 
> action. 
> We need an API in {{ContainerManagementProtocol}} to retrieve the progress on 
> the localization. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8867) Retrieve the progress of resource localization

2018-10-30 Thread Chandni Singh (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669342#comment-16669342
 ] 

Chandni Singh commented on YARN-8867:
-

[~eyang] Thanks for reviewing. Retrieving the progress is a good point. I just 
think it can be done as a next step. 
As you stated, this patch adds the basic protocol to query the status of 
localized resources. With this, the status would only say if the resource 
localization in - IN_PROGRESS, COMPLETED, or FAILED. 
Adding more information in {{LocalizationStatus}} like progress could be 
handled separately.
Let me know if that is okay?

> Retrieve the progress of resource localization
> --
>
> Key: YARN-8867
> URL: https://issues.apache.org/jira/browse/YARN-8867
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8867.wip.patch
>
>
> Refer YARN-3854.
> Currently NM does not have an API to retrieve the progress of localization. 
> Unless the client can know when the localization of a resource is complete 
> irrespective of the type of the resource, it cannot take any appropriate 
> action. 
> We need an API in {{ContainerManagementProtocol}} to retrieve the progress on 
> the localization. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8932) ResourceUtilization cpu is misused in oversubscription as a percentage

2018-10-30 Thread Haibo Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-8932:
-
Attachment: YARN-8932-YARN-1011.02.patch

> ResourceUtilization cpu is misused in oversubscription as a percentage
> --
>
> Key: YARN-8932
> URL: https://issues.apache.org/jira/browse/YARN-8932
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: YARN-1011
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-8932-YARN-1011.00.patch, 
> YARN-8932-YARN-1011.01.patch, YARN-8932-YARN-1011.02.patch
>
>
> The ResourceUtilization javadoc mistakenly documents the cpu as a percentage 
> represented by a float number in [0, 1.0f], however it is used as the # of 
> vcores used in reality.
> See javadoc and discussion in YARN-8911.
>   /**
>    * Get CPU utilization.
>    *
>    * @return CPU utilization normalized to 1 CPU
>    */
>   @Public
>   @Unstable
>   public abstract float getCPU();



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8932) ResourceUtilization cpu is misused in oversubscription as a percentage

2018-10-30 Thread Haibo Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669331#comment-16669331
 ] 

Haibo Chen commented on YARN-8932:
--

Thanks for the review, [~rkanter]. I have updated the patch to include a few 
more comments to explain what the numbers mean after the change.

The cpu utilization numbers involved, before this patch, were percentage 
numbers. By multiple them by the total number of vcores, they are changed to 
absolute cpu utilization in terms of the number of vcores used.

 

> ResourceUtilization cpu is misused in oversubscription as a percentage
> --
>
> Key: YARN-8932
> URL: https://issues.apache.org/jira/browse/YARN-8932
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: YARN-1011
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-8932-YARN-1011.00.patch, 
> YARN-8932-YARN-1011.01.patch, YARN-8932-YARN-1011.02.patch
>
>
> The ResourceUtilization javadoc mistakenly documents the cpu as a percentage 
> represented by a float number in [0, 1.0f], however it is used as the # of 
> vcores used in reality.
> See javadoc and discussion in YARN-8911.
>   /**
>    * Get CPU utilization.
>    *
>    * @return CPU utilization normalized to 1 CPU
>    */
>   @Public
>   @Unstable
>   public abstract float getCPU();



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8867) Retrieve the progress of resource localization

2018-10-30 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669325#comment-16669325
 ] 

Eric Yang commented on YARN-8867:
-

[~csingh] The current wip patch looks ok for adding new communication protocol 
in yarn, node manager and mapreduce client to retrieve localization status.  We 
probably want to define what constitutes a localization status.  In CLI and UI, 
progress field seems to indicate a numeric number of percentage.  The current 
communication protocol is a list of localization status.  If we have container 
image to be localized, and it is represented by one of the localization status. 
 It does not translate well to percentage values.  Docker image download is 
also hard to be represented as percentage value because docker pull output does 
not show percentage output.  Given those limitations from backend, it would be 
in our best interests to define what user will see as progression.  If progress 
percentage is still the information that is going to be displayed, we may want 
to spend more time on how to calculate status to progress percentage.  
Otherwise, localizing a single container without other resource might look odd 
for UI to stuck on 0 percent and jump to 100 percent because there is only one 
localization status.

> Retrieve the progress of resource localization
> --
>
> Key: YARN-8867
> URL: https://issues.apache.org/jira/browse/YARN-8867
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8867.wip.patch
>
>
> Refer YARN-3854.
> Currently NM does not have an API to retrieve the progress of localization. 
> Unless the client can know when the localization of a resource is complete 
> irrespective of the type of the resource, it cannot take any appropriate 
> action. 
> We need an API in {{ContainerManagementProtocol}} to retrieve the progress on 
> the localization. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8932) ResourceUtilization cpu is misused in oversubscription as a percentage

2018-10-30 Thread Robert Kanter (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669288#comment-16669288
 ] 

Robert Kanter commented on YARN-8932:
-

Thanks for the patch [~haibochen].  A minor comment:
- In the tests, you update some of the constants.  It would be good to put in 
comments to explain them.

> ResourceUtilization cpu is misused in oversubscription as a percentage
> --
>
> Key: YARN-8932
> URL: https://issues.apache.org/jira/browse/YARN-8932
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: YARN-1011
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-8932-YARN-1011.00.patch, 
> YARN-8932-YARN-1011.01.patch
>
>
> The ResourceUtilization javadoc mistakenly documents the cpu as a percentage 
> represented by a float number in [0, 1.0f], however it is used as the # of 
> vcores used in reality.
> See javadoc and discussion in YARN-8911.
>   /**
>    * Get CPU utilization.
>    *
>    * @return CPU utilization normalized to 1 CPU
>    */
>   @Public
>   @Unstable
>   public abstract float getCPU();



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8902) Add volume manager that manages CSI volume lifecycle

2018-10-30 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669269#comment-16669269
 ] 

Hadoop QA commented on YARN-8902:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 24s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 56s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 11 new 
+ 58 unchanged - 0 fixed = 69 total (was 58) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  8s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
19s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
19s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}105m  7s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}172m 25s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  Should 
org.apache.hadoop.yarn.server.resourcemanager.volume.csi.provisioner.VolumeProvisioningResults$VolumeProvisioningResult
 be a _static_ inner class?  At VolumeProvisioningResults.java:inner class?  At 
VolumeProvisioningResults.java:[lines 64-79] |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8902 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12946186/YARN-8902

[jira] [Commented] (YARN-8960) Can't get submarine service status using the command of "yarn app -status" under security environment

2018-10-30 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669239#comment-16669239
 ] 

Hadoop QA commented on YARN-8960:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 38s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 13s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine: 
The patch generated 7 new + 31 unchanged - 0 fixed = 38 total (was 31) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
31s{color} | {color:green} hadoop-yarn-submarine in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 52m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8960 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12946198/YARN-8960.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b93bd98809b5 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 277a3d8 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/22383/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-submarine.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22383/testReport/ |
| Max. process+thread count | 329 (vs. ulimi

[jira] [Commented] (YARN-8897) LoadBasedRouterPolicy throws "NPE" in case of sub cluster unavailability

2018-10-30 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669196#comment-16669196
 ] 

Hadoop QA commented on YARN-8897:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 49s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
18s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 56m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8897 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12946184/YARN-8897-003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 76000ac9f62a 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 277a3d8 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22382/testReport/ |
| Max. process+thread count | 294 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22382/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> LoadBasedRouterPolicy throws "NPE" in case of sub cluster una

[jira] [Commented] (YARN-8932) ResourceUtilization cpu is misused in oversubscription as a percentage

2018-10-30 Thread Haibo Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669151#comment-16669151
 ] 

Haibo Chen commented on YARN-8932:
--

The unit test failure is unrelated. The code changes are not related to 
TestWorkPreservingRMRestart. I could not reproduce it locally, 
TestQueueManagementDynamicEditPolicy.testEditScheduler is tracked by YARN-8494 
and TestNMProxy.testNMProxyRPCRetry has been failing in other jiras.

> ResourceUtilization cpu is misused in oversubscription as a percentage
> --
>
> Key: YARN-8932
> URL: https://issues.apache.org/jira/browse/YARN-8932
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: YARN-1011
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-8932-YARN-1011.00.patch, 
> YARN-8932-YARN-1011.01.patch
>
>
> The ResourceUtilization javadoc mistakenly documents the cpu as a percentage 
> represented by a float number in [0, 1.0f], however it is used as the # of 
> vcores used in reality.
> See javadoc and discussion in YARN-8911.
>   /**
>    * Get CPU utilization.
>    *
>    * @return CPU utilization normalized to 1 CPU
>    */
>   @Public
>   @Unstable
>   public abstract float getCPU();



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8955) Add a flag to use local docker image instead of getting latest from registry

2018-10-30 Thread Chandni Singh (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669119#comment-16669119
 ] 

Chandni Singh commented on YARN-8955:
-

[~eyang] I would like to take it up since I am working on YARN-3854. Assigning 
it to myself.

> Add a flag to use local docker image instead of getting latest from registry
> 
>
> Key: YARN-8955
> URL: https://issues.apache.org/jira/browse/YARN-8955
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>  Labels: Docker
>
> Some companies have security policy to use local docker images instead of 
> getting latest images from internet.  When docker image is pulled in 
> localization phase, there are two possible out comes.  The image is getting 
> the latest from trusted registries, or the image is a static local copy.  
> This task is to add a configuration flag to give priority to local image over 
> trusted registry image. 
>  If a image already exist locally, node manager does not trigger docker pull 
> to get the latest image from trusted registries. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-8955) Add a flag to use local docker image instead of getting latest from registry

2018-10-30 Thread Chandni Singh (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh reassigned YARN-8955:
---

Assignee: Chandni Singh

> Add a flag to use local docker image instead of getting latest from registry
> 
>
> Key: YARN-8955
> URL: https://issues.apache.org/jira/browse/YARN-8955
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Chandni Singh
>Priority: Major
>  Labels: Docker
>
> Some companies have security policy to use local docker images instead of 
> getting latest images from internet.  When docker image is pulled in 
> localization phase, there are two possible out comes.  The image is getting 
> the latest from trusted registries, or the image is a static local copy.  
> This task is to add a configuration flag to give priority to local image over 
> trusted registry image. 
>  If a image already exist locally, node manager does not trigger docker pull 
> to get the latest image from trusted registries. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8957) Add Serializable interface to ComponentContainers

2018-10-30 Thread Chandni Singh (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669114#comment-16669114
 ] 

Chandni Singh commented on YARN-8957:
-

+1 (non-binding). It should implement {{Serializable}}. 
{{ComponentContainers}} was added with 
https://issues.apache.org/jira/browse/YARN-8542.

> Add Serializable interface to ComponentContainers
> -
>
> Key: YARN-8957
> URL: https://issues.apache.org/jira/browse/YARN-8957
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Minor
> Attachments: YARN-8957-trunk.001.patch
>
>
> In YARN service API:
> public class ComponentContainers
> { private static final long serialVersionUID = -1456748479118874991L; ... }
>  
>  seems should be 
>  
>  public class ComponentContainers {color:#d04437}implements 
> Serializable{color} {
> private static final long serialVersionUID = -1456748479118874991L; ... }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app

2018-10-30 Thread Tao Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669111#comment-16669111
 ] 

Tao Yang commented on YARN-8958:


Attached v2 patch to fix UT failures
(1) Set 
{{yarn.resourcemanager.store.class=org.apache.hadoop.yarn.server.resourcemanager.recovery.MemoryRMStateStore}}
 for TestFairOrderingPolicy#testSchedulableEntitiesLeak to avoid RM recovering 
apps from state left by former test case.
(2) TestCapacityScheduler#testAllocateReorder always have a problem that only 
activate one app but expect both two apps, it can pass before because app2 will 
be add into schedulable entities through calling CapacityScheduler#allocate 
explicitly in this test case (add app2 into entitiesToReorder then add it into 
schedulableEntities) even though app2 is still not activated. So that this 
problem is exposed because of this patch, and if set 
{{yarn.scheduler.capacity.maximum-am-resource-percent=1.0}} then both two apps 
can be activated in this test case, This test case can pass again.

> Schedulable entities leak in fair ordering policy when recovering containers 
> between remove app attempt and remove app
> --
>
> Key: YARN-8958
> URL: https://issues.apache.org/jira/browse/YARN-8958
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8958.001.patch, YARN-8958.002.patch
>
>
> We found a NPE in ClientRMService#getApplications when querying apps with 
> specified queue. The cause is that there is one app which can't be found by 
> calling RMContextImpl#getRMApps(is finished and swapped out of memory) but 
> still can be queried from fair ordering policy.
> To reproduce schedulable entities leak in fair ordering policy:
> (1) create app1 and launch container1 on node1
> (2) restart RM
> (3) remove app1 attempt, app1 is removed from the schedulable entities.
> (4) recover container1, then the state of contianer1 is changed to COMPLETED, 
> app1 is bring back to entitiesToReorder after container released, then app1 
> will be added back into schedulable entities after calling 
> FairOrderingPolicy#getAssignmentIterator by scheduler.
> (5) remove app1
> To solve this problem, we should make sure schedulableEntities can only be 
> affected by add or remove app attempt, new entity should not be added into 
> schedulableEntities by reordering process.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> schedulableEntities.add(schedulableEntity);
>   }
> {code}
> Related codes above can be improved as follow to make sure only existent 
> entity can be re-add into schedulableEntities.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> boolean exists = schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> if (exists) {
>   schedulableEntities.add(schedulableEntity);
> } else {
>   LOG.info("Skip reordering non-existent schedulable entity: "
>   + schedulableEntity.getId());
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6729) Clarify documentation on how to enable cgroup support

2018-10-30 Thread Shane Kumpf (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf updated YARN-6729:
--
Summary: Clarify documentation on how to enable cgroup support  (was: NM 
percentage-physical-cpu-limit should be always 100 if 
DefaultLCEResourcesHandler is used)

> Clarify documentation on how to enable cgroup support
> -
>
> Key: YARN-6729
> URL: https://issues.apache.org/jira/browse/YARN-6729
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Yufei Gu
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-6729-trunk.001.patch
>
>
> NM percentage-physical-cpu-limit is not honored in 
> DefaultLCEResourcesHandler, which may cause container cpu usage calculation 
> issue. e.g. container vcore usage is potentially more than 100% if 
> percentage-physical-cpu-limit is set to a value less than 100. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8854) Upgrade jquery datatable version references to v1.10.19

2018-10-30 Thread Sunil Govindan (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8854:
-
Fix Version/s: 3.3.0

> Upgrade jquery datatable version references to v1.10.19
> ---
>
> Key: YARN-8854
> URL: https://issues.apache.org/jira/browse/YARN-8854
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Akhil PB
>Assignee: Akhil PB
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: YARN-8854.001.patch, YARN-8854.002.patch, 
> YARN-8854.003.patch, YARN-8854.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8854) Upgrade jquery datatable version references to v1.10.19

2018-10-30 Thread Sunil Govindan (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8854:
-
Summary: Upgrade jquery datatable version references to v1.10.19  (was: 
[Hadoop YARN Common] Update jquery datatable version references)

> Upgrade jquery datatable version references to v1.10.19
> ---
>
> Key: YARN-8854
> URL: https://issues.apache.org/jira/browse/YARN-8854
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Akhil PB
>Assignee: Akhil PB
>Priority: Critical
> Attachments: YARN-8854.001.patch, YARN-8854.002.patch, 
> YARN-8854.003.patch, YARN-8854.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app

2018-10-30 Thread Tao Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8958:
---
Attachment: YARN-8958.002.patch

> Schedulable entities leak in fair ordering policy when recovering containers 
> between remove app attempt and remove app
> --
>
> Key: YARN-8958
> URL: https://issues.apache.org/jira/browse/YARN-8958
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8958.001.patch, YARN-8958.002.patch
>
>
> We found a NPE in ClientRMService#getApplications when querying apps with 
> specified queue. The cause is that there is one app which can't be found by 
> calling RMContextImpl#getRMApps(is finished and swapped out of memory) but 
> still can be queried from fair ordering policy.
> To reproduce schedulable entities leak in fair ordering policy:
> (1) create app1 and launch container1 on node1
> (2) restart RM
> (3) remove app1 attempt, app1 is removed from the schedulable entities.
> (4) recover container1, then the state of contianer1 is changed to COMPLETED, 
> app1 is bring back to entitiesToReorder after container released, then app1 
> will be added back into schedulable entities after calling 
> FairOrderingPolicy#getAssignmentIterator by scheduler.
> (5) remove app1
> To solve this problem, we should make sure schedulableEntities can only be 
> affected by add or remove app attempt, new entity should not be added into 
> schedulableEntities by reordering process.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> schedulableEntities.add(schedulableEntity);
>   }
> {code}
> Related codes above can be improved as follow to make sure only existent 
> entity can be re-add into schedulableEntities.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> boolean exists = schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> if (exists) {
>   schedulableEntities.add(schedulableEntity);
> } else {
>   LOG.info("Skip reordering non-existent schedulable entity: "
>   + schedulableEntity.getId());
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8905) [Router] Add JvmMetricsInfo and pause monitor

2018-10-30 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669075#comment-16669075
 ] 

Hadoop QA commented on YARN-8905:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
30s{color} | {color:green} hadoop-yarn-server-router in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 53m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8905 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12946185/YARN-8905-005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4310e2a7660c 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 62d98ca |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22380/testReport/ |
| Max. process+thread count | 705 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22380/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> [Router] Add JvmMetricsInfo and pause monitor
> --

[jira] [Updated] (YARN-8778) Add Command Line interface to invoke interactive docker shell

2018-10-30 Thread Eric Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8778:

Attachment: YARN-8778.007.patch

> Add Command Line interface to invoke interactive docker shell
> -
>
> Key: YARN-8778
> URL: https://issues.apache.org/jira/browse/YARN-8778
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8778.001.patch, YARN-8778.002.patch, 
> YARN-8778.003.patch, YARN-8778.004.patch, YARN-8778.005.patch, 
> YARN-8778.006.patch, YARN-8778.007.patch
>
>
> CLI will be the mandatory interface we are providing for a user to use 
> interactive docker shell feature. We will need to create a new class 
> “InteractiveDockerShellCLI” to read command line into the servlet and pass 
> all the way down to docker executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8776) Container Executor change to create stdin/stdout pipeline

2018-10-30 Thread Eric Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8776:

Attachment: YARN-8776.006.patch

> Container Executor change to create stdin/stdout pipeline
> -
>
> Key: YARN-8776
> URL: https://issues.apache.org/jira/browse/YARN-8776
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8776.001.patch, YARN-8776.002.patch, 
> YARN-8776.003.patch, YARN-8776.004.patch, YARN-8776.005.patch, 
> YARN-8776.006.patch
>
>
> The pipeline is built to connect the stdin/stdout channel from WebSocket 
> servlet through container-executor to docker executor. So when the WebSocket 
> servlet is started, we need to invoke container-executor “dockerExec” method 
> (which will be implemented) to create a new docker executor and use “docker 
> exec -it $ContainerId” command which executes an interactive bash shell on 
> the container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8914) Add xtermjs to YARN UI2

2018-10-30 Thread Eric Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8914:

Attachment: YARN-8914.005.patch

> Add xtermjs to YARN UI2
> ---
>
> Key: YARN-8914
> URL: https://issues.apache.org/jira/browse/YARN-8914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8914.001.patch, YARN-8914.002.patch, 
> YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch
>
>
> In the container listing from UI2, we can add a link to connect to docker 
> container using xtermjs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app

2018-10-30 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668954#comment-16668954
 ] 

Hadoop QA commented on YARN-8958:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 22s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 22s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}105m  2s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}161m 32s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.policy.TestFairOrderingPolicy |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8958 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12946177/YARN-8958.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8e70d66ca204 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 7757331 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/22379/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22379/testReport/ |
| Max. process+thread count | 894 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/had

[jira] [Updated] (YARN-8575) Avoid committing allocation proposal to unavailable nodes in async scheduling

2018-10-30 Thread Bibin A Chundatt (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-8575:
---
Affects Version/s: (was: 3.1.2)
   (was: 3.2.0)

> Avoid committing allocation proposal to unavailable nodes in async scheduling
> -
>
> Key: YARN-8575
> URL: https://issues.apache.org/jira/browse/YARN-8575
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-8575.001.patch, YARN-8575.002.patch
>
>
> Recently we found a new error as follows: 
> {noformat}
> ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  node to unreserve doesn't exist, nodeid: host1:45454
> {noformat}
> Reproduce this problem:
> (1) Create a reserve proposal for app1 on node1
> (2) node1 is successfully decommissioned and removed from node tracker
> (3) Try to commit this outdated reserve proposal, it will be accepted and 
> applied.
> This error may be occurred after decommissioning some NMs. The application 
> who print the error log will always have a reserved container on non-exist 
> (decommissioned) NM and the pending request will never be satisfied.
> To solve this problem, scheduler should check node state in 
> FiCaSchedulerApp#accept to avoid committing outdated proposals on unusable 
> nodes. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6729) NM percentage-physical-cpu-limit should be always 100 if DefaultLCEResourcesHandler is used

2018-10-30 Thread Shane Kumpf (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668663#comment-16668663
 ] 

Shane Kumpf commented on YARN-6729:
---

Thanks, [~tangzhankun]! I'm good with that approach. I'll commit this shortly 
unless there are other comments.
{quote}One thing in my mind is that once we put 
"yarn.nodemanager.resource.cpu.enabled" into the document. Does it mean that 
these features are stable? Because you know, for the historical reason, that 
settings are marked "unstable".
{quote}
Enabling CgroupsLCEResourcesHandler follows the same code path, so I don't have 
much concern. We can discuss the annotations and removal of the deprecated code 
as part of YARN-8924.

> NM percentage-physical-cpu-limit should be always 100 if 
> DefaultLCEResourcesHandler is used
> ---
>
> Key: YARN-6729
> URL: https://issues.apache.org/jira/browse/YARN-6729
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Yufei Gu
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-6729-trunk.001.patch
>
>
> NM percentage-physical-cpu-limit is not honored in 
> DefaultLCEResourcesHandler, which may cause container cpu usage calculation 
> issue. e.g. container vcore usage is potentially more than 100% if 
> percentage-physical-cpu-limit is set to a value less than 100. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-8961) [UI2] Flow Run End Time shows 'Invalid date'

2018-10-30 Thread Sunil Govindan (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan reassigned YARN-8961:


Assignee: Akhil PB

> [UI2] Flow Run End Time shows 'Invalid date'
> 
>
> Key: YARN-8961
> URL: https://issues.apache.org/jira/browse/YARN-8961
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Charan Hebri
>Assignee: Akhil PB
>Priority: Major
> Attachments: Invalid_Date.png
>
>
> End Time for Flow Runs is shown as *Invalid date* for runs that are in 
> progress. This should be shown as *N/A* just like how it is shown for 'CPU 
> VCores' and 'Memory Used'. Attached relevant screenshot.
> cc [~akhilpb]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8961) [UI2] Flow Run End Time shows 'Invalid date'

2018-10-30 Thread Charan Hebri (JIRA)

Charan Hebri created YARN-8961:
--

 Summary: [UI2] Flow Run End Time shows 'Invalid date'
 Key: YARN-8961
 URL: https://issues.apache.org/jira/browse/YARN-8961
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Charan Hebri
 Attachments: Invalid_Date.png

End Time for Flow Runs is shown as *Invalid date* for runs that are in 
progress. This should be shown as *N/A* just like how it is shown for 'CPU 
VCores' and 'Memory Used'. Attached relevant screenshot.
cc [~akhilpb]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8960) Can't get submarine service status using the command of "yarn app -status" under security environment

2018-10-30 Thread Zac Zhou (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zac Zhou updated YARN-8960:
---
Attachment: YARN-8960.002.patch

> Can't get submarine service status using the command of "yarn app -status" 
> under security environment
> -
>
> Key: YARN-8960
> URL: https://issues.apache.org/jira/browse/YARN-8960
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
> Attachments: YARN-8960.001.patch, YARN-8960.002.patch
>
>
> After submitting a submarine job, we tried to get service status using the 
> following command:
> yarn app -status ${service_name}
> But we got the following error：
> HTTP error code : 500
>  
> The stack in resourcemanager log is :
> ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {}
> java.lang.reflect.UndeclaredThrowableException
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
>  at 
> com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker
> ._dispatch(AbstractResourceMethodDispatchProvider.java:205)
>  at 
> com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodD
> ispatcher.java:75)
>  at 
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
>  at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>  at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
>  at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>  at 
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
>  at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
>  at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
>  at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
>  at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
>  at 
> com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:89)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:179)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
>  at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
>  at 
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
>  at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
>  at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
>  at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
>  at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(Authentica

[jira] [Commented] (YARN-8960) Can't get submarine service status using the command of "yarn app -status" under security environment

2018-10-30 Thread Zac Zhou (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668550#comment-16668550
 ] 

Zac Zhou commented on YARN-8960:


The root cause is that the submarine service doesn't have keytab and principal 
configuration. we need to add parameters support that.

A patch will be attached shortly

> Can't get submarine service status using the command of "yarn app -status" 
> under security environment
> -
>
> Key: YARN-8960
> URL: https://issues.apache.org/jira/browse/YARN-8960
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
>
> After submitting a submarine job, we tried to get service status using the 
> following command:
> yarn app -status ${service_name}
> But we got the following error：
> HTTP error code : 500
>  
> The stack in resourcemanager log is :
> ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {}
> java.lang.reflect.UndeclaredThrowableException
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
>  at 
> com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker
> ._dispatch(AbstractResourceMethodDispatchProvider.java:205)
>  at 
> com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodD
> ispatcher.java:75)
>  at 
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
>  at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>  at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
>  at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>  at 
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
>  at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
>  at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
>  at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
>  at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
>  at 
> com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:89)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:179)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
>  at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
>  at 
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
>  at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
>  at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
>  at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
>  at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter

[jira] [Created] (YARN-8960) Can't get submarine service status using the command of "yarn app -status" under security environment

2018-10-30 Thread Zac Zhou (JIRA)

Zac Zhou created YARN-8960:
--

 Summary: Can't get submarine service status using the command of 
"yarn app -status" under security environment
 Key: YARN-8960
 URL: https://issues.apache.org/jira/browse/YARN-8960
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zac Zhou
Assignee: Zac Zhou


After submitting a submarine job, we tried to get service status using the 
following command:

yarn app -status ${service_name}

But we got the following error：

HTTP error code : 500

 

The stack in resourcemanager log is :

ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {}
java.lang.reflect.UndeclaredThrowableException
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
 at 
org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800)
 at 
org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
 at 
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker
._dispatch(AbstractResourceMethodDispatchProvider.java:205)
 at 
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodD
ispatcher.java:75)
 at 
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
 at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
 at 
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
 at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
 at 
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
 at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
 at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
 at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
 at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
 at 
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
 at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
 at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
 at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
 at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
 at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:89)
 at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
 at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
 at 
org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:179)
 at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
 at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
 at 
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
 at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
 at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
 at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
 at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
 at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
 at 
org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
 at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
 at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:6
44)
 at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:5
92)
 at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
 at 
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1610)
 at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
 at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
 at 
org.eclipse.jetty

[jira] [Commented] (YARN-8959) TestContainerResizing fails randomly

2018-10-30 Thread Bibin A Chundatt (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668429#comment-16668429
 ] 

Bibin A Chundatt commented on YARN-8959:


 for i in {1..100}; do result=`mvn test -Dtest=TestContainerResizing | grep 
BUILD`; echo $i$result ;cat 
target/surefire-reports/org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.txt
 | grep TestContainerResizing ;mv target/surefire-reports 
target/surefire-reports$i; done


> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Minor
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-8959) TestContainerResizing fails randomly

2018-10-30 Thread Bilwa S T (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-8959:
---

Assignee: Bilwa S T

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Minor
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8959) TestContainerResizing fails randomly

2018-10-30 Thread Bibin A Chundatt (JIRA)

Bibin A Chundatt created YARN-8959:
--

 Summary: TestContainerResizing fails randomly
 Key: YARN-8959
 URL: https://issues.apache.org/jira/browse/YARN-8959
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt


org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
{code}
testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
  Time elapsed: 0.348 s  <<< FAILURE!
java.lang.AssertionError: expected:<1024> but was:<3072>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
{code}
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
{code}
testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
  Time elapsed: 0.445 s  <<< FAILURE!
java.lang.AssertionError: expected:<1024> but was:<7168>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)

{code}
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
{code}
testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
  Time elapsed: 0.321 s  <<< FAILURE!
java.lang.AssertionError: expected:<1024> but was:<2048>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8902) Add volume manager that manages CSI volume lifecycle

2018-10-30 Thread Weiwei Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668389#comment-16668389
 ] 

Weiwei Yang commented on YARN-8902:
---

[~leftnoteasy], [~sunilg], please help to review.

> Add volume manager that manages CSI volume lifecycle
> 
>
> Key: YARN-8902
> URL: https://issues.apache.org/jira/browse/YARN-8902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8902.001.patch, YARN-8902.002.patch, 
> YARN-8902.003.patch, YARN-8902.004.patch, YARN-8902.005.patch, 
> YARN-8902.006.patch
>
>
> The CSI volume manager is a service running in RM process, that manages all 
> CSI volumes' lifecycle. The details about volume's lifecycle states can be 
> found in [CSI 
> spec|https://github.com/container-storage-interface/spec/blob/master/spec.md].
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8902) Add volume manager that manages CSI volume lifecycle

2018-10-30 Thread Weiwei Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668387#comment-16668387
 ] 

Weiwei Yang commented on YARN-8902:
---

v6 patch uploaded, major change is to remove the async processing in the volume 
processor, lets keep it simple in the 1st version. We do sync volume processing 
like following
 # AppMaster calls allocate()
 # the request gets to the volume-processor
 # if volume resource is found in this request, before entering next processor, 
do #4, #5, #6
 # submit a provisioning task to the VolumeManager and wait for its execution, 
or timeout in 10 sec
 # once the provisioning task is done (all volumes are provisioned), then call 
next processor’s allocate() call
 # if task is failed (some volumes failed to be provisioned), throw a exception 
to notify the client

Hope this makes sense.

Thanks

> Add volume manager that manages CSI volume lifecycle
> 
>
> Key: YARN-8902
> URL: https://issues.apache.org/jira/browse/YARN-8902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8902.001.patch, YARN-8902.002.patch, 
> YARN-8902.003.patch, YARN-8902.004.patch, YARN-8902.005.patch, 
> YARN-8902.006.patch
>
>
> The CSI volume manager is a service running in RM process, that manages all 
> CSI volumes' lifecycle. The details about volume's lifecycle states can be 
> found in [CSI 
> spec|https://github.com/container-storage-interface/spec/blob/master/spec.md].
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8902) Add volume manager that manages CSI volume lifecycle

2018-10-30 Thread Weiwei Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8902:
--
Attachment: YARN-8902.006.patch

> Add volume manager that manages CSI volume lifecycle
> 
>
> Key: YARN-8902
> URL: https://issues.apache.org/jira/browse/YARN-8902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8902.001.patch, YARN-8902.002.patch, 
> YARN-8902.003.patch, YARN-8902.004.patch, YARN-8902.005.patch, 
> YARN-8902.006.patch
>
>
> The CSI volume manager is a service running in RM process, that manages all 
> CSI volumes' lifecycle. The details about volume's lifecycle states can be 
> found in [CSI 
> spec|https://github.com/container-storage-interface/spec/blob/master/spec.md].
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8905) [Router] Add JvmMetricsInfo and pause monitor

2018-10-30 Thread Bilwa S T (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-8905:

Attachment: YARN-8905-005.patch

> [Router] Add JvmMetricsInfo and pause monitor
> -
>
> Key: YARN-8905
> URL: https://issues.apache.org/jira/browse/YARN-8905
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-8905-001.patch, YARN-8905-002.patch, 
> YARN-8905-003.patch, YARN-8905-004.patch, YARN-8905-005.patch
>
>
> Similar to resourcemanager and nodemanager serivce we can add JvmMetricsInfo 
> to router service too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8897) LoadBasedRouterPolicy throws "NPE" in case of sub cluster unavailability

2018-10-30 Thread Bilwa S T (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-8897:

Attachment: YARN-8897-003.patch

> LoadBasedRouterPolicy throws "NPE" in case of sub cluster unavailability 
> -
>
> Key: YARN-8897
> URL: https://issues.apache.org/jira/browse/YARN-8897
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation, router
>Reporter: Akshay Agarwal
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-8897-001.patch, YARN-8897-002.patch, 
> YARN-8897-003.patch
>
>
> If no sub clusters are available for "*Load Based Router Policy*" with 
> *cluster weight* as *1*  in Router Based Federation Setup , throwing 
> "*NullPointerException*".
>  
> *Exception Details:*
> {code:java}
> java.lang.NullPointerException: java.lang.NullPointerException
>  at 
> org.apache.hadoop.yarn.server.federation.policies.router.LoadBasedRouterPolicy.getHomeSubcluster(LoadBasedRouterPolicy.java:99)
>  at 
> org.apache.hadoop.yarn.server.federation.policies.RouterPolicyFacade.getHomeSubcluster(RouterPolicyFacade.java:204)
>  at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:362)
>  at 
> org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282)
>  at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>  at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85)
>  at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication(ApplicationClientProtocolPBClientImpl.java:297)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>  at com.sun.proxy.$Proxy15.submitApplication(Unknown Source)
>  at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:288)
>  at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:300)
>  at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:331)
>  at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
>  at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
>  at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>  at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
>  at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)
>  at 
> org.apache.hadoop.exam

[jira] [Commented] (YARN-8905) [Router] Add JvmMetricsInfo and pause monitor

2018-10-30 Thread Bibin A Chundatt (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668352#comment-16668352
 ] 

Bibin A Chundatt commented on YARN-8905:


[~BilwaST]

Please fix checkstyle,whitespace and license issue

> [Router] Add JvmMetricsInfo and pause monitor
> -
>
> Key: YARN-8905
> URL: https://issues.apache.org/jira/browse/YARN-8905
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-8905-001.patch, YARN-8905-002.patch, 
> YARN-8905-003.patch, YARN-8905-004.patch
>
>
> Similar to resourcemanager and nodemanager serivce we can add JvmMetricsInfo 
> to router service too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app

2018-10-30 Thread Tao Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8958:
---
Attachment: YARN-8958.001.patch

> Schedulable entities leak in fair ordering policy when recovering containers 
> between remove app attempt and remove app
> --
>
> Key: YARN-8958
> URL: https://issues.apache.org/jira/browse/YARN-8958
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8958.001.patch
>
>
> We found a NPE in ClientRMService#getApplications when querying apps with 
> specified queue. The cause is that there is one app which can't be found by 
> calling RMContextImpl#getRMApps(is finished and swapped out of memory) but 
> still can be queried from fair ordering policy.
> To reproduce schedulable entities leak in fair ordering policy:
> (1) create app1 and launch container1 on node1
> (2) restart RM
> (3) remove app1 attempt, app1 is removed from the schedulable entities.
> (4) recover container1, then the state of contianer1 is changed to COMPLETED, 
> app1 is bring back to entitiesToReorder after container released, then app1 
> will be added back into schedulable entities after calling 
> FairOrderingPolicy#getAssignmentIterator by scheduler.
> (5) remove app1
> To solve this problem, we should make sure schedulableEntities can only be 
> affected by add or remove app attempt, new entity should not be added into 
> schedulableEntities by reordering process.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> schedulableEntities.add(schedulableEntity);
>   }
> {code}
> Related codes above can be improved as follow to make sure only existent 
> entity can be re-add into schedulableEntities.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> boolean exists = schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> if (exists) {
>   schedulableEntities.add(schedulableEntity);
> } else {
>   LOG.info("Skip reordering non-existent schedulable entity: "
>   + schedulableEntity.getId());
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app

2018-10-30 Thread Tao Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668305#comment-16668305
 ] 

Tao Yang commented on YARN-8958:


Attached v1 patch for review.

> Schedulable entities leak in fair ordering policy when recovering containers 
> between remove app attempt and remove app
> --
>
> Key: YARN-8958
> URL: https://issues.apache.org/jira/browse/YARN-8958
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8958.001.patch
>
>
> We found a NPE in ClientRMService#getApplications when querying apps with 
> specified queue. The cause is that there is one app which can't be found by 
> calling RMContextImpl#getRMApps(is finished and swapped out of memory) but 
> still can be queried from fair ordering policy.
> To reproduce schedulable entities leak in fair ordering policy:
> (1) create app1 and launch container1 on node1
> (2) restart RM
> (3) remove app1 attempt, app1 is removed from the schedulable entities.
> (4) recover container1, then the state of contianer1 is changed to COMPLETED, 
> app1 is bring back to entitiesToReorder after container released, then app1 
> will be added back into schedulable entities after calling 
> FairOrderingPolicy#getAssignmentIterator by scheduler.
> (5) remove app1
> To solve this problem, we should make sure schedulableEntities can only be 
> affected by add or remove app attempt, new entity should not be added into 
> schedulableEntities by reordering process.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> schedulableEntities.add(schedulableEntity);
>   }
> {code}
> Related codes above can be improved as follow to make sure only existent 
> entity can be re-add into schedulableEntities.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> boolean exists = schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> if (exists) {
>   schedulableEntities.add(schedulableEntity);
> } else {
>   LOG.info("Skip reordering non-existent schedulable entity: "
>   + schedulableEntity.getId());
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app

2018-10-30 Thread Tao Yang (JIRA)

Tao Yang created YARN-8958:
--

 Summary: Schedulable entities leak in fair ordering policy when 
recovering containers between remove app attempt and remove app
 Key: YARN-8958
 URL: https://issues.apache.org/jira/browse/YARN-8958
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 3.2.1
Reporter: Tao Yang
Assignee: Tao Yang


We found a NPE in ClientRMService#getApplications when querying apps with 
specified queue. The cause is that there is one app which can't be found by 
calling RMContextImpl#getRMApps(is finished and swapped out of memory) but 
still can be queried from fair ordering policy.

To reproduce schedulable entities leak in fair ordering policy:
(1) create app1 and launch container1 on node1
(2) restart RM
(3) remove app1 attempt, app1 is removed from the schedulable entities.
(4) recover container1, then the state of contianer1 is changed to COMPLETED, 
app1 is bring back to entitiesToReorder after container released, then app1 
will be added back into schedulable entities after calling 
FairOrderingPolicy#getAssignmentIterator by scheduler.
(5) remove app1

To solve this problem, we should make sure schedulableEntities can only be 
affected by add or remove app attempt, new entity should not be added into 
schedulableEntities by reordering process.
{code:java}
  protected void reorderSchedulableEntity(S schedulableEntity) {
//remove, update comparable data, and reinsert to update position in order
schedulableEntities.remove(schedulableEntity);
updateSchedulingResourceUsage(
  schedulableEntity.getSchedulingResourceUsage());
schedulableEntities.add(schedulableEntity);
  }
{code}
Related codes above can be improved as follow to make sure only existent entity 
can be re-add into schedulableEntities.
{code:java}
  protected void reorderSchedulableEntity(S schedulableEntity) {
//remove, update comparable data, and reinsert to update position in order
boolean exists = schedulableEntities.remove(schedulableEntity);
updateSchedulingResourceUsage(
  schedulableEntity.getSchedulingResourceUsage());
if (exists) {
  schedulableEntities.add(schedulableEntity);
} else {
  LOG.info("Skip reordering non-existent schedulable entity: "
  + schedulableEntity.getId());
}
  }
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

59 matches

Mail list logo