[jira] [Commented] (YARN-8496) The capacity scheduler uses label to cause vcore to be incorrect

2018-07-06 Thread tangshangwen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535626#comment-16535626
 ] 

tangshangwen commented on YARN-8496:


I'll update a patch later



> The capacity scheduler uses label to cause vcore to be incorrect
> 
>
> Key: YARN-8496
> URL: https://issues.apache.org/jira/browse/YARN-8496
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 2.7.6
>Reporter: tangshangwen
>Assignee: tangshangwen
>Priority: Major
> Attachments: yarn-bug.png
>
>
>  In my cluster, I used label scheduling, and I found that it caused the vcore 
> of the cluster to be incorrect
>  
> capacity-scheduler.xml
>  
> {code:java}
> 
> 
> yarn.scheduler.capacity.root.queues
> support
> 
> 
> yarn.scheduler.capacity.root.support.capacity
> 100
> 
> 
> yarn.scheduler.capacity.root.support.accessible-node-labels
> test1
> 
> 
> yarn.scheduler.capacity.root.support.accessible-node-labels.test1.capacity
> 100
> 
> 
> yarn.scheduler.capacity.root.accessible-node-labels.test1.capacity
> 100
> 
> 
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5565) Capacity Scheduler not assigning value correctly.

2018-07-06 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535622#comment-16535622
 ] 

Zian Chen edited comment on YARN-5565 at 7/7/18 4:48 AM:
-

Hi [~gurmukhd] , After investigating this issue, I think is related to JAVA 
float precision. JAVA float has 32 bits (4 bytes) where 23 bits are used for 
the mantissa (about 7 decimal *digits*). which will result in some unexpected 
precision loss here. 

Also, capacity value is a percentage based value, so it will be converted and 
cap into 0.0714286f. 

I tested in local, System.out.println(0.07142857142857143f) will be 0.071428575.

So to be safe, we suggest not setting capacity with precision which extends 
JAVA float precision.


was (Author: zian chen):
Hi [~gurmukhd] , After investigating this issue, I think is related to JAVA 
float precision. JAVA float has 32 bits (4 bytes) where 23 bits are used for 
the mantissa (about 7 decimal *digits*). which will result in some unexpected 
precision loss here. 

Also, capacity value is a percentage based value, so it will be converted and 
cap into 0.0714286f. 

I tested in local, System.out.println(0.07142857142857143f) will be 0.071428575.

So to be safe, we suggest not setting capacity with precision which extends 
JAVA float precision.

 

 

 

> Capacity Scheduler not assigning value correctly.
> -
>
> Key: YARN-5565
> URL: https://issues.apache.org/jira/browse/YARN-5565
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, yarn
>Affects Versions: 2.7.2
> Environment: hadoop 2.7.2
>Reporter: gurmukh singh
>Priority: Major
>  Labels: capacity-scheduler, scheduler, yarn
>
> Hi
> I was testing and found out that value assigned in the scheduler 
> configuration is not consistent with what ResourceManager is assigning.
> If i set the configuration as below and understand that it is java float, but 
> the rounding is not correct.
> capacity-sheduler.xml
> 
>   yarn.scheduler.capacity.q1.capacity
>   7.142857142857143
> 
> In Java:  System.err.println((7.142857142857143f)) ===> 7.142587 
> But, instead Resource Manager is assigning is 7.1428566
> Tested this on hadoop 2.7.2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5565) Capacity Scheduler not assigning value correctly.

2018-07-06 Thread Zian Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen reassigned YARN-5565:
---

Assignee: Zian Chen

> Capacity Scheduler not assigning value correctly.
> -
>
> Key: YARN-5565
> URL: https://issues.apache.org/jira/browse/YARN-5565
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, yarn
>Affects Versions: 2.7.2
> Environment: hadoop 2.7.2
>Reporter: gurmukh singh
>Assignee: Zian Chen
>Priority: Major
>  Labels: capacity-scheduler, scheduler, yarn
>
> Hi
> I was testing and found out that value assigned in the scheduler 
> configuration is not consistent with what ResourceManager is assigning.
> If i set the configuration as below and understand that it is java float, but 
> the rounding is not correct.
> capacity-sheduler.xml
> 
>   yarn.scheduler.capacity.q1.capacity
>   7.142857142857143
> 
> In Java:  System.err.println((7.142857142857143f)) ===> 7.142587 
> But, instead Resource Manager is assigning is 7.1428566
> Tested this on hadoop 2.7.2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5565) Capacity Scheduler not assigning value correctly.

2018-07-06 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535622#comment-16535622
 ] 

Zian Chen commented on YARN-5565:
-

Hi [~gurmukhd] , After investigating this issue, I think is related to JAVA 
float precision. JAVA float has 32 bits (4 bytes) where 23 bits are used for 
the mantissa (about 7 decimal *digits*). which will result in some unexpected 
precision loss here. 

Also, capacity value is a percentage based value, so it will be converted and 
cap into 0.0714286f. 

I tested in local, System.out.println(0.07142857142857143f) will be 0.071428575.

So to be safe, we suggest not setting capacity with precision which extends 
JAVA float precision.

 

 

 

> Capacity Scheduler not assigning value correctly.
> -
>
> Key: YARN-5565
> URL: https://issues.apache.org/jira/browse/YARN-5565
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, yarn
>Affects Versions: 2.7.2
> Environment: hadoop 2.7.2
>Reporter: gurmukh singh
>Priority: Major
>  Labels: capacity-scheduler, scheduler, yarn
>
> Hi
> I was testing and found out that value assigned in the scheduler 
> configuration is not consistent with what ResourceManager is assigning.
> If i set the configuration as below and understand that it is java float, but 
> the rounding is not correct.
> capacity-sheduler.xml
> 
>   yarn.scheduler.capacity.q1.capacity
>   7.142857142857143
> 
> In Java:  System.err.println((7.142857142857143f)) ===> 7.142587 
> But, instead Resource Manager is assigning is 7.1428566
> Tested this on hadoop 2.7.2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7481) Gpu locality support for Better AI scheduling

2018-07-06 Thread Chen Qingcha (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Qingcha updated YARN-7481:
---
Attachment: hadoop_2.9.0.patch

> Gpu locality support for Better AI scheduling
> -
>
> Key: YARN-7481
> URL: https://issues.apache.org/jira/browse/YARN-7481
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api, RM, yarn
>Affects Versions: 2.7.2
>Reporter: Chen Qingcha
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: GPU locality support for Job scheduling.pdf, 
> hadoop-2.7.2-gpu.patch, hadoop-2.7.2.gpu-port.patch, hadoop_2.9.0.patch
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> We enhance Hadoop with GPU support for better AI job scheduling. 
> Currently, YARN-3926 also supports GPU scheduling, which treats GPU as 
> countable resource. 
> However, GPU placement is also very important to deep learning job for better 
> efficiency.
>  For example, a 2-GPU job runs on gpu {0,1} could be faster than run on gpu 
> {0, 7}, if GPU 0 and 1 are under the same PCI-E switch while 0 and 7 are not.
>  We add the support to Hadoop 2.7.2 to enable GPU locality scheduling, which 
> support fine-grained GPU placement. 
> A 64-bits bitmap is added to yarn Resource, which indicates both GPU usage 
> and locality information in a node (up to 64 GPUs per node). '1' means 
> available and '0' otherwise in the corresponding position of the bit.   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7481) Gpu locality support for Better AI scheduling

2018-07-06 Thread Chen Qingcha (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Qingcha updated YARN-7481:
---
Attachment: (was: hadoop_2.9.0.patch)

> Gpu locality support for Better AI scheduling
> -
>
> Key: YARN-7481
> URL: https://issues.apache.org/jira/browse/YARN-7481
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api, RM, yarn
>Affects Versions: 2.7.2
>Reporter: Chen Qingcha
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: GPU locality support for Job scheduling.pdf, 
> hadoop-2.7.2-gpu.patch, hadoop-2.7.2.gpu-port.patch
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> We enhance Hadoop with GPU support for better AI job scheduling. 
> Currently, YARN-3926 also supports GPU scheduling, which treats GPU as 
> countable resource. 
> However, GPU placement is also very important to deep learning job for better 
> efficiency.
>  For example, a 2-GPU job runs on gpu {0,1} could be faster than run on gpu 
> {0, 7}, if GPU 0 and 1 are under the same PCI-E switch while 0 and 7 are not.
>  We add the support to Hadoop 2.7.2 to enable GPU locality scheduling, which 
> support fine-grained GPU placement. 
> A 64-bits bitmap is added to yarn Resource, which indicates both GPU usage 
> and locality information in a node (up to 64 GPUs per node). '1' means 
> available and '0' otherwise in the corresponding position of the bit.   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8289) Modify distributedshell to support Node Attributes

2018-07-06 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535579#comment-16535579
 ] 

Sunil Govindan commented on YARN-8289:
--

[~Zian Chen] I have a patch ready for this. I ll share along with scheduler 
changes.

> Modify distributedshell to support Node Attributes
> --
>
> Key: YARN-8289
> URL: https://issues.apache.org/jira/browse/YARN-8289
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: distributed-shell
>Affects Versions: YARN-3409
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Major
>
> Modifications required in Distributed shell to support NodeAttributes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8289) Modify distributedshell to support Node Attributes

2018-07-06 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535541#comment-16535541
 ] 

Zian Chen commented on YARN-8289:
-

Hi [~Naganarasimha] , [~cheersyang] , I think we pinged the wrong person here. 
[~sunilg] , do you have any plans working on this? I can pick it up if you 
don't have the bandwidth.

> Modify distributedshell to support Node Attributes
> --
>
> Key: YARN-8289
> URL: https://issues.apache.org/jira/browse/YARN-8289
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: distributed-shell
>Affects Versions: YARN-3409
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Major
>
> Modifications required in Distributed shell to support NodeAttributes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8480) Add boolean option for resources

2018-07-06 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535536#comment-16535536
 ] 

Zian Chen commented on YARN-8480:
-

[~templedf] , would you mind give some example what kind of resources can be 
treated as "boolean" resource here?

To me, node labels and node attributes are orthogonal with each other. Node 
labels are addressing requirements that we want to specify apps running on the 
specific node with the resource they need, or partition cluster on 
organizations or workloads. While node attributes are solving the problem when 
we need to characterize a node based on many of its attributes and select a 
node based on expression of these attributes during scheduling.

If we could find the user scenarios for boolean resource which can not be 
covered by node labels or node attributes very well. That will be a very good 
start for supporting this feature.

 

> Add boolean option for resources
> 
>
> Key: YARN-8480
> URL: https://issues.apache.org/jira/browse/YARN-8480
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Daniel Templeton
>Assignee: Szilard Nemeth
>Priority: Major
>
> Make it possible to define a resource with a boolean value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8302) ATS v2 should handle HBase connection issue properly

2018-07-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535442#comment-16535442
 ] 

Hudson commented on YARN-8302:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14533 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14533/])
YARN-8302. ATS v2 should handle HBase connection issue properly. 
(rohithsharmaks: rev ba683204498c97654be4727ab9e128c433a45498)
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestTimelineReaderHBaseDown.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/HBaseTimelineReaderImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java


> ATS v2 should handle HBase connection issue properly
> 
>
> Key: YARN-8302
> URL: https://issues.apache.org/jira/browse/YARN-8302
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: ATSv2
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8302.1.patch, YARN-8302.2.patch, YARN-8302.3.patch
>
>
> ATS v2 call times out with below error when it can't connect to HBase 
> instance.
> {code}
> bash-4.2$ curl -i -k -s -1  -H 'Content-Type: application/json'  -H 'Accept: 
> application/json' --max-time 5   --negotiate -u : 
> 'https://xxx:8199/ws/v2/timeline/apps/application_1526357251888_0022/entities/YARN_CONTAINER?fields=ALL&_=1526425686092'
> curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received
> {code}
> {code:title=ATS log}
> 2018-05-15 23:10:03,623 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=7, 
> retries=7, started=8165 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:13,651 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=8, 
> retries=8, started=18192 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:23,730 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=9, 
> retries=9, started=28272 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:33,788 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=10, 
> retries=10, started=38330 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1{code}
> There are two issues here.
> 1) Check why ATS can't connect to HBase
> 2) In case of connection error,  ATS call should not get timeout. It should 
> fail with proper error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8499) ATS v2 should handle connection issues in general for all storages

2018-07-06 Thread Rohith Sharma K S (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-8499:

Issue Type: Improvement  (was: Bug)

> ATS v2 should handle connection issues in general for all storages
> --
>
> Key: YARN-8499
> URL: https://issues.apache.org/jira/browse/YARN-8499
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Sunil Govindan
>Priority: Major
>
> Post YARN-8302, Hbase connection issues are handled in ATSv2. However this 
> could be made general by introducing an api in storage interface and 
> implementing in each of the storage as per the store semantics.
>  
> cc [~rohithsharma] [~vinodkv] [~vrushalic]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8302) ATS v2 should handle HBase connection issue properly

2018-07-06 Thread Rohith Sharma K S (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-8302:

Issue Type: Improvement  (was: Bug)

> ATS v2 should handle HBase connection issue properly
> 
>
> Key: YARN-8302
> URL: https://issues.apache.org/jira/browse/YARN-8302
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: ATSv2
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8302.1.patch, YARN-8302.2.patch, YARN-8302.3.patch
>
>
> ATS v2 call times out with below error when it can't connect to HBase 
> instance.
> {code}
> bash-4.2$ curl -i -k -s -1  -H 'Content-Type: application/json'  -H 'Accept: 
> application/json' --max-time 5   --negotiate -u : 
> 'https://xxx:8199/ws/v2/timeline/apps/application_1526357251888_0022/entities/YARN_CONTAINER?fields=ALL&_=1526425686092'
> curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received
> {code}
> {code:title=ATS log}
> 2018-05-15 23:10:03,623 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=7, 
> retries=7, started=8165 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:13,651 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=8, 
> retries=8, started=18192 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:23,730 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=9, 
> retries=9, started=28272 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:33,788 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=10, 
> retries=10, started=38330 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1{code}
> There are two issues here.
> 1) Check why ATS can't connect to HBase
> 2) In case of connection error,  ATS call should not get timeout. It should 
> fail with proper error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8302) ATS v2 should handle HBase connection issue properly

2018-07-06 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535414#comment-16535414
 ] 

Rohith Sharma K S commented on YARN-8302:
-

committing shortly

> ATS v2 should handle HBase connection issue properly
> 
>
> Key: YARN-8302
> URL: https://issues.apache.org/jira/browse/YARN-8302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: ATSv2
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-8302.1.patch, YARN-8302.2.patch, YARN-8302.3.patch
>
>
> ATS v2 call times out with below error when it can't connect to HBase 
> instance.
> {code}
> bash-4.2$ curl -i -k -s -1  -H 'Content-Type: application/json'  -H 'Accept: 
> application/json' --max-time 5   --negotiate -u : 
> 'https://xxx:8199/ws/v2/timeline/apps/application_1526357251888_0022/entities/YARN_CONTAINER?fields=ALL&_=1526425686092'
> curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received
> {code}
> {code:title=ATS log}
> 2018-05-15 23:10:03,623 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=7, 
> retries=7, started=8165 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:13,651 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=8, 
> retries=8, started=18192 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:23,730 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=9, 
> retries=9, started=28272 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:33,788 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=10, 
> retries=10, started=38330 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1{code}
> There are two issues here.
> 1) Check why ATS can't connect to HBase
> 2) In case of connection error,  ATS call should not get timeout. It should 
> fail with proper error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8487) Remove the unused Variable in TestBroadcastAMRMProxyFederationPolicy#testNotifyOfResponseFromUnknownSubCluster

2018-07-06 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535411#comment-16535411
 ] 

Zian Chen commented on YARN-8487:
-

Hi [~msingh] , [~BilwaST] I quickly checked the method 
testNotifyOfResponseFromUnknownSubCluster, looks like 

Map> response. is not been used. 

Would you mind me provide a quick fix on this JIRA?

> Remove the unused Variable in 
> TestBroadcastAMRMProxyFederationPolicy#testNotifyOfResponseFromUnknownSubCluster
> --
>
> Key: YARN-8487
> URL: https://issues.apache.org/jira/browse/YARN-8487
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: amrmproxy
>Reporter: Mukul Kumar Singh
>Assignee: Bilwa S T
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8383) TimelineServer 1.5 start fails with NoClassDefFoundError

2018-07-06 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535392#comment-16535392
 ] 

Jason Lowe commented on YARN-8383:
--

Sorry for the delay, as I was out on vacation and finally got back to this.

I think this was caused by YARN-6628.  It looks like the jackson core is being 
shaded but the fst library is not being updated to reference the shaded path.

I can try to poke at this some more early next week to see if providing a 
shaded version of fst that in turn references the shaded version of jackson can 
work.

> TimelineServer 1.5 start fails with NoClassDefFoundError
> 
>
> Key: YARN-8383
> URL: https://issues.apache.org/jira/browse/YARN-8383
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.4
>Reporter: Rohith Sharma K S
>Priority: Blocker
>
> TimelineServer 1.5 start fails with NoClassDefFoundError.
> {noformat}
> 2018-05-31 22:10:58,548 FATAL 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer:
>  Error starting ApplicationHistoryServer
> java.lang.NoClassDefFoundError: com/fasterxml/jackson/core/JsonFactory
>   at 
> org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.(RollingLevelDBTimelineStore.java:174)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2306)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2271)
>   at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2367)
>   at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2393)
>   at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.createSummaryStore(EntityGroupFSTimelineStore.java:239)
>   at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.serviceInit(EntityGroupFSTimelineStore.java:146)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:115)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:180)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:190)
> Caused by: java.lang.ClassNotFoundException: 
> com.fasterxml.jackson.core.JsonFactory
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8302) ATS v2 should handle HBase connection issue properly

2018-07-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535326#comment-16535326
 ] 

genericqa commented on YARN-8302:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  7s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
37s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
28s{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-client in 
the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
45s{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in 
the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 85m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8302 |
| JIRA Patch URL | 

[jira] [Commented] (YARN-8492) ATSv2 HBase tests are failing with ClassNotFoundException

2018-07-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535273#comment-16535273
 ] 

Hudson commented on YARN-8492:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14531 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14531/])
YARN-8492. ATSv2 HBase tests are failing with ClassNotFoundException. (sunilg: 
rev e4bf38cf50943565796c00f8b5711a2882813488)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/pom.xml


> ATSv2 HBase tests are failing with ClassNotFoundException
> -
>
> Key: YARN-8492
> URL: https://issues.apache.org/jira/browse/YARN-8492
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Major
>  Labels: test
> Fix For: 3.2.0
>
> Attachments: YARN-8492.01.patch, YARN-8492.02.patch
>
>
> It is seen in recent QA report that ATSv2 Hbase tests are failing with 
> ClassNotFoundException.
> This looks to be regression from hadoop common patch or any other patch. We 
> need to figure out which JIRA broke this and fix tests failure.
>  hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun
>       
> hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageSchema
>       
> hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageEntities
>       hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps
>       hadoop.yarn.server.timelineservice.storage.TestTimelineReaderHBaseDown
>       
> hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRunCompaction
>       
> hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageDomain
>       
> hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
>       
> hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity
>  
> {noformat}
> ERROR] 
> org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps
>   Time elapsed: 0.102 s  <<< ERROR!
> java.lang.NoClassDefFoundError: 
> org/apache/hadoop/crypto/key/KeyProviderTokenIssuer
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps.setupBeforeClass(TestHBaseTimelineStorageApps.java:97)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.crypto.key.KeyProviderTokenIssuer
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7055) YARN Timeline Service v.2: beta 1

2018-07-06 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535264#comment-16535264
 ] 

Vrushali C edited comment on YARN-7055 at 7/6/18 7:43 PM:
--

Current status: 
Done: 
- Migration Path plan for tsv1 to tsv2.
- Supporting multiple hbase versions.

In Progress:
- Security / Authorization ACLs

TBD (cc [~rohithsharma]) :
- Fault tolerance
- Misc items:
-- hbase shaded jars
-- Yarn Client + ATSv2 Integration YARN-8303
-- Knox + SSO proxy doesn’t work with TimelineReader
-- Serving application logs from ATSv2 YARN-5742 . related  fixed YARN-7753




was (Author: vrushalic):
Current status: 
Done: 
- Migration Path plan for tsv1 to tsv2.
- Supporting multiple hbase versions.

In Progress:
- Security / Authorization ACLs

TBD (cc [~rohithsharma]) :
- Fault tolerance
- Misc items:
 hbase shaded jars
 Yarn Client + ATSv2 Integration YARN-8303
 Knox + SSO proxy doesn’t work with TimelineReader
 Serving application logs from ATSv2 YARN-5742 . related  fixed YARN-7753



> YARN Timeline Service v.2: beta 1 
> --
>
> Key: YARN-7055
> URL: https://issues.apache.org/jira/browse/YARN-7055
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineclient, timelinereader, timelineserver
>Reporter: Vrushali C
>Assignee: Vrushali C
>Priority: Major
> Attachments: TSv2 next steps.pdf
>
>
> This is an umbrella JIRA for the beta 1 milestone for YARN Timeline Service 
> v.2.
> YARN-2928 was alpha1, YARN-5355 was alpha2. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7055) YARN Timeline Service v.2: beta 1

2018-07-06 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535264#comment-16535264
 ] 

Vrushali C commented on YARN-7055:
--

Current status: 
Done: 
- Migration Path plan for tsv1 to tsv2.
- Supporting multiple hbase versions.

In Progress:
- Security / Authorization ACLs

TBD (cc [~rohithsharma]) :
- Fault tolerance
- Misc items:
 hbase shaded jars
 Yarn Client + ATSv2 Integration YARN-8303
 Knox + SSO proxy doesn’t work with TimelineReader
 Serving application logs from ATSv2 YARN-5742 . related  fixed YARN-7753



> YARN Timeline Service v.2: beta 1 
> --
>
> Key: YARN-7055
> URL: https://issues.apache.org/jira/browse/YARN-7055
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineclient, timelinereader, timelineserver
>Reporter: Vrushali C
>Assignee: Vrushali C
>Priority: Major
> Attachments: TSv2 next steps.pdf
>
>
> This is an umbrella JIRA for the beta 1 milestone for YARN Timeline Service 
> v.2.
> YARN-2928 was alpha1, YARN-5355 was alpha2. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8500) Use hbase shaded jars

2018-07-06 Thread Vrushali C (JIRA)
Vrushali C created YARN-8500:


 Summary: Use hbase shaded jars
 Key: YARN-8500
 URL: https://issues.apache.org/jira/browse/YARN-8500
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vrushali C
Assignee: Vrushali C



Move to using hbase shaded jars in atsv2 

Related jira YARN-7213



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8302) ATS v2 should handle HBase connection issue properly

2018-07-06 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535246#comment-16535246
 ] 

Sunil Govindan commented on YARN-8302:
--

Thanks [~vinodkv]. I have corrected the YarnConfguration annotation issue. I 
will raise another Jira to track a common error handling across all storage. 
YARN-8499 is created to handle this.

> ATS v2 should handle HBase connection issue properly
> 
>
> Key: YARN-8302
> URL: https://issues.apache.org/jira/browse/YARN-8302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: ATSv2
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-8302.1.patch, YARN-8302.2.patch, YARN-8302.3.patch
>
>
> ATS v2 call times out with below error when it can't connect to HBase 
> instance.
> {code}
> bash-4.2$ curl -i -k -s -1  -H 'Content-Type: application/json'  -H 'Accept: 
> application/json' --max-time 5   --negotiate -u : 
> 'https://xxx:8199/ws/v2/timeline/apps/application_1526357251888_0022/entities/YARN_CONTAINER?fields=ALL&_=1526425686092'
> curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received
> {code}
> {code:title=ATS log}
> 2018-05-15 23:10:03,623 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=7, 
> retries=7, started=8165 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:13,651 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=8, 
> retries=8, started=18192 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:23,730 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=9, 
> retries=9, started=28272 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:33,788 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=10, 
> retries=10, started=38330 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1{code}
> There are two issues here.
> 1) Check why ATS can't connect to HBase
> 2) In case of connection error,  ATS call should not get timeout. It should 
> fail with proper error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8499) ATS v2 should handle connection issues in general for all storages

2018-07-06 Thread Sunil Govindan (JIRA)
Sunil Govindan created YARN-8499:


 Summary: ATS v2 should handle connection issues in general for all 
storages
 Key: YARN-8499
 URL: https://issues.apache.org/jira/browse/YARN-8499
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sunil Govindan


Post YARN-8302, Hbase connection issues are handled in ATSv2. However this 
could be made general by introducing an api in storage interface and 
implementing in each of the storage as per the store semantics.

 

cc [~rohithsharma] [~vinodkv] [~vrushalic]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8302) ATS v2 should handle HBase connection issue properly

2018-07-06 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8302:
-
Attachment: YARN-8302.3.patch

> ATS v2 should handle HBase connection issue properly
> 
>
> Key: YARN-8302
> URL: https://issues.apache.org/jira/browse/YARN-8302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: ATSv2
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-8302.1.patch, YARN-8302.2.patch, YARN-8302.3.patch
>
>
> ATS v2 call times out with below error when it can't connect to HBase 
> instance.
> {code}
> bash-4.2$ curl -i -k -s -1  -H 'Content-Type: application/json'  -H 'Accept: 
> application/json' --max-time 5   --negotiate -u : 
> 'https://xxx:8199/ws/v2/timeline/apps/application_1526357251888_0022/entities/YARN_CONTAINER?fields=ALL&_=1526425686092'
> curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received
> {code}
> {code:title=ATS log}
> 2018-05-15 23:10:03,623 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=7, 
> retries=7, started=8165 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:13,651 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=8, 
> retries=8, started=18192 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:23,730 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=9, 
> retries=9, started=28272 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:33,788 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=10, 
> retries=10, started=38330 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1{code}
> There are two issues here.
> 1) Check why ATS can't connect to HBase
> 2) In case of connection error,  ATS call should not get timeout. It should 
> fail with proper error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8492) ATSv2 HBase tests are failing with ClassNotFoundException

2018-07-06 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8492:
-
Labels: test  (was: )

> ATSv2 HBase tests are failing with ClassNotFoundException
> -
>
> Key: YARN-8492
> URL: https://issues.apache.org/jira/browse/YARN-8492
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Major
>  Labels: test
> Attachments: YARN-8492.01.patch, YARN-8492.02.patch
>
>
> It is seen in recent QA report that ATSv2 Hbase tests are failing with 
> ClassNotFoundException.
> This looks to be regression from hadoop common patch or any other patch. We 
> need to figure out which JIRA broke this and fix tests failure.
>  hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun
>       
> hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageSchema
>       
> hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageEntities
>       hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps
>       hadoop.yarn.server.timelineservice.storage.TestTimelineReaderHBaseDown
>       
> hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRunCompaction
>       
> hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageDomain
>       
> hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
>       
> hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity
>  
> {noformat}
> ERROR] 
> org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps
>   Time elapsed: 0.102 s  <<< ERROR!
> java.lang.NoClassDefFoundError: 
> org/apache/hadoop/crypto/key/KeyProviderTokenIssuer
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps.setupBeforeClass(TestHBaseTimelineStorageApps.java:97)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.crypto.key.KeyProviderTokenIssuer
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6767) Timeline client won't be able to write when TimelineCollector is not up yet, or NM is down

2018-07-06 Thread Vrushali C (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C reassigned YARN-6767:


Assignee: Vrushali C

> Timeline client won't be able to write when TimelineCollector is not up yet, 
> or NM is down
> --
>
> Key: YARN-6767
> URL: https://issues.apache.org/jira/browse/YARN-6767
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineclient
>Affects Versions: 3.0.0-alpha4
>Reporter: Haibo Chen
>Assignee: Vrushali C
>Priority: Major
>
> As discussed in the call, when an application first starts to run, its 
> corresponding TimelineCollector instance may not be up yet, or if the 
> TimelineCollector goes down when node manager dies (TimelineCollector now 
> runs as part of NM auxiliary services), the timeline client
> will not able to write entities. We need to address or mitigate the issue if 
> possible, or at least call it out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6767) Timeline client won't be able to write when TimelineCollector is not up yet, or NM is down

2018-07-06 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535235#comment-16535235
 ] 

Vrushali C edited comment on YARN-6767 at 7/6/18 7:01 PM:
--

We need to handle collector fault tolerance, that is the case of collector 
going down after it had come up. Tracked in YARN-7272

But the case of collector not being up itself in the first place needs to be 
handled by the client framework. 


was (Author: vrushalic):
We need to handle collector fault tolerance, that is the case of collector 
going down after it had come up. 

But the case of collector not being up itself in the first place needs to be 
handled by the client framework. 

> Timeline client won't be able to write when TimelineCollector is not up yet, 
> or NM is down
> --
>
> Key: YARN-6767
> URL: https://issues.apache.org/jira/browse/YARN-6767
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineclient
>Affects Versions: 3.0.0-alpha4
>Reporter: Haibo Chen
>Priority: Major
>
> As discussed in the call, when an application first starts to run, its 
> corresponding TimelineCollector instance may not be up yet, or if the 
> TimelineCollector goes down when node manager dies (TimelineCollector now 
> runs as part of NM auxiliary services), the timeline client
> will not able to write entities. We need to address or mitigate the issue if 
> possible, or at least call it out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6767) Timeline client won't be able to write when TimelineCollector is not up yet, or NM is down

2018-07-06 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535235#comment-16535235
 ] 

Vrushali C commented on YARN-6767:
--

We need to handle collector fault tolerance, that is the case of collector 
going down after it had come up. 

But the case of collector not being up itself in the first place needs to be 
handled by the client framework. 

> Timeline client won't be able to write when TimelineCollector is not up yet, 
> or NM is down
> --
>
> Key: YARN-6767
> URL: https://issues.apache.org/jira/browse/YARN-6767
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineclient
>Affects Versions: 3.0.0-alpha4
>Reporter: Haibo Chen
>Priority: Major
>
> As discussed in the call, when an application first starts to run, its 
> corresponding TimelineCollector instance may not be up yet, or if the 
> TimelineCollector goes down when node manager dies (TimelineCollector now 
> runs as part of NM auxiliary services), the timeline client
> will not able to write entities. We need to address or mitigate the issue if 
> possible, or at least call it out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8302) ATS v2 should handle HBase connection issue properly

2018-07-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535233#comment-16535233
 ] 

genericqa commented on YARN-8302:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
44s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
6s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 19s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
46s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
33s{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-client in 
the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 41s{color} 
| {color:red} hadoop-yarn-server-timelineservice-hbase-tests in the patch 
failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 87m 46s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageDomain |
|   | 

[jira] [Commented] (YARN-7451) Add missing tests to verify the presence of custom resources of RM apps and scheduler webservice endpoints

2018-07-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535219#comment-16535219
 ] 

Hudson commented on YARN-7451:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14529 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14529/])
YARN-7451. Add missing tests to verify the presence of custom resources 
(sunilg: rev a129e3e74e16ed039d637dc1499dc3e5df317d94)
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/helper/XmlCustomResourceTypeTestCase.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesFairScheduler.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/helper/AppInfoXmlVerifications.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/helper/BufferedClientResponse.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerInfo.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/helper/ResourceRequestsJsonVerifications.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/helper/ResourceRequestsXmlVerifications.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/fairscheduler/FairSchedulerXmlVerifications.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/fairscheduler/CustomResourceTypesConfigurationProvider.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/helper/AppInfoJsonVerifications.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesAppsCustomResourceTypes.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesSchedulerActivities.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/fairscheduler/FairSchedulerJsonVerifications.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesConfigurationMutation.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/fairscheduler/TestRMWebServicesFairSchedulerCustomResourceTypes.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/helper/JsonCustomResourceTypeTestcase.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesApps.java


> Add missing tests to verify the presence of custom resources of RM apps and 
> scheduler webservice endpoints
> --
>
> Key: YARN-7451

[jira] [Commented] (YARN-7556) Fair scheduler configuration should allow resource types in the minResources and maxResources properties

2018-07-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535218#comment-16535218
 ] 

Hudson commented on YARN-7556:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14529 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14529/])
YARN-7556. Fair scheduler configuration should allow resource types in (sunilg: 
rev 9edc74f64a31450af3c55c0dadf352862e4b359d)
* (edit) hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ConfigurableResource.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/impl/LightWeightResource.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/Resource.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/allocation/AllocationFileQueueParser.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/FairScheduler.md


> Fair scheduler configuration should allow resource types in the minResources 
> and maxResources properties
> 
>
> Key: YARN-7556
> URL: https://issues.apache.org/jira/browse/YARN-7556
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 3.0.0-beta1
>Reporter: Daniel Templeton
>Assignee: Szilard Nemeth
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: YARN-7556.001.patch, YARN-7556.002.patch, 
> YARN-7556.003.patch, YARN-7556.004.patch, YARN-7556.005.patch, 
> YARN-7556.006.patch, YARN-7556.007.patch, YARN-7556.008.patch, 
> YARN-7556.009.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7753) [UI2] Meta information about Application logs has to be pulled from ATS 1.5 instead of ATS2

2018-07-06 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-7753:
--
Summary: [UI2] Meta information about Application logs has to be pulled 
from ATS 1.5 instead of ATS2  (was: [UI2] Application logs has to be pulled 
from ATS 1.5 instead of ATS2)

> [UI2] Meta information about Application logs has to be pulled from ATS 1.5 
> instead of ATS2
> ---
>
> Key: YARN-7753
> URL: https://issues.apache.org/jira/browse/YARN-7753
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-7753.001.patch, YARN-7753.002.patch
>
>
> Currently UI tries to pull logs from ATS v2. Instead, it should be pulled 
> from ATS v1 as ATS2 doesnt have a log story yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8302) ATS v2 should handle HBase connection issue properly

2018-07-06 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535208#comment-16535208
 ] 

Vinod Kumar Vavilapalli commented on YARN-8302:
---

This should be generic to all storages - "if storage is down, give errors on 
the API" - not just for HBase Storage, no?

> ATS v2 should handle HBase connection issue properly
> 
>
> Key: YARN-8302
> URL: https://issues.apache.org/jira/browse/YARN-8302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: ATSv2
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-8302.1.patch, YARN-8302.2.patch
>
>
> ATS v2 call times out with below error when it can't connect to HBase 
> instance.
> {code}
> bash-4.2$ curl -i -k -s -1  -H 'Content-Type: application/json'  -H 'Accept: 
> application/json' --max-time 5   --negotiate -u : 
> 'https://xxx:8199/ws/v2/timeline/apps/application_1526357251888_0022/entities/YARN_CONTAINER?fields=ALL&_=1526425686092'
> curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received
> {code}
> {code:title=ATS log}
> 2018-05-15 23:10:03,623 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=7, 
> retries=7, started=8165 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:13,651 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=8, 
> retries=8, started=18192 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:23,730 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=9, 
> retries=9, started=28272 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:33,788 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=10, 
> retries=10, started=38330 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1{code}
> There are two issues here.
> 1) Check why ATS can't connect to HBase
> 2) In case of connection error,  ATS call should not get timeout. It should 
> fail with proper error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8435) Fix NPE when the same client simultaneously contact for the first time Yarn Router

2018-07-06 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535206#comment-16535206
 ] 

Giovanni Matteo Fumarola commented on YARN-8435:


Thanks [~sunilg] for the comment.
Yes, it is supposed to land on trunk. I updated the fix version.

> Fix NPE when the same client simultaneously contact for the first time Yarn 
> Router
> --
>
> Key: YARN-8435
> URL: https://issues.apache.org/jira/browse/YARN-8435
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: router
>Affects Versions: 2.9.0, 3.0.2
>Reporter: rangjiaheng
>Assignee: rangjiaheng
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: YARN-8435.v1.patch, YARN-8435.v2.patch, 
> YARN-8435.v3.patch, YARN-8435.v4.patch, YARN-8435.v5.patch, 
> YARN-8435.v6.patch, YARN-8435.v7.patch
>
>
> When Two client process (with the same user name and the same hostname) begin 
> to connect to yarn router at the same time, to submit application, kill 
> application, ... and so on, then a java.lang.NullPointerException may throws 
> from yarn router.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8435) Fix NPE when the same client simultaneously contact for the first time Yarn Router

2018-07-06 Thread Giovanni Matteo Fumarola (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-8435:
---
Fix Version/s: 3.2.0

> Fix NPE when the same client simultaneously contact for the first time Yarn 
> Router
> --
>
> Key: YARN-8435
> URL: https://issues.apache.org/jira/browse/YARN-8435
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: router
>Affects Versions: 2.9.0, 3.0.2
>Reporter: rangjiaheng
>Assignee: rangjiaheng
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: YARN-8435.v1.patch, YARN-8435.v2.patch, 
> YARN-8435.v3.patch, YARN-8435.v4.patch, YARN-8435.v5.patch, 
> YARN-8435.v6.patch, YARN-8435.v7.patch
>
>
> When Two client process (with the same user name and the same hostname) begin 
> to connect to yarn router at the same time, to submit application, kill 
> application, ... and so on, then a java.lang.NullPointerException may throws 
> from yarn router.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8435) Fix NPE when the same client simultaneously contact for the first time Yarn Router

2018-07-06 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535183#comment-16535183
 ] 

Sunil Govindan commented on YARN-8435:
--

[~giovanni.fumarola] Is this Jira supposed to be land on trunk?

Also when u commit, cud u pls update the fix version as well. Thank you.

> Fix NPE when the same client simultaneously contact for the first time Yarn 
> Router
> --
>
> Key: YARN-8435
> URL: https://issues.apache.org/jira/browse/YARN-8435
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: router
>Affects Versions: 2.9.0, 3.0.2
>Reporter: rangjiaheng
>Assignee: rangjiaheng
>Priority: Critical
> Attachments: YARN-8435.v1.patch, YARN-8435.v2.patch, 
> YARN-8435.v3.patch, YARN-8435.v4.patch, YARN-8435.v5.patch, 
> YARN-8435.v6.patch, YARN-8435.v7.patch
>
>
> When Two client process (with the same user name and the same hostname) begin 
> to connect to yarn router at the same time, to submit application, kill 
> application, ... and so on, then a java.lang.NullPointerException may throws 
> from yarn router.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8302) ATS v2 should handle HBase connection issue properly

2018-07-06 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535150#comment-16535150
 ] 

Rohith Sharma K S commented on YARN-8302:
-

+1 for latest patch.. Pending jenkins.. I ran patch in both profile and tests 
are executed fine.

> ATS v2 should handle HBase connection issue properly
> 
>
> Key: YARN-8302
> URL: https://issues.apache.org/jira/browse/YARN-8302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: ATSv2
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-8302.1.patch, YARN-8302.2.patch
>
>
> ATS v2 call times out with below error when it can't connect to HBase 
> instance.
> {code}
> bash-4.2$ curl -i -k -s -1  -H 'Content-Type: application/json'  -H 'Accept: 
> application/json' --max-time 5   --negotiate -u : 
> 'https://xxx:8199/ws/v2/timeline/apps/application_1526357251888_0022/entities/YARN_CONTAINER?fields=ALL&_=1526425686092'
> curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received
> {code}
> {code:title=ATS log}
> 2018-05-15 23:10:03,623 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=7, 
> retries=7, started=8165 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:13,651 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=8, 
> retries=8, started=18192 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:23,730 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=9, 
> retries=9, started=28272 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:33,788 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=10, 
> retries=10, started=38330 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1{code}
> There are two issues here.
> 1) Check why ATS can't connect to HBase
> 2) In case of connection error,  ATS call should not get timeout. It should 
> fail with proper error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8302) ATS v2 should handle HBase connection issue properly

2018-07-06 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8302:
-
Attachment: YARN-8302.2.patch

> ATS v2 should handle HBase connection issue properly
> 
>
> Key: YARN-8302
> URL: https://issues.apache.org/jira/browse/YARN-8302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: ATSv2
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-8302.1.patch, YARN-8302.2.patch
>
>
> ATS v2 call times out with below error when it can't connect to HBase 
> instance.
> {code}
> bash-4.2$ curl -i -k -s -1  -H 'Content-Type: application/json'  -H 'Accept: 
> application/json' --max-time 5   --negotiate -u : 
> 'https://xxx:8199/ws/v2/timeline/apps/application_1526357251888_0022/entities/YARN_CONTAINER?fields=ALL&_=1526425686092'
> curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received
> {code}
> {code:title=ATS log}
> 2018-05-15 23:10:03,623 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=7, 
> retries=7, started=8165 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:13,651 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=8, 
> retries=8, started=18192 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:23,730 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=9, 
> retries=9, started=28272 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:33,788 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=10, 
> retries=10, started=38330 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1{code}
> There are two issues here.
> 1) Check why ATS can't connect to HBase
> 2) In case of connection error,  ATS call should not get timeout. It should 
> fail with proper error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8480) Add boolean option for resources

2018-07-06 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535146#comment-16535146
 ] 

Sunil Govindan commented on YARN-8480:
--

I think boolean resource is a bit vague item as per me. Currently we are 
considering only COUNTABLE resources. And you have mentioned that boolean 
resource is not consumable, hence its more or less falling in to NodeAttribute 
itself. Node attributes doesnt hold any resources to it unlike 
resources/labels. 

I think non-countable and sharable resources in YARN is pretty much an item to 
be discussed for its usage etc. Since it affects the resource calculator, we 
need to be clear on how it will used in normal calculations.

 

> Add boolean option for resources
> 
>
> Key: YARN-8480
> URL: https://issues.apache.org/jira/browse/YARN-8480
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Daniel Templeton
>Assignee: Szilard Nemeth
>Priority: Major
>
> Make it possible to define a resource with a boolean value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8480) Add boolean option for resources

2018-07-06 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535129#comment-16535129
 ] 

Daniel Templeton commented on YARN-8480:


A boolean resource wouldn't be consumed by its use.  The node attributes work 
is definitely in the same vein, though much heavier weight.  The other 
difference is that there's no dependency on node labels or placement 
constraints here, so the scheduler implementation cost is very low.  There is 
some overlap with node attributes, but when we have three entirely distinct 
ways to specify resources (resources, node labels, and node attributes), that's 
not too surprising.

> Add boolean option for resources
> 
>
> Key: YARN-8480
> URL: https://issues.apache.org/jira/browse/YARN-8480
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Daniel Templeton
>Assignee: Szilard Nemeth
>Priority: Major
>
> Make it possible to define a resource with a boolean value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8193) YARN RM hangs abruptly (stops allocating resources) when running successive applications.

2018-07-06 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535019#comment-16535019
 ] 

Jason Lowe commented on YARN-8193:
--

bq. The build failed due to some reason not related to the patch:

The problem can be seen in the precommit log:
{noformat}
Switched to branch 'branch-2.9.0'
Your branch is up-to-date with 'origin/branch-2.9.0'.
HEAD is now at 756ebc8 HADOOP-15036. Update LICENSE.txt for HADOOP-14840. 
(asuresh)
{noformat}

The patch is targeting branch-2.9.0 which does not have HADOOP-15375 which 
fixed the cert error during the build.  I will upload the same patch against 
branch-2, the next place this should be committed before committing to 
branch-2.9, which does have the cert fix.  The patch applies cleanly, so it's 
the same bits.  Hopefully this will result in a useful Jenkins run.

> YARN RM hangs abruptly (stops allocating resources) when running successive 
> applications.
> -
>
> Key: YARN-8193
> URL: https://issues.apache.org/jira/browse/YARN-8193
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8193-branch-2-001.patch, 
> YARN-8193-branch-2.9.0-001.patch, YARN-8193.001.patch, YARN-8193.002.patch
>
>
> When running massive queries successively, at some point RM just hangs and 
> stops allocating resources. At the point RM get hangs, YARN throw 
> NullPointerException  at RegularContainerAllocator.getLocalityWaitFactor.
> There's sufficient space given to yarn.nodemanager.local-dirs (not a node 
> health issue, RM didn't report any node being unhealthy). There is no fixed 
> trigger for this (query or operation).
> This problem goes away on restarting ResourceManager. No NM restart is 
> required. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8193) YARN RM hangs abruptly (stops allocating resources) when running successive applications.

2018-07-06 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-8193:
-
Attachment: YARN-8193-branch-2-001.patch

> YARN RM hangs abruptly (stops allocating resources) when running successive 
> applications.
> -
>
> Key: YARN-8193
> URL: https://issues.apache.org/jira/browse/YARN-8193
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8193-branch-2-001.patch, 
> YARN-8193-branch-2.9.0-001.patch, YARN-8193.001.patch, YARN-8193.002.patch
>
>
> When running massive queries successively, at some point RM just hangs and 
> stops allocating resources. At the point RM get hangs, YARN throw 
> NullPointerException  at RegularContainerAllocator.getLocalityWaitFactor.
> There's sufficient space given to yarn.nodemanager.local-dirs (not a node 
> health issue, RM didn't report any node being unhealthy). There is no fixed 
> trigger for this (query or operation).
> This problem goes away on restarting ResourceManager. No NM restart is 
> required. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-8462) Resource Manager shutdown with FATAL Exception

2018-07-06 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved YARN-8462.
--
Resolution: Duplicate

This is being handled by YARN-8193 with a new branch-2 patch posted there.

> Resource Manager shutdown with FATAL Exception
> --
>
> Key: YARN-8462
> URL: https://issues.apache.org/jira/browse/YARN-8462
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.9.0
>Reporter: Amithsha
>Priority: Critical
>
> Intermediately Resource manager going down with following exceptions 
>  
> 2018-06-25 15:24:30,572 FATAL event.EventDispatcher 
> (EventDispatcher.java:run(75)) - Error in handling event type NODE_UPDATE to 
> the Event Dispatcher
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.getLocalityWaitFactor(RegularContainerAllocator.java:268)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.canAssign(RegularContainerAllocator.java:315)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignOffSwitchContainers(RegularContainerAllocator.java:388)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainersOnNode(RegularContainerAllocator.java:469)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.tryAllocateOnNode(RegularContainerAllocator.java:250)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:819)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:857)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:55)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:868)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:1121)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1338)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1333)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1422)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1197)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1059)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1464)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:150)
>         at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
>         at java.lang.Thread.run(Thread.java:745)
> 2018-06-25 15:24:30,573 INFO  event.EventDispatcher 
> (EventDispatcher.java:run(79)) - Exiting, bbye..
> 2018-06-25 15:24:30,579 ERROR delegation.AbstractDelegationTokenSecretManager 
> (AbstractDelegationTokenSecretManager.java:run(690)) - ExpiredTokenRemover 
> received java.lang.InterruptedException: sleep interrupted
>  
> Before the build we applied the patches 

[jira] [Updated] (YARN-7481) Gpu locality support for Better AI scheduling

2018-07-06 Thread Chen Qingcha (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Qingcha updated YARN-7481:
---
Attachment: (was: hadoop_2.9.0.patch)

> Gpu locality support for Better AI scheduling
> -
>
> Key: YARN-7481
> URL: https://issues.apache.org/jira/browse/YARN-7481
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api, RM, yarn
>Affects Versions: 2.7.2
>Reporter: Chen Qingcha
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: GPU locality support for Job scheduling.pdf, 
> hadoop-2.7.2-gpu.patch, hadoop-2.7.2.gpu-port.patch, hadoop_2.9.0.patch
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> We enhance Hadoop with GPU support for better AI job scheduling. 
> Currently, YARN-3926 also supports GPU scheduling, which treats GPU as 
> countable resource. 
> However, GPU placement is also very important to deep learning job for better 
> efficiency.
>  For example, a 2-GPU job runs on gpu {0,1} could be faster than run on gpu 
> {0, 7}, if GPU 0 and 1 are under the same PCI-E switch while 0 and 7 are not.
>  We add the support to Hadoop 2.7.2 to enable GPU locality scheduling, which 
> support fine-grained GPU placement. 
> A 64-bits bitmap is added to yarn Resource, which indicates both GPU usage 
> and locality information in a node (up to 64 GPUs per node). '1' means 
> available and '0' otherwise in the corresponding position of the bit.   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7481) Gpu locality support for Better AI scheduling

2018-07-06 Thread Chen Qingcha (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Qingcha updated YARN-7481:
---
Attachment: hadoop_2.9.0.patch

> Gpu locality support for Better AI scheduling
> -
>
> Key: YARN-7481
> URL: https://issues.apache.org/jira/browse/YARN-7481
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api, RM, yarn
>Affects Versions: 2.7.2
>Reporter: Chen Qingcha
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: GPU locality support for Job scheduling.pdf, 
> hadoop-2.7.2-gpu.patch, hadoop-2.7.2.gpu-port.patch, hadoop_2.9.0.patch
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> We enhance Hadoop with GPU support for better AI job scheduling. 
> Currently, YARN-3926 also supports GPU scheduling, which treats GPU as 
> countable resource. 
> However, GPU placement is also very important to deep learning job for better 
> efficiency.
>  For example, a 2-GPU job runs on gpu {0,1} could be faster than run on gpu 
> {0, 7}, if GPU 0 and 1 are under the same PCI-E switch while 0 and 7 are not.
>  We add the support to Hadoop 2.7.2 to enable GPU locality scheduling, which 
> support fine-grained GPU placement. 
> A 64-bits bitmap is added to yarn Resource, which indicates both GPU usage 
> and locality information in a node (up to 64 GPUs per node). '1' means 
> available and '0' otherwise in the corresponding position of the bit.   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8496) The capacity scheduler uses label to cause vcore to be incorrect

2018-07-06 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534607#comment-16534607
 ] 

Zian Chen commented on YARN-8496:
-

[~tangshangwen] , thanks for raising up this issue. This allocation is 
basically a multiple resource allocation which uses DRC when calculating the 
amount of resource we allocate for each container. When we calculate if the 
node has the available resource to do more allocation or not, we pick the 
dominant resource and do the calculation. This may cause dominant resource 
still have available but some other resource is already not enough to do the 
allocation. 

I think we need to fix this. 

[~sunilg] , could you share thoughts on this?

 

 

> The capacity scheduler uses label to cause vcore to be incorrect
> 
>
> Key: YARN-8496
> URL: https://issues.apache.org/jira/browse/YARN-8496
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 2.7.6
>Reporter: tangshangwen
>Assignee: tangshangwen
>Priority: Major
> Attachments: yarn-bug.png
>
>
>  In my cluster, I used label scheduling, and I found that it caused the vcore 
> of the cluster to be incorrect
>  
> capacity-scheduler.xml
>  
> {code:java}
> 
> 
> yarn.scheduler.capacity.root.queues
> support
> 
> 
> yarn.scheduler.capacity.root.support.capacity
> 100
> 
> 
> yarn.scheduler.capacity.root.support.accessible-node-labels
> test1
> 
> 
> yarn.scheduler.capacity.root.support.accessible-node-labels.test1.capacity
> 100
> 
> 
> yarn.scheduler.capacity.root.accessible-node-labels.test1.capacity
> 100
> 
> 
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8492) ATSv2 HBase tests are failing with ClassNotFoundException

2018-07-06 Thread Takanobu Asanuma (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534497#comment-16534497
 ] 

Takanobu Asanuma commented on YARN-8492:


[~rohithsharma] Sorry for generating the bug by YARN-8363, and thanks for 
analyzing and fixing it.

> ATSv2 HBase tests are failing with ClassNotFoundException
> -
>
> Key: YARN-8492
> URL: https://issues.apache.org/jira/browse/YARN-8492
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Major
> Attachments: YARN-8492.01.patch, YARN-8492.02.patch
>
>
> It is seen in recent QA report that ATSv2 Hbase tests are failing with 
> ClassNotFoundException.
> This looks to be regression from hadoop common patch or any other patch. We 
> need to figure out which JIRA broke this and fix tests failure.
>  hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun
>       
> hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageSchema
>       
> hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageEntities
>       hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps
>       hadoop.yarn.server.timelineservice.storage.TestTimelineReaderHBaseDown
>       
> hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRunCompaction
>       
> hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageDomain
>       
> hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
>       
> hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity
>  
> {noformat}
> ERROR] 
> org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps
>   Time elapsed: 0.102 s  <<< ERROR!
> java.lang.NoClassDefFoundError: 
> org/apache/hadoop/crypto/key/KeyProviderTokenIssuer
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps.setupBeforeClass(TestHBaseTimelineStorageApps.java:97)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.crypto.key.KeyProviderTokenIssuer
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org