date:20150602

[jira] [Created] (YARN-3758) The mininum memory setting(yarn.scheduler.minimum-allocation-mb) is not working in container

2015-06-02 Thread skrho (JIRA)

skrho created YARN-3758:
---

 Summary: The mininum memory 
setting(yarn.scheduler.minimum-allocation-mb) is not working in container
 Key: YARN-3758
 URL: https://issues.apache.org/jira/browse/YARN-3758
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: skrho


Hello there~~

I have 2 clusters


First cluster is 5 node , default 1 application queue, 8G Physical memory each 
node
Second cluster is 10 node, 2 application queuey, 230G Physical memory each node

Wherever a mapreduce job is running, I want resourcemanager is to set the 
minimum memory  256m to container

So I was changing configuration in yarn-site.xml

yarn.scheduler.minimum-allocation-mb : 256
mapreduce.map.java.opts : -Xms256m 
mapreduce.reduce.java.opts : -Xms256m 
mapreduce.map.memory.mb : 256 
mapreduce.reduce.memory.mb : 256 


In First cluster  whenever a mapreduce job is running , I can see used memory 
256m in web console( http://installedIP:8088/cluster/nodes )
But In Second cluster whenever a mapreduce job is running , I can see used 
memory 1024m in web console( http://installedIP:8088/cluster/nodes ) 

I know default memory value is 1024m, so if there is not changing memory 
setting, the default value is working.

I have been testing for two weeks, but I don't know why mimimum memory setting 
is not working in second cluster

Why this difference is happened? 

Am I wrong setting configuration?
or Is there bug?

Thank you for reading~~



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3749) We should make a copy of configuration when init MiniYARNCluster with multiple RMs

2015-06-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14568667#comment-14568667
 ] 

Hadoop QA commented on YARN-3749:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 29s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 7 new or modified test files. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 49s | The applied patch generated  1 
new checkstyle issues (total was 212, now 213). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 34s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   7m  1s | Tests passed in 
hadoop-yarn-client. |
| {color:red}-1{color} | yarn tests |  60m 25s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| {color:green}+1{color} | yarn tests |   1m 52s | Tests passed in 
hadoop-yarn-server-tests. |
| | | 115m  5s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector |
|   | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler |
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12736732/YARN-3749.6.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 990078b |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8158/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8158/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8158/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8158/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-tests test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8158/artifact/patchprocess/testrun_hadoop-yarn-server-tests.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8158/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8158/console |


This message was automatically generated.

 We should make a copy of configuration when init MiniYARNCluster with 
 multiple RMs
 --

 Key: YARN-3749
 URL: https://issues.apache.org/jira/browse/YARN-3749
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chun Chen
Assignee: Chun Chen
 Attachments: YARN-3749.2.patch, YARN-3749.3.patch, YARN-3749.4.patch, 
 YARN-3749.5.patch, YARN-3749.6.patch, YARN-3749.7.patch, YARN-3749.patch


 When I was trying to write a test case for YARN-2674, I found DS client 
 trying to connect to both rm1 and rm2 with the same address 0.0.0.0:18032 
 when RM failover. But I initially set 
 yarn.resourcemanager.address.rm1=0.0.0.0:18032, 
 yarn.resourcemanager.address.rm2=0.0.0.0:28032  After digging, I found it is 
 in ClientRMService where the value of yarn.resourcemanager.address.rm2 
 changed to 0.0.0.0:18032. See the following code in ClientRMService:
 {code}
 clientBindAddress = conf.updateConnectAddr(YarnConfiguration.RM_BIND_HOST,
YarnConfiguration.RM_ADDRESS,

 YarnConfiguration.DEFAULT_RM_ADDRESS,
server.getListenerAddress());
 {code}
 Since we use the same instance of configuration in rm1 and rm2 and init both 
 RM before

[jira] [Updated] (YARN-3758) The mininum memory setting(yarn.scheduler.minimum-allocation-mb) is not working in container

2015-06-02 Thread skrho (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

skrho updated YARN-3758:

Description: 
Hello there~~

I have 2 clusters


First cluster is 5 node , default 1 application queue, Capacity scheduler, 8G 
Physical memory each node
Second cluster is 10 node, 2 application queuey, fair-scheduler, 230G Physical 
memory each node

Wherever a mapreduce job is running, I want resourcemanager is to set the 
minimum memory  256m to container

So I was changing configuration in yarn-site.xml  mapred-site.xml

yarn.scheduler.minimum-allocation-mb : 256
mapreduce.map.java.opts : -Xms256m 
mapreduce.reduce.java.opts : -Xms256m 
mapreduce.map.memory.mb : 256 
mapreduce.reduce.memory.mb : 256 


In First cluster  whenever a mapreduce job is running , I can see used memory 
256m in web console( http://installedIP:8088/cluster/nodes )
But In Second cluster whenever a mapreduce job is running , I can see used 
memory 1024m in web console( http://installedIP:8088/cluster/nodes ) 

I know default memory value is 1024m, so if there is not changing memory 
setting, the default value is working.

I have been testing for two weeks, but I don't know why mimimum memory setting 
is not working in second cluster

Why this difference is happened? 

Am I wrong setting configuration?
or Is there bug?

Thank you for reading~~

  was:
Hello there~~

I have 2 clusters


First cluster is 5 node , default 1 application queue, 8G Physical memory each 
node
Second cluster is 10 node, 2 application queuey, 230G Physical memory each node

Wherever a mapreduce job is running, I want resourcemanager is to set the 
minimum memory  256m to container

So I was changing configuration in yarn-site.xml  mapred-site.xml

yarn.scheduler.minimum-allocation-mb : 256
mapreduce.map.java.opts : -Xms256m 
mapreduce.reduce.java.opts : -Xms256m 
mapreduce.map.memory.mb : 256 
mapreduce.reduce.memory.mb : 256 


In First cluster  whenever a mapreduce job is running , I can see used memory 
256m in web console( http://installedIP:8088/cluster/nodes )
But In Second cluster whenever a mapreduce job is running , I can see used 
memory 1024m in web console( http://installedIP:8088/cluster/nodes ) 

I know default memory value is 1024m, so if there is not changing memory 
setting, the default value is working.

I have been testing for two weeks, but I don't know why mimimum memory setting 
is not working in second cluster

Why this difference is happened? 

Am I wrong setting configuration?
or Is there bug?

Thank you for reading~~


 The mininum memory setting(yarn.scheduler.minimum-allocation-mb) is not 
 working in container
 

 Key: YARN-3758
 URL: https://issues.apache.org/jira/browse/YARN-3758
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: skrho

 Hello there~~
 I have 2 clusters
 First cluster is 5 node , default 1 application queue, Capacity scheduler, 8G 
 Physical memory each node
 Second cluster is 10 node, 2 application queuey, fair-scheduler, 230G 
 Physical memory each node
 Wherever a mapreduce job is running, I want resourcemanager is to set the 
 minimum memory  256m to container
 So I was changing configuration in yarn-site.xml  mapred-site.xml
 yarn.scheduler.minimum-allocation-mb : 256
 mapreduce.map.java.opts : -Xms256m 
 mapreduce.reduce.java.opts : -Xms256m 
 mapreduce.map.memory.mb : 256 
 mapreduce.reduce.memory.mb : 256 
 In First cluster  whenever a mapreduce job is running , I can see used memory 
 256m in web console( http://installedIP:8088/cluster/nodes )
 But In Second cluster whenever a mapreduce job is running , I can see used 
 memory 1024m in web console( http://installedIP:8088/cluster/nodes ) 
 I know default memory value is 1024m, so if there is not changing memory 
 setting, the default value is working.
 I have been testing for two weeks, but I don't know why mimimum memory 
 setting is not working in second cluster
 Why this difference is happened? 
 Am I wrong setting configuration?
 or Is there bug?
 Thank you for reading~~



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3706) Generalize native HBase writer for additional tables

2015-06-02 Thread Joep Rottinghuis (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joep Rottinghuis updated YARN-3706:
---
Attachment: YARN-3726-YARN-2928.004.patch

YARN-3726-YARN-2928.004.patch :
- fixed bug in cleanse (found thanks to unit test)
- fixed value separator (was ! instead of ?).
- Added readResult and readResults to EntityColumnPrefix (still need to add 
signature in interface).
- Added initial unit test for TimeLineWriterUtils
- Added relationship checking to TestTimelineWriterImpl

 Generalize native HBase writer for additional tables
 

 Key: YARN-3706
 URL: https://issues.apache.org/jira/browse/YARN-3706
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Joep Rottinghuis
Assignee: Joep Rottinghuis
Priority: Minor
 Attachments: YARN-3706-YARN-2928.001.patch, 
 YARN-3726-YARN-2928.002.patch, YARN-3726-YARN-2928.003.patch, 
 YARN-3726-YARN-2928.004.patch


 When reviewing YARN-3411 we noticed that we could change the class hierarchy 
 a little in order to accommodate additional tables easily.
 In order to get ready for benchmark testing we left the original layout in 
 place, as performance would not be impacted by the code hierarchy.
 Here is a separate jira to address the hierarchy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2962) ZKRMStateStore: Limit the number of znodes under a znode

2015-06-02 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14568789#comment-14568789
 ] 

Varun Saxena commented on YARN-2962:


Was waiting for an input from [~vinodkv] and [~asuresh] so that we reach a 
common understanding on what we will do on the backward compatibility part.

Anyways in the coming week, plan to upload a patch implementing one of the 
approaches discussed.

 ZKRMStateStore: Limit the number of znodes under a znode
 

 Key: YARN-2962
 URL: https://issues.apache.org/jira/browse/YARN-2962
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Varun Saxena
Priority: Critical
 Attachments: YARN-2962.01.patch, YARN-2962.2.patch, YARN-2962.3.patch


 We ran into this issue where we were hitting the default ZK server message 
 size configs, primarily because the message had too many znodes even though 
 they individually they were all small.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3753) RM failed to come up with java.io.IOException: Wait for ZKClient creation timed out

2015-06-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14568670#comment-14568670
 ] 

Hadoop QA commented on YARN-3753:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  14m 53s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 29s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 25s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 27s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m 16s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  86m 33s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12736741/YARN-3753.1.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 990078b |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8161/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8161/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8161/console |


This message was automatically generated.

 RM failed to come up with java.io.IOException: Wait for ZKClient creation 
 timed out
 -

 Key: YARN-3753
 URL: https://issues.apache.org/jira/browse/YARN-3753
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Sumana Sathish
Assignee: Jian He
Priority: Critical
 Attachments: YARN-3753.1.patch, YARN-3753.patch


 RM failed to come up with the following error while submitting an mapreduce 
 job.
 {code:title=RM log}
 015-05-30 03:40:12,190 ERROR recovery.RMStateStore 
 (RMStateStore.java:transition(179)) - Error storing app: 
 application_1432956515242_0006
 java.io.IOException: Wait for ZKClient creation timed out
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1098)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeApplicationStateInternal(ZKRMStateStore.java:609)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:175)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:160)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900)
   at

[jira] [Created] (YARN-3756) The mininum memory setting(yarn.scheduler.minimum-allocation-mb) is not working in container

2015-06-02 Thread skrho (JIRA)

skrho created YARN-3756:
---

 Summary: The mininum memory 
setting(yarn.scheduler.minimum-allocation-mb) is not working in container
 Key: YARN-3756
 URL: https://issues.apache.org/jira/browse/YARN-3756
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
 Environment: hadoop 2.4.0
Reporter: skrho


Hello there~~

I have 2 clusters

First cluster is 5 node , default 1 application queue, 8G Physical memory each 
node
Second cluster is 10 node, 2 application queuey, 230G Physical memory each node

Wherever a mapreduce job is running, I want resourcemanager is to set the 
minimum memory  256m to container

So I was changing configuration in yarn-site.xml  mapred-site.xml

yarn.scheduler.minimum-allocation-mb : 256
mapreduce.map.java.opts : -Xms256m 
mapreduce.reduce.java.opts : -Xms256m 
mapreduce.map.memory.mb : 256 
mapreduce.reduce.memory.mb : 256 

In First cluster  whenever a mapreduce job is running , I can see used memory 
256m in web console( http://installedIP:8088/cluster/nodes )

But In Second cluster whenever a mapreduce job is running , I can see used 
memory 1024m in web console( http://installedIP:8088/cluster/nodes ) 

I know default memory value is 1024m, so if there is not changing memory 
setting, the default value is working.

I have been testing for two weeks, but I don't know why mimimum memory setting 
is not working in second cluster

Why this difference is happened? 

Am I wrong setting configuration?
or Is there bug?

Thank you for reading~~



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3753) RM failed to come up with java.io.IOException: Wait for ZKClient creation timed out

2015-06-02 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-3753:
--
Attachment: YARN-3753.2.patch

 RM failed to come up with java.io.IOException: Wait for ZKClient creation 
 timed out
 -

 Key: YARN-3753
 URL: https://issues.apache.org/jira/browse/YARN-3753
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Sumana Sathish
Assignee: Jian He
Priority: Critical
 Attachments: YARN-3753.1.patch, YARN-3753.2.patch, YARN-3753.patch


 RM failed to come up with the following error while submitting an mapreduce 
 job.
 {code:title=RM log}
 015-05-30 03:40:12,190 ERROR recovery.RMStateStore 
 (RMStateStore.java:transition(179)) - Error storing app: 
 application_1432956515242_0006
 java.io.IOException: Wait for ZKClient creation timed out
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1098)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeApplicationStateInternal(ZKRMStateStore.java:609)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:175)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:160)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:895)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
   at java.lang.Thread.run(Thread.java:745)
 2015-05-30 03:40:12,194 FATAL resourcemanager.ResourceManager 
 (ResourceManager.java:handle(750)) - Received a 
 org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type 
 STATE_STORE_OP_FAILED. Cause:
 java.io.IOException: Wait for ZKClient creation timed out
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1098)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeApplicationStateInternal(ZKRMStateStore.java:609)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:175)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:160)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at

[jira] [Commented] (YARN-3753) RM failed to come up with java.io.IOException: Wait for ZKClient creation timed out

2015-06-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14568797#comment-14568797
 ] 

Hadoop QA commented on YARN-3753:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 53s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 48s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 27s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m  6s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  88m  7s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12736776/YARN-3753.2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 990078b |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8164/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8164/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8164/console |


This message was automatically generated.

 RM failed to come up with java.io.IOException: Wait for ZKClient creation 
 timed out
 -

 Key: YARN-3753
 URL: https://issues.apache.org/jira/browse/YARN-3753
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Sumana Sathish
Assignee: Jian He
Priority: Critical
 Attachments: YARN-3753.1.patch, YARN-3753.2.patch, YARN-3753.patch


 RM failed to come up with the following error while submitting an mapreduce 
 job.
 {code:title=RM log}
 015-05-30 03:40:12,190 ERROR recovery.RMStateStore 
 (RMStateStore.java:transition(179)) - Error storing app: 
 application_1432956515242_0006
 java.io.IOException: Wait for ZKClient creation timed out
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1098)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeApplicationStateInternal(ZKRMStateStore.java:609)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:175)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:160)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900)
   at

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-06-02 Thread Lavkesh Lahngir (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14568834#comment-14568834
]

Lavkesh Lahngir commented on YARN-3591:
---

[~zxu] :Can we get away without storing into NMstateStore? Other changes seems
to be okay.
It's not a big change in terms of the code, but adding in NMstate could be
debatable.
[~vvasudev]: Thoughts?

Resource Localisation on a bad disk causes subsequent containers failure
-

Key: YARN-3591
URL: https://issues.apache.org/jira/browse/YARN-3591
Project: Hadoop YARN
Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Lavkesh Lahngir
Assignee: Lavkesh Lahngir
Attachments: 0001-YARN-3591.1.patch, 0001-YARN-3591.patch,
YARN-3591.2.patch, YARN-3591.3.patch, YARN-3591.4.patch

It happens when a resource is localised on the disk, after localising that
disk has gone bad. NM keeps paths for localised resources in memory. At the
time of resource request isResourcePresent(rsrc) will be called which calls
file.exists() on the localised path.
In some cases when disk has gone bad, inodes are stilled cached and
file.exists() returns true. But at the time of reading, file will not open.
Note: file.exists() actually calls stat64 natively which returns true because
it was able to find inode information from the OS.
A proposal is to call file.list() on the parent path of the resource, which
will call open() natively. If the disk is good it should return an array of
paths with length at-least 1.

86 matches

Mail list logo