[jira] [Updated] (HDFS-9068) SBN checkpoint could not work after the only name directory recovery from failure

2015-09-13 Thread He Xiaoqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-9068:
--
Attachment: HDFS-9068.patch

Attach patch: check failure directory if OK before saving fsimage.

> SBN checkpoint could not work after the only name directory recovery from 
> failure
> -
>
> Key: HDFS-9068
> URL: https://issues.apache.org/jira/browse/HDFS-9068
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1
>Reporter: He Xiaoqiao
> Attachments: HDFS-9068.patch
>
>
> SBN does checkpoint to {{dfs.namenode.name.dir}} peroidly, but the 
> checkpointer could not work when there is only one directory in configuration 
> item {{dfs.namenode.name.dir}} and the disk which the directory located 
> recoveries from failure.
> The impact of class is org.apache.hadoop.hdfs.server.namenode.FSImage.java
> {code:title=org.apache.hadoop.hdfs.server.namenode.FSImage.java|borderStyle=solid}
> @Override
> public void run() {
>   try {
> saveFSImage(context, sd, nnf);
>   } catch (SaveNamespaceCancelledException snce) {
> LOG.info("Cancelled image saving for " + sd.getRoot() +
> ": " + snce.getMessage());
> // don't report an error on the storage dir!
>   } catch (Throwable t) {
> LOG.error("Unable to save image for " + sd.getRoot(), t);
> context.reportErrorOnStorageDirectory(sd);
>   }
> }
> {code}
> sd is added to errorSDs: {{context.reportErrorOnStorageDirectory(sd)}}, it 
> will never be used when {{saveFSImage(context, sd, nnf)}} failed becasue 
> storage is Not available or failed even if it recovers from failure. Then 
> JournalNode will accumulate a large number of editlog files since 
> checkpointer failed and NameNode will restart for log time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8993) Balancer throws NPE

2015-08-29 Thread He Xiaoqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-8993:
--
Description: 
Balancer may throw NPE because the following {{if Statements}}
{code:title=Balancer.java|firstline=705}
synchronized (block) {
  // update locations
  for (String datanodeUuid : blk.getDatanodeUuids()) {
final BalancerDatanode d = datanodeMap.get(datanodeUuid);
if (datanode != null) { // not an unknown datanode
  block.addLocation(d);
}
  }
}
{code}
Before moving block, Balancer divides into two step to get all DNs info  some 
blocks of DN from NN. Regarding one DN commission after {{getDatanodeReport}} 
then {{getBlockList}} and one block's target is above DN just right, It will 
throw NPE when dispatcher.

  was:
Balancer may throw NPE because the following {{if Statements}}
{code:java|firstline=705}
  synchronized (block) {
// update locations
for (String datanodeUuid : blk.getDatanodeUuids()) {
  final BalancerDatanode d = datanodeMap.get(datanodeUuid);
  if (datanode != null) { // not an unknown datanode
block.addLocation(d);
  }
}
  }
{code}
Before moving block, Balancer divides into two step to get all DNs info  some 
blocks of DN from NN. Regarding one DN commission after {{getDatanodeReport}} 
then {{getBlockList}} and one block's target is above DN just right, It will 
throw NPE when dispatcher.


 Balancer throws NPE
 ---

 Key: HDFS-8993
 URL: https://issues.apache.org/jira/browse/HDFS-8993
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.4.1
Reporter: He Xiaoqiao

 Balancer may throw NPE because the following {{if Statements}}
 {code:title=Balancer.java|firstline=705}
 synchronized (block) {
   // update locations
   for (String datanodeUuid : blk.getDatanodeUuids()) {
 final BalancerDatanode d = datanodeMap.get(datanodeUuid);
 if (datanode != null) { // not an unknown datanode
   block.addLocation(d);
 }
   }
 }
 {code}
 Before moving block, Balancer divides into two step to get all DNs info  
 some blocks of DN from NN. Regarding one DN commission after 
 {{getDatanodeReport}} then {{getBlockList}} and one block's target is above 
 DN just right, It will throw NPE when dispatcher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8992) Balancer throws NPE

2015-08-29 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721016#comment-14721016
 ] 

He Xiaoqiao commented on HDFS-8992:
---

Thanks Tsz Wo Nicholas Sze for your comments, there is only one line of stack 
trace:
{code}
2015-06-29 14:06:35,280 WARN org.apache.hadoop.hdfs.server.balancer.Balancer: 
Dispatcher thread failed
java.lang.NullPointerException
{code}

 Balancer throws NPE
 ---

 Key: HDFS-8992
 URL: https://issues.apache.org/jira/browse/HDFS-8992
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.4.1
Reporter: He Xiaoqiao

 Balancer may throw NPE because the following {{if Statements}}
 {code:java|firstline=705}
   synchronized (block) {
 // update locations
 for (String datanodeUuid : blk.getDatanodeUuids()) {
   final BalancerDatanode d = datanodeMap.get(datanodeUuid);
   if (datanode != null) { // not an unknown datanode
 block.addLocation(d);
   }
 }
   }
 {code}
 Before moving block, Balancer divides into two step to get all DNs info  
 some blocks of DN from NN. Regarding one DN commission after 
 {{getDatanodeReport}} then {{getBlockList}} and one block's target is above 
 DN just right, It will throw NPE when dispatcher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8992) Balancer throws NPE

2015-08-29 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721017#comment-14721017
 ] 

He Xiaoqiao commented on HDFS-8992:
---

Thanks Tsz Wo Nicholas Sze for your comments, there is only one line of stack 
trace:
{code}
2015-06-29 14:06:35,280 WARN org.apache.hadoop.hdfs.server.balancer.Balancer: 
Dispatcher thread failed
java.lang.NullPointerException
{code}

 Balancer throws NPE
 ---

 Key: HDFS-8992
 URL: https://issues.apache.org/jira/browse/HDFS-8992
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.4.1
Reporter: He Xiaoqiao

 Balancer may throw NPE because the following {{if Statements}}
 {code:java|firstline=705}
   synchronized (block) {
 // update locations
 for (String datanodeUuid : blk.getDatanodeUuids()) {
   final BalancerDatanode d = datanodeMap.get(datanodeUuid);
   if (datanode != null) { // not an unknown datanode
 block.addLocation(d);
   }
 }
   }
 {code}
 Before moving block, Balancer divides into two step to get all DNs info  
 some blocks of DN from NN. Regarding one DN commission after 
 {{getDatanodeReport}} then {{getBlockList}} and one block's target is above 
 DN just right, It will throw NPE when dispatcher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8993) Balancer throws NPE

2015-08-29 Thread He Xiaoqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-8993:
--
Description: 
Balancer may throw NPE because the following {{if Statements}} at line 709 of 
Balancer.java
{code:title=Balancer.java}
synchronized (block) {
  // update locations
  for (String datanodeUuid : blk.getDatanodeUuids()) {
final BalancerDatanode d = datanodeMap.get(datanodeUuid);
if (datanode != null) { // not an unknown datanode
  block.addLocation(d);
}
  }
}
{code}
Before moving block, Balancer divides into two step to get all DNs info  some 
blocks of DN from NN. Regarding one DN commission after {{getDatanodeReport}} 
then {{getBlockList}} and one block's target is above DN just right, It will 
throw NPE when dispatcher.

  was:
Balancer may throw NPE because the following {{if Statements}}
{code:title=Balancer.java|firstline=705}
synchronized (block) {
  // update locations
  for (String datanodeUuid : blk.getDatanodeUuids()) {
final BalancerDatanode d = datanodeMap.get(datanodeUuid);
if (datanode != null) { // not an unknown datanode
  block.addLocation(d);
}
  }
}
{code}
Before moving block, Balancer divides into two step to get all DNs info  some 
blocks of DN from NN. Regarding one DN commission after {{getDatanodeReport}} 
then {{getBlockList}} and one block's target is above DN just right, It will 
throw NPE when dispatcher.


 Balancer throws NPE
 ---

 Key: HDFS-8993
 URL: https://issues.apache.org/jira/browse/HDFS-8993
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.4.1
Reporter: He Xiaoqiao

 Balancer may throw NPE because the following {{if Statements}} at line 709 of 
 Balancer.java
 {code:title=Balancer.java}
 synchronized (block) {
   // update locations
   for (String datanodeUuid : blk.getDatanodeUuids()) {
 final BalancerDatanode d = datanodeMap.get(datanodeUuid);
 if (datanode != null) { // not an unknown datanode
   block.addLocation(d);
 }
   }
 }
 {code}
 Before moving block, Balancer divides into two step to get all DNs info  
 some blocks of DN from NN. Regarding one DN commission after 
 {{getDatanodeReport}} then {{getBlockList}} and one block's target is above 
 DN just right, It will throw NPE when dispatcher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8992) Balancer throws NPE

2015-08-29 Thread He Xiaoqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao resolved HDFS-8992.
---
Resolution: Duplicate

 Balancer throws NPE
 ---

 Key: HDFS-8992
 URL: https://issues.apache.org/jira/browse/HDFS-8992
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.4.1
Reporter: He Xiaoqiao

 Balancer may throw NPE because the following {{if Statements}} at line 709 of 
 Balancer.java
 {code:java|firstline=705}
   synchronized (block) {
 // update locations
 for (String datanodeUuid : blk.getDatanodeUuids()) {
   final BalancerDatanode d = datanodeMap.get(datanodeUuid);
   if (datanode != null) { // not an unknown datanode
 block.addLocation(d);
   }
 }
   }
 {code}
 Before moving block, Balancer divides into two step to get all DNs info  
 some blocks of DN from NN. Regarding one DN commission after 
 {{getDatanodeReport}} then {{getBlockList}} and one block's target is above 
 DN just right, It will throw NPE when dispatcher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8992) Balancer throws NPE

2015-08-29 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721030#comment-14721030
 ] 

He Xiaoqiao commented on HDFS-8992:
---

thanks [~brahmareddy] for ur comment. i will close this ticket.

 Balancer throws NPE
 ---

 Key: HDFS-8992
 URL: https://issues.apache.org/jira/browse/HDFS-8992
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.4.1
Reporter: He Xiaoqiao

 Balancer may throw NPE because the following {{if Statements}} at line 709 of 
 Balancer.java
 {code:java|firstline=705}
   synchronized (block) {
 // update locations
 for (String datanodeUuid : blk.getDatanodeUuids()) {
   final BalancerDatanode d = datanodeMap.get(datanodeUuid);
   if (datanode != null) { // not an unknown datanode
 block.addLocation(d);
   }
 }
   }
 {code}
 Before moving block, Balancer divides into two step to get all DNs info  
 some blocks of DN from NN. Regarding one DN commission after 
 {{getDatanodeReport}} then {{getBlockList}} and one block's target is above 
 DN just right, It will throw NPE when dispatcher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8992) Balancer throws NPE

2015-08-29 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721015#comment-14721015
 ] 

He Xiaoqiao commented on HDFS-8992:
---

Thanks Tsz Wo Nicholas Sze for your comments, there is only one line of stack 
trace:
{code}
2015-06-29 14:06:35,280 WARN org.apache.hadoop.hdfs.server.balancer.Balancer: 
Dispatcher thread failed
java.lang.NullPointerException
{code}

 Balancer throws NPE
 ---

 Key: HDFS-8992
 URL: https://issues.apache.org/jira/browse/HDFS-8992
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.4.1
Reporter: He Xiaoqiao

 Balancer may throw NPE because the following {{if Statements}}
 {code:java|firstline=705}
   synchronized (block) {
 // update locations
 for (String datanodeUuid : blk.getDatanodeUuids()) {
   final BalancerDatanode d = datanodeMap.get(datanodeUuid);
   if (datanode != null) { // not an unknown datanode
 block.addLocation(d);
   }
 }
   }
 {code}
 Before moving block, Balancer divides into two step to get all DNs info  
 some blocks of DN from NN. Regarding one DN commission after 
 {{getDatanodeReport}} then {{getBlockList}} and one block's target is above 
 DN just right, It will throw NPE when dispatcher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8992) Balancer throws NPE

2015-08-29 Thread He Xiaoqiao (JIRA)
He Xiaoqiao created HDFS-8992:
-

 Summary: Balancer throws NPE
 Key: HDFS-8992
 URL: https://issues.apache.org/jira/browse/HDFS-8992
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.4.1
Reporter: He Xiaoqiao


Balancer may throw NPE because the following {{if Statements}}
{code:java|firstline=705}
  synchronized (block) {
// update locations
for (String datanodeUuid : blk.getDatanodeUuids()) {
  final BalancerDatanode d = datanodeMap.get(datanodeUuid);
  if (datanode != null) { // not an unknown datanode
block.addLocation(d);
  }
}
  }
{code}
Before moving block, Balancer divides into two step to get all DNs info  some 
blocks of DN from NN. Regarding one DN commission after {{getDatanodeReport}} 
then {{getBlockList}} and one block's target is above DN just right, It will 
throw NPE when dispatcher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8993) Balancer throws NPE

2015-08-29 Thread He Xiaoqiao (JIRA)
He Xiaoqiao created HDFS-8993:
-

 Summary: Balancer throws NPE
 Key: HDFS-8993
 URL: https://issues.apache.org/jira/browse/HDFS-8993
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.4.1
Reporter: He Xiaoqiao


Balancer may throw NPE because the following {{if Statements}}
{code:java|firstline=705}
  synchronized (block) {
// update locations
for (String datanodeUuid : blk.getDatanodeUuids()) {
  final BalancerDatanode d = datanodeMap.get(datanodeUuid);
  if (datanode != null) { // not an unknown datanode
block.addLocation(d);
  }
}
  }
{code}
Before moving block, Balancer divides into two step to get all DNs info  some 
blocks of DN from NN. Regarding one DN commission after {{getDatanodeReport}} 
then {{getBlockList}} and one block's target is above DN just right, It will 
throw NPE when dispatcher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8992) Balancer throws NPE

2015-08-29 Thread He Xiaoqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-8992:
--
Description: 
Balancer may throw NPE because the following {{if Statements}} at line 709 of 
Balancer.java
{code:java|firstline=705}
  synchronized (block) {
// update locations
for (String datanodeUuid : blk.getDatanodeUuids()) {
  final BalancerDatanode d = datanodeMap.get(datanodeUuid);
  if (datanode != null) { // not an unknown datanode
block.addLocation(d);
  }
}
  }
{code}
Before moving block, Balancer divides into two step to get all DNs info  some 
blocks of DN from NN. Regarding one DN commission after {{getDatanodeReport}} 
then {{getBlockList}} and one block's target is above DN just right, It will 
throw NPE when dispatcher.

  was:
Balancer may throw NPE because the following {{if Statements}}
{code:java|firstline=705}
  synchronized (block) {
// update locations
for (String datanodeUuid : blk.getDatanodeUuids()) {
  final BalancerDatanode d = datanodeMap.get(datanodeUuid);
  if (datanode != null) { // not an unknown datanode
block.addLocation(d);
  }
}
  }
{code}
Before moving block, Balancer divides into two step to get all DNs info  some 
blocks of DN from NN. Regarding one DN commission after {{getDatanodeReport}} 
then {{getBlockList}} and one block's target is above DN just right, It will 
throw NPE when dispatcher.


 Balancer throws NPE
 ---

 Key: HDFS-8992
 URL: https://issues.apache.org/jira/browse/HDFS-8992
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.4.1
Reporter: He Xiaoqiao

 Balancer may throw NPE because the following {{if Statements}} at line 709 of 
 Balancer.java
 {code:java|firstline=705}
   synchronized (block) {
 // update locations
 for (String datanodeUuid : blk.getDatanodeUuids()) {
   final BalancerDatanode d = datanodeMap.get(datanodeUuid);
   if (datanode != null) { // not an unknown datanode
 block.addLocation(d);
   }
 }
   }
 {code}
 Before moving block, Balancer divides into two step to get all DNs info  
 some blocks of DN from NN. Regarding one DN commission after 
 {{getDatanodeReport}} then {{getBlockList}} and one block's target is above 
 DN just right, It will throw NPE when dispatcher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8993) Balancer throws NPE

2015-08-29 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721021#comment-14721021
 ] 

He Xiaoqiao commented on HDFS-8993:
---

sorry to create to the same ticket since first creation not complete... BTW, 
how to delete this ticket?

 Balancer throws NPE
 ---

 Key: HDFS-8993
 URL: https://issues.apache.org/jira/browse/HDFS-8993
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.4.1
Reporter: He Xiaoqiao

 Balancer may throw NPE because the following {{if Statements}} at line 709 of 
 Balancer.java
 {code:title=Balancer.java}
 synchronized (block) {
   // update locations
   for (String datanodeUuid : blk.getDatanodeUuids()) {
 final BalancerDatanode d = datanodeMap.get(datanodeUuid);
 if (datanode != null) { // not an unknown datanode
   block.addLocation(d);
 }
   }
 }
 {code}
 Before moving block, Balancer divides into two step to get all DNs info  
 some blocks of DN from NN. Regarding one DN commission after 
 {{getDatanodeReport}} then {{getBlockList}} and one block's target is above 
 DN just right, It will throw NPE when dispatcher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8992) Balancer throws NPE

2015-08-29 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721018#comment-14721018
 ] 

He Xiaoqiao commented on HDFS-8992:
---

Actually, it can be fixed only add condition at {{if Statements}}
{code}
  synchronized (block) {
// update locations
for (String datanodeUuid : blk.getDatanodeUuids()) {
  final BalancerDatanode d = datanodeMap.get(datanodeUuid);
  if (datanode != null  null != d) { // not an unknown datanode
block.addLocation(d);
  }
}
  }
{code}

 Balancer throws NPE
 ---

 Key: HDFS-8992
 URL: https://issues.apache.org/jira/browse/HDFS-8992
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.4.1
Reporter: He Xiaoqiao

 Balancer may throw NPE because the following {{if Statements}}
 {code:java|firstline=705}
   synchronized (block) {
 // update locations
 for (String datanodeUuid : blk.getDatanodeUuids()) {
   final BalancerDatanode d = datanodeMap.get(datanodeUuid);
   if (datanode != null) { // not an unknown datanode
 block.addLocation(d);
   }
 }
   }
 {code}
 Before moving block, Balancer divides into two step to get all DNs info  
 some blocks of DN from NN. Regarding one DN commission after 
 {{getDatanodeReport}} then {{getBlockList}} and one block's target is above 
 DN just right, It will throw NPE when dispatcher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8973) NameNode exit without any exception log

2015-08-27 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14716600#comment-14716600
 ] 

He Xiaoqiao commented on HDFS-8973:
---

Thanks Kanaka for your comments. The follow is the main GC messages before 
process exit in .out file. actually GC works well as usual via following info 
and monitor system. at 19:35:02.508 when 'log4j:ERROR Failed to flush writer' 
appears no more logs output to .log file but GC info continue print about 5mins 
until 19:40:10 and namenode process exit. at that time, enough Memory space for 
JVM working.
{code:borderStyle=solid}
2015-08-26T19:34:30.537+0800: [GC [ParNew: 8315771K-63022K(9292032K), 
0.1909130 secs] 96423904K-88172502K(133185344K), 0.1910150 secs] [Times: 
user=3.37 sys=0.01, real=0.19 secs] 
2015-08-26T19:34:42.296+0800: [GC [ParNew: 8322670K-71664K(9292032K), 
0.2214550 secs] 96432150K-88183374K(133185344K), 0.2215720 secs] [Times: 
user=3.92 sys=0.01, real=0.22 secs] 
2015-08-26T19:34:52.412+0800: [GC [ParNew: 8331312K-82431K(9292032K), 
0.2173850 secs] 96443022K-88195492K(133185344K), 0.2174950 secs] [Times: 
user=3.86 sys=0.00, real=0.22 secs] 
2015-08-26T19:35:02.508+0800: [GC [ParNew: 8342079K-101837K(9292032K), 
0.1873830 secs] 96455140K-88216487K(133185344K), 0.1874800 secs] [Times: 
user=3.26 sys=0.02, real=0.18 secs]
log4j:ERROR Failed to flush writer,
java.io.IOException: 错误的文件描述符
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:318)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)
at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)
at 
org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276)
at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
at 
org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
at org.apache.log4j.Category.callAppenders(Category.java:206)
at org.apache.log4j.Category.forcedLog(Category.java:391)
at org.apache.log4j.Category.log(Category.java:856)
at 
org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:176)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2391)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:2312)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:2919)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:2894)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:2976)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:5432)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReceivedAndDeleted(NameNodeRpcServer.java:1061)
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolServerSideTranslatorPB.java:209)
at 
org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28065)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
2015-08-26T19:35:14.959+0800: [GC [ParNew: 8361485K-93419K(9292032K), 
0.1904630 secs] 96476135K-88211796K(133185344K), 0.1905540 secs] [Times: 
user=3.38 sys=0.00, real=0.19 secs] 
2015-08-26T19:35:25.424+0800: [GC [ParNew: 8353067K-54117K(9292032K), 
0.1892230 secs] 96471444K-88174133K(133185344K), 0.1893260 secs] [Times: 
user=3.31 sys=0.01, real=0.19 secs] 
2015-08-26T19:35:36.512+0800: [GC [ParNew: 8313765K-55946K(9292032K), 
0.1901160 secs] 96433781K-88177578K(133185344K), 0.1902050 secs] 

[jira] [Commented] (HDFS-8973) NameNode exit without any exception log

2015-08-27 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14716414#comment-14716414
 ] 

He Xiaoqiao commented on HDFS-8973:
---

Thanks, Kanaka. before stop logging everything looks well actually, neither 
WARN nor ERROR occured, and after that it continues print GC info to out file 
about 5 mins also looks well but log4j:ERROR Failed to flush writer, no any 
other useful info. 
I doubt there are multi threads using the same log4j handler, and when rolling 
the logfile, one thread close the Stream, and other thread continue write to 
this Stream, thus some Exception throws and interrupt the Thread. When all 
Threads of Namenode interrupt, main Thread exit.

 NameNode exit without any exception log
 ---

 Key: HDFS-8973
 URL: https://issues.apache.org/jira/browse/HDFS-8973
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.1
Reporter: He Xiaoqiao
Priority: Critical

 namenode process exit without any useful WARN/ERROR log, and after .log file 
 output interrupt .out file continue show about 5 min GC log. when .log file 
 intertupt .out file print the follow ERROR, it may hint some info. it seems 
 cause by log4j ERROR.
 {code:title=namenode.out|borderStyle=solid}
 log4j:ERROR Failed to flush writer,
 java.io.IOException: 错误的文件描述符
 at java.io.FileOutputStream.writeBytes(Native Method)
 at java.io.FileOutputStream.write(FileOutputStream.java:318)
 at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
 at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
 at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
 at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
 at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
 at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)
 at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)
 at 
 org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276)
 at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
 at 
 org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
 at 
 org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
 at org.apache.log4j.Category.callAppenders(Category.java:206)
 at org.apache.log4j.Category.forcedLog(Category.java:391)
 at org.apache.log4j.Category.log(Category.java:856)
 at 
 org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:176)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2391)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:2312)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:2919)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:2894)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:2976)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:5432)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReceivedAndDeleted(NameNodeRpcServer.java:1061)
 at 
 org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolServerSideTranslatorPB.java:209)
 at 
 org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28065)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8973) NameNode exit without any exception log

2015-08-27 Thread He Xiaoqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-8973:
--
Description: 
namenode process exit without any useful WARN/ERROR log, and after .log file 
output interrupt .out file continue show about 5 min GC log. when .log file 
intertupt .out file print the follow ERROR, it may hint some info. it seems 
cause by log4j ERROR.

{code:title=namenode.out|borderStyle=solid}
log4j:ERROR Failed to flush writer,
java.io.IOException: 错误的文件描述符
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:318)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)
at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)
at 
org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276)
at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
at 
org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
at org.apache.log4j.Category.callAppenders(Category.java:206)
at org.apache.log4j.Category.forcedLog(Category.java:391)
at org.apache.log4j.Category.log(Category.java:856)
at 
org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:176)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2391)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:2312)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:2919)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:2894)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:2976)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:5432)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReceivedAndDeleted(NameNodeRpcServer.java:1061)
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolServerSideTranslatorPB.java:209)
at 
org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28065)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
{code}

  was:
namenode process exit without any useful WARN/ERROR log, and after .log file 
output interrupt .out file continue show about 5 min GC log. when .log file 
intertupt .out file print the follow ERROR, it may hint some info. it seems 
cause by log4j ERROR.

{code:title=namenode.out|borderStyle=solid}
log4j:ERROR Failed to flush writer,
Total time for which application threads were stopped: 0.0047800 seconds
java.io.IOException: 错误的文件描述符
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:318)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)
at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)
at 
org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276)
at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
at 

[jira] [Created] (HDFS-8973) NameNode exit without any exception log

2015-08-27 Thread He Xiaoqiao (JIRA)
He Xiaoqiao created HDFS-8973:
-

 Summary: NameNode exit without any exception log
 Key: HDFS-8973
 URL: https://issues.apache.org/jira/browse/HDFS-8973
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.1
Reporter: He Xiaoqiao
Priority: Critical


namenode process exit without any useful WARN/ERROR log, and after .log file 
output interrupt .out file continue show about 5 min GC log. when .log file 
intertupt .out file print the follow ERROR, it may hint some info. it seems 
cause by log4j ERROR.

{code:title=namenode.out|borderStyle=solid}
log4j:ERROR Failed to flush writer,
Total time for which application threads were stopped: 0.0047800 seconds
java.io.IOException: 错误的文件描述符
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:318)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)
at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)
at 
org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276)
at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
at 
org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
at org.apache.log4j.Category.callAppenders(Category.java:206)
at org.apache.log4j.Category.forcedLog(Category.java:391)
at org.apache.log4j.Category.log(Category.java:856)
at 
org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:176)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2391)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:2312)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:2919)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:2894)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:2976)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:5432)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReceivedAndDeleted(NameNodeRpcServer.java:1061)
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolServerSideTranslatorPB.java:209)
at 
org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28065)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    4   5   6   7   8   9