date:20150817


[ 
https://issues.apache.org/jira/browse/HDFS-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700274#comment-14700274
 ] 

Colin Patrick McCabe commented on HDFS-7446:


I would like to see this backported to 2.6.1 just because otherwise it will 
create hassles for people who want to start using inotify.  Do you think this 
is feasible?

 HDFS inotify should have the ability to determine what txid it has read up to
 -

 Key: HDFS-7446
 URL: https://issues.apache.org/jira/browse/HDFS-7446
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 2.6.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
  Labels: 2.6.1-candidate
 Fix For: 2.7.0

 Attachments: HDFS-7446.001.patch, HDFS-7446.002.patch, 
 HDFS-7446.003.patch


 HDFS inotify should have the ability to determine what txid it has read up 
 to.  This will allow users who want to avoid missing any events to record 
 this txid and use it to resume reading events at the spot they left off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8897) Loadbalancer always exits with : java.io.IOException: Another Balancer is running.. Exiting ...

2015-08-17 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699337#comment-14699337
 ] 

Rakesh R commented on HDFS-8897:


Hi [~Alexandre LINTE], Thanks for reporting this. Jira description is confusing 
a bit, could you please give more details about your test scenario and the 
expected output.

 Loadbalancer always exits with : java.io.IOException: Another Balancer is 
 running..  Exiting ...
 

 Key: HDFS-8897
 URL: https://issues.apache.org/jira/browse/HDFS-8897
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.7.1
 Environment: Centos 6.6
Reporter: LINTE

 When balancer is launched, it should test if there is already a 
 /system/balancer.id file in HDFS.
 When the file doesn't exist, the balancer don't want to run : 
 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
 hdfs://sandbox]
 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
 Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
 = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
 Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
 Bytes Being Moved
 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
 NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
 30mins, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
 NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
 30mins, 0sec
 java.io.IOException: Another Balancer is running..  Exiting ...
 Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
 Looking at the audit log file when trying to run the balancer, the balancer 
 create the /system/balancer.id and then delete it on exiting ... 
 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
 src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
 proto=rpc
 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 The error seems to be located in 
 org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
 The function checkAndMarkRunning return null even if the /system/balancer.id 
 doesn't exist before entering this function; if it exists, then it is deleted 
 and the balancer exit with the same error.
 
   private OutputStream checkAndMarkRunning() throws IOException {
 try {
   if (fs.exists(idPath)) {
 // try appending to it so that it will fail fast if another balancer 
 is
 // running.
 IOUtils.closeStream(fs.append(idPath));
 fs.delete(idPath, true);
   }
   final FSDataOutputStream fsout = fs.create(idPath, false);
   // mark balancer idPath to be deleted during filesystem closure
   fs.deleteOnExit(idPath);
   if (write2IdFile) {
 fsout.writeBytes(InetAddress.getLocalHost().getHostName());
 fsout.hflush();
   }
   return fsout;
 } catch(RemoteException e) {
   
 if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
 return null;
   } else {
 throw e;
   }
 }
   }
 
 Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8713) Convert DatanodeDescriptor to use SLF4J logging


[ 
https://issues.apache.org/jira/browse/HDFS-8713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699358#comment-14699358
 ] 

Hadoop QA commented on HDFS-8713:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 32s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 41s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 42s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 34s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  5s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 172m 48s | Tests failed in hadoop-hdfs. |
| | | 214m 24s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12743378/hdfs-8713.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 13604bd |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12007/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12007/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12007/console |


This message was automatically generated.

 Convert DatanodeDescriptor to use SLF4J logging
 ---

 Key: HDFS-8713
 URL: https://issues.apache.org/jira/browse/HDFS-8713
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.6-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
 Attachments: hdfs-8713.001.patch


 Let's convert this class to use SLF4J



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8897) Loadbalancer always exits with : java.io.IOException: Another Balancer is running.. Exiting ...

2015-08-17 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699454#comment-14699454
 ] 

Rakesh R commented on HDFS-8897:


I could see hdfs://sandbox appears two times {code}15/08/14 16:35:12 INFO 
balancer.Balancer: namenodes = [hdfs://sandbox/, hdfs://sandbox]{code} Is that 
duplicate entries? am I miss anything?

 Loadbalancer always exits with : java.io.IOException: Another Balancer is 
 running..  Exiting ...
 

 Key: HDFS-8897
 URL: https://issues.apache.org/jira/browse/HDFS-8897
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.7.1
 Environment: Centos 6.6
Reporter: LINTE

 When balancer is launched, it should test if there is already a 
 /system/balancer.id file in HDFS.
 When the file doesn't exist, the balancer don't want to run : 
 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
 hdfs://sandbox]
 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
 Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
 = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
 Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
 Bytes Being Moved
 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
 NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
 30mins, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
 NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
 30mins, 0sec
 java.io.IOException: Another Balancer is running..  Exiting ...
 Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
 Looking at the audit log file when trying to run the balancer, the balancer 
 create the /system/balancer.id and then delete it on exiting ... 
 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
 src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
 proto=rpc
 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 The error seems to be located in 
 org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
 The function checkAndMarkRunning return null even if the /system/balancer.id 
 doesn't exist before entering this function; if it exists, then it is deleted 
 and the balancer exit with the same error.
 
   private OutputStream checkAndMarkRunning() throws IOException {
 try {
   if (fs.exists(idPath)) {
 // try appending to it so that it will fail fast if another balancer 
 is
 // running.
 IOUtils.closeStream(fs.append(idPath));
 fs.delete(idPath, true);
   }
   final FSDataOutputStream fsout = fs.create(idPath, false);
   // mark balancer idPath to be deleted during filesystem closure
   fs.deleteOnExit(idPath);
   if (write2IdFile) {
 fsout.writeBytes(InetAddress.getLocalHost().getHostName());
 fsout.hflush();
   }
   return fsout;
 } catch(RemoteException e) {
   
 if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
 return null;
   } else {
 throw e;
   }
 }
   }
 
 Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8845) DiskChecker should not traverse the entire tree


 [ 
https://issues.apache.org/jira/browse/HDFS-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8845:
---
  Resolution: Fixed
   Fix Version/s: 2.8.0
Target Version/s: 2.8.0
  Status: Resolved  (was: Patch Available)

Committed to 2.8.  Thanks, [~lichangleo].

 DiskChecker should not traverse the entire tree
 ---

 Key: HDFS-8845
 URL: https://issues.apache.org/jira/browse/HDFS-8845
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li
 Fix For: 2.8.0

 Attachments: HDFS-8845.patch


 DiskChecker should not traverse entire tree because it's causing heavy disk 
 load on checkDiskError()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones


 [ 
https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8833:

Attachment: HDFS-8833-HDFS-7285-merge.01.patch

Updating the patch to:
# Remove all references of {{ErasureCodingZone}} in {{src/main}}, will handle 
{{test}} separately.
# Updated {{INodeFile}}, changing {{isStriped}} to {{erasureCodingPolicy}} in 
header and added an API
# Changed how {{FSDirErasureCodingOp#getErasureCodingPolicyForPath}} queries EC 
policy, taking into consideration the policy stored in file header
# Updated the main test {{TestErasureCodingZones}} 

Version 00 patch is mostly refactoring and this patch has some logic-level 
changes.

 Erasure coding: store EC schema and cell size in INodeFile and eliminate 
 notion of EC zones
 ---

 Key: HDFS-8833
 URL: https://issues.apache.org/jira/browse/HDFS-8833
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-7285
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8833-HDFS-7285-merge.00.patch, 
 HDFS-8833-HDFS-7285-merge.01.patch


 We have [discussed | 
 https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754]
  storing EC schema with files instead of EC zones and recently revisited the 
 discussion under HDFS-8059.
 As a recap, the _zone_ concept has severe limitations including renaming and 
 nested configuration. Those limitations are valid in encryption for security 
 reasons and it doesn't make sense to carry them over in EC.
 This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For 
 simplicity, we should first implement it as an xattr and consider memory 
 optimizations (such as moving it to file header) as a follow-on. We should 
 also disable changing EC policy on a non-empty file / dir in the first phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8908) TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad datanode


[ 
https://issues.apache.org/jira/browse/HDFS-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700478#comment-14700478
 ] 

Hadoop QA commented on HDFS-8908:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   7m 41s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 40s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 24s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 20s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   1m  7s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 176m 23s | Tests failed in hadoop-hdfs. |
| | | 199m  1s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750861/h8908_20150817.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / c77bd6a |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12011/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12011/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12011/console |


This message was automatically generated.

 TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad 
 datanode
 --

 Key: HDFS-8908
 URL: https://issues.apache.org/jira/browse/HDFS-8908
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h8908_20150817.patch


 See 
 https://builds.apache.org/job/PreCommit-HDFS-Build/12005/testReport/org.apache.hadoop.hdfs/TestAppendSnapshotTruncate/testAST/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8911) NameNode Metric : Add WAL counters as a JMX metric


 [ 
https://issues.apache.org/jira/browse/HDFS-8911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-8911:
---
Status: Patch Available  (was: Open)

 NameNode Metric : Add WAL counters as a JMX metric
 --

 Key: HDFS-8911
 URL: https://issues.apache.org/jira/browse/HDFS-8911
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
 Attachments: HDFS-8911.001.patch


 Today we log Write Ahead Log metrics in the log. This JIRA proposes to expose 
 those metrics via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8911) NameNode Metric : Add WAL counters as a JMX metric


 [ 
https://issues.apache.org/jira/browse/HDFS-8911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-8911:
---
Attachment: HDFS-8911.001.patch

Adds TotalSyncCount and TotalSyncTimes metrics to NameNodeMetrics

 NameNode Metric : Add WAL counters as a JMX metric
 --

 Key: HDFS-8911
 URL: https://issues.apache.org/jira/browse/HDFS-8911
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
 Attachments: HDFS-8911.001.patch


 Today we log Write Ahead Log metrics in the log. This JIRA proposes to expose 
 those metrics via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8278) HDFS Balancer should consider remaining storage % when checking for under-utilized machines


[ 
https://issues.apache.org/jira/browse/HDFS-8278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700522#comment-14700522
 ] 

Hadoop QA commented on HDFS-8278:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 48s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 58s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 56s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 26s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 23s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 32s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 10s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 174m 22s | Tests failed in hadoop-hdfs. |
| | | 219m 35s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.fs.viewfs.TestViewFsWithXAttrs |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750868/h8278_20150817.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / c77bd6a |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12012/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12012/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12012/console |


This message was automatically generated.

 HDFS Balancer should consider remaining storage % when checking for 
 under-utilized machines
 ---

 Key: HDFS-8278
 URL: https://issues.apache.org/jira/browse/HDFS-8278
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer  mover
Affects Versions: 2.8.0
Reporter: Gopal V
Assignee: Tsz Wo Nicholas Sze
 Attachments: h8278_20150817.patch


 DFS balancer mistakenly identifies a node with very little storage space 
 remaining as an underutilized node and tries to move large amounts of data 
 to that particular node.
 All these block moves fail to execute successfully, as the % utilization is 
 less relevant than the dfs remaining storage on that node.
 {code}
 15/04/24 04:25:55 INFO balancer.Balancer: 0 over-utilized: []
 15/04/24 04:25:55 INFO balancer.Balancer: 1 underutilized: 
 [172.19.1.46:50010:DISK]
 15/04/24 04:25:55 INFO balancer.Balancer: Need to move 47.68 GB to make the 
 cluster balanced.
 15/04/24 04:25:55 INFO balancer.Balancer: Decided to move 413.08 MB bytes 
 from 172.19.1.52:50010:DISK to 172.19.1.46:50010:DISK
 15/04/24 04:25:55 INFO balancer.Balancer: Will move 413.08 MB in this 
 iteration
 15/04/24 04:25:55 WARN balancer.Dispatcher: Failed to move 
 blk_1078689321_1099517353638 with size=131146 from 172.19.1.52:50010:DISK to 
 172.19.1.46:50010:DISK through 172.19.1.53:50010: Got error, status message 
 opReplaceBlock 
 BP-942051088-172.18.1.41-1370508013893:blk_1078689321_1099517353638 received 
 exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: Out of 
 space: The volume with the most available space (=225042432 B) is less than 
 the block size (=268435456 B)., block move is failed
 {code}
 The machine in concern is under-full when it comes to the BP utilization, but 
 has very little free space available for blocks.
 {code}
 Decommission Status : Normal
 Configured Capacity: 3826907185152 (3.48 TB)
 DFS Used: 2817262833664 (2.56 TB)
 Non DFS Used: 1000621305856 (931.90 GB)
 DFS Remaining: 9023045632 (8.40 GB)
 DFS Used%: 73.62%
 DFS Remaining%: 0.24%
 Configured Cache Capacity: 8589934592 (8 GB)
 Cache Used: 0 (0 B)
 Cache Remaining: 8589934592 (8 GB)
 Cache

[jira] [Commented] (HDFS-6955) DN should reserve disk space for a full block when creating tmp files


[ 
https://issues.apache.org/jira/browse/HDFS-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700530#comment-14700530
 ] 

Hadoop QA commented on HDFS-6955:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 50s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   8m  2s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 54s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 24s | The applied patch generated  3 
new checkstyle issues (total was 154, now 155). |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 24s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 36s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  9s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 176m 49s | Tests failed in hadoop-hdfs. |
| | | 222m 14s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750865/HDFS-6955-02.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / c77bd6a |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12013/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12013/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12013/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12013/console |


This message was automatically generated.

 DN should reserve disk space for a full block when creating tmp files
 -

 Key: HDFS-6955
 URL: https://issues.apache.org/jira/browse/HDFS-6955
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.5.0
Reporter: Arpit Agarwal
Assignee: kanaka kumar avvaru
 Attachments: HDFS-6955-01.patch, HDFS-6955-02.patch


 HDFS-6898 is introducing disk space reservation for RBW files to avoid 
 running out of disk space midway through block creation.
 This Jira is to introduce similar reservation for tmp files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8862) Improve BlockManager#excessReplicateMap


[ 
https://issues.apache.org/jira/browse/HDFS-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700560#comment-14700560
 ] 

Yi Liu commented on HDFS-8862:
--

Thanks [~cmccabe] for the review. Will commit the patch later.

{{LightWeightLinkedSetBlockInfo}} can shrink..



 Improve BlockManager#excessReplicateMap
 ---

 Key: HDFS-8862
 URL: https://issues.apache.org/jira/browse/HDFS-8862
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-8862.001.patch


 Per [~cmccabe]'s comments in HDFS-8792, this JIRA is to discuss improving 
 {{BlockManager#excessReplicateMap}}.
 That's right HashMap don't ever shrink when elements are removed,  but 
 TreeMap entry needs to store more (memory) references (left,  right, parent) 
 than HashMap entry (only one reference next),  even when there is element 
 removing and cause some entry empty, the empty HashMap entry is just a 
 {{null}} reference (4 bytes),  so they are close at this point.  On the other 
 hand, the key of {{excessReplicateMap}} is datanode uuid, so the entries 
 number is almost fixed, so HashMap memory is good than TreeMap memory in this 
 case.   I think the most important is the search/insert/remove performance, 
 HashMap is absolutely better than TreeMap.  Because we don't need to sort,  
 we should use HashMap instead of TreeMap



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8862) BlockManager#excessReplicateMap should use a HashMap


 [ 
https://issues.apache.org/jira/browse/HDFS-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8862:
-
Summary: BlockManager#excessReplicateMap should use a HashMap  (was: 
Improve BlockManager#excessReplicateMap)

 BlockManager#excessReplicateMap should use a HashMap
 

 Key: HDFS-8862
 URL: https://issues.apache.org/jira/browse/HDFS-8862
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-8862.001.patch


 Per [~cmccabe]'s comments in HDFS-8792, this JIRA is to discuss improving 
 {{BlockManager#excessReplicateMap}}.
 That's right HashMap don't ever shrink when elements are removed,  but 
 TreeMap entry needs to store more (memory) references (left,  right, parent) 
 than HashMap entry (only one reference next),  even when there is element 
 removing and cause some entry empty, the empty HashMap entry is just a 
 {{null}} reference (4 bytes),  so they are close at this point.  On the other 
 hand, the key of {{excessReplicateMap}} is datanode uuid, so the entries 
 number is almost fixed, so HashMap memory is good than TreeMap memory in this 
 case.   I think the most important is the search/insert/remove performance, 
 HashMap is absolutely better than TreeMap.  Because we don't need to sort,  
 we should use HashMap instead of TreeMap



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8912) Implement ShrinkableHashMap extends java HashMap and use properly


[ 
https://issues.apache.org/jira/browse/HDFS-8912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700620#comment-14700620
 ] 

Yi Liu commented on HDFS-8912:
--

Hi [~cmccabe], how do you think about it? Thanks.

 Implement ShrinkableHashMap extends java HashMap and use properly
 -

 Key: HDFS-8912
 URL: https://issues.apache.org/jira/browse/HDFS-8912
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yi Liu
Assignee: Yi Liu

 Currently {{LightWeightHashSet}} and {{LightWeightLinkedSet}} are used in 
 hdfs, there are two advantages compared to java HashSet: one is the entry 
 requires fewer memory, another is it's shrinkable.  In real cluster, hdfs is 
 a long running service, and {{set}} may become very large at some time and 
 may become small after that, so shrinking the {{set}} when size hits the 
 shrink threshold is necessary, it can improve the NN memory.
 Same situation for {{map}}, some HashMap used in BlockManager (e.g., the 
 hashmap in CorruptReplicasMap), it's better to be shrinkable. 
  I think it's worth to implement ShrinkableHashMap extends the java HashMap, 
 for quick glance, seems few code is needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8911) NameNode Metric : Add WAL counters as a JMX metric


[ 
https://issues.apache.org/jira/browse/HDFS-8911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700485#comment-14700485
 ] 

Arpit Agarwal commented on HDFS-8911:
-

+1 pending Jenkins.

 NameNode Metric : Add WAL counters as a JMX metric
 --

 Key: HDFS-8911
 URL: https://issues.apache.org/jira/browse/HDFS-8911
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
 Attachments: HDFS-8911.001.patch


 Today we log Write Ahead Log metrics in the log. This JIRA proposes to expose 
 those metrics via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8911) NameNode Metric : Add WAL counters as a JMX metric


[ 
https://issues.apache.org/jira/browse/HDFS-8911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700500#comment-14700500
 ] 

Andrew Wang commented on HDFS-8911:
---

This is a small correction, but the edit log is write behind not write ahead. 
Could you update the JIRA summary to just say edit log rather than WAL?

 NameNode Metric : Add WAL counters as a JMX metric
 --

 Key: HDFS-8911
 URL: https://issues.apache.org/jira/browse/HDFS-8911
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
 Attachments: HDFS-8911.001.patch


 Today we log Write Ahead Log metrics in the log. This JIRA proposes to expose 
 those metrics via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8862) BlockManager#excessReplicateMap should use a HashMap


 [ 
https://issues.apache.org/jira/browse/HDFS-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8862:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

 BlockManager#excessReplicateMap should use a HashMap
 

 Key: HDFS-8862
 URL: https://issues.apache.org/jira/browse/HDFS-8862
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.8.0

 Attachments: HDFS-8862.001.patch


 Per [~cmccabe]'s comments in HDFS-8792, this JIRA is to discuss improving 
 {{BlockManager#excessReplicateMap}}.
 That's right HashMap don't ever shrink when elements are removed,  but 
 TreeMap entry needs to store more (memory) references (left,  right, parent) 
 than HashMap entry (only one reference next),  even when there is element 
 removing and cause some entry empty, the empty HashMap entry is just a 
 {{null}} reference (4 bytes),  so they are close at this point.  On the other 
 hand, the key of {{excessReplicateMap}} is datanode uuid, so the entries 
 number is almost fixed, so HashMap memory is good than TreeMap memory in this 
 case.   I think the most important is the search/insert/remove performance, 
 HashMap is absolutely better than TreeMap.  Because we don't need to sort,  
 we should use HashMap instead of TreeMap



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8826) Balancer may not move blocks efficiently in some cases


[ 
https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700575#comment-14700575
 ] 

Arpit Agarwal commented on HDFS-8826:
-

+1 for the patch. The test failures look unrelated although couple of the 
checkstyle issues look valid.

 Balancer may not move blocks efficiently in some cases
 --

 Key: HDFS-8826
 URL: https://issues.apache.org/jira/browse/HDFS-8826
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer  mover
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h8826_20150811.patch, h8826_20150816.patch


 Balancer is inefficient in the following case:
 || Datanode || Utilization || Rack ||
 | D1 | 95% | A |
 | D2 | 30% | B |
 | D3, D4, D5 | 0% | B |
 The average utilization is 25% so that D2 is within 10% threshold.  However, 
 Balancer currently will first move blocks from D2 to D3, D4 and D5 since they 
 are under the same rack.  Then, it will move blocks from D1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8278) HDFS Balancer should consider remaining storage % when checking for under-utilized machines


 [ 
https://issues.apache.org/jira/browse/HDFS-8278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8278:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Thanks Jing for reviewing the patch.

I have committed this.

 HDFS Balancer should consider remaining storage % when checking for 
 under-utilized machines
 ---

 Key: HDFS-8278
 URL: https://issues.apache.org/jira/browse/HDFS-8278
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer  mover
Affects Versions: 2.8.0
Reporter: Gopal V
Assignee: Tsz Wo Nicholas Sze
 Fix For: 2.8.0

 Attachments: h8278_20150817.patch


 DFS balancer mistakenly identifies a node with very little storage space 
 remaining as an underutilized node and tries to move large amounts of data 
 to that particular node.
 All these block moves fail to execute successfully, as the % utilization is 
 less relevant than the dfs remaining storage on that node.
 {code}
 15/04/24 04:25:55 INFO balancer.Balancer: 0 over-utilized: []
 15/04/24 04:25:55 INFO balancer.Balancer: 1 underutilized: 
 [172.19.1.46:50010:DISK]
 15/04/24 04:25:55 INFO balancer.Balancer: Need to move 47.68 GB to make the 
 cluster balanced.
 15/04/24 04:25:55 INFO balancer.Balancer: Decided to move 413.08 MB bytes 
 from 172.19.1.52:50010:DISK to 172.19.1.46:50010:DISK
 15/04/24 04:25:55 INFO balancer.Balancer: Will move 413.08 MB in this 
 iteration
 15/04/24 04:25:55 WARN balancer.Dispatcher: Failed to move 
 blk_1078689321_1099517353638 with size=131146 from 172.19.1.52:50010:DISK to 
 172.19.1.46:50010:DISK through 172.19.1.53:50010: Got error, status message 
 opReplaceBlock 
 BP-942051088-172.18.1.41-1370508013893:blk_1078689321_1099517353638 received 
 exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: Out of 
 space: The volume with the most available space (=225042432 B) is less than 
 the block size (=268435456 B)., block move is failed
 {code}
 The machine in concern is under-full when it comes to the BP utilization, but 
 has very little free space available for blocks.
 {code}
 Decommission Status : Normal
 Configured Capacity: 3826907185152 (3.48 TB)
 DFS Used: 2817262833664 (2.56 TB)
 Non DFS Used: 1000621305856 (931.90 GB)
 DFS Remaining: 9023045632 (8.40 GB)
 DFS Used%: 73.62%
 DFS Remaining%: 0.24%
 Configured Cache Capacity: 8589934592 (8 GB)
 Cache Used: 0 (0 B)
 Cache Remaining: 8589934592 (8 GB)
 Cache Used%: 0.00%
 Cache Remaining%: 100.00%
 Xceivers: 3
 Last contact: Fri Apr 24 04:28:36 PDT 2015
 {code}
 The machine has 0.40 Gb of non-RAM storage available on that node, so it is 
 futile to attempt to move any blocks to that particular machine.
 This is a similar concern when a machine loses disks, since the comparisons 
 of utilization always compare percentages per-node. Even that scenario needs 
 to cap data movement to that node to the DFS Remaining % variable.
 Trying to move any more data than that to a given node will always fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8880) NameNode metrics logging

2015-08-17 Thread Jitendra Nath Pandey (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700457#comment-14700457
 ] 

Jitendra Nath Pandey commented on HDFS-8880:


+1

 NameNode metrics logging
 

 Key: HDFS-8880
 URL: https://issues.apache.org/jira/browse/HDFS-8880
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-8880.01.patch, HDFS-8880.02.patch, 
 HDFS-8880.03.patch, HDFS-8880.04.patch, namenode-metrics.log


 The NameNode can periodically log metrics to help debugging when the cluster 
 is not setup with another metrics monitoring scheme.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8895) Remove deprecated BlockStorageLocation APIs


[ 
https://issues.apache.org/jira/browse/HDFS-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700527#comment-14700527
 ] 

Andrew Wang commented on HDFS-8895:
---

Quite some flakes but they all passed for me locally. Will commit shortly, 
thanks Eddy for reviewing.

 Remove deprecated BlockStorageLocation APIs
 ---

 Key: HDFS-8895
 URL: https://issues.apache.org/jira/browse/HDFS-8895
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: HDFS-8895.001.patch


 HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so 
 it can be removed from trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8895) Remove deprecated BlockStorageLocation APIs


 [ 
https://issues.apache.org/jira/browse/HDFS-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8895:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
 Release Note: This removes the deprecated 
DistributedFileSystem#getFileBlockStorageLocations API used for getting 
VolumeIds of block replicas. Applications interested in the volume of a replica 
can instead consult BlockLocation#getStorageIds to obtain equivalent 
information.  (was: This removes the deprecated 
DistributedFileSystem#getFileBlockStorageLocations API used for getting 
VolumeIds of block replicas. Instead, use BlockLocation#getStorageIds to get 
very similar information.)
   Status: Resolved  (was: Patch Available)

Committed to trunk, thanks again for reviewing Eddy.

 Remove deprecated BlockStorageLocation APIs
 ---

 Key: HDFS-8895
 URL: https://issues.apache.org/jira/browse/HDFS-8895
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 3.0.0

 Attachments: HDFS-8895.001.patch


 HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so 
 it can be removed from trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8862) BlockManager#excessReplicateMap should use a HashMap


[ 
https://issues.apache.org/jira/browse/HDFS-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700577#comment-14700577
 ] 

Yi Liu commented on HDFS-8862:
--

One more discussion, do you think it's worth to extend java HashMap and 
implement the {{shrink}}? Since it's better to have the shrinked HashMap in 
some places. From my point of review, I think it's worth and for quick glance, 
seems few code is needed.

 BlockManager#excessReplicateMap should use a HashMap
 

 Key: HDFS-8862
 URL: https://issues.apache.org/jira/browse/HDFS-8862
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.8.0

 Attachments: HDFS-8862.001.patch


 Per [~cmccabe]'s comments in HDFS-8792, this JIRA is to discuss improving 
 {{BlockManager#excessReplicateMap}}.
 That's right HashMap don't ever shrink when elements are removed,  but 
 TreeMap entry needs to store more (memory) references (left,  right, parent) 
 than HashMap entry (only one reference next),  even when there is element 
 removing and cause some entry empty, the empty HashMap entry is just a 
 {{null}} reference (4 bytes),  so they are close at this point.  On the other 
 hand, the key of {{excessReplicateMap}} is datanode uuid, so the entries 
 number is almost fixed, so HashMap memory is good than TreeMap memory in this 
 case.   I think the most important is the search/insert/remove performance, 
 HashMap is absolutely better than TreeMap.  Because we don't need to sort,  
 we should use HashMap instead of TreeMap



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8826) Balancer may not move blocks efficiently in some cases


[ 
https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700576#comment-14700576
 ] 

Arpit Agarwal commented on HDFS-8826:
-

Also it may be a good idea to add a separate option to source from the most 
over-utilized DataNodes first so the administrator does not have to pass the 
source DNs manually. We can add it in a separate Jira.

 Balancer may not move blocks efficiently in some cases
 --

 Key: HDFS-8826
 URL: https://issues.apache.org/jira/browse/HDFS-8826
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer  mover
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h8826_20150811.patch, h8826_20150816.patch


 Balancer is inefficient in the following case:
 || Datanode || Utilization || Rack ||
 | D1 | 95% | A |
 | D2 | 30% | B |
 | D3, D4, D5 | 0% | B |
 The average utilization is 25% so that D2 is within 10% threshold.  However, 
 Balancer currently will first move blocks from D2 to D3, D4 and D5 since they 
 are under the same rack.  Then, it will move blocks from D1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8895) Remove deprecated BlockStorageLocation APIs


[ 
https://issues.apache.org/jira/browse/HDFS-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700380#comment-14700380
 ] 

Hadoop QA commented on HDFS-8895:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  20m 28s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   8m  9s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  8s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 42s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 37s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  8s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 175m 21s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 34s | Tests passed in 
hadoop-hdfs-client. |
| | | 227m 43s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.TestAppendSnapshotTruncate |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.server.namenode.ha.TestLossyRetryInvocationHandler |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750398/HDFS-8895.001.patch |
| Optional Tests | javac unit javadoc findbugs checkstyle |
| git revision | trunk / e535e0f |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12010/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12010/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12010/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12010/console |


This message was automatically generated.

 Remove deprecated BlockStorageLocation APIs
 ---

 Key: HDFS-8895
 URL: https://issues.apache.org/jira/browse/HDFS-8895
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: HDFS-8895.001.patch


 HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so 
 it can be removed from trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones


[ 
https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700454#comment-14700454
 ] 

Zhe Zhang commented on HDFS-8833:
-

[~jingzhao] [~walter.k.su] [~andrew.wang] Thanks for the discussions. Again, 
the non-empty directory change is really simple so I left it as-is (allowing 
setting EC policy on non-empty dirs). Let's continue that discussion and reach 
a consensus.

 Erasure coding: store EC schema and cell size in INodeFile and eliminate 
 notion of EC zones
 ---

 Key: HDFS-8833
 URL: https://issues.apache.org/jira/browse/HDFS-8833
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-7285
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8833-HDFS-7285-merge.00.patch, 
 HDFS-8833-HDFS-7285-merge.01.patch


 We have [discussed | 
 https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754]
  storing EC schema with files instead of EC zones and recently revisited the 
 discussion under HDFS-8059.
 As a recap, the _zone_ concept has severe limitations including renaming and 
 nested configuration. Those limitations are valid in encryption for security 
 reasons and it doesn't make sense to carry them over in EC.
 This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For 
 simplicity, we should first implement it as an xattr and consider memory 
 optimizations (such as moving it to file header) as a follow-on. We should 
 also disable changing EC policy on a non-empty file / dir in the first phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8909) Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use BlockUnderConstructionFeature


[ 
https://issues.apache.org/jira/browse/HDFS-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700456#comment-14700456
 ] 

Zhe Zhang commented on HDFS-8909:
-

[~jingzhao] I just realized we said {{HDFS-7285-rebase}} instead of 
{{HDFS-7285-merge}} above. The two branches have similar {{HEAD}} so it doesn't 
make much difference. But in general, which branch do you prefer we use for 
other pending tasks?

 Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use 
 BlockUnderConstructionFeature
 

 Key: HDFS-8909
 URL: https://issues.apache.org/jira/browse/HDFS-8909
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-7285
Reporter: Zhe Zhang
Assignee: Jing Zhao

 HDFS-8801 converts {{BlockInfoUC}} as a feature. We should consolidate 
 {{BlockInfoContiguousUC}} and {{BlockInfoStripedUC}} logics to use this 
 feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8792) BlockManager#postponedMisreplicatedBlocks should use a LightWeightHashSet to save memory


[ 
https://issues.apache.org/jira/browse/HDFS-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700543#comment-14700543
 ] 

Yi Liu commented on HDFS-8792:
--

Thanks [~cmccabe] for the review and commit!

 BlockManager#postponedMisreplicatedBlocks should use a LightWeightHashSet to 
 save memory
 

 Key: HDFS-8792
 URL: https://issues.apache.org/jira/browse/HDFS-8792
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.8.0

 Attachments: HDFS-8792.001.patch, HDFS-8792.002.patch, 
 HDFS-8792.003.patch


 {{LightWeightHashSet}} requires fewer memory than java hashset. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8912) Implement ShrinkableHashMap extends java HashMap and use properly


 [ 
https://issues.apache.org/jira/browse/HDFS-8912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8912:
-
Description: 
Currently {{LightWeightHashSet}} and {{LightWeightLinkedSet}} are used in hdfs, 
there are two advantages compared to java HashSet: one is the entry requires 
fewer memory, another is it's shrinkable.  In real cluster, hdfs is a long 
running service, and {{set}} may become large at some time and may become small 
after that, so shrinking the {{set}} when size hits the shrink threshold is 
necessary, it can improve the NN memory.

Same situation for {{map}}, some HashMap used in BlockManager (e.g., the 
hashmap in CorruptReplicasMap), it's better to be shrinkable. 
 I think it's worth to implement ShrinkableHashMap extends the java HashMap, 
for quick glance, seems few code is needed.

  was:
Currently {{LightWeightHashSet}} and {{LightWeightLinkedSet}} are used in hdfs, 
there are two advantages compared to java HashSet: one is the entry requires 
fewer memory, another is it's shrinkable.  In real cluster, hdfs is a long 
running service, and {{set}} may become very large at some time and may become 
small after that, so shrinking the {{set}} when size hits the shrink threshold 
is necessary, it can improve the NN memory.

Same situation for {{map}}, some HashMap used in BlockManager (e.g., the 
hashmap in CorruptReplicasMap), it's better to be shrinkable. 
 I think it's worth to implement ShrinkableHashMap extends the java HashMap, 
for quick glance, seems few code is needed.


 Implement ShrinkableHashMap extends java HashMap and use properly
 -

 Key: HDFS-8912
 URL: https://issues.apache.org/jira/browse/HDFS-8912
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yi Liu
Assignee: Yi Liu

 Currently {{LightWeightHashSet}} and {{LightWeightLinkedSet}} are used in 
 hdfs, there are two advantages compared to java HashSet: one is the entry 
 requires fewer memory, another is it's shrinkable.  In real cluster, hdfs is 
 a long running service, and {{set}} may become large at some time and may 
 become small after that, so shrinking the {{set}} when size hits the shrink 
 threshold is necessary, it can improve the NN memory.
 Same situation for {{map}}, some HashMap used in BlockManager (e.g., the 
 hashmap in CorruptReplicasMap), it's better to be shrinkable. 
  I think it's worth to implement ShrinkableHashMap extends the java HashMap, 
 for quick glance, seems few code is needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8792) BlockManager#postponedMisreplicatedBlocks should use a LightWeightHashSet to save memory


[ 
https://issues.apache.org/jira/browse/HDFS-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700333#comment-14700333
 ] 

Hudson commented on HDFS-8792:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8314 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8314/])
HDFS-8792. BlockManager#postponedMisreplicatedBlocks should use a 
LightWeightHashSet to save memory (Yi Liu via Colin P. McCabe) (cmccabe: rev 
c77bd6af16cbc26f88a2c6d8220db83a3e1caa2c)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLightWeightHashSet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java


 BlockManager#postponedMisreplicatedBlocks should use a LightWeightHashSet to 
 save memory
 

 Key: HDFS-8792
 URL: https://issues.apache.org/jira/browse/HDFS-8792
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.8.0

 Attachments: HDFS-8792.001.patch, HDFS-8792.002.patch, 
 HDFS-8792.003.patch


 {{LightWeightHashSet}} requires fewer memory than java hashset. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8845) DiskChecker should not traverse the entire tree


 [ 
https://issues.apache.org/jira/browse/HDFS-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8845:
---
Summary: DiskChecker should not traverse the entire tree  (was: DiskChecker 
should not traverse entire tree)

 DiskChecker should not traverse the entire tree
 ---

 Key: HDFS-8845
 URL: https://issues.apache.org/jira/browse/HDFS-8845
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li
 Attachments: HDFS-8845.patch


 DiskChecker should not traverse entire tree because it's causing heavy disk 
 load on checkDiskError()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8845) DiskChecker should not traverse the entire tree


[ 
https://issues.apache.org/jira/browse/HDFS-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700353#comment-14700353
 ] 

Andrew Wang commented on HDFS-8845:
---

Yea sounds good to me too, Colin explained to me offline that the BlockScanner 
processes suspect blocks like these first, so anything that would get caught 
by a path-checker will get quickly caught by the BlockScanner. Thanks all.

 DiskChecker should not traverse the entire tree
 ---

 Key: HDFS-8845
 URL: https://issues.apache.org/jira/browse/HDFS-8845
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li
 Fix For: 2.8.0

 Attachments: HDFS-8845.patch


 DiskChecker should not traverse entire tree because it's causing heavy disk 
 load on checkDiskError()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8909) Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use BlockUnderConstructionFeature

2015-08-17 Thread Walter Su (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700556#comment-14700556
 ] 

Walter Su commented on HDFS-8909:
-

I'm afraid the patch makes the earlier branch commits unable to rebase.
To [~zhz]: The patch should be squashed together with others.

 Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use 
 BlockUnderConstructionFeature
 

 Key: HDFS-8909
 URL: https://issues.apache.org/jira/browse/HDFS-8909
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-7285
Reporter: Zhe Zhang
Assignee: Jing Zhao
 Attachments: HDFS-8909.000.patch


 HDFS-8801 converts {{BlockInfoUC}} as a feature. We should consolidate 
 {{BlockInfoContiguousUC}} and {{BlockInfoStripedUC}} logics to use this 
 feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8880) NameNode metrics logging


[ 
https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700606#comment-14700606
 ] 

Hudson commented on HDFS-8880:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8315 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8315/])
HDFS-8880. NameNode metrics logging. (Arpit Agarwal) (arp: rev 
a88f31ebf3433392419127816f168136de0a9e77)
* hadoop-hdfs-project/hadoop-hdfs/src/test/resources/log4j.properties
* hadoop-common-project/hadoop-common/src/main/conf/log4j.properties
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMetricsLogger.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/MBeans.java


 NameNode metrics logging
 

 Key: HDFS-8880
 URL: https://issues.apache.org/jira/browse/HDFS-8880
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: 2.8.0

 Attachments: HDFS-8880.01.patch, HDFS-8880.02.patch, 
 HDFS-8880.03.patch, HDFS-8880.04.patch, namenode-metrics.log


 The NameNode can periodically log metrics to help debugging when the cluster 
 is not setup with another metrics monitoring scheme.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8845) DiskChecker should not traverse the entire tree


[ 
https://issues.apache.org/jira/browse/HDFS-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700604#comment-14700604
 ] 

Hudson commented on HDFS-8845:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8315 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8315/])
HDFS-8845. DiskChecker should not traverse the entire tree (Chang Li via Colin 
P. McCabe) (cmccabe: rev ec183faadcf7edaf432aca3b25d24215d505c2ec)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java


 DiskChecker should not traverse the entire tree
 ---

 Key: HDFS-8845
 URL: https://issues.apache.org/jira/browse/HDFS-8845
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li
 Fix For: 2.8.0

 Attachments: HDFS-8845.patch


 DiskChecker should not traverse entire tree because it's causing heavy disk 
 load on checkDiskError()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8278) HDFS Balancer should consider remaining storage % when checking for under-utilized machines


[ 
https://issues.apache.org/jira/browse/HDFS-8278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700607#comment-14700607
 ] 

Hudson commented on HDFS-8278:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8315 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8315/])
HDFS-8278. When computing max-size-to-move in Balancer, count only the storage 
with remaining = default block size. (szetszwo: rev 
51a00964da0e399718d1cec25ff692a32d7642b7)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 HDFS Balancer should consider remaining storage % when checking for 
 under-utilized machines
 ---

 Key: HDFS-8278
 URL: https://issues.apache.org/jira/browse/HDFS-8278
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer  mover
Affects Versions: 2.8.0
Reporter: Gopal V
Assignee: Tsz Wo Nicholas Sze
 Fix For: 2.8.0

 Attachments: h8278_20150817.patch


 DFS balancer mistakenly identifies a node with very little storage space 
 remaining as an underutilized node and tries to move large amounts of data 
 to that particular node.
 All these block moves fail to execute successfully, as the % utilization is 
 less relevant than the dfs remaining storage on that node.
 {code}
 15/04/24 04:25:55 INFO balancer.Balancer: 0 over-utilized: []
 15/04/24 04:25:55 INFO balancer.Balancer: 1 underutilized: 
 [172.19.1.46:50010:DISK]
 15/04/24 04:25:55 INFO balancer.Balancer: Need to move 47.68 GB to make the 
 cluster balanced.
 15/04/24 04:25:55 INFO balancer.Balancer: Decided to move 413.08 MB bytes 
 from 172.19.1.52:50010:DISK to 172.19.1.46:50010:DISK
 15/04/24 04:25:55 INFO balancer.Balancer: Will move 413.08 MB in this 
 iteration
 15/04/24 04:25:55 WARN balancer.Dispatcher: Failed to move 
 blk_1078689321_1099517353638 with size=131146 from 172.19.1.52:50010:DISK to 
 172.19.1.46:50010:DISK through 172.19.1.53:50010: Got error, status message 
 opReplaceBlock 
 BP-942051088-172.18.1.41-1370508013893:blk_1078689321_1099517353638 received 
 exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: Out of 
 space: The volume with the most available space (=225042432 B) is less than 
 the block size (=268435456 B)., block move is failed
 {code}
 The machine in concern is under-full when it comes to the BP utilization, but 
 has very little free space available for blocks.
 {code}
 Decommission Status : Normal
 Configured Capacity: 3826907185152 (3.48 TB)
 DFS Used: 2817262833664 (2.56 TB)
 Non DFS Used: 1000621305856 (931.90 GB)
 DFS Remaining: 9023045632 (8.40 GB)
 DFS Used%: 73.62%
 DFS Remaining%: 0.24%
 Configured Cache Capacity: 8589934592 (8 GB)
 Cache Used: 0 (0 B)
 Cache Remaining: 8589934592 (8 GB)
 Cache Used%: 0.00%
 Cache Remaining%: 100.00%
 Xceivers: 3
 Last contact: Fri Apr 24 04:28:36 PDT 2015
 {code}
 The machine has 0.40 Gb of non-RAM storage available on that node, so it is 
 futile to attempt to move any blocks to that particular machine.
 This is a similar concern when a machine loses disks, since the comparisons 
 of utilization always compare percentages per-node. Even that scenario needs 
 to cap data movement to that node to the DFS Remaining % variable.
 Trying to move any more data than that to a given node will always fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8862) BlockManager#excessReplicateMap should use a HashMap


[ 
https://issues.apache.org/jira/browse/HDFS-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700603#comment-14700603
 ] 

Hudson commented on HDFS-8862:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8315 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8315/])
HDFS-8862. BlockManager#excessReplicateMap should use a HashMap. (yliu) (yliu: 
rev 71566e23820d33e0110ca55eded3299735e970b9)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 BlockManager#excessReplicateMap should use a HashMap
 

 Key: HDFS-8862
 URL: https://issues.apache.org/jira/browse/HDFS-8862
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.8.0

 Attachments: HDFS-8862.001.patch


 Per [~cmccabe]'s comments in HDFS-8792, this JIRA is to discuss improving 
 {{BlockManager#excessReplicateMap}}.
 That's right HashMap don't ever shrink when elements are removed,  but 
 TreeMap entry needs to store more (memory) references (left,  right, parent) 
 than HashMap entry (only one reference next),  even when there is element 
 removing and cause some entry empty, the empty HashMap entry is just a 
 {{null}} reference (4 bytes),  so they are close at this point.  On the other 
 hand, the key of {{excessReplicateMap}} is datanode uuid, so the entries 
 number is almost fixed, so HashMap memory is good than TreeMap memory in this 
 case.   I think the most important is the search/insert/remove performance, 
 HashMap is absolutely better than TreeMap.  Because we don't need to sort,  
 we should use HashMap instead of TreeMap



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8895) Remove deprecated BlockStorageLocation APIs


[ 
https://issues.apache.org/jira/browse/HDFS-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700605#comment-14700605
 ] 

Hudson commented on HDFS-8895:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8315 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8315/])
HDFS-8895. Remove deprecated BlockStorageLocation APIs. (wang: rev 
eee4d716b48074825e1afcd9c74038a393ddeb69)
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/VolumeId.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolServerSideTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/BlockStorageLocation.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientDatanodeProtocol.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientDatanodeProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestVolumeId.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsBlocksMetadata.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/HdfsVolumeId.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockStorageLocationUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java


 Remove deprecated BlockStorageLocation APIs
 ---

 Key: HDFS-8895
 URL: https://issues.apache.org/jira/browse/HDFS-8895
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 3.0.0

 Attachments: HDFS-8895.001.patch


 HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so 
 it can be removed from trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8908) TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad datanode


[ 
https://issues.apache.org/jira/browse/HDFS-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700325#comment-14700325
 ] 

Hadoop QA commented on HDFS-8908:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |   5m 54s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 46s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 23s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 28s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   1m  6s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 175m  2s | Tests failed in hadoop-hdfs. |
| | | 195m 10s | |
\\
\\
|| Reason || Tests ||
| Failed build | hadoop-hdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750861/h8908_20150817.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / e535e0f |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12009/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12009/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12009/console |


This message was automatically generated.

 TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad 
 datanode
 --

 Key: HDFS-8908
 URL: https://issues.apache.org/jira/browse/HDFS-8908
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h8908_20150817.patch


 See 
 https://builds.apache.org/job/PreCommit-HDFS-Build/12005/testReport/org.apache.hadoop.hdfs/TestAppendSnapshotTruncate/testAST/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8911) NameNode Metric : Add WAL counters as a JMX metric


[ 
https://issues.apache.org/jira/browse/HDFS-8911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700503#comment-14700503
 ] 

Anu Engineer commented on HDFS-8911:


[~andrew.wang] Thanks for letting me know. I will fix that.




 NameNode Metric : Add WAL counters as a JMX metric
 --

 Key: HDFS-8911
 URL: https://issues.apache.org/jira/browse/HDFS-8911
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
 Attachments: HDFS-8911.001.patch


 Today we log Write Ahead Log metrics in the log. This JIRA proposes to expose 
 those metrics via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8911) NameNode Metric : Add Editlog counters as a JMX metric


 [ 
https://issues.apache.org/jira/browse/HDFS-8911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-8911:
---
Summary: NameNode Metric : Add Editlog counters as a JMX metric  (was: 
NameNode Metric : Add WAL counters as a JMX metric)

 NameNode Metric : Add Editlog counters as a JMX metric
 --

 Key: HDFS-8911
 URL: https://issues.apache.org/jira/browse/HDFS-8911
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
 Attachments: HDFS-8911.001.patch


 Today we log Write Ahead Log metrics in the log. This JIRA proposes to expose 
 those metrics via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8911) NameNode Metric : Add Editlog counters as a JMX metric


 [ 
https://issues.apache.org/jira/browse/HDFS-8911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-8911:
---
Description: Today we log editlog metrics in the log. This JIRA proposes to 
expose those metrics via JMX.  (was: Today we log Write Ahead Log metrics in 
the log. This JIRA proposes to expose those metrics via JMX.)

 NameNode Metric : Add Editlog counters as a JMX metric
 --

 Key: HDFS-8911
 URL: https://issues.apache.org/jira/browse/HDFS-8911
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
 Attachments: HDFS-8911.001.patch


 Today we log editlog metrics in the log. This JIRA proposes to expose those 
 metrics via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-8862) BlockManager#excessReplicateMap should use a HashMap


[ 
https://issues.apache.org/jira/browse/HDFS-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700577#comment-14700577
 ] 

Yi Liu edited comment on HDFS-8862 at 8/18/15 1:39 AM:
---

One more discussion, do you think it's worth to extend java HashMap and 
implement the {{shrink}}? Since it's better to have the shrinkable HashMap in 
some places. From my point of review, I think it's worth and for quick glance, 
seems few code is needed.


was (Author: hitliuyi):
One more discussion, do you think it's worth to extend java HashMap and 
implement the {{shrink}}? Since it's better to have the shrinked HashMap in 
some places. From my point of review, I think it's worth and for quick glance, 
seems few code is needed.

 BlockManager#excessReplicateMap should use a HashMap
 

 Key: HDFS-8862
 URL: https://issues.apache.org/jira/browse/HDFS-8862
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.8.0

 Attachments: HDFS-8862.001.patch


 Per [~cmccabe]'s comments in HDFS-8792, this JIRA is to discuss improving 
 {{BlockManager#excessReplicateMap}}.
 That's right HashMap don't ever shrink when elements are removed,  but 
 TreeMap entry needs to store more (memory) references (left,  right, parent) 
 than HashMap entry (only one reference next),  even when there is element 
 removing and cause some entry empty, the empty HashMap entry is just a 
 {{null}} reference (4 bytes),  so they are close at this point.  On the other 
 hand, the key of {{excessReplicateMap}} is datanode uuid, so the entries 
 number is almost fixed, so HashMap memory is good than TreeMap memory in this 
 case.   I think the most important is the search/insert/remove performance, 
 HashMap is absolutely better than TreeMap.  Because we don't need to sort,  
 we should use HashMap instead of TreeMap



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8909) Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use BlockUnderConstructionFeature

2015-08-17 Thread Walter Su (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700610#comment-14700610
 ] 

Walter Su commented on HDFS-8909:
-

bq. Instead of keeping doing git rebase, maybe we should switch to git merge 
now. We can skip HDFS-8801 when merging trunk changes.
bq. a small change in trunk can cause conflicts for rebasing large number of 
commits in the feature branch --- quote from Jing Zhao 
\[jing.apa...@gmail.com\] in common-dev mailing list.
Totally agree. HDFS-8801 is an example. HDFS-8909 just tries to merge some part 
from trunk to branch, which is no different from 'git merge'.

 Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use 
 BlockUnderConstructionFeature
 

 Key: HDFS-8909
 URL: https://issues.apache.org/jira/browse/HDFS-8909
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-7285
Reporter: Zhe Zhang
Assignee: Jing Zhao
 Attachments: HDFS-8909.000.patch


 HDFS-8801 converts {{BlockInfoUC}} as a feature. We should consolidate 
 {{BlockInfoContiguousUC}} and {{BlockInfoStripedUC}} logics to use this 
 feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8912) Implement ShrinkableHashMap extends java HashMap and use properly

Yi Liu created HDFS-8912:


 Summary: Implement ShrinkableHashMap extends java HashMap and use 
properly
 Key: HDFS-8912
 URL: https://issues.apache.org/jira/browse/HDFS-8912
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yi Liu
Assignee: Yi Liu


Currently {{LightWeightHashSet}} and {{LightWeightLinkedSet}} are used in hdfs, 
there are two advantages compared to java HashSet: one is the entry requires 
fewer memory, another is it's shrinkable.  In real cluster, hdfs is a long 
running service, and {{set}} may become very large at some time and may become 
small after that, so shrinking the {{set}} when size hits the shrink threshold 
is necessary, it can improve the NN memory.

Same situation for {{map}}, some HashMap used in BlockManager (e.g., the 
hashmap in CorruptReplicasMap), it's better to be shrinkable. 
 I think it's worth to implement ShrinkableHashMap extends the java HashMap, 
for quick glance, seems few code is needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8823) Move replication factor into individual blocks


[ 
https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700639#comment-14700639
 ] 

Hadoop QA commented on HDFS-8823:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 17s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 7 new or modified test files. |
| {color:green}+1{color} | javac |   7m 41s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 45s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 24s | The applied patch generated  7 
new checkstyle issues (total was 649, now 651). |
| {color:green}+1{color} | whitespace |   0m  7s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 20s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 37s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  3s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 175m  5s | Tests failed in hadoop-hdfs. |
| | | 219m 20s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.web.TestWebHDFSAcl |
|   | hadoop.hdfs.TestAppendSnapshotTruncate |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750885/HDFS-8823.005.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ec183fa |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12014/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12014/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12014/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12014/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12014/console |


This message was automatically generated.

 Move replication factor into individual blocks
 --

 Key: HDFS-8823
 URL: https://issues.apache.org/jira/browse/HDFS-8823
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-8823.000.patch, HDFS-8823.001.patch, 
 HDFS-8823.002.patch, HDFS-8823.003.patch, HDFS-8823.004.patch, 
 HDFS-8823.005.patch


 This jira proposes to record the replication factor in the {{BlockInfo}} 
 class. The changes have two advantages:
 * Decoupling the namespace and the block management layer. It is a 
 prerequisite step to move block management off the heap or to a separate 
 process.
 * Increased flexibility on replicating blocks. Currently the replication 
 factors of all blocks have to be the same. The replication factors of these 
 blocks are equal to the highest replication factor across all snapshots. The 
 changes will allow blocks in a file to have different replication factor, 
 potentially saving some space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8862) Improve BlockManager#excessReplicateMap


[ 
https://issues.apache.org/jira/browse/HDFS-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700352#comment-14700352
 ] 

Colin Patrick McCabe commented on HDFS-8862:


I guess the set of datanodes is not going to shrink that much over the life of 
the cluster.  So the fact that this data structure can't shrink should be OK.  
We may want to look into whether that {{LightWeightLinkedSetBlockInfo}} can 
shrink... but that is outside the scope of this JIRA.  +1.

 Improve BlockManager#excessReplicateMap
 ---

 Key: HDFS-8862
 URL: https://issues.apache.org/jira/browse/HDFS-8862
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-8862.001.patch


 Per [~cmccabe]'s comments in HDFS-8792, this JIRA is to discuss improving 
 {{BlockManager#excessReplicateMap}}.
 That's right HashMap don't ever shrink when elements are removed,  but 
 TreeMap entry needs to store more (memory) references (left,  right, parent) 
 than HashMap entry (only one reference next),  even when there is element 
 removing and cause some entry empty, the empty HashMap entry is just a 
 {{null}} reference (4 bytes),  so they are close at this point.  On the other 
 hand, the key of {{excessReplicateMap}} is datanode uuid, so the entries 
 number is almost fixed, so HashMap memory is good than TreeMap memory in this 
 case.   I think the most important is the search/insert/remove performance, 
 HashMap is absolutely better than TreeMap.  Because we don't need to sort,  
 we should use HashMap instead of TreeMap



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8846) Create edit log files with old layout version for upgrade testing


[ 
https://issues.apache.org/jira/browse/HDFS-8846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700361#comment-14700361
 ] 

Colin Patrick McCabe commented on HDFS-8846:


Hi [~zhz],

It seems that you left out the binary changes:
{code}
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-252-dfs-dir.tgz 
b/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-252-dfs-dir.tgz
new file mode 100644
index 000..2aaab18
Binary files /dev/null and 
b/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-252-dfs-dir.tgz 
differ
{code}

You should create the patch with {{\-\-binary}} so that these are included

 Create edit log files with old layout version for upgrade testing
 -

 Key: HDFS-8846
 URL: https://issues.apache.org/jira/browse/HDFS-8846
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8846.00.patch


 Per discussion under HDFS-8480, we should create some edit log files with old 
 layout version, to test whether they can be correctly handled in upgrades.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8435) createNonRecursive support needed in WebHdfsFileSystem to support HBase

2015-08-17 Thread Jakob Homan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-8435:
--
Status: Open  (was: Patch Available)

 createNonRecursive support needed in WebHdfsFileSystem to support HBase
 ---

 Key: HDFS-8435
 URL: https://issues.apache.org/jira/browse/HDFS-8435
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.6.0
Reporter: Vinoth Sathappan
Assignee: Jakob Homan
 Attachments: HDFS-8435-branch-2.7.001.patch, HDFS-8435.001.patch, 
 HDFS-8435.002.patch, HDFS-8435.003.patch


 The WebHdfsFileSystem implementation doesn't support createNonRecursive. 
 HBase extensively depends on that for proper functioning. Currently, when the 
 region servers are started over web hdfs, they crash due with -
 createNonRecursive unsupported for this filesystem class 
 org.apache.hadoop.hdfs.web.SWebHdfsFileSystem
 at 
 org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1137)
 at 
 org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1112)
 at 
 org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1088)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:85)
 at 
 org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:198)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8909) Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use BlockUnderConstructionFeature


[ 
https://issues.apache.org/jira/browse/HDFS-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700589#comment-14700589
 ] 

Jing Zhao commented on HDFS-8909:
-

Instead of keeping doing git rebase, maybe we should switch to git merge now. 
We can skip HDFS-8801 when merging trunk changes.

 Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use 
 BlockUnderConstructionFeature
 

 Key: HDFS-8909
 URL: https://issues.apache.org/jira/browse/HDFS-8909
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-7285
Reporter: Zhe Zhang
Assignee: Jing Zhao
 Attachments: HDFS-8909.000.patch


 HDFS-8801 converts {{BlockInfoUC}} as a feature. We should consolidate 
 {{BlockInfoContiguousUC}} and {{BlockInfoStripedUC}} logics to use this 
 feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8911) NameNode Metric : Add WAL counters as a JMX metric

Anu Engineer created HDFS-8911:
--

 Summary: NameNode Metric : Add WAL counters as a JMX metric
 Key: HDFS-8911
 URL: https://issues.apache.org/jira/browse/HDFS-8911
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer


Today we log Write Ahead Log metrics in the log. This JIRA proposes to expose 
those metrics via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8846) Create edit log files with old layout version for upgrade testing


 [ 
https://issues.apache.org/jira/browse/HDFS-8846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8846:

Attachment: HDFS-8846.01.patch

Thanks for the good catch Colin! Updating the patch with the binary diff.

 Create edit log files with old layout version for upgrade testing
 -

 Key: HDFS-8846
 URL: https://issues.apache.org/jira/browse/HDFS-8846
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8846.00.patch, HDFS-8846.01.patch


 Per discussion under HDFS-8480, we should create some edit log files with old 
 layout version, to test whether they can be correctly handled in upgrades.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8909) Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use BlockUnderConstructionFeature


[ 
https://issues.apache.org/jira/browse/HDFS-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700481#comment-14700481
 ] 

Jing Zhao commented on HDFS-8909:
-

I think maybe to save us time from now on we can use git merge to merge trunk 
changes into the ec feature branch. Thus either feature branch is ok to me. 
Which one do you prefer?

 Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use 
 BlockUnderConstructionFeature
 

 Key: HDFS-8909
 URL: https://issues.apache.org/jira/browse/HDFS-8909
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-7285
Reporter: Zhe Zhang
Assignee: Jing Zhao

 HDFS-8801 converts {{BlockInfoUC}} as a feature. We should consolidate 
 {{BlockInfoContiguousUC}} and {{BlockInfoStripedUC}} logics to use this 
 feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8909) Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use BlockUnderConstructionFeature


 [ 
https://issues.apache.org/jira/browse/HDFS-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8909:

Attachment: HDFS-8909.000.patch

Patch against HDFS-7285-REBASE branch.

 Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use 
 BlockUnderConstructionFeature
 

 Key: HDFS-8909
 URL: https://issues.apache.org/jira/browse/HDFS-8909
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-7285
Reporter: Zhe Zhang
Assignee: Jing Zhao
 Attachments: HDFS-8909.000.patch


 HDFS-8801 converts {{BlockInfoUC}} as a feature. We should consolidate 
 {{BlockInfoContiguousUC}} and {{BlockInfoStripedUC}} logics to use this 
 feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8880) NameNode metrics logging


 [ 
https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8880:

  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 2.8.0
Target Version/s:   (was: 2.8.0)
  Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2.

 NameNode metrics logging
 

 Key: HDFS-8880
 URL: https://issues.apache.org/jira/browse/HDFS-8880
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: 2.8.0

 Attachments: HDFS-8880.01.patch, HDFS-8880.02.patch, 
 HDFS-8880.03.patch, HDFS-8880.04.patch, namenode-metrics.log


 The NameNode can periodically log metrics to help debugging when the cluster 
 is not setup with another metrics monitoring scheme.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8435) createNonRecursive support needed in WebHdfsFileSystem to support HBase

2015-08-17 Thread Jakob Homan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-8435:
--
Attachment: HDFS-8435.004.patch

Fixed javadoc and whitespace complaints.  Unfortunately, as we're adding a 
deprecated API to WebHDFS, the javac warning is unavoidable.  Unit tests that 
failed/timed-out on Jenkins pass repeatedly for me; I consider them spurious.

 createNonRecursive support needed in WebHdfsFileSystem to support HBase
 ---

 Key: HDFS-8435
 URL: https://issues.apache.org/jira/browse/HDFS-8435
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.6.0
Reporter: Vinoth Sathappan
Assignee: Jakob Homan
 Attachments: HDFS-8435-branch-2.7.001.patch, HDFS-8435.001.patch, 
 HDFS-8435.002.patch, HDFS-8435.003.patch, HDFS-8435.004.patch


 The WebHdfsFileSystem implementation doesn't support createNonRecursive. 
 HBase extensively depends on that for proper functioning. Currently, when the 
 region servers are started over web hdfs, they crash due with -
 createNonRecursive unsupported for this filesystem class 
 org.apache.hadoop.hdfs.web.SWebHdfsFileSystem
 at 
 org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1137)
 at 
 org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1112)
 at 
 org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1088)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:85)
 at 
 org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:198)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8435) createNonRecursive support needed in WebHdfsFileSystem to support HBase

2015-08-17 Thread Jakob Homan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-8435:
--
Status: Patch Available  (was: Open)

 createNonRecursive support needed in WebHdfsFileSystem to support HBase
 ---

 Key: HDFS-8435
 URL: https://issues.apache.org/jira/browse/HDFS-8435
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.6.0
Reporter: Vinoth Sathappan
Assignee: Jakob Homan
 Attachments: HDFS-8435-branch-2.7.001.patch, HDFS-8435.001.patch, 
 HDFS-8435.002.patch, HDFS-8435.003.patch, HDFS-8435.004.patch


 The WebHdfsFileSystem implementation doesn't support createNonRecursive. 
 HBase extensively depends on that for proper functioning. Currently, when the 
 region servers are started over web hdfs, they crash due with -
 createNonRecursive unsupported for this filesystem class 
 org.apache.hadoop.hdfs.web.SWebHdfsFileSystem
 at 
 org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1137)
 at 
 org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1112)
 at 
 org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1088)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:85)
 at 
 org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:198)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8901) Use ByteBuffer in striping positional read


 [ 
https://issues.apache.org/jira/browse/HDFS-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-8901:

Summary: Use ByteBuffer in striping positional read  (was: Use 
ByteBuffer/DirectByteBuffer in striping positional read)

 Use ByteBuffer in striping positional read
 --

 Key: HDFS-8901
 URL: https://issues.apache.org/jira/browse/HDFS-8901
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng

 Native erasure coder prefers to direct ByteBuffer for performance 
 consideration. To prepare for it, this change uses ByteBuffer through the 
 codes in implementing striping position read. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8897) Loadbalancer always exits with : java.io.IOException: Another Balancer is running.. Exiting ...

2015-08-17 Thread LINTE (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699538#comment-14699538
 ] 

LINTE commented on HDFS-8897:
-

Thank you for your attention.
In fact the balancer seems to look at 2 values : 
  - fs.defaultFS
  - dfs.nameservices

I had fs.defaultFS = hdfs://sandbox/ and dfs.nameservices = sandbox.
I remove the / at the end of fs.defaultFS and it solves my error.

Balancer should use only one of these values.

Regards,



 Loadbalancer always exits with : java.io.IOException: Another Balancer is 
 running..  Exiting ...
 

 Key: HDFS-8897
 URL: https://issues.apache.org/jira/browse/HDFS-8897
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.7.1
 Environment: Centos 6.6
Reporter: LINTE

 When balancer is launched, it should test if there is already a 
 /system/balancer.id file in HDFS.
 When the file doesn't exist, the balancer don't want to run : 
 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
 hdfs://sandbox]
 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
 Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
 = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
 Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
 Bytes Being Moved
 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
 NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
 30mins, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
 NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
 30mins, 0sec
 java.io.IOException: Another Balancer is running..  Exiting ...
 Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
 Looking at the audit log file when trying to run the balancer, the balancer 
 create the /system/balancer.id and then delete it on exiting ... 
 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
 src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
 proto=rpc
 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 The error seems to be located in 
 org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
 The function checkAndMarkRunning return null even if the /system/balancer.id 
 doesn't exist before entering this function; if it exists, then it is deleted 
 and the balancer exit with the same error.
 
   private OutputStream checkAndMarkRunning() throws IOException {
 try {
   if (fs.exists(idPath)) {
 // try appending to it so that it will fail fast if another balancer 
 is
 // running.
 IOUtils.closeStream(fs.append(idPath));
 fs.delete(idPath, true);
   }
   final FSDataOutputStream fsout = fs.create(idPath, false);
   // mark balancer idPath to be deleted during filesystem closure
   fs.deleteOnExit(idPath);
   if (write2IdFile) {
 fsout.writeBytes(InetAddress.getLocalHost().getHostName());
 fsout.hflush();
   }
   return fsout;
 } catch(RemoteException e) {
   
 if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
 return null;
   } else {
 throw e;
   }
 }
   }
 
 Regards



--
This message

[jira] [Created] (HDFS-8905) Refactor DFSInputStream#ReaderStrategy

Kai Zheng created HDFS-8905:
---

 Summary: Refactor DFSInputStream#ReaderStrategy
 Key: HDFS-8905
 URL: https://issues.apache.org/jira/browse/HDFS-8905
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


DFSInputStream#ReaderStrategy family don't look very good. This refactors a 
little bit to make them make more sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8902) Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping read (position and stateful)


[ 
https://issues.apache.org/jira/browse/HDFS-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699582#comment-14699582
 ] 

Kai Zheng commented on HDFS-8902:
-

Thanks [~hitliuyi] for the pointer! I didn't notice HDFS-8668 and I thought I 
could use it to relate all the related issues. My dirty code indicated that all 
the required change isn't trivial and would be better to break down into 
pieces, at least two or three. Yes I can take HDFS-8668 and see how it would be 
better to proceed. Sounds good? Thanks.

 Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder 
 in striping read (position and stateful)
 -

 Key: HDFS-8902
 URL: https://issues.apache.org/jira/browse/HDFS-8902
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng

 We would choose ByteBuffer on heap or direct ByteBuffer according to used 
 erasure coder in striping read (position and stateful), for performance 
 consideration. Pure Java implemented coder favors on heap one, though native 
 coder likes more direct one, avoiding data copy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8897) Loadbalancer always exits with : java.io.IOException: Another Balancer is running.. Exiting ...

2015-08-17 Thread LINTE (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699488#comment-14699488
 ] 

LINTE commented on HDFS-8897:
-

Hi,

Below a part of hdfs-site.xml, namenode HA is used, maybe the origin for this 
issue ?
It was working fine with hdfs 2.6.0.

---
property
namedfs.nameservices/name
valuesandbox/value
/property

property
namedfs.ha.namenodes.sandbox/name
valuenn1,nn2/value
/property
---

 Loadbalancer always exits with : java.io.IOException: Another Balancer is 
 running..  Exiting ...
 

 Key: HDFS-8897
 URL: https://issues.apache.org/jira/browse/HDFS-8897
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.7.1
 Environment: Centos 6.6
Reporter: LINTE

 When balancer is launched, it should test if there is already a 
 /system/balancer.id file in HDFS.
 When the file doesn't exist, the balancer don't want to run : 
 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
 hdfs://sandbox]
 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
 Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
 = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
 Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
 Bytes Being Moved
 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
 NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
 30mins, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
 NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
 30mins, 0sec
 java.io.IOException: Another Balancer is running..  Exiting ...
 Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
 Looking at the audit log file when trying to run the balancer, the balancer 
 create the /system/balancer.id and then delete it on exiting ... 
 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
 src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
 proto=rpc
 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 The error seems to be located in 
 org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
 The function checkAndMarkRunning return null even if the /system/balancer.id 
 doesn't exist before entering this function; if it exists, then it is deleted 
 and the balancer exit with the same error.
 
   private OutputStream checkAndMarkRunning() throws IOException {
 try {
   if (fs.exists(idPath)) {
 // try appending to it so that it will fail fast if another balancer 
 is
 // running.
 IOUtils.closeStream(fs.append(idPath));
 fs.delete(idPath, true);
   }
   final FSDataOutputStream fsout = fs.create(idPath, false);
   // mark balancer idPath to be deleted during filesystem closure
   fs.deleteOnExit(idPath);
   if (write2IdFile) {
 fsout.writeBytes(InetAddress.getLocalHost().getHostName());
 fsout.hflush();
   }
   return fsout;
 } catch(RemoteException e) {
   
 if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
 return null;
   } else {
 throw e;
   }
 }
   }
 
 Regards



--
This message was sent by Atlassian JIRA

[jira] [Created] (HDFS-8904) Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping recovery on DataNode side

Kai Zheng created HDFS-8904:
---

 Summary: Uses ByteBuffer on heap or direct ByteBuffer according to 
used erasure coder in striping recovery on DataNode side
 Key: HDFS-8904
 URL: https://issues.apache.org/jira/browse/HDFS-8904
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


We would choose ByteBuffer on heap or direct ByteBuffer according to used 
erasure coder in striping recovery in DataNode side like the work to do in 
client side, for performance consideration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8907) Configurable striping read buffer threhold

[
https://issues.apache.org/jira/browse/HDFS-8907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kai Zheng updated HDFS-8907:

Description: In striping input stream, positional read merges all the
possible strips together, while stateful read reads a strip a time. The former
is efficient but may incur too large chunk buffers for a client to afford, the
latter is simple good but can be improved for better throughput. This would
consolidate the both and use a configurable (new or existing) buffer threshold
to control how it goes. Fixed chunk buffers for the read will be allocated
accordingly and reused time and time, as existing stateful read does. The
aligned strips to read a time may be computed against the threshold. (was: In
striping input stream, positional read merges all the possible strips together,
while stateful read reads a strip a time. The former is efficient but may incur
too large chunk buffers for a client to afford, the latter is simple good but
can be improved for better throughput. This would consolidate the both and use
a configurable (new or existing) buffer threshold to control how it goes. Fixed
chunk buffers for the read will be allocated accordingly and reused time and
time. The aligned strips to read a time may be computed against the threshold. )

Configurable striping read buffer threhold
--

Key: HDFS-8907
URL: https://issues.apache.org/jira/browse/HDFS-8907
Project: Hadoop HDFS
Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng

In striping input stream, positional read merges all the possible strips
together, while stateful read reads a strip a time. The former is efficient
but may incur too large chunk buffers for a client to afford, the latter is
simple good but can be improved for better throughput. This would consolidate
the both and use a configurable (new or existing) buffer threshold to control
how it goes. Fixed chunk buffers for the read will be allocated accordingly
and reused time and time, as existing stateful read does. The aligned strips
to read a time may be computed against the threshold.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8907) Configurable striping read buffer threhold

Kai Zheng created HDFS-8907:
---

 Summary: Configurable striping read buffer threhold
 Key: HDFS-8907
 URL: https://issues.apache.org/jira/browse/HDFS-8907
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


In striping input stream, positional read merges all the possible strips 
together, while stateful read reads a strip a time. The former is efficient but 
may incur too large chunk buffers for a client to afford, the latter is simple 
good but can be improved for better throughput. This would consolidate the both 
and use a configurable (new or existing) buffer threshold to control how it 
goes. Fixed chunk buffers for the read will be allocated accordingly and reused 
time and time. The aligned strips to read a time may be computed against the 
threshold. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-8902) Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping read (position and stateful)


[ 
https://issues.apache.org/jira/browse/HDFS-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699579#comment-14699579
 ] 

Yi Liu edited comment on HDFS-8902 at 8/17/15 2:15 PM:
---

Kai, HDFS-8668 will handle the java bytebuffer and direct buffer for all EC 
related encoding/decoding, so I think several new JIRAs related to this you 
just created are duplicated. 
If you want, you can take that JIRA, but I think we can handle them all in one 
JIRA, no need to create separate JIRAs.


was (Author: hitliuyi):
Kai, HDFS-8668 will handle the java bytebuffer and direct buffer for all EC 
related encoding/decoding, so I think several new JIRAs you just created are 
duplicated. 
If you want, you can take that JIRA, but I think we can handle them all in one 
JIRA, no need to create separate JIRAs.

 Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder 
 in striping read (position and stateful)
 -

 Key: HDFS-8902
 URL: https://issues.apache.org/jira/browse/HDFS-8902
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng

 We would choose ByteBuffer on heap or direct ByteBuffer according to used 
 erasure coder in striping read (position and stateful), for performance 
 consideration. Pure Java implemented coder favors on heap one, though native 
 coder likes more direct one, avoiding data copy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8902) Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping read (position and stateful)


[ 
https://issues.apache.org/jira/browse/HDFS-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699602#comment-14699602
 ] 

Kai Zheng commented on HDFS-8902:
-

Thank you! Yes HDFS-8668 looks great to contain all the pieces.

 Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder 
 in striping read (position and stateful)
 -

 Key: HDFS-8902
 URL: https://issues.apache.org/jira/browse/HDFS-8902
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng

 We would choose ByteBuffer on heap or direct ByteBuffer according to used 
 erasure coder in striping read (position and stateful), for performance 
 consideration. Pure Java implemented coder favors on heap one, though native 
 coder likes more direct one, avoiding data copy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8901) Use ByteBuffer in striping positional read


 [ 
https://issues.apache.org/jira/browse/HDFS-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-8901:

Description: Native erasure coder prefers to direct ByteBuffer for 
performance consideration. To prepare for it, this change uses ByteBuffer 
through the codes in implementing striping position read. It will also fix 
avoiding unnecessary data copying between striping read chunk buffers and 
decode input buffers.  (was: Native erasure coder prefers to direct ByteBuffer 
for performance consideration. To prepare for it, this change uses ByteBuffer 
through the codes in implementing striping position read. )

 Use ByteBuffer in striping positional read
 --

 Key: HDFS-8901
 URL: https://issues.apache.org/jira/browse/HDFS-8901
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng

 Native erasure coder prefers to direct ByteBuffer for performance 
 consideration. To prepare for it, this change uses ByteBuffer through the 
 codes in implementing striping position read. It will also fix avoiding 
 unnecessary data copying between striping read chunk buffers and decode input 
 buffers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8668) Erasure Coding: revisit buffer used for encoding and decoding.


 [ 
https://issues.apache.org/jira/browse/HDFS-8668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8668:
-
Assignee: Kai Zheng  (was: Yi Liu)

 Erasure Coding: revisit buffer used for encoding and decoding.
 --

 Key: HDFS-8668
 URL: https://issues.apache.org/jira/browse/HDFS-8668
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yi Liu
Assignee: Kai Zheng

 For encoding and decoding buffers, currently some places use java heap 
 ByteBuffer,  some use direct byteBUffer, and some use java byte array.  If 
 the coder implementation is native, we should use direct ByteBuffer. This 
 jira is to  revisit all encoding/decoding buffers and improve them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8902) Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping read (position and stateful)


[ 
https://issues.apache.org/jira/browse/HDFS-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699587#comment-14699587
 ] 

Yi Liu commented on HDFS-8902:
--

Sure, I just assigned it to you, thanks for working on it.

 Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder 
 in striping read (position and stateful)
 -

 Key: HDFS-8902
 URL: https://issues.apache.org/jira/browse/HDFS-8902
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng

 We would choose ByteBuffer on heap or direct ByteBuffer according to used 
 erasure coder in striping read (position and stateful), for performance 
 consideration. Pure Java implemented coder favors on heap one, though native 
 coder likes more direct one, avoiding data copy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8668) Erasure Coding: revisit buffer used for encoding and decoding.


[ 
https://issues.apache.org/jira/browse/HDFS-8668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699599#comment-14699599
 ] 

Kai Zheng commented on HDFS-8668:
-

Thanks [~hitliuyi] for the issue and assigning it to me. As some part isn't 
easy and we have to change to use ByteBuffer from bytes array first, some part 
couples with buffer pool stuffs, I break this whole consideration into smaller 
issues and will work on them.

 Erasure Coding: revisit buffer used for encoding and decoding.
 --

 Key: HDFS-8668
 URL: https://issues.apache.org/jira/browse/HDFS-8668
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yi Liu
Assignee: Kai Zheng

 For encoding and decoding buffers, currently some places use java heap 
 ByteBuffer,  some use direct byteBUffer, and some use java byte array.  If 
 the coder implementation is native, we should use direct ByteBuffer. This 
 jira is to  revisit all encoding/decoding buffers and improve them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8897) Loadbalancer always exits with : java.io.IOException: Another Balancer is running.. Exiting ...

2015-08-17 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699513#comment-14699513
 ] 

Rakesh R commented on HDFS-8897:


My observation about the case is - Balancer is seeing two nameservice IDs but 
both are pointing to the same cluster, one with {{hdfs://sandbox/}} slash and 
other {{hdfs://sandbox}}. While running balancer it will establish 
NameNodeConnectors and internally creates the idFilePath {{balancer.id}} to 
prevent simultaneous balancer operations. Since both {{nameservice IDs}} are 
pointing to the same cluster, for the first connector {{balancer.id}} creation 
will be succeeded and then again tries to create the {{balancer.id}} for the 
second connector it sees idFilePath already exists and resulting in failure. 
IMHO, we could find the reason for two occurrences of the same cluster ID to 
understand it well, right?

bq.It was working fine with hdfs 2.6.0.
The validation to prevent the simultaneous balancing has modified in 2.7.1, 
thats the reason you are not seeing any problem with 2.6.0 version.

 Loadbalancer always exits with : java.io.IOException: Another Balancer is 
 running..  Exiting ...
 

 Key: HDFS-8897
 URL: https://issues.apache.org/jira/browse/HDFS-8897
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.7.1
 Environment: Centos 6.6
Reporter: LINTE

 When balancer is launched, it should test if there is already a 
 /system/balancer.id file in HDFS.
 When the file doesn't exist, the balancer don't want to run : 
 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
 hdfs://sandbox]
 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
 Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
 = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
 Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
 Bytes Being Moved
 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
 NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
 30mins, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
 NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
 30mins, 0sec
 java.io.IOException: Another Balancer is running..  Exiting ...
 Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
 Looking at the audit log file when trying to run the balancer, the balancer 
 create the /system/balancer.id and then delete it on exiting ... 
 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
 src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
 proto=rpc
 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
 ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
 src=/system/balancer.id dst=nullperm=null   proto=rpc
 The error seems to be located in 
 org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
 The function checkAndMarkRunning return null even if the /system/balancer.id 
 doesn't exist before entering this function; if it exists, then it is deleted 
 and the balancer exit with the same error.
 
   private OutputStream checkAndMarkRunning() throws IOException {
 try {
   if (fs.exists(idPath)) {
 // try appending to it so that it will fail fast if another balancer 
 is
 // running.
 IOUtils.closeStream(fs.append(idPath));

[jira] [Created] (HDFS-8901) Use ByteBuffer/DirectByteBuffer in striping positional read

Kai Zheng created HDFS-8901:
---

 Summary: Use ByteBuffer/DirectByteBuffer in striping positional 
read
 Key: HDFS-8901
 URL: https://issues.apache.org/jira/browse/HDFS-8901
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


Native erasure coder prefers to direct ByteBuffer for performance 
consideration. To prepare for it, this change uses ByteBuffer through the codes 
in implementing striping position read. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8902) Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping read (position and stateful)

Kai Zheng created HDFS-8902:
---

 Summary: Uses ByteBuffer on heap or direct ByteBuffer according to 
used erasure coder in striping read (position and stateful)
 Key: HDFS-8902
 URL: https://issues.apache.org/jira/browse/HDFS-8902
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


We would choose ByteBuffer on heap or direct ByteBuffer according to used 
erasure coder in striping read (position and stateful), for performance 
consideration. Pure Java implemented coder favors on heap one, though native 
coder likes more direct one, avoiding data copy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8903) Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping write

Kai Zheng created HDFS-8903:
---

 Summary: Uses ByteBuffer on heap or direct ByteBuffer according to 
used erasure coder in striping write
 Key: HDFS-8903
 URL: https://issues.apache.org/jira/browse/HDFS-8903
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng


We would choose ByteBuffer on heap or direct ByteBuffer according to used 
erasure coder in striping write, for performance consideration. Pure Java 
implemented coder favors on heap one, though native coder likes more direct 
one, avoiding data copy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8902) Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder in striping read (position and stateful)


[ 
https://issues.apache.org/jira/browse/HDFS-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699579#comment-14699579
 ] 

Yi Liu commented on HDFS-8902:
--

Kai, HDFS-8668 will handle the java bytebuffer and direct buffer for all EC 
related encoding/decoding, so I think several new JIRAs you just created are 
duplicated. 
If you want, you can take that JIRA, but I think we can handle them all in one 
JIRA, no need to create separate JIRAs.

 Uses ByteBuffer on heap or direct ByteBuffer according to used erasure coder 
 in striping read (position and stateful)
 -

 Key: HDFS-8902
 URL: https://issues.apache.org/jira/browse/HDFS-8902
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng

 We would choose ByteBuffer on heap or direct ByteBuffer according to used 
 erasure coder in striping read (position and stateful), for performance 
 consideration. Pure Java implemented coder favors on heap one, though native 
 coder likes more direct one, avoiding data copy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8705) BlockStoragePolicySuite uses equalsIgnoreCase for name lookup, won't work in all locales

2015-08-17 Thread Walter Su (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699478#comment-14699478
 ] 

Walter Su commented on HDFS-8705:
-

You can use {{StringUtils.toLowerCase(String)}} instead.

 BlockStoragePolicySuite uses equalsIgnoreCase for name lookup, won't work in 
 all locales
 

 Key: HDFS-8705
 URL: https://issues.apache.org/jira/browse/HDFS-8705
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.8.0
Reporter: Steve Loughran
Assignee: Brahma Reddy Battula
Priority: Minor
 Attachments: HDFS-8705.patch


 Looking at {{BlockStoragePolicySuite.getPolicy(name)}}, is using 
 {{equalsIgnoreCase()}} to find a policy which matches a name.
 This will not work in all locales. It must use 
 {{toLowerCase(Locale.ENGLISH).equals(name)}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8906) Non Authenticated Data node Allowed to Join HDFS

2015-08-17 Thread John J. Howard (JIRA)

John J. Howard created HDFS-8906:


 Summary: Non Authenticated Data node Allowed to Join HDFS
 Key: HDFS-8906
 URL: https://issues.apache.org/jira/browse/HDFS-8906
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 0.20.2
 Environment: CentOS 6.7
Reporter: John J. Howard
Priority: Minor


An attacker with network access to a Hadoop cluster can create a spoof datanode 
that the namenode will accept into the cluster without authentication, allowing 
the attacker to run MapReduce jobs on the cluster in order to steal data.  The 
spoof datanode is created by adding the namenode RSA SSH public key to the 
known hosts directory, starting Hadoop services, setting the IP address to be 
the same as a legitimate node on the Hadoop cluster and sending the namenode a 
heartbeat message with an empty namespace ID.  This will cause the namenode to 
think that the spoof datanode is a node that had previously crashed and lost 
its data.  The namenode will then connect to the spoof datanode using its SSH 
credentials and start replicating data on the spoof datanode, incorporating the 
spoof datanode into the cluster.  Once incorporated, the spoof node can start 
issuing MapReduce jobs to retrieve cluster data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones


[ 
https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700751#comment-14700751
 ] 

Hadoop QA commented on HDFS-8833:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750930/HDFS-8833-HDFS-7285-merge.01.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / b57c9a3 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12017/console |


This message was automatically generated.

 Erasure coding: store EC schema and cell size in INodeFile and eliminate 
 notion of EC zones
 ---

 Key: HDFS-8833
 URL: https://issues.apache.org/jira/browse/HDFS-8833
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-7285
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8833-HDFS-7285-merge.00.patch, 
 HDFS-8833-HDFS-7285-merge.01.patch


 We have [discussed | 
 https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754]
  storing EC schema with files instead of EC zones and recently revisited the 
 discussion under HDFS-8059.
 As a recap, the _zone_ concept has severe limitations including renaming and 
 nested configuration. Those limitations are valid in encryption for security 
 reasons and it doesn't make sense to carry them over in EC.
 This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For 
 simplicity, we should first implement it as an xattr and consider memory 
 optimizations (such as moving it to file header) as a follow-on. We should 
 also disable changing EC policy on a non-empty file / dir in the first phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8846) Create edit log files with old layout version for upgrade testing


[ 
https://issues.apache.org/jira/browse/HDFS-8846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700746#comment-14700746
 ] 

Hadoop QA commented on HDFS-8846:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |   6m 31s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   8m 14s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 46s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 46s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   1m  6s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 193m 14s | Tests failed in hadoop-hdfs. |
| | | 215m 13s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.TestNameNodeMetricsLogger |
|   | hadoop.hdfs.TestHDFSTrash |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750924/HDFS-8846.01.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / 71566e2 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12015/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12015/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12015/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12015/console |


This message was automatically generated.

 Create edit log files with old layout version for upgrade testing
 -

 Key: HDFS-8846
 URL: https://issues.apache.org/jira/browse/HDFS-8846
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8846.00.patch, HDFS-8846.01.patch


 Per discussion under HDFS-8480, we should create some edit log files with old 
 layout version, to test whether they can be correctly handled in upgrades.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8906) Non Authenticated Data node Allowed to Join HDFS

2015-08-17 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699627#comment-14699627
 ] 

Allen Wittenauer commented on HDFS-8906:


Hadoop 0.20.2 had no (real) security features in it.  This is the least of its 
problems:  setting hadoop.job.ugi would allow anyone to connect as anyone else.

This and other issues have since been fixed in subsequent versions of Hadoop.  

Given that 0.20.2 is over 5 years old at this point and unless there is 
something else, I'll be closing this as won't fix.

 Non Authenticated Data node Allowed to Join HDFS
 

 Key: HDFS-8906
 URL: https://issues.apache.org/jira/browse/HDFS-8906
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 0.20.2
 Environment: CentOS 6.7
Reporter: John J. Howard
Priority: Minor
  Labels: security

 An attacker with network access to a Hadoop cluster can create a spoof 
 datanode that the namenode will accept into the cluster without 
 authentication, allowing the attacker to run MapReduce jobs on the cluster in 
 order to steal data.  The spoof datanode is created by adding the namenode 
 RSA SSH public key to the known hosts directory, starting Hadoop services, 
 setting the IP address to be the same as a legitimate node on the Hadoop 
 cluster and sending the namenode a heartbeat message with an empty namespace 
 ID.  This will cause the namenode to think that the spoof datanode is a node 
 that had previously crashed and lost its data.  The namenode will then 
 connect to the spoof datanode using its SSH credentials and start replicating 
 data on the spoof datanode, incorporating the spoof datanode into the 
 cluster.  Once incorporated, the spoof node can start issuing MapReduce jobs 
 to retrieve cluster data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-17 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699664#comment-14699664
 ] 

Yongjun Zhang commented on HDFS-8828:
-

Hi [~jingzhao],

Thanks for the review and good comments!

Some thoughts:

Currently {{DistCpOptions}} is currently the only vehicle to pass info between 
different stages of distcp. To address your comment 1  2, we need to add 
something new to pass additional info (which is derived data, and can be hold 
by a new class) to pass between sync and copyListing stages. To do this, there 
are two choices:

1. Pass an object of this new class as a standalone parameter between stages, 
which require changing quite some method signatures, most of the places don't 
use this new parameter.

2. Put the object of this class as a member of DistCpOptions, so there is no 
need to change method signatures. We can create another new class 
{{DistCpDerivedInput}} to hold derived input, and put all derived data as 
members of {{DistCpDerivedInput}}. If we define DistCpOptions as only holding 
command line options, then then this choice is not perfect; however, if we 
define DistcpOptions as may-contain derived input too, then it's ok.

Which choice you like better? or additional thoughts?

Thanks.






 Utilize Snapshot diff report to build copy list in distcp
 -

 Key: HDFS-8828
 URL: https://issues.apache.org/jira/browse/HDFS-8828
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: distcp, snapshots
Reporter: Yufei Gu
Assignee: Yufei Gu
 Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
 HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
 HDFS-8828.006.patch, HDFS-8828.007.patch


 Some users reported huge time cost to build file copy list in distcp. (30 
 hours for 1.6M files). We can leverage snapshot diff report to build file 
 copy list including files/dirs which are changes only between two snapshots 
 (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
 less copy list building time. 2. less file copy MR jobs.
 HDFS snapshot diff report provide information about file/directory creation, 
 deletion, rename and modification between two snapshots or a snapshot and a 
 normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
 the default distcp. So it still relies on default distcp to building complete 
 list of files under the source dir. This patch only puts creation and 
 modification files into the copy list based on snapshot diff report. We can 
 minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8713) Convert DatanodeDescriptor to use SLF4J logging


[ 
https://issues.apache.org/jira/browse/HDFS-8713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699845#comment-14699845
 ] 

Andrew Wang commented on HDFS-8713:
---

Thanks for reminding me about this one Yi, I ran the timed out test locally 
successfully. Will commit to trunk and branch-2 based on Eddy's earlier +1.

 Convert DatanodeDescriptor to use SLF4J logging
 ---

 Key: HDFS-8713
 URL: https://issues.apache.org/jira/browse/HDFS-8713
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.6-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
 Attachments: hdfs-8713.001.patch


 Let's convert this class to use SLF4J



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8908) TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad datanode

Tsz Wo Nicholas Sze created HDFS-8908:
-

 Summary: TestAppendSnapshotTruncate may fail with IOException: 
Failed to replace a bad datanode
 Key: HDFS-8908
 URL: https://issues.apache.org/jira/browse/HDFS-8908
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor


See 
https://builds.apache.org/job/PreCommit-HDFS-Build/12005/testReport/org.apache.hadoop.hdfs/TestAppendSnapshotTruncate/testAST/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8713) Convert DatanodeDescriptor to use SLF4J logging


 [ 
https://issues.apache.org/jira/browse/HDFS-8713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8713:
--
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2, thanks again Eddy + Yi.

 Convert DatanodeDescriptor to use SLF4J logging
 ---

 Key: HDFS-8713
 URL: https://issues.apache.org/jira/browse/HDFS-8713
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.6-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
 Fix For: 2.8.0

 Attachments: hdfs-8713.001.patch


 Let's convert this class to use SLF4J



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8895) Remove deprecated BlockStorageLocation APIs


[ 
https://issues.apache.org/jira/browse/HDFS-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699899#comment-14699899
 ] 

Andrew Wang commented on HDFS-8895:
---

Kicking a rebuild since Jenkins ate the test output (?). [~eddyxu] you mind 
reviewing this one too since you looked at HDFS-8887?

 Remove deprecated BlockStorageLocation APIs
 ---

 Key: HDFS-8895
 URL: https://issues.apache.org/jira/browse/HDFS-8895
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: HDFS-8895.001.patch


 HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so 
 it can be removed from trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones


[ 
https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699910#comment-14699910
 ] 

Jing Zhao commented on HDFS-8833:
-

Yes, a conversion tool can be helpful.

 Erasure coding: store EC schema and cell size in INodeFile and eliminate 
 notion of EC zones
 ---

 Key: HDFS-8833
 URL: https://issues.apache.org/jira/browse/HDFS-8833
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-7285
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8833-HDFS-7285-merge.00.patch


 We have [discussed | 
 https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754]
  storing EC schema with files instead of EC zones and recently revisited the 
 discussion under HDFS-8059.
 As a recap, the _zone_ concept has severe limitations including renaming and 
 nested configuration. Those limitations are valid in encryption for security 
 reasons and it doesn't make sense to carry them over in EC.
 This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For 
 simplicity, we should first implement it as an xattr and consider memory 
 optimizations (such as moving it to file header) as a follow-on. We should 
 also disable changing EC policy on a non-empty file / dir in the first phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

[
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699916#comment-14699916
]

Zhe Zhang commented on HDFS-7285:
-

[~vinayrpet] Sure, let's solicit more feedback.

Your rebase doesn't cause any additional failures: [Jenkins results |
https://builds.apache.org/job/Hadoop-HDFS-7285-REBASE/]. Did you run Jenkins
before posting the branch? Otherwise, a nice ace :)

Erasure Coding Support inside HDFS
--

Key: HDFS-7285
URL: https://issues.apache.org/jira/browse/HDFS-7285
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Weihua Jiang
Assignee: Zhe Zhang
Attachments: Consolidated-20150707.patch,
Consolidated-20150806.patch, Consolidated-20150810.patch, ECAnalyzer.py,
ECParser.py, HDFS-7285-initial-PoC.patch,
HDFS-7285-merge-consolidated-01.patch,
HDFS-7285-merge-consolidated-trunk-01.patch,
HDFS-7285-merge-consolidated.trunk.03.patch,
HDFS-7285-merge-consolidated.trunk.04.patch,
HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch,
HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf,
HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf,
HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf,
fsimage-analysis-20150105.pdf

Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice
of data reliability, comparing to the existing HDFS 3-replica approach. For
example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks,
with storage overhead only being 40%. This makes EC a quite attractive
alternative for big data storage, particularly for cold data.
Facebook had a related open source project called HDFS-RAID. It used to be
one of the contribute packages in HDFS but had been removed since Hadoop 2.0
for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends
on MapReduce to do encoding and decoding tasks; 2) it can only be used for
cold files that are intended not to be appended anymore; 3) the pure Java EC
coding implementation is extremely slow in practical use. Due to these, it
might not be a good idea to just bring HDFS-RAID back.
We (Intel and Cloudera) are working on a design to build EC into HDFS that
gets rid of any external dependencies, makes it self-contained and
independently maintained. This design lays the EC feature on the storage type
support and considers compatible with existing HDFS features like caching,
snapshot, encryption, high availability and etc. This design will also
support different EC coding schemes, implementations and policies for
different deployment scenarios. By utilizing advanced libraries (e.g. Intel
ISA-L library), an implementation can greatly improve the performance of EC
encoding/decoding and makes the EC solution even more attractive. We will
post the design document soon.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6955) DN should reserve disk space for a full block when creating tmp files

2015-08-17 Thread kanaka kumar avvaru (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699926#comment-14699926
 ] 

kanaka kumar avvaru commented on HDFS-6955:
---

Updated patch for white space errors and a checkstyle issue 
{{FsVolumeImpl.java:462:69: 'reserved' hides a field.}}.

 DN should reserve disk space for a full block when creating tmp files
 -

 Key: HDFS-6955
 URL: https://issues.apache.org/jira/browse/HDFS-6955
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.5.0
Reporter: Arpit Agarwal
Assignee: kanaka kumar avvaru
 Attachments: HDFS-6955-01.patch, HDFS-6955-02.patch


 HDFS-6898 is introducing disk space reservation for RBW files to avoid 
 running out of disk space midway through block creation.
 This Jira is to introduce similar reservation for tmp files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6407) Add sorting and pagination in the datanode tab of the NN Web UI


 [ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6407:
-
Issue Type: Improvement  (was: Bug)

 Add sorting and pagination in the datanode tab of the NN Web UI
 ---

 Key: HDFS-6407
 URL: https://issues.apache.org/jira/browse/HDFS-6407
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.0
Reporter: Nathan Roberts
Assignee: Haohui Mai
Priority: Critical
  Labels: BB2015-05-TBR
 Attachments: 002-datanodes-sorted-capacityUsed.png, 
 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
 HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, 
 HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, 
 HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, 
 HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 
 2.png, sorting table.png


 old ui supported clicking on column header to sort on that column. The new ui 
 seems to have dropped this very useful feature.
 There are a few tables in the Namenode UI to display  datanodes information, 
 directory listings and snapshots.
 When there are many items in the tables, it is useful to have ability to sort 
 on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8278) HDFS Balancer should consider remaining storage % when checking for under-utilized machines


 [ 
https://issues.apache.org/jira/browse/HDFS-8278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8278:
--
Attachment: h8278_20150817.patch

h8278_20150817.patch: counts only the storage with remaining storage = default 
block size.

I also removes the use of threshold in computeMaxSize2Move(..).

 HDFS Balancer should consider remaining storage % when checking for 
 under-utilized machines
 ---

 Key: HDFS-8278
 URL: https://issues.apache.org/jira/browse/HDFS-8278
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer  mover
Affects Versions: 2.8.0
Reporter: Gopal V
Assignee: Tsz Wo Nicholas Sze
 Attachments: h8278_20150817.patch


 DFS balancer mistakenly identifies a node with very little storage space 
 remaining as an underutilized node and tries to move large amounts of data 
 to that particular node.
 All these block moves fail to execute successfully, as the % utilization is 
 less relevant than the dfs remaining storage on that node.
 {code}
 15/04/24 04:25:55 INFO balancer.Balancer: 0 over-utilized: []
 15/04/24 04:25:55 INFO balancer.Balancer: 1 underutilized: 
 [172.19.1.46:50010:DISK]
 15/04/24 04:25:55 INFO balancer.Balancer: Need to move 47.68 GB to make the 
 cluster balanced.
 15/04/24 04:25:55 INFO balancer.Balancer: Decided to move 413.08 MB bytes 
 from 172.19.1.52:50010:DISK to 172.19.1.46:50010:DISK
 15/04/24 04:25:55 INFO balancer.Balancer: Will move 413.08 MB in this 
 iteration
 15/04/24 04:25:55 WARN balancer.Dispatcher: Failed to move 
 blk_1078689321_1099517353638 with size=131146 from 172.19.1.52:50010:DISK to 
 172.19.1.46:50010:DISK through 172.19.1.53:50010: Got error, status message 
 opReplaceBlock 
 BP-942051088-172.18.1.41-1370508013893:blk_1078689321_1099517353638 received 
 exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: Out of 
 space: The volume with the most available space (=225042432 B) is less than 
 the block size (=268435456 B)., block move is failed
 {code}
 The machine in concern is under-full when it comes to the BP utilization, but 
 has very little free space available for blocks.
 {code}
 Decommission Status : Normal
 Configured Capacity: 3826907185152 (3.48 TB)
 DFS Used: 2817262833664 (2.56 TB)
 Non DFS Used: 1000621305856 (931.90 GB)
 DFS Remaining: 9023045632 (8.40 GB)
 DFS Used%: 73.62%
 DFS Remaining%: 0.24%
 Configured Cache Capacity: 8589934592 (8 GB)
 Cache Used: 0 (0 B)
 Cache Remaining: 8589934592 (8 GB)
 Cache Used%: 0.00%
 Cache Remaining%: 100.00%
 Xceivers: 3
 Last contact: Fri Apr 24 04:28:36 PDT 2015
 {code}
 The machine has 0.40 Gb of non-RAM storage available on that node, so it is 
 futile to attempt to move any blocks to that particular machine.
 This is a similar concern when a machine loses disks, since the comparisons 
 of utilization always compare percentages per-node. Even that scenario needs 
 to cap data movement to that node to the DFS Remaining % variable.
 Trying to move any more data than that to a given node will always fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8883) NameNode Metrics : Add FSNameSystem lock Queue Length

2015-08-17 Thread Xiaoyu Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-8883:
-
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks [~anu] for the contribution. I've committed the patch to trunk and 
branch-2.

 NameNode Metrics : Add FSNameSystem lock Queue Length
 -

 Key: HDFS-8883
 URL: https://issues.apache.org/jira/browse/HDFS-8883
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
 Fix For: 2.8.0

 Attachments: HDFS-8883.001.patch


 FSNameSystemLock can have contention when NameNode is under load. This patch 
 adds  LockQueueLength -- the number of threads waiting on FSNameSystemLock -- 
 as a metric in NameNode. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8908) TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad datanode


 [ 
https://issues.apache.org/jira/browse/HDFS-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8908:
--
Attachment: h8908_20150817.patch

 TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad 
 datanode
 --

 Key: HDFS-8908
 URL: https://issues.apache.org/jira/browse/HDFS-8908
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h8908_20150817.patch


 See 
 https://builds.apache.org/job/PreCommit-HDFS-Build/12005/testReport/org.apache.hadoop.hdfs/TestAppendSnapshotTruncate/testAST/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8801) Convert BlockInfoUnderConstruction as a feature


[ 
https://issues.apache.org/jira/browse/HDFS-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699936#comment-14699936
 ] 

Haohui Mai commented on HDFS-8801:
--

The current approach requires exposing 
{{setGenerationStampAndVerifyReplicas()}} and {{commitBlock()}} into the 
{{BlockInfo}} class. It's not ideal and it requires further refactoring, but I 
think given the scope of the changes it's okay to address it in a separate jira.

+1.

 Convert BlockInfoUnderConstruction as a feature
 ---

 Key: HDFS-8801
 URL: https://issues.apache.org/jira/browse/HDFS-8801
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Zhe Zhang
Assignee: Jing Zhao
 Attachments: HDFS-8801.000.patch


 Per discussion under HDFS-8499, with the erasure coding feature, there will 
 be 4 types of {{BlockInfo}} forming a multi-inheritance: 
 {{complete+contiguous}}, {{complete+striping}}, {{UC+contiguous}}, 
 {{UC+striped}}. We had the same challenge with {{INodeFile}} and the solution 
 was building feature classes like {{FileUnderConstructionFeature}}. This JIRA 
 aims to implement the same idea on {{BlockInfo}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8908) TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad datanode


 [ 
https://issues.apache.org/jira/browse/HDFS-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8908:
--
Attachment: (was: h8908_20150817.patch)

 TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad 
 datanode
 --

 Key: HDFS-8908
 URL: https://issues.apache.org/jira/browse/HDFS-8908
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h8908_20150817.patch


 See 
 https://builds.apache.org/job/PreCommit-HDFS-Build/12005/testReport/org.apache.hadoop.hdfs/TestAppendSnapshotTruncate/testAST/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab

2015-08-17 Thread Benoy Antony (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699893#comment-14699893
 ] 

Benoy Antony commented on HDFS-6407:


It will be good to specify the version information of the datatables component. 
This will help in maintaining this functionality.
For other Js components, the version information is included in the file name. 

 new namenode UI, lost ability to sort columns in datanode tab
 -

 Key: HDFS-6407
 URL: https://issues.apache.org/jira/browse/HDFS-6407
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Nathan Roberts
Assignee: Haohui Mai
Priority: Critical
  Labels: BB2015-05-TBR
 Attachments: 002-datanodes-sorted-capacityUsed.png, 
 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
 HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, 
 HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, 
 HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, 
 HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 
 2.png, sorting table.png


 old ui supported clicking on column header to sort on that column. The new ui 
 seems to have dropped this very useful feature.
 There are a few tables in the Namenode UI to display  datanodes information, 
 directory listings and snapshots.
 When there are many items in the tables, it is useful to have ability to sort 
 on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

[
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699906#comment-14699906
]

Zhe Zhang commented on HDFS-7285:
-

Thanks Walter for the questions.

bq. Will you rebase HDFS-7285 weekly after squashes it?
Yes that's my plan. Actually since we are close to merging I plan to rebase
more frequently. Currently I'm rebasing HDFS-7285-merge daily

bq. Should we squash HDFS-8854 and HDFS-8833 as well? To make the future
rebasing easier, also to try to avoid a second squash.
[~andrew.wang] has started a discussion thread on common-dev regarding the
rebase workflow. I'll wait until we reach a consensus there before squashing
the 2 new big patches.

bq. The commit message is inaccurate because HDFS-7285 is not finished yet.
Good point. I'll reword it in next rebase.

Erasure Coding Support inside HDFS
--

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6407) Add sorting and pagination in the datanode tab of the NN Web UI


 [ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6407:
-
Summary: Add sorting and pagination in the datanode tab of the NN Web UI  
(was: new namenode UI, lost ability to sort columns in datanode tab)

 Add sorting and pagination in the datanode tab of the NN Web UI
 ---

 Key: HDFS-6407
 URL: https://issues.apache.org/jira/browse/HDFS-6407
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Nathan Roberts
Assignee: Haohui Mai
Priority: Critical
  Labels: BB2015-05-TBR
 Attachments: 002-datanodes-sorted-capacityUsed.png, 
 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
 HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, 
 HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, 
 HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, 
 HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 
 2.png, sorting table.png


 old ui supported clicking on column header to sort on that column. The new ui 
 seems to have dropped this very useful feature.
 There are a few tables in the Namenode UI to display  datanodes information, 
 directory listings and snapshots.
 When there are many items in the tables, it is useful to have ability to sort 
 on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6407) Add sorting and pagination in the datanode tab of the NN Web UI


 [ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6407:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks all for the reviews and 
the contribution.

 Add sorting and pagination in the datanode tab of the NN Web UI
 ---

 Key: HDFS-6407
 URL: https://issues.apache.org/jira/browse/HDFS-6407
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.0
Reporter: Nathan Roberts
Assignee: Haohui Mai
Priority: Critical
  Labels: BB2015-05-TBR
 Fix For: 2.8.0

 Attachments: 002-datanodes-sorted-capacityUsed.png, 
 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
 HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, 
 HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, 
 HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, 
 HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 
 2.png, sorting table.png


 old ui supported clicking on column header to sort on that column. The new ui 
 seems to have dropped this very useful feature.
 There are a few tables in the Namenode UI to display  datanodes information, 
 directory listings and snapshots.
 When there are many items in the tables, it is useful to have ability to sort 
 on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8908) TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad datanode