[jira] [Commented] (HDFS-5579) Under construction files make DataNode decommission take very long hours

2014-01-08 Thread zhaoyunjiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865202#comment-13865202
 ] 

zhaoyunjiong commented on HDFS-5579:


It's already in the patch.
+if (bc.isUnderConstruction()) {
+  if (block.equals(bc.getLastBlock())  curReplicas  minReplication) 
{
+continue;
+  }
+  underReplicatedInOpenFiles++;
+}

 Under construction files make DataNode decommission take very long hours
 

 Key: HDFS-5579
 URL: https://issues.apache.org/jira/browse/HDFS-5579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 1.2.0, 2.2.0
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Attachments: HDFS-5579-branch-1.2.patch, HDFS-5579.patch


 We noticed that some times decommission DataNodes takes very long time, even 
 exceeds 100 hours.
 After check the code, I found that in 
 BlockManager:computeReplicationWorkForBlocks(ListListBlock 
 blocksToReplicate) it won't replicate blocks which belongs to under 
 construction files, however in 
 BlockManager:isReplicationInProgress(DatanodeDescriptor srcNode), if there  
 is block need replicate no matter whether it belongs to under construction or 
 not, the decommission progress will continue running.
 That's the reason some time the decommission takes very long time.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5729) Lower chance to hit NPE in allocateNodeLocal

2014-01-08 Thread wenwupeng (JIRA)
wenwupeng created HDFS-5729:
---

 Summary: Lower chance to hit NPE in allocateNodeLocal 
 Key: HDFS-5729
 URL: https://issues.apache.org/jira/browse/HDFS-5729
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: wenwupeng


we have lower chance to hit NPE in allocateNodeLocal  when run benchmark(hit 4 
in 20 times).

Steps:
1. setup hadoop 2.2.0 environment
2. Run for i in {1..10}; do /hadoop/hadoop-smoke/bin/hadoop jar 
/hadoop/hadoop-smoke/share/hadoop/mapreduce/hadoop-mapreduce-client-common-*.jar
 org.apache.hadoop.fs.TestDFSIO -write -nrFiles 30 -fileSize 64MB; sleep 10;done


2014-01-08 03:56:14,082 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
handling event type NODE_UPDATE to the scheduler
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:291)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:252)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:294)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:614)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignNodeLocalContainers(FifoScheduler.java:524)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:482)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:419)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:658)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:687)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:95)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:440)
at java.lang.Thread.run(Thread.java:662)

will attach log and configure files later

Note: 
My topology file:
10.111.89.230   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
10.111.89.231   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
10.111.89.232   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
10.111.89.239   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
10.111.89.233   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
10.111.89.234   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
10.111.89.240   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
10.111.89.236   /QE2/sin2-pekaurora-bdcqe047.eng.vmware.com
10.111.89.241   /QE2/sin2-pekaurora-bdcqe047.eng.vmware.com
10.111.89.238   /QE2/sin2-pekaurora-bdcqe048.eng.vmware.com
10.111.89.242   /QE2/sin2-pekaurora-bdcqe048.eng.vmware.com




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5729) Lower chance to hit NPE in allocateNodeLocal

2014-01-08 Thread wenwupeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wenwupeng updated HDFS-5729:


Attachment: log.tar.gz
conf.tar.gz

Attach the logs and configure file

 Lower chance to hit NPE in allocateNodeLocal 
 -

 Key: HDFS-5729
 URL: https://issues.apache.org/jira/browse/HDFS-5729
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: wenwupeng
 Attachments: conf.tar.gz, log.tar.gz


 we have lower chance to hit NPE in allocateNodeLocal  when run benchmark(hit 
 4 in 20 times).
 Steps:
 1. setup hadoop 2.2.0 environment
 2. Run for i in {1..10}; do /hadoop/hadoop-smoke/bin/hadoop jar 
 /hadoop/hadoop-smoke/share/hadoop/mapreduce/hadoop-mapreduce-client-common-*.jar
  org.apache.hadoop.fs.TestDFSIO -write -nrFiles 30 -fileSize 64MB; sleep 
 10;done
 2014-01-08 03:56:14,082 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:291)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:252)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:294)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:614)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignNodeLocalContainers(FifoScheduler.java:524)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:482)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:419)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:658)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:687)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:95)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:440)
 at java.lang.Thread.run(Thread.java:662)
 will attach log and configure files later
 Note: 
 My topology file:
 10.111.89.230   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
 10.111.89.231   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
 10.111.89.232   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
 10.111.89.239   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
 10.111.89.233   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
 10.111.89.234   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
 10.111.89.240   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
 10.111.89.236   /QE2/sin2-pekaurora-bdcqe047.eng.vmware.com
 10.111.89.241   /QE2/sin2-pekaurora-bdcqe047.eng.vmware.com
 10.111.89.238   /QE2/sin2-pekaurora-bdcqe048.eng.vmware.com
 10.111.89.242   /QE2/sin2-pekaurora-bdcqe048.eng.vmware.com



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-4273) Fix some issue in DFSInputstream

2014-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865220#comment-13865220
 ] 

Hadoop QA commented on HDFS-4273:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621932/HDFS-4273.v8.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5844//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5844//console

This message is automatically generated.

 Fix some issue in DFSInputstream
 

 Key: HDFS-4273
 URL: https://issues.apache.org/jira/browse/HDFS-4273
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.2-alpha
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HDFS-4273-v2.patch, HDFS-4273.patch, HDFS-4273.v3.patch, 
 HDFS-4273.v4.patch, HDFS-4273.v5.patch, HDFS-4273.v6.patch, 
 HDFS-4273.v7.patch, HDFS-4273.v8.patch, TestDFSInputStream.java


 Following issues in DFSInputStream are addressed in this jira:
 1. read may not retry enough in some cases cause early failure
 Assume the following call logic
 {noformat} 
 readWithStrategy()
   - blockSeekTo()
   - readBuffer()
  - reader.doRead()
  - seekToNewSource() add currentNode to deadnode, wish to get a 
 different datanode
 - blockSeekTo()
- chooseDataNode()
   - block missing, clear deadNodes and pick the currentNode again
 seekToNewSource() return false
  readBuffer() re-throw the exception quit loop
 readWithStrategy() got the exception,  and may fail the read call before 
 tried MaxBlockAcquireFailures.
 {noformat} 
 2. In multi-threaded scenario(like hbase), DFSInputStream.failures has race 
 condition, it is cleared to 0 when it is still used by other thread. So it is 
 possible that  some read thread may never quit. Change failures to local 
 variable solve this issue.
 3. If local datanode is added to deadNodes, it will not be removed from 
 deadNodes if DN is back alive. We need a way to remove local datanode from 
 deadNodes when the local datanode is become live.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5730) Inconsistent Audit logging for HDFS APIs

2014-01-08 Thread Uma Maheswara Rao G (JIRA)
Uma Maheswara Rao G created HDFS-5730:
-

 Summary: Inconsistent Audit logging for HDFS APIs
 Key: HDFS-5730
 URL: https://issues.apache.org/jira/browse/HDFS-5730
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.2.0, 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


When looking at the audit loggs in HDFS, I am seeing some inconsistencies what 
was logged with audit and what is added recently.

For more details please check the comments.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5730) Inconsistent Audit logging for HDFS APIs

2014-01-08 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865273#comment-13865273
 ] 

Uma Maheswara Rao G commented on HDFS-5730:
---

HDFS audit logging interface:
{code}
 /**
   * Same as
   * {@link #logAuditEvent(boolean, String, InetAddress, String, String, 
String, FileStatus)}
   * with additional parameters related to logging delegation token tracking
   * IDs.
   * 
   * @param succeeded Whether authorization succeeded.
   * @param userName Name of the user executing the request.
   * @param addr Remote address of the request.
   * @param cmd The requested command.
   * @param src Path of affected source file.
   * @param dst Path of affected destination file (if any).
   * @param stat File information for operations that change the file's metadata
   *  (permissions, owner, times, etc).
   * @param ugi UserGroupInformation of the current user, or null if not logging
   *  token tracking information
   * @param dtSecretManager The token secret manager, or null if not logging
   *  token tracking information
   */
  public abstract void logAuditEvent(boolean succeeded, String userName,
  InetAddress addr, String cmd, String src, String dst,
  FileStatus stat, UserGroupInformation ugi,
  DelegationTokenSecretManager dtSecretManager);
{code}

Here succeeded parameter indicates whether Authorization check succeeded.

Recent APIs like addCacheDirective, modifyCacheDirective, 
removeCacheDirective..etc are used that parameter to indicate whether op 
succeeded or not.

{code}
 boolean success = false;
 ...
 writeLock();
try {
  checkOperation(OperationCategory.WRITE);
  if (isInSafeMode()) {
throw new SafeModeException(
Cannot add cache directive, safeMode);
  }
  cacheManager.modifyDirective(directive, pc, flags);
  getEditLog().logModifyCacheDirectiveInfo(directive,
  cacheEntry != null);
  success = true;
} finally {
  writeUnlock();
  if (success) {
getEditLog().logSync();
  }
  if (isAuditEnabled()  isExternalInvocation()) {
logAuditEvent(success, modifyCacheDirective, null, null, null);
  }
  RetryCache.setState(cacheEntry, success);
}

{code}

But all the older APIs like startFile..etc handled the AccessControlException 
explicitly and passed the first parameter value as false if failure. No log for 
other IOE.


Also snapShot related APIs followed other pattern. Here we just logged only on 
success.

{code}
String createSnapshot(String snapshotRoot, String snapshotName)
  throws SafeModeException, IOException {
  ..
  .
getEditLog().logSync();

if (auditLog.isInfoEnabled()  isExternalInvocation()) {
  logAuditEvent(true, createSnapshot, snapshotRoot, snapshotPath, null);
}
return snapshotPath;
  }
{code}

So, we have to unify the audit logging here in all APIs.

 Inconsistent Audit logging for HDFS APIs
 

 Key: HDFS-5730
 URL: https://issues.apache.org/jira/browse/HDFS-5730
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 2.2.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G

 When looking at the audit loggs in HDFS, I am seeing some inconsistencies 
 what was logged with audit and what is added recently.
 For more details please check the comments.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5721) sharedEditsImage in Namenode#initializeSharedEdits() should be closed before method returns

2014-01-08 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865281#comment-13865281
 ] 

Uma Maheswara Rao G commented on HDFS-5721:
---

{quote}
There are also other places with the similar issues that not get close in 
finally block. i.e. Namenode#Format(), FSNamesystem# loadFromDisk(), etc. I 
think we should fix all these similar issues in one JIRA
{quote}
I agree to close the streams.  Actually in most of this cases, JVM will 
terminate immediately after the command execution (ex: format ..etc). It will 
not run system for log with the leaked streams. But if we face any issue due to 
streams because of not closing them, closing would be fine now. Am i missed 
something here?


 sharedEditsImage in Namenode#initializeSharedEdits() should be closed before 
 method returns
 ---

 Key: HDFS-5721
 URL: https://issues.apache.org/jira/browse/HDFS-5721
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: hdfs-5721-v1.txt, hdfs-5721-v2.txt


 At line 901:
 {code}
   FSImage sharedEditsImage = new FSImage(conf,
   Lists.URInewArrayList(),
   sharedEditsDirs);
 {code}
 sharedEditsImage is not closed before the method returns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5729) Lower chance to hit NPE in allocateNodeLocal

2014-01-08 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865288#comment-13865288
 ] 

Uma Maheswara Rao G commented on HDFS-5729:
---

Should we move this to YARN?

 Lower chance to hit NPE in allocateNodeLocal 
 -

 Key: HDFS-5729
 URL: https://issues.apache.org/jira/browse/HDFS-5729
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: wenwupeng
 Attachments: conf.tar.gz, log.tar.gz


 we have lower chance to hit NPE in allocateNodeLocal  when run benchmark(hit 
 4 in 20 times).
 Steps:
 1. setup hadoop 2.2.0 environment
 2. Run for i in {1..10}; do /hadoop/hadoop-smoke/bin/hadoop jar 
 /hadoop/hadoop-smoke/share/hadoop/mapreduce/hadoop-mapreduce-client-common-*.jar
  org.apache.hadoop.fs.TestDFSIO -write -nrFiles 30 -fileSize 64MB; sleep 
 10;done
 2014-01-08 03:56:14,082 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:291)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:252)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:294)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:614)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignNodeLocalContainers(FifoScheduler.java:524)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:482)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:419)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:658)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:687)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:95)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:440)
 at java.lang.Thread.run(Thread.java:662)
 will attach log and configure files later
 Note: 
 My topology file:
 10.111.89.230   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
 10.111.89.231   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
 10.111.89.232   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
 10.111.89.239   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
 10.111.89.233   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
 10.111.89.234   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
 10.111.89.240   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
 10.111.89.236   /QE2/sin2-pekaurora-bdcqe047.eng.vmware.com
 10.111.89.241   /QE2/sin2-pekaurora-bdcqe047.eng.vmware.com
 10.111.89.238   /QE2/sin2-pekaurora-bdcqe048.eng.vmware.com
 10.111.89.242   /QE2/sin2-pekaurora-bdcqe048.eng.vmware.com



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5726) Fix compilation error in AbstractINodeDiff for JDK7

2014-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865317#comment-13865317
 ] 

Hudson commented on HDFS-5726:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #446 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/446/])
HDFS-5726. Fix compilation error in AbstractINodeDiff for JDK7. Contributed by 
Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556433)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java


 Fix compilation error in AbstractINodeDiff for JDK7
 ---

 Key: HDFS-5726
 URL: https://issues.apache.org/jira/browse/HDFS-5726
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 3.0.0
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Fix For: 3.0.0

 Attachments: HDFS-5726.000.patch


 HDFS-5715 breaks JDK7 build for the following error:
 {code}
 [ERROR] 
 /home/kasha/code/hadoop-trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java:[134,53]
  error: snapshotId has private access in AbstractINodeDiff
 {code}
 This jira will fix the issue.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5715) Use Snapshot ID to indicate the corresponding Snapshot for a FileDiff/DirectoryDiff

2014-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865313#comment-13865313
 ] 

Hudson commented on HDFS-5715:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #446 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/446/])
HDFS-5715. Use Snapshot ID to indicate the corresponding Snapshot for a 
FileDiff/DirectoryDiff. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556353)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CacheReplicationMonitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeReference.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeSymlink.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeWithAdditionalFields.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodesInPath.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiffList.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiffList.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotFSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImageWithSnapshot.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSnapshotPathINodes.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotTestHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestINodeFileUnderConstructionWithSnapshot.java
* 

[jira] [Commented] (HDFS-5649) Unregister NFS and Mount service when NFS gateway is shutting down

2014-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865316#comment-13865316
 ] 

Hudson commented on HDFS-5649:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #446 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/446/])
HDFS-5649. Unregister NFS and Mount service when NFS gateway is shutting down. 
Contributed by Brandon Li (brandonli: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556405)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/mount/MountdBase.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/Nfs3Base.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcProgram.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/portmap/PortmapRequest.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/DFSClientCache.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Unregister NFS and Mount service when NFS gateway is shutting down
 --

 Key: HDFS-5649
 URL: https://issues.apache.org/jira/browse/HDFS-5649
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Brandon Li
 Fix For: 2.3.0

 Attachments: HDFS-5649.001.patch, HDFS-5649.002.patch


 The services should be unregistered if the gateway is asked to shutdown 
 gracefully.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5724) modifyCacheDirective logging audit log command wrongly as addCacheDirective

2014-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865314#comment-13865314
 ] 

Hudson commented on HDFS-5724:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #446 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/446/])
HDFS-5724. modifyCacheDirective logging audit log command wrongly as 
addCacheDirective (Uma Maheswara Rao G via Colin Patrick McCabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556386)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java


 modifyCacheDirective logging audit log command wrongly as addCacheDirective
 ---

 Key: HDFS-5724
 URL: https://issues.apache.org/jira/browse/HDFS-5724
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
Priority: Minor
  Labels: caching
 Attachments: HDFS-5724.patch


 modifyCacheDirective:
 {code}
  if (isAuditEnabled()  isExternalInvocation()) {
 logAuditEvent(success, addCacheDirective, null, null, null);
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction

2014-01-08 Thread Vinay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay updated HDFS-5723:


Assignee: Vinay
  Status: Patch Available  (was: Open)

 Append failed FINALIZED replica should not be accepted as valid when that 
 block is underconstruction
 

 Key: HDFS-5723
 URL: https://issues.apache.org/jira/browse/HDFS-5723
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.2.0
Reporter: Vinay
Assignee: Vinay
 Attachments: HDFS-5723.patch


 Scenario:
 1. 3 node cluster with 
 dfs.client.block.write.replace-datanode-on-failure.enable set to false.
 2. One file is written with 3 replicas, blk_id_gs1
 3. One of the datanode DN1 is down.
 4. File was opened with append and some more data is added to the file and 
 synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2
 5. Now  DN1 restarted
 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should 
 be marked corrupted.
 but since NN having appended block state as UnderConstruction, at this time 
 its not detecting this block as corrupt and adding to valid block locations.
 As long as the namenode is alive, this datanode also will be considered as 
 valid replica and read/append will fail in that datanode.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-3752) BOOTSTRAPSTANDBY for new Standby node will not work just after saveNameSpace at ANN in case of BKJM

2014-01-08 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-3752:
---

Attachment: HDFS-3752-testcase.patch

 BOOTSTRAPSTANDBY for new Standby node will not work just after saveNameSpace 
 at ANN in case of BKJM
 ---

 Key: HDFS-3752
 URL: https://issues.apache.org/jira/browse/HDFS-3752
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: 2.0.0-alpha
Reporter: Vinay
 Attachments: HDFS-3752-testcase.patch


 1. do {{saveNameSpace}} in ANN node by entering into safemode
 2. in another new node, install standby NN and do BOOTSTRAPSTANDBY
 3. Now StandBy NN will not able to copy the fsimage_txid from ANN
 This is because, SNN not able to find the next txid (txid+1) in shared 
 storage.
 Just after {{saveNameSpace}} shared storage will have the new logsegment with 
 only START_LOG_SEGEMENT edits op.
 and BookKeeper will not be able to read last entry from inprogress ledger.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction

2014-01-08 Thread Vinay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay updated HDFS-5723:


Attachment: HDFS-5723.patch

Attached the patch, Please review

 Append failed FINALIZED replica should not be accepted as valid when that 
 block is underconstruction
 

 Key: HDFS-5723
 URL: https://issues.apache.org/jira/browse/HDFS-5723
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.2.0
Reporter: Vinay
 Attachments: HDFS-5723.patch


 Scenario:
 1. 3 node cluster with 
 dfs.client.block.write.replace-datanode-on-failure.enable set to false.
 2. One file is written with 3 replicas, blk_id_gs1
 3. One of the datanode DN1 is down.
 4. File was opened with append and some more data is added to the file and 
 synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2
 5. Now  DN1 restarted
 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should 
 be marked corrupted.
 but since NN having appended block state as UnderConstruction, at this time 
 its not detecting this block as corrupt and adding to valid block locations.
 As long as the namenode is alive, this datanode also will be considered as 
 valid replica and read/append will fail in that datanode.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-3752) BOOTSTRAPSTANDBY for new Standby node will not work just after saveNameSpace at ANN in case of BKJM

2014-01-08 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865367#comment-13865367
 ] 

Rakesh R commented on HDFS-3752:


Hi, As I understood from the discussion, when bootstrapping the standby it is 
not very much required to see the transactions present in the 'in_progress' 
node and skipping of 'in_progress' will not cause any inconsistencies.  Anyway 
StandbyToActive transition will always ensure, the edit delta transactions will 
be read from the shared edit dirs and able to reliably start as Active.

bq.we could add an easy workaround flag, like bootstrapStandby 
-skipSharedEditsCheck, since the check here is just to help out the user and 
not actually necessary for correct operation.
I also agree in skipping the shared edits check during bootstrapstandby, in 
that case there is no special fix required for this JIRA. 
Presently, there is no test cases for bootstrap with BKJM shared edits and I've 
tried few. Could you please review the attached test case patch. If everyone 
agrees, push this in and and can close this JIRA once HDFS-4120 is in. Any 
thoughts?


 BOOTSTRAPSTANDBY for new Standby node will not work just after saveNameSpace 
 at ANN in case of BKJM
 ---

 Key: HDFS-3752
 URL: https://issues.apache.org/jira/browse/HDFS-3752
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: 2.0.0-alpha
Reporter: Vinay
 Attachments: HDFS-3752-testcase.patch


 1. do {{saveNameSpace}} in ANN node by entering into safemode
 2. in another new node, install standby NN and do BOOTSTRAPSTANDBY
 3. Now StandBy NN will not able to copy the fsimage_txid from ANN
 This is because, SNN not able to find the next txid (txid+1) in shared 
 storage.
 Just after {{saveNameSpace}} shared storage will have the new logsegment with 
 only START_LOG_SEGEMENT edits op.
 and BookKeeper will not be able to read last entry from inprogress ledger.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5715) Use Snapshot ID to indicate the corresponding Snapshot for a FileDiff/DirectoryDiff

2014-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865432#comment-13865432
 ] 

Hudson commented on HDFS-5715:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1638 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1638/])
HDFS-5715. Use Snapshot ID to indicate the corresponding Snapshot for a 
FileDiff/DirectoryDiff. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556353)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CacheReplicationMonitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeReference.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeSymlink.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeWithAdditionalFields.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodesInPath.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiffList.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiffList.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotFSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImageWithSnapshot.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSnapshotPathINodes.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotTestHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestINodeFileUnderConstructionWithSnapshot.java
* 

[jira] [Commented] (HDFS-5649) Unregister NFS and Mount service when NFS gateway is shutting down

2014-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865435#comment-13865435
 ] 

Hudson commented on HDFS-5649:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1638 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1638/])
HDFS-5649. Unregister NFS and Mount service when NFS gateway is shutting down. 
Contributed by Brandon Li (brandonli: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556405)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/mount/MountdBase.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/Nfs3Base.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcProgram.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/portmap/PortmapRequest.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/DFSClientCache.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Unregister NFS and Mount service when NFS gateway is shutting down
 --

 Key: HDFS-5649
 URL: https://issues.apache.org/jira/browse/HDFS-5649
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Brandon Li
 Fix For: 2.3.0

 Attachments: HDFS-5649.001.patch, HDFS-5649.002.patch


 The services should be unregistered if the gateway is asked to shutdown 
 gracefully.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5726) Fix compilation error in AbstractINodeDiff for JDK7

2014-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865436#comment-13865436
 ] 

Hudson commented on HDFS-5726:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1638 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1638/])
HDFS-5726. Fix compilation error in AbstractINodeDiff for JDK7. Contributed by 
Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556433)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java


 Fix compilation error in AbstractINodeDiff for JDK7
 ---

 Key: HDFS-5726
 URL: https://issues.apache.org/jira/browse/HDFS-5726
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 3.0.0
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Fix For: 3.0.0

 Attachments: HDFS-5726.000.patch


 HDFS-5715 breaks JDK7 build for the following error:
 {code}
 [ERROR] 
 /home/kasha/code/hadoop-trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java:[134,53]
  error: snapshotId has private access in AbstractINodeDiff
 {code}
 This jira will fix the issue.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5727) introduce a self-maintaining io queue handling mechanism

2014-01-08 Thread Richard Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Chen updated HDFS-5727:
---

Summary: introduce a self-maintaining io queue handling mechanism  (was: 
introduce a self-maintain io queue handling mechanism)

 introduce a self-maintaining io queue handling mechanism
 

 Key: HDFS-5727
 URL: https://issues.apache.org/jira/browse/HDFS-5727
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode
Affects Versions: 3.0.0
Reporter: Liang Xie
Assignee: Liang Xie

 Currently the datanode read/write SLA is difficult to be guaranteed for HBase 
 online requirement. One of major reasons is we don't support io priority or 
 io request reorder inside datanode.
 I propose introducing a self-maintain io queue mechanism to handle io request 
 priority. Imaging there're lots of concurrent read/write requests from HBase 
 side, and a background datanode block scanner is running(default is every 21 
 days, IIRC) just in time, then the HBase read/write 99% or 99.9% percentile 
 latency would be vulnerable despite we have a bg thread throttling...
 the reorder stuff i have not thought clearly enough, but definitely the 
 reorder in the queue in the app side would beat the currently relying OS's io 
 queue merge.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5727) introduce a self-maintain io queue handling mechanism

2014-01-08 Thread Richard Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Chen updated HDFS-5727:
---

Description: 
Currently the datanode read/write SLA is difficult to be guaranteed for HBase 
online requirement. One of major reasons is we don't support io priority or io 
request reorder inside datanode.
I propose introducing a self-maintain io queue mechanism to handle io request 
priority. Imaging there're lots of concurrent read/write requests from HBase 
side, and a background datanode block scanner is running(default is every 21 
days, IIRC) just in time, then the HBase read/write 99% or 99.9% percentile 
latency would be vulnerable despite we have a bg thread throttling...
the reorder stuff i have not thought clearly enough, but definitely the reorder 
in the queue in the app side would beat the currently relying OS's io queue 
merge.

  was:
Currently the datanode read/write SLA is dfficult to be ganranteed for HBase 
online requirement. One of major reasons is we don't support io priority or io 
reqeust reorder inside datanode.
I proposal introducing a self-maintain io queue mechanism to handle io request 
priority. Image there're lots of concurrent read/write reqeust from HBase side, 
and a background datanode block scanner is running(default is every 21 days, 
IIRC) just in time, then the HBase read/write 99% or 99.9% percentile latency 
would be vulnerable despite we have a bg thread throttling...
the reorder stuf i have not thought clearly enough, but definitely the reorder 
in the queue in the app side would beat the currently relying OS's io queue 
merge.


 introduce a self-maintain io queue handling mechanism
 -

 Key: HDFS-5727
 URL: https://issues.apache.org/jira/browse/HDFS-5727
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode
Affects Versions: 3.0.0
Reporter: Liang Xie
Assignee: Liang Xie

 Currently the datanode read/write SLA is difficult to be guaranteed for HBase 
 online requirement. One of major reasons is we don't support io priority or 
 io request reorder inside datanode.
 I propose introducing a self-maintain io queue mechanism to handle io request 
 priority. Imaging there're lots of concurrent read/write requests from HBase 
 side, and a background datanode block scanner is running(default is every 21 
 days, IIRC) just in time, then the HBase read/write 99% or 99.9% percentile 
 latency would be vulnerable despite we have a bg thread throttling...
 the reorder stuff i have not thought clearly enough, but definitely the 
 reorder in the queue in the app side would beat the currently relying OS's io 
 queue merge.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5731) Refactoring to define interfaces between BM and NN and simplify the flow between them

2014-01-08 Thread Amir Langer (JIRA)
Amir Langer created HDFS-5731:
-

 Summary: Refactoring to define interfaces between BM and NN and 
simplify the flow between them
 Key: HDFS-5731
 URL: https://issues.apache.org/jira/browse/HDFS-5731
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Amir Langer


Start the separation of BlockManager (BM) from NameNode (NN) by simplifying the 
flow between the two components and defining API interfaces between them. 
The two components still exist in the same VM and use the same memory space 
(using the same instances).
Logic to calls from Datanodes should be in the BM.
NN should interact with BM using few calls and BM should use the return types 
as much as possible to pass information to the NN.
APIs between them should be defined as interfaces so later it can be improved 
to not use the same object instances and turned into a real protocol.
This still assumes a one to one relation between NN and BM, same VM and does 
not handle lifecycle of the service.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5731) Refactoring to define interfaces between BM and NN and simplify the flow between them

2014-01-08 Thread Amir Langer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amir Langer updated HDFS-5731:
--

Description: 
Start the separation of BlockManager (BM) from NameNode (NN) by simplifying the 
flow between the two components and defining API interfaces between them. 
The two components still exist in the same VM and use the same memory space 
(using the same instances).
Logic to calls from Datanodes should be in the BM.
NN should interact with BM using few calls and BM should use the return types 
as much as possible to pass information to the NN.
APIs between them should be defined as interfaces so later it can be improved 
to not use the same object instances and turned into a real protocol.
This still assumes a one to one relation between NN and BM, same VM and does 
not handle lifecycle of the service.

This task should maintain backward compatibility


  was:
Start the separation of BlockManager (BM) from NameNode (NN) by simplifying the 
flow between the two components and defining API interfaces between them. 
The two components still exist in the same VM and use the same memory space 
(using the same instances).
Logic to calls from Datanodes should be in the BM.
NN should interact with BM using few calls and BM should use the return types 
as much as possible to pass information to the NN.
APIs between them should be defined as interfaces so later it can be improved 
to not use the same object instances and turned into a real protocol.
This still assumes a one to one relation between NN and BM, same VM and does 
not handle lifecycle of the service.



 Refactoring to define interfaces between BM and NN and simplify the flow 
 between them
 -

 Key: HDFS-5731
 URL: https://issues.apache.org/jira/browse/HDFS-5731
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Amir Langer

 Start the separation of BlockManager (BM) from NameNode (NN) by simplifying 
 the flow between the two components and defining API interfaces between them. 
 The two components still exist in the same VM and use the same memory space 
 (using the same instances).
 Logic to calls from Datanodes should be in the BM.
 NN should interact with BM using few calls and BM should use the return types 
 as much as possible to pass information to the NN.
 APIs between them should be defined as interfaces so later it can be improved 
 to not use the same object instances and turned into a real protocol.
 This still assumes a one to one relation between NN and BM, same VM and does 
 not handle lifecycle of the service.
 This task should maintain backward compatibility



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5732) Separate memory space between BM and NN

2014-01-08 Thread Amir Langer (JIRA)
Amir Langer created HDFS-5732:
-

 Summary: Separate memory space between BM and NN
 Key: HDFS-5732
 URL: https://issues.apache.org/jira/browse/HDFS-5732
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Amir Langer


Change created APIs to not rely on the same instance being shared in both BM 
and NN. Use immutable objects / keep state in sync.
BM and NN will still exist in the same VM work on a new BM service as an 
independent process is deferred to later tasks.
Also, a one to one relation between BM and NN is assumed. 
This task should maintain backward compatibility.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5733) Separate concurrency control between BM and NN

2014-01-08 Thread Amir Langer (JIRA)
Amir Langer created HDFS-5733:
-

 Summary: Separate concurrency control between BM and NN
 Key: HDFS-5733
 URL: https://issues.apache.org/jira/browse/HDFS-5733
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Amir Langer


Replace usage of the namesystem locking mechanism by the BM with its own 
concurrency control to control its own internal state.

Both NN and BM will still run from the same VM.
This task should maintain backward compatibility.

 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5727) introduce a self-maintaining io queue handling mechanism

2014-01-08 Thread Richard Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865467#comment-13865467
 ] 

Richard Chen commented on HDFS-5727:


interesting but if you can improve your language further, you will help the 
audience to better understand what you intend to do. My team is working on 
something similar to that. I am thinking of adding your problem into our design 
scope. We can certainly collaborate on this. Let me know your thoughts.

 introduce a self-maintaining io queue handling mechanism
 

 Key: HDFS-5727
 URL: https://issues.apache.org/jira/browse/HDFS-5727
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode
Affects Versions: 3.0.0
Reporter: Liang Xie
Assignee: Liang Xie

 Currently the datanode read/write SLA is difficult to be guaranteed for HBase 
 online requirement. One of major reasons is we don't support io priority or 
 io request reorder inside datanode.
 I propose introducing a self-maintain io queue mechanism to handle io request 
 priority. Imaging there're lots of concurrent read/write requests from HBase 
 side, and a background datanode block scanner is running(default is every 21 
 days, IIRC) just in time, then the HBase read/write 99% or 99.9% percentile 
 latency would be vulnerable despite we have a bg thread throttling...
 the reorder stuff i have not thought clearly enough, but definitely the 
 reorder in the queue in the app side would beat the currently relying OS's io 
 queue merge.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5734) A NN-internal RPC BM service

2014-01-08 Thread Amir Langer (JIRA)
Amir Langer created HDFS-5734:
-

 Summary: A NN-internal RPC BM service
 Key: HDFS-5734
 URL: https://issues.apache.org/jira/browse/HDFS-5734
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Amir Langer


Separate the BM from NN by running it with with its own thread-pool and RPC 
protocol but still in the same process as NN.
NN and BM will in interact through some loopback call that will simulate a 
separate service.
This sprint still assumes a one to one relation between NN and BM and does not 
split the BM to a separate process, only simulates such a split inside the same 
VM. This allows us to defer any configuration issue / Testing support / scripts 
changes to later tasks. 
This task will therefore also not handle any HA issue to the BM itself. It 
will, however, deal with having BM code actually running in a different thread 
to the NN code and will handle building the initialisation / lifecycle code to 
an independent BM.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5735) Testing support for BM as a service

2014-01-08 Thread Amir Langer (JIRA)
Amir Langer created HDFS-5735:
-

 Summary: Testing support for BM as a service
 Key: HDFS-5735
 URL: https://issues.apache.org/jira/browse/HDFS-5735
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Amir Langer


Testing support for an independent BM service. Modify tests to start it / use 
MiniDFSCluster if they require a BM. Verify that all tests still pass with an 
independent BM (running off MiniDFSCluster).




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5736) BM service as a separate process

2014-01-08 Thread Amir Langer (JIRA)
Amir Langer created HDFS-5736:
-

 Summary: BM service as a separate process
 Key: HDFS-5736
 URL: https://issues.apache.org/jira/browse/HDFS-5736
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Amir Langer


Add scripts / config. to allow running BM as a separate service.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5731) Refactoring to define interfaces between BM and NN and simplify the flow between them

2014-01-08 Thread Amir Langer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amir Langer updated HDFS-5731:
--

Attachment: 0001-Separation-of-BM-from-NN-Step1-introduce-APIs-as-int.patch

patch are changes done on top of trunk and were last rebased to start from
commit:

HADOOP-10175. Har files system authority should preserve userinfo. Contributed 
by Chuan Liu.  
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1553169 
13f79535-47bb-0310-9956-ffa450edef68



 Refactoring to define interfaces between BM and NN and simplify the flow 
 between them
 -

 Key: HDFS-5731
 URL: https://issues.apache.org/jira/browse/HDFS-5731
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Amir Langer
 Attachments: 
 0001-Separation-of-BM-from-NN-Step1-introduce-APIs-as-int.patch


 Start the separation of BlockManager (BM) from NameNode (NN) by simplifying 
 the flow between the two components and defining API interfaces between them. 
 The two components still exist in the same VM and use the same memory space 
 (using the same instances).
 Logic to calls from Datanodes should be in the BM.
 NN should interact with BM using few calls and BM should use the return types 
 as much as possible to pass information to the NN.
 APIs between them should be defined as interfaces so later it can be improved 
 to not use the same object instances and turned into a real protocol.
 This still assumes a one to one relation between NN and BM, same VM and does 
 not handle lifecycle of the service.
 This task should maintain backward compatibility



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction

2014-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865487#comment-13865487
 ] 

Hadoop QA commented on HDFS-5723:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621965/HDFS-5723.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5845//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5845//console

This message is automatically generated.

 Append failed FINALIZED replica should not be accepted as valid when that 
 block is underconstruction
 

 Key: HDFS-5723
 URL: https://issues.apache.org/jira/browse/HDFS-5723
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.2.0
Reporter: Vinay
Assignee: Vinay
 Attachments: HDFS-5723.patch


 Scenario:
 1. 3 node cluster with 
 dfs.client.block.write.replace-datanode-on-failure.enable set to false.
 2. One file is written with 3 replicas, blk_id_gs1
 3. One of the datanode DN1 is down.
 4. File was opened with append and some more data is added to the file and 
 synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2
 5. Now  DN1 restarted
 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should 
 be marked corrupted.
 but since NN having appended block state as UnderConstruction, at this time 
 its not detecting this block as corrupt and adding to valid block locations.
 As long as the namenode is alive, this datanode also will be considered as 
 valid replica and read/append will fail in that datanode.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5724) modifyCacheDirective logging audit log command wrongly as addCacheDirective

2014-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865503#comment-13865503
 ] 

Hudson commented on HDFS-5724:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1663 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1663/])
HDFS-5724. modifyCacheDirective logging audit log command wrongly as 
addCacheDirective (Uma Maheswara Rao G via Colin Patrick McCabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556386)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java


 modifyCacheDirective logging audit log command wrongly as addCacheDirective
 ---

 Key: HDFS-5724
 URL: https://issues.apache.org/jira/browse/HDFS-5724
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
Priority: Minor
  Labels: caching
 Attachments: HDFS-5724.patch


 modifyCacheDirective:
 {code}
  if (isAuditEnabled()  isExternalInvocation()) {
 logAuditEvent(success, addCacheDirective, null, null, null);
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5715) Use Snapshot ID to indicate the corresponding Snapshot for a FileDiff/DirectoryDiff

2014-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865502#comment-13865502
 ] 

Hudson commented on HDFS-5715:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1663 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1663/])
HDFS-5715. Use Snapshot ID to indicate the corresponding Snapshot for a 
FileDiff/DirectoryDiff. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556353)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CacheReplicationMonitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeReference.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeSymlink.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeWithAdditionalFields.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodesInPath.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiffList.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiffList.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotFSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImageWithSnapshot.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSnapshotPathINodes.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotTestHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestINodeFileUnderConstructionWithSnapshot.java
* 

[jira] [Commented] (HDFS-5726) Fix compilation error in AbstractINodeDiff for JDK7

2014-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865506#comment-13865506
 ] 

Hudson commented on HDFS-5726:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1663 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1663/])
HDFS-5726. Fix compilation error in AbstractINodeDiff for JDK7. Contributed by 
Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556433)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java


 Fix compilation error in AbstractINodeDiff for JDK7
 ---

 Key: HDFS-5726
 URL: https://issues.apache.org/jira/browse/HDFS-5726
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 3.0.0
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Fix For: 3.0.0

 Attachments: HDFS-5726.000.patch


 HDFS-5715 breaks JDK7 build for the following error:
 {code}
 [ERROR] 
 /home/kasha/code/hadoop-trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java:[134,53]
  error: snapshotId has private access in AbstractINodeDiff
 {code}
 This jira will fix the issue.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5649) Unregister NFS and Mount service when NFS gateway is shutting down

2014-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865505#comment-13865505
 ] 

Hudson commented on HDFS-5649:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1663 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1663/])
HDFS-5649. Unregister NFS and Mount service when NFS gateway is shutting down. 
Contributed by Brandon Li (brandonli: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556405)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/mount/MountdBase.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/Nfs3Base.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcProgram.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/portmap/PortmapRequest.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/DFSClientCache.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Unregister NFS and Mount service when NFS gateway is shutting down
 --

 Key: HDFS-5649
 URL: https://issues.apache.org/jira/browse/HDFS-5649
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Brandon Li
 Fix For: 2.3.0

 Attachments: HDFS-5649.001.patch, HDFS-5649.002.patch


 The services should be unregistered if the gateway is asked to shutdown 
 gracefully.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5721) sharedEditsImage in Namenode#initializeSharedEdits() should be closed before method returns

2014-01-08 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-5721:
-

Attachment: hdfs-5721-v3.txt

Patch v3 addresses Junping's comment.

 sharedEditsImage in Namenode#initializeSharedEdits() should be closed before 
 method returns
 ---

 Key: HDFS-5721
 URL: https://issues.apache.org/jira/browse/HDFS-5721
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: hdfs-5721-v1.txt, hdfs-5721-v2.txt, hdfs-5721-v3.txt


 At line 901:
 {code}
   FSImage sharedEditsImage = new FSImage(conf,
   Lists.URInewArrayList(),
   sharedEditsDirs);
 {code}
 sharedEditsImage is not closed before the method returns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5734) A NN-internal RPC BM service

2014-01-08 Thread jay vyas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865628#comment-13865628
 ] 

jay vyas commented on HDFS-5734:


Sorry to ask, but... whats BM ?  Is that the BackupNameNode?

 A NN-internal RPC BM service
 

 Key: HDFS-5734
 URL: https://issues.apache.org/jira/browse/HDFS-5734
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Amir Langer

 Separate the BM from NN by running it with with its own thread-pool and RPC 
 protocol but still in the same process as NN.
 NN and BM will in interact through some loopback call that will simulate a 
 separate service.
 This sprint still assumes a one to one relation between NN and BM and does 
 not split the BM to a separate process, only simulates such a split inside 
 the same VM. This allows us to defer any configuration issue / Testing 
 support / scripts changes to later tasks. 
 This task will therefore also not handle any HA issue to the BM itself. It 
 will, however, deal with having BM code actually running in a different 
 thread to the NN code and will handle building the initialisation / lifecycle 
 code to an independent BM.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5721) sharedEditsImage in Namenode#initializeSharedEdits() should be closed before method returns

2014-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865669#comment-13865669
 ] 

Hadoop QA commented on HDFS-5721:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621987/hdfs-5721-v3.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5846//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5846//console

This message is automatically generated.

 sharedEditsImage in Namenode#initializeSharedEdits() should be closed before 
 method returns
 ---

 Key: HDFS-5721
 URL: https://issues.apache.org/jira/browse/HDFS-5721
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: hdfs-5721-v1.txt, hdfs-5721-v2.txt, hdfs-5721-v3.txt


 At line 901:
 {code}
   FSImage sharedEditsImage = new FSImage(conf,
   Lists.URInewArrayList(),
   sharedEditsDirs);
 {code}
 sharedEditsImage is not closed before the method returns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (HDFS-2261) AOP unit tests are not getting compiled or run

2014-01-08 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla reassigned HDFS-2261:
--

Assignee: (was: Karthik Kambatla)

 AOP unit tests are not getting compiled or run 
 ---

 Key: HDFS-2261
 URL: https://issues.apache.org/jira/browse/HDFS-2261
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha, 2.0.4-alpha
 Environment: 
 https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/834/console
 -compile-fault-inject ant target 
Reporter: Giridharan Kesavan
Priority: Minor
 Attachments: hdfs-2261.patch


 The tests in src/test/aop are not getting compiled or run.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5579) Under construction files make DataNode decommission take very long hours

2014-01-08 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865705#comment-13865705
 ] 

Jing Zhao commented on HDFS-5579:
-

{code}
+if (bc.isUnderConstruction()) {
+  if (block.equals(bc.getLastBlock())  curReplicas  minReplication) 
{
+continue;
+  }
+  underReplicatedInOpenFiles++;
+}
{code}

Here if {{block}} is not the last block, and {{block}} is not under replicated, 
underReplicatedInOpenFiles will still increase?

 Under construction files make DataNode decommission take very long hours
 

 Key: HDFS-5579
 URL: https://issues.apache.org/jira/browse/HDFS-5579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 1.2.0, 2.2.0
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Attachments: HDFS-5579-branch-1.2.patch, HDFS-5579.patch


 We noticed that some times decommission DataNodes takes very long time, even 
 exceeds 100 hours.
 After check the code, I found that in 
 BlockManager:computeReplicationWorkForBlocks(ListListBlock 
 blocksToReplicate) it won't replicate blocks which belongs to under 
 construction files, however in 
 BlockManager:isReplicationInProgress(DatanodeDescriptor srcNode), if there  
 is block need replicate no matter whether it belongs to under construction or 
 not, the decommission progress will continue running.
 That's the reason some time the decommission takes very long time.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5737) Replacing only the default ACL can fail to copy unspecified base entries from the access ACL.

2014-01-08 Thread Chris Nauroth (JIRA)
Chris Nauroth created HDFS-5737:
---

 Summary: Replacing only the default ACL can fail to copy 
unspecified base entries from the access ACL.
 Key: HDFS-5737
 URL: https://issues.apache.org/jira/browse/HDFS-5737
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth


The final round of changes in HDFS-5673 switched to a search approach instead 
of a scan approach for finding base access entries that need to be copied to 
the default ACL.  However, in the case of doing full replacement on the default 
ACL, the list may not be sorted properly at this point in the code, causing the 
searches to miss the access entries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Work started] (HDFS-5737) Replacing only the default ACL can fail to copy unspecified base entries from the access ACL.

2014-01-08 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-5737 started by Chris Nauroth.

 Replacing only the default ACL can fail to copy unspecified base entries from 
 the access ACL.
 -

 Key: HDFS-5737
 URL: https://issues.apache.org/jira/browse/HDFS-5737
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth

 The final round of changes in HDFS-5673 switched to a search approach instead 
 of a scan approach for finding base access entries that need to be copied to 
 the default ACL.  However, in the case of doing full replacement on the 
 default ACL, the list may not be sorted properly at this point in the code, 
 causing the searches to miss the access entries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5737) Replacing only the default ACL can fail to copy unspecified base entries from the access ACL.

2014-01-08 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5737:


Attachment: (was: HDFS-5673.1.patch)

 Replacing only the default ACL can fail to copy unspecified base entries from 
 the access ACL.
 -

 Key: HDFS-5737
 URL: https://issues.apache.org/jira/browse/HDFS-5737
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5737.1.patch


 The final round of changes in HDFS-5673 switched to a search approach instead 
 of a scan approach for finding base access entries that need to be copied to 
 the default ACL.  However, in the case of doing full replacement on the 
 default ACL, the list may not be sorted properly at this point in the code, 
 causing the searches to miss the access entries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5737) Replacing only the default ACL can fail to copy unspecified base entries from the access ACL.

2014-01-08 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5737:


Attachment: HDFS-5737.1.patch

 Replacing only the default ACL can fail to copy unspecified base entries from 
 the access ACL.
 -

 Key: HDFS-5737
 URL: https://issues.apache.org/jira/browse/HDFS-5737
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5737.1.patch


 The final round of changes in HDFS-5673 switched to a search approach instead 
 of a scan approach for finding base access entries that need to be copied to 
 the default ACL.  However, in the case of doing full replacement on the 
 default ACL, the list may not be sorted properly at this point in the code, 
 causing the searches to miss the access entries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5737) Replacing only the default ACL can fail to copy unspecified base entries from the access ACL.

2014-01-08 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5737:


Attachment: HDFS-5673.1.patch

Here is a patch to fix the bug.
# The easiest way to fix this is to do another sort at the start of 
{{AclTransformation#copyDefaultsIfNeeded}}.
# This bug had been causing us to produce invalid default ACLs that are missing 
the base entries (owner, group, other).  As an extra defense, I changed the 
validation logic so that it requires the base entries for both access and 
default.  Previously, this was just enforced for access.  To do this, I rewrote 
this portion of the logic to use the search approach, similar to what people 
found more readable for {{AclTransformation#copyDefaultsIfNeeded}}.  In theory, 
the checks on the default ACL should never fail, because we should always copy 
the missing required entries from the access ACL.  However, if there is a bug, 
then it's better to bail earlier instead of producing an invalid default ACL 
that gets used later.
# Added one more test in {{TestAclTransformation}}.  This test failed before I 
made the fix in {{AclTransformation}}.

 Replacing only the default ACL can fail to copy unspecified base entries from 
 the access ACL.
 -

 Key: HDFS-5737
 URL: https://issues.apache.org/jira/browse/HDFS-5737
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5737.1.patch


 The final round of changes in HDFS-5673 switched to a search approach instead 
 of a scan approach for finding base access entries that need to be copied to 
 the default ACL.  However, in the case of doing full replacement on the 
 default ACL, the list may not be sorted properly at this point in the code, 
 causing the searches to miss the access entries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5714) Use byte array to represent UnderConstruction feature and Snapshot feature for INodeFile

2014-01-08 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5714:


Attachment: HDFS-5714.000.patch

Early patch for review. In general, the patch 
1. Encodes the whole FileDiffList into a byte array. Instead of always keeping 
the byte array in memory, currently the patch only encodes a FileDiffList to a 
byte array when loading it from FSImage for the first time. And later if the 
corresponding snapshot information is accessed the byte array will be decoded 
to the FileDiffList and will not be encoded again (until the next time NN 
restarting).
2. Remove ClientNode from FileUnderConstructionFeature and use a byte array to 
represent the ClientName and ClientMachine.

 Use byte array to represent UnderConstruction feature and Snapshot feature 
 for INodeFile
 

 Key: HDFS-5714
 URL: https://issues.apache.org/jira/browse/HDFS-5714
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-5714.000.patch


 Currently we define specific classes to represent different INode features, 
 such as FileUnderConstructionFeature and FileWithSnapshotFeature. While 
 recording these feature information in memory, the internal information and 
 object references can still cost a lot of memory. For example, for 
 FileWithSnapshotFeature, not considering the INode's local name, the whole 
 FileDiff list (with size n) can cost around 120n bytes.
 In order to decrease the memory usage, we plan to use byte array to record 
 the UnderConstruction feature and Snapshot feature for INodeFile. 
 Specifically, if we use protobuf's encoding, the memory usage for a 
 FileWithSnapshotFeature can be less than 56n bytes.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5738) Serialize INode information in protobuf

2014-01-08 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-5738:
-

Attachment: HDFS-5738.000.patch

 Serialize INode information in protobuf
 ---

 Key: HDFS-5738
 URL: https://issues.apache.org/jira/browse/HDFS-5738
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5738.000.patch


 This jira proposes to serialize inode information with protobuf. 
 Snapshot-related information are out of the scope of this jira.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5738) Serialize INode information in protobuf

2014-01-08 Thread Haohui Mai (JIRA)
Haohui Mai created HDFS-5738:


 Summary: Serialize INode information in protobuf
 Key: HDFS-5738
 URL: https://issues.apache.org/jira/browse/HDFS-5738
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5738.000.patch

This jira proposes to serialize inode information with protobuf. 
Snapshot-related information are out of the scope of this jira.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5738) Serialize INode information in protobuf

2014-01-08 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865762#comment-13865762
 ] 

Jing Zhao commented on HDFS-5738:
-

Can you give more details about how you serialize the inode information (e.g., 
traversing the fsdiretory tree or using inodesMap etc.)? This information will 
help others get a better understanding of your patch.

 Serialize INode information in protobuf
 ---

 Key: HDFS-5738
 URL: https://issues.apache.org/jira/browse/HDFS-5738
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5738.000.patch


 This jira proposes to serialize inode information with protobuf. 
 Snapshot-related information are out of the scope of this jira.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN

2014-01-08 Thread Eric Sirianni (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865782#comment-13865782
 ] 

Eric Sirianni commented on HDFS-5483:
-

Arpit - I noticed that the supplied patch only ignores the extra replica in the 
full Block Report code path ({{processReport()}}).  Doesn't this leave the 
assertion still exposed on the {{BLOCK_RECEIVED}} 
({{processIncrementalReportedBlock()}}) path?  It seems like this code might 
need to be changed to search based on storage ID also:
{code}
if (reportedState == ReplicaState.FINALIZED
 (storedBlock.findDatanode(dn)  0
|| corruptReplicas.isReplicaCorrupt(storedBlock, dn))) {
  toAdd.add(storedBlock);
}
{code}


 NN should gracefully handle multiple block replicas on same DN
 --

 Key: HDFS-5483
 URL: https://issues.apache.org/jira/browse/HDFS-5483
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: Arpit Agarwal
 Fix For: 3.0.0

 Attachments: h5483.02.patch


 {{BlockManager#reportDiff}} can cause an assertion failure in 
 {{BlockInfo#moveBlockToHead}} if the block report shows the same block as 
 belonging to more than one storage.
 The issue is that {{moveBlockToHead}} assumes it will find the 
 DatanodeStorageInfo for the given block.
 Exception details:
 {code}
 java.lang.AssertionError: Index is out of bound
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984)
 at 
 org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5739) ACL RPC must allow null name or null permissions in ACL entries.

2014-01-08 Thread Chris Nauroth (JIRA)
Chris Nauroth created HDFS-5739:
---

 Summary: ACL RPC must allow null name or null permissions in ACL 
entries.
 Key: HDFS-5739
 URL: https://issues.apache.org/jira/browse/HDFS-5739
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth


Currently, the ACL RPC defines ACL entries with required fields for name and 
permissions.  These fields actually need to be optional.  The name can be null 
to represent unnamed ACL entries, such as the file owner or mask.  Permissions 
can be null when passed in an ACL spec to remove ACL entries via 
{{FileSystem#removeAclEntries}}.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Work started] (HDFS-5739) ACL RPC must allow null name or null permissions in ACL entries.

2014-01-08 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-5739 started by Chris Nauroth.

 ACL RPC must allow null name or null permissions in ACL entries.
 

 Key: HDFS-5739
 URL: https://issues.apache.org/jira/browse/HDFS-5739
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth

 Currently, the ACL RPC defines ACL entries with required fields for name and 
 permissions.  These fields actually need to be optional.  The name can be 
 null to represent unnamed ACL entries, such as the file owner or mask.  
 Permissions can be null when passed in an ACL spec to remove ACL entries via 
 {{FileSystem#removeAclEntries}}.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Work started] (HDFS-5677) Need error checking for HA cluster configuration

2014-01-08 Thread Vincent Sheffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-5677 started by Vincent Sheffer.

 Need error checking for HA cluster configuration
 

 Key: HDFS-5677
 URL: https://issues.apache.org/jira/browse/HDFS-5677
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, ha
Affects Versions: 2.0.6-alpha
 Environment: centos6.5, oracle jdk6 45, 
Reporter: Vincent Sheffer
Assignee: Vincent Sheffer
Priority: Minor
 Fix For: 3.0.0, 2.3.0


 If a node is declared in the *dfs.ha.namenodes.myCluster* but is _not_ later 
 defined in subsequent *dfs.namenode.servicerpc-address.myCluster.nodename* or 
 *dfs.namenode.rpc-address.myCluster.XXX* properties no error or warning 
 message is provided to indicate that.
 The only indication of a problem is a log message like the following:
 {code}
 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to 
 server: myCluster:8020
 {code}
 Another way to look at this is that no error or warning is provided when a 
 servicerpc-address/rpc-address property is defined for a node without a 
 corresponding node declared in *dfs.ha.namenodes.myCluster*.
 This arose when I had a typo in the *dfs.ha.namenodes.myCluster* property for 
 one of my node names.  It would be very helpful to have at least a warning 
 message on startup if there is a configuration problem like this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5739) ACL RPC must allow null name or null permissions in ACL entries.

2014-01-08 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5739:


Attachment: HDFS-5739.1.patch

This patch switches the fields to optional in the protobuf spec, updates the 
translation logic in {{PBHelper}} and expands on the tests in {{TestPBHelper}} 
to cover these cases.

 ACL RPC must allow null name or null permissions in ACL entries.
 

 Key: HDFS-5739
 URL: https://issues.apache.org/jira/browse/HDFS-5739
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5739.1.patch


 Currently, the ACL RPC defines ACL entries with required fields for name and 
 permissions.  These fields actually need to be optional.  The name can be 
 null to represent unnamed ACL entries, such as the file owner or mask.  
 Permissions can be null when passed in an ACL spec to remove ACL entries via 
 {{FileSystem#removeAclEntries}}.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5740) getmerge file system shell command needs error message for user error

2014-01-08 Thread John Pfuntner (JIRA)
John Pfuntner created HDFS-5740:
---

 Summary: getmerge file system shell command needs error message 
for user error
 Key: HDFS-5740
 URL: https://issues.apache.org/jira/browse/HDFS-5740
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 1.1.2
 Environment: {noformat}[jpfuntner@h58 tmp]$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.0 (Santiago)
[jpfuntner@h58 tmp]$ hadoop version
Hadoop 1.1.2.21
Subversion  -r 
Compiled by jenkins on Thu Jan 10 03:38:39 PST 2013
From source with checksum ce0aa0de785f572347f1afee69c73861{noformat}
Reporter: John Pfuntner
Priority: Minor


I naively tried a {{getmerge}} operation but it didn't seem to do anything and 
there was no error message:

{noformat}[jpfuntner@h58 tmp]$ hadoop fs -mkdir /user/jpfuntner/tmp
[jpfuntner@h58 tmp]$ num=0; while [ $num -lt 5 ]; do echo file$num | hadoop fs 
-put - /user/jpfuntner/tmp/file$num; let num=num+1; done
[jpfuntner@h58 tmp]$ ls -A
[jpfuntner@h58 tmp]$ hadoop fs -getmerge /user/jpfuntner/tmp/file* files.txt
[jpfuntner@h58 tmp]$ ls -A
[jpfuntner@h58 tmp]$ hadoop fs -ls /user/jpfuntner/tmp
Found 5 items
-rw---   3 jpfuntner hdfs  6 2014-01-08 17:37 
/user/jpfuntner/tmp/file0
-rw---   3 jpfuntner hdfs  6 2014-01-08 17:37 
/user/jpfuntner/tmp/file1
-rw---   3 jpfuntner hdfs  6 2014-01-08 17:37 
/user/jpfuntner/tmp/file2
-rw---   3 jpfuntner hdfs  6 2014-01-08 17:37 
/user/jpfuntner/tmp/file3
-rw---   3 jpfuntner hdfs  6 2014-01-08 17:37 
/user/jpfuntner/tmp/file4
[jpfuntner@h58 tmp]$ {noformat}

It was pointed out to me that I made a mistake and my source should have been a 
directory not a set of regular files.  It works if I use the directory:

{noformat}[jpfuntner@h58 tmp]$ hadoop fs -getmerge /user/jpfuntner/tmp/ 
files.txt
[jpfuntner@h58 tmp]$ ls -A
files.txt  .files.txt.crc
[jpfuntner@h58 tmp]$ cat files.txt
file0
file1
file2
file3
file4
[jpfuntner@h58 tmp]$ {noformat}

I think the {{getmerge}} command should issue an error message to let the user 
know they made a mistake.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5737) Replacing only the default ACL can fail to copy unspecified base entries from the access ACL.

2014-01-08 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865903#comment-13865903
 ] 

Haohui Mai commented on HDFS-5737:
--

The patch looks good to me. +1. However, there are a couple efficiency issues 
that can be addressed in separate jiras:

# Implement your own binary search so that (1) it supports finding in a sub 
list of the collection, and (2) it always returns the lowest element in the 
list. That way you can make finding the pivot more efficient, and you don't 
need to create sub lists in {{copyDefaultsIfNeeded}}.
# Since you know the pivot, you can insert the default entries at the pivot 
position and sort that sub list. Alternatively you can separate the ACLs into 
default entries and access entries, and concat them at the very end.

 Replacing only the default ACL can fail to copy unspecified base entries from 
 the access ACL.
 -

 Key: HDFS-5737
 URL: https://issues.apache.org/jira/browse/HDFS-5737
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5737.1.patch


 The final round of changes in HDFS-5673 switched to a search approach instead 
 of a scan approach for finding base access entries that need to be copied to 
 the default ACL.  However, in the case of doing full replacement on the 
 default ACL, the list may not be sorted properly at this point in the code, 
 causing the searches to miss the access entries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5739) ACL RPC must allow null name or null permissions in ACL entries.

2014-01-08 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865918#comment-13865918
 ] 

Haohui Mai commented on HDFS-5739:
--

The name parts looks good.

Since {AclEntry#permissions} is a enum, from a semantic point of view I would 
prefer that it is non nullable. Is it possible to simply ignore the value in 
{{removeAclEntries}}?

 ACL RPC must allow null name or null permissions in ACL entries.
 

 Key: HDFS-5739
 URL: https://issues.apache.org/jira/browse/HDFS-5739
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5739.1.patch


 Currently, the ACL RPC defines ACL entries with required fields for name and 
 permissions.  These fields actually need to be optional.  The name can be 
 null to represent unnamed ACL entries, such as the file owner or mask.  
 Permissions can be null when passed in an ACL spec to remove ACL entries via 
 {{FileSystem#removeAclEntries}}.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5612) NameNode: change all permission checks to enforce ACLs in addition to permissions.

2014-01-08 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865931#comment-13865931
 ] 

Haohui Mai commented on HDFS-5612:
--

Can you specify the invariants (i.e., the correctness conditions) of a valid 
list of AclEntry? I think it is important to document them as {{checkAcl}} 
depend on these invariants.

It seems that the following invariants hold for a valid list of AclEntry:

# The list has to be sorted. 
# Each entry in the list is unique.
# Default entries do not have names.
# There is at least one user / group / other entry does not have a name. (Why?)

I guess it is not immediately clear to me what is the semantic of the name of 
an entry. Can you please explain?

 NameNode: change all permission checks to enforce ACLs in addition to 
 permissions.
 --

 Key: HDFS-5612
 URL: https://issues.apache.org/jira/browse/HDFS-5612
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5612.1.patch, HDFS-5612.2.patch


 All {{NameNode}} code paths that enforce permissions must be updated so that 
 they also enforce ACLs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5738) Serialize INode information in protobuf

2014-01-08 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865941#comment-13865941
 ] 

Haohui Mai commented on HDFS-5738:
--

This patch serializes the inode information into two sections, INODE and 
INODE_DIRECTORY. On a high level, the inode information can be seen as a graph, 
where the inode are the vertices and the references are the edges. The INODE 
section records the information about the inode, such as atime / mtime. The 
INODE_DIRECTORY section records the all children for each inode.

The design simplifies the serialization of snapshot information.

 Serialize INode information in protobuf
 ---

 Key: HDFS-5738
 URL: https://issues.apache.org/jira/browse/HDFS-5738
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5738.000.patch


 This jira proposes to serialize inode information with protobuf. 
 Snapshot-related information are out of the scope of this jira.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5737) Replacing only the default ACL can fail to copy unspecified base entries from the access ACL.

2014-01-08 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5737:


Hadoop Flags: Reviewed

Thanks for the review, Haohui.  I'll commit this in a moment.

bq. Implement your own binary search so that (1) it supports finding in a sub 
list of the collection, and (2) it always returns the lowest element in the 
list. That way you can make finding the pivot more efficient, and you don't 
need to create sub lists in copyDefaultsIfNeeded.

My understanding is that {{ArrayList#subList}} returns an alternative view over 
the same underlying array, just with a different offset and length to pin it 
within the requested range.  This would mean that there is no cost incurred for 
copying the underlying data, just some extra math to deal with offset 
calculations, so perhaps the efficiency gain would be minor.  Here is the code 
for {{ArrayList#subList}}:

http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/ArrayList.java#876

Agreed on point 2 though that we'd need a custom binary search variant if we 
want to do that.  {{Collections#binarySearch}} can't do it.

 Replacing only the default ACL can fail to copy unspecified base entries from 
 the access ACL.
 -

 Key: HDFS-5737
 URL: https://issues.apache.org/jira/browse/HDFS-5737
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5737.1.patch


 The final round of changes in HDFS-5673 switched to a search approach instead 
 of a scan approach for finding base access entries that need to be copied to 
 the default ACL.  However, in the case of doing full replacement on the 
 default ACL, the list may not be sorted properly at this point in the code, 
 causing the searches to miss the access entries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN

2014-01-08 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865965#comment-13865965
 ] 

Arpit Agarwal commented on HDFS-5483:
-

Eric, the blockreceived path won't assert since it doesn't try to manipulate 
the BlockInfo list directly.

However looking at it some more I think we can eliminate the findDatanode 
routine, or at least make it 'private'. I'll file a separate Jira for it.

 NN should gracefully handle multiple block replicas on same DN
 --

 Key: HDFS-5483
 URL: https://issues.apache.org/jira/browse/HDFS-5483
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: Arpit Agarwal
 Fix For: 3.0.0

 Attachments: h5483.02.patch


 {{BlockManager#reportDiff}} can cause an assertion failure in 
 {{BlockInfo#moveBlockToHead}} if the block report shows the same block as 
 belonging to more than one storage.
 The issue is that {{moveBlockToHead}} assumes it will find the 
 DatanodeStorageInfo for the given block.
 Exception details:
 {code}
 java.lang.AssertionError: Index is out of bound
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984)
 at 
 org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5741) BlockInfo#findDataNode can be deprecated

2014-01-08 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-5741:
---

 Summary: BlockInfo#findDataNode can be deprecated
 Key: HDFS-5741
 URL: https://issues.apache.org/jira/browse/HDFS-5741
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


{{BlockInfo#findDataNode}} can be replaced with {{BlockInfo#findStorageInfo}} 
everywhere else except in {{#addStorage}}.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5741) BlockInfo#findDataNode can be deprecated

2014-01-08 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-5741:


Remaining Estimate: (was: 2h)
 Original Estimate: (was: 2h)

 BlockInfo#findDataNode can be deprecated
 

 Key: HDFS-5741
 URL: https://issues.apache.org/jira/browse/HDFS-5741
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal

 {{BlockInfo#findDataNode}} can be replaced with {{BlockInfo#findStorageInfo}} 
 everywhere else except in {{#addStorage}}.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5741) BlockInfo#findDataNode can be deprecated

2014-01-08 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-5741:


Priority: Minor  (was: Major)

 BlockInfo#findDataNode can be deprecated
 

 Key: HDFS-5741
 URL: https://issues.apache.org/jira/browse/HDFS-5741
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
Priority: Minor

 NN now tracks replicas by storage, so {{BlockInfo#findDataNode}} can be 
 replaced with {{BlockInfo#findStorageInfo}}.
 {{BlockManager#reportDiff}} is being fixed as part of HDFS-5483, this Jira is 
 to fix the rest of the callers.
 [suggested by [~sirianni] on HDFS-5483]



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5741) BlockInfo#findDataNode can be deprecated

2014-01-08 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-5741:


Description: 
NN now tracks replicas by storage, so {{BlockInfo#findDataNode}} can be 
replaced with {{BlockInfo#findStorageInfo}}.

{{BlockManager#reportDiff}} is being fixed as part of HDFS-5483, this Jira is 
to fix the rest of the callers.

[suggested by [~sirianni] on HDFS-5483]

  was:{{BlockInfo#findDataNode}} can be replaced with 
{{BlockInfo#findStorageInfo}} everywhere else except in {{#addStorage}}.


 BlockInfo#findDataNode can be deprecated
 

 Key: HDFS-5741
 URL: https://issues.apache.org/jira/browse/HDFS-5741
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal

 NN now tracks replicas by storage, so {{BlockInfo#findDataNode}} can be 
 replaced with {{BlockInfo#findStorageInfo}}.
 {{BlockManager#reportDiff}} is being fixed as part of HDFS-5483, this Jira is 
 to fix the rest of the callers.
 [suggested by [~sirianni] on HDFS-5483]



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5739) ACL RPC must allow null name or null permissions in ACL entries.

2014-01-08 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5739:


Attachment: HDFS-5739.2.patch

Thanks for the review, Haohui.  I'm attaching patch version 2 to show what this 
looks like when we keep permissions required.

bq. Is it possible to simply ignore the value in removeAclEntries?

Yes, the logic currently ignores it.  If we wanted to strictly match existing 
implementations like Linux, then we would actually send an error back to the 
user if they tried to specify permissions in a remove call.  I don't know that 
we need to be rigid about that, and we could always choose to implement that 
check at the CLI layer if we want it, so I'm fine with this approach.

The effect of this is that protobuf will default initialize the enum field to 
the 0'th element (NONE) on conversion from proto to model.  For symmetry, this 
patch adds the corresponding logic in the conversion from model to proto too.

 ACL RPC must allow null name or null permissions in ACL entries.
 

 Key: HDFS-5739
 URL: https://issues.apache.org/jira/browse/HDFS-5739
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5739.1.patch, HDFS-5739.2.patch


 Currently, the ACL RPC defines ACL entries with required fields for name and 
 permissions.  These fields actually need to be optional.  The name can be 
 null to represent unnamed ACL entries, such as the file owner or mask.  
 Permissions can be null when passed in an ACL spec to remove ACL entries via 
 {{FileSystem#removeAclEntries}}.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5742) DatanodeCluster (mini cluster of DNs) fails to start

2014-01-08 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-5742:
---

 Summary: DatanodeCluster (mini cluster of DNs) fails to start
 Key: HDFS-5742
 URL: https://issues.apache.org/jira/browse/HDFS-5742
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
Priority: Minor


DatanodeCluster fails to start with NPE in MiniDFSCluster.

Looks like a simple bug in {{MiniDFSCluster#determineDfsBaseDir}} - missing 
check for null configuration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5739) ACL RPC must allow null name or null permissions in ACL entries.

2014-01-08 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865994#comment-13865994
 ] 

Haohui Mai commented on HDFS-5739:
--

I think that it is fine to check it at the CLI layer. +1 on the v2 patch.

 ACL RPC must allow null name or null permissions in ACL entries.
 

 Key: HDFS-5739
 URL: https://issues.apache.org/jira/browse/HDFS-5739
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5739.1.patch, HDFS-5739.2.patch


 Currently, the ACL RPC defines ACL entries with required fields for name and 
 permissions.  These fields actually need to be optional.  The name can be 
 null to represent unnamed ACL entries, such as the file owner or mask.  
 Permissions can be null when passed in an ACL spec to remove ACL entries via 
 {{FileSystem#removeAclEntries}}.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5612) NameNode: change all permission checks to enforce ACLs in addition to permissions.

2014-01-08 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865997#comment-13865997
 ] 

Chris Nauroth commented on HDFS-5612:
-

Sure thing.  Here is a list of the invariants.  I'll also fold this list into 
the comments in a new patch later.
# The list must be sorted.
# Each entry in the list is unique.
# There is exactly one each of the unnamed user / group / other entries.  These 
entries are identical to the classic owner / group / other permissions encoded 
in permission bits today.  The ACL enforcement algorithm states that owner 
permissions trump named user permissions.  This becomes important if the file 
owner also has a named user entry in the ACL.  Assume the file owner is haohui, 
and the owner permissions are rw-, but there is also a named user entry for 
user:haohui:r--.  In this case, the owner entry must take precedence over the 
named user entry so that you get read-write access.  Additionally, the 
effective permissions granted to a user through groups must include the 
permissions of the file's group (if the user is a member).
# The mask entry, if present, must not have a name.  (The name would be 
meaningless.)
# The owner entry must not have a name.  (The name would be meaningless.)
# There may be any number of named user entries.  These entries are used if the 
username is a specific match (assuming the user is not the owner as discussed 
above).
# There may be any number of named group entries.  Assuming the user is not the 
owner, and there is no named user entry matching that user, and the user is a 
member of at least one named group or the file's group, then the user's 
effective permissions are the union of permissions for all such groups in which 
the user is a member.
# Default entries are ignored during permission enforcement.

Regarding default entries, these are not used during permission enforcement at 
all, so there really are no invariants related to the default ACL within the 
context of {{checkAcl}}.  However, the default ACL on a directory will be 
copied to the access ACL of its newly created child inodes.  Since the default 
ACL eventually becomes an access ACL for a different inode, we can say that the 
same set of invariants must hold for the default ACL entries.  (Otherwise, we'd 
have a violation of invariants later when it comes time to run {{checkAcl}} on 
that child inode.)


 NameNode: change all permission checks to enforce ACLs in addition to 
 permissions.
 --

 Key: HDFS-5612
 URL: https://issues.apache.org/jira/browse/HDFS-5612
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5612.1.patch, HDFS-5612.2.patch


 All {{NameNode}} code paths that enforce permissions must be updated so that 
 they also enforce ACLs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5742) DatanodeCluster (mini cluster of DNs) fails to start

2014-01-08 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-5742:


Attachment: HDFS-5742.01.patch

 DatanodeCluster (mini cluster of DNs) fails to start
 

 Key: HDFS-5742
 URL: https://issues.apache.org/jira/browse/HDFS-5742
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
Priority: Minor
 Attachments: HDFS-5742.01.patch


 DatanodeCluster fails to start with NPE in MiniDFSCluster.
 Looks like a simple bug in {{MiniDFSCluster#determineDfsBaseDir}} - missing 
 check for null configuration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5739) ACL RPC must allow null name or unspecified permissions in ACL entries.

2014-01-08 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5739:


Summary: ACL RPC must allow null name or unspecified permissions in ACL 
entries.  (was: ACL RPC must allow null name or null permissions in ACL 
entries.)

 ACL RPC must allow null name or unspecified permissions in ACL entries.
 ---

 Key: HDFS-5739
 URL: https://issues.apache.org/jira/browse/HDFS-5739
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5739.1.patch, HDFS-5739.2.patch


 Currently, the ACL RPC defines ACL entries with required fields for name and 
 permissions.  These fields actually need to be optional.  The name can be 
 null to represent unnamed ACL entries, such as the file owner or mask.  
 Permissions can be null when passed in an ACL spec to remove ACL entries via 
 {{FileSystem#removeAclEntries}}.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HDFS-5737) Replacing only the default ACL can fail to copy unspecified base entries from the access ACL.

2014-01-08 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth resolved HDFS-5737.
-

   Resolution: Fixed
Fix Version/s: HDFS ACLs (HDFS-4685)

I committed this to the HDFS-4685 feature branch.

 Replacing only the default ACL can fail to copy unspecified base entries from 
 the access ACL.
 -

 Key: HDFS-5737
 URL: https://issues.apache.org/jira/browse/HDFS-5737
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: HDFS ACLs (HDFS-4685)

 Attachments: HDFS-5737.1.patch


 The final round of changes in HDFS-5673 switched to a search approach instead 
 of a scan approach for finding base access entries that need to be copied to 
 the default ACL.  However, in the case of doing full replacement on the 
 default ACL, the list may not be sorted properly at this point in the code, 
 causing the searches to miss the access entries.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HDFS-5739) ACL RPC must allow null name or unspecified permissions in ACL entries.

2014-01-08 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth resolved HDFS-5739.
-

   Resolution: Fixed
Fix Version/s: HDFS ACLs (HDFS-4685)
 Hadoop Flags: Reviewed

I committed the v2 patch to the HDFS-4685 feature branch.  Thanks again for the 
review, Haohui.

 ACL RPC must allow null name or unspecified permissions in ACL entries.
 ---

 Key: HDFS-5739
 URL: https://issues.apache.org/jira/browse/HDFS-5739
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, namenode
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: HDFS ACLs (HDFS-4685)

 Attachments: HDFS-5739.1.patch, HDFS-5739.2.patch


 Currently, the ACL RPC defines ACL entries with required fields for name and 
 permissions.  These fields actually need to be optional.  The name can be 
 null to represent unnamed ACL entries, such as the file owner or mask.  
 Permissions can be null when passed in an ACL spec to remove ACL entries via 
 {{FileSystem#removeAclEntries}}.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-5743) Use protobuf to serialize snapshot information

2014-01-08 Thread Haohui Mai (JIRA)
Haohui Mai created HDFS-5743:


 Summary: Use protobuf to serialize snapshot information
 Key: HDFS-5743
 URL: https://issues.apache.org/jira/browse/HDFS-5743
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Jing Zhao


This jira tracks the efforts of using protobuf to serialize snapshot-related 
information in FSImage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5742) DatanodeCluster (mini cluster of DNs) fails to start

2014-01-08 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5742:


Status: Patch Available  (was: Open)

 DatanodeCluster (mini cluster of DNs) fails to start
 

 Key: HDFS-5742
 URL: https://issues.apache.org/jira/browse/HDFS-5742
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
Priority: Minor
 Attachments: HDFS-5742.01.patch


 DatanodeCluster fails to start with NPE in MiniDFSCluster.
 Looks like a simple bug in {{MiniDFSCluster#determineDfsBaseDir}} - missing 
 check for null configuration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5742) DatanodeCluster (mini cluster of DNs) fails to start

2014-01-08 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866011#comment-13866011
 ] 

Jing Zhao commented on HDFS-5742:
-

+1 Patch looks good to me.

 DatanodeCluster (mini cluster of DNs) fails to start
 

 Key: HDFS-5742
 URL: https://issues.apache.org/jira/browse/HDFS-5742
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
Priority: Minor
 Attachments: HDFS-5742.01.patch


 DatanodeCluster fails to start with NPE in MiniDFSCluster.
 Looks like a simple bug in {{MiniDFSCluster#determineDfsBaseDir}} - missing 
 check for null configuration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5743) Use protobuf to serialize snapshot information

2014-01-08 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-5743:
-

Target Version/s: HDFS-5698 (FSImage in protobuf)

 Use protobuf to serialize snapshot information
 --

 Key: HDFS-5743
 URL: https://issues.apache.org/jira/browse/HDFS-5743
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Jing Zhao

 This jira tracks the efforts of using protobuf to serialize snapshot-related 
 information in FSImage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5738) Serialize INode information in protobuf

2014-01-08 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-5738:
-

Target Version/s: HDFS-5698 (FSImage in protobuf)

 Serialize INode information in protobuf
 ---

 Key: HDFS-5738
 URL: https://issues.apache.org/jira/browse/HDFS-5738
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5738.000.patch


 This jira proposes to serialize inode information with protobuf. 
 Snapshot-related information are out of the scope of this jira.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5717) Save FSImage header in protobuf

2014-01-08 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-5717:
-

Target Version/s: HDFS-5698 (FSImage in protobuf)

 Save FSImage header in protobuf
 ---

 Key: HDFS-5717
 URL: https://issues.apache.org/jira/browse/HDFS-5717
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-5698 (FSImage in protobuf)
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5717.000.patch, HDFS-5717.001.patch, 
 HDFS-5717.002.patch


 This jira introduces the basic framework to serialize and deserialize FSImage 
 in protobuf, and it serializes some header information in the new protobuf 
 format.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5722) Implement compression in the HTTP server of SNN / SBN instead of FSImage

2014-01-08 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866104#comment-13866104
 ] 

Haohui Mai commented on HDFS-5722:
--

Indeed the difficulties come from the efficiency prospectives. Currently 
skipping N bytes in a compressed stream requires decompressing the data. It can 
be problematic because N can be huge. (e.g., when skipping the inode section N 
can be as large as a couple GB)

Just to clarify, the code will continue to support old FsImage that has 
compression enabled. This jira only proposes to move compression support out of 
the new FSImage format.

 Implement compression in the HTTP server of SNN / SBN instead of FSImage
 

 Key: HDFS-5722
 URL: https://issues.apache.org/jira/browse/HDFS-5722
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai

 The current FSImage format support compression, there is a field in the 
 header which specifies the compression codec used to compress the data in the 
 image. The main motivation was to reduce the number of bytes to be 
 transferred between SNN / SBN / NN.
 The main disadvantage, however, is that it requires the client to access the 
 FSImage in strictly sequential order. This might not fit well with the new 
 design of FSImage. For example, serializing the data in protobuf allows the 
 client to quickly skip data that it does not understand. The compression 
 built-in the format, however, complicates the calculation of offsets and 
 lengths. Recovering from a corrupted, compressed FSImage is also non-trivial 
 as off-the-shelf tools like bzip2recover is inapplicable.
 This jira proposes to move the compression from the format of the FSImage to 
 the transport layer, namely, the HTTP server of SNN / SBN. This design 
 simplifies the format of FSImage, opens up the opportunity to quickly 
 navigate through the FSImage, and eases the process of recovery. It also 
 retains the benefits of reducing the number of bytes to be transferred across 
 the wire since there are compression on the transport layer.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5742) DatanodeCluster (mini cluster of DNs) fails to start

2014-01-08 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-5742:


Status: Open  (was: Patch Available)

Withdrawing the patch for now. There are other bugs in DatanodeCluster.

Will submit a combined patch later.

 DatanodeCluster (mini cluster of DNs) fails to start
 

 Key: HDFS-5742
 URL: https://issues.apache.org/jira/browse/HDFS-5742
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
Priority: Minor
 Attachments: HDFS-5742.01.patch, HDFS-5742.02.patch


 DatanodeCluster fails to start with NPE in MiniDFSCluster.
 Looks like a simple bug in {{MiniDFSCluster#determineDfsBaseDir}} - missing 
 check for null configuration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5722) Implement compression in the HTTP server of SNN / SBN instead of FSImage

2014-01-08 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866114#comment-13866114
 ] 

Haohui Mai commented on HDFS-5722:
--

Had an offline discussion with @Jing Zhao, and digged into the original jira 
(HDFS-1435) that did compression work.

One concern is that it might increase disk I/O when writing FSImage 
uncompressed into the disk. The following table shows that it does not seems to 
be a problem:

https://issues.apache.org/jira/browse/HDFS-1435?focusedCommentId=12921060page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12921060

Based on the data, I think it makes sense to move compression out of the 
FSImage format. The code can compress the data on the fly when transferring it 
through HTTP, or write the FSImage uncompressed onto the disk, and compute the 
digest and compresses the whole file in the background. Both solutions can 
reduce the time that the NN spent safe mode when saving the namespace.


 Implement compression in the HTTP server of SNN / SBN instead of FSImage
 

 Key: HDFS-5722
 URL: https://issues.apache.org/jira/browse/HDFS-5722
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai

 The current FSImage format support compression, there is a field in the 
 header which specifies the compression codec used to compress the data in the 
 image. The main motivation was to reduce the number of bytes to be 
 transferred between SNN / SBN / NN.
 The main disadvantage, however, is that it requires the client to access the 
 FSImage in strictly sequential order. This might not fit well with the new 
 design of FSImage. For example, serializing the data in protobuf allows the 
 client to quickly skip data that it does not understand. The compression 
 built-in the format, however, complicates the calculation of offsets and 
 lengths. Recovering from a corrupted, compressed FSImage is also non-trivial 
 as off-the-shelf tools like bzip2recover is inapplicable.
 This jira proposes to move the compression from the format of the FSImage to 
 the transport layer, namely, the HTTP server of SNN / SBN. This design 
 simplifies the format of FSImage, opens up the opportunity to quickly 
 navigate through the FSImage, and eases the process of recovery. It also 
 retains the benefits of reducing the number of bytes to be transferred across 
 the wire since there are compression on the transport layer.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5653) Log namenode hostname in various exceptions being thrown in a HA setup

2014-01-08 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866118#comment-13866118
 ] 

Jing Zhao commented on HDFS-5653:
-

For the current patch, since getCurrentProxyInfo and getProxy are called in 
different places, is it possible that a failover happened in the middle 
(triggered by another RPC call, e.g.)? I think another possible solution is to 
let getProxy return (Proxy + extra tag) where the tag can be used to indicate 
the NN. 

 Log namenode hostname in various exceptions being thrown in a HA setup
 --

 Key: HDFS-5653
 URL: https://issues.apache.org/jira/browse/HDFS-5653
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 2.2.0
Reporter: Arpit Gupta
Assignee: Haohui Mai
Priority: Minor
 Attachments: HDFS-5653.000.patch, HDFS-5653.001.patch, 
 HDFS-5653.002.patch, HDFS-5653.003.patch


 In a HA setup any time we see an exception such as safemode or namenode in 
 standby etc we dont know which namenode it came from. The user has to go to 
 the logs of the namenode and determine which one was active and/or standby 
 around the same time.
 I think it would help with debugging if any such exceptions could include the 
 namenode hostname so the user could know exactly which namenode served the 
 request.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5717) Save FSImage header in protobuf

2014-01-08 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866145#comment-13866145
 ] 

Jing Zhao commented on HDFS-5717:
-

[~wheat9], could you update the description and give more details about your 
basic designs? +1 after that.

 Save FSImage header in protobuf
 ---

 Key: HDFS-5717
 URL: https://issues.apache.org/jira/browse/HDFS-5717
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-5698 (FSImage in protobuf)
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5717.000.patch, HDFS-5717.001.patch, 
 HDFS-5717.002.patch


 This jira introduces the basic framework to serialize and deserialize FSImage 
 in protobuf, and it serializes some header information in the new protobuf 
 format.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5717) Save FSImage header in protobuf

2014-01-08 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866147#comment-13866147
 ] 

Jing Zhao commented on HDFS-5717:
-

bq. Don't call newLoader

newLoader is the method which creates the real Loader instance, which can be 
either old loader or new loader supporting protobuf. Thus the name newLoader 
makes sense to me.

 Save FSImage header in protobuf
 ---

 Key: HDFS-5717
 URL: https://issues.apache.org/jira/browse/HDFS-5717
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-5698 (FSImage in protobuf)
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5717.000.patch, HDFS-5717.001.patch, 
 HDFS-5717.002.patch


 This jira introduces the basic framework to serialize and deserialize FSImage 
 in protobuf, and it serializes some header information in the new protobuf 
 format.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5742) DatanodeCluster (mini cluster of DNs) fails to start

2014-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866161#comment-13866161
 ] 

Hadoop QA commented on HDFS-5742:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12622054/HDFS-5742.02.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5847//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5847//console

This message is automatically generated.

 DatanodeCluster (mini cluster of DNs) fails to start
 

 Key: HDFS-5742
 URL: https://issues.apache.org/jira/browse/HDFS-5742
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
Priority: Minor
 Attachments: HDFS-5742.01.patch, HDFS-5742.02.patch


 DatanodeCluster fails to start with NPE in MiniDFSCluster.
 Looks like a simple bug in {{MiniDFSCluster#determineDfsBaseDir}} - missing 
 check for null configuration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-3544) Ability to use SimpleRegeratingCode to fix missing blocks

2014-01-08 Thread Chris Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866170#comment-13866170
 ] 

Chris Li commented on HDFS-3544:


Any updates on this issue? We're interested in trying this out to save space on 
our cold files.

 Ability to use SimpleRegeratingCode to fix missing blocks
 -

 Key: HDFS-3544
 URL: https://issues.apache.org/jira/browse/HDFS-3544
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: contrib/raid
Reporter: dhruba borthakur
Assignee: Weiyan Wang

 ReedSolomon encoding (n, k) has n storage nodes and can tolerate n-k 
 failures. Regenerating a block needs to access k blocks. This is a problem 
 when n and k are large. Instead, we can use simple regenerating codes (n, k, 
 f) that does first does ReedSolomon (n,k) and then does XOR with f stripe 
 size. Then, a single disk failure needs to access only f nodes and f can be 
 very small.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5722) Implement compression in the HTTP server of SNN / SBN instead of FSImage

2014-01-08 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866181#comment-13866181
 ] 

Colin Patrick McCabe commented on HDFS-5722:


[~tlipcon], [~atm], [~hairong], how do you feel about removing support for 
on-disk FSImage compression?

It seems to me that we should just add an option for doing HTTP compression, 
but keep the old option for on-disk compression.  It concerns me that someone 
with a small disk might upgrade to a new version of Hadoop and then be unable 
to save his (much larger) fsimage on a small partition once compression support 
has been removed.  I also think that for really large FSImages, loading a 
compressed version could be faster, if the compression were offloaded to a 
worker thread like Todd suggested in HDFS-1435.

The FSImage is always read sequentially.  If we implement optional sections, 
that won't change this fact.  So I just don't see a reason for messing with 
this.  But maybe there's something I have overlooked.

Thoughts?

 Implement compression in the HTTP server of SNN / SBN instead of FSImage
 

 Key: HDFS-5722
 URL: https://issues.apache.org/jira/browse/HDFS-5722
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai

 The current FSImage format support compression, there is a field in the 
 header which specifies the compression codec used to compress the data in the 
 image. The main motivation was to reduce the number of bytes to be 
 transferred between SNN / SBN / NN.
 The main disadvantage, however, is that it requires the client to access the 
 FSImage in strictly sequential order. This might not fit well with the new 
 design of FSImage. For example, serializing the data in protobuf allows the 
 client to quickly skip data that it does not understand. The compression 
 built-in the format, however, complicates the calculation of offsets and 
 lengths. Recovering from a corrupted, compressed FSImage is also non-trivial 
 as off-the-shelf tools like bzip2recover is inapplicable.
 This jira proposes to move the compression from the format of the FSImage to 
 the transport layer, namely, the HTTP server of SNN / SBN. This design 
 simplifies the format of FSImage, opens up the opportunity to quickly 
 navigate through the FSImage, and eases the process of recovery. It also 
 retains the benefits of reducing the number of bytes to be transferred across 
 the wire since there are compression on the transport layer.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5717) Save FSImage header in protobuf

2014-01-08 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-5717:
-

Description: 
This jira introduces several basic components to serialize / deserialize the 
FSImage in protobuf, including:

* Using protobuf to describe the skeleton of the new FSImage format.
* Introducing a separate code path to serialize and deserialize the new FSImage 
format.
* Saving the summary of the FSImage in the new format.

  was:This jira introduces the basic framework to serialize and deserialize 
FSImage in protobuf, and it serializes some header information in the new 
protobuf format.


 Save FSImage header in protobuf
 ---

 Key: HDFS-5717
 URL: https://issues.apache.org/jira/browse/HDFS-5717
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-5698 (FSImage in protobuf)
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5717.000.patch, HDFS-5717.001.patch, 
 HDFS-5717.002.patch


 This jira introduces several basic components to serialize / deserialize the 
 FSImage in protobuf, including:
 * Using protobuf to describe the skeleton of the new FSImage format.
 * Introducing a separate code path to serialize and deserialize the new 
 FSImage format.
 * Saving the summary of the FSImage in the new format.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HDFS-5717) Save FSImage header in protobuf

2014-01-08 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao resolved HDFS-5717.
-

   Resolution: Fixed
Fix Version/s: HDFS-5698 (FSImage in protobuf)
 Hadoop Flags: Reviewed

I've committed this.

 Save FSImage header in protobuf
 ---

 Key: HDFS-5717
 URL: https://issues.apache.org/jira/browse/HDFS-5717
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-5698 (FSImage in protobuf)
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: HDFS-5698 (FSImage in protobuf)

 Attachments: HDFS-5717.000.patch, HDFS-5717.001.patch, 
 HDFS-5717.002.patch


 This jira introduces several basic components to serialize / deserialize the 
 FSImage in protobuf, including:
 * Using protobuf to describe the skeleton of the new FSImage format.
 * Introducing a separate code path to serialize and deserialize the new 
 FSImage format.
 * Saving the summary of the FSImage in the new format.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5738) Serialize INode information in protobuf

2014-01-08 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-5738:
-

Attachment: HDFS-5738.001.patch

Rebase on the current branch.

 Serialize INode information in protobuf
 ---

 Key: HDFS-5738
 URL: https://issues.apache.org/jira/browse/HDFS-5738
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5738.000.patch, HDFS-5738.001.patch


 This jira proposes to serialize inode information with protobuf. 
 Snapshot-related information are out of the scope of this jira.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5579) Under construction files make DataNode decommission take very long hours

2014-01-08 Thread zhaoyunjiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated HDFS-5579:
---

Attachment: HDFS-5579-branch-1.2.patch
HDFS-5579.patch

Good point. Thanks Jing.
Update patches to fix this problem.

 Under construction files make DataNode decommission take very long hours
 

 Key: HDFS-5579
 URL: https://issues.apache.org/jira/browse/HDFS-5579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 1.2.0, 2.2.0
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Attachments: HDFS-5579-branch-1.2.patch, HDFS-5579-branch-1.2.patch, 
 HDFS-5579.patch, HDFS-5579.patch


 We noticed that some times decommission DataNodes takes very long time, even 
 exceeds 100 hours.
 After check the code, I found that in 
 BlockManager:computeReplicationWorkForBlocks(ListListBlock 
 blocksToReplicate) it won't replicate blocks which belongs to under 
 construction files, however in 
 BlockManager:isReplicationInProgress(DatanodeDescriptor srcNode), if there  
 is block need replicate no matter whether it belongs to under construction or 
 not, the decommission progress will continue running.
 That's the reason some time the decommission takes very long time.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5579) Under construction files make DataNode decommission take very long hours

2014-01-08 Thread zhaoyunjiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated HDFS-5579:
---

Attachment: (was: HDFS-5579-branch-1.2.patch)

 Under construction files make DataNode decommission take very long hours
 

 Key: HDFS-5579
 URL: https://issues.apache.org/jira/browse/HDFS-5579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 1.2.0, 2.2.0
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Attachments: HDFS-5579-branch-1.2.patch, HDFS-5579.patch


 We noticed that some times decommission DataNodes takes very long time, even 
 exceeds 100 hours.
 After check the code, I found that in 
 BlockManager:computeReplicationWorkForBlocks(ListListBlock 
 blocksToReplicate) it won't replicate blocks which belongs to under 
 construction files, however in 
 BlockManager:isReplicationInProgress(DatanodeDescriptor srcNode), if there  
 is block need replicate no matter whether it belongs to under construction or 
 not, the decommission progress will continue running.
 That's the reason some time the decommission takes very long time.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5579) Under construction files make DataNode decommission take very long hours

2014-01-08 Thread zhaoyunjiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated HDFS-5579:
---

Attachment: (was: HDFS-5579.patch)

 Under construction files make DataNode decommission take very long hours
 

 Key: HDFS-5579
 URL: https://issues.apache.org/jira/browse/HDFS-5579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 1.2.0, 2.2.0
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Attachments: HDFS-5579-branch-1.2.patch, HDFS-5579.patch


 We noticed that some times decommission DataNodes takes very long time, even 
 exceeds 100 hours.
 After check the code, I found that in 
 BlockManager:computeReplicationWorkForBlocks(ListListBlock 
 blocksToReplicate) it won't replicate blocks which belongs to under 
 construction files, however in 
 BlockManager:isReplicationInProgress(DatanodeDescriptor srcNode), if there  
 is block need replicate no matter whether it belongs to under construction or 
 not, the decommission progress will continue running.
 That's the reason some time the decommission takes very long time.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5579) Under construction files make DataNode decommission take very long hours

2014-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866258#comment-13866258
 ] 

Hadoop QA commented on HDFS-5579:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12622097/HDFS-5579-branch-1.2.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5848//console

This message is automatically generated.

 Under construction files make DataNode decommission take very long hours
 

 Key: HDFS-5579
 URL: https://issues.apache.org/jira/browse/HDFS-5579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 1.2.0, 2.2.0
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Attachments: HDFS-5579-branch-1.2.patch, HDFS-5579.patch


 We noticed that some times decommission DataNodes takes very long time, even 
 exceeds 100 hours.
 After check the code, I found that in 
 BlockManager:computeReplicationWorkForBlocks(ListListBlock 
 blocksToReplicate) it won't replicate blocks which belongs to under 
 construction files, however in 
 BlockManager:isReplicationInProgress(DatanodeDescriptor srcNode), if there  
 is block need replicate no matter whether it belongs to under construction or 
 not, the decommission progress will continue running.
 That's the reason some time the decommission takes very long time.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5579) Under construction files make DataNode decommission take very long hours

2014-01-08 Thread zhaoyunjiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated HDFS-5579:
---

Attachment: HDFS-5579.patch

 Under construction files make DataNode decommission take very long hours
 

 Key: HDFS-5579
 URL: https://issues.apache.org/jira/browse/HDFS-5579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 1.2.0, 2.2.0
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Attachments: HDFS-5579-branch-1.2.patch, HDFS-5579.patch


 We noticed that some times decommission DataNodes takes very long time, even 
 exceeds 100 hours.
 After check the code, I found that in 
 BlockManager:computeReplicationWorkForBlocks(ListListBlock 
 blocksToReplicate) it won't replicate blocks which belongs to under 
 construction files, however in 
 BlockManager:isReplicationInProgress(DatanodeDescriptor srcNode), if there  
 is block need replicate no matter whether it belongs to under construction or 
 not, the decommission progress will continue running.
 That's the reason some time the decommission takes very long time.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5579) Under construction files make DataNode decommission take very long hours

2014-01-08 Thread zhaoyunjiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated HDFS-5579:
---

Attachment: (was: HDFS-5579.patch)

 Under construction files make DataNode decommission take very long hours
 

 Key: HDFS-5579
 URL: https://issues.apache.org/jira/browse/HDFS-5579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 1.2.0, 2.2.0
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Attachments: HDFS-5579-branch-1.2.patch, HDFS-5579.patch


 We noticed that some times decommission DataNodes takes very long time, even 
 exceeds 100 hours.
 After check the code, I found that in 
 BlockManager:computeReplicationWorkForBlocks(ListListBlock 
 blocksToReplicate) it won't replicate blocks which belongs to under 
 construction files, however in 
 BlockManager:isReplicationInProgress(DatanodeDescriptor srcNode), if there  
 is block need replicate no matter whether it belongs to under construction or 
 not, the decommission progress will continue running.
 That's the reason some time the decommission takes very long time.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5645) Support upgrade marker in editlog streams

2014-01-08 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-5645:
-

Attachment: h5645_20130109.patch

h5645_20130109.patch: updated with the branch.

Since the patch also applies to trunk, let me try submitting it.

 Support upgrade marker in editlog streams
 -

 Key: HDFS-5645
 URL: https://issues.apache.org/jira/browse/HDFS-5645
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: editsStored, h5645_20130103.patch, h5645_20130109.patch


 During upgrade, a marker can be inserted into the editlog streams so that it 
 is possible to roll back to the marker transaction.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5645) Support upgrade marker in editlog streams

2014-01-08 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-5645:
-

Status: Patch Available  (was: Open)

 Support upgrade marker in editlog streams
 -

 Key: HDFS-5645
 URL: https://issues.apache.org/jira/browse/HDFS-5645
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: editsStored, h5645_20130103.patch, h5645_20130109.patch


 During upgrade, a marker can be inserted into the editlog streams so that it 
 is possible to roll back to the marker transaction.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HDFS-5721) sharedEditsImage in Namenode#initializeSharedEdits() should be closed before method returns

2014-01-08 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-5721:
-

Issue Type: Improvement  (was: Bug)

 sharedEditsImage in Namenode#initializeSharedEdits() should be closed before 
 method returns
 ---

 Key: HDFS-5721
 URL: https://issues.apache.org/jira/browse/HDFS-5721
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: hdfs-5721-v1.txt, hdfs-5721-v2.txt, hdfs-5721-v3.txt


 At line 901:
 {code}
   FSImage sharedEditsImage = new FSImage(conf,
   Lists.URInewArrayList(),
   sharedEditsDirs);
 {code}
 sharedEditsImage is not closed before the method returns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


  1   2   >