[jira] [Commented] (HDFS-3443) Unable to catch up edits during standby to active switch due to NPE

2015-01-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283600#comment-14283600
 ] 

Hadoop QA commented on HDFS-3443:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693217/HDFS-3443-006.patch
  against trunk revision 5a6c084.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
  org.apache.hadoop.hdfs.server.namenode.TestFileTruncate

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9274//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9274//console

This message is automatically generated.

> Unable to catch up edits during standby to active switch due to NPE
> ---
>
> Key: HDFS-3443
> URL: https://issues.apache.org/jira/browse/HDFS-3443
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover, ha
>Reporter: suja s
>Assignee: Vinayakumar B
> Attachments: HDFS-3443-003.patch, HDFS-3443-004.patch, 
> HDFS-3443-005.patch, HDFS-3443-006.patch, HDFS-3443_1.patch, HDFS-3443_1.patch
>
>
> Start NN
> Let NN standby services be started.
> Before the editLogTailer is initialised start ZKFC and allow the 
> activeservices start to proceed further.
> Here editLogTailer.catchupDuringFailover() will throw NPE.
> void startActiveServices() throws IOException {
> LOG.info("Starting services required for active state");
> writeLock();
> try {
>   FSEditLog editLog = dir.fsImage.getEditLog();
>   
>   if (!editLog.isOpenForWrite()) {
> // During startup, we're already open for write during initialization.
> editLog.initJournalsForWrite();
> // May need to recover
> editLog.recoverUnclosedStreams();
> 
> LOG.info("Catching up to latest edits from old active before " +
> "taking over writer role in edits logs.");
> editLogTailer.catchupDuringFailover();
> {noformat}
> 2012-05-18 16:51:27,585 WARN org.apache.hadoop.ipc.Server: IPC Server 
> Responder, call org.apache.hadoop.ha.HAServiceProtocol.getServiceStatus from 
> XX.XX.XX.55:58003: output error
> 2012-05-18 16:51:27,586 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 8 on 8020, call org.apache.hadoop.ha.HAServiceProtocol.transitionToActive 
> from XX.XX.XX.55:58004: error: java.lang.NullPointerException
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:602)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1287)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:63)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1219)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:978)
>   at 
> org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
>   at 
> org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:3633)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692)
>  

[jira] [Commented] (HDFS-3443) Unable to catch up edits during standby to active switch due to NPE

2015-01-20 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283613#comment-14283613
 ] 

Vinayakumar B commented on HDFS-3443:
-

Above failures are unrelated.

> Unable to catch up edits during standby to active switch due to NPE
> ---
>
> Key: HDFS-3443
> URL: https://issues.apache.org/jira/browse/HDFS-3443
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover, ha
>Reporter: suja s
>Assignee: Vinayakumar B
> Attachments: HDFS-3443-003.patch, HDFS-3443-004.patch, 
> HDFS-3443-005.patch, HDFS-3443-006.patch, HDFS-3443_1.patch, HDFS-3443_1.patch
>
>
> Start NN
> Let NN standby services be started.
> Before the editLogTailer is initialised start ZKFC and allow the 
> activeservices start to proceed further.
> Here editLogTailer.catchupDuringFailover() will throw NPE.
> void startActiveServices() throws IOException {
> LOG.info("Starting services required for active state");
> writeLock();
> try {
>   FSEditLog editLog = dir.fsImage.getEditLog();
>   
>   if (!editLog.isOpenForWrite()) {
> // During startup, we're already open for write during initialization.
> editLog.initJournalsForWrite();
> // May need to recover
> editLog.recoverUnclosedStreams();
> 
> LOG.info("Catching up to latest edits from old active before " +
> "taking over writer role in edits logs.");
> editLogTailer.catchupDuringFailover();
> {noformat}
> 2012-05-18 16:51:27,585 WARN org.apache.hadoop.ipc.Server: IPC Server 
> Responder, call org.apache.hadoop.ha.HAServiceProtocol.getServiceStatus from 
> XX.XX.XX.55:58003: output error
> 2012-05-18 16:51:27,586 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 8 on 8020, call org.apache.hadoop.ha.HAServiceProtocol.transitionToActive 
> from XX.XX.XX.55:58004: error: java.lang.NullPointerException
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:602)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1287)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:63)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1219)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:978)
>   at 
> org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
>   at 
> org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:3633)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686)
> 2012-05-18 16:51:27,586 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 9 on 8020 caught an exception
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>   at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>   at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2092)
>   at org.apache.hadoop.ipc.Server.access$2000(Server.java:107)
>   at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:930)
>   at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:994)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1738)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7641) Update archival storage user doc for list/set/get block storage policies

2015-01-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283646#comment-14283646
 ] 

Hadoop QA commented on HDFS-7641:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693249/HDFS-7641.001.patch
  against trunk revision c94c0d2.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.TestFileCreation

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9275//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9275//console

This message is automatically generated.

> Update archival storage user doc for list/set/get block storage policies
> 
>
> Key: HDFS-7641
> URL: https://issues.apache.org/jira/browse/HDFS-7641
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Attachments: HDFS-7641.001.patch
>
>
> After HDFS-7323, the list/set/get block storage policies commands are 
> different, we should update the corresponding user doc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7637) Fix the check condition for reserved path

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283684#comment-14283684
 ] 

Hudson commented on HDFS-7637:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #79 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/79/])
HDFS-7637. Fix the check condition for reserved path. Contributed by Yi Liu. 
(jing9: rev e843a0a8cee5c704a5d28cf14b5a4050094d341b)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java


> Fix the check condition for reserved path
> -
>
> Key: HDFS-7637
> URL: https://issues.apache.org/jira/browse/HDFS-7637
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7637.001.patch
>
>
> Currently the {{.reserved}} patch check function is:
> {code}
> public static boolean isReservedName(String src) {
>   return src.startsWith(DOT_RESERVED_PATH_PREFIX);
> }
> {code}
> And {{DOT_RESERVED_PATH_PREFIX}} is {{/.reserved}}, it should be 
> {{/.reserved/}}, for example: if some other directory prefix with 
> _/.reserved_, we say it's _/.reservedpath_, then the check is wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7640) print NFS Client in the NFS log

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283689#comment-14283689
 ] 

Hudson commented on HDFS-7640:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #79 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/79/])
HDFS-7640. print NFS Client in the NFS log. Contributed by Brandon Li. (wheat9: 
rev 5e5e35b1856293503124b77d5d4998a4d8e83082)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java


> print NFS Client in the NFS log
> ---
>
> Key: HDFS-7640
> URL: https://issues.apache.org/jira/browse/HDFS-7640
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-7640.001.patch
>
>
> Currently hdfs-nfs logs does not have any information about nfs clients.
> When multiple clients are using nfs, it becomes hard to distinguish which 
> request came from which client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5631) Expose interfaces required by FsDatasetSpi implementations

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283681#comment-14283681
 ] 

Hudson commented on HDFS-5631:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #79 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/79/])
HDFS-5631. Change BlockMetadataHeader.readHeader(..), ChunkChecksum class and 
constructor to public; and fix FsDatasetSpi to use generic type instead of 
FsVolumeImpl.  Contributed by David Powell and Joe Pallas (szetszwo: rev 
4a4450836c8972480b9387b5e31bab57ae2b5baa)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ChunkChecksum.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalReplica.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalRollingLogs.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/RamDiskAsyncLazyPersistService.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/TestExternalDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalVolumeImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalReplicaInPipeline.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockMetadataHeader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Expose interfaces required by FsDatasetSpi implementations
> --
>
> Key: HDFS-5631
> URL: https://issues.apache.org/jira/browse/HDFS-5631
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: David Powell
>Assignee: Joe Pallas
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-5631-LazyPersist.patch, 
> HDFS-5631-LazyPersist.patch, HDFS-5631.patch, HDFS-5631.patch
>
>
> This sub-task addresses section 4.1 of the document attached to HDFS-5194,
> the exposure of interfaces needed by a FsDatasetSpi implementation.
> Specifically it makes ChunkChecksum public and BlockMetadataHeader's
> readHeader() and writeHeader() methods public.
> The changes to BlockReaderUtil (and related classes) discussed by section
> 4.1 are only needed if supporting short-circuit, and should be addressed
> as part of an effort to provide such support rather than this JIRA.
> To help ensure these changes are complete and are not regressed in the
> future, tests that gauge the accessibility (though *not* behavior)
> of interfaces needed by a FsDatasetSpi subclass are also included.
> These take the form of a dummy FsDatasetSpi subclass -- a successful
> compilation is effectively a pass.  Trivial unit tests are included so
> that there is something tangible to track.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7638) Small fix and few refinements for FSN#truncate

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283685#comment-14283685
 ] 

Hudson commented on HDFS-7638:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #79 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/79/])
HDFS-7638: Small fix and few refinements for FSN#truncate. (yliu) (yliu: rev 
5a6c084f074990a1f412475b147fd4f040b57d57)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Small fix and few refinements for FSN#truncate
> --
>
> Key: HDFS-7638
> URL: https://issues.apache.org/jira/browse/HDFS-7638
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 3.0.0
>
> Attachments: HDFS-7638.001.patch
>
>
> *1.* 
> {code}
> removeBlocks(collectedBlocks);
> {code}
> should be after {{logSync}}, as we do in other FSN places (rename, delete, 
> write with overwrite), the reason is discussed in HDFS-2815 and 
> https://issues.apache.org/jira/browse/HDFS-6871?focusedCommentId=14110068&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14110068
> *2.*
> {code}
> stat = FSDirStatAndListingOp.getFileInfo(dir, src, false,
> FSDirectory.isReservedRawName(src), true);
> {code}
> We'd better to use {{dir.getAuditFileInfo}}, since it's only for audit log. 
> If audit log is not on, we don't need to get the file info.
> *3.*
> In {{truncateInternal}}, 
> {code}
> INodeFile file = iip.getLastINode().asFile();
> {code}
> is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7637) Fix the check condition for reserved path

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283738#comment-14283738
 ] 

Hudson commented on HDFS-7637:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #813 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/813/])
HDFS-7637. Fix the check condition for reserved path. Contributed by Yi Liu. 
(jing9: rev e843a0a8cee5c704a5d28cf14b5a4050094d341b)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Fix the check condition for reserved path
> -
>
> Key: HDFS-7637
> URL: https://issues.apache.org/jira/browse/HDFS-7637
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7637.001.patch
>
>
> Currently the {{.reserved}} patch check function is:
> {code}
> public static boolean isReservedName(String src) {
>   return src.startsWith(DOT_RESERVED_PATH_PREFIX);
> }
> {code}
> And {{DOT_RESERVED_PATH_PREFIX}} is {{/.reserved}}, it should be 
> {{/.reserved/}}, for example: if some other directory prefix with 
> _/.reserved_, we say it's _/.reservedpath_, then the check is wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7640) print NFS Client in the NFS log

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283743#comment-14283743
 ] 

Hudson commented on HDFS-7640:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #813 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/813/])
HDFS-7640. print NFS Client in the NFS log. Contributed by Brandon Li. (wheat9: 
rev 5e5e35b1856293503124b77d5d4998a4d8e83082)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java


> print NFS Client in the NFS log
> ---
>
> Key: HDFS-7640
> URL: https://issues.apache.org/jira/browse/HDFS-7640
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-7640.001.patch
>
>
> Currently hdfs-nfs logs does not have any information about nfs clients.
> When multiple clients are using nfs, it becomes hard to distinguish which 
> request came from which client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5631) Expose interfaces required by FsDatasetSpi implementations

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283735#comment-14283735
 ] 

Hudson commented on HDFS-5631:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #813 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/813/])
HDFS-5631. Change BlockMetadataHeader.readHeader(..), ChunkChecksum class and 
constructor to public; and fix FsDatasetSpi to use generic type instead of 
FsVolumeImpl.  Contributed by David Powell and Joe Pallas (szetszwo: rev 
4a4450836c8972480b9387b5e31bab57ae2b5baa)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalReplicaInPipeline.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockMetadataHeader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/TestExternalDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalRollingLogs.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ChunkChecksum.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/RamDiskAsyncLazyPersistService.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalVolumeImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalReplica.java


> Expose interfaces required by FsDatasetSpi implementations
> --
>
> Key: HDFS-5631
> URL: https://issues.apache.org/jira/browse/HDFS-5631
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: David Powell
>Assignee: Joe Pallas
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-5631-LazyPersist.patch, 
> HDFS-5631-LazyPersist.patch, HDFS-5631.patch, HDFS-5631.patch
>
>
> This sub-task addresses section 4.1 of the document attached to HDFS-5194,
> the exposure of interfaces needed by a FsDatasetSpi implementation.
> Specifically it makes ChunkChecksum public and BlockMetadataHeader's
> readHeader() and writeHeader() methods public.
> The changes to BlockReaderUtil (and related classes) discussed by section
> 4.1 are only needed if supporting short-circuit, and should be addressed
> as part of an effort to provide such support rather than this JIRA.
> To help ensure these changes are complete and are not regressed in the
> future, tests that gauge the accessibility (though *not* behavior)
> of interfaces needed by a FsDatasetSpi subclass are also included.
> These take the form of a dummy FsDatasetSpi subclass -- a successful
> compilation is effectively a pass.  Trivial unit tests are included so
> that there is something tangible to track.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7638) Small fix and few refinements for FSN#truncate

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283739#comment-14283739
 ] 

Hudson commented on HDFS-7638:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #813 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/813/])
HDFS-7638: Small fix and few refinements for FSN#truncate. (yliu) (yliu: rev 
5a6c084f074990a1f412475b147fd4f040b57d57)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Small fix and few refinements for FSN#truncate
> --
>
> Key: HDFS-7638
> URL: https://issues.apache.org/jira/browse/HDFS-7638
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 3.0.0
>
> Attachments: HDFS-7638.001.patch
>
>
> *1.* 
> {code}
> removeBlocks(collectedBlocks);
> {code}
> should be after {{logSync}}, as we do in other FSN places (rename, delete, 
> write with overwrite), the reason is discussed in HDFS-2815 and 
> https://issues.apache.org/jira/browse/HDFS-6871?focusedCommentId=14110068&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14110068
> *2.*
> {code}
> stat = FSDirStatAndListingOp.getFileInfo(dir, src, false,
> FSDirectory.isReservedRawName(src), true);
> {code}
> We'd better to use {{dir.getAuditFileInfo}}, since it's only for audit log. 
> If audit log is not on, we don't need to get the file info.
> *3.*
> In {{truncateInternal}}, 
> {code}
> INodeFile file = iip.getLastINode().asFile();
> {code}
> is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7421) Move processing of postponed over-replicated blocks to a background task

2015-01-20 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283832#comment-14283832
 ] 

Kihwal Lee commented on HDFS-7421:
--

Isn't it a dupe of HDFS-6425?

> Move processing of postponed over-replicated blocks to a background task
> 
>
> Key: HDFS-7421
> URL: https://issues.apache.org/jira/browse/HDFS-7421
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, namenode
>Affects Versions: 2.6.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>
> In an HA environment, we postpone sending block invalidates to DNs until all 
> DNs holding a given block have done at least one block report to the NN after 
> it became active. When that first block report after becoming active does 
> occur, we attempt to reprocess all postponed misreplicated blocks inline with 
> the block report RPC. In the case where there are many postponed 
> misreplicated blocks, this can cause block report RPCs to take an 
> inordinately long time to complete, sometimes on the order of minutes, which 
> has the potential to tie up RPC handlers, block incoming RPCs, etc. There's 
> no need to hurriedly process all postponed misreplicated blocks so that we 
> can quickly send invalidate commands back to DNs, so let's move this 
> processing outside of the RPC handler context and into a background thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7638) Small fix and few refinements for FSN#truncate

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283864#comment-14283864
 ] 

Hudson commented on HDFS-7638:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #76 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/76/])
HDFS-7638: Small fix and few refinements for FSN#truncate. (yliu) (yliu: rev 
5a6c084f074990a1f412475b147fd4f040b57d57)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Small fix and few refinements for FSN#truncate
> --
>
> Key: HDFS-7638
> URL: https://issues.apache.org/jira/browse/HDFS-7638
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 3.0.0
>
> Attachments: HDFS-7638.001.patch
>
>
> *1.* 
> {code}
> removeBlocks(collectedBlocks);
> {code}
> should be after {{logSync}}, as we do in other FSN places (rename, delete, 
> write with overwrite), the reason is discussed in HDFS-2815 and 
> https://issues.apache.org/jira/browse/HDFS-6871?focusedCommentId=14110068&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14110068
> *2.*
> {code}
> stat = FSDirStatAndListingOp.getFileInfo(dir, src, false,
> FSDirectory.isReservedRawName(src), true);
> {code}
> We'd better to use {{dir.getAuditFileInfo}}, since it's only for audit log. 
> If audit log is not on, we don't need to get the file info.
> *3.*
> In {{truncateInternal}}, 
> {code}
> INodeFile file = iip.getLastINode().asFile();
> {code}
> is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5631) Expose interfaces required by FsDatasetSpi implementations

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283860#comment-14283860
 ] 

Hudson commented on HDFS-5631:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #76 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/76/])
HDFS-5631. Change BlockMetadataHeader.readHeader(..), ChunkChecksum class and 
constructor to public; and fix FsDatasetSpi to use generic type instead of 
FsVolumeImpl.  Contributed by David Powell and Joe Pallas (szetszwo: rev 
4a4450836c8972480b9387b5e31bab57ae2b5baa)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/TestExternalDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/RamDiskAsyncLazyPersistService.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalReplica.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ChunkChecksum.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockMetadataHeader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalRollingLogs.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalVolumeImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalReplicaInPipeline.java


> Expose interfaces required by FsDatasetSpi implementations
> --
>
> Key: HDFS-5631
> URL: https://issues.apache.org/jira/browse/HDFS-5631
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: David Powell
>Assignee: Joe Pallas
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-5631-LazyPersist.patch, 
> HDFS-5631-LazyPersist.patch, HDFS-5631.patch, HDFS-5631.patch
>
>
> This sub-task addresses section 4.1 of the document attached to HDFS-5194,
> the exposure of interfaces needed by a FsDatasetSpi implementation.
> Specifically it makes ChunkChecksum public and BlockMetadataHeader's
> readHeader() and writeHeader() methods public.
> The changes to BlockReaderUtil (and related classes) discussed by section
> 4.1 are only needed if supporting short-circuit, and should be addressed
> as part of an effort to provide such support rather than this JIRA.
> To help ensure these changes are complete and are not regressed in the
> future, tests that gauge the accessibility (though *not* behavior)
> of interfaces needed by a FsDatasetSpi subclass are also included.
> These take the form of a dummy FsDatasetSpi subclass -- a successful
> compilation is effectively a pass.  Trivial unit tests are included so
> that there is something tangible to track.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7637) Fix the check condition for reserved path

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283863#comment-14283863
 ] 

Hudson commented on HDFS-7637:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #76 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/76/])
HDFS-7637. Fix the check condition for reserved path. Contributed by Yi Liu. 
(jing9: rev e843a0a8cee5c704a5d28cf14b5a4050094d341b)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java


> Fix the check condition for reserved path
> -
>
> Key: HDFS-7637
> URL: https://issues.apache.org/jira/browse/HDFS-7637
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7637.001.patch
>
>
> Currently the {{.reserved}} patch check function is:
> {code}
> public static boolean isReservedName(String src) {
>   return src.startsWith(DOT_RESERVED_PATH_PREFIX);
> }
> {code}
> And {{DOT_RESERVED_PATH_PREFIX}} is {{/.reserved}}, it should be 
> {{/.reserved/}}, for example: if some other directory prefix with 
> _/.reserved_, we say it's _/.reservedpath_, then the check is wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7640) print NFS Client in the NFS log

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283868#comment-14283868
 ] 

Hudson commented on HDFS-7640:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #76 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/76/])
HDFS-7640. print NFS Client in the NFS log. Contributed by Brandon Li. (wheat9: 
rev 5e5e35b1856293503124b77d5d4998a4d8e83082)
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> print NFS Client in the NFS log
> ---
>
> Key: HDFS-7640
> URL: https://issues.apache.org/jira/browse/HDFS-7640
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-7640.001.patch
>
>
> Currently hdfs-nfs logs does not have any information about nfs clients.
> When multiple clients are using nfs, it becomes hard to distinguish which 
> request came from which client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7638) Small fix and few refinements for FSN#truncate

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283877#comment-14283877
 ] 

Hudson commented on HDFS-7638:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2011 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2011/])
HDFS-7638: Small fix and few refinements for FSN#truncate. (yliu) (yliu: rev 
5a6c084f074990a1f412475b147fd4f040b57d57)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java


> Small fix and few refinements for FSN#truncate
> --
>
> Key: HDFS-7638
> URL: https://issues.apache.org/jira/browse/HDFS-7638
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 3.0.0
>
> Attachments: HDFS-7638.001.patch
>
>
> *1.* 
> {code}
> removeBlocks(collectedBlocks);
> {code}
> should be after {{logSync}}, as we do in other FSN places (rename, delete, 
> write with overwrite), the reason is discussed in HDFS-2815 and 
> https://issues.apache.org/jira/browse/HDFS-6871?focusedCommentId=14110068&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14110068
> *2.*
> {code}
> stat = FSDirStatAndListingOp.getFileInfo(dir, src, false,
> FSDirectory.isReservedRawName(src), true);
> {code}
> We'd better to use {{dir.getAuditFileInfo}}, since it's only for audit log. 
> If audit log is not on, we don't need to get the file info.
> *3.*
> In {{truncateInternal}}, 
> {code}
> INodeFile file = iip.getLastINode().asFile();
> {code}
> is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7637) Fix the check condition for reserved path

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283876#comment-14283876
 ] 

Hudson commented on HDFS-7637:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2011 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2011/])
HDFS-7637. Fix the check condition for reserved path. Contributed by Yi Liu. 
(jing9: rev e843a0a8cee5c704a5d28cf14b5a4050094d341b)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Fix the check condition for reserved path
> -
>
> Key: HDFS-7637
> URL: https://issues.apache.org/jira/browse/HDFS-7637
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7637.001.patch
>
>
> Currently the {{.reserved}} patch check function is:
> {code}
> public static boolean isReservedName(String src) {
>   return src.startsWith(DOT_RESERVED_PATH_PREFIX);
> }
> {code}
> And {{DOT_RESERVED_PATH_PREFIX}} is {{/.reserved}}, it should be 
> {{/.reserved/}}, for example: if some other directory prefix with 
> _/.reserved_, we say it's _/.reservedpath_, then the check is wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7640) print NFS Client in the NFS log

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283881#comment-14283881
 ] 

Hudson commented on HDFS-7640:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2011 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2011/])
HDFS-7640. print NFS Client in the NFS log. Contributed by Brandon Li. (wheat9: 
rev 5e5e35b1856293503124b77d5d4998a4d8e83082)
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> print NFS Client in the NFS log
> ---
>
> Key: HDFS-7640
> URL: https://issues.apache.org/jira/browse/HDFS-7640
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-7640.001.patch
>
>
> Currently hdfs-nfs logs does not have any information about nfs clients.
> When multiple clients are using nfs, it becomes hard to distinguish which 
> request came from which client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5631) Expose interfaces required by FsDatasetSpi implementations

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283873#comment-14283873
 ] 

Hudson commented on HDFS-5631:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2011 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2011/])
HDFS-5631. Change BlockMetadataHeader.readHeader(..), ChunkChecksum class and 
constructor to public; and fix FsDatasetSpi to use generic type instead of 
FsVolumeImpl.  Contributed by David Powell and Joe Pallas (szetszwo: rev 
4a4450836c8972480b9387b5e31bab57ae2b5baa)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalRollingLogs.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ChunkChecksum.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalVolumeImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/TestExternalDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalReplica.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/RamDiskAsyncLazyPersistService.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalReplicaInPipeline.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockMetadataHeader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java


> Expose interfaces required by FsDatasetSpi implementations
> --
>
> Key: HDFS-5631
> URL: https://issues.apache.org/jira/browse/HDFS-5631
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: David Powell
>Assignee: Joe Pallas
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-5631-LazyPersist.patch, 
> HDFS-5631-LazyPersist.patch, HDFS-5631.patch, HDFS-5631.patch
>
>
> This sub-task addresses section 4.1 of the document attached to HDFS-5194,
> the exposure of interfaces needed by a FsDatasetSpi implementation.
> Specifically it makes ChunkChecksum public and BlockMetadataHeader's
> readHeader() and writeHeader() methods public.
> The changes to BlockReaderUtil (and related classes) discussed by section
> 4.1 are only needed if supporting short-circuit, and should be addressed
> as part of an effort to provide such support rather than this JIRA.
> To help ensure these changes are complete and are not regressed in the
> future, tests that gauge the accessibility (though *not* behavior)
> of interfaces needed by a FsDatasetSpi subclass are also included.
> These take the form of a dummy FsDatasetSpi subclass -- a successful
> compilation is effectively a pass.  Trivial unit tests are included so
> that there is something tangible to track.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7638) Small fix and few refinements for FSN#truncate

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283902#comment-14283902
 ] 

Hudson commented on HDFS-7638:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #80 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/80/])
HDFS-7638: Small fix and few refinements for FSN#truncate. (yliu) (yliu: rev 
5a6c084f074990a1f412475b147fd4f040b57d57)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Small fix and few refinements for FSN#truncate
> --
>
> Key: HDFS-7638
> URL: https://issues.apache.org/jira/browse/HDFS-7638
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 3.0.0
>
> Attachments: HDFS-7638.001.patch
>
>
> *1.* 
> {code}
> removeBlocks(collectedBlocks);
> {code}
> should be after {{logSync}}, as we do in other FSN places (rename, delete, 
> write with overwrite), the reason is discussed in HDFS-2815 and 
> https://issues.apache.org/jira/browse/HDFS-6871?focusedCommentId=14110068&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14110068
> *2.*
> {code}
> stat = FSDirStatAndListingOp.getFileInfo(dir, src, false,
> FSDirectory.isReservedRawName(src), true);
> {code}
> We'd better to use {{dir.getAuditFileInfo}}, since it's only for audit log. 
> If audit log is not on, we don't need to get the file info.
> *3.*
> In {{truncateInternal}}, 
> {code}
> INodeFile file = iip.getLastINode().asFile();
> {code}
> is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7637) Fix the check condition for reserved path

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283901#comment-14283901
 ] 

Hudson commented on HDFS-7637:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #80 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/80/])
HDFS-7637. Fix the check condition for reserved path. Contributed by Yi Liu. 
(jing9: rev e843a0a8cee5c704a5d28cf14b5a4050094d341b)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java


> Fix the check condition for reserved path
> -
>
> Key: HDFS-7637
> URL: https://issues.apache.org/jira/browse/HDFS-7637
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7637.001.patch
>
>
> Currently the {{.reserved}} patch check function is:
> {code}
> public static boolean isReservedName(String src) {
>   return src.startsWith(DOT_RESERVED_PATH_PREFIX);
> }
> {code}
> And {{DOT_RESERVED_PATH_PREFIX}} is {{/.reserved}}, it should be 
> {{/.reserved/}}, for example: if some other directory prefix with 
> _/.reserved_, we say it's _/.reservedpath_, then the check is wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5631) Expose interfaces required by FsDatasetSpi implementations

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283898#comment-14283898
 ] 

Hudson commented on HDFS-5631:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #80 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/80/])
HDFS-5631. Change BlockMetadataHeader.readHeader(..), ChunkChecksum class and 
constructor to public; and fix FsDatasetSpi to use generic type instead of 
FsVolumeImpl.  Contributed by David Powell and Joe Pallas (szetszwo: rev 
4a4450836c8972480b9387b5e31bab57ae2b5baa)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalReplica.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalReplicaInPipeline.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/TestExternalDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ChunkChecksum.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalRollingLogs.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalVolumeImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/RamDiskAsyncLazyPersistService.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockMetadataHeader.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Expose interfaces required by FsDatasetSpi implementations
> --
>
> Key: HDFS-5631
> URL: https://issues.apache.org/jira/browse/HDFS-5631
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: David Powell
>Assignee: Joe Pallas
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-5631-LazyPersist.patch, 
> HDFS-5631-LazyPersist.patch, HDFS-5631.patch, HDFS-5631.patch
>
>
> This sub-task addresses section 4.1 of the document attached to HDFS-5194,
> the exposure of interfaces needed by a FsDatasetSpi implementation.
> Specifically it makes ChunkChecksum public and BlockMetadataHeader's
> readHeader() and writeHeader() methods public.
> The changes to BlockReaderUtil (and related classes) discussed by section
> 4.1 are only needed if supporting short-circuit, and should be addressed
> as part of an effort to provide such support rather than this JIRA.
> To help ensure these changes are complete and are not regressed in the
> future, tests that gauge the accessibility (though *not* behavior)
> of interfaces needed by a FsDatasetSpi subclass are also included.
> These take the form of a dummy FsDatasetSpi subclass -- a successful
> compilation is effectively a pass.  Trivial unit tests are included so
> that there is something tangible to track.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7640) print NFS Client in the NFS log

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283905#comment-14283905
 ] 

Hudson commented on HDFS-7640:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #80 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/80/])
HDFS-7640. print NFS Client in the NFS log. Contributed by Brandon Li. (wheat9: 
rev 5e5e35b1856293503124b77d5d4998a4d8e83082)
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> print NFS Client in the NFS log
> ---
>
> Key: HDFS-7640
> URL: https://issues.apache.org/jira/browse/HDFS-7640
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-7640.001.patch
>
>
> Currently hdfs-nfs logs does not have any information about nfs clients.
> When multiple clients are using nfs, it becomes hard to distinguish which 
> request came from which client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7638) Small fix and few refinements for FSN#truncate

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283925#comment-14283925
 ] 

Hudson commented on HDFS-7638:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2030 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2030/])
HDFS-7638: Small fix and few refinements for FSN#truncate. (yliu) (yliu: rev 
5a6c084f074990a1f412475b147fd4f040b57d57)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java


> Small fix and few refinements for FSN#truncate
> --
>
> Key: HDFS-7638
> URL: https://issues.apache.org/jira/browse/HDFS-7638
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 3.0.0
>
> Attachments: HDFS-7638.001.patch
>
>
> *1.* 
> {code}
> removeBlocks(collectedBlocks);
> {code}
> should be after {{logSync}}, as we do in other FSN places (rename, delete, 
> write with overwrite), the reason is discussed in HDFS-2815 and 
> https://issues.apache.org/jira/browse/HDFS-6871?focusedCommentId=14110068&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14110068
> *2.*
> {code}
> stat = FSDirStatAndListingOp.getFileInfo(dir, src, false,
> FSDirectory.isReservedRawName(src), true);
> {code}
> We'd better to use {{dir.getAuditFileInfo}}, since it's only for audit log. 
> If audit log is not on, we don't need to get the file info.
> *3.*
> In {{truncateInternal}}, 
> {code}
> INodeFile file = iip.getLastINode().asFile();
> {code}
> is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5631) Expose interfaces required by FsDatasetSpi implementations

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283921#comment-14283921
 ] 

Hudson commented on HDFS-5631:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2030 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2030/])
HDFS-5631. Change BlockMetadataHeader.readHeader(..), ChunkChecksum class and 
constructor to public; and fix FsDatasetSpi to use generic type instead of 
FsVolumeImpl.  Contributed by David Powell and Joe Pallas (szetszwo: rev 
4a4450836c8972480b9387b5e31bab57ae2b5baa)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/TestExternalDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalVolumeImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ChunkChecksum.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockMetadataHeader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalRollingLogs.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalReplicaInPipeline.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/RamDiskAsyncLazyPersistService.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalReplica.java


> Expose interfaces required by FsDatasetSpi implementations
> --
>
> Key: HDFS-5631
> URL: https://issues.apache.org/jira/browse/HDFS-5631
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: David Powell
>Assignee: Joe Pallas
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-5631-LazyPersist.patch, 
> HDFS-5631-LazyPersist.patch, HDFS-5631.patch, HDFS-5631.patch
>
>
> This sub-task addresses section 4.1 of the document attached to HDFS-5194,
> the exposure of interfaces needed by a FsDatasetSpi implementation.
> Specifically it makes ChunkChecksum public and BlockMetadataHeader's
> readHeader() and writeHeader() methods public.
> The changes to BlockReaderUtil (and related classes) discussed by section
> 4.1 are only needed if supporting short-circuit, and should be addressed
> as part of an effort to provide such support rather than this JIRA.
> To help ensure these changes are complete and are not regressed in the
> future, tests that gauge the accessibility (though *not* behavior)
> of interfaces needed by a FsDatasetSpi subclass are also included.
> These take the form of a dummy FsDatasetSpi subclass -- a successful
> compilation is effectively a pass.  Trivial unit tests are included so
> that there is something tangible to track.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7640) print NFS Client in the NFS log

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283928#comment-14283928
 ] 

Hudson commented on HDFS-7640:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2030 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2030/])
HDFS-7640. print NFS Client in the NFS log. Contributed by Brandon Li. (wheat9: 
rev 5e5e35b1856293503124b77d5d4998a4d8e83082)
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> print NFS Client in the NFS log
> ---
>
> Key: HDFS-7640
> URL: https://issues.apache.org/jira/browse/HDFS-7640
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-7640.001.patch
>
>
> Currently hdfs-nfs logs does not have any information about nfs clients.
> When multiple clients are using nfs, it becomes hard to distinguish which 
> request came from which client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7433) DatanodeManager#datanodeMap should be a HashMap, not a TreeMap, to optimize lookup performance

2015-01-20 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283918#comment-14283918
 ] 

Kihwal Lee commented on HDFS-7433:
--

The precommit did not run. Although it is a comment only change over the 
previous patch, a new precommit run will be nice since the last run was over a 
month ago.

> DatanodeManager#datanodeMap should be a HashMap, not a TreeMap, to optimize 
> lookup performance
> --
>
> Key: HDFS-7433
> URL: https://issues.apache.org/jira/browse/HDFS-7433
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7433.patch, HDFS-7433.patch, HDFS-7433.patch
>
>
> The datanode map is currently a {{TreeMap}}.  For many thousands of 
> datanodes, tree lookups are ~10X more expensive than a {{HashMap}}.  
> Insertions and removals are up to 100X more expensive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-7433) DatanodeManager#datanodeMap should be a HashMap, not a TreeMap, to optimize lookup performance

2015-01-20 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283918#comment-14283918
 ] 

Kihwal Lee edited comment on HDFS-7433 at 1/20/15 3:24 PM:
---

The precommit did not run, so I just kicked it. Although it is a comment only 
change over the previous patch, a new precommit run will be nice since the last 
run was over a month ago.


was (Author: kihwal):
The precommit did not run. Although it is a comment only change over the 
previous patch, a new precommit run will be nice since the last run was over a 
month ago.

> DatanodeManager#datanodeMap should be a HashMap, not a TreeMap, to optimize 
> lookup performance
> --
>
> Key: HDFS-7433
> URL: https://issues.apache.org/jira/browse/HDFS-7433
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7433.patch, HDFS-7433.patch, HDFS-7433.patch
>
>
> The datanode map is currently a {{TreeMap}}.  For many thousands of 
> datanodes, tree lookups are ~10X more expensive than a {{HashMap}}.  
> Insertions and removals are up to 100X more expensive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7637) Fix the check condition for reserved path

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283924#comment-14283924
 ] 

Hudson commented on HDFS-7637:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2030 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2030/])
HDFS-7637. Fix the check condition for reserved path. Contributed by Yi Liu. 
(jing9: rev e843a0a8cee5c704a5d28cf14b5a4050094d341b)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java


> Fix the check condition for reserved path
> -
>
> Key: HDFS-7637
> URL: https://issues.apache.org/jira/browse/HDFS-7637
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7637.001.patch
>
>
> Currently the {{.reserved}} patch check function is:
> {code}
> public static boolean isReservedName(String src) {
>   return src.startsWith(DOT_RESERVED_PATH_PREFIX);
> }
> {code}
> And {{DOT_RESERVED_PATH_PREFIX}} is {{/.reserved}}, it should be 
> {{/.reserved/}}, for example: if some other directory prefix with 
> _/.reserved_, we say it's _/.reservedpath_, then the check is wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7339) Allocating and persisting block groups in NameNode

2015-01-20 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7339:

Attachment: HDFS-7339-004.patch

Updated patch with some polishing edits. The Jenkins error is a timeout and 
looks unrelated.

> Allocating and persisting block groups in NameNode
> --
>
> Key: HDFS-7339
> URL: https://issues.apache.org/jira/browse/HDFS-7339
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, 
> HDFS-7339-003.patch, HDFS-7339-004.patch, Meta-striping.jpg, NN-stripping.jpg
>
>
> All erasure codec operations center around the concept of _block group_; they 
> are formed in initial encoding and looked up in recoveries and conversions. A 
> lightweight class {{BlockGroup}} is created to record the original and parity 
> blocks in a coding group, as well as a pointer to the codec schema (pluggable 
> codec schemas will be supported in HDFS-7337). With the striping layout, the 
> HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. 
> Therefore we propose to extend a file’s inode to switch between _contiguous_ 
> and _striping_ modes, with the current mode recorded in a binary flag. An 
> array of BlockGroups (or BlockGroup IDs) is added, which remains empty for 
> “traditional” HDFS files with contiguous block layout.
> The NameNode creates and maintains {{BlockGroup}} instances through the new 
> {{ECManager}} component; the attached figure has an illustration of the 
> architecture. As a simple example, when a {_Striping+EC_} file is created and 
> written to, it will serve requests from the client to allocate new 
> {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, 
> {{BlockGroups}} are allocated both in initial online encoding and in the 
> conversion from replication to EC. {{ECManager}} also facilitates the lookup 
> of {{BlockGroup}} information for block recovery work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (HDFS-7631) Deep learn about hadoop

2015-01-20 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal closed HDFS-7631.
---

> Deep learn about hadoop
> ---
>
> Key: HDFS-7631
> URL: https://issues.apache.org/jira/browse/HDFS-7631
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: frank
>Priority: Trivial
>
> I want to learn more about hadoop code. If there are any books that can help 
> me.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7641) Update archival storage user doc for list/set/get block storage policies

2015-01-20 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284147#comment-14284147
 ] 

Jing Zhao commented on HDFS-7641:
-

Thanks for working on this, Yi! One minor comment: the current commands use 
"-path " and "-policy " to specify path and policy. We need to 
update the doc accordingly. 

> Update archival storage user doc for list/set/get block storage policies
> 
>
> Key: HDFS-7641
> URL: https://issues.apache.org/jira/browse/HDFS-7641
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Attachments: HDFS-7641.001.patch
>
>
> After HDFS-7323, the list/set/get block storage policies commands are 
> different, we should update the corresponding user doc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7639) Remove the limitation imposed by dfs.balancer.moverThreads

2015-01-20 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He reassigned HDFS-7639:
-

Assignee: Chen He

> Remove the limitation imposed by dfs.balancer.moverThreads
> --
>
> Key: HDFS-7639
> URL: https://issues.apache.org/jira/browse/HDFS-7639
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Chen He
>
> In Balancer/Mover, the number of dispatcher threads 
> (dfs.balancer.moverThreads) limits the number of concurrent moves.  Each 
> dispatcher thread sends request to a datanode and then is blocked for waiting 
> the response.  We should remove such limitation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7639) Remove the limitation imposed by dfs.balancer.moverThreads

2015-01-20 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284182#comment-14284182
 ] 

Chen He commented on HDFS-7639:
---

Hi [~szetszwo], I am little bit confused about this JIRA. In HDFS-6595, 
maximium threads limits were added to avoid balancer overwhelm the datanode. If 
I am right, do you mean, we weaken the limits and make them non-blocking? 

> Remove the limitation imposed by dfs.balancer.moverThreads
> --
>
> Key: HDFS-7639
> URL: https://issues.apache.org/jira/browse/HDFS-7639
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Chen He
>
> In Balancer/Mover, the number of dispatcher threads 
> (dfs.balancer.moverThreads) limits the number of concurrent moves.  Each 
> dispatcher thread sends request to a datanode and then is blocked for waiting 
> the response.  We should remove such limitation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7433) DatanodeManager#datanodeMap should be a HashMap, not a TreeMap, to optimize lookup performance

2015-01-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284197#comment-14284197
 ] 

Hadoop QA commented on HDFS-7433:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12692603/HDFS-7433.patch
  against trunk revision c94c0d2.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9276//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9276//console

This message is automatically generated.

> DatanodeManager#datanodeMap should be a HashMap, not a TreeMap, to optimize 
> lookup performance
> --
>
> Key: HDFS-7433
> URL: https://issues.apache.org/jira/browse/HDFS-7433
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7433.patch, HDFS-7433.patch, HDFS-7433.patch
>
>
> The datanode map is currently a {{TreeMap}}.  For many thousands of 
> datanodes, tree lookups are ~10X more expensive than a {{HashMap}}.  
> Insertions and removals are up to 100X more expensive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7587) Edit log corruption can happen if append fails with a quota violation

2015-01-20 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284272#comment-14284272
 ] 

Xiaoyu Yao commented on HDFS-7587:
--

Agree with [~szetszwo] we should use verifyQuota() instead of 
UpdateSpaceConsumed(). Also, can we add a unit test to verify that correctness 
of the quota usage after the exception is thrown for this case?

> Edit log corruption can happen if append fails with a quota violation
> -
>
> Key: HDFS-7587
> URL: https://issues.apache.org/jira/browse/HDFS-7587
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Kihwal Lee
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-7587.patch
>
>
> We have seen a standby namenode crashing due to edit log corruption. It was 
> complaining that {{OP_CLOSE}} cannot be applied because the file is not 
> under-construction.
> When a client was trying to append to the file, the remaining space quota was 
> very small. This caused a failure in {{prepareFileForWrite()}}, but after the 
> inode was already converted for writing and a lease added. Since these were 
> not undone when the quota violation was detected, the file was left in 
> under-construction with an active lease without edit logging {{OP_ADD}}.
> A subsequent {{append()}} eventually caused a lease recovery after the soft 
> limit period. This resulted in {{commitBlockSynchronization()}}, which closed 
> the file with {{OP_CLOSE}} being logged.  Since there was no corresponding 
> {{OP_ADD}}, edit replaying could not apply this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-3443) Unable to catch up edits during standby to active switch due to NPE

2015-01-20 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284276#comment-14284276
 ] 

Tsz Wo Nicholas Sze commented on HDFS-3443:
---

Thanks Vinay.  Some comments on the patch:
- NameNode.started should be volatile or use AtomicBoolean.
- Need to add checkNNStartup() for the rpc methods below.
-* getGroupsForUser(String)
-* refresh(String, String[])
-* refreshCallQueue()
-* refreshSuperUserGroupsConfiguration()
-* refreshUserToGroupsMappings()


> Unable to catch up edits during standby to active switch due to NPE
> ---
>
> Key: HDFS-3443
> URL: https://issues.apache.org/jira/browse/HDFS-3443
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover, ha
>Reporter: suja s
>Assignee: Vinayakumar B
> Attachments: HDFS-3443-003.patch, HDFS-3443-004.patch, 
> HDFS-3443-005.patch, HDFS-3443-006.patch, HDFS-3443_1.patch, HDFS-3443_1.patch
>
>
> Start NN
> Let NN standby services be started.
> Before the editLogTailer is initialised start ZKFC and allow the 
> activeservices start to proceed further.
> Here editLogTailer.catchupDuringFailover() will throw NPE.
> void startActiveServices() throws IOException {
> LOG.info("Starting services required for active state");
> writeLock();
> try {
>   FSEditLog editLog = dir.fsImage.getEditLog();
>   
>   if (!editLog.isOpenForWrite()) {
> // During startup, we're already open for write during initialization.
> editLog.initJournalsForWrite();
> // May need to recover
> editLog.recoverUnclosedStreams();
> 
> LOG.info("Catching up to latest edits from old active before " +
> "taking over writer role in edits logs.");
> editLogTailer.catchupDuringFailover();
> {noformat}
> 2012-05-18 16:51:27,585 WARN org.apache.hadoop.ipc.Server: IPC Server 
> Responder, call org.apache.hadoop.ha.HAServiceProtocol.getServiceStatus from 
> XX.XX.XX.55:58003: output error
> 2012-05-18 16:51:27,586 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 8 on 8020, call org.apache.hadoop.ha.HAServiceProtocol.transitionToActive 
> from XX.XX.XX.55:58004: error: java.lang.NullPointerException
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:602)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1287)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:63)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1219)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:978)
>   at 
> org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
>   at 
> org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:3633)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686)
> 2012-05-18 16:51:27,586 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 9 on 8020 caught an exception
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>   at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>   at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2092)
>   at org.apache.hadoop.ipc.Server.access$2000(Server.java:107)
>   at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:930)
>   at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:994)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1738)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7639) Remove the limitation imposed by dfs.balancer.moverThreads

2015-01-20 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284292#comment-14284292
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7639:
---

HDFS-6595 adds dfs.datanode.balance.max.concurrent.moves which is a datanode 
configuration.  This JIRA is about dfs.balancer.moverThreads which is a 
Balancer configuration.

Balancer use a dispatcher thread to send request to a datanode.  The thread is 
blocked until the replica transfer is completed.  So the number of dispatcher 
threads (dfs.balancer.moverThreads) limits the number of concurrent moves.

I suggest changing the dispatcher thread so that it only sends requests but not 
wait for the responses.  Then, use a separated thread pool to wait for the 
response using non-blocking io.

> Remove the limitation imposed by dfs.balancer.moverThreads
> --
>
> Key: HDFS-7639
> URL: https://issues.apache.org/jira/browse/HDFS-7639
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Chen He
>
> In Balancer/Mover, the number of dispatcher threads 
> (dfs.balancer.moverThreads) limits the number of concurrent moves.  Each 
> dispatcher thread sends request to a datanode and then is blocked for waiting 
> the response.  We should remove such limitation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7634) Lazy persist (memory) file should not support truncate currently

2015-01-20 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284312#comment-14284312
 ] 

Arpit Agarwal commented on HDFS-7634:
-

+1 for the patch, thanks for catching this [~hitliuyi].

[~shv] if you have no objections I can commit it today evening. Thanks.

> Lazy persist (memory) file should not support truncate currently
> 
>
> Key: HDFS-7634
> URL: https://issues.apache.org/jira/browse/HDFS-7634
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 3.0.0
>
> Attachments: HDFS-7634.001.patch, HDFS-7634.002.patch
>
>
> Similar with {{append}}, lazy persist (memory) file should not support 
> truncate currently. Quote the reason from HDFS-6581 design doc:
> {quote}
> Appends to files created with the LAZY_PERSISTflag will not be allowed in the 
> initial implementation to avoid the complexity of keeping in­memory and 
> on­disk replicas in sync on a given DataNode.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7642) NameNode should periodically log DataNode decommissioning progress

2015-01-20 Thread Zhe Zhang (JIRA)
Zhe Zhang created HDFS-7642:
---

 Summary: NameNode should periodically log DataNode decommissioning 
progress
 Key: HDFS-7642
 URL: https://issues.apache.org/jira/browse/HDFS-7642
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Zhe Zhang
Assignee: Zhe Zhang
Priority: Minor


We've see a case where the decommissioning was stuck due to some files have 
more replicas then DNs. HDFS-5662 fixes this particular issue but there are 
other use cases where the decommissioning process might get stuck or slow down. 
Some monitoring / logging will help debugging those issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7611) TestFileTruncate.testTruncateEditLogLoad times out waiting for Mini HDFS Cluster to start

2015-01-20 Thread Byron Wong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Byron Wong reassigned HDFS-7611:


Assignee: Byron Wong

> TestFileTruncate.testTruncateEditLogLoad times out waiting for Mini HDFS 
> Cluster to start
> -
>
> Key: HDFS-7611
> URL: https://issues.apache.org/jira/browse/HDFS-7611
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Byron Wong
> Attachments: testTruncateEditLogLoad.log
>
>
> I've seen it failing on Jenkins a couple of times. Somehow the cluster is not 
> comming ready after NN restart.
> Not sure if it is truncate specific, as I've seen same behaviour with other 
> tests that restart the NameNode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7634) Lazy persist (memory) file should not support truncate currently

2015-01-20 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284320#comment-14284320
 ] 

Konstantin Shvachko commented on HDFS-7634:
---

No objections. +1

> Lazy persist (memory) file should not support truncate currently
> 
>
> Key: HDFS-7634
> URL: https://issues.apache.org/jira/browse/HDFS-7634
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 3.0.0
>
> Attachments: HDFS-7634.001.patch, HDFS-7634.002.patch
>
>
> Similar with {{append}}, lazy persist (memory) file should not support 
> truncate currently. Quote the reason from HDFS-6581 design doc:
> {quote}
> Appends to files created with the LAZY_PERSISTflag will not be allowed in the 
> initial implementation to avoid the complexity of keeping in­memory and 
> on­disk replicas in sync on a given DataNode.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7633) When Datanode has too many blocks, BlockPoolSliceScanner.getNewBlockScanTime throws IllegalArgumentException

2015-01-20 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284323#comment-14284323
 ] 

Arpit Agarwal commented on HDFS-7633:
-

Thanks for catching this [~walter.k.su] and submitting a patch. Not sure why 
Jenkins does not like the patch, it applies fine for me with 'git apply'. Could 
you try regenerating the patch simply with 'git diff'?

+1 for the change once we get a Jenkins run.




> When Datanode has too many blocks, BlockPoolSliceScanner.getNewBlockScanTime 
> throws IllegalArgumentException
> 
>
> Key: HDFS-7633
> URL: https://issues.apache.org/jira/browse/HDFS-7633
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Walter Su
>Assignee: Walter Su
>Priority: Minor
> Attachments: h7633_20150116.patch
>
>
> issue:
> When Total blocks of one of my DNs reaches 33554432, It refuses to accept 
> more blocks, this is the ERROR.
> 2015-01-16 15:21:44,571 | ERROR | DataXceiver for client  at /172.1.1.8:50490 
> [Receiving block 
> BP-1976278848-172.1.1.2-1419846518085:blk_1221043436_147936990] | 
> datasight-198:25009:DataXceiver error processing WRITE_BLOCK operation  src: 
> /172.1.1.8:50490 dst: /172.1.1.11:25009 | 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:250)
> java.lang.IllegalArgumentException: n must be positive
> at java.util.Random.nextInt(Random.java:300)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.getNewBlockScanTime(BlockPoolSliceScanner.java:263)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.addBlock(BlockPoolSliceScanner.java:276)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.addBlock(DataBlockScanner.java:193)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.closeBlock(DataNode.java:1733)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:765)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
> at java.lang.Thread.run(Thread.java:745)
> analysis:
> in function 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.getNewBlockScanTime()
> when blockMap.size() is too big,
> Math.max(blockMap.size(),1)  * 600  is int type, and negtive
> Math.max(blockMap.size(),1) * 600 * 1000L is long type, and negtive
> (int)period  is Integer.MIN_VALUE
> Math.abs((int)period) is Integer.MIN_VALUE , which is negtive
> DFSUtil.getRandom().nextInt(periodInt)  will thows IllegalArgumentException
> I use Java HotSpot (build 1.7.0_05-b05)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7634) Disallow truncation of Lazy persist files

2015-01-20 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-7634:

Summary: Disallow truncation of Lazy persist files  (was: Lazy persist 
(memory) file should not support truncate currently)

> Disallow truncation of Lazy persist files
> -
>
> Key: HDFS-7634
> URL: https://issues.apache.org/jira/browse/HDFS-7634
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 3.0.0
>
> Attachments: HDFS-7634.001.patch, HDFS-7634.002.patch
>
>
> Similar with {{append}}, lazy persist (memory) file should not support 
> truncate currently. Quote the reason from HDFS-6581 design doc:
> {quote}
> Appends to files created with the LAZY_PERSISTflag will not be allowed in the 
> initial implementation to avoid the complexity of keeping in­memory and 
> on­disk replicas in sync on a given DataNode.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7643) Test case to ensure lazy persist files cannot be truncated

2015-01-20 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-7643:
---

 Summary: Test case to ensure lazy persist files cannot be truncated
 Key: HDFS-7643
 URL: https://issues.apache.org/jira/browse/HDFS-7643
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Affects Versions: 2.7.0
Reporter: Arpit Agarwal


Task to add test case for HDFS-7634. Ensure that an attempt to truncate a file 
created with LAZY_PERSIST policy is failed by the NameNode. For reference see 
{{TestLazyPersistFiles#testAppendIsDenied}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7634) Disallow truncation of Lazy persist files

2015-01-20 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284332#comment-14284332
 ] 

Arpit Agarwal commented on HDFS-7634:
-

I filed HDFS-7643 to add a test case for this issue. I will commit the patch 
shortly.

> Disallow truncation of Lazy persist files
> -
>
> Key: HDFS-7634
> URL: https://issues.apache.org/jira/browse/HDFS-7634
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 3.0.0
>
> Attachments: HDFS-7634.001.patch, HDFS-7634.002.patch
>
>
> Similar with {{append}}, lazy persist (memory) file should not support 
> truncate currently. Quote the reason from HDFS-6581 design doc:
> {quote}
> Appends to files created with the LAZY_PERSISTflag will not be allowed in the 
> initial implementation to avoid the complexity of keeping in­memory and 
> on­disk replicas in sync on a given DataNode.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7634) Disallow truncation of Lazy persist files

2015-01-20 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-7634:

  Resolution: Fixed
Target Version/s:   (was: 2.7.0)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks for the review [~shv]. Committed to trunk.

> Disallow truncation of Lazy persist files
> -
>
> Key: HDFS-7634
> URL: https://issues.apache.org/jira/browse/HDFS-7634
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 3.0.0
>
> Attachments: HDFS-7634.001.patch, HDFS-7634.002.patch
>
>
> Similar with {{append}}, lazy persist (memory) file should not support 
> truncate currently. Quote the reason from HDFS-6581 design doc:
> {quote}
> Appends to files created with the LAZY_PERSISTflag will not be allowed in the 
> initial implementation to avoid the complexity of keeping in­memory and 
> on­disk replicas in sync on a given DataNode.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7611) TestFileTruncate.testTruncateEditLogLoad times out waiting for Mini HDFS Cluster to start

2015-01-20 Thread Byron Wong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284351#comment-14284351
 ] 

Byron Wong commented on HDFS-7611:
--

This bug happens only in tests with restarts and happens because blocks from 
files created in previous tests are not being deleted when replaying edits logs.
1) I'm still investigating the source of this, but some time while replaying 
edits, {{DirectoryWithSnapshotFeature$cleanDirectory}} can decrement an INode's 
namespace quota to negative. Either the namespace count was overcounting while 
cleaning directories or snapshotDiff, or the INode's namespace quota wasn't 
counted up properly in the first place.
2) If the INode's namespace quota happens to be -1, the blocks associated with 
that inode will not be deleted. When we call {{fsd.removeLastINode(iip)}} in 
{{FSDirDeleteOp$unprotectedDelete}}, we explicitly check whether its return 
code is -1. In that case, we skip collecting the blocks that should be deleted. 
Notice that in {{FSDirectory$removeLastINode}}, one of the possible returns is 
{{return counts.get(Quota.NAMESPACE)}}.
3) Now there are blocks in the blocksMap that shouldn't be there. This will 
increase the number of blocks needed to get out of safeMode. The test failure 
depends on whether the namenode receives these blocks. If it does, then the 
namenode will exit safeMode and the test will suceed.

> TestFileTruncate.testTruncateEditLogLoad times out waiting for Mini HDFS 
> Cluster to start
> -
>
> Key: HDFS-7611
> URL: https://issues.apache.org/jira/browse/HDFS-7611
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Byron Wong
> Attachments: testTruncateEditLogLoad.log
>
>
> I've seen it failing on Jenkins a couple of times. Somehow the cluster is not 
> comming ready after NN restart.
> Not sure if it is truncate specific, as I've seen same behaviour with other 
> tests that restart the NameNode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7634) Disallow truncation of Lazy persist files

2015-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284350#comment-14284350
 ] 

Hudson commented on HDFS-7634:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #6894 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6894/])
HDFS-7634. Disallow truncation of Lazy persist files. (Contributed by Yi Liu) 
(arp: rev c09c65b2125908855a5f1d0047bc164ea4bea04d)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
HDFS-7634. Fix CHANGES.txt (arp: rev dd0228b8f7d9b3851aa408398eef516b93522e95)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Disallow truncation of Lazy persist files
> -
>
> Key: HDFS-7634
> URL: https://issues.apache.org/jira/browse/HDFS-7634
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 3.0.0
>
> Attachments: HDFS-7634.001.patch, HDFS-7634.002.patch
>
>
> Similar with {{append}}, lazy persist (memory) file should not support 
> truncate currently. Quote the reason from HDFS-6581 design doc:
> {quote}
> Appends to files created with the LAZY_PERSISTflag will not be allowed in the 
> initial implementation to avoid the complexity of keeping in­memory and 
> on­disk replicas in sync on a given DataNode.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7587) Edit log corruption can happen if append fails with a quota violation

2015-01-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284357#comment-14284357
 ] 

Daryn Sharp commented on HDFS-7587:
---

{{verifyQuota}} is already invoked so the quota counts shouldn't go out of 
sync.  {{updateSpaceConsumed}} calls {{updateCount}}, which calls 
{{verifyQuota}} prior to invoking {{unprotectedUpdateCount}}.  The quotas 
aren't going to change so it seems calling {{verifyQuota}} explicitly is wasted 
processing time.

bq.  Otherwise, the quote counts will be incorrect if there is an exception 
thrown later on.

Do you have a scenario in mind?  Ie. what is "later on"?  Moving the file to UC 
and associating the lease aren't going to throw checked exceptions.  They might 
throw a runtime exception.  The NN has no concept of a transaction (no 
rollback), so we're fully committed to finishing the op once we start updating 
datastructures.  In this patch, once the quota update is successful, we're 
committed to moving the file to UC and assigning a lease.  If we think those 
final steps will throw, then we're in trouble because we can't rollback.  Even 
if that were to happen, an out of sync quota is better than a corrupted 
in-memory state and edit logs caused by the NN throwing runtime exceptions that 
don't cause an abort.

> Edit log corruption can happen if append fails with a quota violation
> -
>
> Key: HDFS-7587
> URL: https://issues.apache.org/jira/browse/HDFS-7587
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Kihwal Lee
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-7587.patch
>
>
> We have seen a standby namenode crashing due to edit log corruption. It was 
> complaining that {{OP_CLOSE}} cannot be applied because the file is not 
> under-construction.
> When a client was trying to append to the file, the remaining space quota was 
> very small. This caused a failure in {{prepareFileForWrite()}}, but after the 
> inode was already converted for writing and a lease added. Since these were 
> not undone when the quota violation was detected, the file was left in 
> under-construction with an active lease without edit logging {{OP_ADD}}.
> A subsequent {{append()}} eventually caused a lease recovery after the soft 
> limit period. This resulted in {{commitBlockSynchronization()}}, which closed 
> the file with {{OP_CLOSE}} being logged.  Since there was no corresponding 
> {{OP_ADD}}, edit replaying could not apply this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7644) minor typo in HffpFS doc

2015-01-20 Thread Charles Lamb (JIRA)
Charles Lamb created HDFS-7644:
--

 Summary: minor typo in HffpFS doc
 Key: HDFS-7644
 URL: https://issues.apache.org/jira/browse/HDFS-7644
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.7.0
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Trivial


In hadoop-httpfs/src/site/apt/index.apt.vm, s/seening/seen/




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7644) minor typo in HffpFS doc

2015-01-20 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7644:
---
Attachment: HDFS-7644.000.patch

> minor typo in HffpFS doc
> 
>
> Key: HDFS-7644
> URL: https://issues.apache.org/jira/browse/HDFS-7644
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Trivial
> Attachments: HDFS-7644.000.patch
>
>
> In hadoop-httpfs/src/site/apt/index.apt.vm, s/seening/seen/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7496) Fix FsVolume removal race conditions on the DataNode

2015-01-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284398#comment-14284398
 ] 

Colin Patrick McCabe commented on HDFS-7496:


Unfortunately, this no longer applies.  Can you rebase the patch?

> Fix FsVolume removal race conditions on the DataNode 
> -
>
> Key: HDFS-7496
> URL: https://issues.apache.org/jira/browse/HDFS-7496
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7496.000.patch, HDFS-7496.001.patch, 
> HDFS-7496.002.patch, HDFS-7496.003.patch, HDFS-7496.003.patch, 
> HDFS-7496.004.patch, HDFS-7496.005.patch
>
>
> We discussed a few FsVolume removal race conditions on the DataNode in 
> HDFS-7489.  We should figure out a way to make removing an FsVolume safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4173) If the NameNode has already been formatted, but a QuroumJournal has not, auto-format it on startup

2015-01-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284403#comment-14284403
 ] 

Colin Patrick McCabe commented on HDFS-4173:


The NameNode no longer auto-formats edit log directories on the local 
filesystem.  So I think we should close this as WONTFIX since the original 
reason for doing it (consistency with the local edit log directories) no longer 
applies.

> If the NameNode has already been formatted, but a QuroumJournal has not, 
> auto-format it on startup
> --
>
> Key: HDFS-4173
> URL: https://issues.apache.org/jira/browse/HDFS-4173
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: journal-node, namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-4173.001.patch
>
>
> If we have multiple edit log directories, and some of them are formatted, but 
> others are not, we format the unformatted ones.  However, when we implemented 
> QuorumJournalManager, we did not extend this behavior to it.  It makes sense 
> to do this.
> One use case is if you want to add a QuorumJournalManager URI 
> ({{journal://}}) to an existing {{NameNode}}, without reformatting 
> everything.  There is currently no easy way to do this, since {{namenode 
> \-format}} will nuke everything, and there's no other way to format the 
> {{JournalNodes}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7639) Remove the limitation imposed by dfs.balancer.moverThreads

2015-01-20 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284407#comment-14284407
 ] 

Chen He commented on HDFS-7639:
---

Thank you for the comments, [~szetszwo]. 

> Remove the limitation imposed by dfs.balancer.moverThreads
> --
>
> Key: HDFS-7639
> URL: https://issues.apache.org/jira/browse/HDFS-7639
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Chen He
>
> In Balancer/Mover, the number of dispatcher threads 
> (dfs.balancer.moverThreads) limits the number of concurrent moves.  Each 
> dispatcher thread sends request to a datanode and then is blocked for waiting 
> the response.  We should remove such limitation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7496) Fix FsVolume removal race conditions on the DataNode

2015-01-20 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-7496:

Attachment: HDFS-7496.006.patch

Rebased the patch to resolve conflicts with the latest trunk.

> Fix FsVolume removal race conditions on the DataNode 
> -
>
> Key: HDFS-7496
> URL: https://issues.apache.org/jira/browse/HDFS-7496
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7496.000.patch, HDFS-7496.001.patch, 
> HDFS-7496.002.patch, HDFS-7496.003.patch, HDFS-7496.003.patch, 
> HDFS-7496.004.patch, HDFS-7496.005.patch, HDFS-7496.006.patch
>
>
> We discussed a few FsVolume removal race conditions on the DataNode in 
> HDFS-7489.  We should figure out a way to make removing an FsVolume safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7610) Should use StorageDirectory.getCurrentDIr() to construct FsVolumeImpl

2015-01-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284413#comment-14284413
 ] 

Colin Patrick McCabe commented on HDFS-7610:


{code}
+  try {
+volumeSet.add(sl.getFile().getCanonicalFile());
+  } catch (IOException e) {
+// Thrown because File#getCanoicalFile(). Ignored.
+  }
{code}

We can't ignore exceptions like this.  Anyway, I don't think we need the 
"canonical" filename anyway.  If someone is playing games with symlinks, that's 
their own problem, not ours.  Just get the absolute pathname, an operation that 
can't fail.

{code}
-Set expectedVolumes = new HashSet();
+Set expectedVolumes = new HashSet<>();
{code}

Changes like this are just churn that makes it harder to read the diff, so 
let's not

thanks

> Should use StorageDirectory.getCurrentDIr() to construct FsVolumeImpl
> -
>
> Key: HDFS-7610
> URL: https://issues.apache.org/jira/browse/HDFS-7610
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7610.000.patch
>
>
> In the hot swap feature, {{FsDatasetImpl#addVolume}} uses the base volume dir 
> (e.g. "{{/foo/data0}}", instead of volume's current dir 
> "{{/foo/data/current}}" to construct {{FsVolumeImpl}}. As a result, DataNode 
> can not remove this newly added volume, because its 
> {{FsVolumeImpl#getBasePath}} returns "{{/foo}}".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4173) If the NameNode has already been formatted, but a QuroumJournal has not, auto-format it on startup

2015-01-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284418#comment-14284418
 ] 

Hadoop QA commented on HDFS-4173:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12552952/HDFS-4173.001.patch
  against trunk revision dd0228b.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9278//console

This message is automatically generated.

> If the NameNode has already been formatted, but a QuroumJournal has not, 
> auto-format it on startup
> --
>
> Key: HDFS-4173
> URL: https://issues.apache.org/jira/browse/HDFS-4173
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: journal-node, namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-4173.001.patch
>
>
> If we have multiple edit log directories, and some of them are formatted, but 
> others are not, we format the unformatted ones.  However, when we implemented 
> QuorumJournalManager, we did not extend this behavior to it.  It makes sense 
> to do this.
> One use case is if you want to add a QuorumJournalManager URI 
> ({{journal://}}) to an existing {{NameNode}}, without reformatting 
> everything.  There is currently no easy way to do this, since {{namenode 
> \-format}} will nuke everything, and there's no other way to format the 
> {{JournalNodes}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7645) Rolling upgrade is restoring blocks from trash multiple times

2015-01-20 Thread Nathan Roberts (JIRA)
Nathan Roberts created HDFS-7645:


 Summary: Rolling upgrade is restoring blocks from trash multiple 
times
 Key: HDFS-7645
 URL: https://issues.apache.org/jira/browse/HDFS-7645
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: Nathan Roberts


When performing an HDFS rolling upgrade, the trash directory is getting 
restored twice when under normal circumstances it shouldn't need to be restored 
at all. iiuc, the only time these blocks should be restored is if we need to 
rollback a rolling upgrade. 

On a busy cluster, this can cause significant and unnecessary block churn both 
on the datanodes, and more importantly in the namenode.

The two times this happens are:
1) restart of DN onto new software
{code}
  private void doTransition(DataNode datanode, StorageDirectory sd,
  NamespaceInfo nsInfo, StartupOption startOpt) throws IOException {
if (startOpt == StartupOption.ROLLBACK && sd.getPreviousDir().exists()) {
  Preconditions.checkState(!getTrashRootDir(sd).exists(),
  sd.getPreviousDir() + " and " + getTrashRootDir(sd) + " should not " +
  " both be present.");
  doRollback(sd, nsInfo); // rollback if applicable
} else {
  // Restore all the files in the trash. The restored files are retained
  // during rolling upgrade rollback. They are deleted during rolling
  // upgrade downgrade.
  int restored = restoreBlockFilesFromTrash(getTrashRootDir(sd));
  LOG.info("Restored " + restored + " block files from trash.");
}
{code}

2) When heartbeat response no longer indicates a rollingupgrade is in progress
{code}
  /**
   * Signal the current rolling upgrade status as indicated by the NN.
   * @param inProgress true if a rolling upgrade is in progress
   */
  void signalRollingUpgrade(boolean inProgress) throws IOException {
String bpid = getBlockPoolId();
if (inProgress) {
  dn.getFSDataset().enableTrash(bpid);
  dn.getFSDataset().setRollingUpgradeMarker(bpid);
} else {
  dn.getFSDataset().restoreTrash(bpid);
  dn.getFSDataset().clearRollingUpgradeMarker(bpid);
}
  }
{code}

HDFS-6800 and HDFS-6981 were modifying this behavior making it not completely 
clear whether this is somehow intentional. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7339) Allocating and persisting block groups in NameNode

2015-01-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284427#comment-14284427
 ] 

Hadoop QA commented on HDFS-7339:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693336/HDFS-7339-004.patch
  against trunk revision c94c0d2.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9277//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9277//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9277//console

This message is automatically generated.

> Allocating and persisting block groups in NameNode
> --
>
> Key: HDFS-7339
> URL: https://issues.apache.org/jira/browse/HDFS-7339
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, 
> HDFS-7339-003.patch, HDFS-7339-004.patch, Meta-striping.jpg, NN-stripping.jpg
>
>
> All erasure codec operations center around the concept of _block group_; they 
> are formed in initial encoding and looked up in recoveries and conversions. A 
> lightweight class {{BlockGroup}} is created to record the original and parity 
> blocks in a coding group, as well as a pointer to the codec schema (pluggable 
> codec schemas will be supported in HDFS-7337). With the striping layout, the 
> HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. 
> Therefore we propose to extend a file’s inode to switch between _contiguous_ 
> and _striping_ modes, with the current mode recorded in a binary flag. An 
> array of BlockGroups (or BlockGroup IDs) is added, which remains empty for 
> “traditional” HDFS files with contiguous block layout.
> The NameNode creates and maintains {{BlockGroup}} instances through the new 
> {{ECManager}} component; the attached figure has an illustration of the 
> architecture. As a simple example, when a {_Striping+EC_} file is created and 
> written to, it will serve requests from the client to allocate new 
> {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, 
> {{BlockGroups}} are allocated both in initial online encoding and in the 
> conversion from replication to EC. {{ECManager}} also facilitates the lookup 
> of {{BlockGroup}} information for block recovery work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7496) Fix FsVolume removal race conditions on the DataNode

2015-01-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284429#comment-14284429
 ] 

Hadoop QA commented on HDFS-7496:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693384/HDFS-7496.006.patch
  against trunk revision dd0228b.

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9279//console

This message is automatically generated.

> Fix FsVolume removal race conditions on the DataNode 
> -
>
> Key: HDFS-7496
> URL: https://issues.apache.org/jira/browse/HDFS-7496
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7496.000.patch, HDFS-7496.001.patch, 
> HDFS-7496.002.patch, HDFS-7496.003.patch, HDFS-7496.003.patch, 
> HDFS-7496.004.patch, HDFS-7496.005.patch, HDFS-7496.006.patch
>
>
> We discussed a few FsVolume removal race conditions on the DataNode in 
> HDFS-7489.  We should figure out a way to make removing an FsVolume safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-01-20 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284438#comment-14284438
 ] 

Andrew Wang commented on HDFS-7411:
---

I think these test failures are unrelated. HDFS-7527 is tracking the 
TestDecommission#testIncludeByRegistrationName failure. I ran 
TestRollingUpgrade locally a few times and it worked.

> Refactor and improve decommissioning logic into DecommissionManager
> ---
>
> Key: HDFS-7411
> URL: https://issues.apache.org/jira/browse/HDFS-7411
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.5.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
> hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
> hdfs-7411.006.patch, hdfs-7411.007.patch
>
>
> Would be nice to split out decommission logic from DatanodeManager to 
> DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7496) Fix FsVolume removal race conditions on the DataNode

2015-01-20 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-7496:

Attachment: HDFS-7496.007.patch

HDFS-5631 introduced {{ExternalVolumeImpl}} and {{ExternalDatasetImpl}}. I 
updated the patch to add missing functions in these two classes.

> Fix FsVolume removal race conditions on the DataNode 
> -
>
> Key: HDFS-7496
> URL: https://issues.apache.org/jira/browse/HDFS-7496
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7496.000.patch, HDFS-7496.001.patch, 
> HDFS-7496.002.patch, HDFS-7496.003.patch, HDFS-7496.003.patch, 
> HDFS-7496.004.patch, HDFS-7496.005.patch, HDFS-7496.006.patch, 
> HDFS-7496.007.patch
>
>
> We discussed a few FsVolume removal race conditions on the DataNode in 
> HDFS-7489.  We should figure out a way to make removing an FsVolume safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7645) Rolling upgrade is restoring blocks from trash multiple times

2015-01-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284445#comment-14284445
 ] 

Colin Patrick McCabe commented on HDFS-7645:


I think we should get rid of trash and just always create a previous/ directory 
when doing rolling upgrade, the same as we do with regular upgrade.  The speed 
is clearly acceptable since we've done these upgrades in the field when 
switching to the blockid-based layout with no problems.  And it will be a lot 
more maintainable and less confusing.

> Rolling upgrade is restoring blocks from trash multiple times
> -
>
> Key: HDFS-7645
> URL: https://issues.apache.org/jira/browse/HDFS-7645
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Nathan Roberts
>
> When performing an HDFS rolling upgrade, the trash directory is getting 
> restored twice when under normal circumstances it shouldn't need to be 
> restored at all. iiuc, the only time these blocks should be restored is if we 
> need to rollback a rolling upgrade. 
> On a busy cluster, this can cause significant and unnecessary block churn 
> both on the datanodes, and more importantly in the namenode.
> The two times this happens are:
> 1) restart of DN onto new software
> {code}
>   private void doTransition(DataNode datanode, StorageDirectory sd,
>   NamespaceInfo nsInfo, StartupOption startOpt) throws IOException {
> if (startOpt == StartupOption.ROLLBACK && sd.getPreviousDir().exists()) {
>   Preconditions.checkState(!getTrashRootDir(sd).exists(),
>   sd.getPreviousDir() + " and " + getTrashRootDir(sd) + " should not 
> " +
>   " both be present.");
>   doRollback(sd, nsInfo); // rollback if applicable
> } else {
>   // Restore all the files in the trash. The restored files are retained
>   // during rolling upgrade rollback. They are deleted during rolling
>   // upgrade downgrade.
>   int restored = restoreBlockFilesFromTrash(getTrashRootDir(sd));
>   LOG.info("Restored " + restored + " block files from trash.");
> }
> {code}
> 2) When heartbeat response no longer indicates a rollingupgrade is in progress
> {code}
>   /**
>* Signal the current rolling upgrade status as indicated by the NN.
>* @param inProgress true if a rolling upgrade is in progress
>*/
>   void signalRollingUpgrade(boolean inProgress) throws IOException {
> String bpid = getBlockPoolId();
> if (inProgress) {
>   dn.getFSDataset().enableTrash(bpid);
>   dn.getFSDataset().setRollingUpgradeMarker(bpid);
> } else {
>   dn.getFSDataset().restoreTrash(bpid);
>   dn.getFSDataset().clearRollingUpgradeMarker(bpid);
> }
>   }
> {code}
> HDFS-6800 and HDFS-6981 were modifying this behavior making it not completely 
> clear whether this is somehow intentional. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7548) Corrupt block reporting delayed until datablock scanner thread detects it

2015-01-20 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-7548:
-
Status: Open  (was: Patch Available)

Cancelling the patch to address Daryn's comment.

> Corrupt block reporting delayed until datablock scanner thread detects it
> -
>
> Key: HDFS-7548
> URL: https://issues.apache.org/jira/browse/HDFS-7548
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-7548-v2.patch, HDFS-7548-v3.patch, HDFS-7548.patch
>
>
> When there is one datanode holding the block and that block happened to be
> corrupt, namenode would keep on trying to replicate the block repeatedly but 
> it would only report the block as corrupt only when the data block scanner 
> thread of the datanode picks up this bad block.
> Requesting improvement in namenode reporting so that corrupt replica would be 
> reported when there is only 1 replica and the replication of that replica 
> keeps on failing with the checksum error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7646) HDFS truncate may remove data from the supposedly read-only previous/ directory during an upgrade

2015-01-20 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-7646:
--

 Summary: HDFS truncate may remove data from the supposedly 
read-only previous/ directory during an upgrade
 Key: HDFS-7646
 URL: https://issues.apache.org/jira/browse/HDFS-7646
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Colin Patrick McCabe
Priority: Blocker


During a DataNode layout version upgrade, HDFS creates hardlinks.  These 
hardlinks allow the same block to be accessible from both the current/ and 
previous/ directories.  Rollback is possible by deleting the current/ directory 
and renaming previous/ to current/.

However, if the user truncates one of these hardlinked block files, it 
effectively eliminates the ability to roll back to the previous data.

We probably need to disable truncation-in-place during a DataNode upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7610) Should use StorageDirectory.getCurrentDIr() to construct FsVolumeImpl

2015-01-20 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-7610:

Attachment: HDFS-7610.001.patch

[~cmccabe] Thanks for reviewing.

I have updated the patch to address your comments above.

> Should use StorageDirectory.getCurrentDIr() to construct FsVolumeImpl
> -
>
> Key: HDFS-7610
> URL: https://issues.apache.org/jira/browse/HDFS-7610
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7610.000.patch, HDFS-7610.001.patch
>
>
> In the hot swap feature, {{FsDatasetImpl#addVolume}} uses the base volume dir 
> (e.g. "{{/foo/data0}}", instead of volume's current dir 
> "{{/foo/data/current}}" to construct {{FsVolumeImpl}}. As a result, DataNode 
> can not remove this newly added volume, because its 
> {{FsVolumeImpl#getBasePath}} returns "{{/foo}}".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7548) Corrupt block reporting delayed until datablock scanner thread detects it

2015-01-20 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-7548:
-
Attachment: HDFS-7548-v4.patch

Attaching a new patch addressing Daryn's comments.

> Corrupt block reporting delayed until datablock scanner thread detects it
> -
>
> Key: HDFS-7548
> URL: https://issues.apache.org/jira/browse/HDFS-7548
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-7548-v2.patch, HDFS-7548-v3.patch, 
> HDFS-7548-v4.patch, HDFS-7548.patch
>
>
> When there is one datanode holding the block and that block happened to be
> corrupt, namenode would keep on trying to replicate the block repeatedly but 
> it would only report the block as corrupt only when the data block scanner 
> thread of the datanode picks up this bad block.
> Requesting improvement in namenode reporting so that corrupt replica would be 
> reported when there is only 1 replica and the replication of that replica 
> keeps on failing with the checksum error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7548) Corrupt block reporting delayed until datablock scanner thread detects it

2015-01-20 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-7548:
-
Status: Patch Available  (was: Open)

> Corrupt block reporting delayed until datablock scanner thread detects it
> -
>
> Key: HDFS-7548
> URL: https://issues.apache.org/jira/browse/HDFS-7548
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-7548-v2.patch, HDFS-7548-v3.patch, 
> HDFS-7548-v4.patch, HDFS-7548.patch
>
>
> When there is one datanode holding the block and that block happened to be
> corrupt, namenode would keep on trying to replicate the block repeatedly but 
> it would only report the block as corrupt only when the data block scanner 
> thread of the datanode picks up this bad block.
> Requesting improvement in namenode reporting so that corrupt replica would be 
> reported when there is only 1 replica and the replication of that replica 
> keeps on failing with the checksum error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool

2015-01-20 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284506#comment-14284506
 ] 

Andrew Wang commented on HDFS-6673:
---

Hi Eddy, thanks for doing these benchmarks. I only have some nitty stuff, it 
overall looks great. I'm +1 pending these changes.

* Could use slf4j logging for new code.
* The use of {{String.format}} is often unnecessary, since slf4j and 
Precondtion can already do the substitutions. It's more efficient to defer. 
Using substitutions is also better than concatenating strings yourself with 
{{+}}.
* Can we add some class javadoc on PBImageTextWriter describing the overall 
process? It's nice to have an overview, even if a lot of the info is in other 
bits of javadoc. For instance, mentioning the format of the PB image with 
sections of records for INodes and directories, the two passes through the 
fsimage used to first build the two different maps and then to print the 
delimited format, being able to use LevelDB or the InMemoryDB to store the maps.
* Do we actually need to sync the metadata maps? This is all generated data, so 
doesn't seem necessary, especially since this will typically be a one-time 
thing and might decrease performance.

> Add Delimited format supports for PB OIV tool
> -
>
> Key: HDFS-6673
> URL: https://issues.apache.org/jira/browse/HDFS-6673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, 
> HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, 
> HDFS-6673.005.patch
>
>
> The new oiv tool, which is designed for Protobuf fsimage, lacks a few 
> features supported in the old {{oiv}} tool. 
> This task adds supports of _Delimited_ processor to the oiv tool. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool

2015-01-20 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284564#comment-14284564
 ] 

Haohui Mai commented on HDFS-6673:
--

bq. By IN || parent_id || localName, do you mean concating inode, parent inode 
and INode localName as key in LevelDB? In this case, since INode is the prefix 
of the key, is the order of keys still determined by inode?

You can take a look at 
https://issues.apache.org/jira/secure/attachment/12643478/HDFS-6293.001.patch

Please do not commit this patch until the seek issue is addressed.

> Add Delimited format supports for PB OIV tool
> -
>
> Key: HDFS-6673
> URL: https://issues.apache.org/jira/browse/HDFS-6673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, 
> HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, 
> HDFS-6673.005.patch
>
>
> The new oiv tool, which is designed for Protobuf fsimage, lacks a few 
> features supported in the old {{oiv}} tool. 
> This task adds supports of _Delimited_ processor to the oiv tool. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7339) Allocating and persisting block groups in NameNode

2015-01-20 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7339:

Attachment: HDFS-7339-005.patch

Fix findbugs issues.

> Allocating and persisting block groups in NameNode
> --
>
> Key: HDFS-7339
> URL: https://issues.apache.org/jira/browse/HDFS-7339
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, 
> HDFS-7339-003.patch, HDFS-7339-004.patch, HDFS-7339-005.patch, 
> Meta-striping.jpg, NN-stripping.jpg
>
>
> All erasure codec operations center around the concept of _block group_; they 
> are formed in initial encoding and looked up in recoveries and conversions. A 
> lightweight class {{BlockGroup}} is created to record the original and parity 
> blocks in a coding group, as well as a pointer to the codec schema (pluggable 
> codec schemas will be supported in HDFS-7337). With the striping layout, the 
> HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. 
> Therefore we propose to extend a file’s inode to switch between _contiguous_ 
> and _striping_ modes, with the current mode recorded in a binary flag. An 
> array of BlockGroups (or BlockGroup IDs) is added, which remains empty for 
> “traditional” HDFS files with contiguous block layout.
> The NameNode creates and maintains {{BlockGroup}} instances through the new 
> {{ECManager}} component; the attached figure has an illustration of the 
> architecture. As a simple example, when a {_Striping+EC_} file is created and 
> written to, it will serve requests from the client to allocate new 
> {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, 
> {{BlockGroups}} are allocated both in initial online encoding and in the 
> conversion from replication to EC. {{ECManager}} also facilitates the lookup 
> of {{BlockGroup}} information for block recovery work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool

2015-01-20 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284572#comment-14284572
 ] 

Haohui Mai commented on HDFS-6673:
--

Here is a link to the related comments from HDFS-6293: 
https://issues.apache.org/jira/browse/HDFS-6293?focusedCommentId=13989358&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13989358

> Add Delimited format supports for PB OIV tool
> -
>
> Key: HDFS-6673
> URL: https://issues.apache.org/jira/browse/HDFS-6673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, 
> HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, 
> HDFS-6673.005.patch
>
>
> The new oiv tool, which is designed for Protobuf fsimage, lacks a few 
> features supported in the old {{oiv}} tool. 
> This task adds supports of _Delimited_ processor to the oiv tool. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool

2015-01-20 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284580#comment-14284580
 ] 

Lei (Eddy) Xu commented on HDFS-6673:
-

Hi, [~wheat9]

Thank you very much to pointing this out. In your patch, you have dumped inodes 
to LevelDB sorted by its parent ID. I have tried this method, but in my 
experiments, the time to dumping inodes and scan leveldb _sequentially_ 
overweights the benefits of sequential scanning.

In the current patch, I assume that one directory was sequentially written to 
fsimage. Thus when in the second run to scan INode section to generate text 
output, the parent directory INode is actually relatively stable cached in the 
LRU cache, as 

{code}
@Override
public String getParentPath(long inode) throws IOException {
  if (inode == INodeId.ROOT_INODE_ID) {
return "/";
  }
  byte[] bytes = dirChildMap.get(toBytes(inode));
  Preconditions.checkState(bytes != null && bytes.length == 8,
  "Can not find parent directory for inode %s, "
  + "fsimage might be corrupted", inode);
  long parent = toLong(bytes);
  if (!dirPathCache.containsKey(parent)) {
bytes = dirMap.get(toBytes(parent));
if (parent != INodeId.ROOT_INODE_ID) {
  Preconditions.checkState(bytes != null,
  "Can not find parent directory for inode %s, "
  + ", the fsimage might be corrupted.", parent);
}
String parentName = toString(bytes);
String parentPath =
new File(getParentPath(parent), parentName).toString();
dirPathCache.put(parent, parentPath);
  }
  return dirPathCache.get(parent);
}
{code}

Thus, even it is not a completely sequential scan on directory ID, it only 
involves one seek per INode. 

Does it make sense to you?

> Add Delimited format supports for PB OIV tool
> -
>
> Key: HDFS-6673
> URL: https://issues.apache.org/jira/browse/HDFS-6673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, 
> HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, 
> HDFS-6673.005.patch
>
>
> The new oiv tool, which is designed for Protobuf fsimage, lacks a few 
> features supported in the old {{oiv}} tool. 
> This task adds supports of _Delimited_ processor to the oiv tool. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool

2015-01-20 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284590#comment-14284590
 ] 

Lei (Eddy) Xu commented on HDFS-6673:
-

One more thing, we'd also like to have the full path of each file, storing 
inode as {{IN || parent id || localName}} still requires look up of the parent 
directory somehow. 

Additionally, since the first {{inode -> parent id}} mapping is very small 
(<100MB for this 3.3GB fsimage), it should be reasonable cached by OS. It also 
avoids LevelDB's write amplification to some extends in my tests.

> Add Delimited format supports for PB OIV tool
> -
>
> Key: HDFS-6673
> URL: https://issues.apache.org/jira/browse/HDFS-6673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, 
> HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, 
> HDFS-6673.005.patch
>
>
> The new oiv tool, which is designed for Protobuf fsimage, lacks a few 
> features supported in the old {{oiv}} tool. 
> This task adds supports of _Delimited_ processor to the oiv tool. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7587) Edit log corruption can happen if append fails with a quota violation

2015-01-20 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284594#comment-14284594
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7587:
---

> ... The quotas aren't going to change so it seems calling verifyQuota 
> explicitly is wasted processing time.

We may call verifyQuota in the beginning and update quota without checking at 
the end.

> Do you have a scenario in mind? Ie. what is "later on"? Moving the file to UC 
> and associating the lease aren't going to throw checked exceptions. ...

convertLastBlockToUnderConstruction does throw IOException.

> ... Even if that were to happen, an out of sync quota is better than a 
> corrupted in-memory state and edit logs caused by the NN throwing runtime 
> exceptions that don't cause an abort.

Agree.  Out of sync quota is better than a corrupted in-memory state.  Also, 
in-sync quota is better than out of sync quota.  We could have both in-sync 
quota and uncorrupted in-memory state.  I think no one is saying that we prefer 
in-sync quota than uncorrupted in-memory state.

> Edit log corruption can happen if append fails with a quota violation
> -
>
> Key: HDFS-7587
> URL: https://issues.apache.org/jira/browse/HDFS-7587
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Kihwal Lee
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-7587.patch
>
>
> We have seen a standby namenode crashing due to edit log corruption. It was 
> complaining that {{OP_CLOSE}} cannot be applied because the file is not 
> under-construction.
> When a client was trying to append to the file, the remaining space quota was 
> very small. This caused a failure in {{prepareFileForWrite()}}, but after the 
> inode was already converted for writing and a lease added. Since these were 
> not undone when the quota violation was detected, the file was left in 
> under-construction with an active lease without edit logging {{OP_ADD}}.
> A subsequent {{append()}} eventually caused a lease recovery after the soft 
> limit period. This resulted in {{commitBlockSynchronization()}}, which closed 
> the file with {{OP_CLOSE}} being logged.  Since there was no corresponding 
> {{OP_ADD}}, edit replaying could not apply this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool

2015-01-20 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284596#comment-14284596
 ] 

Haohui Mai commented on HDFS-6673:
--

bq. Thank you very much to pointing this out. In your patch, you have dumped 
inodes to LevelDB sorted by its parent ID. I have tried this method, but in my 
experiments, the time to dumping inodes and scan leveldb sequentially 
overweights the benefits of sequential scanning.

For this particular purpose you don't necessarily store the inode into the db 
-- putting the key in the db is sufficient.

bq. I assume that one directory was sequentially written to fsimage. 

This does not hold. FSImage stores the inodes with no order. See 
{{FSImageFormatPBINode#serializeINodeSection}}.

> Add Delimited format supports for PB OIV tool
> -
>
> Key: HDFS-6673
> URL: https://issues.apache.org/jira/browse/HDFS-6673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, 
> HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, 
> HDFS-6673.005.patch
>
>
> The new oiv tool, which is designed for Protobuf fsimage, lacks a few 
> features supported in the old {{oiv}} tool. 
> This task adds supports of _Delimited_ processor to the oiv tool. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool

2015-01-20 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284600#comment-14284600
 ] 

Lei (Eddy) Xu commented on HDFS-6673:
-

bq. For this particular purpose you don't necessarily store the inode into the 
db – putting the key in the db is sufficient.

How could we find INode to extract fields to print out in this scenario?

> Add Delimited format supports for PB OIV tool
> -
>
> Key: HDFS-6673
> URL: https://issues.apache.org/jira/browse/HDFS-6673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, 
> HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, 
> HDFS-6673.005.patch
>
>
> The new oiv tool, which is designed for Protobuf fsimage, lacks a few 
> features supported in the old {{oiv}} tool. 
> This task adds supports of _Delimited_ processor to the oiv tool. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7647) DatanodeManager.sortLocatedBlocks() sorts DatanodeIDs but not StorageIDs

2015-01-20 Thread Milan Desai (JIRA)
Milan Desai created HDFS-7647:
-

 Summary: DatanodeManager.sortLocatedBlocks() sorts DatanodeIDs but 
not StorageIDs
 Key: HDFS-7647
 URL: https://issues.apache.org/jira/browse/HDFS-7647
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Milan Desai
Assignee: Milan Desai


DatanodeManager.sortLocatedBlocks() sorts the array of DatanodeIDs inside each 
LocatedBlock, but does not touch the array of StorageIDs and StorageTypes. As a 
result, the DatanodeIDs and StorageIDs/StorageTypes are mismatched. The method 
is called by FSNamesystem.getBlockLocations(), so the client will not know 
which StorageID/Type corresponds to which DatanodeID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7005) DFS input streams do not timeout

2015-01-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284645#comment-14284645
 ] 

Colin Patrick McCabe commented on HDFS-7005:


[~zsl2007], it appears that the DataNode is setting both a write and a read 
timeout on its sockets, but the DFSClient is only setting a read timeout.  If 
you want to file another JIRA to add a write timeout to DFSClient sockets, that 
might be a good idea.

> DFS input streams do not timeout
> 
>
> Key: HDFS-7005
> URL: https://issues.apache.org/jira/browse/HDFS-7005
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.0.0, 2.5.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Fix For: 2.6.0
>
> Attachments: HDFS-7005.patch
>
>
> Input streams lost their timeout.  The problem appears to be 
> {{DFSClient#newConnectedPeer}} does not set the read timeout.  During a 
> temporary network interruption the server will close the socket, unbeknownst 
> to the client host, which blocks on a read forever.
> The results are dire.  Services such as the RM, JHS, NMs, oozie servers, etc 
> all need to be restarted to recover - unless you want to wait many hours for 
> the tcp stack keepalive to detect the broken socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7611) TestFileTruncate.testTruncateEditLogLoad times out waiting for Mini HDFS Cluster to start

2015-01-20 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284651#comment-14284651
 ] 

Konstantin Shvachko commented on HDFS-7611:
---

Was looking at {{TestOpenFilesWithSnapshot}} which also restarts NameNode and 
fails intermittently with the same timeout. I see similar behavior as Byron 
described.
The test creates two files {{/test/test/test2}} and {{/test/test/test3}}, then 
aborts the streams, creates a snapshot, deletes the files, and restarts the the 
NameNode. If any of the replicas of the files were created on any of DNs, then 
the test succeeds. If the stream is aborted before the replicas are created, 
then the test fails.
So some blocks, which were deleted before the NN restart are not being garbage 
collected on restart, and NN cannot get out of safe mode then.
This test does not use truncate, but does use snapshots.

> TestFileTruncate.testTruncateEditLogLoad times out waiting for Mini HDFS 
> Cluster to start
> -
>
> Key: HDFS-7611
> URL: https://issues.apache.org/jira/browse/HDFS-7611
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Byron Wong
> Attachments: testTruncateEditLogLoad.log
>
>
> I've seen it failing on Jenkins a couple of times. Somehow the cluster is not 
> comming ready after NN restart.
> Not sure if it is truncate specific, as I've seen same behaviour with other 
> tests that restart the NameNode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7623) Add htrace configuration properties to core-default.xml and update user doc about how to enable htrace

2015-01-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284654#comment-14284654
 ] 

Colin Patrick McCabe commented on HDFS-7623:


bq.  using the open source tracing library, 
{{{https://git-wip-us.apache.org/repos/asf/incubator-htrace.git}HTrace}}.

Should say "Apache HTrace" :)

+1 when that's resolved.  thanks, Yi, and sorry for the delays in reviewing 
(long weekend)

> Add htrace configuration properties to core-default.xml and update user doc 
> about how to enable htrace
> --
>
> Key: HDFS-7623
> URL: https://issues.apache.org/jira/browse/HDFS-7623
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-7623.001.patch
>
>
> This JIRA does following things:
> *1.* Add htrace configuration properties to core-default.xml
> *2.* update user doc about how to enable htrace.
> *3.* Few fix in user doc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3519) Checkpoint upload may interfere with a concurrent saveNamespace

2015-01-20 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-3519:
--
Attachment: HDFS-3519-3.patch

Thanks, Chris. Good point. Here is the updated patch with your suggestions.

> Checkpoint upload may interfere with a concurrent saveNamespace
> ---
>
> Key: HDFS-3519
> URL: https://issues.apache.org/jira/browse/HDFS-3519
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Todd Lipcon
>Assignee: Ming Ma
>Priority: Critical
> Attachments: HDFS-3519-2.patch, HDFS-3519-3.patch, HDFS-3519.patch, 
> test-output.txt
>
>
> TestStandbyCheckpoints failed in [precommit build 
> 2620|https://builds.apache.org/job/PreCommit-HDFS-Build/2620//testReport/] 
> due to the following issue:
> - both nodes were in Standby state, and configured to checkpoint "as fast as 
> possible"
> - NN1 starts to save its own namespace
> - NN2 starts to upload a checkpoint for the same txid. So, both threads are 
> writing to the same file fsimage.ckpt_12, but the actual file contents 
> correspond to the uploading thread's data.
> - NN1 finished its saveNamespace operation while NN2 was still uploading. So, 
> it renamed the ckpt file. However, the contents of the file are still empty 
> since NN2 hasn't sent any bytes
> - NN2 finishes the upload, and the rename() call fails, which causes the 
> directory to be marked failed, etc.
> The result is that there is a file fsimage_12 which appears to be a finalized 
> image but in fact is incompletely transferred. When the transfer completes, 
> the problem "heals itself" so there wouldn't be persistent corruption unless 
> the machine crashes at the same time. And even then, we'd still have the 
> earlier checkpoint to restore from.
> This same race could occur in a non-HA setup if a user puts the NN in safe 
> mode and issues saveNamespace operations concurrent with a 2NN checkpointing, 
> I believe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7575) NameNode not handling heartbeats properly after HDFS-2832

2015-01-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284681#comment-14284681
 ] 

Colin Patrick McCabe commented on HDFS-7575:


So there are two approaches here:
1. silently (i.e., without user intervention), dedupe duplicate storage IDs 
when starting up the DataNode
2. create a new DataNode layout version and dedupe duplicate storage IDs during 
the upgrade.

Arguments in favor of approach #1:
* Collisions might happen that we need to dedupe repeatly.  This argument seems 
specious since the probability is effectively less than the change of cosmic 
rays causing errors (as Nicholas pointed out).  I think the probabilities 
outlined here make this argument a non-starter: 
https://en.wikipedia.org/wiki/Universally_unique_identifier#Random_UUID_probability_of_duplicates.
  Also, approach #1 only dedupes on a single datanode, but there can be many 
datanodes in the cluster.

* As Suresh pointed out, the old software can easily handle cases where the 
Storage IDs are unique.  So using a new layout version is not required to flip 
back and forth between old and new software.  While this is true, we have 
bumped the layout version in the past even when the old software could handle 
the new layout.  For example, HDFS-6482 added a new DN layout version even 
though the old software could use the new blockid-based layout.  So this 
argument is basically just saying "approach #1 is viable."  But it doesn't tell 
us whether approach #1 is a good idea.

* Nobody has made this argument yet, but you could argue that the upgrade 
process will be faster with approach #1 than approach #2.  However, we've done 
datanode layout version upgrades on production clusters in the past and time 
hasn't been an issue.  The JNI hardlink code (and soon, the Java7 hardlink 
code) eliminated the long delays that resulted from spawning shell commands.  
So I don't think this argument is persuasive.

Arguments in favor of approach #2:
* Changing the storage ID during startup basically changes storage ID from 
being a permanent identifier to a temporary one.  This seems like a small 
change, but I would argue that it's really a big one, architecturally.  For 
example, suppose we wanted to persist this information at some point.  We 
couldn't really do that if it's changing all the time.

* With approach #1, we have to carry the burden of the dedupe code forever.  We 
can't ever stop deduping, even in Hadoop 3.0, because for all we know, the user 
has just upgraded, and was previously running 2.6 (a version with the bug) that 
we will have to correct.  The extra run time isn't an issue, but the complexity 
is.  What if our write to VERSION fails on one of the volume directories?  What 
do we do then?  And then if volume failures are tolerated, this directory could 
later come back and be an issue.  The purpose of layout versions is so that we 
don't have to think about these kind of "mix and match" issues.

* Approach #1 leaves us open to some weird scenarios.  For example, what if I 
have /storage1 -> /foo and /storage2 -> /foo.  In other words, you have what 
appears to be two volume root directories, but it's really the same directory.  
Approach #2 will complain, but approach #1 will happily rename the storageID of 
the /foo directory and continue with the corrupt configuration.  This is what 
happens when you fudge error checking.

So in conclusion I would argue for approach #2.  Thoughts?

> NameNode not handling heartbeats properly after HDFS-2832
> -
>
> Key: HDFS-7575
> URL: https://issues.apache.org/jira/browse/HDFS-7575
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0, 2.5.0, 2.6.0
>Reporter: Lars Francke
>Assignee: Arpit Agarwal
>Priority: Critical
> Attachments: HDFS-7575.01.patch, HDFS-7575.02.patch, 
> HDFS-7575.03.binary.patch, HDFS-7575.03.patch, HDFS-7575.04.binary.patch, 
> HDFS-7575.04.patch, HDFS-7575.05.binary.patch, HDFS-7575.05.patch, 
> testUpgrade22via24GeneratesStorageIDs.tgz, 
> testUpgradeFrom22GeneratesStorageIDs.tgz, 
> testUpgradeFrom24PreservesStorageId.tgz
>
>
> Before HDFS-2832 each DataNode would have a unique storageId which included 
> its IP address. Since HDFS-2832 the DataNodes have a unique storageId per 
> storage directory which is just a random UUID.
> They send reports per storage directory in their heartbeats. This heartbeat 
> is processed on the NameNode in the 
> {{DatanodeDescriptor#updateHeartbeatState}} method. Pre HDFS-2832 this would 
> just store the information per Datanode. After the patch though each DataNode 
> can have multiple different storages so it's stored in a map keyed by the 
> storage Id.
> This works fine for all clusters that have been installed post HDFS-2832 as

[jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool

2015-01-20 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284696#comment-14284696
 ] 

Lei (Eddy) Xu commented on HDFS-6673:
-

[~wheat9] To provide more background, I described what I had tried here:

1. I had tried use {{directory ID || inode Id}} as key and {{INode}} protobuf 
as value to store all INodes in LevelDB, the end-to-end time is about 40-50 
minutes, while the time to dump INodes along is about 20-ish minutes, which is 
already larger than the end-to-end time now (10 minutes). Moreover, when the 
LevelDB become larger (about 1GB as I recalled), the write performance dropped 
significantly. I suspected that it is because the 
[write-amplification|https://github.com/facebook/rocksdb/wiki/RocksDB-Basics]. 
I have also tried to split one large LevelDB to multiple smaller ones, but it 
does not worth the complexity. As a result, I dropped this approach and chose 
to not re-order inodes.

2. 
bq. This does not hold. FSImage stores the inodes with no order. See 
{{FSImageFormatPBINode#serializeINodeSection.}}

Yes, you are right.  But by checking {{INode#hashCode()}}, it seems that they 
are not completely random when {{INode <= 2 ** 32}}. Despite of that, since 
{{dirChildMap}} uses {{Long}} as keys and values. The size of {{dirChildMap}} 
is 2 orders of magnitude smaller than the fsimage.  So if the fsimage is 
{{50GB}}, the leveldb is less than 1GB and can be reasonably well to fit into 
OS cache on a laptop.  Thus one seek per INode is not terribly bad maybe?

3. The {{DirPathCache}} caches the *full path* of the parent directory with 16K 
entries. Suppose the average full path of a directory is about 128 bytes, it 
uses only about ~1MB memory. I supposed that we can increase the capacity of 
this LRUcache later when we actually measure the hit rates. I believe that this 
LRUcache should work, given the fact that the measured performance of this 
approach is faster.

4. Unlike in {{FileDistributionCalculator}}, we need the full path of an inode 
when print it.  Since directories and inodes are stored out of order in 
fsimage, we need at least sorting directories or inodes to some extend. I chose 
to sort directory, because 

# The total # of directories is much smaller.
# The LRU cache is more (only) effective to directories. 

Do these make sense to you, [~wheat9]. It would be great if I can get a +1 from 
you.

Thanks!

> Add Delimited format supports for PB OIV tool
> -
>
> Key: HDFS-6673
> URL: https://issues.apache.org/jira/browse/HDFS-6673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, 
> HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, 
> HDFS-6673.005.patch
>
>
> The new oiv tool, which is designed for Protobuf fsimage, lacks a few 
> features supported in the old {{oiv}} tool. 
> This task adds supports of _Delimited_ processor to the oiv tool. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7646) HDFS truncate may remove data from the supposedly read-only previous/ directory during an upgrade

2015-01-20 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284708#comment-14284708
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7646:
---

Do we have copy-on-truncate?  We discussed this earlier; see [~shv]'s 
[commnet|https://issues.apache.org/jira/browse/HDFS-3107?focusedCommentId=14189216&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14189216].

> HDFS truncate may remove data from the supposedly read-only previous/ 
> directory during an upgrade
> -
>
> Key: HDFS-7646
> URL: https://issues.apache.org/jira/browse/HDFS-7646
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Priority: Blocker
>
> During a DataNode layout version upgrade, HDFS creates hardlinks.  These 
> hardlinks allow the same block to be accessible from both the current/ and 
> previous/ directories.  Rollback is possible by deleting the current/ 
> directory and renaming previous/ to current/.
> However, if the user truncates one of these hardlinked block files, it 
> effectively eliminates the ability to roll back to the previous data.
> We probably need to disable truncation-in-place during a DataNode upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool

2015-01-20 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284889#comment-14284889
 ] 

Lei (Eddy) Xu commented on HDFS-6673:
-

I went back to check the sizes of two LevelDB metadata map for this 3.3GB 
fsimage

{code}
 $ du -h dirMap/
46M dirMap/
$ du -h dirChildMap/
244MdirChildMap/
{code}

> Add Delimited format supports for PB OIV tool
> -
>
> Key: HDFS-6673
> URL: https://issues.apache.org/jira/browse/HDFS-6673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, 
> HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, 
> HDFS-6673.005.patch
>
>
> The new oiv tool, which is designed for Protobuf fsimage, lacks a few 
> features supported in the old {{oiv}} tool. 
> This task adds supports of _Delimited_ processor to the oiv tool. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7646) HDFS truncate may remove data from the supposedly read-only previous/ directory during an upgrade

2015-01-20 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284897#comment-14284897
 ] 

Konstantin Shvachko commented on HDFS-7646:
---

In place truncate is disabled during upgrades. NN will do copy on truncate 
until upgrade is finalized or rolled back. This is reflected in the design doc 
of HDFS-3107, and there is a unit test verifying that.

> HDFS truncate may remove data from the supposedly read-only previous/ 
> directory during an upgrade
> -
>
> Key: HDFS-7646
> URL: https://issues.apache.org/jira/browse/HDFS-7646
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Priority: Blocker
>
> During a DataNode layout version upgrade, HDFS creates hardlinks.  These 
> hardlinks allow the same block to be accessible from both the current/ and 
> previous/ directories.  Rollback is possible by deleting the current/ 
> directory and renaming previous/ to current/.
> However, if the user truncates one of these hardlinked block files, it 
> effectively eliminates the ability to roll back to the previous data.
> We probably need to disable truncation-in-place during a DataNode upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7634) Disallow truncation of Lazy persist files

2015-01-20 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284906#comment-14284906
 ] 

Yi Liu commented on HDFS-7634:
--

Thanks a lot for [~arpitagarwal] and [~shv] for the review and commit.

> Disallow truncation of Lazy persist files
> -
>
> Key: HDFS-7634
> URL: https://issues.apache.org/jira/browse/HDFS-7634
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 3.0.0
>
> Attachments: HDFS-7634.001.patch, HDFS-7634.002.patch
>
>
> Similar with {{append}}, lazy persist (memory) file should not support 
> truncate currently. Quote the reason from HDFS-6581 design doc:
> {quote}
> Appends to files created with the LAZY_PERSISTflag will not be allowed in the 
> initial implementation to avoid the complexity of keeping in­memory and 
> on­disk replicas in sync on a given DataNode.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7647) DatanodeManager.sortLocatedBlocks() sorts DatanodeIDs but not StorageIDs

2015-01-20 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-7647:
--
 Target Version/s: 2.7.0
Affects Version/s: (was: 3.0.0)
   2.6.0

> DatanodeManager.sortLocatedBlocks() sorts DatanodeIDs but not StorageIDs
> 
>
> Key: HDFS-7647
> URL: https://issues.apache.org/jira/browse/HDFS-7647
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Milan Desai
>Assignee: Milan Desai
>
> DatanodeManager.sortLocatedBlocks() sorts the array of DatanodeIDs inside 
> each LocatedBlock, but does not touch the array of StorageIDs and 
> StorageTypes. As a result, the DatanodeIDs and StorageIDs/StorageTypes are 
> mismatched. The method is called by FSNamesystem.getBlockLocations(), so the 
> client will not know which StorageID/Type corresponds to which DatanodeID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7496) Fix FsVolume removal race conditions on the DataNode

2015-01-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284913#comment-14284913
 ] 

Hadoop QA commented on HDFS-7496:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693390/HDFS-7496.007.patch
  against trunk revision dd0228b.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9280//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9280//console

This message is automatically generated.

> Fix FsVolume removal race conditions on the DataNode 
> -
>
> Key: HDFS-7496
> URL: https://issues.apache.org/jira/browse/HDFS-7496
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7496.000.patch, HDFS-7496.001.patch, 
> HDFS-7496.002.patch, HDFS-7496.003.patch, HDFS-7496.003.patch, 
> HDFS-7496.004.patch, HDFS-7496.005.patch, HDFS-7496.006.patch, 
> HDFS-7496.007.patch
>
>
> We discussed a few FsVolume removal race conditions on the DataNode in 
> HDFS-7489.  We should figure out a way to make removing an FsVolume safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7575) NameNode not handling heartbeats properly after HDFS-2832

2015-01-20 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284937#comment-14284937
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7575:
---

Layout version defines layout format but not the software (don't confuse it 
with the software version).  The question here is whether there is a layout 
format change here.  Are we changing from a layout, where some storage IDs 
could be the same, to a new layout, where all storage IDs have to be distinct?  
I think the answer is no since the same storage ID does not work even using the 
old software.

> NameNode not handling heartbeats properly after HDFS-2832
> -
>
> Key: HDFS-7575
> URL: https://issues.apache.org/jira/browse/HDFS-7575
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0, 2.5.0, 2.6.0
>Reporter: Lars Francke
>Assignee: Arpit Agarwal
>Priority: Critical
> Attachments: HDFS-7575.01.patch, HDFS-7575.02.patch, 
> HDFS-7575.03.binary.patch, HDFS-7575.03.patch, HDFS-7575.04.binary.patch, 
> HDFS-7575.04.patch, HDFS-7575.05.binary.patch, HDFS-7575.05.patch, 
> testUpgrade22via24GeneratesStorageIDs.tgz, 
> testUpgradeFrom22GeneratesStorageIDs.tgz, 
> testUpgradeFrom24PreservesStorageId.tgz
>
>
> Before HDFS-2832 each DataNode would have a unique storageId which included 
> its IP address. Since HDFS-2832 the DataNodes have a unique storageId per 
> storage directory which is just a random UUID.
> They send reports per storage directory in their heartbeats. This heartbeat 
> is processed on the NameNode in the 
> {{DatanodeDescriptor#updateHeartbeatState}} method. Pre HDFS-2832 this would 
> just store the information per Datanode. After the patch though each DataNode 
> can have multiple different storages so it's stored in a map keyed by the 
> storage Id.
> This works fine for all clusters that have been installed post HDFS-2832 as 
> they get a UUID for their storage Id. So a DN with 8 drives has a map with 8 
> different keys. On each Heartbeat the Map is searched and updated 
> ({{DatanodeStorageInfo storage = storageMap.get(s.getStorageID());}}):
> {code:title=DatanodeStorageInfo}
>   void updateState(StorageReport r) {
> capacity = r.getCapacity();
> dfsUsed = r.getDfsUsed();
> remaining = r.getRemaining();
> blockPoolUsed = r.getBlockPoolUsed();
>   }
> {code}
> On clusters that were upgraded from a pre HDFS-2832 version though the 
> storage Id has not been rewritten (at least not on the four clusters I 
> checked) so each directory will have the exact same storageId. That means 
> there'll be only a single entry in the {{storageMap}} and it'll be 
> overwritten by a random {{StorageReport}} from the DataNode. This can be seen 
> in the {{updateState}} method above. This just assigns the capacity from the 
> received report, instead it should probably sum it up per received heartbeat.
> The Balancer seems to be one of the only things that actually uses this 
> information so it now considers the utilization of a random drive per 
> DataNode for balancing purposes.
> Things get even worse when a drive has been added or replaced as this will 
> now get a new storage Id so there'll be two entries in the storageMap. As new 
> drives are usually empty it skewes the balancers decision in a way that this 
> node will never be considered over-utilized.
> Another problem is that old StorageReports are never removed from the 
> storageMap. So if I replace a drive and it gets a new storage Id the old one 
> will still be in place and used for all calculations by the Balancer until a 
> restart of the NameNode.
> I can try providing a patch that does the following:
> * Instead of using a Map I could just store the array we receive or instead 
> of storing an array sum up the values for reports with the same Id
> * On each heartbeat clear the map (so we know we have up to date information)
> Does that sound sensible?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool

2015-01-20 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284942#comment-14284942
 ] 

Haohui Mai commented on HDFS-6673:
--

Just to recap. The current approaches are (please correct me if I'm wrong):

  # Scan the {{INodeDirectorySection}} linearly and put a map {{childId -> 
parentId}} into LevelDB. 
  # Scan the {{INodeSection}} and store a map {{id -> localName}} into LevelDB 
for all directories.
  # Scan the {{INodeSection}} and for each inode, to construct the full path by 
looking up in the LevelDB.

The size of LevelDB is {{#inodes * sizeof(inodeid) * 2}} + {{local names for 
all directories}} (as every inode has a parent). For a rough estimate, the size 
of LevelDB is more than 8G for an image that contains 400M inodes. This is 
large enough thus it may not fit in the working set.

In step (3) it requires several LevelDB look ups per inode. (I'm skeptical that 
LRU actually works since there is really no locality here as mentioned 
earlier). My concern is that once the LevelDB fails to fit in the working set, 
the look up becomes at least one seek. Note that a typical HDD drive serves 
around 100 IOP/S, thus for 400M inodes it takes 400M / 100 = 4M seconds ~ 1000 
hours to complete.

My proposal are:

  # Scan the {{INodeDirectorySection}} linearly and put a map {{childId -> 
parentId}} in memory.
  # Scan the {{INodeSection}} and for each inode, store a map {{parentid || 
localName -> info}} into LevelDB for all inodes.
  # Scan the LevelDB using DFS and then output the result.

The differences are: (1) it has more writes as it stores all required 
information into the LevelDB. (2) it requires a bigger working set at step (1). 
(2) There is only one seek per directory instead of one seek per file in step 
(3) which give a bound the total time when processing large fsimage.

More comments:

bq. the end-to-end time is about 40-50 minutes, while the time to dump INodes 
along is about 20-ish minutes, which is already larger than the end-to-end time 
now (10 minutes) ... I had tried use directory ID || inode Id as key and INode 
protobuf as value to store all INodes in LevelDB

This is an apple-to-orange comparison. Protobuf has significant overheads due 
to excessive object creation. I found it takes ~30% of the total processing 
times when building the PB-based fsimage. I suggest dumping only the required 
information in customized format for this patch.

> Add Delimited format supports for PB OIV tool
> -
>
> Key: HDFS-6673
> URL: https://issues.apache.org/jira/browse/HDFS-6673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, 
> HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, 
> HDFS-6673.005.patch
>
>
> The new oiv tool, which is designed for Protobuf fsimage, lacks a few 
> features supported in the old {{oiv}} tool. 
> This task adds supports of _Delimited_ processor to the oiv tool. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7610) Should use StorageDirectory.getCurrentDIr() to construct FsVolumeImpl

2015-01-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284950#comment-14284950
 ] 

Hadoop QA commented on HDFS-7610:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693398/HDFS-7610.001.patch
  against trunk revision dd0228b.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9281//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9281//console

This message is automatically generated.

> Should use StorageDirectory.getCurrentDIr() to construct FsVolumeImpl
> -
>
> Key: HDFS-7610
> URL: https://issues.apache.org/jira/browse/HDFS-7610
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7610.000.patch, HDFS-7610.001.patch
>
>
> In the hot swap feature, {{FsDatasetImpl#addVolume}} uses the base volume dir 
> (e.g. "{{/foo/data0}}", instead of volume's current dir 
> "{{/foo/data/current}}" to construct {{FsVolumeImpl}}. As a result, DataNode 
> can not remove this newly added volume, because its 
> {{FsVolumeImpl#getBasePath}} returns "{{/foo}}".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7648) Verify the datanode directory layout

2015-01-20 Thread Tsz Wo Nicholas Sze (JIRA)
Tsz Wo Nicholas Sze created HDFS-7648:
-

 Summary: Verify the datanode directory layout
 Key: HDFS-7648
 URL: https://issues.apache.org/jira/browse/HDFS-7648
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Tsz Wo Nicholas Sze


HDFS-6482 changed datanode layout to use block ID to determine the directory to 
store the block.  We should have some mechanism to verify it.  Either 
DirectoryScanner or block report generation do the check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6833) DirectoryScanner should not register a deleting block with memory of DataNode

2015-01-20 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284959#comment-14284959
 ] 

Tsz Wo Nicholas Sze commented on HDFS-6833:
---

Sure, will review the patch.

> DirectoryScanner should not register a deleting block with memory of DataNode
> -
>
> Key: HDFS-6833
> URL: https://issues.apache.org/jira/browse/HDFS-6833
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.5.0, 2.5.1
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
>Priority: Critical
> Attachments: HDFS-6833-10.patch, HDFS-6833-11.patch, 
> HDFS-6833-12.patch, HDFS-6833-13.patch, HDFS-6833-14.patch, 
> HDFS-6833-6-2.patch, HDFS-6833-6-3.patch, HDFS-6833-6.patch, 
> HDFS-6833-7-2.patch, HDFS-6833-7.patch, HDFS-6833.8.patch, HDFS-6833.9.patch, 
> HDFS-6833.patch, HDFS-6833.patch, HDFS-6833.patch, HDFS-6833.patch, 
> HDFS-6833.patch
>
>
> When a block is deleted in DataNode, the following messages are usually 
> output.
> {code}
> 2014-08-07 17:53:11,606 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Scheduling blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>  for deletion
> 2014-08-07 17:53:11,617 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
> {code}
> However, DirectoryScanner may be executed when DataNode deletes the block in 
> the current implementation. And the following messsages are output.
> {code}
> 2014-08-07 17:53:30,519 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Scheduling blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>  for deletion
> 2014-08-07 17:53:31,426 INFO 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool 
> BP-1887080305-172.28.0.101-1407398838872 Total blocks: 1, missing metadata 
> files:0, missing block files:0, missing blocks in memory:1, mismatched 
> blocks:0
> 2014-08-07 17:53:31,426 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> missing block to memory FinalizedReplica, blk_1073741825_1001, FINALIZED
>   getNumBytes() = 21230663
>   getBytesOnDisk()  = 21230663
>   getVisibleLength()= 21230663
>   getVolume()   = /hadoop/data1/dfs/data/current
>   getBlockFile()= 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>   unlinked  =false
> 2014-08-07 17:53:31,531 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
> {code}
> Deleting block information is registered in DataNode's memory.
> And when DataNode sends a block report, NameNode receives wrong block 
> information.
> For example, when we execute recommission or change the number of 
> replication, NameNode may delete the right block as "ExcessReplicate" by this 
> problem.
> And "Under-Replicated Blocks" and "Missing Blocks" occur.
> When DataNode run DirectoryScanner, DataNode should not register a deleting 
> block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7548) Corrupt block reporting delayed until datablock scanner thread detects it

2015-01-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284960#comment-14284960
 ] 

Hadoop QA commented on HDFS-7548:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693402/HDFS-7548-v4.patch
  against trunk revision dd0228b.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9282//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9282//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9282//console

This message is automatically generated.

> Corrupt block reporting delayed until datablock scanner thread detects it
> -
>
> Key: HDFS-7548
> URL: https://issues.apache.org/jira/browse/HDFS-7548
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-7548-v2.patch, HDFS-7548-v3.patch, 
> HDFS-7548-v4.patch, HDFS-7548.patch
>
>
> When there is one datanode holding the block and that block happened to be
> corrupt, namenode would keep on trying to replicate the block repeatedly but 
> it would only report the block as corrupt only when the data block scanner 
> thread of the datanode picks up this bad block.
> Requesting improvement in namenode reporting so that corrupt replica would be 
> reported when there is only 1 replica and the replication of that replica 
> keeps on failing with the checksum error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7623) Add htrace configuration properties to core-default.xml and update user doc about how to enable htrace

2015-01-20 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7623:
-
Attachment: HDFS-7623.002.patch

No problem:) Thanks a lot for review, Colin.
Update the patch to address the comment and will commit shortly.

> Add htrace configuration properties to core-default.xml and update user doc 
> about how to enable htrace
> --
>
> Key: HDFS-7623
> URL: https://issues.apache.org/jira/browse/HDFS-7623
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-7623.001.patch, HDFS-7623.002.patch
>
>
> This JIRA does following things:
> *1.* Add htrace configuration properties to core-default.xml
> *2.* update user doc about how to enable htrace.
> *3.* Few fix in user doc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7575) NameNode not handling heartbeats properly after HDFS-2832

2015-01-20 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284973#comment-14284973
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7575:
---

We should log the old (invalid) storage id.

+1 on HDFS-7575.05.patch other than that.

> NameNode not handling heartbeats properly after HDFS-2832
> -
>
> Key: HDFS-7575
> URL: https://issues.apache.org/jira/browse/HDFS-7575
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0, 2.5.0, 2.6.0
>Reporter: Lars Francke
>Assignee: Arpit Agarwal
>Priority: Critical
> Attachments: HDFS-7575.01.patch, HDFS-7575.02.patch, 
> HDFS-7575.03.binary.patch, HDFS-7575.03.patch, HDFS-7575.04.binary.patch, 
> HDFS-7575.04.patch, HDFS-7575.05.binary.patch, HDFS-7575.05.patch, 
> testUpgrade22via24GeneratesStorageIDs.tgz, 
> testUpgradeFrom22GeneratesStorageIDs.tgz, 
> testUpgradeFrom24PreservesStorageId.tgz
>
>
> Before HDFS-2832 each DataNode would have a unique storageId which included 
> its IP address. Since HDFS-2832 the DataNodes have a unique storageId per 
> storage directory which is just a random UUID.
> They send reports per storage directory in their heartbeats. This heartbeat 
> is processed on the NameNode in the 
> {{DatanodeDescriptor#updateHeartbeatState}} method. Pre HDFS-2832 this would 
> just store the information per Datanode. After the patch though each DataNode 
> can have multiple different storages so it's stored in a map keyed by the 
> storage Id.
> This works fine for all clusters that have been installed post HDFS-2832 as 
> they get a UUID for their storage Id. So a DN with 8 drives has a map with 8 
> different keys. On each Heartbeat the Map is searched and updated 
> ({{DatanodeStorageInfo storage = storageMap.get(s.getStorageID());}}):
> {code:title=DatanodeStorageInfo}
>   void updateState(StorageReport r) {
> capacity = r.getCapacity();
> dfsUsed = r.getDfsUsed();
> remaining = r.getRemaining();
> blockPoolUsed = r.getBlockPoolUsed();
>   }
> {code}
> On clusters that were upgraded from a pre HDFS-2832 version though the 
> storage Id has not been rewritten (at least not on the four clusters I 
> checked) so each directory will have the exact same storageId. That means 
> there'll be only a single entry in the {{storageMap}} and it'll be 
> overwritten by a random {{StorageReport}} from the DataNode. This can be seen 
> in the {{updateState}} method above. This just assigns the capacity from the 
> received report, instead it should probably sum it up per received heartbeat.
> The Balancer seems to be one of the only things that actually uses this 
> information so it now considers the utilization of a random drive per 
> DataNode for balancing purposes.
> Things get even worse when a drive has been added or replaced as this will 
> now get a new storage Id so there'll be two entries in the storageMap. As new 
> drives are usually empty it skewes the balancers decision in a way that this 
> node will never be considered over-utilized.
> Another problem is that old StorageReports are never removed from the 
> storageMap. So if I replace a drive and it gets a new storage Id the old one 
> will still be in place and used for all calculations by the Balancer until a 
> restart of the NameNode.
> I can try providing a patch that does the following:
> * Instead of using a Map I could just store the array we receive or instead 
> of storing an array sum up the values for reports with the same Id
> * On each heartbeat clear the map (so we know we have up to date information)
> Does that sound sensible?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >