[jira] [Commented] (HDFS-6825) Edit log corruption due to delayed block removal

2014-08-11 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093809#comment-14093809
 ] 

Yongjun Zhang commented on HDFS-6825:
-

Hi [~andrew.wang] and [~atm],

Thanks a lot for the review and comments. I attached version 004 to address 
them.

To answer ATM's question 3: the code is necessary because otherwise 
commitBlockSynchronization would thrown FileNotFoundException
in TestCommitBlockSynchronization introduced by this fix (see 
https://builds.apache.org/job/PreCommit-HDFS-Build/7584//testReport/). The code 
added so isFileDeleted would return true for the file, thus the intended test 
can be done instead of the FileNotFoundException introduced with this fix. I 
had a comment in the the beginning of this change:
{code}
// set file's parent and put the file to inodeMap, so FSNamesystem's
// isFileDeleted() method will return false on this file
{code}

Thanks.


> Edit log corruption due to delayed block removal
> 
>
> Key: HDFS-6825
> URL: https://issues.apache.org/jira/browse/HDFS-6825
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6825.001.patch, HDFS-6825.002.patch, 
> HDFS-6825.003.patch, HDFS-6825.004.patch
>
>
> Observed the following stack:
> {code}
> 2014-08-04 23:49:44,133 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> commitBlockSynchronization(lastblock=BP-.., newgenerationstamp=..., 
> newlength=..., newtargets=..., closeFile=true, deleteBlock=false)
> 2014-08-04 23:49:44,133 WARN 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Unexpected exception 
> while updating disk space. 
> java.io.FileNotFoundException: Path not found: 
> /solr/hierarchy/core_node1/data/tlog/tlog.xyz
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateSpaceConsumed(FSDirectory.java:1807)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitOrCompleteLastBlock(FSNamesystem.java:3975)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.closeFileCommitBlocks(FSNamesystem.java:4178)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:4146)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.commitBlockSynchronization(NameNodeRpcServer.java:662)
> at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.commitBlockSynchronization(DatanodeProtocolServerSideTranslatorPB.java:270)
> at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28073)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)
> {code}
> Found this is what happened:
> - client created file /solr/hierarchy/core_node1/data/tlog/tlog.xyz
> - client tried to append to this file, but the lease expired, so lease 
> recovery is started, thus the append failed
> - the file get deleted, however, there are still pending blocks of this file 
> not deleted
> - then commitBlockSynchronization() method is called (see stack above), an 
> InodeFile is created out of the pending block, not aware of that the file was 
> deleted already
> - FileNotExistException was thrown by FSDirectory.updateSpaceConsumed, but 
> swallowed by commitOrCompleteLastBlock
> - closeFileCommitBlocks continue to call finalizeINodeFileUnderConstruction 
> and wrote CloseOp to the edit log



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6825) Edit log corruption due to delayed block removal

2014-08-11 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6825:


Attachment: HDFS-6825.004.patch

> Edit log corruption due to delayed block removal
> 
>
> Key: HDFS-6825
> URL: https://issues.apache.org/jira/browse/HDFS-6825
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6825.001.patch, HDFS-6825.002.patch, 
> HDFS-6825.003.patch, HDFS-6825.004.patch
>
>
> Observed the following stack:
> {code}
> 2014-08-04 23:49:44,133 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> commitBlockSynchronization(lastblock=BP-.., newgenerationstamp=..., 
> newlength=..., newtargets=..., closeFile=true, deleteBlock=false)
> 2014-08-04 23:49:44,133 WARN 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Unexpected exception 
> while updating disk space. 
> java.io.FileNotFoundException: Path not found: 
> /solr/hierarchy/core_node1/data/tlog/tlog.xyz
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateSpaceConsumed(FSDirectory.java:1807)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitOrCompleteLastBlock(FSNamesystem.java:3975)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.closeFileCommitBlocks(FSNamesystem.java:4178)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:4146)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.commitBlockSynchronization(NameNodeRpcServer.java:662)
> at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.commitBlockSynchronization(DatanodeProtocolServerSideTranslatorPB.java:270)
> at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28073)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)
> {code}
> Found this is what happened:
> - client created file /solr/hierarchy/core_node1/data/tlog/tlog.xyz
> - client tried to append to this file, but the lease expired, so lease 
> recovery is started, thus the append failed
> - the file get deleted, however, there are still pending blocks of this file 
> not deleted
> - then commitBlockSynchronization() method is called (see stack above), an 
> InodeFile is created out of the pending block, not aware of that the file was 
> deleted already
> - FileNotExistException was thrown by FSDirectory.updateSpaceConsumed, but 
> swallowed by commitOrCompleteLastBlock
> - closeFileCommitBlocks continue to call finalizeINodeFileUnderConstruction 
> and wrote CloseOp to the edit log



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6825) Edit log corruption due to delayed block removal

2014-08-11 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6825:


Attachment: (was: HDFS-6825.004.patch)

> Edit log corruption due to delayed block removal
> 
>
> Key: HDFS-6825
> URL: https://issues.apache.org/jira/browse/HDFS-6825
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6825.001.patch, HDFS-6825.002.patch, 
> HDFS-6825.003.patch
>
>
> Observed the following stack:
> {code}
> 2014-08-04 23:49:44,133 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> commitBlockSynchronization(lastblock=BP-.., newgenerationstamp=..., 
> newlength=..., newtargets=..., closeFile=true, deleteBlock=false)
> 2014-08-04 23:49:44,133 WARN 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Unexpected exception 
> while updating disk space. 
> java.io.FileNotFoundException: Path not found: 
> /solr/hierarchy/core_node1/data/tlog/tlog.xyz
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateSpaceConsumed(FSDirectory.java:1807)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitOrCompleteLastBlock(FSNamesystem.java:3975)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.closeFileCommitBlocks(FSNamesystem.java:4178)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:4146)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.commitBlockSynchronization(NameNodeRpcServer.java:662)
> at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.commitBlockSynchronization(DatanodeProtocolServerSideTranslatorPB.java:270)
> at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28073)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)
> {code}
> Found this is what happened:
> - client created file /solr/hierarchy/core_node1/data/tlog/tlog.xyz
> - client tried to append to this file, but the lease expired, so lease 
> recovery is started, thus the append failed
> - the file get deleted, however, there are still pending blocks of this file 
> not deleted
> - then commitBlockSynchronization() method is called (see stack above), an 
> InodeFile is created out of the pending block, not aware of that the file was 
> deleted already
> - FileNotExistException was thrown by FSDirectory.updateSpaceConsumed, but 
> swallowed by commitOrCompleteLastBlock
> - closeFileCommitBlocks continue to call finalizeINodeFileUnderConstruction 
> and wrote CloseOp to the edit log



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6825) Edit log corruption due to delayed block removal

2014-08-11 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6825:


Attachment: HDFS-6825.004.patch

> Edit log corruption due to delayed block removal
> 
>
> Key: HDFS-6825
> URL: https://issues.apache.org/jira/browse/HDFS-6825
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6825.001.patch, HDFS-6825.002.patch, 
> HDFS-6825.003.patch, HDFS-6825.004.patch
>
>
> Observed the following stack:
> {code}
> 2014-08-04 23:49:44,133 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> commitBlockSynchronization(lastblock=BP-.., newgenerationstamp=..., 
> newlength=..., newtargets=..., closeFile=true, deleteBlock=false)
> 2014-08-04 23:49:44,133 WARN 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Unexpected exception 
> while updating disk space. 
> java.io.FileNotFoundException: Path not found: 
> /solr/hierarchy/core_node1/data/tlog/tlog.xyz
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateSpaceConsumed(FSDirectory.java:1807)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitOrCompleteLastBlock(FSNamesystem.java:3975)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.closeFileCommitBlocks(FSNamesystem.java:4178)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:4146)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.commitBlockSynchronization(NameNodeRpcServer.java:662)
> at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.commitBlockSynchronization(DatanodeProtocolServerSideTranslatorPB.java:270)
> at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28073)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)
> {code}
> Found this is what happened:
> - client created file /solr/hierarchy/core_node1/data/tlog/tlog.xyz
> - client tried to append to this file, but the lease expired, so lease 
> recovery is started, thus the append failed
> - the file get deleted, however, there are still pending blocks of this file 
> not deleted
> - then commitBlockSynchronization() method is called (see stack above), an 
> InodeFile is created out of the pending block, not aware of that the file was 
> deleted already
> - FileNotExistException was thrown by FSDirectory.updateSpaceConsumed, but 
> swallowed by commitOrCompleteLastBlock
> - closeFileCommitBlocks continue to call finalizeINodeFileUnderConstruction 
> and wrote CloseOp to the edit log



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6833) DirectoryScanner should not register a deleting block with memory of DataNode

2014-08-11 Thread Shinichi Yamashita (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichi Yamashita updated HDFS-6833:
-

Attachment: HDFS-6833.patch

> DirectoryScanner should not register a deleting block with memory of DataNode
> -
>
> Key: HDFS-6833
> URL: https://issues.apache.org/jira/browse/HDFS-6833
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
> Attachments: HDFS-6833.patch, HDFS-6833.patch
>
>
> When a block is deleted in DataNode, the following messages are usually 
> output.
> {code}
> 2014-08-07 17:53:11,606 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Scheduling blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>  for deletion
> 2014-08-07 17:53:11,617 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
> {code}
> However, DirectoryScanner may be executed when DataNode deletes the block in 
> the current implementation. And the following messsages are output.
> {code}
> 2014-08-07 17:53:30,519 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Scheduling blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>  for deletion
> 2014-08-07 17:53:31,426 INFO 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool 
> BP-1887080305-172.28.0.101-1407398838872 Total blocks: 1, missing metadata 
> files:0, missing block files:0, missing blocks in memory:1, mismatched 
> blocks:0
> 2014-08-07 17:53:31,426 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> missing block to memory FinalizedReplica, blk_1073741825_1001, FINALIZED
>   getNumBytes() = 21230663
>   getBytesOnDisk()  = 21230663
>   getVisibleLength()= 21230663
>   getVolume()   = /hadoop/data1/dfs/data/current
>   getBlockFile()= 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>   unlinked  =false
> 2014-08-07 17:53:31,531 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
> {code}
> Deleting block information is registered in DataNode's memory.
> And when DataNode sends a block report, NameNode receives wrong block 
> information.
> For example, when we execute recommission or change the number of 
> replication, NameNode may delete the right block as "ExcessReplicate" by this 
> problem.
> And "Under-Replicated Blocks" and "Missing Blocks" occur.
> When DataNode run DirectoryScanner, DataNode should not register a deleting 
> block.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-11 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093777#comment-14093777
 ] 

Tsz Wo Nicholas Sze commented on HDFS-6134:
---

> ... The recommended access method is instead HttpFS, which runs as a 
> non-superuser. ...

Could you give more details?  Do you mean that each user has to run a HttpFS 
server for their application?

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Charles Lamb
> Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
> HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
> HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Synchronization among Mappers in map-reduce task

2014-08-11 Thread saurabh jain
> Hi Folks ,
>
> I have been writing a map-reduce application where I am having an input
> file containing records and every field in the record is separated by some
> delimiter.
>
> In addition to this user will also provide a list of columns that he wants
> to lookup in a master properties file (stored in HDFS). If this columns
> (lets say it a key) is present in master properties file then get the
> corresponding value and update the key with this value and if the key is
> not present it in the master properties file then it will create a new
> value for this key and will write to this property file and will also
> update in the record.
>
> I have written this application , tested it and everything worked fine
> till now.
>
> *e.g :* *I/P Record :* This | is | the | test | record
>
> *Columns :* 2,4 (that means code will look up only field *"is" and "test"* in
> the master properties file.)
>
> Here , I have a question.
>
> *Q 1:* In the case when my input file is huge and it is splitted across
> the multiple mappers , I was getting the below mentioned exception where
> all the other mappers tasks were failing. *Also initially when I started
> the job my master properties file is empty.* In my code I have a check if
> this file (master properties) doesn't exist create a new empty file before
> submitting the job itself.
>
> e.g : If i have 4 splits of data , then 3 map tasks are failing. But after
> this all the failed map tasks restarts and finally the job become
> successful.
>
> So , *here is the question , is it possible to make sure that when one of
> the mapper tasks is writing to a file , other should wait until the first
> one is finished. ?* I read that all the mappers task don't interact with
> each other.
>
> Also what will happen in the scenario when I start multiple parallel
> map-reduce jobs and all of them working on the same properties files. *Is
> there any way to have synchronization between two independent map reduce
> jobs*?
>
> I have also read that ZooKeeper can be used in such scenarios , Is that
> correct ?
>
>
> Error: 
> com.techidiocy.hadoop.filesystem.api.exceptions.HDFSFileSystemException: 
> IOException - failed while appending data to the file ->Failed to create file 
> [/user/cloudera/lob/master/bank.properties] for 
> [DFSClient_attempt_1407778869492_0032_m_02_0_1618418105_1] on client 
> [10.X.X.17], because this file is already being created by
> [DFSClient_attempt_1407778869492_0032_m_05_0_-949968337_1] on [10.X.X.17]
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2548)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2377)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:2612)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2575)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:522)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:373)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
> at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)
>
>


[jira] [Commented] (HDFS-6774) Make FsDataset and DataStore support removing volumes.

2014-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093694#comment-14093694
 ] 

Hadoop QA commented on HDFS-6774:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12661102/HDFS-6774.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7612//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7612//console

This message is automatically generated.

> Make FsDataset and DataStore support removing volumes.
> --
>
> Key: HDFS-6774
> URL: https://issues.apache.org/jira/browse/HDFS-6774
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.4.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-6774.000.patch, HDFS-6774.001.patch
>
>
> Managing volumes on DataNode includes decommissioning an active volume 
> without restarting DataNode. 
> This task adds support to remove volumes from {{DataStorage}} and 
> {{BlockPoolSliceStorage}} dynamically.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6839) Fix TestCLI to expect new output

2014-08-11 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6839:
---

Status: In Progress  (was: Patch Available)

> Fix TestCLI to expect new output
> 
>
> Key: HDFS-6839
> URL: https://issues.apache.org/jira/browse/HDFS-6839
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6839.001.patch
>
>
> TestCLI is failing because HADOOP-10919 changed the output of the cp usage 
> command.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6839) Fix TestCLI to expect new output

2014-08-11 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6839:
---

Status: Patch Available  (was: In Progress)

HADOOP-10919 changed the output of cp's usage. Submitting patch to get a 
jenkins run.


> Fix TestCLI to expect new output
> 
>
> Key: HDFS-6839
> URL: https://issues.apache.org/jira/browse/HDFS-6839
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6839.001.patch
>
>
> TestCLI is failing because HADOOP-10919 changed the output of the cp usage 
> command.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6840) Clients are always sent to the same datanode when read is off rack

2014-08-11 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093613#comment-14093613
 ] 

Jason Lowe commented on HDFS-6840:
--

I think the previous behavior was not deterministic due to this change that was 
removed in the HDFS-6268 patch:

{code}
// put a random node at position 0 if it is not a local/local-rack node
if(tempIndex == 0 && localRackNode == -1 && nodes.length != 0) {
  swap(nodes, 0, r.nextInt(nodes.length));
{code}

The list used to be mostly deterministic, but the first node in the list (i.e.: 
the one clients are likely to be the only one to use) was random.

I have not done the bisect to prove without a doubt it was HDFS-6268, but we've 
run builds based on something 2.4.1+ and 2.5 and this behavior is brand-new 
with 2.5.  There weren't a lot of changes in the topology sorting arena besides 
this one between 2.4.1 and 2.5.0, and the code and JIRA for HDFS-6268 state 
it's intentionally not randomizing the datanode list between clients.  Besides 
the bisect approach I probably can try replacing the network topology class 
with the one from before HDFS-6268 and see if the behavior reverts to what it 
used to be.

> Clients are always sent to the same datanode when read is off rack
> --
>
> Key: HDFS-6840
> URL: https://issues.apache.org/jira/browse/HDFS-6840
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Priority: Critical
>
> After HDFS-6268 the sorting order of block locations is deterministic for a 
> given block and locality level (e.g.: local, rack. off-rack), so off-rack 
> clients all see the same datanode for the same block.  This leads to very 
> poor behavior in distributed cache localization and other scenarios where 
> many clients all want the same block data at approximately the same time.  
> The one datanode is crushed by the load while the other replicas only handle 
> local and rack-local requests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093603#comment-14093603
 ] 

Andrew Wang commented on HDFS-6134:
---

Hey Sanjay, thanks for reviewing things,

Regarding HAR, could you lay out the usecase you have in mind? When the user 
makes the HAR, they'll need access to all the input files (encrypted or 
unencrypted), and then if they write it within an EZ, then it'll be encrypted, 
else, unencrypted. This behavior seems reasonable to me.

Regarding webhdfs, it's not a recommended deployment. I'm going to doc this 
additionally in HDFS-6824. It requires giving the DNs (thus the HDFS superuser) 
access to EZ keys, which is not particularly secure. There is HTTPS transport 
via swebhdfs, but that doesn't fix the key access issue. The recommended access 
method is instead HttpFS, which runs as a non-superuser. So, yes distcp will 
work too. This will definitely be covered during our testing.

Regarding scalability, you can put the KMS behind a load balancer, which should 
make scalability a non-issue. Tucu can comment better on this than me since 
he's done some KMS benchmarking, but I think a single instance should be able 
to handle O(1000s) of req/s.

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Charles Lamb
> Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
> HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
> HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6268) Better sorting in NetworkTopology#pseudoSortByDistance when no local node is found

2014-08-11 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HDFS-6268:
-

Fix Version/s: 2.5.0

We also ran into the massive skew issue during localization that 
[~ashwinshankar77] encountered.  The result previously was not deterministic 
since it would swap in a random node to the first position if it wasn't local 
or rack-local, but now it always sends all off-rack requests to the same node.

Localization is a pretty common process, so I'm not sure sending most of the 
nodes to a single datanode is a good default.  Filed HDFS-6840 to discuss this 
further.

> Better sorting in NetworkTopology#pseudoSortByDistance when no local node is 
> found
> --
>
> Key: HDFS-6268
> URL: https://issues.apache.org/jira/browse/HDFS-6268
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Minor
> Fix For: 3.0.0, 2.5.0
>
> Attachments: hdfs-6268-1.patch, hdfs-6268-2.patch, hdfs-6268-3.patch, 
> hdfs-6268-4.patch, hdfs-6268-5.patch, hdfs-6268-branch-2.001.patch
>
>
> In NetworkTopology#pseudoSortByDistance, if no local node is found, it will 
> always place the first rack local node in the list in front.
> This became an issue when a dataset was loaded from a single datanode. This 
> datanode ended up being the first replica for all the blocks in the dataset. 
> When running an Impala query, the non-local reads when reading past a block 
> boundary were all hitting this node, meaning massive load skew.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6840) Clients are always sent to the same datanode when read is off rack

2014-08-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093595#comment-14093595
 ] 

Andrew Wang commented on HDFS-6840:
---

Hey Jason, did you bisect this behavior to HDFS-6268? My impression of the old 
pseudo-sort was that it was deterministic. AFAIK there wasn't a Random doing a 
shuffle. The idea of not-shuffling by default was to preserve the old behavior, 
and in case there were any nice page cache effects from directing reads to the 
same replica.

> Clients are always sent to the same datanode when read is off rack
> --
>
> Key: HDFS-6840
> URL: https://issues.apache.org/jira/browse/HDFS-6840
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Priority: Critical
>
> After HDFS-6268 the sorting order of block locations is deterministic for a 
> given block and locality level (e.g.: local, rack. off-rack), so off-rack 
> clients all see the same datanode for the same block.  This leads to very 
> poor behavior in distributed cache localization and other scenarios where 
> many clients all want the same block data at approximately the same time.  
> The one datanode is crushed by the load while the other replicas only handle 
> local and rack-local requests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-11 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093583#comment-14093583
 ] 

Sanjay Radia commented on HDFS-6134:


One of the items raised at the meeting and summarized by Owen in his meeting 
minutes comment (june 26) is the scalability concern. How is that being 
addressed? Can a  job client get the keys prior to job submission?

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Charles Lamb
> Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
> HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
> HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6840) Clients are always sent to the same datanode when read is off rack

2014-08-11 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093584#comment-14093584
 ] 

Jason Lowe commented on HDFS-6840:
--

HDFS-6701 gives the option to randomize the returned datanodes but the default 
is off.  I'm not sure if defaulting to off is a good thing, given the 
significantly different load behavior and heavy skew to the one datanode.  If 
that skew is desired then I think it should be opted-in rather than having to 
opt-out to avoid the skew.

> Clients are always sent to the same datanode when read is off rack
> --
>
> Key: HDFS-6840
> URL: https://issues.apache.org/jira/browse/HDFS-6840
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Priority: Critical
>
> After HDFS-6268 the sorting order of block locations is deterministic for a 
> given block and locality level (e.g.: local, rack. off-rack), so off-rack 
> clients all see the same datanode for the same block.  This leads to very 
> poor behavior in distributed cache localization and other scenarios where 
> many clients all want the same block data at approximately the same time.  
> The one datanode is crushed by the load while the other replicas only handle 
> local and rack-local requests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6774) Make FsDataset and DataStore support removing volumes.

2014-08-11 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-6774:


Attachment: HDFS-6774.001.patch

Refactor the patch and fix inconsistent comments. 

> Make FsDataset and DataStore support removing volumes.
> --
>
> Key: HDFS-6774
> URL: https://issues.apache.org/jira/browse/HDFS-6774
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.4.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-6774.000.patch, HDFS-6774.001.patch
>
>
> Managing volumes on DataNode includes decommissioning an active volume 
> without restarting DataNode. 
> This task adds support to remove volumes from {{DataStorage}} and 
> {{BlockPoolSliceStorage}} dynamically.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-11 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093579#comment-14093579
 ] 

Sanjay Radia commented on HDFS-6134:


Wrt to webhdfs, the document says that the decryption/encryption will happen in 
the Datanode. 
*  Will the DN be able to access the key necessary to do this?
* The data will be transmitted in the clear - is that what we want? For the 
normal HDFS API the decryption/encryption happens at the client side.
* There are two aspects to Webhdfs: the rest client and the webhdfs Filesystem. 
Have you considered both use cases?
* Will distcp work via webhdfs? Customers often use  webhdfs instead of hdfs 
for cross-cluster copies.

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Charles Lamb
> Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
> HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
> HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6840) Clients are always sent to the same datanode when read is off rack

2014-08-11 Thread Jason Lowe (JIRA)
Jason Lowe created HDFS-6840:


 Summary: Clients are always sent to the same datanode when read is 
off rack
 Key: HDFS-6840
 URL: https://issues.apache.org/jira/browse/HDFS-6840
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Jason Lowe
Priority: Critical


After HDFS-6268 the sorting order of block locations is deterministic for a 
given block and locality level (e.g.: local, rack. off-rack), so off-rack 
clients all see the same datanode for the same block.  This leads to very poor 
behavior in distributed cache localization and other scenarios where many 
clients all want the same block data at approximately the same time.  The one 
datanode is crushed by the load while the other replicas only handle local and 
rack-local requests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6425) Large postponedMisreplicatedBlocks has impact on blockReport latency

2014-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093558#comment-14093558
 ] 

Hadoop QA commented on HDFS-6425:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12661072/HDFS-6425-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7611//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7611//console

This message is automatically generated.

> Large postponedMisreplicatedBlocks has impact on blockReport latency
> 
>
> Key: HDFS-6425
> URL: https://issues.apache.org/jira/browse/HDFS-6425
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-6425-2.patch, HDFS-6425-Test-Case.pdf, 
> HDFS-6425.patch
>
>
> Sometimes we have large number of over replicates when NN fails over. When 
> the new active NN took over, over replicated blocks will be put to 
> postponedMisreplicatedBlocks until all DNs for that block aren't stale 
> anymore.
> We have a case where NNs flip flop. Before postponedMisreplicatedBlocks 
> became empty, NN fail over again and again. So postponedMisreplicatedBlocks 
> just kept increasing until the cluster is stable. 
> In addition, large postponedMisreplicatedBlocks could make 
> rescanPostponedMisreplicatedBlocks slow. rescanPostponedMisreplicatedBlocks 
> takes write lock. So it could slow down the block report processing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-08-11 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093538#comment-14093538
 ] 

Sanjay Radia commented on HDFS-6134:


.bq. Charles posted a design doc for how distcp will work with encryption at 
HDFS-6509.
 I did a quick glance over it. We also need to do the same for har. I think the 
same .raw should work ...

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Charles Lamb
> Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
> HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
> HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-6134) Transparent data at rest encryption

2014-08-11 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093538#comment-14093538
 ] 

Sanjay Radia edited comment on HDFS-6134 at 8/12/14 12:19 AM:
--

bq. Charles posted a design doc for how distcp will work with encryption at 
HDFS-6509.
 I did a quick glance over it. We also need to do the same for har. I think the 
same .raw should work ...


was (Author: sanjay.radia):
.bq. Charles posted a design doc for how distcp will work with encryption at 
HDFS-6509.
 I did a quick glance over it. We also need to do the same for har. I think the 
same .raw should work ...

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Charles Lamb
> Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
> HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
> HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6835) Archival Storage: Add a new API to set storage policy

2014-08-11 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6835:


Attachment: HDFS-6835.000.patch

Initial patch. Add a new API setStoragePolicy(String src, String policyName).

> Archival Storage: Add a new API to set storage policy
> -
>
> Key: HDFS-6835
> URL: https://issues.apache.org/jira/browse/HDFS-6835
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client, namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Jing Zhao
> Attachments: HDFS-6835.000.patch
>
>
> The new data migration tool proposed HDFS-6801 will determine if the storage 
> policy of files needs to be updated.  The tool needs a new API to set storage 
> policy.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6820) Namenode fails to boot if the file system reorders rename operations

2014-08-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093483#comment-14093483
 ] 

Colin Patrick McCabe commented on HDFS-6820:


Hmm, good find.  You can fix this problem by calling fsync on the directory 
that contains both files after the first rename (but before the second).  
Unfortunately, this requires JNI.

> Namenode fails to boot if the file system reorders rename operations
> 
>
> Key: HDFS-6820
> URL: https://issues.apache.org/jira/browse/HDFS-6820
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Samer Al-Kiswany
>Priority: Minor
>
> After studying the steps HDFS name node takes to update the logs we found the 
> following bug. The bug may not manifest in all current file system 
> implementations, but it is possible in file systems that reorder metadata 
> operations. e.g. btrfs
> Looking at the strace of HDFS name node we see the following when updating 
> the image:
> create(fsimage.chk)
> append(fsimage.chk)
> fsync(fsimage.chk)
> create(fsimage.md5.tmp)
> append(fsimage.md5.tmp)
> fsync(fsimage.md5.tmp)
> rename(fsimage.md5, fsimage.md5.tmp)
> rename(fsimage, fsimage.chk)
> If the file system reorders the last two rename operations and the system 
> crashes before the second rename is persisted on disk, the system may end up 
> with a fsimage that does not have a corresponding md5 file. In this case HDFS 
> namenode does not boot.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6634) inotify in HDFS

2014-08-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093481#comment-14093481
 ] 

Andrew Wang commented on HDFS-6634:
---

Thanks for revving James, few more comments. I haven't dug into the QJM stuff 
in detail yet.

Misc/meta:
* We have some TODOs sprinkled around. I like to turn these into JIRAs, and 
then reference the JIRA in the comment. JIRA is a much better tracking tool 
than TODOs :)
* Comment in HdfsAdmin has "TODO complete this"?
* hdfs-default.xml, would be good if you could provide guidance about how to 
size this appropriately. Basically, why and how was 1000 chosen?
* Maybe we leave the InterfaceStability at Unstable for now just in case, we 
might want to change things as future work flows in.
* CNPServerSideTranslatorPB, unrelated whitespace changes
* EditsDoubleBuffer, does that visibility need to be changed to public?
* NNRpcServer, unnecessary RejectedExecutionException import?
* The PB definition for getCurrentTxid is a uint64, but I see that we return a 
-1 if the edit log is not open. Seems like it should be an int64. Same issue 
for lastWriterEpoch in QJournalProtocol.
* QuorumJournalManager, unused import to RemoteEditLogManifest

DFSInotifyEventInputStream
* You could add a "_MS" suffix to the two constants, self-documents that 
they're in milliseconds. I think one of these also isn't being used.
* Some javadoc lines are longer than 80 chars
* Can we add a {{@link}} to the best-documented version of {{next}} from the 
other versions of {{next}}?
* Using LinkedBlockingQueue as a reference, it'd be nice to provide blocking 
calls as well for convenience. You could use the same terminology of poll and 
take from LinkedBlockingQueue if you like.
* The {{namenode}} RPC proxy has its own timeout and retry policy built in, so 
the user could very well end up waiting for minutes regardless of what they 
pass in. I think if we used Futures and cancellation (or some other method of 
interrupting), it would be more timely. You could also create a new RPC proxy 
with a more fitting timeout/retry policy.
* Also I don't think doing additional backoff is necessary, since the RPC proxy 
already has timeout/retry policy. I think since our timeouts are quite 
conservative (60s?), backoff isn't as important.

Tests:
* some lines longer than 80 chars
* Nit: "Uncomitted" should have two m's
* Wish I had told you about this before, but there's DFSTestUtil#runOperations 
which runs through a bunch of different operations. Is this reusable in 
testBasic? It'd be nice, since TestOfflineEditsViewer checks that all opcodes 
are generated by runOperations, so we'll only need to update this function when 
we add more opcodes in the future. Or we could do the same assertion about all 
the opcodes being present and tested.
* Can we add conservative timeouts on the new tests, e.g. 12 ?

> inotify in HDFS
> ---
>
> Key: HDFS-6634
> URL: https://issues.apache.org/jira/browse/HDFS-6634
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs-client, namenode, qjm
>Reporter: James Thomas
>Assignee: James Thomas
> Attachments: HDFS-6634.2.patch, HDFS-6634.3.patch, HDFS-6634.patch, 
> inotify-design.2.pdf, inotify-design.pdf, inotify-intro.2.pdf, 
> inotify-intro.pdf
>
>
> Design a mechanism for applications like search engines to access the HDFS 
> edit stream.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6819) make HDFS fault injection framework working with maven

2014-08-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093478#comment-14093478
 ] 

Colin Patrick McCabe commented on HDFS-6819:


I don't see a lot of value in this, given that we already have stuff like 
{{MiniDFSCluster}} and {{DFSClientFaultInjector}}.

> make HDFS fault injection framework working with maven
> --
>
> Key: HDFS-6819
> URL: https://issues.apache.org/jira/browse/HDFS-6819
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: George Wong
>Assignee: George Wong
>
> In current trunk code repo, the FI framework does not work. Because maven 
> build process does not execute the AspectJ injection.
> Since FI is very useful for testing and bug reproduce, it is better to make 
> FI framework working in the trunk code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6818) FSShell's get operation should have the ability to take a "length" argument

2014-08-11 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6818:
---

Summary: FSShell's get operation should have the ability to take a "length" 
argument  (was: Enhance hdfs get to copy file with length)

> FSShell's get operation should have the ability to take a "length" argument
> ---
>
> Key: HDFS-6818
> URL: https://issues.apache.org/jira/browse/HDFS-6818
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Guo Ruijing
>
> Use Case: if HDFS file is corrupted, some tool can be used to copy out good 
> part of corrupted file. We may enhance "hdfs -get" to copy out good part.
> Existing:
> hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc]  ... 
> Proposal:
> hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-length]  ... 
> 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6425) Large postponedMisreplicatedBlocks has impact on blockReport latency

2014-08-11 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6425:
--

Attachment: HDFS-6425-2.patch

Updated patch for latest trunk.

> Large postponedMisreplicatedBlocks has impact on blockReport latency
> 
>
> Key: HDFS-6425
> URL: https://issues.apache.org/jira/browse/HDFS-6425
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-6425-2.patch, HDFS-6425-Test-Case.pdf, 
> HDFS-6425.patch
>
>
> Sometimes we have large number of over replicates when NN fails over. When 
> the new active NN took over, over replicated blocks will be put to 
> postponedMisreplicatedBlocks until all DNs for that block aren't stale 
> anymore.
> We have a case where NNs flip flop. Before postponedMisreplicatedBlocks 
> became empty, NN fail over again and again. So postponedMisreplicatedBlocks 
> just kept increasing until the cluster is stable. 
> In addition, large postponedMisreplicatedBlocks could make 
> rescanPostponedMisreplicatedBlocks slow. rescanPostponedMisreplicatedBlocks 
> takes write lock. So it could slow down the block report processing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6838) Code cleanup for unnecessary INode replacement

2014-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093353#comment-14093353
 ] 

Hudson commented on HDFS-6838:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6048 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6048/])
HDFS-6838. Code cleanup for unnecessary INode replacement. Contributed by Jing 
Zhao. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1617361)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeReference.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeSymlink.java


> Code cleanup for unnecessary INode replacement
> --
>
> Key: HDFS-6838
> URL: https://issues.apache.org/jira/browse/HDFS-6838
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-6838.000.patch
>
>
> With INode features we now no longer have INode replacement when converting a 
> file to an under-construction/with-snapshot file or converting a directory to 
> a snapshottable/with-snapshot directory. This jira plans to remove some 
> unnecessary code that is only useful for INode replacement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6837) Code cleanup for Balancer and Dispatcher

2014-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093352#comment-14093352
 ] 

Hudson commented on HDFS-6837:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6048 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6048/])
HDFS-6837. Code cleanup for Balancer and Dispatcher. Contributed by Tsz Wo 
Nicholas Sze. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1617337)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/ExitStatus.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithHANameNodes.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithMultipleNameNodes.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithNodeGroup.java


> Code cleanup for Balancer and Dispatcher
> 
>
> Key: HDFS-6837
> URL: https://issues.apache.org/jira/browse/HDFS-6837
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: h6837_20140810.patch, h6837_20140810b.patch
>
>
> A few minor code cleanup changes:
> - The constructor of Dispatcher should not read Balancer's conf properties; 
> the values should be passed by parameters.
> - StorageGroup.utilization can be removed; it is only used in toString().
> - Move Balancer.ReturnStatus to a standalone class.
> - In Dispatcher, rename BalancerDatanode to DDatanode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6582) Missing null check in RpcProgramNfs3#read(XDR, SecurityHandler)

2014-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093350#comment-14093350
 ] 

Hudson commented on HDFS-6582:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6048 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6048/])
HDFS-6582. Missing null check in RpcProgramNfs3#read(XDR, SecurityHandler). 
Contributed by Abhiraj Butala (brandonli: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1617366)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestRpcProgramNfs3.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Missing null check in RpcProgramNfs3#read(XDR, SecurityHandler)
> ---
>
> Key: HDFS-6582
> URL: https://issues.apache.org/jira/browse/HDFS-6582
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Ted Yu
>Assignee: Abhiraj Butala
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-6582.patch
>
>
> Around line 691:
> {code}
> FSDataInputStream fis = clientCache.getDfsInputStream(userName,
> Nfs3Utils.getFileIdPath(handle));
> try {
>   readCount = fis.read(offset, readbuffer, 0, count);
> {code}
> fis may be null, leading to NullPointerException



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6582) Missing null check in RpcProgramNfs3#read(XDR, SecurityHandler)

2014-08-11 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-6582:
-

Fix Version/s: 2.6.0

> Missing null check in RpcProgramNfs3#read(XDR, SecurityHandler)
> ---
>
> Key: HDFS-6582
> URL: https://issues.apache.org/jira/browse/HDFS-6582
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Ted Yu
>Assignee: Abhiraj Butala
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-6582.patch
>
>
> Around line 691:
> {code}
> FSDataInputStream fis = clientCache.getDfsInputStream(userName,
> Nfs3Utils.getFileIdPath(handle));
> try {
>   readCount = fis.read(offset, readbuffer, 0, count);
> {code}
> fis may be null, leading to NullPointerException



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6582) Missing null check in RpcProgramNfs3#read(XDR, SecurityHandler)

2014-08-11 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-6582:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> Missing null check in RpcProgramNfs3#read(XDR, SecurityHandler)
> ---
>
> Key: HDFS-6582
> URL: https://issues.apache.org/jira/browse/HDFS-6582
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Ted Yu
>Assignee: Abhiraj Butala
>Priority: Minor
> Attachments: HDFS-6582.patch
>
>
> Around line 691:
> {code}
> FSDataInputStream fis = clientCache.getDfsInputStream(userName,
> Nfs3Utils.getFileIdPath(handle));
> try {
>   readCount = fis.read(offset, readbuffer, 0, count);
> {code}
> fis may be null, leading to NullPointerException



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6582) Missing null check in RpcProgramNfs3#read(XDR, SecurityHandler)

2014-08-11 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-6582:
-

Affects Version/s: 2.2.0

> Missing null check in RpcProgramNfs3#read(XDR, SecurityHandler)
> ---
>
> Key: HDFS-6582
> URL: https://issues.apache.org/jira/browse/HDFS-6582
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Ted Yu
>Assignee: Abhiraj Butala
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-6582.patch
>
>
> Around line 691:
> {code}
> FSDataInputStream fis = clientCache.getDfsInputStream(userName,
> Nfs3Utils.getFileIdPath(handle));
> try {
>   readCount = fis.read(offset, readbuffer, 0, count);
> {code}
> fis may be null, leading to NullPointerException



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6582) Missing null check in RpcProgramNfs3#read(XDR, SecurityHandler)

2014-08-11 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093339#comment-14093339
 ] 

Brandon Li commented on HDFS-6582:
--

I've committed the patch. Thank you, [~abutala], for the contribution!


> Missing null check in RpcProgramNfs3#read(XDR, SecurityHandler)
> ---
>
> Key: HDFS-6582
> URL: https://issues.apache.org/jira/browse/HDFS-6582
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Ted Yu
>Assignee: Abhiraj Butala
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-6582.patch
>
>
> Around line 691:
> {code}
> FSDataInputStream fis = clientCache.getDfsInputStream(userName,
> Nfs3Utils.getFileIdPath(handle));
> try {
>   readCount = fis.read(offset, readbuffer, 0, count);
> {code}
> fis may be null, leading to NullPointerException



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6582) Missing null check in RpcProgramNfs3#read(XDR, SecurityHandler)

2014-08-11 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093337#comment-14093337
 ] 

Brandon Li commented on HDFS-6582:
--

+1

> Missing null check in RpcProgramNfs3#read(XDR, SecurityHandler)
> ---
>
> Key: HDFS-6582
> URL: https://issues.apache.org/jira/browse/HDFS-6582
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Ted Yu
>Assignee: Abhiraj Butala
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-6582.patch
>
>
> Around line 691:
> {code}
> FSDataInputStream fis = clientCache.getDfsInputStream(userName,
> Nfs3Utils.getFileIdPath(handle));
> try {
>   readCount = fis.read(offset, readbuffer, 0, count);
> {code}
> fis may be null, leading to NullPointerException



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6838) Code cleanup for unnecessary INode replacement

2014-08-11 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6838:
-

   Resolution: Fixed
Fix Version/s: 2.6.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~jingzhao] for the 
contribution.

> Code cleanup for unnecessary INode replacement
> --
>
> Key: HDFS-6838
> URL: https://issues.apache.org/jira/browse/HDFS-6838
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-6838.000.patch
>
>
> With INode features we now no longer have INode replacement when converting a 
> file to an under-construction/with-snapshot file or converting a directory to 
> a snapshottable/with-snapshot directory. This jira plans to remove some 
> unnecessary code that is only useful for INode replacement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6838) Code cleanup for unnecessary INode replacement

2014-08-11 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093319#comment-14093319
 ] 

Haohui Mai commented on HDFS-6838:
--

Looks good to me. +1

> Code cleanup for unnecessary INode replacement
> --
>
> Key: HDFS-6838
> URL: https://issues.apache.org/jira/browse/HDFS-6838
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Minor
> Attachments: HDFS-6838.000.patch
>
>
> With INode features we now no longer have INode replacement when converting a 
> file to an under-construction/with-snapshot file or converting a directory to 
> a snapshottable/with-snapshot directory. This jira plans to remove some 
> unnecessary code that is only useful for INode replacement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6838) Code cleanup for unnecessary INode replacement

2014-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093303#comment-14093303
 ] 

Hadoop QA commented on HDFS-6838:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12661032/HDFS-6838.000.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7610//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7610//console

This message is automatically generated.

> Code cleanup for unnecessary INode replacement
> --
>
> Key: HDFS-6838
> URL: https://issues.apache.org/jira/browse/HDFS-6838
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Minor
> Attachments: HDFS-6838.000.patch
>
>
> With INode features we now no longer have INode replacement when converting a 
> file to an under-construction/with-snapshot file or converting a directory to 
> a snapshottable/with-snapshot directory. This jira plans to remove some 
> unnecessary code that is only useful for INode replacement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6757) Simplify lease manager with INodeID

2014-08-11 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093276#comment-14093276
 ] 

Haohui Mai commented on HDFS-6757:
--

Thanks [~cmccabe] and [~daryn] for the comments.

Just to make sure that I don't accidentally miss anything, can you guys list 
the issues I need to address in the next patch? So far I plan to change the 
code so that the NN bails out when it fails to replay the OP_CLOSE operations? 
[~cmccabe], do you have anything specific that you want to address?

> Simplify lease manager with INodeID
> ---
>
> Key: HDFS-6757
> URL: https://issues.apache.org/jira/browse/HDFS-6757
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6757.000.patch, HDFS-6757.001.patch, 
> HDFS-6757.002.patch, HDFS-6757.003.patch, HDFS-6757.004.patch
>
>
> Currently the lease manager records leases based on path instead of inode 
> ids. Therefore, the lease manager needs to carefully keep track of the path 
> of active leases during renames and deletes. This can be a non-trivial task.
> This jira proposes to simplify the logic by tracking leases using inodeids 
> instead of paths.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6803) Documenting DFSClient#DFSInputStream expectations reading and preading in concurrent context

2014-08-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093266#comment-14093266
 ] 

Colin Patrick McCabe commented on HDFS-6803:


[~stev...@iseran.com]: Hmm.  I don't see much advantage in making 
non-positional reads concurrent.  When two threads do non-positional reads, 
they inherently interfere with each other by modifying the position.  So 
concurrent non-positional reads would not be very useful for most programmers, 
since you would basically not know what offset your read was starting at.  it 
would depend on the peculiarities of thread timing.

Concurrent positional reads (preads) are useful precisely because they don't 
have this problem.  You're not sharing a stream position with any other thread, 
so you know what you're getting with your pread.

I think if we do allow concurrent non-positional reads, we should also document 
that this is optional, and that the stream never reads the same byte offset 
more than once.

bq. getPos() may block for an arbitrary amount of time if another thread is 
attempting to perform a positioned read and is having some problem 
communicating with the far end.  Is that something we really want? Is it 
something people expect?

HDFS has has this behavior for a long time. I checked back in Hadoop 0.20 and 
the {{synchronized}} is there on getPos and read.  I would be ok with getPos 
returning the position locklessly (perhaps from an AtomicLong?) but to my 
knowledge, nobody has ever requested that we change this.

{{pread}} should never affect the output of {{getPos}}.  That would go against 
the basic guarantee of positional read: that it doesn't alter the current 
stream position.  It doesn't really help FSes that implement pread as 
seek+read+seek, either.  Those filesystems have a basic problem-- the inability 
to do concurrent preads-- that weakening the {{getPos}} guarantee can't 
possibly solve.  The real solution is to add a better pread implementation to 
those filesystems.

(I do not think that concurrent pread should be required of all hadoop FSes, 
but it should be highly encouraged for all implementors.  And implementing 
pread as seek+read+seek should be highly discouraged)

I like the idea of saying that operations in "group P" (read, seek, skip, 
zero-copy read, releaseBuffer) can block each other, and every other operation 
is asynchronous.  I think that fits the needs of HBase, MR, and other clients 
needs very well; what do you think?

> Documenting DFSClient#DFSInputStream expectations reading and preading in 
> concurrent context
> 
>
> Key: HDFS-6803
> URL: https://issues.apache.org/jira/browse/HDFS-6803
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: 2.4.1
>Reporter: stack
> Attachments: DocumentingDFSClientDFSInputStream (1).pdf
>
>
> Reviews of the patch posted the parent task suggest that we be more explicit 
> about how DFSIS is expected to behave when being read by contending threads. 
> It is also suggested that presumptions made internally be made explicit 
> documenting expectations.
> Before we put up a patch we've made a document of assertions we'd like to 
> make into tenets of DFSInputSteam.  If agreement, we'll attach to this issue 
> a patch that weaves the assumptions into DFSIS as javadoc and class comments. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions

2014-08-11 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093249#comment-14093249
 ] 

Alejandro Abdelnur commented on HDFS-6826:
--

[~daryn], looking forward to your alternate approach. 

BTW, the plugin could be implemented in a way that the whole 'external' 
authorization data is fetched in advance, or that the external call has a 
maxCallTime and timesout returning default fallback strict permissions.

> Plugin interface to enable delegation of HDFS authorization assertions
> --
>
> Key: HDFS-6826
> URL: https://issues.apache.org/jira/browse/HDFS-6826
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFS-6826-idea.patch, 
> HDFSPluggableAuthorizationProposal.pdf
>
>
> When Hbase data, HiveMetaStore data or Search data is accessed via services 
> (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce 
> permissions on corresponding entities (databases, tables, views, columns, 
> search collections, documents). It is desirable, when the data is accessed 
> directly by users accessing the underlying data files (i.e. from a MapReduce 
> job), that the permission of the data files map to the permissions of the 
> corresponding data entity (i.e. table, column family or search collection).
> To enable this we need to have the necessary hooks in place in the NameNode 
> to delegate authorization to an external system that can map HDFS 
> files/directories to data entities and resolve their permissions based on the 
> data entities permissions.
> I’ll be posting a design proposal in the next few days.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions

2014-08-11 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093239#comment-14093239
 ] 

Daryn Sharp commented on HDFS-6826:
---

It doesn't matter which or how many paths go through the custom plugin.  Adding 
anything executed in a handler that can block, with or w/o the fsn lock, will 
put the entire NN in jeopardy.

When it comes to problems with a slow external authz:
# Best-worst case is the special authz clients < ipc handlers.  Authz clients 
suffocate the throughput of "normal" clients, DN heartbeats, and block reports 
but the NN limps along.
# Worst-worst case is the number of special authz clients >= ipc handlers.  NN 
is effectively stalled.  If the external authz service is down, and not just 
extremely slow, the latency from connection timeouts will cause the NN to go 
into an overloaded death spiral.

I'll post an alternate approach that should require no client code changes 
shortly.

> Plugin interface to enable delegation of HDFS authorization assertions
> --
>
> Key: HDFS-6826
> URL: https://issues.apache.org/jira/browse/HDFS-6826
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFS-6826-idea.patch, 
> HDFSPluggableAuthorizationProposal.pdf
>
>
> When Hbase data, HiveMetaStore data or Search data is accessed via services 
> (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce 
> permissions on corresponding entities (databases, tables, views, columns, 
> search collections, documents). It is desirable, when the data is accessed 
> directly by users accessing the underlying data files (i.e. from a MapReduce 
> job), that the permission of the data files map to the permissions of the 
> corresponding data entity (i.e. table, column family or search collection).
> To enable this we need to have the necessary hooks in place in the NameNode 
> to delegate authorization to an external system that can map HDFS 
> files/directories to data entities and resolve their permissions based on the 
> data entities permissions.
> I’ll be posting a design proposal in the next few days.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions

2014-08-11 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated HDFS-6826:
-

Attachment: HDFS-6826-idea.patch

[~daryn],

A custom plugin, would have a list of region prefixes that are subject to 
'external' permissions, any path not matching these prefixes would go straight 
to the default plugin. Only path’s matching the region prefixes would b subject 
to an 'external' permissions check.

Attached is an initial prototype, with a basic testcase using a custom plugin 
showing the proposed solution.

> Plugin interface to enable delegation of HDFS authorization assertions
> --
>
> Key: HDFS-6826
> URL: https://issues.apache.org/jira/browse/HDFS-6826
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFS-6826-idea.patch, 
> HDFSPluggableAuthorizationProposal.pdf
>
>
> When Hbase data, HiveMetaStore data or Search data is accessed via services 
> (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce 
> permissions on corresponding entities (databases, tables, views, columns, 
> search collections, documents). It is desirable, when the data is accessed 
> directly by users accessing the underlying data files (i.e. from a MapReduce 
> job), that the permission of the data files map to the permissions of the 
> corresponding data entity (i.e. table, column family or search collection).
> To enable this we need to have the necessary hooks in place in the NameNode 
> to delegate authorization to an external system that can map HDFS 
> files/directories to data entities and resolve their permissions based on the 
> data entities permissions.
> I’ll be posting a design proposal in the next few days.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6825) Edit log corruption due to delayed block removal

2014-08-11 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093183#comment-14093183
 ] 

Aaron T. Myers commented on HDFS-6825:
--

Hey Yongjun, thanks a lot for working on this. A few questions/comments for you:

# Please correct me if I'm wrong, but I don't think it's necessary for the test 
to use an HA minicluster, and instead a normal single NN minicluster would 
demonstrate this bug as well. If I'm right about that, please change the test. 
Making it HA distracts a bit from the actual bug, and makes the helper 
functions a tad more complex.
# I don't think that {{loopRecoverLease}} is actually called anywhere but 
TestPipelinesFailover, so no need to move it to DFSTestUtil.
# I don't follow why the change in TestCommitBlockSynchronization is necessary. 
Doesn't seem like it should be. Am I missing something?
# I agree with Andrew that using the {{DeleteThread}} in the test case seems 
like it's fragile and unnecessary.

> Edit log corruption due to delayed block removal
> 
>
> Key: HDFS-6825
> URL: https://issues.apache.org/jira/browse/HDFS-6825
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6825.001.patch, HDFS-6825.002.patch, 
> HDFS-6825.003.patch
>
>
> Observed the following stack:
> {code}
> 2014-08-04 23:49:44,133 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> commitBlockSynchronization(lastblock=BP-.., newgenerationstamp=..., 
> newlength=..., newtargets=..., closeFile=true, deleteBlock=false)
> 2014-08-04 23:49:44,133 WARN 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Unexpected exception 
> while updating disk space. 
> java.io.FileNotFoundException: Path not found: 
> /solr/hierarchy/core_node1/data/tlog/tlog.xyz
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateSpaceConsumed(FSDirectory.java:1807)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitOrCompleteLastBlock(FSNamesystem.java:3975)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.closeFileCommitBlocks(FSNamesystem.java:4178)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:4146)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.commitBlockSynchronization(NameNodeRpcServer.java:662)
> at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.commitBlockSynchronization(DatanodeProtocolServerSideTranslatorPB.java:270)
> at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28073)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)
> {code}
> Found this is what happened:
> - client created file /solr/hierarchy/core_node1/data/tlog/tlog.xyz
> - client tried to append to this file, but the lease expired, so lease 
> recovery is started, thus the append failed
> - the file get deleted, however, there are still pending blocks of this file 
> not deleted
> - then commitBlockSynchronization() method is called (see stack above), an 
> InodeFile is created out of the pending block, not aware of that the file was 
> deleted already
> - FileNotExistException was thrown by FSDirectory.updateSpaceConsumed, but 
> swallowed by commitOrCompleteLastBlock
> - closeFileCommitBlocks continue to call finalizeINodeFileUnderConstruction 
> and wrote CloseOp to the edit log



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6839) Fix TestCLI to expect new output

2014-08-11 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6839:
---

Attachment: HDFS-6839.001.patch

A patch with the new output is attached. Might as well wait for the jenkins run.

> Fix TestCLI to expect new output
> 
>
> Key: HDFS-6839
> URL: https://issues.apache.org/jira/browse/HDFS-6839
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6839.001.patch
>
>
> TestCLI is failing because HADOOP-10919 changed the output of the cp usage 
> command.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Work started] (HDFS-6839) Fix TestCLI to expect new output

2014-08-11 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-6839 started by Charles Lamb.

> Fix TestCLI to expect new output
> 
>
> Key: HDFS-6839
> URL: https://issues.apache.org/jira/browse/HDFS-6839
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6839.001.patch
>
>
> TestCLI is failing because HADOOP-10919 changed the output of the cp usage 
> command.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6839) Fix TestCLI to expect new output

2014-08-11 Thread Charles Lamb (JIRA)
Charles Lamb created HDFS-6839:
--

 Summary: Fix TestCLI to expect new output
 Key: HDFS-6839
 URL: https://issues.apache.org/jira/browse/HDFS-6839
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
Reporter: Charles Lamb
Assignee: Charles Lamb


TestCLI is failing because HADOOP-10919 changed the output of the cp usage 
command.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6838) Code cleanup for unnecessary INode replacement

2014-08-11 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6838:


Priority: Minor  (was: Major)

> Code cleanup for unnecessary INode replacement
> --
>
> Key: HDFS-6838
> URL: https://issues.apache.org/jira/browse/HDFS-6838
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Minor
> Attachments: HDFS-6838.000.patch
>
>
> With INode features we now no longer have INode replacement when converting a 
> file to an under-construction/with-snapshot file or converting a directory to 
> a snapshottable/with-snapshot directory. This jira plans to remove some 
> unnecessary code that is only useful for INode replacement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6838) Code cleanup for unnecessary INode replacement

2014-08-11 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6838:


Attachment: HDFS-6838.000.patch

Main changes:
# INode#recordModification on longer needs to return an INode. 
# Clean INode#setxxx methods accordingly.

> Code cleanup for unnecessary INode replacement
> --
>
> Key: HDFS-6838
> URL: https://issues.apache.org/jira/browse/HDFS-6838
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-6838.000.patch
>
>
> With INode features we now no longer have INode replacement when converting a 
> file to an under-construction/with-snapshot file or converting a directory to 
> a snapshottable/with-snapshot directory. This jira plans to remove some 
> unnecessary code that is only useful for INode replacement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6838) Code cleanup for unnecessary INode replacement

2014-08-11 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6838:


Status: Patch Available  (was: Open)

> Code cleanup for unnecessary INode replacement
> --
>
> Key: HDFS-6838
> URL: https://issues.apache.org/jira/browse/HDFS-6838
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-6838.000.patch
>
>
> With INode features we now no longer have INode replacement when converting a 
> file to an under-construction/with-snapshot file or converting a directory to 
> a snapshottable/with-snapshot directory. This jira plans to remove some 
> unnecessary code that is only useful for INode replacement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6838) Code cleanup for unnecessary INode replacement

2014-08-11 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-6838:
---

 Summary: Code cleanup for unnecessary INode replacement
 Key: HDFS-6838
 URL: https://issues.apache.org/jira/browse/HDFS-6838
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao


With INode features we now no longer have INode replacement when converting a 
file to an under-construction/with-snapshot file or converting a directory to a 
snapshottable/with-snapshot directory. This jira plans to remove some 
unnecessary code that is only useful for INode replacement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6837) Code cleanup for Balancer and Dispatcher

2014-08-11 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6837:


   Resolution: Fixed
Fix Version/s: 2.6.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed this to trunk and branch-2.

> Code cleanup for Balancer and Dispatcher
> 
>
> Key: HDFS-6837
> URL: https://issues.apache.org/jira/browse/HDFS-6837
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: h6837_20140810.patch, h6837_20140810b.patch
>
>
> A few minor code cleanup changes:
> - The constructor of Dispatcher should not read Balancer's conf properties; 
> the values should be passed by parameters.
> - StorageGroup.utilization can be removed; it is only used in toString().
> - Move Balancer.ReturnStatus to a standalone class.
> - In Dispatcher, rename BalancerDatanode to DDatanode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6837) Code cleanup for Balancer and Dispatcher

2014-08-11 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093060#comment-14093060
 ] 

Jing Zhao commented on HDFS-6837:
-

+1. I will commit the patch shortly.

> Code cleanup for Balancer and Dispatcher
> 
>
> Key: HDFS-6837
> URL: https://issues.apache.org/jira/browse/HDFS-6837
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h6837_20140810.patch, h6837_20140810b.patch
>
>
> A few minor code cleanup changes:
> - The constructor of Dispatcher should not read Balancer's conf properties; 
> the values should be passed by parameters.
> - StorageGroup.utilization can be removed; it is only used in toString().
> - Move Balancer.ReturnStatus to a standalone class.
> - In Dispatcher, rename BalancerDatanode to DDatanode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions

2014-08-11 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093045#comment-14093045
 ] 

Daryn Sharp commented on HDFS-6826:
---

I understand the motivation but there has to be a better approach.  Isn't this 
akin to a nfs server or ext4 basing its permission model on a mysql query to 
access raw mysql files?

Every external dependency introduces latency and additional HA concerns.  Tying 
up handlers, whether or not the fsn lock is held, during an operation is very 
dangerous and unacceptable for the reasons I originally cited.  Currently 
non-local edit logs, ex. shared nfs edit dir or journal node, are the only 
external dependency (I'm aware of).  This critical dependency is unavoidable 
for durability and consistency.

However, if an external service exposing data entities in hdfs uses a 
supplemental authz scheme, it should be its responsibility to arbitrate access 
if fs-level permissions are insufficient.

> Plugin interface to enable delegation of HDFS authorization assertions
> --
>
> Key: HDFS-6826
> URL: https://issues.apache.org/jira/browse/HDFS-6826
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSPluggableAuthorizationProposal.pdf
>
>
> When Hbase data, HiveMetaStore data or Search data is accessed via services 
> (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce 
> permissions on corresponding entities (databases, tables, views, columns, 
> search collections, documents). It is desirable, when the data is accessed 
> directly by users accessing the underlying data files (i.e. from a MapReduce 
> job), that the permission of the data files map to the permissions of the 
> corresponding data entity (i.e. table, column family or search collection).
> To enable this we need to have the necessary hooks in place in the NameNode 
> to delegate authorization to an external system that can map HDFS 
> files/directories to data entities and resolve their permissions based on the 
> data entities permissions.
> I’ll be posting a design proposal in the next few days.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6569) OOB message can't be sent to the client when DataNode shuts down for upgrade

2014-08-11 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093039#comment-14093039
 ] 

Brandon Li commented on HDFS-6569:
--

The test failure was not introduced by this patch. 

> OOB message can't be sent to the client when DataNode shuts down for upgrade
> 
>
> Key: HDFS-6569
> URL: https://issues.apache.org/jira/browse/HDFS-6569
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Brandon Li
>Assignee: Kihwal Lee
> Attachments: HDFS-6569.001.patch, HDFS-6569.002.patch, 
> test-hdfs-6569.patch
>
>
> The socket is closed too early before the OOB message can be sent to client, 
> which causes the write pipeline failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6793) Missing changes in HftpFileSystem when Reintroduce dfs.http.port / dfs.https.port in branch-2

2014-08-11 Thread Juan Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092983#comment-14092983
 ] 

Juan Yu commented on HDFS-6793:
---

Thanks Andrew and Yongjun!

> Missing changes in HftpFileSystem when Reintroduce dfs.http.port / 
> dfs.https.port in branch-2
> -
>
> Key: HDFS-6793
> URL: https://issues.apache.org/jira/browse/HDFS-6793
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Juan Yu
>Assignee: Juan Yu
>Priority: Blocker
> Fix For: 2.5.0
>
> Attachments: HDFS-6793.branch2.patch, HDFS-6793.patch, HDFS-6793.patch
>
>
> HDFS-6632 Reintroduce dfs.http.port / dfs.https.port in branch-2, but it 
> doesn't include changes to HftpFileSystem.
> HftpFileSystem is removed from trunk, but still in 2.5, so in 2.5, we need to 
> use dfs.http.port / dfs.https.port



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6803) Documenting DFSClient#DFSInputStream expectations reading and preading in concurrent context

2014-08-11 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092958#comment-14092958
 ] 

Steve Loughran commented on HDFS-6803:
--

I'm very tempted to push for a lax model which says

# either the preads block or they are concurrent if the FS supports
# preads and classic (read@pos) reads may be concurrent or blocking
# getPos() may expose the position of a positioned read

I know #3 goes up against all the rules of hiding things, but think of this: if 
we mandate that {{getPos()}} hides all intermediate positions on pread, then 
any class which uses the base implementation of seek+read+seek will require 
{{getPos()}} to be synced with the read, which implies that : 

{{getPos()}} may block for an arbitrary amount of time if another thread is 
attempting to perform a positioned read and is having some problem 
communicating with the far end.

Is that something we really want? Is it something people expect? 

> Documenting DFSClient#DFSInputStream expectations reading and preading in 
> concurrent context
> 
>
> Key: HDFS-6803
> URL: https://issues.apache.org/jira/browse/HDFS-6803
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: 2.4.1
>Reporter: stack
> Attachments: DocumentingDFSClientDFSInputStream (1).pdf
>
>
> Reviews of the patch posted the parent task suggest that we be more explicit 
> about how DFSIS is expected to behave when being read by contending threads. 
> It is also suggested that presumptions made internally be made explicit 
> documenting expectations.
> Before we put up a patch we've made a document of assertions we'd like to 
> make into tenets of DFSInputSteam.  If agreement, we'll attach to this issue 
> a patch that weaves the assumptions into DFSIS as javadoc and class comments. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6660) Use int instead of object reference to DatanodeStorageInfo in BlockInfo triplets,

2014-08-11 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092874#comment-14092874
 ] 

Daryn Sharp commented on HDFS-6660:
---

As a safety precaution, is it possible to add at least an assert to help catch 
if the indices get munged?

> Use int instead of object reference to DatanodeStorageInfo in BlockInfo 
> triplets,
> -
>
> Key: HDFS-6660
> URL: https://issues.apache.org/jira/browse/HDFS-6660
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.4.1
>Reporter: Amir Langer
>Assignee: Amir Langer
> Attachments: 
> 0002-add-an-integer-id-to-all-storages-in-DatanodeManager.patch
>
>
> Map an int index to every DatanodeStorageInfo and use it instead of object 
> reference in the BlockInfo triplets data structure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6659) Create a Block List

2014-08-11 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092871#comment-14092871
 ] 

Daryn Sharp commented on HDFS-6659:
---

I'd suggest trying to create a subclass of {{ChunkedArrayList}} that implements 
the custom hole tracking you need.

> Create a Block List
> ---
>
> Key: HDFS-6659
> URL: https://issues.apache.org/jira/browse/HDFS-6659
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.4.1
>Reporter: Amir Langer
>Assignee: Amir Langer
> Attachments: 
> 0001-BlockList-a-list-of-Blocks-that-saves-memory-by-mana.patch
>
>
> BlockList - An efficient array based list that can extend its capacity with 
> two main features:
> 1. Gaps (result of remove operations) are managed internally without the need 
> for extra memory - We create a linked list of gaps by using the array index 
> as references + An int to the head of the gaps list. In every insert 
> operation, we first use any available gap before extending the array.
> 2. Array extension is done by chaining different arrays, not by allocating a 
> larger array and copying all its data across. This is a lot less heavy in 
> terms of latency for that particular call. It also avoids having large amount 
> of contiguous heap space and so behaves nicer with garbage collection.
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6833) DirectoryScanner should not register a deleting block with memory of DataNode

2014-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092737#comment-14092737
 ] 

Hadoop QA commented on HDFS-6833:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12660983/HDFS-6833.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7609//console

This message is automatically generated.

> DirectoryScanner should not register a deleting block with memory of DataNode
> -
>
> Key: HDFS-6833
> URL: https://issues.apache.org/jira/browse/HDFS-6833
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
> Attachments: HDFS-6833.patch
>
>
> When a block is deleted in DataNode, the following messages are usually 
> output.
> {code}
> 2014-08-07 17:53:11,606 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Scheduling blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>  for deletion
> 2014-08-07 17:53:11,617 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
> {code}
> However, DirectoryScanner may be executed when DataNode deletes the block in 
> the current implementation. And the following messsages are output.
> {code}
> 2014-08-07 17:53:30,519 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Scheduling blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>  for deletion
> 2014-08-07 17:53:31,426 INFO 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool 
> BP-1887080305-172.28.0.101-1407398838872 Total blocks: 1, missing metadata 
> files:0, missing block files:0, missing blocks in memory:1, mismatched 
> blocks:0
> 2014-08-07 17:53:31,426 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> missing block to memory FinalizedReplica, blk_1073741825_1001, FINALIZED
>   getNumBytes() = 21230663
>   getBytesOnDisk()  = 21230663
>   getVisibleLength()= 21230663
>   getVolume()   = /hadoop/data1/dfs/data/current
>   getBlockFile()= 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>   unlinked  =false
> 2014-08-07 17:53:31,531 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
> {code}
> Deleting block information is registered in DataNode's memory.
> And when DataNode sends a block report, NameNode receives wrong block 
> information.
> For example, when we execute recommission or change the number of 
> replication, NameNode may delete the right block as "ExcessReplicate" by this 
> problem.
> And "Under-Replicated Blocks" and "Missing Blocks" occur.
> When DataNode run DirectoryScanner, DataNode should not register a deleting 
> block.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6833) DirectoryScanner should not register a deleting block with memory of DataNode

2014-08-11 Thread Shinichi Yamashita (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichi Yamashita updated HDFS-6833:
-

Status: Patch Available  (was: Open)

> DirectoryScanner should not register a deleting block with memory of DataNode
> -
>
> Key: HDFS-6833
> URL: https://issues.apache.org/jira/browse/HDFS-6833
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
> Attachments: HDFS-6833.patch
>
>
> When a block is deleted in DataNode, the following messages are usually 
> output.
> {code}
> 2014-08-07 17:53:11,606 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Scheduling blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>  for deletion
> 2014-08-07 17:53:11,617 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
> {code}
> However, DirectoryScanner may be executed when DataNode deletes the block in 
> the current implementation. And the following messsages are output.
> {code}
> 2014-08-07 17:53:30,519 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Scheduling blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>  for deletion
> 2014-08-07 17:53:31,426 INFO 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool 
> BP-1887080305-172.28.0.101-1407398838872 Total blocks: 1, missing metadata 
> files:0, missing block files:0, missing blocks in memory:1, mismatched 
> blocks:0
> 2014-08-07 17:53:31,426 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> missing block to memory FinalizedReplica, blk_1073741825_1001, FINALIZED
>   getNumBytes() = 21230663
>   getBytesOnDisk()  = 21230663
>   getVisibleLength()= 21230663
>   getVolume()   = /hadoop/data1/dfs/data/current
>   getBlockFile()= 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>   unlinked  =false
> 2014-08-07 17:53:31,531 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
> {code}
> Deleting block information is registered in DataNode's memory.
> And when DataNode sends a block report, NameNode receives wrong block 
> information.
> For example, when we execute recommission or change the number of 
> replication, NameNode may delete the right block as "ExcessReplicate" by this 
> problem.
> And "Under-Replicated Blocks" and "Missing Blocks" occur.
> When DataNode run DirectoryScanner, DataNode should not register a deleting 
> block.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6833) DirectoryScanner should not register a deleting block with memory of DataNode

2014-08-11 Thread Shinichi Yamashita (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichi Yamashita updated HDFS-6833:
-

Attachment: HDFS-6833.patch

I attach a patch file which I added the processing about a block deleting.

> DirectoryScanner should not register a deleting block with memory of DataNode
> -
>
> Key: HDFS-6833
> URL: https://issues.apache.org/jira/browse/HDFS-6833
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
> Attachments: HDFS-6833.patch
>
>
> When a block is deleted in DataNode, the following messages are usually 
> output.
> {code}
> 2014-08-07 17:53:11,606 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Scheduling blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>  for deletion
> 2014-08-07 17:53:11,617 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
> {code}
> However, DirectoryScanner may be executed when DataNode deletes the block in 
> the current implementation. And the following messsages are output.
> {code}
> 2014-08-07 17:53:30,519 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Scheduling blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>  for deletion
> 2014-08-07 17:53:31,426 INFO 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool 
> BP-1887080305-172.28.0.101-1407398838872 Total blocks: 1, missing metadata 
> files:0, missing block files:0, missing blocks in memory:1, mismatched 
> blocks:0
> 2014-08-07 17:53:31,426 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> missing block to memory FinalizedReplica, blk_1073741825_1001, FINALIZED
>   getNumBytes() = 21230663
>   getBytesOnDisk()  = 21230663
>   getVisibleLength()= 21230663
>   getVolume()   = /hadoop/data1/dfs/data/current
>   getBlockFile()= 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>   unlinked  =false
> 2014-08-07 17:53:31,531 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
>  Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file 
> /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
> {code}
> Deleting block information is registered in DataNode's memory.
> And when DataNode sends a block report, NameNode receives wrong block 
> information.
> For example, when we execute recommission or change the number of 
> replication, NameNode may delete the right block as "ExcessReplicate" by this 
> problem.
> And "Under-Replicated Blocks" and "Missing Blocks" occur.
> When DataNode run DirectoryScanner, DataNode should not register a deleting 
> block.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6664) HDFS permissions guide documentation states incorrect default group mapping class.

2014-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092567#comment-14092567
 ] 

Hadoop QA commented on HDFS-6664:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12660925/HDFS6664-03.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7606//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7606//console

This message is automatically generated.

> HDFS permissions guide documentation states incorrect default group mapping 
> class.
> --
>
> Key: HDFS-6664
> URL: https://issues.apache.org/jira/browse/HDFS-6664
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0, 2.5.0
>Reporter: Chris Nauroth
>Priority: Trivial
>  Labels: newbie
> Attachments: HDFS6664-01.patch, HDFS6664-02.patch, HDFS6664-03.patch
>
>
> The HDFS permissions guide states that our default group mapping class is 
> {{org.apache.hadoop.security.ShellBasedUnixGroupsMapping}}.  This is no 
> longer true.  The default has been changed to 
> {{org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback}}.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6261) Add document for enabling node group layer in HDFS

2014-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092565#comment-14092565
 ] 

Hadoop QA commented on HDFS-6261:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12660927/HDFS-6261.v3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7607//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7607//console

This message is automatically generated.

> Add document for enabling node group layer in HDFS
> --
>
> Key: HDFS-6261
> URL: https://issues.apache.org/jira/browse/HDFS-6261
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: documentation
>Reporter: Wenwu Peng
>Assignee: Binglin Chang
>  Labels: documentation
> Attachments: 2-layer-topology.png, 3-layer-topology.png, 
> 3layer-topology.png, 4layer-topology.png, HDFS-6261.v1.patch, 
> HDFS-6261.v1.patch, HDFS-6261.v2.patch, HDFS-6261.v3.patch
>
>
> Most of patches from Umbrella JIRA HADOOP-8468  have committed, However there 
> is no site to introduce NodeGroup-aware(HADOOP Virtualization Extensisons) 
> and how to do configuration. so we need to doc it.
> 1.  Doc NodeGroup-aware relate in http://hadoop.apache.org/docs/current 
> 2.  Doc NodeGroup-aware properties in core-default.xml.



--
This message was sent by Atlassian JIRA
(v6.2#6252)