[jira] [Updated] (HDFS-14626) Decommission all nodes hosting last block of open file succeeds unexpectedly

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14626:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Decommission all nodes hosting last block of open file succeeds unexpectedly 
> -
>
> Key: HDFS-14626
> URL: https://issues.apache.org/jira/browse/HDFS-14626
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: test-to-reproduce.patch
>
>
> I have been investigating scenarios that cause decommission to hang, 
> especially around one long standing issue. That is, an open block on the host 
> which is being decommissioned can cause the process to never complete.
> Checking the history, there seems to have been at least one change in 
> HDFS-5579 which greatly improved the situation, but from reading comments and 
> support cases, there still seems to be some scenarios where open blocks on a 
> DN host cause the decommission to get stuck.
> No matter what I try, I have not been able to reproduce this, but I think I 
> have uncovered another issue that may partly explain why.
> If I do the following, the nodes will decommission without any issues:
> 1. Create a file and write to it so it crosses a block boundary. Then there 
> is one complete block and one under construction block. Keep the file open, 
> and write a few bytes periodically.
> 2. Now note the nodes which the UC block is currently being written on, and 
> decommission them all.
> 3. The decommission should succeed.
> 4. Now attempt to close the open file, and it will fail to close with an 
> error like below, probably as decommissioned nodes are not allowed to send 
> IBRs:
> {code:java}
> java.io.IOException: Unable to close file because the last block 
> BP-646926902-192.168.0.20-1562099323291:blk_1073741827_1003 does not have 
> enough number of replicas.
>     at 
> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:968)
>     at 
> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:911)
>     at 
> org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:894)
>     at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:849)
>     at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>     at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101){code}
> Interestingly, if you recommission the nodes without restarting them before 
> closing the file, it will close OK, and writes to it can continue even once 
> decommission has completed.
> I don't think this is expected - ie decommission should not complete on all 
> nodes hosting the last UC block of a file?
> From what I have figured out, I don't think UC blocks are considered in the 
> DatanodeAdminManager at all. This is because the original list of blocks it 
> cares about, are taken from the Datanode block Iterator, which takes them 
> from the DatanodeStorageInfo objects attached to the datanode instance. I 
> believe UC blocks don't make it into the DatanodeStoreageInfo until after 
> they have been completed and an IBR sent, so the decommission logic never 
> considers them.
> What troubles me about this explanation, is how did open files previously 
> cause decommission to get stuck if it never checks for them, so I suspect I 
> am missing something.
> I will attach a patch with a test case that demonstrates this issue. This 
> reproduces on trunk and I also tested on CDH 5.8.1, which is based on the 2.6 
> branch, but with a lot of backports.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13671:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Priority: Major
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12197) Do the HDFS dist stitching in hadoop-hdfs-project

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12197:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Do the HDFS dist stitching in hadoop-hdfs-project
> -
>
> Key: HDFS-12197
> URL: https://issues.apache.org/jira/browse/HDFS-12197
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 3.0.0-alpha4
>Reporter: Andrew Wang
>Priority: Major
>
> Problem reported by [~lars_francke] on HDFS-11596. We can no longer easily 
> start a namenode and datanode from the source directory without doing a full 
> build per the wiki instructions: 
> https://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment
> This is because we don't have a top-level dist for HDFS. $HADOOP_YARN_HOME 
> for instance can be set to {{hadoop-yarn-project/target}}, but 
> $HADOOP_HDFS_HOME goes into the submodule: 
> {{hadoop-hdfs-project/hadoop-hdfs/target}}. This means it's missing the files 
> from the sibling hadoop-hdfs-client module (which is required by the 
> namenode), but also other siblings like nfs and httpfs.
> So, I think the right fix is doing the dist stitching at the 
> {{hadoop-hdfs-project}} level where we can aggregate all the child modules, 
> and pointing $HADOOP_HDFS_HOME at this directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12207) A few DataXceiver#writeBlock cleanups related to optional storage IDs and types

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12207:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> A few DataXceiver#writeBlock cleanups related to optional storage IDs and 
> types
> ---
>
> Key: HDFS-12207
> URL: https://issues.apache.org/jira/browse/HDFS-12207
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.0.0-alpha4
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Major
>
> Here's the conversation that [~ehiggs] and I had on HDFS-12151 regarding some 
> improvements:
> bq. Should we use nst > 0 rather than targetStorageTypes.length > 0 (amended) 
> here for clarity?
> Yes.
> bq. Should the targetStorageTypes.length > 0 check really be nsi > 0? We 
> could elide it then since it's already captured in the outside if.
> This does look redundant since targetStorageIds.length will be either 0 or == 
> targetStorageTypes.length
> bq. Finally, I don't understand why we need to add the targeted ID/type for 
> checkAccess. Each DN only needs to validate itself, yea? BTSM#checkAccess 
> indicates this in its javadoc, but it looks like we run through ourselves and 
> the targets each time:
> That seems like a good simplification. I think I had assumed the BTI and 
> requested types being checked should be the same (String - String, uint64 - 
> uint64); but I don't see a reason why they have to be. Chris Douglas, what do 
> you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13177) Investigate and fix DFSStripedOutputStream handling of DSQuotaExceededException

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13177:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Investigate and fix DFSStripedOutputStream handling of 
> DSQuotaExceededException
> ---
>
> Key: HDFS-13177
> URL: https://issues.apache.org/jira/browse/HDFS-13177
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiao Chen
>Priority: Major
>
> This is the DFSStripedOutputStream equivalent of HDFS-13164



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13879) FileSystem: Add allowSnapshot, disallowSnapshot, getSnapshotDiffReport and getSnapshottableDirListing

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13879:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> FileSystem: Add allowSnapshot, disallowSnapshot, getSnapshotDiffReport and 
> getSnapshottableDirListing
> -
>
> Key: HDFS-13879
> URL: https://issues.apache.org/jira/browse/HDFS-13879
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.1
>Reporter: Siyao Meng
>Priority: Major
>
> I wonder whether we should add allowSnapshot() and disallowSnapshot() to 
> FileSystem abstract class.
> I think we should because createSnapshot(), renameSnapshot() and 
> deleteSnapshot() are already part of it.
> Any reason why we don't want to do this?
> Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11091) Implement a getTrashRoot that does not fall-back

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11091:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Implement a getTrashRoot that does not fall-back
> 
>
> Key: HDFS-11091
> URL: https://issues.apache.org/jira/browse/HDFS-11091
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Xiao Chen
>Assignee: Yuanbo Liu
>Priority: Major
>
> From HDFS-10756's 
> [discussion|https://issues.apache.org/jira/browse/HDFS-10756?focusedCommentId=15623755=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15623755]:
> {{getTrashRoot}} is supposed to return the trash dir considering encryption 
> zone. But if there's an error encountered (e.g. access control exception), it 
> falls back to the default trash dir.
> Although there is a warning message about this, it is still a somewhat 
> surprising behavior. The fall back was added by HDFS-9799 for compatibility 
> reasons. This jira is to propose we add a getTrashRoot that throws, which 
> will actually be more user-friendly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11109) ViewFileSystem Df command should work even when the backing NameServices are down

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11109:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> ViewFileSystem Df command should work even when the backing NameServices are 
> down
> -
>
> Key: HDFS-11109
> URL: https://issues.apache.org/jira/browse/HDFS-11109
> Project: Hadoop HDFS
>  Issue Type: Task
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>Priority: Major
>  Labels: viewfs
>
> With HDFS-11058, Df command will work well with ViewFileSystem. federated 
> cluster can be backed up several NameServers, with each managing their own 
> NameSpaces. Even when some of the NameServers are down, the Federated cluster 
> will continue to work well for the NameServers that are alive. 
> But {{hadoop fs -df}} command when run against the Federated cluster expects 
> all the backing NameServers to be up and running. Else, the command errors 
> out with exception. 
> Would be preferable to have the federated cluster commands highly available 
> to match the NameSpace partition availability. 
> {noformat}
> #hadoop fs -df -h /
> df: Call From manoj-mbp.local/172.16.3.66 to localhost:52001 failed on 
> connection exception: java.net.ConnectException: Connection refused; For more 
> details see:  http://wiki.apache.org/hadoop/ConnectionRefused
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12597) Add CryptoOutputStream to WebHdfsFileSystem create call.

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12597:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Add CryptoOutputStream to WebHdfsFileSystem create call.
> 
>
> Key: HDFS-12597
> URL: https://issues.apache.org/jira/browse/HDFS-12597
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption, kms, webhdfs
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13928) Add the corner case testings for log roll

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13928:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Add the corner case testings for log roll
> -
>
> Key: HDFS-13928
> URL: https://issues.apache.org/jira/browse/HDFS-13928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.1
>Reporter: Yiqun Lin
>Priority: Major
>
> We find some corner cases for log roll when doing jounalnode migration in our 
> cluster. We use a online way for the migration. And there will cause some 
> corner cases:
>  * Multiple in-progress  edits_inprogress* files exists in edits dir, if log 
> roll is updated correctly. In this case, we can divide this into two cases:
>  1.With redundant in-progress file with higher txid than current in-progress 
> file
>  2.With redundant in-progress file with lower txid than current in-progress 
> file
>  * In a HA mode, SBN is down, what's the behavior of log roll will become?
> We can complete the log roll UTs for above cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11066) Improve test coverage for ISA-L native coder

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11066:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Improve test coverage for ISA-L native coder
> 
>
> Key: HDFS-11066
> URL: https://issues.apache.org/jira/browse/HDFS-11066
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>Priority: Major
>  Labels: hdfs-ec-3.0-nice-to-have
>
> Some issues were introduced but not found in time due to lack of necessary 
> Jenkins support for the ISA-L related building options. We should re-enable 
> ISA-L related building options in Jenkins system, so to ensure the quality of 
> the related native codes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14786) A new block placement policy tolerating availability zone failure

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14786:

Target Version/s: 3.4.0  (was: 3.3.0, 2.9.3)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> A new block placement policy tolerating availability zone failure
> -
>
> Key: HDFS-14786
> URL: https://issues.apache.org/jira/browse/HDFS-14786
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: block placement
>Reporter: Mingliang Liu
>Priority: Major
>
> {{NetworkTopology}} assumes "/datacenter/rack/host" 3 layer topology. Default 
> block placement policies are rack awareness for better fault tolerance. Newer 
> block placement policy like {{BlockPlacementPolicyRackFaultTolerant}} tries 
> its best to place the replicas to most racks, which further tolerates more 
> racks failing. HADOOP-8470 brought {{NetworkTopologyWithNodeGroup}} to add 
> another layer under rack, i.e. "/datacenter/rack/host/nodegroup" 4 layer 
> topology. With that, replicas within a rack can be placed in different node 
> groups for better isolation.
> Existing block placement policies tolerate one rack failure since at least 
> two racks are chosen in those cases. Chances are all replicas could be placed 
> in the same datacenter, though there are multiple data centers in the same 
> cluster topology. In other words, fault of higher layers beyond rack is not 
> well tolerated.
> However, more deployments in public cloud are leveraging multiple available 
> zones (AZ) for high-availability since the inter-AZ latency seems affordable 
> in many cases. In a single AZ, some cloud providers like AWS support 
> [partitioned placement 
> groups|https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-groups.html#placement-groups-partition]
>  which basically are different racks. A simple network topology mapped to 
> HDFS is "/availabilityzone/rack/host" 3 layers.
> To achieve high availability tolerating zone failure, this JIRA proposes a 
> new data placement policy which tries its best to place replicas in most AZs, 
> most racks, and most evenly distributed.
> Examples with 3 replicas, we choose racks as following:
>  - 1AZ: fall back to {{BlockPlacementPolicyRackFaultTolerant}} to place among 
> most racks
>  - 2AZ: randomly choose one rack in one AZ and randomly choose two racks in 
> the other AZ
>  - 3AZ: randomly choose one rack in every AZ
>  - 4AZ: randomly choose three AZs and randomly choose one rack in every AZ
> After racks are picked, hosts are chosen randomly within racks honoring local 
> storage, favorite nodes, excluded nodes, storage types etc. Data may become 
> imbalance if topology is very uneven in AZs. This seems not a problem as in 
> public cloud, infrastructure provisioning is more flexible than 1P.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12953) XORRawDecoder.doDecode throws NullPointerException

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12953:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> XORRawDecoder.doDecode throws NullPointerException
> --
>
> Key: HDFS-12953
> URL: https://issues.apache.org/jira/browse/HDFS-12953
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Lei (Eddy) Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HDFS-12953.test.patch
>
>
> Thanks [~danielpol] report on HDFS-12860.
> {noformat}
> 17/11/30 04:19:55 INFO mapreduce.Job: map 0% reduce 0%
> 17/11/30 04:20:01 INFO mapreduce.Job: Task Id : 
> attempt_1512036058655_0003_m_02_0, Status : FAILED
> Error: java.lang.NullPointerException
> at 
> org.apache.hadoop.io.erasurecode.rawcoder.XORRawDecoder.doDecode(XORRawDecoder.java:83)
> at 
> org.apache.hadoop.io.erasurecode.rawcoder.RawErasureDecoder.decode(RawErasureDecoder.java:106)
> at 
> org.apache.hadoop.io.erasurecode.rawcoder.RawErasureDecoder.decode(RawErasureDecoder.java:170)
> at 
> org.apache.hadoop.hdfs.StripeReader.decodeAndFillBuffer(StripeReader.java:423)
> at 
> org.apache.hadoop.hdfs.StatefulStripeReader.decode(StatefulStripeReader.java:94)
> at org.apache.hadoop.hdfs.StripeReader.readStripe(StripeReader.java:382)
> at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:318)
> at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:391)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:813)
> at java.io.DataInputStream.read(DataInputStream.java:149)
> at 
> org.apache.hadoop.examples.terasort.TeraInputFormat$TeraRecordReader.nextKeyValue(TeraInputFormat.java:257)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:563)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:794)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12355) Webhdfs needs to support encryption zones.

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12355:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Webhdfs needs to support encryption zones.
> --
>
> Key: HDFS-12355
> URL: https://issues.apache.org/jira/browse/HDFS-12355
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: encryption, kms
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
>
> Will create a sub tasks.
> 1. Add fsserverdefaults to {{NamenodeWebhdfsMethods}}.
> 2. Return File encryption info in {{GETFILESTATUS}} call from 
> {{NamenodeWebhdfsMethods}}
> 3. Adding {{CryptoInputStream}} and {{CryptoOutputStream}} to InputStream and 
> OutputStream.
> 4. {{WebhdfsFilesystem}} needs to acquire kms delegation token from kms 
> servers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13285) Improve runtime for TestReadStripedFileWithMissingBlocks#testReadFileWithMissingBlocks

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13285:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Improve runtime for 
> TestReadStripedFileWithMissingBlocks#testReadFileWithMissingBlocks 
> ---
>
> Key: HDFS-13285
> URL: https://issues.apache.org/jira/browse/HDFS-13285
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Priority: Major
>
> TestReadStripedFileWithMissingBlocks#testReadFileWithMissingBlocks takes 
> anywhere b/w 2-4 minutes depending on host machine. Jira intends to make it 
> leaner.
> cc: [~elgoiri]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12240) Document WebHDFS rename API parameter renameoptions

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12240:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Document WebHDFS rename API parameter renameoptions
> ---
>
> Key: HDFS-12240
> URL: https://issues.apache.org/jira/browse/HDFS-12240
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
>
> The {{FileSystem#rename}} API has an overloaded version that carries an extra 
> parameter "renameoptions". The extra parameter can be used to support trash 
> or support overwriting.
> The WebHDFS Rest API does not document this parameter, so file this jira to 
> get it documented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12379) NameNode getListing should use FileStatus instead of HdfsFileStatus

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12379:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> NameNode getListing should use FileStatus instead of HdfsFileStatus
> ---
>
> Key: HDFS-12379
> URL: https://issues.apache.org/jira/browse/HDFS-12379
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Priority: Major
>
> The public {{listStatus}} APIs in {{FileSystem}} and 
> {{DistributedFileSystem}} expose {{FileStatus}} instead of 
> {{HdfsFileStatus}}. Therefore it is a waste to create the more expensive 
> {{HdfsFileStatus}} objects on NameNode.
> It should be a simple change similar to HDFS-11641. Marking incompatible 
> because wire protocol is incompatible. Not sure what downstream apps are 
> affected by this incompatibility. Maybe those directly using curl, or writing 
> their own HDFS client.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15269) NameNode should check the authorization API version only once during initialization

2020-04-09 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079611#comment-17079611
 ] 

Brahma Reddy Battula commented on HDFS-15269:
-

[~weichiu] thanks reporting and fixing this.. Added fixversion 3.4.0, as this 
merged to 3.4.0 also.

> NameNode should check the authorization API version only once during 
> initialization
> ---
>
> Key: HDFS-15269
> URL: https://issues.apache.org/jira/browse/HDFS-15269
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Blocker
> Fix For: 3.3.0, 3.4.0
>
>
> After HDFS-14743, every authorization check logs a messages like the following
> {noformat}
> 2020-04-07 23:44:55,276 INFO org.apache.hadoop.security.UserGroupInformation: 
> Default authorization provider supports the new authorization provider API
> 2020-04-07 23:44:55,276 INFO org.apache.hadoop.security.UserGroupInformation: 
> Default authorization provider supports the new authorization provider API
> 2020-04-07 23:44:55,277 INFO org.apache.hadoop.security.UserGroupInformation: 
> Default authorization provider supports the new authorization provider API
> 2020-04-07 23:44:55,278 INFO org.apache.hadoop.security.UserGroupInformation: 
> Default authorization provider supports the new authorization provider API
> {noformat}
> The intend was to check for authorization provider's API compatibility during 
> initialization but apparently it's not. This will result in serious 
> performance regression.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15269) NameNode should check the authorization API version only once during initialization

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-15269:

Fix Version/s: 3.4.0

> NameNode should check the authorization API version only once during 
> initialization
> ---
>
> Key: HDFS-15269
> URL: https://issues.apache.org/jira/browse/HDFS-15269
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Blocker
> Fix For: 3.3.0, 3.4.0
>
>
> After HDFS-14743, every authorization check logs a messages like the following
> {noformat}
> 2020-04-07 23:44:55,276 INFO org.apache.hadoop.security.UserGroupInformation: 
> Default authorization provider supports the new authorization provider API
> 2020-04-07 23:44:55,276 INFO org.apache.hadoop.security.UserGroupInformation: 
> Default authorization provider supports the new authorization provider API
> 2020-04-07 23:44:55,277 INFO org.apache.hadoop.security.UserGroupInformation: 
> Default authorization provider supports the new authorization provider API
> 2020-04-07 23:44:55,278 INFO org.apache.hadoop.security.UserGroupInformation: 
> Default authorization provider supports the new authorization provider API
> {noformat}
> The intend was to check for authorization provider's API compatibility during 
> initialization but apparently it's not. This will result in serious 
> performance regression.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15241) Distcp print wrong log info when use -log

2020-03-27 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068582#comment-17068582
 ] 

Brahma Reddy Battula commented on HDFS-15241:
-

[~rain_lyy] nice finding, we can give "bytescopied" here.

> Distcp print wrong log info when use -log
> -
>
> Key: HDFS-15241
> URL: https://issues.apache.org/jira/browse/HDFS-15241
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp
>Affects Versions: 3.1.1
>Reporter: liuyanyu
>Priority: Minor
> Attachments: image-2020-03-25-17-28-33-394.png
>
>
> when run distcp with -log /logpath -v, distcp will print copy status and file 
> info to /logpath, but print log with wrong file zise. The logs print as 
> follows:
> FILE_COPIED: source=hdfs://ns1/test/stax2-api-3.1.4.jar, size=161867 --> 
> target=hdfs://ns1/tmp/target/stax2-api-3.1.4.jar, size=0
> As I analysis ,the root cause is as follows:
> targrtFileStatus got before copying. So targrtFileStatus is null. Here should 
> get targrtFileStatus again after file copying.
> !image-2020-03-25-17-28-33-394.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15244) Make Decommission related configurations runtime refreshable

2020-03-27 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068561#comment-17068561
 ] 

Brahma Reddy Battula commented on HDFS-15244:
-

duplicated to which jira..?

> Make Decommission related configurations runtime refreshable
> 
>
> Key: HDFS-15244
> URL: https://issues.apache.org/jira/browse/HDFS-15244
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>
> There are certain configurations that can be tuned to increase the speed of 
> Decommissioning, But restarting namenode for it seems to be a heavy operation.
> Propose to make these configurations, reconfigurable during runtime in order 
> to speed-up decommissioning as per need.
> Proposed Configs :
> * {{dfs.namenode.replication.work.multiplier.per.iteration}}
> * {{dfs.namenode.replication.max-streams-hard-limit}}
> * {{dfs.namenode.replication.max-streams}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15118) [SBN Read] Slow clients when Observer reads are enabled but there are no Observers on the cluster.

2020-03-23 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned HDFS-15118:
---

Assignee: Chen Liang  (was: Brahma Reddy Battula)

> [SBN Read] Slow clients when Observer reads are enabled but there are no 
> Observers on the cluster.
> --
>
> Key: HDFS-15118
> URL: https://issues.apache.org/jira/browse/HDFS-15118
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15118.001.patch, HDFS-15118.002.patch
>
>
> We see substantial degradation in performance of HDFS clients, when Observer 
> reads are enabled via {{ObserverReadProxyProvider}}, but there are no 
> ObserverNodes on the cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15118) [SBN Read] Slow clients when Observer reads are enabled but there are no Observers on the cluster.

2020-03-23 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned HDFS-15118:
---

Assignee: Brahma Reddy Battula  (was: Chen Liang)

> [SBN Read] Slow clients when Observer reads are enabled but there are no 
> Observers on the cluster.
> --
>
> Key: HDFS-15118
> URL: https://issues.apache.org/jira/browse/HDFS-15118
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Brahma Reddy Battula
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15118.001.patch, HDFS-15118.002.patch
>
>
> We see substantial degradation in performance of HDFS clients, when Observer 
> reads are enabled via {{ObserverReadProxyProvider}}, but there are no 
> ObserverNodes on the cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15226) Ranger integrates HDFS and discovers NPE

2020-03-16 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula resolved HDFS-15226.
-
   Fix Version/s: (was: 3.2.1)
  (was: 3.2.0)
Target Version/s:   (was: 3.2.1)
  Resolution: Duplicate

Linking the duplicate defect for future reference.

> Ranger integrates HDFS and discovers NPE
> 
>
> Key: HDFS-15226
> URL: https://issues.apache.org/jira/browse/HDFS-15226
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.6
> Environment: Apache Ranger1.2 && Hadoop2.7.6
>Reporter: bianqi
>Priority: Critical
>
> When I integrated ranger1.2 with Hadoop2.7.6, the following NPE error 
> occurred when executing hdfs dfs -ls /.
>  However, when I integrated ranger1.2 with Hadoop2.7.1, executing hdfs dfs 
> -ls / without any errors, and the directory list can be displayed normally.
> {quote}java.lang.NullPointerException
>  at java.lang.String.checkBounds(String.java:384)
>  at java.lang.String.(String.java:425)
>  at org.apache.hadoop.hdfs.DFSUtil.bytes2String(DFSUtil.java:337)
>  at org.apache.hadoop.hdfs.DFSUtil.bytes2String(DFSUtil.java:319)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.getINodeAttrs(FSPermissionChecker.java:238)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:183)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1752)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:100)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3832)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1012)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:855)
>  at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1758)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2213)
>  DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 1 on 8020: responding 
> to org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from 
> xx:8502 Call#0 Retry#0
> {quote}
> When I checked the HDFS source code and debug hdfs source . I found 
> pathByNameArr[i] is null.
> {quote}private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int 
> pathIdx,
>  INode inode, int snapshotId) {
>  INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
>  if (getAttributesProvider() != null) {
>  String[] elements = new String[pathIdx + 1];
>  for (int i = 0; i < elements.length; i++) {
>  elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);
>  }
>  inodeAttrs = getAttributesProvider().getAttributes(elements, inodeAttrs);
>  }
>  return inodeAttrs;
>  }
>  
> {quote}
> I found that the code of the trunk branch has been fixed and currently has 
> not been merged in the latest 3.2.1 version.
> I hope that this patch can be merged into other branches as soon as 
> possible,thank you very much! 
>  
> {quote}private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int 
> pathIdx,
>  INode inode, int snapshotId) {
>  INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
>  if (getAttributesProvider() != null)
> Unknown macro: \{ String[] elements = new String[pathIdx + 1]; /** * {@link 
> INode#getPathComponents(String)}
> returns a null component
>  * for the root only path "/". Assign an empty string if so.
>  */
>  if (pathByNameArr.length == 1 && pathByNameArr[0] == null)
> Unknown macro: \{ elements[0] = ""; }
> else
> Unknown macro: { for (int i = 0; i < elements.length; i++)
> Unknown macro: \{ elements[i] = DFSUtil.bytes2String(pathByNameArr[i]); }}
> inodeAttrs = getAttributesProvider().getAttributes(elements, inodeAttrs);
>  }
>  return inodeAttrs;
>  }
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Reopened] (HDFS-15226) Ranger integrates HDFS and discovers NPE

2020-03-16 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reopened HDFS-15226:
-

> Ranger integrates HDFS and discovers NPE
> 
>
> Key: HDFS-15226
> URL: https://issues.apache.org/jira/browse/HDFS-15226
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.6
> Environment: Apache Ranger1.2 && Hadoop2.7.6
>Reporter: bianqi
>Priority: Critical
> Fix For: 3.2.0, 3.2.1
>
>
> When I integrated ranger1.2 with Hadoop2.7.6, the following NPE error 
> occurred when executing hdfs dfs -ls /.
>  However, when I integrated ranger1.2 with Hadoop2.7.1, executing hdfs dfs 
> -ls / without any errors, and the directory list can be displayed normally.
> {quote}java.lang.NullPointerException
>  at java.lang.String.checkBounds(String.java:384)
>  at java.lang.String.(String.java:425)
>  at org.apache.hadoop.hdfs.DFSUtil.bytes2String(DFSUtil.java:337)
>  at org.apache.hadoop.hdfs.DFSUtil.bytes2String(DFSUtil.java:319)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.getINodeAttrs(FSPermissionChecker.java:238)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:183)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1752)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:100)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3832)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1012)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:855)
>  at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1758)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2213)
>  DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 1 on 8020: responding 
> to org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from 
> xx:8502 Call#0 Retry#0
> {quote}
> When I checked the HDFS source code and debug hdfs source . I found 
> pathByNameArr[i] is null.
> {quote}private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int 
> pathIdx,
>  INode inode, int snapshotId) {
>  INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
>  if (getAttributesProvider() != null) {
>  String[] elements = new String[pathIdx + 1];
>  for (int i = 0; i < elements.length; i++) {
>  elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);
>  }
>  inodeAttrs = getAttributesProvider().getAttributes(elements, inodeAttrs);
>  }
>  return inodeAttrs;
>  }
>  
> {quote}
> I found that the code of the trunk branch has been fixed and currently has 
> not been merged in the latest 3.2.1 version.
> I hope that this patch can be merged into other branches as soon as 
> possible,thank you very much! 
>  
> {quote}private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int 
> pathIdx,
>  INode inode, int snapshotId) {
>  INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
>  if (getAttributesProvider() != null)
> Unknown macro: \{ String[] elements = new String[pathIdx + 1]; /** * {@link 
> INode#getPathComponents(String)}
> returns a null component
>  * for the root only path "/". Assign an empty string if so.
>  */
>  if (pathByNameArr.length == 1 && pathByNameArr[0] == null)
> Unknown macro: \{ elements[0] = ""; }
> else
> Unknown macro: { for (int i = 0; i < elements.length; i++)
> Unknown macro: \{ elements[i] = DFSUtil.bytes2String(pathByNameArr[i]); }}
> inodeAttrs = getAttributesProvider().getAttributes(elements, inodeAttrs);
>  }
>  return inodeAttrs;
>  }
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15226) Ranger integrates HDFS and discovers NPE

2020-03-16 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060206#comment-17060206
 ] 

Brahma Reddy Battula commented on HDFS-15226:
-

Hope you are asking "HDFS-12614", this is already merged 3.2 also.

FYI.. 
[https://github.com/apache/hadoop/blame/branch-3.2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java]

> Ranger integrates HDFS and discovers NPE
> 
>
> Key: HDFS-15226
> URL: https://issues.apache.org/jira/browse/HDFS-15226
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.6
> Environment: Apache Ranger1.2 && Hadoop2.7.6
>Reporter: bianqi
>Priority: Critical
> Fix For: 3.2.0, 3.2.1
>
>
> When I integrated ranger1.2 with Hadoop2.7.6, the following NPE error 
> occurred when executing hdfs dfs -ls /.
>  However, when I integrated ranger1.2 with Hadoop2.7.1, executing hdfs dfs 
> -ls / without any errors, and the directory list can be displayed normally.
> {quote}java.lang.NullPointerException
>  at java.lang.String.checkBounds(String.java:384)
>  at java.lang.String.(String.java:425)
>  at org.apache.hadoop.hdfs.DFSUtil.bytes2String(DFSUtil.java:337)
>  at org.apache.hadoop.hdfs.DFSUtil.bytes2String(DFSUtil.java:319)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.getINodeAttrs(FSPermissionChecker.java:238)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:183)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1752)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:100)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3832)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1012)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:855)
>  at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1758)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2213)
>  DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 1 on 8020: responding 
> to org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from 
> xx:8502 Call#0 Retry#0
> {quote}
> When I checked the HDFS source code and debug hdfs source . I found 
> pathByNameArr[i] is null.
> {quote}private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int 
> pathIdx,
>  INode inode, int snapshotId) {
>  INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
>  if (getAttributesProvider() != null) {
>  String[] elements = new String[pathIdx + 1];
>  for (int i = 0; i < elements.length; i++) {
>  elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);
>  }
>  inodeAttrs = getAttributesProvider().getAttributes(elements, inodeAttrs);
>  }
>  return inodeAttrs;
>  }
>  
> {quote}
> I found that the code of the trunk branch has been fixed and currently has 
> not been merged in the latest 3.2.1 version.
> I hope that this patch can be merged into other branches as soon as 
> possible,thank you very much! 
>  
> {quote}private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int 
> pathIdx,
>  INode inode, int snapshotId) {
>  INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
>  if (getAttributesProvider() != null)
> Unknown macro: \{ String[] elements = new String[pathIdx + 1]; /** * {@link 
> INode#getPathComponents(String)}
> returns a null component
>  * for the root only path "/". Assign an empty string if so.
>  */
>  if (pathByNameArr.length == 1 && pathByNameArr[0] == null)
> Unknown macro: \{ elements[0] = ""; }
> else
> Unknown macro: { for (int i = 0; i < elements.length; i++)
> Unknown macro: \{ elements[i] = DFSUtil.bytes2String(pathByNameArr[i]); }}
> inodeAttrs = getAttributesProvider().getAttributes(elements, inodeAttrs);
>  }
>  return inodeAttrs;
>  }
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-03-13 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17058959#comment-17058959
 ] 

Brahma Reddy Battula commented on HDFS-15113:
-

[~weichiu] do you've cycle to review this, shall I go head commit this..?

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Blocker
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch, 
> HDFS-15113.003.patch, HDFS-15113.004.patch, HDFS-15113.005.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-03-10 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055774#comment-17055774
 ] 

Brahma Reddy Battula commented on HDFS-15113:
-

+1 from myside, Holding for commit till others review.

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Blocker
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch, 
> HDFS-15113.003.patch, HDFS-15113.004.patch, HDFS-15113.005.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-03-05 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17052124#comment-17052124
 ] 

Brahma Reddy Battula commented on HDFS-15113:
-

+1 on latest patch apart from jenkins errors.. Please handle the check-style 
error's.

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Blocker
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch, 
> HDFS-15113.003.patch, HDFS-15113.004.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-02-28 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047738#comment-17047738
 ] 

Brahma Reddy Battula commented on HDFS-15113:
-

{quote}This has to be a blocker for 3.3.0. Updated jira to reflect the reality.
{quote}
Ok, I will consider this. 

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Blocker
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch, 
> HDFS-15113.003.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15194) ERROR log print wrong user info when run distcp

2020-02-26 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula resolved HDFS-15194.
-
Resolution: Duplicate

[~rain_lyy] thanks for confirmation. Closing as duplicate of HDFS-13626.

> ERROR log print wrong user info when run distcp
> ---
>
> Key: HDFS-15194
> URL: https://issues.apache.org/jira/browse/HDFS-15194
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.1
>Reporter: liuyanyu
>Priority: Minor
> Attachments: distcp.log, image-2020-02-26-14-10-19-654.png
>
>
> Use user test which belong to group hadoop to run distcp with -pbugpaxt , cp 
> a directory which owner is super, distcp runs failed error log as follows:
> 2020-02-26 11:17:02,755 INFO [IPC Server handler 5 on 27101] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from 
> attempt_1582635453769_0003_m_21_0: Error: 
> org.apache.hadoop.security.AccessControlException: User super is not a super 
> user (non-super user cannot change owner).
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner(FSDirAttrOp.java:85)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setOwner(FSNamesystem.java:1927)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setOwner(NameNodeRpcServer.java:870)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setOwner(ClientNamenodeProtocolServerSideTranslatorPB.java:566)
>  at
> ...
> Currnet user is test, not super, the log print wrong user info
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15194) ERROR log print wrong user info when run distcp

2020-02-26 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045777#comment-17045777
 ] 

Brahma Reddy Battula commented on HDFS-15194:
-

[~rain_lyy] thanks for reporting. Looks it's duplicate of HDFS-13626.

> ERROR log print wrong user info when run distcp
> ---
>
> Key: HDFS-15194
> URL: https://issues.apache.org/jira/browse/HDFS-15194
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.1
>Reporter: liuyanyu
>Priority: Minor
> Attachments: distcp.log, image-2020-02-26-14-10-19-654.png
>
>
> Use user test which belong to group hadoop to run distcp with -pbugpaxt , cp 
> a directory which owner is super, distcp runs failed error log as follows:
> 2020-02-26 11:17:02,755 INFO [IPC Server handler 5 on 27101] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from 
> attempt_1582635453769_0003_m_21_0: Error: 
> org.apache.hadoop.security.AccessControlException: User super is not a super 
> user (non-super user cannot change owner).
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner(FSDirAttrOp.java:85)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setOwner(FSNamesystem.java:1927)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setOwner(NameNodeRpcServer.java:870)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setOwner(ClientNamenodeProtocolServerSideTranslatorPB.java:566)
>  at
> ...
> Currnet user is test, not super, the log print wrong user info
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15087) RBF: Balance/Rename across federation namespaces

2020-01-28 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025589#comment-17025589
 ] 

Brahma Reddy Battula commented on HDFS-15087:
-

{quote}First support DFSAdmin command balance. Then integrate it to Router. 
Finally support smart balance
{quote}
Ok.this will be good to have.
{quote}When I trying to use snapshot I meet a tricky problem. I envisioned 2 
ways:
In fun1, we don't do hardlink in the loop. So the final hardlink will cost a 
lot. From the performance chapter we can see the HardLink part costs 
474,701ms(82.22%) for a large path. The larger the path is, the higer the 
hardlink proportion. So in this way the benefit of snapshot delta is not 
much(17.78%).
{quote}
if we use snapshot, I dn't think, we require hardlink.
{quote}We need to check the existance of the dst-path. If the dst-path exists 
then the rpc succeeds. Otherwise the rpc fails.
{quote}
When dst path have multiplefiles and subfolders, all these existence will be 
checked.

 
{quote}Hi everyone, any further comments? Please let me know your thoughts, 
thanks !
{quote}
I dn't have further comments. May be, you can update design doc with this 
improvements/future plan and create the branch start contribute.

 
 
 

> RBF: Balance/Rename across federation namespaces
> 
>
> Key: HDFS-15087
> URL: https://issues.apache.org/jira/browse/HDFS-15087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jinglun
>Priority: Major
> Attachments: HDFS-15087.initial.patch, HFR_Rename Across Federation 
> Namespaces.pdf
>
>
> The Xiaomi storage team has developed a new feature called HFR(HDFS 
> Federation Rename) that enables us to do balance/rename across federation 
> namespaces. The idea is to first move the meta to the dst NameNode and then 
> link all the replicas. It has been working in our largest production cluster 
> for 2 months. We use it to balance the namespaces. It turns out HFR is fast 
> and flexible. The detail could be found in the design doc. 
> Looking forward to a lively discussion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo

2020-01-28 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025536#comment-17025536
 ] 

Brahma Reddy Battula commented on HDFS-15133:
-

As discussed earlier, make this configurable to aviod the confusion.

Once POC test is done, Please publish the result.

> Use rocksdb to store NameNode inode and blockInfo
> -
>
> Key: HDFS-15133
> URL: https://issues.apache.org/jira/browse/HDFS-15133
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: maobaolong
>Priority: Major
> Attachments: image-2020-01-28-12-30-33-015.png
>
>
> Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can 
> achieve the same request.
> This is ozone and alluxio way to manage meta data of master node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo

2020-01-22 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021264#comment-17021264
 ] 

Brahma Reddy Battula commented on HDFS-15133:
-

[~maobaolong] thanks for proposing this, would like to hear POC reults if 
you've done on this. thanks.

> Use rocksdb to store NameNode inode and blockInfo
> -
>
> Key: HDFS-15133
> URL: https://issues.apache.org/jira/browse/HDFS-15133
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: maobaolong
>Priority: Major
>
> Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can 
> achieve the same request.
> This is ozone and alluxio way to manage meta data of master node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-01-14 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014841#comment-17014841
 ] 

Brahma Reddy Battula edited comment on HDFS-15113 at 1/14/20 8:13 AM:
--

Hi [~hexiaoqiao]

thanks for reporting..is this have high chance when 
"dfs.blockreport.initialDelay" is configured with "0"..? UT is passing without 
fix, can you update the UT..?


was (Author: brahmareddy):
Hi [~hexiaoqiao]

thanks for reporting..is this have high chance when 
"dfs.blockreport.initialDelay" is configured with "0"..?

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-01-13 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014841#comment-17014841
 ] 

Brahma Reddy Battula commented on HDFS-15113:
-

Hi [~hexiaoqiao]

thanks for reporting..is this have high chance when 
"dfs.blockreport.initialDelay" is configured with "0"..?

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15087) RBF: Balance/Rename across federation namespaces

2020-01-11 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013422#comment-17013422
 ] 

Brahma Reddy Battula commented on HDFS-15087:
-

[~LiJinglun] . Gone through the design document,Nice Feature and Neatly 
docuemented. (y) 

 
 # Scheduler will have ablitiy to identify which NS is full and automatically 
schedule the job..? May be each NS can configure the threshhold..?
 # Scheduler can be enchanced based RPC load/usage ( balance the RPC load 
also)..?
 # How the consistency is ensured if you wn't use snapshot like [~linyiqun]  
and [~goiri]  mentioned..? After "*saveTree"* step, before creating the mount 
table(Or blockwrites will be there till job is success, but this might delay 
other applications as you make tree as immutable?) .how the delta will be 
processed..?  ( i didn't seen, making mount table readonly)..Bytheway editlog 
idea looks good here.
 # Mount table properties(or attributes) also preserved..? 
 # saveTree() and GraftTree() are idempotent..? On Namenode failover, will 
these be re-executed if it's connected to newly active namenode.

Please do correct me if I am not in same page with you guys.

Bytheway I am planning for 3.3.0 release by March mid, it will be good if we 
finish before that, so that this can be included in 3.3 Release.

> RBF: Balance/Rename across federation namespaces
> 
>
> Key: HDFS-15087
> URL: https://issues.apache.org/jira/browse/HDFS-15087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jinglun
>Priority: Major
> Attachments: HDFS-15087.initial.patch, HFR_Rename Across Federation 
> Namespaces.pdf
>
>
> The Xiaomi storage team has developed a new feature called HFR(HDFS 
> Federation Rename) that enables us to do balance/rename across federation 
> namespaces. The idea is to first move the meta to the dst NameNode and then 
> link all the replicas. It has been working in our largest production cluster 
> for 2 months. We use it to balance the namespaces. It turns out HFR is fast 
> and flexible. The detail could be found in the design doc. 
> Looking forward to a lively discussion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14878) DataStreamer's ResponseProceesor#run() should log with Warn loglevel

2019-11-11 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971817#comment-16971817
 ] 

Brahma Reddy Battula commented on HDFS-14878:
-

[~kihwal] let's know your opinon on this jira.

> DataStreamer's ResponseProceesor#run() should log with Warn loglevel
> 
>
> Key: HDFS-14878
> URL: https://issues.apache.org/jira/browse/HDFS-14878
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14878.001.patch
>
>
> {code:java}
>   if (duration > dfsclientSlowLogThresholdMs) {
> LOG.info("Slow ReadProcessor read fields for block " + block
>   + " took " + duration + "ms (threshold="
>   + dfsclientSlowLogThresholdMs + "ms); ack: " + ack
>   + ", targets: " + Arrays.asList(targets));
>   } {code}
> log level should be warn here



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14974) RBF: Make tests use free ports

2019-11-11 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971800#comment-16971800
 ] 

Brahma Reddy Battula commented on HDFS-14974:
-

[~elgoiri]  have a check on org.apache.hadoop.net.ServerSocketUtil which might 
be used here.

> RBF: Make tests use free ports
> --
>
> Key: HDFS-14974
> URL: https://issues.apache.org/jira/browse/HDFS-14974
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-14974.000.patch
>
>
> Currently, {{TestRouterSecurityManager#testCreateCredentials}} create a 
> Router with the default ports. However, these ports might be used. We should 
> set it to :0 for it to be assigned dynamically.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14909) DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage count for excluded node which is already part of excluded scope

2019-10-16 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953415#comment-16953415
 ] 

Brahma Reddy Battula commented on HDFS-14909:
-

+1,Nice Finding.

> DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage 
> count for excluded node which is already part of excluded scope 
> -
>
> Key: HDFS-14909
> URL: https://issues.apache.org/jira/browse/HDFS-14909
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-14909.001.patch, HDFS-14909.002.patch, 
> HDFS-14909.003.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14284) RBF: Log Router identifier when reporting exceptions

2019-10-02 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16943111#comment-16943111
 ] 

Brahma Reddy Battula edited comment on HDFS-14284 at 10/2/19 7:53 PM:
--

Ok.Just I want to confirm when router is can't access state store we can 
shutdown the router.
{quote}This shouldn't break compatibility as it would be a new field in the new 
remote exception.
{quote}
I was talking about "new NoNamenodesAvailableException"  where we are going to 
add one more field( and this exception was introduced before release). I was 
concerned about this.

[~ayushtkn] and [~inigoiri], if you both are ok. Then I am ok.

 

[~hemanthboyina] you can update the patch,as [~crh] suggested.

 


was (Author: brahmareddy):
Ok.Just I want to confirm when router is can't access state store we can 
shutdown the router.
{quote}This shouldn't break compatibility as it would be a new field in the new 
remote exception.
{quote}
I was talking about "new NoNamenodesAvailableException"  where we are going to 
add one more field( and this exception was introduced b. I was concerned about 
this.

[~ayushtkn] and [~inigoiri], if you both are ok. Then I am ok.

 

[~hemanthboyina] you can update the patch,as [~crh] suggested.

 

> RBF: Log Router identifier when reporting exceptions
> 
>
> Key: HDFS-14284
> URL: https://issues.apache.org/jira/browse/HDFS-14284
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch
>
>
> The typical setup is to use multiple Routers through 
> ConfiguredFailoverProxyProvider.
> In a regular HA Namenode setup, it is easy to know which NN was used.
> However, in RBF, any Router can be the one reporting the exception and it is 
> hard to know which was the one.
> We should have a way to identify which Router/Namenode was the one triggering 
> the exception.
> This would also apply with Observer Namenodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions

2019-10-02 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16943111#comment-16943111
 ] 

Brahma Reddy Battula commented on HDFS-14284:
-

Ok.Just I want to confirm when router is can't access state store we can 
shutdown the router.
{quote}This shouldn't break compatibility as it would be a new field in the new 
remote exception.
{quote}
I was talking about "new NoNamenodesAvailableException"  where we are going to 
add one more field( and this exception was introduced b. I was concerned about 
this.

[~ayushtkn] and [~inigoiri], if you both are ok. Then I am ok.

 

[~hemanthboyina] you can update the patch,as [~crh] suggested.

 

> RBF: Log Router identifier when reporting exceptions
> 
>
> Key: HDFS-14284
> URL: https://issues.apache.org/jira/browse/HDFS-14284
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch
>
>
> The typical setup is to use multiple Routers through 
> ConfiguredFailoverProxyProvider.
> In a regular HA Namenode setup, it is easy to know which NN was used.
> However, in RBF, any Router can be the one reporting the exception and it is 
> hard to know which was the one.
> We should have a way to identify which Router/Namenode was the one triggering 
> the exception.
> This would also apply with Observer Namenodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions

2019-10-01 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942477#comment-16942477
 ] 

Brahma Reddy Battula commented on HDFS-14284:
-

{quote}We will try to cover as many cases as possible but not easy to get all 
of them down.
{quote}
[~elgoiri] can you hightlight why router is not having access to statestore? Do 
you think,in such case router should allow requests..?

 

> RBF: Log Router identifier when reporting exceptions
> 
>
> Key: HDFS-14284
> URL: https://issues.apache.org/jira/browse/HDFS-14284
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch
>
>
> The typical setup is to use multiple Routers through 
> ConfiguredFailoverProxyProvider.
> In a regular HA Namenode setup, it is easy to know which NN was used.
> However, in RBF, any Router can be the one reporting the exception and it is 
> hard to know which was the one.
> We should have a way to identify which Router/Namenode was the one triggering 
> the exception.
> This would also apply with Observer Namenodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14878) DataStreamer's ResponseProceesor#run() should log with Warn loglevel

2019-10-01 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942474#comment-16942474
 ] 

Brahma Reddy Battula commented on HDFS-14878:
-

[~ayushtkn]  you can raise seperate Jira to track that.

> DataStreamer's ResponseProceesor#run() should log with Warn loglevel
> 
>
> Key: HDFS-14878
> URL: https://issues.apache.org/jira/browse/HDFS-14878
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14878.001.patch
>
>
> {code:java}
>   if (duration > dfsclientSlowLogThresholdMs) {
> LOG.info("Slow ReadProcessor read fields for block " + block
>   + " took " + duration + "ms (threshold="
>   + dfsclientSlowLogThresholdMs + "ms); ack: " + ack
>   + ", targets: " + Arrays.asList(targets));
>   } {code}
> log level should be warn here



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14878) DataStreamer's ResponseProceesor#run() should log with Warn loglevel

2019-10-01 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942386#comment-16942386
 ] 

Brahma Reddy Battula commented on HDFS-14878:
-

Hope you are asking about 
[this|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java#L1232]
 and not [this. 
|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java#L1248]
 which is already warn.

I am not sure why it is info log when setupPipelineForAppendOrRecovery() logs 
[warn. |#L1476]]

Please do correct me if anything I missed..

 

> DataStreamer's ResponseProceesor#run() should log with Warn loglevel
> 
>
> Key: HDFS-14878
> URL: https://issues.apache.org/jira/browse/HDFS-14878
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14878.001.patch
>
>
> {code:java}
>   if (duration > dfsclientSlowLogThresholdMs) {
> LOG.info("Slow ReadProcessor read fields for block " + block
>   + " took " + duration + "ms (threshold="
>   + dfsclientSlowLogThresholdMs + "ms); ack: " + ack
>   + ", targets: " + Arrays.asList(targets));
>   } {code}
> log level should be warn here



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions

2019-10-01 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942360#comment-16942360
 ] 

Brahma Reddy Battula commented on HDFS-14284:
-

{quote}Router that had issues (no access to the store) and it was very hard to 
find it.
{quote}
I think we should shutdown router in that case.. One of the example is like 
below which is not handled currently..
{code:java}
if (!zkManager.getCurator().isStarted()) {
 throw new StateStoreUnavailableException(
 "Cannot get data, " + "ZKCurator is STOPPED:" + e.getMessage());
}{code}

> RBF: Log Router identifier when reporting exceptions
> 
>
> Key: HDFS-14284
> URL: https://issues.apache.org/jira/browse/HDFS-14284
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch
>
>
> The typical setup is to use multiple Routers through 
> ConfiguredFailoverProxyProvider.
> In a regular HA Namenode setup, it is easy to know which NN was used.
> However, in RBF, any Router can be the one reporting the exception and it is 
> hard to know which was the one.
> We should have a way to identify which Router/Namenode was the one triggering 
> the exception.
> This would also apply with Observer Namenodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14090) RBF: Improved isolation for downstream name nodes. {Static}

2019-10-01 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942355#comment-16942355
 ] 

Brahma Reddy Battula commented on HDFS-14090:
-

[~crh] thanks for great work here. I too liked first apporach .Sorry for late 
reply.

Overall apporach looks good apart from the following minor suggestions if you 
agree.

 

i) Following might mislead,May be we can log number of handlers are overloaded 
as we through same message. 
{code:java}
LOG.debug("Permission denied for ugi: {} for method: {}",
 ugi, m.getName()); 
{code}
ii) Following will give fairness instead of *tryAcquire()*
{code:java}
public boolean tryAcquire(long timeout, TimeUnit unit){code}
iiI) As this demands Total number of handlers configured for all the 
nameserivce should be less than or equal to totalhandlers of RBF,may be these 
we need to document in HDFS-14558.

iv) looks Naming of method and classes might improve..? E.G intead of  
"FairnessManager.java" like RBFRpcFairnessManager or 
RBFHandlerFairnessManger.javaacquirepermit(..)->acquireHandler() (Looks 
permits you got from semaphore)any thoughts.?

v) can we expose number of handlers available or used handlers for NS level?

would like to see how the dynamic  allocation (HDFS-14750) and observer load 
will be distributed (As static might not be more benefit since cluster load 
will not predicatable)
  

> RBF: Improved isolation for downstream name nodes. {Static}
> ---
>
> Key: HDFS-14090
> URL: https://issues.apache.org/jira/browse/HDFS-14090
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: CR Hota
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14090-HDFS-13891.001.patch, 
> HDFS-14090-HDFS-13891.002.patch, HDFS-14090-HDFS-13891.003.patch, 
> HDFS-14090-HDFS-13891.004.patch, HDFS-14090-HDFS-13891.005.patch, 
> HDFS-14090.006.patch, HDFS-14090.007.patch, HDFS-14090.008.patch, 
> HDFS-14090.009.patch, HDFS-14090.010.patch, HDFS-14090.011.patch, 
> HDFS-14090.012.patch, HDFS-14090.013.patch, HDFS-14090.014.patch, RBF_ 
> Isolation design.pdf
>
>
> Router is a gateway to underlying name nodes. Gateway architectures, should 
> help minimize impact of clients connecting to healthy clusters vs unhealthy 
> clusters.
> For example - If there are 2 name nodes downstream, and one of them is 
> heavily loaded with calls spiking rpc queue times, due to back pressure the 
> same with start reflecting on the router. As a result of this, clients 
> connecting to healthy/faster name nodes will also slow down as same rpc queue 
> is maintained for all calls at the router layer. Essentially the same IPC 
> thread pool is used by router to connect to all name nodes.
> Currently router uses one single rpc queue for all calls. Lets discuss how we 
> can change the architecture and add some throttling logic for 
> unhealthy/slow/overloaded name nodes.
> One way could be to read from current call queue, immediately identify 
> downstream name node and maintain a separate queue for each underlying name 
> node. Another simpler way is to maintain some sort of rate limiter configured 
> for each name node and let routers drop/reject/send error requests after 
> certain threshold. 
> This won’t be a simple change as router’s ‘Server’ layer would need redesign 
> and implementation. Currently this layer is the same as name node.
> Opening this ticket to discuss, design and implement this feature.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions

2019-10-01 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942294#comment-16942294
 ] 

Brahma Reddy Battula commented on HDFS-14284:
-

Agree with [~crh]. May be we can handle seperate Jira and handle only 
"NoNamenodesAvailableException" here..?

 

I have one question here, what we are going to achieve by adding routerID to 
following..? will client retry to another router..? ( and wn't be incompatiable 
if there is some automation.)
Exception in thread "main" 
org.apache.hadoop.ipc.RemoteException(java.io.IOException): No namenode 
available under nameservice ns0

> RBF: Log Router identifier when reporting exceptions
> 
>
> Key: HDFS-14284
> URL: https://issues.apache.org/jira/browse/HDFS-14284
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch
>
>
> The typical setup is to use multiple Routers through 
> ConfiguredFailoverProxyProvider.
> In a regular HA Namenode setup, it is easy to know which NN was used.
> However, in RBF, any Router can be the one reporting the exception and it is 
> hard to know which was the one.
> We should have a way to identify which Router/Namenode was the one triggering 
> the exception.
> This would also apply with Observer Namenodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10303) DataStreamer#ResponseProcessor calculates packet ack latency incorrectly.

2019-10-01 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942254#comment-16942254
 ] 

Brahma Reddy Battula commented on HDFS-10303:
-

{quote} - The existing log level may not be ideal. This is directly visible to 
users and unless something actually failed, it is better to log at 
{{INFO}}.{quote}
 

As per below description, I feel this logLvel should be "WARN'. Please correct 
me if I am wrong.
 * *Info* - Generally useful information to log (service start/stop, 
configuration assumptions, etc). Info I want to always have available but 
usually don't care about under normal circumstances. This is my out-of-the-box 
config level.
 * *Warn* - Anything that can potentially cause application oddities, but for 
which I am automatically recovering. (Such as switching from a primary to 
backup server, retrying an operation, missing secondary data, etc.)

> DataStreamer#ResponseProcessor calculates packet ack latency incorrectly.
> -
>
> Key: HDFS-10303
> URL: https://issues.apache.org/jira/browse/HDFS-10303
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.2
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: HDFS-10303-001.patch, HDFS-10303-002.patch
>
>
> Packets acknowledge duration should be calculated based on the packet send 
> time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14878) DataStreamer's ResponseProceesor#run() should log with Warn loglevel

2019-10-01 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942252#comment-16942252
 ] 

Brahma Reddy Battula commented on HDFS-14878:
-

IMO, this log level should be "WARN" so that user can look into system. Usually 
"INFO" logs will be ignored.

As [~hemanthboyina] mentioned above case and slow logs in BlockReceiver are 
"WARN",which indicates that something happened.

And if we set loglevel as "INFO" warn logs also printed.

> DataStreamer's ResponseProceesor#run() should log with Warn loglevel
> 
>
> Key: HDFS-14878
> URL: https://issues.apache.org/jira/browse/HDFS-14878
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14878.001.patch
>
>
> {code:java}
>   if (duration > dfsclientSlowLogThresholdMs) {
> LOG.info("Slow ReadProcessor read fields for block " + block
>   + " took " + duration + "ms (threshold="
>   + dfsclientSlowLogThresholdMs + "ms); ack: " + ack
>   + ", targets: " + Arrays.asList(targets));
>   } {code}
> log level should be warn here



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14495) RBF: Duplicate FederationRPCMetrics

2019-10-01 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942022#comment-16942022
 ] 

Brahma Reddy Battula commented on HDFS-14495:
-

IMO, we can gohead this change with only trunk,as this wn't look good to have 
duplicate values..? 
{quote}Not sure what the guideline is for JMX, etc.
{quote}
 

Refernce of [JMX 
Compatiability|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Metrics.2FJMX]

> RBF: Duplicate FederationRPCMetrics
> ---
>
> Key: HDFS-14495
> URL: https://issues.apache.org/jira/browse/HDFS-14495
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: metrics
>Reporter: Akira Ajisaka
>Assignee: hemanthboyina
>Priority: Major
>
> There are two FederationRPCMetrics displayed in Web UI (http:// hostname>:/jmx) and most of the metrics are the same.
> * FederationRPCMetrics via {{@Metrics}} and {{@Metric}} annotations
> * FederationRPCMetrics via registering FederationRPCMBean
> Can we remove {{@Metrics}} and {{@Metric}} annotations to remove duplication?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14509) DN throws InvalidToken due to inequality of password when upgrade NN 2.x to 3.x

2019-09-27 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16939658#comment-16939658
 ] 

Brahma Reddy Battula commented on HDFS-14509:
-

[~shv] thanks for explanation.
{quote}NN 3.x does not include storage types into block token until the upgrade 
is finalized.
 This will require changes on branch-3.x only.
{quote}
Yes,I was targeting existing 2.7 ( 2.8 or 2.6) versions where it can smooth 
upgrade(they can't port any issue before they plan for upgrade.). 

 
{quote}As I said #2 seems more general, so let's just go with it. If nobody 
objects.
{quote}
Should be fine with this approrach also, Existing cluster might need to update 
this patch before upgrade.

 

> DN throws InvalidToken due to inequality of password when upgrade NN 2.x to 
> 3.x
> ---
>
> Key: HDFS-14509
> URL: https://issues.apache.org/jira/browse/HDFS-14509
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yuxuan Wang
>Priority: Blocker
>  Labels: release-blocker
> Attachments: HDFS-14509-001.patch
>
>
> According to the doc, if we want to upgrade cluster from 2.x to 3.x, we need 
> upgrade NN first. And there will be a intermediate state that NN is 3.x and 
> DN is 2.x. At that moment, if a client reads (or writes) a block, it will get 
> a block token from NN and then deliver the token to DN who can verify the 
> token. But the verification in the code now is :
> {code:title=BlockTokenSecretManager.java|borderStyle=solid}
> public void checkAccess(...)
> {
> ...
> id.readFields(new DataInputStream(new 
> ByteArrayInputStream(token.getIdentifier(;
> ...
> if (!Arrays.equals(retrievePassword(id), token.getPassword())) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " doesn't have the correct token password");
> }
> }
> {code} 
> And {{retrievePassword(id)}} is:
> {code} 
> public byte[] retrievePassword(BlockTokenIdentifier identifier)
> {
> ...
> return createPassword(identifier.getBytes(), key.getKey());
> }
> {code} 
> So, if NN's identifier add new fields, DN will lose the fields and compute 
> wrong password.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13891) HDFS RBF stabilization phase I

2019-09-12 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928857#comment-16928857
 ] 

Brahma Reddy Battula commented on HDFS-13891:
-

Once again thanks to all.

> HDFS RBF stabilization phase I  
> 
>
> Key: HDFS-13891
> URL: https://issues.apache.org/jira/browse/HDFS-13891
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Brahma Reddy Battula
>Priority: Major
>  Labels: RBF
>
> RBF(Router Based Federation) shipped in 3.0+ and 2.9..
> now its out various corner cases, scale and error handling issues are 
> surfacing.
> And we are targeting security feaiure (HDFS-13532) also.
> this umbrella to fix all those issues and support missing 
> protocols(HDFS-13655) before next 3.3 release.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13891) HDFS RBF stabilization phase I

2019-09-12 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928856#comment-16928856
 ] 

Brahma Reddy Battula commented on HDFS-13891:
-

Hopefully I set the fix version for all the jira's under this umbrella.Hence 
going to close this now.

> HDFS RBF stabilization phase I  
> 
>
> Key: HDFS-13891
> URL: https://issues.apache.org/jira/browse/HDFS-13891
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Brahma Reddy Battula
>Priority: Major
>  Labels: RBF
>
> RBF(Router Based Federation) shipped in 3.0+ and 2.9..
> now its out various corner cases, scale and error handling issues are 
> surfacing.
> And we are targeting security feaiure (HDFS-13532) also.
> this umbrella to fix all those issues and support missing 
> protocols(HDFS-13655) before next 3.3 release.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14526) RBF: Update the document of RBF related metrics

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14526:

Fix Version/s: 3.3.0

> RBF: Update the document of RBF related metrics
> ---
>
> Key: HDFS-14526
> URL: https://issues.apache.org/jira/browse/HDFS-14526
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: RBF
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14526-HDFS-13891.1.patch, 
> HDFS-14526-HDFS-13891.2.patch, HDFS-14526-HDFS-13891.3.patch, 
> federationmetrics_v1.png
>
>
> This is a follow-on task of HDFS-14508. We need to update 
> {{HDFSRouterFederation.md#Metrics}} and {{Metrics.md}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14545) RBF: Router should support GetUserMappingsProtocol

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14545:

Fix Version/s: 3.3.0

> RBF: Router should support GetUserMappingsProtocol
> --
>
> Key: HDFS-14545
> URL: https://issues.apache.org/jira/browse/HDFS-14545
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14545-HDFS-13891-01.patch, 
> HDFS-14545-HDFS-13891-02.patch, HDFS-14545-HDFS-13891-03.patch, 
> HDFS-14545-HDFS-13891-04.patch, HDFS-14545-HDFS-13891-05.patch, 
> HDFS-14545-HDFS-13891-06.patch, HDFS-14545-HDFS-13891-07.patch, 
> HDFS-14545-HDFS-13891-08.patch, HDFS-14545-HDFS-13891-09.patch, 
> HDFS-14545-HDFS-13891-10.patch, HDFS-14545-HDFS-13891.000.patch
>
>
> We should be able to check the groups for a user from a Router.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14550) RBF: Failed to get statistics from NameNodes before 2.9.0

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14550:

Fix Version/s: 3.3.0

> RBF: Failed to get statistics from NameNodes before 2.9.0
> -
>
> Key: HDFS-14550
> URL: https://issues.apache.org/jira/browse/HDFS-14550
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Akira Ajisaka
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14550-HDFS-13891.001.patch
>
>
> DFSRouter fails to get stats from NameNodes that do not have HDFS-7877
> {noformat}
> 2019-06-03 17:40:15,407 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService: 
> Cannot get stat from nn1:nn01:8022 using JMX
> org.codehaus.jettison.json.JSONException: 
> JSONObject["NumInMaintenanceLiveDataNodes"] not found.
> at org.codehaus.jettison.json.JSONObject.get(JSONObject.java:360)
> at org.codehaus.jettison.json.JSONObject.getInt(JSONObject.java:421)
> at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateJMXParameters(NamenodeHeartbeatService.java:345)
> at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.getNamenodeStatusReport(NamenodeHeartbeatService.java:278)
> at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateState(NamenodeHeartbeatService.java:206)
> at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.periodicInvoke(NamenodeHeartbeatService.java:160)
> at 
> org.apache.hadoop.hdfs.server.federation.router.PeriodicService$1.run(PeriodicService.java:178)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14508) RBF: Clean-up and refactor UI components

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14508:

Fix Version/s: 3.3.0

> RBF: Clean-up and refactor UI components
> 
>
> Key: HDFS-14508
> URL: https://issues.apache.org/jira/browse/HDFS-14508
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: CR Hota
>Assignee: Takanobu Asanuma
>Priority: Minor
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14508-HDFS-13891.1.patch, 
> HDFS-14508-HDFS-13891.2.patch, HDFS-14508-HDFS-13891.3.patch, 
> HDFS-14508-HDFS-13891.4.patch, HDFS-14508-HDFS-13891.5.patch
>
>
> Router UI has tags that are not used or incorrectly set. The code should be 
> cleaned-up.
> One such example is 
> Path : 
> (\hadoop-hdfs-project\hadoop-hdfs-rbf\src\main\webapps\router\federationhealth.js)
> {code:java}
> {"name": "routerstat", "url": 
> "/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus"},{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13480) RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13480:

Fix Version/s: 3.3.0

> RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key
> ---
>
> Key: HDFS-13480
> URL: https://issues.apache.org/jira/browse/HDFS-13480
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-13480-HDFS-13891-05.patch, 
> HDFS-13480-HDFS-13891-06.patch, HDFS-13480-HDFS-13891-07.patch, 
> HDFS-13480-HDFS-13891-08.patch, HDFS-13480.001.patch, HDFS-13480.002.patch, 
> HDFS-13480.002.patch, HDFS-13480.003.patch, HDFS-13480.004.patch
>
>
> Now, if i enable the heartbeat.enable, but i do not want to monitor any 
> namenode, i get an ERROR log like:
> {code:java}
> [2018-04-19T14:00:03.057+08:00] [ERROR] 
> federation.router.Router.serviceInit(Router.java 214) [main] : Heartbeat is 
> enabled but there are no namenodes to monitor
> {code}
> and if i disable the heartbeat.enable, we cannot get any mounttable update, 
> because the following logic in Router.java:
> {code:java}
> if (conf.getBoolean(
> RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE,
> RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE_DEFAULT)) {
>   // Create status updater for each monitored Namenode
>   this.namenodeHeartbeatServices = createNamenodeHeartbeatServices();
>   for (NamenodeHeartbeatService hearbeatService :
>   this.namenodeHeartbeatServices) {
> addService(hearbeatService);
>   }
>   if (this.namenodeHeartbeatServices.isEmpty()) {
> LOG.error("Heartbeat is enabled but there are no namenodes to 
> monitor");
>   }
>   // Periodically update the router state
>   this.routerHeartbeatService = new RouterHeartbeatService(this);
>   addService(this.routerHeartbeatService);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14516) RBF: Create hdfs-rbf-site.xml for RBF specific properties

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14516:

Fix Version/s: 3.3.0

> RBF: Create hdfs-rbf-site.xml for RBF specific properties
> -
>
> Key: HDFS-14516
> URL: https://issues.apache.org/jira/browse/HDFS-14516
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14516.1.patch, HDFS-14516.2.patch
>
>
> Currently, users write rbf properties in {{hdfs-site.xml}} though the 
> definitions are in {{hdfs-rbf-default.xml}}. Like other modules, it would be 
> better if there is a specific configuration file, {{hdfs-rbf-site.xml}}.
> {{hdfs-rbf-default.xml}} also should be loaded when it exists in the 
> configuration directory. It is just a document at the moment.
> There is an early discussion in HDFS-13215.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14490) RBF: Remove unnecessary quota checks

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14490:

Fix Version/s: 3.3.0

> RBF: Remove unnecessary quota checks
> 
>
> Key: HDFS-14490
> URL: https://issues.apache.org/jira/browse/HDFS-14490
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14490-HDFS-13891-01.patch, 
> HDFS-14490-HDFS-13891-02.patch, HDFS-14490-HDFS-13891-03.patch
>
>
> Remove unnecessary quota checks for unrelated operations such as setEcPolicy, 
> getEcPolicy and similar  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14457) RBF: Add order text SPACE in CLI command 'hdfs dfsrouteradmin'

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14457:

Fix Version/s: 3.3.0

> RBF: Add order text SPACE in CLI command 'hdfs dfsrouteradmin'
> --
>
> Key: HDFS-14457
> URL: https://issues.apache.org/jira/browse/HDFS-14457
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: HDFS-13891
>Reporter: luhuachao
>Assignee: luhuachao
>Priority: Major
>  Labels: RBF
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14457-HDFS-13891-01.patch, 
> HDFS-14457-HDFS-13891-02.patch, HDFS-14457.01.patch
>
>
> when execute cli comand 'hdfs dfsrouteradmin' ,the text in -order donot 
> contain SPACE



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14454) RBF: getContentSummary() should allow non-existing folders

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14454:

Fix Version/s: 3.3.0

> RBF: getContentSummary() should allow non-existing folders
> --
>
> Key: HDFS-14454
> URL: https://issues.apache.org/jira/browse/HDFS-14454
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14454-HDFS-13891.000.patch, 
> HDFS-14454-HDFS-13891.001.patch, HDFS-14454-HDFS-13891.002.patch, 
> HDFS-14454-HDFS-13891.003.patch, HDFS-14454-HDFS-13891.004.patch, 
> HDFS-14454-HDFS-13891.005.patch, HDFS-14454-HDFS-13891.006.patch
>
>
> We have a mount point with HASH_ALL and one of the subclusters does not 
> contain the folder.
> In this case, getContentSummary() returns FileNotFoundException.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14447) RBF: Router should support RefreshUserMappingsProtocol

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14447:

Fix Version/s: 3.3.0

> RBF: Router should support RefreshUserMappingsProtocol
> --
>
> Key: HDFS-14447
> URL: https://issues.apache.org/jira/browse/HDFS-14447
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.1.0
>Reporter: Shen Yinjie
>Assignee: Shen Yinjie
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14447-HDFS-13891.01.patch, 
> HDFS-14447-HDFS-13891.02.patch, HDFS-14447-HDFS-13891.03.patch, 
> HDFS-14447-HDFS-13891.04.patch, HDFS-14447-HDFS-13891.05.patch, 
> HDFS-14447-HDFS-13891.06.patch, HDFS-14447-HDFS-13891.07.patch, 
> HDFS-14447-HDFS-13891.08.patch, HDFS-14447-HDFS-13891.09.patch, error.png
>
>
> HDFS with RBF
> We configure hadoop.proxyuser.xx.yy ,then execute hdfs dfsadmin 
> -Dfs.defaultFS=hdfs://router-fed -refreshSuperUserGroupsConfiguration,
>  it throws "Unknown protocol: ...RefreshUserMappingProtocol".
> RouterAdminServer should support RefreshUserMappingsProtocol , or a proxyuser 
> client would be refused to impersonate.As shown in the screenshot



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14351:

Fix Version/s: 3.3.0

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14351-HDFS-13891.001.patch, 
> HDFS-14351-HDFS-13891.002.patch, HDFS-14351-HDFS-13891.003.patch, 
> HDFS-14351-HDFS-13891.004.patch, HDFS-14351-HDFS-13891.005.patch, 
> HDFS-14351-HDFS-13891.006.patch, HDFS-14351.001.patch, HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14369) RBF: Fix trailing "/" for webhdfs

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14369:

Fix Version/s: 3.3.0

> RBF: Fix trailing "/" for webhdfs
> -
>
> Key: HDFS-14369
> URL: https://issues.apache.org/jira/browse/HDFS-14369
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: CR Hota
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14369-HDFS-13891-regressiontest-001.patch, 
> HDFS-14369-HDFS-13891.001.patch, HDFS-14369-HDFS-13891.002.patch, 
> HDFS-14369-HDFS-13891.003.patch, HDFS-14369-HDFS-13891.004.patch, 
> HDFS-14369-HDFS-13891.005.patch, HDFS-14369-HDFS-13891.006.patch
>
>
> WebHDFS doesn't trim trailing slash causing discrepancy in operations.
> Example below
> --
> Using HDFS API, two directory are listed.
> {code}
> $ hdfs dfs -ls hdfs://:/tmp/
> Found 2 items
> drwxrwxrwx   - hdfs supergroup  0 2018-11-09 17:50 
> hdfs://:/tmp/tmp1
> drwxrwxrwx   - hdfs supergroup  0 2018-11-09 17:50 
> hdfs://:/tmp/tmp2
> {code}
> Using WebHDFS API, only one directory is listed.
> {code}
> $ curl -u : --negotiate -i 
> "http://:50071/webhdfs/v1/tmp/?op=LISTSTATUS"
> (snip)
> {"FileStatuses":{"FileStatus":[
> {"accessTime":0,"blockSize":0,"childrenNum":0,"fileId":16387,"group":"supergroup","length":0,"modificationTime":1552016766769,"owner":"hdfs","pathSuffix":"tmp1","permission":"755","replication":0,"storagePolicy":0,"type":"DIRECTORY"}
> ]}}
> {code}
> The mount table is as follows:
> {code}
> $ hdfs dfsrouteradmin -ls /tmp
> Mount Table Entries:
> SourceDestinations  Owner 
> Group Mode  Quota/Usage  
> /tmp  ns1->/tmp aajisaka  
> users rwxr-xr-x [NsQuota: -/-, SsQuota: 
> -/-]
> /tmp/tmp1 ns1->/tmp/tmp1aajisaka  
> users rwxr-xr-x [NsQuota: -/-, SsQuota: 
> -/-]
> /tmp/tmp2 ns2->/tmp/tmp2aajisaka  
> users rwxr-xr-x [NsQuota: -/-, SsQuota: 
> -/-]
> {code}
> Without trailing thrash, two directories are listed.
> {code}
> $ curl -u : --negotiate -i 
> "http://:50071/webhdfs/v1/tmp?op=LISTSTATUS"
> (snip)
> {"FileStatuses":{"FileStatus":[
> {"accessTime":1541753421917,"blockSize":0,"childrenNum":0,"fileId":0,"group":"supergroup","length":0,"modificationTime":1541753421917,"owner":"hdfs","pathSuffix":"tmp1","permission":"777","replication":0,"storagePolicy":0,"symlink":"","type":"DIRECTORY"},
> {"accessTime":1541753429812,"blockSize":0,"childrenNum":0,"fileId":0,"group":"supergroup","length":0,"modificationTime":1541753429812,"owner":"hdfs","pathSuffix":"tmp2","permission":"777","replication":0,"storagePolicy":0,"symlink":"","type":"DIRECTORY"}
> ]}}
> {code}
> [~ajisakaa] Thanks for reporting this, I borrowed the text from 
> HDFS-13972



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14440) RBF: Optimize the file write process in case of multiple destinations.

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14440:

Fix Version/s: 3.3.0

> RBF: Optimize the file write process in case of multiple destinations.
> --
>
> Key: HDFS-14440
> URL: https://issues.apache.org/jira/browse/HDFS-14440
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14440-HDFS-13891-01.patch, 
> HDFS-14440-HDFS-13891-02.patch, HDFS-14440-HDFS-13891-03.patch, 
> HDFS-14440-HDFS-13891-04.patch, HDFS-14440-HDFS-13891-05.patch, 
> HDFS-14440-HDFS-13891-06.patch
>
>
> In case of multiple destinations, We need to check if the file already exists 
> in one of the subclusters for which we use the existing getBlockLocation() 
> API which is by default a sequential Call,
> In an ideal scenario where the file needs to be created each subcluster shall 
> be checked sequentially, this can be done concurrently to save time.
> In another case where the file is found and if the last block is null, we 
> need to do getFileInfo to all the locations to get the location where the 
> file exists. This also can be prevented by use of ConcurrentCall since we 
> shall be having the remoteLocation to where the getBlockLocation returned a 
> non null entry.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14422) RBF: Router shouldn't allow READ operations in safe mode

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14422:

Fix Version/s: 3.3.0

> RBF: Router shouldn't allow READ operations in safe mode
> 
>
> Key: HDFS-14422
> URL: https://issues.apache.org/jira/browse/HDFS-14422
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14422-HDFS-13891.000.patch, 
> HDFS-14422-HDFS-13891.001.patch
>
>
> We are currently seeing:
> org.apache.hadoop.hdfs.server.federation.store.StateStoreUnavailableException:
>  Mount Table not initialized
>   at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.verifyMountTable(MountTableResolver.java:521)
>   at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:394)
>   at 
> org.apache.hadoop.hdfs.server.federation.resolver.MultipleDestinationMountTableResolver.getDestinationForPath(MultipleDestinationMountTableResolver.java:87)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:1258)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterClientProtocol.getFileInfo(RouterClientProtocol.java:747)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getFileInfo(RouterRpcServer.java:749)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:881)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:513)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1011)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1915)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2621)
> The Namenode allows READ operations but for the Router not being able to 
> access the State Store also hits the read operations.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14388) RBF: Prevent loading metric system when disabled

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14388:

Fix Version/s: 3.3.0

> RBF: Prevent loading metric system when disabled
> 
>
> Key: HDFS-14388
> URL: https://issues.apache.org/jira/browse/HDFS-14388
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14388-HDFS-13891.000.patch, 
> HDFS-14388-HDFS-13891.001.patch
>
>
> Currently, the Router and the State Store try to initialize the metrics even 
> when they are specially disabled. This produces a lot of verbose logs in 
> tests without metrics.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14343) RBF: Fix renaming folders spread across multiple subclusters

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14343:

Fix Version/s: HDFS-13891
   3.3.0

> RBF: Fix renaming folders spread across multiple subclusters
> 
>
> Key: HDFS-14343
> URL: https://issues.apache.org/jira/browse/HDFS-14343
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14343-HDFS-13891-01.patch, 
> HDFS-14343-HDFS-13891-02.patch, HDFS-14343-HDFS-13891-03.patch, 
> HDFS-14343-HDFS-13891-04.patch, HDFS-14343-HDFS-13891-05.patch
>
>
> The {{RouterClientProtocol#rename()}} function assumes that we are renaming 
> files and only renames one of them (i.e., {{invokeSequential()}}). In the 
> case of folders which are in all subclusters (e.g., HASH_ALL) we should 
> rename all locations (i.e., {{invokeAll()}}).



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14334) RBF: Use human readable format for long numbers in the Router UI

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14334:

Fix Version/s: 3.3.0

> RBF: Use human readable format for long numbers in the Router UI
> 
>
> Key: HDFS-14334
> URL: https://issues.apache.org/jira/browse/HDFS-14334
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14334-HDFS-13891.000.patch, 
> HDFS-14334-HDFS-13891.001.patch, block-files-numbers-after.png, 
> block-files-numbers.png
>
>
> Currently, for the number of files, we show the raw number. When it starts to 
> get into millions, it is hard to read. We should use a human readable version 
> similar to what we do with PB, GB, MB,...



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14331) RBF: IOE While Removing Mount Entry

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14331:

Fix Version/s: 3.3.0

> RBF: IOE While Removing Mount Entry
> ---
>
> Key: HDFS-14331
> URL: https://issues.apache.org/jira/browse/HDFS-14331
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14331-HDFS-13891-01.patch, 
> HDFS-14331-HDFS-13891-02.patch, HDFS-14331-HDFS-13891-03.patch
>
>
> IOException while trying to remove the mount entry when the actual 
> destination doesn't exist.
> {noformat}
> java.io.IOException: Directory does not exist: /mount at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.valueOf(INodeDirectory.java:59)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetQuota(FSDirAttrOp.java:334)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setQuota(FSDirAttrOp.java:244)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setQuota(FSNamesystem.java:3352)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setQuota(NameNodeRpcServer.java:1484)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setQuota(ClientNamenodeProtocolServerSideTranslatorPB.java:1042)
>  at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:37182)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:943) at 
> org.apache.hadoop.ipc.Server$Call.run(Server.java:1) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2825)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14316) RBF: Support unavailable subclusters for mount points with multiple destinations

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14316:

Fix Version/s: 3.3.0

> RBF: Support unavailable subclusters for mount points with multiple 
> destinations
> 
>
> Key: HDFS-14316
> URL: https://issues.apache.org/jira/browse/HDFS-14316
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14316-HDFS-13891.000.patch, 
> HDFS-14316-HDFS-13891.001.patch, HDFS-14316-HDFS-13891.002.patch, 
> HDFS-14316-HDFS-13891.003.patch, HDFS-14316-HDFS-13891.004.patch, 
> HDFS-14316-HDFS-13891.005.patch, HDFS-14316-HDFS-13891.006.patch, 
> HDFS-14316-HDFS-13891.007.patch, HDFS-14316-HDFS-13891.008.patch, 
> HDFS-14316-HDFS-13891.009.patch, HDFS-14316-HDFS-13891.010.patch, 
> HDFS-14316-HDFS-13891.011.patch, HDFS-14316-HDFS-13891.012.patch, 
> HDFS-14316-HDFS-13891.013.patch, HDFS-14316-HDFS-13891.014.patch, 
> HDFS-14316-HDFS-13891.015.patch
>
>
> Currently mount points with multiple destinations (e.g., HASH_ALL) fail 
> writes when the destination subcluster is down. We need an option to allow 
> writing in other subclusters when one is down.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14329) RBF: Add maintenance nodes to federation metrics

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14329:

Fix Version/s: 3.3.0

> RBF: Add maintenance nodes to federation metrics
> 
>
> Key: HDFS-14329
> URL: https://issues.apache.org/jira/browse/HDFS-14329
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14329-HDFS-13891-01.patch, 
> HDFS-14329-HDFS-13891-02.patch
>
>
> Extend datanode maintenance related metrics into federation metrics.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14268) RBF: Fix the location of the DNs in getDatanodeReport()

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14268:

Fix Version/s: 3.3.0

> RBF: Fix the location of the DNs in getDatanodeReport()
> ---
>
> Key: HDFS-14268
> URL: https://issues.apache.org/jira/browse/HDFS-14268
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14268-HDFS-13891.000.patch, 
> HDFS-14268-HDFS-13891.001.patch, HDFS-14268-HDFS-13891.002.patch, 
> HDFS-14268-HDFS-13891.003.patch, HDFS-14268-HDFS-13891.004.patch
>
>
> When getting all the DNs in the federation, the Router queries each of the 
> subclusters and aggregates them assigning the subcluster id to the location. 
> This query uses a {{HashSet}} which provides a "random" order for the results.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14230) RBF: Throw RetriableException instead of IOException when no namenodes available

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14230:

Fix Version/s: 3.3.0

> RBF: Throw RetriableException instead of IOException when no namenodes 
> available
> 
>
> Key: HDFS-14230
> URL: https://issues.apache.org/jira/browse/HDFS-14230
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.2.0, 3.1.1, 2.9.2, 3.0.3
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14230-HDFS-13891.001.patch, 
> HDFS-14230-HDFS-13891.002.patch, HDFS-14230-HDFS-13891.003.patch, 
> HDFS-14230-HDFS-13891.004.patch, HDFS-14230-HDFS-13891.005.patch, 
> HDFS-14230-HDFS-13891.006.patch
>
>
> Failover usually happens when upgrading namenodes. And there are no active 
> namenodes within some seconds, Accessing HDFS through router fails at this 
> moment. This could make jobs  failure or hang. Some hive jobs logs are as 
> follow  
> {code:java}
> 2019-01-03 16:12:08,337 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
> 133.33 sec
> MapReduce Total cumulative CPU time: 2 minutes 13 seconds 330 msec
> Ended Job = job_1542178952162_24411913
> Launching Job 4 out of 6
> Exception in thread "Thread-86" java.lang.RuntimeException: 
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): No namenode 
> available under nameservice Cluster3
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.shouldRetry(RouterRpcClient.java:328)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:488)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:495)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeMethod(RouterRpcClient.java:385)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeSequential(RouterRpcClient.java:760)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getFileInfo(RouterRpcServer.java:1152)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:849)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2134)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2130)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1867)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2130)
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1804)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1338)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3925)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1014)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:849)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2134)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2130)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1867)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2130)
> {code}
> Deep into the code. Maybe we can throw StandbyException when no namenodes 
> available. Client will fail after some 

[jira] [Updated] (HDFS-14259) RBF: Fix safemode message for Router

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14259:

Fix Version/s: 3.3.0

> RBF: Fix safemode message for Router
> 
>
> Key: HDFS-14259
> URL: https://issues.apache.org/jira/browse/HDFS-14259
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Ranith Sardar
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14259-HDFS-13891.000.patch, 
> HDFS-14259-HDFS-13891.001.patch, HDFS-14259-HDFS-13891.002.patch
>
>
> Currently, the {{getSafemode()}} bean checks the state of the Router but 
> returns the error if the status is different than SAFEMODE:
> {code}
>   public String getSafemode() {
>   if (!getRouter().isRouterState(RouterServiceState.SAFEMODE)) {
> return "Safe mode is ON. " + this.getSafeModeTip();
>   }
> } catch (IOException e) {
>   return "Failed to get safemode status. Please check router"
>   + "log for more detail.";
> }
> return "";
>   }
> {code}
> The condition should be reversed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14252) RBF : Exceptions are exposing the actual sub cluster path

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14252:

Fix Version/s: HDFS-13891
   3.3.0

> RBF : Exceptions are exposing the actual sub cluster path
> -
>
> Key: HDFS-14252
> URL: https://issues.apache.org/jira/browse/HDFS-14252
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14252-HDFS-13891-01.patch, 
> HDFS-14252-HDFS-13891-02.patch, HDFS-14252-HDFS-13891-03.patch
>
>
> In case of file not found exception. If only one destination is available. 
> Either mounted only one Or mounted multiple but available only one(disabled 
> NS or something) during operation. In that scenario the exceptions are not 
> processed and is directly thrown. This exposes the actual sub cluster 
> destination path instead the path w.r.t. Mount.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14249) RBF: Tooling to identify the subcluster location of a file

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14249:

Fix Version/s: HDFS-13891
   3.3.0

> RBF: Tooling to identify the subcluster location of a file
> --
>
> Key: HDFS-14249
> URL: https://issues.apache.org/jira/browse/HDFS-14249
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14249-HDFS-13891.000.patch, 
> HDFS-14249-HDFS-13891.001.patch, HDFS-14249-HDFS-13891.002.patch
>
>
> Mount points can spread files across multiple subclusters depennding on a 
> policy (e.g., HASH, HASH_ALL). Administrators would need a way to identify 
> the location.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14226) RBF: Setting attributes should set on all subclusters' directories.

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14226:

Fix Version/s: 3.3.0

> RBF: Setting attributes should set on all subclusters' directories.
> ---
>
> Key: HDFS-14226
> URL: https://issues.apache.org/jira/browse/HDFS-14226
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: RBF
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14226-HDFS-13891-01.patch, 
> HDFS-14226-HDFS-13891-02.patch, HDFS-14226-HDFS-13891-03.patch, 
> HDFS-14226-HDFS-13891-04.patch, HDFS-14226-HDFS-13891-05.patch, 
> HDFS-14226-HDFS-13891-06.patch, HDFS-14226-HDFS-13891-07.patch, 
> HDFS-14226-HDFS-13891-WIP1.patch
>
>
> Only one subcluster is set now.
> {noformat}
> // create a mount point of multiple subclusters
> hdfs dfsrouteradmin -add /all_data ns1 /data1
> hdfs dfsrouteradmin -add /all_data ns2 /data2
> hdfs ec -Dfs.defaultFS=hdfs://router: -setPolicy -path /all_data -policy 
> RS-3-2-1024k
> Set RS-3-2-1024k erasure coding policy on /all_data
> hdfs ec -Dfs.defaultFS=hdfs://router: -getPolicy -path /all_data
> RS-3-2-1024k
> hdfs ec -Dfs.defaultFS=hdfs://ns1-namenode:8020 -getPolicy -path /data1
> RS-3-2-1024k
> hdfs ec -Dfs.defaultFS=hdfs://ns2-namenode:8020 -getPolicy -path /data2
> The erasure coding policy of /data2 is unspecified
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14223) RBF: Add configuration documents for using multiple sub-clusters

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14223:

Fix Version/s: 3.3.0

> RBF: Add configuration documents for using multiple sub-clusters
> 
>
> Key: HDFS-14223
> URL: https://issues.apache.org/jira/browse/HDFS-14223
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: RBF
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14223-HDFS-13891.1.patch, 
> HDFS-14223-HDFS-13891.2.patch
>
>
> When using multiple sub-clusters for a mount point, we need to set 
> {{dfs.federation.router.file.resolver.client.class}} to 
> {{MultipleDestinationMountTableResolver}}. The current documents lack of the 
> explanation. We should add it to HDFSRouterFederation.md and 
> hdfs-rbf-default.xml.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14215) RBF: Remove dependency on availability of default namespace

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14215:

Fix Version/s: 3.3.0

> RBF: Remove dependency on availability of default namespace
> ---
>
> Key: HDFS-14215
> URL: https://issues.apache.org/jira/browse/HDFS-14215
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14215-HDFS-13891-01.patch, 
> HDFS-14215-HDFS-13891-02.patch, HDFS-14215-HDFS-13891-03.patch, 
> HDFS-14215-HDFS-13891-04.patch, HDFS-14215-HDFS-13891-05.patch, 
> HDFS-14215-HDFS-13891-05.patch, HDFS-14215-HDFS-13891-06.patch, 
> HDFS-14215-HDFS-13891-07.patch, HDFS-14215-HDFS-13891-08.patch, 
> HDFS-14215-HDFS-13891-09.patch
>
>
> Remove the dependency of all API's on the availability of default namespace.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14224) RBF: NPE in getContentSummary() for getEcPolicy() in case of multiple destinations

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14224:

Fix Version/s: 3.3.0

> RBF: NPE in getContentSummary() for getEcPolicy() in case of multiple 
> destinations
> --
>
> Key: HDFS-14224
> URL: https://issues.apache.org/jira/browse/HDFS-14224
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14224-HDFS-13891-01.patch, 
> HDFS-14224-HDFS-13891-02.patch, HDFS-14224-HDFS-13891-03.patch, 
> HDFS-14224-HDFS-13891-04.patch, HDFS-14224-HDFS-13891-05.patch, 
> HDFS-14224-HDFS-13891-06.patch
>
>
> Null Pointer Exception in GetContentSummary for EC policy when there are 
> multiple destinations.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14225) RBF : MiniRouterDFSCluster should configure the failover proxy provider for namespace

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14225:

Fix Version/s: 3.3.0

> RBF : MiniRouterDFSCluster should configure the failover proxy provider for 
> namespace
> -
>
> Key: HDFS-14225
> URL: https://issues.apache.org/jira/browse/HDFS-14225
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Ranith Sardar
>Priority: Minor
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14225-HDFS-13891.000.patch
>
>
> Getting UnknownHostException in UT.
> {noformat}
> org.apache.hadoop.ipc.RemoteException(java.lang.IllegalArgumentException): 
> java.net.UnknownHostException: ns0
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14209) RBF: setQuota() through router is working for only the mount Points under the Source column in MountTable

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14209:

Fix Version/s: 3.3.0

> RBF: setQuota() through router is working for only the mount Points under the 
> Source column in MountTable
> -
>
> Key: HDFS-14209
> URL: https://issues.apache.org/jira/browse/HDFS-14209
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Shubham Dewan
>Assignee: Shubham Dewan
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14209-HDFS-13891.002.patch, 
> HDFS-14209-HDFS-13891.003.patch, HDFS-14209.001.patch
>
>
> Through router we are only able to setQuota for the directories under the 
> Source column of the mount table.
>  For any other directories apart from mount table entry ==> No remote 
> locations available IOException is thrown.
>  Should be able to setQuota for all the directories if present.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14210) RBF: ACL commands should work over all the destinations

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14210:

Fix Version/s: 3.3.0

> RBF: ACL commands should work over all the destinations
> ---
>
> Key: HDFS-14210
> URL: https://issues.apache.org/jira/browse/HDFS-14210
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Shubham Dewan
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14210-HDFS-13891-04.patch, 
> HDFS-14210-HDFS-13891-05.patch, HDFS-14210-HDFS-13891.002.patch, 
> HDFS-14210-HDFS-13891.003.patch, HDFS-14210.001.patch
>
>
> 1) A mount point with multiple destinations.
> 2) ./bin/hdfs dfs -setfacl -m user:abc:rwx /testacl
> 3) where /testacl => /test1, /test2
> 4) command works for only one destination.
> ACL should be set on both of the destinations.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14156) RBF: rollEdit() command fails with Router

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14156:

Fix Version/s: 3.3.0

> RBF: rollEdit() command fails with Router
> -
>
> Key: HDFS-14156
> URL: https://issues.apache.org/jira/browse/HDFS-14156
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Harshakiran Reddy
>Assignee: Shubham Dewan
>Priority: Major
>  Labels: RBF
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14156-HDFS-13891.006.patch, 
> HDFS-14156-HDFS-13891.007.patch, HDFS-14156.001.patch, HDFS-14156.002.patch, 
> HDFS-14156.003.patch, HDFS-14156.004.patch, HDFS-14156.005.patch
>
>
> {noformat}
> bin> ./hdfs dfsadmin -rollEdits
> rollEdits: Cannot cast java.lang.Long to long
> bin>
> {noformat}
> Trace :-
> {noformat}
> org.apache.hadoop.ipc.RemoteException(java.lang.ClassCastException): Cannot 
> cast java.lang.Long to long
> at java.lang.Class.cast(Class.java:3369)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:1085)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeConcurrent(RouterRpcClient.java:982)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterClientProtocol.rollEdits(RouterClientProtocol.java:900)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.rollEdits(RouterRpcServer.java:862)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.rollEdits(ClientNamenodeProtocolServerSideTranslatorPB.java:899)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:878)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:824)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2684)
> at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1520)
> at org.apache.hadoop.ipc.Client.call(Client.java:1466)
> at org.apache.hadoop.ipc.Client.call(Client.java:1376)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
> at com.sun.proxy.$Proxy11.rollEdits(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.rollEdits(ClientNamenodeProtocolTranslatorPB.java:804)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy12.rollEdits(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.rollEdits(DFSClient.java:2350)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.rollEdits(DistributedFileSystem.java:1550)
> at org.apache.hadoop.hdfs.tools.DFSAdmin.rollEdits(DFSAdmin.java:850)
> at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:2353)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2568)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional 

[jira] [Updated] (HDFS-14193) RBF: Inconsistency with the Default Namespace

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14193:

Fix Version/s: 3.3.0

> RBF: Inconsistency with the Default Namespace
> -
>
> Key: HDFS-14193
> URL: https://issues.apache.org/jira/browse/HDFS-14193
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14193-HDFS-13891-01.patch, 
> HDFS-14193-HDFS-13891-02.patch
>
>
> In the present scenario, if the default nameservice is not explicitly 
> mentioned.Each router fallbacks to it local namespace as Default.There in 
> each router having different default namespaces. Which leads to 
> inconsistencies in operations and even blocks in maintaining a global uniform 
> state. The outputs becomes specific to which router is serving the request 
> and is different with different routers.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14206) RBF: Cleanup quota modules

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14206:

Fix Version/s: 3.3.0

> RBF: Cleanup quota modules
> --
>
> Key: HDFS-14206
> URL: https://issues.apache.org/jira/browse/HDFS-14206
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14206-HDFS-13891.000.patch, 
> HDFS-14206-HDFS-13891.001.patch, HDFS-14206-HDFS-13891.002.patch
>
>
> The quota part needs some cleanup.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14191) RBF: Remove hard coded router status from FederationMetrics.

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14191:

Fix Version/s: 3.3.0

> RBF: Remove hard coded router status from FederationMetrics.
> 
>
> Key: HDFS-14191
> URL: https://issues.apache.org/jira/browse/HDFS-14191
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Ranith Sardar
>Assignee: Ranith Sardar
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14191-HDFS-13891.002.patch, 
> HDFS-14191-HDFS-13891.003.patch, HDFS-14191.001.patch, 
> IMG_20190109_023713.jpg, image-2019-01-08-16-05-34-736.png, 
> image-2019-01-08-16-09-46-648.png
>
>
> Status value in "Router Information" and in Overview tab, is not matching for 
> "SAFEMODE" condition.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14161) RBF: Throw StandbyException instead of IOException so that client can retry when can not get connection

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14161:

Fix Version/s: 3.3.0

> RBF: Throw StandbyException instead of IOException so that client can retry 
> when can not get connection
> ---
>
> Key: HDFS-14161
> URL: https://issues.apache.org/jira/browse/HDFS-14161
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.1.1, 2.9.2, 3.0.3
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14161-HDFS-13891.001.patch, 
> HDFS-14161-HDFS-13891.002.patch, HDFS-14161-HDFS-13891.003.patch, 
> HDFS-14161-HDFS-13891.004.patch, HDFS-14161-HDFS-13891.005.patch, 
> HDFS-14161-HDFS-13891.006.patch, HDFS-14161.001.patch
>
>
> Hive Client may hang when get IOException, stack follows
> {code:java}
> Exception in thread "Thread-150" java.lang.RuntimeException: 
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): Cannot get a 
> connection to bigdata-nn20.g01:8020
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.getConnection(RouterRpcClient.java:262)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeMethod(RouterRpcClient.java:380)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeSequential(RouterRpcClient.java:752)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getFileInfo(RouterRpcServer.java:1152)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:849)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2134)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2130)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1867)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2130)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:554)
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:74)
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): Cannot 
> get a connection to bigdata-nn20.g01:8020
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.getConnection(RouterRpcClient.java:262)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeMethod(RouterRpcClient.java:380)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeSequential(RouterRpcClient.java:752)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getFileInfo(RouterRpcServer.java:1152)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:849)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2134)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2130)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1867)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2130)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1503)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1441)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
>   at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:775)
>   at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>   at 
> 

[jira] [Updated] (HDFS-14150) RBF: Quotas of the sub-cluster should be removed when removing the mount point

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14150:

Fix Version/s: 3.3.0

> RBF: Quotas of the sub-cluster should be removed when removing the mount point
> --
>
> Key: HDFS-14150
> URL: https://issues.apache.org/jira/browse/HDFS-14150
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: RBF
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14150-HDFS-13891.2.patch, 
> HDFS-14150-HDFS-13891.3.patch, HDFS-14150.1.patch, HDFS-14150.2.patch
>
>
> From HDFS-14143
> {noformat}
> $ hdfs dfsrouteradmin -add /ns1_data ns1 /data
> $ hdfs dfsrouteradmin -setQuota /ns1_data -nsQuota 10 -ssQuota 10
> $ hdfs dfsrouteradmin -ls /ns1_data
> SourceDestinations  Owner 
> Group Mode  Quota/Usage
> /ns1_datans1->/data tasanuma
> users  rwxr-xr-x [NsQuota: 10/1, SsQuota: 
> 10 B/0 B]
> $ hdfs dfsrouteradmin -rm /ns1_data
> $ hdfs dfsrouteradmin -add /ns1_data ns1 /data
> $ hdfs dfsrouteradmin -ls /ns1_data
> SourceDestinations  Owner 
> Group Mode  Quota/Usage
> /ns1_datans1->/data tasanuma
> users  rwxr-xr-x [NsQuota: -/-, SsQuota: 
> -/-]
> $ hadoop fs -put file1 /ns1_data/file1
> put: The DiskSpace quota of /data is exceeded: quota = 10 B = 10 B but 
> diskspace consumed = 402653184 B = 384 MB
> {noformat}
> This is because the quotas of the subclusters still remain after "hdfs 
> dfsrouteradmin -rm". And "hdfs dfsrouteradmin -add" doesn't reflect the 
> existing quotas.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14152) RBF: Fix a typo in RouterAdmin usage

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14152:

Fix Version/s: 3.3.0

> RBF: Fix a typo in RouterAdmin usage
> 
>
> Key: HDFS-14152
> URL: https://issues.apache.org/jira/browse/HDFS-14152
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: RBF, newbie
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14152-HDFS-13891-01.patch
>
>
> {{routeradmin}} is wrong.
> {noformat}
> Usage: hdfs routeradmin
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14167) RBF: Add stale nodes to federation metrics

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14167:

Fix Version/s: 3.3.0

> RBF: Add stale nodes to federation metrics
> --
>
> Key: HDFS-14167
> URL: https://issues.apache.org/jira/browse/HDFS-14167
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14167-HDFS-13891.000.patch
>
>
> The federation metrics mimic the Namenode FSNamesystemState. However, the 
> stale datanodes are not collected.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13856) RBF: RouterAdmin should support dfsrouteradmin -refreshRouterArgs command

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13856:

Fix Version/s: 3.3.0

> RBF: RouterAdmin should support dfsrouteradmin -refreshRouterArgs command
> -
>
> Key: HDFS-13856
> URL: https://issues.apache.org/jira/browse/HDFS-13856
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation, hdfs
>Affects Versions: 3.0.0, 3.1.0, 2.9.1
>Reporter: yanghuafeng
>Assignee: yanghuafeng
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-13856-HDFS-13891.001.patch, 
> HDFS-13856-HDFS-13891.002.patch, HDFS-13856-HDFS-13891.003.patch, 
> HDFS-13856.001.patch, HDFS-13856.002.patch
>
>
> Like namenode router should support refresh policy individually. For example, 
> we have implemented simple password authentication per rpc connection. The 
> password dict can be refreshed by generic refresh policy. We also want to 
> support this in RouterAdminServer. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14085) RBF: LS command for root shows wrong owner and permission information.

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14085:

Fix Version/s: 3.3.0

> RBF: LS command for root shows wrong owner and permission information.
> --
>
> Key: HDFS-14085
> URL: https://issues.apache.org/jira/browse/HDFS-14085
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14085-HDFS-13891-01.patch, 
> HDFS-14085-HDFS-13891-02.patch, HDFS-14085-HDFS-13891-03.patch, 
> HDFS-14085-HDFS-13891-04.patch, HDFS-14085-HDFS-13891-05.patch, 
> HDFS-14085-HDFS-13891-06.patch, HDFS-14085-HDFS-13891-07.patch, 
> HDFS-14085-HDFS-13891-08.patch, HDFS-14085-HDFS-13891-09.patch
>
>
> The LS command for / lists all the mount entries but the permission displayed 
> is the default permission (777) and the owner and group info same as that of 
> the user calling it; Which actually should be the same as that of the 
> destination of the mount point.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14151) RBF: Make the read-only column of Mount Table clearly understandable

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14151:

Fix Version/s: 3.3.0

> RBF: Make the read-only column of Mount Table clearly understandable
> 
>
> Key: HDFS-14151
> URL: https://issues.apache.org/jira/browse/HDFS-14151
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: RBF
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14151.1.patch, HDFS-14151.2.patch, 
> HDFS-14151.3.patch, mount_table_3rd_patch.png, mount_table_before.png, 
> read_only_a.png, read_only_b.png
>
>
> The read-only column of Mount Table is a little confusing now.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14024) RBF: ProvidedCapacityTotal json exception in NamenodeHeartbeatService

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14024:

Fix Version/s: 3.3.0

> RBF: ProvidedCapacityTotal json exception in NamenodeHeartbeatService
> -
>
> Key: HDFS-14024
> URL: https://issues.apache.org/jira/browse/HDFS-14024
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: CR Hota
>Assignee: CR Hota
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14024-HDFS-13891.0.patch, HDFS-14024.0.patch
>
>
> Routers may be proxying for a downstream name node that is NOT migrated to 
> understand "ProvidedCapacityTotal". updateJMXParameters method in 
> NamenodeHeartbeatService should handle this without breaking.
>  
> {code:java}
> jsonObject.getLong("MissingBlocks"),
> jsonObject.getLong("PendingReplicationBlocks"),
> jsonObject.getLong("UnderReplicatedBlocks"),
> jsonObject.getLong("PendingDeletionBlocks"),
> jsonObject.getLong("ProvidedCapacityTotal"));
> {code}
> One way to do this is create a json wrapper while gives back some default if 
> json node is not found.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14114) RBF: MIN_ACTIVE_RATIO should be configurable

2019-09-12 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14114:

Fix Version/s: 3.3.0

> RBF: MIN_ACTIVE_RATIO should be configurable
> 
>
> Key: HDFS-14114
> URL: https://issues.apache.org/jira/browse/HDFS-14114
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Fix For: 3.3.0, HDFS-13891
>
> Attachments: HDFS-14114-HDFS-13891.001.patch, 
> HDFS-14114-HDFS-13891.002.patch, HDFS-14114.001.patch, HDFS-14114.002.patch, 
> HDFS-14114.003.patch, HDFS-14114.004.patch, HDFS-14114.005.patch, 
> HDFS-14114.006.patch, HDFS-14114.007.patch, HDFS-14114.008.patch
>
>
> The following code contains 
> {code:java}
>   if (timeSinceLastActive > connectionCleanupPeriodMs ||
>   active < MIN_ACTIVE_RATIO * total) {
> // Remove and close 1 connection
> List conns = pool.removeConnections(1);
> for (ConnectionContext conn : conns) {
>   conn.close();
> }
> LOG.debug("Removed connection {} used {} seconds ago. " +
> "Pool has {}/{} connections", pool.getConnectionPoolId(),
> TimeUnit.MILLISECONDS.toSeconds(timeSinceLastActive),
> pool.getNumConnections(), pool.getMaxSize());
>   }
> ...
> if (pool.getNumConnections() < pool.getMaxSize() &&
> active >= MIN_ACTIVE_RATIO * total) {
>   ConnectionContext conn = pool.newConnection();
>   pool.addConnection(conn);
> } else {
>   LOG.debug("Cannot add more than {} connections to {}",
>   pool.getMaxSize(), pool);
> }
> {code}
> It affects cleanup and creating Connections. Maybe it should be configurable 
> so that we can reconfig it to improve performance



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



<    1   2   3   4   5   6   7   8   9   10   >