[jira] [Commented] (HDFS-15253) Set default throttle value on dfs.image.transfer.bandwidthPerSec

2020-04-11 Thread Karthik Palanisamy (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17081210#comment-17081210
 ] 

Karthik Palanisamy commented on HDFS-15253:
---

Thank you  [~dineshchitlangia] I think 1M txns even low for normal workload.  
Most times, I see every 1-2 hours regular checkpoints.  I am ok to keep default 
setting. [~weichiu] what do you think?

If we set dfs.image.compress to true then the user can't use 
dfs.image.parallel.load option.

Right now,  I will be updating dfs.image.transfer.bandwidthPerSec, and let 
other settings as it is.  

> Set default throttle value on dfs.image.transfer.bandwidthPerSec
> 
>
> Key: HDFS-15253
> URL: https://issues.apache.org/jira/browse/HDFS-15253
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
>
> The default value dfs.image.transfer.bandwidthPerSec is set to 0 so it can 
> use maximum available bandwidth for fsimage transfers during checkpoint. I 
> think we should throttle this. Many users were experienced namenode failover 
> when transferring large image size along with fsimage replication on 
> dfs.namenode.name.dir. eg. >25Gb.  
> Thought to set,
> dfs.image.transfer.bandwidthPerSec=52428800. (50 MB/s)
> dfs.namenode.checkpoint.txns=200 (Default is 1M, good to avoid frequent 
> checkpoint. However, the default checkpoint runs every 6 hours once)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15271) Update DFSUtil.getSpnegoKeytabKey based on hadoop.http.authentication.kerberos.keytab

2020-04-11 Thread Masatake Iwasaki (Jira)
Masatake Iwasaki created HDFS-15271:
---

 Summary: Update DFSUtil.getSpnegoKeytabKey based on 
hadoop.http.authentication.kerberos.keytab
 Key: HDFS-15271
 URL: https://issues.apache.org/jira/browse/HDFS-15271
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: security
Affects Versions: 3.3.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki


After HADOOP-16314, setting {{dfs.web.authentication.kerberos.keytab}} has no 
effect. Keytab file can be specified by 
{{hadoop.http.authentication.kerberos.keytab}} now. DFSUtil.getSpnegoKeytabKey 
should care about the new proeperty while keeping backward compatibility. Since 
the {{getSpnegoKeytabKey}} is only called by NameNodeHttpServer and the value 
is not used, deprecating it could be an option.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-04-11 Thread Toshihiro Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17081309#comment-17081309
 ] 

Toshihiro Suzuki commented on HDFS-15217:
-

I created a PR for this. After this patch, we can see additional information in 
the lock report message as follows:
{code:java}
2020-04-11 23:04:36,020 [IPC Server handler 5 on default port 62641] INFO  
namenode.FSNamesystem (FSNamesystemLock.java:writeUnlock(321)) - Number of 
suppressed write-lock reports: 0
Longest write-lock held at 2020-04-11 23:04:36,020+0900 for 3ms by 
delete (ugi=bob (auth:SIMPLE),ip=/127.0.0.1,src=/file,dst=null,perm=null) via 
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:302)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:261)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1746)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3274)
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1130)
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:724)
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:529)
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1016)
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:944)
java.security.AccessController.doPrivileged(Native Method)
javax.security.auth.Subject.doAs(Subject.java:422)
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1845)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2948)

Total suppressed write-lock held time: 0.0

{code}
This patch adds the additional information *"by delete (ugi=bob 
(auth:SIMPLE),ip=/127.0.0.1,src=/file,dst=null,perm=null)"* which is similar to 
the audit log format.

> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15243) Child directory should not be deleted or renamed if parent directory is a protected directory

2020-04-11 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17081344#comment-17081344
 ] 

Ayush Saxena commented on HDFS-15243:
-

Thanx [~rain_lyy] for the patch.
* Please include a UT covering all the changes
* Move the configuration to HDFS rather than being at common, since this would 
be only HDFS specific

> Child directory should not be deleted or renamed if parent directory is a 
> protected directory
> -
>
> Key: HDFS-15243
> URL: https://issues.apache.org/jira/browse/HDFS-15243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: 3.1.1
>Affects Versions: 3.1.1
>Reporter: liuyanyu
>Assignee: liuyanyu
>Priority: Major
> Attachments: HDFS-15243.001.patch, image-2020-03-28-09-23-31-335.png
>
>
> HDFS-8983 add  fs.protected.directories to support protected directories on 
> NameNode.  But as I test, when set a parent directory(eg /testA)  to 
> protected directory, the child directory (eg /testA/testB) still can be 
> deleted or renamed. When we protect a directory  mainly for protecting the 
> data under this directory , So I think the child directory should not be 
> delete or renamed if the parent directory is a protected directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11041) Unable to unregister FsDatasetState MBean if if DataNode is shutdown twice

2020-04-11 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-11041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17081367#comment-17081367
 ] 

Ayush Saxena commented on HDFS-11041:
-

Thanx [~weichiu] for the report.
Is this issue still there? Any reasons for holding this?

> Unable to unregister FsDatasetState MBean if if DataNode is shutdown twice
> --
>
> Key: HDFS-11041
> URL: https://issues.apache.org/jira/browse/HDFS-11041
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Trivial
> Attachments: HDFS-11041.01.patch, HDFS-11041.02.patch, 
> HDFS-11041.03.patch
>
>
> I saw error message like the following in some tests
> {noformat}
> 2016-10-21 04:09:03,900 [main] WARN  util.MBeans 
> (MBeans.java:unregister(114)) - Error unregistering 
> Hadoop:service=DataNode,name=FSDatasetState-33cd714c-0b1a-471f-8efe-f431d7d874bc
> javax.management.InstanceNotFoundException: 
> Hadoop:service=DataNode,name=FSDatasetState-33cd714c-0b1a-471f-8efe-f431d7d874bc
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)
>   at org.apache.hadoop.metrics2.util.MBeans.unregister(MBeans.java:112)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:2127)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:2016)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1985)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1962)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1936)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1929)
>   at 
> org.apache.hadoop.hdfs.TestDatanodeReport.testDatanodeReport(TestDatanodeReport.java:144)
> {noformat}
> The test shuts down datanode, and then shutdown cluster, which shuts down the 
> a datanode twice. Resetting the FsDatasetSpi reference in DataNode to null 
> resolves the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15247) RBF: Provide Non DFS Used per DataNode in DataNode UI

2020-04-11 Thread Lisheng Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-15247:
---
Attachment: HDFS-15247.002.patch

> RBF: Provide Non DFS Used per DataNode in DataNode UI
> -
>
> Key: HDFS-15247
> URL: https://issues.apache.org/jira/browse/HDFS-15247
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15247.001.patch, HDFS-15247.002.patch, Screenshot 
> from 2020-04-09 17-19-30.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15247) RBF: Provide Non DFS Used per DataNode in DataNode UI

2020-04-11 Thread Lisheng Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-15247:
---
Attachment: (was: HDFS-15247.002.patch)

> RBF: Provide Non DFS Used per DataNode in DataNode UI
> -
>
> Key: HDFS-15247
> URL: https://issues.apache.org/jira/browse/HDFS-15247
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15247.001.patch, Screenshot from 2020-04-09 
> 17-19-30.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12862) CacheDirective becomes invalid when NN restart or failover

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12862:

Target Version/s: 3.4.0

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> CacheDirective becomes invalid when NN restart or failover
> --
>
> Key: HDFS-12862
> URL: https://issues.apache.org/jira/browse/HDFS-12862
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, hdfs
>Affects Versions: 2.7.1
> Environment: 
>Reporter: Wang XL
>Assignee: Wang XL
>Priority: Major
>  Labels: patch
> Fix For: 3.3.0
>
> Attachments: HDFS-12862-branch-2.7.1.001.patch, 
> HDFS-12862-trunk.002.patch, HDFS-12862-trunk.003.patch, 
> HDFS-12862-trunk.004.patch, HDFS-12862.005.patch, HDFS-12862.006.patch, 
> HDFS-12862.007.patch, HDFS-12862.branch-3.1.patch
>
>
> The logic in FSNDNCacheOp#modifyCacheDirective is not correct.  when modify 
> cacheDirective,the expiration in directive may be a relative expiryTime, and 
> EditLog will serial a relative expiry time.
> {code:java}
> // Some comments here
> static void modifyCacheDirective(
>   FSNamesystem fsn, CacheManager cacheManager, CacheDirectiveInfo 
> directive,
>   EnumSet flags, boolean logRetryCache) throws IOException {
> final FSPermissionChecker pc = getFsPermissionChecker(fsn);
> cacheManager.modifyDirective(directive, pc, flags);
> fsn.getEditLog().logModifyCacheDirectiveInfo(directive, logRetryCache);
>   }
> {code}
> But when SBN replay the log ,it will invoke 
> FSImageSerialization#readCacheDirectiveInfo  as a absolute expiryTime.It will 
> result in the inconsistency .
> {code:java}
>   public static CacheDirectiveInfo readCacheDirectiveInfo(DataInput in)
>   throws IOException {
> CacheDirectiveInfo.Builder builder =
> new CacheDirectiveInfo.Builder();
> builder.setId(readLong(in));
> int flags = in.readInt();
> if ((flags & 0x1) != 0) {
>   builder.setPath(new Path(readString(in)));
> }
> if ((flags & 0x2) != 0) {
>   builder.setReplication(readShort(in));
> }
> if ((flags & 0x4) != 0) {
>   builder.setPool(readString(in));
> }
> if ((flags & 0x8) != 0) {
>   builder.setExpiration(
>   CacheDirectiveInfo.Expiration.newAbsolute(readLong(in)));
> }
> if ((flags & ~0xF) != 0) {
>   throw new IOException("unknown flags set in " +
>   "ModifyCacheDirectiveInfoOp: " + flags);
> }
> return builder.build();
>   }
> {code}
> In other words, fsn.getEditLog().logModifyCacheDirectiveInfo(directive, 
> logRetryCache)  may serial a relative expiry time,But  
> builder.setExpiration(CacheDirectiveInfo.Expiration.newAbsolute(readLong(in)))
>read it as a absolute expiryTime.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10919) Provide admin/debug tool to dump out info of a given block

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10919:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Provide admin/debug tool to dump out info of a given block
> --
>
> Key: HDFS-10919
> URL: https://issues.apache.org/jira/browse/HDFS-10919
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Shweta
>Priority: Major
>
> We have fsck to find out blocks associated with a file, which is nice. 
> Sometimes, we saw trouble with a specific block, we'd like to collect info 
> about this block, such as
> * what file this block belong to, 
> * where the replicas of this block are located, 
> * whether the block is EC coded; 
> * if a block is EC coded, whether it's a data block, or code
> * if a block is EC coded, what's the codec.
> * if a block is EC coded, what's the block group
> * for the block group, what are the other blocks
> Create this jira to provide such a util, as dfsadmin, or a debug tool.
> Thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11942) make new chooseDataNode policy work in more operation like seek, fetch

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11942:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> make new  chooseDataNode policy  work in more operation like seek, fetch
> 
>
> Key: HDFS-11942
> URL: https://issues.apache.org/jira/browse/HDFS-11942
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: 2.6.0, 2.7.0, 3.0.0-alpha3
>Reporter: Fangyuan Deng
>Priority: Major
> Attachments: HDFS-11942.0.patch, HDFS-11942.1.patch, 
> ssd-first-disable(default).png, ssd-first-enable.png
>
>
> in default policy, if a file is ONE_SSD, client will prior read the local 
> disk replica rather than the remote ssd replica.
> but now, the pci-e SSD and 10G ethernet make remote read SSD more faster than 
>  the local disk.
> HDFS-9666 give us a patch,  but the code is not complete and not updated for 
> a long time.
> this sub-task issue give a complete patch and 
> we have tested on three machines [ 32 core cpu, 128G mem , 1000M network, 
> 1.2T HDD, 800G SSD(intel P3600) ].
> with this feather, throughput of hbase table(ONE_SSD) is double of which 
> without this feather



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11161) Incorporate Baidu Yun BOS file system implementation

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11161:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Incorporate Baidu Yun BOS file system implementation
> 
>
> Key: HDFS-11161
> URL: https://issues.apache.org/jira/browse/HDFS-11161
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: fs
>Reporter: Faen Zhang
>Priority: Major
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> Baidu Yun ( https://cloud.baidu.com/ ) is one of top tier cloud computing 
> provider. Baidu Yun BOS is widely used among China's cloud users, but 
> currently it is not easy to access data laid on BOS storage from user's 
> Hadoop/Spark application, because of no original support for BOS in Hadoop.
> This work aims to integrate Baidu Yun BOS with Hadoop. By simple 
> configuration, Spark/Hadoop applications can read/write data from BOS without 
> any code change. Narrowing the gap between user's APP and data storage, like 
> what have been done for S3 and Aliyun OSS in Hadoop.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15131) FoldedTreeSet appears to degrade over time

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-15131:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> FoldedTreeSet appears to degrade over time
> --
>
> Key: HDFS-15131
> URL: https://issues.apache.org/jira/browse/HDFS-15131
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>
> We have seen some occurrences of the Namenode getting very slow on delete 
> operations, to the point where IBRs get blocked frequently and files fail to 
> close. On one cluster in particular, after about 4 weeks of uptime, the 
> Namenode started responding very poorly. Restarting it corrected the problem 
> for another 4 weeks. 
> In that example, jstacks in the namenode always pointed to slow operations 
> around a HDFS delete call which was performing an operation on the 
> FoldedTreeSet structure. The captured jstacks always pointed at an operation 
> on the folded tree set each time they were sampled:
> {code}
> "IPC Server handler 573 on 8020" #663 daemon prio=5 os_prio=0 
> tid=0x7fe6a4087800 nid=0x97a6 runnable [0x7fe67bdfd000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:879)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:263)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3676)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3507)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4158)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4132)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4069)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4053)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:845)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:308)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:603)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
> {code}
> The observation in this case, was that the namenode worked fine after a 
> restart and then at some point after about 4 weeks of uptime, this problem 
> started happening, and it would persist until the namenode was restarted. 
> Then the problem did not return for about another 4 weeks.
> On a completely different cluster and version, I recently came across a 
> problem where files were again failing to close (last block does not have 
> sufficient number of replicas) and the datanodes were logging a lot of 
> messages like the following:
> {code}
> 2019-11-27 09:00:49,678 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Took 21540ms to process 1 commands from NN
> {code}
> These messages had a range of durations and were fairly frequent. Focusing on 
> the longer messages at around 20 seconds and checking a few different 
> occurrences, jstacks showed as very similar problem to the NN issue, in that 
> the folderTreeSet seems to be at the root of the problem.
> The heartbeat thread is waiting on a lock:
> {code}
> "Thread-26" #243 daemon prio=5 os_prio=0 tid=0x7f575225a000 nid=0x1d8bae 
> waiting for monitor entry [0x7f571ed76000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.ap

[jira] [Updated] (HDFS-8430) Erasure coding: compute file checksum for striped files (stripe by stripe)

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-8430:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Erasure coding: compute file checksum for striped files (stripe by stripe)
> --
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Kai Zheng
>Priority: Major
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13287) TestINodeFile#testGetBlockType results in NPE when run alone

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13287:

Target Version/s: 3.4.0

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> TestINodeFile#testGetBlockType results in NPE when run alone
> 
>
> Key: HDFS-13287
> URL: https://issues.apache.org/jira/browse/HDFS-13287
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
>Priority: Minor
> Fix For: 3.3.0, 3.1.4
>
> Attachments: HDFS-13287.01.patch
>
>
> When TestINodeFile#testGetBlockType is run by itself, it results in the 
> following error:
> {code:java}
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.218 
> s <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestINodeFile
> [ERROR] 
> testGetBlockType(org.apache.hadoop.hdfs.server.namenode.TestINodeFile)  Time 
> elapsed: 0.023 s  <<< ERROR!
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.namenode.ErasureCodingPolicyManager.getPolicyInfoByID(ErasureCodingPolicyManager.java:220)
> at 
> org.apache.hadoop.hdfs.server.namenode.ErasureCodingPolicyManager.getByID(ErasureCodingPolicyManager.java:208)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(INodeFile.java:207)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.(INodeFile.java:266)
> at 
> org.apache.hadoop.hdfs.server.namenode.TestINodeFile.createStripedINodeFile(TestINodeFile.java:112)
> at 
> org.apache.hadoop.hdfs.server.namenode.TestINodeFile.testGetBlockType(TestINodeFile.java:299)
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13998) ECAdmin NPE with -setPolicy -replicate

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13998:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> ECAdmin NPE with -setPolicy -replicate
> --
>
> Key: HDFS-13998
> URL: https://issues.apache.org/jira/browse/HDFS-13998
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.2.0, 3.1.2
>Reporter: Xiao Chen
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-13998.01.patch, HDFS-13998.02.patch, 
> HDFS-13998.03.patch
>
>
> HDFS-13732 tried to improve the output of the console tool. But we missed the 
> fact that for replication, {{getErasureCodingPolicy}} would return null.
> This jira is to fix it in ECAdmin, and add a unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11503) Integrate Chocolate Cloud RS coder implementation

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11503:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Integrate Chocolate Cloud RS coder implementation
> -
>
> Key: HDFS-11503
> URL: https://issues.apache.org/jira/browse/HDFS-11503
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha2
>Reporter: Andrew Wang
>Assignee: Marcell Feher
>Priority: Major
> Attachments: HDFS-11503.patch
>
>
> Quote from Marcell on HDFS-7285:
> First of all let me introduce ourselves: we are Chocolate Cloud from Denmark, 
> we use erasure coding to improve storage solutions. We already have 
> Reed-Solomon and Random Linear Network Coding backends for Liberasurecode, 
> and now we are at the final stage of developing our RS plugin to HDFS-EC. The 
> performance of our plugin is similar to ISA-L's, in some configurations we 
> are better, in others we are worse (our initial speed comparison charts can 
> be found here: https://www.chocolate-cloud.cc/Plugins/HDFS-EC/hdfs.html).
> We would like our plugin to become officially supported in Hadoop 3.0. We can 
> already provide a preliminary version of our (native) library and a patch 
> with the necessary glue code for the next alpha release.
> I'd like to know your thoughts about whether it's possible and how it could 
> be achieved.
> P.S: I'm happy to share more details if there's interest



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14903) Update access time in toCompleteFile

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14903:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Update access time in toCompleteFile
> 
>
> Key: HDFS-14903
> URL: https://issues.apache.org/jira/browse/HDFS-14903
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: lihanran
>Assignee: lihanran
>Priority: Major
> Attachments: HDFS-14903.001.patch
>
>
> when cleate a file,accesstime and modifitime are different



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10274) Move NameSystem#isInStartupSafeMode() to BlockManagerSafeMode

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-10274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10274:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Move NameSystem#isInStartupSafeMode() to BlockManagerSafeMode
> -
>
> Key: HDFS-10274
> URL: https://issues.apache.org/jira/browse/HDFS-10274
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Major
> Attachments: HDFS-10274-01.patch
>
>
> To reduce the number of methods in Namesystem interface and for clean looking 
> refactor, its better to move {{isInStartupSafeMode()}} to BlockManager and 
> BlockManagerSafeMode, as most of the callers are in BlockManager. So one more 
> interface overhead can be reduced.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7273) Add start and stop wrapper scripts for mover

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-7273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-7273:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Add start and stop wrapper scripts for mover
> 
>
> Key: HDFS-7273
> URL: https://issues.apache.org/jira/browse/HDFS-7273
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Benoy Antony
>Assignee: Benoy Antony
>Priority: Minor
>
> Similar to balancer , we need start/stop mover scripts to run data migration 
> tool as a daemon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11260) Slow writer threads are not stopped

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11260:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Slow writer threads are not stopped
> ---
>
> Key: HDFS-11260
> URL: https://issues.apache.org/jira/browse/HDFS-11260
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.0
> Environment: CDH5.8.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
>
> If a DataNode receives a transferred block, it tries to stop writer to the 
> same block. However, this may not work, and we saw the following error 
> message and stacktrace.
> Fundamentally, the assumption of {{ReplicaInPipeline#stopWriter}} is wrong. 
> It assumes the writer thread must be a DataXceiver thread, which it can be 
> interrupted and terminates afterwards. However, IPC threads may also be the 
> writer thread by calling initReplicaRecovery, and which ignores interrupt and 
> do not terminate. 
> {noformat}
> 2016-12-16 19:58:56,167 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Join on writer thread Thread[IPC Server handler 6 on 50020,5,main] timed out
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:135)
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2052)
> 2016-12-16 19:58:56,167 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> IOException in BlockReceiver constructor. Cause is
> 2016-12-16 19:58:56,168 ERROR 
> org.apache.hadoop.hdfs.server.datanode.DataNode: 
> sj1dra082.corp.adobe.com:50010:DataXceiver error processing WRITE_BLOCK 
> operation  src: /10.10.0.80:44105 dst: /10.10.0.82:50010
> java.io.IOException: Join on writer thread Thread[IPC Server handler 6 on 
> 50020,5,main] timed out
> at 
> org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:212)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1579)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:195)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:669)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> There is also a logic error in FsDatasetImpl#createTemporary, in which if the 
> code in the synchronized block executes for more than 60 seconds (in theory), 
> it could throw an exception, without trying to stop the existing slow writer.
> We saw a FsDatasetImpl#createTemporary failed after nearly 10 minutes, and 
> it's unclear why yet. It's my understanding that the code intends to stop 
> slow writers after 1 minute by default. Some code rewrite is probably needed 
> to get the logic right.
> {noformat}
> 2016-12-16 23:12:24,636 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Unable 
> to stop existing writer for block 
> BP-1527842723-10.0.0.180-1367984731269:blk_4313782210_1103780331023 after 
> 568320 miniseconds.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14353:

Target Version/s: 3.4.0

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Erasure Coding: metrics xmitsInProgress become to negative.
> ---
>
> Key: HDFS-14353
> URL: https://issues.apache.org/jira/browse/HDFS-14353
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, erasure-coding
>Affects Versions: 3.3.0
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, 
> HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, 
> HDFS-14353.006.patch, HDFS-14353.007.patch, HDFS-14353.008.patch, 
> HDFS-14353.009.patch, screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14897) [hadoop-hdfs] Fix order of actual and expected expression in assert statements

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14897:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> [hadoop-hdfs] Fix order of actual and expected expression in assert statements
> --
>
> Key: HDFS-14897
> URL: https://issues.apache.org/jira/browse/HDFS-14897
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Priority: Major
>
> Fix order of actual and expected expression in assert statements which gives 
> misleading message when test case fails. Attached file has some of the places 
> where it is placed wrongly.
> {code:java}
> [ERROR] 
> testNodeRemovalGracefully(org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService)
>   Time elapsed: 3.385 s  <<< FAILURE!
> java.lang.AssertionError: Shutdown nodes should be 0 now expected:<1> but 
> was:<0>
> {code}
> For long term, [AssertJ|http://joel-costigliola.github.io/assertj/] can be 
> used for new test cases which avoids such mistakes.
> This is a follow-up jira for the hadoop-hdfs project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14261) Kerberize JournalNodeSyncer unit test

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14261:

Target Version/s: 3.4.0

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Kerberize JournalNodeSyncer unit test
> -
>
> Key: HDFS-14261
> URL: https://issues.apache.org/jira/browse/HDFS-14261
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: journal-node, security, test
>Affects Versions: 3.2.0, 3.1.2
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14261.001.patch
>
>
> This jira is an addition to HDFS-14140. Making the unit tests in 
> TestJournalNodeSync run on a Kerberized cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12208) NN should consider DataNode#xmitInProgress when placing new block

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12208:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> NN should consider DataNode#xmitInProgress when placing new block
> -
>
> Key: HDFS-12208
> URL: https://issues.apache.org/jira/browse/HDFS-12208
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: block placement, erasure-coding
>Affects Versions: 3.0.0-alpha4
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>
> As discussed in HDFS-12044, NN only considers xceiver counts on DN when 
> placing new blocks. NN should also consider background reconstruction works, 
> presented by xmits on DN.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14904) Balancer should pick nodes based on utilization in each iteration

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14904:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Balancer should pick nodes based on utilization in each iteration
> -
>
> Key: HDFS-14904
> URL: https://issues.apache.org/jira/browse/HDFS-14904
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Reporter: Leon Gao
>Assignee: Leon Gao
>Priority: Major
>
> In each iteration, balancer should pick nodes with the highest/lowest usage 
> first, as suggested in HDFS-14894



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14607) Support OpenTracing in WebHDFS

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14607:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Support OpenTracing in WebHDFS
> --
>
> Key: HDFS-14607
> URL: https://issues.apache.org/jira/browse/HDFS-14607
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Nikhil Navadiya
>Assignee: Nikhil Navadiya
>Priority: Major
> Attachments: HDFS-14607.WIP.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11611) Handle unsuccessful encode and decode calls

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11611:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Handle unsuccessful encode and decode calls
> ---
>
> Key: HDFS-11611
> URL: https://issues.apache.org/jira/browse/HDFS-11611
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha2
>Reporter: László Bence Nagy
>Priority: Minor
>
> Normally encode and decode operations are performed successfully but there 
> can be some rare cases when they fail for some reason, specifically with the 
> native implementations. The framework should be ready to perceive these cases 
> (for example by catching exceptions or by checking the operations return 
> value) and then do some kind of rescheduling of the failed operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11927) hdfs oiv_legacy:Normalize the verification of input parameter

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11927:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> hdfs oiv_legacy:Normalize the verification of input parameter
> -
>
> Key: HDFS-11927
> URL: https://issues.apache.org/jira/browse/HDFS-11927
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 3.0.0-alpha3
>Reporter: LiXin Ge
>Assignee: LiXin Ge
>Priority: Major
> Attachments: HDFS-11927.002.patch, HDFS-11927.patch
>
>
> At present, hdfs oiv_legacy tool lacks in verification of input 
> parameter.people can type in irrelevant option like:
> bq../hdfs oiv_legacy -i fsimage_000 -o out -p XML -step 1024
> or type in option with wrong format which he think come into effec but 
> actually not: 
> bq../hdfs oiv_legacy -i fsimage_000 -o out -p 
> FileDistribution maxSize 4096 step 512 format
> or some meaningless word which can also get through:
> bq../hdfs oiv_legacy -i fsimage_000 -o out -p XML Hello World
> We'd better not let these cases go unchecked. 
> By the way, this patch fixed some checkstyle warnings in 
> +DelimitedImageVisitor.java+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12332) Logging improvement suggestion for SampleStat function MinMax.add

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12332:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Logging improvement suggestion for SampleStat function MinMax.add
> -
>
> Key: HDFS-12332
> URL: https://issues.apache.org/jira/browse/HDFS-12332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 3.0.0-alpha4
>Reporter: Xu Zhao
>Priority: Minor
> Attachments: hdfs-12332.patch
>
>
> In order to debug performance anomlies, it is better to see when the max and 
> min value get updated in metrics.
> For example, in our workload we would like to see the longest latency of a 
> write block request on DataNode.
> The following metric updating code could possibly update the min/max value in 
> SampleStat class:
> {code:java}
> datanode.getMetrics().addWriteBlockOp(elapsed());
> datanode.getMetrics().incrWritesFromClient(peer.isLocal(), size);
> {code}
> Here I am attaching an attempt to patch it.
> The patch just adds additional DEBUG level logging in the MinMax class and 
> prints the new value when it gets updated.
> Please let me know what you think. For example, if this patch would introduce 
> too much noises, what is the best way to trace the read/writeBlock that has 
> the minimum/maximum latency?
> Any comments are appreciated!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11020) Add more doc for HDFS transparent encryption

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11020:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Add more doc for HDFS transparent encryption
> 
>
> Key: HDFS-11020
> URL: https://issues.apache.org/jira/browse/HDFS-11020
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: documentation, encryption, fs
>Reporter: Yi Liu
>Assignee: Jennica Pounds
>Priority: Minor
>
> We need correct version of Openssl which supports hardware acceleration of 
> AES CTR.
> Let's add more doc about how to configure the correct Openssl.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11542) Fix RawErasureCoderBenchmark decoding operation

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11542:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Fix RawErasureCoderBenchmark decoding operation
> ---
>
> Key: HDFS-11542
> URL: https://issues.apache.org/jira/browse/HDFS-11542
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha2
>Reporter: László Bence Nagy
>Priority: Minor
>  Labels: test
>
> There are some issues with the decode operation in the 
> *RawErasureCoderBenchmark.java* file. The decoding method is called like 
> this: *decoder.decode(decodeInputs, ERASED_INDEXES, outputs);*. 
> Using RS 6+3 configuration it could be called with these parameters correctly 
> like this: *decode([ d0, NULL, d2, d3, NULL, d5, p0, NULL, p2 ], [ 1, 4, 7 ], 
> [ -, -, - ])*. The 1,4,7 indexes are in the *ERASED_INDEXES* array so in the 
> *decodeInputs* array the values at those indexes are set to NULL, all other 
> data and parity packets are present in the array. The *outputs* array's 
> length is 3, where the d1, d4 and p1 packets should be reconstructed. This 
> would be the right solution.
> Right now this example would be called like this: *decode([ d0, d1, d2, d3, 
> d4, d5, -, -, - ], [ 1, 4, 7 ], [ -, -, - ])*. So it has two main problems 
> with the *decodeInputs* array. Firstly, the packets are not set to NULL where 
> they should be based on the *ERASED_INDEXES* array. Secondly, it does not 
> have any parity packets for decoding.
> The first problem is easy to solve, the values at the proper indexes need to 
> be set to NULL. The latter one is a little harder because right now multiple 
> rounds of encode operations are done one after another and similarly multiple 
> decode operations are called one by one. Encode and decode pairs should be 
> called one after another so that the encoded parity packets can be used in 
> the *decodeInputs* array as a parameter for decode. (Of course, their 
> performance should be still measured separately.)
> Moreover, there is one more problem in this file. Right now it works with RS 
> 6+3 and the *ERASED_INDEXES* array is fixed to *[ 6, 7, 8 ]*. So the three 
> parity packets are needed to be reconstructed. This means that no real decode 
> performance is measured because no data packet is needed to be reconstructed 
> (even if the decode works properly). Actually, only new parity packets are 
> needed to be encoded. The exact implementation depends on the underlying 
> erasure coding plugin, but the point is that data packets should also be 
> erased to measure real decode performance.
> In addition to this, more RS configurations (not just 6+3) could be measured 
> as well to be able to compare them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13064) Httpfs should return json instead of html when writting to a file without Content-Type

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13064:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Httpfs should return json instead of html when writting to a file without 
> Content-Type
> --
>
> Key: HDFS-13064
> URL: https://issues.apache.org/jira/browse/HDFS-13064
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 3.0.0
>Reporter: fang zhenyi
>Assignee: fang zhenyi
>Priority: Minor
> Attachments: HDFS-13064.001.patch, HDFS-13064.002.patch
>
>
> When I create a hdfs file, I get the following response.
>  
> {code:java}
> zdh102:~ # curl -i -X PUT 
> "http://10.43.183.103:14000/webhdfs/v1/2.txt?op=CREATE&user.name=hdfs&data=true";
> HTTP/1.1 400 Bad Request
> Server: Apache-Coyote/1.1
> Set-Cookie: 
> hadoop.auth="u=hdfs&p=hdfs&t=simple&e=1516901333684&s=wYqDlu/ovRxay9d6I6UmoH77KKI=";
>  Path=/; Expires= , 25- -2018 17:28:53 GMT; HttpOnly
> Content-Type: text/html;charset=utf-8
> Content-Language: en
> Content-Length: 1122
> Date: Thu, 25 Jan 2018 07:28:53 GMT
> Connection: close
> Apache Tomcat/7.0.82 - Error report 
> HTTP Status 400 - Data upload requests must have 
> content-type set to 'application/octet-stream' noshade="noshade">type Status reportmessage Data 
> upload requests must have content-type set to 
> 'application/octet-stream'description The request sent 
> by the client was syntactically incorrect. noshade="noshade">Apache Tomcat/7.0.82zdh102:~ # 
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12261) Improve HdfsAdmin's javadoc regarding superusers.

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12261:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Improve HdfsAdmin's javadoc regarding superusers.
> -
>
> Key: HDFS-12261
> URL: https://issues.apache.org/jira/browse/HDFS-12261
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Minor
>
> From [discussions 
> of|https://issues.apache.org/jira/browse/HDFS-10899?focusedCommentId=16113267&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16113267]
>  HDFS-10899, we should improve the javadoc accordingly.
> {quote}
> This is not actually true, this class is just for HDFS-specific operations. 
> Putting "Admin" in the name is a misnomer, and since this continues to be 
> confusing, maybe we should enhance the class javadoc to make this explicit.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12084) Scheduled Count will not decrement when file is deleted before all IBR's received

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12084:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Scheduled Count will not decrement when file is deleted before all IBR's 
> received
> -
>
> Key: HDFS-12084
> URL: https://issues.apache.org/jira/browse/HDFS-12084
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Major
> Attachments: HDFS-12084-001.patch, HDFS-12084-002.patch, 
> HDFS-12084-003.patch, HDFS-12084-branch-2.patch
>
>
> When small files creation && deletion happens so frequently and DN's did not 
> report blocks to NN before deletion, then scheduled count will keep on 
> increment and which will not deleted as blocks are deleted.
> *Note*: Every 20 mins,this can be rolled, but with in 20 mins, count can be 
> more as so many operations.
> when batchIBR enabled with committed allowed=1 this will be observed more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11397) TestThrottledAsyncChecker#testCancellation timed out

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11397:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> TestThrottledAsyncChecker#testCancellation timed out
> 
>
> Key: HDFS-11397
> URL: https://issues.apache.org/jira/browse/HDFS-11397
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, test
>Affects Versions: 3.0.0-alpha4
>Reporter: John Zhuge
>Assignee: Manjunath Anand
>Priority: Minor
> Attachments: HDFS-11397-V01.patch
>
>
> {noformat}
> Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 61.153 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.datanode.checker.TestThrottledAsyncChecker
> testCancellation(org.apache.hadoop.hdfs.server.datanode.checker.TestThrottledAsyncChecker)
>   Time elapsed: 60.033 sec  <<< ERROR!
> java.lang.Exception: test timed out after 6 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>   at 
> org.apache.hadoop.hdfs.server.datanode.checker.TestThrottledAsyncChecker.testCancellation(TestThrottledAsyncChecker.java:114)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12763) DataStreamer should heartbeat during flush

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12763:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> DataStreamer should heartbeat during flush
> --
>
> Key: HDFS-12763
> URL: https://issues.apache.org/jira/browse/HDFS-12763
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
>Priority: Major
> Attachments: HDFS-12763.001.patch
>
>
> From HDFS-5032:
> bq. Absence of heartbeat during flush will be fixed in a separate jira by 
> Daryn Sharp
> This JIRA tracks the case where absence of heartbeat can cause the pipeline 
> to fail if operations like flush take some time to complete.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11398) TestDataNodeVolumeFailure#testUnderReplicationAfterVolFailure still fails intermittently

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11398:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> TestDataNodeVolumeFailure#testUnderReplicationAfterVolFailure still fails 
> intermittently
> 
>
> Key: HDFS-11398
> URL: https://issues.apache.org/jira/browse/HDFS-11398
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Major
> Attachments: HDFS-11398-reproduce.patch, HDFS-11398.001.patch, 
> HDFS-11398.002.patch, failure.log
>
>
> The test {{TestDataNodeVolumeFailure#testUnderReplicationAfterVolFailure}} 
> still fails intermittently in trunk after HDFS-11316. The stack infos:
> {code}
> testUnderReplicationAfterVolFailure(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure)
>   Time elapsed: 95.021 sec  <<< ERROR!
> java.util.concurrent.TimeoutException: Timed out waiting for condition. 
> Thread diagnostics:
> Timestamp: 2017-02-07 07:00:34,193
> 
> java.lang.Thread.State: RUNNABLE
> at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native 
> Method)
> at 
> org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(DomainSocketWatcher.java:52)
> at 
> org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:511)
> at java.lang.Thread.run(Thread.java:745)
>   at 
> org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:276)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testUnderReplicationAfterVolFailure(TestDataNodeVolumeFailure.java:412)
> {code}
> I looked into this and found there is one chance that the vaule 
> {{UnderReplicatedBlocksCount}} will be no longer > 0. The following is my 
> analysation:
> In test {{TestDataNodeVolumeFailure.testUnderReplicationAfterVolFailure}}, it 
> uses creating file to trigger the disk error checking. The related codes:
> {code}
> Path file1 = new Path("/test1");
> DFSTestUtil.createFile(fs, file1, 1024, (short)3, 1L);
> DFSTestUtil.waitReplication(fs, file1, (short)3);
> // Fail the first volume on both datanodes
> File dn1Vol1 = new File(dataDir, "data"+(2*0+1));
> File dn2Vol1 = new File(dataDir, "data"+(2*1+1));
> DataNodeTestUtils.injectDataDirFailure(dn1Vol1, dn2Vol1);
> Path file2 = new Path("/test2");
> DFSTestUtil.createFile(fs, file2, 1024, (short)3, 1L);
> DFSTestUtil.waitReplication(fs, file2, (short)3);
> {code}
> This will lead one problem: If the cluster is busy, and it costs long time to 
> wait replication of file2 to be desired value. During this time, the under 
> replication blocks of file1 can also be rereplication in cluster. If this is 
> done, the condition {{underReplicatedBlocks > 0}} will never be  satisfied.
> And this can be reproduced in my local env.
> Actually, we can use a easy way {{DataNodeTestUtils.waitForDiskError}} to 
> replace this, it runs fast and be more reliable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13639) SlotReleaser is not fast enough

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13639:

Target Version/s: 3.4.0

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> SlotReleaser is not fast enough
> ---
>
> Key: HDFS-13639
> URL: https://issues.apache.org/jira/browse/HDFS-13639
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.4.0, 2.6.0, 3.0.2
> Environment: 1. YCSB:
> {color:#00} recordcount=20
>  fieldcount=1
>  fieldlength=1000
>  operationcount=1000
>  
>  workload=com.yahoo.ycsb.workloads.CoreWorkload
>  
>  table=ycsb-test
>  columnfamily=C
>  readproportion=1
>  updateproportion=0
>  insertproportion=0
>  scanproportion=0
>  
>  maxscanlength=0
>  requestdistribution=zipfian
>  
>  # default 
>  readallfields=true
>  writeallfields=true
>  scanlengthdistribution=constan{color}
> {color:#00}2. datanode:{color}
> -Xmx2048m -Xms2048m -Xmn1024m -XX:MaxDirectMemorySize=1024m 
> -XX:MaxPermSize=256m -Xloggc:$run_dir/stdout/datanode_gc_${start_time}.log 
> -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError 
> -XX:HeapDumpPath=$log_dir -XX:+PrintGCApplicationStoppedTime 
> -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80 
> -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSParallelRemarkEnabled 
> -XX:+CMSClassUnloadingEnabled -XX:CMSMaxAbortablePrecleanTime=1 
> -XX:+CMSScavengeBeforeRemark -XX:+PrintPromotionFailure 
> -XX:+CMSConcurrentMTEnabled -XX:+ExplicitGCInvokesConcurrent 
> -XX:+SafepointTimeout -XX:MonitorBound=16384 -XX:-UseBiasedLocking 
> -verbose:gc -XX:+PrintGCDetails -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps
> {color:#00}3. regionserver:{color}
> {color:#00}-Xmx10g -Xms10g -XX:MaxDirectMemorySize=10g 
> -XX:MaxGCPauseMillis=150 -XX:MaxTenuringThreshold=2 
> -XX:+UnlockExperimentalVMOptions -XX:G1NewSizePercent=5 
> -Xloggc:$run_dir/stdout/regionserver_gc_${start_time}.log -Xss256k 
> -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$log_dir -verbose:gc 
> -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCApplicationStoppedTime 
> -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintAdaptiveSizePolicy 
> -XX:+PrintTenuringDistribution -XX:+PrintSafepointStatistics 
> -XX:PrintSafepointStatisticsCount=1 -XX:PrintFLSStatistics=1 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=100 -XX:GCLogFileSize=128m 
> -XX:+SafepointTimeout -XX:MonitorBound=16384 -XX:-UseBiasedLocking 
> -XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=65 
> -XX:+ParallelRefProcEnabled -XX:ConcGCThreads=4 -XX:ParallelGCThreads=16 
> -XX:G1HeapRegionSize=32m -XX:G1MixedGCCountTarget=64 
> -XX:G1OldCSetRegionThresholdPercent=5{color}
> {color:#00}block cache is disabled:{color}{color:#00} 
>  hbase.bucketcache.size
>  0.9
>  {color}
>  
>Reporter: Gang Xie
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-13639-2.4.diff, HDFS-13639.001.patch, 
> HDFS-13639.002.patch, ShortCircuitCache_new_slotReleaser.diff, 
> perf_after_improve_SlotReleaser.png, perf_before_improve_SlotReleaser.png
>
>
> When test the performance of the ShortCircuit Read of the HDFS with YCSB, we 
> find that SlotReleaser of the ShortCircuitCache has some performance issue. 
> The problem is that, the qps of the slot releasing could only reach to 1000+ 
> while the qps of the slot allocating is ~3000. This means that the replica 
> info on datanode could not be released in time, which causes a lot of GCs and 
> finally full GCs.
>  
> The fireflame graph shows that SlotReleaser spends a lot of time to do domain 
> socket connecting and throw/catching the exception when close the domain 
> socket and its streams. It doesn't make any sense to do the connecting and 
> closing each time. Each time when we connect to the domain socket, Datanode 
> allocates a new thread to free the slot. There are a lot of initializing 
> work, and it's costly. We need reuse the domain socket. 
>  
> After switch to reuse the domain socket(see diff attached), we get great 
> improvement(see the perf):
>  # without reusing the domain socket, the get qps of the YCSB getting worse 
> and worse, and after about 45 mins, full GC starts. When we reuse the domain 
> socket, no full GC found, and the stress test could be finished smoothly, the 
> qps of allocating and releasing match.
>  # Due to the datanode young GC, without the improvement, the YCSB get qps is 
> even smaller than the one with the improvement, ~3700 VS ~4200.
> The diff is against 2.4, and I think this issue exists till latest version. I 
> doesn't have test env with 2.7 and higher version. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-

[jira] [Updated] (HDFS-14894) Add balancer parameter to balance top used nodes

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14894:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Add balancer parameter to balance top used nodes
> 
>
> Key: HDFS-14894
> URL: https://issues.apache.org/jira/browse/HDFS-14894
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Reporter: Leon Gao
>Assignee: Leon Gao
>Priority: Major
> Attachments: HDFS-14894.001.patch
>
>
> We sometimes see a few of our datanodes reach very high usage (due to various 
> reasons) and we need to reduce their usage in an urgent situation.
> We see two ways to achieve it currently,
> -Calculate and reset balancing threshold.
> -Pick nodes manually according to usage stats and put them in a file and use 
> `-resource` flag.
> However, both of them are not very intuitive or too much manual work in an 
> urgent close-to-outage situation. Add a small feature to automatically pick 
> top used hosts will be a straightforward option, for example 
> `-sourceThreshold 95` to only target datanodes with >95% usage. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14718) HttpFS: Sort response by key names as WebHDFS does

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14718:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> HttpFS: Sort response by key names as WebHDFS does
> --
>
> Key: HDFS-14718
> URL: https://issues.apache.org/jira/browse/HDFS-14718
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>
> *Example*
> See description of HDFS-14665 for an example of LISTSTATUS.
> *Analysis*
> WebHDFS is [using a 
> TreeMap|https://github.com/apache/hadoop/blob/99bf1dc9eb18f9b4d0338986d1b8fd2232f1232f/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java#L120]
>  to serialize HdfsFileStatus, while HttpFS [uses a 
> LinkedHashMap|https://github.com/apache/hadoop/blob/6fcc5639ae32efa5a5d55a6b6cf23af06fc610c3/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/FSOperations.java#L107]
>  to serialize FileStatus.
> *Questions*
> Why the difference? Is this intentional?
> - I looked into the Git history. It seems it's simply because WebHDFS uses 
> TreeMap from the beginning; and HttpFS uses LinkedHashMap from the beginning. 
> It is not only limited to LISTSTATUS, but ALL other request's JSON 
> serialization.
> Now the real question: Could/Should we replace ALL LinkedHashMap into TreeMap 
> in HttpFS serialization in FSOperations class?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7368) Support HDFS specific 'shell' on command 'hdfs dfs' invocation

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-7368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-7368:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Support HDFS specific 'shell' on command 'hdfs dfs' invocation
> --
>
> Key: HDFS-7368
> URL: https://issues.apache.org/jira/browse/HDFS-7368
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Major
> Attachments: HDFS-7368-001.patch
>
>
> * *hadoop fs* is the generic implementation for all filesystem 
> implementations, but some of the operations are supported only in some 
> filesystems. Ex: snapshot commands, acl commands, xattr commands.
> * *hdfs dfs* is recommended in all hdfs related docs in current releases.
> In current code  both *hdfs shell* and *hadoop fs* points to hadoop common 
> implementation of FSShell.
> It would be better to have HDFS specific extention of FSShell which includes 
> HDFS only commands in future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11392) FSPermissionChecker#checkSubAccess should support inodeattribute provider

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11392:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> FSPermissionChecker#checkSubAccess should support inodeattribute provider
> -
>
> Key: HDFS-11392
> URL: https://issues.apache.org/jira/browse/HDFS-11392
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: John Zhuge
>Priority: Minor
>
> HDFS-6826 added this TODO in {{FSPermissionChecker#checkSubAccess}}:
> {code:title=FSPermissionChecker#checkSubAccess}
> //TODO have to figure this out with inodeattribute provider
> INodeAttributes inodeAttr =
> getINodeAttrs(components, pathIdx, d, snapshotId);
> {code}
> If inodeattribute provider in play, it always incorrectly returns the root 
> attr of the subtree even when it descends multiple levels down the sub tree, 
> because the components array is for the root .
> {code:title=FSPermissionChecker#getINodeAttrs}
>   private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx,
>   INode inode, int snapshotId) {
> INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
> if (getAttributesProvider() != null) {
>   String[] elements = new String[pathIdx + 1];
>   for (int i = 0; i < elements.length; i++) {
> elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);
>   }
>   inodeAttrs = getAttributesProvider().getAttributes(elements, 
> inodeAttrs);
> }
> return inodeAttrs;
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11381) PreCommit TestDataNodeOutlierDetectionViaMetrics failure

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11381:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> PreCommit TestDataNodeOutlierDetectionViaMetrics failure
> 
>
> Key: HDFS-11381
> URL: https://issues.apache.org/jira/browse/HDFS-11381
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0-alpha4
>Reporter: John Zhuge
>Assignee: Arpit Agarwal
>Priority: Minor
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/18285/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
> {noformat}
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.457 sec <<< 
> FAILURE! - in 
> org.apache.hadoop.hdfs.server.datanode.metrics.TestDataNodeOutlierDetectionViaMetrics
> testOutlierIsDetected(org.apache.hadoop.hdfs.server.datanode.metrics.TestDataNodeOutlierDetectionViaMetrics)
>   Time elapsed: 0.27 sec  <<< FAILURE!
> java.lang.AssertionError: 
> Expected: is <1>
>  but: was <0>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.junit.Assert.assertThat(Assert.java:865)
>   at org.junit.Assert.assertThat(Assert.java:832)
>   at 
> org.apache.hadoop.hdfs.server.datanode.metrics.TestDataNodeOutlierDetectionViaMetrics.testOutlierIsDetected(TestDataNodeOutlierDetectionViaMetrics.java:85)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11039) Expose more configuration properties to hdfs-default.xml

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11039:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Expose more configuration properties to hdfs-default.xml
> 
>
> Key: HDFS-11039
> URL: https://issues.apache.org/jira/browse/HDFS-11039
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, newbie
>Reporter: Yi Liu
>Assignee: Jennica Pounds
>Priority: Minor
>
> There are some configuration properties for hdfs, but have not been exposed 
> in hdfs-default.xml.
> It's convenient for Hadoop user/admin if we add them in the hdfs-default.xml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11485) HttpFS should warn about weak SSL ciphers

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11485:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> HttpFS should warn about weak SSL ciphers
> -
>
> Key: HDFS-11485
> URL: https://issues.apache.org/jira/browse/HDFS-11485
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 2.9.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>
> HDFS-11418 sets a list of default ciphers that contain a few weak ciphers in 
> order to maintain backwards compatibility. In addition, users can select weak 
> ciphers by env {{HTTPFS_SSL_CIPHERS}}. It'd nice to get warnings about the 
> weak ciphers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9821) HDFS configuration should accept friendly time units

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-9821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-9821:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> HDFS configuration should accept friendly time units
> 
>
> Key: HDFS-9821
> URL: https://issues.apache.org/jira/browse/HDFS-9821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 2.8.0
>Reporter: Arpit Agarwal
>Assignee: Xiaobing Zhou
>Priority: Major
>
> HDFS configuration keys that define time intervals use units inconsistently 
> (Hours, seconds, milliseconds).
> Not all keys have the unit as part of their name. Related keys may use 
> different units e.g. {{dfs.blockreport.intervalMsec}} accepts msec while 
> {{dfs.blockreport.initialDelay}} accepts seconds. Milliseconds is rarely 
> useful as a time unit which makes these values hard to parse when reading 
> config files.
> We can either
> # Let existing keys use friendly units e.g. 100ms, 60s, 5m, 1d, 6w etc. This 
> can be done compatibly since there will be no conflict with existing valid 
> configuration. If no suffix is specified just default to the current time 
> unit.
> # Just deprecate the existing keys and define new ones that accept friendly 
> units.
> We continue to use fine-grained time units (usually ms) internally in code 
> and also accept "ms" option for tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14788) Use dynamic regex filter to ignore copy of source files in Distcp

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14788:


Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Use dynamic regex filter to ignore copy of source files in Distcp
> -
>
> Key: HDFS-14788
> URL: https://issues.apache.org/jira/browse/HDFS-14788
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp
>Affects Versions: 3.2.1
>Reporter: Mukund Thakur
>Assignee: Mukund Thakur
>Priority: Major
> Fix For: 3.3.0
>
>
> There is a feature in Distcp where we can ignore specific files to get copied 
> to the destination. This is currently based on a filter regex which is read 
> from a specific file. The process of creating different regex file for 
> different distcp jobs seems like a tedious task. What we are proposing is to 
> expose a regex_filter parameter which can be set during Distcp job creation 
> and use this filter in a new implementation CopyFilter class. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11203) Rename support during re-encrypt EDEK

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11203:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Rename support during re-encrypt EDEK
> -
>
> Key: HDFS-11203
> URL: https://issues.apache.org/jira/browse/HDFS-11203
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: encryption
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Major
>
> Currently HDFS-10899 disables renames within the EZ if it's under 
> re-encryption. (similar to current cross-zone rename checks).
> We'd like to support rename in the long run, so cluster is fully functioning 
> during re-encryption.
> The reason rename is particularly difficult is:
> - We want to re-encrypt all files under an EZ in one pass, without missing any
> - We want to iterate through the files and keep track of where we are (i.e. a 
> cursor), so in case of NN failover/crash, we can resume from fsimage/edits.
> - We cannot guarantee namespace is not changed during re-encryption. Newly 
> created files automatically has new edek, deleted files we don't care. But if 
> a file is renamed from behind the cursor to before, it may be missed in the 
> re-encryption.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7702) Move metadata across namenode - Effort to a real distributed namenode

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-7702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-7702:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Move metadata across namenode - Effort to a real distributed namenode
> -
>
> Key: HDFS-7702
> URL: https://issues.apache.org/jira/browse/HDFS-7702
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ray Zhang
>Assignee: Ray Zhang
>Priority: Major
> Attachments: Namespace Moving Tool Design Proposal.pdf
>
>
> Implement a tool can show in memory namespace tree structure with 
> weight(size) and a API can move metadata across different namenode. The 
> purpose is moving data efficiently and faster, without moving blocks on 
> datanode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10854) Remove createStripedFile and addBlockToFile by creating real EC files

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-10854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10854:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Remove createStripedFile and addBlockToFile by creating real EC files
> -
>
> Key: HDFS-10854
> URL: https://issues.apache.org/jira/browse/HDFS-10854
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding, test
>Affects Versions: 3.0.0-alpha2
>Reporter: Zhe Zhang
>Assignee: Sammi Chen
>Priority: Major
>
> {{DFSTestUtil#createStripedFile}} and {{addBlockToFile}} were developed 
> before we completed EC client. They were used to test the {{NameNode}} EC 
> logic when the client was unable to really create/read/write EC files.
> They are causing confusions in other issues about {{NameNode}}. For example, 
> in one of the patches under {{HDFS-10301}}, 
> {{testProcessOverReplicatedAndMissingStripedBlock}} fails because in the test 
> we fake a block report from a DN, with a randomly generated storage ID. The 
> DN itself is never aware of that storage. This is not possible in a real 
> production environment.
> {code}
>   DatanodeStorage storage = new 
> DatanodeStorage(UUID.randomUUID().toString());
>   StorageReceivedDeletedBlocks[] reports = DFSTestUtil
>   .makeReportForReceivedBlock(block,
>   ReceivedDeletedBlockInfo.BlockStatus.RECEIVING_BLOCK, storage);
>   for (StorageReceivedDeletedBlocks report : reports) {
> ns.processIncrementalBlockReport(dn.getDatanodeId(), report);
>   }
> {code}
> Now that we have a fully functional EC client, we should remove the old 
> testing logic and use similar logic as non-EC tests (creating real files and 
> emulate blocks missing / being corrupt).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-3570) Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used space

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-3570:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used 
> space
> 
>
> Key: HDFS-3570
> URL: https://issues.apache.org/jira/browse/HDFS-3570
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Akira Ajisaka
>Priority: Minor
> Attachments: HDFS-3570.003.patch, HDFS-3570.2.patch, 
> HDFS-3570.aash.1.patch
>
>
> Report from a user here: 
> https://groups.google.com/a/cloudera.org/d/msg/cdh-user/pIhNyDVxdVY/b7ENZmEvBjIJ,
>  post archived at http://pastebin.com/eVFkk0A0
> This user had a specific DN that had a large non-DFS usage among 
> dfs.data.dirs, and very little DFS usage (which is computed against total 
> possible capacity). 
> Balancer apparently only looks at the usage, and ignores to consider that 
> non-DFS usage may also be high on a DN/cluster. Hence, it thinks that if a 
> DFS Usage report from DN is 8% only, its got a lot of free space to write 
> more blocks, when that isn't true as shown by the case of this user. It went 
> on scheduling writes to the DN to balance it out, but the DN simply can't 
> accept any more blocks as a result of its disks' state.
> I think it would be better if we _computed_ the actual utilization based on 
> {{(100-(actual remaining space))/(capacity)}}, as opposed to the current 
> {{(dfs used)/(capacity)}}. Thoughts?
> This isn't very critical, however, cause it is very rare to see DN space 
> being used for non DN data, but it does expose a valid bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-4319) FSShell copyToLocal creates files with the executable bit set

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-4319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-4319:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> FSShell copyToLocal creates files with the executable bit set
> -
>
> Key: HDFS-4319
> URL: https://issues.apache.org/jira/browse/HDFS-4319
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Colin McCabe
>Priority: Minor
>
> With the default value of {{fs.permissions.umask-mode}}, {{022}},  {{FSShell 
> copyToLocal}} creates files with the executable bit set.
> If, on the other hand, you change {{fs.permissions.umask-mode}} to something 
> like {{133}}, you encounter a different problem.  When you use 
> {{copyToLocal}} to create directories, they don't have the executable bit 
> set, meaning they do not have search permission.
> Since HDFS doesn't allow the executable bit to be set on files, it seems 
> illogical to add it in when using {{copyToLocal}}.  This is also a 
> regression, since branch 1 did not have this problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12308) Erasure Coding: Provide DistributedFileSystem & DFSClient API to return the effective EC policy on a directory or file, including the replication policy

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12308:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Erasure Coding: Provide DistributedFileSystem &  DFSClient API to return the 
> effective EC policy on a directory or file, including the replication policy
> -
>
> Key: HDFS-12308
> URL: https://issues.apache.org/jira/browse/HDFS-12308
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Sammi Chen
>Assignee: chencan
>Priority: Major
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HADOOP-12308.002.patch, HADOOP-12308.patch, 
> HDFS-12308.003.patch, HDFS-12308.004.patch
>
>
> Provide DistributedFileSystem & DFSClient API to return the effective EC 
> policy on a directory or file, including the replication policy. Policy name 
> will like {{getNominalErasureCodingPolicy(PATH)}} and 
> {{getAllNominalErasureCodingPolicies}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-4389) Non-HA DFSClients do not attempt reconnects

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-4389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-4389:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Non-HA DFSClients do not attempt reconnects
> ---
>
> Key: HDFS-4389
> URL: https://issues.apache.org/jira/browse/HDFS-4389
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, hdfs-client
>Affects Versions: 2.0.0-alpha, 3.0.0-alpha1
>Reporter: Daryn Sharp
>Priority: Major
>
> The HA retry policy implementation appears to have broken non-HA 
> {{DFSClient}} connect retries.  The ipc 
> {{Client.Connection#handleConnectionFailure}} used to perform 45 connection 
> attempts, but now it consults a retry policy.  For non-HA proxies, the policy 
> does not handle {{ConnectException}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7902) Expose truncate API for libwebhdfs

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-7902:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Expose truncate API for libwebhdfs
> --
>
> Key: HDFS-7902
> URL: https://issues.apache.org/jira/browse/HDFS-7902
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: native, webhdfs
>Affects Versions: 2.7.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Major
> Attachments: HDFS-7902.001.patch, HDFS-7902.002.patch
>
>
> As Colin suggested in HDFS-7838, we will add truncate support for libwebhdfs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8538) Change the default volume choosing policy to AvailableSpaceVolumeChoosingPolicy

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-8538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-8538:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Change the default volume choosing policy to 
> AvailableSpaceVolumeChoosingPolicy
> ---
>
> Key: HDFS-8538
> URL: https://issues.apache.org/jira/browse/HDFS-8538
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: hdfs-8538.001.patch
>
>
> For datanodes with different sized disks, they almost always want the 
> available space policy. Users with homogenous disks are unaffected.
> Since this code has baked for a while, let's change it to be the default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12859) Admin command resetBalancerBandwidth

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12859:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Admin command resetBalancerBandwidth
> 
>
> Key: HDFS-12859
> URL: https://issues.apache.org/jira/browse/HDFS-12859
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer & mover
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: 
> 0003-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, 
> 0004-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, HDFS-12859.patch
>
>
> We can already set balancer bandwidth dynamically using command 
> setBalancerBandwidth. The setting value is not persistent and not stored in 
> configuration file. The different datanodes could their different default or 
> former setting in configuration.
> When we suggested to develop a schedule balancer task which runs at midnight 
> everyday. We set a larger bandwidth for it and hope to reset the value after 
> finishing. However, we found it difficult to reset the different setting for 
> different datanodes as the setBalancerBandwidth command can only set the same 
> value to all datanodes. If we want to use unique setting for every datanode, 
> we have to reset the datanodes.
> So it would be useful to have a command to synchronize the setting with the 
> configuration file. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12243) Trash emptier should use Time.monotonicNow()

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12243:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Trash emptier should use Time.monotonicNow()
> 
>
> Key: HDFS-12243
> URL: https://issues.apache.org/jira/browse/HDFS-12243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Reporter: Wei-Chiu Chuang
>Assignee: Lisheng Sun
>Priority: Minor
> Attachments: HDFS-12243.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10529) Df reports incorrect usage when appending less than block size

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-10529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10529:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Df reports incorrect usage when appending less than block size
> --
>
> Key: HDFS-10529
> URL: https://issues.apache.org/jira/browse/HDFS-10529
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2, 3.0.0-alpha1
>Reporter: Pranav Prakash
>Assignee: Pranav Prakash
>Priority: Minor
>  Labels: datanode, fs, hdfs
> Attachments: HDFS-10529.000.patch
>
>
> Steps to recreate issue:
> 1. Create a 100MB file on HDFS cluster with 128MB blocksize and replication 
> factor 3
> 2. Append 100MB to the file
> 3. Df reports around 900MB even though it should only be around 600MB.
> Looking at the blocks confirms that df is incorrect, as there exist only two 
> blocks on each DN -- a 128MB block and a 72MB block.
> This issue seems to arise because BlockPoolSlice does not account for the 
> delta increase in dfsUsage when an append happens to a partially-filled 
> block, and instead naively adds the total block size. For instance, in the 
> example scenario when when block is "filled" from 100 to 128MB, 
> addFinalizedBlock() in BlockPoolSlice adds the size of the newly created 
> block into the total instead of accounting for the difference/delta in block 
> size between old and new.  This has the effect of double-counting the old 
> partially-filled block: it is counted once when it is first created (in the 
> example scenario when the 100MB file is created) and again when it becomes 
> part of the filled block (in the example scenario when the 128MB block is 
> formed form the initial 100MB block). Thus the perceived size becomes 100MB + 
> 128MB + 72 = 300 MB for each DN, or 900MB across the cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8115) Make PermissionStatusFormat public

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-8115:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Make PermissionStatusFormat public
> --
>
> Key: HDFS-8115
> URL: https://issues.apache.org/jira/browse/HDFS-8115
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Minor
> Attachments: HDFS-8115.1.patch
>
>
> implementations of {{INodeAttributeProvider}} are required to provide an 
> implementation of {{getPermissionLong()}} method. Unfortunately, the long 
> permission format is an encoding of the user, group and mode with each field 
> converted to int using {{SerialNumberManager}} which is package protected.
> Thus it would be nice to make the {{PermissionStatusFormat}} enum public (and 
> also make the {{toLong()}} static method public) so that user specified 
> implementations of {{INodeAttributeProvider}} may use it.
> This would also make it more consistent with {{AclStatusFormat}} which I 
> guess has been made public for the same reason.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11423) Allow ReplicaInfo to be subclassed outside of package

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11423:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Allow ReplicaInfo to be subclassed outside of package
> -
>
> Key: HDFS-11423
> URL: https://issues.apache.org/jira/browse/HDFS-11423
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.0.0-alpha2
>Reporter: Joe Pallas
>Priority: Minor
> Attachments: HDFS-11423.001.patch
>
>
> The constructor for {{ReplicaInfo}} is package-scope instead of protected.  
> This means that an alternative dataset implementation in its own package 
> cannot create Replica classes that inherit from ReplicaInfo the way 
> {{LocalReplica}} does (or the way {{ProvidedReplica}} does in HDFS-9806).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12122) TestEditLogTailer is flaky

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12122:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> TestEditLogTailer is flaky
> --
>
> Key: HDFS-12122
> URL: https://issues.apache.org/jira/browse/HDFS-12122
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: namenode
>Affects Versions: 3.0.0-alpha4
>Reporter: John Zhuge
>Priority: Minor
>  Labels: flaky-test
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/20225/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
> {noformat}
> Tests run: 12, Failures: 5, Errors: 0, Skipped: 0, Time elapsed: 46.901 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer
> testTriggersLogRollsForAllStandbyNN[0](org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer)
>   Time elapsed: 10.728 sec  <<< FAILURE!
> java.lang.AssertionError: Test resulted in an unexpected exit
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1957)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1944)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1937)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testTriggersLogRollsForAllStandbyNN(TestEditLogTailer.java:245)
> testNN1TriggersLogRolls[1](org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer)
>   Time elapsed: 0.394 sec  <<< FAILURE!
> java.lang.AssertionError: Test resulted in an unexpected exit
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1957)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1944)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1937)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testStandbyTriggersLogRolls(TestEditLogTailer.java:201)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testNN1TriggersLogRolls(TestEditLogTailer.java:154)
> testNN2TriggersLogRolls[1](org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer)
>   Time elapsed: 1.287 sec  <<< FAILURE!
> java.lang.AssertionError: Test resulted in an unexpected exit
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1957)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1944)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1937)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testStandbyTriggersLogRolls(TestEditLogTailer.java:201)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testNN2TriggersLogRolls(TestEditLogTailer.java:159)
> testNN0TriggersLogRolls[1](org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer)
>   Time elapsed: 1.286 sec  <<< FAILURE!
> java.lang.AssertionError: Test resulted in an unexpected exit
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1957)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1944)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1937)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testStandbyTriggersLogRolls(TestEditLogTailer.java:201)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testNN0TriggersLogRolls(TestEditLogTailer.java:149)
> testTriggersLogRollsForAllStandbyNN[1](org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer)
>   Time elapsed: 10.469 sec  <<< FAILURE!
> java.lang.AssertionError: Test resulted in an unexpected exit
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1957)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1944)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1937)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testTriggersLogRollsForAllStandbyNN(TestEditLogTailer.java:245)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11455) Fix javac warnings in HDFS that caused by deprecated FileSystem APIs

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11455:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Fix javac warnings in HDFS that caused by deprecated FileSystem APIs
> 
>
> Key: HDFS-11455
> URL: https://issues.apache.org/jira/browse/HDFS-11455
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11455.001.patch, HDFS-11455.002.patch, 
> HDFS-11455.003.patch
>
>
> There are many javac warnings coming out after FileSystem APIs which promote 
> inefficient call patterns being deprecated in HADOOP-13321. The relevant 
> warnings:
> {code}
> [WARNING] 
> /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestQuota.java:[320,18]
>  [deprecation] isFile(Path) in FileSystem has been deprecated
> [WARNING] 
> /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestQuota.java:[1409,18]
>  [deprecation] isFile(Path) in FileSystem has been deprecated
> [WARNING] 
> /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java:[778,19]
>  [deprecation] isDirectory(Path) in FileSystem has been deprecated
> [WARNING] 
> /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java:[787,20]
>  [deprecation] isDirectory(Path) in FileSystem has been deprecated
> [WARNING] 
> /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestQuotaByStorageType.java:[834,18]
>  [deprecation] isFile(Path) in FileSystem has been 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15140) Replace FoldedTreeSet in Datanode with SortedSet or TreeMap

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-15140:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Replace FoldedTreeSet in Datanode with SortedSet or TreeMap
> ---
>
> Key: HDFS-15140
> URL: https://issues.apache.org/jira/browse/HDFS-15140
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15140.001.patch, HDFS-15140.002.patch
>
>
> Based on the problems discussed in HDFS-15131, I would like to explore 
> replacing the FoldedTreeSet structure in the datanode with a builtin Java 
> equivalent - either SortedSet or TreeMap.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11543) Test multiple erasure coding implementations

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11543:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Test multiple erasure coding implementations
> 
>
> Key: HDFS-11543
> URL: https://issues.apache.org/jira/browse/HDFS-11543
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha2
>Reporter: László Bence Nagy
>Priority: Minor
>  Labels: test
>
> Potentially, multiple native erasure coding plugins will be available to be 
> used from HDFS later on. These plugins should be tested as well. For example, 
> the *NativeRSRawErasureCoderFactory* class - which is used for instantiating 
> the native ISA-L plugin's encoder and decoder objects - are used in 5 test 
> files under the 
> *hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/*
>  directory. The files are:
> - *TestDFSStripedInputStream.java*
> - *TestDFSStripedOutputStream.java*
> - *TestDFSStripedOutputStreamWithFailure.java*
> - *TestReconstructStripedFile.java*
> - *TestUnsetAndChangeDirectoryEcPolicy.java*
> Other erasure coding plugins should be tested in these cases as well in a 
> nice way (not by for example making a new file for every new erasure coding 
> plugin). For this purpose [parameterized 
> tests|https://github.com/junit-team/junit4/wiki/parameterized-tests] might be 
> used.
> This is also true for the 
> *hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/erasurecode/rawcoder/*
>  directory where this approach could be used for example for the 
> interoperability tests (when it is checked that certain erasure coding 
> implementations are compatible with each other by doing the encoding and 
> decoding operations with different plugins and verifying their results). The 
> plugin pairs which should be tested could be the parameters for the 
> parameterized tests.
> The parameterized test is just an idea, there can be other solutions as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7408) Add a counter in the log that shows the number of block reports processed

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-7408:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Add a counter in the log that shows the number of block reports processed
> -
>
> Key: HDFS-7408
> URL: https://issues.apache.org/jira/browse/HDFS-7408
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Suresh Srinivas
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-7408.001.patch
>
>
> It would be great to have in the info log corresponding to block report 
> processing, printing information on how many block reports have been 
> processed. This can be useful to debug when namenode is unresponsive 
> especially during startup time to understand if datanodes are sending block 
> reports multiple times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11489) KMS should warning about weak SSL ciphers

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11489:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> KMS should warning about weak SSL ciphers
> -
>
> Key: HDFS-11489
> URL: https://issues.apache.org/jira/browse/HDFS-11489
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.9.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>
> HADOOP-14083 sets a list of default ciphers that contain a few weak ciphers 
> in order to maintain backwards compatibility. In addition, users can select 
> weak ciphers by env {{KMS_SSL_CIPHERS}}. It'd nice to get warnings about the 
> weak ciphers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11356) figure out what to do about hadoop-hdfs-project/hadoop-hdfs/src/main/native

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11356:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> figure out what to do about hadoop-hdfs-project/hadoop-hdfs/src/main/native
> ---
>
> Key: HDFS-11356
> URL: https://issues.apache.org/jira/browse/HDFS-11356
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build, documentation
>Affects Versions: 3.0.0-alpha2
>Reporter: Allen Wittenauer
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HDFS-11356.001.patch
>
>
> The move of code to hdfs-client-native creation caused all sorts of loose 
> ends, and this is just another one.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14305:

Target Version/s: 3.4.0  (was: 3.3.0, 3.1.4, 3.2.2, 2.10.1)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: Konstantin Shvachko
>Priority: Major
>  Labels: multi-sbnn
> Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, 
> HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, 
> HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8065) Erasure coding: Support truncate at striped group boundary

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-8065:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Erasure coding: Support truncate at striped group boundary
> --
>
> Key: HDFS-8065
> URL: https://issues.apache.org/jira/browse/HDFS-8065
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Rakesh Radhakrishnan
>Priority: Major
> Attachments: HDFS-8065-00.patch, HDFS-8065-01.patch
>
>
> We can support truncate at striped group boundary firstly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11672) Fix some misleading log messages related to EC block groups

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11672:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Fix some misleading log messages related to EC block groups
> ---
>
> Key: HDFS-11672
> URL: https://issues.apache.org/jira/browse/HDFS-11672
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Manoj Govindassamy
>Priority: Minor
>  Labels: newbie
>
> I turned the NameNode's BlockStateChange and hdfs.StateChange logs up to 
> debug, noticed some out of date log messages like the following:
> {noformat}
> 2017-04-18 17:19:50,639 [IPC Server handler 4 on 45133] DEBUG 
> hdfs.StateChange (FSDirWriteFileOp.java:addBlock(507)) - DIR* 
> FSDirectory.addBlock: /test2RecoveryTasksForSameBlockGroup with 
> blk_-9223372036854775792_1001 block is added to the in-memory file system
> 2017-04-18 17:19:50,836 [Block report processor] DEBUG BlockStateChange 
> (LowRedundancyBlocks.java:update(353)) - BLOCK* 
> NameSystem.LowRedundancyBlock.update: blk_-9223372036854775792_1001 has only 
> 8 replicas and needs 9 replicas so is added to neededReconstructions at 
> priority level 2
> 2017-04-18 17:19:51,235 [IPC Server handler 4 on 45133] DEBUG 
> hdfs.StateChange (FSNamesystem.java:closeFile(3740)) - closeFile: 
> /test2RecoveryTasksForSameBlockGroup with 1 blocks is persisted to the file 
> system
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10616) Improve performance of path handling

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10616:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Improve performance of path handling
> 
>
> Key: HDFS-10616
> URL: https://issues.apache.org/jira/browse/HDFS-10616
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Major
> Attachments: 2.6-2.7.1-heap.png
>
>
> Path handling in the namesystem and directory is very inefficient.  The path 
> is repeatedly resolved, decomposed into path components, recombined to a full 
> path. parsed again, throughout the system.  This is directly inefficient for 
> general performance, and indirectly via unnecessary pressure on young gen GC.
> The namesystem should only operate on paths, parse it once into inodes, and 
> the directory should only operate on inodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14312) KMS-o-meter: Scale test KMS using kms audit log

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14312:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> KMS-o-meter: Scale test KMS using kms audit log
> ---
>
> Key: HDFS-14312
> URL: https://issues.apache.org/jira/browse/HDFS-14312
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: kms
>Affects Versions: 3.3.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
>
> It appears to me that Dynamometer's architecture allows KMS scale tests too.
> I imagine there are two ways to scale test a KMS.
> # Take KMS audit logs, and replay the logs against a KMS.
> # Configure Dynamometer to start KMS in addition to NameNode. Assuming the 
> fsimage comes from an encrypted cluster, replaying HDFS audit log also tests 
> KMS.
> It would be even more interesting to have a tool that converts uncrypted 
> cluster fsimage to an encrypted one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11350) Document the missing settings relevant to DataNode disk IO statistics

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11350:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Document the missing settings relevant to DataNode disk IO statistics
> -
>
> Key: HDFS-11350
> URL: https://issues.apache.org/jira/browse/HDFS-11350
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0-alpha2
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11350.001.patch
>
>
> HDFS-11299 and HDFS-11339 introduced some new setting relevant to  profiling 
> hooks to expose latency statistics around DataNode disk IO. These settings 
> should be intended for users but they are not documented. Totally three 
> relevant settings:
> 1.dfs.datanode.enable.fileio.profiling
> 2.dfs.datanode.enable.fileio.fault.injection
> 3.dfs.datanode.fileio.profiling.sampling.fraction
> Actually, only the setting {{dfs.datanode.enable.fileio.fault.injection}} 
> don't need to be documented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13656) Logging more info when client completes file error

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13656:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Logging more info when client completes file error
> --
>
> Key: HDFS-13656
> URL: https://issues.apache.org/jira/browse/HDFS-13656
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.2
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-13656.001.patch, error-hdfs.png
>
>
> We found the error when dfs client completes file.
> !error-hdfs.png!
> Now the error log is too simple and cannot provide enough info for debugging. 
> And this error will failed the write operation and finally lead our critical 
> tasks failed. It will be good to print retry time and file path info in 
> {{DFSOutputStream#completeFile}}. In addition, the log level can be updated 
> from INFO to WARN, it won't be ignored easily by users.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7550) Minor followon cleanups from HDFS-7543

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-7550:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Minor followon cleanups from HDFS-7543
> --
>
> Key: HDFS-7550
> URL: https://issues.apache.org/jira/browse/HDFS-7550
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Charles Lamb
>Priority: Minor
> Attachments: HDFS-7550.001.patch
>
>
> The commit of HDFS-7543 crossed paths with these comments:
> FSDirMkdirOp.java
> in #mkdirs, you removed the final String srcArg = src. This should be left 
> in. Many IDEs will whine about making assignments to formal args and that's 
> why it was put in in the first place.
> FSDirRenameOp.java
> #renameToInt, dstIIP (and resultingStat) could benefit from final's.
> FSDirXAttrOp.java
> I'm not sure why you've moved the call to getINodesInPath4Write and 
> checkXAttrChangeAccess inside the writeLock.
> FSDirStatAndListing.java
> The javadoc for the @param src needs to be changed to reflect that it's an 
> INodesInPath, not a String. Nit: it might be better to rename the 
> INodesInPath arg from src to iip.
> #getFileInfo4DotSnapshot is now unused since you in-lined it into 
> #getFileInfo.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14528) Failover from Active to Standby Failed

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14528:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Failover from Active to Standby Failed  
> 
>
> Key: HDFS-14528
> URL: https://issues.apache.org/jira/browse/HDFS-14528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>  Labels: multi-sbnn
> Attachments: HDFS-14528.003.patch, HDFS-14528.004.patch, 
> HDFS-14528.005.patch, HDFS-14528.006.patch, HDFS-14528.007.patch, 
> HDFS-14528.2.Patch, ZKFC_issue.patch
>
>
>  *In a cluster with more than one Standby namenode, manual failover throws 
> exception for some cases*
> *When trying to exectue the failover command from active to standby* 
> *._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*
>   Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
> connection exception: java.net.ConnectException: Connection refused
> This is encountered in the following cases :
>  Scenario 1 : 
> Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> When trying to manually failover from NN1 to NN2 if NN3 is down, Exception is 
> thrown
> Scenario 2 :
>  Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> ZKFC's -              ZKFC1,            ZKFC2,            ZKFC3
> When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is 
> down, Exception is thrown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12257) Expose getSnapshottableDirListing as a public API in HdfsAdmin

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12257:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Expose getSnapshottableDirListing as a public API in HdfsAdmin
> --
>
> Key: HDFS-12257
> URL: https://issues.apache.org/jira/browse/HDFS-12257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>Priority: Major
> Attachments: HDFS-12257.001.patch, HDFS-12257.002.patch, 
> HDFS-12257.003.patch
>
>
> Found at HIVE-16294. We have a CLI API for listing snapshottable dirs, but no 
> programmatic API. Other snapshot APIs are exposed in HdfsAdmin, I think we 
> should expose listing there as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10743) MiniDFSCluster test runtimes can be drastically reduce

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10743:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> MiniDFSCluster test runtimes can be drastically reduce
> --
>
> Key: HDFS-10743
> URL: https://issues.apache.org/jira/browse/HDFS-10743
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Kuhu Shukla
>Priority: Major
> Attachments: HDFS-10743.001.patch, HDFS-10743.002.patch, 
> HDFS-10743.003.patch
>
>
> {{MiniDFSCluster}} tests have excessive runtimes.  The main problem appears 
> to be the heartbeat interval.  The NN may have to wait up to 3s (default 
> value) for all DNs to heartbeat, triggering registration, so NN can go 
> active.  Tests that repeatedly restart the NN are severely affected.
> Example for varying heartbeat intervals for {{TestFSImageWithAcl}}:
> * 3s = ~70s -- (disgusting, why I investigated)
> * 1s = ~27s
> * 500ms = ~17s -- (had to hack DNConf for millisecond precision)
> That a 4x improvement in runtime.
> 17s is still excessively long for what the test does.  Further areas to 
> explore when running tests:
> * Reduce numerous sleeps intervals in DN's {{BPServiceActor}}.
> * Ensure heartbeats and initial BR are sent immediately upon (re)registration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10429) DataStreamer interrupted warning always appears when using CLI upload file

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-10429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10429:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> DataStreamer interrupted warning  always appears when using CLI upload file
> ---
>
> Key: HDFS-10429
> URL: https://issues.apache.org/jira/browse/HDFS-10429
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.3
>Reporter: Zhiyuan Yang
>Priority: Minor
> Attachments: HDFS-10429.1.patch, HDFS-10429.2.patch, 
> HDFS-10429.3.patch
>
>
> Every time I use 'hdfs dfs -put' upload file, this warning is printed:
> {code:java}
> 16/05/18 20:57:56 WARN hdfs.DataStreamer: Caught exception
> java.lang.InterruptedException
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Thread.join(Thread.java:1245)
>   at java.lang.Thread.join(Thread.java:1319)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.closeResponder(DataStreamer.java:871)
>   at org.apache.hadoop.hdfs.DataStreamer.endBlock(DataStreamer.java:519)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:696)
> {code}
> The reason is this: originally, DataStreamer::closeResponder always prints a 
> warning about InterruptedException; since HDFS-9812, 
> DFSOutputStream::closeImpl  always forces threads to close, which causes 
> InterruptedException.
> A simple fix is to use debug level log instead of warning level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10911) Change edit log OP_UPDATE_BLOCKS to store delta blocks only.

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-10911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10911:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Change edit log OP_UPDATE_BLOCKS to store delta blocks only.
> 
>
> Key: HDFS-10911
> URL: https://issues.apache.org/jira/browse/HDFS-10911
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.3, 3.0.0-alpha1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Major
>
> Every time a HDFS client {{close}}  or {{hflush}} an open file, NameNode 
> enumerates all the blocks  and stores then into edit log (OP_UPDATE_BLOCKS). 
> It would cause problem when the client is appending a large file frequently 
> (i.e., WAL). 
> Because HDFS is append only, we can only store the blocks that have been 
> changed (delta blocks) in edit log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13495) RBF: Support Router Admin REST API

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13495:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> RBF: Support Router Admin REST API
> --
>
> Key: HDFS-13495
> URL: https://issues.apache.org/jira/browse/HDFS-13495
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Mohammad Arshad
>Assignee: Fengnan Li
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-13495-001.patch
>
>
> This JIRA intends to add REST API support for all admin commands. Router 
> Admin REST APIs can be useful in managing the Routers from a central 
> management layer tool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11232) System.err should be System.out

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11232:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> System.err should be System.out
> ---
>
> Key: HDFS-11232
> URL: https://issues.apache.org/jira/browse/HDFS-11232
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Ethan Li
>Priority: Trivial
> Attachments: HDFS-11232.001.patch, HDFS-11232.002.patch
>
>
> In 
> /Users/Ethan/Worksplace/IntelliJWorkspace/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java,
>  System.err.println("Generating new cluster id:"); is used. I think it should 
> be System.out.println(...) since this is not an error message



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11885) createEncryptionZone should not block on initializing EDEK cache

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11885:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> createEncryptionZone should not block on initializing EDEK cache
> 
>
> Key: HDFS-11885
> URL: https://issues.apache.org/jira/browse/HDFS-11885
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-11885.001.patch, HDFS-11885.002.patch, 
> HDFS-11885.003.patch, HDFS-11885.004.patch
>
>
> When creating an encryption zone, we call {{ensureKeyIsInitialized}}, which 
> calls {{provider.warmUpEncryptedKeys(keyName)}}. This is a blocking call, 
> which attempts to fill the key cache up to the low watermark.
> If the KMS is down or slow, this can take a very long time, and cause the 
> createZone RPC to fail with a timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11464) Improve the selection in choosing storage for blocks

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11464:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Improve the selection in choosing storage for blocks
> 
>
> Key: HDFS-11464
> URL: https://issues.apache.org/jira/browse/HDFS-11464
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Major
> Attachments: HDFS-11464.001.patch, HDFS-11464.002.patch, 
> HDFS-11464.003.patch, HDFS-11464.004.patch, HDFS-11464.005.patch
>
>
> Currently the logic in choosing storage for blocks is not a good way. It 
> always uses the first valid storage of a given StorageType ({{see 
> DataNodeDescriptor#chooseStorage4Block}}). This should not be a good 
> selection. That means blcoks will always be written to the same volume (first 
> volume) and other valid volumes have no choices. This problem is brought up 
> by this comment ( 
> https://issues.apache.org/jira/browse/HDFS-9807?focusedCommentId=15878382&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15878382
>  )
> There is one solution from me:
> * First, based on existing storages in one node, extract all the valid 
> storages into a collection.
> * Then, disrupt the order of these vaild storages, get a new collection.
> * Finally, get the first storage from the new storages collection.
> These steps will be executed in {{DataNodeDescriptor#chooseStorage4Block}} 
> and replace current logic. I think this improvement can be done as a subtask 
> under HDFS-11419. Any further comments are welcomed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14476) lock too long when fix inconsistent blocks between disk and in-memory

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-14476:

Target Version/s: 3.4.0

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> lock too long when fix inconsistent blocks between disk and in-memory
> -
>
> Key: HDFS-14476
> URL: https://issues.apache.org/jira/browse/HDFS-14476
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0, 2.7.0, 3.0.3
>Reporter: Sean Chow
>Assignee: Sean Chow
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14476-branch-2.01.patch, 
> HDFS-14476-branch-2.02.patch, HDFS-14476-branch-2.10.02.patch, 
> HDFS-14476.00.patch, HDFS-14476.002.patch, HDFS-14476.01.patch, 
> HDFS-14476.branch-3.2.001.patch, datanode-with-patch-14476.png
>
>
> When directoryScanner have the results of differences between disk and 
> in-memory blocks. it will try to run {{checkAndUpdate}} to fix it. However 
> {{FsDatasetImpl.checkAndUpdate}} is a synchronized call
> As I have about 6millions blocks for every datanodes and every 6hours' scan 
> will have about 25000 abnormal blocks to fix. That leads to a long lock 
> holding FsDatasetImpl object.
> let's assume every block need 10ms to fix(because of latency of SAS disk), 
> that will cost 250 seconds to finish. That means all reads and writes will be 
> blocked for 3mins for that datanode.
>  
> {code:java}
> 2019-05-06 08:06:51,704 INFO 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool 
> BP-1644920766-10.223.143.220-1450099987967 Total blocks: 6850197, missing 
> metadata files:23574, missing block files:23574, missing blocks in 
> memory:47625, mismatched blocks:0
> ...
> 2019-05-06 08:16:41,625 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Took 588402ms to process 1 commands from NN
> {code}
> Take long time to process command from nn because threads are blocked. And 
> namenode will see long lastContact time for this datanode.
> Maybe this affect all hdfs versions.
> *how to fix:*
> just like process invalidate command from namenode with 1000 batch size, fix 
> these abnormal block should be handled with batch too and sleep 2 seconds 
> between the batch to allow normal reading/writing blocks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12667:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
> Attachments: HDFS-12667-001.patch, HDFS-12667-002.patch
>
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache> keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10237) Support specifying checksum type in WebHDFS/HTTPFS writers

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-10237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10237:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Support specifying checksum type in WebHDFS/HTTPFS writers
> --
>
> Key: HDFS-10237
> URL: https://issues.apache.org/jira/browse/HDFS-10237
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: webhdfs
>Affects Versions: 2.8.0
>Reporter: Harsh J
>Priority: Minor
> Attachments: HDFS-10237.000.patch, HDFS-10237.001.patch, 
> HDFS-10237.002.patch, HDFS-10237.002.patch
>
>
> Currently you cannot set a desired checksum type over a WebHDFS or HTTPFS 
> writer, as you can with the regular DFS writer (done via HADOOP-8240)
> This JIRA covers the changes necessary to bring the same ability to WebHDFS 
> and HTTPFS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9940) Balancer should not use property dfs.datanode.balance.max.concurrent.moves

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-9940:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Balancer should not use property dfs.datanode.balance.max.concurrent.moves
> --
>
> Key: HDFS-9940
> URL: https://issues.apache.org/jira/browse/HDFS-9940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
>
> It is very confusing for both Balancer and Datanode to use the same property 
> {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the 
> Balancer because the property has "datanode" in the name string. Many 
> customers forget to set the property for the Balancer.
> Change the Balancer to use a new property 
> {{dfs.balancer.max.concurrent.moves}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12811) A cheaper, faster, less memory intensive Hadoop fsck

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12811:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> A cheaper, faster, less memory intensive Hadoop fsck
> 
>
> Key: HDFS-12811
> URL: https://issues.apache.org/jira/browse/HDFS-12811
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
>
> A cheaper, faster, less memory intensive, approach is eliminate the traversal 
> by directly scanning the inode table.
>  A side-effect is paths would be scanned and displayed in a random order. A 
> new option, ex. "-direct" or "-fast", would likely be required to avoid 
> compatibility issues.
> PS: This enhancement is valid only if the path is root directory {{/}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10848) Move hadoop-hdfs-native-client module into hadoop-hdfs-client

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10848:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Move hadoop-hdfs-native-client module into hadoop-hdfs-client
> -
>
> Key: HDFS-10848
> URL: https://issues.apache.org/jira/browse/HDFS-10848
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Akira Ajisaka
>Assignee: Huafeng Wang
>Priority: Major
> Attachments: HDFS-10848.001.patch
>
>
> When a patch changes hadoop-hdfs-client module, Jenkins does not pick up the 
> tests in the native code. That way we overlooked test failure when committing 
> the patch. (ex. HDFS-10844)
> [~aw] said in HDFS-10844,
> bq. Ideally, all of this native code would be hdfs-client. Then when a change 
> is made to to that code, this code will also get tested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10364) Log current node in reversexml tool when parse failed

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-10364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10364:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Log current node in reversexml tool when parse failed
> -
>
> Key: HDFS-10364
> URL: https://issues.apache.org/jira/browse/HDFS-10364
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-10364.01.patch
>
>
> Sometimes we want to modify the xml before converting it. If some error 
> happened, it's hard to find out. Adding a line to tell where the failure is 
> would be helpful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12116:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
> --
>
> Key: HDFS-12116
> URL: https://issues.apache.org/jira/browse/HDFS-12116
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Xiao Chen
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-12116.01.patch, HDFS-12116.02.patch, 
> HDFS-12116.03.patch, 
> TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml
>
>
> This seems to be long-standing, but the failure rate (~10%) is slightly 
> higher in dist-test run in using cdh.
> In both _08 and _09 tests:
> # an attempt is made to make a replica in {{TEMPORARY}}
>  state, by {{waitForTempReplica}}.
> # Once that's returned, the test goes on to verify block reports shows 
> correct pending replication blocks.
> But there's a race condition. If the replica is replicated between steps #1 
> and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how 
> many replicas are replicated, hence failing the test.
> Failures are seen on both {{TestNNHandlesBlockReportPerStorage}} and 
> {{TestNNHandlesCombinedBlockReport}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11987) DistributedFileSystem#create and append do not honor CreateFlag.CREATE|APPEND

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11987:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> DistributedFileSystem#create and append do not honor CreateFlag.CREATE|APPEND
> -
>
> Key: HDFS-11987
> URL: https://issues.apache.org/jira/browse/HDFS-11987
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.1, 3.0.0-alpha3
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Major
> Attachments: HDFS-11987.00.patch
>
>
> {{DistributedFileSystem#create()}} and {{DistributedFIleSystem#append()}} do 
> not honor the expected behavior on {{CreateFlag.CREATE|APPEND}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11825) Make dfshealth datanode information more readable

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11825:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> Make dfshealth datanode information more readable
> -
>
> Key: HDFS-11825
> URL: https://issues.apache.org/jira/browse/HDFS-11825
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-alpha2
>Reporter: Sarah Victor
>Assignee: Sarah Victor
>Priority: Trivial
> Attachments: HADOOP-11825.1.patch
>
>
> On the dfshealth.html#tab-datanode web page to monitor health of Hadoop nodes 
> (hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html)
> (1) The Capacity column reports the node capacity but is sorted by another 
> value that is not reported (ie) the % Capacity Used. It will be good to name 
> the column heading as "Capacity Used [%]" and report both the Capacity Used 
> and the Capacity Used %
> (2) The Block Pool Used column reports the block pools used but is sorted by 
> the % Block Pool Used, It will be good to name the column heading as Block 
> Pool Used [%]
> (3) The "Last contact" column is more easily understood if we call it "Last 
> Heartbeat". Adding the measurement unit [secs] to the header makes it more 
> easily understood.
> (3) Adding the measurement unit [mins] to the "Last Block Report" header 
> makes it more easily understood.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2020-04-11 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17081492#comment-17081492
 ] 

Hadoop QA commented on HDFS-12667:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} HDFS-12667 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-12667 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12895068/HDFS-12667-002.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29113/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
> Attachments: HDFS-12667-001.patch, HDFS-12667-002.patch
>
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache> keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12859) Admin command resetBalancerBandwidth

2020-04-11 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17081495#comment-17081495
 ] 

Hadoop QA commented on HDFS-12859:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} HDFS-12859 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-12859 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12899783/0004-HDFS-12859-Admin-command-resetBalancerBandwidth.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29110/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Admin command resetBalancerBandwidth
> 
>
> Key: HDFS-12859
> URL: https://issues.apache.org/jira/browse/HDFS-12859
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer & mover
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: 
> 0003-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, 
> 0004-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, HDFS-12859.patch
>
>
> We can already set balancer bandwidth dynamically using command 
> setBalancerBandwidth. The setting value is not persistent and not stored in 
> configuration file. The different datanodes could their different default or 
> former setting in configuration.
> When we suggested to develop a schedule balancer task which runs at midnight 
> everyday. We set a larger bandwidth for it and hope to reset the value after 
> finishing. However, we found it difficult to reset the different setting for 
> different datanodes as the setBalancerBandwidth command can only set the same 
> value to all datanodes. If we want to use unique setting for every datanode, 
> we have to reset the datanodes.
> So it would be useful to have a command to synchronize the setting with the 
> configuration file. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7368) Support HDFS specific 'shell' on command 'hdfs dfs' invocation

2020-04-11 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-7368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17081491#comment-17081491
 ] 

Hadoop QA commented on HDFS-7368:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} HDFS-7368 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-7368 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12679814/HDFS-7368-001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29116/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Support HDFS specific 'shell' on command 'hdfs dfs' invocation
> --
>
> Key: HDFS-7368
> URL: https://issues.apache.org/jira/browse/HDFS-7368
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Major
> Attachments: HDFS-7368-001.patch
>
>
> * *hadoop fs* is the generic implementation for all filesystem 
> implementations, but some of the operations are supported only in some 
> filesystems. Ex: snapshot commands, acl commands, xattr commands.
> * *hdfs dfs* is recommended in all hdfs related docs in current releases.
> In current code  both *hdfs shell* and *hadoop fs* points to hadoop common 
> implementation of FSShell.
> It would be better to have HDFS specific extention of FSShell which includes 
> HDFS only commands in future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11464) Improve the selection in choosing storage for blocks

2020-04-11 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17081493#comment-17081493
 ] 

Hadoop QA commented on HDFS-11464:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} HDFS-11464 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-11464 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12867091/HDFS-11464.005.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29111/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Improve the selection in choosing storage for blocks
> 
>
> Key: HDFS-11464
> URL: https://issues.apache.org/jira/browse/HDFS-11464
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Major
> Attachments: HDFS-11464.001.patch, HDFS-11464.002.patch, 
> HDFS-11464.003.patch, HDFS-11464.004.patch, HDFS-11464.005.patch
>
>
> Currently the logic in choosing storage for blocks is not a good way. It 
> always uses the first valid storage of a given StorageType ({{see 
> DataNodeDescriptor#chooseStorage4Block}}). This should not be a good 
> selection. That means blcoks will always be written to the same volume (first 
> volume) and other valid volumes have no choices. This problem is brought up 
> by this comment ( 
> https://issues.apache.org/jira/browse/HDFS-9807?focusedCommentId=15878382&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15878382
>  )
> There is one solution from me:
> * First, based on existing storages in one node, extract all the valid 
> storages into a collection.
> * Then, disrupt the order of these vaild storages, get a new collection.
> * Finally, get the first storage from the new storages collection.
> These steps will be executed in {{DataNodeDescriptor#chooseStorage4Block}} 
> and replace current logic. I think this improvement can be done as a subtask 
> under HDFS-11419. Any further comments are welcomed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10848) Move hadoop-hdfs-native-client module into hadoop-hdfs-client

2020-04-11 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17081496#comment-17081496
 ] 

Hadoop QA commented on HDFS-10848:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  9s{color} 
| {color:red} HDFS-10848 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-10848 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12861393/HDFS-10848.001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29122/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Move hadoop-hdfs-native-client module into hadoop-hdfs-client
> -
>
> Key: HDFS-10848
> URL: https://issues.apache.org/jira/browse/HDFS-10848
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Akira Ajisaka
>Assignee: Huafeng Wang
>Priority: Major
> Attachments: HDFS-10848.001.patch
>
>
> When a patch changes hadoop-hdfs-client module, Jenkins does not pick up the 
> tests in the native code. That way we overlooked test failure when committing 
> the patch. (ex. HDFS-10844)
> [~aw] said in HDFS-10844,
> bq. Ideally, all of this native code would be hdfs-client. Then when a change 
> is made to to that code, this code will also get tested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >