[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HDFS-15024: Description: When we enable the ONN , there will be three NN nodes for the client configuration, Such as configuration dfs.ha.namenodes.ns1 nn2,nn3,nn1 Currently, nn2 is in standby state nn3 is in observer state nn1 is in active state When the user performs an access HDFS operation ./bin/hadoop --loglevel debug fs -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /user/haiyang1/test8 You need to request nn1 when you execute the msync method, Actually connect nn2 first and failover is required In connection nn3 does not meet the requirements, failover needs to be performed, but at this time, failover operation needs to be performed during a period of hibernation Finally, it took a period of hibernation to connect the successful request to nn1 In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current default implementation is Sleep time is calculated when more than one failover operation is performed I think that the Number of NameNodes as a condition of calculation of sleep time is more reasonable That is, in the current test, executing failover on connection nn3 does not need to sleep time to directly connect to the next nn node See client_error.log for details was: {code:java} When we enable the ONN , there will be three NN nodes for the client configuration, Such as configuration dfs.ha.namenodes.ns1 nn2,nn3,nn1 Currently, nn2 is in standby state nn3 is in observer state nn1 is in active state When the user performs an access HDFS operation ./bin/hadoop --loglevel debug fs -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /user/haiyang1/test8 You need to request nn1 when you execute the msync method, Actually connect nn2 first and failover is required In connection nn3 does not meet the requirements, failover needs to be performed, but at this time, failover operation needs to be performed during a period of hibernation Finally, it took a period of hibernation to connect the successful request to nn1 In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current default implementation is Sleep time is calculated when more than one failover operation is performed I think that the Number of NameNodes as a condition of calculation of sleep time is more reasonable That is, in the current test, executing failover on connection nn3 does not need to sleep time to directly connect to the next nn node See client_error.log for details {code} > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -mkdir /user/haiyang1/test8 > You need to request nn1 when you execute the msync method, > Actually connect nn2 first and failover is required > In connection nn3 does not meet the requirements, failover needs to be > performed, but at this time, failover operation needs to be performed during > a period of hibernation > Finally, it took a period of hibernation to connect the successful request to > nn1 > In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current > default implementation is Sleep time is calculated when more than one > failover operation is performed > I think that the Number of NameNodes as a condition of calculation of sleep > time is more reasonable > That is, in the current test, executing failover on connection nn3 does not > need to sleep time to directly connect to the next nn node > See client_error.log for details -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service
[ https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984780#comment-16984780 ] Jinglun commented on HDFS-13811: Hi [~linyiqun], thanks your comments ! I didn't think about the web UI before, thanks your reminding ! I haven't looked into the code, just a guess: is the web UI usage fetched from the state store now ? Might be we can display the usage from the local cache(RouterQuotaManager) ? I had a look of [~dibyendu_hadoop]'s way but not in detail. I'll try to figure out how it works and how to fix the web UI this weekend. > RBF: Race condition between router admin quota update and periodic quota > update service > --- > > Key: HDFS-13811 > URL: https://issues.apache.org/jira/browse/HDFS-13811 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Dibyendu Karmakar >Assignee: Jinglun >Priority: Major > Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, > HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch > > > If we try to update quota of an existing mount entry and at the same time > periodic quota update service is running on the same mount entry, it is > leading the mount table to _inconsistent state._ > Here transactions are: > A - Quota update service is fetching mount table entries. > B - Quota update service is updating the mount table with current usage. > A' - User is trying to update quota using admin cmd. > and the transaction sequence is [ A A' B ] > quota update service is updating the mount table with old quota value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984776#comment-16984776 ] huhaiyang commented on HDFS-15024: -- [~weichiu] [~vagarychen] Hello, how do you solve this problem in the process of use? Thanks! > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > {code:java} > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -mkdir /user/haiyang1/test8 > You need to request nn1 when you execute the msync method, > Actually connect nn2 first and failover is required > In connection nn3 does not meet the requirements, failover needs to be > performed, but at this time, failover operation needs to be performed during > a period of hibernation > Finally, it took a period of hibernation to connect the successful request to > nn1 > In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current > default implementation is Sleep time is calculated when more than one > failover operation is performed > I think that the Number of NameNodes as a condition of calculation of sleep > time is more reasonable > That is, in the current test, executing failover on connection nn3 does not > need to sleep time to directly connect to the next nn node > See client_error.log for details > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984772#comment-16984772 ] huhaiyang commented on HDFS-15024: -- ./bin/hadoop --loglevel debug fs -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /user/haiyang1/test8 ... 19/11/29 14:26:55 DEBUG ipc.Client: The ping interval is 6 ms. 19/11/29 14:26:55 DEBUG ipc.Client: Connecting to nn2/xx:8020 19/11/29 14:26:55 DEBUG ipc.Client: IPC Client (1337335626) connection to nn2/xx:8020 from hadoop: starting, having connections 1 19/11/29 14:26:55 DEBUG ipc.Client: IPC Client (1337335626) connection to nn2/xx:8020 from hadoop sending #0 org.apache.hadoop.hdfs.protocol.ClientProtocol.msync 19/11/29 14:26:55 DEBUG ipc.Client: IPC Client (1337335626) connection to nn2/xx:8020 from hadoop got value #0 19/11/29 14:26:55 DEBUG retry.RetryInvocationHandler: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state standby. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2018) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1461) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.msync(NameNodeRpcServer.java:1384) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(ClientNamenodeProtocolServerSideTranslatorPB.java:1907) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:531) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1903) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2815) , while invoking $Proxy4.getFileInfo over [nn3/xx:8020,nn2/xx:8020,nn1/xx:8020]. Trying to failover immediately. org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state standby. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2018) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1461) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.msync(NameNodeRpcServer.java:1384) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(ClientNamenodeProtocolServerSideTranslatorPB.java:1907) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:531) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1903) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2815) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1543) at org.apache.hadoop.ipc.Client.call(Client.java:1489) at org.apache.hadoop.ipc.Client.call(Client.java:1388) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) at com.sun.proxy.$Proxy15.msync(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.msync(ClientNamenodeProtocolTranslatorPB.java:1958) at org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.initializeMsync(ObserverReadProxyProvider.java:318) at
[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huhaiyang updated HDFS-15024: - Description: {code:java} When we enable the ONN , there will be three NN nodes for the client configuration, Such as configuration dfs.ha.namenodes.ns1 nn2,nn3,nn1 Currently, nn2 is in standby state nn3 is in observer state nn1 is in active state When the user performs an access HDFS operation ./bin/hadoop --loglevel debug fs -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /user/haiyang1/test8 You need to request nn1 when you execute the msync method, Actually connect nn2 first and failover is required In connection nn3 does not meet the requirements, failover needs to be performed, but at this time, failover operation needs to be performed during a period of hibernation Finally, it took a period of hibernation to connect the successful request to nn1 In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current default implementation is Sleep time is calculated when more than one failover operation is performed I think that the Number of NameNodes as a condition of calculation of sleep time is more reasonable That is, in the current test, executing failover on connection nn3 does not need to sleep time to directly connect to the next nn node See client_error.log for details {code} was: {code:java} When we enable the ONN , there will be three NN nodes for the client configuration, Such as configuration dfs.ha.namenodes.ns1 nn2,nn3,nn1 Currently, nn2 is in standby state nn3 is in observer state nn1 is in active state When the user performs an access HDFS operation ./bin/hadoop --loglevel debug fs -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -ls /user/haiyang1/ {code} > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > {code:java} > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -mkdir /user/haiyang1/test8 > You need to request nn1 when you execute the msync method, > Actually connect nn2 first and failover is required > In connection nn3 does not meet the requirements, failover needs to be > performed, but at this time, failover operation needs to be performed during > a period of hibernation > Finally, it took a period of hibernation to connect the successful request to > nn1 > In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current > default implementation is Sleep time is calculated when more than one > failover operation is performed > I think that the Number of NameNodes as a condition of calculation of sleep > time is more reasonable > That is, in the current test, executing failover on connection nn3 does not > need to sleep time to directly connect to the next nn node > See client_error.log for details > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984766#comment-16984766 ] fanghanyun edited comment on HDFS-14986 at 11/29/19 7:05 AM: - hadoop version 2.6.0-cdh5.13.1 public Set deepCopyReplica(String bpid) throws IOException { //Set replicas = new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.EMPTY_SET // :volumeMap.replicas(bpid)); Set replicas = null; try (AutoCloseableLock lock = datasetLock.acquire()) { replicas = new HashSet<>(volumeMap.replicas(bpid) == null ? Collections. EMPTY_SET : volumeMap.replicas(bpid)); } Cannot solve symbol 'datasetLock' was (Author: fanghanyun): hadoop version 2.6.0-cdh5.13.1 !image-2019-11-29-14-59-11-179.png! > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.10.0 >Reporter: Ryan Wu >Assignee: Aiphago >Priority: Major > Fix For: 3.3.0, 2.10.1, 2.11.0 > > Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, > HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, > HDFS-14986.006.patch > > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984766#comment-16984766 ] fanghanyun commented on HDFS-14986: --- hadoop version 2.6.0-cdh5.13.1 !image-2019-11-29-14-59-11-179.png! > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.10.0 >Reporter: Ryan Wu >Assignee: Aiphago >Priority: Major > Fix For: 3.3.0, 2.10.1, 2.11.0 > > Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, > HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, > HDFS-14986.006.patch > > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service
[ https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984758#comment-16984758 ] Yiqun Lin edited comment on HDFS-13811 at 11/29/19 6:55 AM: Hi [~LiJinglun], I take a look for the major change of the patch. I am agreed with that updating mount table statestore behavior should be called only by admin call and let periodic service update the local cache. This will broken one thing that the usage in mount table cannot be updated. And the quota usage displayed in web UI will be invalid. How do we plan to fix this? Can you use the similar way that [~dibyendu_hadoop] did before to extract the quota usage from quota manager when getting the mount table entries? {noformat} @@ -142,6 +143,20 @@ public GetMountTableEntriesResponse getMountTableEntries( it.remove(); } } +// If quotamanager is not null, update quota usage from quota cache. +if (this.getQuotaManager() != null && request.isUpdateQuotaCache()) { + RouterQuotaUsage quota = + this.getQuotaManager().getQuotaUsage(record.getSourcePath()); + if(quota != null) { +RouterQuotaUsage oldquota = record.getQuota(); +RouterQuotaUsage newQuota = new RouterQuotaUsage.Builder() +.fileAndDirectoryCount(quota.getFileAndDirectoryCount()) +.quota(oldquota.getQuota()) +.spaceConsumed(quota.getSpaceConsumed()) +.spaceQuota(oldquota.getSpaceQuota()).build(); +record.setQuota(newQuota); + } +} } } {noformat} was (Author: linyiqun): Hi [~LiJinglun], I take a look for the major change of the patch. I am agreed with that updating mount table statestore behavior should be called only by admin call and let periodic service update the local cache. This will broken one thing that the usage in mount table cannot be updated. And the quota usage displayed in web UI will be invalid. How do we plan to fix this?The definition of update service is that: {noformat} /** * Service to periodically update the {@link RouterQuotaUsage} * cached information in the {@link Router} and update corresponding * mount table in State Store. */ public class RouterQuotaUpdateService extends PeriodicService { {noformat} > RBF: Race condition between router admin quota update and periodic quota > update service > --- > > Key: HDFS-13811 > URL: https://issues.apache.org/jira/browse/HDFS-13811 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Dibyendu Karmakar >Assignee: Jinglun >Priority: Major > Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, > HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch > > > If we try to update quota of an existing mount entry and at the same time > periodic quota update service is running on the same mount entry, it is > leading the mount table to _inconsistent state._ > Here transactions are: > A - Quota update service is fetching mount table entries. > B - Quota update service is updating the mount table with current usage. > A' - User is trying to update quota using admin cmd. > and the transaction sequence is [ A A' B ] > quota update service is updating the mount table with old quota value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service
[ https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984758#comment-16984758 ] Yiqun Lin edited comment on HDFS-13811 at 11/29/19 6:49 AM: Hi [~LiJinglun], I take a look for the major change of the patch. I am agreed with that updating mount table statestore behavior should be called only by admin call and let periodic service update the local cache. This will broken one thing that the usage in mount table cannot be updated. And the quota usage displayed in web UI will be invalid. How do we plan to fix this?The definition of update service is that: {noformat} /** * Service to periodically update the {@link RouterQuotaUsage} * cached information in the {@link Router} and update corresponding * mount table in State Store. */ public class RouterQuotaUpdateService extends PeriodicService { {noformat} was (Author: linyiqun): Hi [~LiJinglun], I take a look for the major change of the patch. I am agreed with that updating mount table statestore behavior should be called only by admin call and let periodic service update the local cache. This will broken one thing that the usage in mount table cannot be updated. And the quota usage displayed in web UI will be invalid. How do we plan to fix this? > RBF: Race condition between router admin quota update and periodic quota > update service > --- > > Key: HDFS-13811 > URL: https://issues.apache.org/jira/browse/HDFS-13811 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Dibyendu Karmakar >Assignee: Jinglun >Priority: Major > Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, > HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch > > > If we try to update quota of an existing mount entry and at the same time > periodic quota update service is running on the same mount entry, it is > leading the mount table to _inconsistent state._ > Here transactions are: > A - Quota update service is fetching mount table entries. > B - Quota update service is updating the mount table with current usage. > A' - User is trying to update quota using admin cmd. > and the transaction sequence is [ A A' B ] > quota update service is updating the mount table with old quota value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service
[ https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984758#comment-16984758 ] Yiqun Lin commented on HDFS-13811: -- Hi [~LiJinglun], I take a look for the major change of the patch. I am agreed with that updating mount table statestore behavior should be called only by admin call and let periodic service update the local cache. This will broken one thing that the usage in mount table cannot be updated. And the quota usage displayed in web UI will be invalid. How do we plan to fix this? > RBF: Race condition between router admin quota update and periodic quota > update service > --- > > Key: HDFS-13811 > URL: https://issues.apache.org/jira/browse/HDFS-13811 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Dibyendu Karmakar >Assignee: Jinglun >Priority: Major > Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, > HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch > > > If we try to update quota of an existing mount entry and at the same time > periodic quota update service is running on the same mount entry, it is > leading the mount table to _inconsistent state._ > Here transactions are: > A - Quota update service is fetching mount table entries. > B - Quota update service is updating the mount table with current usage. > A' - User is trying to update quota using admin cmd. > and the transaction sequence is [ A A' B ] > quota update service is updating the mount table with old quota value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huhaiyang updated HDFS-15024: - Description: {code:java} When we enable the ONN , there will be three NN nodes for the client configuration, Such as configuration dfs.ha.namenodes.ns1 nn2,nn3,nn1 Currently, nn2 is in standby state nn3 is in observer state nn1 is in active state When the user performs an access HDFS operation ./bin/hadoop --loglevel debug fs -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -ls /user/haiyang1/ {code} was: {code:java} When we enable the ONN , there will be three NN nodes for the client configuration, Such as configuration dfs.ha.namenodes.ns1 nn2,nn3,nn1 Currently, nn2 is in standby state nn3 is in observer state nn1 is in active state When the user performs an access HDFS operation ./bin/hadoop --loglevel debug fs -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -ls /user/haiyang1/ {code} > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > {code:java} > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -ls /user/haiyang1/ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huhaiyang updated HDFS-15024: - Attachment: (was: client_error.log) > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > {code:java} > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -ls /user/haiyang1/ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huhaiyang updated HDFS-15024: - Attachment: client_error.log > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > {code:java} > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -ls /user/haiyang1/ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15003) RBF: Make Router support storage type quota.
[ https://issues.apache.org/jira/browse/HDFS-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984752#comment-16984752 ] Jinglun commented on HDFS-15003: Hi [~ayushtkn] [~elgoiri], would you like to help reviewing v02 ? Thanks very much ! > RBF: Make Router support storage type quota. > > > Key: HDFS-15003 > URL: https://issues.apache.org/jira/browse/HDFS-15003 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Attachments: HDFS-15003.001.patch, HDFS-15003.002.patch > > > Make Router support storage type quota. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
[ https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984750#comment-16984750 ] Fei Hui commented on HDFS-15023: [~ayushtkn]Thanks for your comments Try to add UT > [SBN read] ZKFC should check the state before joining the election > -- > > Key: HDFS-15023 > URL: https://issues.apache.org/jira/browse/HDFS-15023 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15023.001.patch > > > As discussed HDFS-14961, ZKFC should not join election when its state is > observer. > Right now when namemode was an observer, it joined election and it would be > become a standby. > MonitorDaemon thread callchain is that > doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() > -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> > createLockNodeAsync > callBack for zookeeper > processResult -> becomeStandby -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
[ https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984749#comment-16984749 ] Hadoop QA commented on HDFS-15023: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 39s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 48s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 49s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 3s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}111m 34s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.ha.TestZKFailoverControllerStress | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-15023 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12987115/HDFS-15023.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux fc69a6df21d0 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 44f7b91 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28424/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28424/testReport/ | | Max. process+thread count | 1343 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output |
[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huhaiyang updated HDFS-15024: - Attachment: client_error.log > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > {code:java} > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -ls /user/haiyang1/ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huhaiyang updated HDFS-15024: - Description: {code:java} When we enable the ONN , there will be three NN nodes for the client configuration, Such as configuration dfs.ha.namenodes.ns1 nn2,nn3,nn1 Currently, nn2 is in standby state nn3 is in observer state nn1 is in active state When the user performs an access HDFS operation ./bin/hadoop --loglevel debug fs -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -ls /user/haiyang1/ {code} was: {code:java} When we enable the ONN , there will be three NN nodes for the client configuration, Such as configuration dfs.ha.namenodes.ns1 nn2,nn3,nn1 Currently, nn2 is in standby state nn3 is in observer state nn1 is in active state When the user performs an access HDFS operation ./bin/hadoop --loglevel debug fs -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -ls /user/haiyang1/ 16:49:13 DEBUG ipc.Client: The ping interval is 6 ms. 19/11/28 16:49:13 DEBUG ipc.Client: Connecting to xx/xx:8020 ... 19/11/28 16:49:13 DEBUG retry.RetryInvocationHandler: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state observer. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2018) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1461) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.msync(NameNodeRpcServer.java:1384) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(ClientNamenodeProtocolServerSideTranslatorPB.java:1907) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:531) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1903) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2815) , while invoking $Proxy4.getFileInfo over [xx/xx:8020,xx/xx:8020,xx/xx:8020]. Trying to failover immediately. org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state observer. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2018) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1461) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.msync(NameNodeRpcServer.java:1384) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(ClientNamenodeProtocolServerSideTranslatorPB.java:1907) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:531) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1903) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2815) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1543) at org.apache.hadoop.ipc.Client.call(Client.java:1489) at org.apache.hadoop.ipc.Client.call(Client.java:1388) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) at com.sun.proxy.$Proxy15.msync(Unknown Source) at
[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huhaiyang updated HDFS-15024: - Description: {code:java} When we enable the ONN , there will be three NN nodes for the client configuration, Such as configuration dfs.ha.namenodes.ns1 nn2,nn3,nn1 Currently, nn2 is in standby state nn3 is in observer state nn1 is in active state When the user performs an access HDFS operation ./bin/hadoop --loglevel debug fs -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -ls /user/haiyang1/ 16:49:13 DEBUG ipc.Client: The ping interval is 6 ms. 19/11/28 16:49:13 DEBUG ipc.Client: Connecting to xx/xx:8020 ... 19/11/28 16:49:13 DEBUG retry.RetryInvocationHandler: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state observer. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2018) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1461) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.msync(NameNodeRpcServer.java:1384) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(ClientNamenodeProtocolServerSideTranslatorPB.java:1907) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:531) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1903) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2815) , while invoking $Proxy4.getFileInfo over [xx/xx:8020,xx/xx:8020,xx/xx:8020]. Trying to failover immediately. org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state observer. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2018) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1461) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.msync(NameNodeRpcServer.java:1384) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(ClientNamenodeProtocolServerSideTranslatorPB.java:1907) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:531) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1903) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2815) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1543) at org.apache.hadoop.ipc.Client.call(Client.java:1489) at org.apache.hadoop.ipc.Client.call(Client.java:1388) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) at com.sun.proxy.$Proxy15.msync(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.msync(ClientNamenodeProtocolTranslatorPB.java:1958) at org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.initializeMsync(ObserverReadProxyProvider.java:318) at org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.access$500(ObserverReadProxyProvider.java:69) at
[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huhaiyang updated HDFS-15024: - Attachment: HDFS-15024.001.patch Affects Version/s: 3.3.0 2.10.0 3.2.1 Description: {code:java} When we enable the ONN , there will be three NN nodes for the client configuration, Such as configuration dfs.ha.namenodes.ns1 nn2,nn3,nn1 Currently, nn2 is in standby state, nn3 is in observer state, and nn1 is in active state When the user performs an access HDFS operation {code} Summary: [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time (was: [SBN read] In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime method, Number of NameNodes as a condition of calculation of sleep time) > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch > > > {code:java} > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, nn2 is in standby state, nn3 is in observer state, and nn1 is in > active state > When the user performs an access HDFS operation > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime method, Number of NameNodes as a condition of calculation of sleep time
huhaiyang created HDFS-15024: Summary: [SBN read] In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime method, Number of NameNodes as a condition of calculation of sleep time Key: HDFS-15024 URL: https://issues.apache.org/jira/browse/HDFS-15024 Project: Hadoop HDFS Issue Type: Improvement Reporter: huhaiyang -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
[ https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HDFS-15023: Issue Type: Improvement (was: Bug) > [SBN read] ZKFC should check the state before joining the election > -- > > Key: HDFS-15023 > URL: https://issues.apache.org/jira/browse/HDFS-15023 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15023.001.patch > > > As discussed HDFS-14961, ZKFC should not join election when its state is > observer. > Right now when namemode was an observer, it joined election and it would be > become a standby. > MonitorDaemon thread callchain is that > doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() > -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> > createLockNodeAsync > callBack for zookeeper > processResult -> becomeStandby -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
[ https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984739#comment-16984739 ] Ayush Saxena commented on HDFS-15023: - Thanx [~ferhui], did you change the fix here? Wasn't it checking not Observer? That was more descriptive, if that is APT, you can add a line of comment also explaining the reason. Is it possible to cover the change by a test? > [SBN read] ZKFC should check the state before joining the election > -- > > Key: HDFS-15023 > URL: https://issues.apache.org/jira/browse/HDFS-15023 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15023.001.patch > > > As discussed HDFS-14961, ZKFC should not join election when its state is > observer. > Right now when namemode was an observer, it joined election and it would be > become a standby. > MonitorDaemon thread callchain is that > doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() > -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> > createLockNodeAsync > callBack for zookeeper > processResult -> becomeStandby -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9695) HTTPFS - CHECKACCESS operation missing
[ https://issues.apache.org/jira/browse/HDFS-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984732#comment-16984732 ] Takanobu Asanuma commented on HDFS-9695: Thanks for updating the patch, [~hemanthboyina]. It almost seems good. Some minor comments: * Please remove the blank line in the FSAccess constructor. * About {{FSAccess#execute}}, the javadoc comment of {{@return}} seems wrong. > HTTPFS - CHECKACCESS operation missing > -- > > Key: HDFS-9695 > URL: https://issues.apache.org/jira/browse/HDFS-9695 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Bert Hekman >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-9695.001.patch, HDFS-9695.002.patch, > HDFS-9695.003.patch, HDFS-9695.004.patch > > > Hi, > The CHECKACCESS operation seems to be missing in HTTPFS. I'm getting the > following error: > {code} > QueryParamException: java.lang.IllegalArgumentException: No enum constant > org.apache.hadoop.fs.http.client.HttpFSFileSystem.Operation.CHECKACCESS > {code} > A quick look into the org.apache.hadoop.fs.http.client.HttpFSFileSystem class > reveals that CHECKACCESS is not defined at all. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15003) RBF: Make Router support storage type quota.
[ https://issues.apache.org/jira/browse/HDFS-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984730#comment-16984730 ] Hadoop QA commented on HDFS-15003: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 43s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 34s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 39s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 11s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 62m 17s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-15003 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12987114/HDFS-15003.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 33d3e79aa3ab 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 44f7b91 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28423/testReport/ | | Max. process+thread count | 2737 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28423/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > RBF: Make Router support storage type quota. > > > Key: HDFS-15003 > URL:
[jira] [Commented] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
[ https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984729#comment-16984729 ] Fei Hui commented on HDFS-15023: Upload the simple fix > [SBN read] ZKFC should check the state before joining the election > -- > > Key: HDFS-15023 > URL: https://issues.apache.org/jira/browse/HDFS-15023 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15023.001.patch > > > As discussed HDFS-14961, ZKFC should not join election when its state is > observer. > Right now when namemode was an observer, it joined election and it would be > become a standby. > MonitorDaemon thread callchain is that > doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() > -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> > createLockNodeAsync > callBack for zookeeper > processResult -> becomeStandby -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
[ https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-15023: --- Status: Patch Available (was: Open) > [SBN read] ZKFC should check the state before joining the election > -- > > Key: HDFS-15023 > URL: https://issues.apache.org/jira/browse/HDFS-15023 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15023.001.patch > > > As discussed HDFS-14961, ZKFC should not join election when its state is > observer. > Right now when namemode was an observer, it joined election and it would be > become a standby. > MonitorDaemon thread callchain is that > doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() > -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> > createLockNodeAsync > callBack for zookeeper > processResult -> becomeStandby -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
[ https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-15023: --- Attachment: HDFS-15023.001.patch > [SBN read] ZKFC should check the state before joining the election > -- > > Key: HDFS-15023 > URL: https://issues.apache.org/jira/browse/HDFS-15023 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15023.001.patch > > > As discussed HDFS-14961, ZKFC should not join election when its state is > observer. > Right now when namemode was an observer, it joined election and it would be > become a standby. > MonitorDaemon thread callchain is that > doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() > -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> > createLockNodeAsync > callBack for zookeeper > processResult -> becomeStandby -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
[ https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-15023: --- Description: As discussed HDFS-14961, ZKFC should not join election when its state is observer. Right now when namemode was an observer, it joined election and it would be become a standby. MonitorDaemon thread callchain is that doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> createLockNodeAsync callBack for zookeeper processResult -> becomeStandby was:As discussed HDFS-14961, ZKFC should not join election when its state is observer > [SBN read] ZKFC should check the state before joining the election > -- > > Key: HDFS-15023 > URL: https://issues.apache.org/jira/browse/HDFS-15023 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15023.001.patch > > > As discussed HDFS-14961, ZKFC should not join election when its state is > observer. > Right now when namemode was an observer, it joined election and it would be > become a standby. > MonitorDaemon thread callchain is that > doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() > -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> > createLockNodeAsync > callBack for zookeeper > processResult -> becomeStandby -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
[ https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-15023: --- Issue Type: Bug (was: Improvement) > [SBN read] ZKFC should check the state before joining the election > -- > > Key: HDFS-15023 > URL: https://issues.apache.org/jira/browse/HDFS-15023 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15023.001.patch > > > As discussed HDFS-14961, ZKFC should not join election when its state is > observer. > Right now when namemode was an observer, it joined election and it would be > become a standby. > MonitorDaemon thread callchain is that > doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() > -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> > createLockNodeAsync > callBack for zookeeper > processResult -> becomeStandby -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15003) RBF: Make Router support storage type quota.
[ https://issues.apache.org/jira/browse/HDFS-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun updated HDFS-15003: --- Attachment: HDFS-15003.002.patch > RBF: Make Router support storage type quota. > > > Key: HDFS-15003 > URL: https://issues.apache.org/jira/browse/HDFS-15003 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Attachments: HDFS-15003.001.patch, HDFS-15003.002.patch > > > Make Router support storage type quota. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15021) RBF: Delegation Token can't remove correctly in absence of cancelToken and restart the router.
[ https://issues.apache.org/jira/browse/HDFS-15021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984716#comment-16984716 ] Yuxuan Wang commented on HDFS-15021: I think TestZKDelegationTokenSecretManager#testNodesLoadedAfterRestart() already cover the case. > RBF: Delegation Token can't remove correctly in absence of cancelToken and > restart the router. > --- > > Key: HDFS-15021 > URL: https://issues.apache.org/jira/browse/HDFS-15021 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Weidong Duan >Priority: Major > > The ZKDelegationTokenSecretManager couldn't remove the expired DTs on the > Zookeeper as expected when restart the Router in the absence of invoking the > method ` ZKDelegationTokenSecretManager#cancelToken`. > This case will cause many stale DTs leave on the Zookeeper . Maybe cause the > performance problem of the Router. > I think this is a bug and should be resolved in the latter. Is it Right? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
[ https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui reassigned HDFS-15023: -- Assignee: Fei Hui > [SBN read] ZKFC should check the state before joining the election > -- > > Key: HDFS-15023 > URL: https://issues.apache.org/jira/browse/HDFS-15023 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > > As discussed HDFS-14961, ZKFC should not join election when its state is > observer -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
Fei Hui created HDFS-15023: -- Summary: [SBN read] ZKFC should check the state before joining the election Key: HDFS-15023 URL: https://issues.apache.org/jira/browse/HDFS-15023 Project: Hadoop HDFS Issue Type: Improvement Reporter: Fei Hui As discussed HDFS-14961, ZKFC should not join election when its state is observer -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15013) Reduce NameNode overview tab response time
[ https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984618#comment-16984618 ] Hudson commented on HDFS-15013: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17710 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17710/]) HDFS-15013. Reduce NameNode overview tab response time. Contributed by (surendralilhore: rev 44f7b9159d8eec151f199231bafe0677f9383dc3) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.js > Reduce NameNode overview tab response time > -- > > Key: HDFS-15013 > URL: https://issues.apache.org/jira/browse/HDFS-15013 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-15013.001.patch, HDFS-15013.002.patch, > image-2019-11-26-10-05-39-640.png, image-2019-11-26-10-09-07-952.png > > > Now, the overview tab load /conf synchronously as follow picture. > !image-2019-11-26-10-05-39-640.png! > This issue will change it to an asynchronous method. The effect diagram is as > follows. > !image-2019-11-26-10-09-07-952.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15013) Reduce NameNode overview tab response time
[ https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984614#comment-16984614 ] Surendra Singh Lilhore commented on HDFS-15013: --- +1 > Reduce NameNode overview tab response time > -- > > Key: HDFS-15013 > URL: https://issues.apache.org/jira/browse/HDFS-15013 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-15013.001.patch, HDFS-15013.002.patch, > image-2019-11-26-10-05-39-640.png, image-2019-11-26-10-09-07-952.png > > > Now, the overview tab load /conf synchronously as follow picture. > !image-2019-11-26-10-05-39-640.png! > This issue will change it to an asynchronous method. The effect diagram is as > follows. > !image-2019-11-26-10-09-07-952.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15013) Reduce NameNode overview tab response time
[ https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore updated HDFS-15013: -- Resolution: Fixed Status: Resolved (was: Patch Available) Thanks [~marvelrock] for contribution. Thanks [~elgoiri], [~ayushtkn] , [~hemanthboyina] for review. Committed to trunk. > Reduce NameNode overview tab response time > -- > > Key: HDFS-15013 > URL: https://issues.apache.org/jira/browse/HDFS-15013 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-15013.001.patch, HDFS-15013.002.patch, > image-2019-11-26-10-05-39-640.png, image-2019-11-26-10-09-07-952.png > > > Now, the overview tab load /conf synchronously as follow picture. > !image-2019-11-26-10-05-39-640.png! > This issue will change it to an asynchronous method. The effect diagram is as > follows. > !image-2019-11-26-10-09-07-952.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6874) Add GETFILEBLOCKLOCATIONS operation to HttpFS
[ https://issues.apache.org/jira/browse/HDFS-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984611#comment-16984611 ] hemanthboyina commented on HDFS-6874: - {quote}But I think that only applies to branch-2. {quote} yes , it only applies to branch-2 . I think we can go ahead with this . > Add GETFILEBLOCKLOCATIONS operation to HttpFS > - > > Key: HDFS-6874 > URL: https://issues.apache.org/jira/browse/HDFS-6874 > Project: Hadoop HDFS > Issue Type: Improvement > Components: httpfs >Affects Versions: 2.4.1, 2.7.3 >Reporter: Gao Zhong Liang >Assignee: Weiwei Yang >Priority: Major > Labels: BB2015-05-TBR > Attachments: HDFS-6874-1.patch, HDFS-6874-branch-2.6.0.patch, > HDFS-6874.02.patch, HDFS-6874.03.patch, HDFS-6874.04.patch, > HDFS-6874.05.patch, HDFS-6874.06.patch, HDFS-6874.07.patch, > HDFS-6874.08.patch, HDFS-6874.09.patch, HDFS-6874.10.patch, HDFS-6874.patch > > > GETFILEBLOCKLOCATIONS operation is missing in HttpFS, which is already > supported in WebHDFS. For the request of GETFILEBLOCKLOCATIONS in > org.apache.hadoop.fs.http.server.HttpFSServer, BAD_REQUEST is returned so far: > ... > case GETFILEBLOCKLOCATIONS: { > response = Response.status(Response.Status.BAD_REQUEST).build(); > break; > } > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15010) BlockPoolSlice#addReplicaThreadPool static pool should be initialized by static method
[ https://issues.apache.org/jira/browse/HDFS-15010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore updated HDFS-15010: -- Fix Version/s: 3.2.2 3.1.4 3.3.0 Resolution: Fixed Status: Resolved (was: Patch Available) Thanks [~elgoiri] for review. Committed to trunk, branch-3.2 and branch-3.1! > BlockPoolSlice#addReplicaThreadPool static pool should be initialized by > static method > -- > > Key: HDFS-15010 > URL: https://issues.apache.org/jira/browse/HDFS-15010 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 3.1.2 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-15010.001.patch, HDFS-15010.02.patch, > HDFS-15010.03.patch, HDFS-15010.04.patch, HDFS-15010.05.patch > > > {{BlockPoolSlice#initializeAddReplicaPool()}} method currently initialize the > static thread pool instance. But when two {{BPServiceActor}} actor try to > load block pool parallelly then it may create different instance. > So {{BlockPoolSlice#initializeAddReplicaPool()}} method should be a static > method. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15010) BlockPoolSlice#addReplicaThreadPool static pool should be initialized by static method
[ https://issues.apache.org/jira/browse/HDFS-15010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984600#comment-16984600 ] Hudson commented on HDFS-15010: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17709 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17709/]) HDFS-15010. BlockPoolSlice#addReplicaThreadPool static pool should be (surendralilhore: rev 0384687811446a52009b96cc85bf961a3e83afc4) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsVolumeList.java > BlockPoolSlice#addReplicaThreadPool static pool should be initialized by > static method > -- > > Key: HDFS-15010 > URL: https://issues.apache.org/jira/browse/HDFS-15010 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 3.1.2 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Major > Attachments: HDFS-15010.001.patch, HDFS-15010.02.patch, > HDFS-15010.03.patch, HDFS-15010.04.patch, HDFS-15010.05.patch > > > {{BlockPoolSlice#initializeAddReplicaPool()}} method currently initialize the > static thread pool instance. But when two {{BPServiceActor}} actor try to > load block pool parallelly then it may create different instance. > So {{BlockPoolSlice#initializeAddReplicaPool()}} method should be a static > method. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14984) HDFS setQuota: Error message should be added for invalid input max range value to hdfs dfsadmin -setQuota command
[ https://issues.apache.org/jira/browse/HDFS-14984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984589#comment-16984589 ] hemanthboyina commented on HDFS-14984: -- if we set an invalid input 9223372036854775807(which is Long.MAX_VALUE) . it should go through the if condition and throw exception (based on error thrown for invalid input) {code:java} DFSClient.java if ((namespaceQuota <= 0 && namespaceQuota != HdfsConstants.QUOTA_DONT_SET && namespaceQuota != HdfsConstants.QUOTA_RESET) || (storagespaceQuota < 0 && storagespaceQuota != HdfsConstants.QUOTA_DONT_SET && storagespaceQuota != HdfsConstants.QUOTA_RESET)) { throw new IllegalArgumentException("Invalid values for quota : " + namespaceQuota + " and " + storagespaceQuota);} {code} but in FSDirAttrOp.java , we have an else if check , if nsQuota equals Long.MAX_VALUE , we are setting with Old NS Quota. {code:java} final QuotaCounts oldQuota = dirNode.getQuotaCounts(); final long oldNsQuota = oldQuota.getNameSpace(); final long oldSsQuota = oldQuota.getStorageSpace(); if (dirNode.isRoot() && nsQuota == HdfsConstants.QUOTA_RESET) { nsQuota = HdfsConstants.QUOTA_DONT_SET; } else if (nsQuota == HdfsConstants.QUOTA_DONT_SET) { nsQuota = oldNsQuota; } // unchanged space/namespace quota if (type == null && oldNsQuota == nsQuota && oldSsQuota == ssQuota) { return null; } {code} Either the exception message was not proper or the if condition was not correct in DFSClient . please correct me if am wrong . > HDFS setQuota: Error message should be added for invalid input max range > value to hdfs dfsadmin -setQuota command > - > > Key: HDFS-14984 > URL: https://issues.apache.org/jira/browse/HDFS-14984 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.1.2 >Reporter: Souryakanta Dwivedy >Priority: Minor > Attachments: image-2019-11-13-14-05-19-603.png, > image-2019-11-13-14-07-04-536.png > > > An error message should be added for invalid input max range value > "9223372036854775807" to hdfs dfsadmin -setQuota command > * set quota for a directory with invalid input vlaue as > "9223372036854775807"- set quota for a directory with invalid input vlaue as > "9223372036854775807" the command will be successful without displaying any > result.Quota value will not be set for the directory internally,but it > will be better from user usage point of view if an error message will > display for the invalid max range value "9223372036854775807" as it is > displaying while setting the input value as "0" For example "hdfs > dfsadmin -setQuota 9223372036854775807 /quota" > !image-2019-11-13-14-05-19-603.png! > > * - Try to set quota for a directory with invalid input value as "0" It > will throw an error message as "setQuota: Invalid values for quota : 0 and > 9223372036854775807" For example "hdfs dfsadmin -setQuota 0 /quota" > !image-2019-11-13-14-07-04-536.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15022) Add new RPC to transfer data block with external shell script across Datanode
[ https://issues.apache.org/jira/browse/HDFS-15022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984557#comment-16984557 ] Hadoop QA commented on HDFS-15022: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 37s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 2s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 3s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 3m 3s{color} | {color:red} hadoop-hdfs-project generated 4 new + 15 unchanged - 4 fixed = 19 total (was 19) {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 3m 3s{color} | {color:red} hadoop-hdfs-project generated 3 new + 741 unchanged - 0 fixed = 744 total (was 741) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 5s{color} | {color:orange} hadoop-hdfs-project: The patch generated 95 new + 931 unchanged - 3 fixed = 1026 total (was 934) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 40s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 57s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 23s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 56s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 91m 11s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}167m 0s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Exceptional return value of java.io.File.delete() ignored in org.apache.hadoop.hdfs.server.datanode.DataXceiver.linkBlock(ExtendedBlock, Token, String, DatanodeInfo, StorageType, String, DatanodeInfo, StorageType) At DataXceiver.java:ignored in org.apache.hadoop.hdfs.server.datanode.DataXceiver.linkBlock(ExtendedBlock, Token, String,
[jira] [Commented] (HDFS-14901) RBF: Add Encryption Zone related ClientProtocol APIs
[ https://issues.apache.org/jira/browse/HDFS-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984554#comment-16984554 ] hemanthboyina commented on HDFS-14901: -- thanks for the review [~ayushtkn] [~elgoiri] {quote} if no specific reason, you can use routerDFS only for both and chunk of having {{routerProtocol}} from the test. {quote} some of the API's like getDataEncryptionKey() were not present in routerDFS , so we need to use routerProtocol. > RBF: Add Encryption Zone related ClientProtocol APIs > > > Key: HDFS-14901 > URL: https://issues.apache.org/jira/browse/HDFS-14901 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14901.001.patch, HDFS-14901.002.patch > > > Currently listEncryptionZones,reencryptEncryptionZone,listReencryptionStatus > these APIs are not implemented in Router. > This JIRA is intend to implement above mentioned APIs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15009) FSCK "-list-corruptfileblocks" return Invalid Entries
[ https://issues.apache.org/jira/browse/HDFS-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984515#comment-16984515 ] Hadoop QA commented on HDFS-15009: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 23s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 14s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 22s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 51s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}137m 26s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 16s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}232m 54s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestStoragePolicySatisfierWithHA | | | hadoop.hdfs.server.namenode.TestAddStripedBlocks | | | hadoop.hdfs.server.namenode.TestDeleteRace | | | hadoop.hdfs.TestDeadNodeDetection | | | hadoop.hdfs.server.namenode.TestNameNodeMXBean | | | hadoop.hdfs.server.namenode.TestNameNodeXAttr | | | hadoop.hdfs.server.namenode.TestAuditLogs | | | hadoop.hdfs.server.namenode.ha.TestConsistentReadsObserver | | | hadoop.hdfs.server.namenode.TestFSDirectory | | | hadoop.hdfs.server.namenode.TestReencryptionWithKMS | | | hadoop.hdfs.server.namenode.TestNameEditsConfigs | | | hadoop.hdfs.server.namenode.TestAddBlockRetry | | | hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR | | | hadoop.hdfs.server.namenode.TestCommitBlockWithInvalidGenStamp | | | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints | |
[jira] [Commented] (HDFS-9695) HTTPFS - CHECKACCESS operation missing
[ https://issues.apache.org/jira/browse/HDFS-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984460#comment-16984460 ] Hadoop QA commented on HDFS-9695: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 49s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 25s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-httpfs: The patch generated 1 new + 455 unchanged - 0 fixed = 456 total (was 455) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 37s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 39s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 53m 49s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-9695 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12987067/HDFS-9695.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d1795487e895 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 46166bd | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/28421/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-httpfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28421/testReport/ | | Max. process+thread count | 632 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-httpfs U: hadoop-hdfs-project/hadoop-hdfs-httpfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28421/console | | Powered by | Apache Yetus
[jira] [Updated] (HDFS-15022) Add new RPC to transfer data block with external shell script across Datanode
[ https://issues.apache.org/jira/browse/HDFS-15022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-15022: Attachment: HDFS-15022.patch Status: Patch Available (was: Open) > Add new RPC to transfer data block with external shell script across Datanode > - > > Key: HDFS-15022 > URL: https://issues.apache.org/jira/browse/HDFS-15022 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Yang Yun >Assignee: Yang Yun >Priority: Minor > Attachments: HDFS-15022.patch > > > Replicating data block is expensive when some Datanodes are down, especially > for slow storage. Add a new RPC to replicate block with external shell script > across datanode. User can choose more effective way to copy block files. > In our setup, Archive volume are configured to remote reliable storage. we > just add a new link file in new datanode to the remote file when do > replication. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15022) Add new RPC to transfer data block with external shell script across Datanode
Yang Yun created HDFS-15022: --- Summary: Add new RPC to transfer data block with external shell script across Datanode Key: HDFS-15022 URL: https://issues.apache.org/jira/browse/HDFS-15022 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Yang Yun Assignee: Yang Yun Replicating data block is expensive when some Datanodes are down, especially for slow storage. Add a new RPC to replicate block with external shell script across datanode. User can choose more effective way to copy block files. In our setup, Archive volume are configured to remote reliable storage. we just add a new link file in new datanode to the remote file when do replication. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13571) Deadnode detection
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-13571: --- Release Note: When dead node blocks DFSInputStream, Deadnode detection can find it and share this information to other DFSInputStreams in the same DFSClient. Thus, these DFSInputStreams will not read from the dead node and be blocked by this dead node. (was: When dead node blocks DFSInputStream,Deadnode detection can find it and share this information to other DFSInputStreams in the same DFSClient. Thus, these DFSInputStreams will not read from the dead node and be blocked by this dead node. ) > Deadnode detection > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: DeadNodeDetectorDesign.pdf, HDFS-13571-2.6.diff, node > status machine.png > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13571) Deadnode detection
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-13571: --- Release Note: When dead node blocks DFSInputStream,Deadnode detection can find it and share this information to other DFSInputStreams in the same DFSClient. Thus, these DFSInputStreams will not read from the dead node and be blocked by this dead node. (was: When dead node blocks DFSInputStream,Deadnode detection can find it and share this information to other DFSInputStreams in the same DFSClient. Thus, these DFSInputStreams will not read from the dead node and be blocked by this dead node. ) > Deadnode detection > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: DeadNodeDetectorDesign.pdf, HDFS-13571-2.6.diff, node > status machine.png > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13571) Deadnode detection
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984443#comment-16984443 ] Lisheng Sun commented on HDFS-13571: Thank [~linyiqun] for patient review and good comments. I have added release note for this JIRA. > Deadnode detection > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: DeadNodeDetectorDesign.pdf, HDFS-13571-2.6.diff, node > status machine.png > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13571) Deadnode detection
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-13571: --- Release Note: When dead node blocks DFSInputStream,Deadnode detection can find it and share this information to other DFSInputStreams in the same DFSClient. Thus, these DFSInputStreams will not read from the dead node and be blocked by this dead node. > Deadnode detection > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: DeadNodeDetectorDesign.pdf, HDFS-13571-2.6.diff, node > status machine.png > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13571) Deadnode detection
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-13571: --- Summary: Deadnode detection (was: Dead DataNode Detector) > Deadnode detection > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: DeadNodeDetectorDesign.pdf, HDFS-13571-2.6.diff, node > status machine.png > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9695) HTTPFS - CHECKACCESS operation missing
[ https://issues.apache.org/jira/browse/HDFS-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984415#comment-16984415 ] hemanthboyina commented on HDFS-9695: - updated the patch with test failures and findbugs fixed. please review. > HTTPFS - CHECKACCESS operation missing > -- > > Key: HDFS-9695 > URL: https://issues.apache.org/jira/browse/HDFS-9695 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Bert Hekman >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-9695.001.patch, HDFS-9695.002.patch, > HDFS-9695.003.patch, HDFS-9695.004.patch > > > Hi, > The CHECKACCESS operation seems to be missing in HTTPFS. I'm getting the > following error: > {code} > QueryParamException: java.lang.IllegalArgumentException: No enum constant > org.apache.hadoop.fs.http.client.HttpFSFileSystem.Operation.CHECKACCESS > {code} > A quick look into the org.apache.hadoop.fs.http.client.HttpFSFileSystem class > reveals that CHECKACCESS is not defined at all. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9695) HTTPFS - CHECKACCESS operation missing
[ https://issues.apache.org/jira/browse/HDFS-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hemanthboyina updated HDFS-9695: Attachment: HDFS-9695.004.patch > HTTPFS - CHECKACCESS operation missing > -- > > Key: HDFS-9695 > URL: https://issues.apache.org/jira/browse/HDFS-9695 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Bert Hekman >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-9695.001.patch, HDFS-9695.002.patch, > HDFS-9695.003.patch, HDFS-9695.004.patch > > > Hi, > The CHECKACCESS operation seems to be missing in HTTPFS. I'm getting the > following error: > {code} > QueryParamException: java.lang.IllegalArgumentException: No enum constant > org.apache.hadoop.fs.http.client.HttpFSFileSystem.Operation.CHECKACCESS > {code} > A quick look into the org.apache.hadoop.fs.http.client.HttpFSFileSystem class > reveals that CHECKACCESS is not defined at all. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14961) [SBN read] Prevent ZKFC changing Observer Namenode state
[ https://issues.apache.org/jira/browse/HDFS-14961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984407#comment-16984407 ] Hudson commented on HDFS-14961: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17708 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17708/]) HDFS-14961. [SBN read] Prevent ZKFC changing Observer Namenode state. (ayushsaxena: rev 46166bd8d1be6f25bd38703fb9b0a417e3ef750b) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSZKFailoverController.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java > [SBN read] Prevent ZKFC changing Observer Namenode state > > > Key: HDFS-14961 > URL: https://issues.apache.org/jira/browse/HDFS-14961 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Ayush Saxena >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14961-01.patch, HDFS-14961-02.patch, > HDFS-14961-03.patch, HDFS-14961-04.patch, ZKFC-TEST-14961.patch > > > HDFS-14130 made ZKFC aware of the Observer Namenode and hence allows ZKFC > running along with the observer NOde. > The Observer namenode isn't suppose to be part of ZKFC election process. > But if the Namenode was part of election, before turning into Observer by > transitionToObserver Command. The ZKFC still sends instruction to the > Namenode as a result of previous participation and sometimes tend to change > the state of Observer to Standby. > This is also the reason for failure in TestDFSZKFailoverController. > TestDFSZKFailoverController has been consistently failing with a time out > waiting in testManualFailoverWithDFSHAAdmin(). In particular > {{waitForHAState(1, HAServiceState.OBSERVER);}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14961) [SBN read] Prevent ZKFC changing Observer Namenode state
[ https://issues.apache.org/jira/browse/HDFS-14961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HDFS-14961: Fix Version/s: 3.3.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > [SBN read] Prevent ZKFC changing Observer Namenode state > > > Key: HDFS-14961 > URL: https://issues.apache.org/jira/browse/HDFS-14961 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Ayush Saxena >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14961-01.patch, HDFS-14961-02.patch, > HDFS-14961-03.patch, HDFS-14961-04.patch, ZKFC-TEST-14961.patch > > > HDFS-14130 made ZKFC aware of the Observer Namenode and hence allows ZKFC > running along with the observer NOde. > The Observer namenode isn't suppose to be part of ZKFC election process. > But if the Namenode was part of election, before turning into Observer by > transitionToObserver Command. The ZKFC still sends instruction to the > Namenode as a result of previous participation and sometimes tend to change > the state of Observer to Standby. > This is also the reason for failure in TestDFSZKFailoverController. > TestDFSZKFailoverController has been consistently failing with a time out > waiting in testManualFailoverWithDFSHAAdmin(). In particular > {{waitForHAState(1, HAServiceState.OBSERVER);}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14961) [SBN read] Prevent ZKFC changing Observer Namenode state
[ https://issues.apache.org/jira/browse/HDFS-14961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984398#comment-16984398 ] Ayush Saxena commented on HDFS-14961: - Committed to trunk. Thanx [~elgoiri] for the report and review, [~vinayakumarb], [~ferhui] and [~csun] for the reviews!!! > [SBN read] Prevent ZKFC changing Observer Namenode state > > > Key: HDFS-14961 > URL: https://issues.apache.org/jira/browse/HDFS-14961 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14961-01.patch, HDFS-14961-02.patch, > HDFS-14961-03.patch, HDFS-14961-04.patch, ZKFC-TEST-14961.patch > > > HDFS-14130 made ZKFC aware of the Observer Namenode and hence allows ZKFC > running along with the observer NOde. > The Observer namenode isn't suppose to be part of ZKFC election process. > But if the Namenode was part of election, before turning into Observer by > transitionToObserver Command. The ZKFC still sends instruction to the > Namenode as a result of previous participation and sometimes tend to change > the state of Observer to Standby. > This is also the reason for failure in TestDFSZKFailoverController. > TestDFSZKFailoverController has been consistently failing with a time out > waiting in testManualFailoverWithDFSHAAdmin(). In particular > {{waitForHAState(1, HAServiceState.OBSERVER);}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15021) RBF: Delegation Token can't remove correctly in absence of cancelToken and restart the router.
Weidong Duan created HDFS-15021: --- Summary: RBF: Delegation Token can't remove correctly in absence of cancelToken and restart the router. Key: HDFS-15021 URL: https://issues.apache.org/jira/browse/HDFS-15021 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Weidong Duan The ZKDelegationTokenSecretManager couldn't remove the expired DTs on the Zookeeper as expected when restart the Router in the absence of invoking the method ` ZKDelegationTokenSecretManager#cancelToken`. This case will cause many stale DTs leave on the Zookeeper . Maybe cause the performance problem of the Router. I think this is a bug and should be resolved in the latter. Is it Right? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15009) FSCK "-list-corruptfileblocks" return Invalid Entries
[ https://issues.apache.org/jira/browse/HDFS-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984367#comment-16984367 ] hemanthboyina commented on HDFS-15009: -- thanks for the suggestion [~ayushtkn] updated the patch , please review > FSCK "-list-corruptfileblocks" return Invalid Entries > - > > Key: HDFS-15009 > URL: https://issues.apache.org/jira/browse/HDFS-15009 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-15009.001.patch, HDFS-15009.002.patch, > HDFS-15009.003.patch, HDFS-15009.004.patch > > > Scenario : if we have two directories dir1, dir10 and only dir10 have > corrupt files > Now if we run -list-corruptfileblocks for dir1, corrupt files count for dir1 > showing is of dir10 > {code:java} > while (blkIterator.hasNext()) { > BlockInfo blk = blkIterator.next(); > final INodeFile inode = getBlockCollection(blk); > skip++; > if (inode != null) { > String src = inode.getFullPathName(); > if (src.startsWith(path)){ > corruptFiles.add(new CorruptFileBlockInfo(src, blk)); > count++; > if (count >= DEFAULT_MAX_CORRUPT_FILEBLOCKS_RETURNED) > break; > } > } > } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15009) FSCK "-list-corruptfileblocks" return Invalid Entries
[ https://issues.apache.org/jira/browse/HDFS-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hemanthboyina updated HDFS-15009: - Attachment: HDFS-15009.004.patch > FSCK "-list-corruptfileblocks" return Invalid Entries > - > > Key: HDFS-15009 > URL: https://issues.apache.org/jira/browse/HDFS-15009 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-15009.001.patch, HDFS-15009.002.patch, > HDFS-15009.003.patch, HDFS-15009.004.patch > > > Scenario : if we have two directories dir1, dir10 and only dir10 have > corrupt files > Now if we run -list-corruptfileblocks for dir1, corrupt files count for dir1 > showing is of dir10 > {code:java} > while (blkIterator.hasNext()) { > BlockInfo blk = blkIterator.next(); > final INodeFile inode = getBlockCollection(blk); > skip++; > if (inode != null) { > String src = inode.getFullPathName(); > if (src.startsWith(path)){ > corruptFiles.add(new CorruptFileBlockInfo(src, blk)); > count++; > if (count >= DEFAULT_MAX_CORRUPT_FILEBLOCKS_RETURNED) > break; > } > } > } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14961) [SBN read] Prevent ZKFC changing Observer Namenode state
[ https://issues.apache.org/jira/browse/HDFS-14961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984338#comment-16984338 ] Vinayakumar B commented on HDFS-14961: -- Thanks [~ayushtkn] for the analysis and the fix. Fix looks good to me. +1. There is already a check present in HealthMonitor thread to quitElection when namenode state found to be OBSERVER. {code:java} if (changedState == HAServiceState.OBSERVER) { elector.quitElection(true); serviceState = HAServiceState.OBSERVER; return; }{code} But this is an async monitoring happening every 1 second. In case of manual transition, state can change directly in NameNode. So ZKFC syncs during monitoring and quits election. As [~ferhui] suggested, checking for the state before joining the election also doesn't hurt. Can be added as a separate Improvement Jira as [~ayushtkn] already said. {code:java} if(serviceState != HAServiceState.OBSERVER) { elector.joinElection(targetToData(localTarget)); }{code} > [SBN read] Prevent ZKFC changing Observer Namenode state > > > Key: HDFS-14961 > URL: https://issues.apache.org/jira/browse/HDFS-14961 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14961-01.patch, HDFS-14961-02.patch, > HDFS-14961-03.patch, HDFS-14961-04.patch, ZKFC-TEST-14961.patch > > > HDFS-14130 made ZKFC aware of the Observer Namenode and hence allows ZKFC > running along with the observer NOde. > The Observer namenode isn't suppose to be part of ZKFC election process. > But if the Namenode was part of election, before turning into Observer by > transitionToObserver Command. The ZKFC still sends instruction to the > Namenode as a result of previous participation and sometimes tend to change > the state of Observer to Standby. > This is also the reason for failure in TestDFSZKFailoverController. > TestDFSZKFailoverController has been consistently failing with a time out > waiting in testManualFailoverWithDFSHAAdmin(). In particular > {{waitForHAState(1, HAServiceState.OBSERVER);}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15020) Add a test case of storage type quota to TestHdfsAdmin.
[ https://issues.apache.org/jira/browse/HDFS-15020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun updated HDFS-15020: --- Resolution: Not A Problem Status: Resolved (was: Patch Available) Hi [~ayushtkn] , thanks your reminding ! Since HdfsAdmin.setQuotaByStorageType() does nothing but call DistributedFileSystem.setQuotaByStorageType(src, type, quota), I think the TestQuota.testQuotaByStorageType() would cover. I'll close this jira. > Add a test case of storage type quota to TestHdfsAdmin. > --- > > Key: HDFS-15020 > URL: https://issues.apache.org/jira/browse/HDFS-15020 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-15020.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12348) disable removing blocks to trash while rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-12348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984287#comment-16984287 ] lindongdong commented on HDFS-12348: Hi, [~surendrasingh], thanks for your patch. I find some problem with the patch: After we do rolling upgrade prepare, the old DN will move deleted file to trash. With this patch, the new DN will never delete the trash dir forever. > disable removing blocks to trash while rolling upgrade > -- > > Key: HDFS-12348 > URL: https://issues.apache.org/jira/browse/HDFS-12348 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode >Reporter: Jiandan Yang >Assignee: Jiandan Yang >Priority: Major > Attachments: HDFS-12348.001.patch, HDFS-12348.002.patch, > HDFS-12348.003.patch > > > DataNode remove block file and meta file to trash while rolling upgrade,and > do delete when > executing finalize. > This leads disk of datanode to be full, because > (1) frequently creating and deleting files(eg,Hbase compaction); > (2) cluster is very big, and rolling upgrade often last several days; > Current our solution is clean trash by hand, but this is very dangerous in > product environment. > we think disable trash of datanode maybe a good method to avoid disk to be > full. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service
[ https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984238#comment-16984238 ] Yiqun Lin commented on HDFS-13811: -- [~LiJinglun], sorry for the delayed review. I was busy on reviewing other patches. I will give my review comments these days. > RBF: Race condition between router admin quota update and periodic quota > update service > --- > > Key: HDFS-13811 > URL: https://issues.apache.org/jira/browse/HDFS-13811 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Dibyendu Karmakar >Assignee: Jinglun >Priority: Major > Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, > HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch > > > If we try to update quota of an existing mount entry and at the same time > periodic quota update service is running on the same mount entry, it is > leading the mount table to _inconsistent state._ > Here transactions are: > A - Quota update service is fetching mount table entries. > B - Quota update service is updating the mount table with current usage. > A' - User is trying to update quota using admin cmd. > and the transaction sequence is [ A A' B ] > quota update service is updating the mount table with old quota value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org