[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-11-28 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15024:

Description: 
When we enable the ONN , there will be three NN nodes for the client 
configuration,
Such as configuration


dfs.ha.namenodes.ns1
nn2,nn3,nn1


Currently, 
nn2 is in standby state
nn3 is in observer state 
nn1 is in active state

When the user performs an access HDFS operation
./bin/hadoop --loglevel debug fs 
-Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
 -mkdir /user/haiyang1/test8

You need to request nn1 when you execute the msync method,
Actually connect nn2 first and failover is required
In connection nn3 does not meet the requirements, failover needs to be 
performed, but at this time, failover operation needs to be performed during a 
period of hibernation
Finally, it took a period of hibernation to connect the successful request to 
nn1

In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current 
default implementation is Sleep time is calculated when more than one failover 
operation is performed

I think that the Number of NameNodes as a condition of calculation of sleep 
time is more reasonable
That is, in the current test, executing failover on connection nn3 does not 
need to sleep time to directly connect to the next nn node

See client_error.log for details


  was:
{code:java}
When we enable the ONN , there will be three NN nodes for the client 
configuration,
Such as configuration


dfs.ha.namenodes.ns1
nn2,nn3,nn1


Currently, 
nn2 is in standby state
nn3 is in observer state 
nn1 is in active state

When the user performs an access HDFS operation
./bin/hadoop --loglevel debug fs 
-Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
 -mkdir /user/haiyang1/test8

You need to request nn1 when you execute the msync method,
Actually connect nn2 first and failover is required
In connection nn3 does not meet the requirements, failover needs to be 
performed, but at this time, failover operation needs to be performed during a 
period of hibernation
Finally, it took a period of hibernation to connect the successful request to 
nn1

In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current 
default implementation is Sleep time is calculated when more than one failover 
operation is performed

I think that the Number of NameNodes as a condition of calculation of sleep 
time is more reasonable
That is, in the current test, executing failover on connection nn3 does not 
need to sleep time to directly connect to the next nn node

See client_error.log for details



{code}


> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -mkdir /user/haiyang1/test8
> You need to request nn1 when you execute the msync method,
> Actually connect nn2 first and failover is required
> In connection nn3 does not meet the requirements, failover needs to be 
> performed, but at this time, failover operation needs to be performed during 
> a period of hibernation
> Finally, it took a period of hibernation to connect the successful request to 
> nn1
> In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current 
> default implementation is Sleep time is calculated when more than one 
> failover operation is performed
> I think that the Number of NameNodes as a condition of calculation of sleep 
> time is more reasonable
> That is, in the current test, executing failover on connection nn3 does not 
> need to sleep time to directly connect to the next nn node
> See client_error.log for details



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service

2019-11-28 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984780#comment-16984780
 ] 

Jinglun commented on HDFS-13811:


Hi [~linyiqun], thanks your comments ! I didn't think about the web UI before, 
thanks your reminding !

I haven't looked into the code, just a guess: is the web UI usage fetched from 
the state store now ? Might be we can display the usage from the local 
cache(RouterQuotaManager) ? 

I had a look of [~dibyendu_hadoop]'s way but not in detail. I'll try to figure 
out how it works and how to fix the web UI this weekend.

> RBF: Race condition between router admin quota update and periodic quota 
> update service
> ---
>
> Key: HDFS-13811
> URL: https://issues.apache.org/jira/browse/HDFS-13811
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, 
> HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch
>
>
> If we try to update quota of an existing mount entry and at the same time 
> periodic quota update service is running on the same mount entry, it is 
> leading the mount table to _inconsistent state._
> Here transactions are:
> A - Quota update service is fetching mount table entries.
> B - Quota update service is updating the mount table with current usage.
> A' - User is trying to update quota using admin cmd.
> and the transaction sequence is [ A A' B ]
> quota update service is updating the mount table with old quota value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-11-28 Thread huhaiyang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984776#comment-16984776
 ] 

huhaiyang commented on HDFS-15024:
--

[~weichiu] [~vagarychen]  Hello, how do you solve this problem in the process 
of use?

Thanks!


> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> {code:java}
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -mkdir /user/haiyang1/test8
> You need to request nn1 when you execute the msync method,
> Actually connect nn2 first and failover is required
> In connection nn3 does not meet the requirements, failover needs to be 
> performed, but at this time, failover operation needs to be performed during 
> a period of hibernation
> Finally, it took a period of hibernation to connect the successful request to 
> nn1
> In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current 
> default implementation is Sleep time is calculated when more than one 
> failover operation is performed
> I think that the Number of NameNodes as a condition of calculation of sleep 
> time is more reasonable
> That is, in the current test, executing failover on connection nn3 does not 
> need to sleep time to directly connect to the next nn node
> See client_error.log for details
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-11-28 Thread huhaiyang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984772#comment-16984772
 ] 

huhaiyang commented on HDFS-15024:
--

./bin/hadoop --loglevel debug fs 
-Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
  -mkdir /user/haiyang1/test8
...
19/11/29 14:26:55 DEBUG ipc.Client: The ping interval is 6 ms.
19/11/29 14:26:55 DEBUG ipc.Client: Connecting to nn2/xx:8020
19/11/29 14:26:55 DEBUG ipc.Client: IPC Client (1337335626) connection to 
nn2/xx:8020 from hadoop: starting, having connections 1
19/11/29 14:26:55 DEBUG ipc.Client: IPC Client (1337335626) connection to 
nn2/xx:8020 from hadoop sending #0 
org.apache.hadoop.hdfs.protocol.ClientProtocol.msync
19/11/29 14:26:55 DEBUG ipc.Client: IPC Client (1337335626) connection to 
nn2/xx:8020 from hadoop got value #0
19/11/29 14:26:55 DEBUG retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category WRITE is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2018)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1461)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.msync(NameNodeRpcServer.java:1384)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(ClientNamenodeProtocolServerSideTranslatorPB.java:1907)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:531)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1903)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2815)
, while invoking $Proxy4.getFileInfo over 
[nn3/xx:8020,nn2/xx:8020,nn1/xx:8020]. Trying to failover immediately.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category WRITE is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2018)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1461)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.msync(NameNodeRpcServer.java:1384)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(ClientNamenodeProtocolServerSideTranslatorPB.java:1907)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:531)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1903)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2815)

at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1543)
at org.apache.hadoop.ipc.Client.call(Client.java:1489)
at org.apache.hadoop.ipc.Client.call(Client.java:1388)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
at com.sun.proxy.$Proxy15.msync(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.msync(ClientNamenodeProtocolTranslatorPB.java:1958)
at 
org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.initializeMsync(ObserverReadProxyProvider.java:318)
at 

[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-11-28 Thread huhaiyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huhaiyang updated HDFS-15024:
-
Description: 
{code:java}
When we enable the ONN , there will be three NN nodes for the client 
configuration,
Such as configuration


dfs.ha.namenodes.ns1
nn2,nn3,nn1


Currently, 
nn2 is in standby state
nn3 is in observer state 
nn1 is in active state

When the user performs an access HDFS operation
./bin/hadoop --loglevel debug fs 
-Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
 -mkdir /user/haiyang1/test8

You need to request nn1 when you execute the msync method,
Actually connect nn2 first and failover is required
In connection nn3 does not meet the requirements, failover needs to be 
performed, but at this time, failover operation needs to be performed during a 
period of hibernation
Finally, it took a period of hibernation to connect the successful request to 
nn1

In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current 
default implementation is Sleep time is calculated when more than one failover 
operation is performed

I think that the Number of NameNodes as a condition of calculation of sleep 
time is more reasonable
That is, in the current test, executing failover on connection nn3 does not 
need to sleep time to directly connect to the next nn node

See client_error.log for details



{code}

  was:
{code:java}
When we enable the ONN , there will be three NN nodes for the client 
configuration,
Such as configuration


dfs.ha.namenodes.ns1
nn2,nn3,nn1


Currently, 
nn2 is in standby state
nn3 is in observer state 
nn1 is in active state

When the user performs an access HDFS operation
./bin/hadoop --loglevel debug fs 
-Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
 -ls /user/haiyang1/



{code}


> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> {code:java}
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -mkdir /user/haiyang1/test8
> You need to request nn1 when you execute the msync method,
> Actually connect nn2 first and failover is required
> In connection nn3 does not meet the requirements, failover needs to be 
> performed, but at this time, failover operation needs to be performed during 
> a period of hibernation
> Finally, it took a period of hibernation to connect the successful request to 
> nn1
> In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current 
> default implementation is Sleep time is calculated when more than one 
> failover operation is performed
> I think that the Number of NameNodes as a condition of calculation of sleep 
> time is more reasonable
> That is, in the current test, executing failover on connection nn3 does not 
> need to sleep time to directly connect to the next nn node
> See client_error.log for details
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2019-11-28 Thread fanghanyun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984766#comment-16984766
 ] 

fanghanyun edited comment on HDFS-14986 at 11/29/19 7:05 AM:
-

hadoop version 2.6.0-cdh5.13.1

public Set deepCopyReplica(String bpid) throws IOException {
 //Set replicas = new HashSet<>(volumeMap.replicas(bpid) == 
null ? Collections.EMPTY_SET
 // :volumeMap.replicas(bpid));
 Set replicas = null;
 try (AutoCloseableLock lock = datasetLock.acquire()) {
 replicas = new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.
 EMPTY_SET : volumeMap.replicas(bpid));
 }

Cannot solve symbol 'datasetLock'


was (Author: fanghanyun):
hadoop version 2.6.0-cdh5.13.1

!image-2019-11-29-14-59-11-179.png!

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.10.0
>Reporter: Ryan Wu
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.0, 2.10.1, 2.11.0
>
> Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, 
> HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, 
> HDFS-14986.006.patch
>
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2019-11-28 Thread fanghanyun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984766#comment-16984766
 ] 

fanghanyun commented on HDFS-14986:
---

hadoop version 2.6.0-cdh5.13.1

!image-2019-11-29-14-59-11-179.png!

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.10.0
>Reporter: Ryan Wu
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.0, 2.10.1, 2.11.0
>
> Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, 
> HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, 
> HDFS-14986.006.patch
>
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service

2019-11-28 Thread Yiqun Lin (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984758#comment-16984758
 ] 

Yiqun Lin edited comment on HDFS-13811 at 11/29/19 6:55 AM:


Hi [~LiJinglun], I take a look for the major change of the patch. I am agreed 
with that updating mount table statestore behavior should be called only by 
admin call and let periodic service update the local cache. This will broken 
one thing that the usage in mount table cannot be updated. And the quota usage 
displayed in web UI will be invalid. How do we plan to fix this?

Can you use the similar way that [~dibyendu_hadoop] did before to extract the 
quota usage from quota manager when getting the mount table entries?
{noformat}
@@ -142,6 +143,20 @@ public GetMountTableEntriesResponse getMountTableEntries(
 it.remove();
   }
 }
+// If quotamanager is not null, update quota usage from quota cache.
+if (this.getQuotaManager() != null && request.isUpdateQuotaCache()) {
+  RouterQuotaUsage quota =
+  this.getQuotaManager().getQuotaUsage(record.getSourcePath());
+  if(quota != null) {
+RouterQuotaUsage oldquota = record.getQuota();
+RouterQuotaUsage newQuota = new RouterQuotaUsage.Builder()
+.fileAndDirectoryCount(quota.getFileAndDirectoryCount())
+.quota(oldquota.getQuota())
+.spaceConsumed(quota.getSpaceConsumed())
+.spaceQuota(oldquota.getSpaceQuota()).build();
+record.setQuota(newQuota);
+  }
+}
   }
 }
{noformat}


was (Author: linyiqun):
Hi [~LiJinglun], I take a look for the major change of the patch. I am agreed 
with that updating mount table statestore behavior should be called only by 
admin call and let periodic service update the local cache. This will broken 
one thing that the usage in mount table cannot be updated. And the quota usage 
displayed in web UI will be invalid. How do we plan to fix this?The definition 
of update service is that:

{noformat}
/**
 * Service to periodically update the {@link RouterQuotaUsage}
 * cached information in the {@link Router} and update corresponding
 * mount table in State Store.
 */
public class RouterQuotaUpdateService extends PeriodicService {
{noformat}

> RBF: Race condition between router admin quota update and periodic quota 
> update service
> ---
>
> Key: HDFS-13811
> URL: https://issues.apache.org/jira/browse/HDFS-13811
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, 
> HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch
>
>
> If we try to update quota of an existing mount entry and at the same time 
> periodic quota update service is running on the same mount entry, it is 
> leading the mount table to _inconsistent state._
> Here transactions are:
> A - Quota update service is fetching mount table entries.
> B - Quota update service is updating the mount table with current usage.
> A' - User is trying to update quota using admin cmd.
> and the transaction sequence is [ A A' B ]
> quota update service is updating the mount table with old quota value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service

2019-11-28 Thread Yiqun Lin (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984758#comment-16984758
 ] 

Yiqun Lin edited comment on HDFS-13811 at 11/29/19 6:49 AM:


Hi [~LiJinglun], I take a look for the major change of the patch. I am agreed 
with that updating mount table statestore behavior should be called only by 
admin call and let periodic service update the local cache. This will broken 
one thing that the usage in mount table cannot be updated. And the quota usage 
displayed in web UI will be invalid. How do we plan to fix this?The definition 
of update service is that:

{noformat}
/**
 * Service to periodically update the {@link RouterQuotaUsage}
 * cached information in the {@link Router} and update corresponding
 * mount table in State Store.
 */
public class RouterQuotaUpdateService extends PeriodicService {
{noformat}


was (Author: linyiqun):
Hi [~LiJinglun], I take a look for the major change of the patch. I am agreed 
with that updating mount table statestore behavior should be called only by 
admin call and let periodic service update the local cache. This will broken 
one thing that the usage in mount table cannot be updated. And the quota usage 
displayed in web UI will be invalid. How do we plan to fix this?

> RBF: Race condition between router admin quota update and periodic quota 
> update service
> ---
>
> Key: HDFS-13811
> URL: https://issues.apache.org/jira/browse/HDFS-13811
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, 
> HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch
>
>
> If we try to update quota of an existing mount entry and at the same time 
> periodic quota update service is running on the same mount entry, it is 
> leading the mount table to _inconsistent state._
> Here transactions are:
> A - Quota update service is fetching mount table entries.
> B - Quota update service is updating the mount table with current usage.
> A' - User is trying to update quota using admin cmd.
> and the transaction sequence is [ A A' B ]
> quota update service is updating the mount table with old quota value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service

2019-11-28 Thread Yiqun Lin (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984758#comment-16984758
 ] 

Yiqun Lin commented on HDFS-13811:
--

Hi [~LiJinglun], I take a look for the major change of the patch. I am agreed 
with that updating mount table statestore behavior should be called only by 
admin call and let periodic service update the local cache. This will broken 
one thing that the usage in mount table cannot be updated. And the quota usage 
displayed in web UI will be invalid. How do we plan to fix this?

> RBF: Race condition between router admin quota update and periodic quota 
> update service
> ---
>
> Key: HDFS-13811
> URL: https://issues.apache.org/jira/browse/HDFS-13811
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, 
> HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch
>
>
> If we try to update quota of an existing mount entry and at the same time 
> periodic quota update service is running on the same mount entry, it is 
> leading the mount table to _inconsistent state._
> Here transactions are:
> A - Quota update service is fetching mount table entries.
> B - Quota update service is updating the mount table with current usage.
> A' - User is trying to update quota using admin cmd.
> and the transaction sequence is [ A A' B ]
> quota update service is updating the mount table with old quota value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-11-28 Thread huhaiyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huhaiyang updated HDFS-15024:
-
Description: 
{code:java}
When we enable the ONN , there will be three NN nodes for the client 
configuration,
Such as configuration


dfs.ha.namenodes.ns1
nn2,nn3,nn1


Currently, 
nn2 is in standby state
nn3 is in observer state 
nn1 is in active state

When the user performs an access HDFS operation
./bin/hadoop --loglevel debug fs 
-Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
 -ls /user/haiyang1/



{code}

  was:
{code:java}
When we enable the ONN , there will be three NN nodes for the client 
configuration,
Such as configuration


dfs.ha.namenodes.ns1
nn2,nn3,nn1


Currently, 
nn2 is in standby state
nn3 is in observer state 
nn1 is in active state

When the user performs an access HDFS operation
./bin/hadoop --loglevel debug fs 
-Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
 -ls /user/haiyang1/

{code}


> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> {code:java}
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -ls /user/haiyang1/
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-11-28 Thread huhaiyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huhaiyang updated HDFS-15024:
-
Attachment: (was: client_error.log)

> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> {code:java}
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -ls /user/haiyang1/
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-11-28 Thread huhaiyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huhaiyang updated HDFS-15024:
-
Attachment: client_error.log

> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> {code:java}
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -ls /user/haiyang1/
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15003) RBF: Make Router support storage type quota.

2019-11-28 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984752#comment-16984752
 ] 

Jinglun commented on HDFS-15003:


Hi [~ayushtkn] [~elgoiri], would you like to help reviewing v02 ? Thanks very 
much !

> RBF: Make Router support storage type quota.
> 
>
> Key: HDFS-15003
> URL: https://issues.apache.org/jira/browse/HDFS-15003
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15003.001.patch, HDFS-15003.002.patch
>
>
> Make Router support storage type quota.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election

2019-11-28 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984750#comment-16984750
 ] 

Fei Hui commented on HDFS-15023:


[~ayushtkn]Thanks for your comments
Try to add UT

> [SBN read] ZKFC should check the state before joining the election
> --
>
> Key: HDFS-15023
> URL: https://issues.apache.org/jira/browse/HDFS-15023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15023.001.patch
>
>
> As discussed HDFS-14961, ZKFC should not join election when its state is 
> observer.
> Right now when namemode was an observer,  it joined election and it would be 
> become a standby.
> MonitorDaemon thread callchain is that
> doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() 
> -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> 
> createLockNodeAsync
> callBack for zookeeper
> processResult -> becomeStandby



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election

2019-11-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984749#comment-16984749
 ] 

Hadoop QA commented on HDFS-15023:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m  3s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}111m 34s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ha.TestZKFailoverControllerStress |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-15023 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987115/HDFS-15023.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux fc69a6df21d0 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 44f7b91 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28424/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28424/testReport/ |
| Max. process+thread count | 1343 (vs. ulimit of 5500) |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 

[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-11-28 Thread huhaiyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huhaiyang updated HDFS-15024:
-
Attachment: client_error.log

> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> {code:java}
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -ls /user/haiyang1/
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-11-28 Thread huhaiyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huhaiyang updated HDFS-15024:
-
Description: 
{code:java}
When we enable the ONN , there will be three NN nodes for the client 
configuration,
Such as configuration


dfs.ha.namenodes.ns1
nn2,nn3,nn1


Currently, 
nn2 is in standby state
nn3 is in observer state 
nn1 is in active state

When the user performs an access HDFS operation
./bin/hadoop --loglevel debug fs 
-Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
 -ls /user/haiyang1/

{code}

  was:
{code:java}
When we enable the ONN , there will be three NN nodes for the client 
configuration,
Such as configuration


dfs.ha.namenodes.ns1
nn2,nn3,nn1


Currently, 
nn2 is in standby state
nn3 is in observer state 
nn1 is in active state

When the user performs an access HDFS operation
./bin/hadoop --loglevel debug fs 
-Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
 -ls /user/haiyang1/
16:49:13 DEBUG ipc.Client: The ping interval is 6 ms.
19/11/28 16:49:13 DEBUG ipc.Client: Connecting to xx/xx:8020
...
19/11/28 16:49:13 DEBUG retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category WRITE is not supported in state observer. Visit 
https://s.apache.org/sbnn-error
at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2018)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1461)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.msync(NameNodeRpcServer.java:1384)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(ClientNamenodeProtocolServerSideTranslatorPB.java:1907)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:531)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1903)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2815)
, while invoking $Proxy4.getFileInfo over [xx/xx:8020,xx/xx:8020,xx/xx:8020]. 
Trying to failover immediately.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category WRITE is not supported in state observer. Visit 
https://s.apache.org/sbnn-error
at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2018)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1461)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.msync(NameNodeRpcServer.java:1384)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(ClientNamenodeProtocolServerSideTranslatorPB.java:1907)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:531)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1903)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2815)

at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1543)
at org.apache.hadoop.ipc.Client.call(Client.java:1489)
at org.apache.hadoop.ipc.Client.call(Client.java:1388)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
at com.sun.proxy.$Proxy15.msync(Unknown Source)
at 

[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-11-28 Thread huhaiyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huhaiyang updated HDFS-15024:
-
Description: 
{code:java}
When we enable the ONN , there will be three NN nodes for the client 
configuration,
Such as configuration


dfs.ha.namenodes.ns1
nn2,nn3,nn1


Currently, 
nn2 is in standby state
nn3 is in observer state 
nn1 is in active state

When the user performs an access HDFS operation
./bin/hadoop --loglevel debug fs 
-Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
 -ls /user/haiyang1/
16:49:13 DEBUG ipc.Client: The ping interval is 6 ms.
19/11/28 16:49:13 DEBUG ipc.Client: Connecting to xx/xx:8020
...
19/11/28 16:49:13 DEBUG retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category WRITE is not supported in state observer. Visit 
https://s.apache.org/sbnn-error
at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2018)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1461)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.msync(NameNodeRpcServer.java:1384)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(ClientNamenodeProtocolServerSideTranslatorPB.java:1907)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:531)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1903)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2815)
, while invoking $Proxy4.getFileInfo over [xx/xx:8020,xx/xx:8020,xx/xx:8020]. 
Trying to failover immediately.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category WRITE is not supported in state observer. Visit 
https://s.apache.org/sbnn-error
at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2018)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1461)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.msync(NameNodeRpcServer.java:1384)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(ClientNamenodeProtocolServerSideTranslatorPB.java:1907)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:531)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1903)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2815)

at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1543)
at org.apache.hadoop.ipc.Client.call(Client.java:1489)
at org.apache.hadoop.ipc.Client.call(Client.java:1388)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
at com.sun.proxy.$Proxy15.msync(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.msync(ClientNamenodeProtocolTranslatorPB.java:1958)
at 
org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.initializeMsync(ObserverReadProxyProvider.java:318)
at 
org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.access$500(ObserverReadProxyProvider.java:69)
at 

[jira] [Updated] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-11-28 Thread huhaiyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huhaiyang updated HDFS-15024:
-
   Attachment: HDFS-15024.001.patch
Affects Version/s: 3.3.0
   2.10.0
   3.2.1
  Description: 
{code:java}
When we enable the ONN , there will be three NN nodes for the client 
configuration,
Such as configuration


dfs.ha.namenodes.ns1
nn2,nn3,nn1


Currently, nn2 is in standby state, nn3 is in observer state, and nn1 is in 
active state
When the user performs an access HDFS operation


{code}
  Summary: [SBN read] In FailoverOnNetworkExceptionRetry , Number 
of NameNodes as a condition of calculation of sleep time  (was: [SBN read] In 
FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime method, Number of 
NameNodes as a condition of calculation of sleep time)

> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Priority: Major
> Attachments: HDFS-15024.001.patch
>
>
> {code:java}
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, nn2 is in standby state, nn3 is in observer state, and nn1 is in 
> active state
> When the user performs an access HDFS operation
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime method, Number of NameNodes as a condition of calculation of sleep time

2019-11-28 Thread huhaiyang (Jira)
huhaiyang created HDFS-15024:


 Summary: [SBN read] In FailoverOnNetworkExceptionRetry 
getFailoverOrRetrySleepTime method, Number of NameNodes as a condition of 
calculation of sleep time
 Key: HDFS-15024
 URL: https://issues.apache.org/jira/browse/HDFS-15024
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: huhaiyang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election

2019-11-28 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15023:

Issue Type: Improvement  (was: Bug)

> [SBN read] ZKFC should check the state before joining the election
> --
>
> Key: HDFS-15023
> URL: https://issues.apache.org/jira/browse/HDFS-15023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15023.001.patch
>
>
> As discussed HDFS-14961, ZKFC should not join election when its state is 
> observer.
> Right now when namemode was an observer,  it joined election and it would be 
> become a standby.
> MonitorDaemon thread callchain is that
> doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() 
> -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> 
> createLockNodeAsync
> callBack for zookeeper
> processResult -> becomeStandby



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election

2019-11-28 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984739#comment-16984739
 ] 

Ayush Saxena commented on HDFS-15023:
-

Thanx [~ferhui], did you change the fix here?
Wasn't it checking not Observer? That was more descriptive, if that is APT, you 
can add a line of comment also explaining the reason.
Is it possible to cover the change by a test?


> [SBN read] ZKFC should check the state before joining the election
> --
>
> Key: HDFS-15023
> URL: https://issues.apache.org/jira/browse/HDFS-15023
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15023.001.patch
>
>
> As discussed HDFS-14961, ZKFC should not join election when its state is 
> observer.
> Right now when namemode was an observer,  it joined election and it would be 
> become a standby.
> MonitorDaemon thread callchain is that
> doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() 
> -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> 
> createLockNodeAsync
> callBack for zookeeper
> processResult -> becomeStandby



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9695) HTTPFS - CHECKACCESS operation missing

2019-11-28 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984732#comment-16984732
 ] 

Takanobu Asanuma commented on HDFS-9695:


Thanks for updating the patch, [~hemanthboyina]. It almost seems good. Some 
minor comments:
 * Please remove the blank line in the FSAccess constructor.
 * About {{FSAccess#execute}}, the javadoc comment of {{@return}} seems wrong.

> HTTPFS - CHECKACCESS operation missing
> --
>
> Key: HDFS-9695
> URL: https://issues.apache.org/jira/browse/HDFS-9695
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Bert Hekman
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-9695.001.patch, HDFS-9695.002.patch, 
> HDFS-9695.003.patch, HDFS-9695.004.patch
>
>
> Hi,
> The CHECKACCESS operation seems to be missing in HTTPFS. I'm getting the 
> following error:
> {code}
> QueryParamException: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.fs.http.client.HttpFSFileSystem.Operation.CHECKACCESS
> {code}
> A quick look into the org.apache.hadoop.fs.http.client.HttpFSFileSystem class 
> reveals that CHECKACCESS is not defined at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15003) RBF: Make Router support storage type quota.

2019-11-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984730#comment-16984730
 ] 

Hadoop QA commented on HDFS-15003:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 34s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 39s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
11s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m 17s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-15003 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987114/HDFS-15003.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 33d3e79aa3ab 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 44f7b91 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28423/testReport/ |
| Max. process+thread count | 2737 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28423/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RBF: Make Router support storage type quota.
> 
>
> Key: HDFS-15003
> URL: 

[jira] [Commented] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election

2019-11-28 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984729#comment-16984729
 ] 

Fei Hui commented on HDFS-15023:


Upload the simple fix

> [SBN read] ZKFC should check the state before joining the election
> --
>
> Key: HDFS-15023
> URL: https://issues.apache.org/jira/browse/HDFS-15023
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15023.001.patch
>
>
> As discussed HDFS-14961, ZKFC should not join election when its state is 
> observer.
> Right now when namemode was an observer,  it joined election and it would be 
> become a standby.
> MonitorDaemon thread callchain is that
> doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() 
> -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> 
> createLockNodeAsync
> callBack for zookeeper
> processResult -> becomeStandby



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election

2019-11-28 Thread Fei Hui (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HDFS-15023:
---
Status: Patch Available  (was: Open)

> [SBN read] ZKFC should check the state before joining the election
> --
>
> Key: HDFS-15023
> URL: https://issues.apache.org/jira/browse/HDFS-15023
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15023.001.patch
>
>
> As discussed HDFS-14961, ZKFC should not join election when its state is 
> observer.
> Right now when namemode was an observer,  it joined election and it would be 
> become a standby.
> MonitorDaemon thread callchain is that
> doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() 
> -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> 
> createLockNodeAsync
> callBack for zookeeper
> processResult -> becomeStandby



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election

2019-11-28 Thread Fei Hui (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HDFS-15023:
---
Attachment: HDFS-15023.001.patch

> [SBN read] ZKFC should check the state before joining the election
> --
>
> Key: HDFS-15023
> URL: https://issues.apache.org/jira/browse/HDFS-15023
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15023.001.patch
>
>
> As discussed HDFS-14961, ZKFC should not join election when its state is 
> observer.
> Right now when namemode was an observer,  it joined election and it would be 
> become a standby.
> MonitorDaemon thread callchain is that
> doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() 
> -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> 
> createLockNodeAsync
> callBack for zookeeper
> processResult -> becomeStandby



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election

2019-11-28 Thread Fei Hui (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HDFS-15023:
---
Description: 
As discussed HDFS-14961, ZKFC should not join election when its state is 
observer.

Right now when namemode was an observer,  it joined election and it would be 
become a standby.

MonitorDaemon thread callchain is that
doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() -> 
elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> 
createLockNodeAsync

callBack for zookeeper
processResult -> becomeStandby

  was:As discussed HDFS-14961, ZKFC should not join election when its state is 
observer


> [SBN read] ZKFC should check the state before joining the election
> --
>
> Key: HDFS-15023
> URL: https://issues.apache.org/jira/browse/HDFS-15023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15023.001.patch
>
>
> As discussed HDFS-14961, ZKFC should not join election when its state is 
> observer.
> Right now when namemode was an observer,  it joined election and it would be 
> become a standby.
> MonitorDaemon thread callchain is that
> doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() 
> -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> 
> createLockNodeAsync
> callBack for zookeeper
> processResult -> becomeStandby



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election

2019-11-28 Thread Fei Hui (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HDFS-15023:
---
Issue Type: Bug  (was: Improvement)

> [SBN read] ZKFC should check the state before joining the election
> --
>
> Key: HDFS-15023
> URL: https://issues.apache.org/jira/browse/HDFS-15023
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15023.001.patch
>
>
> As discussed HDFS-14961, ZKFC should not join election when its state is 
> observer.
> Right now when namemode was an observer,  it joined election and it would be 
> become a standby.
> MonitorDaemon thread callchain is that
> doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() 
> -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> 
> createLockNodeAsync
> callBack for zookeeper
> processResult -> becomeStandby



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15003) RBF: Make Router support storage type quota.

2019-11-28 Thread Jinglun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HDFS-15003:
---
Attachment: HDFS-15003.002.patch

> RBF: Make Router support storage type quota.
> 
>
> Key: HDFS-15003
> URL: https://issues.apache.org/jira/browse/HDFS-15003
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15003.001.patch, HDFS-15003.002.patch
>
>
> Make Router support storage type quota.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15021) RBF: Delegation Token can't remove correctly in absence of cancelToken and restart the router.

2019-11-28 Thread Yuxuan Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984716#comment-16984716
 ] 

Yuxuan Wang commented on HDFS-15021:


I think TestZKDelegationTokenSecretManager#testNodesLoadedAfterRestart() 
already cover the case.

> RBF:  Delegation Token can't remove correctly in absence of cancelToken and 
> restart the router.
> ---
>
> Key: HDFS-15021
> URL: https://issues.apache.org/jira/browse/HDFS-15021
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Weidong Duan
>Priority: Major
>
> The ZKDelegationTokenSecretManager couldn't remove the expired DTs on the 
> Zookeeper as expected when restart the Router in the absence of invoking the 
> method ` ZKDelegationTokenSecretManager#cancelToken`.
> This case will cause many stale DTs leave on the Zookeeper . Maybe cause the 
> performance problem of the Router.
> I think this is a bug and  should be resolved in the latter. Is it Right?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election

2019-11-28 Thread Fei Hui (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui reassigned HDFS-15023:
--

Assignee: Fei Hui

> [SBN read] ZKFC should check the state before joining the election
> --
>
> Key: HDFS-15023
> URL: https://issues.apache.org/jira/browse/HDFS-15023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
>
> As discussed HDFS-14961, ZKFC should not join election when its state is 
> observer



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election

2019-11-28 Thread Fei Hui (Jira)
Fei Hui created HDFS-15023:
--

 Summary: [SBN read] ZKFC should check the state before joining the 
election
 Key: HDFS-15023
 URL: https://issues.apache.org/jira/browse/HDFS-15023
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Fei Hui


As discussed HDFS-14961, ZKFC should not join election when its state is 
observer



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15013) Reduce NameNode overview tab response time

2019-11-28 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984618#comment-16984618
 ] 

Hudson commented on HDFS-15013:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17710 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17710/])
HDFS-15013. Reduce NameNode overview tab response time. Contributed by 
(surendralilhore: rev 44f7b9159d8eec151f199231bafe0677f9383dc3)
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.js


> Reduce NameNode overview tab response time
> --
>
> Key: HDFS-15013
> URL: https://issues.apache.org/jira/browse/HDFS-15013
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-15013.001.patch, HDFS-15013.002.patch, 
> image-2019-11-26-10-05-39-640.png, image-2019-11-26-10-09-07-952.png
>
>
> Now, the overview tab load /conf synchronously as follow picture.
>  !image-2019-11-26-10-05-39-640.png! 
> This issue will change it to an asynchronous method. The effect diagram is as 
> follows.
>  !image-2019-11-26-10-09-07-952.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15013) Reduce NameNode overview tab response time

2019-11-28 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984614#comment-16984614
 ] 

Surendra Singh Lilhore commented on HDFS-15013:
---

+1

> Reduce NameNode overview tab response time
> --
>
> Key: HDFS-15013
> URL: https://issues.apache.org/jira/browse/HDFS-15013
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-15013.001.patch, HDFS-15013.002.patch, 
> image-2019-11-26-10-05-39-640.png, image-2019-11-26-10-09-07-952.png
>
>
> Now, the overview tab load /conf synchronously as follow picture.
>  !image-2019-11-26-10-05-39-640.png! 
> This issue will change it to an asynchronous method. The effect diagram is as 
> follows.
>  !image-2019-11-26-10-09-07-952.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15013) Reduce NameNode overview tab response time

2019-11-28 Thread Surendra Singh Lilhore (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-15013:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks [~marvelrock] for contribution. Thanks [~elgoiri], [~ayushtkn] , 
[~hemanthboyina]  for review.

Committed to trunk.

> Reduce NameNode overview tab response time
> --
>
> Key: HDFS-15013
> URL: https://issues.apache.org/jira/browse/HDFS-15013
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-15013.001.patch, HDFS-15013.002.patch, 
> image-2019-11-26-10-05-39-640.png, image-2019-11-26-10-09-07-952.png
>
>
> Now, the overview tab load /conf synchronously as follow picture.
>  !image-2019-11-26-10-05-39-640.png! 
> This issue will change it to an asynchronous method. The effect diagram is as 
> follows.
>  !image-2019-11-26-10-09-07-952.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6874) Add GETFILEBLOCKLOCATIONS operation to HttpFS

2019-11-28 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984611#comment-16984611
 ] 

hemanthboyina commented on HDFS-6874:
-

{quote}But I think that only applies to branch-2.
{quote}
yes , it only applies to branch-2 .

I think we can go ahead with this .

> Add GETFILEBLOCKLOCATIONS operation to HttpFS
> -
>
> Key: HDFS-6874
> URL: https://issues.apache.org/jira/browse/HDFS-6874
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 2.4.1, 2.7.3
>Reporter: Gao Zhong Liang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6874-1.patch, HDFS-6874-branch-2.6.0.patch, 
> HDFS-6874.02.patch, HDFS-6874.03.patch, HDFS-6874.04.patch, 
> HDFS-6874.05.patch, HDFS-6874.06.patch, HDFS-6874.07.patch, 
> HDFS-6874.08.patch, HDFS-6874.09.patch, HDFS-6874.10.patch, HDFS-6874.patch
>
>
> GETFILEBLOCKLOCATIONS operation is missing in HttpFS, which is already 
> supported in WebHDFS.  For the request of GETFILEBLOCKLOCATIONS in 
> org.apache.hadoop.fs.http.server.HttpFSServer, BAD_REQUEST is returned so far:
> ...
>  case GETFILEBLOCKLOCATIONS: {
> response = Response.status(Response.Status.BAD_REQUEST).build();
> break;
>   }
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15010) BlockPoolSlice#addReplicaThreadPool static pool should be initialized by static method

2019-11-28 Thread Surendra Singh Lilhore (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-15010:
--
Fix Version/s: 3.2.2
   3.1.4
   3.3.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~elgoiri]  for review.

Committed to trunk, branch-3.2 and branch-3.1!

> BlockPoolSlice#addReplicaThreadPool static pool should be initialized by 
> static method
> --
>
> Key: HDFS-15010
> URL: https://issues.apache.org/jira/browse/HDFS-15010
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.1.2
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15010.001.patch, HDFS-15010.02.patch, 
> HDFS-15010.03.patch, HDFS-15010.04.patch, HDFS-15010.05.patch
>
>
> {{BlockPoolSlice#initializeAddReplicaPool()}} method currently initialize the 
> static thread pool instance. But when two {{BPServiceActor}} actor try to 
> load block pool parallelly then it may create different instance. 
> So {{BlockPoolSlice#initializeAddReplicaPool()}} method should be a static 
> method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15010) BlockPoolSlice#addReplicaThreadPool static pool should be initialized by static method

2019-11-28 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984600#comment-16984600
 ] 

Hudson commented on HDFS-15010:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17709 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17709/])
HDFS-15010. BlockPoolSlice#addReplicaThreadPool static pool should be 
(surendralilhore: rev 0384687811446a52009b96cc85bf961a3e83afc4)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsVolumeList.java


> BlockPoolSlice#addReplicaThreadPool static pool should be initialized by 
> static method
> --
>
> Key: HDFS-15010
> URL: https://issues.apache.org/jira/browse/HDFS-15010
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.1.2
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-15010.001.patch, HDFS-15010.02.patch, 
> HDFS-15010.03.patch, HDFS-15010.04.patch, HDFS-15010.05.patch
>
>
> {{BlockPoolSlice#initializeAddReplicaPool()}} method currently initialize the 
> static thread pool instance. But when two {{BPServiceActor}} actor try to 
> load block pool parallelly then it may create different instance. 
> So {{BlockPoolSlice#initializeAddReplicaPool()}} method should be a static 
> method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14984) HDFS setQuota: Error message should be added for invalid input max range value to hdfs dfsadmin -setQuota command

2019-11-28 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984589#comment-16984589
 ] 

hemanthboyina commented on HDFS-14984:
--

if we set an invalid input 9223372036854775807(which is Long.MAX_VALUE) . it 
should go through the if condition and throw exception  (based on error thrown 
for invalid input)
{code:java}
DFSClient.java

if ((namespaceQuota <= 0 &&
  namespaceQuota != HdfsConstants.QUOTA_DONT_SET &&
  namespaceQuota != HdfsConstants.QUOTA_RESET) ||
(storagespaceQuota < 0 &&
storagespaceQuota != HdfsConstants.QUOTA_DONT_SET &&
storagespaceQuota != HdfsConstants.QUOTA_RESET)) {
  throw new IllegalArgumentException("Invalid values for quota : " +
  namespaceQuota + " and " +
  storagespaceQuota);} {code}
but in FSDirAttrOp.java , we have  an else if check , if nsQuota equals 
Long.MAX_VALUE , we are setting with Old NS Quota.
{code:java}
  final QuotaCounts oldQuota = dirNode.getQuotaCounts();
final long oldNsQuota = oldQuota.getNameSpace();
final long oldSsQuota = oldQuota.getStorageSpace();
if (dirNode.isRoot() && nsQuota == HdfsConstants.QUOTA_RESET) {
  nsQuota = HdfsConstants.QUOTA_DONT_SET;
} else if (nsQuota == HdfsConstants.QUOTA_DONT_SET) {
  nsQuota = oldNsQuota;
   }
   // unchanged space/namespace quota
if (type == null && oldNsQuota == nsQuota && oldSsQuota == ssQuota) {
  return null;
} {code}
Either the exception message was not proper or the  if condition was not 
correct in DFSClient .

please correct me if am wrong .

> HDFS setQuota: Error message should be added for invalid input max range 
> value to hdfs dfsadmin -setQuota command
> -
>
> Key: HDFS-14984
> URL: https://issues.apache.org/jira/browse/HDFS-14984
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.2
>Reporter: Souryakanta Dwivedy
>Priority: Minor
> Attachments: image-2019-11-13-14-05-19-603.png, 
> image-2019-11-13-14-07-04-536.png
>
>
> An error message should be added for invalid input max range value 
> "9223372036854775807" to hdfs dfsadmin -setQuota command
>  * set quota for a directory with invalid input vlaue as 
> "9223372036854775807"- set quota for a directory with invalid input vlaue as 
> "9223372036854775807"   the command will be successful without displaying any 
> result.Quota value    will not be set for the directory internally,but it 
> will be better from user usage point of view  if an error message will 
> display for the invalid max range value "9223372036854775807" as it is 
> displaying    while setting the input value as "0"   For example "hdfs 
> dfsadmin -setQuota  9223372036854775807 /quota"        
>              !image-2019-11-13-14-05-19-603.png!
>  
>  *   - Try to set quota for a directory with invalid input value as "0"   It 
> will throw an error message as "setQuota: Invalid values for quota : 0 and 
> 9223372036854775807"       For example "hdfs dfsadmin -setQuota 0 /quota" 
>           !image-2019-11-13-14-07-04-536.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15022) Add new RPC to transfer data block with external shell script across Datanode

2019-11-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984557#comment-16984557
 ] 

Hadoop QA commented on HDFS-15022:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
37s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  3m  3s{color} | 
{color:red} hadoop-hdfs-project generated 4 new + 15 unchanged - 4 fixed = 19 
total (was 19) {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  3m  3s{color} 
| {color:red} hadoop-hdfs-project generated 3 new + 741 unchanged - 0 fixed = 
744 total (was 741) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  5s{color} | {color:orange} hadoop-hdfs-project: The patch generated 95 new 
+ 931 unchanged - 3 fixed = 1026 total (was 934) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 57s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
23s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
56s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 91m 11s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Exceptional return value of java.io.File.delete() ignored in 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.linkBlock(ExtendedBlock, 
Token, String, DatanodeInfo, StorageType, String, DatanodeInfo, StorageType)  
At DataXceiver.java:ignored in 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.linkBlock(ExtendedBlock, 
Token, String, 

[jira] [Commented] (HDFS-14901) RBF: Add Encryption Zone related ClientProtocol APIs

2019-11-28 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984554#comment-16984554
 ] 

hemanthboyina commented on HDFS-14901:
--

thanks for the review [~ayushtkn] [~elgoiri]
{quote} if no specific reason, you can use routerDFS only for both and chunk of 
having {{routerProtocol}} from the test.
{quote}
some of the API's like getDataEncryptionKey() were not present in routerDFS , 
so we need to use routerProtocol.

> RBF: Add Encryption Zone related ClientProtocol APIs
> 
>
> Key: HDFS-14901
> URL: https://issues.apache.org/jira/browse/HDFS-14901
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14901.001.patch, HDFS-14901.002.patch
>
>
> Currently listEncryptionZones,reencryptEncryptionZone,listReencryptionStatus 
> these APIs are not implemented in Router.
> This JIRA is intend to implement above mentioned APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15009) FSCK "-list-corruptfileblocks" return Invalid Entries

2019-11-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984515#comment-16984515
 ] 

Hadoop QA commented on HDFS-15009:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 14s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 51s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}137m 26s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
16s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}232m 54s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestStoragePolicySatisfierWithHA |
|   | hadoop.hdfs.server.namenode.TestAddStripedBlocks |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.TestDeadNodeDetection |
|   | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
|   | hadoop.hdfs.server.namenode.TestNameNodeXAttr |
|   | hadoop.hdfs.server.namenode.TestAuditLogs |
|   | hadoop.hdfs.server.namenode.ha.TestConsistentReadsObserver |
|   | hadoop.hdfs.server.namenode.TestFSDirectory |
|   | hadoop.hdfs.server.namenode.TestReencryptionWithKMS |
|   | hadoop.hdfs.server.namenode.TestNameEditsConfigs |
|   | hadoop.hdfs.server.namenode.TestAddBlockRetry |
|   | hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR |
|   | hadoop.hdfs.server.namenode.TestCommitBlockWithInvalidGenStamp |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints |
| 

[jira] [Commented] (HDFS-9695) HTTPFS - CHECKACCESS operation missing

2019-11-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984460#comment-16984460
 ] 

Hadoop QA commented on HDFS-9695:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 49s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-httpfs: The 
patch generated 1 new + 455 unchanged - 0 fixed = 456 total (was 455) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 37s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
39s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 53m 49s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-9695 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987067/HDFS-9695.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d1795487e895 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 46166bd |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28421/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-httpfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28421/testReport/ |
| Max. process+thread count | 632 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-httpfs U: 
hadoop-hdfs-project/hadoop-hdfs-httpfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28421/console |
| Powered by | Apache Yetus 

[jira] [Updated] (HDFS-15022) Add new RPC to transfer data block with external shell script across Datanode

2019-11-28 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15022:

Attachment: HDFS-15022.patch
Status: Patch Available  (was: Open)

> Add new RPC to transfer data block with external shell script across Datanode
> -
>
> Key: HDFS-15022
> URL: https://issues.apache.org/jira/browse/HDFS-15022
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15022.patch
>
>
> Replicating data block is expensive when some Datanodes are down, especially 
> for slow storage. Add a new RPC to replicate block with external shell script 
> across datanode. User can choose more effective way to copy block files.
> In our setup, Archive volume are configured to remote reliable storage. we 
> just add a new link file in new datanode to the remote file when do 
> replication.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15022) Add new RPC to transfer data block with external shell script across Datanode

2019-11-28 Thread Yang Yun (Jira)
Yang Yun created HDFS-15022:
---

 Summary: Add new RPC to transfer data block with external shell 
script across Datanode
 Key: HDFS-15022
 URL: https://issues.apache.org/jira/browse/HDFS-15022
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Reporter: Yang Yun
Assignee: Yang Yun


Replicating data block is expensive when some Datanodes are down, especially 
for slow storage. Add a new RPC to replicate block with external shell script 
across datanode. User can choose more effective way to copy block files.


In our setup, Archive volume are configured to remote reliable storage. we just 
add a new link file in new datanode to the remote file when do replication.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13571) Deadnode detection

2019-11-28 Thread Lisheng Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-13571:
---
Release Note: When dead node blocks DFSInputStream, Deadnode detection can 
find it and share this information to other DFSInputStreams in the same 
DFSClient. Thus, these DFSInputStreams will not read from the dead node and be 
blocked by this dead node.   (was: When dead node blocks 
DFSInputStream,Deadnode detection can find it and share this information to 
other DFSInputStreams in the same DFSClient. Thus, these DFSInputStreams will 
not read from the dead node and be blocked by this dead node. )

> Deadnode detection
> --
>
> Key: HDFS-13571
> URL: https://issues.apache.org/jira/browse/HDFS-13571
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.4.0, 2.6.0, 3.0.2
>Reporter: Gang Xie
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: DeadNodeDetectorDesign.pdf, HDFS-13571-2.6.diff, node 
> status machine.png
>
>
> Currently, the information of the dead datanode in DFSInputStream in stored 
> locally. So, it could not be shared among the inputstreams of the same 
> DFSClient. In our production env, every days, some datanodes dies with 
> different causes. At this time, after the first inputstream blocked and 
> detect this, it could share this information to others in the same DFSClient, 
> thus, the ohter inputstreams are still blocked by the dead node for some 
> time, which could cause bad service latency.
> To eliminate this impact from dead datanode, we designed a dead datanode 
> detector, which detect the dead ones in advance, and share this information 
> among all the inputstreams in the same client. This improvement has being 
> online for some months and works fine.  So, we decide to port to the 3.0 (the 
> version used in our production env is 2.4 and 2.6).
> I will do the porting work and upload the code later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13571) Deadnode detection

2019-11-28 Thread Lisheng Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-13571:
---
Release Note: When dead node blocks DFSInputStream,Deadnode detection can 
find it and share this information to other DFSInputStreams in the same 
DFSClient. Thus, these DFSInputStreams will not read from the dead node and be 
blocked by this dead node.   (was: When dead node blocks 
DFSInputStream,Deadnode detection can find it and share this information to 
other DFSInputStreams in the same DFSClient.
 Thus, these DFSInputStreams will not read from the dead node and be blocked by 
this dead node. )

> Deadnode detection
> --
>
> Key: HDFS-13571
> URL: https://issues.apache.org/jira/browse/HDFS-13571
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.4.0, 2.6.0, 3.0.2
>Reporter: Gang Xie
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: DeadNodeDetectorDesign.pdf, HDFS-13571-2.6.diff, node 
> status machine.png
>
>
> Currently, the information of the dead datanode in DFSInputStream in stored 
> locally. So, it could not be shared among the inputstreams of the same 
> DFSClient. In our production env, every days, some datanodes dies with 
> different causes. At this time, after the first inputstream blocked and 
> detect this, it could share this information to others in the same DFSClient, 
> thus, the ohter inputstreams are still blocked by the dead node for some 
> time, which could cause bad service latency.
> To eliminate this impact from dead datanode, we designed a dead datanode 
> detector, which detect the dead ones in advance, and share this information 
> among all the inputstreams in the same client. This improvement has being 
> online for some months and works fine.  So, we decide to port to the 3.0 (the 
> version used in our production env is 2.4 and 2.6).
> I will do the porting work and upload the code later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13571) Deadnode detection

2019-11-28 Thread Lisheng Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984443#comment-16984443
 ] 

Lisheng Sun commented on HDFS-13571:


Thank [~linyiqun] for patient review and good comments.

I have added release note for this JIRA.

> Deadnode detection
> --
>
> Key: HDFS-13571
> URL: https://issues.apache.org/jira/browse/HDFS-13571
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.4.0, 2.6.0, 3.0.2
>Reporter: Gang Xie
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: DeadNodeDetectorDesign.pdf, HDFS-13571-2.6.diff, node 
> status machine.png
>
>
> Currently, the information of the dead datanode in DFSInputStream in stored 
> locally. So, it could not be shared among the inputstreams of the same 
> DFSClient. In our production env, every days, some datanodes dies with 
> different causes. At this time, after the first inputstream blocked and 
> detect this, it could share this information to others in the same DFSClient, 
> thus, the ohter inputstreams are still blocked by the dead node for some 
> time, which could cause bad service latency.
> To eliminate this impact from dead datanode, we designed a dead datanode 
> detector, which detect the dead ones in advance, and share this information 
> among all the inputstreams in the same client. This improvement has being 
> online for some months and works fine.  So, we decide to port to the 3.0 (the 
> version used in our production env is 2.4 and 2.6).
> I will do the porting work and upload the code later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13571) Deadnode detection

2019-11-28 Thread Lisheng Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-13571:
---
Release Note: 
When dead node blocks DFSInputStream,Deadnode detection can find it and share 
this information to other DFSInputStreams in the same DFSClient.
 Thus, these DFSInputStreams will not read from the dead node and be blocked by 
this dead node. 

> Deadnode detection
> --
>
> Key: HDFS-13571
> URL: https://issues.apache.org/jira/browse/HDFS-13571
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.4.0, 2.6.0, 3.0.2
>Reporter: Gang Xie
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: DeadNodeDetectorDesign.pdf, HDFS-13571-2.6.diff, node 
> status machine.png
>
>
> Currently, the information of the dead datanode in DFSInputStream in stored 
> locally. So, it could not be shared among the inputstreams of the same 
> DFSClient. In our production env, every days, some datanodes dies with 
> different causes. At this time, after the first inputstream blocked and 
> detect this, it could share this information to others in the same DFSClient, 
> thus, the ohter inputstreams are still blocked by the dead node for some 
> time, which could cause bad service latency.
> To eliminate this impact from dead datanode, we designed a dead datanode 
> detector, which detect the dead ones in advance, and share this information 
> among all the inputstreams in the same client. This improvement has being 
> online for some months and works fine.  So, we decide to port to the 3.0 (the 
> version used in our production env is 2.4 and 2.6).
> I will do the porting work and upload the code later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13571) Deadnode detection

2019-11-28 Thread Lisheng Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-13571:
---
Summary: Deadnode detection  (was: Dead DataNode Detector)

> Deadnode detection
> --
>
> Key: HDFS-13571
> URL: https://issues.apache.org/jira/browse/HDFS-13571
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.4.0, 2.6.0, 3.0.2
>Reporter: Gang Xie
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: DeadNodeDetectorDesign.pdf, HDFS-13571-2.6.diff, node 
> status machine.png
>
>
> Currently, the information of the dead datanode in DFSInputStream in stored 
> locally. So, it could not be shared among the inputstreams of the same 
> DFSClient. In our production env, every days, some datanodes dies with 
> different causes. At this time, after the first inputstream blocked and 
> detect this, it could share this information to others in the same DFSClient, 
> thus, the ohter inputstreams are still blocked by the dead node for some 
> time, which could cause bad service latency.
> To eliminate this impact from dead datanode, we designed a dead datanode 
> detector, which detect the dead ones in advance, and share this information 
> among all the inputstreams in the same client. This improvement has being 
> online for some months and works fine.  So, we decide to port to the 3.0 (the 
> version used in our production env is 2.4 and 2.6).
> I will do the porting work and upload the code later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9695) HTTPFS - CHECKACCESS operation missing

2019-11-28 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984415#comment-16984415
 ] 

hemanthboyina commented on HDFS-9695:
-

updated the patch with test failures and findbugs fixed.

please review.

> HTTPFS - CHECKACCESS operation missing
> --
>
> Key: HDFS-9695
> URL: https://issues.apache.org/jira/browse/HDFS-9695
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Bert Hekman
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-9695.001.patch, HDFS-9695.002.patch, 
> HDFS-9695.003.patch, HDFS-9695.004.patch
>
>
> Hi,
> The CHECKACCESS operation seems to be missing in HTTPFS. I'm getting the 
> following error:
> {code}
> QueryParamException: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.fs.http.client.HttpFSFileSystem.Operation.CHECKACCESS
> {code}
> A quick look into the org.apache.hadoop.fs.http.client.HttpFSFileSystem class 
> reveals that CHECKACCESS is not defined at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9695) HTTPFS - CHECKACCESS operation missing

2019-11-28 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-9695:

Attachment: HDFS-9695.004.patch

> HTTPFS - CHECKACCESS operation missing
> --
>
> Key: HDFS-9695
> URL: https://issues.apache.org/jira/browse/HDFS-9695
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Bert Hekman
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-9695.001.patch, HDFS-9695.002.patch, 
> HDFS-9695.003.patch, HDFS-9695.004.patch
>
>
> Hi,
> The CHECKACCESS operation seems to be missing in HTTPFS. I'm getting the 
> following error:
> {code}
> QueryParamException: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.fs.http.client.HttpFSFileSystem.Operation.CHECKACCESS
> {code}
> A quick look into the org.apache.hadoop.fs.http.client.HttpFSFileSystem class 
> reveals that CHECKACCESS is not defined at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14961) [SBN read] Prevent ZKFC changing Observer Namenode state

2019-11-28 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984407#comment-16984407
 ] 

Hudson commented on HDFS-14961:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17708 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17708/])
HDFS-14961. [SBN read] Prevent ZKFC changing Observer Namenode state. 
(ayushsaxena: rev 46166bd8d1be6f25bd38703fb9b0a417e3ef750b)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSZKFailoverController.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java


> [SBN read] Prevent ZKFC changing Observer Namenode state
> 
>
> Key: HDFS-14961
> URL: https://issues.apache.org/jira/browse/HDFS-14961
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Íñigo Goiri
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14961-01.patch, HDFS-14961-02.patch, 
> HDFS-14961-03.patch, HDFS-14961-04.patch, ZKFC-TEST-14961.patch
>
>
> HDFS-14130 made ZKFC aware of the Observer Namenode and hence allows ZKFC 
> running along with the observer NOde.
> The Observer namenode isn't suppose to be part of ZKFC election process.
> But if the  Namenode was part of election, before turning into Observer by 
> transitionToObserver Command. The ZKFC still sends instruction to the 
> Namenode as a result of previous participation and sometimes tend to change 
> the state of Observer to Standby.
> This is also the reason for  failure in TestDFSZKFailoverController.
> TestDFSZKFailoverController has been consistently failing with a time out 
> waiting in testManualFailoverWithDFSHAAdmin(). In particular 
> {{waitForHAState(1, HAServiceState.OBSERVER);}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14961) [SBN read] Prevent ZKFC changing Observer Namenode state

2019-11-28 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-14961:

Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> [SBN read] Prevent ZKFC changing Observer Namenode state
> 
>
> Key: HDFS-14961
> URL: https://issues.apache.org/jira/browse/HDFS-14961
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Íñigo Goiri
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14961-01.patch, HDFS-14961-02.patch, 
> HDFS-14961-03.patch, HDFS-14961-04.patch, ZKFC-TEST-14961.patch
>
>
> HDFS-14130 made ZKFC aware of the Observer Namenode and hence allows ZKFC 
> running along with the observer NOde.
> The Observer namenode isn't suppose to be part of ZKFC election process.
> But if the  Namenode was part of election, before turning into Observer by 
> transitionToObserver Command. The ZKFC still sends instruction to the 
> Namenode as a result of previous participation and sometimes tend to change 
> the state of Observer to Standby.
> This is also the reason for  failure in TestDFSZKFailoverController.
> TestDFSZKFailoverController has been consistently failing with a time out 
> waiting in testManualFailoverWithDFSHAAdmin(). In particular 
> {{waitForHAState(1, HAServiceState.OBSERVER);}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14961) [SBN read] Prevent ZKFC changing Observer Namenode state

2019-11-28 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984398#comment-16984398
 ] 

Ayush Saxena commented on HDFS-14961:
-

Committed to trunk.
Thanx [~elgoiri] for the report and review, [~vinayakumarb], [~ferhui] and 
[~csun] for the reviews!!!

> [SBN read] Prevent ZKFC changing Observer Namenode state
> 
>
> Key: HDFS-14961
> URL: https://issues.apache.org/jira/browse/HDFS-14961
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Íñigo Goiri
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14961-01.patch, HDFS-14961-02.patch, 
> HDFS-14961-03.patch, HDFS-14961-04.patch, ZKFC-TEST-14961.patch
>
>
> HDFS-14130 made ZKFC aware of the Observer Namenode and hence allows ZKFC 
> running along with the observer NOde.
> The Observer namenode isn't suppose to be part of ZKFC election process.
> But if the  Namenode was part of election, before turning into Observer by 
> transitionToObserver Command. The ZKFC still sends instruction to the 
> Namenode as a result of previous participation and sometimes tend to change 
> the state of Observer to Standby.
> This is also the reason for  failure in TestDFSZKFailoverController.
> TestDFSZKFailoverController has been consistently failing with a time out 
> waiting in testManualFailoverWithDFSHAAdmin(). In particular 
> {{waitForHAState(1, HAServiceState.OBSERVER);}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15021) RBF: Delegation Token can't remove correctly in absence of cancelToken and restart the router.

2019-11-28 Thread Weidong Duan (Jira)
Weidong Duan created HDFS-15021:
---

 Summary: RBF:  Delegation Token can't remove correctly in absence 
of cancelToken and restart the router.
 Key: HDFS-15021
 URL: https://issues.apache.org/jira/browse/HDFS-15021
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Weidong Duan


The ZKDelegationTokenSecretManager couldn't remove the expired DTs on the 
Zookeeper as expected when restart the Router in the absence of invoking the 
method ` ZKDelegationTokenSecretManager#cancelToken`.

This case will cause many stale DTs leave on the Zookeeper . Maybe cause the 
performance problem of the Router.

I think this is a bug and  should be resolved in the latter. Is it Right?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15009) FSCK "-list-corruptfileblocks" return Invalid Entries

2019-11-28 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984367#comment-16984367
 ] 

hemanthboyina commented on HDFS-15009:
--

thanks for the suggestion [~ayushtkn]

updated the patch , please review

> FSCK "-list-corruptfileblocks" return Invalid Entries
> -
>
> Key: HDFS-15009
> URL: https://issues.apache.org/jira/browse/HDFS-15009
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15009.001.patch, HDFS-15009.002.patch, 
> HDFS-15009.003.patch, HDFS-15009.004.patch
>
>
> Scenario :  if we have two directories dir1, dir10 and only dir10 have 
> corrupt files 
> Now if we run -list-corruptfileblocks for dir1,  corrupt files count for dir1 
> showing is of dir10
> {code:java}
>   while (blkIterator.hasNext()) {
> BlockInfo blk = blkIterator.next();
> final INodeFile inode = getBlockCollection(blk);
> skip++;
> if (inode != null) {
>   String src = inode.getFullPathName();
>   if (src.startsWith(path)){
> corruptFiles.add(new CorruptFileBlockInfo(src, blk));
> count++;
> if (count >= DEFAULT_MAX_CORRUPT_FILEBLOCKS_RETURNED)
>   break;
>   }
> }
>   } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15009) FSCK "-list-corruptfileblocks" return Invalid Entries

2019-11-28 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-15009:
-
Attachment: HDFS-15009.004.patch

> FSCK "-list-corruptfileblocks" return Invalid Entries
> -
>
> Key: HDFS-15009
> URL: https://issues.apache.org/jira/browse/HDFS-15009
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15009.001.patch, HDFS-15009.002.patch, 
> HDFS-15009.003.patch, HDFS-15009.004.patch
>
>
> Scenario :  if we have two directories dir1, dir10 and only dir10 have 
> corrupt files 
> Now if we run -list-corruptfileblocks for dir1,  corrupt files count for dir1 
> showing is of dir10
> {code:java}
>   while (blkIterator.hasNext()) {
> BlockInfo blk = blkIterator.next();
> final INodeFile inode = getBlockCollection(blk);
> skip++;
> if (inode != null) {
>   String src = inode.getFullPathName();
>   if (src.startsWith(path)){
> corruptFiles.add(new CorruptFileBlockInfo(src, blk));
> count++;
> if (count >= DEFAULT_MAX_CORRUPT_FILEBLOCKS_RETURNED)
>   break;
>   }
> }
>   } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14961) [SBN read] Prevent ZKFC changing Observer Namenode state

2019-11-28 Thread Vinayakumar B (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984338#comment-16984338
 ] 

Vinayakumar B commented on HDFS-14961:
--

Thanks [~ayushtkn] for the analysis and the fix.
 Fix looks good to me. +1.

There is already a check present in HealthMonitor thread to quitElection when 
namenode state found to be OBSERVER.
{code:java}
if (changedState == HAServiceState.OBSERVER) {
  elector.quitElection(true);
  serviceState = HAServiceState.OBSERVER;
  return;
}{code}
But this is an async monitoring happening every 1 second. In case of manual 
transition, state can change directly in NameNode. So ZKFC syncs during 
monitoring and quits election.

As [~ferhui] suggested, checking for the state before joining the election also 
doesn't hurt. Can be added as a separate Improvement Jira as [~ayushtkn] 
already said.
{code:java}  if(serviceState != HAServiceState.OBSERVER) {
elector.joinElection(targetToData(localTarget));
  }{code}
 

> [SBN read] Prevent ZKFC changing Observer Namenode state
> 
>
> Key: HDFS-14961
> URL: https://issues.apache.org/jira/browse/HDFS-14961
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Íñigo Goiri
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14961-01.patch, HDFS-14961-02.patch, 
> HDFS-14961-03.patch, HDFS-14961-04.patch, ZKFC-TEST-14961.patch
>
>
> HDFS-14130 made ZKFC aware of the Observer Namenode and hence allows ZKFC 
> running along with the observer NOde.
> The Observer namenode isn't suppose to be part of ZKFC election process.
> But if the  Namenode was part of election, before turning into Observer by 
> transitionToObserver Command. The ZKFC still sends instruction to the 
> Namenode as a result of previous participation and sometimes tend to change 
> the state of Observer to Standby.
> This is also the reason for  failure in TestDFSZKFailoverController.
> TestDFSZKFailoverController has been consistently failing with a time out 
> waiting in testManualFailoverWithDFSHAAdmin(). In particular 
> {{waitForHAState(1, HAServiceState.OBSERVER);}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15020) Add a test case of storage type quota to TestHdfsAdmin.

2019-11-28 Thread Jinglun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HDFS-15020:
---
Resolution: Not A Problem
Status: Resolved  (was: Patch Available)

Hi [~ayushtkn] , thanks your reminding ! Since 
HdfsAdmin.setQuotaByStorageType() does nothing but call 
DistributedFileSystem.setQuotaByStorageType(src, type, quota), I think the 
TestQuota.testQuotaByStorageType() would cover. I'll close this jira.

> Add a test case of storage type quota to TestHdfsAdmin.
> ---
>
> Key: HDFS-15020
> URL: https://issues.apache.org/jira/browse/HDFS-15020
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-15020.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12348) disable removing blocks to trash while rolling upgrade

2019-11-28 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984287#comment-16984287
 ] 

lindongdong commented on HDFS-12348:


Hi, [~surendrasingh], thanks for your patch.

I find some problem with the patch: 

After we do rolling upgrade prepare, the old DN will move deleted file to 
trash. 

With this patch, the new DN will never delete the trash dir forever.

> disable removing blocks to trash while rolling upgrade
> --
>
> Key: HDFS-12348
> URL: https://issues.apache.org/jira/browse/HDFS-12348
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode
>Reporter: Jiandan Yang 
>Assignee: Jiandan Yang 
>Priority: Major
> Attachments: HDFS-12348.001.patch, HDFS-12348.002.patch, 
> HDFS-12348.003.patch
>
>
> DataNode remove block file and meta file to trash while rolling upgrade,and 
> do delete when
> executing finalize. 
> This  leads disk of datanode to be full, because
> (1) frequently creating and deleting files(eg,Hbase compaction);
> (2) cluster is very big, and rolling upgrade often last several days;
> Current our solution is clean trash by hand, but this is very dangerous in 
> product environment. 
> we think disable trash of datanode maybe a good method to avoid disk to be 
> full.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service

2019-11-28 Thread Yiqun Lin (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984238#comment-16984238
 ] 

Yiqun Lin commented on HDFS-13811:
--

[~LiJinglun], sorry for the delayed review. I was busy on reviewing other 
patches. I will give my review comments these days.

> RBF: Race condition between router admin quota update and periodic quota 
> update service
> ---
>
> Key: HDFS-13811
> URL: https://issues.apache.org/jira/browse/HDFS-13811
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, 
> HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch
>
>
> If we try to update quota of an existing mount entry and at the same time 
> periodic quota update service is running on the same mount entry, it is 
> leading the mount table to _inconsistent state._
> Here transactions are:
> A - Quota update service is fetching mount table entries.
> B - Quota update service is updating the mount table with current usage.
> A' - User is trying to update quota using admin cmd.
> and the transaction sequence is [ A A' B ]
> quota update service is updating the mount table with old quota value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org