[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2021-10-06 Thread Janus Chow (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424818#comment-17424818
 ] 

Janus Chow commented on HDFS-13821:
---

Thanks [~elgoiri] for the comment.
{quote}Can you post a PR modifying these parameters?
{quote}
Sure, will raise another ticket for the parameter changes.
{quote}Overall, we are trying to get out of guava, is it possible here? I have 
to say that the loading cache interface is pretty neat
{quote}
Personally, I do like using guava cache. But seems it's a little out of hand 
for guava here. Since Router will cache each path, one namespace can have 
millions of files. Router will forward to multi namespaces, so the hit rate 
won't be too high, the operation of loadCache and expireCache would be some 
overhead for this situation. I think the hit rate needs to be improved a lot.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2
>
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, HDFS-13821.008.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2021-10-05 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424576#comment-17424576
 ] 

Íñigo Goiri commented on HDFS-13821:


Thanks [~Symious]; that's significant, a couple of questions:
* Can you post a PR modifying these parameters?
* Overall, we are trying to get out of guava, is it possible here? I have to 
say that the loading cache interface is pretty neat

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2
>
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, HDFS-13821.008.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2021-10-05 Thread Janus Chow (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424511#comment-17424511
 ] 

Janus Chow commented on HDFS-13821:
---

A comparison of the LocalCacheTest result on local PC would be as follows:
||Argument||original stats (ms)||set initialCapacity (ms)||
|1 8 1000|0.059|0.042|
|1 16 1000|0.131|0.076|
|1 32 1000|0.268|0.1|
|1 64 1000|0.453|0.129|
|1 128 1000|0.871|0.175|
|1 256 1000|1.796|0.258|
|1 512 1000|3.322|0.457|
|1 1024 1000|7.107|0.796|
|1 1024 1|7.6741|0.5801|

 

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2
>
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, HDFS-13821.008.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2021-10-05 Thread Janus Chow (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424499#comment-17424499
 ] 

Janus Chow commented on HDFS-13821:
---

The performance bottleneck should be related to the bug mentioned in 
[https://github.com/google/guava/issues/1055.] 

We can work around this issue by setting initialCapacity to maxCacheSize 
(mentioned in [https://unportant.info/chasing-down-guava-cache-slowness.html).] 

In branch 2.10, the guava version is 11.0.2, it's still affected.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2
>
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, HDFS-13821.008.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2019-08-12 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905350#comment-16905350
 ] 

Íñigo Goiri commented on HDFS-13821:


>From the description that [~ferhui] had, the source of this is that the cache 
>implementation is not very efficient.
>From what I remember, we should replace:
{code}
/** Path -> Remote location. */
private Cache locationCache;
{code}
We can do the same benchmark [~ferhui] did with a different implementation.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2
>
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, HDFS-13821.008.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2019-08-11 Thread xuzq (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16904827#comment-16904827
 ] 

xuzq commented on HDFS-13821:
-

Thanks [~elgoiri].

I found the same problem in pressure measuring RBF, default ProcessingAvgTime 
will be 20ms/min, ops is 2.48M/min.

After close the cache, ProcessingAvgTime down to 0.05ms/min. 

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2
>
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, HDFS-13821.008.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2019-08-11 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16904823#comment-16904823
 ] 

Íñigo Goiri commented on HDFS-13821:


In the past we always cached so to keep backwards compatibility we should keep 
this enabled by default.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2
>
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, HDFS-13821.008.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2019-08-11 Thread xuzq (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16904781#comment-16904781
 ] 

xuzq commented on HDFS-13821:
-

I'm sorry to discuss this again, I think 
FEDERATION_MOUNT_TABLE_CACHE_ENABLE_DEFAULT should be false. The cache is 
turned off by default.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2
>
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, HDFS-13821.008.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-22 Thread Fei Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588530#comment-16588530
 ] 

Fei Hui commented on HDFS-13821:


[~linyiqun][~elgoiri] Thanks for review & commit

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2
>
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, HDFS-13821.008.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-21 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588349#comment-16588349
 ] 

Hudson commented on HDFS-13821:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #14813 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14813/])
HDFS-13821. RBF: Add dfs.federation.router.mount-table.cache.enable so (yqlin: 
rev 81847392badcd58d934333e7c3b5bf14b4fa1f3f)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/resolver/TestMountTableResolver.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/resolver/MountTableResolver.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RBFConfigKeys.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/resources/hdfs-rbf-default.xml


> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, HDFS-13821.008.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-21 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588345#comment-16588345
 ] 

Yiqun Lin commented on HDFS-13821:
--

Thanks [~elgoiri] for the confirmation. Committing this.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, HDFS-13821.008.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-21 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588065#comment-16588065
 ] 

Íñigo Goiri commented on HDFS-13821:


+1 on  [^HDFS-13821.008.patch].

I'd like to eventually improve the performance of the cache though.
Let's do it in a separate JIRA.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, HDFS-13821.008.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-20 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586829#comment-16586829
 ] 

Yiqun Lin commented on HDFS-13821:
--

LGTM, +1. I will hold off the commit to let [~elgoiri] to have a final check.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, HDFS-13821.008.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-20 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586239#comment-16586239
 ] 

genericqa commented on HDFS-13821:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 51s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 22s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m  
0s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 65m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13821 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936290/HDFS-13821.008.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux a1fce67646f4 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 65e7469 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24818/testReport/ |
| Max. process+thread count | 953 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24818/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> RBF: Add 

[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-20 Thread Fei Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586151#comment-16586151
 ] 

Fei Hui commented on HDFS-13821:


[~linyiqun] Thanks for detail comments
Upload v008 patch

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, HDFS-13821.008.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-20 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586114#comment-16586114
 ] 

Yiqun Lin commented on HDFS-13821:
--

[~ferhui], I afraid that you haven't fully addressed my comments:

* It will be a good behaviour to add failed message in fail() check, like 
fail("getCacheSize call should fail.").
* The description of the new setting can be simplified as:
{noformat}
 Set to true to enable mount table cache (Path to Remote Location cache).
 Disabling the cache is recommended when a large amount of unique paths are 
queried.
{noformat}

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, LocalCacheTest.java, 
> image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-20 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585829#comment-16585829
 ] 

genericqa commented on HDFS-13821:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 11s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
58s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 70m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13821 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936255/HDFS-13821.007.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 7cfce179dc86 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e3d73bb |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24816/testReport/ |
| Max. process+thread count | 951 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24816/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> RBF: Add 

[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-20 Thread Fei Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585749#comment-16585749
 ] 

Fei Hui commented on HDFS-13821:


[~linyiqun] Thanks. 
Upload v007 patch according to your suggestions

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, HDFS-13821.007.patch, LocalCacheTest.java, 
> image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-20 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585592#comment-16585592
 ] 

genericqa commented on HDFS-13821:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 16m  9s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m 14s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.fs.contract.router.web.TestRouterWebHDFSContractAppend |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13821 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936229/HDFS-13821.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux cf8ad2f09c9c 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6425ed2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24813/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24813/testReport/ |
| Max. process+thread count | 947 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 

[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-20 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585509#comment-16585509
 ] 

Yiqun Lin commented on HDFS-13821:
--

Thanks [~ferhui] for updating the patch, [~ferhui]. Two comments:

* Could you describe more about the new setting 
{{dfs.federation.router.mount-table.cache.enable}} in config file 
hdfs-rbf-default.xml? At least, we might mention the usage of this cache and 
that will be good understanding for the users to enable/disable the cache.
* Add a fail(...) check after line {{tmpMountTable.getCacheSize();}}, so that 
we can ensure that the getCacheSize call will throw IOExceotion.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> HDFS-13821.006.patch, LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-19 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585444#comment-16585444
 ] 

genericqa commented on HDFS-13821:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 15s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-rbf: The patch 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m  
4s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13821 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936215/HDFS-13821.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 7dd4d7aef87e 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 4aacbff |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24810/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24810/testReport/ |
| Max. process+thread count | 953 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 

[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-19 Thread Fei Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585401#comment-16585401
 ] 

Fei Hui commented on HDFS-13821:


Upload v005 fix checkstyle

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, HDFS-13821.004.patch, HDFS-13821.005.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-19 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585386#comment-16585386
 ] 

genericqa commented on HDFS-13821:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 10s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 19s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-rbf: The patch 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  2s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
26s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 76m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13821 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936210/HDFS-13821.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux abf33119fcec 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 4aacbff |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24807/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24807/testReport/ |
| Max. process+thread count | 966 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 

[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-19 Thread Fei Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585348#comment-16585348
 ] 

Fei Hui commented on HDFS-13821:


[~elgoiri] Thanks. Change the default value and add a unit test according to 
your suggestions.
Upload v004 patch

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-18 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584701#comment-16584701
 ] 

Íñigo Goiri commented on HDFS-13821:


[^HDFS-13821.003.patch] LGTM.

It might be a little overdone, but I think we should add a unit test where we 
disable the cache.
We can then spy the object and verify that the cache is null but we still 
resolve paths.

By the way, the default behavior before was using caching so I think we should 
keep the default as enabled.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-17 Thread Fei Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584143#comment-16584143
 ] 

Fei Hui commented on HDFS-13821:


Errors come from HDFS-13790

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-17 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583821#comment-16583821
 ] 

genericqa commented on HDFS-13821:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
37s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 28m 
15s{color} | {color:red} root in trunk failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-hdfs-rbf in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-hdfs-rbf in trunk failed. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
24s{color} | {color:red} hadoop-hdfs-rbf in trunk failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 19s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
20s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 27s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 21s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m  3s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13821 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936001/HDFS-13821.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 2ae92eca58cb 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / fa121eb |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24799/artifact/out/branch-mvninstall-root.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24799/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
| mvnsite | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24799/artifact/out/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
| 

[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-17 Thread Fei Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583793#comment-16583793
 ] 

Fei Hui commented on HDFS-13821:


[~elgoiri] Thanks. Fix the code according to your suggestions and fix 
checkstyle.
Upload v003 patch. 

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> HDFS-13821.003.patch, LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-17 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583741#comment-16583741
 ] 

genericqa commented on HDFS-13821:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 22s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 12s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-rbf: The patch 
generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 10s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
28s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 75m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13821 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12935993/HDFS-13821.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 264049b23ad7 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c67b065 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24798/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24798/testReport/ |
| Max. process+thread count | 1353 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 

[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-17 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583684#comment-16583684
 ] 

Íñigo Goiri commented on HDFS-13821:


Thanks [~ferhui] for  [^HDFS-13821.002.patch].
I think we could just use the cache as null to mark that is disabled. In 
addition, we can keep the final for the  field. 

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, HDFS-13821.002.patch, 
> LocalCacheTest.java, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-17 Thread Fei Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583567#comment-16583567
 ] 

Fei Hui commented on HDFS-13821:


[~elgoiri] [~linyiqun] Thanks for so meaningful discussions !
Upload v002 patch for fixing unit test testCompareConfigurationClassAgainstXml

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, LocalCacheTest.java, 
> image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-17 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583503#comment-16583503
 ] 

Íñigo Goiri commented on HDFS-13821:


[~ferhui], thanks for the results.
They look pretty bad... we should replace this cache for something more 
reliable.
I would prefer to solve the issue, but for now this is fine.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, LocalCacheTest.java, 
> image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-17 Thread Fei Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583469#comment-16583469
 ] 

Fei Hui commented on HDFS-13821:


[~elgoiri] Here's my testing result about localcache

{code:java}
hadoop jar test-1.0-SNAPSHOT.jar LocalCacheTest 1 8 1000 
elapse:0.166 ms per op each thread
hadoop jar test-1.0-SNAPSHOT.jar LocalCacheTest 1 16 1000
elapse:0.358 ms per op each thread
hadoop jar test-1.0-SNAPSHOT.jar LocalCacheTest 1 32 1000
elapse:0.775 ms per op each thread
hadoop jar test-1.0-SNAPSHOT.jar LocalCacheTest 1 64 1000
elapse:1.571 ms per op each thread
hadoop jar test-1.0-SNAPSHOT.jar LocalCacheTest 1 128 1000
elapse:2.688 ms per op each thread
hadoop jar test-1.0-SNAPSHOT.jar LocalCacheTest 1 256 1000
elapse:6.212 ms per op each thread
hadoop jar test-1.0-SNAPSHOT.jar LocalCacheTest 1 512 1000
elapse:12.359 ms per op each thread
{code}


> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, LocalCacheTest.java, 
> image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-17 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583462#comment-16583462
 ] 

Íñigo Goiri commented on HDFS-13821:


Can we give it a try setting the concurrency level? [~ferhui] any chance you 
can benchmark it with a level of 32 or so? 

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, LocalCacheTest.java, 
> image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-16 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583288#comment-16583288
 ] 

Yiqun Lin commented on HDFS-13821:
--

I prefer to disable the cache as a temporary approach if we don't have a good 
way to improve this right now.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, LocalCacheTest.java, 
> image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-16 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582848#comment-16582848
 ] 

Íñigo Goiri commented on HDFS-13821:


Let's replace the local cache then. Any proposals? 

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, LocalCacheTest.java, 
> image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-15 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580749#comment-16580749
 ] 

Yiqun Lin commented on HDFS-13821:
--

Thanks [~ferhui] for providing the test results!
As [~ferhui] pointed that the bottleneck seems in the localcache. I look into 
the localcache instance, it uses reentrant lock (not read/write lock) for the 
thread-safe operation. So here the problem is that when multiple read/write 
operations are doing  for the cache, the cache maybe looks bad.

{quote}
{quote}

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, LocalCacheTest.java, 
> image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-14 Thread Fei Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579955#comment-16579955
 ] 

Fei Hui commented on HDFS-13821:


[~elgoiri] Thanks for your reply.
{quote}
The hit ratio of the cache over time
{quote}
The hit ratio is nearyly zero in my test case
{quote}
The stats on the read/write lock
{quote}
Maybe it's unrelated to the read/write lock, because my approach code is in 
read lock block as follow
{code:java}
  @Override
  public PathLocation getDestinationForPath(final String path)
  throws IOException {
verifyMountTable();
readLock.lock();
try {
  if (!mountTableCacheEnable) {
return lookupLocation(path);
  }
  Callable meh = new Callable() {
@Override
public PathLocation call() throws Exception {
  return lookupLocation(path);
}
  };
  return this.locationCache.get(path, meh);
} catch (ExecutionException e) {
  throw new IOException(e);
} finally {
  readLock.unlock();
}
  }
{code}
{quote}
The time for a hit or a miss
{quote}
I do not stat it, but i am guessing localcache is the bottleneck in my test 
case and i do a test. 
The test code is uploaded *LocalCacheTest.java*. i run 'hadoop jar 
test-1.0-SNAPSHOT.jar LocalCacheTest 1 1024 1000' which means cachesize is 
1, threads number is 1024 and 1000 get ops each thread. The result : 
elapse:24.555 ms per op each thread.
There is a lock in localcache. I think In my test case 1024 concurrent threads 
computing is better than one thread holds the write lock and other threads 
blocked.


> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is 

[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-13 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578297#comment-16578297
 ] 

Íñigo Goiri commented on HDFS-13821:


bq. Profiling LocalCache is interesing, should we adress a new JIRA?
I would try to improve the caching approach in this JIRA.
You can always use your first approach in  [^HDFS-13821.001.patch] as a 
temporary mitigation.

[~ferhui], can you profile of where the time goes?
The main point is to check where is the time going: the cache is slow or is the 
read/write lock?
Can you provide stats on?
* The hit ratio of the cache over time.
* The stats on the read/write lock.
* The time for a hit or a miss.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-13 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578179#comment-16578179
 ] 

Yiqun Lin commented on HDFS-13821:
--

Thanks for describing more, [~ferhui]. I'm +1 for the idea for disabling cache 
when a lot of unique paths were accessed.
{quote}
Other options I can think of are:
Try to figure out when to stop caching.
...
{quote}
For the opinion of 'Try to figure out when to stop caching.', we can do a count 
stat for the hit ratio in mount table resolver, and print log info of this. 
Then users decide to enable/disable the cache.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-13 Thread Fei Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578110#comment-16578110
 ] 

Fei Hui commented on HDFS-13821:


[~elgoiri] Profiling LocalCache is interesing, should we adress a new JIRA? 
Disabling Cache is efficient when hundreds of millions files are access,  in 
such scenes ProxyAvgTime only costs 0.0x ms by directly computing remote path.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13821) RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache

2018-08-13 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577918#comment-16577918
 ] 

Íñigo Goiri commented on HDFS-13821:


I've seen similar situations in the past and that was the reason to add the max 
cache size in HDFS-12802.
I'm guessing that the MR jobs accesses a lot of unique paths and we generate a 
new location for each one of them.
Adding the option to disable the cache might be a solution but I'd like to go a 
little deeper before going there.
Other options I can think of are:
* Try to figure out when to stop caching.
* Cache addresses at a level with a better hit ratio.
* Improve the locking model. From the trace [~ferhui] posted, I'm guessing that 
the issue is that we are holding the write lock a lot.

CC [~crh] and [~csun] as they have been doing some test on the performance.

> RBF: Add dfs.federation.router.mount-table.cache.enable so that users can 
> disable cache
> ---
>
> Key: HDFS-13821
> URL: https://issues.apache.org/jira/browse/HDFS-13821
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0, 2.9.1, 3.0.3
>Reporter: Fei Hui
>Priority: Major
> Attachments: HDFS-13821.001.patch, image-2018-08-13-11-27-49-023.png
>
>
> When i test rbf, if found performance problem.
> I found that ProxyAvgTime From Ganglia is so high, i run jstack on Router and 
> get the following stack frames
> {quote}
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0005c264acd8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2249)
>     at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>     at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>     at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>     at 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver.getDestinationForPath(MountTableResolver.java:380)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2104)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:2087)
>     at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(RouterRpcServer.java:1050)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2115)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2111)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
> {quote}
> Many threads blocked on *LocalCache*
> After disable the cache, ProxyAvgTime is down as follow showed
>  !image-2018-08-13-11-27-49-023.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org