[ https://issues.apache.org/jira/browse/HDFS-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14029234#comment-14029234 ]
Tony Reix commented on HDFS-6515: --------------------------------- grep "waiting for " Issue3.testPageRounder.5 | grep cached gives on PPC64: (TestFsDatasetCache.java:get(530)) - waiting for 1 to be cached. Right now only 0 blocks are cached.\0A (TestFsDatasetCache.java:get(563)) - waiting for directive 2 to be cached. stats = {bytesNeeded: 65536\2C bytesCached: 0\2C filesNeeded: 1\2C filesCached: 0\2C hasExpired: false}\0A (TestFsDatasetCache.java:get(563)) - waiting for directive 2 to be cached. stats = {bytesNeeded: 65536\2C bytesCached: 0\2C filesNeeded: 1\2C filesCached: 0\2C hasExpired: false}\0A (TestFsDatasetCache.java:get(563)) - waiting for directive 2 to be cached. stats = {bytesNeeded: 65536\2C bytesCached: 0\2C filesNeeded: 1\2C filesCached: 0\2C hasExpired: false}\0A and on x86_64 : (TestFsDatasetCache.java:get(530)) - waiting for 16 to be cached. Right now only 0 blocks are cached.\0A (TestFsDatasetCache.java:get(563)) - waiting for directive 2 to be cached. stats = {bytesNeeded: 4096\2C bytesCached: 0\2C filesNeeded: 1\2C filesCached: 0\2C hasExpired: false}\0A (TestFsDatasetCache.java:get(563)) - waiting for directive 2 to be cached. stats = {bytesNeeded: 4096\2C bytesCached: 0\2C filesNeeded: 1\2C filesCached: 0\2C hasExpired: false}\0A (TestFsDatasetCache.java:get(563)) - waiting for directive 2 to be cached. stats = {bytesNeeded: 4096\2C bytesCached: 0\2C filesNeeded: 1\2C filesCached: 0\2C hasExpired: false}\0A PPC64: bytesNeeded: 65536 x86_64: bytesNeeded: 4096 nativeio.NativeIO (NativeIO.java:mlock(160)) - mlocking /home/tony/hadoop-3.X.Y-FromGitHub/hadoop-common/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current/BP-2013187742-127.0.1.1-1402063475487/current/finalized/blk_1073741825\0A Using mlock() is done 10 times on PPC64 and 44 times on x86_64 . > testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache) > ----------------------------------------------------------------------------- > > Key: HDFS-6515 > URL: https://issues.apache.org/jira/browse/HDFS-6515 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 2.4.0 > Environment: Linux on PPC64 > Reporter: Tony Reix > Priority: Blocker > Labels: test > > I have an issue with test : > testPageRounder > (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache) > on Linux/PowerPC. > On Linux/Intel, test runs fine. > On Linux/PowerPC, I have: > testPageRounder(org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache) > Time elapsed: 64.037 sec <<< ERROR! > java.lang.Exception: test timed out after 60000 milliseconds > Looking at details, I see that some "Failed to cache " messages appear in the > traces. Only 10 on Intel, but 186 on PPC64. > On PPC64, it looks like some thread is waiting for something that never > happens, generating a TimeOut. > I'm now using IBM JVM, however I've just checked that the issue also appears > with OpenJDK. > I'm now using Hadoop latest, however, the issue appeared within Hadoop 2.4.0 . > I need help for understanding what the test is doing, what traces are > expected, in order to understand what/where is the root cause. -- This message was sent by Atlassian JIRA (v6.2#6252)