[ 
https://issues.apache.org/jira/browse/HDFS-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14029234#comment-14029234
 ] 

Tony Reix commented on HDFS-6515:
---------------------------------

grep "waiting for " Issue3.testPageRounder.5 | grep cached

gives on PPC64:

(TestFsDatasetCache.java:get(530)) - waiting for 1 to be cached.   Right now 
only 0 blocks are cached.\0A
(TestFsDatasetCache.java:get(563)) - waiting for directive 2 to be cached.  
stats = {bytesNeeded: 65536\2C bytesCached: 0\2C filesNeeded: 1\2C filesCached: 
0\2C hasExpired: false}\0A
(TestFsDatasetCache.java:get(563)) - waiting for directive 2 to be cached.  
stats = {bytesNeeded: 65536\2C bytesCached: 0\2C filesNeeded: 1\2C filesCached: 
0\2C hasExpired: false}\0A
(TestFsDatasetCache.java:get(563)) - waiting for directive 2 to be cached.  
stats = {bytesNeeded: 65536\2C bytesCached: 0\2C filesNeeded: 1\2C filesCached: 
0\2C hasExpired: false}\0A

and on x86_64 :

(TestFsDatasetCache.java:get(530)) - waiting for 16 to be cached.   Right now 
only 0 blocks are cached.\0A
(TestFsDatasetCache.java:get(563)) - waiting for directive 2 to be cached.  
stats = {bytesNeeded: 4096\2C bytesCached: 0\2C filesNeeded: 1\2C filesCached: 
0\2C hasExpired: false}\0A
(TestFsDatasetCache.java:get(563)) - waiting for directive 2 to be cached.  
stats = {bytesNeeded: 4096\2C bytesCached: 0\2C filesNeeded: 1\2C filesCached: 
0\2C hasExpired: false}\0A
(TestFsDatasetCache.java:get(563)) - waiting for directive 2 to be cached.  
stats = {bytesNeeded: 4096\2C bytesCached: 0\2C filesNeeded: 1\2C filesCached: 
0\2C hasExpired: false}\0A

PPC64:   bytesNeeded: 65536
x86_64: bytesNeeded: 4096



nativeio.NativeIO (NativeIO.java:mlock(160)) - mlocking 
/home/tony/hadoop-3.X.Y-FromGitHub/hadoop-common/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current/BP-2013187742-127.0.1.1-1402063475487/current/finalized/blk_1073741825\0A

Using mlock() is done 10 times on PPC64 and 44 times on x86_64 .

> testPageRounder   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-6515
>                 URL: https://issues.apache.org/jira/browse/HDFS-6515
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.4.0
>         Environment: Linux on PPC64
>            Reporter: Tony Reix
>            Priority: Blocker
>              Labels: test
>
> I have an issue with test :
>    testPageRounder
>   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
> on Linux/PowerPC.
> On Linux/Intel, test runs fine.
> On Linux/PowerPC, I have:
> testPageRounder(org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)  
> Time elapsed: 64.037 sec  <<< ERROR!
> java.lang.Exception: test timed out after 60000 milliseconds
> Looking at details, I see that some "Failed to cache " messages appear in the 
> traces. Only 10 on Intel, but 186 on PPC64.
> On PPC64, it looks like some thread is waiting for something that never 
> happens, generating a TimeOut.
> I'm now using IBM JVM, however I've just checked that the issue also appears 
> with OpenJDK.
> I'm now using Hadoop latest, however, the issue appeared within Hadoop 2.4.0 .
> I need help for understanding what the test is doing, what traces are 
> expected, in order to understand what/where is the root cause.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to