[ 
https://issues.apache.org/jira/browse/HDFS-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813397#comment-13813397
 ] 

Colin Patrick McCabe commented on HDFS-5394:
--------------------------------------------

Looks like the issue is an environment issue on the build machines.  They only 
have 64k of available mlock space:

{code}
2013-11-04 20:34:51,387 WARN  impl.FsDatasetCache 
(FsDatasetCache.java:run(329)) - Failed to cache block 1073741842 in 
BP-1183768563-67.195.138.24-1383597287811
ENOMEM: Cannot allocate memory
    at org.apache.hadoop.io.nativeio.NativeIO$POSIX.mlock_native(Native Method)
    at org.apache.hadoop.io.nativeio.NativeIO$POSIX.mlock(NativeIO.java:255)
    at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MappableBlock$PosixMlocker.mlock(MappableBlock.java:54)
    at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MappableBlock.load(MappableBlock.java:99)
    at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetCache$CachingTask.run(FsDatasetCache.java:321)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)
2013-11-04 20:34:51,388 WARN  impl.FsDatasetCache 
(FsDatasetCache.java:run(329)) - Failed to cache block 1073741841 in 
BP-1183768563-67.195.138.24-1383597287811
ENOMEM: Cannot allocate memory
    at org.apache.hadoop.io.nativeio.NativeIO$POSIX.mlock_native(Native Method)
    at org.apache.hadoop.io.nativeio.NativeIO$POSIX.mlock(NativeIO.java:255)
    at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MappableBlock$PosixMlocker.mlock(MappableBlock.java:54)
    at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MappableBlock.load(MappableBlock.java:99)
    at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetCache$CachingTask.run(FsDatasetCache.java:321)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)
2013-11-04 20:34:51,543 INFO  datanode.TestFsDatasetCache 
(TestFsDatasetCache.java:get(190)) - 
{code}

and then later...
{code}
verifyExpectedCacheUsage: expected 65535, got 60074; memlock limit = 65536.  
Waiting...
{code}

I'm not sure why we can't seem to get up to 65535, considering the ulimit is 
supposed to be just higher than that.  I'll see if I can reproduce locally.

> fix race conditions in DN caching and uncaching
> -----------------------------------------------
>
>                 Key: HDFS-5394
>                 URL: https://issues.apache.org/jira/browse/HDFS-5394
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, namenode
>    Affects Versions: 3.0.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5394-caching.001.patch, 
> HDFS-5394-caching.002.patch, HDFS-5394-caching.003.patch, 
> HDFS-5394-caching.004.patch, HDFS-5394.005.patch, HDFS-5394.006.patch
>
>
> The DN needs to handle situations where it is asked to cache the same replica 
> more than once.  (Currently, it can actually do two mmaps and mlocks.)  It 
> also needs to handle the situation where caching a replica is cancelled 
> before said caching completes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to