[ 
https://issues.apache.org/jira/browse/HADOOP-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541110
 ] 

Doug Cutting commented on HADOOP-2172:
--------------------------------------

The claim in particular (from IRC) is that random access to a MapFile stored on 
the LocalFileSystem has become much slower, and that much of the time is taken 
seeking the CRC file.  Seeks within the current buffer should not require 
system calls, but perhaps that optimization was lost?

I hacked TestArrayFile to test this, and ArrayFile.get(0) run 100,000 times 
took around 1 second in 0.13 and now takes over ten.


> PositionCache was removed from FSDataInputStream, causes extremely bad 
> MapFile performance
> ------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2172
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2172
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.3, 0.15.0
>            Reporter: Johan Oskarsson
>            Assignee: Johan Oskarsson
>            Priority: Blocker
>         Attachments: positioncache-v1.patch
>
>
> The PositionCache in FSDataInputStream seems to have been removed in 
> HADOOP-1470. This causes for example MapFile.get usage to be  extremely slow 
> as the file position isn't cached in memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to