[ 
https://issues.apache.org/jira/browse/HDFS-11379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15843457#comment-15843457
 ] 

Daryn Sharp commented on HDFS-11379:
------------------------------------

Found due to hive jobs colliding.  Tasks opened orc files, other tasks stomped 
on them, so when the original tasks attempted to read the footer (outside the 
initial fetch range) it went into an infinite loop requesting locations.  Issue 
was difficult to isolate because by default the stream will fetch 10 blocks of 
locations so the issue only manifested for multi-GB files.

> DFSInputStream may infinite loop requesting block locations
> -----------------------------------------------------------
>
>                 Key: HDFS-11379
>                 URL: https://issues.apache.org/jira/browse/HDFS-11379
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.7.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>
> DFSInputStream creation caches file size and initial range of locations.  If 
> the file is truncated (or replaced) and the client attempts to read outside 
> the initial range, the client goes into a tight infinite looping requesting 
> locations for the nonexistent range.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to