[ 
https://issues.apache.org/jira/browse/HIVE-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13845936#comment-13845936
 ] 

Prasanth J commented on HIVE-6016:
----------------------------------

This should fix hcatalog unit test failure TestOrcDynamicPartitioned in hadoop2.

> Hadoop23Shims has a bug in listLocatedStatus impl.
> --------------------------------------------------
>
>                 Key: HIVE-6016
>                 URL: https://issues.apache.org/jira/browse/HIVE-6016
>             Project: Hive
>          Issue Type: Bug
>          Components: Shims
>    Affects Versions: 0.13.0
>            Reporter: Sushanth Sowmyan
>            Assignee: Prasanth J
>         Attachments: HIVE-6016.1.patch
>
>
> Prashant and I discovered that the implementation of the wrapping Iterator in 
> listLocatedStatus at 
> https://github.com/apache/hive/blob/2d2f89c21618341987c1257a88691981f1f606c7/shims/src/0.23/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java#L350-L393
>  is broken.
> Basically, if you had files (a,b,_s) , with a filter that is supposed to 
> filter out _s, we expect an output result of (a,b). Instead, we get 
> (a,b,null), with hasNext looking at the next value to see if it's null, and 
> using that to decide if it has any more entries, and thus, (a,b,_s) becomes 
> (a,b).
> There's a boundary condition on the very first pick, which causes a (_s,a,b) 
> to result in (_s,a,b), bypassing the filter, and thus, we wind up with a 
> resultant unfiltered (_s,a,b) which orc breaks on.
> The effect of this bug is that Orc will not be able to read directories where 
> there is a _SUCCESS file, say, as the first entry returned by the FileStatus.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to