[jira] [Commented] (HIVE-12712) HiveInputFormat may fail to column names to read in some cases

Prasanth Jayachandran (JIRA) Mon, 21 Dec 2015 15:58:57 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-12712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067285#comment-15067285
 ]


Prasanth Jayachandran commented on HIVE-12712:
----------------------------------------------

The test failures are unrelated. 18 tests are failing for other patches as 
well. Dynamic partition pruning test case is related to JDK version. On JDK v7 
the test passes and on JDK v8 it fails with hashmap ordering difference which 
is a known issue.

> HiveInputFormat may fail to column names to read in some cases
> --------------------------------------------------------------
>
>                 Key: HIVE-12712
>                 URL: https://issues.apache.org/jira/browse/HIVE-12712
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.0.0, 2.1.0
>            Reporter: Takahiko Saito
>            Assignee: Prasanth Jayachandran
>         Attachments: HIVE-12712.1.patch, HIVE-12712.2.patch
>
>
> The primary issue is when plan is generated pathToAliases map is populated 
> with directory paths to table aliases. pathToAliases.put() uses 
> path.toString() as map key. During probing, path.toUri().toString() is used. 
> This can cause probe misses when path contains spaces in them. path.toUri() 
> will escape the spaces in the path whereas path.toString() does not escape 
> the spaces. As a result, HiveInputFormat can trigger a different code path 
> which can fail to set list of columns to read from the source table. This was 
> causing unexpected NPE in OrcInputFormat (after refactoring HIVE-11705) which 
> removed null check for column names. The resulting exception is 
> {code}
> Caused by: java.lang.RuntimeException: ORC split generation failed with 
> exception: java.lang.NullPointerException
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1288)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1354)
>         at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:367)
>         at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:457)
>         at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:152)
>         at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:246)
>         at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:240)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>         at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:240)
>         at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:227)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         ... 3 more
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.NullPointerException
>         at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1282)
>         ... 15 more
> Caused by: java.lang.NullPointerException
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.extractNeededColNames(OrcInputFormat.java:422)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.extractNeededColNames(OrcInputFormat.java:417)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.access$2000(OrcInputFormat.java:134)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:1072)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:919)
>         ... 4 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12712) HiveInputFormat may fail to column names to read in some cases

Reply via email to