CombinedHiveInputFormat should parse the inputpath correctly
------------------------------------------------------------
Key: HIVE-1001
URL: https://issues.apache.org/jira/browse/HIVE-1001
Project: Hadoop Hive
Issue Type: Bug
Affects Versions: 0.5.0
Reporter: Zheng Shao
>From David Lerman:
"
I'm running into errors where CombinedHiveInputFormat is combining data from
two different tables which is causing problems because the tables have
different input formats.
It looks like the problem is in
org.apache.hadoop.hive.shims.Hadoop20Shims.getInputPathsShim. It calls
CombineFileInputFormat.getInputPaths which returns the list of input paths
and then chops off the first 5 characters to remove file: from the
beginning, but the return value I'm getting from getInputPaths is actually
hdfs://domain/path. So then when it creates the pools using these paths,
none of the input paths match the pools (since they're just the file path
which protocol or domain).
"
We should use Path.getPath() to get the path part of an URI instead of just
chopping off 5 chars.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.