[
https://issues.apache.org/jira/browse/HIVE-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dave Lerman updated HIVE-1006:
------------------------------
Attachment: hive.1006.2.patch
Sorry about that - upload the wrong patch for this and 1007.
> getPartitionDescFromPath failing from CombineHiveInputFormat
> ------------------------------------------------------------
>
> Key: HIVE-1006
> URL: https://issues.apache.org/jira/browse/HIVE-1006
> Project: Hadoop Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.4.1
> Reporter: Dave Lerman
> Attachments: hive.1006.1.patch, hive.1006.2.patch
>
>
> When HiveInputFormat.getPartitionDescFromPath is called from
> CombineHiveInputFormat, it sometimes fails to return a matching partitionDesc
> which then causes an Exception down the line since the split doesn't have an
> inputFormatClassName.
> The issue is that the path format used as the key in pathToPartitionInfo
> varies between stage - in the first stage it's the complete path as returned
> from the table definitions (eg. hdfs://server/path), and then in subsequent
> stages, it's the complete path with port (eg. hdfs://server:8020/path) of the
> result of the previous stage. This isn't a problem in HiveInputFormat since
> the directory you're looking up always uses the same format as the keys, but
> in CombineHiveInputFormat, we take that path and look up its children in the
> file system to get all the block information, and then use one of the
> returned paths to get the partition info -- and that returned path does not
> include the port. So, in any stage after the first, we are looking for a
> path without the port, but all the keys in the map contain a port, so we
> don't find a match.
> The attached patch may not be ideal -- it doesn't fix the underlying problem
> of inconsistent path formats in pathToPartitionInfo -- it just works around
> it by walking through the map and looking for a matching path rather than
> doing a hash lookup.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.