Luke Lovett created HIVE-9771: --------------------------------- Summary: HiveCombineInputFormat does not appropriately call getSplits on InputFormats for native tables Key: HIVE-9771 URL: https://issues.apache.org/jira/browse/HIVE-9771 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Environment: Hive 0.12.0 Hadoop 2.4.1 java version "1.7.0_51" Java(TM) SE Runtime Environment (build 1.7.0_51-b13) Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode) Reporter: Luke Lovett
{{HiveCombineInputFormat}} never calls {{getSplits}} on a custom {{InputFormat}} when those InputFormats are used by native tables. If I {{set}} {{hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;}}, then {{getSplits}} is called. I'm not the first user to have experience this either; see [this post from the hive-user mailing list|https://mail-archives.apache.org/mod_mbox/hive-user/201410.mbox/%3CCAENxBwy+XB1OB2ZOjz=4=NxKNMsWA==o0ibrd+gopxgqrj2...@mail.gmail.com%3E]. The purpose of this ticket is to discover: - Is this difference in behavior between CombineHiveInputFormat and HiveInputFormat intentional? - Is there any way of forcing CombineHiveInputFormat to call getSplits on my own InputFormat? I was reading through the code for CombineHiveInputFormat, and it looks like it might only call my own InputFormat's getSplits method if the table is non-native. I'm not sure if I'm interpreting this correctly. - For the purpose of creating an InputFormat to be used by other users, is it better to set {{hive.input.format}} to work around this, or to create a StorageHandler and make non-native tables? -- This message was sent by Atlassian JIRA (v6.3.4#6332)