Luke Lovett created HIVE-9771:
---------------------------------

             Summary: HiveCombineInputFormat does not appropriately call 
getSplits on InputFormats for native tables
                 Key: HIVE-9771
                 URL: https://issues.apache.org/jira/browse/HIVE-9771
             Project: Hive
          Issue Type: Bug
    Affects Versions: 0.12.0
         Environment: Hive 0.12.0
Hadoop 2.4.1
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
            Reporter: Luke Lovett


{{HiveCombineInputFormat}} never calls {{getSplits}} on a custom 
{{InputFormat}} when those InputFormats are used by native tables. If I {{set}} 
{{hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;}}, then 
{{getSplits}} is called. I'm not the first user to have experience this either; 
see [this post from the hive-user mailing 
list|https://mail-archives.apache.org/mod_mbox/hive-user/201410.mbox/%3CCAENxBwy+XB1OB2ZOjz=4=NxKNMsWA==o0ibrd+gopxgqrj2...@mail.gmail.com%3E].

The purpose of this ticket is to discover:
- Is this difference in behavior between CombineHiveInputFormat and 
HiveInputFormat intentional?
- Is there any way of forcing CombineHiveInputFormat to call getSplits 
on my own InputFormat? I was reading through the code for 
CombineHiveInputFormat, and it looks like it might only call my own 
InputFormat's getSplits method if the table is non-native. I'm not sure 
if I'm interpreting this correctly.
- For the purpose of creating an InputFormat to be used by other users, is it 
better to set {{hive.input.format}} to work around this, or to 
create a StorageHandler and make non-native tables?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to