[ 
https://issues.apache.org/jira/browse/HIVE-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486597#comment-13486597
 ] 

Chris Drome commented on HIVE-3638:
-----------------------------------

@Shreepadma: I had traced the code a while back and found that Hive was 
creating an empty file for these types of queries. Hadoop20 didn't care that 
the file was empty and would create a split, which would get a mapper. With 
Hadoop23 I noticed that there is a condition which checks to see whether the 
file is empty or not. If it is empty it doesn't create a split and hence 
doesn't get a mapper. In this way Hive could trick Hadoop20 into running an MR 
job, but this tactic doesn't work on Hadoop23. I don't remember the classes 
off-hand.

Here is the diff of the generated vs expected output. If I remember correctly, 
when no splits are generated it returns NULL.

Line 84: select max(ds) from TEST1; (no partitions exist)
Line 211: alter table TEST1 add partition (ds='1'); select max(ds) from TEST1;
Line 337: select count(distinct ds) from TEST1;
Line 1080: alter table TEST2 add partition (ds='1', hr='1'); alter table TEST2 
add partition (ds='1', hr='2'); alter table TEST2 add partition (ds='1', 
hr='3'); select ds, count(distinct hr) from TEST2 group by ds;
Line 1453: alter table TEST1 add partition (ds='2'); select max(ds) from TEST1;

    [junit] diff -a 
/export/crawlspace/cdrome/workspace/hive/build/ql/test/logs/clientpositive/metadataonly1.q.out
 
/export/crawlspace/cdrome/workspace/hive/ql/src/test/results/clientpositive/metadataonly1.q.out
    [junit] 84c84
    [junit] < NULL
    [junit] ---
    [junit] > 
    [junit] 211c211
    [junit] < NULL
    [junit] ---
    [junit] > 1
    [junit] 337c337
    [junit] < 0
    [junit] ---
    [junit] > 1
    [junit] 1080a1081
    [junit] > 1 3
    [junit] 1453c1454
    [junit] < NULL
    [junit] ---
    [junit] > 2
                
> metadataonly1.q test fails with Hadoop23
> ----------------------------------------
>
>                 Key: HIVE-3638
>                 URL: https://issues.apache.org/jira/browse/HIVE-3638
>             Project: Hive
>          Issue Type: Bug
>          Components: Tests
>    Affects Versions: 0.10.0, 0.9.1
>            Reporter: Chris Drome
>
> Hive creates an empty file as a hack to get Hadoop to run a mapper.
> This no longer works with Hadoop23, causing this test to fail. Note that this 
> tests empty partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to