I ran into an interesting bug. Basically, if your FROM() source is
a partitioned table and you use a where clause that prunes, all of the
INSERT HERE SELECT * WHERE x=y ignores each specified where clause.  This
does not occur if the source partition is not specified, but if the source
as where partition = 'x' then the where on each individual insert is
ignored...

I've included some files here

testdata.tsv - Tab delimited data to prove the issue
create_tables.hive - Creates a database and tables as well as loads the
data from the TSV

Test Cases:
I created these test case files in a way that there are three types of
insert in each case: 1. Load all data from initial statement, 2. Load
partial data (use a limiting clause such as where day >= '2013-01-05', and
3 Load NO data from the initial statement (where 1 = 0)

These tests are all run on hive 0.9

multi-flat-flat.hive - The source table and the dest tables are not
partitioned, the where clauses work as expected:

19 Rows loaded to multi_bug_flat
0 Rows loaded to multi_bug_flat3
15 Rows loaded to multi_bug_flat2

multi-part-part.hive - The source table and the dest tables are
partitioned. The where clauses are not honored.

9 Rows loaded to multi_bug_part3
9 Rows loaded to multi_bug_part2
9 Rows loaded to multi_bug_part

multi-flat-part.hive - The source table is flat, the dest table is
partitioned - The where clauses work as expected:

0 Rows loaded to multi_bug_part3
15 Rows loaded to multi_bug_part2
19 Rows loaded to multi_bug_part

multi-part-flat.hive - The source table is partitioned, the dest table is
flat - The where clauses are not honored:

9 Rows loaded to multi_bug_flat
9 Rows loaded to multi_bug_flat3
9 Rows loaded to multi_bug_flat2

multi-part-specified.hive - The source and dest are partitioned, but there
is no partition pruning statement in the from ()  this works as expected

0 Rows loaded to multi_bug_part3
15 Rows loaded to multi_bug_part2
19 Rows loaded to multi_bug_part


Thoughts?

Attachment: create_tables.hive
Description: Binary data

Attachment: multi-flat-flat.hive
Description: Binary data

Attachment: multi-flat-part.hive
Description: Binary data

Attachment: multi-part-flat.hive
Description: Binary data

Attachment: multi-part-part.hive
Description: Binary data

Attachment: multi-part-specified.hive
Description: Binary data

2013-01-09 00:00:002013-01-09 23:59:582013-01-09
2013-01-12 00:00:002013-01-12 23:59:592013-01-12
2013-01-03 00:00:002013-01-03 23:59:582013-01-03
2013-01-18 00:00:062013-01-18 23:59:582013-01-18
2013-01-15 00:00:002013-01-15 23:59:592013-01-15
2013-01-02 00:00:002013-01-02 23:59:592013-01-02
2013-01-17 00:00:032013-01-17 23:59:552013-01-17
2013-01-05 00:00:002013-01-05 23:59:592013-01-05
2013-01-07 00:00:002013-01-07 23:59:592013-01-07
2013-01-10 00:00:012013-01-10 23:59:582013-01-10
2013-01-06 00:00:002013-01-06 23:59:592013-01-06
2013-01-13 00:00:002013-01-13 23:59:592013-01-13
2013-01-01 00:00:012013-01-01 23:59:582013-01-01
2013-01-16 00:00:002013-01-16 23:59:582013-01-16
2013-01-08 00:00:002013-01-08 23:59:592013-01-08
2013-01-11 00:00:002013-01-11 23:59:592013-01-11
2013-01-04 00:00:002013-01-04 23:59:592013-01-04
2013-01-19 00:00:012013-01-19 23:59:532013-01-19
2013-01-14 00:00:002013-01-14 23:59:582013-01-14

Reply via email to