Vineet Garg created HIVE-21330:
----------------------------------
Summary: Bucketing id varies b/w data loaded through streaming
apis and regular query
Key: HIVE-21330
URL: https://issues.apache.org/jira/browse/HIVE-21330
Project: Hive
Issue Type: Bug
Reporter: Vineet Garg
The test at
[https://github.com/apache/hive/blob/master/hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java#L439]
tests for this case. It currently passes but for the wrong reason. This test
checks for empty result set. Result sets are empty due to prior INSERT failing
to load data not because the bucketing scheme is different.
This error with INSERT is fixed in https://github.com/apache/hive/pull/552.
Test with this patch fails because the underlying bucketing ids generated are
different.
These tests are run on MR instead of TEZ which could explain the different
bucketing ids.
I don't really know what are the repercussion of having different bucketing ids
and why are they expected to be same but since there is a test to test this
logic it is worth investigating the case.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)