Gopal V created HIVE-11306:
------------------------------
Summary: Add a bloom-1 filter for Hybrid MapJoin spills
Key: HIVE-11306
URL: https://issues.apache.org/jira/browse/HIVE-11306
Project: Hive
Issue Type: Improvement
Components: Hive
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Gopal V
HIVE-9277 implemented Spillable joins for Tez, which suffers from a corner-case
performance issue when joining wide small tables against a narrow big table
(like a user info table join events stream).
The fact that the wide table is spilled causes extra IO, even though the nDV of
the join key might be in the thousands.
A cheap bloom-1 filter would add a massive performance gain for such queries,
massively cutting down on the spill IO costs for the big-table spills.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)