[ https://issues.apache.org/jira/browse/HIVE-8151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Prasanth J updated HIVE-8151: ----------------------------- Attachment: HIVE-8151.5.patch Rebased patch against latest trunk. > Dynamic partition sort optimization inserts record wrongly to partition when > used with GroupBy > ---------------------------------------------------------------------------------------------- > > Key: HIVE-8151 > URL: https://issues.apache.org/jira/browse/HIVE-8151 > Project: Hive > Issue Type: Bug > Affects Versions: 0.14.0, 0.13.1 > Reporter: Prasanth J > Assignee: Prasanth J > Priority: Blocker > Attachments: HIVE-8151.1.patch, HIVE-8151.2.patch, HIVE-8151.3.patch, > HIVE-8151.4.patch, HIVE-8151.5.patch > > > HIVE-6455 added dynamic partition sort optimization. It added startGroup() > method to FileSink operator to look for changes in reduce key for creating > partition directories. This method however is not reliable as the key called > with startGroup() is different from the key called with processOp(). > startGroup() is called with newly changed key whereas processOp() is called > with previously aggregated key. This will result in processOp() writing the > last row of previous group as the first row of next group. This happens only > when used with group by operator. > The fix is to not rely on startGroup() and do the partition directory > creation in processOp() itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)