[ https://issues.apache.org/jira/browse/HIVE-8151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Prasanth J updated HIVE-8151: ----------------------------- Status: Patch Available (was: Open) > Dynamic partition sort optimization inserts record wrongly to partition when > used with GroupBy > ---------------------------------------------------------------------------------------------- > > Key: HIVE-8151 > URL: https://issues.apache.org/jira/browse/HIVE-8151 > Project: Hive > Issue Type: Bug > Affects Versions: 0.13.1, 0.14.0 > Reporter: Prasanth J > Assignee: Prasanth J > Priority: Critical > Attachments: HIVE-8151.1.patch > > > HIVE-6455 added dynamic partition sort optimization. It added startGroup() > method to FileSink operator to look for changes in reduce key for creating > partition directories. This method however is reliable as the key called with > startGroup() is different from the key called with processOp(). startGroup() > is called with newly changed key whereas processOp() is called with > previously aggregated key. This will result in processOp() writing the last > row of previous group as the first row of next group. This happens only when > used with group by operator. > The fix is to not rely on startGroup() and do the partition directory > creation in processOp() itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)