[
https://issues.apache.org/jira/browse/HIVE-8637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alan Gates updated HIVE-8637:
-----------------------------
Attachment: HIVE-8637.patch
This is not a permanent fix. This fix works by changing
HiveInputFormat.getInputSplits to call a new method in Utilities that sets
values from table properties in the job conf whether they are already set or
not. This seems safe, since the table should properly understand its own
properties.
I believe the correct long term solution is to make sure a different copy of
JobConf goes to the input and output tables, so each can write whatever it
wants there. I think that would have to be done in ExecDriver.execute, since
calls to checkOutputSpecs and getInputSplits are done by Hadoop after Hive
submits the job. I think that would fix the MR case. I'm sure the fix for Tez
would be slightly different (since the job is submitted all at once).
But this would also destroy any ability to communicate information across jobs
via the conf file. I don't know if anything is doing that or not. I'm loathe
to make that big a change when [~hagleitn] has said he wants to cut a release
in a week.
So, I propose this smaller change now, and we file a JIRA for the bigger, more
complete fix.
> In insert into X select from Y, table properties from X are clobbering those
> from Y
> -----------------------------------------------------------------------------------
>
> Key: HIVE-8637
> URL: https://issues.apache.org/jira/browse/HIVE-8637
> Project: Hive
> Issue Type: Task
> Affects Versions: 0.14.0
> Reporter: Alan Gates
> Assignee: Alan Gates
> Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8637.patch
>
>
> With a query like:
> {code}
> insert into table X select * from Y;
> {code}
> the table properties from table X are being sent to the input formats for
> table Y.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)