[ https://issues.apache.org/jira/browse/HIVE-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16159338#comment-16159338 ]
Sergey Shelukhin commented on HIVE-17403: ----------------------------------------- +1 > Fail concatenation for unmanaged and transactional tables > --------------------------------------------------------- > > Key: HIVE-17403 > URL: https://issues.apache.org/jira/browse/HIVE-17403 > Project: Hive > Issue Type: Bug > Affects Versions: 1.3.0, 3.0.0, 2.4.0 > Reporter: Prasanth Jayachandran > Assignee: Prasanth Jayachandran > Priority: Blocker > Attachments: HIVE-17403.1.patch, HIVE-17403.2.patch > > > ALTER TABLE .. CONCATENATE should fail if the table is not managed by hive. > For unmanaged tables, file names can be anything. Hive has some assumptions > about file names which can result in data loss for unmanaged tables. > Example of this is a table/partition having 2 different files files > (part-m-00000__1417075294718 and part-m-00018__1417075294718). Although both > are completely different files, hive thinks these are files generated by > separate instances of same task (because of failure or speculative > execution). Hive will end up removing this file > {code} > 2017-08-28T18:19:29,516 WARN [b27f10d5-d957-4695-ab2a-1453401793df main]: > exec.Utilities (:()) - Duplicate taskid file removed: > file:/Users/table/part=20141120/.hive-staging_hive_2017-08-28_18-19-27_210_3381701454205724533-1/_tmp.-ext-10000/part-m-00018__1417075294718 > with length 958510. Existing file: > file:/Users/table/part=20141120/.hive-staging_hive_2017-08-28_18-19-27_210_3381701454205724533-1/_tmp.-ext-10000/part-m-00000__1417075294718 > with length 1123116 > {code} > DDL should restrict concatenation for unmanaged tables. -- This message was sent by Atlassian JIRA (v6.4.14#64029)