[ 
https://issues.apache.org/jira/browse/HIVE-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16159338#comment-16159338
 ] 

Sergey Shelukhin commented on HIVE-17403:
-----------------------------------------

+1

> Fail concatenation for unmanaged and transactional tables
> ---------------------------------------------------------
>
>                 Key: HIVE-17403
>                 URL: https://issues.apache.org/jira/browse/HIVE-17403
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.3.0, 3.0.0, 2.4.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>            Priority: Blocker
>         Attachments: HIVE-17403.1.patch, HIVE-17403.2.patch
>
>
> ALTER TABLE .. CONCATENATE should fail if the table is not managed by hive. 
> For unmanaged tables, file names can be anything. Hive has some assumptions 
> about file names which can result in data loss for unmanaged tables. 
> Example of this is a table/partition having 2 different files files 
> (part-m-00000__1417075294718 and part-m-00018__1417075294718). Although both 
> are completely different files, hive thinks these are files generated by 
> separate instances of same task (because of failure or speculative 
> execution). Hive will end up removing this file
> {code}
> 2017-08-28T18:19:29,516 WARN  [b27f10d5-d957-4695-ab2a-1453401793df main]: 
> exec.Utilities (:()) - Duplicate taskid file removed: 
> file:/Users/table/part=20141120/.hive-staging_hive_2017-08-28_18-19-27_210_3381701454205724533-1/_tmp.-ext-10000/part-m-00018__1417075294718
>  with length 958510. Existing file: 
> file:/Users/table/part=20141120/.hive-staging_hive_2017-08-28_18-19-27_210_3381701454205724533-1/_tmp.-ext-10000/part-m-00000__1417075294718
>  with length 1123116
> {code}
> DDL should restrict concatenation for unmanaged tables. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to