[ https://issues.apache.org/jira/browse/HIVE-17658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190702#comment-16190702 ]
Sergey Shelukhin commented on HIVE-17658: ----------------------------------------- currently it will at the very least fail as bad as HIVE-17675 :) > Bucketed/Sorted tables - SMB join > --------------------------------- > > Key: HIVE-17658 > URL: https://issues.apache.org/jira/browse/HIVE-17658 > Project: Hive > Issue Type: Sub-task > Components: Transactions > Reporter: Eugene Koifman > > How does this handle tables that are bucketed + sorted? > insert into T values(1,2),(5,6); creates something like delta_2_2/bucket_1 > insert into T values(3,4),(7,8) creates delta_3_3/bucket_1 > the expectation for any reader would be to see some contiguous subset of > (1,2),(3,4),(5,6),(7,8) > but this would require a special reader which I don't see > In particular it's not clear how SMB join can work > This looks like a general problem: > For plain Hive table, if you do 2 inserts, and the 1st one creates 00000_0, > then 2nd one will create 00000_0_copy_1. > There is nothing merge these files at query time to produce a single sort > order (like Acid reader in full acid tables) > It should at least throw in this case. > Current "CONCATENATE" doesn't support bucketed or sorted tables. -- This message was sent by Atlassian JIRA (v6.4.14#64029)