[ 
https://issues.apache.org/jira/browse/HIVE-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838958#action_12838958
 ] 

He Yongqiang commented on HIVE-1193:
------------------------------------

@Zheng,
>>1. How do we make sure that the data is bucketed / sorted? By adding an 
>>additional map-reduce job?
Yes. 
>>2. What if the user already specified "CLUSTER BY key" in his query?
As 1, there will be a new job added which will redistribute the data. 
If the user specify a cluster by column different than the table's sort and 
bucket property, we maybe should let it fail. But right now that cluster by is 
actually ignored.
>>3. Do we disable merging of small files when we do this?
Yes. We should disable it. we should disable it when enabled enforceBucketing 
or enforceSorting


> ensure sorting properties for a table
> -------------------------------------
>
>                 Key: HIVE-1193
>                 URL: https://issues.apache.org/jira/browse/HIVE-1193
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>             Fix For: 0.6.0
>
>         Attachments: hive.1193.1.patch
>
>
> If a table is sorted, and data is being inserted into that - currently, we 
> dont make sure that data is sorted. That might be useful some downstream 
> operations.
> This cannot be made the default due to backward compatibility, but an option 
> can be added for the same

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to