[ 
https://issues.apache.org/jira/browse/HIVE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13738537#comment-13738537
 ] 

Shuaishuai Nie commented on HIVE-5023:
--------------------------------------

Thanks [~ashutoshc] [~vikram.dixit] [~sushanth]
                
> Hive get wrong result when partition has the same path but different schema 
> or authority
> ----------------------------------------------------------------------------------------
>
>                 Key: HIVE-5023
>                 URL: https://issues.apache.org/jira/browse/HIVE-5023
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Shuaishuai Nie
>            Assignee: Shuaishuai Nie
>             Fix For: 0.12.0
>
>         Attachments: HIVE-5023.1.patch, HIVE-5023.2.patch
>
>
> Hive does not differentiate scheme and authority in file uris which cause 
> wrong result when partition has the same path but different schema or 
> authority. Here is a simple repro
> partition file path:
> asv://contain...@secondary1.blob.core.windows.net/2013-08-05/00/text1.txt
> with content "2013-08-05 00:00:00"
> asv://contain...@secondary1.blob.core.windows.net/2013-08-05/00/text2.txt
> with content "2013-08-05 00:00:20"
> {noformat}
> CREATE EXTERNAL TABLE IF NOT EXISTS T1 (t STRING) PARTITIONED BY (ProcessDate 
> STRING, Hour STRING, ClusterName STRING) ROW FORMAT DELIMITED FIELDS 
> TERMINATED by '\t' STORED AS TEXTFILE;
> ALTER TABLE T1 DROP IF EXISTS PARTITION(processDate='2013-08-05', Hour='00', 
> clusterName ='CLusterA');
> ALTER TABLE T1 ADD IF NOT EXISTS PARTITION(processDate='2013-08-05', 
> Hour='00', clusterName ='ClusterA') LOCATION 
> 'asv://contain...@secondary1.blob.core.windows.net/2013-08-05/00';
> ALTER TABLE T1 DROP IF EXISTS PARTITION(processDate='2013-08-05', Hour='00', 
> clusterName ='ClusterB');
> ALTER TABLE T1 ADD IF NOT EXISTS PARTITION(processDate='2013-08-05', 
> Hour='00', clusterName ='ClusterB') LOCATION 
> 'asv://contain...@secondary1.blob.core.windows.net/2013-08-05/00';
> {noformat}
> the expect output of the hive query
> {noformat}
> SELECT ClusterName, t FROM T1 WHERE ProcessDate=’2013-08-05’ AND Hour=’00’;
> {noformat}
> should be
> {noformat}
> ClusterA        2013-08-05 00:00:00
> ClusterB        2013-08-05 00:00:20
> {noformat}
> However it is
> {noformat}
> ClusterA        2013-08-05 00:00:00
> ClusterA        2013-08-05 00:00:20
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to