[jira] [Commented] (HIVE-10980) Merge of dynamic partitions loads all data to default partition

2015-09-12 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14742145#comment-14742145
 ] 

Ashutosh Chauhan commented on HIVE-10980:
-

+1

> Merge of dynamic partitions loads all data to default partition
> ---
>
> Key: HIVE-10980
> URL: https://issues.apache.org/jira/browse/HIVE-10980
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
> Environment: HDP 2.2.4 (also reproduced on apache hive built from 
> trunk) 
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-10980.patch
>
>
> Conditions that lead to the issue:
> 1. Execution engine set to MapReduce
> 2. Partition columns have different types
> 3. Both static and dynamic partitions are used in the query
> 4. Dynamically generated partitions require merge
> Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__".
> Steps to reproduce:
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=strict;
> set hive.optimize.sort.dynamic.partition=false;
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> set hive.execution.engine=mr;
> create external table sdp (
>   dataint bigint,
>   hour int,
>   req string,
>   cid string,
>   caid string
> )
> row format delimited
> fields terminated by ',';
> load data local inpath '../../data/files/dynpartdata1.txt' into table sdp;
> load data local inpath '../../data/files/dynpartdata2.txt' into table sdp;
> ...
> load data local inpath '../../data/files/dynpartdataN.txt' into table sdp;
> create table tdp (cid string, caid string)
> partitioned by (dataint bigint, hour int, req string);
> insert overwrite table tdp partition (dataint=20150316, hour=16, req)
> select cid, caid, req from sdp where dataint=20150316 and hour=16;
> select * from tdp order by caid;
> show partitions tdp;
> Example of the input file:
> 20150316,16,reqA,clusterIdA,cacheId1
> 20150316,16,reqB,clusterIdB,cacheId2 
> 20150316,16,reqA,clusterIdC,cacheId3  
> 20150316,16,reqD,clusterIdD,cacheId4
> 20150316,16,reqA,clusterIdA,cacheId5  
> Actual result:
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__ 
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId22015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdC  cacheId32015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId42015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdA  cacheId52015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId82015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId92015031616  
> __HIVE_DEFAULT_PARTITION__
> 
> dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10980) Merge of dynamic partitions loads all data to default partition

2015-09-11 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741553#comment-14741553
 ] 

Illya Yalovyy commented on HIVE-10980:
--

Thank you!
The patch is re-submitted.

> Merge of dynamic partitions loads all data to default partition
> ---
>
> Key: HIVE-10980
> URL: https://issues.apache.org/jira/browse/HIVE-10980
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
> Environment: HDP 2.2.4 (also reproduced on apache hive built from 
> trunk) 
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-10980.patch
>
>
> Conditions that lead to the issue:
> 1. Execution engine set to MapReduce
> 2. Partition columns have different types
> 3. Both static and dynamic partitions are used in the query
> 4. Dynamically generated partitions require merge
> Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__".
> Steps to reproduce:
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=strict;
> set hive.optimize.sort.dynamic.partition=false;
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> set hive.execution.engine=mr;
> create external table sdp (
>   dataint bigint,
>   hour int,
>   req string,
>   cid string,
>   caid string
> )
> row format delimited
> fields terminated by ',';
> load data local inpath '../../data/files/dynpartdata1.txt' into table sdp;
> load data local inpath '../../data/files/dynpartdata2.txt' into table sdp;
> ...
> load data local inpath '../../data/files/dynpartdataN.txt' into table sdp;
> create table tdp (cid string, caid string)
> partitioned by (dataint bigint, hour int, req string);
> insert overwrite table tdp partition (dataint=20150316, hour=16, req)
> select cid, caid, req from sdp where dataint=20150316 and hour=16;
> select * from tdp order by caid;
> show partitions tdp;
> Example of the input file:
> 20150316,16,reqA,clusterIdA,cacheId1
> 20150316,16,reqB,clusterIdB,cacheId2 
> 20150316,16,reqA,clusterIdC,cacheId3  
> 20150316,16,reqD,clusterIdD,cacheId4
> 20150316,16,reqA,clusterIdA,cacheId5  
> Actual result:
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__ 
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId22015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdC  cacheId32015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId42015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdA  cacheId52015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId82015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId92015031616  
> __HIVE_DEFAULT_PARTITION__
> 
> dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10980) Merge of dynamic partitions loads all data to default partition

2015-09-11 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741524#comment-14741524
 ] 

Lefty Leverenz commented on HIVE-10980:
---

[~yalovyyi], you can re-run tests by cancelling the patch (using a button on 
the top line) and then resubmitting it (using the Submit Patch button that will 
appear in the same place as the Cancel Patch button).

> Merge of dynamic partitions loads all data to default partition
> ---
>
> Key: HIVE-10980
> URL: https://issues.apache.org/jira/browse/HIVE-10980
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
> Environment: HDP 2.2.4 (also reproduced on apache hive built from 
> trunk) 
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-10980.patch
>
>
> Conditions that lead to the issue:
> 1. Execution engine set to MapReduce
> 2. Partition columns have different types
> 3. Both static and dynamic partitions are used in the query
> 4. Dynamically generated partitions require merge
> Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__".
> Steps to reproduce:
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=strict;
> set hive.optimize.sort.dynamic.partition=false;
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> set hive.execution.engine=mr;
> create external table sdp (
>   dataint bigint,
>   hour int,
>   req string,
>   cid string,
>   caid string
> )
> row format delimited
> fields terminated by ',';
> load data local inpath '../../data/files/dynpartdata1.txt' into table sdp;
> load data local inpath '../../data/files/dynpartdata2.txt' into table sdp;
> ...
> load data local inpath '../../data/files/dynpartdataN.txt' into table sdp;
> create table tdp (cid string, caid string)
> partitioned by (dataint bigint, hour int, req string);
> insert overwrite table tdp partition (dataint=20150316, hour=16, req)
> select cid, caid, req from sdp where dataint=20150316 and hour=16;
> select * from tdp order by caid;
> show partitions tdp;
> Example of the input file:
> 20150316,16,reqA,clusterIdA,cacheId1
> 20150316,16,reqB,clusterIdB,cacheId2 
> 20150316,16,reqA,clusterIdC,cacheId3  
> 20150316,16,reqD,clusterIdD,cacheId4
> 20150316,16,reqA,clusterIdA,cacheId5  
> Actual result:
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__ 
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId22015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdC  cacheId32015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId42015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdA  cacheId52015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId82015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId92015031616  
> __HIVE_DEFAULT_PARTITION__
> 
> dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10980) Merge of dynamic partitions loads all data to default partition

2015-09-11 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741476#comment-14741476
 ] 

Illya Yalovyy commented on HIVE-10980:
--

[~gopalv], I have reviewed failed tests:
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation - was 
failing for many build before my patch
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
 - is failing for other patches as well, and I wasn't able to reproduce this 
failure locally. 

What is the best way to re-run tests?



> Merge of dynamic partitions loads all data to default partition
> ---
>
> Key: HIVE-10980
> URL: https://issues.apache.org/jira/browse/HIVE-10980
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
> Environment: HDP 2.2.4 (also reproduced on apache hive built from 
> trunk) 
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-10980.patch
>
>
> Conditions that lead to the issue:
> 1. Execution engine set to MapReduce
> 2. Partition columns have different types
> 3. Both static and dynamic partitions are used in the query
> 4. Dynamically generated partitions require merge
> Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__".
> Steps to reproduce:
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=strict;
> set hive.optimize.sort.dynamic.partition=false;
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> set hive.execution.engine=mr;
> create external table sdp (
>   dataint bigint,
>   hour int,
>   req string,
>   cid string,
>   caid string
> )
> row format delimited
> fields terminated by ',';
> load data local inpath '../../data/files/dynpartdata1.txt' into table sdp;
> load data local inpath '../../data/files/dynpartdata2.txt' into table sdp;
> ...
> load data local inpath '../../data/files/dynpartdataN.txt' into table sdp;
> create table tdp (cid string, caid string)
> partitioned by (dataint bigint, hour int, req string);
> insert overwrite table tdp partition (dataint=20150316, hour=16, req)
> select cid, caid, req from sdp where dataint=20150316 and hour=16;
> select * from tdp order by caid;
> show partitions tdp;
> Example of the input file:
> 20150316,16,reqA,clusterIdA,cacheId1
> 20150316,16,reqB,clusterIdB,cacheId2 
> 20150316,16,reqA,clusterIdC,cacheId3  
> 20150316,16,reqD,clusterIdD,cacheId4
> 20150316,16,reqA,clusterIdA,cacheId5  
> Actual result:
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__ 
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId22015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdC  cacheId32015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId42015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdA  cacheId52015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId82015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId92015031616  
> __HIVE_DEFAULT_PARTITION__
> 
> dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10980) Merge of dynamic partitions loads all data to default partition

2015-09-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14740210#comment-14740210
 ] 

Hive QA commented on HIVE-10980:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12755080/HIVE-10980.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9437 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5232/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5232/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5232/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12755080 - PreCommit-HIVE-TRUNK-Build

> Merge of dynamic partitions loads all data to default partition
> ---
>
> Key: HIVE-10980
> URL: https://issues.apache.org/jira/browse/HIVE-10980
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
> Environment: HDP 2.2.4 (also reproduced on apache hive built from 
> trunk) 
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-10980.patch
>
>
> Conditions that lead to the issue:
> 1. Execution engine set to MapReduce
> 2. Partition columns have different types
> 3. Both static and dynamic partitions are used in the query
> 4. Dynamically generated partitions require merge
> Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__".
> Steps to reproduce:
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=strict;
> set hive.optimize.sort.dynamic.partition=false;
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> set hive.execution.engine=mr;
> create external table sdp (
>   dataint bigint,
>   hour int,
>   req string,
>   cid string,
>   caid string
> )
> row format delimited
> fields terminated by ',';
> load data local inpath '../../data/files/dynpartdata1.txt' into table sdp;
> load data local inpath '../../data/files/dynpartdata2.txt' into table sdp;
> ...
> load data local inpath '../../data/files/dynpartdataN.txt' into table sdp;
> create table tdp (cid string, caid string)
> partitioned by (dataint bigint, hour int, req string);
> insert overwrite table tdp partition (dataint=20150316, hour=16, req)
> select cid, caid, req from sdp where dataint=20150316 and hour=16;
> select * from tdp order by caid;
> show partitions tdp;
> Example of the input file:
> 20150316,16,reqA,clusterIdA,cacheId1
> 20150316,16,reqB,clusterIdB,cacheId2 
> 20150316,16,reqA,clusterIdC,cacheId3  
> 20150316,16,reqD,clusterIdD,cacheId4
> 20150316,16,reqA,clusterIdA,cacheId5  
> Actual result:
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__ 
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId22015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdC  cacheId32015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId42015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdA  cacheId52015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId82015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId92015031616  
> __HIVE_DEFAULT_PARTITION__
> 
> dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10980) Merge of dynamic partitions loads all data to default partition

2015-09-10 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739561#comment-14739561
 ] 

Illya Yalovyy commented on HIVE-10980:
--

Patch is submitted for review:
https://reviews.apache.org/r/38268/

> Merge of dynamic partitions loads all data to default partition
> ---
>
> Key: HIVE-10980
> URL: https://issues.apache.org/jira/browse/HIVE-10980
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
> Environment: HDP 2.2.4 (also reproduced on apache hive built from 
> trunk) 
>Reporter: Illya Yalovyy
> Attachments: HIVE-10980.patch
>
>
> Conditions that lead to the issue:
> 1. Execution engine set to MapReduce
> 2. Partition columns have different types
> 3. Both static and dynamic partitions are used in the query
> 4. Dynamically generated partitions require merge
> Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__".
> Steps to reproduce:
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=strict;
> set hive.optimize.sort.dynamic.partition=false;
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> set hive.execution.engine=mr;
> create external table sdp (
>   dataint bigint,
>   hour int,
>   req string,
>   cid string,
>   caid string
> )
> row format delimited
> fields terminated by ',';
> load data local inpath '../../data/files/dynpartdata1.txt' into table sdp;
> load data local inpath '../../data/files/dynpartdata2.txt' into table sdp;
> ...
> load data local inpath '../../data/files/dynpartdataN.txt' into table sdp;
> create table tdp (cid string, caid string)
> partitioned by (dataint bigint, hour int, req string);
> insert overwrite table tdp partition (dataint=20150316, hour=16, req)
> select cid, caid, req from sdp where dataint=20150316 and hour=16;
> select * from tdp order by caid;
> show partitions tdp;
> Example of the input file:
> 20150316,16,reqA,clusterIdA,cacheId1
> 20150316,16,reqB,clusterIdB,cacheId2 
> 20150316,16,reqA,clusterIdC,cacheId3  
> 20150316,16,reqD,clusterIdD,cacheId4
> 20150316,16,reqA,clusterIdA,cacheId5  
> Actual result:
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__ 
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId22015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdC  cacheId32015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId42015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdA  cacheId52015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId82015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId92015031616  
> __HIVE_DEFAULT_PARTITION__
> 
> dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10980) Merge of dynamic partitions loads all data to default partition

2015-06-12 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14583655#comment-14583655
 ] 

Illya Yalovyy commented on HIVE-10980:
--

I have a patch for this. Will upload it soon.

> Merge of dynamic partitions loads all data to default partition
> ---
>
> Key: HIVE-10980
> URL: https://issues.apache.org/jira/browse/HIVE-10980
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
> Environment: HDP 2.2.4 (also reproduced on apache hive built from 
> trunk) 
>Reporter: Illya Yalovyy
>
> Conditions that lead to the issue:
> 1. Execution engine set to MapReduce
> 2. Partition columns have different types
> 3. Both static and dynamic partitions are used in the query
> 4. Dynamically generated partitions require merge
> Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__".
> Steps to reproduce:
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=strict;
> set hive.optimize.sort.dynamic.partition=false;
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> set hive.execution.engine=mr;
> create external table sdp (
>   dataint bigint,
>   hour int,
>   req string,
>   cid string,
>   caid string
> )
> row format delimited
> fields terminated by ',';
> load data local inpath '../../data/files/dynpartdata1.txt' into table sdp;
> load data local inpath '../../data/files/dynpartdata2.txt' into table sdp;
> ...
> load data local inpath '../../data/files/dynpartdataN.txt' into table sdp;
> create table tdp (cid string, caid string)
> partitioned by (dataint bigint, hour int, req string);
> insert overwrite table tdp partition (dataint=20150316, hour=16, req)
> select cid, caid, req from sdp where dataint=20150316 and hour=16;
> select * from tdp order by caid;
> show partitions tdp;
> Example of the input file:
> 20150316,16,reqA,clusterIdA,cacheId1
> 20150316,16,reqB,clusterIdB,cacheId2 
> 20150316,16,reqA,clusterIdC,cacheId3  
> 20150316,16,reqD,clusterIdD,cacheId4
> 20150316,16,reqA,clusterIdA,cacheId5  
> Actual result:
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__ 
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId22015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdC  cacheId32015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId42015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdA  cacheId52015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId82015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId92015031616  
> __HIVE_DEFAULT_PARTITION__
> 
> dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10980) Merge of dynamic partitions loads all data to default partition

2015-06-11 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14582092#comment-14582092
 ] 

Illya Yalovyy commented on HIVE-10980:
--

Good point. I observed this behavior on MapReduce. I'll update the ticket.

> Merge of dynamic partitions loads all data to default partition
> ---
>
> Key: HIVE-10980
> URL: https://issues.apache.org/jira/browse/HIVE-10980
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
> Environment: HDP 2.2.4 (also reproduced on apache hive built from 
> trunk) 
>Reporter: Illya Yalovyy
>
> Conditions that lead to the issue:
> 1. Partition columns have different types
> 2. Both static and dynamic partitions are used in the query
> 3. Dynamically generated partitions require merge
> Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__".
> Steps to reproduce:
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=strict;
> set hive.optimize.sort.dynamic.partition=false;
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> create external table sdp (
>   dataint bigint,
>   hour int,
>   req string,
>   cid string,
>   caid string
> )
> row format delimited
> fields terminated by ',';
> load data local inpath '../../data/files/dynpartdata1.txt' into table sdp;
> load data local inpath '../../data/files/dynpartdata2.txt' into table sdp;
> ...
> load data local inpath '../../data/files/dynpartdataN.txt' into table sdp;
> create table tdp (cid string, caid string)
> partitioned by (dataint bigint, hour int, req string);
> insert overwrite table tdp partition (dataint=20150316, hour=16, req)
> select cid, caid, req from sdp where dataint=20150316 and hour=16;
> select * from tdp order by caid;
> show partitions tdp;
> Example of the input file:
> 20150316,16,reqA,clusterIdA,cacheId1
> 20150316,16,reqB,clusterIdB,cacheId2 
> 20150316,16,reqA,clusterIdC,cacheId3  
> 20150316,16,reqD,clusterIdD,cacheId4
> 20150316,16,reqA,clusterIdA,cacheId5  
> Actual result:
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__ 
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId22015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdC  cacheId32015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId42015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdA  cacheId52015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId82015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId92015031616  
> __HIVE_DEFAULT_PARTITION__
> 
> dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10980) Merge of dynamic partitions loads all data to default partition

2015-06-10 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581540#comment-14581540
 ] 

Gopal V commented on HIVE-10980:


Are you using MapReduce or Tez?

> Merge of dynamic partitions loads all data to default partition
> ---
>
> Key: HIVE-10980
> URL: https://issues.apache.org/jira/browse/HIVE-10980
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
> Environment: HDP 2.2.4 (also reproduced on apache hive built from 
> trunk) 
>Reporter: Illya Yalovyy
>
> Conditions that lead to the issue:
> 1. Partition columns have different types
> 2. Both static and dynamic partitions are used in the query
> 3. Dynamically generated partitions require merge
> Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__".
> Steps to reproduce:
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=strict;
> set hive.optimize.sort.dynamic.partition=false;
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> create external table sdp (
>   dataint bigint,
>   hour int,
>   req string,
>   cid string,
>   caid string
> )
> row format delimited
> fields terminated by ',';
> load data local inpath '../../data/files/dynpartdata1.txt' into table sdp;
> load data local inpath '../../data/files/dynpartdata2.txt' into table sdp;
> ...
> load data local inpath '../../data/files/dynpartdataN.txt' into table sdp;
> create table tdp (cid string, caid string)
> partitioned by (dataint bigint, hour int, req string);
> insert overwrite table tdp partition (dataint=20150316, hour=16, req)
> select cid, caid, req from sdp where dataint=20150316 and hour=16;
> select * from tdp order by caid;
> show partitions tdp;
> Example of the input file:
> 20150316,16,reqA,clusterIdA,cacheId1
> 20150316,16,reqB,clusterIdB,cacheId2 
> 20150316,16,reqA,clusterIdC,cacheId3  
> 20150316,16,reqD,clusterIdD,cacheId4
> 20150316,16,reqA,clusterIdA,cacheId5  
> Actual result:
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__ 
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId22015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdC  cacheId32015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId42015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdA  cacheId52015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId82015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId92015031616  
> __HIVE_DEFAULT_PARTITION__
> 
> dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)