[jira] [Commented] (HIVE-2117) insert overwrite ignoring partition location

2011-05-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038354#comment-13038354
 ] 

Hudson commented on HIVE-2117:
--

Integrated in Hive-trunk-h0.21 #745 (See 
[https://builds.apache.org/hudson/job/Hive-trunk-h0.21/745/])
HIVE-2117. Insert overwrite ignoring partition location (Patrick Hunt via 
cws)

cws : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1126726
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/TestLocationQueries.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/TestMTQueries.java
* /hive/trunk/ql/src/test/results/clientpositive/alter5.q.out
* /hive/trunk/ql/src/test/queries/clientpositive/alter5.q
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/BaseTestQueries.java


> insert overwrite ignoring partition location
> 
>
> Key: HIVE-2117
> URL: https://issues.apache.org/jira/browse/HIVE-2117
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Query Processor
>Affects Versions: 0.7.1, 0.8.0
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
>Priority: Blocker
> Attachments: HIVE-2117_br07.patch, HIVE-2117_br07.patch, 
> HIVE-2117_trunk.patch, data.txt
>
>
> The following code works differently in 0.5.0 vs 0.7.0.
> In 0.5.0 the partition location is respected. 
> However in 0.7.0 while the initial partition is create with the specified 
> location "/parta", the "insert overwrite ..." results in the partition 
> written to "/dt=a" (note that  is the same in both cases).
> {code}
> create table foo_stg (bar INT, car INT); 
> load data local inpath 'data.txt' into table foo_stg;
>  
> create table foo4 (bar INT, car INT) partitioned by (dt STRING) LOCATION 
> '/user/hive/warehouse/foo4'; 
> alter table foo4 add partition (dt='a') location 
> '/user/hive/warehouse/foo4/parta';
>  
> from foo_stg fs insert overwrite table foo4 partition (dt='a') select *;
> {code}
> From what I can tell HIVE-1707 introduced this via a change to
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Path, String, 
> Map, boolean, boolean)
> specifically:
> {code}
> +  Path partPath = new Path(tbl.getDataLocation().getPath(),
> +  Warehouse.makePartPath(partSpec));
> +
> +  Path newPartPath = new Path(loadPath.toUri().getScheme(), loadPath
> +  .toUri().getAuthority(), partPath.toUri().getPath());
> {code}
> Reading the description on HIVE-1707 it seems that this may have been done 
> purposefully, however given the partition location is explicitly specified 
> for the partition in question it seems like that should be honored (esp give 
> the table location has not changed).
> This difference in behavior is causing a regression in existing production 
> Hive based code. I'd like to take a stab at addressing this, any suggestions?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2117) insert overwrite ignoring partition location

2011-05-20 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037096#comment-13037096
 ] 

Carl Steinbach commented on HIVE-2117:
--

+1. Will commit if tests pass.

> insert overwrite ignoring partition location
> 
>
> Key: HIVE-2117
> URL: https://issues.apache.org/jira/browse/HIVE-2117
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.0, 0.8.0
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
>Priority: Blocker
> Attachments: HIVE-2117_br07.patch, HIVE-2117_br07.patch, 
> HIVE-2117_trunk.patch, data.txt
>
>
> The following code works differently in 0.5.0 vs 0.7.0.
> In 0.5.0 the partition location is respected. 
> However in 0.7.0 while the initial partition is create with the specified 
> location "/parta", the "insert overwrite ..." results in the partition 
> written to "/dt=a" (note that  is the same in both cases).
> {code}
> create table foo_stg (bar INT, car INT); 
> load data local inpath 'data.txt' into table foo_stg;
>  
> create table foo4 (bar INT, car INT) partitioned by (dt STRING) LOCATION 
> '/user/hive/warehouse/foo4'; 
> alter table foo4 add partition (dt='a') location 
> '/user/hive/warehouse/foo4/parta';
>  
> from foo_stg fs insert overwrite table foo4 partition (dt='a') select *;
> {code}
> From what I can tell HIVE-1707 introduced this via a change to
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Path, String, 
> Map, boolean, boolean)
> specifically:
> {code}
> +  Path partPath = new Path(tbl.getDataLocation().getPath(),
> +  Warehouse.makePartPath(partSpec));
> +
> +  Path newPartPath = new Path(loadPath.toUri().getScheme(), loadPath
> +  .toUri().getAuthority(), partPath.toUri().getPath());
> {code}
> Reading the description on HIVE-1707 it seems that this may have been done 
> purposefully, however given the partition location is explicitly specified 
> for the partition in question it seems like that should be honored (esp give 
> the table location has not changed).
> This difference in behavior is causing a regression in existing production 
> Hive based code. I'd like to take a stab at addressing this, any suggestions?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2117) insert overwrite ignoring partition location

2011-05-20 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036914#comment-13036914
 ] 

jirapos...@reviews.apache.org commented on HIVE-2117:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/773/
---

Review request for hive and Carl Steinbach.


Summary
---

This change resolves a regression introduced by HIVE-1707, specifically that 
the partition location (set via alter table partition location) is not being 
respected.

I addressed this by using the user specified location (as done originally), 
except in the case with cross-filesystem moves (which was the concern in 1707).


This addresses bug HIVE-2117.
https://issues.apache.org/jira/browse/HIVE-2117


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java bcacd35 
  ql/src/test/org/apache/hadoop/hive/ql/BaseTestQueries.java PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java 06a0447 
  ql/src/test/org/apache/hadoop/hive/ql/TestLocationQueries.java PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/TestMTQueries.java 8c7c0b8 
  ql/src/test/queries/clientpositive/alter5.q PRE-CREATION 
  ql/src/test/results/clientpositive/alter5.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/773/diff


Testing
---

I added a new test which verifies partition location explicitly - as the 
existing tests ignore this detail. This test failed w/o my fix applied, it 
passes with the fix applied.


Thanks,

Patrick



> insert overwrite ignoring partition location
> 
>
> Key: HIVE-2117
> URL: https://issues.apache.org/jira/browse/HIVE-2117
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.0, 0.8.0
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
>Priority: Blocker
> Attachments: HIVE-2117_br07.patch, HIVE-2117_br07.patch, 
> HIVE-2117_trunk.patch, data.txt
>
>
> The following code works differently in 0.5.0 vs 0.7.0.
> In 0.5.0 the partition location is respected. 
> However in 0.7.0 while the initial partition is create with the specified 
> location "/parta", the "insert overwrite ..." results in the partition 
> written to "/dt=a" (note that  is the same in both cases).
> {code}
> create table foo_stg (bar INT, car INT); 
> load data local inpath 'data.txt' into table foo_stg;
>  
> create table foo4 (bar INT, car INT) partitioned by (dt STRING) LOCATION 
> '/user/hive/warehouse/foo4'; 
> alter table foo4 add partition (dt='a') location 
> '/user/hive/warehouse/foo4/parta';
>  
> from foo_stg fs insert overwrite table foo4 partition (dt='a') select *;
> {code}
> From what I can tell HIVE-1707 introduced this via a change to
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Path, String, 
> Map, boolean, boolean)
> specifically:
> {code}
> +  Path partPath = new Path(tbl.getDataLocation().getPath(),
> +  Warehouse.makePartPath(partSpec));
> +
> +  Path newPartPath = new Path(loadPath.toUri().getScheme(), loadPath
> +  .toUri().getAuthority(), partPath.toUri().getPath());
> {code}
> Reading the description on HIVE-1707 it seems that this may have been done 
> purposefully, however given the partition location is explicitly specified 
> for the partition in question it seems like that should be honored (esp give 
> the table location has not changed).
> This difference in behavior is causing a regression in existing production 
> Hive based code. I'd like to take a stab at addressing this, any suggestions?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2117) insert overwrite ignoring partition location

2011-05-20 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036912#comment-13036912
 ] 

Patrick Hunt commented on HIVE-2117:


I posted reviews up on reviewboard:
trunk: https://reviews.apache.org/r/773/
branch-0.7: https://reviews.apache.org/r/772/


> insert overwrite ignoring partition location
> 
>
> Key: HIVE-2117
> URL: https://issues.apache.org/jira/browse/HIVE-2117
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.0, 0.8.0
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
>Priority: Blocker
> Attachments: HIVE-2117_br07.patch, HIVE-2117_br07.patch, 
> HIVE-2117_trunk.patch, data.txt
>
>
> The following code works differently in 0.5.0 vs 0.7.0.
> In 0.5.0 the partition location is respected. 
> However in 0.7.0 while the initial partition is create with the specified 
> location "/parta", the "insert overwrite ..." results in the partition 
> written to "/dt=a" (note that  is the same in both cases).
> {code}
> create table foo_stg (bar INT, car INT); 
> load data local inpath 'data.txt' into table foo_stg;
>  
> create table foo4 (bar INT, car INT) partitioned by (dt STRING) LOCATION 
> '/user/hive/warehouse/foo4'; 
> alter table foo4 add partition (dt='a') location 
> '/user/hive/warehouse/foo4/parta';
>  
> from foo_stg fs insert overwrite table foo4 partition (dt='a') select *;
> {code}
> From what I can tell HIVE-1707 introduced this via a change to
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Path, String, 
> Map, boolean, boolean)
> specifically:
> {code}
> +  Path partPath = new Path(tbl.getDataLocation().getPath(),
> +  Warehouse.makePartPath(partSpec));
> +
> +  Path newPartPath = new Path(loadPath.toUri().getScheme(), loadPath
> +  .toUri().getAuthority(), partPath.toUri().getPath());
> {code}
> Reading the description on HIVE-1707 it seems that this may have been done 
> purposefully, however given the partition location is explicitly specified 
> for the partition in question it seems like that should be honored (esp give 
> the table location has not changed).
> This difference in behavior is causing a regression in existing production 
> Hive based code. I'd like to take a stab at addressing this, any suggestions?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2117) insert overwrite ignoring partition location

2011-05-20 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036908#comment-13036908
 ] 

jirapos...@reviews.apache.org commented on HIVE-2117:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/772/
---

Review request for hive and Carl Steinbach.


Summary
---

This change resolves a regression introduced by HIVE-1707, specifically that 
the partition location (set via alter table partition location) is not being 
respected.

I addressed this by using the user specified location (as done originally), 
except in the case with cross-filesystem moves (which was the concern in 1707).


This addresses bug HIVE-2117.
https://issues.apache.org/jira/browse/HIVE-2117


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 916b235 
  ql/src/test/org/apache/hadoop/hive/ql/BaseTestQueries.java PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java 4685471 
  ql/src/test/org/apache/hadoop/hive/ql/TestLocationQueries.java PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/TestMTQueries.java 8c7c0b8 
  ql/src/test/queries/clientpositive/alter5.q PRE-CREATION 
  ql/src/test/results/clientpositive/alter5.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/772/diff


Testing
---

I added a new test which verifies partition location explicitly - as the 
existing tests ignore this detail. This test failed w/o my fix applied, it 
passes with the fix applied.


Thanks,

Patrick



> insert overwrite ignoring partition location
> 
>
> Key: HIVE-2117
> URL: https://issues.apache.org/jira/browse/HIVE-2117
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.0, 0.8.0
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
>Priority: Blocker
> Attachments: HIVE-2117_br07.patch, HIVE-2117_br07.patch, 
> HIVE-2117_trunk.patch, data.txt
>
>
> The following code works differently in 0.5.0 vs 0.7.0.
> In 0.5.0 the partition location is respected. 
> However in 0.7.0 while the initial partition is create with the specified 
> location "/parta", the "insert overwrite ..." results in the partition 
> written to "/dt=a" (note that  is the same in both cases).
> {code}
> create table foo_stg (bar INT, car INT); 
> load data local inpath 'data.txt' into table foo_stg;
>  
> create table foo4 (bar INT, car INT) partitioned by (dt STRING) LOCATION 
> '/user/hive/warehouse/foo4'; 
> alter table foo4 add partition (dt='a') location 
> '/user/hive/warehouse/foo4/parta';
>  
> from foo_stg fs insert overwrite table foo4 partition (dt='a') select *;
> {code}
> From what I can tell HIVE-1707 introduced this via a change to
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Path, String, 
> Map, boolean, boolean)
> specifically:
> {code}
> +  Path partPath = new Path(tbl.getDataLocation().getPath(),
> +  Warehouse.makePartPath(partSpec));
> +
> +  Path newPartPath = new Path(loadPath.toUri().getScheme(), loadPath
> +  .toUri().getAuthority(), partPath.toUri().getPath());
> {code}
> Reading the description on HIVE-1707 it seems that this may have been done 
> purposefully, however given the partition location is explicitly specified 
> for the partition in question it seems like that should be honored (esp give 
> the table location has not changed).
> This difference in behavior is causing a regression in existing production 
> Hive based code. I'd like to take a stab at addressing this, any suggestions?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira