> On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote:
> > itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q,
> >  lines 15-16
> > <https://reviews.apache.org/r/54393/diff/1/?file=1576929#file1576929line15>
> >
> >     Why explicit ADD PARTITION statement is required? I think insert into 
> > will create missing partitions.

That was part of the test I did when I found the issue. But it is not needed. I 
will remove it just to keep clarity on the type of testing.


> On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote:
> > itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q,
> >  line 25
> > <https://reviews.apache.org/r/54393/diff/1/?file=1576931#file1576931line25>
> >
> >     Will "SHOW PARTITIONS" be a good validation in this case?

That's a good one. I'll add it.


> On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java, line 
> > 1763
> > <https://reviews.apache.org/r/54393/diff/1/?file=1576938#file1576938line1763>
> >
> >     Should we take into account HIVE_BLOBSTORE_USE_BLOBSTORE_AS_SCRATCHDIR?

It's not necessary. This is just an optimization when two MoveTask are on 
BlobStore. The condition in the method verifies that.

return condOutputPath.equals(linkedSourcePath)
        && BlobStorageUtils.isBlobStoragePath(conf, condInputPath)
        && BlobStorageUtils.isBlobStoragePath(conf, linkedTargetPath);
        
If the user has HIVE_BLOBSTORE_USE_BLOBSTORE_AS_SCRATCHDIR=false, then the 
condInputPath might be on HDFS, and the merge won't happen.


> On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java, line 
> > 1784
> > <https://reviews.apache.org/r/54393/diff/1/?file=1576938#file1576938line1784>
> >
> >     Just note: s3 to s3 copy is *more* efficient than hdfs to s3 copy.

This is only merging two MoveWorks into one, so the hdfs to s3 copy does not 
apply here I think. I changed the comment thought:
* This is an optimization for BlobStore systems to avoid doing two renames or 
copies that are not necessary.


> On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java, lines 
> > 1885-1894
> > <https://reviews.apache.org/r/54393/diff/1/?file=1576938#file1576938line1885>
> >
> >     I feel like this logic should be inside "addDependentMoveTasks" method.

It is confusing. The "addDependentMoveTasks" is used to link a MoveTask to the 
desired task. 
For instance: addDependentMoveTasks(moveTaskToLink, conf, moveOnlyMoveTask, 
dependencyTask);
The above method will link: 
  moveOnlyMoveTask -> moveTaskToLink -> otherChildTasks
  
  
But I don't want to do that with the optimization, I want to copy the 
moveTaskToLink child tasks to moveOnlyMoveTask instead.
Like:
  moveOnlyMoveTask -> otherChildTasks


On Dec. 5, 2016, 11:11 p.m., Sergio Pena wrote:
> > Could you also add a test with a strict dynamic partitioning, like:
> > INSERT OVERWRITE TABLE t2 PARTITION(p1="1", p2) SELECT \*, c1 AS p2 FROM t1;

Thanks. I will add it.


- Sergio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54393/#review158069
-----------------------------------------------------------


On Dec. 5, 2016, 9:56 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54393/
> -----------------------------------------------------------
> 
> (Updated Dec. 5, 2016, 9:56 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15361
>     https://issues.apache.org/jira/browse/HIVE-15361
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Problem:
> - DynamicPartitionCtx and ListBucketingCtx objects weren't set on the new 
> MoveWork created when merging the two MoveWork objects from the 
> ConditionalTask.
> 
> Solution
> - Set the DynamicPartitionCtx and ListBucketingCtx objects to the new 
> MoveWork created for the S3 optimization.
> 
> Other changes
> - Merge the MoveWork objects inside the createCondTask() method for better 
> error handling.
> - Only merge the MoveWork related to the moveOnlyMoveTask. The MoveWork from 
> the mergeAndMoveMoveTask may cause other issues that are not correctly tested.
> - Two new private methods are added to check and merge the conditional 
> input/output tasks to the linked MoveWork.
> 
> 
> Diffs
> -----
> 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q
>  PRE-CREATION 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into_table.q 
> 25e2e7007ff539223d9244ca9822aa65d1441eb0 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_table.q
>  846b2b113f09a74a3f05c13ffb56163e81dc1e8e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> fbb52c132a331aefe870264e035c397078f3c82e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  9f575a66ecefc3933b16dff554bdcc1c1f6420ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  c725c96cbb6b0374e67308a54204c7c25e827567 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> adc1188f09c8019a8aa60403d5813d6fa4509ceb 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java 
> bcd3125ab4ad20c00fec565e5004ee200c0187d5 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java 
> 9a868a04ce93d5c2ee75b5c6e96a1401cea93133 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 
> 771a919ccd0bd75fe6197299ae057647ece89a7e 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MoveWork.java 
> 9f498c7fb88a7a9f77b8c6739c097a2b26b0c617 
>   
> ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsCreateConditionalTask.java
>  e6ec44504685bd9e53f158cc359b8a7b79fd0166 
> 
> Diff: https://reviews.apache.org/r/54393/diff/
> 
> 
> Testing
> -------
> 
> All itests/hive-blobstore tests run.
> 
> Added new blobstore tests:
> - insert_into_dynamic_partitions.q
> - insert_overwrite_dynamic_partitions.q
> 
> Waiting for HiveQA to run the rest of the q-tests.
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>

Reply via email to