> On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote: > > itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q, > > lines 15-16 > > <https://reviews.apache.org/r/54393/diff/1/?file=1576929#file1576929line15> > > > > Why explicit ADD PARTITION statement is required? I think insert into > > will create missing partitions.
That was part of the test I did when I found the issue. But it is not needed. I will remove it just to keep clarity on the type of testing. > On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote: > > itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q, > > line 25 > > <https://reviews.apache.org/r/54393/diff/1/?file=1576931#file1576931line25> > > > > Will "SHOW PARTITIONS" be a good validation in this case? That's a good one. I'll add it. > On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote: > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java, line > > 1763 > > <https://reviews.apache.org/r/54393/diff/1/?file=1576938#file1576938line1763> > > > > Should we take into account HIVE_BLOBSTORE_USE_BLOBSTORE_AS_SCRATCHDIR? It's not necessary. This is just an optimization when two MoveTask are on BlobStore. The condition in the method verifies that. return condOutputPath.equals(linkedSourcePath) && BlobStorageUtils.isBlobStoragePath(conf, condInputPath) && BlobStorageUtils.isBlobStoragePath(conf, linkedTargetPath); If the user has HIVE_BLOBSTORE_USE_BLOBSTORE_AS_SCRATCHDIR=false, then the condInputPath might be on HDFS, and the merge won't happen. > On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote: > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java, line > > 1784 > > <https://reviews.apache.org/r/54393/diff/1/?file=1576938#file1576938line1784> > > > > Just note: s3 to s3 copy is *more* efficient than hdfs to s3 copy. This is only merging two MoveWorks into one, so the hdfs to s3 copy does not apply here I think. I changed the comment thought: * This is an optimization for BlobStore systems to avoid doing two renames or copies that are not necessary. > On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote: > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java, lines > > 1885-1894 > > <https://reviews.apache.org/r/54393/diff/1/?file=1576938#file1576938line1885> > > > > I feel like this logic should be inside "addDependentMoveTasks" method. It is confusing. The "addDependentMoveTasks" is used to link a MoveTask to the desired task. For instance: addDependentMoveTasks(moveTaskToLink, conf, moveOnlyMoveTask, dependencyTask); The above method will link: moveOnlyMoveTask -> moveTaskToLink -> otherChildTasks But I don't want to do that with the optimization, I want to copy the moveTaskToLink child tasks to moveOnlyMoveTask instead. Like: moveOnlyMoveTask -> otherChildTasks On Dec. 5, 2016, 11:11 p.m., Sergio Pena wrote: > > Could you also add a test with a strict dynamic partitioning, like: > > INSERT OVERWRITE TABLE t2 PARTITION(p1="1", p2) SELECT \*, c1 AS p2 FROM t1; Thanks. I will add it. - Sergio ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/54393/#review158069 ----------------------------------------------------------- On Dec. 5, 2016, 9:56 p.m., Sergio Pena wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/54393/ > ----------------------------------------------------------- > > (Updated Dec. 5, 2016, 9:56 p.m.) > > > Review request for hive. > > > Bugs: HIVE-15361 > https://issues.apache.org/jira/browse/HIVE-15361 > > > Repository: hive-git > > > Description > ------- > > Problem: > - DynamicPartitionCtx and ListBucketingCtx objects weren't set on the new > MoveWork created when merging the two MoveWork objects from the > ConditionalTask. > > Solution > - Set the DynamicPartitionCtx and ListBucketingCtx objects to the new > MoveWork created for the S3 optimization. > > Other changes > - Merge the MoveWork objects inside the createCondTask() method for better > error handling. > - Only merge the MoveWork related to the moveOnlyMoveTask. The MoveWork from > the mergeAndMoveMoveTask may cause other issues that are not correctly tested. > - Two new private methods are added to check and merge the conditional > input/output tasks to the linked MoveWork. > > > Diffs > ----- > > > itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q > PRE-CREATION > itests/hive-blobstore/src/test/queries/clientpositive/insert_into_table.q > 25e2e7007ff539223d9244ca9822aa65d1441eb0 > > itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q > PRE-CREATION > > itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_table.q > 846b2b113f09a74a3f05c13ffb56163e81dc1e8e > > itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out > PRE-CREATION > > itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out > fbb52c132a331aefe870264e035c397078f3c82e > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out > 9f575a66ecefc3933b16dff554bdcc1c1f6420ee > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out > PRE-CREATION > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out > c725c96cbb6b0374e67308a54204c7c25e827567 > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java > adc1188f09c8019a8aa60403d5813d6fa4509ceb > ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java > bcd3125ab4ad20c00fec565e5004ee200c0187d5 > ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java > 9a868a04ce93d5c2ee75b5c6e96a1401cea93133 > ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java > 771a919ccd0bd75fe6197299ae057647ece89a7e > ql/src/java/org/apache/hadoop/hive/ql/plan/MoveWork.java > 9f498c7fb88a7a9f77b8c6739c097a2b26b0c617 > > ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsCreateConditionalTask.java > e6ec44504685bd9e53f158cc359b8a7b79fd0166 > > Diff: https://reviews.apache.org/r/54393/diff/ > > > Testing > ------- > > All itests/hive-blobstore tests run. > > Added new blobstore tests: > - insert_into_dynamic_partitions.q > - insert_overwrite_dynamic_partitions.q > > Waiting for HiveQA to run the rest of the q-tests. > > > Thanks, > > Sergio Pena > >