> On July 1, 2015, midnight, Xuefu Zhang wrote: > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java, > > line 142 > > <https://reviews.apache.org/r/34666/diff/1/?file=971707#file971707line142> > > > > Why do we need this now?
This is to prevent a newly generated task to be processed again. In case the task contains localwork, it maybe overwritten. See HIVE-9424 for more details. However, I just found out that this is also fixed as part of HIVE-9659, so I'll remove this code now. > On July 1, 2015, midnight, Xuefu Zhang wrote: > > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java, line > > 227 > > <https://reviews.apache.org/r/34666/diff/1/?file=971711#file971711line227> > > > > why putting the old work in the map. This is because even though we are cloning the op tree, we are still retaining the old work. So, after we've created a new root op, we need to update the rootToWorkMap, and map the cloned root op to the old work. This is later used in getEnclosingWork. We also need to keep the old entry because in SparkPartitionPruningSink, it still stores the old TableScanOperator, and in processPartitionPruningSink it will look up the op to get a corresponding target work. Added more comments in the code. - Chao ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34666/#review89972 ----------------------------------------------------------- On May 26, 2015, 4:28 p.m., Chao Sun wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/34666/ > ----------------------------------------------------------- > > (Updated May 26, 2015, 4:28 p.m.) > > > Review request for hive, chengxiang li and Xuefu Zhang. > > > Bugs: HIVE-9152 > https://issues.apache.org/jira/browse/HIVE-9152 > > > Repository: hive-git > > > Description > ------- > > Tez implemented dynamic partition pruning in HIVE-7826. This is a nice > optimization and we should implement the same in HOS. > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc > itests/src/test/resources/testconfiguration.properties 2a5f7e3 > metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 0f86117 > metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp a0b34cb > metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 55e0385 > metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 749c97a > metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py > 4cc54e8 > ql/if/queryplan.thrift c8dfa35 > ql/src/gen/thrift/gen-cpp/queryplan_types.h ac73bc5 > ql/src/gen/thrift/gen-cpp/queryplan_types.cpp 19d4806 > > ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java > e18f935 > ql/src/gen/thrift/gen-php/Types.php 7121ed4 > ql/src/gen/thrift/gen-py/queryplan/ttypes.py 53c0106 > ql/src/gen/thrift/gen-rb/queryplan_types.rb c2c4220 > ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 9867739 > ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 91e8a02 > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java > 21398d8 > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkDynamicPartitionPruner.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java > e6c845c > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSparkPartitionPruningSinkOperator.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java > 1de7e40 > ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 9d5730d > ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java ea5efe5 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkDynamicPartitionPruningOptimization.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkRemoveDynamicPruningBySize.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java > 8e56263 > ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java > 5f731d7 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkPartitionPruningSinkDesc.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java > 447f104 > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java > e27ce0d > > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java > f7586a4 > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java > 19aae70 > > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningOptimizer.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningSinkOperator.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 05a5841 > ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java aa291b9 > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java > 363e49e > ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning.q > PRE-CREATION > ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning_2.q > PRE-CREATION > ql/src/test/results/clientpositive/spark/bucket2.q.out 89c3b4c > ql/src/test/results/clientpositive/spark/bucket3.q.out 2fc4855 > ql/src/test/results/clientpositive/spark/bucket4.q.out 44e0f9f > ql/src/test/results/clientpositive/spark/column_access_stats.q.out 3e16f61 > ql/src/test/results/clientpositive/spark/limit_partition_metadataonly.q.out > e95d2ab > ql/src/test/results/clientpositive/spark/list_bucket_dml_2.q.java1.7.out > e38ccf8 > ql/src/test/results/clientpositive/spark/optimize_nullscan.q.out 881f41a > ql/src/test/results/clientpositive/spark/pcr.q.out 4c22f0b > ql/src/test/results/clientpositive/spark/sample3.q.out 2fe6b0d > ql/src/test/results/clientpositive/spark/sample9.q.out c9823f7 > ql/src/test/results/clientpositive/spark/smb_mapjoin_11.q.out c3f996f > > ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out > PRE-CREATION > > ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_2.q.out > PRE-CREATION > ql/src/test/results/clientpositive/spark/temp_table.q.out 16d663d > ql/src/test/results/clientpositive/spark/udf_example_add.q.out 7916679 > ql/src/test/results/clientpositive/spark/udf_in_file.q.out c769d1f > ql/src/test/results/clientpositive/spark/union_view.q.out 593ce40 > ql/src/test/results/clientpositive/spark/vector_elt.q.out 180ea15 > ql/src/test/results/clientpositive/spark/vector_string_concat.q.out 9ec8538 > ql/src/test/results/clientpositive/spark/vectorization_decimal_date.q.out > bafd62f > ql/src/test/results/clientpositive/spark/vectorization_div0.q.out 30d116f > ql/src/test/results/clientpositive/spark/vectorized_case.q.out daf6ad3 > ql/src/test/results/clientpositive/spark/vectorized_math_funcs.q.out > 470d9a9 > ql/src/test/results/clientpositive/spark/vectorized_string_funcs.q.out > ef98ae9 > serde/src/gen/thrift/gen-cpp/complex_types.h 3f4c760 > serde/src/gen/thrift/gen-cpp/complex_types.cpp 411e1b0 > serde/src/gen/thrift/gen-cpp/megastruct_types.cpp 2d46b7f > serde/src/gen/thrift/gen-cpp/testthrift_types.h 6c84b9f > serde/src/gen/thrift/gen-cpp/testthrift_types.cpp 7949f23 > > serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/test/ThriftTestObj.java > dda3c5f > > serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/Complex.java > ff0c1f2 > > serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/MegaStruct.java > fba49e4 > > serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/PropValueUnion.java > a50a508 > > serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/SetIntString.java > 334d225 > service/src/gen/thrift/gen-cpp/TCLIService.h 030475b > service/src/gen/thrift/gen-cpp/TCLIService.cpp 209ce63 > service/src/gen/thrift/gen-cpp/TCLIService_types.h 7bceabd > service/src/gen/thrift/gen-cpp/TCLIService_types.cpp 86eeea3 > service/src/gen/thrift/gen-cpp/ThriftHive.h b84362b > service/src/gen/thrift/gen-cpp/ThriftHive.cpp 865db69 > service/src/gen/thrift/gen-cpp/hive_service_types.h bc0e652 > service/src/gen/thrift/gen-cpp/hive_service_types.cpp 255fb00 > > service/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/service/ThriftHive.java > 1c44789 > > service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TBinaryColumn.java > 6b1b054 > > service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TBoolColumn.java > efd571c > > service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TByteColumn.java > 169bfde > > service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TDoubleColumn.java > 4fc5454 > > service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetTablesReq.java > c973fcc > > service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TI16Column.java > c836630 > > service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TI32Column.java > 6c6c5f3 > > service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TI64Column.java > cc383ed > > service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TRow.java > a44cfb0 > > service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TRowSet.java > d16c8a4 > > service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStatus.java > 24a746e > > service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStringColumn.java > 3dae460 > > service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTableSchema.java > ff5e54d > > service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTypeDesc.java > 251f86a > service/src/gen/thrift/gen-py/hive_service/ThriftHive.py 33912f9 > > Diff: https://reviews.apache.org/r/34666/diff/ > > > Testing > ------- > > spark_dynamic_partition_pruning.q, spark_dynamic_partition_pruning_2.q - both > are clone from tez's test. > > > Thanks, > > Chao Sun > >