-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34666/#review89826
-----------------------------------------------------------



itests/src/test/resources/testconfiguration.properties (line 894)
<https://reviews.apache.org/r/34666/#comment142628>

    Are there more test cases that can be turned on?



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSparkPartitionPruningSinkOperator.java
 (line 68)
<https://reviews.apache.org/r/34666/#comment142851>

    I think we should delegate the processing to the parent when processing one 
row from the batch. Refer to VectorReduceSinkOperator for an example.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkDynamicPartitionPruningOptimization.java
 (line 74)
<https://reviews.apache.org/r/34666/#comment142852>

    Is there anything specific to Spark? If not, we should probably reuse 
rather than copying.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkRemoveDynamicPruningBySize.java
 (line 45)
<https://reviews.apache.org/r/34666/#comment142853>

    Same as above. We should probably reuse.



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java (line 375)
<https://reviews.apache.org/r/34666/#comment142860>

    Instead of throwing an AssertionError, should we do a condition assertion 
instead?



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java (line 589)
<https://reviews.apache.org/r/34666/#comment142870>

    It seems that an operator might be visited multiple times.



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java (line 219)
<https://reviews.apache.org/r/34666/#comment142758>

    The comment here is a little confusing. "break op tree" seems having 
already happened above.



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java (line 224)
<https://reviews.apache.org/r/34666/#comment142759>

    Nit: add comments here, like "regenerate task dependency".



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java (line 262)
<https://reviews.apache.org/r/34666/#comment142756>

    Rename generateWorkTree() to generateTaskTreeHelper() or something like 
that.



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningOptimizer.java
 (line 71)
<https://reviews.apache.org/r/34666/#comment142757>

    Rename the class to something like OperatorTreeSplitterForPPD().



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningOptimizer.java
 (line 81)
<https://reviews.apache.org/r/34666/#comment142760>

    Nit: Split this into two lines instead.



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningOptimizer.java
 (line 107)
<https://reviews.apache.org/r/34666/#comment142764>

    For the cloned tree, don't we need to remove the branches that's not 
connected to the pruning sink operator, i.e., RS->Join?



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningOptimizer.java
 (line 111)
<https://reviews.apache.org/r/34666/#comment142768>

    This is not cloned as part of cloneOperatorTree()?



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningSinkOperator.java
 (line 69)
<https://reviews.apache.org/r/34666/#comment142765>

    Nit: remove the blank line.



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningSinkOperator.java
 (line 92)
<https://reviews.apache.org/r/34666/#comment142766>

    Can we still get conflicts in the file name?



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningSinkOperator.java
 (line 98)
<https://reviews.apache.org/r/34666/#comment142767>

    Nit: Potential leak of BufferedOutputStream.


- Xuefu Zhang


On May 26, 2015, 4:28 p.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34666/
> -----------------------------------------------------------
> 
> (Updated May 26, 2015, 4:28 p.m.)
> 
> 
> Review request for hive, chengxiang li and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9152
>     https://issues.apache.org/jira/browse/HIVE-9152
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
> optimization and we should implement the same in HOS.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
>   itests/src/test/resources/testconfiguration.properties 2a5f7e3 
>   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 0f86117 
>   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp a0b34cb 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 55e0385 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 749c97a 
>   metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 
> 4cc54e8 
>   ql/if/queryplan.thrift c8dfa35 
>   ql/src/gen/thrift/gen-cpp/queryplan_types.h ac73bc5 
>   ql/src/gen/thrift/gen-cpp/queryplan_types.cpp 19d4806 
>   
> ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java
>  e18f935 
>   ql/src/gen/thrift/gen-php/Types.php 7121ed4 
>   ql/src/gen/thrift/gen-py/queryplan/ttypes.py 53c0106 
>   ql/src/gen/thrift/gen-rb/queryplan_types.rb c2c4220 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 9867739 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 91e8a02 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 
> 21398d8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkDynamicPartitionPruner.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
> e6c845c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSparkPartitionPruningSinkOperator.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
> 1de7e40 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 9d5730d 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java ea5efe5 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkDynamicPartitionPruningOptimization.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkRemoveDynamicPruningBySize.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java
>  8e56263 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
> 5f731d7 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkPartitionPruningSinkDesc.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
> 447f104 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
> e27ce0d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java
>  f7586a4 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
> 19aae70 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningOptimizer.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningSinkOperator.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 05a5841 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java aa291b9 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> 363e49e 
>   ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning_2.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/bucket2.q.out 89c3b4c 
>   ql/src/test/results/clientpositive/spark/bucket3.q.out 2fc4855 
>   ql/src/test/results/clientpositive/spark/bucket4.q.out 44e0f9f 
>   ql/src/test/results/clientpositive/spark/column_access_stats.q.out 3e16f61 
>   ql/src/test/results/clientpositive/spark/limit_partition_metadataonly.q.out 
> e95d2ab 
>   ql/src/test/results/clientpositive/spark/list_bucket_dml_2.q.java1.7.out 
> e38ccf8 
>   ql/src/test/results/clientpositive/spark/optimize_nullscan.q.out 881f41a 
>   ql/src/test/results/clientpositive/spark/pcr.q.out 4c22f0b 
>   ql/src/test/results/clientpositive/spark/sample3.q.out 2fe6b0d 
>   ql/src/test/results/clientpositive/spark/sample9.q.out c9823f7 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_11.q.out c3f996f 
>   
> ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out
>  PRE-CREATION 
>   
> ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_2.q.out
>  PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/temp_table.q.out 16d663d 
>   ql/src/test/results/clientpositive/spark/udf_example_add.q.out 7916679 
>   ql/src/test/results/clientpositive/spark/udf_in_file.q.out c769d1f 
>   ql/src/test/results/clientpositive/spark/union_view.q.out 593ce40 
>   ql/src/test/results/clientpositive/spark/vector_elt.q.out 180ea15 
>   ql/src/test/results/clientpositive/spark/vector_string_concat.q.out 9ec8538 
>   ql/src/test/results/clientpositive/spark/vectorization_decimal_date.q.out 
> bafd62f 
>   ql/src/test/results/clientpositive/spark/vectorization_div0.q.out 30d116f 
>   ql/src/test/results/clientpositive/spark/vectorized_case.q.out daf6ad3 
>   ql/src/test/results/clientpositive/spark/vectorized_math_funcs.q.out 
> 470d9a9 
>   ql/src/test/results/clientpositive/spark/vectorized_string_funcs.q.out 
> ef98ae9 
>   serde/src/gen/thrift/gen-cpp/complex_types.h 3f4c760 
>   serde/src/gen/thrift/gen-cpp/complex_types.cpp 411e1b0 
>   serde/src/gen/thrift/gen-cpp/megastruct_types.cpp 2d46b7f 
>   serde/src/gen/thrift/gen-cpp/testthrift_types.h 6c84b9f 
>   serde/src/gen/thrift/gen-cpp/testthrift_types.cpp 7949f23 
>   
> serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/test/ThriftTestObj.java
>  dda3c5f 
>   
> serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/Complex.java
>  ff0c1f2 
>   
> serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/MegaStruct.java
>  fba49e4 
>   
> serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/PropValueUnion.java
>  a50a508 
>   
> serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/SetIntString.java
>  334d225 
>   service/src/gen/thrift/gen-cpp/TCLIService.h 030475b 
>   service/src/gen/thrift/gen-cpp/TCLIService.cpp 209ce63 
>   service/src/gen/thrift/gen-cpp/TCLIService_types.h 7bceabd 
>   service/src/gen/thrift/gen-cpp/TCLIService_types.cpp 86eeea3 
>   service/src/gen/thrift/gen-cpp/ThriftHive.h b84362b 
>   service/src/gen/thrift/gen-cpp/ThriftHive.cpp 865db69 
>   service/src/gen/thrift/gen-cpp/hive_service_types.h bc0e652 
>   service/src/gen/thrift/gen-cpp/hive_service_types.cpp 255fb00 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/service/ThriftHive.java
>  1c44789 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TBinaryColumn.java
>  6b1b054 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TBoolColumn.java
>  efd571c 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TByteColumn.java
>  169bfde 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TDoubleColumn.java
>  4fc5454 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetTablesReq.java
>  c973fcc 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TI16Column.java
>  c836630 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TI32Column.java
>  6c6c5f3 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TI64Column.java
>  cc383ed 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TRow.java
>  a44cfb0 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TRowSet.java
>  d16c8a4 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStatus.java
>  24a746e 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStringColumn.java
>  3dae460 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTableSchema.java
>  ff5e54d 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTypeDesc.java
>  251f86a 
>   service/src/gen/thrift/gen-py/hive_service/ThriftHive.py 33912f9 
> 
> Diff: https://reviews.apache.org/r/34666/diff/
> 
> 
> Testing
> -------
> 
> spark_dynamic_partition_pruning.q, spark_dynamic_partition_pruning_2.q - both 
> are clone from tez's test.
> 
> 
> Thanks,
> 
> Chao Sun
> 
>

Reply via email to