kgyrtkirk commented on a change in pull request #2137:
URL: https://github.com/apache/hive/pull/2137#discussion_r611516875
##########
File path:
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java
##########
@@ -171,7 +171,7 @@ private void prepare(InputInitializerContext
initializerContext) throws IOExcept
// perform dynamic partition pruning
if (pruner != null) {
pruner.initialize(getContext(), work, jobConf);
- pruner.prune();
+ pruner.prune(jobConf);
Review comment:
note: `jobConf` was already shown to the `pruner` in the initialize
##########
File path:
iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java
##########
@@ -250,4 +305,74 @@ static void overlayTableProperties(Configuration
configuration, TableDesc tableD
// this is an exception to the interface documentation, but it's a safe
operation to add this property
props.put(InputFormatConfig.TABLE_SCHEMA, schemaJson);
}
+
+ /**
+ * Recursively collects the column names from the predicate.
+ * @param node The node we are traversing
+ * @param columns The already collected column names
+ */
+ private void columns(ExprNodeDesc node, Collection<String> columns) {
Review comment:
why is this a `Collection` - do we need to collect the same column
multiple times?
##########
File path:
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DynamicPartitionPruner.java
##########
@@ -514,4 +569,29 @@ private boolean checkForSourceCompletion(String name) {
}
return false;
}
+
+ /**
+ * Recursively replaces the ExprNodeDynamicListDesc to the list of the
actual values. As a result of this call the
+ * original expression is modified so it can be used for pushing down to the
TableScan for filtering the data at the
+ * source.
+ * <p>
+ * Please make sure to clone the predicate if needed since the original node
will be modified.
+ * @param node The node we are traversing
+ * @param dynArgs The constant values we are substituting
+ */
+ private void replaceDynamicLists(ExprNodeDesc node,
Collection<ExprNodeConstantDesc> dynArgs) {
+ List<ExprNodeDesc> children = node.getChildren();
+ if (children != null && !children.isEmpty()) {
+ ListIterator<ExprNodeDesc> iterator = node.getChildren().listIterator();
+ while (iterator.hasNext()) {
+ ExprNodeDesc child = iterator.next();
+ if (child instanceof ExprNodeDynamicListDesc) {
+ iterator.remove();
+ dynArgs.forEach(iterator::add);
Review comment:
I strongly suspect that this method is problematic; what will happen if
you have filter for 2 different columns or 2 different values?
```
a IN L1 and b IN L2
```
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java
##########
@@ -890,6 +894,24 @@ public static void pushFilters(JobConf jobConf,
TableScanOperator tableScan,
Utilities.setColumnTypeList(jobConf, tableScan);
// push down filters
ExprNodeGenericFuncDesc filterExpr = scanDesc.getFilterExpr();
+ String pruningFilter = jobConf.get(TableScanDesc.PARTITION_PRUNING_FILTER);
+ // If we have a pruning filter then combine it with the original
+ if (pruningFilter != null) {
+ ExprNodeGenericFuncDesc pruningExpr =
SerializationUtilities.deserializeExpression(pruningFilter);
+ if (filterExpr != null) {
+ // Combine the 2 filters with AND
+ filterExpr = new
ExprNodeGenericFuncDesc(TypeInfoFactory.booleanTypeInfo, new GenericUDFOPAnd(),
"and",
Review comment:
note: you could probably use `ExprNodeDescUtils#conjunction` (or move
this method there...)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]