[GitHub] [hive] jcamachor commented on a change in pull request #1315: [HIVE-23951] Support parameterized queries in WHERE/HAVING clause

2020-08-05 Thread GitBox


jcamachor commented on a change in pull request #1315:
URL: https://github.com/apache/hive/pull/1315#discussion_r466085537



##
File path: ql/src/java/org/apache/hadoop/hive/ql/QueryPlan.java
##
@@ -121,6 +121,8 @@
   private final DDLDescWithWriteId acidDdlDesc;
   private Boolean autoCommitValue;
 
+  private Boolean prepareQuery;

Review comment:
   It seems this should be a boolean given the return type of the methods 
getter/setter.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/Compiler.java
##
@@ -338,12 +339,22 @@ private QueryPlan createPlan(BaseSemanticAnalyzer sem) {
 plan.setOptimizedCBOPlan(context.getCalcitePlan());
 plan.setOptimizedQueryString(context.getOptimizedSql());
 
+// this is required so that later driver can skip executing prepare queries
+if (sem.getIsPrepareQuery()) {

Review comment:
   `getIsPrepareQuery` -> `isPrepareQuery`

##
File path: 
parser/src/java/org/apache/hadoop/hive/ql/parse/PrepareStatementParser.g
##
@@ -0,0 +1,66 @@
+/**
+   Licensed to the Apache Software Foundation (ASF) under one or more 
+   contributor license agreements.  See the NOTICE file distributed with 
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with 
+   the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+*/
+parser grammar PrepareStatementParser;
+
+options
+{
+output=AST;
+ASTLabelType=ASTNode;
+backtrack=false;
+k=3;
+}
+
+@members {
+  @Override
+  public Object recoverFromMismatchedSet(IntStream input,
+  RecognitionException re, BitSet follow) throws RecognitionException {
+throw re;
+  }
+  @Override
+  public void displayRecognitionError(String[] tokenNames,
+  RecognitionException e) {
+gParent.errors.add(new ParseError(gParent, e, tokenNames));
+  }
+}
+
+@rulecatch {
+catch (RecognitionException e) {
+  throw e;
+}
+}
+
+//--- Rules for parsing Prepare 
statement-
+prepareStatement
+@init { gParent.pushMsg("prepare statement ", state); }
+@after { gParent.popMsg(state); }
+: KW_PREPARE identifier KW_FROM queryStatementExpression
+-> ^(TOK_PREPARE queryStatementExpression identifier)
+;
+
+executeStatement
+@init { gParent.pushMsg("execute statement ", state); }
+@after { gParent.popMsg(state); }
+: KW_EXECUTE identifier KW_USING executeParamList
+-> ^(TOK_EXECUTE executeParamList identifier)
+;
+
+executeParamList
+@init { gParent.pushMsg("execute param list", state); }
+@after { gParent.popMsg(state); }
+: constant (COMMA constant)*

Review comment:
   Is there a JIRA?

##
File path: ql/src/test/results/clientpositive/llap/udf_greatest.q.out
##
@@ -63,7 +63,7 @@ STAGE PLANS:
   alias: src
   Row Limit Per Split: 1
   Select Operator
-expressions: 'c' (type: string), 'a' (type: string), 'AaA' (type: 
string), 'AAA' (type: string), '13' (type: string), '2' (type: string), '03' 
(type: string), '1' (type: string), null (type: double), null (type: double), 
null (type: double), null (type: double), null (type: double), null (type: 
double)

Review comment:
   Why did these types change? Seems unrelated to this patch. Was it an 
existing bug?

##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
##
@@ -1619,6 +1620,12 @@ public static ColStatistics 
getColStatisticsFromExpression(HiveConf conf, Statis
   colName = enfd.getFieldName();
   colType = enfd.getTypeString();
   countDistincts = numRows;
+} else if (end instanceof ExprDynamicParamDesc) {
+  //skip collecting stats for parameters

Review comment:
   JIRA?

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/ExecuteStatementAnalyzer.java
##
@@ -0,0 +1,320 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" 

[GitHub] [hive] jcamachor commented on a change in pull request #1315: [HIVE-23951] Support parameterized queries in WHERE/HAVING clause

2020-08-04 Thread GitBox


jcamachor commented on a change in pull request #1315:
URL: https://github.com/apache/hive/pull/1315#discussion_r465443070



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/drop/ExecuteStatementAnalyzer.java
##
@@ -0,0 +1,377 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.ddl.table.drop;
+
+import org.apache.hadoop.hive.ql.QueryState;
+import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.DDLType;
+import org.apache.hadoop.hive.ql.exec.ExplainTask;
+import org.apache.hadoop.hive.ql.exec.FetchTask;
+import org.apache.hadoop.hive.ql.exec.FilterOperator;
+import org.apache.hadoop.hive.ql.exec.Operator;
+import org.apache.hadoop.hive.ql.exec.OperatorUtils;
+import org.apache.hadoop.hive.ql.exec.SelectOperator;
+import org.apache.hadoop.hive.ql.exec.SerializationUtilities;
+import org.apache.hadoop.hive.ql.exec.Task;
+import org.apache.hadoop.hive.ql.exec.Utilities;
+import org.apache.hadoop.hive.ql.exec.tez.TezTask;
+import org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator;
+import org.apache.hadoop.hive.ql.parse.ASTNode;
+import org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer;
+import org.apache.hadoop.hive.ql.parse.HiveParser;
+import org.apache.hadoop.hive.ql.parse.SemanticException;
+import org.apache.hadoop.hive.ql.parse.type.ExprNodeDescExprFactory;
+import org.apache.hadoop.hive.ql.plan.BaseWork;
+import org.apache.hadoop.hive.ql.plan.ExprDynamicParamDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc;
+import org.apache.hadoop.hive.ql.plan.ExprNodeDesc;
+import org.apache.hadoop.hive.ql.session.SessionState;
+import org.apache.hadoop.hive.serde2.typeinfo.CharTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory;
+import org.apache.hadoop.hive.serde2.typeinfo.VarcharTypeInfo;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+/**
+ * Analyzer for Execute statement.
+ * This analyzer
+ *  retreives cached {@link BaseSemanticAnalyzer},
+ *  makes copy of all tasks by serializing/deserializing it,
+ *  bind dynamic parameters inside cached {@link BaseSemanticAnalyzer} using 
values provided
+ */
+@DDLType(types = HiveParser.TOK_EXECUTE)
+public class ExecuteStatementAnalyzer extends BaseSemanticAnalyzer {
+
+  public ExecuteStatementAnalyzer(QueryState queryState) throws 
SemanticException {
+super(queryState);
+  }
+
+  /**
+   * This class encapsulate all {@link Task} required to be copied.
+   * This is required because {@link FetchTask} list of {@link Task} may hold 
reference to same
+   * objects (e.g. list of result files) and are required to be 
serialized/de-serialized together.
+   */
+  private class PlanCopy {
+FetchTask fetchTask;
+List> tasks;
+
+PlanCopy(FetchTask fetchTask, List> tasks) {
+  this.fetchTask = fetchTask;
+  this.tasks = tasks;
+}
+
+FetchTask getFetchTask() {
+  return fetchTask;
+}
+
+List> getTasks()  {
+  return tasks;
+}
+  }
+
+  private String getQueryName(ASTNode root) {
+ASTNode queryNameAST = (ASTNode)(root.getChild(1));
+return queryNameAST.getText();
+  }
+
+  /**
+   * Utility method to create copy of provided object using kyro 
serialization/de-serialization.
+   */
+  private  T makeCopy(final Object task, Class objClass) {
+ByteArrayOutputStream baos = new ByteArrayOutputStream();
+SerializationUtilities.serializePlan(task, baos);
+
+return SerializationUtilities.deserializePlan(
+new ByteArrayInputStream(baos.toByteArray()), objClass);
+  }
+
+  /**
+   * Given a {@link BaseSemanticAnalyzer} (cached) this method make copies of 
all tasks
+   * (including {@link FetchTask}) and update the existing {@link 
ExecuteStatementAnalyzer}
+   */
+  private void createTaskCopy(final BaseSemanticAnalyzer cachedPlan) {

Review comment:
   Yes, I thought about `QueryPlan` since I assumed it would have all the 
information that is needed. It seems more natural to 

[GitHub] [hive] jcamachor commented on a change in pull request #1315: [HIVE-23951] Support parameterized queries in WHERE/HAVING clause

2020-08-03 Thread GitBox


jcamachor commented on a change in pull request #1315:
URL: https://github.com/apache/hive/pull/1315#discussion_r464729855



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java
##
@@ -49,6 +50,8 @@
 import com.google.common.collect.Lists;
 import com.google.common.collect.Multimap;
 
+import static 
org.apache.hadoop.hive.ql.optimizer.physical.AnnotateRunTimeStatsOptimizer.getAllOperatorsForSimpleFetch;

Review comment:
   Yes, I meant `getAllOperatorsForSimpleFetch`, since it seems it used 
beyond the scope of `AnnotateRunTimeStatsOptimizer`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org