[jira] [Work logged] (HIVE-24154) Missing simplification opportunity with IN and EQUALS clauses

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24154?focusedWorklogId=483631&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483631
 ]

ASF GitHub Bot logged work on HIVE-24154:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 10:28
Start Date: 13/Sep/20 10:28
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #1492:
URL: https://github.com/apache/hive/pull/1492#discussion_r487511233



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HivePointLookupOptimizerRule.java
##
@@ -734,6 +721,30 @@ private RexNode useStructIfNeeded(List 
columns) {
   }
   operands.remove(i);
   --i;
+} else if (operand.getKind() == SqlKind.EQUALS) {
+  Constraint c = Constraint.of(operand);
+  if (c == null || !HiveCalciteUtil.isDeterministic(c.exprNode)) {
+continue;
+  }
+  String ref = c.exprNode.toString();
+  stringToExpr.put(ref, c.exprNode);
+  if (inLHSExprToRHSExprs.containsKey(ref)) {
+String expr = c.constNode.toString();
+stringToExpr.put(expr, c.constNode);

Review comment:
   note: we have RexNode-s are comparable; this string2expr stuff is not 
neccessarily needed anymore

##
File path: 
ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query74.q.out
##
@@ -147,7 +147,7 @@ HiveSortLimit(sort0=[$2], sort1=[$0], sort2=[$1], 
dir0=[ASC], dir1=[ASC], dir2=[
   HiveProject(ss_sold_date_sk=[$0], 
ss_customer_sk=[$3], ss_net_paid=[$20])
 HiveTableScan(table=[[default, store_sales]], 
table:alias=[store_sales])
   HiveProject(d_date_sk=[$0])
-HiveFilter(condition=[AND(=($1, 1999), IN($1, 1998, 
1999))])
+HiveFilter(condition=[=($1, 1999)])

Review comment:
   I see that this patch works - but would have thinked that we didn't need 
something like this anymore - because IN is opened in an early phase - so 
Calcite should see a bunch of ANDs and ORs - and if that's true - RexSimplify 
could make this simplification - and there would be no need to enhance 
PointLookup...
   
   I wonder how the IN was retained ...or re-created... 

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HivePointLookupOptimizerRule.java
##
@@ -599,7 +587,7 @@ public ConstraintGroup(RexNode rexNode) {
   if (constraint == null) {
 throw new SemanticException("Unable to find constraint which was 
earlier added.");
   }
-  ret.add(constraint.exprNode);
+  ret.add(constraint.constNode);

Review comment:
   oh my; Constrain had it's constructor arguments swapped! what a typo!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483631)
Time Spent: 20m  (was: 10m)

> Missing simplification opportunity with IN and EQUALS clauses
> -
>
> Key: HIVE-24154
> URL: https://issues.apache.org/jira/browse/HIVE-24154
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> For instance, in perf driver CBO query 74, there are several filters that 
> could be simplified further:
> {code}
> HiveFilter(condition=[AND(=($1, 1999), IN($1, 1998, 1999))])
> {code}
> This may lead to incorrect estimates and leads to unnecessary execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-18284) NPE when inserting data with 'distribute by' clause with dynpart sort optimization

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-18284?focusedWorklogId=483644&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483644
 ]

ASF GitHub Bot logged work on HIVE-18284:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 14:45
Start Date: 13/Sep/20 14:45
Worklog Time Spent: 10m 
  Work Description: shameersss1 commented on pull request #1400:
URL: https://github.com/apache/hive/pull/1400#issuecomment-691680467


   @kgyrtkirk @jcamachor Ping for review request!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483644)
Time Spent: 1.5h  (was: 1h 20m)

> NPE when inserting data with 'distribute by' clause with dynpart sort 
> optimization
> --
>
> Key: HIVE-18284
> URL: https://issues.apache.org/jira/browse/HIVE-18284
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.3.1, 2.3.2
>Reporter: Aki Tanaka
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> A Null Pointer Exception occurs when inserting data with 'distribute by' 
> clause. The following snippet query reproduces this issue:
> *(non-vectorized , non-llap mode)*
> {code:java}
> create table table1 (col1 string, datekey int);
> insert into table1 values ('ROW1', 1), ('ROW2', 2), ('ROW3', 1);
> create table table2 (col1 string) partitioned by (datekey int);
> set hive.vectorized.execution.enabled=false;
> set hive.optimize.sort.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> insert into table table2
> PARTITION(datekey)
> select col1,
> datekey
> from table1
> distribute by datekey ;
> {code}
> I could run the insert query without the error if I remove Distribute By  or 
> use Cluster By clause.
> It seems that the issue happens because Distribute By does not guarantee 
> clustering or sorting properties on the distributed keys.
> FileSinkOperator removes the previous fsp. FileSinkOperator will remove the 
> previous fsp which might be re-used when we use Distribute By.
> https://github.com/apache/hive/blob/branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java#L972
> The following stack trace is logged.
> {code:java}
> Vertex failed, vertexName=Reducer 2, vertexId=vertex_1513111717879_0056_1_01, 
> diagnostics=[Task failed, taskId=task_1513111717879_0056_1_01_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
> failure ) : 
> attempt_1513111717879_0056_1_01_00_0:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) {"key":{},"value":{"_col0":"ROW3","_col1":1}}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{},"value":{"_col0":"ROW3","_col1":1}}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:365)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=483645&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483645
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 14:53
Start Date: 13/Sep/20 14:53
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487538671



##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##
@@ -5661,184 +5664,69 @@ public void dropConstraint(String dbName, String 
tableName, String constraintNam
 }
   }
 
-  public List getDefaultConstraintList(String dbName, 
String tblName) throws HiveException, NoSuchObjectException {
-try {
-  return getMSC().getDefaultConstraints(new 
DefaultConstraintsRequest(getDefaultCatalog(conf), dbName, tblName));
-} catch (NoSuchObjectException e) {
-  throw e;
-} catch (Exception e) {
-  throw new HiveException(e);
-}
-  }
-
-  public List getCheckConstraintList(String dbName, String 
tblName) throws HiveException, NoSuchObjectException {
+  public SQLAllTableConstraints getTableConstraints(String dbName, String 
tblName)
+  throws HiveException, NoSuchObjectException {
 try {
-  return getMSC().getCheckConstraints(new 
CheckConstraintsRequest(getDefaultCatalog(conf),
-  dbName, tblName));
+  return getMSC().getAllTableConstraints(new 
AllTableConstraintsRequest(dbName, tblName, getDefaultCatalog(conf)));
 } catch (NoSuchObjectException e) {

Review comment:
   Will NoSuchObjectException comes here? From the interface, it sounds not.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##
@@ -5661,184 +5663,79 @@ public void dropConstraint(String dbName, String 
tableName, String constraintNam
 }
   }
 
-  public List getDefaultConstraintList(String dbName, 
String tblName) throws HiveException, NoSuchObjectException {
+  public SQLAllTableConstraints getTableConstraints(String dbName, String 
tblName) throws HiveException, NoSuchObjectException {
 try {
-  return getMSC().getDefaultConstraints(new 
DefaultConstraintsRequest(getDefaultCatalog(conf), dbName, tblName));
+  AllTableConstraintsRequest tableConstraintsRequest = new 
AllTableConstraintsRequest();
+  tableConstraintsRequest.setDbName(dbName);
+  tableConstraintsRequest.setTblName(tblName);
+  tableConstraintsRequest.setCatName(getDefaultCatalog(conf));
+  return getMSC().getAllTableConstraints(tableConstraintsRequest);
 } catch (NoSuchObjectException e) {
   throw e;
 } catch (Exception e) {
   throw new HiveException(e);
 }
   }
-
-  public List getCheckConstraintList(String dbName, String 
tblName) throws HiveException, NoSuchObjectException {
-try {
-  return getMSC().getCheckConstraints(new 
CheckConstraintsRequest(getDefaultCatalog(conf),
-  dbName, tblName));
-} catch (NoSuchObjectException e) {
-  throw e;
-} catch (Exception e) {
-  throw new HiveException(e);
-}
+  public TableConstraintsInfo getAllTableConstraints(String dbName, String 
tblName) throws HiveException {
+return getTableConstraints(dbName, tblName, false, false);
   }
 
-  /**
-   * Get all primary key columns associated with the table.
-   *
-   * @param dbName Database Name
-   * @param tblName Table Name
-   * @return Primary Key associated with the table.
-   * @throws HiveException
-   */
-  public PrimaryKeyInfo getPrimaryKeys(String dbName, String tblName) throws 
HiveException {
-return getPrimaryKeys(dbName, tblName, false);
+  public TableConstraintsInfo getReliableAndEnableTableConstraints(String 
dbName, String tblName) throws HiveException {
+return getTableConstraints(dbName, tblName, true, true);
   }
 
-  /**
-   * Get primary key columns associated with the table that are available for 
optimization.
-   *
-   * @param dbName Database Name
-   * @param tblName Table Name
-   * @return Primary Key associated with the table.
-   * @throws HiveException
-   */
-  public PrimaryKeyInfo getReliablePrimaryKeys(String dbName, String tblName) 
throws HiveException {
-return getPrimaryKeys(dbName, tblName, true);
-  }
-
-  private PrimaryKeyInfo getPrimaryKeys(String dbName, String tblName, boolean 
onlyReliable)
+  private TableConstraintsInfo getTableConstraints(String dbName, String 
tblName, boolean reliable, boolean enable)
   throws HiveException {
 PerfLogger perfLogger = SessionState.getPerfLogger();
-perfLogger.perfLogBegin(CLASS_NAME, PerfLogger.HIVE_GET_PK);
-try {
-  List primaryKeys = getMSC().getPrimaryKeys(new 
PrimaryKeysRequest(dbName, tblName));
-  if (onlyReliable && primaryKeys != null && !primaryKeys.isEmpty()) {
-primaryKeys = primaryKeys.

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=483646&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483646
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 14:54
Start Date: 13/Sep/20 14:54
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487538671



##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##
@@ -5661,184 +5664,69 @@ public void dropConstraint(String dbName, String 
tableName, String constraintNam
 }
   }
 
-  public List getDefaultConstraintList(String dbName, 
String tblName) throws HiveException, NoSuchObjectException {
-try {
-  return getMSC().getDefaultConstraints(new 
DefaultConstraintsRequest(getDefaultCatalog(conf), dbName, tblName));
-} catch (NoSuchObjectException e) {
-  throw e;
-} catch (Exception e) {
-  throw new HiveException(e);
-}
-  }
-
-  public List getCheckConstraintList(String dbName, String 
tblName) throws HiveException, NoSuchObjectException {
+  public SQLAllTableConstraints getTableConstraints(String dbName, String 
tblName)
+  throws HiveException, NoSuchObjectException {
 try {
-  return getMSC().getCheckConstraints(new 
CheckConstraintsRequest(getDefaultCatalog(conf),
-  dbName, tblName));
+  return getMSC().getAllTableConstraints(new 
AllTableConstraintsRequest(dbName, tblName, getDefaultCatalog(conf)));
 } catch (NoSuchObjectException e) {

Review comment:
   Will NoSuchObjectException comes here? From the interface, it sounds 
not. Remove the handling here too.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483646)
Time Spent: 5h 50m  (was: 5h 40m)

> Consolidate metastore call to fetch constraints
> ---
>
> Key: HIVE-22782
> URL: https://issues.apache.org/jira/browse/HIVE-22782
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Currently separate calls are made to metastore to fetch constraints like Pk, 
> fk, not null etc. Since planner always retrieve these constraints we should 
> retrieve all of them in one call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=483647&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483647
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 15:12
Start Date: 13/Sep/20 15:12
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487540360



##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
##
@@ -1176,145 +1166,101 @@ public void setStatsStateLikeNewTable() {
*  Note that set apis are used by DESCRIBE only, although get apis return 
RELY or ENABLE
*  constraints DESCRIBE could set all type of constraints
* */
-
-  /* This only return PK which are created with RELY */
-  public PrimaryKeyInfo getPrimaryKeyInfo() {
-if(!this.isPKFetched) {
+  public TableConstraintsInfo getTableConstraintsInfo() {
+if (!this.isTableConstraintsFetched) {
   try {
-pki = Hive.get().getReliablePrimaryKeys(this.getDbName(), 
this.getTableName());
-this.isPKFetched = true;
+tableConstraintsInfo = 
Hive.get().getTableConstraints(this.getDbName(), this.getTableName(), true, 
true);
+this.isTableConstraintsFetched = true;
   } catch (HiveException e) {
-LOG.warn("Cannot retrieve PK info for table : " + this.getTableName()
-+ " ignoring exception: " + e);
+LOG.warn(

Review comment:
   Callers are assuming tableConstraintsInfo won't be null after invoking 
this method but it is not if there is an exception.
   Can initialize it with default constructor in this flow.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
##
@@ -116,22 +116,12 @@
   private transient Boolean outdatedForRewritingMaterializedView;
 
   /** Constraint related objects */
-  private transient PrimaryKeyInfo pki;
-  private transient ForeignKeyInfo fki;
-  private transient UniqueConstraint uki;
-  private transient NotNullConstraint nnc;
-  private transient DefaultConstraint dc;
-  private transient CheckConstraint cc;
+  private transient TableConstraintsInfo tableConstraintsInfo;
 
   /** Constraint related flags
*  This is to track if constraints are retrieved from metastore or not
*/
-  private transient boolean isPKFetched=false;
-  private transient boolean isFKFetched=false;
-  private transient boolean isUniqueFetched=false;
-  private transient boolean isNotNullFetched=false;
-  private transient boolean isDefaultFetched=false;
-  private transient boolean isCheckFetched=false;
+  private transient boolean isTableConstraintsFetched = false;

Review comment:
   Shall remove this flag as we can check tableConstraintsInfo != null 
instead. Btw, we need this flag if we use default constructor in exception flow 
of getTableConstraintsInfo() method.

##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
##
@@ -3631,6 +3631,17 @@ boolean cacheFileMetadata(String dbName, String 
tableName, String partName,
   List getCheckConstraints(CheckConstraintsRequest 
request) throws MetaException,
   NoSuchObjectException, TException;
 
+  /**
+   * Get all constraints of given table
+   * @param request Request info
+   * @return all constraints of this table
+   * @throws MetaException
+   * @throws NoSuchObjectException
+   * @throws TException
+   */
+  SQLAllTableConstraints getAllTableConstraints(AllTableConstraintsRequest 
request)

Review comment:
   I can still see it here.

##
File path: 
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##
@@ -720,6 +729,15 @@ struct CheckConstraintsResponse {
   1: required list checkConstraints
 }
 
+struct AllTableConstraintsRequest {
+  1: required string dbName,
+  2: required string tblName,
+  3: required string catName

Review comment:
   catName can be optional.

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##
@@ -9330,6 +9330,31 @@ public CheckConstraintsResponse 
get_check_constraints(CheckConstraintsRequest re
   return new CheckConstraintsResponse(ret);
 }
 
+/**
+ * Api to fetch all table constraints at once
+ * @param request it consist of catalog name, database name and table name 
to identify the table in metastore
+ * @return all constraints attached to given table
+ * @throws TException
+ */
+@Override
+public AllTableConstraintsResponse 
get_all_table_constraints(AllTableConstraintsRequest request) throws TException 
{

Review comment:
   This method also throws MetaException. 

##
File path: 
standalone-metastore/metastore-common/src/main/thrift/

[jira] [Work logged] (HIVE-24154) Missing simplification opportunity with IN and EQUALS clauses

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24154?focusedWorklogId=483659&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483659
 ]

ASF GitHub Bot logged work on HIVE-24154:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 15:48
Start Date: 13/Sep/20 15:48
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1492:
URL: https://github.com/apache/hive/pull/1492#discussion_r487545193



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HivePointLookupOptimizerRule.java
##
@@ -599,7 +587,7 @@ public ConstraintGroup(RexNode rexNode) {
   if (constraint == null) {
 throw new SemanticException("Unable to find constraint which was 
earlier added.");
   }
-  ret.add(constraint.exprNode);
+  ret.add(constraint.constNode);

Review comment:
   I was also surprised we did not catch it before! But that's OK, it was 
working as expected and now they are swapped back! 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483659)
Time Spent: 0.5h  (was: 20m)

> Missing simplification opportunity with IN and EQUALS clauses
> -
>
> Key: HIVE-24154
> URL: https://issues.apache.org/jira/browse/HIVE-24154
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> For instance, in perf driver CBO query 74, there are several filters that 
> could be simplified further:
> {code}
> HiveFilter(condition=[AND(=($1, 1999), IN($1, 1998, 1999))])
> {code}
> This may lead to incorrect estimates and leads to unnecessary execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24154) Missing simplification opportunity with IN and EQUALS clauses

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24154?focusedWorklogId=483661&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483661
 ]

ASF GitHub Bot logged work on HIVE-24154:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 16:03
Start Date: 13/Sep/20 16:03
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1492:
URL: https://github.com/apache/hive/pull/1492#discussion_r487546716



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HivePointLookupOptimizerRule.java
##
@@ -734,6 +721,30 @@ private RexNode useStructIfNeeded(List 
columns) {
   }
   operands.remove(i);
   --i;
+} else if (operand.getKind() == SqlKind.EQUALS) {
+  Constraint c = Constraint.of(operand);
+  if (c == null || !HiveCalciteUtil.isDeterministic(c.exprNode)) {
+continue;
+  }
+  String ref = c.exprNode.toString();
+  stringToExpr.put(ref, c.exprNode);
+  if (inLHSExprToRHSExprs.containsKey(ref)) {
+String expr = c.constNode.toString();
+stringToExpr.put(expr, c.constNode);

Review comment:
   Makes sense, I will simplify this code.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483661)
Time Spent: 50m  (was: 40m)

> Missing simplification opportunity with IN and EQUALS clauses
> -
>
> Key: HIVE-24154
> URL: https://issues.apache.org/jira/browse/HIVE-24154
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> For instance, in perf driver CBO query 74, there are several filters that 
> could be simplified further:
> {code}
> HiveFilter(condition=[AND(=($1, 1999), IN($1, 1998, 1999))])
> {code}
> This may lead to incorrect estimates and leads to unnecessary execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24154) Missing simplification opportunity with IN and EQUALS clauses

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24154?focusedWorklogId=483660&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483660
 ]

ASF GitHub Bot logged work on HIVE-24154:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 16:03
Start Date: 13/Sep/20 16:03
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1492:
URL: https://github.com/apache/hive/pull/1492#discussion_r487546646



##
File path: 
ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query74.q.out
##
@@ -147,7 +147,7 @@ HiveSortLimit(sort0=[$2], sort1=[$0], sort2=[$1], 
dir0=[ASC], dir1=[ASC], dir2=[
   HiveProject(ss_sold_date_sk=[$0], 
ss_customer_sk=[$3], ss_net_paid=[$20])
 HiveTableScan(table=[[default, store_sales]], 
table:alias=[store_sales])
   HiveProject(d_date_sk=[$0])
-HiveFilter(condition=[AND(=($1, 1999), IN($1, 1998, 
1999))])
+HiveFilter(condition=[=($1, 1999)])

Review comment:
   What seems to happen is that we close these IN clauses before we call 
the join propagation rule, which leads to the inference of new predicates. Then 
the RexSimplify does not deal with the IN+EQUALS clauses and the 
HivePointLookup cannot deal with the EQUALS.
   We could possibly change the rules order too but I was not brave enough to 
do that :) Plus I think supporting the degenerate case of EQUALS in 
HivePointLookup should be fine.
   Iirc there is work going on in Calcite to finally have more extensive 
support for IN in RexSimplify, which you suggested some time ago... I hope that 
work goes in and we can simplify this whole code path.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483660)
Time Spent: 40m  (was: 0.5h)

> Missing simplification opportunity with IN and EQUALS clauses
> -
>
> Key: HIVE-24154
> URL: https://issues.apache.org/jira/browse/HIVE-24154
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For instance, in perf driver CBO query 74, there are several filters that 
> could be simplified further:
> {code}
> HiveFilter(condition=[AND(=($1, 1999), IN($1, 1998, 1999))])
> {code}
> This may lead to incorrect estimates and leads to unnecessary execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24154) Missing simplification opportunity with IN and EQUALS clauses

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24154?focusedWorklogId=483662&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483662
 ]

ASF GitHub Bot logged work on HIVE-24154:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 16:04
Start Date: 13/Sep/20 16:04
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #1492:
URL: https://github.com/apache/hive/pull/1492#issuecomment-691690227


   Thanks for checking Zoltan! I had in draft because I wanted to regenerate 
the q files before making it final, but the code was ready to be reviewed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483662)
Time Spent: 1h  (was: 50m)

> Missing simplification opportunity with IN and EQUALS clauses
> -
>
> Key: HIVE-24154
> URL: https://issues.apache.org/jira/browse/HIVE-24154
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> For instance, in perf driver CBO query 74, there are several filters that 
> could be simplified further:
> {code}
> HiveFilter(condition=[AND(=($1, 1999), IN($1, 1998, 1999))])
> {code}
> This may lead to incorrect estimates and leads to unnecessary execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=483663&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483663
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 16:46
Start Date: 13/Sep/20 16:46
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487551175



##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
##
@@ -116,22 +116,12 @@
   private transient Boolean outdatedForRewritingMaterializedView;
 
   /** Constraint related objects */
-  private transient PrimaryKeyInfo pki;
-  private transient ForeignKeyInfo fki;
-  private transient UniqueConstraint uki;
-  private transient NotNullConstraint nnc;
-  private transient DefaultConstraint dc;
-  private transient CheckConstraint cc;
+  private transient TableConstraintsInfo tableConstraintsInfo;
 
   /** Constraint related flags
*  This is to track if constraints are retrieved from metastore or not
*/
-  private transient boolean isPKFetched=false;
-  private transient boolean isFKFetched=false;
-  private transient boolean isUniqueFetched=false;
-  private transient boolean isNotNullFetched=false;
-  private transient boolean isDefaultFetched=false;
-  private transient boolean isCheckFetched=false;
+  private transient boolean isTableConstraintsFetched = false;

Review comment:
   Currently we don't any implementation where we are calling the default 
constructor. But in order to have a parameterised constructor we need need a 
default constructor also. As a result of which in future it might possible that 
some implementation use. the default constructor. So its better to move back to 
flag based verification.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483663)
Time Spent: 6h 10m  (was: 6h)

> Consolidate metastore call to fetch constraints
> ---
>
> Key: HIVE-22782
> URL: https://issues.apache.org/jira/browse/HIVE-22782
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Currently separate calls are made to metastore to fetch constraints like Pk, 
> fk, not null etc. Since planner always retrieve these constraints we should 
> retrieve all of them in one call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=483665&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483665
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 16:50
Start Date: 13/Sep/20 16:50
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487551559



##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
##
@@ -1176,145 +1166,101 @@ public void setStatsStateLikeNewTable() {
*  Note that set apis are used by DESCRIBE only, although get apis return 
RELY or ENABLE
*  constraints DESCRIBE could set all type of constraints
* */
-
-  /* This only return PK which are created with RELY */
-  public PrimaryKeyInfo getPrimaryKeyInfo() {
-if(!this.isPKFetched) {
+  public TableConstraintsInfo getTableConstraintsInfo() {
+if (!this.isTableConstraintsFetched) {
   try {
-pki = Hive.get().getReliablePrimaryKeys(this.getDbName(), 
this.getTableName());
-this.isPKFetched = true;
+tableConstraintsInfo = 
Hive.get().getReliableAndEnableTableConstraints(this.getDbName(), 
this.getTableName());
+this.isTableConstraintsFetched = true;
   } catch (HiveException e) {
-LOG.warn("Cannot retrieve PK info for table : " + this.getTableName()
-+ " ignoring exception: " + e);
+LOG.warn(
+"Cannot retrieve table constraints info for table : " + 
this.getTableName() + " ignoring exception: " + e);
   }
 }
-return pki;
+return tableConstraintsInfo;
   }
 
-  public void setPrimaryKeyInfo(PrimaryKeyInfo pki) {
-this.pki = pki;
-this.isPKFetched = true;
+  /**
+   * TableConstraintsInfo setter
+   * @param tableConstraintsInfo
+   */
+  public void setTableConstraintsInfo(TableConstraintsInfo 
tableConstraintsInfo) {
+this.tableConstraintsInfo = tableConstraintsInfo;
+this.isTableConstraintsFetched = true;
   }
 
-  /* This only return FK constraints which are created with RELY */
-  public ForeignKeyInfo getForeignKeyInfo() {
-if(!isFKFetched) {
-  try {
-fki = Hive.get().getReliableForeignKeys(this.getDbName(), 
this.getTableName());
-this.isFKFetched = true;
-  } catch (HiveException e) {
-LOG.warn(
-"Cannot retrieve FK info for table : " + this.getTableName()
-+ " ignoring exception: " + e);
-  }
+  /**
+   * This only return PK which are created with RELY
+   * @return primary key constraint list
+   */
+  public PrimaryKeyInfo getPrimaryKeyInfo() {
+if (!this.isTableConstraintsFetched) {
+  getTableConstraintsInfo();
 }
-return fki;
+return tableConstraintsInfo.getPrimaryKeyInfo();
   }
 
-  public void setForeignKeyInfo(ForeignKeyInfo fki) {
-this.fki = fki;
-this.isFKFetched = true;
+  /**
+   * This only return FK constraints which are created with RELY
+   * @return foreign key constraint list
+   */
+  public ForeignKeyInfo getForeignKeyInfo() {
+if (!isTableConstraintsFetched) {
+  getTableConstraintsInfo();
+}
+return tableConstraintsInfo.getForeignKeyInfo();
   }
 
-  /* This only return UNIQUE constraint defined with RELY */
+  /**
+   * This only return UNIQUE constraint defined with RELY
+   * @return unique constraint list
+   */
   public UniqueConstraint getUniqueKeyInfo() {
-if(!isUniqueFetched) {
-  try {
-uki = Hive.get().getReliableUniqueConstraints(this.getDbName(), 
this.getTableName());
-this.isUniqueFetched = true;
-  } catch (HiveException e) {
-LOG.warn(
-"Cannot retrieve Unique Key info for table : " + 
this.getTableName()
-+ " ignoring exception: " + e);
-  }
+if (!isTableConstraintsFetched) {
+  getTableConstraintsInfo();
 }
-return uki;
-  }
-
-  public void setUniqueKeyInfo(UniqueConstraint uki) {
-this.uki = uki;
-this.isUniqueFetched = true;
+return tableConstraintsInfo.getUniqueConstraint();
   }
 
-  /* This only return NOT NULL constraint defined with RELY */
+  /**
+   * This only return NOT NULL constraint defined with RELY
+   * @return not null constraint list
+   */
   public NotNullConstraint getNotNullConstraint() {
-if(!isNotNullFetched) {
-  try {
-nnc = Hive.get().getReliableNotNullConstraints(this.getDbName(), 
this.getTableName());
-this.isNotNullFetched = true;
-  } catch (HiveException e) {
-LOG.warn("Cannot retrieve Not Null constraint info for table : "
-+ this.getTableName() + " ignoring exception: " + e);
-  }
+if (!isTableConstraintsFetched) {
+  getTableConstraintsInfo();
 }
-return nnc;
-  }
-
-  public void setNotNullConstraint(NotNullCon

[jira] [Work logged] (HIVE-24155) Upgrade Arrow version to 1.0.1

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24155?focusedWorklogId=483669&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483669
 ]

ASF GitHub Bot logged work on HIVE-24155:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 18:06
Start Date: 13/Sep/20 18:06
Worklog Time Spent: 10m 
  Work Description: medb opened a new pull request #1493:
URL: https://github.com/apache/hive/pull/1493


   Change-Id: I951d7649dfa2f1a4dedf660fd9517a2aaa7b43d5



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483669)
Remaining Estimate: 0h
Time Spent: 10m

> Upgrade Arrow version to 1.0.1
> --
>
> Key: HIVE-24155
> URL: https://issues.apache.org/jira/browse/HIVE-24155
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.1.2
>Reporter: Igor Dvorzhak
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24155) Upgrade Arrow version to 1.0.1

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24155:
--
Labels: pull-request-available  (was: )

> Upgrade Arrow version to 1.0.1
> --
>
> Key: HIVE-24155
> URL: https://issues.apache.org/jira/browse/HIVE-24155
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.1.2
>Reporter: Igor Dvorzhak
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24156) Hive 3 explicity returning select cast("0" as boolean) as false even though hive.lazysimple.extended_boolean_literal is set to false

2020-09-13 Thread Mudit Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mudit Sharma updated HIVE-24156:

Summary: Hive 3 explicity returning select cast("0" as boolean) as false 
even though hive.lazysimple.extended_boolean_literal is set to false  (was: 
Hive 3 explicity returning select cast("0" as boolean) as true even though 
hive.lazysimple.extended_boolean_literal is set to false)

> Hive 3 explicity returning select cast("0" as boolean) as false even though 
> hive.lazysimple.extended_boolean_literal is set to false
> 
>
> Key: HIVE-24156
> URL: https://issues.apache.org/jira/browse/HIVE-24156
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mudit Sharma
>Priority: Major
>
> As per https://issues.apache.org/jira/browse/HIVE-3635, select cast("0" as 
> boolean) should return as false only if 
> hive.lazysimple.extended_boolean_literal is set to true. Even though the 
> config is set to false, this is returning as false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24156) Hive 3 explicity returning select cast("0" as boolean) as false even though hive.lazysimple.extended_boolean_literal is set to false

2020-09-13 Thread Mudit Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mudit Sharma updated HIVE-24156:

Labels: TODOC  (was: )

> Hive 3 explicity returning select cast("0" as boolean) as false even though 
> hive.lazysimple.extended_boolean_literal is set to false
> 
>
> Key: HIVE-24156
> URL: https://issues.apache.org/jira/browse/HIVE-24156
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mudit Sharma
>Priority: Major
>  Labels: TODOC
>
> As per https://issues.apache.org/jira/browse/HIVE-3635, select cast("0" as 
> boolean) should return as false only if 
> hive.lazysimple.extended_boolean_literal is set to true. Even though the 
> config is set to false, this is returning as false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23795) Add Additional Debugging Help for Import SQL

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23795?focusedWorklogId=483716&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483716
 ]

ASF GitHub Bot logged work on HIVE-23795:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 00:46
Start Date: 14/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1199:
URL: https://github.com/apache/hive/pull/1199


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483716)
Time Spent: 50m  (was: 40m)

> Add Additional Debugging Help for Import SQL
> 
>
> Key: HIVE-23795
> URL: https://issues.apache.org/jira/browse/HIVE-23795
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Add some things that were helpful to me when I was recently debugging an 
> issue with importing SQL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23838) KafkaRecordIteratorTest is flaky

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23838?focusedWorklogId=483715&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483715
 ]

ASF GitHub Bot logged work on HIVE-23838:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 00:46
Start Date: 14/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1245:
URL: https://github.com/apache/hive/pull/1245#issuecomment-691752817


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483715)
Time Spent: 1h 40m  (was: 1.5h)

> KafkaRecordIteratorTest is flaky
> 
>
> Key: HIVE-23838
> URL: https://issues.apache.org/jira/browse/HIVE-23838
> Project: Hive
>  Issue Type: Bug
>  Components: kafka integration
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Failed on [4th run of flaky test 
> checker|http://ci.hive.apache.org/job/hive-flaky-check/69/] with
> org.apache.kafka.common.errors.TimeoutException: Timeout expired after 
> 1milliseconds while awaiting InitProducerId



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23793) Review of QueryInfo Class

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23793?focusedWorklogId=483717&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483717
 ]

ASF GitHub Bot logged work on HIVE-23793:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 00:46
Start Date: 14/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1197:
URL: https://github.com/apache/hive/pull/1197#issuecomment-691752836


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483717)
Time Spent: 2.5h  (was: 2h 20m)

> Review of QueryInfo Class
> -
>
> Key: HIVE-23793
> URL: https://issues.apache.org/jira/browse/HIVE-23793
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23805) ValidReadTxnList need not be constructed multiple times in AcidUtils::getAcidState

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23805?focusedWorklogId=483718&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483718
 ]

ASF GitHub Bot logged work on HIVE-23805:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 00:47
Start Date: 14/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1224:
URL: https://github.com/apache/hive/pull/1224


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483718)
Time Spent: 1h  (was: 50m)

> ValidReadTxnList need not be constructed multiple times in 
> AcidUtils::getAcidState 
> ---
>
> Key: HIVE-23805
> URL: https://issues.apache.org/jira/browse/HIVE-23805
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2020-07-06 at 4.53.44 PM.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1273]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1286]
>  
> {code:java}
> String s = conf.get(ValidTxnList.VALID_TXNS_KEY);
>   
>   
> if(!Strings.isNullOrEmpty(s)) {
>   
>  ...
>  ...
>   validTxnList.readFromString(s);
>   
>   
> } {code}
>  
>  
> !Screenshot 2020-07-06 at 4.53.44 PM.png|width=610,height=621!
> AM spends good amount of CPU parsing the same validtxnlist multiple times.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24146) Cleanup TaskExecutionException in GenericUDTFExplode

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24146?focusedWorklogId=483724&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483724
 ]

ASF GitHub Bot logged work on HIVE-24146:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 01:34
Start Date: 14/Sep/20 01:34
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 closed pull request #1483:
URL: https://github.com/apache/hive/pull/1483


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483724)
Time Spent: 40m  (was: 0.5h)

> Cleanup TaskExecutionException in GenericUDTFExplode
> 
>
> Key: HIVE-24146
> URL: https://issues.apache.org/jira/browse/HIVE-24146
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> - Remove TaskExecutionException, which may be not used anymore;
> - Remove the default handling in GenericUDTFExplode#process, which has been 
> verified during the function initializing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24146) Cleanup TaskExecutionException in GenericUDTFExplode

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24146?focusedWorklogId=483725&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483725
 ]

ASF GitHub Bot logged work on HIVE-24146:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 01:35
Start Date: 14/Sep/20 01:35
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1483:
URL: https://github.com/apache/hive/pull/1483


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483725)
Time Spent: 50m  (was: 40m)

> Cleanup TaskExecutionException in GenericUDTFExplode
> 
>
> Key: HIVE-24146
> URL: https://issues.apache.org/jira/browse/HIVE-24146
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> - Remove TaskExecutionException, which may be not used anymore;
> - Remove the default handling in GenericUDTFExplode#process, which has been 
> verified during the function initializing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23800) Add hooks when HiveServer2 stops due to OutOfMemoryError

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23800?focusedWorklogId=483726&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483726
 ]

ASF GitHub Bot logged work on HIVE-23800:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 01:36
Start Date: 14/Sep/20 01:36
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 closed pull request #1205:
URL: https://github.com/apache/hive/pull/1205


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483726)
Time Spent: 3h 50m  (was: 3h 40m)

> Add hooks when HiveServer2 stops due to OutOfMemoryError
> 
>
> Key: HIVE-23800
> URL: https://issues.apache.org/jira/browse/HIVE-23800
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Make oom hook an interface of HiveServer2,  so user can implement the hook to 
> do something before HS2 stops, such as dumping the heap or altering the 
> devops.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23800) Add hooks when HiveServer2 stops due to OutOfMemoryError

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23800?focusedWorklogId=483727&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483727
 ]

ASF GitHub Bot logged work on HIVE-23800:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 01:37
Start Date: 14/Sep/20 01:37
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1205:
URL: https://github.com/apache/hive/pull/1205


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483727)
Time Spent: 4h  (was: 3h 50m)

> Add hooks when HiveServer2 stops due to OutOfMemoryError
> 
>
> Key: HIVE-23800
> URL: https://issues.apache.org/jira/browse/HIVE-23800
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Make oom hook an interface of HiveServer2,  so user can implement the hook to 
> do something before HS2 stops, such as dumping the heap or altering the 
> devops.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24035) Add Jenkinsfile for branch-2.3

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24035?focusedWorklogId=483731&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483731
 ]

ASF GitHub Bot logged work on HIVE-24035:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 02:07
Start Date: 14/Sep/20 02:07
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #1398:
URL: https://github.com/apache/hive/pull/1398#issuecomment-691770335


   @kgyrtkirk it is indeed surefire issue - after upgrading to 3.0.0-M4 the 
issue was resolved and each pod was able to only run tests assigned to it. The 
job seems still timed out though since one pod was assigned 
`TestMiniLlapLocalCliDriver` which seems to be very slow. 
   
   Could you take another look? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483731)
Time Spent: 3h  (was: 2h 50m)

> Add Jenkinsfile for branch-2.3
> --
>
> Key: HIVE-24035
> URL: https://issues.apache.org/jira/browse/HIVE-24035
> Project: Hive
>  Issue Type: Test
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> To enable precommit tests for github PR, we need to have a Jenkinsfile in the 
> repo. This is already done for master and branch-2. This adds the same for 
> branch-2.3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24035) Add Jenkinsfile for branch-2.3

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24035?focusedWorklogId=483732&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483732
 ]

ASF GitHub Bot logged work on HIVE-24035:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 02:08
Start Date: 14/Sep/20 02:08
Worklog Time Spent: 10m 
  Work Description: sunchao edited a comment on pull request #1398:
URL: https://github.com/apache/hive/pull/1398#issuecomment-691770335


   @kgyrtkirk it is indeed surefire issue - after upgrading to 3.0.0-M4 the 
issue was resolved and each pod was able to only run tests assigned to it. The 
job seems still timed out though since one pod was assigned 
`TestMiniLlapLocalCliDriver` which seems to be very slow. 
   
   Could you take another look? I'm not sure if it is easy to solve the timeout 
issue.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483732)
Time Spent: 3h 10m  (was: 3h)

> Add Jenkinsfile for branch-2.3
> --
>
> Key: HIVE-24035
> URL: https://issues.apache.org/jira/browse/HIVE-24035
> Project: Hive
>  Issue Type: Test
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> To enable precommit tests for github PR, we need to have a Jenkinsfile in the 
> repo. This is already done for master and branch-2. This adds the same for 
> branch-2.3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24069) HiveHistory should log the task that ends abnormally

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24069?focusedWorklogId=483733&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483733
 ]

ASF GitHub Bot logged work on HIVE-24069:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 02:09
Start Date: 14/Sep/20 02:09
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1429:
URL: https://github.com/apache/hive/pull/1429#issuecomment-691770939


   @miklosgergely @kgyrtkirk Could you take a look at the changes?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483733)
Time Spent: 20m  (was: 10m)

> HiveHistory should log the task that ends abnormally
> 
>
> Key: HIVE-24069
> URL: https://issues.apache.org/jira/browse/HIVE-24069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When the task returns with the exitVal not equal to 0,  The Executor would 
> skip marking the task return code and calling endTask.  This may make the 
> history log incomplete for such tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24106?focusedWorklogId=483735&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483735
 ]

ASF GitHub Bot logged work on HIVE-24106:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 02:14
Start Date: 14/Sep/20 02:14
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1456:
URL: https://github.com/apache/hive/pull/1456#issuecomment-691771848


   @belugabehr @pvary any thought or comments on this? thanks 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483735)
Time Spent: 20m  (was: 10m)

> Abort polling on the operation state when the current thread is interrupted
> ---
>
> Key: HIVE-24106
> URL: https://issues.apache.org/jira/browse/HIVE-24106
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If running HiveStatement asynchronously as a task like in a thread or future, 
>  if we interrupt the task,  the HiveStatement would continue to poll on the 
> operation state until finish. It's may better to provide a way to abort the 
> executing in such case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24107) Fix typo in ReloadFunctionsOperation

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24107?focusedWorklogId=483736&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483736
 ]

ASF GitHub Bot logged work on HIVE-24107:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 02:17
Start Date: 14/Sep/20 02:17
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1457:
URL: https://github.com/apache/hive/pull/1457#issuecomment-691772540


   @miklosgergely @kgyrtkirk any comments on this change? thank you



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483736)
Time Spent: 50m  (was: 40m)

> Fix typo in ReloadFunctionsOperation
> 
>
> Key: HIVE-24107
> URL: https://issues.apache.org/jira/browse/HIVE-24107
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Hive.get() will register all functions as doRegisterAllFns is true,  so 
> Hive.get().reloadFunctions() may load all functions from metastore twice, use 
> Hive.get(false) instead may be better.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24157) Strict mode to fail on CAST timestamp <-> numeral

2020-09-13 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-24157:
---
Description: 
There is some interest in enforcing that CAST numeral <\-> timestamp is 
disallowed to avoid confusion among users, e.g., SQL standard does not allow 
numeral <\-> timestamp casting, timestamp type is timezone agnostic, etc.

We should introduce a strict config for timestamp (similar to others before): 
If the config is true, we shall fail while compiling the query with a 
meaningful message.

To provide similar behavior, Hive has multiple functions that provide clearer 
semantics for numeral to timestamp conversion (and vice versa):
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions

  was:
There is some interest in enforcing that CAST numeral <-> timestamp is 
disallowed to avoid confusion among users, e.g., SQL standard does not allow 
numeral <-> timestamp casting, timestamp type is timezone agnostic, etc.

We should introduce a strict config for timestamp (similar to others before): 
If the config is true, we shall fail while compiling the query with a 
meaningful message.

To provide similar behavior, Hive has multiple functions that provide clearer 
semantics for numeral to timestamp conversion (and vice versa):
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions


> Strict mode to fail on CAST timestamp <-> numeral
> -
>
> Key: HIVE-24157
> URL: https://issues.apache.org/jira/browse/HIVE-24157
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> There is some interest in enforcing that CAST numeral <\-> timestamp is 
> disallowed to avoid confusion among users, e.g., SQL standard does not allow 
> numeral <\-> timestamp casting, timestamp type is timezone agnostic, etc.
> We should introduce a strict config for timestamp (similar to others before): 
> If the config is true, we shall fail while compiling the query with a 
> meaningful message.
> To provide similar behavior, Hive has multiple functions that provide clearer 
> semantics for numeral to timestamp conversion (and vice versa):
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24157) Strict mode to fail on CAST timestamp <-> numeric

2020-09-13 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-24157:
---
Summary: Strict mode to fail on CAST timestamp <-> numeric  (was: Strict 
mode to fail on CAST timestamp <-> numeral)

> Strict mode to fail on CAST timestamp <-> numeric
> -
>
> Key: HIVE-24157
> URL: https://issues.apache.org/jira/browse/HIVE-24157
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> There is some interest in enforcing that CAST numeral <\-> timestamp is 
> disallowed to avoid confusion among users, e.g., SQL standard does not allow 
> numeral <\-> timestamp casting, timestamp type is timezone agnostic, etc.
> We should introduce a strict config for timestamp (similar to others before): 
> If the config is true, we shall fail while compiling the query with a 
> meaningful message.
> To provide similar behavior, Hive has multiple functions that provide clearer 
> semantics for numeral to timestamp conversion (and vice versa):
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24157) Strict mode to fail on CAST timestamp <-> numeric

2020-09-13 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-24157:
---
Description: 
There is some interest in enforcing that CAST numeric <\-> timestamp is 
disallowed to avoid confusion among users, e.g., SQL standard does not allow 
numeric <\-> timestamp casting, timestamp type is timezone agnostic, etc.

We should introduce a strict config for timestamp (similar to others before): 
If the config is true, we shall fail while compiling the query with a 
meaningful message.

To provide similar behavior, Hive has multiple functions that provide clearer 
semantics for numeric to timestamp conversion (and vice versa):
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions

  was:
There is some interest in enforcing that CAST numeral <\-> timestamp is 
disallowed to avoid confusion among users, e.g., SQL standard does not allow 
numeral <\-> timestamp casting, timestamp type is timezone agnostic, etc.

We should introduce a strict config for timestamp (similar to others before): 
If the config is true, we shall fail while compiling the query with a 
meaningful message.

To provide similar behavior, Hive has multiple functions that provide clearer 
semantics for numeral to timestamp conversion (and vice versa):
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions


> Strict mode to fail on CAST timestamp <-> numeric
> -
>
> Key: HIVE-24157
> URL: https://issues.apache.org/jira/browse/HIVE-24157
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> There is some interest in enforcing that CAST numeric <\-> timestamp is 
> disallowed to avoid confusion among users, e.g., SQL standard does not allow 
> numeric <\-> timestamp casting, timestamp type is timezone agnostic, etc.
> We should introduce a strict config for timestamp (similar to others before): 
> If the config is true, we shall fail while compiling the query with a 
> meaningful message.
> To provide similar behavior, Hive has multiple functions that provide clearer 
> semantics for numeric to timestamp conversion (and vice versa):
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events

2020-09-13 Thread Riju Trivedi (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195184#comment-17195184
 ] 

Riju Trivedi commented on HIVE-24070:
-

[~rameshkumar] [~nareshpr] I think we are trying address same issues in both of 
these jiras [HIVE-22290|https://issues.apache.org/jira/browse/HIVE-22290]

> ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of 
> pending events
> --
>
> Key: HIVE-24070
> URL: https://issues.apache.org/jira/browse/HIVE-24070
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
>
> If there are large number of events that haven't been cleaned up for some 
> reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory 
> while it loads all the events to be deleted.
>  It should fetch events in batches.
> Similar to https://issues.apache.org/jira/browse/HIVE-19430



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23851?focusedWorklogId=483779&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483779
 ]

ASF GitHub Bot logged work on HIVE-23851:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 04:58
Start Date: 14/Sep/20 04:58
Worklog Time Spent: 10m 
  Work Description: shameersss1 commented on pull request #1271:
URL: https://github.com/apache/hive/pull/1271#issuecomment-691810348


   @kgyrtkirk Ping for re-review request!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483779)
Time Spent: 4h 20m  (was: 4h 10m)

> MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
> 
>
> Key: HIVE-23851
> URL: https://issues.apache.org/jira/browse/HIVE-23851
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
> # Create external table
> # Run msck command to sync all the partitions with metastore
> # Remove one of the partition path
> # Run msck repair with partition filtering
> *Stack Trace:*
> {code:java}
>  2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] 
> ppr.PartitionExpressionForMetastore: Failed to deserialize the expression
>  java.lang.IndexOutOfBoundsException: Index: 110, Size: 0
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192]
>  at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT]
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192]
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_192]
> {code}
> *Cause:*
> In case of msck repair with partition filtering we expect expression proxy 
> class to be set as PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78
>  ), While dropping partition we serialize the drop partition filter 
> expression as ( 
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java#L589
>  ) which is incompatible during deserializtion happening in 
> PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionExpression

[jira] [Commented] (HIVE-24156) Hive 3 explicity returning select cast("0" as boolean) as false even though hive.lazysimple.extended_boolean_literal is set to false

2020-09-13 Thread Mudit Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195190#comment-17195190
 ] 

Mudit Sharma commented on HIVE-24156:
-

After this code block: 
[https://github.com/apache/hive/blob/branch-3.1/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java#L590]

 

The code does not seem to be behind the config and code accepts 
"false","off","no","0" and "" as FALSE by default

Let me know if I am wrong here, also in branch-2.3, this was also not behind 
any config, it was simply checking for the length of string: 
[https://github.com/apache/hive/blob/branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java#L175]

 

I saw this has changed after version 3

 

The documentation however is not stating that this behavior is changed after 
major version 3 of Hive

 

> Hive 3 explicity returning select cast("0" as boolean) as false even though 
> hive.lazysimple.extended_boolean_literal is set to false
> 
>
> Key: HIVE-24156
> URL: https://issues.apache.org/jira/browse/HIVE-24156
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mudit Sharma
>Priority: Major
>  Labels: TODOC
>
> As per https://issues.apache.org/jira/browse/HIVE-3635, select cast("0" as 
> boolean) should return as false only if 
> hive.lazysimple.extended_boolean_literal is set to true. Even though the 
> config is set to false, this is returning as false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24139) VectorGroupByOperator is not flushing hash table entries as needed

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24139?focusedWorklogId=483791&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483791
 ]

ASF GitHub Bot logged work on HIVE-24139:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 05:43
Start Date: 14/Sep/20 05:43
Worklog Time Spent: 10m 
  Work Description: rbalamohan commented on pull request #1481:
URL: https://github.com/apache/hive/pull/1481#issuecomment-691824652


   LGTM. +1



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483791)
Time Spent: 40m  (was: 0.5h)

> VectorGroupByOperator is not flushing hash table entries as needed
> --
>
> Key: HIVE-24139
> URL: https://issues.apache.org/jira/browse/HIVE-24139
> Project: Hive
>  Issue Type: Bug
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> After https://issues.apache.org/jira/browse/HIVE-23975 introduced a bug where 
> copyKey mutates some key wrappers while copying. This Jira is to fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24139) VectorGroupByOperator is not flushing hash table entries as needed

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24139?focusedWorklogId=483792&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483792
 ]

ASF GitHub Bot logged work on HIVE-24139:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 05:44
Start Date: 14/Sep/20 05:44
Worklog Time Spent: 10m 
  Work Description: rbalamohan edited a comment on pull request #1481:
URL: https://github.com/apache/hive/pull/1481#issuecomment-691824652


   LGTM. +1 (pending tests)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483792)
Time Spent: 50m  (was: 40m)

> VectorGroupByOperator is not flushing hash table entries as needed
> --
>
> Key: HIVE-24139
> URL: https://issues.apache.org/jira/browse/HIVE-24139
> Project: Hive
>  Issue Type: Bug
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> After https://issues.apache.org/jira/browse/HIVE-23975 introduced a bug where 
> copyKey mutates some key wrappers while copying. This Jira is to fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24127) Dump events from default catalog only

2020-09-13 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24127:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

thanks for the patch [~aasha] and review [~pkumarsinha]

> Dump events from default catalog only
> -
>
> Key: HIVE-24127
> URL: https://issues.apache.org/jira/browse/HIVE-24127
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24127.01.patch, HIVE-24127.02.patch, 
> HIVE-24127.03.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Don't dump events from spark catalog. In bootstrap we skip spark tables. In 
> inceremental load also we should skip spark events.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24117) Fix for not setting managed table location in incremental load

2020-09-13 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24117:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

patch committed to master, thanks for the patch [~aasha] and review 
[~pkumarsinha]

> Fix for not setting managed table location in incremental load
> --
>
> Key: HIVE-24117
> URL: https://issues.apache.org/jira/browse/HIVE-24117
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24117.01.patch, HIVE-24117.02.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)