[jira] [Assigned] (HIVE-22626) Fix TestStatsReplicationScenariosACIDNoAutogather

2021-07-02 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena reassigned HIVE-22626:
---

Assignee: Ayush Saxena

> Fix TestStatsReplicationScenariosACIDNoAutogather
> -
>
> Key: HIVE-22626
> URL: https://issues.apache.org/jira/browse/HIVE-22626
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Zoltan Haindrich
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: qalogs.tgz
>
>
> this test is running "alone" because but still; it sometimes runs more than 
> 40m which results in a timeout
> a jira search reveals that was pretty common: 
> https://issues.apache.org/jira/issues/?jql=text%20~%20%22TestStatsReplicationScenariosACIDNoAutogather%22%20order%20by%20updated%20desc
> from the hive logs:
> * it seems like after a few minutes this test starts there is an exception:
> {code}
> 2019-12-10T22:43:19,594 DEBUG [Finalizer] metastore.HiveMetaStoreClient: 
> Unable to shutdown metastore client. Will try closing transport directly.
> org.apache.thrift.transport.TTransportException: java.net.SocketException: 
> Socket closed
> at 
> org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161)
>  ~[libthrift-0.9.3-1.jar:0.9.3-1]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73) 
> ~[libthrift-0.9.3-1.jar:0.9.3-1]
> at 
> org.apache.thrift.TServiceClient.sendBaseOneway(TServiceClient.java:66) 
> ~[libthrift-0.9.3-1.jar:0.9.3-1]
> at 
> com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:436)
>  ~[libfb303-0.9.3.jar:?]
> at 
> com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:430) 
> ~[libfb303-0.9.3.jar:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:776)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_102]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_102]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_102]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_102]
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at com.sun.proxy.$Proxy62.close(Unknown Source) [?:?]
> at org.apache.hadoop.hive.ql.metadata.Hive.close(Hive.java:542) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.metadata.Hive.finalize(Hive.java:514) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.lang.System$2.invokeFinalize(System.java:1270) [?:1.8.0_102]
> at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:98) 
> [?:1.8.0_102]
> at java.lang.ref.Finalizer.access$100(Finalizer.java:34) [?:1.8.0_102]
> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:210) 
> [?:1.8.0_102]
> Caused by: java.net.SocketException: Socket closed
> at 
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:116) 
> ~[?:1.8.0_102]
> at java.net.SocketOutputStream.write(SocketOutputStream.java:153) 
> ~[?:1.8.0_102]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[?:1.8.0_102]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[?:1.8.0_102]
> at 
> org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:159)
>  ~[libthrift-0.9.3-1.jar:0.9.3-1]
> {code}
> * after that some NoSuchObjectExceptions follow
> * and then some replications seems to happen
> I don't fully understand this; I'll attach the logs...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25267) Fix TestReplicationScenariosAcidTables

2021-07-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25267:
--
Labels: pull-request-available  (was: )

> Fix TestReplicationScenariosAcidTables
> --
>
> Key: HIVE-25267
> URL: https://issues.apache.org/jira/browse/HIVE-25267
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> test is unstable
> http://ci.hive.apache.org/job/hive-flaky-check/242/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25267) Fix TestReplicationScenariosAcidTables

2021-07-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25267?focusedWorklogId=617992&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-617992
 ]

ASF GitHub Bot logged work on HIVE-25267:
-

Author: ASF GitHub Bot
Created on: 02/Jul/21 07:27
Start Date: 02/Jul/21 07:27
Worklog Time Spent: 10m 
  Work Description: pkumarsinha opened a new pull request #2444:
URL: https://github.com/apache/hive/pull/2444


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 617992)
Remaining Estimate: 0h
Time Spent: 10m

> Fix TestReplicationScenariosAcidTables
> --
>
> Key: HIVE-25267
> URL: https://issues.apache.org/jira/browse/HIVE-25267
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Pravin Sinha
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> test is unstable
> http://ci.hive.apache.org/job/hive-flaky-check/242/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25267) Fix TestReplicationScenariosAcidTables

2021-07-02 Thread Pravin Sinha (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha updated HIVE-25267:

Status: Patch Available  (was: Open)

> Fix TestReplicationScenariosAcidTables
> --
>
> Key: HIVE-25267
> URL: https://issues.apache.org/jira/browse/HIVE-25267
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> test is unstable
> http://ci.hive.apache.org/job/hive-flaky-check/242/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25297) Refactor GenericUDFDateDiff

2021-07-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25297?focusedWorklogId=618007&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618007
 ]

ASF GitHub Bot logged work on HIVE-25297:
-

Author: ASF GitHub Bot
Created on: 02/Jul/21 08:41
Start Date: 02/Jul/21 08:41
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on pull request #2437:
URL: https://github.com/apache/hive/pull/2437#issuecomment-872827430


   @zabetak Thank you for the review !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 618007)
Time Spent: 1h 50m  (was: 1h 40m)

> Refactor GenericUDFDateDiff
> ---
>
> Key: HIVE-25297
> URL: https://issues.apache.org/jira/browse/HIVE-25297
> Project: Hive
>  Issue Type: Task
>  Components: UDF
>Affects Versions: All Versions
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Description
> Remove redundant code and refactor entire GenericUDFDateDiff.class code



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25297) Refactor GenericUDFDateDiff

2021-07-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25297?focusedWorklogId=618012&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618012
 ]

ASF GitHub Bot logged work on HIVE-25297:
-

Author: ASF GitHub Bot
Created on: 02/Jul/21 08:52
Start Date: 02/Jul/21 08:52
Worklog Time Spent: 10m 
  Work Description: sankarh merged pull request #2437:
URL: https://github.com/apache/hive/pull/2437


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 618012)
Time Spent: 2h  (was: 1h 50m)

> Refactor GenericUDFDateDiff
> ---
>
> Key: HIVE-25297
> URL: https://issues.apache.org/jira/browse/HIVE-25297
> Project: Hive
>  Issue Type: Task
>  Components: UDF
>Affects Versions: All Versions
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Description
> Remove redundant code and refactor entire GenericUDFDateDiff.class code



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25297) Refactor GenericUDFDateDiff

2021-07-02 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan resolved HIVE-25297.
-
Resolution: Fixed

Merged to master!
Thanks [~ashish-kumar-sharma] for the patch and [~zabetak] for the review!

> Refactor GenericUDFDateDiff
> ---
>
> Key: HIVE-25297
> URL: https://issues.apache.org/jira/browse/HIVE-25297
> Project: Hive
>  Issue Type: Task
>  Components: UDF
>Affects Versions: All Versions
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Description
> Remove redundant code and refactor entire GenericUDFDateDiff.class code



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25305) Replayed transactions are not cleaned up properly on open txn timeout

2021-07-02 Thread Pravin Sinha (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha reassigned HIVE-25305:
---


> Replayed transactions are not cleaned up properly on open txn timeout  
> ---
>
> Key: HIVE-25305
> URL: https://issues.apache.org/jira/browse/HIVE-25305
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25304) Fix test org.apache.hive.spark.client.rpc.TestRpc.testServerPort

2021-07-02 Thread Haymant Mangla (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haymant Mangla resolved HIVE-25304.
---
Resolution: Fixed

> Fix test org.apache.hive.spark.client.rpc.TestRpc.testServerPort
> 
>
> Key: HIVE-25304
> URL: https://issues.apache.org/jira/browse/HIVE-25304
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25246) Fix the clean up of open repl created transactions

2021-07-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25246?focusedWorklogId=618121&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618121
 ]

ASF GitHub Bot logged work on HIVE-25246:
-

Author: ASF GitHub Bot
Created on: 02/Jul/21 13:48
Start Date: 02/Jul/21 13:48
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2396:
URL: https://github.com/apache/hive/pull/2396#discussion_r663019158



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -1122,12 +1154,79 @@ public void abortTxns(AbortTxnsRequest rqst) throws 
MetaException {
 }
   }
 
+  private long getDatabaseId(Connection dbConn, String database, String 
catalog) throws SQLException {
+ResultSet rs = null;
+PreparedStatement pst = null;
+try {
+  String query = "select \"DB_ID\" from \"DBS\" where \"NAME\" = ?  and 
\"CTLG_NAME\" = ?";
+  pst = sqlGenerator.prepareStmtWithParameters(dbConn, query, 
Arrays.asList(database, catalog));
+  LOG.debug("Going to execute query <" + query.replaceAll("\\?", "{}") + 
">",
+  quoteString(database), quoteString(catalog));
+  rs = pst.executeQuery();
+  if (!rs.next()) {
+LOG.error("Database: " + database + " does not exist in catalog " + 
catalog);
+return -1;
+  }
+  return rs.getLong(1);
+} finally {
+  close(rs);
+  closeStmt(pst);
+}
+  }
+
+  private void updateDatabaseProp(Connection dbConn, long dbId, String prop, 
String propValue) throws SQLException {
+ResultSet rs = null;
+PreparedStatement pst = null;
+try {
+  String query = "SELECT \"PARAM_VALUE\" FROM \"DATABASE_PARAMS\" WHERE 
\"PARAM_KEY\" = " +
+  "'" + prop + "' AND \"DB_ID\" = " + dbId;
+  pst = sqlGenerator.prepareStmtWithParameters(dbConn, query, null);
+  rs = pst.executeQuery();
+  query = null;
+  if (!rs.next()) {
+query = "INSERT INTO \"DATABASE_PARAMS\" VALUES ( " + dbId + " , '" + 
prop + "' , ? )";
+  } else if (!rs.getString(1).equals(propValue)) {
+query = "UPDATE \"DATABASE_PARAMS\" SET \"PARAM_VALUE\" = ? WHERE 
\"DB_ID\" = " + dbId +
+" AND \"PARAM_KEY\" = '" + prop + "'";
+  }
+  closeStmt(pst);
+  if (query != null) {
+pst = sqlGenerator.prepareStmtWithParameters(dbConn, query, 
Arrays.asList(propValue));
+LOG.debug("Updating " + prop + " for db <" + query.replaceAll("\\?", 
"{}") + ">", propValue);

Review comment:
   Db name would be helpful.

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -1122,12 +1154,79 @@ public void abortTxns(AbortTxnsRequest rqst) throws 
MetaException {
 }
   }
 
+  private long getDatabaseId(Connection dbConn, String database, String 
catalog) throws SQLException {
+ResultSet rs = null;
+PreparedStatement pst = null;
+try {
+  String query = "select \"DB_ID\" from \"DBS\" where \"NAME\" = ?  and 
\"CTLG_NAME\" = ?";
+  pst = sqlGenerator.prepareStmtWithParameters(dbConn, query, 
Arrays.asList(database, catalog));
+  LOG.debug("Going to execute query <" + query.replaceAll("\\?", "{}") + 
">",
+  quoteString(database), quoteString(catalog));
+  rs = pst.executeQuery();
+  if (!rs.next()) {
+LOG.error("Database: " + database + " does not exist in catalog " + 
catalog);
+return -1;
+  }
+  return rs.getLong(1);
+} finally {
+  close(rs);
+  closeStmt(pst);
+}
+  }
+
+  private void updateDatabaseProp(Connection dbConn, long dbId, String prop, 
String propValue) throws SQLException {
+ResultSet rs = null;
+PreparedStatement pst = null;
+try {
+  String query = "SELECT \"PARAM_VALUE\" FROM \"DATABASE_PARAMS\" WHERE 
\"PARAM_KEY\" = " +
+  "'" + prop + "' AND \"DB_ID\" = " + dbId;
+  pst = sqlGenerator.prepareStmtWithParameters(dbConn, query, null);
+  rs = pst.executeQuery();
+  query = null;
+  if (!rs.next()) {
+query = "INSERT INTO \"DATABASE_PARAMS\" VALUES ( " + dbId + " , '" + 
prop + "' , ? )";
+  } else if (!rs.getString(1).equals(propValue)) {
+query = "UPDATE \"DATABASE_PARAMS\" SET \"PARAM_VALUE\" = ? WHERE 
\"DB_ID\" = " + dbId +
+" AND \"PARAM_KEY\" = '" + prop + "'";
+  }
+  closeStmt(pst);
+  if (query != null) {

Review comment:
   if query == null, having a log message should be useful.

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -1143,37 +1242,13 @@ private void updateReplId(Connection dbConn, 
ReplLastIdInfo replLastIdInfo) thro
 stmt.execute(s);
   }
 
-   

[jira] [Work logged] (HIVE-24609) Fix ArrayIndexOutOfBoundsException when execute full outer join

2021-07-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24609?focusedWorklogId=618137&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618137
 ]

ASF GitHub Bot logged work on HIVE-24609:
-

Author: ASF GitHub Bot
Created on: 02/Jul/21 14:41
Start Date: 02/Jul/21 14:41
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1844:
URL: https://github.com/apache/hive/pull/1844#discussion_r663063144



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java
##
@@ -1629,6 +1629,9 @@ private boolean 
checkShuffleSizeForLargeTable(JoinOperator joinOp, int position,
   // Max is disabled, we can safely return false
   return false;
 }
+if(position < 0){

Review comment:
   @lijufeng2016 Please reopen the PR. Also add a UTs and qtest




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 618137)
Time Spent: 1h  (was: 50m)

> Fix ArrayIndexOutOfBoundsException when execute full outer join
> ---
>
> Key: HIVE-24609
> URL: https://issues.apache.org/jira/browse/HIVE-24609
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.0
>Reporter: jufeng li
>Assignee: Printer Setup
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> here is my hive-sql:
> {code:java}
> select 
> ..
>  from A 
>  full outer join B on A.id = B.id
> {code}
>  
> It can not be execute,I got an ArrayIndexOutOfBoundsException.Then I debug 
> HiveServer2,found when compile sql in some situation and contains full outer 
> join,there is an ArrayIndexOutOfBoundsException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23688) Vectorization: IndexArrayOutOfBoundsException For map type column which includes null value

2021-07-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23688?focusedWorklogId=618255&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618255
 ]

ASF GitHub Bot logged work on HIVE-23688:
-

Author: ASF GitHub Bot
Created on: 02/Jul/21 20:36
Start Date: 02/Jul/21 20:36
Worklog Time Spent: 10m 
  Work Description: SparksFyz opened a new pull request #1122:
URL: https://github.com/apache/hive/pull/1122


   https://issues.apache.org/jira/browse/HIVE-23688


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 618255)
Time Spent: 40m  (was: 0.5h)

> Vectorization: IndexArrayOutOfBoundsException For map type column which 
> includes null value
> ---
>
> Key: HIVE-23688
> URL: https://issues.apache.org/jira/browse/HIVE-23688
> Project: Hive
>  Issue Type: Bug
>  Components: Parquet, storage-api, Vectorization
>Affects Versions: All Versions
>Reporter: 范宜臻
>Assignee: 范宜臻
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.0.0, 4.0.0
>
> Attachments: HIVE-23688.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {color:#de350b}start{color} and {color:#de350b}length{color} are empty arrays 
> in MapColumnVector.values(BytesColumnVector) when values in map contain 
> {color:#de350b}null{color}
> reproduce in master branch:
> {code:java}
> set hive.vectorized.execution.enabled=true; 
> CREATE TABLE parquet_map_type (id int,stringMap map) 
> stored as parquet; 
> insert overwrite table parquet_map_typeSELECT 1, MAP('k1', null, 'k2', 
> 'bar'); 
> select id, stringMap['k1'] from parquet_map_type group by 1,2;
> {code}
> query explain:
> {code:java}
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2 vectorized
>   File Output Operator [FS_12]
> Group By Operator [GBY_11] (rows=5 width=2)
>   Output:["_col0","_col1"],keys:KEY._col0, KEY._col1
> <-Map 1 [SIMPLE_EDGE] vectorized
>   SHUFFLE [RS_10]
> PartitionCols:_col0, _col1
> Group By Operator [GBY_9] (rows=10 width=2)
>   Output:["_col0","_col1"],keys:_col0, _col1
>   Select Operator [SEL_8] (rows=10 width=2)
> Output:["_col0","_col1"]
> TableScan [TS_0] (rows=10 width=2)
>   
> temp@parquet_map_type_fyz,parquet_map_type_fyz,Tbl:COMPLETE,Col:NONE,Output:["id","stringmap"]
> {code}
> runtime error:
> {code:java}
> Vertex failed, vertexName=Map 1, vertexId=vertex_1592040015150_0001_3_00, 
> diagnostics=[Task failed, taskId=task_1592040015150_0001_3_00_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
> failure ) : 
> attempt_1592040015150_0001_3_00_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row 
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
>   at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.

[jira] [Work logged] (HIVE-23688) Vectorization: IndexArrayOutOfBoundsException For map type column which includes null value

2021-07-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23688?focusedWorklogId=618256&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618256
 ]

ASF GitHub Bot logged work on HIVE-23688:
-

Author: ASF GitHub Bot
Created on: 02/Jul/21 20:36
Start Date: 02/Jul/21 20:36
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on pull request #1122:
URL: https://github.com/apache/hive/pull/1122#issuecomment-873244798


   this change solved an issue that we found on customer side, reopened this PR 
and I'll review later


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 618256)
Time Spent: 50m  (was: 40m)

> Vectorization: IndexArrayOutOfBoundsException For map type column which 
> includes null value
> ---
>
> Key: HIVE-23688
> URL: https://issues.apache.org/jira/browse/HIVE-23688
> Project: Hive
>  Issue Type: Bug
>  Components: Parquet, storage-api, Vectorization
>Affects Versions: All Versions
>Reporter: 范宜臻
>Assignee: 范宜臻
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.0.0, 4.0.0
>
> Attachments: HIVE-23688.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {color:#de350b}start{color} and {color:#de350b}length{color} are empty arrays 
> in MapColumnVector.values(BytesColumnVector) when values in map contain 
> {color:#de350b}null{color}
> reproduce in master branch:
> {code:java}
> set hive.vectorized.execution.enabled=true; 
> CREATE TABLE parquet_map_type (id int,stringMap map) 
> stored as parquet; 
> insert overwrite table parquet_map_typeSELECT 1, MAP('k1', null, 'k2', 
> 'bar'); 
> select id, stringMap['k1'] from parquet_map_type group by 1,2;
> {code}
> query explain:
> {code:java}
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2 vectorized
>   File Output Operator [FS_12]
> Group By Operator [GBY_11] (rows=5 width=2)
>   Output:["_col0","_col1"],keys:KEY._col0, KEY._col1
> <-Map 1 [SIMPLE_EDGE] vectorized
>   SHUFFLE [RS_10]
> PartitionCols:_col0, _col1
> Group By Operator [GBY_9] (rows=10 width=2)
>   Output:["_col0","_col1"],keys:_col0, _col1
>   Select Operator [SEL_8] (rows=10 width=2)
> Output:["_col0","_col1"]
> TableScan [TS_0] (rows=10 width=2)
>   
> temp@parquet_map_type_fyz,parquet_map_type_fyz,Tbl:COMPLETE,Col:NONE,Output:["id","stringmap"]
> {code}
> runtime error:
> {code:java}
> Vertex failed, vertexName=Map 1, vertexId=vertex_1592040015150_0001_3_00, 
> diagnostics=[Task failed, taskId=task_1592040015150_0001_3_00_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
> failure ) : 
> attempt_1592040015150_0001_3_00_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row 
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
>   at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
>   at 
> java.util.concurrent.ThreadPo

[jira] [Work logged] (HIVE-23688) Vectorization: IndexArrayOutOfBoundsException For map type column which includes null value

2021-07-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23688?focusedWorklogId=618257&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618257
 ]

ASF GitHub Bot logged work on HIVE-23688:
-

Author: ASF GitHub Bot
Created on: 02/Jul/21 20:40
Start Date: 02/Jul/21 20:40
Worklog Time Spent: 10m 
  Work Description: abstractdog edited a comment on pull request #1122:
URL: https://github.com/apache/hive/pull/1122#issuecomment-873244798


   this change solved an issue that we found on customer side, reopened this PR 
and I'll review later
   @SparksFyz could you please rebase the patch on master?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 618257)
Time Spent: 1h  (was: 50m)

> Vectorization: IndexArrayOutOfBoundsException For map type column which 
> includes null value
> ---
>
> Key: HIVE-23688
> URL: https://issues.apache.org/jira/browse/HIVE-23688
> Project: Hive
>  Issue Type: Bug
>  Components: Parquet, storage-api, Vectorization
>Affects Versions: All Versions
>Reporter: 范宜臻
>Assignee: 范宜臻
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.0.0, 4.0.0
>
> Attachments: HIVE-23688.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {color:#de350b}start{color} and {color:#de350b}length{color} are empty arrays 
> in MapColumnVector.values(BytesColumnVector) when values in map contain 
> {color:#de350b}null{color}
> reproduce in master branch:
> {code:java}
> set hive.vectorized.execution.enabled=true; 
> CREATE TABLE parquet_map_type (id int,stringMap map) 
> stored as parquet; 
> insert overwrite table parquet_map_typeSELECT 1, MAP('k1', null, 'k2', 
> 'bar'); 
> select id, stringMap['k1'] from parquet_map_type group by 1,2;
> {code}
> query explain:
> {code:java}
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2 vectorized
>   File Output Operator [FS_12]
> Group By Operator [GBY_11] (rows=5 width=2)
>   Output:["_col0","_col1"],keys:KEY._col0, KEY._col1
> <-Map 1 [SIMPLE_EDGE] vectorized
>   SHUFFLE [RS_10]
> PartitionCols:_col0, _col1
> Group By Operator [GBY_9] (rows=10 width=2)
>   Output:["_col0","_col1"],keys:_col0, _col1
>   Select Operator [SEL_8] (rows=10 width=2)
> Output:["_col0","_col1"]
> TableScan [TS_0] (rows=10 width=2)
>   
> temp@parquet_map_type_fyz,parquet_map_type_fyz,Tbl:COMPLETE,Col:NONE,Output:["id","stringmap"]
> {code}
> runtime error:
> {code:java}
> Vertex failed, vertexName=Map 1, vertexId=vertex_1592040015150_0001_3_00, 
> diagnostics=[Task failed, taskId=task_1592040015150_0001_3_00_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
> failure ) : 
> attempt_1592040015150_0001_3_00_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row 
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
>   at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenabl

[jira] [Work logged] (HIVE-25048) Refine the start/end functions in HMSHandler

2021-07-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25048?focusedWorklogId=618294&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618294
 ]

ASF GitHub Bot logged work on HIVE-25048:
-

Author: ASF GitHub Bot
Created on: 02/Jul/21 23:08
Start Date: 02/Jul/21 23:08
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 closed pull request #2441:
URL: https://github.com/apache/hive/pull/2441


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 618294)
Time Spent: 1h 10m  (was: 1h)

> Refine the start/end functions in HMSHandler
> 
>
> Key: HIVE-25048
> URL: https://issues.apache.org/jira/browse/HIVE-25048
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Some start/end functions are incomplete in the HMSHandler, the functions can 
> audit the use actions, monitor the performance, and notify the listeners.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25048) Refine the start/end functions in HMSHandler

2021-07-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25048?focusedWorklogId=618295&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618295
 ]

ASF GitHub Bot logged work on HIVE-25048:
-

Author: ASF GitHub Bot
Created on: 02/Jul/21 23:10
Start Date: 02/Jul/21 23:10
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #2441:
URL: https://github.com/apache/hive/pull/2441


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 618295)
Time Spent: 1h 20m  (was: 1h 10m)

> Refine the start/end functions in HMSHandler
> 
>
> Key: HIVE-25048
> URL: https://issues.apache.org/jira/browse/HIVE-25048
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Some start/end functions are incomplete in the HMSHandler, the functions can 
> audit the use actions, monitor the performance, and notify the listeners.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24713) HS2 never knows deregistering from Zookeeper in the particular case

2021-07-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24713?focusedWorklogId=618300&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618300
 ]

ASF GitHub Bot logged work on HIVE-24713:
-

Author: ASF GitHub Bot
Created on: 03/Jul/21 00:08
Start Date: 03/Jul/21 00:08
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1932:
URL: https://github.com/apache/hive/pull/1932#issuecomment-873311345


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 618300)
Time Spent: 1h 40m  (was: 1.5h)

> HS2 never knows deregistering from Zookeeper in the particular case
> ---
>
> Key: HIVE-24713
> URL: https://issues.apache.org/jira/browse/HIVE-24713
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Eugene Chung
>Assignee: Eugene Chung
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> While using zookeeper discovery mode, the problem that HS2 never knows 
> deregistering from Zookeeper always happens.
> Reproduction is simple.
>  # Find one of the zk servers which holds the DeRegisterWatcher watches of 
> HS2 instances. If the version of ZK server is 3.5.0 or above, it's easily 
> found with [http://zk-server:8080/commands/watches] (ZK AdminServer feature)
>  # Check which HS2 instance is watching on the ZK server found at 1, say it's 
> _hs2-of-2_
>  # Restart the ZK server found at 1
>  # Deregister _hs2-of-2_ with the command
> {noformat}
> hive --service hiveserver2 -deregister hs2-of-2{noformat}
>  # _hs2-of-2_ never knows that it must be shut down because the watch event 
> of DeregisterWatcher was already fired at the time of 3.
> The reason of the problem is explained at 
> [https://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html#sc_WatchRememberThese]
> I added some logging to DeRegisterWatcher and checked what events were 
> occurred at the time of 3(restarting of ZK server);
>  # WatchedEvent state:Disconnected type:None path:null
>  # WatchedEvent[WatchedEvent state:SyncConnected type:None path:null]
>  # WatchedEvent[WatchedEvent state:SaslAuthenticated type:None path:null]
>  # WatchedEvent[WatchedEvent state:SyncConnected type:NodeDataChanged
>  path:/hiveserver2/serverUri=hs2-of-2:1;version=3.1.2;sequence=000711]
> As the zk manual says, watches are one-time triggers. When the connection to 
> the ZK server was reestablished, state:SyncConnected type:NodeDataChanged for 
> the path is fired and it's the end. *DeregisterWatcher must be registered 
> again for the same znode to get a future NodeDeleted event.*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25082) Make updateTimezone a default method on SettableTreeReader

2021-07-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25082?focusedWorklogId=618301&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618301
 ]

ASF GitHub Bot logged work on HIVE-25082:
-

Author: ASF GitHub Bot
Created on: 03/Jul/21 00:08
Start Date: 03/Jul/21 00:08
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2236:
URL: https://github.com/apache/hive/pull/2236#issuecomment-873311330


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 618301)
Time Spent: 40m  (was: 0.5h)

> Make updateTimezone a default method on SettableTreeReader
> --
>
> Key: HIVE-25082
> URL: https://issues.apache.org/jira/browse/HIVE-25082
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Avoid useless TimestampStreamReader instance checks by making 
> updateTimezone() a default method in SettableTreeReader



--
This message was sent by Atlassian Jira
(v8.3.4#803005)